0310:cs0310007/paper.tex

1:

2: \documentclass{aadebug}

3:

4: % Uncomment the following line for the CoRR proceedings. The first number

5: % will be e-mailed to you after the workshop, the second is the page number

6: % of the first page of your article. Please look this number up in the

7: % proceedings.

8: \corr{0309027}{237}

9:

10: \newtheorem{Def}{Definition}

11:

12: \begin{document}

13:

14: \runningheads{Schaubschl\"ager, Kranzlm\"uller, Volkert}

15: {Event-based Program Analysis with DeWiz}

16:

17: \title{Event-based Program Analysis with DeWiz}

18:

19: \author{

20: Christian~Schaubschl\"ager\addressnum{1},

21: Dieter~Kranzlm\"uller\addressnum{1},

22: Jens~Volkert\addressnum{1}

23: }

24:

25: \address{1}{

26: GUP,

27: Joh. Kepler University Linz,

28: Altenbergerstr. 69,

29: A-4040 Linz,

30: Austria/Europe

31: schaubschlaeger@gup.uni-linz.ac.at

32: }

33:

34:

35: % This information will show up in `Document Properties' in Acrobat Reader

36: \pdfinfo{

37: /Title (Event-based Program Analysis with DeWiz)

38: /Author (Christian Schaubschl\"ager et al.)

39: }

40:

41: \begin{abstract}

42: Due to the increased complexity of parallel and distributed programs, debugging

43: of them is considered to be the most difficult and time consuming part of the

44: software lifecycle. Tool support is hence a crucial necessity to hide complexity

45: from the user. However, most existing tools seem inadequate as soon as the

46: program under consideration exploits more than a few processors over a long

47: execution time. This problem is addressed by the novel debugging tool DeWiz

48: (Debugging Wizard), whose focus lies on scalability. DeWiz has a modular,

49: scalable architecture, and uses the event graph model as a representation of the

50: investigated program. DeWiz provides a set of modules, which can be combined to

51: generate, analyze, and visualize event graph data. Within this processing

52: pipeline the toolset tries to extract useful information, which is presented to

53: the user at an arbitrary level of abstraction. Additionally, DeWiz is a

54: framework, which can be used to easily implement arbitrary

55: user-defined modules.

56: \end{abstract}

57:

58: \keywords{Program analysis; Debugging; Parallel computing; Distributed computing}

59:

60: \section{Introduction}

61:

62: It is well known, that performance analysis and program debugging, respectively, are

63: two of the most time consuming and complex parts of the software life-cycle. This

64: is especially true for parallel or distributed programs, since parallelism and

65: (inter-process) communication introduce new obstacles which are unknown in

66: sequential programs, and increase the complexity of the software development

67: process.

68:

69: During the past years many program analysis and debugging tools have

70: been developed, using different approaches to hide the complexity of the analyzed

71: or debugged program from the user. Due to the (at least) two-dimensional nature

72: of the analysis data, namely time and space (in terms of processes), some kind of graphical

73: representation has turned out to be the most useful way to present the analysis

74: data to the user. Several approaches of graphical representation have been

75: proposed, most of them visualize a given program execution as a two-dimensional

76: space-time diagram. There is a broad range of tools in this field, for example

77: Vampir \cite{Nage96} and Paradyn \cite{Mill94}, just to list two. Some tools use

78: three dimensional environments like a CAVE to visualize a program execution,

79: for example as a Time Tunnel as described in \cite{Reed95}.

80:

81: A characteristic of parallel programs, which is becoming

82: increasingly important for tool developers, is scalability. With multiprocessor

83: machines and clusters deploying hundreds or thousands of processors, and grid

84: infrastructures combining large numbers of distributed resources, scalability of

85: program analysis tools seems a basic necessity.

86: An important factor which limits scalability of tools, is the sheer amount of

87: analysis data. Therefore it is inevitable for any analysis tool to keep

88: the amount of data presented to the user at manageable sizes. This can be achieved in

89: two ways: firstly by addressing the data collection phase, i.e. by reducing the

90: actual amount of collected data. This approach is utilized in Paradyn, where

91: the amount of collected data is reduced through dynamic

92: instrumentation \cite{HoMi94}. The underlying idea is to extract only those data

93: items, that are actually needed for program analysis. This reduces the

94: total amount of analysis data and thus permits to investigate even large scale

95: programs.

96:

97: On the other hand, even with data reduction applied in the collection

98: phase, the amount of trace data can grow to an enormous size on large scale

99: programs which utilize a large number of processors ofer a long execution time

100: which may exceed days, weeks, or even months. This makes it necessary to

101: focus on scalability also during the data analysis phase. Obviously trace

102: data must be analyzed in a reasonable time and the results must be presented

103: to the user in a meaningful way. Abstraction and graphical representation are the

104: two most important concepts to achieve scalability. An example for such an

105: abstraction mechanism can be found in EDL, the Event Definition Language

106: introduced by Bates and Wileden \cite{BaWi83}. EDL uses two essential mechanisms for event

107: abstraction: filtering and clustering. With filtering, all but a designated

108: subset of events can be deleted from the original event stream. Clustering means,

109: that one or more primitive events are gathered together into a higher level

110: event. EDL has lead to the high-level debugging approach EBBA, Event Based

111: Behavioural Abstraction \cite{Bat95} and the program behaviour models of

112: FORMAN \cite{Aug98}. Both

113: models follow the idea that the behaviour observed in parallel programs

114: may reveal useful patterns, which can be evaluated during program analysis.

115: Another, more recent approach of program monitoring is EARL and has

116: been proposed by Wolf and Mohr in \cite{WoMo98}. EARL stands for Event

117: Analysis and Recognition Language and it allows to construct target independent

118: monitoring and analysis tools by writing scripts in the EARL language.

119:

120: In this

121: paper we describe the scalable and modular debugging tool DeWiz (Debugging

122: Wizard), which uses the event graph model to represent a program's execution.

123: Data analysis and presentation is done by independent modules, which try to

124: automatically extract useful information. In Section 2 the architecture of DeWiz

125: is discussed, while in Section 3 we give some examples that show how DeWiz can be

126: used for program analysis. Finally, an outlook on future work concludes the paper.

127:

128: \section{Tool Architecture}

129: The approach of DeWiz stems from our work on the Monitoring and Debugging

130: environment MAD \cite{KrGr97}. MAD is a collection of software tools for debugging message

131: passing programs based on the MPI standard \cite{Mpi95}. At the core of this toolset are

132: the monitoring tool NOPE and the visualization tool ATEMPT. Although originally

133: developed for message passing programs, the toolset, especially the

134: monitor NOPE, recently has been extended so that also shared memory codes can be traced.

135: The motivation for this extension was, that some of todays architectures are

136: best utilized by using a hybrid MPI/OpenMP programming style \cite{Rab02}.

137:

138: In the following we will describe the architecture, the theoretical model, as

139: well as some implementation aspects of DeWiz in more detail.

140:

141: \subsection{Event Graph}

142:

143: As mentioned above, in DeWiz program executions as recorded with NOPE

144: or event streams generated by online monitors

145: are represented as event graph, which can be defined as follows:

146:

147: \begin{Def}[Event Graph \cite{Kra00}]

148: An event graph is a directed graph $G=(E,\rightarrow)$ , where $E$

149: is the non-empty set of events $e \in E$, while $\rightarrow$ is a relation

150: connecting events, such that $x \rightarrow y$ means that

151: there is an edge from event $x$ to event $y$ in $G$ with the "tail" at event

152: $x$ and the "head" at event $y$.

153: \end{Def}

154:

155: The events $e \in E$ of an event graph are the events observed during a program's

156: execution, like for example send or receive events in message passing programs,

157: and read or write memory accesses in a shared memory program. In case of NOPE

158: there is a standard set of events that will be traced, namely (amongst others)

159: all MPI point-to-point communication events. However, it is easily possible

160: to specify additional user-defined events to be recorded with NOPE, which adds

161: great flexibility to the tool.

162:

163: The relation connecting the events of an event graph is the

164: {\em happened-before relation},

165: which is the transitive, irreflexive closure of the union of the

166: relations $\stackrel{S}{\rightarrow}$ and $\stackrel{C}{\rightarrow}$. It

167: has been defined as follows:

168:

169: \begin{Def}[Happened-before relation \cite{Lam78}]

170: The happened-before relation $\rightarrow$ is defined as\\

171: \begin{center}

172: $\rightarrow = (\stackrel{S}{\rightarrow} \cup \stackrel{C}{\rightarrow})^{+}$\\

173: \end{center}

174: where $\stackrel{S}{\rightarrow}$ is the sequential order of events

175: relative to a particular responsible object,

176: while $\stackrel{C}{\rightarrow}$ is the concurrent order relation connecting

177: events on arbitrary responsible objects.

178: \end{Def}

179:

180: In other words, the relation $\stackrel{S}{\rightarrow}$ defines the sequential

181: order of events on a particular process, with the meaning that if two events

182: $e_p^i$ and $e_p^j$ occur on the same process and $e_p^i$ occurs before $e_p^j$

183: then $e_p^i \stackrel{S}{\rightarrow} e_p^j$.

184: The concurrent order relation $\stackrel{C}{\rightarrow}$ describes the order

185: of corresponding events on different processes, which is established by

186: communication and synchronization. If $e_p^i$ is a send event on process $p$

187: and $e_q^j$ is the corresponding receive event on process $q$, then

188: $e_p^i \stackrel{C}{\rightarrow} e_q^j$.

189:

190: The DeWiz toolset uses the event graph model as its theoretical fundament. The tool

191: itself consists of three main components, the {\em modules}, the

192: {\em protocol}, and a {\em framework}, which are required to construct a DeWiz

193: system for a concrete analysis task.

194:

195: \subsection{DeWiz System}

196:

197: A DeWiz system is built by connecting a set of DeWiz modules, which then act

198: as a kind of event-graph processing pipeline, i.e. the DeWiz modules are

199: responsible for the actual work in a DeWiz system. This modular approach

200: has several advantages. It makes the DeWiz system flexible and easily

201: extensible. Users can utilize existing modules or, if needed, implement their

202: own modules, hence adding arbitrary functionality to the system.

203:

204: Basically we distinguish three kinds of modules:

205:

206: \begin{itemize}

207:   \item Event graph generation modules

208:   \item Automatic analysis modules

209:   \item Data access modules

210: \end{itemize}

211:

212: The modules in a DeWiz system communicate with each other using

213: a specialized protocol,

214: the DeWiz protocol. This protocol is based upon TCP/IP, which makes it

215: possible to distribute a DeWiz system across several computers.

216: Due to this approach, the monitoring and analysis tasks itself can

217: utilize a potentially large number of resources, e.g. by putting

218: the analysis tasks on the grid \cite{Foster}.

219: For example it would be feasible to execute only the

220: monitoring module on the computer where the monitored application

221: is running. The monitoring module would then send the collected

222: events to an analysis module which is executed on some other computer, and

223: so on. Since analysis or processing of monitored events in general can be

224: very time-consuming tasks, the distribution of these tasks can speed-up

225: the analysis process significantly.

226:

227: As mentioned above we distinguish three types of modules. These will be

228: described in more detail in the following sections.

229:

230: \subsubsection{Event Graph Generation Modules}

231:

232: Event graph generation modules are those who produce the event graph data

233: stream from a given program execution. This can be done in two ways, either online or

234: post-mortem. In case of online tracing a DeWiz-Module connects to a running,

235: instrumented program, collects events which are generated by the online

236: monitor, and forwards these events to the next module in the DeWiz system.

237: Currently DeWiz supports online monitors which correspond to the

238: OMIS Compliant Monitor OCM \cite{WiTr98}. There is also an interface to the

239: OpenMP Pragma and Region Instrumentator OPARI \cite{MoMa01}.

240:

241: In case of post-mortem tracing, events are read from tracefiles by a proper

242: DeWiz module. Currently there is a module for reading tracefiles generated

243: by NOPE.

244:

245: \subsubsection{Automatic Analysis Modules}

246:

247: Automatic analysis modules process an event graph stream and try to extract

248: useful information like for example communication patterns, or erroneous

249: behaviour like communication errors. The latter is relatively easy,

250: for example by simply comparing the message lengths at a send event

251: and at the corresponding receive event. If the lengths differ, it is an

252: indication for a possible communication error. A more challenging task

253: is to try to find communication patterns in an event graph. By applying

254: pattern-matching algorithms to the event graph, we try to identify patterns

255: like for example loops. If it is  possible to find any

256: irregularities in the pattern, this would again be a possible source for

257: an error in the investigated program.

258:

259: \subsubsection{Data Access Modules}

260:

261: At the end of the processing pipeline we have data access modules. Their

262: purpose is to display the various analysis-results, which were generated by

263: the predecessing modules, to the user. Depending on the kind of analysis data

264: a suitable form of visualisation will be chosen. In most cases this will be some

265: form of graphical representation, for example in form of a space-time diagram of

266: the event graph.

267: Figure~\ref{comm_failures} shows a visualization of an example message-passing

268: event-graph.

269: On the vertical axes the participating processes are displayed, whereas

270: the horizontal axes represent the time. The black arrows represent messages

271: which are sent from one process to another, with the tail of the arrow at

272: the send event on the source process, and the tip of the arrow at

273: receive event on the destination process.

274: The colored arrows indicate possible communication errors; these will be

275: described in more detail below.

276:

277: \subsection{The DeWiz Protocol and Framework}

278:

279: The DeWiz Protocol is used between modules to transport the event graph stream.

280: For this purpose it is necessary to define data structures which represent the

281: observed events. In our case the following two data structures have been defined:\\

282:

283: \begin{center}

284:   event: $e_p^i = (p,i,type,data)$\\

285:   \vspace{\baselineskip}

286:   concurrent order relation: $e_p^i \rightarrow e_q^j = (p,i,q,j)$\\

287: \end{center}

288:

289: The variables $p$ and $i$ represent the responsible object (e.g. a process) on which

290: the event occurred and its sequential order, respectively. The variable $type$

291: denotes the kind of event, in case of a message passing code a send or a receive

292: operation for example, or a semaphore lock in a shared memory environment. Currently

293: only message-passing and shared-memory events are supported, but due to its

294: flexibility, the event graph can be used to model any kind of software system.

295: Table~\ref{evt_table} gives a short overview of several possible software

296: systems, their corresponding event types and event data.

297: The $data$ variable can be used to store additional information concerning the event,

298: like for example timestamps or calling parameter of the function call that caused the

299: event.

300:

301: \begin{table}

302: \begin{center}

303: \begin{tabular}{|p{4cm}|p{3cm}|p{5cm}|}

304: \hline

305:    target system & event type & event data \\[0.6cm]

306: \hline\hline

307:   parallel/distributed message-passing program &

308:   send &

309:   message data, message-length, destination,message-type,data-type,...\\[0.6cm]

310: \hline

311:   multi-threaded shared memory program & lock & semaphore, waiting time,...\\[0.6cm]

312: \hline

313:   database/transaction system & read record & table, location of table, access time,...\\[0.6cm]

314: \hline

315:   file input/output & write & filename, device, buffer size,...\\[0.6cm]

316: \hline

317: \end{tabular}

318: \caption{Example events and event attributes}

319: \label{evt_table}

320:

321: \end{center}

322: \end{table}

323:

324: The concurrent order relation connects corresponding objects as described above.

325: In DeWiz we use logical vector clocks as described in \cite{Fidg91} by Fidge to

326: implement the concurrent order relation.

327:

328: With the DeWiz Framework it is possible to implement DeWiz modules for any

329: desired functionality. The Framework is written in the Java

330: programming language and provides a set of API functions which simplify

331: the development of user-defined modules, for example by hiding the

332: DeWiz protocol from the user.

333:

334: \section{Examples}

335:

336: \subsection{Overview}

337:

338: \begin{figure}

339: \centering

340: \includegraphics[scale=0.4]{dewiz.eps}

341: \caption{An Example DeWiz System}

342: \label{dewiz}

343: \end{figure}

344:

345: In this section we present an example DeWiz system. If the modules for

346: a concrete analysis task are available, the user may start to construct a

347: corresponding DeWiz-System. The modules are placed and initialized on arbitrary

348: networked computing nodes. A dedicated module, the DeWiz Sentinel is used to

349: control a particular DeWiz System. With a controller interface, available

350: modules may be arbitrarily interconnected by identifying corresponding input and

351: output interfaces.

352: An example for the DeWiz controller interface is shown in Figure~\ref{dewiz}. The smaller

353: window in front shows the module table, including all registered modules (by id

354: and name), their available interfaces and status, the implemented features

355: (send, receive, or none), and the id's of corresponding consumer or producer

356: modules. The larger background window of Figure~\ref{dewiz} provides the same information

357: in form of a module diagram.

358:

359: To use DeWiz in a particular programming environment, dedicated event graph

360: generation modules have been implemented. As mentioned above, currently there

361: is a trace-reader modules for NOPE, as well as an interface to OMIS compliant

362: monitors and an extension to OPARI.

363:

364: Concerning data access modules, DeWiz provides an interface to the analysis tool

365: ATEMPT (Figure~\ref{comm_failures}), a Java applet to display the event graph stream

366: in arbitrary web

367: browsers (Figure~\ref{iedewiz}), and an SMS notifier for critical failures

368: during program execution (Figure~\ref{handy}).

369:

370: \begin{figure}

371: \centering

372: \includegraphics[scale=0.5]{iedewiz.eps}

373: \caption{Visualization of an event-graph in a Java applet}

374: \label{iedewiz}

375: \end{figure}

376:

377: \begin{figure}

378: \centering

379: \includegraphics[scale=0.9]{handy.eps}

380: \caption{DeWiz SMS notifier}

381: \label{handy}

382: \end{figure}

383:

384: The analysis functionality already implemented in DeWiz is illustrated with the

385: following two examples:

386:

387: \begin{itemize}

388:   \item Extraction of communication failures

389:   \item Pattern matching and loop detection

390: \end{itemize}

391:

392: \subsection{Communication Failures}

393:

394: Communication failures can be detected by pairwise analysis of communication

395: events. An example of a possible communication failure is the detection of

396: different message lengths at a send event and the corresponding receive event.

397: Though this is not necessarily a communication failure, the default event-graph

398: visualization module of DeWiz highlights such

399: send or receive events, respectively, and the user can easily check whether

400: this is intended or not. Another more obvious example of a communication

401: failure is the detection of pending send or receive events, which are also

402: highlighted in the event-graph visualization. Isolated events can originate for

403: example from a wrong destination address given at a send event. The consequence

404: would be that the corresponding receive event (in case it is a blocking

405: receive event) would wait forever for the message, thus blocking the

406: receiving process forever.

407: In Figure~\ref{comm_failures}

408: an example event-graph with several possible communication errors is shown.

409:

410: \begin{figure}

411: \centering

412: \includegraphics[scale=0.7]{commerr.eps}

413: \caption{Possible communication errors in a message-passing program}

414: \label{comm_failures}

415: \end{figure}

416:

417: \subsection{Pattern Matching - Loop Detection}

418:

419: A more complex analysis activity compared to the extraction of communication

420: failures is pattern matching and loop detection. The goal of the corresponding

421: DeWiz modules is to identify repeated process interaction patterns in the event

422: graph. An example event graph is shown in Figure~\ref{sim_ex}. This pattern is called

423: {\em simple exchange} pattern and can be defined as the event graph\\

424:

425: \begin{center}

426:

427: $EX(i,p,q) = (EX_ev(i,p,q),EX_rel(i,p,q))$ with\\

428: \vspace{\baselineskip}

429: $EX_ev(i,p,q) = \{e_p^i,e_p^{i+1},e_q^i,e_q^{i+1} \}$ and\\

430: \vspace{\baselineskip}

431: $EX_rel(i,p,q)=\{(e_p^i \stackrel{S}{\rightarrow} e_p^{i+1}),

432:                  (e_q^i \stackrel{S}{\rightarrow} e_q^{i+1}),

433:                  (e_p^i \stackrel{C}{\rightarrow} e_q^{i+1}),

434:                  (e_q^i \stackrel{C}{\rightarrow} e_p^{i+1}) \}$

435: \end{center}

436:

437: where events $e_p^i, e_p^{i+1}$ occur on

438: process $p$ and events $e_q^i, e_q^{i+1}$ occur on process $q$ with $p \neq q$.

439: The existence of this simple pattern in an event graph can easily be verified

440: within a DeWiz module. More complex

441: patterns can be specified and provided in a pattern database according to

442: the needs of users and the characteristics of their programs.

443:

444: \begin{figure}

445: \centering

446: \includegraphics{sim_ex.eps}

447: \caption{Simple exchange}

448: \label{sim_ex}

449: \end{figure}

450:

451: The purpose of detecting patterns in an event-graph is two-fold. Firstly,

452: if it is possible to detect repeated iterations of a pattern in an event

453: graph, this knowledge can be used when the event-graph is visualized,

454: e.g. as space-time diagram. By replacing the possible complicated patterns

455: with simpler symbols, the complexity of the visual representation of the

456: event-graph can be reduced greatly, which would give the user a better

457: overview of the investigated program.

458:

459: Secondly, the user could specify a communication pattern which is

460: expected to occur in the investigated program. DeWiz will compare the

461: given pattern with the event-graph and detect possible deviations, which

462: could possibly originate from an error in the program. Another example is

463: the repeated occurrence of any pattern, possibly within a loop. DeWiz will

464: in a first step detect the pattern, and then check for irregularities in

465: the sequence of this pattern. Figure~\ref{pattern} illustrates such a situation.

466: We see a relatively complex event-graph, which is the trace of an

467: execution of a finite-element message-passing program executed on 16 processes.

468: Despite its complexity, one can relatively easy see the iterations of a pattern,

469: as well as a significant irregularity (in the middle of the diagram). Again,

470: this is an indication for a possible communication error.

471:

472: \begin{figure}

473: \centering

474: \includegraphics[scale=0.6]{pattern.eps}

475: \caption{Event-graph with iterations of a pattern}

476: \label{pattern}

477: \end{figure}

478:

479:

480: \section{Conclusion and Future Work}

481:

482: Performance analysis and debugging of parallel and distributed programs is a

483: difficult activity. The problems are further increased, if program executions

484: with large numbers of processes need to be investigated. For that reason,

485: scalability of software analysis tools is an important characteristic.

486:

487: The modular approach of DeWiz provides scalable parallel program analysis by

488: abstracting the program's behavior as an event graph and distributing the

489: analysis activities of this graph across existing resources. With this approach,

490: DeWiz is able to cope with very large amounts of analysis data, while providing

491: capabilities comparable to existing analysis tools.

492: The current implementation of DeWiz represents a first proof of concept.

493: However, for actual application of DeWiz more examinations with real-world

494: applications are needed. In addition, some more interfaces to existing analysis

495: tools are required. With the flexible structure of DeWiz and the well-defined

496: protocol, an interface to an already existing analysis tool can easily be

497: established. In this way, the analysis tool benefits from the capabilities of

498: DeWiz and achieves a higher level of scalability.

499:

500: \bibliography{paper}

501:

502: \end{document}

503: