1: \documentstyle[epsfig,11pt]{article}
2: \setlength\textwidth{6.5in}
3: \setlength\oddsidemargin{0in}
4: \setlength\evensidemargin{0in}
5: \setlength\topmargin{0in}
6: \setlength\headheight{0in}
7: \setlength\headsep{0in}
8: \setlength\textheight{8.5in}
9: \newcommand{\spin}{{\sc Spin}}
10: \newcommand{\promela}{{\sc Promela}}
11: \newcommand{\otter}{{\sc Otter}}
12: \newcommand{\imp}{{\longrightarrow}}
13: \begin{document}
14: \pagenumbering{roman}
15: \thispagestyle{empty}
16: \setcounter{page}{1}
17: \begin{center}
18:
19: \vspace{.2in}
20: \rule{1.5in}{.01in}\\ [1ex]
21: ANL/MCS-TM-261 \\
22: \rule{1.5in}{.01in}
23:
24: \vspace{1.5in}
25: {\Large\bf Methods to Model-Check Parallel Systems Software}\\[2ex]
26: by \\ [2ex]
27:
28: {\Large\it Olga Shumsky Matlin, William McCune, and Ewing Lusk}\\
29: {\tt{\{matlin,mccune,lusk\}@mcs.anl.gov}}
30: %
31:
32: \vspace{1.5in}
33: Mathematics and Computer Science Division
34:
35: \bigskip
36:
37: Technical Memorandum No. 261
38:
39: \vspace{1in}
40: April 2003
41: \end{center}
42:
43: \vfill
44: \noindent
45: This work was supported by the Mathematical, Information, and Computational
46: Sciences Division subprogram of the Office of Advanced Scientific Computing
47: Research, Office of Science, U.S. Department of Energy, under Contract
48: W-31-109-ENG-38.
49: %
50:
51: %
52: %
53: %
54: %
55: \newpage
56: \noindent
57: Argonne National Laboratory, with facilities in the states of Illinois
58: and Idaho, is owned by the United States Government and operated by The
59: University of Chicago under the provisions of a contract with the
60: Department of Energy.
61:
62: \vspace{2in}
63:
64: \begin{center}
65: {\bf DISCLAIMER}
66: \end{center}
67:
68: \noindent
69: This report was prepared as an account of work sponsored by an agency
70: of the United States Government. Neither the United States Government
71: nor any agency thereof, nor The University of Chicago, nor any of
72: their employees or officers, makes any warranty, express or implied,
73: or assumes any legal liability or responsibility for the accuracy,
74: completeness, or usefulness of any information, apparatus, product, or
75: process disclosed, or represents that its use would not infringe
76: privately-owned rights. Reference herein to any specific commercial
77: product, process, or service by trade name, trademark, manufacturer,
78: or otherwise, does not necessarily constitute or imply its
79: endorsement, recommendation, or favoring by the United States
80: Government or any agency thereof. The views and opinions of document
81: authors expressed herein do not necessarily state or reflect those of
82: the United States Government or any agency thereof.
83: \vspace{1in}
84: \begin{center}
85: Available electronically at http://www.doe.gov/bridge
86:
87: Available for a processing fee to U.S. Dept.\ of
88: Energy and its contractors, in paper, from:
89:
90: U.S. Department of Energy\\
91: Office of Scientific and Technical Information\\
92: P.O. Box 62\\
93: Oak Ridge, TN 37831-0062\\
94: phone: (865) 576-8401\\
95: fax: (865) 576-5728\\
96: email: reports@adonis.osti.gov
97: \end{center}
98: \newpage
99:
100: \tableofcontents
101:
102: \newpage
103:
104: \setcounter{page}{0}
105: \pagenumbering{arabic}
106: \begin{center}
107: {\Large\bf Methods to Model-Check Parallel Systems Software}\\ [1ex]
108: by \\ [1ex]
109: Olga Shumsky Matlin, William McCune, and Ewing Lusk
110: \end{center}
111:
112: \addcontentsline{toc}{section}{Abstract}
113: \abstract{We report on an effort to develop methodologies for formal
114: verification of parts of the Multi-Purpose Daemon (MPD) parallel
115: process management system. MPD is a distributed collection of
116: communicating processes. While the individual components of the
117: collection execute simple algorithms, their interaction leads to
118: unexpected errors that are difficult to uncover by conventional
119: means. Two verification approaches are discussed here: the standard
120: model checking approach using the software model checker \spin\ and
121: the nonstandard use of a general-purpose first-order
122: resolution-style theorem prover \otter\ to conduct the traditional
123: state space exploration. We compare modeling methodology and
124: analyze performance and scalability of the two methods with respect
125: to verification of MPD.}
126:
127: \section{Introduction}
128: \label{sec:introduction}
129: Reasoning about parallel programs is surprisingly difficult. Even
130: small parallel programs are difficult to write correctly, and an
131: incorrect parallel program is equally difficult to debug, as we
132: experienced while writing the Multi-Purpose Daemon (MPD), a process
133: manager for parallel programs
134: \cite{bgl00:mpd:pvmmpi00,butler-lusk-gropp:mpd-parcomp}. Despite
135: MPD's small size and apparent simplicity, errors have impeded progress
136: toward code in which we have complete confidence. Such a situation
137: motivates us to explore program verification techniques.
138:
139: MPD is itself a parallel program. Its function is to start the
140: processes of a parallel job in a scalable way, manage input and
141: output, handle faults, provide services to the application, and
142: terminate jobs cleanly. MPD is the sort of process manager needed to
143: run applications that use the standard
144: MPI~\cite{mpi-forum:journal,mpi-forum:mpi2-journal} library for
145: parallelism, although it is not MPI specific. MPD is distributed as
146: part of the portable and publicly available
147: MPICH~\cite{gropp-lusk:mpich-www,gropp-lusk-doss-skjellum:mpich}
148: implementation of MPI.
149:
150: Our first attempt to use formal verification to ensure correctness of
151: MPD algorithms \cite{mpd-acl2} employed the ACL2 \cite{acl2:book}
152: theorem prover, which allows one to both simulate and verify a model
153: within a single environment. Components of the MPD system,
154: as well as the elements of the Unix socket library on which MPD is
155: based, were formalized in a subset of Lisp. The formalization was based on
156: a so-called oracle \cite{acl2:oracle}, which allows analysis of a
157: parallel system in a sequential environment. The oracle specifies an
158: execution interleaving of concurrent processes and is randomly
159: generated for simulations. Verification is conducted with respect to
160: an arbitrary oracle (i.e., an arbitrary execution interleaving);
161: thus, a property proved in such a way holds for all possible
162: executions of a collection of concurrent processes. In this approach
163: parsing simulation results, formulating desired properties of models
164: of MPD algorithms and reasoning about such models proved difficult,
165: leading us to abandon this traditional theorem-proving method of
166: verification and to try instead model-checking techniques. Two such
167: techniques are described here.
168:
169: The first technique employed the model checker
170: \spin~\cite{spin:book,spin:article}, which
171: supports design and verification of asynchronous distributed
172: communicating systems. Models of such systems are recorded in the
173: special high-level verification language \promela, which can also be
174: used to state some correctness properties of the models. Additional
175: correctness properties are specified by using linear temporal logic.
176: The verification engine of \spin\ is based on on-the-fly reachability
177: analysis with several optimizations and heuristics, such as
178: partial-order reduction and bitstate hashing. The system also
179: includes a simulation environment and a graphical user interface.
180: \spin\ has been used in various settings (see~\cite{spin-web-page} for
181: all of the proceedings of the \spin\ workshops). Because the dynamic
182: nature of MPD is easily expressible in the \spin/\promela\ framework,
183: the tool is a natural choice for verification of our system.
184: However, our early experiences with \spin\ \cite{spin-mpd} suggest
185: that the most natural formalization of certain MPD algorithms in
186: \promela\ leads to performance and scalability challenges. While we
187: have since addressed some of these challenges, as described in
188: Section~\ref{sec:spin}, they led us to explore other ways to formalize
189: and model check MPD algorithms.
190:
191: Specifically, because of considerable in-house theorem-proving
192: expertise, we were led to explore whether a theorem prover can be
193: successfully adapted for our purposes. We have used the
194: general-purpose first-order resolution and paramodulation theorem
195: prover \otter\ \cite{McCune94a}. While the tool is widely used, its
196: primary application is in proof search, mainly in mathematics and
197: logic. The input language is that of first-order logic. Otter does,
198: however, contain extensions for evaluable terms, which are essential
199: to our unusual use of the theorem prover for state-space exploration.
200:
201: The remainder of the paper is structured as follows. In
202: Section~\ref{sec:mpd} we describe MPD in more detail and briefly
203: present the algorithms, along with their correctness properties, for
204: which the verification approaches are evaluated. In
205: Sections~\ref{sec:spin} and \ref{sec:otter} we outline formalizations
206: of the MPD algorithms in \promela\ and the input language of \otter. In
207: Section~\ref{sec:results} we compare and analyze concrete results of
208: specific verification experiments. We conclude with a summary in
209: Section~\ref{sec:summary}.
210:
211: \section{The Multi-Purpose Daemon}
212: The MPD system comprises several types of processes. The {\em
213: daemons\/} are persistent (may run for weeks or months at a time,
214: starting many jobs) and are connected in a ring. {\em Manager\/}
215: processes, started by the daemons to control the application
216: processes ({\em clients\/}) of a single parallel job, provide most of
217: the MPD features and are also connected in a ring. A separate set of
218: managers supports an individual process environment for each user
219: process. A {\em console\/} process is an interface between a user
220: and the daemon ring. A representative topology of the MPD system is
221: shown in Figure~\ref{fig:mpds-all}.
222: \label{sec:mpd}
223: \begin{figure*}%
224: \centerline{ \epsfxsize=4.75in \epsfbox{newmpds1.eps}}
225: \caption{Daemons with console process, managers, and clients}
226: \label{fig:mpds-all}
227: \end{figure*}
228: The vertical solid lines represent
229: connections based on pipes; the remaining solid lines represent
230: connections based on Unix sockets. The dashed and dotted lines are
231: potential or special-purpose connections.
232:
233: Each of the daemon, manager, and console process types has essentially
234: the same pattern of behavior. This feature important because it
235: allows us to model these processes in a consistent manner. After
236: initialization, the process enters an infinite, mostly
237: idle, loop, implemented by the Unix socket function \texttt{select}.
238: When a message arrives on one of its sockets, the process calls the
239: appropriate message handler routine and reenters the idle {\tt
240: select} state. The handler does a small amount of processing,
241: creating new sockets or sending messages on existing ones. The
242: logic of the distributed algorithms executed by the system as a whole
243: is contained primarily in the handlers, and this is where the
244: difficult bugs appear.
245:
246: In the following we compare the two verification approaches on two
247: algorithms: a daemon-level ring establishment algorithm and a
248: manager-level barrier algorithm. We present an outline of each
249: algorithm and highlights of the modeling approach.
250:
251: \subsection{Ring Establishment Algorithm}
252: Establishment and maintenance of a ring of daemons are central to the
253: operation of MPD. Informally, daemon ring creation proceeds as
254: follows. The initial daemon establishes a listening port to which
255: subsequent connections are made. The daemon connects to its own
256: listening port, creating a ring of one daemon. The listening port of
257: the first daemon and the name of the host processor are queried from
258: the console. The desired number of daemons is then initiated and
259: directed to enter the ring by connecting to the first daemon.
260: Figure~\ref{fig:insert}
261: \begin{figure*}%
262: \centerline{ \epsfxsize=4.75in \epsfbox{create1.eps} }
263: \caption{Ring insertion}
264: \label{fig:insert}
265: \end{figure*}
266: shows the result of inserting a new daemon into an existing ring.
267: After the insertion is completed, the old connection between
268: daemons on the right and left of the new daemon is disconnected (shown
269: in the figure by the dashed line). Note that in the special case of
270: insertion into a ring of one daemon, the daemon plays both the left
271: and the right roles.
272:
273: The algorithm may be viewed as consisting of two parts, each of which
274: may potentially contain errors. One part deals with establishment of
275: listening ports and sockets to enable bidirectional communication
276: between processes. The second part concerns the passage of messages
277: over the established communication links and their handling upon
278: receipt. Correct socket establishment depends to a large degree on
279: the correct implementation of the Unix socket library. The \promela\
280: and \otter\ models described below do not check correctness of this
281: part of the algorithm but rather concentrate on the correctness of
282: passing messages between processors, correctness of message handling,
283: and correctness and consistency of the global system state. Under
284: this partitioning of the algorithm, and assuming that each processor
285: records the identity of its right and left neighbors, the correctness
286: property, stated as a postcondition, can be formulated as follows.
287:
288: The algorithm is correct for $i$ daemons when, upon termination of the
289: algorithm, the ring has $i$ distinct connected components and when, for
290: any processors $i$ and $j$ in the ring, if the right neighbor of $i$
291: is $j$, then the left neighbor of $j$ is $i$.
292:
293: \subsection{Barrier Algorithm}
294:
295: Parallel programs frequently rely on a \textit{barrier} mechanism to
296: ensure that all processes of the job reach a certain point (complete
297: initialization, for example) before any are allowed to proceed
298: further. Parallel jobs, that is, programs running on the clients, rely
299: on the manager processes to implement the barrier service. The
300: algorithm proceeds as follows. A manager process is designated as
301: the leader of the algorithm and is given a rank of 0. When the
302: leader reads a request from its client to provide the barrier service,
303: it sends a message \texttt{barrier\_in} to its right-hand side neighbor
304: in the ring. When a non-leader manager receives the
305: \texttt{barrier\_in} message, its behavior is determined by whether its
306: client has already requested the barrier service. If the client has
307: done so, the manager forwards the message to the right-hand side
308: manager. Otherwise, it holds the \texttt{barrier\_in} message until
309: the request from the client arrives. While the \texttt{barrier\_in}
310: message is held, a bit variable \texttt{holding\_barrier\_in} is set.
311: Once the \texttt{barrier\_in} message traverses the entire manager ring
312: and arrives back in the leader, meaning that each client has reached
313: the barrier and notified its manager, the leader sends a
314: \texttt{barrier\_out} message around the ring. When a manager receives
315: the \texttt{barrier\_out} message, it notifies its client to proceed
316: past the barrier. The leader can be either the first or the last
317: manager to allow its client to proceed.
318:
319: The barrier algorithm, as the ring establishment algorithm, can be
320: viewed as consisting of two parts: socket handling and message
321: handling. Here, however, the socket handling portion of the algorithm
322: is largely unimportant to the verification effort, since the
323: communication paths between processes are completely established
324: before the algorithm begins execution. The socket portion may become
325: important in future verification efforts that concentrate on
326: \textit{interaction} of MPD algorithms and examine the barrier
327: algorithm in conjunction with an algorithm that manages --- that is,
328: establishes, breaks down, or restores after a fault --- connections
329: between processes.
330:
331: Two correctness properties, a postcondition and an invariant, are
332: verified for the barrier algorithm. The postcondition property is that
333: all clients have been released from the barrier. The invariant
334: demands that no client be released until every client
335: has reached the barrier, that is, every client has requested the
336: barrier service from its manager.
337:
338: \section{Formalization and Verification of MPD with \spin}
339: \label{sec:spin}
340:
341: To formalize an MPD algorithm in \promela, the modeling language of
342: \spin, one has to make three related decisions: what variables
343: need be defined globally and locally to record the necessary
344: information during the execution of the algorithm, how to model
345: the communication network, and whether and how to model the Unix
346: socket functions. When considered together, the three modeling
347: decisions determine how abstract the resulting model will be.
348:
349: The original approach \cite{spin-mpd} to model checking MPD algorithms
350: with \spin\ produced models that were too literal, meaning that the
351: \promela\ modeling language was used in a way that closely resembled
352: C. The motivation for such an approach was to enable automated
353: extraction of executable C code from verified \promela\ models. As a
354: consequence of such a modeling methodology it was possible to
355: formalize within a single model both the socket-handling and the
356: message-handling portions of the MPD algorithms. Unfortunately,
357: another consequence of the approach was poor verification performance.
358: We were able to verify the ring establishment algorithm on only a few (
359: less than five) daemons, far short of the desired goal to verify
360: models with ten to twenty processors (see Section~\ref{sec:size-note}
361: for discussion of why verifying models of this size would be
362: interesting). During verification attempts on
363: larger models, the number of states in the search space was so large
364: that \spin\ ran out of nearly 1 GB of available memory. The current
365: formalization approach produces more abstract models, which enables
366: verification to complete within the constraints of available memory on
367: larger models, but at the expense of not considering
368: correctness of the socket-handling portion of the algorithms.
369:
370: Components of the MPD system map naturally to predefined {\sc Promela}
371: entities: a
372: {\tt proctype} is defined for each different MPD process type;
373: individual daemon, manager, console, and client processes correspond to
374: active instances of the corresponding {\tt proctype}s; sockets map to
375: channels; and messages that are read and written over the sockets
376: correspond to messages traveling on the channels. The functionality of
377: the Unix {\tt select} \cite{stevens-unp1} is implemented by the
378: nondeterministic {\tt do} construct: when a message arrives in the
379: input queue of a process instance, the appropriate guard of the {\tt
380: do} construct triggers its handling; in the absence of new messages
381: in the input queue, the instance is blocked.
382:
383: The distinguishing characteristic of the \promela\ model of the
384: barrier algorithm is that the clients are not represented by separate
385: process instances. To do otherwise would waste of precious memory. In
386: \spin, each process instance requires a certain amount of space in the
387: description of the entire system state, otherwise known as the state
388: vector. The functionality of a client is to send a request message to
389: its manager and then wait for a reply. Both these actions can be
390: adequately modeled by two binary variables, \verb2client_barrier_in2
391: and \verb2client_barrier_out2, per manager process. We note that the
392: algorithm depends on the fact that a manager records whether it has
393: already received a barrier request from its client, so one of these
394: variables is present in the model regardless of how clients are
395: formalized. Omission of clients reduces the complexity of the model
396: and the size of the state vector in another way: communication links
397: between managers and clients are not necessary. Connections between
398: managers are the only communication links that have to be modeled. To
399: this end we define a global vector of $N$ channels, where $N$ is the
400: number of managers in the model. Each manager process has local
401: pointers, \texttt{left} and \texttt{right}, to the appropriate
402: channels in the global array. To manage the bit variables, we rely on
403: an implementation of bit arrays by Ruys \cite{spin:low-fat}, which
404: defines the \texttt{IS\_0} and \texttt{SET\_1} macros. The
405: formalization ensures that the handler of a barrier request message
406: from the client is executed only once. The use of the {\tt do}
407: construct ensures that all messages that arrive at the manager have
408: equal precedence and can be handled in any order. We show a partial
409: \promela\ model of the barrier algorithm:
410: {\small
411: \begin{verbatim}
412: end_select:
413: do
414: :: IS_0(client_barrier_in,_pid) ->
415: SET_1(client_barrier_in,_pid);
416: /* remainder of client request handling */
417: :: left?barrier_in ->
418: /* barrier_in handling */
419: :: left?barrier_out ->
420: /* remainder barrier_out handling */
421: SET_1(client_barrier_out,_pid);
422: if
423: :: _pid == 0 -> skip
424: :: else -> right!barrier_out
425: fi;
426: od
427: \end{verbatim}}
428: %
429: %
430: %
431:
432: The socket-based communication network presented the most challenges
433: in the formalization of the ring establishment algorithm. Unlike the
434: static topology of the communication links in the barrier algorithm,
435: the topology of connections in the ring algorithm is dynamic. As
436: daemons enter the ring, new connections are established, and the old
437: ones are broken down, as illustrated in Figure~\ref{fig:insert}. In
438: addition, because all execution interleavings are exhaustively
439: considered during the model search, the network has to be set up so
440: that any two processes can communicate with each other. Numerous
441: different formalizations of the communication network were tried
442: in our effort to verify a model of meaningful size. In particular, we
443: tried the bus and matrix of channels formalizations, as suggested
444: in \cite{phd:ruys}.
445:
446: Our experiments showed that, with respect to verification, the best
447: approach to modeling the communication network of the ring
448: establishment algorithm depends on a vector of global channels. Each
449: daemon process owns a channel, which serves as an input queue for any
450: messages addressed to that process. In order to further reduce
451: complexity, as suggested by \spin's documentation, each daemon retains
452: exclusive rights, by using the \texttt{xr} designation, to read from
453: its owned channel. In this approach, illustrated in
454: Figure~\ref{fig:channels},
455: \begin{figure}%
456: \centerline{ \epsfxsize=1.6in \epsfbox{channels.eps} }
457: \caption{Model for message sending and receiving}
458: \label{fig:channels}
459: \end{figure}
460: a single input queue is used by a node to receive messages from all
461: nodes, but different queues are used to send messages addressed to
462: different nodes.
463: We find it interesting that this approach is explicitly advised
464: against in the \spin\ help file: to reduce complexity, the
465: documentation suggests one ``avoid sharing channels between multiple
466: receivers or multiple senders.'' This approach nonetheless succeeds in
467: our model because, while the network is set up to allow sharing
468: channels between multiple senders, most communications take place
469: between distinct pairs of daemons, so channel sharing very rarely occurs
470: during exploration of a single execution path. Thus, the complexity
471: of the model is not adversely affected.
472:
473: \section{Formalization and Verification of MPD with \otter}
474: \label{sec:otter}
475:
476: \otter\ \cite{ McCune94a} is an automated theorem-proving system for
477: first-order logic with equality. Most successful applications of
478: \otter\ have been in abstract algebra and logic, and its strength is
479: the ability to quickly explore large search spaces. It can
480: efficiently manage large sets of formulas (hundreds of thousands),
481: which allows it to automatically prove many difficult theorems.
482:
483: \otter's basic operations are (1) to apply various inference rules
484: to formulas, (2) to apply rewrite rules to inferred formulas, and (3)
485: to determine whether newly inferred formulas are already in the
486: database of formulas. These operations can be applied to
487: state space searches as well as to the heuristic searches
488: used in traditional automated theorem proving.
489:
490: Program verification is a nonstandard application of \otter.
491: Fortunately, \otter's language, data structures, and operations allow
492: reasonably intuitive and efficient implementations of model checking
493: by state space search. One of the languages \otter\ accepts is a
494: sequent language with which we can write rules, assertions, and goals
495: as follows.
496:
497: \[
498: \begin{array}{l}
499: \mbox{\it hypothesis}_1, \cdots, \mbox{\it hypothesis}_n \imp conclusion.\\
500: \imp \mbox{\it assertion}.\\
501: \mbox{\it goal} \imp.
502: \end{array}
503: \]
504:
505: \noindent
506: Given an input consisting of statements of this type, \otter\
507: can apply the rules to the assertions, generating new assertions,
508: and so on, until it derives one of the goals or until reaches a
509: fixed point (i.e., runs out of things to do). All of the assertions
510: (initial and derived) are stored in the database.
511:
512: A feature of \otter\ that allows it to conduct an efficient
513: state-space search is the ability to evaluate hypotheses by rewriting,
514: as well as to match the hypotheses with assertions. For example, the
515: hypotheses $X == 3$, where $X$ is instantiated by a preceding hypothesis,
516: is evaluated to a Boolean value. The rewrite mechanism includes a
517: simple but general equational programming language so that the evaluable
518: hypotheses can be arbitrary function calls. In addition,
519: the conclusion of the rule can have function calls to transform
520: the result.
521:
522: One can use this mechanism to implement a state space search in
523: a straightforward way. The initial states are initial assertions
524: (usually there is just one), say,
525: \[
526: \imp \mbox{\it State($s_0$)},
527: \]
528: where $s_0$ is a data structure representing the initial state.
529: The rules take states to successor states,
530: \[
531: \mbox{\it State(S)}, \mbox{\it P(S)} \imp \mbox{\it State(f(S))},
532: \]
533: where {\it P(S)} are evaluable hypotheses specifying the states to
534: which the rule applies, and {\it f(S)} generates the successor state.
535:
536: When applying this method to model-checking concurrent computations,
537: one frequently needs the hypothesis ``Let $N$ be an arbitrary process.''
538: To implement this capability, we include initial assertions giving the
539: set of process IDs, say, for a three-processor simulation,
540: \[
541: \begin{array}{l}
542: \imp \mbox{\it PID(0)}.\\
543: \imp \mbox{\it PID(1)}.\\
544: \imp \mbox{\it PID(2)}.
545: \end{array}
546: \]
547: Then, the rules can be given as
548: \[
549: \mbox{\it State(S)}, \mbox{\it PID(N)}, \mbox{\it P(S,N)} \imp \mbox{\it State(f(S,N))}.
550: \]
551:
552: In its basic form, the \otter\ search mechanism is designed to be
553: complete; that is, if a conjecture is a theorem, then \otter\ will
554: eventually find a proof. (In the practice of traditional automated
555: theorem proving, restrictions are frequently imposed by the user,
556: resulting in incomplete search procedures.) In the context of
557: state space search, this completeness means that (given enough
558: time and memory) all states that can be reached from the initial
559: states will eventually be derived and stored by \otter.
560:
561: To determine whether an illegal state can be reached, one can include
562: a rule such as
563: \[
564: \mbox{\it State(S)}, \mbox{\it P(S)} \imp \mbox{\it Bad\_State(S)}
565: \]
566: and the goal
567: \[
568: Bad\_State(S) \imp,
569: \]
570: with the effect that if any illegal state is derived, \otter\
571: will report it and print the path of states from an initial state to
572: the illegal one. From \otter's point of view, it has simply found and
573: printed a proof of a goal. From the point of view of verification,
574: the path of states represents an erroneous execution scenario.
575: The \otter\ proof of the reachability of an illegal state can be examined
576: to determine which state transition rule leads to the error. If \otter\
577: reaches a fixed point without deriving an illegal state, the terminal
578: states or the entire state space can be examined and processed by the
579: user or by another system.
580:
581: We show a
582: %
583: partial three-element \otter\ model\ of the barrier algorithm:
584: %
585: %
586: {\small
587: \begin{verbatim}
588: %
589: -> PID(0).
590: -> PID(1).
591: -> PID(2).
592:
593: %
594: -> State(PS(0,0,0,[]),PS(0,0,0,[]),PS(0,0,0,[])).
595:
596: %
597: State(S), PID(X), X == 0, $TRUE(barrier_out_arrived(S,X)) ->
598: State(assign_client_return(receive_message(S,X),X,1)).
599: \end{verbatim}
600: \newpage
601: \begin{verbatim}
602: State(S), PID(X), X != 0, $TRUE(barrier_out_arrived(S,X)) ->
603: State(assign_client_return(send_message
604: (receive_message(S,X),next(X),barrier_out),X,1)).
605:
606: %
607: -> barrier_out_arrived([],I) = false.
608: -> barrier_out_arrived([PS(CLI_IN,CLI_OUT,MSG,Q)|T],I) =
609: $IF(I == 0,
610: first(Q) == barrier _out,
611: barrier_out_arrived(T,I-1)).
612:
613: -> assign_client_return([],I,V) = [].
614: -> assign_client_return([PS(CLI_IN,CLI_OUT,MSG,Q)|T],I,V) =
615: $IF(I == 0,
616: [PS(CLI_IN,V,MSG,Q)|T],
617: [PS(CLI_IN,CLI_OUT,MSG,Q)|assign_client_return(T,I-1,V)]).
618:
619: -> dequeue([]) = [].
620: -> dequeue([H|T]) = T.
621:
622: -> receive_message([],I) = [].
623: -> receive_message([PS(CLI_IN,CLI_OUT,MSG,Q)|T],I) =
624: $IF(I == 0,
625: [PS(CLI_IN,CLI_OUT,MSG,dequeue(Q))|T],
626: [PS(CLI_IN,CLI_OUT,MSG,Q)|receive_message(T,I-1)]).
627: \end{verbatim}}
628:
629: %
630: %
631: %
632:
633:
634: The model is
635: based on the same simplification as the \promela\ model of the
636: algorithm: clients are not explicitly represented. A valid system
637: state consists of three process states. Each process state {\tt PS}
638: is a 4-tuple, whose every element implicitly corresponds to either one
639: of the variables used by the algorithm or the input queue. In this
640: model the process state template is
641:
642: \begin{verbatim}
643: PS[client_barrier_in, client_barrier_out,
644: holding_barrier_in, input_queue].
645: \end{verbatim}
646:
647: In the initial state of each process, every variable has a zero value,
648: and the input queue is empty. The two transition rules correspond to
649: the handling operations for the \verb2barrier_out2 message, as also
650: shown in
651: the \promela\ model above. The rules are applicable when the client
652: barrier requests have been received by all processes, that is, the
653: \verb2client_barrier_in2 bit has been set by every process, any
654: \verb2holding_barrier_in2 bits that might have been set are again
655: reset. In this formalization, the manager with 0 rank is the last to
656: release its client, so the state of the system to which the first of
657: the two state transition rules applies is
658:
659: \begin{verbatim}
660: (PS[1,0,0,(barrier_out)],
661: PS[1,1,0,[]],
662: PS[1,1,0,[]]),
663: \end{verbatim}
664:
665: \noindent
666: and the result of the application of the rule, including evaluation of
667: the functions
668: \verb2assign_cli-2 \verb2ent_return2 to assign the value of the
669: \verb2client_barrier_out2 bit and \verb2receive_message2 to remove a
670: message from the input queue, is the overall system state
671:
672: \begin{verbatim}
673: (PS[1,1,0,[]],
674: PS[1,1,0,[]],
675: PS[1,1,0,[]]).
676: \end{verbatim}
677:
678: Formalization of MPD algorithms in the sequent language of \otter\ is
679: similar to formalizations in the input language of the Mur$\phi$
680: model checker, especially as presented in the simplified form in
681: \cite{murphi:format}. Similarities are most evident in the definition
682: of state transition rules.
683:
684: \section{Results and Comparisons}
685: \label{sec:results}
686:
687: At this point in the project the goal is to discover whether existing
688: tools and methodologies can be used for verification of MPD
689: algorithms. We are also interested in developing formalization
690: techniques that simultaneously produce models abstract enough to allow
691: us to verify models of meaningful size and detailed enough to enable
692: automated model construction and/or code generation. We apply
693: verification approaches to MPD algorithms that have been developed
694: some time ago, and are well understood and well debugged. That
695: is, bug hunting is currently only a secondary goal. (We have
696: nonetheless found a minor error in the ring insertion algorithm using
697: the literal formalization of our earlier approach to using \spin.)
698: Rather, we are interested in developing an arsenal of verification
699: techniques to use for the analysis of the recent and future MPD
700: algorithms. Therefore, criteria for evaluation of verification
701: methods include the ease of modeling, correlation of the model to the
702: design and/or implementation of the MPD algorithm, and verification
703: performance.
704:
705: \subsection{Comparison of Formalizations}
706:
707: The \spin/\promela\ approach is the more natural one for the MPD
708: application, but the \otter\ approach does offer some, albeit possibly
709: subjective, advantages.
710:
711: \spin\ and its input language \promela\ are specifically developed for
712: verification of concurrent communicating processes. Therefore, the
713: language and the tool include special built-in constructs and
714: algorithms for message handling, communication path definition, and
715: variable declaration and manipulation. As a result, there is a
716: natural mapping between MPD components and \promela\ entities. By
717: contrast, \otter, as a general-purpose tool, has to be programmed from
718: scratch to handle these common operations of MPD algorithms.
719: Although, once constructed, this auxiliary portion of the \otter\
720: model can be reused,
721: the \spin\ models are much more concise, as demonstrated
722: by the model extracts above.
723:
724: From our point of view, the ability to model an algorithm as a set of
725: state transition rules is the main advantage of the \otter\ approach.
726: The set of rules, which is the main component of the \otter\ model,
727: can be easily extracted from a flow chart representation of the MPD
728: algorithms or from other notations typically used in the early stages
729: of algorithm design. Construction of \promela\ models looks and feels
730: much more like constructing another implementation of the algorithm,
731: only using a language much less powerful than C or Python, which are
732: used for actual implementations. Furthermore, all intermediate states
733: produced by the \otter\ search are transparent to the user and can be
734: examined. Such examination is not possible in \spin\ and may be a
735: limitation when trying to understand a particularly complicated error
736: trace.
737:
738: Correctness properties of \otter\ models of MPD algorithms have to be
739: formulated within the confines of first-order logic. \spin, on the
740: other hand, allows one to record correctness properties in linear
741: temporal logic. Although we have not yet encountered a situation where
742: limitations of first order logic prevented us from stating the desired
743: correctness property, such a situation is conceivable and may prove a
744: disqualifying drawback of the \otter\ approach.
745:
746: \subsection{Performance Comparison}
747:
748: All verification runs were conducted on a 933 MHz Pentium~III
749: processor with 970 MB of usable RAM. We used default \spin\
750: settings for all verification attempts, except when we increased the
751: memory limit to allow the search to complete. In cases where
752: verification did not complete with default parameters within physical
753: memory limits, verification with compression (using \spin's
754: \texttt{-DCOLLAPSE} compile-time directive) was performed. The input
755: files for the \otter\ experiments included settings to optimize the
756: state space exploration. Some default flags that are usually needed
757: in standard \otter\ experiments but irrelevant for our application
758: were turned off.
759:
760: Table~\ref{tbl:stats} shows performance statistics of applying \spin\
761: \begin{table*}[t]
762: \caption{Verification statistics}
763: \vspace{.2in}
764: \centerline{
765: \begin{tabular}{lc|c|c|c|c|c}
766: & & Model & Time & Memory & States & States \\
767: Problem & Method & Size & (s) & (MB) & Stored & Matched \\\hline
768: Unordered ring & Otter & 7 & 3979.08 & 391 & 7.83654e+05 & 1.908330e+06 \\
769: & Spin & 7 & 30.32 & 375 & 2.27008e+06 &1.90833e+06 \\
770: \hline
771: Ordered ring & Otter & 14 & 15676 & 561 & 1.028351e+06 & 6.318042e+06 \\
772: & Spin & 14 & 263.68 & 734 & 6.50332e+06 & 8.66146e+06 \\
773: \hline
774: Barrier & Otter & 19 & 27082 & 512 & 1.048593e+06 & 8.912898e+06\\
775: & Spin & 21 & 2045 & 746 & 8.38865e+06 & 8.38861e+07
776: \end{tabular}}
777: \vspace{0.1in}
778: \label{tbl:stats}
779: \end{table*}
780: and \otter\ to three problems. Two problems are variants of the ring
781: insertion algorithm. In both versions the first daemon establishes a
782: singleton ring. In the unordered version the subsequent daemons may
783: enter the ring in any order, resulting in many possible final
784: topologies with respect to the relative position of the processors in
785: the ring, and hence a much greater state search space and poorer
786: verification performance. In the ordered version, the order in which
787: the processes enter the ring is fixed, resulting in a single possible
788: final topology. In essence, the difference between the two versions
789: is that in the unordered case the daemons are numbered {\it before}
790: they begin to enter the ring, whereas in the ordered case they are
791: numbered {\it after} they have done so. As a result, in the ordered
792: case, the processor that enters the ring first is always numbered with
793: one, the processor that is second, with two, and so on. Because the
794: algorithm is independent of the identities of the processors, the
795: ordered version is less complex.
796:
797: Table 1 shows statistics for the largest model sizes on which a
798: particular verification approach succeeded. On complex algorithms,
799: such as ring insertion, \otter\ matches \spin\ with respect to the
800: largest verifiable model. In terms of speed and memory usage per
801: examined state, \otter\ is far behind \spin. This result is not
802: surprising because \spin\ is a special-purpose tool, specifically
803: designed for applications like ours, while our use of \otter\ is
804: unusual in this case. In fact, the performance of \otter\ far
805: exceeded our expectations.
806:
807: \otter\ could not verify models of as many processors as \spin\ could
808: for thr barrier algorithm. The explanation for such a difference in
809: performance lies in the way states are represented in each method.
810: In \spin, the three local variables that contribute to individual
811: process states are bits, and the communications channels are
812: essentially arrays of bits, which allows \spin\ to store state vectors
813: very efficiently. In addition, in the \spin\ verification run on a
814: model of twenty-one processors, state vectors were compressed,
815: resulting in further improved performance. In \otter, the
816: variables and the input queue are terms, which are not stored as
817: efficiently as bits. Thus, \otter\ requires much more memory to store
818: an individual system state. As a result, \otter\ is able to examine a
819: much smaller state spaces.
820:
821: The \otter\ search engine is not optimized for the kind of search that
822: takes place in this application. It is therefore not surprising that
823: verification with \otter\ is several times slower than with
824: \spin. But, since memory, not time, is the main limitation in this
825: application specifically and in model checking in general, the speed
826: of verification is only of minor concern.
827:
828: \subsection{A Note on Model Sizes}
829: \label{sec:size-note}
830: Scalability was one of the design goals of MPD. The daemon is
831: intended to run on hundreds, and eventually on thousands, of
832: processors. Unexpected limitations of the underlying operating
833: systems aside, it is assumed that the same algorithms that execute
834: successfully in a daemon of just a few processors will execute
835: smoothly in a
836: much larger system. Given the current state of the model-checking
837: technology, it is impossible to formally verify the algorithms of MPD
838: on very large models sizes, even if it were desirable to do so.
839: Luckily, the empirical evidence obtained while debugging the MPD code
840: using the traditional means suggests that the errors, even the most
841: difficult and obscure ones, exhibit themselves in daemons of just a few
842: processors. It also appears to be an accepted view of the
843: model-checking community that to verify systems with unbounded
844: potential number of elements, it is sufficient to verify a limited( with
845: respect to the number of elements) model of the system
846: \cite{inca-compare,dill-protocol-hardware-92,jackson-counter-detector}.
847:
848: Our goal is to devise a verification approach and a modeling
849: methodology that allows us to verify complex MPD algorithms and
850: interactions of these algorithms on models of ten to twenty daemons.
851: For example, MPD contains an algorithm that merges two daemons that
852: are running parallel jobs. We want to verify the algorithm on a
853: meaningful model consisting of two MPD structures, as shown
854: in Figure~\ref{fig:mpds-all}, each having a console process and
855: three of each of daemon, manager, and client processes.
856:
857: \section{Summary}
858: \label{sec:summary}
859:
860: We described here two approaches to verification of the algorithms of
861: the parallel process management system called MPD. One approach is
862: based on the software model checker \spin, the other on the
863: general-purpose first-order theorem prover \otter. Both approaches
864: are based on model checking, and the use of \otter\ in the model
865: checking-capacity is unusual. The aim was to model algorithms of MPD
866: in both the \spin\ and the \otter\ approaches so as to enable
867: verification of the largest possible model.
868:
869: The two approaches were compared with respect to the ease of
870: formalization and verification performance characteristics. Overall,
871: \spin\ is more efficient in terms of absolute time and memory
872: requirements and relative time and memory requirements per stored
873: system state. In terms of the size of models that each approach
874: allows us to verify, both tools perform roughly the same, with \spin\
875: occasionally demonstrating better performance. Neither approach allows
876: us to verify complicated algorithms on models of about twenty daemons,
877: which is our goal. In terms of formalization methodology, the two
878: approaches are too different to compare, and both exhibit
879: advantages and disadvantages.
880:
881: The main goal of this technical note is to document the current
882: approach to modeling MPD in \promela, and to describe how \otter\ can
883: be used to simulate model-checking style search. We have been
884: applying model checking techniques to MPD algorithms that have been
885: under development and testing for some time. Consequently, it should
886: come as no surprise that no errors have been discovered. Even though
887: the current modeling methodologies limit verification to models of
888: only a few entities, applying them to new MPD algorithms could still
889: be beneficial.
890:
891: For further details the reader is referred to the
892: complete models of the presented algorithms. This information is
893: available at \texttt{http://www.mcs.anl.gov/\~{ }matlin/spin-mpd}.
894:
895: %
896:
897: \addcontentsline{toc}{section}{References}
898: \bibliographystyle{plain}
899: \bibliography{master}
900:
901: %
902:
903: %
904: \end{document}
905: