1: \section{Applications}\label{Se:Applications}
2:
3: % Executive view
4: The experiments discussed in this section demonstrate how
5: the $\gammaHat$ operation can be harnessed in the context of program analysis:
6: the results described below go beyond what previous systems were capable of.
7: In \secref{SPASS}, we discuss the use existing theorem provers and their limitations.
8: In \secref{EADTC}, we suggest a way to overcome these limitations, using decidable logic.
9:
10: We present two examples that use $\gammaHat$ to read out
11: information from $3$-valued structures in a conservative, but rather precise way.
12: The first example demonstrates how supervaluational semantics allows us to obtain more precise
13: information from a $3$-valued structure than we would have using compositional semantics. %otherwise.
14: The second example demonstrates how to use the $3$-valued structures obtained from a TVLA analysis
15: to construct a loop invariant; this is then used to show
16: that certain properties of a linked data structure hold on each
17: loop iteration.
18: In addition, we briefly describe how $\gammaHat$ can be used in algorithms for computing most-precise
19: abstraction operations for shape analysis.
20: Finally, we report on other work that employs $\gammaHat$ to generate a
21: concrete counter-example for shape analysis.
22:
23: \begin{Remark}
24: The $\gammaHat$ operation defines a symbolic concretization with respect to a given abstract domain.
25: In~\secref{Classic}, we defined $\gammaHat$ for the abstract domain of sets of $3$-valued
26: structures. %ordered by the Hoare ordering on sets, where structures are ordered by the embedding ordering.
27: In \appref{Canonic}, we describe a related abstract domain and define $\gammaHat$ for it.
28: The applications described in this section can be used with any domain
29: for which $\gammaHat$ is defined in some logic and a theorem prover for that logic exists.
30: In our examples, we use $\gammaHat$ defined in \secref{Classic} and
31: the first-order logic with transitive closure.
32: \end{Remark}
33:
34: %\subsection{Implementation Issues}\label{Se:SPASS}
35: \subsection{Using the First-Order Theorem Prover SPASS}\label{Se:SPASS}
36:
37: %% What is TVLA and what was done
38: The TVLA (\cite{SAS:LS00}) system performs an iterative fixed-point computation, which
39: yields at every program point $p$ a set $X_p$ of bounded structures.
40: It guarantees that $\gamma(X_p)$ is a superset of the $2$-valued structures that can
41: arise at $p$ in any execution.
42: %To carry out these experiments,
43: We have implemented the $\gammaHat$ operation in TVLA, and
44: employed SPASS~\cite{SPASS} to check, using the formula $\gammaHat(X_p)$,
45: that certain properties of the heap hold at program point $p$.
46: Also, we implemented the supervaluational procedure described in \secref{Supervaluation},
47: employing SPASS.
48: The enhanced version of TVLA generates the formula
49: $\gammaHat(S)$ and makes at most two calls to SPASS
50: to compute the supervaluational value of a query $\varphi$
51: in structure $S$.
52: In this section, we report on our experience in using SPASS and the problems we have encountered.
53:
54: %SPASS
55: First, calls to SPASS theorem prover need not terminate, because
56: first-order logic is undecidable in general.
57: However, in the examples described below, SPASS always terminated.
58:
59: \begin{example}
60: In \exref{SupExample} we (manually) proved that the supervaluational value of
61: the formula $\varphi_{ {\tt x\rightarrow next \neq NULL} }$ on the structure $S$
62: from \figref{threeStruct}(d) is $1$.
63: To check this automatically, we used SPASS to determine the validity of
64: $\widehat{\gamma}(S) \implies \varphi_{ {\tt x\rightarrow next \neq NULL} }$;
65: SPASS indicated that the formula is valid.
66: This guarantees that the formula $\varphi_{ {\tt x\rightarrow next \neq NULL} }$
67: evaluates to $1$ on all of the $2$-valued structures that embed into $S$.
68:
69: In contrast, TVLA uses Kleene semantics for $3$-valued formulas, and
70: will evaluate $\varphi_{ {\tt x\rightarrow next \neq NULL} }$ to $1/2$:
71: under the assignment $[v_1 \mapsto u_1, v_2 \mapsto u_2]$,
72: $x(v_1) \land n(v_1, v_2)$ evaluates to $1 \land 1/2$, which equals $1/2$.
73: \end{example}
74:
75: \subsubsection{Generating and Querying a Loop Invariant}\label{Se:loopExperiment}
76: We used TVLA to compute, for each program point $p$,
77: a set $X_p$ of bounded structures that overapproximate
78: the set of stores that may occur at that point.
79: We then generated $\widehat{\gamma}(X_p)$.
80: Because TVLA is sound,
81: $\widehat{\gamma}(X_p)$ must be an invariant
82: that holds at program point $p$, according to \theref{RepresentingStructuresByFormulae}.
83: In particular, when $p$ is a program point that begins a loop,
84: $\widehat{\gamma}(X_p)$ is a loop invariant.
85:
86: \begin{example}\label{Ex:loopInvariantEx}
87: Let $X = \{S_i \mid i=1,\ldots,5\}$ denote
88: the set of five $3$-valued structures that TVLA
89: found at the beginning of the loop in the
90: \tinsert\/ program from \figref{threeStruct}.
91: \tableref{StructFormula} and \tableref{StructFormulaTwo}
92: of \appref{AppendixTable} show the $S_i$ and their
93: characteristic formulas.
94: The loop invariant is defined by
95: \[
96: \widehat{\gamma}(X) = F_{List} \land (\Lor_{i=1}^5 \characteristicFormula{S_i})
97: \]
98:
99: Using SPASS, this formula was then used to check that in every
100: structure that can occur at the beginning of the loop, \tx\/ points
101: to a valid list, i.e., one that is acyclic and unshared.
102: This property is defined by the following formulas:
103: \[
104: \begin{array}{@{\hspace{0ex}}l@{\hspace{.5ex}}l@{\hspace{.5ex}}l@{\hspace{0ex}}}
105: \text{acyc}_x & \eqdef & \forall v_1, v_2: r_x(v_1) \land n^+(v_1, v_2) \implies \neg n^+(v_2, v_1) \\
106: \text{uns}_x & \eqdef & \forall v \colon r_x(v) \implies
107: ~\neg (\exists w_1, w_2 \colon \neg eq(w_1, w_2) \land n(w_1, v) \land n(w_2, v)) \\
108: \text{list}_x & \eqdef & \text{acyc}_x \land \text{uns}_x
109: \end{array}
110: \]
111: We applied SPASS to check the validity of $\gammaHat(S) \implies list_x$; SPASS indicated that
112: the formula is valid.\footnote{SPASS input is available from {\tt www.cs.tau.ac.il/$\sim$gretay}.}
113: \end{example}
114:
115: In addition to the termination issue, a second obstacle is that SPASS considers infinite structures, which
116: are not allowed in our setting.\footnote{Our
117: intended structures are finite, because they represent memory configurations,
118: which are guaranteed to be finite, although their size is not bounded.}
119: As a consequence, SPASS can fail to verify that a formula is valid for
120: our intended set of structures;
121: however, the opposite can never happen: whenever SPASS indicates that
122: a formula is valid, it is indeed valid for our intended set of
123: structures.
124: \begin{example}
125: We tried to verify that every concrete linked-list represented by the $3$-valued structure
126: $S$ from \figref{threeStruct}(d)
127: has a last element.
128: This condition is expressed by the formula
129: $\varphi_{last} \eqdef \exists v_1 \forall v_2 : \neg n(v_1, v_2)$.
130: The supervaluational value of $\varphi_{last}$ on a structure $S$ is
131: $\superval{\varphi}(S) = 1$, for the following reasons.
132: Because $r_x$ has the definite value $1$ on $u_2$ in $S$,
133: all concrete nodes represented by the summary node $u_2$
134: must be reachable from $x$.
135: Thus, these nodes must form a linked list, i.e.,
136: each of these concrete nodes, except for one node that is the ``last'',
137: has an $n$-edge to another concrete node represented by $u_2$.
138: The last node does not have an $n$-edge back to any of the nodes represented by $u_2$,
139: because that would create sharing, whereas the value of predicate $is$ in $S$ is $0$ on $u_2$.
140: Also, the last node cannot have an $n$-edge to the concrete node represented by $u_1$,
141: because the value of predicate $n$ on the pair $\B{u_2, u_1}$ in $S$ is $0$.
142: Therefore, the last element cannot have an outgoing $n$-edge.
143:
144: We used SPASS to determine the validity of
145: $\gammaHat(S) \implies \varphi_{last}$;
146: SPASS indicated that the formula is \emph{not} valid, because it considered
147: a structure that has infinitely many concrete nodes, all represented by $u_2$.
148: Each of these concrete nodes has an $n$-edge to the next node.
149:
150: The validity test of the formula $\gammaHat(S) \implies \neg \varphi_{last}$
151: failed, of course, because there exists a finite structure that is represented by $S$
152: (and thus satisfies $\gammaHat(S)$) and has a last element. For example, the structure
153: in \figref{threeStruct}(a) that represents a list of size $2$.
154: Therefore, the procedure \textit{Supervaluation}$(\varphi_{last}, {S})$ implemented using SPASS
155: returns $1/2$, even though the supervaluational value of $\varphi_{last}$ on $S$ is $1$.
156: \end{example}
157:
158: The third, and most severe, problem that we face is that SPASS does not support transitive closure.
159: Because transitive closure is not expressible in first-order logic,
160: we could only partially model transitive closure in SPASS,
161: %integrity rules,
162: as described below.
163:
164: SPASS follows other theorem provers in allowing axioms to express
165: requirements on the set of structures considered.
166: We used SPASS axioms to model integrity rules.
167: To partially model transitive closure, %integrity rules,
168: we replaced uses of $n^+(v_1, v_2)$ by uses of a new designated
169: predicate $t[n](v_1, v_2)$.
170: Therefore, SPASS will consider some structures that do not represent
171: possible stores.
172: As a consequence, SPASS can fail to verify that a formula is valid for
173: our intended set of structures;
174: however, the opposite can never happen: whenever SPASS indicates that
175: a formula is valid, it is indeed valid for our intended set of
176: structures.
177: To avoid some of the spurious failures to prove validity, we added
178: axioms to guarantee that
179: (i)\ $t[n](v_1, v_2)$ is transitive and
180: (ii)\ $t[n](v_1, v_2)$ includes all of $n(v_1, v_2)$;
181: thus, $t[n](v_1, v_2)$ includes all of $n^+(v_1, v_2)$.
182: Because transitive closure requires a minimal set, which is not
183: expressible in first-order logic, this approach provides a
184: looser set of integrity rules than we would like.
185: However, it is still the case that whenever SPASS indicates that a
186: formula is valid, it is indeed valid for the set of structures in
187: which $t[n](v_1, v_2)$ is exactly $n^+(v_1, v_2)$.
188:
189: \begin{example}
190: SPASS takes into account the structure shown in \figref{SPASS_TC},
191: in which the value of $t[n](u_1, u_3)$ is $1$,
192: but the value of $n^+(u_1, u_3)$ is $0$
193: because there is no $n$-edge from $u_2$ to $u_3$.
194: \begin{figure}
195: \begin{center}
196: %\framebox{
197: \begin{tabular}{|@{\hspace{1ex}}c@{\hspace{1ex}}|}
198: \hline
199: \xymatrix@R9pt@C8pt{
200: \xyNonSummaryNode{u_1}
201: \ar[rr]^{n}
202: \ar@/^0.8pc/[rr]^{t[n]}
203: \ar@/^2.5pc/[rrr]^{t[n]}
204: &&
205: \xyNonSummaryNode{u_2}
206: &
207: \xyNonSummaryNode{u_3} \\
208: \tx, r_{x}\ar[u] &&
209: r_{x}\ar[u] &
210: r_{x}\ar[u]
211: }\\
212: \hline
213: \end{tabular}
214: \end{center}
215: \caption{\label{Fi:SPASS_TC} SPASS takes into account structures in which the $t[n]$ predicate
216: overapproximates the $n^+$ predicate, such as the structure shown in this figure.}
217: \end{figure}
218: \end{example}
219:
220:
221: \subsection{Decidable Logic}\label{Se:EADTC}
222: %% Decidable logics
223:
224: The obstacles mentioned in \secref{SPASS} are not specific to SPASS.
225: They occur in all theorem provers for first-order logic that we are aware of.
226: To address these obstacles, we are investigating the use of a decidable logic.
227: To reason about linked data structures, we need a notion of reachability
228: to be expressible, for example, using transitive closure.
229: However, a logic that is both decidable and includes reachability
230: must be limited in other aspects.
231: %Because this notion tends to make logics undecidable, e.g., \cite{GOR99,lics:IRRSY},
232: %only limited logics with reachability are decidable.
233:
234: One such example is the decidable second-order theory of two successors \textit{WS2S} \cite{Rabin};
235: its decision procedure is implemented in a tool called MONA \cite{KlaEtAl:Mona}.
236: Second-order quantification suffices to express reachability,
237: but there are still two problems.
238: First, the decision procedure for \textit{WS2S} is necessarily non-elementary \cite{Meyer}.
239: Second, \textit{WS2S} only applies to trees, or,
240: equivalently, to function graphs (graphs with at most one edge leaving any vertex).
241:
242: Another example is $\Lone$, which is a subset of first-order logic with transitive closure,
243: in which the following restriction are imposed on formulas:
244: (i)~they must be in existential-universal form,
245: and (ii)~they must use at most a single unary function $f$, but can use an arbitrary number of unary predicates.
246: \cite{csl04:IRRSY} shows that the decision procedure for satisfiability of $\Lone$ is NEXPTIME-complete.
247:
248: In spite of their limitations, both \textit{WS2S} and $\Lone$ can be useful for
249: reasoning about shape invariants and mutation operations on data structures, such as
250: singly and doubly linked lists, (shared) trees, and graph types \cite{kn:KS93}.
251: The key is the \textit{simulation technique} \cite{cav04:IRRSY}, which encodes
252: complex data-structures using \emph{tractable} structures,
253: e.g., function graphs or simple trees, where we can reason with decidable logics.
254: %The idea is to detect that a mutation is to violate
255: %the correctness of the simulation and forbid such mutations
256: %from being applied.
257:
258: For example, given a suitable simulation,
259: $\gammaHat$ formula can be expressed in \textit{WS2S} and $\Lone$ if the integrity formula $F$ can.
260: This follows from the definition of $\gammaHat$ in \equref{gammaHatSet} and the fact
261: that $\characteristicFormula{S}$ does not contain
262: quantifier alternation.
263: This makes $\Lone$ and \textit{WS2S} candidate implementations
264: for the decision procedure used in the supervaluational semantics and in the algorithms
265: described below.
266:
267: %\cite{lics:IRRSY} describes the logic $\Lone$ that is a subset of first-order logic with transitive closure,
268: %imposed by the following limitations on a $\Lone$ formula: (i)~existential-universal form,
269: %and (ii)~a single unary function $f$ and an arbitrary number of unary predicates.
270: %
271: %In spite of its limitations, in \cite{cav:IRRSY} the logic $\Lone$ is shown to be useful for
272: %reasoning about shape invariants and mutation operations on data structures, such as
273: %singly and doubly linked lists, (shared) trees, and graph types \cite{kn:KS93}.
274: %The key is the \textit{simulation technique} that encodes
275: %complex data-structures using $\Lone$ logic, including mutations of these data-structures.
276: %It detects that a mutation is to violate the correctness of the simulation and forbids such mutations
277: %from being applied.
278: %
279: %Also, the satisfiability of $\Lone$ formulas is decidable and NEXPTIME-complete,
280: %hence the $\Lone$ decision procedure is a candidate implementation for the {\it isSatifiable}
281: %function.\footnote{
282: %Another candidate is the decision procedure for monadic 2-nd order logic over trees \cite{KlaEtAl:Mona}, MONA,
283: %which has non-elementary complexity.}
284:
285: %\subsection{Querying Using Supervaluational Semantics}\label{Se:supExperiment}
286:
287: \subsection{Assume-Guarantee Shape Analysis} % using $\Lone$ Logic}
288: \label{Se:Assume}
289: %%Assume algorithm
290: The $\gammaHat$ operation is useful beyond computing supervaluational semantics: it is a
291: necessary operation used in the algorithms described in~\cite{TACAS:YRS04,VMCAI:RSY04}.
292: These algorithms perform abstract operations symbolically by representing abstract values
293: as logical formulas, and use a theorem prover to check validity of these formulas.
294: These algorithms improve on existing shape-analysis techniques by:
295: \begin{itemize}
296: \item conducting abstract interpretation in the most-precise fashion,
297: improving the technique used in the TVLA system~\cite{SAS:LS00,TOPLAS:SRW02},
298: which provides no guarantees about the precision of its basic mechanisms.
299: \item performing modular verification using assume-guarantee reasoning and procedure specifications.
300: This is perhaps the most-exciting potential application of $\gammaHat$ (and $\Lone$ logic),
301: because existing mechanisms for shape analysis, including TVLA, do not support
302: assume-guarantee reasoning.
303: \end{itemize}
304: %For example,
305: %given a suitable simulation of vocabularies \cite{cav:IRRSY},
306: %$\gammaHat$ can be expressed in $\Lone$ if the integrity formula $F$ can.
307: %It follows from the definition of $\gammaHat$ in \equref{gammaHatSet} and the fact
308: %that $\characteristicFormula{S}$ is in $\Lone$, because it does not contain
309: %quantifier alternation and transitive closure.
310:
311: %These algorithm is based on multiple queries.
312: %The simulation technique \cite{cav:IRRSY} allows us to translate these queries to a decidable logic,
313: %and utilize a decision procedure for that logic to implement the algorithms.
314: %The main part of the translation involves translating $\gammaHat$ formulas to a decidable logic.
315: %For example, $\gammaHat(S)$ can be expressed in $\Lone$ if the integrity rules $IR$ can.
316: %The reason is that $\gammaHat(a) \eqdef \Land_{S \in a} \xi^S \land IR$, and
317: %$\xi$ is in $\Lone$ because it does not contain
318: %quantifier alternation and transitive closure.
319: %In \cite{lics:IRRSY} we provide a sufficient condition under which this can be done automatically.
320: %
321:
322:
323: \subsection{Counter-example Generation}\label{Se:CounterExample}
324:
325: %The techniques presented in this paper have already been explored in order to improve the applicability of
326: %TVLA~\cite{Unpublished:ESY03,Master:Erez04}.
327: Some preliminary work to use the techniques presented in this paper to improve the applicability of TVLA
328: has been carried out.
329: The tool described in ~\cite{Unpublished:ESY03,Master:Erez04} uses the $\gammaHat$ operation to generate a concrete counter-example for
330: a potential error message produced by TVLA
331: for an intermediate $3$-valued structure $S$ at a program point $p$.
332: Such a tool is useful to check if a reported error is a real error or a false-alarm,
333: i.e., it never occurs on any concrete store.
334:
335: Generation of concrete counter-examples from $S$ proceeds as follows.
336: First, $S$ is converted to the formula $\gammaHat(S)$.
337: Then, the tool uses weakest precondition to generate a formula that represents the stores at the entry
338: point that lead to an execution trace that reaches $p$ with a store that satisfies $\gammaHat(S)$.
339: %Then, the tool enumerates the traces starting at the program entry that lead
340: %to a concrete store in $S$ by computing the weakest
341: %precondition which guarantees that the trace is executed with a resultant store which
342: %satisfies the generated formula holds.
343: Finally, a separate tool \cite{MACE} generates a concrete store that satisfies the
344: formula for the entry point.
345: %The idea is to discover a program trace that leads to the program location $p$ with
346: %%an erroneous structure $S$,
347: %by going backwards from $p$ using weakest precondition.
348: %The validity of the weakest precondition for the structures in each program location is
349: %checked using supervaluation. If a counter-example exists, it is taken into account in the further steps of
350: %trace generation.
351:
352:
353:
354:
355:
356:
357:
358:
359:
360:
361:
362:
363:
364:
365:
366:
367:
368:
369:
370:
371:
372:
373:
374:
375:
376:
377: %Contributions:
378: %
379: %The results of this paper
380: %can be viewed as a further step in evolution of
381: %representations for shape analysis dataflow facts:
382: %we show how to represent three-valued structures of
383: %with formulas of classical two-valued logic. Although
384: %conceptually not surprising, such representation has many
385: %methodological benefits:
386: %
387: %1. The use of formulas of classical logic allows replacing
388: %potentially incomplete operations on three-valued structures
389: %in [36] whose correctness is not immediately evident, with
390: %valid transformations on first-order formulas. This makes
391: %it easier both to prove correctness of shape analysis
392: %operations, and to to develop proof-generating shape
393: %analyses.
394: %
395: %2. The operations on formulas can be relegated to
396: %first-order theorem provers that have been extensively
397: %studied and developed over the past decades. Authors
398: %present some experience in using theorem prover SPASS and
399: %suggest the use of decision procedures for decidable logics
400: %that can express interesting properties of the heap.
401: %
402: %3. Because program annotations are typically written as
403: %formulas, representation of dataflow facts as formulas enables
404: %the communication of information between the programmer and
405: %the analysis, which is essential for compositional and
406: %therefore more scalable approaches to shape analysis.
407: