1: \documentclass{tlp}
2:
3: \usepackage{latexsym}
4:
5: \newtheorem{theorem}{Theorem}[section]
6: \newtheorem{lemma}[theorem]{Lemma}
7: \newtheorem{example}[theorem]{Example}
8: \newtheorem{definition}[theorem]{Definition}
9: \newtheorem{proposition}[theorem]{Proposition}
10: \newtheorem{corollary}[theorem]{Corollary}
11: \newtheorem{property}[theorem]{Property}
12: \newtheorem{remark}[theorem]{Remark}
13: \newtheorem{claim}{Claim}
14: \newtheorem{fact}{Fact}
15:
16:
17:
18: \newcommand{\myif}{\ \mbox{\rm if}\ }
19: \newcommand{\when}{\ \mbox{\rm when}\ }
20: \newcommand{\myiff}{\ \mbox{\rm iff}\ }
21: \renewcommand{\and}{\ \mbox{\rm and}\ }
22: \newcommand{\myor}{\ \mbox{\rm or}\ }
23: \newcommand{\mynot}{\mbox{\rm not}\, }
24: \newcommand{\otherwise}{\ \mbox{\rm otherwise}}
25: \newcommand{\subc}{\succ}
26: \newcommand{\comp}{\circ}
27: \newcommand{\cleq}{\sqsubseteq}
28: \newcommand{\clt}{\sqsubset}
29: \newcommand{\sclt}{\lhd}
30: \newcommand{\scleq}{\unlhd}
31: \newcommand{\eqass}{\approx}
32: \newcommand{\UP}[1]{#1
33: \mbox{\raisebox{0.02in}{$\Uparrow$}}}
34: \newcommand{\DOWN}[1]{#1\mbox{\raisebox{0.0in}{$\Downarrow$}}}
35: \newcommand{\Cc}[1]{{\it Cc}(#1)}
36: \newcommand{\Abs}[1]{{\it Abs}(#1)}
37: \newcommand{\PSS}{{\it PSS}}
38: \newcommand{\ASSD}{{\it ASS}_D}
39: \newcommand{\TABJ}{{\it TAB}_{\mbox{\tiny \rm JOIN}}}
40: \newcommand{\TABR}{{\it TAB}_{\mbox{\tiny \rm REFINE}}}
41: \newcommand{\TAB}{{\it TAB}}
42: \newcommand{\Nstar}{{\mbox{\bf N}^\star}}
43: \newcommand{\N}{{\mbox{\bf N}}}
44:
45:
46: \newcommand{\wi}{$\bigtriangledown$}
47: \newcommand{\disj}{$\; | \;$}
48: \newcommand{\go}{{\tt g$_o$}}
49: \newcommand{\gr}{{\tt g$_r$}}
50: \newcommand{\gn}{{\tt g$_n$}}
51: \newcommand{\gf}{{\tt g$_1$}}
52: \newcommand{\gs}{{\tt g$_2$}}
53: \newcommand{\gi}{{\tt g$_i$}}
54: \newcommand{\gp}{{\tt g$_{i+1}$}}
55: \newcommand{\vo}{{\tt v$_o$}}
56: \newcommand{\vn}{{\tt v$_n$}}
57: \newcommand{\va}{{\tt v$_a$}}
58: \newcommand{\vf}{{\tt v$_1$}}
59: \newcommand{\vs}{{\tt v$_2$}}
60: \newcommand{\tyo}{{\tt T$_o$}}
61: \newcommand{\tn}{{\tt T$_n$}}
62: \newcommand{\tnew}{{\tt T$_{new}$}}
63: \newcommand{\widen}{\bigtriangledown}
64:
65: \newcommand{\pag}{\mbox{\tt Pat($\cal R$)}}
66:
67:
68:
69:
70: \title[Sequence-Based Abstract Interpretation of Prolog]
71: {Sequence-Based Abstract Interpretation of Prolog}
72: %{SEQUENCE-BASED ABSTRACT INTERPRETATION OF PROLOG}
73:
74: \author[B. Le Charlier, S. Rossi and P. Van Hentenryck]
75: {BAUDOUIN LE CHARLIER\\
76: Institut d'Informatique, University of Namur,\\
77: 21 rue Grandgagnage, B-5000 Namur, Belgium
78: %\email{ble@info.fundp.ac.be}
79: \and SABINA ROSSI\\
80: Dipartimento di Informatica, Universit\`a di Venezia,\\
81: via Torino 155, 30172 Venezia, Italy
82: %\email{srossi@dsi.unive.it}
83: \and PASCAL VAN HENTENRYCK\\
84: Department of Computer Science,
85: Brown University,\\ P.O. Box 1910, Providence
86: RI 02912, Usa
87: %\email{pvh@cs.brown.edu}
88: }
89:
90: \begin{document}
91: \maketitle
92:
93: \begin{abstract}
94:
95:
96: Abstract interpretation is a general methodology for
97: systematic development of program analyses. An abstract interpretation
98: framework is centered around a parametrized
99: non-standard semantics that can be instantia\-ted by various domains to approximate different
100: program properties.
101:
102: Many abstract interpretation frameworks and analyses for Prolog
103: have been proposed, which seek to extract information useful for program optimization.
104: Although motivated by practical considerations, notably making Prolog competitive with imperative
105: languages, such frameworks fail to capture some of the
106: control structures of existing implementations
107: of the language.
108:
109: In this paper we propose a novel framework for the abstract interpretation
110: of Prolog which handles the depth-first search rule and the cut operator.
111: It relies on the notion of {\em substitution sequence} to model the result of
112: the execution of a goal.
113: The framework consists of (i) a denotational
114: concrete semantics, (ii) a safe abstraction of the concrete semantics
115: defined in terms of a class of post-fixpoints, and (iii) a generic abstract
116: interpretation algorithm. We show that traditional abstract domains of
117: substitutions may easily be adapted to the new framework, and provide
118: experimental evidence of the effectiveness of our approach.
119: We also show that previous work on determinacy analysis, that was not
120: expressible by existing abstract interpretation frameworks, can be seen as an
121: instance of our framework.
122:
123: The ideas developed in this paper can be applied to other logic
124: languages, notably to constraint logic languages,
125: and the theoretical approach should
126: be of general interest for the analysis of many
127: non-deterministic programming languages.
128:
129: \end{abstract}
130:
131:
132: %%%%%%%%%%%%%%%%%%%%%%
133: \section{Introduction}
134: \label{introduction}
135: %%%%%%%%%%%%%%%%%%%%%%
136:
137:
138:
139: {\it Abstract interpretation} \cite{Cousot77} is a general
140: methodology for systematic development of program analyses.
141: It has been applied to
142: various formalisms and paradigms including flow-charts and imperative,
143: functional, logic, and constraint programming.
144: %An abstract interpretation framework is centered around the definition of a parametrized
145: %non-standard semantics that can be instantiated by various domains to approximate different
146: %program properties.
147:
148: %\subsection{Abstract Interpretation of Prolog}
149:
150: Abstract interpretation of Prolog and, more generally, of logic
151: programming was initiated by Mellish \shortcite{Mellish87}
152: and further developed by numerous researchers,
153: e.g., Bruynooghe \shortcite{Bruynooghe91},
154: Cousot and Cousot \shortcite{Cousot92},
155: Jones and S{\o}ndergaard \shortcite{Jones87},
156: Le Charlier \emph{et al.}
157: \shortcite{ICLP91AI}, Marriott and S{\o}ndergaard \shortcite{Marriott89b}.
158: Many different kinds of practical analyses
159: and optimizations have been proposed,
160: a detailed description of which can be found in
161: \cite{Cousot92,Getzinger.SAS94}. Briefly, mode
162: \cite{Cortesi91,D89,Debray88c,So86}, type
163: \cite{BG89b,Graph.JLP,GZ86,Janssens92,KH85,K83,Kluzniak87,L83,MK84,XW88,YS91},
164: and aliasing \cite{Codish91,Jacobs89} analyses collect information
165: about the state of variables during the execution and are useful to
166: speed up term unification and make memory allocation more efficient
167: \cite{Hermenegildo92,Warren88}. Sharing analysis
168: \cite{Corsini91,Cortesi,Kluzniak88,Muthukumar91} is similar to
169: aliasing except that it refers to the sharing of memory structures to
170: which program variables are instantiated; it is useful to perform
171: compile-time garbage collection \cite{Jensen90,Kluzniak88,Mulkers90}
172: and automatic parallelization \cite{SAS94.Gras,Chang85,Giacobazzi90,Jacobs92}.
173: Reference chain analysis \cite{Marien89b,Vanroy92} attempts to determine an
174: upper bound to the length of the pointer chain for a program variable.
175: Trai\-ling analysis \cite{Taylor89} aims at detecting variables which do
176: not need to be trailed. Liveness analysis \cite{Mulkers91a}
177: determines when memory structures can be reused
178: and is useful to perform update-in-place.
179:
180: All these analyses approximate the set of values (i.e., terms or memory
181: structures) to which program variables can be instantiated at some
182: given program point. It is thus not surprising that almost all
183: frameworks for the abstract interpretation of Prolog, e.g.,
184: \cite{Barbuti90,Bruynooghe91,Jones87,Marriott93,Marriott89b,Mellish87,Nilsson90a},
185: are based on abstractions of sets of substitutions. Such
186: traditional
187: frameworks ignore
188: important control features of the language, like
189: the depth-first
190: search strategy and the cut operator.
191: The reason is that these control features
192: are difficult to model accurately,
193: and yet not strictly necessary for a {\em variable level} analysis.
194: %as the ones cited above.
195: However, modeling Prolog control features has two main advantages. First,
196: it allows one to perform so-called
197: {\em predicate level} analyses, like determinacy
198: \cite{Giacobazzi92,Sahlin91,Ueda,Vanroy87,Vanroy92} and local stack
199: \cite{Marien89,Meier91} analyses.
200: These analyses are not captured by traditional abstract interpretation
201: frameworks;
202: they usually rely on some ad hoc te\-chni\-que
203: and require special-purpose proofs of correctness,
204: e.g., \cite{Debray89a,Sahlin91},
205: which may be rather involved.
206: They are useful to perform
207: optimizations, such as the
208: choice point removal and the simplification of environment creation.
209: Second,
210: the analysis of some classes
211: of programs, like programs containing
212: multi-directional procedures which use
213: cuts and meta-predicates
214: to select among different versions,
215: may be widely improved.
216: This may provide the compiler with more
217: chances to perform important optimizations such
218: as dead-code elimination.
219:
220:
221:
222: Abstract interpretation of Prolog with control
223: has been investigated by other authors.
224: In particular, we know of three main different approaches.
225: The approach of Barbuti {\em et al.} \shortcite{BarbutiControl}
226: is based on an abstract semantics for logic programs
227: with control which is
228: %sema\-ntics for logic programs aiming at modeling the
229: %Prolog search rule has been proposed by
230: %Barbuti {\em et al.} in \cite{BarbutiControl}.
231: %Their semantics is
232: parametric with respect to a ``termination theo\-ry''. The latter
233: is intended to be provided from outside, for instance
234: by applying proofs procedures.
235: Fil\`e and Rossi \shortcite{FileRossi}
236: propose
237: an operational and non-compositional
238: abstract interpretation framework for Prolog with cut
239: %has been defined by Fil\`e and Rossi \cite{FileRossi}.
240: consisting of a tabled interpreter to visit OLDT abstract trees
241: decorated with information about sure success or failure of goals.
242: Finally, Spoto \shortcite{Spoto}
243: define
244: an abstract goal-independent denotational
245: semantics for Prolog handling control rules and cut.
246: %has been introduced by F. Spoto and G. Levi in \cite{Spoto}.
247: Program denotations are adorned with
248: ``observability'' constraints giving information about
249: divergent computations and cut executions.
250: We know of no experimental results
251: validating the effectiveness of these
252: approaches.
253:
254: In this paper we present a novel abstract interpretation framework for Prolog
255: which models the depth-first search rule and the cut
256: operator. It relies on the notion of {\em substitution sequence}
257: which allows us to collect the solutions to a goal
258: together with
259: information such as sure success and failure, the number of solutions,
260: and/or termination.
261: The framework that we propose can be applied to perform predicate
262: level analyses, such as determinacy, which were not expressible by classical frameworks, and
263: can be also used to improve the accuracy of
264: existing analyses.
265: Experiments on a sample analysis, namely cardinality analysis, will be
266: discussed.
267:
268:
269:
270: \subsection{Some Motivating Examples}
271:
272: In this section we illustrate by means of
273: small examples the
274: functionality of our static analyzer
275: and we discuss
276: how it improves on previous abstract interpretation frameworks.
277: Experimental results on medium-size programs will be reported later.
278:
279:
280: The first two examples show that predicate level properties,
281: such as determinacy,
282: which are out of the scope of traditional
283: abstract interpretation frameworks can be captured by our analyzer.
284: To the best of our knowledge,
285: does not exist any specific analysis which
286: can infer determinacy of all the programs that are discussed hereafter.
287:
288: Consider first the procedure {\tt is\_last}:
289:
290: \begin{tt} \begin{tabbing} 12\=1234\=\kill
291: \> is\_last(X,[X]). \\
292: \> is\_last(X,[\_|T]) :- is\_last(X,T). \end{tabbing} \end{tt}
293:
294: %\noindent
295: When given the input pattern {\tt is\_last(var,ground)}, where {\tt var}
296: and {\tt ground}
297: denote the set of all variables and
298: the set of all ground terms respectively, our analy\-sis
299: returns the abstract sequence
300: $\langle${\tt is\_last(ground,[ground|ground]),0,1,pt}$\rangle$, where
301: {\tt is\_last(ground,[ground|ground])} is the pattern characterizing the
302: output substitutions,
303: 0 and 1 are,
304: respectively,
305: the minimum and the maximum number of
306: returned
307: output substitutions, and
308: {\tt pt} stands for
309: ``possible termination''.
310: %We know of no abstract interpretation framework
311: %able to infer determinacy of such program.
312: % Moreover,
313: % also the analysis
314: %developed by Debray and Warren in \cite{Debray89a}
315: %to infer functional computations of a logic program
316: %cannot infer determinacy of this program.
317:
318: Consider now the following two versions of the procedure {\tt partition}.
319:
320: \begin{tt}
321: \begin{tabbing}
322: 12\=1234\=\kill
323: \> partition([],P,[],[]). \\
324: \> partition([S|T],P,[S|Ss],Bs) :-
325: S $\leq$ P, !,
326: partition(T,P,Ss,Bs). \\
327: \> partition([B|T],P,Ss,[B|Bs]) :-
328: partition(T,P,Ss,Bs).
329: \end{tabbing}
330: \end{tt}
331:
332: \begin{tt}
333: \begin{tabbing}
334: 12\=1234\=\kill
335: \> partition([],P,[],[]). \\
336: \> partition([S|T],P,[S|Ss],Bs) :-
337: leq(S,P), partition(T,P,Ss,Bs). \\
338: \> partition([B|T],P,Ss,[B|Bs]) :-
339: gt(B,P), partition(T,P,Ss,Bs). \\
340: \> leq(K1-V1,K2-V2) :- K1 $\leq$ K2. \\
341: \> gt(K1-V1,K2-V2) :- K1 $>$ K2.
342: \end{tabbing}
343: \end{tt}
344:
345: %\noindent
346: Note that the second version of the procedure
347: calls arithmetic predicates through an
348: auxiliary predicate and is appropriate for a key sort. Given an input
349: pattern {\tt partition(ground,ground,var,var)}, our analysis returns in both
350: cases the abstract sequence
351: $\langle${\tt partition(ground,ground,ground,ground),0,1,pt}$\rangle$.
352: Input/out\-put patterns are used to
353: determine that the first clause and the two others are mutually
354: exclusive in both programs, while the cut (in the first version) and
355: the abstraction of arithmetic predicates (in the second version)
356: determine the mutual exclusion of the second and the third clause.
357: Thus we can infer determinacy of both versions of the procedure
358: {\tt partition}.
359:
360: As stated above, we don't know of any
361: static analysis for logic programs
362: which can infer
363: determinacy of all these programs.
364: For instance, the analysis
365: developed by Debray and Warren \shortcite{Debray89a}
366: to detect functional computations of a logic program
367: cannot infer determinacy of the procedure ${\tt is\_last}$;
368: the determinacy analysis proposed by Dawson {\em et al.}
369: \shortcite{Ramakrishnan93},
370: while it can handle the second version of the procedure
371: {\tt partition}, it cannot handle the first version of it since it does not
372: deal with the cut; for the same reason,
373: the analysis of Giacobazzi and Ricci
374: \shortcite{Giacobazzi92} cannot treat the first version of the procedure
375: {\tt partition};
376: and the
377: cardinali\-ty analysis defined by Sahlin \shortcite{Sahlin91} cannot handle
378: any of the examples discussed above since it ignores predicate
379: arguments.
380: %Finally, the only system we know
381: %that can handle the second version of the procedure
382: %{\tt partition} is the one developed by Dawson {\em et al.}
383: %\shortcite{Ramakrishnan93}.
384:
385: The next example shows that the use of abstract sequences can improve
386: on the analysis of variable level properties such as modes.
387:
388: Consider the procedure {\tt compress(L,Lc)}, which relates two
389: lists {\tt Lc} and {\tt L}
390: such that {\tt Lc} is a compressed version of {\tt L}.
391: For instance, the
392: compressed version of the list
393: {\tt [a, b, b, c, c, c]} is {\tt [a, 1, b, 2, c, 3]}. A
394: library can contain the definition of a single procedure to handle
395: both compression and decompression as follows.
396:
397: \begin{tt}
398: \begin{tabbing}
399: 12\=1234\=\kill
400: \> compress(A,B) :-
401: var(A), !, decmp(A,B). \\
402: \> compress(A,B) :-
403: cmp(A,B).
404: \end{tabbing}
405: \end{tt}
406:
407: \begin{tt}
408: \begin{tabbing}
409: 12\=1234\=\kill
410: \> cmp([],[]). \\
411: \> cmp([C],[C,1]). \\
412: \> cmp([C1,C2|T],[C1,1,C2,N|Rest]) :-
413: C1 \hskip-0.14cm <> \hskip-0.14cm C2,
414: cmp([C2|T],[C2,N|Rest]). \\
415: \> cmp([C1,C1|T],[C1,N1|Rest]) :-
416: cmp([C1|T],[C1,N|Rest]),
417: N1 \hskip-0.14cm := \hskip-0.14cm N \hskip-0.14cm + \hskip-0.14cm 1.
418: \end{tabbing}
419: \end{tt}
420:
421: \begin{tt}
422: \begin{tabbing}
423: 12\=1234\=\kill
424: \> decmp([],[]). \\
425: \> decmp([C],[C,1]). \\
426: \> decmp([C1,C2|T],[C1,1,C2,N|Rest]) \hskip-0.2cm :- \hskip-0.15cm
427: decomp([C2|T],[C2,N|Rest]),
428: C1 \hskip-0.14cm <> \hskip-0.14cm C2. \\
429: \> decmp([C1,C1|T],[C1,N1|Rest]) \hskip-0.2cm {:-} \hskip-0.15cm
430: N1 \hskip-0.14cm > \hskip-0.14cm 1, \hskip-0.1cm
431: N \hskip-0.14cm := \hskip-0.14cm N1 \hskip-0.14cm - \hskip-0.14cm 1,\\
432: \hskip6.1cm
433: decmp([C1|T],[C1,N|Rest]).
434: \end{tabbing}
435: \end{tt}
436:
437:
438: %\noindent
439: Given the input patterns {\tt compress(ground,var)} and {\tt compress(var,ground)}, our
440: analysis returns the abstract sequence $\langle${\tt
441: compress(ground,ground),0,1,pt}$\rangle$ for both the inputs.
442: This example
443: illustrates many of the functionalities of our sy\-stem, including
444: input/output patterns, abstraction of arithmetic and meta-pre\-di\-ca\-tes,
445: and the cut, all of which are necessary to obtain the optimal
446: precision. In addition, it shows that taking the cut into account
447: improves the analysis of modes.
448: Indeed, a mode analysis
449: ignoring the cut would return the output pattern ${\tt
450: compress(novar,ground)}$ for the input pattern {\tt compress(var,ground)}, losing
451: the groundness information. None of the
452: abstract interpretation algorithms for logic programs we know of
453: %systems or proposals
454: can handle this example with an optimal result.
455: Moreover, if
456: a program only uses the input pattern {\tt compress(var,ground)}, our
457: analysis detects that the second clause of {\tt compress} is dead code
458: without any extra processing since no input/output pattern exists for
459: {\tt comp}. The second clause, the test {\tt var}, and the cut
460: of the first clause can then be removed by an optimizer.
461:
462: Notice that there exist implemented tools for the static
463: analysis of Prolog programs, such as PLAI
464: \cite{Muthukumar92}, which can achieve as accurate success
465: and dead-code information
466: as our analyzer. However, such tools usually
467: integrate
468: several analyses based on different techniques which
469: are
470: not all justified by the
471: abstract interpretation framework.
472: The example of the procedure {\tt compress} shows that
473: our analy\-zer can
474: handle control features of the language
475: within the abstract interpretation framework
476: without the need of any extra consideration.
477:
478:
479:
480:
481:
482: \subsection{Sequence-Based Abstract Interpretation of Prolog}
483:
484: An abstract interpretation framework \cite{CousotJLC92} is centered around the
485: definition of a non-standard
486: %(or {\em abstract})
487: semantics
488: approximating a concrete semantics of the language.
489: %This makes it possible to compute in
490: %finite time a description of a (possibly infinite) set of program
491: %run-time behaviors.
492: %The abstract semantics is defined in terms of
493: %abstract values and abstract operations. Intuitively, each abstract
494: %value describes a set of concrete values while each abstract operation
495: %is a safe approximation of a concrete operation.
496:
497: Most top-down abstract interpretation
498: frameworks for logic programs, see, for instance,
499: \cite{Bruynooghe91,Codognet92a,%
500: Jones87,TOPLAS,Marriott89,%
501: Mellish87,Muthukumar92,Nilsson90a,Warren92,Winsborough92},
502: can be viewed as abstractions of
503: a concrete structural operational semantics \cite{Plotkin81}.
504: Such a semantics defines the meaning of a program as a
505: {\em transition relation} described in terms of transition rules of the form
506: $\langle \theta, o \rangle \longmapsto \theta'$, where the latter expresses the fact
507: that $\theta'$ is a possible output from the execution of
508: the construct $o$ (i.e., a procedure, a clause, etc.) called with input
509: $\theta$.
510: %Similarly to a denotational semantics, a structural operational
511: %semantics is {\em compositional}.
512: This structural operational semantics can easily be
513: rephrased as a fixpoint semantics
514: mapping any input pattern $\langle \theta, o \rangle$ to the set of all corresponding outputs $\theta'$. The fixpoint semantics
515: can then be lifted to a {\em collecting semantics} that maps
516: sets of inputs to sets of outputs and is defined as
517: the least fixpoint of a set-based
518: transformation.
519: The {\em non-standard} (or {\em abstract}) {\em semantics} is identical to the
520: collecting one except that it uses abstract values instead of
521: sets and abstract operations instead of
522: operations over sets.
523: Finally, an abstract interpretation algorithm can be derived by
524: instantiating a generic fixpoint algorithm
525: \cite{universal} to the abstract semantics.
526:
527: The limitations of traditional top-down frameworks for Prolog stem
528: from the fact that structural operational semantics
529: are unable to take the depth-first search rule into account.
530: Control operators such as the cut cannot be modeled
531: and are thus simply ignored. To overcome these limitations, we
532: propose a concrete semantics of Prolog which describes the result of program executions
533: in terms of substitution sequences. This allows us to
534: model the
535: depth-first search rule and the cut operator. The semantics
536: is defined in the denotational setting to deal
537: with sequences resul\-ting from the execution of infinite computations.
538: Moreover, it is still compositional
539: allowing us to reuse most of the material of our previous works,
540: i.e., the abstract domains and the generic algorithm \cite{TOPLAS}.
541: \\
542: However, technical proble\-ms arise when applying the
543: abstract interpretation approach
544: described above. Let us informally explain
545: the main ideas behind the definition of our framework.
546:
547: First,
548: we define a {\em concrete semantics} as the
549: least fixpoint of a concrete transformation {\it TCB}
550: mapping every so-called concrete behavior $\longmapsto$ to another
551: concrete behavior $\stackrel{{\it TCB}}\longmapsto$. The notion
552: of concrete behavior is our
553: denotation choice for a Prolog program: it is a function
554: that maps pairs of the form $\langle \theta, p \rangle $
555: % , where $p$
556: % in a n-ary procedure name and $\theta$ is a substitution over the
557: % variables $x_1,\dots,x_n$
558: to a substitution sequence $S$, which
559: intuitively represents the sequence of computed answer substitutions
560: returned by the query $p(x_1,\dots,x_n)\theta$.
561: % Although concrete
562: % behaviors are functions, we use relational symbols to denote them
563: % in order to emphasize the similarities between our concrete
564: % semantics and the structural operational semantics for Prolog used
565: % in \cite{ACTA95}. We write $ \langle \theta, {\it p} \rangle
566: % \longmapsto S $ to express that the concrete behavior $\longmapsto$
567: % maps the pair $ \langle \theta, {\it p} \rangle $ to the sequence $S$.
568: The fixpoint construction of the concrete semantics relies on
569: a suitable ordering $\sqsubseteq$ defined on sequences.
570: % Our concrete semantics is strictly {\em
571: % instrumental} in that it is only a basis for the abstract
572: % interpretation framework. Moreover, its operations
573: % are easily obtained by extending the concrete operations of the
574: % structural operational semantics \cite{ACTA95} upon which our
575: % previous frameworks are based. This allows us to
576: % reuse much of the material from our existing abstract domains and
577: % generic algorithms (e.g., \cite{SPE,ICLP91AI,TOPLAS,ACTA95}).
578:
579: Second,
580: a collecting transformation {\it TCD} is obtained by lifting the concrete
581: transformation {\it TCB} to sets of substitutions and sets of
582: sequences. The transformation
583: {\it TCD} is monotonic with respect to set inclusion. However,
584: its least fixpoint does not safely
585: approximate the concrete semantics. In fact, the least set
586: with respect to inclusion, that is the empty set $\{\}$, does not contain the
587: least substitution sequence with respect to $\sqsubseteq$, which is a
588: special sequence denoted by $<\bot>$. The problem relies on the fact that
589: an ordering
590: on sets of sequences that ``combines'' both the ordering
591: $\sqsubseteq$ on sequences and the ordering $\subseteq$ on sets is needed.
592: This is an instance of the power domain construction problem
593: \cite{Schmidt88}, which is difficult in general.
594: We choose a more pragmatic solution which
595: consists in restricting to {\em chain-closed}
596: sets of sequences, i.e., sets containing the limit of
597: every increasing chain, with respect to $\sqsubseteq$, of their elements.
598: We also introduce the notion of
599: {\em pre-consistent} collecting behavior which, roughly
600: speaking, contains a lower approximation, with respect to
601: $\sqsubseteq$, of the concrete semantics (the least fixpoint of
602: {\it TCB}). The transformation {\it TCD} maps
603: pre-consistent collecting behaviors to other pre-consistent ones.
604: Moreover, assuming that sets of sequences
605: are {\em chain-closed}, any pre-consistent post-fixpoint, with
606: respect to set inclusion, of {\it TCD} safely approximates the
607: concrete semantics. These results imply that a safe collecting
608: behavior can be constructed by iterating on {\it TCD} from any
609: initial pre-consistent collecting behavior and by applying some
610: widening techniques \cite{Cousot92c} in order to reach a
611: post-fixpoint.
612:
613:
614: Third,
615: the {\it abstract semantics} is defined exactly as
616: the collecting one except that it is parametric with respect to the
617: abstract domains.
618: % whose concretizations are assumed to be sets of
619: % substitutions or sets of substitution sequences.
620: In fact, we do not
621: explicitly distinguish between the collecting and the abstract
622: semantics: in our presentation, the collecting transformation {\it
623: TCD} is just a particular instance of the (generic) abstract
624: transformation {\it TAB}.
625:
626:
627: Finally, a generic abstract interpretation algorithm
628: % that is parametric over abstract domains,
629: is derived from the abstract semantics. The algorithm is
630: essentially an instantiation of the universal fixpoint algorithm
631: described in \cite{universal}.
632: % The latter is top-down and computes a subset of
633: % the fixpoint containing the output value for a particular
634: % input and also all input/output pairs needed to
635: % compute the result.
636: % Although the instantiation of
637: % \cite{universal} to our abstract semantics is as mechanical as in
638: % our previous works \cite{TOPLAS}, the correctness of the novel algorithm
639: % involves new theoretical issues.
640: % Indeed, the algorithm in
641: % \cite{universal} is able to cope with non-monotonic transformations by
642: % computing post-fixpoints
643: % instead of regular (least) fixpoints. Nevertheless, the correctness
644: % proofs in \cite{universal} do not address this point since it does
645: % not apply to traditional analyses. Instead,
646: % In the new framework we additionally need to prove
647: % that the computed post-fixpoint is
648: % pre-consistent and thus safe.
649:
650:
651: \subsection{Plan of the Paper}
652: The paper is organized as follows.
653: Section \ref{CS} and Section \ref{sec:AS} describe, respectively,
654: our concrete and abstract semantics for pure Prolog augmented
655: with the cut.
656: %The correctness results are stated and proven.
657: The generic abstract interpretation algorithm is discussed in Section
658: \ref{sec:GAIA}.
659: Section \ref{sec:FASAS} is a revised and extended version
660: of \cite{cardinality}.
661: It describes an instantiation of our abstract interpretation
662: framework
663: %to perform so-called cardinality analysis,
664: to approximate the
665: number of solutions to a goal. Experimental results are reported.
666: %In particular, first it shows how a
667: %generic domain of abstract sequences can be built from {\em any}
668: % domain of abstract substitutions; then it
669: %describes the instantiation of such a
670: % generic domain
671: %to (an extended version of) the abstract domain {\tt
672: % Pattern} \cite{ICLP91AI,TOPLAS}; finally, it discusses
673: %experimental results.
674: %Our cardinality analysis subsumes both variable level analyses,
675: %like mode and sharing,
676: %and also predicate level analyses, like determinacy. To
677: %clarify this benefit,
678: In Section \ref{sec:RDASBAI}
679: we consider related works on determinacy
680: analysis.
681: % We show that the works of Sahlin
682: %\cite{Sahlin.PhD,Sahlin90} and of Debray and Warren \cite{Debray89a} can
683: %be seen as instances of our framework. We also
684: %discuss the work of Giacobazzi and Ricci \cite{Giacobazzi92}, which is
685: %based on a {\em bottom-up} abstract interpretation \cite{Barbuti90}.
686: %Their abstract domain can also be integrated to our system.
687: %Nevertheless, they
688: %do not handle the depth-first search rule
689: %of Prolog and so the cut operator.
690: Section \ref{conclusion} concludes the paper.
691:
692:
693:
694: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%
695: \section{Concrete semantics}
696: \label{CS}
697: %%%%%%%%%%%%%%%%%%%%%%%%%%%%
698:
699: This section describes a concrete semantics
700: for pure Prolog augmented with the cut.
701: The concrete semantics is the link between the standard semantics
702: of the language and
703: the abstract one.
704: Our concrete semantics is denotational and is based on the notion of
705: substitution sequence.
706: Correctness of the concrete semantics with respect to Prolog
707: standard semantics,
708: i.e., OLD-resolution, is discussed.
709: Most proofs are omitted here; all details can be found in
710: \cite{Cut_rep95}.
711:
712: %\noindent
713: %The description of the concrete semantics
714: %is organized as follows.
715: %Section~\ref{syntax} provides an abstract syntax of normalized programs,
716: %which is suited for the compositional description of the
717: %semantics.
718: %Section~\ref{main:BSD} describes the basic semantic domains of substitutions.
719: % Section~\ref{PSS} introduces program
720: %substitution sequences. Program substitution sequences with cut information are studied
721: %in Section~\ref{pss_cut}. Concrete behaviors, which model the semantics
722: %of programs, are defined in Section~\ref{CB}. The operations
723: %of the concrete semantics are defined in Section~\ref{CO}.
724: %The concrete semantic rules are given in Section~\ref{CSR}.
725: %The concrete semantics itself is defined in Section~\ref{TCS}.
726: %The equivalence between the concrete semantics and the standard
727: %semantics of Prolog is discussed in Section~\ref{CCS}.
728: %Finally, Section~\ref{RW_CS} briefly discusses
729: %related works.
730:
731:
732:
733:
734:
735:
736: \subsection{Syntax}
737: \label{syntax}
738:
739: The abstract interpretation framework presented in this paper
740: assumes that programs are normalized according to the abstract
741: syntax given in
742: Fig.~\ref{abstract_syntax}. The variables occurring in a literal are distinct;
743: distinct procedures have distinct names; all clauses of a procedure
744: have exactly the same head; if a clause uses $m$ different program
745: variables, these variables are $x_1$, \dots, $x_m$.\\
746:
747:
748: \begin{figure}[h]
749: \figrule
750: \begin{it}
751: %
752: \begin{tabular}{lclllcl}
753: P &$ \in$& Programs & & P & ::= & pr $|$ pr P \\
754: pr & $ \in$& Procedures & & pr & ::=& c $|$ c pr \\
755: c &$ \in$& Clauses & & c & ::=& h :- g. \\
756: h &$ \in$& ClauseHeads & & h & ::=& p($x_1$, \ldots, $x_n$) \\
757: g &$ \in$& ClauseBodyPrefixes & & g & ::=& $<>$ $|$ g , l \\
758: l &$ \in$& Literals & & l & ::=& p($x_{i_1}$, \ldots, $x_{i_n}$) $|$ b \\
759: b &$ \in$& Built-ins & & b & ::=& $x_i$=$x_j$ $|$ $x_{i_1}$=f($x_{i_2}$, \ldots, $x_{i_n}$) $|$ ! \\
760: p &$ \in$& ProcedureNames\\
761: f &$ \in$& Functors\\
762: $x_i$ &$ \in$& ProgramVariables (PV)
763: \end{tabular}
764: \end{it}
765: \caption{Abstract syntax of normalized programs}
766: \label{abstract_syntax}
767: \end{figure}
768:
769:
770: %----------------------------------------------------------------------------%
771:
772: \subsection{Basic Semantic Domains}
773: \label{main:BSD}
774:
775: This section presents the basic semantic domains of substitutions.
776: Note that we assume a preliminary knowledge of
777: logic programming; see, for instance \cite{Apt97,Lloyd}.\\
778:
779: \noindent
780: {\bf Variables and Terms.}
781: %\label{Variablesterms}
782: We assume the existence of two disjoint and infinite sets of variables, denoted by
783: ${\it PV}$ and ${\it SV}$.
784: Elements of ${\it PV}$ are called {\it program variables} and are denoted by
785: $x_1$, $x_2$, \dots, $x_i$, \dots. The set ${\it PV}$ is totally ordered;
786: $x_i$ is the $i$-th element of ${\it PV}$.
787: % and the index
788: %$i$ in the symbol $x_i$ means that $x_i$ is the $i$-th element of ${\it PV}$.
789: Elements of ${\it SV}$ are called {\em standard variables} and are denoted by
790: letters $y$ and $z$ (possibly subscripted).
791: %Finite subsets of ${\it PV}$ are denoted by $D$.
792: Terms are built using standard variables only.
793: % they are denoted by the letter $t$.
794: \\
795:
796: \noindent
797: {\bf Standard Substitutions.}
798: Standard substitutions are substitutions in the usual
799: sense
800: %\cite{Apt97,Lloyd}
801: which use standard variables only.
802: The set of standard substitutions is denoted by ${\it SS}$.
803: %Standard substitutions are normally denoted by
804: %the letter $\sigma$.
805: Renamings are standard substitutions that define a permutation
806: of standard variables.
807: % are denoted by the letter $\rho$.
808: The domain and the codomain of a standard substitution $\sigma$
809: are denoted by ${\it dom}(\sigma)$ and ${\it codom}(\sigma)$, respectively.
810: We denote by ${\it mgu}(t_1,t_2)$ the set of standard substitutions
811: that are a most general unifier of terms $t_1$ and $t_2$.\\
812:
813: \noindent
814: {\bf Program Substitutions.}
815: A program substitution is a set
816: $\{x_{i_1}/t_1,\dots,x_{i_n}/t_n\}$, where $x_{i_1},\ldots,x_{i_n}$ are distinct program
817: variables and
818: $t_1$, \dots, $t_n$ are terms.
819: %The variables occurring in $t_1$, \dots, $t_n$ are standard variables.
820: Program substitutions are not substitutions in the usual sense;
821: they are best understood as a form of program store
822: which expresses the state of the computation at a given
823: program point. It is meaningless to compose them as usual substitutions
824: or to use them to express most general unifiers.
825: %Program substitutions are denoted by the letter $\theta$.
826: The domain of a program substitution
827: $\theta=\{x_{i_1}/t_1,\dots,x_{i_n}/t_n\}$, denoted by
828: ${\it dom}(\theta)$, is the set of
829: program variables $\{x_{i_1},\dots,x_{i_n}\}$.
830: The codomain of $\theta$, denoted by
831: ${\it codom}(\theta)$, is the set of
832: standard variables occurring in $t_1,\dots,t_n$.
833: Program and standard substitutions
834: cannot be composed. Instead, standard substitutions
835: are {\em applied} to program substitutions.
836: The application of a standard substitution $\sigma$ to
837: a program substitution $\theta=\{x_{i_1}/t_1,\dots,x_{i_n}/t_n\}$
838: %denoted by $\theta\sigma$,
839: is the program substitution
840: $\theta\sigma=\{x_{i_1}/t_1\sigma,\dots,x_{i_n}/t_n\sigma\}$.
841: The set of program substitutions is denoted by ${\it PS}$.
842: The application $x_i\theta$ of a program substitution $\theta$
843: to a program variable $x_i$ is defined only if $x_i\in{\it dom}(\theta)$;
844: it denotes the term bound to $x_i$ in $\theta$.
845: Let $D$ be a finite subset of ${\it PV}$ and
846: $\theta$ be a program substitution such that
847: $ D \subseteq {\it dom}(\theta)$.
848: The {\em restriction} of $\theta$ to $D$, denoted by $\theta_{/D}$,
849: is the program substitution such that ${\it dom}(\theta_{/D})= D$ and
850: $x_i(\theta_{/D})=x_i\theta$,
851: for all $x_i \in D$.
852: We denote by ${\it PS}_D$ the set of program substitutions with
853: domain $D$.\\
854:
855: \noindent
856: {\bf Canonical Program Substitutions.}
857: %\label{main:CPS}
858: We say that two program substitutions
859: $\theta$ and $\theta'$ are {\it equivalent} if and only if
860: there exists a renaming
861: $\rho$ such that $\theta\rho=\theta'$.
862: We assume that,
863: for each program substitution $\theta$, we are given a canonical
864: representative, denoted by $[\![\theta]\!]$,
865: of the set of all program substitutions that are equivalent to $\theta$.
866: We denote by ${\it CPS}$
867: the set of all
868: %canonical representatives, called
869: {\em canonical
870: program substitutions} $[\![\theta]\!]$.
871: %is denoted by ${\it CPS}$.
872: For any finite set of program variables $D$,
873: we denote by ${\it CPS}_D$ the set ${\it PS}_D\cap {\it CPS}$.
874:
875:
876: \subsection{Program Substitution Sequences}
877: \label{PSS}
878: Program substitution sequences are intended to model the sequence of
879: computed answer substitutions returned by a goal, a clause, or a procedure.
880: %(see Sections \ref{pss_f} and \ref{pss_nf} for a detailed motivation).
881: %In this section,
882: %we formally describe the semantic domains of program substitution sequences.
883: \\
884:
885: \noindent
886: {\bf Program Substitution Sequences.}
887: Let us denote by $\Nstar$ the set of positive natural numbers.
888: A program substitution sequence
889: is either a {\em finite}
890: sequence of the form $<\theta_1, \ldots, \theta_n>$ ($n \geq 0$) or
891: an {\em incomplete} sequence of the form $<\theta_1, \ldots, \theta_n,\bot>$
892: ($n \geq 0$) or an {\em infinite} sequence of the form
893: $<\theta_1, \ldots, \theta_i, \ldots>$ ($i\!\in\!\Nstar $), where the
894: $\theta_i$ are program substitutions with the same domain.
895: %Program substitution sequences are denoted by $S$.
896: We use the notation $<\theta_1, \ldots, \theta_i, \_>$ to represent
897: a program substitution sequence when it is not known whether it is finite,
898: incomplete or infinite.
899: Let $S$ be a program substitution sequence.
900: We denote by ${\it Subst}(S)$ the
901: set of program substitutions that are elements of $S$.
902: The domain of
903: $S$ is defined when $S\neq<>$ and $S\neq<\bot>$. In this case,
904: %the domain of $S$, denoted by
905: ${\it dom}(S)$ is the domain of
906: the program substitutions belonging to ${\it Subst}(S)$.
907: The set of all program substitution sequences is denoted by {\it PSS}.
908: Let $D$ be a finite set of program variables. We denote by
909: {\it PSS}$_D$ the set of all program substitution sequences
910: with domain $D$ augmented
911: with $<>$ and $<\bot>$.
912: Let $S\in{\it PSS}_D$ be a sequence $<\theta_1, \ldots, \theta_i, \_>$
913: and $D'\subseteq D$.
914: The {\em restriction} of $S$ to $D'$, denoted by $S_{/{D'}}$,
915: is the program substitution sequence $<{\theta_1}_{/{D'}}, \ldots, {\theta_i}_{/{D'}}, \_>$.
916: The number of elements of $S$,
917: including the special element $\bot$, is denoted by ${\it Ne}(S)$.
918: The number of elements of $S$ that are substitutions
919: is denoted by ${\it Ns}(S)$.
920: %Hence,
921: %if $S$ is incomplete then
922: %${\it Ne}(S)={\it Ns}(S)+1$; in the other cases,
923: %if $S$ is finite or infinite then
924: %${\it Ne}(S)={\it Ns}(S)$.
925: % otherwise, i.e., if $S$ is incomplete,
926: %${\it Ne}(S)={\it Ns}(S)+1$.
927: Sequence concatenation is denoted by $::$ and it is used only when its
928: first argument is a finite sequence.\\
929:
930:
931: \noindent
932: {\bf Canonical Substitution Sequences.}
933: %\label{main:CSS}
934: The canonical mapping $[\![\cdot]\!]$ is lifted to sequences as
935: follows.
936: Let $S$ be a program substitution sequence $<\theta_1, \ldots, \theta_i, \_>$.
937: We define
938: $[\![S]\!]=
939: %as the program substitution sequence
940: <[\![\theta_1]\!], \ldots, [\![\theta_i]\!], \_>$.
941: We denote by {\it CPSS} the set of all
942: {\em canonical substitution sequences} $[\![S]\!]$
943: and by {\it CPSS}$_D$ the set ${\it PSS}_D \cap {\it CPSS}$,
944: for any finite subset $D$ of {\it PV}.
945: \\
946:
947: \noindent
948: {\bf CPO's of Program Substitution Sequences.}
949: %\label{main:CPSS}
950: The sets {\it PSS}, {\it PSS}$_D$, {\it CPSS} and {\it CPSS}$_D$
951: can be endowed with a structure of {\it pointed cpo} as described below.
952: %Let us first introduce the relation $\sqsubseteq$ on program substitution sequences.
953:
954:
955: \begin{definition}[Relation $\sqsubseteq$ on Program Substitution Sequences]
956: \label{seq_ordering}
957: %
958: Let $S_1,S_2\!\in\!{\it PSS}$. We define
959: \begin{center}
960: \begin{tabular}{llll}
961: $S_1 \sqsubseteq S_2$ & iff & either & $S_1=S_2$ \\
962: & & or &there exists $S,S'\!\in\!
963: {\it PSS}$ such that $S$ is finite,\\
964: & & &$S_1=S::<\bot>$ and $S_2=S::S'$.
965: \end{tabular}
966: \end{center}
967: \end{definition}
968: %
969: %\noindent
970:
971: The relation $\sqsubseteq$ on program substitution sequences is an ordering
972: and
973: the pairs $\langle {\it PSS},\sqsubseteq\rangle $,
974: $\langle {\it CPSS},\sqsubseteq\rangle $,
975: $\langle {\it PSS}_D,\sqsubseteq\rangle $, and
976: $\langle {\it CPSS}_D,\sqsubseteq\rangle $
977: are all pointed cpo's.
978: % (see \cite{Cut_rep95}).
979: \\
980:
981:
982: We denote by
983: $(S_i)_{i\in\N}$ an increasing chain,
984: $S_0 \sqsubseteq S_1 \sqsubseteq \ldots \sqsubseteq S_i \sqsubseteq
985: \ldots$ in
986: {\it PSS}; whereas we denote by $\{S_i\}_{i\in\N}$ a,
987: non necessarily increasing,
988: sequence of elements of
989: {\it PSS}.
990: \\
991:
992: \noindent
993: {\bf Lazy Concatenation.}
994: %\label{lazy_conc}
995: %In the definition of the concrete semantics,
996: Program substitution sequences are combined
997: through the operation $\Box$ and its extensions $\Box_{k=1}^{n}$ and $\Box_{k=1}^{\infty}$
998: defined below.
999:
1000:
1001:
1002: %
1003: \begin{definition}[Operation $\Box$]
1004: Let $S_1$, $S_2\!\in\! {\it PSS}$.
1005: \begin{center}
1006: \begin{tabular}{llcll}
1007: $S_1 \Box S_2$ & $=$ & $S_1::S_2$ & if & $S_1$ is finite\\
1008: & $=$ & $S_1$ & if & $S_1$ is incomplete or infinite.\\
1009: \\
1010: \end{tabular}
1011: \end{center}
1012: \end{definition}
1013:
1014:
1015: \begin{definition}[Operation $\Box_{k=1}^{n}$]
1016: Let $\{S_k\}_{{k\in\Nstar}}$ be an
1017: infinite sequence of program substitution sequences (not neces\-sarily a chain).
1018: For any $n \geq 1$, we define:
1019: \begin{center}
1020: \begin{tabular}{lll}
1021: $\Box_{k=1}^{0}S_k$ & $=$ & $<\;>$ \\
1022: $\Box_{k=1}^{n}S_k$ & $=$ & $(\Box_{k=1}^{n-1}S_k)\Box S_n.$\\
1023: \\
1024: \end{tabular}
1025: \end{center}
1026: \end{definition}
1027:
1028:
1029:
1030: \begin{definition}[Operation $\Box_{k=1}^{\infty}$]
1031: Let $\{S_k\}_{{k\in\Nstar}}$ be an
1032: infinite sequence of program substitution sequences.
1033: The infinite sequence $\{S'_i\}_{i\in\N}$ where
1034: $S'_i= (\Box_{k=1}^{i}S_k)\Box<\bot>$ $({i\in\N})$ is
1035: a chain. So we are allowed to define:
1036: \begin{center}
1037: \begin{tabular}{lllll}
1038: $\Box_{k=1}^{\infty}S_k$ & = & $\sqcup_{i=0}^{\infty} S'_i$
1039: & = &
1040: $\sqcup_{i=0}^{\infty}((\Box_{k=1}^{i}S_k)\Box
1041: <\bot>)$.\\
1042: \\
1043: \end{tabular}
1044: \end{center}
1045: \end{definition}
1046: %
1047: %\noindent
1048:
1049: The operation $\Box$ is associative; hence, it is meaningful to
1050: write $S_1 \Box \ldots \Box S_n$ instead of
1051: $\Box_{k=1}^{n} S_k$. Operations $\Box$,
1052: $\Box_{k=1}^{n}$, and $\Box_{k=1}^{\infty}$ are continuous
1053: with respect to the ordering $\sqsubseteq$ on program substitution sequences.\\
1054: %Moreover, the sets ${\it CPSS}$, ${\it PSS}_D$,
1055: %and ${\it CPSS}_D$ are closed with respect to such operations.\\
1056:
1057:
1058: %
1059:
1060: \noindent
1061: {\bf Program Substitution Sequences with Cut Information.}
1062: %\label{pss_cut}
1063: Program substitution sequences with cut information
1064: are used to model the result of a clause together with
1065: information on cut executions.
1066:
1067: Let ${\it CF }$ be the set of cut flags $\{{\it cut},
1068: {\it nocut}\}$.
1069: A program substitution sequence with cut information
1070: is a pair $\langle S, cf\rangle$
1071: where $S\!\in\!{\it PSS}$ and $cf\!\in\!{\it CF }$.
1072: %The pair $\langle {\it PSS}\times {\it CF},\sqsubseteq \rangle $
1073: %is a pointed cpo for the relation $\sqsubseteq$ defined below.
1074: %We define the following relation.
1075: %We endow the sets
1076: %${\it PSS}\times {\it CF}$,
1077: %${\it PSS}_D\times {\it CF}$,
1078: %${\it CPSS}\times {\it CF}$ and
1079: %${\it CPSS}_D\times {\it CF}$
1080: %with a structure of pointed cpo as follows.
1081:
1082:
1083: \begin{definition} [Relation $\sqsubseteq$ on Substitution Sequences
1084: with~Cut~Information]
1085: \label{leqpssc}
1086: %
1087: Let $\langle S_1,{\it cf}_1 \rangle ,
1088: \langle S_2,{\it cf}_2 \rangle \!\in\! {\it PSS}\times {\it CF}$.
1089: We define
1090: \begin{center}
1091: \begin{tabular}{llll}
1092: $\langle S_1,{\it cf}_1 \rangle
1093: \sqsubseteq \langle S_2,{\it cf}_2 \rangle $
1094: %
1095: & iff & either &$S_1\sqsubseteq S_2$ and ${\it cf}_1 = {\it cf}_2$ \\
1096: & & or & $S_1=<\bot>$ and ${\it cf}_1={\it nocut}$.
1097: \\
1098: \\
1099: \end{tabular}
1100: \end{center}
1101: \end{definition}
1102: %\noindent
1103:
1104: The relation $\sqsubseteq$ on program substitution sequences
1105: with cut information is an or\-de\-ring.
1106: Moreover,
1107: the pairs
1108: $\langle {\it PSS}\times {\it CF},\sqsubseteq \rangle $,
1109: $\langle {\it PSS}_D\times {\it CF},\sqsubseteq \rangle $,
1110: $\langle {\it CPSS}\times {\it CF},\sqsubseteq \rangle $ and
1111: $\langle {\it CPSS}_D\times {\it CF},\sqsubseteq \rangle $
1112: are all pointed cpo's.
1113:
1114: We extend the definition of the operation $\Box$ to
1115: program substitution sequences with cut information.
1116: The extension is continuous
1117: in both the arguments.
1118:
1119: \begin{definition}[Operation $\Box$ with Cut Information]
1120: \label{OCI}
1121: Let $\langle S_1,{\it cf}\rangle\!\in\!{\it PSS}\times {\it CF}$
1122: and $S_2\!\in\!{\it PSS}$. We define
1123: \begin{center}
1124: \begin{tabular}{lllll}
1125: $\langle S_1,{\it cf}\rangle \Box S_2$
1126: & $=$ & $S_1\Box S_2$ & if & {\it cf} = {\it nocut}\\
1127: & & $S_1$ & if & {\it cf} = {\it cut}.
1128: \end{tabular}
1129: \end{center}
1130: \end{definition}
1131:
1132: %\vskip-1.2cm
1133: \subsection{Concrete Behaviors}
1134: \label{CB}
1135:
1136: The notion of concrete behavior
1137: provides a mathema\-ti\-cal model for the input/output behavior of programs.
1138: To simplify the presentation, we do not parameterize the semantics
1139: with respect to programs. Instead, we assume a given fixed underlying program
1140: {\it P}.
1141: %
1142:
1143: \begin{definition} [Concrete Underlying Domain]
1144: The {\em concrete underlying domain},
1145: denoted by {\it CUD},
1146: is the set of all pairs $\langle \theta,p \rangle$ such that
1147: $p$ is the name of a procedure {\it pr}
1148: of $P$ and
1149: $\theta\!\in\!{\it CPS}_{\{x_1,\dots,x_n\}}$,
1150: where $x_1,\dots,x_n$ are the variables
1151: occurring in the head of every clause of {\it pr}.
1152: \end{definition}
1153: %
1154:
1155: Concrete behaviors are functions but we
1156: denote them by the relation symbol $\longmapsto$ in order to
1157: stress the similarities between the concrete semantics and a
1158: structural operational semantics for logic programs defined in \cite{ACTA95}.
1159:
1160: \begin{definition} [Concrete Behaviors]
1161: A {\em concrete behavior}
1162: is a total function $\longmapsto: {\it CUD}\longrightarrow {\it CPSS}$
1163: mapping every pair $\langle \theta,p \rangle\!\in\!{\it CUD}$ to a
1164: canonical program substitution
1165: sequence $S$ such that,
1166: for every $\theta'\!\in\!{\it Subst}(S)$, there exists a standard substitution
1167: $\sigma$ such that $\theta'=\theta \sigma$.
1168: We denote by $\langle \theta,p \rangle\longmapsto S$ the fact that
1169: $\longmapsto$ maps the pair $\langle \theta,p \rangle$ to $S$.
1170: The set of all concrete behaviors is denoted by
1171: {\it CB}.
1172: \end{definition}
1173: %
1174: %
1175:
1176: The ordering $\sqsubseteq$ on program substitution sequences is
1177: lifted to concrete behaviors in a standard way \cite{Schmidt88}.
1178:
1179:
1180: \begin{definition} [Relation $\sqsubseteq$ on Concrete Behaviors]
1181: Let $\longmapsto_1 , \longmapsto_2\in\!{\it CB}$. We define
1182: \begin{center}
1183: \begin{tabular}{lclll}
1184: $ \longmapsto_1 \sqsubseteq \longmapsto_2$ & iff &
1185: ($\langle \theta,p \rangle\longmapsto_1 S_1$ and
1186: $\langle \theta,p \rangle\longmapsto_2 S_2$)
1187: imply
1188: $\ S_1 \sqsubseteq S_2$, \\
1189: & & for all
1190: $\langle \theta,p \rangle\!\in
1191: \!{\it CUD}.$
1192: \end{tabular}
1193: \end{center}
1194: \end{definition}
1195: %
1196:
1197: The following result is straightforward.
1198:
1199: \begin{proposition}
1200: \label{CB-cpo}
1201: $\langle {\it CB}, \sqsubseteq \rangle$ is a pointed cpo, i.e.,
1202: \begin{enumerate}
1203: \itemsep 2pt
1204: \item the relation $\sqsubseteq$ on
1205: {\it CB} is a partial order;
1206: \item {\it CB} has a minimum element, which is
1207: the concrete behavior
1208: $\longmapsto_\bot$ such that
1209: for all $\langle \theta,p \rangle\!\in\! {\it CUD}$,
1210: $\langle \theta,p \rangle\longmapsto_\bot<\bot>$;
1211: \item every chain $(\longmapsto_i)_{i\in\N}$ in {\it CB}
1212: has a least upper bound, denoted by
1213: $\sqcup_{i=0}^{\infty}\longmapsto_i$;
1214: $\sqcup_{i=0}^{\infty}\longmapsto_i$ is the concrete behavior
1215: $\longmapsto$ such that,
1216: for all $ \langle \theta,p \rangle\!\in \!{\it CUD}$,
1217: $\langle \theta,p \rangle\longmapsto \sqcup_{i=0}^{\infty}S_i$,
1218: where $\langle \theta,p \rangle\longmapsto_i S_i$
1219: $(\forall i\!\in\!\N)$.
1220:
1221: \end{enumerate}
1222: \end{proposition}
1223:
1224:
1225: \subsection{Concrete Operations}
1226: \label{CO}
1227:
1228: We specify here the concrete operations which are used in the
1229: definition of the concrete semantics.
1230: The choice of these particular operations is motivated by the fact that
1231: they have useful (i.e., practical)
1232: abstract counterparts (see Sections~\ref{sec:AS}, \ref{sec:GAIA}
1233: and~\ref{sec:FASAS}).
1234: %
1235: %The following notations will be used:
1236: %$c$ denotes a clause of {\it P};
1237: % $l$ denotes a literal occurring in the
1238: %body of $c$ (it can be a procedure call or a unification literal);
1239: %$D=\{x_1,\ldots,x_n\}$ is the set of all variables occurring in
1240: %the head of $c$;
1241: %$D'=\{x_1,\ldots,x_m\}$ is the set of all variables
1242: %occurring in $c$ (hence, $n\leq m$);
1243: %$D''=\{x_{i_1},\ldots,x_{i_{r}}\}$ is the set of all
1244: %variables occurring in
1245: %$l$;
1246: %$D'''$ is equal to $\{x_{1},\ldots,x_{r}\}$.
1247: The concrete operations are polymorphic since their
1248: exact signature depends on a clause $c$ or a literal $l$ or both.\\
1249:
1250: Let $c$ be a clause, $D=\{x_1,\ldots,x_n\}$ be the set of all variables
1251: occurring in the head of $c$, and
1252: $D'=\{x_1,\ldots,x_m\}$ ($n\leq m$) be the set of all variables
1253: occurring in $c$.\\
1254:
1255: \noindent
1256: {\bf Extension at Clause Entry}\ :\ \
1257: {\tt EXTC}$(c,\cdot):{\it CPS}_D
1258: \rightarrow({\it CPSS}_{D'}\times {\it CF}) $\ \\
1259: This operation extends a substitution $\theta$ on
1260: the set of variables in $D$
1261: to the set of variables in $D'$.
1262: Let $\theta\!\in\!{\it CPS}_{D}$.
1263: %
1264: \begin{quote}
1265: \begin{tabular}{llll}
1266: ${\tt EXTC}(c,\theta)$ & $ = $ &
1267: $ \langle < [\![\theta']\!]>, {\it nocut} \rangle$
1268: & \\
1269:
1270: \end{tabular}
1271: \end{quote}
1272: \noindent
1273: where $x_i\theta'=x_i\theta$ $(\forall i:1\leq i \leq n)$ and
1274: $x_{n+1}\theta'$, \dots , $x_{m}\theta'$ are
1275: distinct standard variables not belonging to ${\it codom}(\theta)$.
1276:
1277: %
1278: \vskip0.3cm
1279: %
1280:
1281: \noindent
1282: {\bf Restriction at Clause Exit}\ :\ \
1283: {\tt RESTRC}$(c,\cdot):({\it CPSS}_{D'}\times {\it CF})
1284: \rightarrow({\it CPSS}_D\times {\it CF}) $ \ \\
1285: This operation restricts a pair $\langle S,{\it cf}\rangle$,
1286: representing the result of the execution of $c$ on the set of
1287: variables in $D'$,
1288: to the set of variables in $D$.
1289: Let $\langle S, {\it cf}\rangle
1290: \!\in\!({\it CPSS}_D'\times {\it CF})$.
1291: \begin{quote}
1292: \begin{tabular}{llll}
1293: ${\tt RESTRC}(c,\langle S,{\it cf}\rangle)$ & $=$ &
1294: $ \langle [\![S']\!],{\it cf}\rangle$ &
1295: where $ S'=S_{/{D}} $.
1296: \end{tabular}
1297: \end{quote}
1298: %
1299: \vskip0.3cm
1300:
1301: Let $l$ be a literal occurring in the body of $c$,
1302: $D''=\{x_{i_1},\ldots,x_{i_{r}}\}$ be the set of
1303: variables occurring in $l$, and
1304: $D'''$ be equal to $\{x_{1},\ldots,x_{r}\}$.
1305:
1306: \vskip0.3cm
1307: \noindent
1308: {\bf Restriction before a Call}\ :\ \
1309: {\tt RESTRG}$(l,\cdot):{\it CPS}_{D''}
1310: \rightarrow{\it CPS}_{D'''} $\ \\
1311: This operation expresses a substitution $\theta$ on the
1312: parameters $x_{i_1},\ldots,x_{i_r}$ of a call $l$
1313: in terms of the formal parameters $x_1,\ldots,x_r$ of $l$. Let
1314: $\theta \!\in\!{\it CPS}_{D''}$.
1315: \begin{quote}
1316: \begin{tabular}{lll}
1317: ${\tt RESTRG}(l,\theta)$ & $=$ &
1318: $ [\![\{x_1/x_{i_1}\theta,\ldots,x_{r}/x_{i_{r}}\theta\}]\!] .$
1319: \end{tabular}
1320: \end{quote}
1321: %
1322:
1323:
1324: \vskip0.3cm
1325: \noindent
1326: {\bf Extension of the Result of a Call}\ :\ \
1327: {\tt EXTG}$(l,\cdot,\cdot): {\it CPS}_{D'}\times {\it CPSS}_{D'''}
1328: \not\rightarrow{\it CPSS}_{D'} $ \ \\
1329: \noindent
1330: This operation extends a substitution $\theta$
1331: with a substitution sequence $S$ repre\-senting the result of
1332: executing a call $l$ on $\theta$.
1333: Hence, it is only used in contexts where the substitutions that are elements
1334: of $S$ are (roughly speaking) instances of $\theta$.
1335: %We leave it undefined in the other cases.
1336: %Assume that $l$ is of the form $p'(x_{i_1},\ldots,x_{i_{r}}$
1337: %(unification literals are dealt with in the same way).
1338: Let
1339: $\theta\in {\it CPS}_{D'}$.
1340: Let $S\in {\it CPSS}_{D'''} $ be of the form
1341: $<\theta'\sigma_1,\ldots,\theta'\sigma_i,\_>$ where
1342: %$\theta'={\tt RESTRG}(l,\theta)$
1343: $x_j\theta'=x_{i_j}\theta$
1344: $(1\leq j\leq {r})$
1345: and the
1346: $\sigma_i$ are standard substitutions such that
1347: ${\it dom}(\sigma_i)\subseteq {\it codom}(\theta')$.
1348: %
1349: %\footnote{
1350: %Assuming that $S$ is of the form
1351: %$<\theta'\sigma'_1,\ldots,\theta'\sigma'_i,\_>$
1352: %where the $\sigma'_i$ are {\em arbitrary} standard substitutions,
1353: %the $\sigma_i$ exist and they are unique.}
1354: %
1355: %{\tt EXTG}$(l,\theta,S)$ is computed as follows.
1356: %
1357: %Let $\rho$ be a renaming such that
1358: %$x_j\theta'=x_{i_j}\theta\rho$
1359: %$(1\leq j\leq {n'})$.
1360: Let $\{z_1,\ldots,z_s\}={\it codom}(\theta)
1361: \setminus{\it codom}(\theta')$.
1362: Let $y_{i\mbox{\rm \tiny ,}1},\ldots,y_{i\mbox{\rm \tiny ,}s}$
1363: be distinct standard variables
1364: not belonging to ${\it codom}(\theta)\cup {\it codom}(\sigma_i)$
1365: $(1\leq i\leq {\it Ns}(S))$.
1366: Let $\rho_i$ be a renaming of the form $\{z_1/y_{i\mbox{\rm \tiny ,}1},\ldots,
1367: z_s/y_{i\mbox{\rm \tiny ,}s},
1368: y_{i\mbox{\rm \tiny ,}1}/z_1,\ldots,
1369: y_{i\mbox{\rm \tiny ,}s}/z_s\}$.
1370: \begin{quote}
1371: \begin{tabular}{lll}
1372: ${\tt EXTG}(l,\theta,S)$ & $=$ &
1373: $ [\![<\theta\rho_1\sigma_1,\ldots,\theta\rho_i\sigma_i,\_>]\!] .$
1374: \end{tabular}
1375: \end{quote}
1376: %
1377: %\vskip0.2cm
1378:
1379: It is easy to see that the value of ${\tt EXTG}(l,\theta,S)$ does not depend on the
1380: choice of the $y_{i\mbox{\rm \tiny ,}j}$.
1381: Moreover, it is not defined when $S$ is not
1382: of the above mentioned form.\\
1383:
1384: %
1385: \noindent
1386: {\bf Unification of Two Variables}\ :\ \
1387: {\tt UNIF-VAR}$: {\it CPS}_{\{x_1,x_2\}}
1388: \rightarrow {\it CPSS}_{\{x_1,x_2\}}$\ \\
1389: Let $\theta\!\in\!{\it CPS}_{\{x_1,x_2\}}$.
1390: This operation unifies $x_1\theta$ with $x_2\theta$.
1391: %
1392: \begin{quote}
1393: \begin{tabular}{llll}
1394: {\tt UNIF-VAR}$(\theta)$ & $ =$ & $<> $ &
1395: if $x_1\theta$ and $x_2\theta$ are not unifiable,\\
1396: %
1397: & $ =$ & $[\![<\theta\sigma>]\!]$ & where $\sigma\!\in\! {\it mgu}
1398: (x_1\theta,x_2\theta)$,
1399: otherwise.
1400: \end{tabular}
1401: \end{quote}
1402:
1403: \vskip0.3cm
1404: %
1405: \noindent
1406: {\bf Unification of a Variable and a Functor}\ :\ \
1407: {\tt UNIF-FUNC}$(f,\cdot): {\it CPS}_{D}
1408: \rightarrow {\it CPSS}_{D}$\ \\
1409: Given a functor $f$ of arity $n-1$ and a substitution
1410: $\theta\!\in\!{\it CPS}_{D}$ where $D=\{x_1,\ldots,x_n\}$, the
1411: {\tt UNIF-FUNC} operation unifies $x_1\theta$ with $f(x_2,\ldots,x_n)
1412: \theta$.
1413: %
1414: \begin{quote}
1415: %\hskip-1.2cm
1416: \begin{tabular}{llll}
1417: {\tt UNIF-FUNC}$(f,\theta)$ & $ =$ & $<> $ &
1418: if $x_1\theta$ and $f(x_2,\ldots,x_{n}\theta)$ are not unifiable,\\
1419: %
1420: & $ =$ & $[\![<\theta\sigma>]\!]$ & where $\sigma\!\in\! {\it mgu}
1421: (x_1\theta,f(x_2,\ldots,
1422: x_{n})\theta)$,
1423: otherwise.
1424: \end{tabular}
1425: \end{quote}
1426: \vskip0.3cm
1427:
1428: All operations above
1429: %defined in this section
1430: are monotonic and continuous.
1431: We assume that
1432: Sets of program substitutions are endowed with the %trivial
1433: ordering $\sqsubseteq$ such that $\theta\sqsubseteq\theta'$ iff
1434: $\theta=\theta'$.
1435:
1436: \subsection{Concrete Semantic Rules}
1437: \label{CSR}
1438:
1439: The concrete semantics of the underlying program $P$ is the least fixpoint
1440: of a conti\-nu\-ous transformation on {\it CB}
1441: (the set of concrete behaviors).
1442: This transformation is defined in terms of a set of semantic rules
1443: that naturally extend a concrete behavior to a continuous function
1444: defining the input/output behavior of every prefix of the body of a
1445: clause,
1446: every clause, every suffix
1447: %\footnote{as defined by the abstract syntax}
1448: of a procedure and every procedure of $P$.
1449: This function is called {\em extended concrete behavior}
1450: and maps each element of the extended concrete underlying domain
1451: to a substitution sequence, possibly with cut information, as defined below.
1452:
1453: \begin{definition} [Extended Concrete Underlying Domain]
1454: The {\em extended concrete underlying domain},
1455: denoted by {\it ECUD}, consists of
1456: \begin{enumerate}
1457: \itemsep 2pt
1458: \item
1459: all triples $\langle \theta, g, c \rangle $, where
1460: $c$ is a clause of $P$,
1461: $g$ is a prefix of the body of $c$,
1462: and $\theta$ is a canonical program substitution over the variables
1463: in the head of $c$;
1464: \item
1465: all pairs $\langle \theta, c \rangle $, where
1466: $c$ is a clause of $P$
1467: and $\theta$ is a canonical program substitution over the variables
1468: in the head of $c$;
1469: \item
1470: all pairs $\langle \theta, {\it pr} \rangle $, where
1471: ${\it pr}$ is a suffix of a procedure of $P$
1472: % or a suffix of a procedure of $P$
1473: and $\theta$ is a canonical program substitution over the variables
1474: in the head of the
1475: clauses of {\it pr}.
1476: \end{enumerate}
1477: \end{definition}
1478:
1479: \begin{definition} [Extended Concrete Behaviors]
1480: An {\em extended concrete behavior} is a total function from
1481: {\it ECUD} to the set ${\it CPSS}\cup({\it CPSS}\times{\it CF})$ such that
1482: \begin{enumerate}
1483: \itemsep 2pt
1484: \item
1485: every triple $\langle \theta, g, c \rangle $ from {\it ECUD}
1486: is mapped to a program substitution sequence with cut information $\langle S, {\it cf}\rangle $ such that
1487: ${\it dom}(S)$ is the set of all varia\-bles in $c$;
1488: \item every pair $\langle \theta, c \rangle $ from {\it ECUD}
1489: is mapped to a program substitution sequence with cut information $\langle S, {\it cf}\rangle $ such that
1490: ${\it dom}(S)$ is the set of variables in the head of $c$;
1491: \item every pair $\langle \theta, {\it pr} \rangle $ from {\it ECUD}
1492: is mapped to a program substitution sequence $S$ such that
1493: ${\it dom}(S)$ is the set of variables in the head of the
1494: clauses of {\it pr}.
1495: \end{enumerate}
1496:
1497: \end{definition}
1498:
1499:
1500: The set of extended concrete behaviors is endowed with a structure of
1501: pointed cpo in the obvious way. It is denoted by {\it ECB}; its
1502: elements are denoted by $\longmapsto$.
1503:
1504:
1505:
1506: Let $\longmapsto$ be a concrete behavior. The concrete semantic rules
1507: depicted in Figu\-re~\ref{CSRF} define an extended concrete behavior
1508: derived from $\longmapsto$. This extended concrete behavior is
1509: denoted by the same symbol $\longmapsto$. This does not lead to
1510: confusion since the inputs of the two functions belong to different
1511: sets. The definition proceeds by induction on the syntactic structure
1512: of $P$.
1513:
1514:
1515: %\noindent
1516: The concrete semantic rules model Prolog operational semantics
1517: through the notion of program substitution sequence.
1518: Rule {\bf R1} defines the program substitution sequence with
1519: cut information at the entry point of a clause.
1520: Rules {\bf R2} and {\bf R3} define the effect of the execution of a cut
1521: at the clause level. Rules {\bf R4},
1522: {\bf R5} and {\bf R6} deal with execution of literals;
1523: procedure calls are solved by using the concrete behavior
1524: $\longmapsto$ as an oracle.
1525: Rule {\bf R7} defines the result of a clause.
1526: Rules {\bf R8} and {\bf R9} define the result of a procedure
1527: by structural induction on its suffixes.
1528: Rule {\bf R8} deals
1529: with the suffix consisting of the last clause only: it simply
1530: forgets the cut information, which is not meaningful at the procedure
1531: level.
1532: Rule {\bf R9} combines the result of a clause with the
1533: (combined) result of the next clauses in the same procedure:
1534: it deals with the execution of a cut at the procedure level.
1535: The expression ${\Box}^{{\it Ne}(S)}_{k=1} S_k$ used in
1536: Rules {\bf R4},
1537: {\bf R5} and {\bf R6} deserves an
1538: explanation: when the sequence $S$ is incomplete, it is assumed that
1539: $S_{_{{\it Ne}(S)}}=<\bot>$. This convention
1540: is necessary to propagate the non-termination of $ g'$
1541: to $g$.
1542:
1543:
1544:
1545: %\noindent
1546:
1547: The following results are instrumental for proving the well-definedness
1548: of the concrete semantics.
1549: %
1550:
1551: \begin{proposition}[Properties of the Concrete Semantic Rules]
1552: %\label{PCR}
1553: %\vskip0.2cm
1554: \begin{enumerate}
1555: \itemsep 2pt
1556: \item Given a concrete behavior, the concrete semantic rules define
1557: a unique extended concrete behavior, i.e.,
1558: a unique mapping from
1559: {\it CB} to {\it ECB}. %The mapping is denoted by overloading
1560: %the symbols denoting corresponding elements from
1561: %{\it CB} and {\it ECB}.
1562: This mapping is continuous.
1563: \item Rules {\bf R1} to {\bf R6} have a conclusion of the form
1564: $\langle \theta, g, c \rangle \longmapsto
1565: \langle S, {\it cf}\rangle $.
1566: In all cases, $S$ is of the
1567: form $<\theta'\sigma_1,\dots,\theta'\sigma_i,\_>$,
1568: where the $\sigma_i$ are standard substitutions and
1569: $\langle\theta',{\it nocut}\rangle={\tt EXTC}(c,\theta)$.
1570: \\ Rules {\bf R7} to {\bf R9} have a conclusion
1571: of the form $\langle \theta, \cdot\rangle \longmapsto S$.
1572: In all cases, $S$ is of the form
1573: \mbox{$<\theta\sigma_1,\dots,\theta\sigma_i,\_>$},
1574: where the $\sigma_i$ are standard substitutions.
1575: \end{enumerate}
1576: \end{proposition}
1577:
1578:
1579:
1580:
1581: \subsection{Concrete Semantics}
1582: \label{TCS}
1583:
1584:
1585: The concrete semantics of the underlying program $P$ is
1586: defined as the least fixpoint of the following concrete transformation.
1587:
1588: \begin{definition}[Concrete Transformation]
1589: \label{CT}
1590:
1591: \noindent
1592: The transformation
1593: %$\stackrel{{\it TCB}}\cdot : {\it CB}\rightarrow{\it CB}$
1594: ${\it TCB} : {\it CB}\rightarrow{\it CB}$
1595: is defined
1596: as follows: for all $\longmapsto \in {\it CB}$,
1597: %
1598: \vskip0.2cm
1599: \begin{center}
1600: \begin{tabular}{rcl}
1601: & {\it pr} is a procedure of {\it P} & \\
1602: & $p$ is the name of {\it pr} &\\
1603: & $\langle \theta, {\it pr} \rangle
1604: \longmapsto S$ & \\
1605: {\bf T1} & \raisebox{.7ex}{\underline{\hspace*{3.7cm}}}
1606: & \\
1607: & $\langle \theta, p \rangle
1608: \stackrel{{\it TCB}}\longmapsto
1609: S$ \\
1610: \end{tabular}
1611: \end{center}
1612: where $\stackrel{{\it TCB}}\longmapsto $ stands for ${\it TCB}(\longmapsto)$.
1613: Remember that
1614: $\langle \theta, {\it pr} \rangle
1615: \longmapsto S$ is defined by means of the previous rules
1616: which use the concrete behavior $\longmapsto$ as an oracle to solve the procedure calls.
1617: \end{definition}
1618: %
1619:
1620:
1621:
1622: %
1623: {\begin{figure}[top]
1624: \figrule
1625: \begin{it}
1626:
1627: \begin{center}
1628: \begin{tabular}{rc}
1629: \\
1630: & g\ ::=\ $<>$ \\
1631: {\bf R1} & \raisebox{.7ex}{\underline{\hspace*{3.7cm}}}
1632: \\
1633: & $\langle \theta, g, c \rangle
1634: \longmapsto
1635: {\tt EXTC}(c,\theta)$ \\ \\
1636: \end{tabular}
1637: \end{center}
1638: %
1639: %
1640: %
1641: \hspace{-7cm}
1642: \begin{tabular}{cc}
1643: %
1644: %
1645: %
1646: \begin{tabular}{lc}
1647: & g\ ::=\ g' , ! \\
1648: & $\langle \theta, g',c \rangle
1649: \longmapsto
1650: \langle S, {\it cf} \rangle$ \\
1651: & $S\in\{<\bot>,<>\}$ \\
1652: {\bf R2} & \raisebox{.7ex}{\underline{\hspace*{3.7cm}}}
1653: \\
1654: & $\langle \theta, g, c \rangle
1655: \longmapsto
1656: \langle S, {\it cf} \rangle$ \\
1657: \end{tabular}
1658: %
1659: &
1660: %
1661: \hspace{-6cm}
1662: \begin{tabular}{lc}
1663: & g\ ::=\ g' , ! \\
1664: & $\langle \theta, g',c \rangle
1665: \longmapsto
1666: \langle S, {\it cf} \rangle$ \\
1667: & $S = <\theta'>::S'$ \\
1668: {\bf R3} &\raisebox{.7ex}{\underline{\hspace*{3.7cm}}}
1669: \\
1670: & $\langle \theta, g, c \rangle
1671: \longmapsto
1672: \langle <\theta'>, {\it cut} \rangle$ \\
1673: \end{tabular}
1674: %
1675: \\ \\ \\
1676: %
1677: \begin{tabular}{lc}
1678: & g\ ::=\ g' , l \\
1679: & l\ ::=\ $x_i$=$x_j$ \\
1680:
1681: & $\langle \theta, g',c \rangle
1682: \longmapsto
1683: \langle S, {\it cf} \rangle$ \\
1684: & $S = <\theta_1,\dots,\theta_i,\_>$ \\
1685:
1686: & $\left\{\begin{array}{c}
1687: \theta'_k = {\tt RESTRG}(l,\theta_k) \\
1688: S_{k}^{' }=\mbox{\tt UNIF-VAR}(\theta'_k) \\
1689: S_k = {\tt EXTG}(l,\theta_k,S_{k}^{' }) \\
1690: (1\leq k\leq {\it Ns}(S)) \end{array}\right\}$ \\
1691:
1692: {\bf R4} & \raisebox{.7ex}{\underline{\hspace*{3.7cm}}}
1693: \\
1694: & $\langle\theta, g, c \rangle
1695: \longmapsto
1696: \langle {\Box}^{{\it Ne}(S)}_{k=1} S_k,
1697: {\it cf} \rangle$ \\
1698: \end{tabular}
1699: %
1700: &
1701: %
1702: \hspace{-6cm}
1703: \begin{tabular}{lc}
1704: & g\ ::=\ g' , l \\
1705: & l\ ::=\
1706: $x_{i_1}$=f($x_{i_2}$, \ldots, $x_{i_n}$) \\
1707:
1708: & $\langle \theta, g',c \rangle
1709: \longmapsto
1710: \langle S, {\it cf} \rangle$ \\
1711: & $S = <\theta_1,\dots,\theta_i,\_>$ \\
1712:
1713: & $\left\{\begin{array}{c}
1714: \theta'_k = {\tt RESTRG}(l,\theta_k) \\
1715: S_{k}^{' }=\mbox{\tt UNIF-FUNC}(f,\theta'_k) \\
1716: S_k = {\tt EXTG}(l,\theta_k,S_{k}^{' }) \\
1717: (1\leq k\leq {\it Ns}(S) ) \end{array}\right\}$ \\
1718:
1719: {\bf R5} & \raisebox{.7ex}{\underline{\hspace*{3.7cm}}}
1720: \\
1721: & $\langle\theta, g, c \rangle
1722: \longmapsto
1723: \langle {\Box}^{{\it Ne}(S)}_{k=1} S_k,
1724: {\it cf} \rangle$ \\
1725: \end{tabular}
1726: %
1727: \\ \\ \\
1728: %
1729: %
1730: \begin{tabular}{lc}
1731: & g\ ::=\ g' , l \\
1732: & l\ ::=\
1733: p($x_{i_1}$, \ldots, $x_{i_n}$) \\
1734:
1735: & $\langle \theta, g',c \rangle
1736: \longmapsto
1737: \langle S, {\it cf} \rangle$ \\
1738: & $S = <\theta_1,\dots,\theta_i,\_>$ \\
1739:
1740: & $\left\{\begin{array}{c}
1741: \theta'_k = {\tt RESTRG}(l,\theta_k) \\
1742: \langle \theta'_k ,p\rangle
1743: \longmapsto S_{k}^{'} \\
1744: S_k = {\tt EXTG}(l,\theta_k,S_{k}^{' }) \\
1745: (1\leq k\leq {\it Ns}(S)) \end{array}\right\}$ \\
1746:
1747: {\bf R6} & \raisebox{.7ex}{\underline{\hspace*{3.7cm}}}
1748: \\
1749: & $\langle\theta, g, c \rangle
1750: \longmapsto
1751: \langle {\Box}^{{\it Ne}(S)}_{k=1} S_k,
1752: {\it cf} \rangle$ \\
1753: \end{tabular}
1754: %
1755: &
1756: %
1757: \hspace{-6cm}
1758: \begin{tabular}{lc}
1759: \\
1760: \\
1761: \\
1762: \\
1763: \\
1764: \\
1765: & c \ ::=\ h :- g. \\
1766: & $\langle \theta, g ,c \rangle
1767: \longmapsto
1768: \langle S, {\it cf} \rangle$ \\
1769: {\bf R7} &\raisebox{.7ex}{\underline{\hspace*{3.7cm}}}
1770: \\
1771: & $\langle \theta, c \rangle
1772: \longmapsto
1773: {\tt RESTRC}(c, \langle S, {\it cf} \rangle)$ \\
1774: \end{tabular}
1775: %
1776: \\ \\ \\
1777: %
1778: %
1779: \begin{tabular}{lc}
1780: \\
1781: & pr \ ::=\ c \\
1782: & $\langle \theta, c \rangle
1783: \longmapsto
1784: \langle S, {\it cf} \rangle$ \\
1785: {\bf R8} & \raisebox{.7ex}{\underline{\hspace*{3.7cm}}}
1786: \\
1787: & $\langle \theta, {\it pr} \rangle
1788: \longmapsto S$ \\
1789: \end{tabular}
1790: %
1791: &
1792: %
1793: \hspace{-6cm}
1794: \begin{tabular}{lc}
1795: & pr \ ::=\ c {\it pr}' \\
1796: & $\langle \theta, c \rangle
1797: \longmapsto
1798: \langle S, {\it cf} \rangle$ \\
1799: & $\langle \theta, {\it pr}' \rangle
1800: \longmapsto S'$ \\
1801: {\bf R9} & \raisebox{.7ex}{\underline{\hspace*{3.7cm}}}
1802: \\
1803: & $\langle \theta, {\it pr} \rangle
1804: \longmapsto
1805: \langle S, {\it cf} \rangle \Box S'$ \\
1806: \end{tabular}\\ \\
1807:
1808:
1809: \end{tabular}
1810: \end{it}
1811: \caption{Concrete semantic rules}
1812: \label{CSRF}
1813: \end{figure}}
1814:
1815:
1816: The transformation ${\it TCB}$
1817: %$\stackrel{{\it TCB}}\cdot$
1818: is well-defined and continuous.
1819:
1820: \begin{definition}[Concrete Semantics]
1821: \label{DCS}
1822:
1823: \noindent The concrete semantics of the underlying program $P$ is the least
1824: concrete beha\-vior $\longmapsto$ such that
1825: $$ \longmapsto\ =\ \stackrel{{\it TCB}}\longmapsto.$$
1826: \end{definition}
1827: %
1828: %
1829: %
1830:
1831:
1832:
1833: \subsection{Correctness of the Concrete Semantics}
1834: \label{CCS}
1835:
1836: Since OLD-resolution \cite{Lloyd,Tamaki86} is the standard semantics of
1837: pure Prolog augmented with cut, our concrete
1838: semantics and OLD-resolution have to be proven equivalent.
1839: The proof is fairly complex because OLD-resolution is not compositional.
1840: Consequently, the two semantics do not naturally match.
1841: The equivalence proof is given in \cite{Cut_rep95}. In this section, we
1842: only give the principle of the proof.
1843: \begin{enumerate}
1844: \itemsep 2pt
1845: \item We assume that OLD-resolution uses standard variables to
1846: rename clauses apart.
1847: The initial queries are also assumed to contain
1848: standard variables only.
1849: %
1850: \item The notion of {\em incomplete OLD-tree limited to depth $k$}
1851: is defined ({\it IOLD}$_k$-tree, for short).
1852: Intuitively, an {\it IOLD}$_k$-tree is an OLD-tree modified according
1853: to the following rules:
1854: \begin{enumerate}
1855: \itemsep 2pt
1856: \item procedure calls may be unfolded only down to depth $k$;
1857: \item branches that end at a node whose leftmost literal may not
1858: be unfolded are called {\em incomplete};
1859: \item a depth-first left-to-right traversal of the tree is performed in
1860: order to determine the cuts that are reached by the standard
1861: execution and to prune the tree accordingly; see \cite{Lloyd};
1862: \item the traversal ends when the whole tree has been visited or
1863: when a node that may not be unfolded is reached;
1864: \item the branches on the right of the left-most incomplete branch
1865: are pruned (if such a branch exists).
1866: \end{enumerate}
1867: \item Assuming a query of the form $p(t_1,\dots,t_n)$ and denoting
1868: the concrete beha\-vior
1869: %$\stackrel{_{_{{\it TCB}^k}}}{\longmapsto_{\bot}}$
1870: ${\it TCB}^k({\longmapsto_{\bot}})$
1871: by $\longmapsto_k$, it can be shown that the sequence of computed
1872: answer substitutions $<\sigma_1,\dots,\sigma_i,\_>$ for the
1873: {\it IOLD}$_k$-tree of $p(t_1,\dots,t_n)$ is such that
1874: $\langle \theta, p\ \rangle \longmapsto_k
1875: [\![ <\theta\sigma_1,\dots,\theta\sigma_i,\_>]\!]$
1876: where $\theta=\{x_1/t_1,\dots,x_n/t_n\}$.
1877: \item The equivalence of our concrete semantics and OLD-resolution
1878: is a simple consequence of the previous result.\\ For every query
1879: $p(t_1,\dots,t_n)$, $<\sigma_1,\dots,\sigma_i,\_\!>$
1880: is the sequence of computed answer substitutions of $p(t_1,\dots,t_n)$
1881: according to OLD-resolution if and only if
1882: $\langle \theta, p\ \rangle \longmapsto
1883: [\![ <\theta\sigma_1,\dots,\theta\sigma_i,\_>]\!]$
1884: where $\theta=\{x_1/t_1,\dots,x_n/t_n\}$ and
1885: $\longmapsto$ is the concrete behavior of the program
1886: according to our concrete semantics.
1887: %Assuming the same
1888: % notation as above, the relation
1889: % $\langle \theta, p\ \rangle \longmapsto
1890: % [\![ <\theta\sigma_1,\dots,\theta\sigma_i,\_>]\!]$
1891: % holds, where $<\sigma_1,\dots,\sigma_i,\_>$ is the sequence of computed
1892: % answer substitutions for the query $p(t_1,\dots,t_n)$
1893: % (according to OLD-resolution)
1894: % and $\longmapsto$ is the concrete behavior of the program
1895: % (according to our concrete semantics).
1896: \end{enumerate}
1897:
1898:
1899: In fact, the correctness of our concrete semantics should be
1900: close to obvious to anyone who knows about both Prolog and denotational
1901: semantics.
1902: %Moreover, the semantics of the cut is naturally
1903: % expressed in our formalization.
1904: So, the equivalence proof
1905: is
1906: a formal technical exercise, which adds little to our basic
1907: understan\-ding of the concrete semantics.
1908:
1909: \subsection{Related Works}
1910: \label{RW_CS}
1911:
1912: Denotational semantics for Prolog have been proposed before
1913: \cite{debruin,Debray88b,JonesMy}. Our concrete semantics is not
1914: intended to improve on these works from the language understanding
1915: standpoint. Instead, it is merely designed as a basis for an abstract
1916: interpretation framework; in particular, it uses concrete operations
1917: that are as close as possible to the operations used by the structural
1918: operational semantics presented in \cite{ACTA95} upon which
1919: our previous frameworks
1920: are based. This allows us to reuse
1921: much of the material from our existing abstract domains and generic
1922: algorithms; see, \cite{SPE,ICLP91AI,TOPLAS,ACTA95}.
1923: The idea of distinguishing between
1924: finite, incomplete, and infinite sequences is originally due to
1925: Baudinet \shortcite{Baudinet}.
1926: %
1927: %
1928:
1929:
1930: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1931: \section{Abstract semantics}
1932: \label{sec:AS}
1933: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1934:
1935: %We are now in position to define the abstract semantics.
1936: As we have already
1937: explained in the introduction, our abstract semantics is not
1938: defined as a
1939: least fixpoint of an abstract transformation
1940: but instead as a set of post-fixpoints
1941: that fulfill a safety requirement, namely
1942: pre-consistency. Moreover, the abstract domains are assumed to
1943: represent so-called
1944: chain-closed sets of concrete elements as specified below.
1945:
1946: %This section is organized as follows: Section \ref{sub:AD} describes the abstract domains.
1947: %Section \ref{sub:AO} specifies the abstract operations.
1948: % The abstract semantics is defined in Section \ref{sub:ASem}
1949: %and its safety properties are stated and proven in Section \ref{sub:SAS}. Finally, in
1950: %Section \ref{sub:D} our approach is discussed and compared with other ones.
1951:
1952:
1953:
1954: \subsection{Abstract Domains}
1955: \label{sub:AD}
1956:
1957:
1958:
1959: %In this section, we describe the abstract domains that are used by our abstract
1960: %interpretation framework. In particular
1961: We state here
1962: the mathematical assumptions that are required to be satisfied by the
1963: abstract domains.
1964: Specific abstract domains will be described
1965: in Section \ref{sec:FASAS}.\\
1966:
1967: \noindent
1968: {\bf Abstract Substitutions.}
1969: %\label{subsub:ASub}
1970: For every finite set $D$ of program variables, we denote by
1971: ${\it CS}_D$ the set
1972: %of program substitutions
1973: $\wp({\it PS}_D)$.
1974: %{ Abstract substitutions} denote sets of program substitutions,
1975: %i.e., elements of ${\it CS}_D$, as follows.
1976: %
1977: %\begin{definition}%
1978: %\label{def:ASu}%
1979: %[Abstract Substitutions]
1980: %
1981: \noindent A domain of abstract substitutions is a family of sets ${\it AS}_D$
1982: indexed by the finite sets $D$ of program variables.
1983: Elements of ${\it AS}_D$ are called {\em abstract substitutions};
1984: they are denoted by $\beta$.
1985: Each set ${\it AS}_D$ is endowed with a partial order $\leq$
1986: and a monotonic {\em concretization} function
1987: ${\it Cc}:{\it AS}_D\rightarrow {\it CS}_D$
1988: associating to each abstract substitution $\beta$ the set
1989: ${\it Cc}(\beta)$ of program
1990: substitutions it denotes.\\
1991: %\end{definition}
1992:
1993: \noindent
1994: {\bf Abstract Sequences.}
1995: %\label{subsub:ASeq}
1996: For every finite set $D$ of program variables, we denote by
1997: ${\it CSS}_D$ the set $\wp({\it PSS}_D)$.
1998: Abstract sequences denote chain-closed subsets of ${\it CSS}_D$.\\
1999: %
2000: %
2001: %\begin{definition}%
2002: %\label{def:ASeq}%
2003: %[Abstract Sequences]
2004: %
2005: \noindent A domain of abstract sequences is a family of sets ${\it ASS}_D$
2006: indexed by the finite sets $D$ of program variables.
2007: Elements of ${\it ASS}_D$ are called {\em abstract sequences};
2008: they are denoted by $B$.
2009: Each set ${\it ASS}_D$ is endowed with a partial order $\leq$
2010: and a monotonic {\em concretization} function
2011: ${\it Cc}:{\it ASS}_D\rightarrow {\it CSS}_D$.
2012: Moreover, the following properties are required to be satisfied: (1)
2013: %\begin{enumerate}
2014: %\itemsep 0pt
2015: %\item
2016: every ${\it ASS}_D$ contains an abstract sequence $B_{\bot}$
2017: such that $<\bot>\,\in {\it Cc}(B_{\bot})$;
2018: (2)
2019: %\item
2020: for every $B\in{\it ASS}_D$, ${\it Cc}(B)$ is {\em chain-closed}, i.e.,
2021: for every chain $(S_i)_{i\in {\bf N}}$ of elements of ${\it Cc}(B)$,
2022: the limit $\sqcup_{i=0}^{\infty}S_i$ also belongs to ${\it Cc}(B)$.
2023: %\end{enumerate}
2024: The disjoint union of all the ${\it ASS}_D$ is denoted by ${\it ASS}$.\\
2025: %${\it ASS}$ is assumed to be endowed with the sum ordering
2026: %\cite{Schmidt88}.
2027: %\end{definition}
2028:
2029: \noindent
2030: {\bf Abstract Sequences with Cut Information.}
2031: %\label{ASCI}
2032: Let ${\it CSSC}_D$ denote $\wp({\it PSS}_D\times {\it CF})$.
2033: %For every finite set $D$ of program variables, we denote by ${\it CSSC}_D$
2034: %the set $\wp({\it PSS}_D\times {\it CF})$.
2035: %{ Abstract sequences with cut information} denote sets of
2036: %program substitution sequences with cut information,
2037: %i.e., elements of ${\it CSSC}_D$, as defined below.
2038: %
2039: %\begin{definition}%
2040: %\label{def:ASeqCI}%
2041: %[Abstract Sequences with Cut Information]
2042: %
2043: \noindent A domain of abstract sequences
2044: with cut information is a family of sets ${\it ASSC}_D$
2045: indexed by the finite sets $D$ of program variables.
2046: Elements of ${\it ASSC}_D$ are called
2047: {\em abstract sequences with cut information};
2048: they are denoted by $C$.
2049: Every set ${\it ASSC}_D$ is endowed with a partial order $\leq$
2050: and a monotonic {\em concretization} function
2051: ${\it Cc}:{\it ASSC}_D\rightarrow {\it CSSC}_D$.
2052: The disjoint union of all the ${\it ASSC}_D$ is denoted by ${\it ASSC}$.\\
2053: %and it is assumed to be endowed with the sum ordering.
2054: %
2055: %\end{definition}
2056:
2057: \noindent
2058: {\bf Abstract Behaviors.}
2059: %\label{AB}
2060: Abstract behaviors are the abstract counterpart of the concrete
2061: behaviors introduced in Section \ref{CB}. They are endowed with a
2062: weaker mathe\-matical structure as described below.
2063: As in the case of concrete behaviors, a fixed underlying program $P$ is
2064: assumed.
2065:
2066: %
2067: \begin{definition} [Abstract Underlying Domain]
2068: \label{AAUD}
2069: The {\em abstract underlying domain},
2070: denoted by {\it AUD},
2071: is the set of all pairs $\langle \beta,p \rangle$ such that
2072: $p$ is a procedure name
2073: in $P$ of arity $n$ and
2074: $\beta\!\in\!{\it AS}_{\{x_1,\dots,x_n\}}$.
2075: %where $n$ is the arity of $p$.
2076: \end{definition}
2077: %
2078: %
2079: \begin{definition}[Abstract Behaviors]
2080: \label{def:A}
2081: An {\em abstract behavior}
2082: is a total function ${\it sat}: {\it AUD}\longrightarrow {\it ASS}$
2083: mapping each pair $\langle \beta,p \rangle\!\in\!{\it AUD}$ to an
2084: abstract sequence $B$ with $B\in{\it ASS}_{\{x_1,\dots,x_n\}} $,
2085: where $n$ is the arity of $p$.
2086: The set of all abstract behaviors is denoted by
2087: {\it AB}.
2088: The set {\it AB} is endowed with the partial ordering $\leq$ such that,
2089: for all ${\it sat}_1,{\it sat}_2\in{\it AB}$:
2090: %
2091: $$
2092: \begin{array}{lclr}
2093: {\it sat}_1 \leq {\it sat}_2
2094: & \;
2095: \mbox{\rm iff} \;
2096: &
2097: {\it sat}_1\langle \beta,p \rangle \leq
2098: {\it sat}_2\langle \beta,p \rangle, &\;
2099: \forall \langle \beta,p \rangle \in {\it AUD}.
2100: \end{array}
2101: $$
2102: \end{definition}
2103:
2104: It would be reasonable to assume that abstract behaviors
2105: are monotonic functions but this is not necessary for the safety
2106: results. The notation {\it sat} stands for ``set of
2107: abstract tuples''. It is used because the abstract interpretation
2108: algorithm, derived from the abstract semantics, actually computes a set
2109: of tuples of the form $\langle \beta,p ,B\rangle $, i.e., a part of the
2110: table of an abstract behavior.
2111:
2112:
2113: \subsection{Abstract Operations}
2114: \label{sub:AO}
2115:
2116: In this section, we give the specification of the primitive
2117: abstract operations
2118: used by the abstract semantics.
2119: The specifications are safety assumptions which, roughly speaking,
2120: state that the abstract operations safely simulate the corresponding concrete
2121: ones.
2122: %To ease the presentation, we classify abstract operations in three
2123: %sets.
2124: In particular,
2125: operations {\tt EXTC}, {\tt RESTRG}, {\tt RESTRC},
2126: {\tt UNIF-VAR}, {\tt UNIF-FUNC} are faithful abstract counterparts of
2127: the corresponding concrete operations. Hence,
2128: their specification simply states
2129: that, if some concrete input belongs to the concretization of their
2130: (abstract) input, then the corresponding concrete output belongs to
2131: the concretization
2132: of their (abstract) output. Moreover, overloading the opera\-tion
2133: names is natural in these cases.
2134: Operation {\tt AI-CUT}
2135: deals with the cut; its specification is also straightforward.
2136: Operations {\tt EXTGS} and {\tt CONC} are related to the concrete
2137: operations {\tt EXTG} and $\Box$ in a more involved way. We will discuss them in more detail.
2138: Finally, operations {\tt SUBST} and {\tt SEQ} are simple conversion
2139: operations to convert an abstract domain into another.
2140:
2141:
2142: Let us specify the operations, using the notations of
2143: Section \ref{CO}.
2144: \\
2145: \\
2146: \noindent
2147: {\bf Extension at Clause Entry}\ :\ \
2148: {\tt EXTC}$(c,\cdot):{\it AS}_D
2149: \rightarrow{\it ASSC}_{D'} $\ \\
2150: %
2151: Let $\beta \in {\it AS}_{D}$ and $\theta \in {\it CPS}_{D}$.
2152: The following property is required to hold.
2153: %
2154: \begin{quote}$
2155: \begin{array}{lll}
2156: \theta\in{\it Cc}(\beta)
2157: &
2158: \Rightarrow
2159: &
2160: {\tt EXTC}(c,\theta)\in {\it Cc}({\tt EXTC}(c,\beta)).
2161: \end{array}$
2162: \end{quote}
2163: %
2164: %
2165: \vskip0.3cm
2166: \noindent
2167: {\bf Restriction at Clause Exit}\ :\ \
2168: {\tt RESTRC}$(c,\cdot):{\it ASSC}_{D'}
2169: \rightarrow{\it ASSC}_D $\ \\
2170: Let $C\in {\it ASSC}_{D'}$ and $\langle S, {\it cf}\rangle
2171: \in({\it CPSS}_D'\times {\it CF})$.
2172: %
2173: \begin{quote}$
2174: \begin{array}{lll}
2175: \langle S, {\it cf}\rangle \in{\it Cc}(C)
2176: &
2177: \Rightarrow
2178: &
2179: {\tt RESTRC}(c,\langle S, {\it cf}\rangle)\in
2180: {\it Cc}({\tt RESTRC}(c,C)).
2181: \end{array}$
2182: \end{quote}
2183: %
2184: %
2185: \vskip0.3cm
2186: \noindent
2187: {\bf Restriction before a Call}\ :\ \
2188: {\tt RESTRG}$(l,\cdot):{\it AS}_{D''}
2189: \rightarrow{\it AS}_{D'''} $\ \\
2190: Let $\beta \in{\it AS}_{D''}$ and $\theta \in{\it CPS}_{D''}$.
2191: %
2192: \begin{quote}$
2193: \begin{array}{lll}
2194: \theta \in{\it Cc}(\beta)
2195: &
2196: \Rightarrow
2197: &
2198: {\tt RESTRG}(l,\theta)\in
2199: {\it Cc}({\tt RESTRG}(l,\beta)).
2200: \end{array}$
2201: \end{quote}
2202: %
2203: \vskip0.3cm
2204: \noindent
2205: {\bf Unification of Two Variables}\ :\ \
2206: {\tt UNIF-VAR}$: {\it AS}_{\{x_1,x_2\}}
2207: \rightarrow {\it ASS}_{\{x_1,x_2\}}$\ \\
2208: Let $\beta \in {\it AS}_{\{x_1,x_2\}}$ and
2209: $\theta\in{\it CPS}_{\{x_1,x_2\}}$.
2210: %
2211: \begin{quote}$
2212: \begin{array}{lll}
2213: \theta \in{\it Cc}(\beta)
2214: &
2215: \Rightarrow
2216: &
2217: \mbox{\tt UNIF-VAR}(\theta)\in
2218: {\it Cc}(\mbox{\tt UNIF-VAR}(\beta)).
2219: \end{array}$
2220: \end{quote}
2221: %
2222: \vskip0.3cm
2223: \noindent
2224: {\bf Unification of a Variable and a Functor}\ :\ \
2225: {\tt UNIF-FUNC}$(f,\cdot): {\it AS}_{D}
2226: \rightarrow {\it ASS}_{{D}}$\ \\
2227: Let $\beta \in {\it AS}_{D}$
2228: and $\theta \in {\it CPS}_{D}$. Let also $f$ be a functor
2229: of arity $n-1$.
2230: %
2231: %
2232: \begin{quote}$
2233: \begin{array}{lll}
2234: \theta \in{\it Cc}(\beta)
2235: &
2236: \Rightarrow
2237: &
2238: \mbox{\tt UNIF-FUNC}(f,\theta)\in
2239: {\it Cc}(\mbox{\tt UNIF-FUNC}(f,\beta)).
2240: \end{array}$
2241: \end{quote}
2242: %
2243: \vskip0.3cm
2244: \noindent
2245: {\bf Abstract Interpretation of the Cut}\ :\ \
2246: {\tt AI-CUT}$: {\it ASSC}_{{D'}}
2247: \rightarrow {\it ASSC}_{{D'}}$\ \\
2248: Let $C \in {\it ASSC}_{D'}$,
2249: $\theta \in {\it CPS}_{D'}$,
2250: $S \in {\it CPSS}_{D'}$,
2251: ${\it cf} \in {\it CF}$.
2252: %
2253: \begin{quote}$
2254: \begin{array}{rcl}
2255: \langle <>,{\it cf}\rangle \in {\it Cc}(C) & \Rightarrow &
2256: \langle <>,{\it cf}\rangle \in {\it Cc}(\mbox{\tt AI-CUT}(C)), \\
2257: \langle <\bot>,{\it cf}\rangle \in {\it Cc}(C) & \Rightarrow &
2258: \langle <\bot>,{\it cf}\rangle \in {\it Cc}(\mbox{\tt AI-CUT}(C)),\\
2259: \langle <\theta>::S ,{\it cf}\rangle \in {\it Cc}(C) & \Rightarrow &
2260: \langle <\theta>,{\it cut}\rangle \in {\it Cc}(\mbox{\tt AI-CUT}(C)).
2261: \end{array}$
2262: \end{quote}
2263: %
2264: \vskip0.3cm
2265: \noindent
2266: {\bf Extension of the Result of a Call}\ :\ \
2267: {\tt EXTGS}$(l,\cdot,\cdot):{\it ASSC}_{D'}\times {\it ASS}_{D'''}
2268: \rightarrow{\it ASSC}_{D'} $\ \\
2269: The specification of this operation is more complex because it
2270: abstracts in a single operation the calculation of all sequences
2271: $S_k = {\tt EXTG}(l,\theta_k,S'_{k})$ and of their concatenation
2272: ${\Box}^{{\it Ne}(S)}_{k=1} S_k$, performed by the
2273: rules {\bf R4}, {\bf R5}, {\bf R6}
2274: (see Figure \ref{CSRF}).
2275: At the abstract level, it may be too expensive or even impossible
2276: to simulate the execution of $l$ for all elements of $S$,
2277: as defined in the rules. Therefore,
2278: we abstract $S$ to its substitutions,
2279: losing the ordering. The abstract execution will be the following.
2280: Assuming that $C$ abstracts the
2281: program substitution sequence with cut information
2282: $\langle S,{\it cf}\rangle$ before $l$, we compute $\beta={\tt SUBST}(C)$;
2283: then we compute $\beta'={\tt RESTRG}(l,\beta)$ and, subsequently,
2284: we get the abstract sequence $B$ resulting from the abstract exe\-cution of
2285: $l$ with
2286: input $\beta'$.
2287: %applied to the procedure corresponding to $l$.
2288: The set ${\it Cc}(B)$ contains
2289: all sequences $S'_k$ of rules {\bf R4}, {\bf R5}, {\bf R6}. Then,
2290: an over approximation of the set of all possible values
2291: ${\Box}^{{\it Ne}(S)}_{k=1} S_k$ is computed from the information provided
2292: by $C$
2293: and $B$.
2294: This is realized by the following operation {\tt EXTGS}.
2295: Let $C\in {\it ASSC}_{D'}$, $B\in {\it ASS}_{D'''}$,
2296: $\langle S,{\it cf}\rangle\in ({\it CPSS}_{D'}\times {\it CF})$ and
2297: $S'_1,\dots,S'_{{\it Ns}(S)}\in{\it CPSS}_{D'''}$.
2298: %
2299: \begin{quote}$
2300: \hskip-0.8cm
2301: \begin{array}{rcl}
2302: \left.
2303: \begin{array}{c}
2304: \langle S,{\it cf}\rangle \in {\it Cc}(C),\\
2305: S=<\theta_1,\dots,\theta_i,\_>,\\
2306: % \left\{
2307: % \begin{array}{c}
2308: \left( \begin{array}{c}
2309: \forall k:1\leq k\leq {\it Ns}(S):
2310: S'_k\in{\it Cc}(B)\\
2311: \mbox{ and } S_k={\tt EXTG}(l,\theta_k,S'_k)
2312: \end{array}
2313: \right)
2314: % \end{array}
2315: % \right\}
2316: \end{array}
2317: \right\}
2318: &\Rightarrow&
2319: \langle \Box_{k=1}^{{\it Ne}(S)}S_k ,{\it cf}\rangle \in
2320: {\it Cc}(\mbox{\tt EXTGS}(l,C,B)).
2321: \end{array}$
2322: \end{quote}
2323: %
2324: %Remember that the notation $\Box_{k=1}^{{\it Ne}(S)}S_k$ assumes that
2325: %$S_{{\it Ne}(S)}=<\bot>$ when $S$ is incomplete;
2326: %moreover, the equality $S_k={\tt EXTG}(l,\theta_k,S'_k)$
2327: %implies that all substitutions of
2328: %$S'_k$ are instances of $\theta_k$ in the sense made precise in the
2329: %definition of {\tt EXTG}.
2330: % (see Section \ref{CO}).
2331: % The latter remark suggests that, provided the abstract
2332: %domain is expressive enough, operation {\tt EXTGS} can recover
2333: %(part of) the information lost by executing operation {\tt SUBST}$(C)$,
2334: %by matching the sequences in ${\it Cc}(B)$ with the
2335: %elements of the sequences in ${\it Cc}(C)$.
2336: %
2337: \vskip0.3cm
2338: \noindent
2339: {\bf Abstract Lazy Concatenation}\ :\ \
2340: ${\tt CONC}: ({\it AS}_D \times {\it ASSC}_D \times {\it ASS}_D)
2341: \rightarrow {\it ASS}_D$\ \\
2342: This operation is the abstract counterpart of the concatenation operation
2343: $\Box$. It is however extended with an
2344: additional argument to increase the accuracy.
2345: Let $B'={\tt CONC}(\beta,C,B)$ where
2346: $\beta$ describes a set of input substitutions for
2347: a procedure; $C$ describes the set of substitution sequences with cut
2348: information
2349: obtained by executing a clause of the procedure on $\beta$; $B$ describes the set of
2350: substitution sequences obtained by executing the
2351: subsequent clauses of the procedure on $\beta$.
2352: Then, $B'$ describes the set of substitution sequences obtained by
2353: concatenating the results according to the concrete
2354: concatenation operation $\Box$.
2355:
2356: Let us discuss a simple example to
2357: understand the role of $\beta$.
2358: Assume that
2359: $$
2360: \begin{array}{rcl}
2361: {\it Cc}(C)=\{ \langle <>, {\it nocut} \rangle ,
2362: \langle <\{x_1/a\}>, {\it nocut} \rangle\} &
2363: \mbox{\rm and } &
2364: {\it Cc}(B)=\{ <>,
2365: <\{x_1/b\}> \}.
2366: \end{array}
2367: $$
2368: If the input mode of $x_1$ is unknown, it must be assumed that
2369: all combinations of elements in ${\it Cc}(C)$ and ${\it Cc}(B)$ are
2370: possible. Thus,
2371: $${\it Cc}(B')=\{ <>, <\{x_1/a\}>,
2372: <\{x_1/b\}>, <\{x_1/a\},\{x_1/b\}> \}.$$
2373: On the contrary, if the input mode of $x_1$ is known to be ground,
2374: the outputs $\langle <\{x_1/a\}>, {\it nocut} \rangle$ and
2375: $<\{x_1/b\}>$ are incompatible since $x_1$
2376: cannot be bound to both $a$ and $b$ in the input substitution.
2377: In this case, we have
2378: $${\it Cc}(B')=\{ <>, <\{x_1/a\}>,
2379: <\{x_1/b\}>\}.$$
2380: The first argument $\beta$ of the operation {\tt CONC}
2381: provides information on the input values: it may be useful to improve
2382: the accuracy of the result.
2383: The above discussion motivates the following specification of
2384: operation {\tt CONC}.
2385: Note that the statement
2386: \mbox{$(\exists\sigma\in{\it SS}: \theta' = \theta\sigma)$} is abbreviated by
2387: $\theta'\leq \theta$ in the specification. Let
2388: $\beta\in{\it AS}_D$,
2389: $C\in{\it ASSC}_D$,
2390: $B\in{\it ASS}_D$,
2391: $\theta\in{\it CPS}_D$,
2392: $\langle S_1,{\it cf}\rangle\in ({\it CPSS}_{D}\times {\it CF})$ and
2393: $S_2\in{\it CPSS}_{D}$. %Then,
2394: %
2395: \begin{quote}$
2396: \begin{array}{rcl}
2397: \left.
2398: \begin{array}{c}
2399: \theta\in {\it Cc}(\beta),\\
2400: \langle S_1,{\it cf}\rangle \in {\it Cc}(C),\\
2401: S_2\in {\it Cc}(B),\\
2402: \forall \theta'\in {\it Subst}(S_1)\cup {\it Subst}(S_2):
2403: \theta'\leq \theta
2404: \end{array}
2405: \right\}
2406: &\Rightarrow&
2407: \langle S_1,{\it cf}\rangle \Box S_2 \in {\it Cc}({\tt CONC}(\beta,C,B)).
2408: \end{array}$
2409: \end{quote}
2410:
2411:
2412:
2413: \vskip0.3cm
2414: \noindent
2415: {\bf Operation}\ \ ${\tt SEQ}: {\it ASSC}_{D}
2416: \rightarrow{\it ASS}_{D}$\ \\
2417: This operation forgets the cut information
2418: contained in an abstract sequence with cut information $C$.
2419: It is applied to the
2420: result of the last clause of a procedure before
2421: combining this result with
2422: the results of the other clauses. \\
2423: Let $C\in {\it ASSC}_{D}$ and
2424: $\langle S,{\it cf}\rangle\in ({\it CPSS}_{D}\times {\it CF})$.
2425: \begin{quote}$
2426: \begin{array}{rcl}
2427: \langle S,{\it cf}\rangle \in {\it Cc}(C) &\Rightarrow &
2428: S\in {\it Cc}({\tt SEQ}(C)).
2429: \end{array}$
2430: \end{quote}
2431: %
2432: \vskip0.3cm
2433: \noindent
2434: {\bf Operation}\ \ ${\tt SUBST}: {\it ASSC}_{D'}
2435: \rightarrow{\it AS}_{D'}$\ \\
2436: %
2437: This operation forgets still more
2438: information. It extracts the ``abstract substitution part'' of $C$.
2439: It is applied before executing a literal in a clause. See operation
2440: {\tt EXTGS}.
2441: Let $C\in {\it ASSC}_{D'}$ and
2442: $\langle S,{\it cf}\rangle\in ({\it CPSS}_{D'}\times {\it CF})$.
2443: \begin{quote}$
2444: \begin{array}{rcl}
2445: \langle S,{\it cf}\rangle \in {\it Cc}(C)
2446: &\Rightarrow &
2447: {\it Subst(S)}\subseteq {\it Cc}({\tt SUBST}(C)).
2448: \end{array}$
2449: \end{quote}
2450: %
2451: %
2452: \subsection{Abstract Semantics}
2453: \label{sub:ASem}
2454:
2455: We are now in position to present the abstract semantics.
2456: %As just stated in the introduction, the abstract semantics is not defined
2457: %as a least fixpoint but as a set of post-fixpoints that fulfill a safety
2458: %requirement called pre-consistency.
2459: Note that we are not concerned with
2460: algorithmic issues here: they are dealt with in Section~\ref{sec:GAIA}.
2461: \\
2462:
2463:
2464: \noindent
2465: {\bf Extended Abstract Behaviors.}
2466: %\label{EAB}
2467: Extended abstract behaviors are the abstract counterpart of the
2468: concrete extended behaviors defined in Section \ref{CSR}.
2469: %They are defined in exactly the same way.
2470: %
2471:
2472: \begin{definition} [Extended Abstract Underlying Domain]
2473: \label{DEAUD}
2474: The {\em extended abstract underlying domain},
2475: denoted by {\it EAUD},
2476: consists of
2477: \begin{enumerate}
2478: \itemsep 2pt
2479: \item all triples $\langle \beta, g, c \rangle $, where
2480: $c$ is a clause of $P$,
2481: $g$ is a prefix of the body of $c$,
2482: $\beta\in{\it AS}_{D}$, and
2483: $D$ is the set of variables
2484: in the head of $c$;
2485: \item all pairs $\langle \beta, c \rangle $, where
2486: $c$ is a clause of $P$,
2487: $\beta \in{\it AS}_{D}$, and
2488: $D$ is the set of variables
2489: in the head of $c$;
2490: \item all pairs $\langle \beta, {\it pr} \rangle $, where
2491: ${\it pr}$ is a procedure of $P$
2492: or a suffix of a procedure of $P$,
2493: $\beta\in{\it AS}_{D}$, and
2494: $D$ is the set of variables
2495: in the head of the
2496: clauses of {\it pr}.
2497: \end{enumerate}
2498: \end{definition}
2499: %
2500: \begin{definition} [Extended Abstract Behaviors]
2501: An {\em extended abstract behavior} is a function from
2502: {\it EAUD} to ${\it ASS}\cup{\it ASSC}$ such~that
2503: %
2504: \begin{enumerate}
2505: \itemsep 2pt
2506: \item every triple $\langle \beta, g, c \rangle $ from {\it EAUD}
2507: is mapped to an abstract sequence with cut
2508: information
2509: $C\in {\it ASSC}_{D'} $, where $D'$ is the set of all variables in $c$;
2510: %
2511: \item every pair $\langle \beta, c \rangle $ from {\it EAUD}
2512: is mapped to an abstract sequence with cut information
2513: $C\in {\it ASSC}_D $, where $D$ is the set of variables
2514: in the head of $c$;
2515: %
2516: \item every pair $\langle \beta, {\it pr} \rangle $ from {\it EAUD}
2517: is mapped to an abstract sequence $B\in {\it ASS}_D$,
2518: where $D$ is the set of variables
2519: in the head of the
2520: clauses of {\it pr}.
2521: \end{enumerate}
2522: \end{definition}
2523:
2524: The set of extended abstract behaviors is endowed with a structure of
2525: partial or\-der in the obvious way. It is denoted by {\it EAB} and its
2526: elements are denoted by~{\it esat}.\\
2527: %\end{definition}
2528: %
2529:
2530: \noindent
2531: {\bf Abstract Transformation.}
2532: %\label{AT}
2533: The abstract semantics is defined in terms of two semantic functions
2534: that are depicted in Figure \ref{fig:TAT}.
2535: The first function $E: {\it AB}\rightarrow {\it EAB}$ maps
2536: abstract behaviors to extended abstract behaviors.
2537: It is the abstract counterpart of the concrete semantic rules of
2538: Figure \ref{CSRF}.
2539: The second function ${\it TAB} :{\it AB}\rightarrow{\it AB}$
2540: transforms an abstract behavior into another abstract behavior.
2541: It is the abstract counterpart of Rule {\bf T1} in Definition
2542: \ref{CT}.\\
2543:
2544: \begin{figure}[t]%\langle \rangle
2545: \figrule
2546: %\small
2547: \begin{tabbing}
2548: 123\=4567891\=23123\=456123456789012345\=12347891234123456789123456789 \kill
2549: %$TAB(sat) = \{(\beta,p,B): (\beta,p) \in UD \mbox{ and }
2550: %B = T(\beta,p,sat)\}$ \\ \\
2551:
2552: ${\it TAB}({\it sat})\langle \beta,p \rangle
2553: = E({\it sat})\langle \beta,{\it pr}\rangle$ \\
2554: \>{\bf where} \> ${\it pr}$ is the procedure defining $p$,\\ \\
2555:
2556: $E({\it sat})\langle \beta,{\it pr} \rangle
2557: = ${\tt SEQ}$(C)$ \\
2558: \>{\bf where}
2559: \> $C=E({\it sat})\langle \beta,c \rangle $
2560: \>\> if {\it pr $::=$ c}\\ \\
2561:
2562: $E({\it sat})\langle \beta,{\it pr} \rangle
2563: = {\tt CONC}(\beta,C,B)$ \\
2564: \>{\bf where}
2565: \> $B=E({\it sat})\langle \beta,{\it pr}' \rangle $ \\
2566: \> \> $C=E({\it sat})\langle \beta,c \rangle $
2567: \> \>if {\it pr $::=$ c,pr}$'$\\ \\
2568:
2569: $E({\it sat})\langle \beta,c \rangle
2570: =$ {\tt RESTRC}$(c,C)$ \\
2571: \>{\bf where}
2572: \> $C = E({\it sat})\langle \beta,g,c\rangle$\\
2573: \> \> $g$ is the body of $c$\\ \\
2574:
2575: $E({\it sat})\langle\beta,<>,c \rangle = {\tt EXTC}(c,\beta)$ \\ \\
2576:
2577:
2578: $E({\it sat})\langle \beta,( g , ! ) ,c\rangle
2579: = \mbox{\tt AI-CUT}(C)$ \\
2580: \>{\bf where}
2581: \> $C = E({\it sat})\langle \beta,g,c\rangle$ \\ \\
2582:
2583: $E({\it sat})\langle \beta, ( g , l ) ,c\rangle
2584: = {\tt EXTGS}(l,C,B)$ \\
2585: \>{\bf where}
2586: \> $B=$
2587: \> {\tt UNIF-VAR}$(\beta')$
2588: \> if {\it l $::=$ x$_i$=x$_j$} \\
2589: \> \> \> {\tt UNIF-FUNC}$(f,\beta')$
2590: \> if {\it l $::=$ x$_i$=f$(\ldots)$} \\
2591: \> \> \> ${\it sat}\langle \beta',p \rangle$
2592: \> if {\it l $::=$ p$(\ldots)$} \\
2593: \> \> $\beta' = {\tt RESTRG}(l,\beta'')$\\
2594: \> \> $\beta''= {\tt SUBST}(C)$\\
2595: \> \> $C= E({\it sat})\langle \beta,g,c \rangle$.
2596: \end{tabbing}
2597: \caption{The abstract transformation}
2598: \label{fig:TAT}
2599: \end{figure}
2600:
2601:
2602: \noindent
2603: {\bf Abstract Semantics.}
2604: %\label{subsub:ASem}
2605: The abstract semantics is defined as the set of all abstract
2606: behaviors that
2607: are both post-fixpoints of the abstract transformation
2608: {\it TAB} and pre-consistent.
2609: The corresponding definitions are given first; then
2610: the rationale underlying
2611: the definitions is discussed.
2612:
2613: \begin{definition}[Post-Fixpoints of {\it TAB}]
2614: %\label{def:PFT}
2615: An abstract behavior ${\it sat}\in{\it AB}$ is called
2616: a {\em post-fixpoint} of {\it TAB} if and only if
2617: ${\it TAB}({\it sat})\leq{\it sat} $, i.e., if and only if
2618: $$
2619: \begin{array}{llll}
2620: {\it TAB}({\it sat})\langle \beta,p \rangle
2621: &\leq
2622: &{\it sat}\langle \beta,p \rangle,
2623: &\;\;\forall \langle \beta,p \rangle \in {\it AUD}.\\
2624: \end{array}
2625: $$
2626: \end{definition}
2627: %
2628: \begin{definition}[Pre-Consistent Abstract Behaviors]
2629: \label{def:PCAB}
2630: \noindent
2631: Let $\longmapsto$ be the concrete semantics of the underlying
2632: program, according to Defi\-nition \ref{DCS}.
2633: An abstract behavior ${\it sat}\in{\it AB}$ is said to be
2634: {\em pre-consistent} with respect to $\longmapsto$
2635: if and only if
2636: there exists a concrete behavior $\longmapsto'$
2637: such that
2638: $$\longmapsto'\ \sqsubseteq \longmapsto$$
2639: and such that,
2640: for all $\langle \beta,p \rangle \in {\it AUD}$ and
2641: $\langle \theta,p \rangle \in {\it CUD}$,
2642: $$
2643: \begin{array}{lll}
2644: \left.
2645: \begin{array}{c}
2646: \theta \in {\it Cc}(\beta),\\
2647: \langle \theta,p \rangle \longmapsto' S
2648: \end{array}
2649: \right\}
2650: &\Rightarrow
2651: & S \in {\it Cc}({\it sat}\langle \beta,p \rangle).
2652: \\
2653: \end{array}
2654: $$
2655: \end{definition}
2656:
2657: In the next section, we show that any pre-consistent
2658: post-fixpoint {\it sat} of {\it TAB} is a safe approximation of
2659: the concrete semantics, i.e.,
2660: it is such that for all $\langle \beta,p \rangle \in {\it AUD}$ and
2661: $\langle \theta,p \rangle \in {\it CUD}$,
2662: $$
2663: \begin{array}{lll}
2664: \left.
2665: \begin{array}{c}
2666: \theta \in {\it Cc}(\beta),\\
2667: \langle \theta,p \rangle \longmapsto S
2668: \end{array}
2669: \right\}
2670: &\Rightarrow
2671: & S \in {\it Cc}({\it sat}\langle \beta,p \rangle).\\
2672: \\
2673: \end{array}
2674: $$
2675:
2676:
2677: %\noindent
2678: The abstract semantics is defined as the set of all pre-consistent post-fixpoints.
2679: Indeed, under the current hypotheses on the abstract domains, there is no straightforward way to choose a
2680: ``best'' abstract behavior among all pre-consistent post-fixpoints.
2681: Thus, we consider the problem of computing a reasonably accurate
2682: post-fixpoint as a pragmatic issue to be solved at the algorithmic level.
2683: In fact, the abstract interpretation algorithm
2684: presented in Section \ref{sec:GAIA} is an improvement of the following construction:
2685: define the abstract behavior ${\it sat}_\bot$ by
2686: $$
2687: \begin{array}{llll}
2688: {\it sat}_\bot \langle \beta, p \rangle
2689: & =
2690: & B_\bot,
2691: & \; \;\forall \langle \beta, p \rangle \in {\it AUD}.
2692: \end{array}
2693: $$
2694: Assume that the domain of abstract sequences is endowed with
2695: an % (polymorphic)
2696: upper-bound operation ${\tt UB}: {\it ASS}_D \times {\it ASS}_D \rightarrow {\it ASS}_D$
2697: (not necessarily a {\em least} upper bound).
2698: For every ${\it sat}_1,{\it sat}_2\in {\it AB}$, we define
2699: ${\tt UB}({\it sat}_1,{\it sat}_2)$ by
2700: $$
2701: \begin{array}{llll}
2702: {\tt UB}({\it sat}_1,{\it sat}_2)\langle \beta, p \rangle
2703: & =
2704: & {\tt UB}({\it sat}_1\langle \beta, p \rangle,
2705: {\it sat}_2\langle \beta, p \rangle),
2706: & \;\;\forall \langle \beta, p \rangle \in {\it AUD}.
2707: \end{array}
2708: $$
2709: Let j be an arbitrarily chosen natural number.
2710: An infinite sequence of pre-consistent abstract behaviors
2711: ${\it sat}_0,\ldots,{\it sat}_i,\ldots$ is defined as follows:
2712: $$
2713: \begin{array}{llll}
2714: {\it sat}_0 & = & {\it sat}_\bot, \\
2715: {\it sat}_{i+1}& = & {\it TAB}({\it sat}_i)
2716: & (0\leq i <j ),\\
2717: {\it sat}_{i+1}& = & {\tt UB}({\it sat}_i,{\it TAB}({\it sat}_i))
2718: & ( j\leq i). \\
2719: \end{array}
2720: $$
2721: The abstract behaviors ${\it sat}_i$ are all pre-consistent because
2722: ${\it sat}_\bot$ is pre-consistent by construction,
2723: every application of {\it TAB} maintains pre-consistency (as proven
2724: in the next section),
2725: and each application of {\tt UB} produces an abstract behavior
2726: whose concretization contains the concretizations of the arguments.
2727: Moreover, assuming that every partial order ${\it ASS}_D$ is finite
2728: or satisfies the finite ascending chain proper\-ty, the
2729: sequence ${\it sat}_0,\ldots,{\it sat}_i,\ldots$ has a least
2730: upper bound which is the desired pre-consistent post-fixpoint.
2731: In case the ${\it ASS}_D$ contains chains with infinitely many distinct
2732: elements, {\tt UB} must be a widening operator \cite{Cousot92c}.
2733: \\
2734: The sequence from ${\it sat}_0$ to ${\it sat}_j$
2735: is not ascending in general. In fact, ${\it sat}_\bot$ is not the minimum of
2736: {\it AB} and {\it TAB} is not necessarily monotonic nor extensive
2737: (i.e., ${\it sat}\leq {\it TAB}({\it sat})$ does not always hold).
2738: From step $0$ to $j$, the computation of the ${\it sat}_i$
2739: simulates as closely as possible the computation
2740: of the least fixpoint of the concrete transformation.
2741: From step $j$ to convergence, all iterates are ``lumped'' together.
2742: All concrete behaviors
2743: $\longmapsto_j, \longmapsto_{j+1}, \ldots$
2744: of the Kleene sequence of the concrete semantics,
2745: are thus included in the concretization of the final post-fixpoint {\it sat}.
2746: So, {\it sat} describes properties that are true not only for the
2747: concrete $\longmapsto$
2748: semantics but also for its approximations
2749: $\longmapsto_j, \longmapsto_{j+1}, \ldots$.
2750: The choice of $j$ is a compromise: a low value ensures a faster
2751: convergence while a high value provides a better accuracy.
2752: The abstract interpretation algorithm presented in Section
2753: \ref{sec:GAIA} does not iterate globally over {\it TAB}. It locally
2754: iterates over $E$ for every needed input
2755: pattern $\langle \beta, p \rangle$
2756: and uses different values of $j$ for different input patterns.
2757: Depending on the particular abstract domain, the value can be guessed
2758: more or less cleverly. This is the role of the
2759: special widening operator of Definition~\ref{def:EW}.
2760: A sample widening operator is described in Section \ref{sub:GAO},
2761: showing how the value of $j$ can be guessed in the case of a practical
2762: abstract domain.
2763:
2764: \subsection{Safety of the Abstract Semantics}
2765: \label{sub:SAS}
2766:
2767: We prove here the safety of our abstract semantics.
2768: %The safety results about our abstract semantics are now stated.
2769: %The details of the proofs can be found in \cite{Cut_rep95}.
2770: First, we formally define the notion of safe approximation. Then, we show
2771: that the abstract transformation is safe in the sense that, whenever
2772: {\it sat} safely approximates $\longmapsto$,
2773: ${\it TAB}({\it sat})$ safely approximates
2774: $\stackrel{{\it TCB}}\longmapsto$ (Theorem \ref{th:SAT}).
2775: From this basic result, we deduce that
2776: {\it TAB} transforms pre-consistent abstract
2777: behaviors
2778: into other pre-consistent abstract behaviors (Theorem \ref{th:p-c}),
2779: and that, when {\it sat} is a post-fixpoint
2780: of the abstract transformation which safely approximates
2781: a concrete behavior $\longmapsto$, it also safely approximates
2782: the concrete behavior $\stackrel{{\it TCB}}\longmapsto$
2783: (Theorem \ref{th:cppost}).
2784: Theorem \ref{th:cpab} states that abstract beha\-viors are,
2785: roughly speaking, chain-closed with respect to concrete behaviors.
2786: Finally, Theorem \ref{th:SAS} states our main result, i.e.,
2787: every pre-consistent post-fixpoint of the abstract transformation
2788: safely approximates the concrete semantics.
2789:
2790:
2791:
2792: \begin{definition}[Safe Approximation]
2793: \label{def:SA}
2794: \noindent Let $\longmapsto \in{\it CB}$ and
2795: ${\it sat}\in {\it AB}$.
2796: The abstract behavior {\it sat} {\em safely approximates} the concrete behavior
2797: $\longmapsto$ if and only if, for all
2798: $\langle \theta, p \rangle \in {\it CUD}$ and
2799: $\langle \beta, p \rangle \in {\it AUD}$, the following
2800: implication holds:
2801: $$
2802: \begin{array}{lll}
2803: \left.
2804: \begin{array}{c}
2805: \theta\in{\it Cc}(\beta),\\
2806: \langle \theta, p \rangle\longmapsto S
2807: \end{array}
2808: \right\}
2809: &\Rightarrow&
2810: S\in{\it Cc}({\it sat}\langle \beta ,p \rangle).
2811: \end{array}
2812: $$
2813:
2814: Similarly, let $\longmapsto \in{\it ECB}$ and
2815: ${\it esat}\in {\it EAB}$.
2816: The extended abstract behavior {\it esat} {\em safely approximates}
2817: $\longmapsto$ if and only if, for all
2818: $\langle \theta, {\it pr} \rangle,
2819: \langle \theta, c \rangle,
2820: \langle \theta, g, c \rangle
2821: \in {\it ECUD}$ and
2822: $\langle \beta, {\it pr} \rangle,
2823: \langle \beta, c \rangle,
2824: \langle \beta, g, c \rangle
2825: \in {\it EAUD}$, the following
2826: implications hold:
2827: $$
2828: \begin{array}{c}
2829: \begin{array}{lll}
2830: \left.
2831: \begin{array}{c}
2832: \theta\in{\it Cc}(\beta),\\
2833: \langle \theta, {\it pr} \rangle\longmapsto S
2834: \end{array}
2835: \right\}
2836:
2837: &\Rightarrow&
2838: S\in{\it Cc}({\it esat}\langle \beta , {\it pr} \rangle),
2839: \end{array}\\ \\
2840: \begin{array}{lll}
2841: \left.
2842: \begin{array}{c}
2843: \theta\in{\it Cc}(\beta),\\
2844: \langle \theta, c \rangle\longmapsto \langle S, {\it cf} \rangle
2845: \end{array}
2846: \right\}
2847:
2848: &\Rightarrow&
2849: \langle S, {\it cf} \rangle\in{\it Cc}({\it esat}\langle \beta ,c \rangle),
2850: \end{array}\\ \\
2851: \begin{array}{lll}
2852: \left.
2853: \begin{array}{c}
2854: \theta\in{\it Cc}(\beta),\\
2855: \langle \theta, g, c \rangle\longmapsto \langle S, {\it cf} \rangle
2856: \end{array}
2857: \right\}
2858:
2859: &\Rightarrow&
2860: \langle S, {\it cf} \rangle \in{\it Cc}({\it esat}\langle \beta , g, c \rangle).
2861: \end{array}\\
2862: \\
2863: \end{array}
2864: $$
2865:
2866: \end{definition}
2867:
2868: \begin{theorem}[Safety of the Abstract Transformation]
2869: \label{th:SAT}
2870: \noindent
2871: Let $\longmapsto\in{\it CB}$ and
2872: ${\it sat}\in {\it AB}$. If {\it sat} safely approximates
2873: $\longmapsto$, then ${\it TAB}({\it sat})$ safely approximates
2874: $\stackrel{{\it TCB}}\longmapsto$.
2875: %
2876: \end{theorem}
2877:
2878:
2879: We first establish the following result.
2880: Remember that if $\longmapsto\in{\it CB}$, its extension
2881: in {\it ECB} is also denoted by $\longmapsto$ (see Section \ref{CSR}).
2882:
2883: \begin{lemma}[Safety of $E$]
2884: \label{le:SE}
2885:
2886: \noindent
2887: Let $\longmapsto\in{\it CB}$ and
2888: ${\it sat}\in {\it AB}$. If {\it sat} safely approximates
2889: $\longmapsto$, then $E({\it sat})$ safely approximates $\longmapsto$
2890: (the extension of $\longmapsto$ in {\it ECB}).
2891: \end{lemma}
2892:
2893: \begin{proof*}[Proof of Lemma \ref{le:SE}]
2894: \label{pr:SE}
2895:
2896: We prove the lemma by structural induction on the syntax of the
2897: underlying program. It uses the concrete semantic rules of
2898: Figure~\ref{CSRF}, the definition of $E$ in Figure~\ref{fig:TAT},
2899: and the specifications of the abstract operations given in
2900: Section \ref{sub:AO}. The proof is straightforward due to the close
2901: correspondence of the concrete and the abstract semantics.
2902: We only detail the reasoning for the base case and for the case of
2903: a goal $(g , l)$ where $l$ is an atom of the form
2904: $p(x_{i_1},\dots,x_{i_n})$.
2905: The other cases are similar.
2906:
2907: \vskip0.2cm
2908: \noindent
2909: {\bf Base case}.
2910: Let $\langle \theta, <>, c \rangle
2911: \!\in\! {\it ECUD}$ and
2912: $\langle \beta, <>, c \rangle
2913: \!\in\! {\it EAUD}$. Assume that $\theta\!\in\!{\it Cc}(\beta)$ and
2914: \mbox{
2915: $\langle \theta, <>, c \rangle\longmapsto \langle S, {\it cf}\rangle$}.
2916: It must be proven that
2917: $$\langle S, {\it cf}\rangle\in
2918: {\it Cc}(E({\it sat})\langle\beta,<>,c \rangle). $$
2919: This relation holds because of the three following facts:
2920: \begin{center}
2921: \begin{tabular}{rllll}
2922: $\langle S, {\it cf}\rangle$ & $=$ &
2923: ${\tt EXTC}(c,\theta)$
2924: & &(by {\bf R2}),\\
2925: %
2926: ${\tt EXTC}(c,\theta)$&$\in$&$ {\it Cc}({\tt EXTC}(c,\beta))$
2927: & &(by specification of {\tt EXTC}),\\
2928:
2929: %
2930: $E({\it sat})\langle\beta,<>,c \rangle$& $=$ &${\tt EXTC}(c,\beta)$
2931: & &(by definition of $E$).
2932: \end{tabular}
2933: \end{center}
2934: \vskip0.2cm
2935: \noindent
2936: {\bf Induction step}.
2937: %
2938: Let $\langle \theta, ( g , l ), c \rangle
2939: \in {\it ECUD}$ and
2940: $\langle \beta, ( g , l ), c \rangle
2941: \in {\it EAUD}$,
2942: where $l$ is an atom of the form $p(x_{i_1},\dots,x_{i_n})$.
2943: %
2944: Assume that $\theta\in{\it Cc}(\beta)$ and
2945: \mbox{
2946: $\langle \theta, ( g , l),
2947: c \rangle\longmapsto \langle S, {\it cf}\rangle$}.
2948: %
2949: It must be proven that
2950: $$\langle S, {\it cf}\rangle\in
2951: {\it Cc}(C), \mbox{
2952: where }
2953: C = E({\it sat})\langle\beta, ( g , l),c \rangle.$$
2954: By Rule {\bf R6},
2955: % of the concrete semantics,
2956: there exist program substitutions and program sequences such that
2957: $$
2958: \begin{array}{llll}
2959: \langle \theta, g,c \rangle
2960: \longmapsto
2961: \langle S', {\it cf} \rangle & && (C1)\\
2962: S' = <\theta_1,\dots,\theta_i,\_> & && (C2) \\
2963: \theta'_k = {\tt RESTRG}(l,\theta_k) & (1\leq k\leq {\it Ns}(S)) && (C3) \\
2964: \langle \theta'_k ,p\rangle
2965: \longmapsto S_{k}^{'} & (1\leq k\leq {\it Ns}(S)) && (C4)\\
2966: S_k = {\tt EXTG}(l,\theta_k,S_{k}^{' })
2967: & (1\leq k\leq {\it Ns}(S)) && (C5)\\
2968: S={\Box}^{{\it Ne}(S)}_{k=1} S_k & && (C6)\\
2969: \end{array}
2970: $$
2971: %
2972: Moreover, by definition of
2973: $E({\it sat})$, there exist abstract values such that
2974: $$
2975: \begin{array}{rclll}
2976: C & = &{\tt EXTGS}(l,C',B) && (A1)\\
2977: B & = &{\it sat}\langle \beta',p \rangle && (A2)\\
2978: \beta' & = &{\tt RESTRG}(l,\beta'') && (A3)\\
2979: \beta'' & = &{\tt SUBST}(C') && (A4)\\
2980: C' & = &E({\it sat})\langle \beta,g ,c \rangle && (A5)
2981: \end{array}
2982: $$
2983: %
2984: The following assertions hold.
2985: %The last one is the
2986: %required result.
2987: By $A5$, $C1$, and the induction hypothesis,
2988: $$
2989: \begin{array}{rclll}
2990: \langle S', {\it cf} \rangle
2991: & \in & {\it Cc}(C') && (B1).
2992: \end{array}
2993: $$
2994: %
2995: By $A4$, $B1$, $C2$, and the specification of {\tt SUBST},
2996: $$
2997: \begin{array}{rcllll}
2998: \theta_k & \in & {\it Cc}(\beta'') & (1\leq k\leq {\it Ns}(S))
2999: && (B2).
3000: \end{array}
3001: $$
3002: %
3003: By $A3$, $B2$, $C3$, and the specification of {\tt RESTRG},
3004: $$
3005: \begin{array}{rcllll}
3006: \theta'_k & \in & {\it Cc}(\beta') & (1\leq k\leq {\it Ns}(S))
3007: && (B3).
3008: \end{array}
3009: $$
3010: %
3011: By $A2$, $B3$, $C4$, and the hypothesis that {\it sat} safely approximates
3012: $\longmapsto$,
3013: $$
3014: \begin{array}{rcllll}
3015: S'_{k}& \in & {\it Cc}(B) & (1\leq k\leq {\it Ns}(S))
3016: && (B4).
3017: \end{array}
3018: $$
3019: Finally, by $A1$, $B1$, $B4$, $C2$, $C5$, $C6$, and the specification
3020: of {\tt EXTGS},
3021: $$
3022: \begin{array}{rcllll}
3023: \langle S, {\it cf} \rangle & \in & {\it Cc}(C).
3024: \end{array}
3025: \mathproofbox $$
3026: \end{proof*}
3027: %
3028: %
3029: \begin{proof}[Proof of Theorem \ref{th:SAT}]
3030: \label{pr:SAT}
3031:
3032: The result follows from the definition of {\it TAB} in
3033: Figure~\ref{fig:TAT},
3034: the definition of {\it TCB} in Section \ref{CT}, and
3035: Lemma \ref{le:SE}.
3036: \end{proof}
3037:
3038:
3039:
3040: %
3041: %The proof of Theorem \ref{th:SAT} uses an important lemma
3042: %which states a safety result about the function $E$
3043: %and is given in Section \ref{app:sub:SAS}.
3044: %
3045: %
3046: %
3047: %
3048:
3049:
3050: The next theorem states that the transformation {\it TAB}
3051: maintains pre-consistency.
3052:
3053: \begin{theorem}\label{th:p-c}
3054: Let ${\it sat}\in {\it AB}$. If {\it sat} is
3055: pre-consistent, then ${\it TAB}({\it sat})$
3056: is also pre-consistent.
3057: \end{theorem}
3058: \begin{proof}
3059: Let $\longmapsto$ be the concrete semantics of the underlying program.
3060: Since {\it sat} is
3061: pre-consistent, there exists a concrete behavior $\longmapsto'$ such
3062: that
3063: \begin{enumerate}
3064:
3065: \itemsep 2pt
3066:
3067: \item $\longmapsto'\ \sqsubseteq\ \longmapsto$, and
3068:
3069: \item {\it sat} safely approximates $\longmapsto'$.
3070: \end{enumerate}
3071: The first condition implies that
3072: $$\stackrel{{\it TCB}}{\longmapsto'}\ \sqsubseteq\ \longmapsto,$$
3073: since {\it TCB} is monotonic and
3074: $\stackrel{{\it TCB}}{\longmapsto}\ =\ \longmapsto$.
3075: %the first condition implies that
3076: %$$\stackrel{{\it TCB}}{\longmapsto'}\ \sqsubseteq\ \longmapsto.$$
3077: %since {\it TCB} is monotonic (see Section \ref {TCS}) and
3078: %$\stackrel{{\it TCB}}{\longmapsto}\ =\ \longmapsto$ (see Definition
3079: %\ref{DCS}).
3080: \\The second condition and Theorem \ref{th:SAT} imply that
3081: \begin{center}
3082: ${\it TAB}({\it sat})$\ safely approximates\
3083: $\stackrel{{\it TCB}}{\longmapsto'}.$
3084:
3085: \end{center}
3086: The result follows from the two implied statements and Definition
3087: \ref{def:PCAB}.
3088: \end{proof}
3089:
3090:
3091: The next two theorems state closure properties of
3092: abstract behaviors, which are used to prove the
3093: safety of the abstract semantics.
3094:
3095: \begin{theorem}\label{th:cppost}
3096:
3097: Let {\it sat} be a post-fixpoint of {\it TAB}.
3098: Let $\longmapsto \in{\it CB}$. If {\it sat} safely approxi\-mates
3099: $\longmapsto$, then {\it sat} also safely approximates
3100: $\stackrel{{\it TCB}}\longmapsto$.
3101: \end{theorem}
3102: \begin{proof*}
3103: Assume that {\it sat} safely approximate
3104: $\longmapsto$.
3105: Let
3106: $\langle \theta, p \rangle \in {\it CUD}$ and
3107: $\langle \beta, p \rangle \in{\it AUD}$.
3108: It must be proven that$$
3109: \begin{array}{lll}
3110: \left.
3111: \begin{array}{c}
3112: \theta\in{\it Cc}(\beta),\\
3113: \langle \theta, p \rangle \stackrel{{\it TCB}}\longmapsto S
3114: \end{array}
3115: \right\}
3116: &\Rightarrow&
3117: S\in{\it Cc}({\it sat}\langle \beta ,p \rangle).\end{array}
3118: $$
3119: Assume that the left part of the implication holds.
3120: Theorem \ref{th:SAT} implies that
3121: $$S\in{\it Cc}({\it TAB}({\it sat})\langle \beta ,p \rangle).$$
3122: Since {\it sat} is a post-fixpoint and {\it Cc} is monotonic,
3123: $$\begin{array}{rcl}
3124: {\it Cc}({\it TAB}({\it sat})\langle \beta ,p \rangle)
3125: \subseteq
3126: {\it Cc}({\it sat}\langle \beta ,p \rangle),
3127: \end{array}$$
3128: and then
3129: $$S\in{\it Cc}({\it sat}\langle \beta ,p \rangle).
3130: \mathproofbox $$
3131: \end{proof*}
3132:
3133:
3134: \begin{theorem}\label{th:cpab}
3135: Let ${(\longmapsto_i)}_{i\in{\bf N}}$ be a chain of concrete behaviors.
3136: Let ${\it sat}\in {\it AB}$. If {\it sat} safely appro\-xi\-mates
3137: $\longmapsto_i$, for all $i\in{\bf N}$, then
3138: {\it sat} safely approximates
3139: $(\sqcup_{i=0}^{\infty}\longmapsto_i).$
3140: \end{theorem}
3141: \begin{proof*}
3142: Let us abbreviate $(\sqcup_{i=0}^{\infty}\longmapsto_i)$ by
3143: $\longmapsto$.
3144: It is sufficient to prove that, for any
3145: $\langle \beta, p \rangle \in {\it AUD}$ and
3146: any $\langle \theta, p \rangle \in {\it CUD}$,
3147: $$
3148: \begin{array}{lll}
3149: \left.
3150: \begin{array}{c}
3151: \theta\in{\it Cc}(\beta),\\
3152: \langle \theta, p \rangle\longmapsto S
3153: \end{array}
3154: \right\}
3155: &\Rightarrow&
3156: S\in{\it Cc}({\it sat}\langle \beta ,p \rangle).
3157: \end{array}
3158: $$
3159: Fix $\langle \beta, p \rangle $,
3160: $\langle \theta, p \rangle $, and $S$ satisfying the left part of the
3161: implication. By Theorem \ref{CB-cpo},
3162: $$
3163: \begin{array}{llllll}
3164: S = \sqcup_{i=0}^{\infty} S_i &&
3165: \mbox{\rm where} &&
3166: \langle \theta, p \rangle\longmapsto_i S_i
3167: &
3168: \forall i\in{\bf N}.
3169: \end{array}
3170: $$
3171: Since {\it sat} safely approximates every $\longmapsto_i$,
3172: \begin{center}
3173: $S_i\in{\it Cc}({\it sat}\langle \beta ,p \rangle)$ for all
3174: $i\in{\bf N}$.
3175: \end{center}
3176: Finally,
3177: since ${\it Cc}({\it sat}\langle \beta ,p \rangle)$ is chained-closed,
3178: $$S\in{\it Cc}({\it sat}\langle \beta ,p \rangle).
3179: \mathproofbox $$
3180: \end{proof*}
3181:
3182:
3183:
3184: The last theorem states our main result.
3185: %It uses Lemma \ref{le:SAS}, which is
3186: %stated and proven in Section \ref{app:sub:SAS}.
3187:
3188: \begin{theorem}[Safety of the Abstract Semantics]
3189: \label{th:SAS}
3190: \noindent
3191: Let {\it sat} be a pre-consistent post-fixpoint of {\it TAB}.
3192: Then {\it sat} safely approximates $\longmapsto$ where $\longmapsto$
3193: is the concrete semantics of the underlying program.
3194: \end{theorem}
3195:
3196: We first establish the following statement.
3197:
3198: \begin{lemma}\label{le:SAS}
3199: Let {\it sat} be a pre-consistent post-fixpoint of {\it TAB}.
3200: There exists a chain of concrete behaviors
3201: ${(\longmapsto_i)}_{i\in{\bf N}}$ such that {\it sat} safely approximates
3202: $\longmapsto_i$, for all $i\in{\bf N}$ and
3203: $(\sqcup_{i=0}^{\infty}\longmapsto_i)\ =\ \longmapsto$ where $\longmapsto$
3204: is the concrete semantics of the underlying program.
3205: \end{lemma}
3206: \begin{proof*}[Proof of Lemma \ref{le:SAS}]
3207: The proof is in three steps. First we construct a sequence
3208: ${\{\longmapsto'_i\}}_{i\in{\bf N}}$ of lower-approximations of
3209: $\longmapsto$ which is not necessarily a chain; then we
3210: modify it to get a chain ${(\longmapsto_i)}_{i\in{\bf N}}$;
3211: finally, we show that $(\sqcup_{i=0}^{\infty}\longmapsto_i)\ =\
3212: \longmapsto$. The proof uses the following property of
3213: program substitution sequences, whose proof is left to the reader.
3214: If $S_1$, $S_2$ and $S$ are program substitution sequences such that
3215: $S_1\sqsubseteq S$ and $S_2\sqsubseteq S$, then $S_1$ and $S_2$ have
3216: a least upper-bound, which is either $S_1$ or $S_2$. The least
3217: upper-bound is
3218: denoted by $S_1\sqcup S_2$ in the proof.
3219: \begin{enumerate}
3220: \itemsep 2pt
3221: \item Since {\it sat} is pre-consistent, there exists a concrete
3222: behavior $\longmapsto'$ such that
3223: {\it sat} safely approximate $\longmapsto'$ and
3224: $\longmapsto'\ \sqsubseteq\ \longmapsto$. The sequence
3225: ${\{\longmapsto'_i\}}_{i\in{\bf N}}$ is defined by
3226: $$\begin{array}{llll}
3227: \longmapsto'_0\ =\ \longmapsto' & \mbox{ and } &
3228: \longmapsto'_{i+1}\ =\ \stackrel{{\it TCB}}{\longmapsto'_i}
3229: & ({i\in{\bf N}}).
3230: \end{array}$$
3231: Since $\longmapsto'\ \sqsubseteq\ \longmapsto$,
3232: {\it TCB} is monotonic and $\longmapsto$ is a fixpoint of {\it TCB},
3233: it follows that
3234: $$\begin{array}{ll}
3235: \longmapsto'_{i}\ \sqsubseteq\ \longmapsto
3236: & (\forall{i\in{\bf N}}).
3237: \end{array}$$
3238: Moreover, by Theorem \ref{th:cppost}, {\it sat}
3239: safely approximates every $\longmapsto'_{i}$.
3240: \item ${(\longmapsto_i)}_{i\in{\bf N}}$ is now constructed by
3241: induction over $i$. The correctness of the construction process
3242: requires to prove that, after each induction step, the relation
3243: $\longmapsto_{i}\ \sqsubseteq\ \longmapsto$ holds. We first define
3244: $$\begin{array}{ll}
3245: \longmapsto_0\ =\ \longmapsto'_0.
3246: \end{array}$$
3247: Let $i\in {\bf N}$. Assume, by induction, that
3248: $\longmapsto_0\ \sqsubseteq\
3249: \ldots\ \sqsubseteq\ \longmapsto_i
3250: \ \sqsubseteq\ \longmapsto$.
3251: For every $\langle \theta, p\rangle \in {\it CUD}$, we define
3252: $$\begin{array}{lll}
3253: \langle \theta, p\rangle
3254: \longmapsto_{i+1} (S_1\sqcup S_2) & \mbox{\rm where } &
3255: \left\{
3256: \begin{array}{l}
3257: \langle \theta, p\rangle
3258: \longmapsto_{i} S_1,\\
3259: \langle \theta, p\rangle
3260: \longmapsto'_{i+1} S_2 .\end{array}\right.
3261: \end{array}$$
3262: Since
3263: $\longmapsto'_{i+1}\ \sqsubseteq\ \longmapsto$ and
3264: $\longmapsto_i \ \sqsubseteq\ \longmapsto$, we have that
3265: $\longmapsto_{i+1}$ is well-defined and
3266: $\longmapsto_{i+1}\ \sqsubseteq\ \longmapsto$.
3267: Moreover, since {\it sat} safely approximates
3268: $\longmapsto_i$ (by induction)
3269: and $\longmapsto'_{i+1}$, and $S_1\sqcup S_2$ is
3270: equal either to $S_1$ or $S_2$, in the definition of
3271: $\longmapsto_{i+1}$, we have that {\it sat} safely approximates
3272: every $\longmapsto_{i+1}$.
3273: \item The Kleene sequence of the concrete semantics is a chain
3274: ${(\longmapsto''_i)}_{i\in{\bf N}}$ defined as follows:
3275: $$\begin{array}{llll}
3276: \longmapsto''_0\ =\ \longmapsto_\bot & \mbox{ and } &
3277: \longmapsto''_{i+1}\ =\ \stackrel{{\it TCB}}{\longmapsto''_i}
3278: & ({i\in{\bf N}}).
3279: \end{array}$$
3280: Since $\longmapsto_\bot\ \sqsubseteq\ \longmapsto' $ and
3281: {\it TCB} is monotonic,
3282: it follows, by induction, that
3283: $$ \begin{array}{ll}
3284: \longmapsto''_i\ \sqsubseteq\ \longmapsto'_i
3285: \ \sqsubseteq\ \longmapsto_i
3286: \ \sqsubseteq\ \longmapsto
3287: & (\forall{i\in{\bf N}}).
3288: \end{array}$$
3289: Therefore, by definition of the least upper bound and since
3290: the least fixpoint is the limit of the Kleene sequence,
3291: $$ \begin{array}{l}
3292: \longmapsto\ =\ (\sqcup_{i=0}^\infty \longmapsto''_i)
3293: \ \sqsubseteq( \sqcup_{i=0}^\infty \longmapsto_i)
3294: \ \sqsubseteq\ \longmapsto.
3295: \end{array}$$
3296: Thus,
3297: $$ \begin{array}{l}
3298: \longmapsto\ =\
3299: (\sqcup_{i=0}^\infty \longmapsto_i).
3300: \end{array} \mathproofbox$$
3301: \end{enumerate}
3302: \end{proof*}
3303:
3304: \begin{proof}[Proof of Theorem \ref{th:SAS}]
3305:
3306: \noindent The result is an immediate consequence of Theorem \ref{th:cpab}
3307: and Lemma \ref{le:SAS}
3308: \end{proof}
3309:
3310:
3311: \subsection{Related Works}
3312: \label{sub:D}
3313: In this section we first discuss
3314: the mathematical approach underlying our abstract semantics and relate it with
3315: the higher-order abstract interpretation frameworks
3316: advocated by Cousot and Cousot \shortcite{CousotICCL94}.
3317: Then, we compare our approach with
3318: the abstract semantics for Prolog with control proposed by Barbuti
3319: {\em et al.} \shortcite{BarbutiControl},
3320: by Fil\`e and Rossi \shortcite{FileRossi},
3321: and by Spoto \shortcite{Spoto}.
3322: %\cite{BarbutiControl,FileRossi,Spoto}, which has been designed with the
3323: %aim of providing a basis
3324: %for the abstract interpretation of Prolog with control.
3325: \\
3326:
3327: \noindent
3328: {\bf Cousot and Cousot's Higher-order Abstract Interpretation Frameworks.}
3329: %
3330: %
3331: %
3332: %the abstract semantics for Prolog with control proposed by Barbuti
3333: %{\em et al.} in \cite{BarbutiControl},
3334: %by Fil\`e and Rossi in \cite{FileRossi},
3335: %and by Spoto and Levi in \cite{Spoto}.
3336: %\\
3337: %other approaches to abstract interpretation
3338: %We do not try to be exhaustive here. Our goal is merely (1) to explain
3339: %why the traditional approach to abstract interpretation,
3340: %which uses a simple set-based collecting semantics, does not work
3341: %for program substitution sequences; (2) to relate our
3342: %proposal with the higher-order abstract interpretation framework
3343: %advocated by Cousots in \cite{CousotICCL94}.
3344: %An account of the possible
3345: %mathematical approaches
3346: %to abstract interpretation can be found in \cite{CousotJLC92}.
3347: %\noindent
3348: As mentioned in the introduction,
3349: the traditional approach to abstract interpretation can not be applied
3350: to approximate the concrete semantics
3351: of Section~\ref{CS}.
3352: Indeed, we can define a set-based collecting transformation
3353: by lifting the concrete semantics to
3354: sets of program substitution sequences.
3355: However, the
3356: least fixpoint of the collecting transformation
3357: does not safely approximate the concrete semantics.
3358: The problem can be solved by restricting to sets of
3359: $\wp({\it CPS}_D)$ and $\wp({\it CPSS}_D)$
3360: that enjoy some closure properties
3361: ensuring safeness of the least fixpoint.
3362: This solution is simi\-lar to the choice of
3363: a power-domain structure in denotational semantics
3364: \cite{Schmidt88,Stoy77}: the needed constructions can in fact be
3365: viewed as power-domains.
3366: However
3367: %it has a
3368: %This solution has however a
3369: %drawback:
3370: there is no best way
3371: to choose the closure pro\-per\-ties.
3372: Different closure properties are
3373: adequate for different sorts of information.
3374: It is therefore advocated by Cousot and Cousot
3375: in \shortcite{CousotICCL94} that, for higher-order
3376: languages, different collec\-ting semantics should be defined
3377: for the same language
3378: depending on the kind of properties to be inferred.
3379: In our case, at least two dual collecting semantics could be defined. Both of
3380: them use sets of program substitution sequences that are
3381: chain-closed.
3382: \begin{enumerate}
3383: \itemsep 0pt
3384: \item The first semantics considers {\em downwards-closed} sets
3385: of program substitution
3386: sequences, i.e., such that
3387: for any $S,S'\in{\it CPSS}_D$,
3388: %and $\Sigma\in\wp({\it CPSS}_D)$,
3389: $$
3390: \begin{array}{lll}
3391: \left.
3392: \begin{array}{c}
3393: S\in\Sigma,\\
3394: S' \sqsubseteq S
3395: \end{array}
3396: \right\}
3397: &\Rightarrow
3398: & S'\in \Sigma.
3399: \end{array}
3400: $$
3401: This domain is ordered by inclusion and its minimum is $\{<\bot>\}$.
3402: It is adequate to infer non-termination and upper bounds to the
3403: length of sequences. In particular, it is adequate for determinacy
3404: analysis. However,
3405: it is unable to infer termination since $<\bot>$
3406: belongs to any set of sequences.
3407: \item The second semantics considers {\em upwards-closed} sets
3408: of program substitution sequences, i.e.,
3409: such that
3410: for any $S,S'\in{\it CPSS}_D$,
3411: $$
3412: \begin{array}{lll}
3413: \left.
3414: \begin{array}{c}
3415: S\in\Sigma, \\
3416: S\sqsubseteq S'
3417: \end{array}
3418: \right\}
3419: &\Rightarrow
3420: & S'\in \Sigma.
3421: \end{array}
3422: $$
3423: This domain is ordered by
3424: $\Sigma\leq\Sigma'\ \Leftrightarrow\ \Sigma'\subseteq\Sigma$ and
3425: its minimum is ${\it CPSS}_D$.
3426: It is able to infer termination and lower bounds to the
3427: length of sequences. It is less adequate
3428: than the previous one to infer precise information about the
3429: substitutions in the sequences because its least fixpoint
3430: corresponds to a greatest fixpoint in a traditional framework
3431: ignoring the sequence structure.
3432: \end{enumerate}
3433:
3434: \noindent In both cases, the least fixpoint is well-defined because
3435: the collecting versions of the operations are monotonic, since
3436: they have to ensure the closure properties. Moreover, the least fixpoint
3437: of the collecting semantics safely approximates the concrete
3438: semantics because all iterates are pre-consistent and the sets are
3439: chain-closed.
3440: Nevertheless, our formalization has some advantages.
3441: \begin{enumerate}
3442: \itemsep 0pt
3443: \item It can be more efficient: a single analysis is able to infer
3444: all the information that can be inferred by the two
3445: collecting semantics.
3446: \item It can be more accurate: there are pre-consistent post-fixpoints
3447: that are more precise than the intersection of the two
3448: collecting semantics.
3449: %In practice, an
3450: %abstract interpretation
3451: % algorithm based on our approach can always be at least as accurate as the
3452: % combination of
3453: % two algorithms based on the two collecting semantics.
3454: \end{enumerate}
3455:
3456:
3457: \vskip0.3cm
3458: \noindent
3459: {\bf Barbuti et al.'s Abstract Semantics}.
3460: The abstract semantics proposed by
3461: Barbuti {\em et al.} \shortcite{BarbutiControl} aims at modeling
3462: control aspects of logic programs
3463: such as search strate\-gy and selection rule.
3464: Their semantics is parametric with respect to
3465: a ``termination theory''. The meaning of a program is obtained by
3466: composing the meaning of its ``logic component'' together with a
3467: corresponding ``termination theory'' (the ``control component'').
3468: %An approximate model for termination
3469: The latter can be provided either by applying
3470: techniques of abstract interpretation or by applying proof procedures.
3471: In all cases, control information is deduced from outside
3472: in the form of a separated termination analy\-sis.
3473: This is the main difference with
3474: our framework, where control information, i.e. information relative to
3475: termination or non-termination,
3476: is modeled
3477: within the semantic domains through the notion of substitution sequence.
3478: \\
3479:
3480: \noindent
3481: {\bf Fil\`e and Rossi's Abstract Interpretation Framework}.
3482: The framework proposed by
3483: Fil\`e and Rossi \shortcite{FileRossi}
3484: consists of a tabled interpreter
3485: which explores OLDT abstract trees decorated with
3486: control information about
3487: sure success or failure of the goals.
3488: Such information is used by the cut operation to prune
3489: the OLDT-tree whenever a cut is reached.
3490: Sure success is modeled in our framework by abstract sequences
3491: representing
3492: only non-empty sequences.
3493: The abstract semantics defined by Fil\`e and Rossi is
3494: operational
3495: and non-compositional while ours is compositional and based on the
3496: fixpoint approach. Moreover,
3497: the abstract execution of a goal $(g,!)$
3498: is different.
3499: Whenever is known that $g$ surely succeeds, their framework stops after
3500: generating the first "sure" solution, while ours computes the entire
3501: abstract sequence for $g$ and then cuts it to maintain at most
3502: one solution. Our approach may thus imply some redundant work.
3503: However, if $g$ is used in several contexts, their framework
3504: should recognize this situation and expand the OLDT-tree
3505: further.\\
3506:
3507: \noindent
3508: {\bf Spoto's Denotational Abstract Semantics}.
3509: The related work closest to ours is
3510: the denotational abstract semantics proposed by
3511: Spoto \shortcite{Spoto}.
3512: He defines a
3513: goal-independent and compositional abstract
3514: semantics of Prolog modeling the depth-first search rule and
3515: the cut. His semantics associates to any Prolog program
3516: a sequence of pairs consisting of a ``kernel''
3517: constraint
3518: and its ``observability'' part. Intuitively, kernel constraints
3519: denote computed answers, while observability constraints give
3520: information about
3521: divergent computations and cut executions.
3522: The main difference with our approach is that his semantics is
3523: goal-independent
3524: while ours is not.
3525: This is due to the fact that our abstract
3526: semantics is {\em functional}, i.e., it
3527: associates to each program P a function (an abstract behavior)
3528: mapping every pair $\langle \beta,p\rangle$ to an abstract sequence $B$.
3529: However, this choice is unrelated to our concrete semantics: we could
3530: as well abstract the concrete semantics by a {\em relational}
3531: abstract semantics \cite{CousotJLC92},
3532: making it possible to express dependencies between
3533: input substitutions and the corresponding output substitution sequences.
3534: This is the approach of \cite{RP-98-002}
3535: where we express dependencies between the size of
3536: input terms and the number of corresponding output substitutions.
3537: We will go back to this issue at
3538: the end of Section \ref{sub:GRADC}.
3539:
3540: %We will go back to this issue at
3541: %the end of Section \ref{sub:GRADC}.
3542: %but notice that we have applied the relational approach in
3543: %\cite{RP-98-002}
3544: %to express dependencies between the size of
3545: %input terms and the number of corresponding output substitutions.
3546:
3547:
3548:
3549: %However, we can easily rephrase our semantics
3550: %into
3551: %a {\em relational} one, abstracting the program
3552: %to a set
3553: %of triples of the form $\langle \beta,p,B\rangle$.
3554: %The difference between the functional and relational
3555: % approach
3556: %will be also discussed at
3557: %has been discussed
3558: %by Cousot and Cousot in \cite{CousotJLC92}.
3559: %We will go back to this issue at
3560: %the end of Section \ref{sub:GRADC}.
3561:
3562:
3563: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
3564: \section{Generic abstract interpretation algorithm}
3565: \label{sec:GAIA}
3566: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
3567:
3568: A generic abstract interpretation algorithm
3569: is an algorithm that is
3570: parametric with respect to the abstract domains. It can
3571: be instantiated by various domains
3572: to obtain different data-flow analyses.
3573: Several such algorithms have been proposed for Prolog
3574: \cite{Bruynooghe91,SPE,ICLP91AI,WSA93,TOPLAS,ACTA95,Mellish87,Muthukumar92},
3575: but they do not handle the control features of the language such that
3576: Prolog search rule and cut.
3577:
3578: The algorithm presented here is essentially an instantiation of the
3579: universal fixpoint algorithm described in \cite{universal}
3580: to the abstract semantics of Section \ref{sec:AS}.
3581: In particular,
3582: it is quite similar to the algorithm presented in \cite{ICLP91AI,TOPLAS}:
3583: in fact,
3584: %it can proper\-ly be viewed as
3585: %a generalization
3586: %of this algorithm since
3587: the abstract semantics of Section
3588: \ref{sec:AS}
3589: can be
3590: viewed as a proper generalization of the abstract semantics
3591: described in those papers,
3592: %\cite{ICLP91AI,TOPLAS},
3593: where the sequences
3594: of computed answer substitutions are no longer abstracted to sets of
3595: substitutions.\\
3596: The universal algorithm in \cite{universal} is top-down, i.e.,
3597: it computes a subset of the fixpoint (in the form of a set of tuples)
3598: contai\-ning the output value corresponding to a
3599: distinguished input together with
3600: all the tuples needed
3601: to compute it.
3602: Top-down algorithms are naturally used to perform data-flow analyses,
3603: where one is interested in collecting the abstract information
3604: corresponding to a class of initial queries described by the
3605: distinguished input. It is more efficient in general to
3606: compute a part of the fixpoint only and this allows one to use
3607: infinite abstract domains, which are more expressive
3608: \cite{Cousot92c}.
3609: Although the instantiation of \cite{universal} to our
3610: abstract semantics is as mechanical as
3611: in our previous works (a slightly more general
3612: widening operator is needed however),
3613: the correctness of the algorithm involves some new theoretical
3614: issues: the pre-consistency of the post-fixpoint has now to be proven.
3615: Nevertheless, since the novel algorithm is in practice very similar to
3616: the algorithm presented in \cite{TOPLAS}, we only discuss here the extended
3617: widening operator which ensures a good compromise between
3618: efficiency and accuracy. A detailed description of the algorithm and
3619: its correctness proof can be found
3620: in \cite{sequence}.
3621:
3622:
3623:
3624: \subsection{Extended Widening}
3625: \label{sub:EW}
3626: The extended widening operation used by the novel algorithm is defined as follows.
3627:
3628:
3629:
3630: \begin{definition}[Extended Widening]
3631: \label{def:EW}
3632: \noindent
3633: An {\em extended widening}
3634: on abstract sequences is a (polymorphic%
3635: \footnote{It is parametrized over $D$.})
3636: operation
3637: $\nabla:{\it ASS}_D \times{\it ASS}_D \rightarrow {\it ASS}_D$ that
3638: enjoys the following properties.
3639: Let $\{B_i\}_{i\in{\bf N}}$ be a sequence of elements of ${\it ASS}_D$.
3640: Consider the sequence $\{B'_i\}_{i\in{\bf N}}$ defined by
3641: \vskip-0.1cm
3642: $$
3643: \begin{array}{llll}
3644: B'_0 & = & B_0, \\
3645: {B'}_{i+1} & = & B_{i+1}\nabla{B'}_{i} & (i\in{\bf N}).
3646: \end{array}
3647: $$
3648: The following conditions hold:
3649: \begin{enumerate}
3650: \itemsep 1pt
3651: \item $B'_i \geq B_i$ $(i\in{\bf N})$;
3652: \item the sequence $\{B'_i\}_{i\in{\bf N}}$ is stationary, i.e.,
3653: there exists $j\ge 0$ such that ${B'}_i={B'}_j$
3654: for all $i$ such that $j\leq i$.
3655: \end{enumerate}
3656: \end{definition}
3657:
3658:
3659: An extended widening is slightly more general than a widening
3660: \cite{Cousot92c} because the sequence $\{B'_i\}_{i\in{\bf N}}$
3661: is {\em not} required to be a chain.
3662:
3663: Let us now explain how the extended widening is used
3664: by the algorithm.
3665: Given an input
3666: pair $\langle\beta,p \rangle$, the
3667: algorithm iterates on the computation of
3668: ${\it TAB}({\it sat})\langle\beta,p \rangle$ until
3669: convergence, and concurrently updates {\it sat}, as follows
3670: (recursive calls -- which also modify {\it sat} -- are ignored
3671: in the discussion):
3672: \begin{enumerate}
3673: \itemsep 0pt
3674: \item $B'_0$ $=$ $B_{\bot}$ is stored in the initial ${\it sat}$ as
3675: the output for $\langle\beta,p\rangle$;
3676:
3677: \item $B_i$ results from the $i$-$th$ execution of
3678: ${\it TAB}({\it sat})\langle\beta,p \rangle$;
3679:
3680: \item ${B'}_{i}=B_{i}\nabla{B'}_{i-1}$ is stored in the current ${\it
3681: sat}$ after the $i$-$th$ execution of
3682: ${\it TAB}({\it sat})\langle\beta,p \rangle$;
3683:
3684: \item the loop is exited when $B_{i+1}\leq B'_i$.
3685: \end{enumerate}
3686:
3687: \noindent
3688: The loop terminates because
3689: there must be some $i$ such that $B'_{i+1}=B'_i$
3690: (otherwise Condition 2 of Definition \ref{def:EW} would
3691: be violated), and, hence, $B_{i+1}\leq B'_i$ since
3692: $B'_{i+1}\geq B_{i+1}$ by Condition 1.
3693: The loop can be resumed later on
3694: because some values in {\it sat} have been updated
3695: (Step 1 is omitted in these subsequent executions);
3696: all re-executions of the loop terminate for
3697: the same reasons as the first one; moreover,
3698: the loop can only be resumed finitely many times because
3699: no element in {\it sat} can be improved infinitely many often,
3700: since
3701: there is a $j$ such that $B'_i=B'_j$
3702: for all $i$ greater or equal to $j$.
3703: Note that a local post-fixpoint
3704: is attained each time the loop is exited.
3705: Thus a global post-fixpoint is obtained when all loops
3706: are terminated for all values in {\it sat}.
3707: The formal characterization of Definition \ref{def:EW} elegantly
3708: captures the idea that the algorithm
3709: sticks as closely as possible to the abstract semantics
3710: during the first iterations, and starts lumping
3711: the results together only when enough accuracy is obtained,
3712: in order to ensure convergence. The advantage
3713: of this characterization is that no particular value of $j$ is
3714: fixed. So we can think of ``intelligent''
3715: extended widenings that observe how the successive iterates
3716: behave and that enforce convergence exactly at the right time.
3717: %Consistency
3718: %of the result is guaranteed because each $B'_i$ is pre-consistent and
3719: %the algorithm terminates with a post-fixpoint. Pre-consistency of the
3720: %$B'_i$ follows from ${B'}_i \geq B_i$ and the pre-consistency of $B_i$
3721: %due to Lemma \ref{lemma2}.
3722: The extended widening used
3723: in our experimental evaluation is based on
3724: this intuitive idea (see Section \ref{sub:GAO}).
3725:
3726:
3727:
3728: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
3729: \section{Cardinality analysis}
3730: \label{sec:FASAS}
3731: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
3732:
3733: The abstract interpretation framework for Prolog presented in previous sections
3734: has been instantiated by a domain of
3735: abstract sequences
3736: to perform so-called cardinality analysis; see \cite{cardinality}.
3737: Cardinality analysis
3738: approximates the number of solutions to a goal and is useful for many
3739: purposes such as indexing, cut insertion and elimination
3740: \cite{D89,Sahlin91}, dead code elimination, and memory management
3741: and scheduling in parallel systems \cite{Bueno91b,Hermenegildo86}.
3742: The analysis subsumes traditional determinacy analysis
3743: such as those of \cite{Ramakrishnan93,D89,Giacobazzi92,Sahlin91}.
3744:
3745:
3746: This section is organized as follows.
3747: First we describe how a generic abstract domain
3748: for cardinality analysis, which is parametric with respect to
3749: {\em any} domain of
3750: abstract substitutions, can be built. Then,
3751: we instantiate this generic domain to the domain of abstract substitutions
3752: {\tt Pattern}
3753: \cite{TOPLAS}.
3754: Finally, we discuss experimental evaluations of the
3755: analysis from both accuracy and efficiency standpoints.
3756:
3757:
3758: \subsection{Generic Abstract Domains for Cardinality Analysis}
3759: \label{sub:GADCA}
3760:
3761: In this section, generic domains of abstract sequences
3762: and abstract sequences with cut information
3763: are built.
3764: The domains are generic with respect to the information on
3765: the substitutions in the sequences, but they provide
3766: specific information about the sequence structure.
3767: The latter consists of
3768: lower and upper bounds to the number of substitutions in the
3769: sequences
3770: and information about the nature
3771: (i.e., finite,
3772: incomplete or infinite) of the sequences.
3773: This information allows us to
3774: perform non-termination analysis
3775: and a limited form of termination analysis.
3776: %Moreover, it may improve
3777: %the accuracy of the substitution level analysis for some class of programs,
3778: %like those containing cuts.
3779: %These
3780: %abstract domains %described here
3781: %are adequate for
3782: %can also be used to perform
3783: Predi\-cate level analyses, like
3784: determinacy
3785: %\cite{Ramakrishnan93,Debray89a,Giacobazzi92,Sahlin91}
3786: and functionality
3787: \cite{Debray89a}, which were previously
3788: considered falling outside the scope of
3789: abstract interpretation, can be performed.
3790: \\
3791:
3792: \noindent
3793: {\bf Abstract Substitutions.}
3794: The substitution part of our generic domain of abstract sequences
3795: is assumed to be an element of an arbitrary domain of abstract substitutions
3796: ${\it AS}_D$. The only requirement on ${\it AS}_D$ is that it contains a minimum element
3797: $\beta_\emptyset$ such that ${\it Cc}(\beta_\emptyset)=\emptyset$.
3798: %This is
3799: %a trivial requirement since
3800: An abstract domain can always be
3801: enhanced with such an element.\\
3802:
3803: \noindent
3804: {\bf Abstract Sequences.}
3805: %\label{subsub:ASe}
3806: The generic domain of abstract sequences manipulates
3807: termination information whose domain is defined below.
3808:
3809: \begin{definition}[Termination Information]
3810: \label{def:TI}
3811: \noindent
3812: A termination information $t$ is an element of the set
3813: ${\it TI}=\{ {\it st}, \;{\it snt}, \;{\it pt}\}$
3814: endowed with the ordering $\leq$ defined by
3815: $$
3816: \begin{array}{llll}
3817: t_1\leq t_2 & \Leftrightarrow & \mbox{\rm either } t_1 = t_2\ \mbox{\rm or}\ t_2 = {\it pt}
3818: & \;\;\;\forall t_1,t_2\in{\it TI}.
3819: \end{array}
3820: $$
3821: The symbol {\it st} stands for
3822: ``sure termination'' and it characterizes finite sequences;
3823: {\it snt} stands for
3824: ``sure non termination'' and characterizes incomplete and infinite
3825: sequences; {\it pt} stands for ``possible
3826: termination'' and corresponds to absence of information.
3827: \end{definition}
3828:
3829: \noindent
3830:
3831:
3832: The domain of abstract substitution sequences is defined as follows.
3833:
3834:
3835:
3836: \begin{definition}[Abstract Sequences]
3837: \label{def:ASS_D}
3838: \noindent
3839: Let $D$ be a finite set of program variables. We denote by
3840: ${\it ASS}_D$ the set of all 4-tuples
3841: $\langle \beta, m, M, t \rangle$ such that
3842: % \begin{enumerate}
3843: % \item
3844: $\beta \in {\it AS}_D$,
3845: $m \in {\bf N}$,
3846: $M \in {\bf N}\cup \{\infty\}$,
3847: and $t \in {\it TI}$.%;
3848: % \item $m\leq M$;
3849: % \item $M=0\ \Leftrightarrow\ \beta=\beta_\emptyset$.
3850: % \end{enumerate}
3851: \end{definition}
3852:
3853: Informally, $\beta$ describes all substitutions in the
3854: sequences,
3855: $m$ and $M$ are
3856: lower and upper bounds on the number of substitutions in the
3857: sequences, and
3858: $t$ is an information on termination.
3859: %In the following, abstract sequences are denoted by the letter $B$
3860: %possibly subscripted.
3861:
3862: The ordering on abstract sequences is defined as follows.
3863:
3864: \begin{definition}[Ordering on Abstract Sequences]
3865: \label{def:OAS}
3866: \noindent
3867: Let $B_1,B_2 \in {\it ASS}_D$.
3868: $$
3869: \begin{array}{llll}
3870: B_1 \leq B_2 &\mbox{\rm iff } & \beta_1 \leq \beta_2 \mbox{\rm \ and\ }
3871: m_1 \geq m_2 \mbox{\rm \ and\ }
3872: M_1 \leq M_2 \mbox{\rm \ and\ }
3873: t_1 \leq t_2.
3874: \end{array}
3875: $$
3876: \end{definition}
3877:
3878:
3879: The set of program substitution sequences described by an abstract sequence
3880: $B$ is formally defined as follows.
3881:
3882:
3883: \begin{definition}[Concretization for Abstract Sequences]
3884: \label{def:CASS_D}
3885:
3886: \noindent
3887: Let $B\!\!=\!\!\langle \beta, m,M, t \rangle\!\in\! {\it ASS}_D$.
3888: We define
3889: \begin{center}
3890: \begin{tabbing}
3891: 123\=456\=789\=123456\=12345\kill
3892: ${\it Cc}(B)={\it Sseq}_1(\beta)\cap{\it
3893: Sseq}_2(m,M) \cap{\it Sseq}_3(t)$ \\
3894: { where} \\
3895: \>\> \>
3896: ${\it Sseq}_1(\beta) = \{S: S\in {\it PSS}_D\ \mbox{\rm and}\
3897: {\it Subst}(S)\subseteq{\it Cc}(\beta)\},$\\
3898: \>\> \>
3899: ${\it Sseq}_2(m,M) = \{S: S\in {\it PSS}\ \mbox{\rm and}\
3900: m \leq {\it Ns}(S) \leq M\},$\\
3901: \>\> \>
3902: ${\it Sseq}_3({\it snt}) = \{S: S\in {\it PSS}\ \mbox{\rm and}\
3903: S$ is incomplete or infinite\},\\
3904: \>\> \>
3905: ${\it Sseq}_3({\it st}) = \{S: S\in {\it PSS}\ \mbox{\rm and}\
3906: S$ is finite\},\\
3907: \>\>\>
3908: ${\it Sseq}_3({\it pt}) = {\it PSS}.$ \end{tabbing}
3909: \end{center}
3910: \end{definition}
3911:
3912:
3913:
3914: Monotonicity of the concretization function is a simple
3915: consequence of the definition.
3916:
3917:
3918:
3919: We denote by $B_\bot$ the special abstract sequence
3920: $\langle\beta_\emptyset,0,0,{\it snt}\rangle$ which is such that
3921: ${\it Cc}(B_\bot)=\{<\bot>\}$ as required in Section \ref{sub:AD}.
3922: %Notice that
3923: %, assuming that $\beta_\emptyset$ is the only abstract substitution
3924: %such that ${\it Cc}(\beta_\emptyset)=\emptyset$,
3925: %there is no minimum
3926: %abstract sequence in the domain (even no minimal
3927: %one, since the component $m$ is unbounded).
3928: %In fact, because we assume that ${\it AS}_D$ contains an abstract substitution
3929: %whose concretization is empty, we should sensibly assume that
3930: %${\it ASS}_D$ and ${\it ASSC}_D$ contain corresponding special elements
3931: %$B_\emptyset$ and $C_\emptyset$, because ``no input produces no output''.
3932: %These special elements are not needed actually because
3933: %they can be safely replaced by {\em any} other element (from the same
3934: %domain). Dealing with elements $B_\emptyset$ and $C_\emptyset$ would
3935: %require to add special cases in the implementation of each abstract
3936: %operation, which is cumbersome.
3937: %Thus, for the sake of simplicity, we do not consider such elements and
3938: %we replace them by arbitrary approximations where appropriate.
3939: It is easy to prove that for all abstract sequences
3940: $B\in{\it ASS}_D$, the set ${\it Cc}(B)$ is chain-closed; see \cite{sequence}.
3941:
3942: \vskip0.3cm
3943:
3944: \noindent
3945: {\bf Abstract Sequences with Cut Information.}
3946: %\label{subsub:ASCI}
3947: Abstract sequences with cut information are obtained by enhancing
3948: abstract sequences with information about
3949: execution of cuts.
3950:
3951:
3952: Let us first define the abstract domain for cut information.
3953:
3954: \begin{definition}[Abstract Cut Information]
3955: \label{def:ACI}
3956: \noindent
3957: %We require the existence of a new object denoted by {\it weakcut}.
3958: An abstract cut information {\it acf} is an element of the set
3959: ${\it ACF}=\{{\it cut},{\it nocut}, {\it weakcut}\}$.
3960: \end{definition}
3961:
3962:
3963:
3964:
3965: \begin{definition}[Abstract Sequences with Cut Information]
3966: \label{def:ASSC}
3967: \noindent
3968: Let $D$ be a finite set of program variables.
3969: We denote by $ {\it ASSC}_D $ the set of pairs
3970: $\langle B, {\it acf} \rangle$ where
3971: $B\in{\it ASS}_D $ and ${\it acf} \in {\it ACF}$.
3972: \end{definition}
3973:
3974:
3975: Informally, {\it cut} indicates that a cut has been executed in all
3976: sequences, {\it nocut} that no cut has been executed in any sequence,
3977: and {\it weakcut} that a cut has been executed for all sequences
3978: producing at least one solution. More formally, the concretization
3979: of an abstract sequence with cut information is defined as follows.
3980:
3981: \begin{definition}[Concretization for Abstract Sequences
3982: with Cut Information]
3983: \label{def:CASCI}
3984: \noindent
3985: Let $ B\in{\it ASS}_D$. We define\\
3986: \\
3987: $\begin{array}{llllll}
3988: {\it Cc}(\langle B,{\it cut}\rangle) & =
3989: & \{ \langle S, cut\rangle : S\in {\it Cc}(B)\}, \\
3990: {\it Cc}(\langle B,{\it nocut}\rangle) &=&
3991: \{ \langle S, nocut \rangle: S\in {\it Cc}(B)\}, \\
3992: {\it Cc}(\langle B,{\it weakcut}\rangle) &=&
3993: \{ \langle S, cut \rangle : S\in {\it Cc}(B)\}
3994: \cup \\
3995: & & \{ \langle S, nocut \rangle :\
3996: S\in {\it Cc}(B)
3997: \mbox{\rm\ and\ }S\in \{<>,<\bot>\}\}.
3998: \end{array} $
3999: \end{definition}
4000:
4001: %\noindent
4002: %Abstract sequences with cut information are denoted by the letter $C$.
4003:
4004: \subsection{Abstract Operations}
4005: \label{sub:GAO}
4006: Our next task is to provide definitions of all abstract operations
4007: specified in Section~\ref{sub:AO}.
4008: For space reasons, we describe here a subset of the
4009: operations, i.e., extended widening,
4010: unification, operation treating cut,
4011: and concatenation.
4012: The other operations are described in the appendix.
4013: The reader is referred to \cite{sequence} for the correctness proofs.
4014:
4015: The operations on abstract substitutions which are used in the
4016: definition of the
4017: operations on abstract sequences
4018: will be recalled when needed.
4019:
4020:
4021:
4022:
4023: \vskip0.3cm
4024:
4025: \noindent
4026: {\bf Extended Widening:} $\nabla: {\it ASS}_D\times {\it ASS}_D\rightarrow {\it ASS}_D$\\
4027: %\label{subsub:AS:EW}
4028: We require that the abstract domain ${\it AS}_D$ is equipped with a
4029: widening operation
4030: $\nabla': {\it AS}_D\times {\it AS}_D\rightarrow {\it AS}_D$.
4031: It can be an extended widening, a normal widening, or,
4032: if ${\it AS}_D$ is finite or
4033: enjoys the finite ascending chain property,
4034: any upper bound operation.
4035: The widening
4036: on sequences is obtained by taking the least upper bound of
4037: the termination components, the minimum of the lower bounds
4038: and setting the upper bound to infinity. \\
4039: Assume that $B_{\it old} = \langle \beta_{\it old}, m_{\it old},
4040: M_{\it old}, t_{\it old} \rangle$ and $B_{\it new} = \langle \beta_{\it new},
4041: m_{\it new},
4042: M_{\it new}, t_{\it new} \rangle$.\\
4043: The
4044: operation
4045: $\nabla: {\it ASS}_D\times {\it ASS}_D\rightarrow {\it ASS}_D$
4046: is defined
4047: as follows.
4048:
4049:
4050:
4051: \begin{small}
4052: $$
4053: \begin{array}{lllll}
4054: B_{\it new}\nabla B_{\it old} & = & \langle \beta_{\it new}\nabla'\beta_{\it old}, m_{\it new}, M_{\it new}, t_{\it new}\rangle
4055: &\myif &\! \beta_{\it new}\not\leq\beta_{\it old}
4056: \\
4057: % & = & \langle \beta_{\it old}, 0, M_{\it new}, {\it pt}\rangle
4058: % &\myif &\! \beta = \beta_\emptyset
4059: % \and t\not\leq t_{\it o}, \\
4060: & = & \langle \beta_{\it old}, m_{\it new}, M_{\it new}, {\it pt}\rangle
4061: &\myif &\! \beta_{\it new}\leq\beta_{\it old}
4062: \and t_{\it new}\not\leq t_{\it old} \\
4063: & = & \langle \beta_{\it old},
4064: \min(m_{\it new},m_{\it old}),
4065: \infty, t_{\it old}\rangle
4066: &\myif &\! \beta_{\it new}\leq\beta_{\it old}
4067: \and t_{\it new}\leq t_{\it old} \and \\
4068: & & & &
4069:
4070: (m_{\it new} \!< \!m_{\it old}
4071: \myor
4072: M_{\it new} \!> \!M_{\it old}) \\
4073: & = & B_{\it old}
4074: &\myif &\! B_{\it new}\leq B_{\it old}. \\
4075: \end{array}
4076: $$
4077: \end{small}
4078:
4079:
4080: The first case makes sure that the algorithm iterates until the
4081: abstract substitution part stabilizes. When it is stable,
4082: the widening is applied on sequences.
4083: \\
4084:
4085:
4086:
4087: \noindent
4088: {\it Example}.
4089: Consider the following program:\\
4090: \\
4091: \(
4092: \begin{array}{lll}
4093: {\tt repeat}.\\
4094: {\tt repeat} & \mbox{\tt :-}& {\tt repeat}.
4095: \end{array}
4096: \)\\
4097:
4098:
4099: The concrete semantics of this program maps the input
4100: $\langle \epsilon, {\tt repeat}\rangle$, where $\epsilon$ is the empty
4101: substitution, to the infinite sequence $<\epsilon,\ldots,\epsilon,\ldots>$.
4102:
4103: On this example, because the program has no variables,
4104: our domain of abstract substitutions only contains two values, say
4105: $\beta_{\emptyset}$ and $\beta_{\top}$, such that\\
4106: \\
4107: \(\begin{array}{llll}
4108: {\it Cc}(\beta_{\emptyset})& =& \emptyset\\
4109: {\it Cc}(\beta_{\top})& =&\{\epsilon\}.
4110: \end{array}\)\\
4111:
4112:
4113: Let $B_{\bot}=\langle \beta_{\emptyset},0,0,{\it snt}\rangle$.
4114: Starting from $B_{\bot}$, the algorithm computes the abstract sequences\\
4115: \\
4116: \(\begin{array}{llllllllllllll}
4117: B_0& =& B_{\bot}& \hskip0.8cm B'_0& = & B_{\bot}\\
4118: B_1& =& \langle \beta_{\top},1,1,{\it snt}\rangle&
4119: \hskip0.8cm
4120: B'_1& =& B_1 \nabla B'_{0}=\langle \beta_{\top},1,1,{\it snt}\rangle \\
4121: B_2& =& \langle \beta_{\top},2,2,{\it snt}\rangle
4122: &\hskip0.8cm
4123: B'_2& =& B_2 \nabla B'_{1}=\langle \beta_{\top},1,\infty,{\it snt}\rangle \\
4124: B_3& =& \langle \beta_{\top},2,\infty,{\it snt}\rangle
4125: \end{array}
4126: \)
4127: \\
4128:
4129:
4130: Notice that the widening on sequences is applied when the
4131: abstract substitution part stabilizes, i.e., after the computation of the
4132: abstract sequence $B_2$. The next iterate $B_3$ satisfies the property that
4133: $B_3\leq B'_2$. Hence, according to the discussion in Section
4134: \ref{sub:EW}, the execution terminates
4135: returning the final value
4136: \begin{center}
4137: $B'_2=\langle \beta_{\top}, 1,\infty,{\it snt}\rangle.$
4138: \end{center}
4139:
4140: Observe that $B'_2$
4141: safely approximates the concrete
4142: infinite sequence
4143: $<\epsilon,\ldots,\epsilon,\ldots>$. Moreover, it
4144: expresses the fact that
4145: the execution of {\tt repeat} {\em surely} succeeds at least once
4146: and {\em surely} does not terminate\footnote{This example also shows that
4147: our framework can express non-failure properties such as the ones
4148: described in \cite{BC99,DLH97}.}.
4149:
4150: \vskip0.3cm
4151: \noindent
4152: {\bf Unification of Two Variables:}\ \
4153: {\tt UNIF-VAR}$: {\it AS}_{\{x_1,x_2\}}
4154: \rightarrow {\it ASS}_{\{x_1,x_2\}}$\ \\
4155: \noindent
4156: Given an abstract substitution $\beta$ with domain
4157: $\{x_1,x_2\}$, this operation returns an abstract sequence
4158: which represents a set of substitution sequences
4159: of length 0 or~1 (depending upon the
4160: success or failure of the unification). The terms bound to $x_1$ and
4161: $x_2$ are unified in all these sequences.
4162: The operation {\tt UNIF-VAR} on abstract sequences uses an upgraded
4163: version of the operation {\tt UNIF-VAR}
4164: on abstract substitutions defined in \cite{ICLP91AI,TOPLAS}.
4165: The latter, in addition
4166: to the resulting abstract substitution,
4167: produces now two flags indicating whether the unification always
4168: succeeds, always fails, or can both succeed and fail.
4169: The additional information
4170: is expressed by the boolean values {\it ss} and {\it sf} as specified below.\\
4171:
4172: \noindent
4173: {\em Operation} $\;${\tt UNIF-VAR}$: {\it AS}_{\{x_1,x_2\}}
4174: \rightarrow ({\it AS}_{\{x_1,x_2\}}
4175: \times{\it Bool}\times{\it Bool})$
4176:
4177: \noindent
4178: Let $\beta\in{\it AS}_{\{x_1,x_2\}}$ and
4179: $\langle\beta',{\it ss},{\it sf}\rangle=
4180: \mbox{\tt UNIF-VAR}(\beta)$. The following conditions
4181: hold:
4182: \begin{enumerate}
4183: \itemsep 3pt
4184: \item $
4185: \begin{array}{llll}
4186: \forall\theta\in{\it Cc}(\beta):
4187: \forall\sigma\in{\it SS}:
4188: (\sigma\in{\it mgu}(x_1\theta,x_2\theta)&\Rightarrow&
4189: [\![\theta\sigma]\!]\in{\it Cc}(\beta'));
4190: \end{array}$
4191: \item
4192: $\begin{array}{lll}
4193: {\it ss}={\it true}&\Rightarrow& %{\it Cc}(\beta)\neq \emptyset\mbox{\rm\ and\ }
4194: (\forall\theta\in{\it Cc}(\beta):\
4195: x_1\theta\mbox{\rm\ and\ }
4196: x_2\theta\mbox{\rm\ \ are~unifiable);}
4197: \end{array}$
4198: \item
4199: $\begin{array}{lll}
4200: {\it sf}={\it true}&\Rightarrow&
4201: (\forall\theta\in{\it Cc}(\beta):\
4202: x_1\theta\mbox{\rm\ and\ }
4203: x_2\theta\mbox{\rm\ \ are~not~unifiable})
4204: %\mbox{\rm\ and\ }\beta'=\beta_\emptyset
4205: .
4206: \end{array}$
4207: \end{enumerate}
4208:
4209: \vskip0.2cm
4210:
4211:
4212: Based on the upgraded operation {\tt UNIF-VAR}
4213: for abstract substitutions, we provide an implementation of the operation
4214: {\tt UNIF-VAR} for abstract sequences,
4215: which is correct with respect to the corresponding
4216: specification given in
4217: Section \ref{sub:AO}.
4218:
4219: The operation {\tt UNIF-VAR}$: {\it AS}_{\{x_1,x_2\}}
4220: \rightarrow {\it ASS}_{\{x_1,x_2\}}$ on abstract sequences
4221: is defined as follows.
4222: Let $\beta\in {\it AS}_{\{x_1,x_2\}} $ and
4223: $\langle\beta'',{\it ss},{\it sf}\rangle=
4224: \mbox{\tt UNIF-VAR}(\beta)$. We have that {\tt UNIF-VAR}$(\beta)=B'$ where
4225: $B'$ is the abstract sequence $\langle \beta',m',M',t'\rangle$
4226: such that
4227: $$
4228: \begin{array}{lll}
4229: \beta' & = & \beta'' \\
4230: m' & = & \myif~{\it ss}~\mbox{\rm then}~1~\mbox{\rm else}~0\\
4231: M' & = & \myif~{\it sf}~\mbox{\rm then}~0~\mbox{\rm else}~1\\
4232: t' & = & {\it st}.
4233: \end{array}
4234: $$
4235: %$$
4236: %\begin{array}{llllll}
4237: %\beta' & = & \beta'' \\
4238: %m' & = & 1 & \myif~{\it ss=true}\\
4239: %& =& 0 & \mbox{ otherwise}\\
4240: %M' & = & 0 & \myif~{\it sf=true}\\
4241: %& =& 1 & \mbox{ otherwise}\\
4242: %t' & = & {\it st}.
4243: %%
4244: %%\myif~{\it ss}~\mbox{\rm then}~1~\mbox{\rm else}~0;\\
4245: %M' & = & \myif~{\it sf}~\mbox{\rm then}~0~\mbox{\rm else}~1;\\
4246: %t' & = & {\it st}.
4247: %\end{array}
4248: %$$
4249:
4250:
4251:
4252:
4253: %\vskip0.2cm
4254: \noindent
4255: {\bf Abstract Interpretation of the Cut:}\ \
4256: {\tt AI-CUT}$: {\it ASSC}_{{D'}}
4257: \rightarrow {\it ASSC}_{{D'}}$\
4258:
4259: \noindent
4260: Let $C=\langle\langle \beta,m,M,t\rangle,{\it acf}\rangle$.
4261: {\tt AI-CUT}$(C)=\langle\langle \beta',m,'M',t'\rangle,{\it acf'}\rangle$
4262: where
4263: %
4264: %and $C'=\langle\langle \beta',m,'M',t'\rangle,{\it acf'}\rangle$.
4265: %Then $C'=${\tt AI-CUT}$(C)$ where
4266: \begin{quote}
4267: $ \begin{array}{llll}
4268: \beta' & = & \beta \\
4269:
4270: m' & = & \min(1,m) \\
4271:
4272: M' & = & \min(1,M) \\
4273:
4274: t' & = & {\it st} &\myif m\geq 1 \myor {\it t}={\it st} \\
4275: & = & {\it snt} &\myif M=0 \and {\it t}={\it snt} \\
4276: & = & {\it pt} &\otherwise \\
4277:
4278: {\it acf}'
4279: & = & {\it cut} &\myif m\geq 1 \myor {\it acf}={\it cut}\\
4280: & = & {\it nocut} &\myif M=0 \and {\it acf}={\it nocut}\\
4281: & = & {\it weakcut} &\otherwise.
4282: \end{array}$
4283: \end{quote}
4284:
4285:
4286: \noindent
4287: {\it Example}. Consider the program
4288: \\
4289: \\
4290: \(\begin{array}{lllll}
4291: {\tt p(X)} & \mbox{\tt :-} & {\tt q(X),\;!}.\\
4292: {\tt q(X)} & \mbox{\tt :-} & {\tt X=a}.\\
4293: {\tt q(X)} & \mbox{\tt :-} & {\tt X=b}.
4294: \end{array}
4295: \)\\
4296: \\
4297:
4298: For the sake of simplicity we use a
4299: simple domain of abstract substitutions
4300: which can be seen as the mode component of the
4301: {\tt Pattern} domain \cite{RP-98-002,TOPLAS}.
4302: The example is intended to illustrate the abstract execution
4303: of the operation
4304: {\tt AI-CUT}. Hence, we do not enter here into the details of
4305: the other operations, but the reader is referred
4306: to the appendix for
4307: their definition.
4308:
4309:
4310: The abstract execution of the procedure ${\tt p}$ called with
4311: its argument being a variable is as follows.
4312: Let $$\beta={\tt X}\mapsto {\tt var}$$ be the initial abstract substitution.
4313: Let $c$ be the clause of the program defining ${\tt p}$.
4314: \\
4315: First, the abstract sequence with cut information $C$ is computed by
4316: $$C={\tt EXTC}(c,\beta)=\langle\langle {\tt X}\mapsto {\tt var}, 1,1,
4317: {\it st}\rangle , {\it nocut}\rangle.$$
4318: Then, the procedure {\tt q} that occurs
4319: in the body of $c$ is executed with
4320: $\beta={\tt SUBST}(C)$ returning the
4321: abstract sequence
4322: $$B=\langle {\tt X}\mapsto {\tt ground}, 2, 2, {\it st}\rangle.$$
4323: Hence, the abstract sequence
4324: with cut information $C'$ is computed as follows
4325: $$C'={\tt EXTGS}({\tt q(X)}, C,B)=\langle \langle {\tt X}\mapsto {\tt ground}, 2, 2, {\it st}\rangle,
4326: {\it nocut}\rangle.$$ Now, the operation
4327: {\tt AI-CUT}$(C')$ is applied. Following the definition above, one obtains
4328: $$\mbox{{\tt AI-CUT}}(C')=\langle \langle {\tt X}\mapsto {\tt ground}, 1, 1, {\it st}\rangle,
4329: {\it cut}\rangle$$
4330: expressing~the fact that a cut in the body of $c$ is {\em surely} executed.
4331: The final result~is
4332: $$B'={\tt SEQ}(C')=\langle {\tt X}\mapsto {\tt ground}, 1, 1, {\it st}\rangle$$
4333: stating that the execution of ${\tt p} $ called with
4334: its argument being a variable surely terminates and
4335: succeeds exactly once.
4336:
4337:
4338: Consider now the abstract execution of the procedure ${\tt p}$ called with
4339: a ground argument.
4340: Let $$\beta={\tt X}\mapsto {\tt ground}$$ be the initial abstract substitution.
4341: In this case, the abstract sequence with cut information $C$ is first
4342: computed by
4343: $$C={\tt EXTC}(c,\beta)=\langle\langle {\tt X}\mapsto {\tt ground}, 1,1,
4344: {\it st}\rangle , {\it nocut}\rangle.$$
4345: Then, the procedure {\tt q} is executed with
4346: $\beta={\tt SUBST}(C)$ returning
4347: $$B=\langle {\tt X}\mapsto {\tt ground}, 0, 1, {\it st}\rangle.$$
4348: The abstract sequence
4349: with cut information $C'$ is computed as follows
4350: $$C'={\tt EXTGS}({\tt q(X)}, C,B)=
4351: \langle \langle {\tt X}\mapsto {\tt ground}, 0, 1, {\it st}\rangle,
4352: {\it nocut}\rangle.$$ The operation
4353: {\tt AI-CUT}$(C')$ returns
4354: $$\mbox{\tt AI-CUT}(C')=\langle \langle {\tt X}\mapsto {\tt ground}, 0, 1, {\it
4355: st}\rangle,
4356: {\it weakcut}\rangle$$ expressing the fact
4357: that, in this case, the computation either fails without executing the cut
4358: or succeeds once after executing the cut.
4359: The final result is
4360: $$B'={\tt SEQ}(C')=\langle {\tt X}\mapsto {\tt ground}, 0, 1, {\it
4361: st}\rangle$$ stating that the execution of ${\tt p}$
4362: called with a ground argument
4363: succeeds at most once and surely terminates.
4364:
4365:
4366: The {\tt Pattern} domain
4367: used in our experiments
4368: is more elaborated than
4369: the simple domain of abstract substitutions
4370: used in this example.
4371: However, it does not provide more precision in these
4372: cases.
4373: A more sophisticated domain where an
4374: abstract sequence is represented
4375: as $\langle<\beta_1,\ldots,\beta_n>,m,M,t\rangle$
4376: with $<\beta_1,\ldots,\beta_n>$ being an explicit sequence of
4377: abstract substitutions
4378: could return in the first case
4379: a more precise result. Indeed, one could obtain
4380: $B=\langle \{{\tt X}\mapsto {\tt a}\},\{{\tt X}\mapsto {\tt b}\}
4381: , 2, 2, {\it st}\rangle$ and then
4382: $B'=\langle \{{\tt X}\mapsto {\tt a}\}, 1, 1, {\it st}\rangle$.
4383: However, such a domain could not improve the result in the second
4384: case since the fact that the output substitution can be
4385: either ${\tt X}\mapsto {\tt a}$ or
4386: ${\tt X}\mapsto {\tt b}$ would be represented by
4387: ${\tt X}\mapsto {\tt ground}$ as we have done above.
4388:
4389:
4390: \vskip0.3cm
4391: \noindent
4392: {\bf Abstract Lazy Concatenation.}
4393: %\label{subsub:ALC}
4394: The implementation of the operation {\tt CONC} is complicated here,
4395: in order to get accurate results when the domain ${AS}_D$ is instantiated to the domain {\tt Pattern}.
4396: The implementation works on {\em
4397: enhanced} sets of abstract sequences which allow us to keep
4398: individual structural information about the results of every clause in order
4399: to detect mutual exclusion of the clauses.
4400:
4401: Let us motivate the lifting of
4402: abstract sequences to enhanced abstract sequences.
4403: Lifting an abstract domain to its power set,
4404: see, for instance, \cite{Cousot79a,FileILPS94}, is sometimes
4405: useful when the original abstract domain is not expressive enough to
4406: gain a given level of
4407: accuracy.
4408: Replacing an abstract domain by its power set is computationally
4409: expensive
4410: however; see \cite{WSA93_Granularity}.
4411: Sometimes, the accuracy is lost only inside a few operations;
4412: thus, a good compromise can be to lift the domain only locally,
4413: when these operations are executed, and to go back to the simple
4414: domain afterwards. This is exactly what we are going to do for the
4415: operation {\tt CONC}. The lifted version
4416: of the abstract domain
4417: that we are about to define
4418: is useful when the abstract domain is able to express
4419: definite, but not disjunctive,
4420: structural information about
4421: terms.
4422: % but not disjunctive structural information.
4423: In such a domain, for instance,
4424: the principal functor of the term bound to a program
4425: variable
4426: can be either
4427: definitely known or not known at all; it is not possible to express
4428: that it belongs to a given finite set.
4429: The domain {\tt Pattern} used in our experiments is an abstract domain
4430: of this kind. Disjunctive structural information is however essential
4431: to implement the operation {\tt CONC} accurately: it allows us to
4432: detect
4433: mutually exclusive abstract sequences, i.e., abstract sequences that
4434: should not be ``abstractly concatenated'' since they correspond to
4435: different concrete inputs.
4436: In order to keep disjunctive structural information, our
4437: implementation of {\tt CONC} works on a finite set of abstract sequences.
4438: This set is ``normalized'' in some way, in order to simplify the
4439: case analysis in the implementation.
4440: Basically, we differentiate between ``surely empty'' abstract sequences,
4441: approximating only sequences of the form $<>$ or $<\bot>$,
4442: and ``surely non empty'' abstract
4443: sequences, approximating
4444: only sequences of the form $<\theta>::S$. This is useful
4445: because sequences such as $<>$ or $<\bot>$ are possible outputs
4446: for any input, while sequences of the form $<\theta>::S$ are only
4447: possible for some inputs. Therefore we only have to check
4448: incompatibility of ``surely non empty'' abstract
4449: sequences.
4450: This discussion motivates
4451: the following definitions of semi-simple abstract sequences and simple
4452: abstract sequences.
4453:
4454: \begin{definition}[Semi-Simple Abstract Sequences]
4455: \label{def:SNAS}
4456: \noindent
4457: Let $B\in{\it ASS}_D$. We say that $B$ is a {\em semi-simple} abstract sequence
4458: if
4459: \begin{enumerate}
4460: \itemsep 0pt
4461: \item either, $\beta=\beta_\emptyset$ and $m=M=0$
4462: \item or, $\beta\neq\beta_\emptyset$ and $1\leq m \leq M$.
4463: \end{enumerate}
4464: \end{definition}
4465:
4466: \begin{definition}[Simple Abstract Sequences]
4467: \label{def:NAS}
4468: \noindent
4469: Let $B\in{\it ASS}_D$. We say that $B$ is a {\em simple} abstract sequence
4470: if it is semi-simple and $t\in\{{\it snt},{\it st}\}$.
4471: \end{definition}
4472:
4473:
4474: Semi-simple abstract sequences formalize our idea of distinguishing
4475: between ``surely empty'' and ``surely non empty'' abstract sequences.
4476: Note that,
4477: assuming that $\beta_\emptyset$ is the only abstract substitution such that
4478: ${\it Cc}(\beta_\emptyset)=\emptyset$, we have that ${\it Cc}(B)\neq\emptyset$ for
4479: any semi-simple abstract sequence $B$.
4480: %Enhanced abstract sequences
4481: %are then defined as follows.
4482:
4483: \begin{definition}[Enhanced Abstract Sequences]
4484: \label{def:EnhAS}
4485: \noindent
4486: Let $D$ be a finite set of program variables. We denote by
4487: ${\it ASS}^{\it enh}_D$ the set of all sets of the form
4488: $\{B_1,\dots,B_n\}$, where $n\geq 0$ and
4489: $B_1,\dots,B_n$ are semi-simple abstract sequences from ${\it ASS}_D$.
4490: Elements of ${\it ASS}^{\it enh}_D$ are called {\em enhanced abstract
4491: sequences}; they are denoted by {\it SB} in the following.
4492: The concretization function
4493: ${\it Cc}:{\it ASS}^{\it enh}_D \rightarrow {\it CSS}_D$ is defined by
4494: ${\it Cc}({\it SB}) =\bigcup_{B\in{\it SB}}{\it Cc}(B).$
4495: \end{definition}
4496:
4497:
4498: The operation $ {\tt SPLIT1}$ transforms an
4499: arbitrary abstract sequence into an equivalent enhanced abstract
4500: sequence.
4501: %This is realized by the operation $ {\tt SPLIT1}$ described below.
4502:
4503: \vskip0.3cm
4504: \noindent
4505: {\em Operation} $ \;{\tt SPLIT1}: {\it ASS}_D \rightarrow {\it ASS}^{\it enh}_D$\\
4506: \noindent
4507: This operation is required to satisfy the property that
4508: for every $B\in{\it ASS}_D$,
4509: $ {\it Cc}({\tt SPLIT1}(B))={\it Cc}(B)$.
4510: %It is defined as follows.
4511: Let $B=\langle \beta,m,M,t\rangle$. We define ${\it SB}'={\tt SPLIT1}(B)$
4512: as
4513: ${\it SB}'={\it SB}_1\cup{\it SB}_2$ where
4514: \vskip0.2cm
4515: \begin{quote}
4516: $\begin{array}{llll}
4517: {\it SB}_1 & = & \{\langle \beta_\emptyset,0,0,t\rangle\} &
4518: \myif m=0 \\
4519: & = & \emptyset & \otherwise \\
4520: {\it SB}_2 & = & \{\langle \beta,\max(1,m),M,t\rangle\} &
4521: \myif \beta\neq\beta_\emptyset
4522: \and \max(1,m)\leq M \\
4523: & = & \emptyset & \otherwise . \\
4524: \end{array}$
4525: \end{quote}
4526:
4527:
4528: \vskip0.2cm
4529:
4530: The operation {\tt MERGE} is the converse of {\tt SPLIT1}: it
4531: %
4532: %We also need a conversion operation that
4533: transforms an enhanced
4534: abstract sequence into a plain abstract sequence.
4535: %This operation is called
4536: %{\tt MERGE}.
4537: % and it is the
4538: %converse of {\tt SPLIT1}.
4539: Most of the time, this operation loses
4540: part of the information expressed by the enhanced abstract
4541: substitution sequence; but it does not lose any information
4542: when the enhanced
4543: abstract sequence results from a single application of {\tt SPLIT1}.
4544:
4545: \vskip0.3cm
4546: \noindent
4547: {\em Operation} $ \;{\tt MERGE}: {\it ASS}^{\it enh}_D \rightarrow {\it ASS}_D$\\
4548: The operation
4549: $ {\tt MERGE}$
4550: satisfies the following properties:
4551:
4552: \begin{enumerate}
4553: \itemsep 2pt
4554: \item For every ${\it SB}\in{\it ASS}^{\it enh}_D$,
4555: $ {\it Cc}({\it SB})\subseteq{\it Cc}({\tt MERGE}({\it SB}))$
4556: \item For every $B\in{\it ASS}_D$,
4557: $ {\it Cc}({\tt MERGE}({\tt SPLIT1}(B)))={\it Cc}(B)$.
4558: \end{enumerate}
4559:
4560:
4561:
4562:
4563:
4564: The definition of {\tt MERGE} requires choosing a particular abstract
4565: sequence $B_\emptyset$ such that ${\it Cc}(B_\emptyset)=\emptyset$.
4566: We decide that $B_\emptyset=\langle\beta_\emptyset,1,0,{\it st}\rangle$.
4567: This choice is arbitrary since there is no best (least)
4568: representation of the empty set of abstract sequences in this domain.
4569: Moreover, it uses the binary operation
4570: $ {\tt UNION}: ({\it AS}_D \times {\it AS}_D) \rightarrow {\it AS}_D$,
4571: which is inherited from
4572: our previous framework. The latter is
4573: extended to finite sequences
4574: of abstract substitutions as follows:
4575:
4576:
4577: \begin{quote}
4578: \begin{tabular}{lllll}
4579: ${\tt UNION}(<>)$ & $=$ & $\beta_\emptyset$ & \\
4580: ${\tt UNION}(<\beta>)$ & $=$ & $\beta,\;$
4581: for every
4582: $\beta\in {\it AS}_D$ \\
4583: ${\tt UNION}(<\beta_1,\dots,\beta_n>)$ & $=$ &
4584: ${\tt UNION}(\beta_1,{\tt UNION}(<\beta_2,\dots,\beta_n>))$, \\
4585: & & for all
4586: $\beta_1,\dots,\beta_n\in{\it AS}_D$ $(n\geq 2)$. \\
4587: \end{tabular}
4588: \end{quote}
4589:
4590:
4591: The operation {\tt MERGE} can now be defined.
4592: Let $\sqcup$ denote the least upper bound on {\it TI}.
4593: Let ${\it SB}\in{\it ASS}^{\it enh}_D$ such that
4594: ${\it SB}=\{B_1,\dots,B_n\}$ and
4595: $B_i=\langle \beta_i,m_i,M_i,t_i\rangle$ $(1\leq i\leq n)$.
4596: The abstract sequence $B'={\tt MERGE}({\it SB})$ is such that
4597:
4598: \begin{quote}
4599: $\begin{array}{llll}
4600: B' & = & B_\emptyset & \myif n=0 \\
4601:
4602: & = & B_1 & \myif n=1 \\
4603:
4604: & = & \langle {\tt UNION}(<\beta_1,\dots,\beta_n>) ,
4605: \min(m_1,\dots,m_n),\\
4606: & & \ \max(M_1,\dots,M_n),
4607: t_1\sqcup\dots \sqcup t_n\rangle & \myif n\geq 2.
4608: \end{array}$
4609: \end{quote}
4610:
4611: %The composition of {\tt MERGE} and {\tt SPLIT1} actually
4612: %{\em is} the normalizing operation.\\
4613:
4614: The notion of simple abstract sequence with cut information
4615: is also useful to simplify the case analysis in the
4616: implementation of {\tt CONC}.
4617:
4618:
4619: \begin{definition}[Simple Abstract Sequences with Cut Information]
4620:
4621: \noindent
4622: Let $B\in{ASS}_D$ and $ {\it acf}\in{\it ACF}$. The abstract sequence
4623: with cut information $\langle B,{\it acf} \rangle$ is said to be {\em simple}
4624: if $B$ is simple and ${\it acf}\in{\it CF}$.
4625: \end{definition}
4626:
4627:
4628: The operation {\tt SPLIT2}
4629: %Hence,
4630: %we need an operation to
4631: converts an arbitrary
4632: abstract sequence with cut information into an equivalent set of
4633: simple abstract sequences with cut information.
4634: %We call such operation
4635: %{\tt SPLIT2}.
4636:
4637: \vskip0.3cm
4638: \noindent
4639: {\em Operation} $ {\tt SPLIT2}: {\it ASSC}_D \rightarrow \wp({\it ASSC}_D)$\\
4640: The operation {\tt SPLIT2} satisfies the following
4641: properties.
4642: For every $C\in{\it ASSC}_D$,
4643: \begin{enumerate}
4644: \itemsep 2pt
4645: \item $\bigcup_{C'\in{\mbox{\footnotesize \tt SPLIT2}} (C)}
4646: %
4647: {\it Cc}(C') \,=\,{\it Cc}(C)$;
4648: \item all abstract sequences with cut information in ${\tt SPLIT2}(C)$
4649: are simple.
4650: \end{enumerate}
4651:
4652: \noindent
4653: Its definition is simple.
4654: We first
4655: apply the operation {\tt SPLIT1} to the abstract sequence part of $C$.
4656: Then we split the cut information.
4657: Finally we split the termination information. Formally,
4658: ${\tt SPLIT2}(C)$ is defined as follows.
4659:
4660: \begin{enumerate}
4661: \itemsep 0pt
4662: \item Let $C=\langle B, {\it acf}\rangle\in{\it ASSC}_D$. We define
4663: \begin{quote}
4664: %\hskip-0.7cm
4665: $\begin{array}{lll}
4666: {\tt SPLIT2}(C) & = & \bigcup_{B'\in
4667: \mbox{\footnotesize \tt SPLIT1}(B)}{\tt
4668: SPLIT2}(\langle B',{\it acf}\rangle).
4669: \end{array}$
4670: \end{quote}
4671: \item Let $B=\langle \beta,m,M,t\rangle\in{\it ASS}_D$. Assume that $B$ is semi-simple. We define
4672: \begin{quote}
4673: %\hskip-0.7cm
4674: $\begin{array}{llll}
4675: {\tt SPLIT2}(\langle B,{\it weakcut}\rangle) & = &
4676: {\tt SPLIT2}(\langle B,{\it nocut}\rangle)\cup
4677: {\tt SPLIT2}(\langle B,{\it cut}\rangle) & \myif m=0\\
4678: & = &
4679: {\tt SPLIT2}(\langle B,{\it cut}\rangle) & \myif m\geq 1.
4680: \end{array}$
4681: \end{quote}
4682:
4683: \noindent (Remember that, by Definition \ref{def:SNAS}, we
4684: also have $\beta=\beta_\emptyset$ and $M=0$, in the first case, and
4685: $\beta\neq\beta_\emptyset$ and $m\leq M$, in the second case.)
4686:
4687: \item Let $B=\langle \beta,m,M,t\rangle\in{\it ASS}_D$ and ${\it cf}\in{\it CF}$.
4688: Assume that $B$ is semi-simple. We define
4689: \begin{quote}
4690: %\hskip-0.7cm
4691: $\begin{array}{llll}
4692: {\tt SPLIT2}(\langle B,{\it cf}\rangle) & = &
4693: \{\langle B,{\it cf}\rangle\}
4694: & \myif t\in\{{\it snt},{\it st}\};\\
4695: & = &
4696: \{\langle \langle \beta,m,M,{\it snt}\rangle,{\it cf}\rangle,
4697: \langle \langle \beta,m,M,{\it st}\rangle,{\it cf}\rangle\}
4698: & \myif t={\it pt}.
4699: \end{array}$
4700: \end{quote}
4701: \end{enumerate}
4702:
4703:
4704:
4705: Before presenting the implementation of {\tt CONC},
4706: we still need to specify the operation {\tt EXCLUSIVE}, which
4707: is aimed at detecting incompatible outputs.
4708: An implementation of this operation for the domain {\tt Pattern} is
4709: given in Section \ref{sub:IP}.
4710:
4711: \vskip0.3cm
4712: \noindent
4713: {\em Operation} ${\;\tt EXCLUSIVE}:({\it AS}_D\times {\it AS}_D\times {\it AS}_D)
4714: \rightarrow {\it Bool}$\\
4715: The operation {\tt EXCLUSIVE} satisfies the following
4716: property.
4717: For all $\beta,\beta_1,\beta_2\in{\it AS}_D$,
4718: \begin{center}
4719: $\begin{array}{lll}
4720: {\tt EXCLUSIVE}(\beta,\beta_1,\beta_2) & \Rightarrow &
4721: \neg(\exists \theta\in{\it Cc}(\beta)
4722: ,\theta_1\in{\it Cc}(\beta_1)
4723: ,\theta_2\in{\it Cc}(\beta_2),
4724: \sigma_1,\sigma_2\in{\it SS}:\\\
4725: & & \theta\sigma_1=\theta_1
4726: \and \theta\sigma_2=\theta_2).
4727: \end{array}$
4728: \end{center}
4729:
4730:
4731:
4732: We are now ready to describe the operation {\tt CONC}.
4733:
4734: \vskip0.3cm
4735: \noindent
4736: {\em Operation} $\;{\tt CONC}: ({\it AS}_D \times {\it ASSC}_D
4737: \times {\it ASS}^{\it enh}_D)
4738: \rightarrow {\it ASS}^{\it enh}_D$.\\
4739: \noindent
4740: Let $\beta\in {\it AS}_D$, $C_1\in {\it ASSC}_D $ and
4741: ${\it SB}_2\in {\it ASS}^{\it enh}_D$.
4742: ${\it SB}'={\tt CONC}(\beta,C_1,{\it SB}_2)$ is defined as follows.
4743: We assume that $B_i=\langle \beta_i,m_i,M_i,t_i\rangle$.
4744: %The implementation of the operation {\tt CONC} is as follows.
4745: %When the second argument is simple and the third one is a singleton,
4746: %we perform a careful case analysis; in the general case, we split both
4747: %arguments, we compute the individual results, and we put all
4748: %the individual results
4749: %together in the final resulting set.
4750: %Formally, ${\it SB}'={\tt CONC}(\beta,C_1,{\it SB}_2)$
4751: %is defined as below where $B_i=\langle \beta_i,m_i,M_i,t_i\rangle$.
4752:
4753: \begin{enumerate}
4754: \itemsep 0pt
4755: \item
4756: Let us assume first that $C_1=\langle B_1,{\it acf}_1\rangle$ is simple and ${\it SB}_2=\{B_2\}
4757: $.
4758: \begin{enumerate}
4759: \itemsep 0pt
4760: \item Suppose that ${\it acf}_1={\it cut}$ or $t_1={\it snt}$. In
4761: this case, we define
4762: \begin{quote}
4763: %\hskip-0.8cm
4764: $\begin{array}{lll}
4765: {\it SB}'&= & \{B_1\}.
4766: \end{array}$
4767: \end{quote}
4768: \item Suppose, on the contrary, that ${\it acf}_1={\it nocut}$
4769: and $t_1={\it st}$. We define
4770: \begin{quote}
4771: \hskip-1.0cm
4772: $\begin{array}{llll}
4773: {\it SB}'&= & \{B_2\} & \myif M_1 = 0\\
4774:
4775: &= & \{\langle \beta_1,m_1,M_1,t_2\rangle\}
4776: & \myif M_1 \geq 1 \and M_2 =0\\
4777:
4778: &= & \{\langle {\tt UNION}(\beta_1,\beta_2),
4779: m_1+m_2,M_1+M_2,t_2\rangle\}
4780: & \myif M_1 \geq 1 \and M_2 \geq 1\\
4781: &&&\and \neg{\tt EXCLUSIVE}(\beta,\beta_1,\beta_2)\\
4782:
4783: &= & \emptyset
4784: & \myif M_1 \geq 1 \and M_2 \geq 1\\
4785: &&&\and {\tt EXCLUSIVE}(\beta,\beta_1,\beta_2).
4786: \end{array}$
4787: \end{quote}
4788: \end{enumerate}
4789: \item In the general case, we define
4790: \begin{quote}
4791: $
4792: \begin{array}{lll}
4793: {\it SB}'&= &
4794: \displaystyle\bigcup_{\renewcommand{\arraystretch}{0.1}
4795: \begin{array}{c}
4796: _{C\in\mbox
4797: {\footnotesize \tt SPLIT2}(C_1)}\\
4798: _{B\in{\it SB}_2}
4799: \end{array}
4800: \renewcommand{\arraystretch}{1}}
4801: {\tt CONC}(\beta,C,\{B\}).
4802: \end{array}$
4803: \end{quote}
4804: \end{enumerate}
4805:
4806:
4807: \subsection{Instantiation to {\tt Pattern}}
4808: \label{sub:IP}
4809:
4810: The domain of abstract substitutions {\tt Pattern} has been introduced
4811: in \cite{MusumbuThesis} and it has been used in many of our
4812: previous works, e.g., \cite{SPE,ACTA95}. The reader is referred
4813: to \cite{TOPLAS} for a detailed description of the domain and of its
4814: abstract operations.
4815:
4816: \vskip0.3cm
4817: \noindent
4818: {\bf The Abstract Domain Pattern.}
4819: %\label{subsub:ADP}
4820: \noindent
4821: The version of {\tt Pattern} used in the experimental
4822: evaluation of Section \ref{sub:EE}
4823: %is more expressive than the
4824: %version described in \cite{TOPLAS}. It contains an additional
4825: %so-called arithmetic component that keeps information about
4826: %executions of arithmetic built-ins such as {\tt >}.
4827: % this
4828: %information is used by the operation {\tt EXCLUSIVE}.
4829: %The domain is
4830: can be best viewed as an instantiation of the generic
4831: pattern domain $\pag$ \cite{POPL94,SCP00} with mode, sha\-ring, and
4832: arithmetic components.
4833:
4834: The key intuition behind $\pag$ is to represent information on some
4835: subterms occurring in a substitution instead of information on terms
4836: bound to variables only. More precisely, $\pag$ may associate the
4837: following information with each
4838: conside\-red
4839: subterm: (1)
4840: %\begin{enumerate}
4841: %\itemsep 0pt
4842: %\item
4843: its {\em pattern}, which specifies the main functor of the
4844: subterm (if any) and the subterms which are its arguments;
4845: %\item
4846: its {\em properties}, which are left unspecified and are
4847: given in the domain $\cal R$.
4848: %\end{enumerate}
4849: In addition
4850: to the above information, each variable in the domain of the
4851: substitution is associated with one of the subterms. It can be
4852: expressed that two arguments have the same value (and hence
4853: that two variables are bound together) by associating both arguments
4854: with the same subterm. It should be emphasized that the pattern
4855: information may be void. In theory, information on all subterms could
4856: be kept but the requirement for a finite analysis makes this
4857: impossible for almost all applications. As a consequence, the domain
4858: shares some features with the depth-k abstraction \cite{Kanamori87},
4859: although $\pag$ does not impose a fixed depth but adjusts it
4860: dynamically through upper bound and widening operations. Note that the
4861: identification of subterms (and hence the link between the structural
4862: components and the $\cal R$-domain) is a somewhat arbitrary choice. In
4863: \pag{}, subterms are identified by integer indices, say $1,\ldots,n$
4864: if $n$ subterms are considered, and we denote sets of
4865: indices by the symbol $I$.
4866:
4867: More formally,
4868: the pattern and same-value component
4869: can be described as follows.
4870: The {\em pattern} component is a partial function
4871: ${\it frm}: I \not\rightarrow {\it Pat}_I$,
4872: from the set of indices $I$ to the set
4873: of patterns over $I$, i.e., elements of the form $f(i_1,\ldots,i_n)$,
4874: where $f\in{\cal F}$ is a functor symbol of arity $n$
4875: and $i_1,\ldots,i_n \in I$. When the
4876: pattern is undefined for an index $i$, we write {\it frm($i$) $=$ undef}.
4877: The {\em same-value} component is a total function
4878: ${\it sv}: D \rightarrow I$,
4879: where $D = \{x_1,\ldots,x_n\}$ is the domain of the abstract
4880: substitution.
4881:
4882: A pattern component ${\it frm}: I \not\rightarrow {\it Pat}_I$
4883: denotes a set of families $(t_i)_{i\in I}$
4884: of terms as defined below.
4885: %Let
4886: %${\it frm}: I \not\rightarrow {\it Pat}_I$
4887: %and ${\cal F}$ be the set of all functor symbols.
4888: %
4889: \begin{quote}
4890: $\begin{array}{llll}
4891: {\it Cc}({\it frm}) & = &
4892: \{ (t_i)_{i\in I}\ | &
4893: {\it frm}(i) = f(i_1, \dots, i_n) \,\Rightarrow\,
4894: t_i = f(t_{i_1}, \dots, t_{i_n}),\\
4895: & & &
4896: \forall i, i_1, \dots, i_n \in I,
4897: \forall f \in {\cal F}\}.
4898: \end{array}$
4899: \end{quote}
4900:
4901: \noindent
4902: In order to simulate
4903: unification with
4904: occur-check, we also assume that every pattern
4905: component {\it frm} satisfies the following condition:
4906: the relation $\subc\subseteq I\times I$ such that
4907: $i\subc j$ if and only if ${\it frm}(i)$ is of the form
4908: $f(\dots, j, \dots)$ must be
4909: well-founded.
4910:
4911: A pair $\langle {\it sv},{\it frm} \rangle$ with
4912: ${\it sv}: D \rightarrow I$ and ${\it frm}: I \not\rightarrow {\it Pat}_I$
4913: is called
4914: {\em structural abstract substitution};
4915: it denotes a set of program substitutions as follows:
4916: \begin{quote}
4917: $\begin{array}{lll}
4918: \Cc{\langle {\it sv},{\it frm} \rangle} & = &
4919: \{ \theta \in {\it PS}_D \ | \
4920: \exists
4921: (t_i)_{i\in I}\in{\it Cc}({\it frm}):\;
4922: x_j\theta = t_{{\it sv}(x_j)},\;
4923: \forall x_j \in D \}.
4924: \end{array}$
4925: \end{quote}
4926:
4927:
4928:
4929: The $\cal R$-domain is the generic part which specifies subterm
4930: information by descri\-bing properties of a set of tuples $<t_1, \ldots,
4931: t_n>$ where $t_1, \ldots, t_n$ are terms. As a consequence, defining
4932: the $\cal R$-domain amounts essentially to defining a traditional
4933: domain on substitutions and its operations. We now describe the
4934: various components of the $\cal R$-domain which can be built as an
4935: open product \cite{POPL94,SCP00}.
4936:
4937: The {\em mode} component is described in \cite{TOPLAS} and associates a
4938: mode from the set $ {\it Modes} =
4939: \{ {\tt var}, {\tt ground}, {\tt novar}, {\tt noground}, {\tt ngv},
4940: {\tt gv}, {\tt any}\}$ with each subterm.
4941: Formally, it is a total function ${\it mo}: I \rightarrow {\it Modes}$
4942: whose concretization is defined as
4943:
4944: \begin{quote}
4945: $\begin{array}{lll}
4946: {\it Cc}({\it mo}) & = &
4947: \{ (t_i)_{i\in I}\ |\; \ t_i \in{\it Cc}({\it mo}(i)), \;\forall i \in I \}.
4948: \end{array}$
4949: \end{quote}
4950:
4951:
4952:
4953: The {\em sharing} component maintains information about possible sharing
4954: between pairs of subterms and is also described in \cite{TOPLAS}.
4955: Formally, it is a symmetrical relation ${\it ps}\subseteq I\times I$
4956: whose concretization is defined as
4957:
4958: \begin{quote}
4959: $\begin{array}{lll}
4960: {\it Cc}({\it ps}) & = &
4961: \{ (t_i)_{i\in I}\ |\; \
4962: {\it var}(t_i) \cap {\it var}(t_j) \Rightarrow {\it ps}(i,j),
4963: \;\forall i,j \in I \}.
4964: \end{array}$\\
4965: \end{quote}
4966:
4967:
4968:
4969: The {\em arithmetic} component is novel and aims at using arithmetic
4970: predicates to detect mutual exclusion between clauses. It approximates
4971: information about arithmetic relationships by rational order
4972: constraints, i.e., binary constraints of the form $i \ \delta \ j$ and
4973: unary constraints of the form $i \ \delta \ c$, where $i,j$ are
4974: indices, $\delta \in \{>,\geq,=,\neq,\leq,<\}$ and $c$ is an integer
4975: constant. For instance, a built-in $X \geq Y + 2$ is approximated by a
4976: constraint $X > Y$.
4977: Formally, an element {\it arithm}
4978: is a set of rational order constraints over indices, whose
4979: concretization is defined as follows (a constraint being
4980: satisfied only if the terms are numbers).
4981:
4982: \begin{quote}
4983: $\begin{array}{lll}
4984: {\it Cc}({\it arithm}) &= & \{
4985: (t_i)_{i\in I}\ |\; \ \forall \ i \ \delta \ j \in arithm:
4986: \; t_i \ \delta \ t_j \and \forall \ i \ \delta \ c \in
4987: arithm: \; t_i \ \delta \ c \}.
4988: \end{array}$
4989: \end{quote}
4990:
4991:
4992:
4993:
4994:
4995: \vskip0.3cm
4996: \noindent
4997: {\bf The Operation EXCLUSIVE.}
4998: We describe here the implementation of the ope\-ration {\tt EXCLUSIVE} on our
4999: domain of abstract substitutions. This operation was not present in our
5000: previous works.
5001: It aims at detecting situations where
5002: two output abstract sequences $B_1$ and $B_2$ are incompatible, given
5003: that they both originate from the same abstract input
5004: substitution $\beta$. Only the abstract substitution components
5005: $\beta_1$ and $\beta_2$ of $B_1$ and $B_2$ are useful to detect such
5006: situations. Thus the operation {\tt EXCLUSIVE} has three arguments
5007: $\beta$, $\beta_1$, and $\beta_2$. (See its specification in
5008: Section~\ref{sub:GAO}.)
5009:
5010: Let us first introduce the notion of
5011: decomposition of a program substitution
5012: with respect to a structural abstract substitution.
5013: It represents the family
5014: of terms, occurring in the program substitution, that are given an
5015: index by the structural abstract substitution.
5016:
5017: \begin{definition}[Decomposition of a Program Substitution]
5018: \label{def:DPS}
5019:
5020: \noindent
5021: Let $\langle {\it sv},{\it frm}\rangle $ be a structural abstract
5022: substitution over domain
5023: $D=\{x_1,\dots,x_n\}$ and set of indices $I$.
5024: Let also $\theta\in{\it Cc}\langle {\it sv},{\it frm}\rangle$.
5025: The {\em decomposition of $\theta$
5026: with respect to $\langle {\it sv},{\it frm}\rangle $} is the (unique)
5027: family of terms $(t_i)_{i\in I}$ such that
5028: \begin{quote}
5029: $\begin{array}{lll}
5030: \theta=\{x_1/t_{{\it sv}(x_1)},\dots,x_n/t_{{\it sv}(x_n)}\} & \and
5031: &
5032: (t_i)_{i\in I}\in{\it Cc}({\it frm}).
5033: \end{array}$
5034: \end{quote}
5035:
5036: Existence and unicity of the family $(t_i)_{i\in I}$ can be
5037: proven by an induction argument that uses the fact that
5038: the relation $\subc$ over $I$ is well-founded. Unicity holds conditional to the fact that $I$ does not contain any "useless" element, i.e., for eve\-ry $i \in I$, there exists a variable $x_j \in D$ and a set of indices $i_1,\dots,i_k$ such that
5039: $i_1 = {\it sv}(x_j)$,
5040: $i_1 \succ \dots \succ i_k$,
5041: and $i_k = i$. From now on we assume that this condition always holds.
5042: \end{definition}
5043:
5044:
5045: The next definition models a property of the structural abstract
5046: substitutions obtained by performing any number of abstract
5047: unification steps on another structural abstract substitution.
5048:
5049: %The following concept of instance of a structural
5050: %abstract substitution is also useful.
5051:
5052: \begin{definition}[Instance of a Structural Abstract Substitution]
5053: \label{def:ISAS}
5054: \noindent
5055: Let $\langle {\it sv},{\it frm}\rangle $ and
5056: $\langle {\it sv}',{\it frm}'\rangle $ be two structural abstract
5057: substitutions over the same domain
5058: $D=\{x_1,\dots,x_n\}$ and respective sets of indices $I$ and $I'$.
5059: Let also ${\it im}:I \rightarrow I'$ be a total function.
5060: We say that
5061: {\em $\langle {\it sv}',{\it frm}'\rangle $ is an instance of
5062: $\langle {\it sv},{\it frm}\rangle $ with respect to {\it im}} if the
5063: following conditions hold:
5064:
5065: \begin{enumerate}
5066: \itemsep 2pt
5067: \item ${\it sv}' = {\it im} \ \comp\ {\it sv}$;
5068: \item for all $i,i_1,\dots,i_m\in I,\\
5069: \begin{array}{lll}
5070: {\it frm}(i)= f(i_1,\dots,i_m) & \Rightarrow &
5071: {\it frm}'({\it im}(i))= f({\it im}(i_1),\dots,{\it im}(i_m)).
5072: \end{array}$
5073: \end{enumerate}
5074: \noindent
5075: Moreover, we say that
5076: {\em $\langle {\it sv}',{\it frm}'\rangle $ is an instance of
5077: $\langle {\it sv},{\it frm}\rangle $} if there exists a
5078: function ${\it im}$ such that the conditions hold.
5079: \end{definition}
5080:
5081:
5082: The next property holds.
5083:
5084:
5085:
5086: \begin{property}\label{prop:PI}
5087: Let $\langle {\it sv},{\it frm}\rangle $ and
5088: $\langle {\it sv}',{\it frm}'\rangle $
5089: be two structural abstract substitutions, and let
5090: ${\it im}: I \rightarrow I'$ be such that
5091: $\langle {\it sv}',{\it frm}'\rangle $ is an instance of
5092: $\langle {\it sv},{\it frm}\rangle $ with respect to {\it im}.
5093: Let also $\theta\in{\it Cc}\langle {\it sv},{\it frm}\rangle $,
5094: $\theta'\in{\it Cc}\langle {\it sv}',{\it frm}'\rangle $, and
5095: $\sigma \in {\it SS}$.
5096: Finally, let $(t_i)_{i\in I}$ and $(t'_{i})_{i\in I'}$ be the
5097: decompositions of $\theta$ and $\theta'$ with respect to
5098: $\langle {\it sv},{\it frm}\rangle $ and
5099: $\langle {\it sv}',{\it frm}'\rangle $, respectively.
5100: Then we have
5101: \begin{quote}
5102: $\begin{array}{lll}
5103: \theta' = \theta\sigma &\Rightarrow&
5104: (t_i\sigma)_{i\in I} = (t'_{{\it im}(i)})_{i\in I}.
5105: \end{array}$
5106: \end{quote}
5107: \end{property}
5108: %\vskip-0.5cm
5109: \noindent
5110: The proof is a simple induction on the well-founded relation $\subc$,
5111: induced on $I$ by~${\it frm}$.
5112:
5113:
5114:
5115: The next definitions and properties are instrumental to the
5116: implementation and correctness proof of the operation {\tt EXCLUSIVE}.
5117:
5118:
5119: \begin{definition}[Exclusive Pair of Indices]
5120: \label{def:EPI}
5121: \noindent
5122: Let $ {\it frm}_1$ and $ {\it frm}_2$ be two pattern components over sets
5123: of indices $I$ and $J$, respectively. Let also $i\in I$ and $j\in J$.
5124: \begin{enumerate}
5125: \itemsep 2pt
5126: \item We say that $\langle i,j\rangle$ is {\em directly exclusive} with respect to
5127: $\langle{\it frm}_1,{\it frm}_2\rangle$ iff ${\it frm}_1(i)=
5128: f(i_1,\ldots,i_p)$, ${\it frm}_2(j)=g(j_1,\ldots,j_{q})$
5129: and either $f \neq g$ or $p \neq q$.
5130: \item We say that $\langle i,j\rangle$ is {\em exclusive} with respect to
5131: $\langle{\it frm}_1,{\it frm}_2\rangle$ iff $\langle i,j\rangle$ is
5132: {\em directly exclusive} with respect to $\langle{\it frm}_1,{\it frm}_2\rangle$,
5133: or ${\it frm}_1(i)=f(i_1,\ldots,i_p)$, ${\it frm}_2(j)=f(j_1,\ldots,j_p)$ and there exists
5134: $k: 1 \leq k \leq p$ such that $\langle i_k,j_k\rangle$ is exclusive with respect to
5135: $\langle{\it frm}_1,{\it frm}_2\rangle$.
5136: \end{enumerate}
5137: \end{definition}
5138:
5139: \vskip-0.3cm
5140: \begin{property}\label{prop:EPI}
5141: Let $ {\it frm}_1$ and $ {\it frm}_2$ be two pattern components over sets
5142: of indices $I$ and $J$, respectively.
5143: Let $(t_i)_{i\in I}\in {Cc}({\it frm}_1)$ and
5144: $(t_j)_{j\in J}\in {Cc}({\it frm}_2)$. Let also $i\in I$ and~$j\in J$.
5145: \begin{enumerate}
5146: \itemsep 2pt
5147: \item If the pair $\langle i,j\rangle$ is {\em directly exclusive} with respect to
5148: $\langle{\it frm}_1,{\it frm}_2\rangle$, then the terms $t_i$
5149: and $t_{j}$ are compound and they have distinct principal functors.
5150: \item If the pair $\langle i,j\rangle$ is {\em exclusive} with respect to
5151: $\langle{\it frm}_1,{\it frm}_2\rangle$, then the terms $t_i$
5152: and $t_{j}$ are distinct ($t_i \neq t_{j}$).
5153: \end{enumerate}
5154: \end{property}
5155:
5156:
5157:
5158: We are now in position to provide the implementation of the operation
5159: {\tt EXCLUSIVE} for the domain {\tt Pattern}.
5160: We just show here a partial implementation which only uses the pattern,
5161: same-value, and mode components but it gives the
5162: idea behind the complete implementation. For additional details, the
5163: reader is referred to
5164: \cite{Braem.Modard94}.
5165:
5166: \vskip0.3cm
5167: \noindent
5168: {\em Operation} $\;{\tt
5169: EXCLUSIVE}:{\tt Pattern}\times {\tt Pattern}\times{\tt Pattern}\rightarrow {\it Bool}$\\
5170: \noindent
5171: Let $\beta,\beta_1,\beta_2$ be abstract substitutions over the same
5172: domain $D$ and sets of indices $I$, $I_1$, and $I_2$, respectively.
5173: Assume that $\langle {\it sv}_1, {\it frm}_1 \rangle$ and
5174: $\langle {\it sv}_2, {\it frm}_2 \rangle$
5175: are instances of $\langle {\it sv}, {\it frm} \rangle$ with respect to
5176: ${\it im}_1$ and ${\it im}_2$, respectively.
5177: The value of
5178: {\tt EXCLUSIVE}$(\beta,\beta_1,\beta_2)$ is {\it true} if and only if there
5179: exists
5180: $i \in I $ such that
5181: \begin{enumerate}
5182: \itemsep 2pt
5183: \item ${\it mo}(i) \in \{{\tt ngv},{\tt novar}\}$ and the pair
5184: $\langle{\it im}_1(i),{\it im}_2(i)\rangle$ is directly exclusive
5185: with respect to $\langle {\it frm}_1,{\it frm}_2\rangle$, or
5186: \item ${\it mo(i)} = {\tt ground}$ and the pair
5187: $\langle{\it im}_1(i),{\it im}_2(i)\rangle$ is exclusive
5188: with respect to $\langle {\it frm}_1,{\it frm}_2\rangle$.
5189: \end{enumerate}
5190:
5191: %\noindent
5192: %The value of
5193: %{\tt EXCLUSIVE}$(\beta,\beta_1,\beta_2)$ is {\it false}, otherwise.
5194:
5195: \noindent
5196: Correctness of the implementation
5197: follows from Properties \ref{prop:PI} and \ref{prop:EPI}; see
5198: \cite{sequence}.\\
5199:
5200: %\noindent The implementation of {\tt EXCLUSIVE} presented here does
5201: %not make use of the {\it arithm} component while the implementation
5202: %used in the experimental evaluation of the next section does.
5203: %Additional details are available in \cite{Braem.Modard94}.
5204: %\\
5205:
5206: \noindent
5207: {\bf Prolog's Built-in Predicates.}
5208: Prolog's built-in predicates such as test predicates
5209: ({\tt var}, {\tt ground}, and the like) or arithmetic predicates
5210: ({\tt is}, {\tt <}, \dots) can be handled in essentially the same way
5211: as abstract unification. Our implementation actually includes abstract
5212: operations that deal with test and arithmetic predicates
5213: \cite{Braem.Modard94}. Other built-in predicates can be
5214: accommodated as well, including the predicates {\tt assert}
5215: and {\tt retract}. However, the treatment of the latter predicates
5216: assumes that dynamic predicates are disjoint from static predicates,
5217: i.e., it assumes that the underlying program $P$ is not modified.
5218: A more satisfactory treatment of dynamic predicates requires to
5219: introduce a new abstract object representing the dynamic program;
5220: this improvement is a topic for further work.
5221:
5222: \subsection{Experimental Evaluation}\label{sub:EE}
5223: The experimental results presented in this section provide evidence of
5224: the fact that the approach presented in this paper allows one
5225: to integrate predicate level analysis to existing variable
5226: level analysis at a reasonable implementation cost.
5227: Compari\-sons with other cardinality and determinacy analyses
5228: can be found in Section \ref{sec:RDASBAI}.\\
5229:
5230:
5231: \begin{table}
5232: \caption{Efficiency of the Cardinality Analysis}
5233: \label{efficiency}
5234: %\footnotesize
5235: \centering
5236: \begin{tabular}{||l||r|r||r|r|r|r||r|r|r|r||}
5237: %\hline \hline
5238: \multicolumn{11}{c}{} \\
5239: \cline{1-11}
5240: \multicolumn{1}{||c||}{} & \multicolumn{2}{c||}{} & \multicolumn{4}{c||}{} &
5241: \multicolumn{4}{c||}{} \\
5242: \multicolumn{1}{||c||}{} &\multicolumn{2}{c||}{OR} & \multicolumn{4}{c||}{PC} &
5243: \multicolumn{4}{c||}{PCA} \\
5244: \cline{1-11}
5245: & & & & & & & & & & \\
5246: Programs & I & T & I & T & IR & TR & I & T & IR & TR \\
5247: %\hline
5248: \cline{1-11}
5249: & & & & & & & & & & \\
5250: Qsort &
5251: 13 & 0.08 & 17 & 0.12 & 1.31 & 1.50 & 13 & 0.08 & 1.00 & 1.00\\
5252: Qsort2 &
5253: 15 & 0.08 & 19 & 0.12 & 1.27 & 1.50 & 15 & 0.09 & 1.00 & 1.13 \\
5254: Queens &
5255: 15 & 0.07 & 18 & 0.08 & 1.20 & 1.14 & 18 & 0.10 & 1.20 & 1.43 \\
5256: Press1 &
5257: 532 & 11.77 & 581 & 13.11 & 1.09 & 1.11 & 581 & 13.45 & 1.09 & 1.14
5258: \\
5259: Press2 &
5260: 197 & 3.27 & 200 & 3.56 & 1.02 & 1.09 & 200 & 3.56 & 1.02 & 1.09 \\
5261: Gabriel &
5262: 78
5263: & 0.90 & 84 & 1.00 & 1.08 & 1.11 & 84 & 0.98 & 1.08 & 1.09 \\
5264: Peep &
5265: 132 & 3.21 & 131 & 18.85 & 0.99 & 5.87 & 131 & 19.08 & 0.99 & 5.94 \\
5266: Read &
5267: 432 & 23.91 & 458 & 25.32 & 1.06 & 1.06 & 458 & 25.37 & 1.06 & 1.06
5268: \\
5269: Kalah &
5270: 115 & 1.90 & 121 & 2.09 & 1.05 & 1.10 & 120 & 2.11 & 1.04 & 1.11 \\
5271: Cs &
5272: 79 & 2.19 & 91 & 3.05 & 1.15 & 1.39 & 90 & 3.02 & 1.14 & 1.38 \\
5273: Plan &
5274: 36 & 0.21 & 38 & 0.30 & 1.06 & 1.43 & 38 & 0.27 & 1.06 & 1.29 \\
5275: Disj &
5276: 64 & 1.95 & 68 & 2.14 & 1.06 & 1.10 & 68 & 2.12 & 1.06 & 1.09 \\
5277: Pg &
5278: 38 & 0.32 & 40 & 0.36 & 1.05 & 1.13 & 39 & 0.35 & 1.03 & 1.09 \\
5279: Boyer &
5280: 56 & 0.76 & 56 & 1.15 & 1.00 & 1.51 & 56 & 1.17 & 1.00 & 1.54 \\
5281: Credit &
5282: 63 & 0.57 & 64 & 0.81 & 1.02 & 1.42 & 64 & 0.80 & 1.02 & 1.40\\
5283: & & & & & & & & & &
5284: %\hline
5285: \\
5286: \cline{1-11}
5287: & & & & & & & & & &\\
5288: Mean & & & & & 1.09 & 1.56 & & & 1.05 & 1.52 \\
5289: %
5290: \cline{1-11}
5291: \end{tabular}
5292: %\caption{Efficiency of the Cardinality Analysis}
5293: %\label{efficiency}
5294: \end{table}
5295:
5296: \noindent
5297: {\bf Benchmarks.} Our experiments use our traditional benchmarks
5298: except that cuts have been reinserted as in the original versions.
5299: In addition, some
5300: new programs have been added. {\tt Boyer} is a theorem-prover from the
5301: DEC-10 benchmarks, {\tt Credit} is an expert system from \cite{Sterling86}.
5302: There are two versions of {\tt Qsort}
5303: which differ in procedure {\tt Partition} which uses or does not use
5304: auxiliary predicates for the arithmetic built-ins.
5305: All the benchmarks are available by anonymous ftp from
5306: ftp://ftp.info.fundp.ac.be/pub/users/ble/bench.p.
5307: They have been run on a SUN SS-10/20.
5308:
5309:
5310: \vskip0.3cm
5311: \noindent
5312:
5313: \noindent
5314: {\bf Efficiency.} The efficiency results are reported in Table
5315: \ref{efficiency}. Several algorithms are compared: {\tt OR} is the
5316: original
5317: {\tt GAIA} algorithm on {\tt Pattern} \cite{TOPLAS}, {\tt PC} is the
5318: cardinality analysis with {\tt Pattern} and {\tt PCA} is {\tt PC} with
5319: the abstraction for arithmetic predicates.
5320: I, T, IR and TR are the number of ite\-ra\-tions, the execution time
5321: (in seconds),
5322: the iteration's ratio and the time's ratio respectively.
5323: The first interesting
5324: point to notice is the slight increase (about 5\% on {\tt PCA}) in
5325: iterations when moving from abstract substitutions to abstract
5326: sequences, sho\-wing the effectiveness of our widening operator. Even
5327: more important perhaps is the fact that the time overhead of the
5328: cardinality analysis is small with respect to the traditional analysis:
5329: {\tt PCA}
5330: is 1.52 slower than {\tt OR}. Note that in fact most programs enjoys
5331: an even smaller overhead but {\tt Peep} is about 6 times slower than
5332: {\tt
5333: OR} in {\tt PCA}. This comes from many procedures with many clauses,
5334: most of which being not surely cut; much time is spent in the
5335: concatenation operation. Finally, note that adding
5336: more functionality in the domain did not slow down the analysis by
5337: much.
5338:
5339:
5340:
5341:
5342: \begin{table}
5343: \caption{Accuracy of the Cardinality Analysis}
5344: \label{accuracy}
5345: %\footnotesize
5346: \centering
5347: \begin{tabular}{||l|l|r||r|r||r|r||r|r||r|r||}
5348: %\hline \hline
5349: \multicolumn{11}{c}{} \\
5350: \cline{1-11}
5351: \multicolumn{1}{||c|}{} &\multicolumn{1}{c|}{} & \multicolumn{1}{c||}{} & \multicolumn{2}{c||}{} & \multicolumn{2}{c||}{} &
5352: \multicolumn{2}{c||}{}
5353: & \multicolumn{2}{c||}{} \\
5354: \multicolumn{1}{||c|}{} &\multicolumn{1}{c|}{} & \multicolumn{1}{c||}{} & \multicolumn{2}{c||}{P} & \multicolumn{2}{c||}{C} &
5355: \multicolumn{2}{c||}{PC}
5356: & \multicolumn{2}{c||}{PCA} \\
5357: \cline{1-11}
5358: & & & & & & & & & & \\
5359: Programs & Query & NP & D & \%D & D & \%D & D & \%D & D & \%D \\
5360: %\hline
5361: \cline{1-11}
5362: & & & & & & & & & & \\
5363: Qsort & {\tt qsort(g,v)} & 3 & 0 & 0 & 0 & 0 & 0 & 0 & 3 & 100 \\
5364: Qsort2 & {\tt qsort(g,v)} & 5 & 2 & 40 & 2 & 40 & 2 & 40 & 5 & 100 \\
5365: Queens &{\tt queens(g,v)} &5 & 2 & 40 & 0 & 0 & 2 & 40 & 2 & 40 \\
5366: Press1 & {\tt test\_press(v,v)}&
5367: 47 & 8 & 17 & 19 & 40 & 19 & 40 & 19 & 40 \\
5368: Press2 & {\tt test\_press(v,v)}&
5369: 47 & 12 & 26 & 19 & 40 & 28 & 60 & 28 & 60 \\
5370: Gabriel & {\tt main(v,v)}&
5371: 17 & 0 & 0 & 4 & 24 & 4 & 24 & 4 & 24 \\
5372: Peep & {\tt comppeeppopt(g,v,g)}&
5373: 24 & 4 & 17 & 7 & 29 & 16 & 67 & 16 & 67 \\
5374: Read & {\tt read(v,v)} &
5375: 46 & 11 & 24 & 27 & 59 & 31 & 67 & 31 & 67 \\
5376: Kalah & {\tt play(v,v)}& 46 & 16 & 35 & 20 & 43 & 33 & 72 & 40 & 87 \\
5377: Cs & {\tt pgenconfig(v)} & 32 & 11 & 34 & 7 & 22 & 11 & 34 & 13 & 41 \\
5378: Plan & {\tt transform(g,g,v)} &
5379: 13 & 1 & 8 & 0 & 0 & 1 & 8 & 1 & 8 \\
5380: Disj & {\tt top(v)} & 28 & 13 & 46 & 11 & 39 & 13 & 46 & 13 & 46 \\
5381: Pg & {\tt pdsbm(g,v)} & 10 & 2 & 20 & 3 & 30 & 5 & 50 & 6 & 50 \\
5382: Boyer & {\tt boyer(g)} & 24 & 0 & 0 & 20 & 83 & 20 & 83 & 20 & 83 \\
5383: Credit & {\tt credit(a,a)} &
5384: 26 & 14 & 58 & 11 & 42 & 14 & 54 & 16 & 62 \\
5385: & & & & & & & & & & \\
5386: %\hline
5387: \cline{1-11}
5388: & & & & & & & & & &\\
5389: Mean & & & & 24 & & 33 & & 46 & & 58 \\
5390: %\hline
5391: \cline{1-11}
5392: \end{tabular}
5393: %\caption{Accuracy of the Cardinality Analysis}
5394: %\label{accuracy}
5395: \end{table}
5396:
5397:
5398: \vskip0.3cm
5399: \noindent
5400:
5401: \noindent
5402: {\bf Accuracy.}
5403: The accuracy results are reported in Table \ref{accuracy}.
5404: For each program we speci\-fy the initial query to which the abstract
5405: interpretation algorithm is applied (we denote by {\tt a}, {\tt g}
5406: and {\tt v} the modes {\tt any}, {\tt ground} and {\tt var}, respectively).
5407: Several
5408: versions of the algorithm are compared with respect to their ability to
5409: detect
5410: determinacy of procedures, which was our primary motivation. {\tt P}
5411: is using only the domain {\tt Pattern} (i.e., cuts are ignored), {\tt
5412: C} is only using the cut (i.e., {\tt EXCLUSIVE} always returns false),
5413: and {\tt PC, PCA} are defined as previously.
5414: In the table, NP stands for the number of procedures and
5415: D and \%D denote
5416: the number of procedures and the percentage of
5417: procedures, respectively, that are detected to be
5418: deterministic
5419: by the algorithms.
5420: There are several
5421: interesting points to notice. First, {\tt PCA} detects that 58\% of
5422: the procedures are deterministic, although many of these programs in
5423: fact use heavily the nondeterminism of Prolog. Most of the results are
5424: optimal and a nice example is the program {\tt Kalah}. Second, the
5425: cut and input/output patterns are really complementary to improve the
5426: analysis. Input/output patterns alone give 41\% of the deterministic
5427: procedures (i.e., those detected by {\tt PCA}),
5428: while the cut detects 57\% of the deterministic
5429: procedures. The abstraction of arithmetic predicates
5430: adds 21\% of deterministic procedures\footnote{Notice that
5431: 24/58=0.41, 33/58=0.57 and (58-46)/58=0.21.
5432: The inequality 41+57+21$\not=$100 can be understood by the fact
5433: that the analysis computed by P, C and A (the latter being the algorithm that
5434: only considers the
5435: arithmetic predicates)
5436: are not completely exclusive.}.
5437: The main lesson here is that all
5438: components are of primary importance to obtain precise results.
5439:
5440:
5441:
5442:
5443: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
5444: \section{Retaled works on determinacy analysis}
5445: \label{sec:RDASBAI}
5446: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
5447:
5448: Determinacy of logic programs in general and of Prolog programs in
5449: particular
5450: is an important research topic because determinate programs can be
5451: implemented more efficiently than non-determinate programs
5452: (often, much more efficiently).
5453: Several forms of determinacy have been identified, which lead to
5454: different kinds of optimizations.
5455: In this section, we review a few
5456: interesting papers on determinacy analysis at the light of our novel
5457: framework for the abstract interpretation of Prolog.
5458: The benefit of this study is twofold:
5459: first, it sheds new light on these analyses in the
5460: context of abstract interpretation;
5461: second, it supports the claim that our proposal is appropriate to
5462: integrate most existing analyses into a single framework.
5463:
5464:
5465:
5466: \subsection{Sahlin's Determinacy Analysis for Full Prolog}\label{sub:SDAFP}
5467:
5468:
5469: The analysis proposed by D. Sahlin \shortcite{Sahlin91} aims at
5470: detecting procedures of a (full) Prolog program that are determinate
5471: (i.e., they succeed at most once) or fully-determinate
5472: (i.e., they succeed exactly once).
5473: The analysis is developed in the context of the partial evaluator
5474: Mixtus \cite{Sahlin.PhD} in order to detect situations where cuts
5475: can be ``executed'' or removed.
5476: Sahlin's analysis is not based on abstract interpretation;
5477: hence he provides a specific correctness proof for it. \\
5478: In this section, we show that the determinacy analysis
5479: proposed by Sahlin \shortcite{Sahlin91} is indeed an instance of our
5480: framework over his abstract domain.
5481:
5482:
5483: \vskip0.3cm
5484: \noindent
5485: {\bf Abstract Domains.}
5486: \noindent
5487: Sahlin's analysis completely ignores information
5488: on program variables. The
5489: abstract domains are
5490: concerned with the sequence structure only: substitutions are completely
5491: ignored. Note that no abstract interpretation framework
5492: available at the time of his writing
5493: was adequate to his needs.
5494:
5495: \vskip0.3cm
5496: \noindent
5497: {\bf Abstract Substitutions.}
5498: %According to Sahlin's analysis, the transformations performed by
5499: %the partial evaluator ensure that the program is sufficiently
5500: %``instantiated'' to obtain an accurate analysis. This is modeled
5501: %in our framework by choosing sets ${\it AS}_D$ consisting of a single element.
5502: Since program variables are ignored, we can assume a domain
5503: ${\it AS}$ consisting of an arbitrary single element.
5504: %also drop the subscript
5505: %$\scriptstyle D$ in the following.
5506:
5507: \vskip0.3cm
5508: \noindent
5509: {\bf Abstract Sequences.}
5510: Sahlin's analysis can be formalized in our framework by defi\-ning
5511: ${\it ASS}=\wp({\it AASS})$, where
5512: ${\it AASS}=\{{\cal L}, 0, 1, 1', 2, 2'\}$\footnote{We
5513: choose to denote the elements of {\it AASS} by the same
5514: symbols as in \cite{Sahlin91}.}.
5515: We call elements of {\it AASS}, {\em atomic abstract sequences}.
5516: Their concretization is defined as follows:
5517: %function ${\it Cc}:{\it AASS}\rightarrow \wp({\it
5518: % PSS})$ is defined by:
5519: \vskip0.2cm
5520: \begin{quote}
5521: $\begin{array}{lllllll}
5522: {\it Cc}({\cal L}) & = & \{ <\bot> \} \\
5523: {\it Cc}(0) & = & \{ \ <> \ \} \\
5524: {\it Cc}(1) & = & \{ S\in {\it PSS} \mid {\it Ns}(S)=1
5525: \and S\mbox{\rm ~is~finite}\} \\
5526: {\it Cc}(1') & = & \{ S\in {\it PSS} \mid {\it Ns}(S)=1
5527: \and S\mbox{\rm ~is~incomplete}\} \\
5528: {\it Cc}(2) & = &
5529: \{ S\in {\it PSS} \mid {\it Ns}(S)>1
5530: \and S\mbox{\rm ~is~finite}\} \\
5531: {\it Cc}(2') & = &
5532: \{ S\in {\it PSS} \mid {\it Ns}(S)>1
5533: \and S\mbox{\rm ~is~incomplete~or~infinite}\}
5534: \end{array}$
5535: \end{quote}
5536: \vskip0.2cm
5537: \noindent
5538: %An abstract sequence $B$ represents a set of program substitution
5539: %sequences as below.
5540: The concretization function ${\it Cc}:{\it ASS}\rightarrow \wp({\it
5541: PSS})$ is defined by:
5542: %\vskip0.2cm
5543: \begin{quote}
5544: $\begin{array}{lll}
5545: {\it Cc}(B) & = &
5546:
5547: \bigcup_{b\in B} {\it Cc}(b).
5548:
5549: \end{array}$
5550: \end{quote}
5551: %\vskip0.2cm
5552: \noindent The relation $\leq$ on {\it ASS} is naturally defined as
5553: being set inclusion. The
5554: concretization function is thus clearly monotonic.
5555:
5556: \vskip0.3cm
5557: \noindent
5558: {\bf Abstract Sequences with Cut Information.}
5559: We define the set ${\it ASSC}$ as being
5560: equal to $\wp({\it AASS}\times {\it CF})$.
5561: The elements of ${\it ASSC}$ are denoted by
5562: ${\cal L}_{n}$, $0_{n}$, $1_{n}$, $1'_{n}$, $2_{n}$, $2'_{n}$,
5563: ${\cal L}_{c}$, $0_{c}$, $1_{c}$, $1'_{c}$, $2_{c}$, $2'_{c}$,
5564: in \cite{Sahlin91}, where the index $n$ stands for {\it nocut}, while the index
5565: $c$ stands for {\it cut}.
5566: The concretization function is defined in the obvious way.
5567:
5568: \vskip0.3cm
5569: \noindent
5570: {\bf Extended Widening.}
5571: %\label{subsub:EWS}
5572: \noindent
5573: In order to instantiate our generic abstract interpretation algorithm
5574: to the above domains, it remains to provide an implementation of the
5575: various abstract operations.
5576: This can be done systematically from the specifications of the
5577: operations
5578: and the domain
5579: definitions; we leave it as an exercise to the reader, except for
5580: the extended widening, whose implementation is not obvious.
5581: The basic intuition behind the extended widening is that it should
5582: ``observe'' how the abstract sequences evolve between the consecutive iterations
5583: in order to ensure convergence when enough accuracy seems to be
5584: attained.
5585: In this abstract domain, the abstract sequence $B_i$ produced at
5586: step $i$ may intuitively differ from $B_{i-1}$ by the fact that some
5587: ``incomplete'' elements (i.e., ${\cal L}$, $1'$, $2'$) can be removed
5588: and replaced by more ``complete'' ones.
5589: Of course the computation starts with $B_0=\{{\cal L}\}$.
5590: Thus the algorithm waits until ``enough incomplete elements have been removed''
5591: and then accumulates the next iteration results to enforce termination.
5592: This can be formalized by defining a pre-order
5593: $\cleq$ over {\it ASS}
5594: such that $B_1\cleq B_2$ holds when $B_2$ only contains elements that
5595: are ``more complete'' than some elements of $B_1$ and when, conversely,
5596: $B_1$ only contains elements that
5597: are ``less complete'' than some elements of $B_2$.
5598: We first define the relation {\em is strictly less complete than} between
5599: atomic abstract sequences by the table:
5600:
5601: \begin{quote}
5602: $\begin{array}{llllllllll}
5603:
5604: {\cal L} \clt 0 &\; {\cal L} \clt 1 & \;{\cal L} \clt 1' & \;
5605: {\cal L} \clt 2 &\;
5606: {\cal L} \clt 2'&\;
5607: 1' \clt 1 & \; 1' \clt 2 & \; 1' \clt 2' &\;
5608: 2' \clt 2.
5609: \end{array}$
5610: \end{quote}
5611:
5612: \noindent
5613: Then, for all atomic abstract sequences $b_1$ and $b_2$, we say that
5614: {\em $b_1$ is less complete than $b_2$}, denoted by
5615: $b_1\cleq b_2$, if $b_1 = b_2$ or $b_1\clt b_2$.
5616: This relation is lifted to general abstract sequences as follows:
5617:
5618: \begin{definition}[Computational Pre-Ordering]
5619: \label{def:CO}
5620: \noindent Let $B_1,B_2\in {\it ASS}$. By definition,
5621: \begin{quote}
5622: $\begin{array}{lll}
5623: B_1\cleq B_2 & \myiff & (\forall b_1\in B_1,\,
5624: \exists b_2\in B_2 \mbox{ such that }
5625: b_1 \cleq b_2) \and\\
5626: & & (\forall b_2\in B_2,\,
5627: \exists b_1\in B_1
5628: \mbox{ such that } b_1 \cleq b_2).
5629: \end{array}$
5630: \end{quote}
5631:
5632: %\noindent
5633: We write $B_1\clt B_2$ to denote the condition
5634: ($B_1 \cleq B_2 \and B_2 \not\cleq B_1$).
5635: \end{definition}
5636: \vskip-0.3cm
5637:
5638: We are now in position to define the extended widening.
5639:
5640: \begin{definition}
5641: [Extended Widening for Sahlin's Domain: $B'=B_{\it new}\nabla B_{\it old}$]
5642: \label{imp:EWSD}
5643: \begin{quote}
5644: $\begin{array}{llll}
5645: B' & = & B_{\it new} & \myif B_{\it old} \clt B_{\it new},\\
5646: & = & B_{\it new}\cup B_{\it old} & \otherwise.
5647: \end{array}$
5648: \end{quote}
5649: \end{definition}
5650:
5651: %\noindent
5652: In fact, the above operation does not
5653: fulfill, strictly speaking, the requirements for being an extended
5654: widening. It works however if we have $B_{\it old}\cleq B_{\it new}$ each time it is
5655: applied. This is normally the case if the other abstract
5656: operations are carefully implemented, since each iteration of the
5657: abstract interpretation algorithm should
5658: replace every element in
5659: $B_{\it old}$ by one or several more complete elements.
5660: Before stating
5661: %precisely
5662: what is it actually achieved by the
5663: operation $\nabla$, we need two definitions.
5664:
5665: \begin{definition}[Equivalent Abstract Sequences]
5666: \label{def:EAS}
5667: Let $B_1, B_2\in {\it ASS}$. By definition,
5668: \begin{quote}
5669: $\begin{array}{lll}
5670: B_1 \eqass B_2 & \myiff & B_1 \cleq B_2 \and B_2 \cleq B_1 .
5671: \end{array}$
5672: \end{quote}
5673: \end{definition}
5674:
5675: %\noindent
5676: The relation $\eqass$ is an equivalence because $\cleq$ is a
5677: pre-order. It can be shown that $\eqass$ determines $42$ equivalence
5678: classes, of which 28 are a singleton (e.g., $\{\{{\cal L}, 0, 1'\}\}$),
5679: 10 have 2 elements (e.g., $\{\{{\cal L}, 0, 2'\},\{{\cal L}, 0, 1',
5680: 2'\}\}$),
5681: and 4 have 4 elements (e.g., $\{\{{\cal L}, 0, 2\},
5682: \{{\cal L}, 0, 2, 2'\},
5683: \{{\cal L}, 0, 1', 2\},
5684: \{{\cal L}, 0, 1', 2, 2'\}\}$).
5685: It is also important to note that distinct equivalent abstract
5686: sequences always have different concretizations.
5687:
5688:
5689: \begin{definition}[Strengthened Computational Ordering]
5690: \label{prop:SCO}
5691: \noindent
5692: Let $B_1, B_2\in {\it ASS}$. By definition,
5693: \begin{quote}
5694: $\begin{array}{lll}
5695: B_1 \scleq B_2 & \myiff & B_1\clt B_2 \myor
5696: (B_1 \eqass B_2 \and B_1 \subseteq B_2).
5697: \end{array}$
5698: \end{quote}
5699: \end{definition}
5700:
5701: %\noindent
5702: The relation $\scleq$ is an order; every ascending
5703: sequence
5704: $B_1 \scleq B_2 \scleq \dots \scleq B_i \dots\,$ is stationary
5705: since {\it ASS} is finite.
5706:
5707: \begin{property}[Conditional Convergence of the
5708: Extended Widening]
5709: \label{prop:CCEW}
5710: Let $\{B_i\}_{i\in{\bf N}}$ and $\{B'_i\}_{i\in{\bf N}}$ be two
5711: sequences of elements of {\it ASS} such that
5712: \begin{enumerate}
5713: \itemsep 3pt
5714: \item $B'_i \cleq B_{i+1},$ for all $i\in{\bf N}$;
5715: \item $B'_{i+1}=B_{i+1}\nabla B'_i,$ for all $i\in{\bf N}$.
5716: \end{enumerate}
5717:
5718: \noindent Then we have $B_i\leq B'_i$, for all $i\in{\bf N}^\ast$,
5719: and the sequence
5720: $\{B'_i\}_{i\in{\bf N}}$ is stationary.
5721: \end{property}
5722:
5723: \begin{proof}
5724: The fact that $B_i\leq B'_i$, for all $i\in{\bf N}^\ast$, is a direct
5725: consequence of the definition of the operation $\nabla$.
5726: Moreover, the hypotheses on the sequences ensure that
5727: $B'_1\scleq B'_2 \scleq \dots \scleq B'_i \dots\,$; thus the sequence
5728: $\{B'_i\}_{i\in{\bf N}}$ is stationary.
5729: \end{proof}
5730:
5731: \vskip0.3cm
5732:
5733: If all abstract operations are congruent with respect to $\cleq$
5734: \footnote{We would have written {\em monotonic} if the relation $\cleq$ was an
5735: order, not a pre-order only.},
5736: each iteration of the
5737: abstract interpretation algorithm ensures that $B_{\it old}\cleq B_{\it new}$,
5738: where $B_{\it old}$ is the current value in {\it sat} and $B_{\it new}$ is the newly
5739: computed
5740: abstract sequence. Thus, Proper\-ty~\ref{prop:CCEW} guarantees
5741: termination of the abstract interpretation algorithm.
5742: Congruence of the abstract operations with respect to $\cleq$
5743: is ensured if they are
5744: ``as accurate as possible'' (which is achieved in \cite{Sahlin91});
5745: however,
5746: proving this property entails a lot of work.
5747: A
5748: simpler solution consists of testing whether $B_{\it old}\cleq B_{\it new}$
5749: actually holds before each application of the extending widening. If
5750: the condition does not hold, we switch to a cruder form of widening, which
5751: simply merges all successive results.
5752:
5753: \vskip0.3cm
5754: \noindent
5755: {\bf
5756: Comparison with our Cardinality Analysis.}
5757: The determinacy information inferred by means of Sahlin's domain is in general
5758: less accurate than our cardinality analysis (except maybe in some
5759: partial evaluation contexts). For instance, with the former domain, it
5760: is not possible to detect mutually exclusive clauses except when cuts
5761: occur in the clauses. As illustrated in Section \ref{sub:IP}, the
5762: information provided by the abstract substitution
5763: component of our domain is instrumental to detect sure failure, sure
5764: success, and mutual exclusion, which all contribute to improve the
5765: accuracy of the determinacy (or cardinality) analysis.
5766: Nevertheless, the specific information about the sequence structure is
5767: finer grained in Sahlin's domain than in ours. Consider the abstract
5768: sequence $\{ {\cal L}, 1\}$; it is approximated, in our domain, by
5769: $\langle 0, 1, {\it pt} \rangle$, which is actually equivalent to
5770: $\{ {\cal L}, 0, 1, 1'\}$. Thus, it could be interesting to design a
5771: domain for abstract sequences similar to our cardinality domain,
5772: where
5773: the sequence component coincides with Sahlin's domain.
5774:
5775:
5776: \vspace{-0.2cm}
5777: \subsection{Giacobazzi and Ricci's Analysis of Determinate Computations}
5778: \label{sub:GRADC}
5779:
5780: The work of Giacobazzi and Ricci \shortcite{Giacobazzi92},
5781: is also worth being reviewed in our context.
5782: They propose an analysis of functional dependencies \cite{Mendelzon85}
5783: between procedure arguments
5784: of the success set of pure logic programs. Their
5785: ana\-ly\-sis is a bottom-up abstract interpretation, based on
5786: \cite{Barbuti90,FLMP89}. The analysis also infers groundness
5787: information and is intended to be used for parallel logic program
5788: optimization.
5789: In our comparison, we focus on the functional dependencies and we
5790: simplify the presentation in order to concentrate on the
5791: salient points.
5792: First, we provide a definition of functional dependency tailored to
5793: our framework. The definitions use some notions from Section \ref{sub:IP}.
5794:
5795: \begin{definition}[Functional Dependency]
5796: \label{def:FD}
5797: \noindent
5798: Let $\langle {\it sv}, {\it frm} \rangle$ be a structural abstract
5799: substitution over domain $D$ and set of indices $I$. A {\em functional
5800: dependency for} $\langle {\it sv}, {\it frm} \rangle$,
5801: denoted by $J\rightarrow j$, is a pair consisting of a subset $J$ of
5802: $I$ and an index $j\in I$.
5803:
5804: \noindent Let $S\in{\it PSS}_D$ be a program substitution sequence
5805: such that ${\it Subst}(S)\subseteq {\it Cc}\langle {\it sv}, {\it frm}
5806: \rangle$.
5807: We say that the functional dependency $J\rightarrow j$ {\em holds in
5808: $S$ for $\langle {\it sv}, {\it frm} \rangle$}, if for all fami\-lies of
5809: terms $(t_i)_{i\in I}$, $(t'_i)_{i\in I}$ that are decompositions of
5810: some program substitutions of ${\it Subst}(S)$, the
5811: following implication is true:
5812: %
5813: \begin{quote}
5814: $\begin{array}{lll}
5815: (t_i)_{i\in J}=(t'_i)_{i\in J} & \Rightarrow & t_j=t'_j.
5816: \end{array}$
5817: \end{quote}
5818: %
5819: \end{definition}
5820:
5821: %\noindent
5822: Then we define an abstract domain to express functional
5823: dependencies.
5824: % in program substitution sequences.
5825:
5826: \begin{definition}[Abstract Sequences with Functional
5827: Dependencies]
5828: \label{def:ASFD}
5829:
5830: \noindent {\em An abstract sequence with functional
5831: dependencies} is a triple $\langle {\it sv}, {\it frm}, {\it fd}
5832: \rangle$ where $\langle {\it sv}, {\it frm} \rangle$ is a structural abstract
5833: substitution over domain $D$ and set of indices $I$, and {\it fd}
5834: is a set of functional dependencies for $\langle {\it sv}, {\it frm} \rangle$.
5835: The concretization function for abstract sequences with functional
5836: dependencies is defined by
5837: \vskip0.2cm
5838: \begin{quote}
5839:
5840: $\begin{array}{lll}
5841: {\it Cc}\langle {\it sv}, {\it frm}, {\it fd} \rangle & = &
5842:
5843: \left\{\begin{array}{l}
5844:
5845: S\in {\it PSS}_D \ \
5846:
5847: \begin{array}{|l}
5848:
5849: {\it Subst}(S)\subseteq \Cc{\langle {\it sv}, {\it frm} \rangle} \and\\
5850:
5851: J\rightarrow j \mbox{\rm ~holds~in~}
5852: S \mbox{\rm~for~} \langle {\it sv}, {\it frm} \rangle,\\
5853:
5854: \mbox{\rm for~every~}J\rightarrow j\in {\it fd}.
5855:
5856: \end{array}
5857:
5858: \end{array}\right\}.
5859:
5860: \end{array}$
5861: \end{quote}
5862:
5863: \end{definition}
5864: \vskip0.3cm
5865: %\noindent
5866:
5867: In fact, the functional dependency component {\it fd} is best
5868: viewed as an additional component to the cardinality domain defined in Section
5869: \ref{sec:FASAS}, since its usefulness for determinacy analysis depends
5870: on the availability of mode information. Let $S\in{\it CPSS}_D$ be a
5871: canonical program substitution sequence. We say that $S$ is {\em functional}
5872: if the set ${\it Subst}(S)$ is empty or is a singleton.
5873: %\footnote{Here we stick to the terminology of \cite{Debray89a}.}
5874: Such sequences model the behavior of procedures that cannot produce
5875: two or more distinct solutions. Assume that $S$ is the output sequence
5876: corresponding to the input substitution $\theta$, for some procedure $p$.
5877: Assume that $\theta\in{\it Cc}\langle {\it sv}, {\it frm} \rangle$ and
5878: $S\in{\it Cc}\langle {\it sv}', {\it frm}', {\it fd}' \rangle$ where
5879: $\langle {\it sv}', {\it frm}' \rangle$ is more instantiated than
5880: $\langle {\it sv}, {\it frm} \rangle$% with respect to {\it im}
5881: . We can
5882: infer that $S$ is functional if there exists $J\subseteq I'$ such
5883: that ${\it fd}'$ contains a functional dependency of the form
5884: $J\rightarrow i$, for every $i\in {\it sv}'(D)$, and if every term $t_j$
5885: corresponding to an index $j\in J$ in a program substitution of $S$ is
5886: not more instantiated than the corresponding term in $\theta$. The
5887: latter information is easily deduced if we know, for instance, that
5888: $t_j$ is ground or is a variable. Thus adding a functional dependency
5889: component to our cardinality domain allows us to infer that output
5890: program substitution sequences are functional.
5891:
5892:
5893:
5894: It is important to point out that the new component {\it fd} expresses a
5895: property of program substitution sequences, not a property of (single) program
5896: substitutions.
5897: It is meaningless to use functional dependencies
5898: in a domain of abstract substitutions, because a set of functional
5899: dependencies determines a (two valued) condition
5900: on a set of program substitution.
5901: Either the set verifies the condition, then no constraint is added, or
5902: it does not and the set is rejected as a whole.
5903: Thus, a component {\it fd} defines a {\em set of sets of} program
5904: substitutions.
5905: As a consequence, functional
5906: dependencies cannot be handled by previous top-down abstract
5907: interpretation frameworks such as \cite{Bruynooghe91,TOPLAS,Marriott89,%
5908: Mellish87,Muthukumar92,Warren92,Winsborough92}.
5909: However the abstract interpretation framework used by \cite{Giacobazzi92}
5910: is bottom-up and abstracts the success set of the program. The result
5911: of an analysis represents a set of possible success sets, i.e.,
5912: {\em a set of sets of output patterns}, which is similar to a set of
5913: sets of program substitutions.
5914: As far as we know, it is the first time that this difference of
5915: expressivity between bottom-up and (previous) top-down abstract
5916: interpretation frameworks is pointed out in the literature.
5917: The comparison usually concentrates on the fact that bottom-up
5918: frameworks are {\em goal independent}, i.e., they provide information on
5919: the program as a whole, while top-down frameworks are {\em goal dependent},
5920: i.e., they provide information about the program {\em and} a given initial
5921: goal. We believe that a more fundamental difference lies in the fact that
5922: top-down frameworks are {\em functional}, i.e., they abstract
5923: the behavior of a program by a function between sets of sets, while bottom-up
5924: frameworks are {\em relational}, i.e., they abstract the behavior of
5925: a program by a set of relations. The difference between the two
5926: approaches has been previously put forward
5927: by Cousot and Cousot \shortcite{CousotJLC92}, but not in the context of logic programs.
5928: The functional approach can easily focus on small parts of the program
5929: behavior but looses the dependencies between inputs and outputs;
5930: the converse holds for the relational approach.
5931: Our novel framework is basically functional, but the domain of
5932: abstract sequences is in some sense relational; thus the framework allows us to combine the
5933: advantages of both approaches.
5934:
5935:
5936: \subsection{Debray and Warren's Analysis of Functional Computations}
5937:
5938: In the previous section, we have shown that functional dependencies
5939: are useful to infer that an output program substitution sequence is functional,
5940: i.e., does not contain two or more distinct program substitutions.
5941: Such a sequence may contain several occurrences of the same program
5942: substitution, however.
5943: The importance of functional computations for logic program
5944: optimization was advocated early by Debray and Warren
5945: \shortcite{Debray89a}.
5946: In this paper, these authors propose a sophisticated algorithm to infer
5947: functional computations of a logic program. The analysis exploits
5948: functional dependencies and mode information, as well as a set of
5949: sufficient conditions to detect mutually exclusive clauses.
5950: Their algorithm is not based on abstract interpretation and assumes
5951: that functional dependencies and mode information are given from
5952: outside.
5953: Thus the algorithm considers an annotated program; it uses a set
5954: $\{ \bot, {\bf true}, {\bf false} \}$ where $\bot$ is an initializing value,
5955: {\bf true} means that a procedure is functional and {\bf false} means
5956: that it is not known whether the procedure is functional. Hence, the
5957: set can be viewed as a domain of abstract sequences, with
5958: concretization function
5959: ${\it Cc} : \{ \bot, {\bf true}, {\bf false}\} \rightarrow \wp({\it CPSS})$
5960: defined by
5961: \begin{quote}
5962: $\begin{array}{lll}
5963: {\it Cc}(\bot) & = & \{<\bot>\};\\
5964: {\it Cc}({\bf true}) & = & \{S\in{\it CPSS}\mid \
5965: {\it Subst}(S)
5966: \mbox{\rm~is~empty~or~is~a~singleton.}
5967: \};\\
5968: {\it Cc}({\bf false}) & = & {\it CPSS}.
5969: \end{array}$
5970: \end{quote}
5971:
5972: \noindent
5973: All aspects of their analysis can be accommodated in our approach by providing
5974: suitable abstract domains. An abstract domain consisting of our cardinality
5975: domain augmented with a functional dependency component would
5976: probably be fairly accurate.
5977: Moreover, in our approach, all analyses can be performed at the same
5978: time and interact with each other, making it possible to get
5979: a better accuracy.
5980:
5981:
5982:
5983: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
5984: \section{Conclusion}
5985: \label{conclusion}
5986: %%%%%%%%%%%%%%%%%%%%%%%%%%
5987:
5988: This paper has introduced a novel abstract interpretation framework,
5989: capturing the depth-first search strategy and the cut operation of
5990: Prolog. The framework is based on the notion of substitution sequences
5991: and the abstract semantics is defined as a pre-consistent post-fixpoint
5992: of the abstract transformation. Abstract interpretation algorithms
5993: need chain-closed domains and a special widening operator to compute
5994: the semantics. This approach overcomes some of the limitations of
5995: previous frameworks. In particular, it broadens the applicability of
5996: the abstract interpretation approach to new analyses and
5997: can potentially improve
5998: the precision of existing analyses.
5999: On the practical side, in this paper, we have only shown
6000: that our approach allows one to integrate -
6001: efficiently and at a low conceptual cost - a predicate
6002: level analysis (i.e., determinacy analysis)
6003: to variable level analyses classically handled by abstract
6004: interpretation. However, the improvement on classical analyses
6005: is marginal because, due to our design choices for the
6006: abstract sequence domain
6007: (i.e., a simple extension of {\tt Pattern}), the new system
6008: behaves almost as the previous
6009: version of GAIA for variable level analyses.
6010: Nevertheless, the new framework opens a door for defining
6011: and exploiting more sophisticated domains for abstract sequences.
6012:
6013: \bibliographystyle{tlp}
6014: \bibliography{Biblio}
6015:
6016:
6017: \newpage
6018: \appendix
6019: \section*{Appendix}
6020: We complete here the description of the abstract operations
6021: started in Section \ref{sub:GAO}.
6022: The correctness proofs of all the abstract operations can be found in
6023: \cite{sequence}. The definitions below have been added in order to allow
6024: the reader to check the details of the examples in Section \ref{sub:GAO}.
6025: \\
6026:
6027: \noindent
6028: {\bf Extension at Clause Entry}:\ \
6029: {\tt EXTC}$(c,\cdot):{\it AS}_D
6030: \rightarrow{\it ASSC}_{D'} $ \ \\
6031: The implementation reuses the homonymous operation
6032: from the previous framework,
6033: which is specified as follows.
6034: \\
6035:
6036: \noindent
6037: {\it Operation}$\;$
6038: {\tt EXTC}$(c,\cdot):{\it AS}_D
6039: \rightarrow{\it AS}_{D'} $\\
6040: Let $\beta\!\in\!{\it AS}_{D}$,
6041: $\theta\!\in\!{\it CPS}_{D}$, and $\theta'\!\in\!{\it PS}_{D'}$
6042: such that $x_i\theta'=x_i\theta$
6043: $(\forall i:1\leq i \leq n)$ and
6044: $x_{n+1}\theta'$, \dots , $x_{m}\theta'$ are
6045: distinct standard variables
6046: not belonging to ${\it codom}(\theta).$
6047: Then
6048: $$
6049: \begin{array}{lll}
6050: \theta\in{\it Cc}(\beta) & \Rightarrow & [\![\theta']\!]\in
6051: {\it Cc}({\tt EXTC}(c,\beta)).
6052: \end{array}
6053: $$
6054: Hence, the {\tt EXTC} operation on sequences is defined by
6055: $$
6056: \begin{array}{lll}
6057: {\tt EXTC}(c,\beta) & = &
6058: \langle \langle {\tt EXTC}(c,\beta), 1, 1, {\it st} \rangle
6059: , {\it nocut} \rangle.
6060: \end{array}
6061: $$
6062:
6063: \vskip0.3cm
6064: \noindent
6065: {\bf Restriction at Clause Exit}:\ \
6066: {\tt RESTRC}$(c,\cdot):{\it ASSC}_{D'}
6067: \rightarrow{\it ASSC}_D $ \ \\
6068: The treatment of this operation is similar to the previous one.
6069: We first specify the abstract substitution version of the operation.
6070: \\
6071:
6072: \noindent
6073: {\it Operation}$\;$
6074: {\tt RESTRC}$(c,\cdot):{\it AS}_{D'}
6075: \rightarrow{\it AS}_{D} $
6076: \\
6077: Let $\beta\!\in\!{\it AS}_{D'}$ and
6078: $\theta\!\in\!{\it CPS}_{D'}$. We have
6079: $$
6080: \begin{array}{lll}
6081: \theta\in{\it Cc}(\beta) & \Rightarrow & [\![\theta_{|D}]\!]\in
6082: {\it Cc}({\tt RESTRC}(c,\beta)).
6083: \end{array}
6084: $$
6085: Hence, the {\tt RESTRC} operation on sequences is defined by
6086: $$
6087: \begin{array}{lll}
6088: {\tt RESTRC}(c,C) & = &
6089: \langle {\tt RESTRC}(c,\beta), m, M, {\it acf} \rangle.
6090: \end{array}
6091: $$
6092:
6093: \vskip0.3cm
6094: \noindent
6095: {\bf Restriction before a Call}:\ \
6096: {\tt RESTRG}$(l,\cdot):{\it AS}_{D'}
6097: \rightarrow{\it AS}_{D'''} $\ \\
6098: This operation is simply inherited from the previous framework.
6099: \\
6100:
6101: \noindent
6102: {\bf Unification of a Variable and a Functor}:\ \
6103: {\tt UNIF-FUNC}$(f,\cdot): {\it AS}_{D}
6104: \rightarrow {\it ASS}_{{D}}$\ \\
6105: The treatment of this operation is identical to the treatment of the
6106: {\tt UNIF-VAR} ope\-ration and is thus omitted.
6107: \\
6108:
6109: \noindent
6110: {\bf Extension of the Result of a Call}:\ \
6111: {\tt EXTGS}$(l,\cdot,\cdot):{\it ASSC}_{D'}\times {\it ASS}_{D'''}
6112: \rightarrow{\it ASSC}_{D'} $\
6113: \\
6114: This operation reuses the operation {\tt EXTG} from the previous framework.
6115: The reused operation has to fulfill the specification just below.
6116: \\
6117:
6118: \noindent
6119: {\it Operation}$\;$
6120: {\tt EXTG}$(l,\cdot,\cdot):{\it AS}_{D'}\times {\it AS}_{D'''}
6121: \rightarrow{\it AS}_{D'} $
6122: \\
6123: Let $\beta_1\in{\it AS}_{D'}$ and $\beta_2\in{\it AS}_{D'''}$.
6124: Let $\theta_1\in {\it CPS}_{D'}$ and $\theta_2\in {\it PS}_{D'''}$ be
6125: such that $x_{i_j}\theta_1=x_j\theta_2$
6126: $(\forall j : 1\leq j\leq {n'})$.
6127: Let $\sigma\in {\it SS}$ such that
6128: ${\it dom}(\sigma)\subseteq {\it codom}(\theta_2)$.
6129: Let $\{z_1,\ldots,z_r\}={\it codom}(\theta_1)
6130: \setminus{\it codom}(\theta_2)$.
6131: Let $y_{1},\ldots,y_{r}$ be distinct standard variables
6132: not belonging to ${\it codom}(\theta_1)\cup {\it
6133: codom}(\sigma)$.
6134: Let $\rho=\{z_1/y_{1},\ldots,
6135: z_r/y_{r},
6136: y_{1}/z_1,\ldots,
6137: y_{r}/z_r\}$.
6138: Under these assumptions,
6139: \begin{quote}$
6140: \begin{array}{rcl}
6141: \left.
6142: \begin{array}{r}
6143: \theta_1\in {\it Cc}(\beta_1),\\
6144: \theta_2\sigma\in{\it Cc}(\beta_2)
6145: \end{array}
6146: \right\}
6147: &\Rightarrow&
6148: [\![\theta_1\rho\sigma]\!] \in
6149: {\it Cc}({\tt EXTG}(l,\beta_1,\beta_2)).
6150: \end{array}$
6151: \end{quote}
6152:
6153: \noindent
6154: The implementation of {\tt EXTGS} is as follows.
6155:
6156:
6157: \begin{quote}
6158: $\begin{array}{llll}
6159: \beta' & = & {\tt EXTG}(l,\beta_1,\beta_2); \\
6160:
6161: m' & = & m_1 m_2 & \myif t_2={\it st}, \\
6162: & = & \min(1,m_1) m_2 & \otherwise; \\
6163:
6164: M' & = & \min(1,M_1) M_2 & \myif t_2={\it snt}, \\
6165: & = & M_1 M_2 & \otherwise; \\
6166:
6167: t' & = & {\it snt} & \myif t_1={\it snt}
6168: \myor (t_2={\it snt}
6169: \and m_1\geq 1), \\
6170: & = & {\it st} & \myif t_1={\it st}
6171: \and (t_2={\it st}
6172: \myor M_1=0), \\
6173: & = & {\it pt} & \otherwise; \\
6174:
6175: {\it acf}'
6176: & = & {\it acf}.
6177: \end{array}$
6178: \end{quote}
6179:
6180:
6181: \vskip0.3cm
6182: \noindent
6183: {\bf Operation}$\;$ ${\tt SEQ}: {\it ASSC}_{D}
6184: \rightarrow{\it ASS}_{D}$\
6185: \\
6186: We define
6187: $${\tt SEQ}(\langle B, {\it acf}\rangle) \ = \ B.$$
6188:
6189: \vskip0.3cm
6190: \noindent
6191: {\bf Operation} $\;$ ${\tt SUBST}: {\it ASSC}_{D'}
6192: \rightarrow{\it AS}_{D'}$\\
6193: We define
6194: $$
6195: {\tt SUBST}(\langle \langle \beta, m, M, t \rangle, {\it acf}\rangle)\
6196: =\ \beta.$$
6197:
6198:
6199: \end{document}
6200:
6201:
6202:
6203:
6204:
6205: