0010:cs0010028/main.tex

1: \documentclass{tlp}

2:

3: \usepackage{latexsym}

4:

5: \newtheorem{theorem}{Theorem}[section]

6: \newtheorem{lemma}[theorem]{Lemma}

7: \newtheorem{example}[theorem]{Example}

8: \newtheorem{definition}[theorem]{Definition}

9: \newtheorem{proposition}[theorem]{Proposition}

10: \newtheorem{corollary}[theorem]{Corollary}

11: \newtheorem{property}[theorem]{Property}

12: \newtheorem{remark}[theorem]{Remark}

13: \newtheorem{claim}{Claim}

14: \newtheorem{fact}{Fact}

15:

16:

17:

18: \newcommand{\myif}{\ \mbox{\rm if}\ }

19: \newcommand{\when}{\ \mbox{\rm when}\ }

20: \newcommand{\myiff}{\ \mbox{\rm iff}\ }

21: \renewcommand{\and}{\ \mbox{\rm and}\ }

22: \newcommand{\myor}{\ \mbox{\rm or}\ }

23: \newcommand{\mynot}{\mbox{\rm not}\, }

24: \newcommand{\otherwise}{\ \mbox{\rm otherwise}}

25: \newcommand{\subc}{\succ}

26: \newcommand{\comp}{\circ}

27: \newcommand{\cleq}{\sqsubseteq}

28: \newcommand{\clt}{\sqsubset}

29: \newcommand{\sclt}{\lhd}

30: \newcommand{\scleq}{\unlhd}

31: \newcommand{\eqass}{\approx}

32: \newcommand{\UP}[1]{#1

33:                      \mbox{\raisebox{0.02in}{$\Uparrow$}}}

34: \newcommand{\DOWN}[1]{#1\mbox{\raisebox{0.0in}{$\Downarrow$}}}

35: \newcommand{\Cc}[1]{{\it Cc}(#1)}

36: \newcommand{\Abs}[1]{{\it Abs}(#1)}

37: \newcommand{\PSS}{{\it PSS}}

38: \newcommand{\ASSD}{{\it ASS}_D}

39: \newcommand{\TABJ}{{\it TAB}_{\mbox{\tiny \rm JOIN}}}

40: \newcommand{\TABR}{{\it TAB}_{\mbox{\tiny \rm  REFINE}}}

41: \newcommand{\TAB}{{\it TAB}}

42: \newcommand{\Nstar}{{\mbox{\bf N}^\star}}

43: \newcommand{\N}{{\mbox{\bf N}}}

44:

45:

46: \newcommand{\wi}{$\bigtriangledown$}

47: \newcommand{\disj}{$\; | \;$}

48: \newcommand{\go}{{\tt g$_o$}}

49: \newcommand{\gr}{{\tt g$_r$}}

50: \newcommand{\gn}{{\tt g$_n$}}

51: \newcommand{\gf}{{\tt g$_1$}}

52: \newcommand{\gs}{{\tt g$_2$}}

53: \newcommand{\gi}{{\tt g$_i$}}

54: \newcommand{\gp}{{\tt g$_{i+1}$}}

55: \newcommand{\vo}{{\tt v$_o$}}

56: \newcommand{\vn}{{\tt v$_n$}}

57: \newcommand{\va}{{\tt v$_a$}}

58: \newcommand{\vf}{{\tt v$_1$}}

59: \newcommand{\vs}{{\tt v$_2$}}

60: \newcommand{\tyo}{{\tt T$_o$}}

61: \newcommand{\tn}{{\tt T$_n$}}

62: \newcommand{\tnew}{{\tt T$_{new}$}}

63: \newcommand{\widen}{\bigtriangledown}

64:

65: \newcommand{\pag}{\mbox{\tt Pat($\cal R$)}}

66:

67:

68:

69:

70: \title[Sequence-Based Abstract Interpretation of Prolog]

71: 	{Sequence-Based Abstract Interpretation of Prolog}

72: %{SEQUENCE-BASED ABSTRACT INTERPRETATION OF PROLOG}

73:

74: \author[B. Le Charlier, S. Rossi and P. Van Hentenryck]

75: {BAUDOUIN LE CHARLIER\\

76: Institut d'Informatique, University of Namur,\\

77: 21 rue Grandgagnage, B-5000 Namur, Belgium

78: %\email{ble@info.fundp.ac.be}

79: \and SABINA ROSSI\\

80: Dipartimento di Informatica, Universit\`a di Venezia,\\

81: via Torino 155, 30172 Venezia, Italy

82: %\email{srossi@dsi.unive.it}

83:  \and PASCAL VAN HENTENRYCK\\

84: Department of Computer Science,

85: Brown University,\\ P.O. Box 1910, Providence

86: RI 02912, Usa

87: %\email{pvh@cs.brown.edu}

88: }

89:

90: \begin{document}

91: \maketitle

92:

93: \begin{abstract}

94:

95:

96: Abstract interpretation is a general methodology for

97: systematic development of program analyses. An abstract interpretation

98: framework is centered around a parametrized

99: non-standard semantics that can be instantia\-ted by various domains to approximate different

100: program properties.

101:

102: Many abstract interpretation frameworks and analyses for Prolog

103: have been proposed, which seek to extract information useful for program optimization.

104: Although motivated by practical considerations, notably making Prolog competitive with imperative

105: languages, such frameworks fail to capture some of the

106: control structures of existing implementations

107: of the language.

108:

109:  In this paper we propose a novel framework for the abstract interpretation

110: of Prolog which  handles the depth-first search rule and the cut operator.

111:  It relies on the notion of {\em substitution sequence} to model the result of

112: the execution of a goal.

113: The framework consists of  (i) a denotational

114: concrete semantics, (ii) a  safe abstraction of the concrete semantics

115: defined in terms of a class of post-fixpoints, and (iii) a generic abstract

116: interpretation algorithm. We show that traditional abstract domains of

117: substitutions may easily be adapted  to the new framework, and provide

118: experimental evidence of the effectiveness of our approach.

119: We also show that previous work on determinacy analysis, that was not

120: expressible by existing abstract interpretation frameworks, can be seen as an

121: instance of our framework.

122:

123: The ideas developed in this paper can be applied to other logic

124: languages, notably to  constraint logic languages,

125: and the theoretical approach  should

126: be of general interest for the analysis of many

127: non-deterministic programming languages.

128:

129: \end{abstract}

130:

131:

132: %%%%%%%%%%%%%%%%%%%%%%

133: \section{Introduction}

134: \label{introduction}

135: %%%%%%%%%%%%%%%%%%%%%%

136:

137:

138:

139: {\it Abstract interpretation} \cite{Cousot77} is a general

140: methodology for systematic development of program analyses.

141: It has been applied to

142: various   formalisms and paradigms  including flow-charts and imperative,

143: functional, logic, and constraint programming.

144: %An abstract interpretation framework is centered around the definition of a parametrized

145: %non-standard semantics that can be instantiated by various domains to approximate different

146: %program properties.

147:

148: %\subsection{Abstract Interpretation of Prolog}

149:

150: Abstract interpretation of Prolog and, more generally, of logic

151: programming was initiated by Mellish \shortcite{Mellish87}

152: and further developed by numerous researchers,

153: e.g., Bruynooghe \shortcite{Bruynooghe91},

154: Cousot and Cousot \shortcite{Cousot92},

155: Jones and S{\o}ndergaard \shortcite{Jones87},

156: Le Charlier \emph{et al.}

157: \shortcite{ICLP91AI}, Marriott and S{\o}ndergaard \shortcite{Marriott89b}.

158: Many different kinds of practical analyses

159: and optimizations have been proposed,

160: a detailed description of which can be found in

161: \cite{Cousot92,Getzinger.SAS94}.  Briefly, mode

162: \cite{Cortesi91,D89,Debray88c,So86}, type

163: \cite{BG89b,Graph.JLP,GZ86,Janssens92,KH85,K83,Kluzniak87,L83,MK84,XW88,YS91},

164: and aliasing \cite{Codish91,Jacobs89} analyses collect information

165: about the state of variables during the execution and are useful to

166: speed up term unification and make memory allocation more efficient

167: \cite{Hermenegildo92,Warren88}. Sharing analysis

168: \cite{Corsini91,Cortesi,Kluzniak88,Muthukumar91} is similar to

169: aliasing except that it refers to the sharing of memory structures to

170: which program variables are instantiated; it is useful to perform

171: compile-time garbage collection \cite{Jensen90,Kluzniak88,Mulkers90}

172: and automatic parallelization \cite{SAS94.Gras,Chang85,Giacobazzi90,Jacobs92}.

173: Reference chain analysis \cite{Marien89b,Vanroy92} attempts to determine an

174: upper bound to the length of the pointer chain for a program variable.

175: Trai\-ling analysis \cite{Taylor89} aims at detecting variables which do

176: not need to be trailed.  Liveness analysis \cite{Mulkers91a}

177: determines when memory structures can be reused

178:  and is useful to perform update-in-place.

179:

180: All these analyses  approximate the set of values (i.e., terms or memory

181: structures) to which program variables can be instantiated at some

182: given program point. It is thus not surprising   that almost all

183: frameworks for the abstract interpretation of Prolog, e.g.,

184: \cite{Barbuti90,Bruynooghe91,Jones87,Marriott93,Marriott89b,Mellish87,Nilsson90a},

185: are based on abstractions of  sets of substitutions.  Such

186: traditional

187: frameworks ignore

188: important control features of the language, like

189:  the depth-first

190: search strategy and the cut operator.

191: The reason is that  these control features

192: are difficult to model accurately,

193: and yet not strictly necessary for a {\em variable level} analysis.

194: %as the ones cited  above.

195: However, modeling Prolog control features has two main advantages. First,

196: it allows one to perform so-called

197: {\em predicate level} analyses, like determinacy

198: \cite{Giacobazzi92,Sahlin91,Ueda,Vanroy87,Vanroy92} and local stack

199: \cite{Marien89,Meier91} analyses.

200: These analyses are not captured by traditional abstract interpretation

201: frameworks;

202: they usually rely on some ad hoc te\-chni\-que

203: and  require special-purpose proofs of correctness,

204: e.g., \cite{Debray89a,Sahlin91},

205:   which may be rather involved.

206: They are useful to perform

207: optimizations, such as  the

208: choice point removal and the simplification of environment creation.

209: Second,

210:  the analysis of some classes

211: of programs, like  programs containing

212: multi-directional procedures which use

213: cuts and meta-predicates

214: to select among different versions,

215: may be widely improved.

216: This may provide the compiler with more

217: chances to perform important optimizations such

218: as dead-code elimination.

219:

220:

221:

222: Abstract interpretation of Prolog with control

223: has been investigated by other authors.

224: In particular, we know of three main different approaches.

225: The  approach of  Barbuti {\em et al.}   \shortcite{BarbutiControl}

226: is based on  an abstract semantics for logic programs

227: with control which is

228: %sema\-ntics for logic programs aiming at modeling the

229: %Prolog search rule has been proposed by

230: %Barbuti {\em et al.} in  \cite{BarbutiControl}.

231: %Their semantics is

232: parametric with respect to a ``termination theo\-ry''. The latter

233:  is intended to be provided from outside, for instance

234: by applying proofs procedures.

235: Fil\`e and  Rossi \shortcite{FileRossi}

236: propose

237: an operational and non-compositional

238: abstract interpretation framework for  Prolog with cut

239: %has been defined by Fil\`e and Rossi \cite{FileRossi}.

240:  consisting of a tabled interpreter to visit OLDT abstract trees

241: decorated with information about sure success or failure of goals.

242: Finally,  Spoto   \shortcite{Spoto}

243: define

244:  an abstract goal-independent denotational

245: semantics for Prolog  handling  control rules  and  cut.

246: %has been introduced by F. Spoto and G. Levi  in \cite{Spoto}.

247: Program denotations are adorned with

248: ``observability'' constraints giving information about

249:   divergent computations and cut executions.

250: We know of no experimental results

251:  validating the effectiveness of  these

252:  approaches.

253:

254: In this paper we present a novel abstract interpretation framework for Prolog

255: which models the depth-first search rule and the cut

256: operator. It relies on the notion of {\em substitution sequence}

257: which allows us to collect the solutions to a goal

258: together with

259: information such as sure success and failure, the number of solutions,

260: and/or termination.

261: The framework that we propose can  be applied to perform predicate

262: level analyses, such as determinacy, which were not expressible by classical frameworks, and

263: can be also used  to improve the accuracy of

264: existing analyses.

265: Experiments on a  sample analysis, namely cardinality analysis, will be

266: discussed.

267:

268:

269:

270: \subsection{Some Motivating Examples}

271:

272: In this section we illustrate by means of

273:  small examples the

274:  functionality of our static analyzer

275:  and  we discuss

276:  how it improves on previous abstract interpretation frameworks.

277: Experimental results on medium-size programs will be reported  later.

278:

279:

280: The first two examples show that  predicate level properties,

281: such as determinacy,

282: which are out of the scope of traditional

283:  abstract interpretation frameworks can be captured by our analyzer.

284: To the best of our knowledge,

285: does not exist any specific analysis which

286: can infer determinacy of all the programs that are discussed hereafter.

287:

288: Consider first  the procedure {\tt is\_last}:

289:

290: \begin{tt} \begin{tabbing} 12\=1234\=\kill

291: \> is\_last(X,[X]). \\

292: \> is\_last(X,[\_|T]) :- is\_last(X,T).  \end{tabbing} \end{tt}

293:

294: %\noindent

295: When given the input pattern {\tt is\_last(var,ground)}, where {\tt var}

296: and {\tt ground}

297: denote the set of all variables and

298: the set of all  ground terms respectively, our analy\-sis

299: returns the  abstract sequence

300: $\langle${\tt is\_last(ground,[ground|ground]),0,1,pt}$\rangle$, where

301: {\tt is\_last(ground,[ground|ground])}  is the pattern characterizing  the

302: output substitutions,

303: 0 and 1 are,

304: respectively,

305: the minimum and the maximum number of

306: returned

307:  output substitutions, and

308:  {\tt pt} stands for

309: ``possible termination''.

310: %We know of no abstract interpretation framework

311: %able to infer determinacy of such program.

312: % Moreover,

313: % also the analysis

314: %developed by Debray and Warren in  \cite{Debray89a}

315: %to infer functional computations of a logic program

316: %cannot infer determinacy of this program.

317:

318: Consider now the following two versions of the procedure {\tt partition}.

319:

320: \begin{tt}

321: \begin{tabbing}

322: 12\=1234\=\kill

323: \> partition([],P,[],[]). \\

324: \> partition([S|T],P,[S|Ss],Bs) :-

325:       S $\leq$ P, !,

326:       partition(T,P,Ss,Bs). \\

327: \> partition([B|T],P,Ss,[B|Bs]) :-

328:     partition(T,P,Ss,Bs).

329: \end{tabbing}

330: \end{tt}

331:

332: \begin{tt}

333: \begin{tabbing}

334: 12\=1234\=\kill

335: \> partition([],P,[],[]). \\

336: \> partition([S|T],P,[S|Ss],Bs) :-

337:   leq(S,P), partition(T,P,Ss,Bs). \\

338: \> partition([B|T],P,Ss,[B|Bs]) :-

339:       gt(B,P), partition(T,P,Ss,Bs). \\

340: \> leq(K1-V1,K2-V2) :- K1 $\leq$ K2. \\

341: \> gt(K1-V1,K2-V2) :- K1 $>$ K2.

342: \end{tabbing}

343: \end{tt}

344:

345: %\noindent

346: Note that the second version of the procedure

347:  calls arithmetic predicates through an

348: auxiliary predicate and is appropriate for a key sort. Given an input

349: pattern {\tt partition(ground,ground,var,var)}, our analysis returns in both

350: cases the abstract sequence

351: $\langle${\tt partition(ground,ground,ground,ground),0,1,pt}$\rangle$.

352: Input/out\-put patterns are used to

353: determine that the first clause and the two others are mutually

354: exclusive in both programs, while the cut (in the first version) and

355: the abstraction of arithmetic predicates (in the second version)

356: determine the mutual exclusion of the second and the third clause.

357: Thus we can infer determinacy of both versions of the procedure

358: {\tt partition}.

359:

360: As stated above,  we don't know of any

361:  static analysis for logic programs

362: which can infer

363: determinacy of all these programs.

364: For instance, the analysis

365: developed by Debray and Warren   \shortcite{Debray89a}

366: to detect functional computations of a logic program

367: cannot infer determinacy of the procedure ${\tt is\_last}$;

368: the  determinacy analysis proposed by Dawson {\em et al.}

369: \shortcite{Ramakrishnan93},

370: while  it can handle the second version of the procedure

371: {\tt partition}, it  cannot handle  the first version of it since it does not

372: deal with the cut; for the same reason,

373: the  analysis of Giacobazzi and Ricci

374: \shortcite{Giacobazzi92} cannot treat the first version of the procedure

375: {\tt partition};

376: and the

377:  cardinali\-ty analysis defined by Sahlin  \shortcite{Sahlin91} cannot handle

378: any of the examples discussed above since it ignores predicate

379: arguments.

380:  %Finally, the only system we know

381: %that can handle the second version of the procedure

382: %{\tt partition} is the one developed by  Dawson {\em et al.}

383: %\shortcite{Ramakrishnan93}.

384:

385: The next example shows that the use of abstract sequences can improve

386: on the analysis of variable level properties such as modes.

387:

388: Consider  the procedure {\tt compress(L,Lc)}, which relates two

389: lists  {\tt Lc} and {\tt L}

390:  such that {\tt Lc} is a compressed version of {\tt L}.

391: For instance, the

392: compressed version of  the list

393: {\tt [a, b, b, c, c, c]} is {\tt [a, 1, b, 2, c, 3]}. A

394: library can contain the definition of a single procedure to handle

395: both compression and decompression as follows.

396:

397: \begin{tt}

398: \begin{tabbing}

399: 12\=1234\=\kill

400: \> compress(A,B) :-

401:       var(A), !, decmp(A,B). \\

402: \> compress(A,B) :-

403:       cmp(A,B).

404: \end{tabbing}

405: \end{tt}

406:

407: \begin{tt}

408: \begin{tabbing}

409: 12\=1234\=\kill

410: \> cmp([],[]). \\

411: \> cmp([C],[C,1]). \\

412: \> cmp([C1,C2|T],[C1,1,C2,N|Rest])  :-

413:        C1 \hskip-0.14cm <> \hskip-0.14cm C2,

414:       cmp([C2|T],[C2,N|Rest]). \\

415: \> cmp([C1,C1|T],[C1,N1|Rest]) :-

416:      cmp([C1|T],[C1,N|Rest]),

417:        N1 \hskip-0.14cm  := \hskip-0.14cm  N \hskip-0.14cm + \hskip-0.14cm 1.

418: \end{tabbing}

419: \end{tt}

420:

421: \begin{tt}

422: \begin{tabbing}

423: 12\=1234\=\kill

424: \> decmp([],[]).  \\

425: \> decmp([C],[C,1]). \\

426: \> decmp([C1,C2|T],[C1,1,C2,N|Rest]) \hskip-0.2cm :- \hskip-0.15cm

427:       decomp([C2|T],[C2,N|Rest]),

428:       C1 \hskip-0.14cm <> \hskip-0.14cm C2. \\

429: \> decmp([C1,C1|T],[C1,N1|Rest])  \hskip-0.2cm {:-} \hskip-0.15cm

430:       N1 \hskip-0.14cm   > \hskip-0.14cm 1,  \hskip-0.1cm

431:       N \hskip-0.14cm  := \hskip-0.14cm  N1 \hskip-0.14cm - \hskip-0.14cm 1,\\

432:   \hskip6.1cm

433:       decmp([C1|T],[C1,N|Rest]).

434: \end{tabbing}

435: \end{tt}

436:

437:

438: %\noindent

439: Given the input patterns {\tt compress(ground,var)} and {\tt compress(var,ground)}, our

440: analysis returns the abstract sequence $\langle${\tt

441: compress(ground,ground),0,1,pt}$\rangle$ for both the inputs.

442: This example

443: illustrates many of the functionalities of our sy\-stem, including

444: input/output patterns, abstraction of arithmetic and meta-pre\-di\-ca\-tes,

445: and the cut, all of which are necessary to obtain the optimal

446: precision. In addition, it shows that taking the cut into account

447: improves the analysis of modes.

448: Indeed,  a mode analysis

449: ignoring the cut would return the output pattern ${\tt

450: compress(novar,ground)}$ for the input pattern {\tt compress(var,ground)}, losing

451: the groundness information. None of the

452: abstract interpretation algorithms for logic programs we know of

453: %systems or proposals

454:  can handle this example with an optimal result.

455:  Moreover, if

456: a program only uses the input pattern {\tt compress(var,ground)}, our

457: analysis detects that the second clause of {\tt compress} is dead code

458: without any extra processing since no input/output pattern exists for

459: {\tt comp}. The second clause, the test {\tt var}, and the cut

460: of the first clause can  then be removed by an optimizer.

461:

462: Notice that there exist implemented tools for the static

463: analysis  of Prolog programs, such as PLAI

464: \cite{Muthukumar92}, which can achieve as accurate  success

465: and dead-code information

466: as our analyzer. However, such tools usually

467: integrate

468:   several analyses based on different techniques which

469: are

470: not all justified by the

471: abstract interpretation framework.

472: The example of the procedure {\tt compress}  shows that

473: our analy\-zer can

474:  handle control features of the language

475: within the abstract interpretation framework

476: without the need of any extra consideration.

477:

478:

479:

480:

481:

482: \subsection{Sequence-Based Abstract Interpretation of Prolog}

483:

484: An abstract interpretation framework \cite{CousotJLC92} is centered around the

485: definition of a  non-standard

486: %(or {\em abstract})

487:  semantics

488: approximating a concrete semantics of the language.

489: %This  makes it possible to compute in

490: %finite time a description of a (possibly infinite) set of program

491: %run-time behaviors.

492: %The abstract semantics is defined in terms of

493: %abstract values and abstract operations.  Intuitively, each abstract

494: %value describes a set of concrete values while each abstract operation

495: %is a safe approximation of a concrete operation.

496:

497: Most top-down abstract interpretation

498: frameworks for logic programs, see, for instance,

499:   \cite{Bruynooghe91,Codognet92a,%

500: Jones87,TOPLAS,Marriott89,%

501: Mellish87,Muthukumar92,Nilsson90a,Warren92,Winsborough92},

502: can be viewed as abstractions of

503: a concrete structural operational semantics \cite{Plotkin81}.

504: Such a semantics defines the meaning of a program as a

505: {\em transition relation} described in terms of transition rules of the form

506:  $\langle \theta, o \rangle \longmapsto \theta'$, where the latter expresses the fact

507:  that $\theta'$ is a possible output from the execution of

508: the construct $o$ (i.e., a procedure, a clause, etc.) called with input

509: $\theta$.

510: %Similarly to a denotational semantics, a structural operational

511: %semantics is {\em compositional}.

512: This structural operational semantics can  easily be

513: rephrased as a fixpoint semantics

514: mapping any input pattern  $\langle \theta, o \rangle$ to the set of all corresponding outputs $\theta'$. The fixpoint semantics

515: can then be lifted to a {\em collecting semantics} that maps

516: sets of inputs to sets of outputs and is defined as

517:  the least fixpoint of a set-based

518: transformation.

519: The {\em non-standard} (or {\em abstract}) {\em semantics} is identical to the

520: collecting one except that it uses abstract values instead of

521: sets and abstract operations instead of

522: operations over sets.

523:  Finally, an abstract interpretation algorithm can be  derived by

524: instantiating a generic fixpoint algorithm

525:  \cite{universal} to the abstract semantics.

526:

527: The limitations of traditional top-down frameworks for Prolog stem

528: from the fact that  structural operational semantics

529:  are unable to take the depth-first search rule into account.

530: Control operators such as the cut cannot be modeled

531:  and are thus simply ignored. To overcome these limitations, we

532: propose a concrete semantics of Prolog which describes the result of program executions

533: in terms of substitution sequences. This allows us to

534: model the

535: depth-first search rule and the cut operator. The semantics

536:  is defined in the denotational setting to deal

537: with sequences resul\-ting from the execution of infinite computations.

538:  Moreover, it is still compositional

539: allowing us to reuse most of the material of our previous works,

540: i.e., the abstract domains and the generic algorithm \cite{TOPLAS}.

541: \\

542: However, technical proble\-ms arise when applying the

543: abstract interpretation approach

544:   described above. Let us informally explain

545: the main ideas behind the definition of our framework.

546:

547: First,

548:   we define a {\em concrete semantics} as the

549:   least fixpoint of a concrete transformation {\it TCB}

550:   mapping every so-called concrete behavior $\longmapsto$ to another

551:   concrete behavior $\stackrel{{\it TCB}}\longmapsto$.  The notion

552:   of concrete behavior is our

553:   denotation choice for a Prolog program: it  is a function

554:   that maps pairs of the form $\langle \theta, p \rangle $

555: % , where $p$

556: % in a n-ary procedure name and $\theta$ is a substitution over the

557: % variables $x_1,\dots,x_n$

558:   to a substitution sequence $S$, which

559:   intuitively represents the sequence of computed answer substitutions

560:   returned by the query $p(x_1,\dots,x_n)\theta$.

561: % Although concrete

562: % behaviors are functions, we use relational symbols to denote them

563: % in order to emphasize the similarities between our concrete

564: % semantics and the structural operational semantics for Prolog used

565: % in \cite{ACTA95}.  We write $ \langle \theta, {\it p} \rangle

566: % \longmapsto S $ to express that the concrete behavior $\longmapsto$

567: % maps the pair $ \langle \theta, {\it p} \rangle $ to the sequence $S$.

568:   The fixpoint construction of the concrete semantics relies on

569:   a suitable ordering $\sqsubseteq$  defined on sequences.

570: %  Our concrete semantics is strictly {\em

571: %    instrumental} in that it is only a basis for the abstract

572: %  interpretation framework. Moreover, its operations

573: %  are easily obtained  by extending the concrete operations of the

574: %  structural operational semantics \cite{ACTA95} upon which our

575: %  previous frameworks are based. This allows us to

576: %  reuse much of the material from our existing abstract domains and

577: %  generic algorithms (e.g., \cite{SPE,ICLP91AI,TOPLAS,ACTA95}).

578:

579: Second,

580:   a collecting transformation {\it TCD} is obtained by lifting the concrete

581:   transformation {\it TCB} to  sets of substitutions and sets of

582:   sequences.  The transformation

583:   {\it TCD} is monotonic with respect to set inclusion.  However,

584:   its least fixpoint does not safely

585:   approximate the concrete semantics. In fact,  the least set

586:   with respect to inclusion, that is the empty set $\{\}$, does not contain the

587:   least substitution sequence with respect to $\sqsubseteq$, which is a

588:   special sequence denoted by $<\bot>$. The problem relies on the fact that

589:   an ordering

590:   on sets of sequences that ``combines'' both the ordering

591:   $\sqsubseteq$ on sequences and the ordering $\subseteq$ on sets is needed.

592:   This is an instance of the power domain construction problem

593:   \cite{Schmidt88}, which is difficult in general.

594:   We choose a more pragmatic solution which

595:  consists in restricting to {\em chain-closed}

596:   sets of sequences, i.e., sets  containing the limit of

597:   every increasing chain, with respect to $\sqsubseteq$, of their elements.

598:   We also introduce the notion of

599:   {\em pre-consistent} collecting behavior which, roughly

600:   speaking, contains a lower approximation, with respect to

601:   $\sqsubseteq$, of the concrete semantics (the least fixpoint of

602:   {\it TCB}). The transformation {\it TCD} maps

603:   pre-consistent collecting behaviors to other pre-consistent ones.

604:   Moreover, assuming that sets of sequences

605:   are {\em chain-closed},  any pre-consistent post-fixpoint, with

606:   respect to set inclusion, of {\it TCD} safely approximates the

607:   concrete semantics.  These results imply that a safe collecting

608:   behavior can be constructed by iterating on {\it TCD} from any

609:   initial pre-consistent collecting behavior and by applying some

610:   widening techniques \cite{Cousot92c} in order to reach a

611:   post-fixpoint.

612:

613:

614: Third,

615:   the {\it abstract semantics} is defined exactly  as

616:   the collecting one  except that it is parametric with respect to the

617:   abstract domains.

618: %  whose concretizations are assumed to be sets of

619: %  substitutions or sets of substitution sequences.

620:   In fact, we do not

621:   explicitly distinguish between the collecting and the abstract

622:   semantics: in our presentation, the collecting transformation {\it

623:     TCD} is just a particular instance of the (generic) abstract

624:   transformation {\it TAB}.

625:

626:

627:   Finally, a generic abstract interpretation algorithm

628: % that is parametric  over abstract domains,

629:   is derived from the abstract semantics. The algorithm  is

630:   essentially an instantiation of the universal fixpoint algorithm

631:   described in \cite{universal}.

632: %  The latter is top-down and  computes a subset of

633: %  the fixpoint  containing the output value for a particular

634: %  input and also all input/output pairs needed to

635: %  compute the result.

636: %  Although the instantiation of

637: %  \cite{universal} to our abstract semantics is as mechanical as in

638: %  our previous works \cite{TOPLAS},  the correctness of the novel algorithm

639: %  involves new theoretical issues.

640: %  Indeed, the algorithm in

641: %  \cite{universal} is able to cope with non-monotonic transformations by

642: %  computing  post-fixpoints

643: %  instead of regular (least) fixpoints.  Nevertheless, the correctness

644: %  proofs in \cite{universal} do not address this point since it does

645: %  not apply to traditional analyses. Instead,

646: %  In the new framework we additionally need  to prove

647: %   that the computed post-fixpoint is

648: %   pre-consistent and thus safe.

649:

650:

651: \subsection{Plan of the Paper}

652: The paper is organized as follows.

653: Section \ref{CS} and  Section \ref{sec:AS} describe, respectively,

654: our concrete and abstract semantics for pure Prolog augmented

655: with  the cut.

656: %The correctness results are stated and proven.

657: The generic abstract interpretation algorithm is discussed in Section

658: \ref{sec:GAIA}.

659: Section \ref{sec:FASAS} is a revised and extended version

660: of \cite{cardinality}.

661: It describes an instantiation of our abstract interpretation

662: framework

663: %to perform so-called cardinality analysis,

664: to approximate the

665: number of solutions to a goal. Experimental results are reported.

666: %In particular, first it  shows how a

667: %generic domain of abstract sequences can be built from {\em any}

668: % domain of abstract substitutions; then it

669: %describes the instantiation of such a

670: % generic domain

671: %to (an extended version of) the abstract domain {\tt

672: %  Pattern} \cite{ICLP91AI,TOPLAS}; finally, it discusses

673: %experimental results.

674: %Our cardinality analysis subsumes both variable level analyses,

675: %like mode and sharing,

676: %and  also  predicate level analyses, like determinacy. To

677: %clarify this benefit,

678: In Section \ref{sec:RDASBAI}

679: we consider related works on determinacy

680: analysis.

681: % We show that the works of Sahlin

682: %\cite{Sahlin.PhD,Sahlin90} and of Debray and Warren \cite{Debray89a} can

683: %be seen as  instances of  our framework. We also

684: %discuss the work of Giacobazzi and Ricci \cite{Giacobazzi92}, which is

685: %based on a {\em bottom-up} abstract interpretation \cite{Barbuti90}.

686: %Their abstract domain can also be integrated to our system.

687: %Nevertheless, they

688: %do not handle the depth-first search rule

689: %of Prolog and so the cut operator.

690:  Section \ref{conclusion} concludes the paper.

691:

692:

693:

694: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%

695: \section{Concrete semantics}

696: \label{CS}

697: %%%%%%%%%%%%%%%%%%%%%%%%%%%%

698:

699: This section  describes a concrete semantics

700:  for pure Prolog augmented with the cut.

701: The concrete semantics is the link between the standard semantics

702: of the language and

703: the abstract one.

704: Our concrete semantics is denotational and is based on the notion of

705: substitution sequence.

706: Correctness of the concrete semantics with respect to  Prolog

707: standard semantics,

708: i.e., OLD-resolution, is discussed.

709: Most proofs are omitted here; all details can be found in

710:  \cite{Cut_rep95}.

711:

712: %\noindent

713: %The description of the concrete semantics

714: %is organized as follows.

715: %Section~\ref{syntax} provides an abstract syntax of normalized programs,

716: %which is suited for the compositional description of the

717: %semantics.

718: %Section~\ref{main:BSD} describes the basic semantic domains of substitutions.

719: % Section~\ref{PSS} introduces program

720: %substitution sequences. Program substitution sequences with cut information are studied

721: %in Section~\ref{pss_cut}. Concrete behaviors, which model the semantics

722: %of programs, are defined in Section~\ref{CB}. The  operations

723: %of the concrete semantics are defined in Section~\ref{CO}.

724: %The concrete semantic rules are given in Section~\ref{CSR}.

725: %The concrete semantics itself is defined in Section~\ref{TCS}.

726: %The equivalence between  the concrete semantics and the standard

727: %semantics of Prolog is discussed in Section~\ref{CCS}.

728: %Finally, Section~\ref{RW_CS} briefly discusses

729: %related works.

730:

731:

732:

733:

734:

735:

736: \subsection{Syntax}

737: \label{syntax}

738:

739: The abstract interpretation framework presented in this paper

740: assumes that programs are normalized according to the abstract

741: syntax given in

742: Fig.~\ref{abstract_syntax}.  The variables occurring in a literal are distinct;

743: distinct procedures have distinct names; all clauses of a procedure

744: have exactly the same head; if a clause uses $m$ different program

745: variables, these variables are $x_1$, \dots, $x_m$.\\

746:

747:

748: \begin{figure}[h]

749: \figrule

750: \begin{it}

751: %

752: \begin{tabular}{lclllcl}

753: P  &$ \in$&  Programs & & P  & ::= & pr $|$ pr P \\

754: pr & $ \in$& Procedures & & pr & ::=& c $|$ c pr  \\

755: c  &$ \in$&  Clauses & & c  & ::=& h  :- g.      \\

756: h  &$ \in$&  ClauseHeads & & h  & ::=& p($x_1$, \ldots, $x_n$)    \\

757: g  &$ \in$&  ClauseBodyPrefixes & & g  & ::=& $<>$ $|$ g , l \\

758: l  &$ \in$&  Literals  & & l  & ::=& p($x_{i_1}$, \ldots, $x_{i_n}$) $|$ b  \\

759: b  &$ \in$&  Built-ins & & b  & ::=& $x_i$=$x_j$ $|$ $x_{i_1}$=f($x_{i_2}$, \ldots, $x_{i_n}$) $|$ ! \\

760: p  &$ \in$&  ProcedureNames\\

761: f  &$ \in$&  Functors\\

762: $x_i$ &$ \in$&  ProgramVariables (PV)

763: \end{tabular}

764: \end{it}

765: \caption{Abstract syntax of normalized programs}

766: \label{abstract_syntax}

767: \end{figure}

768:

769:

770: %----------------------------------------------------------------------------%

771:

772: \subsection{Basic Semantic Domains}

773: \label{main:BSD}

774:

775: This section presents the basic semantic domains of substitutions.

776:  Note that we assume a preliminary knowledge of

777: logic programming; see, for instance \cite{Apt97,Lloyd}.\\

778:

779: \noindent

780: {\bf Variables and Terms.}

781: %\label{Variablesterms}

782: We assume the existence of two disjoint and infinite sets of variables, denoted by

783: ${\it PV}$ and ${\it SV}$.

784: Elements of ${\it PV}$ are called {\it program variables} and are denoted by

785: $x_1$, $x_2$, \dots, $x_i$, \dots. The set ${\it PV}$ is totally ordered;

786:  $x_i$ is the $i$-th element of ${\it PV}$.

787: % and the index

788: %$i$ in the symbol $x_i$ means that $x_i$ is the $i$-th element of ${\it PV}$.

789: Elements of ${\it SV}$ are called {\em standard variables} and  are denoted by

790:  letters $y$ and $z$ (possibly subscripted).

791: %Finite subsets of ${\it PV}$ are denoted by $D$.

792: Terms are built using standard variables only.

793: % they  are denoted by the letter $t$.

794: \\

795:

796: \noindent

797: {\bf Standard Substitutions.}

798: Standard substitutions are substitutions in the usual

799: sense

800: %\cite{Apt97,Lloyd}

801: which use standard variables only.

802: The set of standard substitutions is denoted by ${\it SS}$.

803: %Standard substitutions are normally denoted by

804: %the letter $\sigma$.

805: Renamings are standard substitutions that define a permutation

806: of standard variables.

807: % are denoted by the letter $\rho$.

808: The domain and the codomain of a standard substitution $\sigma$

809: are denoted by ${\it dom}(\sigma)$ and ${\it codom}(\sigma)$, respectively.

810: We denote by ${\it mgu}(t_1,t_2)$ the set of standard substitutions

811: that are a most general unifier of terms $t_1$ and $t_2$.\\

812:

813: \noindent

814: {\bf Program Substitutions.}

815:  A program substitution  is a  set

816: $\{x_{i_1}/t_1,\dots,x_{i_n}/t_n\}$, where $x_{i_1},\ldots,x_{i_n}$ are distinct program

817: variables and

818: $t_1$, \dots, $t_n$ are terms.

819: %The variables occurring in $t_1$, \dots, $t_n$ are standard variables.

820: Program substitutions are not substitutions in the usual sense;

821: they are best understood as a form of program store

822: which expresses the  state of the computation at a given

823: program point. It is meaningless to compose them as usual substitutions

824: or to use them to express most general unifiers.

825: %Program substitutions are denoted by the letter $\theta$.

826: The domain of a program substitution

827: $\theta=\{x_{i_1}/t_1,\dots,x_{i_n}/t_n\}$, denoted by

828: ${\it dom}(\theta)$, is the set of

829: program variables $\{x_{i_1},\dots,x_{i_n}\}$.

830: The codomain of $\theta$, denoted by

831: ${\it codom}(\theta)$, is the set of

832: standard variables occurring in $t_1,\dots,t_n$.

833: Program and standard substitutions

834: cannot be composed. Instead, standard substitutions

835: are {\em applied} to program substitutions.

836: The application of a standard substitution $\sigma$ to

837: a program substitution $\theta=\{x_{i_1}/t_1,\dots,x_{i_n}/t_n\}$

838: %denoted by $\theta\sigma$,

839: is the program substitution

840: $\theta\sigma=\{x_{i_1}/t_1\sigma,\dots,x_{i_n}/t_n\sigma\}$.

841: The set of program substitutions is denoted by ${\it PS}$.

842: The application $x_i\theta$ of a program substitution $\theta$

843: to a program variable $x_i$ is defined only if  $x_i\in{\it dom}(\theta)$;

844: it denotes the term bound to $x_i$ in $\theta$.

845: Let $D$ be a finite subset of ${\it PV}$ and

846:  $\theta$ be a program substitution such that

847: $ D \subseteq {\it dom}(\theta)$.

848: The {\em restriction} of $\theta$ to  $D$, denoted by $\theta_{/D}$,

849: is the program substitution such that ${\it dom}(\theta_{/D})= D$ and

850: $x_i(\theta_{/D})=x_i\theta$,

851: for all $x_i \in D$.

852: We denote by ${\it PS}_D$ the set of program substitutions with

853: domain $D$.\\

854:

855: \noindent

856: {\bf Canonical Program Substitutions.}

857: %\label{main:CPS}

858: We say that two program substitutions

859:  $\theta$ and $\theta'$  are {\it equivalent} if and only if

860:  there exists a  renaming

861: $\rho$ such that $\theta\rho=\theta'$.

862: We assume that,

863: for each program substitution $\theta$, we are given a canonical

864: representative, denoted  by $[\![\theta]\!]$,

865: of the set of all program substitutions that are equivalent to $\theta$.

866: We denote by ${\it CPS}$

867: the set of all

868: %canonical representatives, called

869: {\em canonical

870: program substitutions} $[\![\theta]\!]$.

871: %is denoted by ${\it CPS}$.

872: For any finite set of program variables $D$,

873: we denote by ${\it CPS}_D$ the set ${\it PS}_D\cap {\it CPS}$.

874:

875:

876: \subsection{Program Substitution Sequences}

877: \label{PSS}

878: Program substitution sequences are intended to model the sequence of

879: computed answer substitutions returned by a goal, a clause, or a procedure.

880: %(see Sections \ref{pss_f} and \ref{pss_nf} for a detailed motivation).

881: %In this section,

882: %we formally describe the semantic domains of program substitution sequences.

883: \\

884:

885: \noindent

886: {\bf Program Substitution Sequences.}

887: Let us denote by  $\Nstar$  the set of positive natural numbers.

888: A program substitution sequence

889:  is either a {\em finite}

890: sequence of the form $<\theta_1, \ldots, \theta_n>$ ($n \geq 0$) or

891: an {\em incomplete} sequence of the form $<\theta_1, \ldots, \theta_n,\bot>$

892: ($n \geq 0$) or an {\em infinite} sequence of the form

893: $<\theta_1, \ldots, \theta_i, \ldots>$ ($i\!\in\!\Nstar $), where the

894: $\theta_i$ are program substitutions with the same domain.

895: %Program substitution sequences are denoted by $S$.

896: We use the notation  $<\theta_1, \ldots, \theta_i, \_>$ to represent

897: a program substitution sequence when it is not known whether it is finite,

898:  incomplete or infinite.

899: Let $S$ be a program substitution sequence.

900: We denote by   ${\it Subst}(S)$  the

901: set of program substitutions that are elements of  $S$.

902: The domain of

903: $S$ is defined when $S\neq<>$ and $S\neq<\bot>$. In this case,

904: %the domain of $S$, denoted by

905: ${\it dom}(S)$ is the domain of

906: the program substitutions belonging to ${\it Subst}(S)$.

907: The set of all program substitution sequences is denoted by {\it PSS}.

908: Let $D$ be a finite set of program variables. We denote by

909: {\it  PSS}$_D$  the set of all program substitution sequences

910: with domain $D$ augmented

911: with $<>$ and $<\bot>$.

912: Let $S\in{\it  PSS}_D$ be a sequence $<\theta_1, \ldots, \theta_i, \_>$

913: and $D'\subseteq D$.

914: The {\em restriction} of $S$ to $D'$, denoted by $S_{/{D'}}$,

915: is the program substitution sequence $<{\theta_1}_{/{D'}}, \ldots, {\theta_i}_{/{D'}}, \_>$.

916: The number of elements of $S$,

917: including the special element $\bot$, is denoted by ${\it Ne}(S)$.

918: The number of elements of $S$ that are substitutions

919: is denoted by ${\it Ns}(S)$.

920: %Hence,

921: %if $S$ is incomplete then

922: %${\it Ne}(S)={\it Ns}(S)+1$; in the other cases,

923: %if $S$ is finite or infinite then

924: %${\it Ne}(S)={\it Ns}(S)$.

925: % otherwise, i.e., if $S$ is incomplete,

926: %${\it Ne}(S)={\it Ns}(S)+1$.

927: Sequence concatenation is denoted by $::$ and it is used only when its

928: first argument is a  finite sequence.\\

929:

930:

931: \noindent

932: {\bf Canonical Substitution Sequences.}

933: %\label{main:CSS}

934: The canonical mapping $[\![\cdot]\!]$ is lifted to sequences as

935: follows.

936:  Let $S$ be a program substitution sequence $<\theta_1, \ldots, \theta_i, \_>$.

937: We define

938:  $[\![S]\!]=

939: %as  the program substitution sequence

940: <[\![\theta_1]\!], \ldots, [\![\theta_i]\!], \_>$.

941: We denote  by {\it CPSS} the set of all

942: {\em canonical substitution sequences} $[\![S]\!]$

943: and by {\it  CPSS}$_D$ the set ${\it  PSS}_D \cap {\it CPSS}$,

944: for any finite subset  $D$ of {\it PV}.

945: \\

946:

947: \noindent

948: {\bf CPO's of Program Substitution Sequences.}

949: %\label{main:CPSS}

950: The sets {\it  PSS}, {\it  PSS}$_D$, {\it  CPSS} and {\it  CPSS}$_D$

951: can be endowed with a structure of {\it pointed cpo} as described below.

952: %Let us first introduce the relation $\sqsubseteq$ on program substitution sequences.

953:

954:

955: \begin{definition}[Relation $\sqsubseteq$ on Program Substitution Sequences]

956: \label{seq_ordering}

957: %

958: Let $S_1,S_2\!\in\!{\it PSS}$. We define

959: \begin{center}

960: \begin{tabular}{llll}

961: $S_1 \sqsubseteq S_2$ &  iff & either & $S_1=S_2$ \\

962:  & & or &there exists $S,S'\!\in\!

963: {\it PSS}$  such that $S$ is finite,\\

964: & & &$S_1=S::<\bot>$ and  $S_2=S::S'$.

965: \end{tabular}

966: \end{center}

967: \end{definition}

968: %

969: %\noindent

970:

971: The relation $\sqsubseteq$ on program substitution sequences is an ordering

972: and

973:  the pairs $\langle {\it PSS},\sqsubseteq\rangle $,

974: $\langle {\it CPSS},\sqsubseteq\rangle $,

975: $\langle {\it PSS}_D,\sqsubseteq\rangle $, and

976: $\langle {\it CPSS}_D,\sqsubseteq\rangle $

977: are all pointed cpo's.

978: % (see   \cite{Cut_rep95}).

979: \\

980:

981:

982: We denote by

983:   $(S_i)_{i\in\N}$  an increasing  chain,

984: $S_0  \sqsubseteq S_1  \sqsubseteq \ldots \sqsubseteq S_i  \sqsubseteq

985: \ldots$ in

986:   {\it PSS}; whereas we denote by  $\{S_i\}_{i\in\N}$ a,

987:  non necessarily increasing,

988: sequence  of elements of

989:   {\it PSS}.

990: \\

991:

992: \noindent

993: {\bf Lazy Concatenation.}

994: %\label{lazy_conc}

995: %In the definition of the concrete semantics,

996: Program substitution sequences are combined

997: through the  operation $\Box$ and its extensions $\Box_{k=1}^{n}$ and $\Box_{k=1}^{\infty}$

998: defined below.

999:

1000:

1001:

1002: %

1003: \begin{definition}[Operation $\Box$]

1004: Let $S_1$, $S_2\!\in\!  {\it PSS}$.

1005: \begin{center}

1006: \begin{tabular}{llcll}

1007: $S_1 \Box S_2$ & $=$ & $S_1::S_2$ & if & $S_1$ is finite\\

1008:                & $=$ & $S_1$      & if & $S_1$ is incomplete or infinite.\\

1009: \\

1010: \end{tabular}

1011: \end{center}

1012: \end{definition}

1013:

1014:

1015: \begin{definition}[Operation $\Box_{k=1}^{n}$]

1016: Let $\{S_k\}_{{k\in\Nstar}}$ be an

1017: infinite sequence of program substitution sequences (not neces\-sarily a chain).

1018: For any $n \geq 1$, we define:

1019: \begin{center}

1020: \begin{tabular}{lll}

1021: $\Box_{k=1}^{0}S_k$ & $=$ & $<\;>$  \\

1022:  $\Box_{k=1}^{n}S_k$ & $=$ & $(\Box_{k=1}^{n-1}S_k)\Box S_n.$\\

1023: \\

1024: \end{tabular}

1025: \end{center}

1026: \end{definition}

1027:

1028:

1029:

1030: \begin{definition}[Operation $\Box_{k=1}^{\infty}$]

1031: Let $\{S_k\}_{{k\in\Nstar}}$ be an

1032: infinite sequence of program substitution sequences.

1033: The infinite sequence $\{S'_i\}_{i\in\N}$ where

1034: $S'_i= (\Box_{k=1}^{i}S_k)\Box<\bot>$ $({i\in\N})$ is

1035: a chain. So we are allowed to define:

1036: \begin{center}

1037: \begin{tabular}{lllll}

1038: $\Box_{k=1}^{\infty}S_k$ & = & $\sqcup_{i=0}^{\infty} S'_i$

1039:                          & = &

1040:                                $\sqcup_{i=0}^{\infty}((\Box_{k=1}^{i}S_k)\Box

1041:                                                                       <\bot>)$.\\

1042: \\

1043: \end{tabular}

1044: \end{center}

1045: \end{definition}

1046: %

1047: %\noindent

1048:

1049: The operation $\Box$ is associative; hence,  it is meaningful to

1050: write $S_1 \Box \ldots \Box S_n$ instead of

1051: $\Box_{k=1}^{n} S_k$. Operations $\Box$,

1052: $\Box_{k=1}^{n}$, and $\Box_{k=1}^{\infty}$ are continuous

1053: with respect to the ordering $\sqsubseteq$ on program substitution sequences.\\

1054: %Moreover, the sets ${\it CPSS}$, ${\it  PSS}_D$,

1055: %and ${\it CPSS}_D$ are closed with respect to such operations.\\

1056:

1057:

1058: %

1059:

1060: \noindent

1061: {\bf Program Substitution Sequences with Cut Information.}

1062: %\label{pss_cut}

1063: Program substitution sequences with cut information

1064:  are used to model the result of a  clause together with

1065:   information on cut executions.

1066:

1067: Let ${\it CF }$ be the set of cut flags $\{{\it cut},

1068:  {\it nocut}\}$.

1069: A  program substitution sequence with cut information

1070:  is a pair $\langle S, cf\rangle$

1071: where $S\!\in\!{\it PSS}$ and $cf\!\in\!{\it CF }$.

1072: %The pair $\langle {\it PSS}\times {\it CF},\sqsubseteq \rangle $

1073: %is a pointed cpo for the relation $\sqsubseteq$ defined below.

1074: %We define the following relation.

1075: %We endow the sets

1076: %${\it PSS}\times {\it CF}$,

1077: %${\it PSS}_D\times {\it CF}$,

1078: %${\it CPSS}\times {\it CF}$ and

1079: %${\it CPSS}_D\times {\it CF}$

1080: %with a structure of pointed cpo as follows.

1081:

1082:

1083: \begin{definition} [Relation $\sqsubseteq$ on  Substitution Sequences

1084:                                 with~Cut~Information]

1085: \label{leqpssc}

1086: %

1087: Let $\langle S_1,{\it cf}_1 \rangle ,

1088:      \langle S_2,{\it cf}_2 \rangle \!\in\! {\it PSS}\times {\it CF}$.

1089: We define

1090: \begin{center}

1091: \begin{tabular}{llll}

1092:  $\langle S_1,{\it cf}_1 \rangle

1093: \sqsubseteq \langle S_2,{\it cf}_2 \rangle $

1094: %

1095:       & iff & either &$S_1\sqsubseteq S_2$ and ${\it cf}_1 = {\it cf}_2$ \\

1096:       &     & or    & $S_1=<\bot>$ and ${\it cf}_1={\it nocut}$.

1097: \\

1098: \\

1099: \end{tabular}

1100: \end{center}

1101: \end{definition}

1102: %\noindent

1103:

1104: The relation $\sqsubseteq$ on program substitution sequences

1105: with cut information is an or\-de\-ring.

1106:  Moreover,

1107:  the pairs

1108:  $\langle {\it PSS}\times {\it CF},\sqsubseteq \rangle $,

1109: $\langle {\it PSS}_D\times {\it CF},\sqsubseteq \rangle $,

1110: $\langle {\it CPSS}\times {\it CF},\sqsubseteq \rangle $ and

1111: $\langle {\it CPSS}_D\times {\it CF},\sqsubseteq \rangle $

1112: are all  pointed cpo's.

1113:

1114: We extend the definition of the operation $\Box$  to

1115: program substitution sequences with cut information.

1116: The extension is  continuous

1117: in both the arguments.

1118:

1119: \begin{definition}[Operation $\Box$ with Cut Information]

1120: \label{OCI}

1121: Let $\langle S_1,{\it cf}\rangle\!\in\!{\it PSS}\times {\it CF}$

1122:                                    and $S_2\!\in\!{\it PSS}$. We define

1123: \begin{center}

1124: \begin{tabular}{lllll}

1125:  $\langle S_1,{\it cf}\rangle \Box S_2$

1126:             & $=$ & $S_1\Box S_2$ & if & {\it cf} = {\it nocut}\\

1127:             &     & $S_1$         & if & {\it cf} = {\it cut}.

1128: \end{tabular}

1129: \end{center}

1130: \end{definition}

1131:

1132: %\vskip-1.2cm

1133: \subsection{Concrete Behaviors}

1134: \label{CB}

1135:

1136: The notion of concrete behavior

1137:  provides a mathema\-ti\-cal model for the input/output behavior of programs.

1138: To simplify the presentation, we do not parameterize the semantics

1139: with respect to programs. Instead, we assume a given fixed underlying program

1140: {\it P}.

1141: %

1142:

1143: \begin{definition} [Concrete Underlying Domain]

1144: The {\em concrete underlying domain},

1145: denoted by {\it CUD},

1146:  is the set of all pairs $\langle \theta,p \rangle$ such that

1147: $p$ is the name of a procedure {\it pr}

1148: of $P$ and

1149: $\theta\!\in\!{\it CPS}_{\{x_1,\dots,x_n\}}$,

1150: where $x_1,\dots,x_n$ are the variables

1151:  occurring in the head of every clause of {\it pr}.

1152: \end{definition}

1153: %

1154:

1155: Concrete behaviors are functions but we

1156:  denote them by the relation symbol $\longmapsto$ in order to

1157: stress the similarities  between the concrete semantics and a

1158: structural operational semantics for logic programs defined in  \cite{ACTA95}.

1159:

1160: \begin{definition} [Concrete Behaviors]

1161: A {\em concrete behavior}

1162: is a total function $\longmapsto: {\it CUD}\longrightarrow {\it CPSS}$

1163: mapping every pair  $\langle \theta,p \rangle\!\in\!{\it CUD}$  to a

1164: canonical program  substitution

1165: sequence $S$ such that,

1166: for every  $\theta'\!\in\!{\it Subst}(S)$, there exists a standard substitution

1167: $\sigma$ such that $\theta'=\theta \sigma$.

1168: We denote by $\langle \theta,p \rangle\longmapsto S$  the fact that

1169: $\longmapsto$ maps the pair $\langle \theta,p \rangle$ to $S$.

1170: The set of all concrete behaviors is denoted by

1171: {\it CB}.

1172: \end{definition}

1173: %

1174: %

1175:

1176: The ordering $\sqsubseteq$ on program substitution sequences is

1177: lifted to concrete behaviors in a standard way \cite{Schmidt88}.

1178:

1179:

1180: \begin{definition}  [Relation $\sqsubseteq$ on Concrete Behaviors]

1181: Let $\longmapsto_1 , \longmapsto_2\in\!{\it CB}$. We define

1182: \begin{center}

1183: \begin{tabular}{lclll}

1184: $ \longmapsto_1 \sqsubseteq \longmapsto_2$ & iff &

1185:                    ($\langle \theta,p \rangle\longmapsto_1 S_1$ and

1186:                     $\langle \theta,p \rangle\longmapsto_2 S_2$)

1187:              imply

1188:    $\ S_1 \sqsubseteq S_2$,  \\

1189:             & & for all

1190:                                   $\langle \theta,p \rangle\!\in

1191:                                   \!{\it CUD}.$

1192: \end{tabular}

1193: \end{center}

1194: \end{definition}

1195: %

1196:

1197: The following result is straightforward.

1198:

1199: \begin{proposition}

1200: \label{CB-cpo}

1201: $\langle {\it CB}, \sqsubseteq \rangle$  is a pointed cpo, i.e.,

1202: \begin{enumerate}

1203: \itemsep 2pt

1204: \item the relation $\sqsubseteq$  on

1205:           {\it CB} is a partial order;

1206: \item {\it CB} has a minimum element, which is

1207:                          the concrete behavior

1208:                          $\longmapsto_\bot$ such that

1209:                       for all $\langle \theta,p \rangle\!\in\! {\it CUD}$,

1210:                             $\langle \theta,p \rangle\longmapsto_\bot<\bot>$;

1211: \item every chain $(\longmapsto_i)_{i\in\N}$ in {\it CB}

1212:        has a least upper bound, denoted by

1213:                                 $\sqcup_{i=0}^{\infty}\longmapsto_i$;

1214:       $\sqcup_{i=0}^{\infty}\longmapsto_i$ is the concrete behavior

1215:       $\longmapsto$  such that,

1216:        for all $ \langle \theta,p \rangle\!\in \!{\it CUD}$,

1217:       $\langle \theta,p \rangle\longmapsto \sqcup_{i=0}^{\infty}S_i$,

1218:       where $\langle \theta,p \rangle\longmapsto_i S_i$

1219:                               $(\forall i\!\in\!\N)$.

1220:

1221: \end{enumerate}

1222: \end{proposition}

1223:

1224:

1225: \subsection{Concrete Operations}

1226: \label{CO}

1227:

1228: We specify here the concrete operations which are used in the

1229: definition of the concrete semantics.

1230: The choice of these particular operations is motivated by the fact that

1231: they have useful (i.e., practical)

1232: abstract counterparts  (see Sections~\ref{sec:AS}, \ref{sec:GAIA}

1233: and~\ref{sec:FASAS}).

1234: %

1235: %The following notations will be used:

1236: %$c$ denotes a clause of {\it P};

1237: % $l$ denotes a literal occurring in the

1238: %body of $c$ (it can be a procedure call or a unification literal);

1239: %$D=\{x_1,\ldots,x_n\}$ is the set of all variables occurring in

1240: %the head of $c$;

1241: %$D'=\{x_1,\ldots,x_m\}$ is the set of all variables

1242: %occurring in $c$ (hence, $n\leq m$);

1243: %$D''=\{x_{i_1},\ldots,x_{i_{r}}\}$ is the set of all

1244: %variables occurring in

1245: %$l$;

1246: %$D'''$ is equal to $\{x_{1},\ldots,x_{r}\}$.

1247: The concrete operations are polymorphic since their

1248: exact signature depends on a clause $c$ or a literal $l$ or both.\\

1249:

1250: Let $c$ be a clause,  $D=\{x_1,\ldots,x_n\}$ be the set of all variables

1251: occurring in the head of $c$, and

1252: $D'=\{x_1,\ldots,x_m\}$ ($n\leq m$) be the set of all variables

1253: occurring in $c$.\\

1254:

1255: \noindent

1256: {\bf Extension at  Clause Entry}\  :\ \

1257:      {\tt EXTC}$(c,\cdot):{\it CPS}_D

1258: \rightarrow({\it CPSS}_{D'}\times {\it CF}) $\ \\

1259: This operation extends a substitution $\theta$ on

1260: the set of variables in $D$

1261: to the set of variables in $D'$.

1262: Let $\theta\!\in\!{\it CPS}_{D}$.

1263: %

1264: \begin{quote}

1265: \begin{tabular}{llll}

1266: ${\tt EXTC}(c,\theta)$ & $ = $ &

1267: $ \langle  < [\![\theta']\!]>, {\it nocut} \rangle$

1268:                    & \\

1269:

1270: \end{tabular}

1271: \end{quote}

1272: \noindent

1273:  where $x_i\theta'=x_i\theta$ $(\forall i:1\leq i \leq n)$ and

1274:   $x_{n+1}\theta'$, \dots , $x_{m}\theta'$ are

1275:                distinct standard    variables    not belonging to ${\it codom}(\theta)$.

1276:

1277: %

1278: \vskip0.3cm

1279: %

1280:

1281: \noindent

1282: {\bf Restriction at Clause Exit}\ :\ \

1283:  {\tt RESTRC}$(c,\cdot):({\it CPSS}_{D'}\times {\it CF})

1284:                          \rightarrow({\it CPSS}_D\times {\it CF}) $ \ \\

1285: This operation restricts a pair $\langle S,{\it cf}\rangle$,

1286: representing the result of the execution of $c$ on the set of

1287: variables in $D'$,

1288: to the set of variables in $D$.

1289: Let  $\langle S, {\it cf}\rangle

1290: \!\in\!({\it CPSS}_D'\times {\it CF})$.

1291: \begin{quote}

1292: \begin{tabular}{llll}

1293: ${\tt RESTRC}(c,\langle S,{\it cf}\rangle)$ & $=$ &

1294:                   $  \langle [\![S']\!],{\it cf}\rangle$ &

1295: where $ S'=S_{/{D}} $.

1296: \end{tabular}

1297: \end{quote}

1298: %

1299: \vskip0.3cm

1300:

1301: Let $l$ be a literal occurring in the body of $c$,

1302: $D''=\{x_{i_1},\ldots,x_{i_{r}}\}$ be the set of

1303: variables occurring in $l$, and

1304: $D'''$ be equal to $\{x_{1},\ldots,x_{r}\}$.

1305:

1306: \vskip0.3cm

1307: \noindent

1308: {\bf Restriction before a Call}\ :\ \

1309:  {\tt RESTRG}$(l,\cdot):{\it CPS}_{D''}

1310:                          \rightarrow{\it CPS}_{D'''} $\ \\

1311: This operation expresses a substitution $\theta$ on the

1312: parameters $x_{i_1},\ldots,x_{i_r}$ of a call $l$

1313: in terms of the formal parameters $x_1,\ldots,x_r$ of $l$. Let

1314:   $\theta \!\in\!{\it CPS}_{D''}$.

1315: \begin{quote}

1316: \begin{tabular}{lll}

1317: ${\tt RESTRG}(l,\theta)$ & $=$ &

1318:      $  [\![\{x_1/x_{i_1}\theta,\ldots,x_{r}/x_{i_{r}}\theta\}]\!]  .$

1319: \end{tabular}

1320: \end{quote}

1321: %

1322:

1323:

1324: \vskip0.3cm

1325: \noindent

1326: {\bf Extension of the Result of a Call}\ :\ \

1327:  {\tt EXTG}$(l,\cdot,\cdot): {\it CPS}_{D'}\times {\it CPSS}_{D'''}

1328:                          \not\rightarrow{\it CPSS}_{D'} $ \ \\

1329: \noindent

1330: This operation extends a substitution $\theta$

1331: with a substitution sequence $S$ repre\-senting the result of

1332: executing a call $l$ on $\theta$.

1333: Hence, it is only used in contexts where the substitutions that are elements

1334: of $S$ are (roughly speaking) instances of $\theta$.

1335: %We leave it undefined in the other cases.

1336: %Assume that $l$ is of the form $p'(x_{i_1},\ldots,x_{i_{r}}$

1337: %(unification literals are dealt with in the same way).

1338:  Let

1339:  $\theta\in {\it CPS}_{D'}$.

1340: Let $S\in {\it CPSS}_{D'''} $ be of the form

1341: $<\theta'\sigma_1,\ldots,\theta'\sigma_i,\_>$ where

1342: %$\theta'={\tt RESTRG}(l,\theta)$

1343: $x_j\theta'=x_{i_j}\theta$

1344: $(1\leq j\leq {r})$

1345: and the

1346: $\sigma_i$ are standard substitutions such that

1347: ${\it dom}(\sigma_i)\subseteq {\it codom}(\theta')$.

1348: %

1349: %\footnote{

1350: %Assuming that $S$ is of the form

1351: %$<\theta'\sigma'_1,\ldots,\theta'\sigma'_i,\_>$

1352: %where the $\sigma'_i$ are {\em arbitrary} standard substitutions,

1353: %the $\sigma_i$ exist and they are unique.}

1354: %

1355: %{\tt EXTG}$(l,\theta,S)$ is computed as follows.

1356: %

1357: %Let $\rho$ be a renaming such that

1358: %$x_j\theta'=x_{i_j}\theta\rho$

1359: %$(1\leq j\leq {n'})$.

1360: Let $\{z_1,\ldots,z_s\}={\it codom}(\theta)

1361:                              \setminus{\it codom}(\theta')$.

1362:       Let $y_{i\mbox{\rm \tiny ,}1},\ldots,y_{i\mbox{\rm \tiny ,}s}$

1363:       be distinct standard variables

1364:       not belonging to ${\it codom}(\theta)\cup {\it codom}(\sigma_i)$

1365:       $(1\leq i\leq {\it Ns}(S))$.

1366:       Let $\rho_i$ be a renaming of the form $\{z_1/y_{i\mbox{\rm \tiny ,}1},\ldots,

1367:                     z_s/y_{i\mbox{\rm \tiny ,}s},

1368:                     y_{i\mbox{\rm \tiny ,}1}/z_1,\ldots,

1369:                     y_{i\mbox{\rm \tiny ,}s}/z_s\}$.

1370: \begin{quote}

1371: \begin{tabular}{lll}

1372: ${\tt EXTG}(l,\theta,S)$ & $=$ &

1373:  $  [\![<\theta\rho_1\sigma_1,\ldots,\theta\rho_i\sigma_i,\_>]\!]  .$

1374: \end{tabular}

1375: \end{quote}

1376: %

1377: %\vskip0.2cm

1378:

1379: It is easy to see that the value of  ${\tt EXTG}(l,\theta,S)$  does not depend on the

1380: choice of the $y_{i\mbox{\rm \tiny ,}j}$.

1381: Moreover,  it  is not defined when $S$ is not

1382: of the above mentioned form.\\

1383:

1384: %

1385: \noindent

1386: {\bf Unification of Two Variables}\ :\ \

1387:    {\tt UNIF-VAR}$: {\it CPS}_{\{x_1,x_2\}}

1388:                      \rightarrow {\it CPSS}_{\{x_1,x_2\}}$\  \\

1389: Let $\theta\!\in\!{\it CPS}_{\{x_1,x_2\}}$.

1390: This operation unifies $x_1\theta$ with $x_2\theta$.

1391: %

1392: \begin{quote}

1393: \begin{tabular}{llll}

1394: {\tt UNIF-VAR}$(\theta)$ & $ =$  & $<> $ &

1395:                        if $x_1\theta$ and $x_2\theta$ are not unifiable,\\

1396: %

1397:   &  $ =$ & $[\![<\theta\sigma>]\!]$ & where $\sigma\!\in\! {\it mgu}

1398:                                                    (x_1\theta,x_2\theta)$,

1399:                                        otherwise.

1400: \end{tabular}

1401: \end{quote}

1402:

1403: \vskip0.3cm

1404: %

1405: \noindent

1406: {\bf Unification of a Variable and a Functor}\ :\ \

1407:    {\tt UNIF-FUNC}$(f,\cdot): {\it CPS}_{D}

1408:                      \rightarrow {\it CPSS}_{D}$\ \\

1409: Given a functor $f$ of arity $n-1$ and a substitution

1410: $\theta\!\in\!{\it CPS}_{D}$ where $D=\{x_1,\ldots,x_n\}$, the

1411: {\tt UNIF-FUNC} operation unifies $x_1\theta$ with $f(x_2,\ldots,x_n)

1412: \theta$.

1413: %

1414: \begin{quote}

1415: %\hskip-1.2cm

1416: \begin{tabular}{llll}

1417: {\tt UNIF-FUNC}$(f,\theta)$ & $ =$  & $<> $ &

1418:                        if $x_1\theta$ and $f(x_2,\ldots,x_{n}\theta)$ are not unifiable,\\

1419: %

1420:   &  $ =$ & $[\![<\theta\sigma>]\!]$ & where $\sigma\!\in\! {\it mgu}

1421:                                                    (x_1\theta,f(x_2,\ldots,

1422: x_{n})\theta)$,

1423:                                        otherwise.

1424: \end{tabular}

1425: \end{quote}

1426: \vskip0.3cm

1427:

1428: All   operations above

1429: %defined in this section

1430: are monotonic and continuous.

1431: We assume that

1432: Sets of program substitutions are  endowed with the %trivial

1433: ordering $\sqsubseteq$ such that $\theta\sqsubseteq\theta'$ iff

1434: $\theta=\theta'$.

1435:

1436: \subsection{Concrete Semantic Rules}

1437: \label{CSR}

1438:

1439: The concrete semantics of the underlying program $P$ is the least fixpoint

1440: of a conti\-nu\-ous transformation on {\it CB}

1441: (the set of concrete behaviors).

1442: This transformation is defined in terms of a set of semantic rules

1443: that naturally extend a concrete behavior to a continuous function

1444: defining the input/output behavior of every prefix of the body of a

1445: clause,

1446: every clause, every suffix

1447: %\footnote{as defined by the abstract syntax}

1448:  of a procedure and every procedure of $P$.

1449: This function is called {\em extended concrete behavior}

1450: and maps each element of the extended concrete underlying domain

1451: to a substitution sequence, possibly with cut information, as defined below.

1452:

1453: \begin{definition} [Extended Concrete Underlying Domain]

1454: The {\em extended concrete underlying domain},

1455: denoted by {\it ECUD}, consists of

1456: \begin{enumerate}

1457: \itemsep 2pt

1458: \item

1459:  all triples $\langle \theta, g, c \rangle $, where

1460:   $c$ is a clause of $P$,

1461:   $g$ is a prefix of the body of $c$,

1462:   and $\theta$ is a canonical program substitution over the variables

1463:   in the head of $c$;

1464: \item

1465:  all pairs $\langle \theta,  c \rangle $, where

1466:   $c$ is a clause of $P$

1467:   and $\theta$ is a canonical program substitution over the variables

1468:   in the head of $c$;

1469: \item

1470:  all pairs $\langle \theta,  {\it pr} \rangle $, where

1471:   ${\it pr}$ is a suffix of a procedure of $P$

1472: %  or a suffix of a procedure of $P$

1473:   and $\theta$ is a canonical program substitution over the variables

1474:   in the head of the

1475:   clauses of {\it pr}.

1476: \end{enumerate}

1477: \end{definition}

1478:

1479: \begin{definition} [Extended Concrete Behaviors]

1480: An {\em extended concrete behavior} is a total function from

1481: {\it ECUD} to the set ${\it CPSS}\cup({\it CPSS}\times{\it CF})$ such that

1482: \begin{enumerate}

1483: \itemsep 2pt

1484: \item

1485: every triple $\langle \theta, g, c \rangle $ from {\it ECUD}

1486:       is mapped to a program substitution sequence with cut information $\langle S, {\it cf}\rangle $ such that

1487:       ${\it dom}(S)$ is the set of all varia\-bles in  $c$;

1488: \item every pair $\langle \theta,  c \rangle $ from {\it ECUD}

1489:       is mapped to a program substitution sequence with cut information $\langle S, {\it cf}\rangle $ such that

1490:       ${\it dom}(S)$ is the set of variables in the head of  $c$;

1491: \item every pair $\langle \theta,  {\it pr} \rangle $ from {\it ECUD}

1492:       is mapped to a program substitution sequence $S$ such that

1493:       ${\it dom}(S)$ is the set of variables in the head of the

1494:       clauses of  {\it pr}.

1495: \end{enumerate}

1496:

1497: \end{definition}

1498:

1499:

1500: The set of extended concrete behaviors is endowed with a structure of

1501: pointed cpo in the obvious way. It is denoted by {\it ECB}; its

1502: elements  are denoted by $\longmapsto$.

1503:

1504:

1505:

1506: Let $\longmapsto$ be a concrete behavior. The concrete semantic rules

1507: depicted in Figu\-re~\ref{CSRF} define an extended concrete behavior

1508: derived from $\longmapsto$. This extended concrete behavior is

1509: denoted by the same symbol $\longmapsto$. This does not lead to

1510: confusion since the inputs of the two functions belong to different

1511: sets. The definition proceeds by induction on the syntactic structure

1512: of $P$.

1513:

1514:

1515: %\noindent

1516: The concrete semantic rules  model Prolog operational semantics

1517: through the notion of program substitution sequence.

1518: Rule {\bf R1} defines the program substitution sequence with

1519: cut information at the entry point of a clause.

1520: Rules {\bf R2} and {\bf R3} define the effect of the execution of a cut

1521: at the clause level. Rules {\bf R4},

1522: {\bf R5} and {\bf R6} deal with execution of literals;

1523: procedure calls are solved by using the concrete behavior

1524: $\longmapsto$ as an oracle.

1525: Rule {\bf R7} defines the result of a clause.

1526: Rules {\bf R8} and {\bf R9} define the result of a procedure

1527: by structural induction on its suffixes.

1528: Rule {\bf R8} deals

1529: with the suffix consisting of the last clause only: it simply

1530: forgets the cut information, which is not meaningful at the procedure

1531: level.

1532: Rule {\bf R9} combines the result of a clause with the

1533: (combined) result of the next clauses in the same procedure:

1534:  it deals with the execution of a cut at the procedure level.

1535: The expression ${\Box}^{{\it Ne}(S)}_{k=1} S_k$ used in

1536:  Rules {\bf R4},

1537: {\bf R5} and {\bf R6}  deserves an

1538: explanation: when the sequence $S$ is incomplete, it is assumed that

1539: $S_{_{{\it Ne}(S)}}=<\bot>$. This convention

1540: is necessary to propagate the non-termination of $ g'$

1541: to $g$.

1542:

1543:

1544:

1545: %\noindent

1546:

1547: The following results are instrumental for  proving the well-definedness

1548: of the concrete semantics.

1549: %

1550:

1551: \begin{proposition}[Properties of the Concrete Semantic Rules]

1552: %\label{PCR}

1553: %\vskip0.2cm

1554:   \begin{enumerate}

1555: \itemsep 2pt

1556:   \item Given a concrete behavior, the concrete semantic rules define

1557:         a unique extended concrete behavior, i.e.,

1558:         a unique mapping from

1559:         {\it CB} to {\it ECB}. %The mapping is denoted by overloading

1560:         %the symbols denoting corresponding elements from

1561:          %{\it CB} and {\it ECB}.

1562: This mapping is continuous.

1563:   \item Rules {\bf R1} to {\bf R6} have a conclusion of the form

1564:         $\langle \theta, g, c \rangle \longmapsto

1565:          \langle S, {\it cf}\rangle $.

1566:         In all cases, $S$ is of the

1567:         form $<\theta'\sigma_1,\dots,\theta'\sigma_i,\_>$,

1568:         where the $\sigma_i$ are standard substitutions and

1569:         $\langle\theta',{\it nocut}\rangle={\tt EXTC}(c,\theta)$.

1570:   \\  Rules {\bf R7} to {\bf R9} have a conclusion

1571:     of the form  $\langle \theta, \cdot\rangle \longmapsto  S$.

1572: In all cases, $S$ is of the form

1573: 	 \mbox{$<\theta\sigma_1,\dots,\theta\sigma_i,\_>$},

1574: 	 where the $\sigma_i$ are standard substitutions.

1575:       \end{enumerate}

1576:  \end{proposition}

1577:

1578:

1579:

1580:

1581:  \subsection{Concrete Semantics}

1582:  \label{TCS}

1583:

1584:

1585:  The concrete semantics of the underlying program $P$ is

1586: defined as the least fixpoint of the following concrete transformation.

1587:

1588:  \begin{definition}[Concrete Transformation]

1589:  \label{CT}

1590:

1591:  \noindent

1592: The transformation

1593:  %$\stackrel{{\it TCB}}\cdot : {\it CB}\rightarrow{\it CB}$

1594: ${\it TCB} : {\it CB}\rightarrow{\it CB}$

1595: is defined

1596: as follows: for all $\longmapsto \in {\it CB}$,

1597:  %

1598: \vskip0.2cm

1599:  \begin{center}

1600:  \begin{tabular}{rcl}

1601:    & {\it pr} is a procedure of {\it P}                         & \\

1602:    & $p$ is the name of  {\it pr}                                &\\

1603:    & $\langle \theta, {\it pr} \rangle

1604:     \longmapsto S$                                            &  \\

1605:  {\bf T1} & \raisebox{.7ex}{\underline{\hspace*{3.7cm}}}

1606: 							 &    \\

1607:   &  $\langle \theta, p \rangle

1608:      \stackrel{{\it TCB}}\longmapsto

1609: 		   S$                                             \\

1610:  \end{tabular}

1611:  \end{center}

1612: where $\stackrel{{\it TCB}}\longmapsto $ stands for ${\it TCB}(\longmapsto)$.

1613: Remember that

1614: $\langle \theta, {\it pr} \rangle

1615:     \longmapsto S$ is defined by means of the previous rules

1616:  which use the concrete behavior $\longmapsto$ as an oracle to solve the procedure calls.

1617:  \end{definition}

1618:  %

1619:

1620:

1621:

1622: %

1623: {\begin{figure}[top]

1624: \figrule

1625: \begin{it}

1626:

1627: \begin{center}

1628: \begin{tabular}{rc}

1629: \\

1630: &  g\  ::=\ $<>$                                                 \\

1631: {\bf R1}  & \raisebox{.7ex}{\underline{\hspace*{3.7cm}}}

1632:     \\

1633: & $\langle \theta, g, c \rangle

1634:    \longmapsto

1635:  {\tt EXTC}(c,\theta)$    \\                                     \\

1636: \end{tabular}

1637: \end{center}

1638: %

1639: %

1640: %

1641: \hspace{-7cm}

1642: \begin{tabular}{cc}

1643: %

1644: %

1645: %

1646: \begin{tabular}{lc}

1647: &  g\  ::=\ g' , !                                             \\

1648: &  $\langle \theta, g',c \rangle

1649:    \longmapsto

1650:    \langle S, {\it cf} \rangle$                               \\

1651:  & $S\in\{<\bot>,<>\}$                                         \\

1652: {\bf R2}  & \raisebox{.7ex}{\underline{\hspace*{3.7cm}}}

1653:  \\

1654: & $\langle \theta, g, c \rangle

1655:    \longmapsto

1656:    \langle S, {\it cf} \rangle$                                \\

1657: \end{tabular}

1658: %

1659: &

1660: %

1661: \hspace{-6cm}

1662: \begin{tabular}{lc}

1663:  & g\  ::=\ g' , !                                             \\

1664:  & $\langle \theta, g',c \rangle

1665:    \longmapsto

1666:    \langle S, {\it cf} \rangle$                               \\

1667:   & $S = <\theta'>::S'$                                         \\

1668: {\bf R3} &\raisebox{.7ex}{\underline{\hspace*{3.7cm}}}

1669:     \\

1670: & $\langle \theta, g, c \rangle

1671:    \longmapsto

1672:    \langle <\theta'>, {\it cut} \rangle$                       \\

1673: \end{tabular}

1674: %

1675: \\ \\ \\

1676: %

1677: \begin{tabular}{lc}

1678:  & g\  ::=\ g' , l                                             \\

1679:   & l\  ::=\ $x_i$=$x_j$                                        \\

1680:

1681:   & $\langle \theta, g',c \rangle

1682:    \longmapsto

1683:    \langle S, {\it cf} \rangle$                               \\

1684:   & $S = <\theta_1,\dots,\theta_i,\_>$                          \\

1685:

1686: & $\left\{\begin{array}{c}

1687:  \theta'_k = {\tt RESTRG}(l,\theta_k)                        \\

1688: S_{k}^{' }=\mbox{\tt UNIF-VAR}(\theta'_k)                      \\

1689: S_k = {\tt EXTG}(l,\theta_k,S_{k}^{' })                     \\

1690:  (1\leq k\leq {\it Ns}(S))           \end{array}\right\}$       \\

1691:

1692: {\bf R4} & \raisebox{.7ex}{\underline{\hspace*{3.7cm}}}

1693:                 \\

1694: & $\langle\theta, g, c \rangle

1695:   \longmapsto

1696:  \langle {\Box}^{{\it Ne}(S)}_{k=1} S_k,

1697:                           {\it cf} \rangle$                   \\

1698: \end{tabular}

1699: %

1700: &

1701: %

1702: \hspace{-6cm}

1703: \begin{tabular}{lc}

1704:  & g\  ::=\ g' , l                                             \\

1705:  & l\  ::=\

1706:    $x_{i_1}$=f($x_{i_2}$, \ldots, $x_{i_n}$)                  \\

1707:

1708:   & $\langle \theta, g',c \rangle

1709:    \longmapsto

1710:    \langle S, {\it cf} \rangle$                               \\

1711:   & $S = <\theta_1,\dots,\theta_i,\_>$                          \\

1712:

1713: & $\left\{\begin{array}{c}

1714: \theta'_k = {\tt RESTRG}(l,\theta_k)                        \\

1715: S_{k}^{' }=\mbox{\tt UNIF-FUNC}(f,\theta'_k)                   \\

1716: S_k = {\tt EXTG}(l,\theta_k,S_{k}^{' })                     \\

1717:   (1\leq k\leq {\it Ns}(S) )   \end{array}\right\}$         \\

1718:

1719: {\bf R5} & \raisebox{.7ex}{\underline{\hspace*{3.7cm}}}

1720:           \\

1721: & $\langle\theta, g, c \rangle

1722:   \longmapsto

1723:  \langle {\Box}^{{\it Ne}(S)}_{k=1} S_k,

1724:                           {\it cf} \rangle$                   \\

1725: \end{tabular}

1726: %

1727: \\ \\ \\

1728: %

1729: %

1730: \begin{tabular}{lc}

1731:  &  g\  ::=\ g' , l                                             \\

1732:   & l\  ::=\

1733:     p($x_{i_1}$, \ldots, $x_{i_n}$)                           \\

1734:

1735:   & $\langle \theta, g',c \rangle

1736:    \longmapsto

1737:    \langle S, {\it cf} \rangle$                               \\

1738:   & $S = <\theta_1,\dots,\theta_i,\_>$                          \\

1739:

1740:  & $\left\{\begin{array}{c}

1741: \theta'_k = {\tt RESTRG}(l,\theta_k)                        \\

1742: \langle  \theta'_k ,p\rangle

1743:     \longmapsto  S_{k}^{'}                                   \\

1744: S_k = {\tt EXTG}(l,\theta_k,S_{k}^{' })                     \\

1745:    (1\leq k\leq {\it Ns}(S))       \end{array}\right\}$  \\

1746:

1747: {\bf R6} & \raisebox{.7ex}{\underline{\hspace*{3.7cm}}}

1748:                         \\

1749: & $\langle\theta, g, c \rangle

1750:   \longmapsto

1751:  \langle {\Box}^{{\it Ne}(S)}_{k=1} S_k,

1752:                           {\it cf} \rangle$                   \\

1753: \end{tabular}

1754: %

1755: &

1756: %

1757: \hspace{-6cm}

1758: \begin{tabular}{lc}

1759: \\

1760: \\

1761: \\

1762: \\

1763: \\

1764:                                                               \\

1765:   & c \  ::=\  h  :- g.                                         \\

1766:   & $\langle \theta, g ,c \rangle

1767:    \longmapsto

1768:    \langle S, {\it cf} \rangle$                               \\

1769: {\bf R7} &\raisebox{.7ex}{\underline{\hspace*{3.7cm}}}

1770:                             \\

1771:   & $\langle \theta, c \rangle

1772:    \longmapsto

1773: {\tt RESTRC}(c, \langle S, {\it cf} \rangle)$                 \\

1774: \end{tabular}

1775: %

1776: \\ \\ \\

1777: %

1778: %

1779: \begin{tabular}{lc}

1780:                                                               \\

1781:  & pr \  ::=\  c                                               \\

1782:   & $\langle \theta, c \rangle

1783:    \longmapsto

1784:    \langle S, {\it cf} \rangle$                               \\

1785: {\bf R8} & \raisebox{.7ex}{\underline{\hspace*{3.7cm}}}

1786:                                       \\

1787:   & $\langle \theta, {\it pr} \rangle

1788:    \longmapsto S$                                             \\

1789: \end{tabular}

1790: %

1791: &

1792: %

1793: \hspace{-6cm}

1794: \begin{tabular}{lc}

1795:   & pr \  ::=\  c {\it pr}'                                     \\

1796:   & $\langle \theta, c \rangle

1797:    \longmapsto

1798:    \langle S, {\it cf} \rangle$                               \\

1799:   & $\langle \theta, {\it pr}' \rangle

1800:    \longmapsto S'$                                            \\

1801: {\bf R9} & \raisebox{.7ex}{\underline{\hspace*{3.7cm}}}

1802:                                     \\

1803:   & $\langle \theta, {\it pr} \rangle

1804:    \longmapsto

1805:    \langle S, {\it cf} \rangle \Box S'$                       \\

1806: \end{tabular}\\ \\

1807:

1808:

1809: \end{tabular}

1810: \end{it}

1811: \caption{Concrete semantic rules}

1812: \label{CSRF}

1813: \end{figure}}

1814:

1815:

1816: The transformation ${\it TCB}$

1817: %$\stackrel{{\it TCB}}\cdot$

1818:  is well-defined  and continuous.

1819:

1820:  \begin{definition}[Concrete Semantics]

1821:  \label{DCS}

1822:

1823:  \noindent The concrete semantics of the underlying program $P$ is the least

1824:  concrete beha\-vior $\longmapsto$ such that

1825:  $$ \longmapsto\ =\ \stackrel{{\it TCB}}\longmapsto.$$

1826:  \end{definition}

1827:  %

1828:  %

1829:  %

1830:

1831:

1832:

1833:  \subsection{Correctness of the Concrete Semantics}

1834:  \label{CCS}

1835:

1836:  Since OLD-resolution \cite{Lloyd,Tamaki86} is the standard semantics of

1837:  pure Prolog augmented with cut,  our concrete

1838:  semantics and OLD-resolution  have to be proven equivalent.

1839:  The proof is fairly complex because OLD-resolution is not compositional.

1840:  Consequently, the two semantics do not  naturally  match.

1841:  The equivalence proof is given in \cite{Cut_rep95}. In this section, we

1842:  only give the principle of the proof.

1843:  \begin{enumerate}

1844:  \itemsep 2pt

1845:  \item We assume that OLD-resolution uses standard variables to

1846:  rename clauses apart.

1847:  The  initial queries are also assumed to contain

1848:  standard variables only.

1849:  %

1850:  \item The notion of {\em incomplete OLD-tree limited to depth $k$}

1851:  is defined ({\it IOLD}$_k$-tree, for short).

1852:  Intuitively, an {\it IOLD}$_k$-tree is an OLD-tree modified according

1853:  to the following rules:

1854:        \begin{enumerate}

1855: \itemsep 2pt

1856:        \item procedure calls may be unfolded only down to depth $k$;

1857:        \item branches that end at a node whose leftmost literal may not

1858: 	    be unfolded are called {\em incomplete};

1859:        \item a  depth-first left-to-right traversal of the tree is performed in

1860: 	    order to determine the cuts that are reached by the standard

1861: 	    execution and to prune the tree accordingly; see \cite{Lloyd};

1862:        \item the traversal ends when the whole tree has been visited  or

1863: 	     when a node that may not be unfolded is reached;

1864:        \item the branches on the right of the left-most incomplete branch

1865: 	     are pruned (if such a branch exists).

1866:  \end{enumerate}

1867:  \item Assuming a query of the form $p(t_1,\dots,t_n)$ and denoting

1868:  the concrete beha\-vior

1869:   %$\stackrel{_{_{{\it TCB}^k}}}{\longmapsto_{\bot}}$

1870: ${\it TCB}^k({\longmapsto_{\bot}})$

1871:  by $\longmapsto_k$, it can be shown that the sequence of computed

1872:  answer substitutions $<\sigma_1,\dots,\sigma_i,\_>$ for the

1873:  {\it IOLD}$_k$-tree of $p(t_1,\dots,t_n)$ is such that

1874:  $\langle \theta, p\  \rangle \longmapsto_k

1875: 	 [\![ <\theta\sigma_1,\dots,\theta\sigma_i,\_>]\!]$

1876:  where $\theta=\{x_1/t_1,\dots,x_n/t_n\}$.

1877:  \item The equivalence of our concrete semantics and OLD-resolution

1878:     is a simple consequence of the previous result.\\ For every query

1879: $p(t_1,\dots,t_n)$, $<\sigma_1,\dots,\sigma_i,\_\!>$

1880: is the sequence of computed answer substitutions of $p(t_1,\dots,t_n)$

1881: according to OLD-resolution if and only if

1882: $\langle \theta, p\  \rangle \longmapsto

1883: 	 [\![ <\theta\sigma_1,\dots,\theta\sigma_i,\_>]\!]$

1884: where $\theta=\{x_1/t_1,\dots,x_n/t_n\}$ and

1885: $\longmapsto$ is the concrete behavior of the program

1886:     according to our concrete semantics.

1887: %Assuming the same

1888: %    notation as above, the relation

1889: %    $\langle \theta, p\  \rangle \longmapsto

1890: %	 [\![ <\theta\sigma_1,\dots,\theta\sigma_i,\_>]\!]$

1891: %    holds, where $<\sigma_1,\dots,\sigma_i,\_>$ is the sequence of computed

1892: %    answer substitutions for the query $p(t_1,\dots,t_n)$

1893: %    (according to OLD-resolution)

1894: %    and $\longmapsto$ is the concrete behavior of the program

1895: %    (according to our concrete semantics).

1896:  \end{enumerate}

1897:

1898:

1899:  In fact, the correctness of our concrete semantics should be

1900:  close to obvious to anyone who knows about both Prolog and denotational

1901:  semantics.

1902: %Moreover, the semantics of the cut is  naturally

1903: % expressed in our formalization.

1904: So, the equivalence proof

1905:  is

1906:  a formal technical exercise, which adds little to our basic

1907:  understan\-ding of the concrete semantics.

1908:

1909:  \subsection{Related Works}

1910:  \label{RW_CS}

1911:

1912:  Denotational semantics for Prolog have been proposed before

1913:   \cite{debruin,Debray88b,JonesMy}. Our concrete semantics is not

1914:  intended to improve on these works from the language understanding

1915:  standpoint. Instead, it is merely designed as a basis for an abstract

1916:  interpretation framework; in particular, it uses concrete operations

1917:  that are as close as possible to the operations used by the structural

1918:  operational semantics presented in \cite{ACTA95} upon which

1919:  our previous frameworks

1920:  are based. This allows us to reuse

1921:  much of the material from our existing abstract domains and generic

1922:  algorithms; see, \cite{SPE,ICLP91AI,TOPLAS,ACTA95}.

1923:  The idea of distinguishing between

1924:  finite, incomplete, and infinite sequences is originally due to

1925:   Baudinet \shortcite{Baudinet}.

1926:  %

1927:  %

1928:

1929:

1930: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

1931: \section{Abstract semantics}

1932: \label{sec:AS}

1933: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

1934:

1935: %We are now in position to define the abstract semantics.

1936: As we have already

1937:  explained in the introduction, our abstract semantics is  not

1938:  defined as a

1939: least fixpoint of an abstract transformation

1940: but instead as a set of post-fixpoints

1941:  that fulfill a safety requirement, namely

1942: pre-consistency. Moreover, the abstract domains are assumed to

1943:  represent so-called

1944: chain-closed sets of concrete elements as specified below.

1945:

1946: %This section is organized as follows: Section \ref{sub:AD} describes the abstract domains.

1947: %Section \ref{sub:AO} specifies the abstract operations.

1948: % The abstract semantics is defined in Section \ref{sub:ASem}

1949: %and its safety properties are stated and proven in Section \ref{sub:SAS}. Finally, in

1950: %Section \ref{sub:D}   our approach is discussed and compared with other  ones.

1951:

1952:

1953:

1954: \subsection{Abstract Domains}

1955: \label{sub:AD}

1956:

1957:

1958:

1959: %In this section, we describe the abstract domains that are used by our abstract

1960: %interpretation framework. In particular

1961: We  state here

1962: the mathematical assumptions that are required to be satisfied by the

1963: abstract domains.

1964: Specific abstract domains will be described

1965: in  Section \ref{sec:FASAS}.\\

1966:

1967: \noindent

1968: {\bf Abstract Substitutions.}

1969: %\label{subsub:ASub}

1970: For   every finite set $D$ of program variables, we denote by

1971: ${\it CS}_D$ the set

1972: %of program substitutions

1973: $\wp({\it PS}_D)$.

1974: %{ Abstract substitutions} denote sets of program substitutions,

1975: %i.e., elements of ${\it CS}_D$, as follows.

1976: %

1977: %\begin{definition}%

1978: %\label{def:ASu}%

1979: %[Abstract Substitutions]

1980: %

1981: \noindent A domain of abstract substitutions is a family of sets ${\it AS}_D$

1982: indexed by the finite sets $D$ of program variables.

1983: Elements of ${\it AS}_D$ are called {\em abstract substitutions};

1984: they are denoted by $\beta$.

1985: Each set ${\it AS}_D$ is endowed with a partial order $\leq$

1986: and a monotonic  {\em concretization} function

1987: ${\it Cc}:{\it AS}_D\rightarrow {\it CS}_D$

1988: associating to each  abstract substitution $\beta$ the set

1989: ${\it Cc}(\beta)$ of program

1990: substitutions it denotes.\\

1991: %\end{definition}

1992:

1993: \noindent

1994: {\bf Abstract Sequences.}

1995: %\label{subsub:ASeq}

1996: For every finite set $D$ of program variables, we denote by

1997: ${\it CSS}_D$ the set $\wp({\it PSS}_D)$.

1998:  Abstract sequences denote chain-closed subsets of ${\it CSS}_D$.\\

1999: %

2000: %

2001: %\begin{definition}%

2002: %\label{def:ASeq}%

2003: %[Abstract Sequences]

2004: %

2005: \noindent A domain of abstract sequences is a family of sets ${\it ASS}_D$

2006: indexed by the finite sets $D$ of program variables.

2007: Elements of ${\it ASS}_D$ are called {\em abstract sequences};

2008: they are denoted by $B$.

2009: Each set ${\it ASS}_D$ is endowed with a partial order $\leq$

2010: and a monotonic {\em concretization} function

2011: ${\it Cc}:{\it ASS}_D\rightarrow {\it CSS}_D$.

2012: Moreover, the following properties are required to be satisfied: (1)

2013: %\begin{enumerate}

2014: %\itemsep 0pt

2015: %\item

2016: every ${\it ASS}_D$ contains an abstract sequence $B_{\bot}$

2017: such that $<\bot>\,\in {\it Cc}(B_{\bot})$;

2018: (2)

2019: %\item

2020: for every $B\in{\it ASS}_D$, ${\it Cc}(B)$ is {\em chain-closed}, i.e.,

2021:       for every chain $(S_i)_{i\in {\bf N}}$ of elements of ${\it Cc}(B)$,

2022:       the limit $\sqcup_{i=0}^{\infty}S_i$ also belongs to ${\it Cc}(B)$.

2023: %\end{enumerate}

2024: The disjoint union of all the ${\it ASS}_D$ is denoted by ${\it ASS}$.\\

2025: %${\it ASS}$ is assumed to be endowed with the sum ordering

2026: %\cite{Schmidt88}.

2027: %\end{definition}

2028:

2029: \noindent

2030: {\bf Abstract Sequences with Cut Information.}

2031: %\label{ASCI}

2032: Let ${\it CSSC}_D$ denote $\wp({\it PSS}_D\times {\it CF})$.

2033: %For every finite set $D$ of program variables, we denote by  ${\it CSSC}_D$

2034: %the set $\wp({\it PSS}_D\times {\it CF})$.

2035: %{ Abstract sequences with cut information} denote sets of

2036: %program substitution sequences with cut information,

2037: %i.e., elements of ${\it CSSC}_D$, as defined below.

2038: %

2039: %\begin{definition}%

2040: %\label{def:ASeqCI}%

2041: %[Abstract Sequences with Cut Information]

2042: %

2043: \noindent A domain of abstract sequences

2044: with cut information is a family of sets ${\it ASSC}_D$

2045: indexed by the finite sets $D$ of program variables.

2046: Elements of ${\it ASSC}_D$ are called

2047: {\em abstract sequences with cut information};

2048: they are denoted by $C$.

2049: Every set ${\it ASSC}_D$ is endowed with a partial order $\leq$

2050: and a monotonic {\em concretization} function

2051: ${\it Cc}:{\it ASSC}_D\rightarrow {\it CSSC}_D$.

2052: The disjoint union of all the ${\it ASSC}_D$ is denoted by ${\it ASSC}$.\\

2053: %and it is assumed to be endowed with the sum ordering.

2054: %

2055: %\end{definition}

2056:

2057: \noindent

2058: {\bf Abstract Behaviors.}

2059: %\label{AB}

2060: Abstract behaviors are the abstract counterpart of the concrete

2061: behaviors introduced in Section \ref{CB}. They are endowed with a

2062: weaker mathe\-matical structure as described below.

2063: As in the case of concrete behaviors, a fixed underlying program $P$ is

2064: assumed.

2065:

2066: %

2067: \begin{definition} [Abstract Underlying Domain]

2068: \label{AAUD}

2069: The {\em abstract underlying domain},

2070: denoted by {\it AUD},

2071:  is the set of all pairs $\langle \beta,p \rangle$ such that

2072: $p$ is  a procedure name

2073: in $P$ of arity $n$ and

2074: $\beta\!\in\!{\it AS}_{\{x_1,\dots,x_n\}}$.

2075: %where $n$ is the arity of $p$.

2076: \end{definition}

2077: %

2078: %

2079: \begin{definition}[Abstract Behaviors]

2080: \label{def:A}

2081: An {\em abstract behavior}

2082: is a total function ${\it sat}: {\it AUD}\longrightarrow {\it ASS}$

2083: mapping each pair  $\langle \beta,p \rangle\!\in\!{\it AUD}$  to an

2084: abstract sequence $B$ with $B\in{\it ASS}_{\{x_1,\dots,x_n\}} $,

2085: where $n$ is the arity of $p$.

2086: The set of all abstract behaviors is denoted by

2087: {\it AB}.

2088: The set {\it AB} is endowed with the partial ordering $\leq$ such that,

2089: for all ${\it sat}_1,{\it sat}_2\in{\it AB}$:

2090: %

2091: $$

2092: \begin{array}{lclr}

2093: {\it sat}_1 \leq {\it sat}_2

2094: & \;

2095: \mbox{\rm iff} \;

2096: &

2097: {\it sat}_1\langle \beta,p \rangle \leq

2098: {\it sat}_2\langle \beta,p \rangle, &\;

2099:    \forall \langle \beta,p \rangle \in {\it AUD}.

2100: \end{array}

2101: $$

2102: \end{definition}

2103:

2104: It would be reasonable to assume that abstract behaviors

2105: are monotonic functions but this is not necessary for the safety

2106: results. The notation {\it sat} stands for ``set of

2107: abstract tuples''. It is used because the abstract interpretation

2108: algorithm, derived from the abstract semantics, actually computes a set

2109: of tuples of the form $\langle \beta,p ,B\rangle $, i.e., a part of the

2110: table of an abstract behavior.

2111:

2112:

2113: \subsection{Abstract Operations}

2114: \label{sub:AO}

2115:

2116: In this section, we give the specification of the primitive

2117:   abstract operations

2118: used  by the abstract semantics.

2119: The specifications are safety assumptions which, roughly speaking,

2120: state that the abstract operations safely simulate the corresponding concrete

2121: ones.

2122: %To ease the presentation, we classify abstract operations in three

2123: %sets.

2124: In particular,

2125: operations {\tt EXTC}, {\tt RESTRG}, {\tt RESTRC},

2126:       {\tt UNIF-VAR}, {\tt UNIF-FUNC} are faithful abstract counterparts of

2127:       the corresponding concrete operations. Hence,

2128:       their specification simply states

2129:       that, if some concrete input belongs to the concretization of their

2130:     (abstract) input, then the corresponding concrete output  belongs to

2131:       the concretization

2132:        of their (abstract) output. Moreover, overloading the opera\-tion

2133:       names is natural in these cases.

2134:        Operation  {\tt AI-CUT}

2135:       deals with the cut; its specification is also straightforward.

2136: Operations {\tt EXTGS} and {\tt CONC} are related to the concrete

2137:       operations {\tt EXTG} and $\Box$ in a more involved way. We will discuss them  in more detail.

2138: Finally, operations {\tt SUBST} and {\tt SEQ} are simple conversion

2139:       operations to convert  an abstract domain into another.

2140:

2141:

2142:  Let us  specify the operations, using the notations of

2143: Section \ref{CO}.

2144: \\

2145: \\

2146: \noindent

2147: {\bf Extension at  Clause Entry}\ :\ \

2148:      {\tt EXTC}$(c,\cdot):{\it AS}_D

2149: \rightarrow{\it ASSC}_{D'} $\  \\

2150: %

2151: Let $\beta \in {\it AS}_{D}$ and $\theta \in {\it CPS}_{D}$.

2152: The following property is required to hold.

2153: %

2154: \begin{quote}$

2155: \begin{array}{lll}

2156: \theta\in{\it Cc}(\beta)

2157: &

2158: \Rightarrow

2159: &

2160: {\tt EXTC}(c,\theta)\in {\it Cc}({\tt EXTC}(c,\beta)).

2161: \end{array}$

2162: \end{quote}

2163: %

2164: %

2165: \vskip0.3cm

2166: \noindent

2167: {\bf Restriction at Clause Exit}\ :\ \

2168:  {\tt RESTRC}$(c,\cdot):{\it ASSC}_{D'}

2169:                          \rightarrow{\it ASSC}_D $\  \\

2170: Let $C\in {\it ASSC}_{D'}$ and $\langle S, {\it cf}\rangle

2171: \in({\it CPSS}_D'\times {\it CF})$.

2172: %

2173: \begin{quote}$

2174: \begin{array}{lll}

2175: \langle S, {\it cf}\rangle \in{\it Cc}(C)

2176: &

2177: \Rightarrow

2178: &

2179: {\tt RESTRC}(c,\langle S, {\it cf}\rangle)\in

2180: {\it Cc}({\tt RESTRC}(c,C)).

2181: \end{array}$

2182: \end{quote}

2183: %

2184: %

2185: \vskip0.3cm

2186: \noindent

2187: {\bf Restriction before a Call}\ :\ \

2188:  {\tt RESTRG}$(l,\cdot):{\it AS}_{D''}

2189:                          \rightarrow{\it AS}_{D'''} $\ \\

2190: Let $\beta \in{\it AS}_{D''}$  and $\theta \in{\it CPS}_{D''}$.

2191: %

2192: \begin{quote}$

2193: \begin{array}{lll}

2194: \theta \in{\it Cc}(\beta)

2195: &

2196: \Rightarrow

2197: &

2198: {\tt RESTRG}(l,\theta)\in

2199: {\it Cc}({\tt RESTRG}(l,\beta)).

2200: \end{array}$

2201: \end{quote}

2202: %

2203: \vskip0.3cm

2204: \noindent

2205: {\bf Unification of Two Variables}\ :\ \

2206:    {\tt UNIF-VAR}$: {\it AS}_{\{x_1,x_2\}}

2207:                      \rightarrow {\it ASS}_{\{x_1,x_2\}}$\  \\

2208: Let $\beta \in {\it AS}_{\{x_1,x_2\}}$ and

2209:  $\theta\in{\it CPS}_{\{x_1,x_2\}}$.

2210: %

2211: \begin{quote}$

2212: \begin{array}{lll}

2213: \theta \in{\it Cc}(\beta)

2214: &

2215: \Rightarrow

2216: &

2217: \mbox{\tt UNIF-VAR}(\theta)\in

2218: {\it Cc}(\mbox{\tt UNIF-VAR}(\beta)).

2219: \end{array}$

2220: \end{quote}

2221: %

2222: \vskip0.3cm

2223: \noindent

2224: {\bf Unification of a Variable and a Functor}\ :\ \

2225:    {\tt UNIF-FUNC}$(f,\cdot): {\it AS}_{D}

2226:                      \rightarrow {\it ASS}_{{D}}$\  \\

2227: Let $\beta \in {\it AS}_{D}$

2228: and  $\theta \in {\it CPS}_{D}$. Let also $f$ be a functor

2229: of arity $n-1$.

2230: %

2231: %

2232: \begin{quote}$

2233: \begin{array}{lll}

2234: \theta \in{\it Cc}(\beta)

2235: &

2236: \Rightarrow

2237: &

2238: \mbox{\tt UNIF-FUNC}(f,\theta)\in

2239: {\it Cc}(\mbox{\tt UNIF-FUNC}(f,\beta)).

2240: \end{array}$

2241: \end{quote}

2242: %

2243: \vskip0.3cm

2244: \noindent

2245: {\bf Abstract Interpretation of the Cut}\ :\ \

2246:    {\tt AI-CUT}$: {\it ASSC}_{{D'}}

2247:                          \rightarrow {\it ASSC}_{{D'}}$\  \\

2248: Let $C \in {\it ASSC}_{D'}$,

2249:     $\theta \in {\it CPS}_{D'}$,

2250:     $S \in {\it CPSS}_{D'}$,

2251:     ${\it cf} \in {\it CF}$.

2252: %

2253: \begin{quote}$

2254: \begin{array}{rcl}

2255: \langle <>,{\it cf}\rangle \in {\it Cc}(C) & \Rightarrow &

2256: \langle <>,{\it cf}\rangle \in {\it Cc}(\mbox{\tt AI-CUT}(C)), \\

2257: \langle <\bot>,{\it cf}\rangle \in {\it Cc}(C) & \Rightarrow &

2258: \langle <\bot>,{\it cf}\rangle \in {\it Cc}(\mbox{\tt AI-CUT}(C)),\\

2259: \langle <\theta>::S ,{\it cf}\rangle \in {\it Cc}(C) & \Rightarrow &

2260: \langle <\theta>,{\it cut}\rangle \in {\it Cc}(\mbox{\tt AI-CUT}(C)).

2261: \end{array}$

2262: \end{quote}

2263: %

2264: \vskip0.3cm

2265: \noindent

2266: {\bf Extension of the Result of a Call}\ :\ \

2267: {\tt EXTGS}$(l,\cdot,\cdot):{\it ASSC}_{D'}\times {\it ASS}_{D'''}

2268:                          \rightarrow{\it ASSC}_{D'} $\  \\

2269: The specification of this operation is more complex because it

2270: abstracts in a single operation the calculation of all sequences

2271: $S_k = {\tt EXTG}(l,\theta_k,S'_{k})$ and of their concatenation

2272: ${\Box}^{{\it Ne}(S)}_{k=1} S_k$, performed by the

2273:  rules {\bf R4}, {\bf R5}, {\bf R6}

2274: (see Figure \ref{CSRF}).

2275: At the abstract level, it may be too expensive or even impossible

2276: to simulate the execution of $l$ for all elements of $S$,

2277: as defined in the rules. Therefore,

2278: we abstract $S$ to its substitutions,

2279: losing the ordering. The abstract execution will be the following.

2280:  Assuming that $C$ abstracts the

2281: program substitution sequence with cut information

2282: $\langle S,{\it cf}\rangle$ before $l$, we compute $\beta={\tt SUBST}(C)$;

2283: then we compute $\beta'={\tt RESTRG}(l,\beta)$ and, subsequently,

2284: we get the abstract sequence $B$ resulting from the abstract exe\-cution of

2285:  $l$ with

2286: input $\beta'$.

2287: %applied to the procedure corresponding to $l$.

2288: The set ${\it Cc}(B)$ contains

2289: all sequences $S'_k$ of rules {\bf R4}, {\bf R5}, {\bf R6}. Then,

2290: an over approximation of the set of all possible values

2291: ${\Box}^{{\it Ne}(S)}_{k=1} S_k$ is computed from the information provided

2292: by $C$

2293: and $B$.

2294: This is realized by the following operation {\tt EXTGS}.

2295:  Let $C\in {\it ASSC}_{D'}$, $B\in {\it ASS}_{D'''}$,

2296:  $\langle S,{\it cf}\rangle\in ({\it CPSS}_{D'}\times {\it CF})$ and

2297: $S'_1,\dots,S'_{{\it Ns}(S)}\in{\it CPSS}_{D'''}$.

2298: %

2299: \begin{quote}$

2300: \hskip-0.8cm

2301: \begin{array}{rcl}

2302: \left.

2303: \begin{array}{c}

2304: \langle S,{\it cf}\rangle  \in {\it Cc}(C),\\

2305: S=<\theta_1,\dots,\theta_i,\_>,\\

2306:    %    \left\{

2307:    %    \begin{array}{c}

2308:        \left( \begin{array}{c}

2309: \forall k:1\leq k\leq  {\it Ns}(S):

2310:        S'_k\in{\it Cc}(B)\\

2311:        \mbox{ and } S_k={\tt EXTG}(l,\theta_k,S'_k)

2312: \end{array}

2313: \right)

2314:    %    \end{array}

2315:     %   \right\}

2316: \end{array}

2317: \right\}

2318:                 &\Rightarrow&

2319: \langle \Box_{k=1}^{{\it Ne}(S)}S_k ,{\it cf}\rangle \in

2320:                        {\it  Cc}(\mbox{\tt EXTGS}(l,C,B)).

2321: \end{array}$

2322: \end{quote}

2323: %

2324: %Remember that the notation $\Box_{k=1}^{{\it Ne}(S)}S_k$ assumes that

2325: %$S_{{\it Ne}(S)}=<\bot>$ when $S$ is incomplete;

2326: %moreover, the equality $S_k={\tt EXTG}(l,\theta_k,S'_k)$

2327: %implies that all substitutions of

2328: %$S'_k$ are instances of $\theta_k$ in the sense made precise in the

2329: %definition of {\tt EXTG}.

2330:  % (see Section \ref{CO}).

2331: % The latter remark suggests that, provided the abstract

2332: %domain is expressive enough, operation {\tt EXTGS} can recover

2333: %(part of) the information lost by executing operation {\tt SUBST}$(C)$,

2334: %by matching the sequences in ${\it Cc}(B)$ with the

2335: %elements of the sequences in ${\it Cc}(C)$.

2336: %

2337: \vskip0.3cm

2338: \noindent

2339: {\bf Abstract Lazy Concatenation}\ :\ \

2340:     ${\tt CONC}: ({\it AS}_D \times {\it ASSC}_D  \times {\it ASS}_D)

2341:                  \rightarrow {\it ASS}_D$\  \\

2342: This operation is the abstract counterpart of the concatenation operation

2343: $\Box$. It is however extended with an

2344: additional argument to increase the accuracy.

2345:  Let $B'={\tt CONC}(\beta,C,B)$ where

2346:  $\beta$ describes a set of input substitutions for

2347: a procedure; $C$ describes the set of substitution sequences with cut

2348: information

2349: obtained by executing a clause of the procedure on $\beta$;  $B$ describes the set of

2350: substitution sequences obtained by executing the

2351: subsequent clauses of the procedure on $\beta$.

2352: Then, $B'$ describes the set of substitution sequences obtained by

2353: concatenating the results according to the concrete

2354: concatenation operation $\Box$.

2355:

2356: Let us discuss a simple example to

2357:  understand the role of $\beta$.

2358:   Assume that

2359: $$

2360: \begin{array}{rcl}

2361: {\it Cc}(C)=\{ \langle <>, {\it nocut} \rangle ,

2362:             \langle <\{x_1/a\}>, {\it nocut} \rangle\} &

2363: \mbox{\rm and }                                         &

2364: {\it Cc}(B)=\{ <>,

2365:                <\{x_1/b\}> \}.

2366: \end{array}

2367: $$

2368: If the input mode of $x_1$ is unknown, it must be assumed that

2369: all combinations of elements in ${\it Cc}(C)$ and ${\it Cc}(B)$ are

2370: possible. Thus,

2371: $${\it Cc}(B')=\{ <>, <\{x_1/a\}>,

2372:                <\{x_1/b\}>,  <\{x_1/a\},\{x_1/b\}> \}.$$

2373: On the contrary, if the input mode of $x_1$ is known to be ground,

2374: the outputs $\langle <\{x_1/a\}>, {\it nocut} \rangle$ and

2375: $<\{x_1/b\}>$ are incompatible since $x_1$

2376: cannot be bound to both $a$ and $b$ in the input substitution.

2377: In this case, we have

2378: $${\it Cc}(B')=\{ <>, <\{x_1/a\}>,

2379:                <\{x_1/b\}>\}.$$

2380: The first argument $\beta$ of the operation {\tt CONC}

2381:  provides information on the input values: it may be useful to improve

2382:  the accuracy of the result.

2383: The above discussion motivates the following specification of

2384: operation {\tt CONC}.

2385: Note that the statement

2386: \mbox{$(\exists\sigma\in{\it SS}: \theta' = \theta\sigma)$} is abbreviated by

2387: $\theta'\leq \theta$ in the specification. Let

2388: $\beta\in{\it AS}_D$,

2389: $C\in{\it ASSC}_D$,

2390: $B\in{\it ASS}_D$,

2391: $\theta\in{\it CPS}_D$,

2392: $\langle S_1,{\it cf}\rangle\in ({\it CPSS}_{D}\times {\it CF})$ and

2393: $S_2\in{\it CPSS}_{D}$. %Then,

2394: %

2395: \begin{quote}$

2396: \begin{array}{rcl}

2397:       \left.

2398:       \begin{array}{c}

2399:          \theta\in {\it Cc}(\beta),\\

2400:          \langle S_1,{\it cf}\rangle \in {\it Cc}(C),\\

2401:          S_2\in {\it Cc}(B),\\

2402:          \forall \theta'\in {\it Subst}(S_1)\cup {\it Subst}(S_2):

2403:                                                \theta'\leq \theta

2404:       \end{array}

2405:       \right\}

2406:                 &\Rightarrow&

2407: \langle S_1,{\it cf}\rangle \Box S_2 \in {\it Cc}({\tt CONC}(\beta,C,B)).

2408: \end{array}$

2409: \end{quote}

2410:

2411:

2412:

2413: \vskip0.3cm

2414: \noindent

2415: {\bf Operation}\ \ ${\tt SEQ}: {\it ASSC}_{D}

2416:                          \rightarrow{\it ASS}_{D}$\  \\

2417: This operation forgets the cut information

2418: contained in an abstract sequence with cut information $C$.

2419: It is applied to the

2420: result of the last clause of a procedure before

2421: combining this result with

2422: the results of the other clauses. \\

2423: Let $C\in {\it ASSC}_{D}$ and

2424:  $\langle S,{\it cf}\rangle\in ({\it CPSS}_{D}\times {\it CF})$.

2425: \begin{quote}$

2426: \begin{array}{rcl}

2427: \langle S,{\it cf}\rangle \in {\it Cc}(C)       &\Rightarrow &

2428: S\in {\it Cc}({\tt SEQ}(C)).

2429: \end{array}$

2430: \end{quote}

2431: %

2432: \vskip0.3cm

2433: \noindent

2434: {\bf Operation}\ \ ${\tt SUBST}: {\it ASSC}_{D'}

2435:                          \rightarrow{\it AS}_{D'}$\  \\

2436: %

2437: This operation forgets still more

2438: information.  It extracts the ``abstract substitution part'' of $C$.

2439: It is applied before executing a literal in a clause. See operation

2440: {\tt EXTGS}.

2441: Let $C\in {\it ASSC}_{D'}$ and

2442:  $\langle S,{\it cf}\rangle\in ({\it CPSS}_{D'}\times {\it CF})$.

2443: \begin{quote}$

2444: \begin{array}{rcl}

2445:          \langle S,{\it cf}\rangle \in {\it Cc}(C)

2446:         &\Rightarrow &

2447: {\it Subst(S)}\subseteq {\it Cc}({\tt SUBST}(C)).

2448: \end{array}$

2449: \end{quote}

2450: %

2451: %

2452: \subsection{Abstract Semantics}

2453: \label{sub:ASem}

2454:

2455: We are now in position to present the abstract semantics.

2456: %As just stated in the introduction, the abstract semantics is not defined

2457: %as a least fixpoint but as a set of post-fixpoints that fulfill a safety

2458: %requirement called pre-consistency.

2459: Note that we are not concerned with

2460: algorithmic issues here: they are dealt with in Section~\ref{sec:GAIA}.

2461: \\

2462:

2463:

2464: \noindent

2465: {\bf Extended Abstract Behaviors.}

2466: %\label{EAB}

2467: Extended abstract behaviors are the abstract counterpart of the

2468: concrete extended behaviors defined in Section \ref{CSR}.

2469: %They are defined in exactly the same way.

2470: %

2471:

2472: \begin{definition} [Extended Abstract Underlying Domain]

2473: \label{DEAUD}

2474: The {\em extended abstract underlying domain},

2475: denoted by {\it EAUD},

2476: consists of

2477: \begin{enumerate}

2478: \itemsep 2pt

2479: \item all triples $\langle \beta, g, c \rangle $, where

2480:   $c$ is a clause of $P$,

2481:   $g$ is a prefix of the body of $c$,

2482:   $\beta\in{\it AS}_{D}$, and

2483:   $D$ is the set of variables

2484:   in  the head of $c$;

2485: \item all pairs $\langle \beta,  c \rangle $, where

2486:   $c$ is a clause of $P$,

2487:   $\beta \in{\it AS}_{D}$, and

2488:   $D$ is the set of variables

2489:   in  the head of $c$;

2490: \item all pairs $\langle \beta,  {\it pr} \rangle $, where

2491:   ${\it pr}$ is a procedure of $P$

2492:   or a suffix of a procedure of $P$,

2493:   $\beta\in{\it AS}_{D}$, and

2494:   $D$ is the set of variables

2495:   in the head of the

2496:   clauses of {\it pr}.

2497: \end{enumerate}

2498: \end{definition}

2499: %

2500: \begin{definition} [Extended Abstract Behaviors]

2501: An {\em extended abstract behavior} is a  function from

2502: {\it EAUD} to ${\it ASS}\cup{\it ASSC}$ such~that

2503: %

2504: \begin{enumerate}

2505: \itemsep 2pt

2506: \item every triple $\langle \beta, g, c \rangle $ from {\it EAUD}

2507:       is mapped to an abstract sequence with cut

2508:       information

2509:       $C\in {\it ASSC}_{D'} $, where $D'$ is the set of all variables in $c$;

2510: %

2511: \item every pair $\langle \beta,  c \rangle $ from {\it EAUD}

2512:       is mapped to an abstract sequence with cut information

2513:       $C\in {\it ASSC}_D $, where $D$ is the set of variables

2514:       in the head of $c$;

2515: %

2516: \item every pair $\langle \beta,  {\it pr} \rangle $ from {\it EAUD}

2517:       is mapped to an abstract sequence $B\in {\it ASS}_D$,

2518:        where $D$ is the set of variables

2519:        in the head of the

2520:        clauses of {\it pr}.

2521: \end{enumerate}

2522: \end{definition}

2523:

2524: The set of extended abstract behaviors is endowed with a structure of

2525: partial or\-der in the obvious way. It is denoted by {\it EAB} and its

2526: elements are denoted by~{\it esat}.\\

2527: %\end{definition}

2528: %

2529:

2530: \noindent

2531: {\bf Abstract Transformation.}

2532: %\label{AT}

2533: The abstract semantics is defined in terms of two semantic functions

2534: that are depicted in Figure \ref{fig:TAT}.

2535: The first function $E: {\it AB}\rightarrow {\it EAB}$ maps

2536: abstract behaviors to extended abstract behaviors.

2537: It is the abstract counterpart of the concrete semantic rules of

2538: Figure \ref{CSRF}.

2539: The second function ${\it TAB} :{\it AB}\rightarrow{\it AB}$

2540: transforms an abstract behavior into another abstract behavior.

2541: It is the abstract counterpart of  Rule {\bf T1} in Definition

2542: \ref{CT}.\\

2543:

2544: \begin{figure}[t]%\langle \rangle

2545: \figrule

2546: %\small

2547: \begin{tabbing}

2548: 123\=4567891\=23123\=456123456789012345\=12347891234123456789123456789 \kill

2549: %$TAB(sat) = \{(\beta,p,B): (\beta,p) \in UD \mbox{ and }

2550: %B = T(\beta,p,sat)\}$ \\ \\

2551:

2552: ${\it TAB}({\it sat})\langle \beta,p \rangle

2553:           = E({\it sat})\langle \beta,{\it pr}\rangle$ \\

2554: \>{\bf where}   \> ${\it pr}$  is the procedure defining $p$,\\ \\

2555:

2556: $E({\it sat})\langle \beta,{\it pr} \rangle

2557:           = ${\tt SEQ}$(C)$ \\

2558: \>{\bf where}

2559:           \> $C=E({\it sat})\langle \beta,c \rangle $

2560:              \>\> if {\it pr $::=$   c}\\ \\

2561:

2562: $E({\it sat})\langle \beta,{\it pr} \rangle

2563:           =  {\tt CONC}(\beta,C,B)$ \\

2564: \>{\bf where}

2565:           \> $B=E({\it sat})\langle \beta,{\it pr}' \rangle $ \\

2566: \>        \> $C=E({\it sat})\langle \beta,c \rangle $

2567:              \> \>if {\it pr $::=$ c,pr}$'$\\ \\

2568:

2569: $E({\it sat})\langle \beta,c \rangle

2570:           =$ {\tt RESTRC}$(c,C)$ \\

2571: \>{\bf where}

2572:           \> $C = E({\it sat})\langle \beta,g,c\rangle$\\

2573: \>        \> $g$ is the body of $c$\\ \\

2574:

2575: $E({\it sat})\langle\beta,<>,c \rangle = {\tt EXTC}(c,\beta)$ \\ \\

2576:

2577:

2578: $E({\it sat})\langle \beta,( g , ! ) ,c\rangle

2579:           = \mbox{\tt AI-CUT}(C)$ \\

2580: \>{\bf where}

2581:           \> $C = E({\it sat})\langle \beta,g,c\rangle$ \\ \\

2582:

2583: $E({\it sat})\langle \beta, ( g , l ) ,c\rangle

2584:           = {\tt EXTGS}(l,C,B)$ \\

2585: \>{\bf where}

2586:           \> $B=$

2587:              \> {\tt UNIF-VAR}$(\beta')$

2588:                 \> if {\it l $::=$ x$_i$=x$_j$} \\

2589: \>        \> \> {\tt UNIF-FUNC}$(f,\beta')$

2590:                 \> if  {\it l $::=$ x$_i$=f$(\ldots)$} \\

2591: \>        \> \> ${\it sat}\langle \beta',p \rangle$

2592:                 \> if  {\it l $::=$ p$(\ldots)$} \\

2593: \>        \> $\beta' = {\tt RESTRG}(l,\beta'')$\\

2594: \>        \> $\beta''= {\tt SUBST}(C)$\\

2595: \>        \> $C= E({\it sat})\langle \beta,g,c \rangle$.

2596: \end{tabbing}

2597: \caption{The  abstract transformation}

2598: \label{fig:TAT}

2599: \end{figure}

2600:

2601:

2602: \noindent

2603: {\bf Abstract Semantics.}

2604: %\label{subsub:ASem}

2605: The abstract semantics is defined as the set of all abstract

2606:    behaviors that

2607: are both post-fixpoints of the abstract transformation

2608: {\it TAB} and pre-consistent.

2609: The corresponding definitions are given first; then

2610:   the rationale underlying

2611: the definitions  is   discussed.

2612:

2613: \begin{definition}[Post-Fixpoints of {\it TAB}]

2614: %\label{def:PFT}

2615: An abstract behavior ${\it sat}\in{\it AB}$ is called

2616: a {\em post-fixpoint} of {\it TAB} if and only if

2617: ${\it TAB}({\it sat})\leq{\it sat} $, i.e., if and only if

2618: $$

2619: \begin{array}{llll}

2620: {\it TAB}({\it sat})\langle \beta,p \rangle

2621: &\leq

2622: &{\it sat}\langle \beta,p \rangle,

2623: &\;\;\forall \langle \beta,p \rangle \in {\it AUD}.\\

2624: \end{array}

2625: $$

2626: \end{definition}

2627: %

2628: \begin{definition}[Pre-Consistent Abstract Behaviors]

2629: \label{def:PCAB}

2630: \noindent

2631: Let $\longmapsto$ be the concrete semantics of the underlying

2632: program, according to Defi\-nition \ref{DCS}.

2633: An abstract behavior ${\it sat}\in{\it AB}$ is said to be

2634:  {\em pre-consistent} with respect to $\longmapsto$

2635: if and only if

2636: there exists a concrete behavior $\longmapsto'$

2637: such that

2638: $$\longmapsto'\ \sqsubseteq \longmapsto$$

2639: and such that,

2640: for all $\langle \beta,p \rangle \in {\it AUD}$ and

2641: $\langle \theta,p \rangle \in {\it CUD}$,

2642: $$

2643: \begin{array}{lll}

2644:   \left.

2645:   \begin{array}{c}

2646:    \theta \in {\it Cc}(\beta),\\

2647:    \langle \theta,p \rangle \longmapsto' S

2648:   \end{array}

2649:    \right\}

2650: &\Rightarrow

2651: &  S \in {\it Cc}({\it sat}\langle \beta,p \rangle).

2652: \\

2653: \end{array}

2654: $$

2655: \end{definition}

2656:

2657: In the next section, we show that any pre-consistent

2658: post-fixpoint {\it sat} of {\it TAB} is a safe approximation of

2659: the concrete semantics, i.e.,

2660: it is such that for all $\langle \beta,p \rangle \in {\it AUD}$ and

2661: $\langle \theta,p \rangle \in {\it CUD}$,

2662: $$

2663: \begin{array}{lll}

2664:   \left.

2665:   \begin{array}{c}

2666:    \theta \in {\it Cc}(\beta),\\

2667:    \langle \theta,p \rangle \longmapsto S

2668:   \end{array}

2669:    \right\}

2670: &\Rightarrow

2671: &  S \in {\it Cc}({\it sat}\langle \beta,p \rangle).\\

2672: \\

2673: \end{array}

2674: $$

2675:

2676:

2677: %\noindent

2678: The abstract semantics is defined as the set of all pre-consistent post-fixpoints.

2679: Indeed, under the current hypotheses on the abstract domains, there is no straightforward way to choose a

2680: ``best'' abstract behavior among all pre-consistent post-fixpoints.

2681: Thus, we consider the problem of computing a reasonably accurate

2682: post-fixpoint as a pragmatic issue to be solved at the algorithmic level.

2683: In fact, the abstract interpretation algorithm

2684: presented in Section \ref{sec:GAIA} is an improvement of the following construction:

2685: define the abstract behavior ${\it sat}_\bot$ by

2686: $$

2687: \begin{array}{llll}

2688: {\it sat}_\bot \langle \beta, p \rangle

2689: & =

2690: & B_\bot,

2691: & \; \;\forall \langle \beta, p \rangle \in {\it AUD}.

2692: \end{array}

2693: $$

2694: Assume that the domain of abstract sequences is endowed with

2695: an % (polymorphic)

2696: upper-bound operation ${\tt UB}: {\it ASS}_D \times {\it ASS}_D \rightarrow {\it ASS}_D$

2697: (not necessarily a {\em least} upper bound).

2698: For every ${\it sat}_1,{\it sat}_2\in {\it AB}$, we define

2699: ${\tt UB}({\it sat}_1,{\it sat}_2)$ by

2700: $$

2701: \begin{array}{llll}

2702: {\tt UB}({\it sat}_1,{\it sat}_2)\langle \beta, p \rangle

2703: & =

2704: & {\tt UB}({\it sat}_1\langle \beta, p \rangle,

2705:            {\it sat}_2\langle \beta, p \rangle),

2706: & \;\;\forall \langle \beta, p \rangle \in {\it AUD}.

2707: \end{array}

2708: $$

2709: Let j be an arbitrarily chosen natural number.

2710: An infinite sequence of pre-consistent abstract behaviors

2711: ${\it sat}_0,\ldots,{\it sat}_i,\ldots$ is defined as follows:

2712: $$

2713: \begin{array}{llll}

2714: {\it sat}_0 & = & {\it sat}_\bot, \\

2715: {\it sat}_{i+1}& = & {\it TAB}({\it sat}_i)

2716:                    & (0\leq i <j ),\\

2717: {\it sat}_{i+1}& = & {\tt UB}({\it sat}_i,{\it TAB}({\it sat}_i))

2718:                    & ( j\leq i).  \\

2719: \end{array}

2720: $$

2721: The abstract behaviors ${\it sat}_i$ are all pre-consistent because

2722: ${\it sat}_\bot$ is pre-consistent by construction,

2723: every  application of {\it TAB} maintains pre-consistency (as proven

2724: in the next section),

2725: and each application of {\tt UB} produces an abstract behavior

2726: whose concretization contains the concretizations of the arguments.

2727: Moreover, assuming that every  partial order ${\it ASS}_D$ is finite

2728: or satisfies the finite ascending chain proper\-ty, the

2729: sequence ${\it sat}_0,\ldots,{\it sat}_i,\ldots$ has a least

2730: upper bound which is the desired pre-consistent post-fixpoint.

2731: In  case the ${\it ASS}_D$ contains chains with infinitely many distinct

2732: elements, {\tt UB} must be a widening operator \cite{Cousot92c}.

2733: \\

2734: The sequence from ${\it sat}_0$ to ${\it sat}_j$

2735: is not ascending in general. In fact, ${\it sat}_\bot$ is not the minimum of

2736: {\it AB} and {\it TAB} is not necessarily monotonic nor extensive

2737: (i.e., ${\it sat}\leq {\it TAB}({\it sat})$ does not always hold).

2738: From step $0$ to $j$, the  computation of the ${\it sat}_i$

2739: simulates as closely as possible the computation

2740: of the least fixpoint of the concrete transformation.

2741: From step $j$ to convergence, all iterates are ``lumped'' together.

2742: All concrete behaviors

2743: $\longmapsto_j, \longmapsto_{j+1}, \ldots$

2744: of the Kleene sequence of the concrete semantics,

2745: are thus included in the concretization of the final post-fixpoint {\it sat}.

2746: So, {\it sat} describes properties that are true not only for the

2747: concrete $\longmapsto$

2748: semantics but also for its approximations

2749: $\longmapsto_j, \longmapsto_{j+1}, \ldots$.

2750: The choice of $j$ is a compromise: a low value  ensures a faster

2751: convergence while a high value provides a better accuracy.

2752: The abstract interpretation algorithm presented in  Section

2753: \ref{sec:GAIA} does not iterate globally over {\it TAB}. It locally

2754: iterates over $E$ for every needed input

2755: pattern $\langle \beta, p \rangle$

2756: and uses different values of $j$ for different input patterns.

2757: Depending on the particular abstract domain, the value can be guessed

2758: more or less cleverly. This is the role of the

2759: special widening operator of  Definition~\ref{def:EW}.

2760: A sample widening operator is described in Section \ref{sub:GAO},

2761: showing how the value of $j$ can be guessed in the case of a practical

2762: abstract domain.

2763:

2764: \subsection{Safety of the Abstract Semantics}

2765: \label{sub:SAS}

2766:

2767: We prove here the safety of our abstract semantics.

2768: %The safety results about our abstract semantics are now stated.

2769: %The details of the proofs can be found in \cite{Cut_rep95}.

2770: First, we formally define the notion of safe approximation. Then, we show

2771: that the abstract transformation is safe in the sense that, whenever

2772: {\it sat} safely approximates $\longmapsto$,

2773: ${\it TAB}({\it sat})$ safely approximates

2774: $\stackrel{{\it TCB}}\longmapsto$ (Theorem \ref{th:SAT}).

2775: From this basic result, we deduce that

2776: {\it TAB} transforms pre-consistent abstract

2777: behaviors

2778: into other pre-consistent abstract behaviors (Theorem \ref{th:p-c}),

2779: and that, when {\it sat} is a post-fixpoint

2780: of the abstract transformation which safely approximates

2781: a concrete behavior $\longmapsto$, it also safely approximates

2782: the concrete behavior  $\stackrel{{\it TCB}}\longmapsto$

2783: (Theorem \ref{th:cppost}).

2784: Theorem \ref{th:cpab} states that abstract beha\-viors are,

2785: roughly speaking, chain-closed with respect to concrete behaviors.

2786: Finally, Theorem \ref{th:SAS} states our main result, i.e.,

2787: every pre-consistent post-fixpoint of the abstract transformation

2788: safely approximates the concrete semantics.

2789:

2790:

2791:

2792: \begin{definition}[Safe Approximation]

2793: \label{def:SA}

2794: \noindent Let $\longmapsto \in{\it CB}$ and

2795:     ${\it sat}\in {\it AB}$.

2796: The abstract behavior {\it sat} {\em safely approximates} the concrete behavior

2797: $\longmapsto$ if and only if, for all

2798: $\langle \theta, p \rangle \in {\it CUD}$ and

2799:     $\langle \beta,  p \rangle \in {\it AUD}$, the following

2800: implication holds:

2801: $$

2802: \begin{array}{lll}

2803:   \left.

2804:   \begin{array}{c}

2805:    \theta\in{\it Cc}(\beta),\\

2806:    \langle \theta, p \rangle\longmapsto S

2807:   \end{array}

2808:    \right\}

2809:    &\Rightarrow&

2810: S\in{\it Cc}({\it sat}\langle \beta ,p \rangle).

2811: \end{array}

2812: $$

2813:

2814:  Similarly, let $\longmapsto \in{\it ECB}$ and

2815:     ${\it esat}\in {\it EAB}$.

2816: The extended abstract behavior {\it esat} {\em safely approximates}

2817: $\longmapsto$ if and only if, for all

2818: $\langle \theta, {\it pr} \rangle,

2819: \langle \theta, c \rangle,

2820: \langle \theta, g, c \rangle

2821: \in {\it ECUD}$ and

2822:     $\langle \beta, {\it pr} \rangle,

2823: \langle \beta, c \rangle,

2824: \langle \beta, g, c \rangle

2825:  \in {\it EAUD}$, the following

2826: implications hold:

2827: $$

2828: \begin{array}{c}

2829: \begin{array}{lll}

2830:   \left.

2831:   \begin{array}{c}

2832:    \theta\in{\it Cc}(\beta),\\

2833:    \langle \theta,  {\it pr} \rangle\longmapsto S

2834:   \end{array}

2835:    \right\}

2836:

2837: &\Rightarrow&

2838: S\in{\it Cc}({\it esat}\langle \beta , {\it pr} \rangle),

2839: \end{array}\\ \\

2840: \begin{array}{lll}

2841:   \left.

2842:   \begin{array}{c}

2843:    \theta\in{\it Cc}(\beta),\\

2844:    \langle \theta, c \rangle\longmapsto \langle S, {\it cf} \rangle

2845:   \end{array}

2846:    \right\}

2847:

2848: &\Rightarrow&

2849: \langle S, {\it cf} \rangle\in{\it Cc}({\it esat}\langle \beta ,c \rangle),

2850: \end{array}\\ \\

2851: \begin{array}{lll}

2852:   \left.

2853:   \begin{array}{c}

2854:    \theta\in{\it Cc}(\beta),\\

2855:    \langle \theta, g, c \rangle\longmapsto \langle S, {\it cf} \rangle

2856:   \end{array}

2857:    \right\}

2858:

2859: &\Rightarrow&

2860:  \langle S, {\it cf} \rangle \in{\it Cc}({\it esat}\langle \beta , g, c \rangle).

2861: \end{array}\\

2862: \\

2863: \end{array}

2864: $$

2865:

2866: \end{definition}

2867:

2868: \begin{theorem}[Safety of the Abstract Transformation]

2869: \label{th:SAT}

2870: \noindent

2871: Let $\longmapsto\in{\it CB}$ and

2872:     ${\it sat}\in {\it AB}$. If {\it sat} safely approximates

2873: $\longmapsto$, then ${\it TAB}({\it sat})$ safely approximates

2874: $\stackrel{{\it TCB}}\longmapsto$.

2875: %

2876: \end{theorem}

2877:

2878:

2879: We first establish the following result.

2880:  Remember that if $\longmapsto\in{\it CB}$, its extension

2881: in {\it ECB} is also denoted by $\longmapsto$ (see Section \ref{CSR}).

2882:

2883: \begin{lemma}[Safety of $E$]

2884: \label{le:SE}

2885:

2886: \noindent

2887: Let $\longmapsto\in{\it CB}$ and

2888:     ${\it sat}\in {\it AB}$. If {\it sat} safely approximates

2889: $\longmapsto$, then $E({\it sat})$ safely approximates $\longmapsto$

2890: (the extension of $\longmapsto$ in {\it ECB}).

2891: \end{lemma}

2892:

2893: \begin{proof*}[Proof of Lemma \ref{le:SE}]

2894: \label{pr:SE}

2895:

2896: We prove the lemma by structural  induction on the syntax of the

2897: underlying program. It uses the concrete semantic rules of

2898: Figure~\ref{CSRF},  the definition of $E$ in Figure~\ref{fig:TAT},

2899: and  the specifications of the abstract operations given in

2900: Section \ref{sub:AO}. The proof is straightforward due to the close

2901: correspondence of the concrete and the abstract semantics.

2902: We only detail the reasoning for the base case and for the case of

2903: a goal $(g , l)$ where $l$ is an atom of the form

2904: $p(x_{i_1},\dots,x_{i_n})$.

2905: The other cases are similar.

2906:

2907: \vskip0.2cm

2908: \noindent

2909: {\bf Base case}.

2910:  Let $\langle \theta, <>, c \rangle

2911: \!\in\! {\it ECUD}$ and

2912:     $\langle \beta, <>, c \rangle

2913:  \!\in\! {\it EAUD}$. Assume that  $\theta\!\in\!{\it Cc}(\beta)$ and

2914: \mbox{

2915: $\langle \theta, <>, c \rangle\longmapsto \langle S, {\it cf}\rangle$}.

2916: It must be proven that

2917: $$\langle S, {\it cf}\rangle\in

2918: {\it Cc}(E({\it sat})\langle\beta,<>,c \rangle). $$

2919: This  relation holds because of the three following facts:

2920: \begin{center}

2921:   \begin{tabular}{rllll}

2922:    $\langle S, {\it cf}\rangle$ & $=$ &

2923:    ${\tt EXTC}(c,\theta)$

2924:     & &(by {\bf R2}),\\

2925: %

2926:    ${\tt EXTC}(c,\theta)$&$\in$&$ {\it Cc}({\tt EXTC}(c,\beta))$

2927:     & &(by specification of {\tt EXTC}),\\

2928:

2929: %

2930:    $E({\it sat})\langle\beta,<>,c \rangle$& $=$ &${\tt EXTC}(c,\beta)$

2931:     & &(by definition of $E$).

2932:   \end{tabular}

2933: \end{center}

2934: \vskip0.2cm

2935: \noindent

2936: {\bf Induction step}.

2937: %

2938: Let $\langle \theta,  ( g , l ), c \rangle

2939:     \in {\it ECUD}$ and

2940: $\langle \beta,  ( g , l ), c \rangle

2941:     \in {\it EAUD}$,

2942: where $l$ is an atom of the form $p(x_{i_1},\dots,x_{i_n})$.

2943: %

2944: Assume  that $\theta\in{\it Cc}(\beta)$ and

2945: \mbox{

2946: $\langle \theta,  ( g , l),

2947:               c \rangle\longmapsto \langle S, {\it cf}\rangle$}.

2948: %

2949: It must be proven  that

2950: $$\langle S, {\it cf}\rangle\in

2951: {\it Cc}(C), \mbox{

2952: where }

2953:  C = E({\it sat})\langle\beta, ( g , l),c \rangle.$$

2954: By Rule {\bf R6},

2955: % of the concrete semantics,

2956: there exist program substitutions and program sequences such that

2957: $$

2958: \begin{array}{llll}

2959:   \langle \theta, g,c \rangle

2960:    \longmapsto

2961:    \langle S', {\it cf} \rangle      &                           && (C1)\\

2962: S' = <\theta_1,\dots,\theta_i,\_>    &                           && (C2) \\

2963: \theta'_k = {\tt RESTRG}(l,\theta_k) & (1\leq k\leq {\it Ns}(S)) && (C3) \\

2964: \langle  \theta'_k ,p\rangle

2965:     \longmapsto  S_{k}^{'}           & (1\leq k\leq {\it Ns}(S)) && (C4)\\

2966: S_k = {\tt EXTG}(l,\theta_k,S_{k}^{' })

2967:                                      & (1\leq k\leq {\it Ns}(S)) && (C5)\\

2968: S={\Box}^{{\it Ne}(S)}_{k=1} S_k     &                           && (C6)\\

2969: \end{array}

2970: $$

2971: %

2972: Moreover, by  definition of

2973: $E({\it sat})$, there exist abstract values such that

2974: $$

2975: \begin{array}{rclll}

2976: C        & = &{\tt EXTGS}(l,C',B)                               && (A1)\\

2977: B        & = &{\it sat}\langle \beta',p \rangle                 && (A2)\\

2978: \beta'   & = &{\tt RESTRG}(l,\beta'')                           && (A3)\\

2979: \beta''  & = &{\tt SUBST}(C')                                   && (A4)\\

2980: C'       & = &E({\it sat})\langle \beta,g ,c \rangle            && (A5)

2981: \end{array}

2982: $$

2983: %

2984: The following assertions hold.

2985: %The last one is the

2986: %required result.

2987: By $A5$, $C1$, and the induction hypothesis,

2988: $$

2989: \begin{array}{rclll}

2990:  \langle S', {\it cf} \rangle

2991:         & \in & {\it Cc}(C') && (B1).

2992: \end{array}

2993: $$

2994: %

2995: By $A4$, $B1$, $C2$, and the specification of {\tt SUBST},

2996: $$

2997: \begin{array}{rcllll}

2998: \theta_k & \in &  {\it Cc}(\beta'') & (1\leq k\leq {\it Ns}(S))

2999:                                                                 && (B2).

3000: \end{array}

3001: $$

3002: %

3003: By $A3$, $B2$, $C3$, and the specification of {\tt RESTRG},

3004: $$

3005: \begin{array}{rcllll}

3006: \theta'_k & \in &  {\it Cc}(\beta') & (1\leq k\leq {\it Ns}(S))

3007:                                                                 && (B3).

3008: \end{array}

3009: $$

3010: %

3011: By $A2$, $B3$, $C4$, and the hypothesis that {\it sat} safely approximates

3012: $\longmapsto$,

3013: $$

3014: \begin{array}{rcllll}

3015: S'_{k}& \in &  {\it Cc}(B) & (1\leq k\leq {\it Ns}(S))

3016:                                                                 && (B4).

3017: \end{array}

3018: $$

3019: Finally, by $A1$, $B1$, $B4$,  $C2$, $C5$, $C6$, and the specification

3020: of {\tt EXTGS},

3021: $$

3022: \begin{array}{rcllll}

3023: \langle S, {\it cf} \rangle & \in &  {\it Cc}(C).

3024: \end{array}

3025: \mathproofbox $$

3026: \end{proof*}

3027: %

3028: %

3029: \begin{proof}[Proof of Theorem \ref{th:SAT}]

3030: \label{pr:SAT}

3031:

3032: The result follows from the definition of {\it TAB} in

3033: Figure~\ref{fig:TAT},

3034:  the definition of {\it TCB} in Section \ref{CT}, and

3035:  Lemma \ref{le:SE}.

3036: \end{proof}

3037:

3038:

3039:

3040: %

3041: %The proof of Theorem \ref{th:SAT} uses an important lemma

3042: %which states a safety result about the function $E$

3043: %and is given in Section \ref{app:sub:SAS}.

3044: %

3045: %

3046: %

3047: %

3048:

3049:

3050: The next theorem states that the transformation {\it TAB}

3051: maintains pre-consistency.

3052:

3053: \begin{theorem}\label{th:p-c}

3054:  Let ${\it sat}\in {\it AB}$. If {\it sat} is

3055: pre-consistent, then ${\it TAB}({\it sat})$

3056: is also pre-consistent.

3057: \end{theorem}

3058: \begin{proof}

3059: Let $\longmapsto$ be the concrete semantics of the underlying program.

3060:  Since {\it sat} is

3061: pre-consistent, there exists a concrete behavior $\longmapsto'$ such

3062: that

3063: \begin{enumerate}

3064:

3065: \itemsep 2pt

3066:

3067: \item $\longmapsto'\ \sqsubseteq\ \longmapsto$, and

3068:

3069: \item {\it sat} safely approximates $\longmapsto'$.

3070: \end{enumerate}

3071: The first condition implies that

3072: $$\stackrel{{\it TCB}}{\longmapsto'}\ \sqsubseteq\ \longmapsto,$$

3073: since {\it TCB} is monotonic and

3074: $\stackrel{{\it TCB}}{\longmapsto}\ =\ \longmapsto$.

3075: %the first condition implies that

3076: %$$\stackrel{{\it TCB}}{\longmapsto'}\ \sqsubseteq\ \longmapsto.$$

3077: %since {\it TCB} is monotonic (see Section \ref {TCS}) and

3078: %$\stackrel{{\it TCB}}{\longmapsto}\ =\ \longmapsto$ (see Definition

3079: %\ref{DCS}).

3080: \\The second condition and Theorem \ref{th:SAT} imply that

3081: \begin{center}

3082: ${\it TAB}({\it sat})$\  safely approximates\

3083:                               $\stackrel{{\it TCB}}{\longmapsto'}.$

3084:

3085: \end{center}

3086: The result follows from the two implied statements and Definition

3087: \ref{def:PCAB}.

3088: \end{proof}

3089:

3090:

3091:  The next two theorems state closure properties of

3092: abstract behaviors, which are used to prove the

3093: safety of the abstract semantics.

3094:

3095: \begin{theorem}\label{th:cppost}

3096:

3097: Let {\it sat} be a post-fixpoint of {\it TAB}.

3098: Let $\longmapsto \in{\it CB}$. If {\it sat} safely approxi\-mates

3099: $\longmapsto$, then {\it sat} also safely approximates

3100: $\stackrel{{\it TCB}}\longmapsto$.

3101: \end{theorem}

3102: \begin{proof*}

3103: Assume that {\it sat} safely approximate

3104: $\longmapsto$.

3105: Let

3106: $\langle \theta, p \rangle \in {\it CUD}$ and

3107:     $\langle \beta,  p \rangle \in{\it AUD}$.

3108: It must be proven that$$

3109: \begin{array}{lll}

3110:   \left.

3111:   \begin{array}{c}

3112:    \theta\in{\it Cc}(\beta),\\

3113:    \langle \theta, p \rangle \stackrel{{\it TCB}}\longmapsto S

3114:   \end{array}

3115:    \right\}

3116:    &\Rightarrow&

3117: S\in{\it Cc}({\it sat}\langle \beta ,p \rangle).\end{array}

3118: $$

3119: Assume that the left part of the implication holds.

3120: Theorem \ref{th:SAT} implies that

3121: $$S\in{\it Cc}({\it TAB}({\it sat})\langle \beta ,p \rangle).$$

3122: Since {\it sat} is a post-fixpoint and {\it Cc} is monotonic,

3123: $$\begin{array}{rcl}

3124: {\it Cc}({\it TAB}({\it sat})\langle \beta ,p \rangle)

3125: \subseteq

3126: {\it Cc}({\it sat}\langle \beta ,p \rangle),

3127: \end{array}$$

3128: and then

3129: $$S\in{\it Cc}({\it sat}\langle \beta ,p \rangle).

3130: \mathproofbox $$

3131: \end{proof*}

3132:

3133:

3134: \begin{theorem}\label{th:cpab}

3135: Let ${(\longmapsto_i)}_{i\in{\bf N}}$ be a chain of concrete behaviors.

3136: Let ${\it sat}\in {\it AB}$. If {\it sat} safely appro\-xi\-mates

3137: $\longmapsto_i$, for all $i\in{\bf N}$, then

3138: {\it sat} safely approximates

3139: $(\sqcup_{i=0}^{\infty}\longmapsto_i).$

3140: \end{theorem}

3141: \begin{proof*}

3142: Let us abbreviate $(\sqcup_{i=0}^{\infty}\longmapsto_i)$ by

3143: $\longmapsto$.

3144: It is sufficient to prove that, for any

3145: $\langle \beta, p \rangle \in {\it AUD}$ and

3146: any $\langle \theta, p \rangle \in {\it CUD}$,

3147: $$

3148: \begin{array}{lll}

3149:   \left.

3150:   \begin{array}{c}

3151:    \theta\in{\it Cc}(\beta),\\

3152:    \langle \theta, p \rangle\longmapsto S

3153:   \end{array}

3154:    \right\}

3155:    &\Rightarrow&

3156: S\in{\it Cc}({\it sat}\langle \beta ,p \rangle).

3157: \end{array}

3158: $$

3159: Fix  $\langle \beta, p \rangle $,

3160: $\langle \theta, p \rangle $, and $S$ satisfying the left part of the

3161: implication. By Theorem \ref{CB-cpo},

3162: $$

3163: \begin{array}{llllll}

3164: S = \sqcup_{i=0}^{\infty} S_i &&

3165: \mbox{\rm where}              &&

3166: \langle \theta, p \rangle\longmapsto_i S_i

3167: &

3168: \forall i\in{\bf N}.

3169:  \end{array}

3170: $$

3171: Since {\it sat} safely approximates every $\longmapsto_i$,

3172: \begin{center}

3173: $S_i\in{\it Cc}({\it sat}\langle \beta ,p \rangle)$ for all

3174: $i\in{\bf N}$.

3175: \end{center}

3176:  Finally,

3177: since ${\it Cc}({\it sat}\langle \beta ,p \rangle)$ is chained-closed,

3178: $$S\in{\it Cc}({\it sat}\langle \beta ,p \rangle).

3179: \mathproofbox $$

3180: \end{proof*}

3181:

3182:

3183:

3184: The last theorem states our main result.

3185: %It uses Lemma \ref{le:SAS}, which is

3186: %stated and proven in Section \ref{app:sub:SAS}.

3187:

3188: \begin{theorem}[Safety of the Abstract Semantics]

3189: \label{th:SAS}

3190: \noindent

3191: Let {\it sat} be a pre-consistent post-fixpoint of {\it  TAB}.

3192: Then {\it sat} safely approximates $\longmapsto$ where $\longmapsto$

3193: is the concrete semantics of the underlying program.

3194: \end{theorem}

3195:

3196: We first establish the following statement.

3197:

3198: \begin{lemma}\label{le:SAS}

3199: Let {\it sat} be a pre-consistent post-fixpoint of {\it  TAB}.

3200: There exists a chain of concrete behaviors

3201: ${(\longmapsto_i)}_{i\in{\bf N}}$ such that {\it sat} safely approximates

3202: $\longmapsto_i$, for all $i\in{\bf N}$ and

3203: $(\sqcup_{i=0}^{\infty}\longmapsto_i)\ =\ \longmapsto$ where $\longmapsto$

3204: is the concrete semantics of the underlying program.

3205: \end{lemma}

3206: \begin{proof*}[Proof of Lemma \ref{le:SAS}]

3207: The proof is in three steps. First we construct a sequence

3208: ${\{\longmapsto'_i\}}_{i\in{\bf N}}$ of lower-approximations of

3209: $\longmapsto$ which is not necessarily a chain; then we

3210: modify it to get a chain ${(\longmapsto_i)}_{i\in{\bf N}}$;

3211: finally, we show that $(\sqcup_{i=0}^{\infty}\longmapsto_i)\ =\

3212: \longmapsto$. The proof uses the following property of

3213: program substitution sequences, whose proof is left to the reader.

3214: If $S_1$, $S_2$ and $S$ are program substitution sequences such that

3215: $S_1\sqsubseteq S$ and $S_2\sqsubseteq S$, then $S_1$ and $S_2$ have

3216: a least upper-bound, which is either $S_1$ or $S_2$. The least

3217: upper-bound is

3218: denoted by $S_1\sqcup S_2$ in the proof.

3219: \begin{enumerate}

3220: \itemsep 2pt

3221: \item Since {\it sat} is pre-consistent, there exists a concrete

3222:       behavior $\longmapsto'$ such that

3223:       {\it sat} safely approximate $\longmapsto'$ and

3224:       $\longmapsto'\ \sqsubseteq\ \longmapsto$. The sequence

3225:       ${\{\longmapsto'_i\}}_{i\in{\bf N}}$ is defined by

3226:       $$\begin{array}{llll}

3227:         \longmapsto'_0\ =\ \longmapsto' & \mbox{ and } &

3228:         \longmapsto'_{i+1}\ =\ \stackrel{{\it TCB}}{\longmapsto'_i}

3229:                       & ({i\in{\bf N}}).

3230:       \end{array}$$

3231:       Since $\longmapsto'\ \sqsubseteq\ \longmapsto$,

3232:       {\it TCB} is monotonic and $\longmapsto$ is a fixpoint of {\it TCB},

3233:       it follows that

3234:       $$\begin{array}{ll}

3235:         \longmapsto'_{i}\ \sqsubseteq\ \longmapsto

3236:                       & (\forall{i\in{\bf N}}).

3237:       \end{array}$$

3238:       Moreover, by Theorem \ref{th:cppost}, {\it sat}

3239:       safely approximates every $\longmapsto'_{i}$.

3240: \item ${(\longmapsto_i)}_{i\in{\bf N}}$ is now constructed by

3241:       induction over $i$. The correctness of the construction process

3242:       requires to prove that, after each induction step, the relation

3243:       $\longmapsto_{i}\ \sqsubseteq\ \longmapsto$ holds. We first define

3244:       $$\begin{array}{ll}

3245:         \longmapsto_0\ =\ \longmapsto'_0.

3246:       \end{array}$$

3247:       Let $i\in {\bf N}$. Assume, by induction, that

3248:       $\longmapsto_0\ \sqsubseteq\

3249:                       \ldots\ \sqsubseteq\ \longmapsto_i

3250:                        \ \sqsubseteq\ \longmapsto$.

3251:       For every $\langle \theta, p\rangle \in {\it CUD}$, we define

3252:       $$\begin{array}{lll}

3253:         \langle \theta, p\rangle

3254:         \longmapsto_{i+1} (S_1\sqcup S_2) & \mbox{\rm where } &

3255:         \left\{

3256:         \begin{array}{l}

3257:         \langle \theta, p\rangle

3258:         \longmapsto_{i} S_1,\\

3259:         \langle \theta, p\rangle

3260:         \longmapsto'_{i+1} S_2 .\end{array}\right.

3261:       \end{array}$$

3262:        Since

3263:        $\longmapsto'_{i+1}\ \sqsubseteq\ \longmapsto$ and

3264:        $\longmapsto_i \ \sqsubseteq\ \longmapsto$, we have that

3265:        $\longmapsto_{i+1}$ is well-defined and

3266:        $\longmapsto_{i+1}\ \sqsubseteq\ \longmapsto$.

3267: Moreover, since {\it sat} safely approximates

3268:        $\longmapsto_i$ (by induction)

3269: and $\longmapsto'_{i+1}$, and  $S_1\sqcup S_2$ is

3270:        equal either to $S_1$ or $S_2$, in the definition of

3271:        $\longmapsto_{i+1}$, we have that {\it sat} safely approximates

3272:        every $\longmapsto_{i+1}$.

3273: \item  The Kleene sequence of the concrete semantics is a chain

3274:        ${(\longmapsto''_i)}_{i\in{\bf N}}$ defined as follows:

3275:        $$\begin{array}{llll}

3276:         \longmapsto''_0\ =\ \longmapsto_\bot & \mbox{ and } &

3277:         \longmapsto''_{i+1}\ =\ \stackrel{{\it TCB}}{\longmapsto''_i}

3278:                       & ({i\in{\bf N}}).

3279:       \end{array}$$

3280:        Since $\longmapsto_\bot\ \sqsubseteq\ \longmapsto' $ and

3281:        {\it TCB} is monotonic,

3282:  it follows, by induction, that

3283:         $$ \begin{array}{ll}

3284:         \longmapsto''_i\ \sqsubseteq\ \longmapsto'_i

3285:                        \ \sqsubseteq\ \longmapsto_i

3286:                        \ \sqsubseteq\ \longmapsto

3287:                       & (\forall{i\in{\bf N}}).

3288:       \end{array}$$

3289:       Therefore, by definition of the least upper bound and since

3290:       the least fixpoint is the limit of the Kleene sequence,

3291:       $$ \begin{array}{l}

3292:         \longmapsto\ =\ (\sqcup_{i=0}^\infty \longmapsto''_i)

3293:                        \ \sqsubseteq( \sqcup_{i=0}^\infty \longmapsto_i)

3294:                        \ \sqsubseteq\ \longmapsto.

3295:       \end{array}$$

3296:       Thus,

3297:        $$ \begin{array}{l}

3298:         \longmapsto\ =\

3299:                        (\sqcup_{i=0}^\infty \longmapsto_i).

3300:       \end{array} \mathproofbox$$

3301: \end{enumerate}

3302: \end{proof*}

3303:

3304: \begin{proof}[Proof of Theorem \ref{th:SAS}]

3305:

3306: \noindent The result is an immediate consequence of Theorem \ref{th:cpab}

3307: and Lemma \ref{le:SAS}

3308: \end{proof}

3309:

3310:

3311: \subsection{Related Works}

3312: \label{sub:D}

3313: In this section we first discuss

3314: the mathematical approach underlying our abstract semantics and relate it with

3315: the higher-order abstract interpretation frameworks

3316: advocated by  Cousot and  Cousot  \shortcite{CousotICCL94}.

3317: Then, we compare our approach with

3318: the abstract  semantics  for Prolog with control proposed by  Barbuti

3319: {\em et al.}  \shortcite{BarbutiControl},

3320: by  Fil\`e and Rossi  \shortcite{FileRossi},

3321: and by Spoto  \shortcite{Spoto}.

3322: %\cite{BarbutiControl,FileRossi,Spoto}, which has been designed with the

3323: %aim of  providing a basis

3324: %for the abstract interpretation of Prolog with control.

3325: \\

3326:

3327: \noindent

3328: {\bf Cousot and Cousot's Higher-order Abstract Interpretation Frameworks.}

3329: %

3330: %

3331: %

3332: %the abstract semantics for Prolog with control proposed by Barbuti

3333: %{\em et al.} in \cite{BarbutiControl},

3334: %by Fil\`e and Rossi in \cite{FileRossi},

3335: %and by Spoto and Levi in \cite{Spoto}.

3336: %\\

3337: %other approaches  to abstract interpretation

3338: %We do not try to be exhaustive here. Our goal is merely (1) to explain

3339: %why the traditional approach to abstract interpretation,

3340: %which uses a simple set-based collecting semantics, does not work

3341: %for program substitution sequences; (2) to relate our

3342: %proposal with the higher-order abstract interpretation framework

3343: %advocated by  Cousots in \cite{CousotICCL94}.

3344: %An account of the possible

3345: %mathematical approaches

3346: %to abstract interpretation can be found in \cite{CousotJLC92}.

3347: %\noindent

3348: As mentioned in the introduction,

3349: the traditional approach to abstract interpretation can not be applied

3350: to approximate  the concrete semantics

3351: of Section~\ref{CS}.

3352: Indeed, we can define a set-based collecting transformation

3353: by lifting the concrete semantics to

3354: sets of program substitution sequences.

3355: However, the

3356:  least fixpoint of the collecting transformation

3357: does not safely approximate the concrete semantics.

3358: The problem can be solved by restricting to sets of

3359: $\wp({\it CPS}_D)$ and $\wp({\it CPSS}_D)$

3360:  that enjoy  some closure properties

3361: ensuring safeness of  the least fixpoint.

3362:  This solution is simi\-lar to the choice of

3363: a power-domain structure in denotational semantics

3364: \cite{Schmidt88,Stoy77}: the needed constructions can  in fact be

3365: viewed as power-domains.

3366: However

3367: %it has a

3368: %This solution has however a

3369: %drawback:

3370:  there is no best way

3371: to choose the closure pro\-per\-ties.

3372: Different closure properties are

3373: adequate for different sorts of information.

3374: It is therefore advocated by  Cousot and Cousot

3375:  in \shortcite{CousotICCL94} that, for higher-order

3376: languages, different collec\-ting semantics should be defined

3377: for the same language

3378: depending on the kind of properties to be inferred.

3379: In our case, at least two dual collecting semantics could be defined. Both of

3380: them use sets of program  substitution sequences that are

3381: chain-closed.

3382: \begin{enumerate}

3383: \itemsep 0pt

3384: \item The first semantics considers {\em downwards-closed} sets

3385:  of program substitution

3386:   sequences, i.e., such that

3387:   for any $S,S'\in{\it CPSS}_D$,

3388: %and $\Sigma\in\wp({\it CPSS}_D)$,

3389:   $$

3390:   \begin{array}{lll}

3391:   \left.

3392:   \begin{array}{c}

3393:    S\in\Sigma,\\

3394:    S' \sqsubseteq S

3395:   \end{array}

3396:    \right\}

3397:   &\Rightarrow

3398:   & S'\in \Sigma.

3399:    \end{array}

3400:    $$

3401:    This domain is ordered by inclusion and its minimum is $\{<\bot>\}$.

3402:    It is adequate to infer non-termination and upper bounds to the

3403:    length of sequences. In particular, it is adequate for determinacy

3404:    analysis. However,

3405:    it is unable to infer  termination since $<\bot>$

3406:    belongs to any set of sequences.

3407: \item   The second semantics considers {\em upwards-closed} sets

3408: of program substitution sequences, i.e.,

3409:    such that

3410:   for any $S,S'\in{\it CPSS}_D$,

3411:   $$

3412:   \begin{array}{lll}

3413:   \left.

3414:   \begin{array}{c}

3415:    S\in\Sigma, \\

3416:    S\sqsubseteq S'

3417:   \end{array}

3418:    \right\}

3419:   &\Rightarrow

3420:   & S'\in \Sigma.

3421:    \end{array}

3422:    $$

3423:    This domain is ordered by

3424:    $\Sigma\leq\Sigma'\ \Leftrightarrow\ \Sigma'\subseteq\Sigma$ and

3425:    its minimum is ${\it CPSS}_D$.

3426:    It is able to infer termination and lower bounds to the

3427:    length of sequences. It is less adequate

3428:    than the previous one to infer precise information about the

3429:    substitutions in the sequences because its least fixpoint

3430:     corresponds to  a greatest fixpoint in a traditional framework

3431:    ignoring the sequence structure.

3432: \end{enumerate}

3433:

3434: \noindent   In both cases, the least fixpoint is well-defined because

3435:    the collecting versions of the operations are monotonic, since

3436:    they have to ensure the closure properties. Moreover, the least fixpoint

3437:    of the collecting semantics safely approximates the concrete

3438:    semantics because all iterates are pre-consistent and the sets are

3439:    chain-closed.

3440:    Nevertheless, our formalization has some advantages.

3441:    \begin{enumerate}

3442: \itemsep 0pt

3443:    \item It can be more efficient:  a single analysis is able to infer

3444:         all the information that can be inferred by the two

3445:         collecting semantics.

3446:     \item It can be more accurate: there are pre-consistent post-fixpoints

3447:         that are more precise than the intersection of the two

3448:         collecting semantics.

3449: %In practice, an

3450: %abstract interpretation

3451: %        algorithm based on our approach can always be at least as accurate as the

3452: %        combination of

3453: %        two algorithms based on the two collecting semantics.

3454:    \end{enumerate}

3455:

3456:

3457: \vskip0.3cm

3458: \noindent

3459: {\bf Barbuti et al.'s Abstract Semantics}.

3460: The abstract semantics proposed by

3461:  Barbuti {\em et al.}   \shortcite{BarbutiControl} aims at modeling

3462:  control aspects of logic programs

3463:  such as search strate\-gy and selection rule.

3464: Their semantics is  parametric with respect to

3465: a ``termination theory''. The meaning of a program is  obtained by

3466: composing the meaning of its  ``logic component'' together with a

3467: corresponding ``termination theory'' (the ``control component'').

3468: %An approximate model for termination

3469: The latter can be provided either by applying

3470: techniques of abstract interpretation or by applying proof procedures.

3471: In all cases, control information is deduced from outside

3472: in the form of a separated termination analy\-sis.

3473: This is the main difference with

3474: our framework, where control information, i.e. information relative to

3475: termination or non-termination,

3476: is   modeled

3477: within the semantic domains through the notion of substitution sequence.

3478: \\

3479:

3480: \noindent

3481: {\bf Fil\`e and Rossi's Abstract Interpretation Framework}.

3482: The  framework proposed by

3483:  Fil\`e and  Rossi   \shortcite{FileRossi}

3484: consists of a tabled interpreter

3485: which  explores OLDT abstract trees decorated with

3486:  control information about

3487: sure success or failure of the goals.

3488: Such information is used by the cut operation to prune

3489: the OLDT-tree whenever a cut is reached.

3490: Sure success is modeled in our framework by abstract sequences

3491: representing

3492:  only non-empty sequences.

3493: The abstract semantics defined by Fil\`e and Rossi is

3494:  operational

3495: and non-compositional while ours is compositional and based on the

3496: fixpoint approach. Moreover,

3497:  the abstract execution of a goal  $(g,!)$

3498: is  different.

3499: Whenever is known that $g$ surely succeeds, their framework stops after

3500: generating the first "sure" solution, while ours computes the entire

3501: abstract sequence for $g$  and then cuts it to maintain at most

3502: one solution. Our approach may thus imply some redundant work.

3503: However, if $g$  is used in several contexts, their framework

3504: should recognize this situation and expand the  OLDT-tree

3505: further.\\

3506:

3507: \noindent

3508: {\bf Spoto's Denotational Abstract Semantics}.

3509: The  related work closest to ours is

3510: the denotational abstract semantics proposed by

3511:  Spoto \shortcite{Spoto}.

3512: He defines a

3513:  goal-independent and compositional abstract

3514: semantics of Prolog modeling the depth-first search rule and

3515: the cut. His semantics associates to any Prolog program

3516: a sequence of pairs consisting of a ``kernel''

3517: constraint

3518: and its ``observability'' part. Intuitively,  kernel constraints

3519: denote  computed answers, while  observability constraints give

3520:  information about

3521:   divergent computations and cut executions.

3522: The main difference with our approach is that his semantics is

3523: goal-independent

3524: while ours is not.

3525: This is due to the fact that our abstract

3526: semantics is {\em functional}, i.e., it

3527: associates to each program P a function (an abstract behavior)

3528: mapping every pair $\langle \beta,p\rangle$ to an abstract sequence $B$.

3529: However, this choice is unrelated to our  concrete semantics: we could

3530: as well abstract the concrete semantics  by a {\em relational}

3531: abstract semantics \cite{CousotJLC92},

3532:  making it possible to express dependencies between

3533: input substitutions and the corresponding output substitution sequences.

3534: This is the approach of \cite{RP-98-002}

3535: where we express dependencies between the size of

3536: input terms and the number of corresponding output substitutions.

3537: We will go back  to this issue at

3538: the end of Section \ref{sub:GRADC}.

3539:

3540: %We will go back  to this issue at

3541: %the end of Section \ref{sub:GRADC}.

3542: %but notice that we have applied the relational approach in

3543: %\cite{RP-98-002}

3544: %to express dependencies between the size of

3545: %input terms and the number of corresponding output substitutions.

3546:

3547:

3548:

3549: %However, we can easily rephrase our  semantics

3550: %into

3551: %a {\em relational} one, abstracting the program

3552: %to a set

3553: %of triples of the form $\langle \beta,p,B\rangle$.

3554: %The difference between the functional and relational

3555: % approach

3556: %will be also discussed at

3557: %has been  discussed

3558: %by Cousot and Cousot in \cite{CousotJLC92}.

3559: %We will go back  to this issue at

3560: %the end of Section \ref{sub:GRADC}.

3561:

3562:

3563: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

3564: \section{Generic abstract interpretation algorithm}

3565: \label{sec:GAIA}

3566: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

3567:

3568: A generic abstract interpretation algorithm

3569: is an algorithm that is

3570: parametric with respect to the  abstract domains. It can

3571: be instantiated by various domains

3572: to obtain different data-flow analyses.

3573: Several such algorithms have been proposed for Prolog

3574: \cite{Bruynooghe91,SPE,ICLP91AI,WSA93,TOPLAS,ACTA95,Mellish87,Muthukumar92},

3575: but they do not handle the control features of the language such that

3576:  Prolog search rule and cut.

3577:

3578: The algorithm presented here is essentially an instantiation of the

3579: universal fixpoint algorithm described in \cite{universal}

3580: to the abstract semantics of Section \ref{sec:AS}.

3581: In particular,

3582: it is  quite similar to the algorithm presented  in \cite{ICLP91AI,TOPLAS}:

3583: in fact,

3584: %it can proper\-ly be viewed as

3585: %a generalization

3586: %of this algorithm since

3587:  the abstract semantics  of Section

3588: \ref{sec:AS}

3589: can be

3590: viewed as a proper generalization of the abstract semantics

3591: described in those papers,

3592: %\cite{ICLP91AI,TOPLAS},

3593: where the sequences

3594: of computed answer substitutions are no longer abstracted to sets of

3595: substitutions.\\

3596: The universal algorithm in \cite{universal} is top-down, i.e.,

3597: it computes a subset of the fixpoint (in the form of a set of tuples)

3598: contai\-ning the output value corresponding to a

3599: distinguished input together with

3600:  all the tuples needed

3601: to compute it.

3602: Top-down algorithms are naturally used to perform data-flow analyses,

3603: where one is interested in collecting the abstract information

3604: corresponding to a class of initial queries described by the

3605: distinguished input. It is more efficient in general to

3606: compute a part of the fixpoint only and this allows one to use

3607: infinite abstract domains, which are more expressive

3608: \cite{Cousot92c}.

3609: Although the instantiation of \cite{universal} to our

3610: abstract semantics is as mechanical as

3611: in our previous works (a slightly more general

3612: widening operator is needed however),

3613: the correctness of the algorithm  involves some new theoretical

3614: issues:  the pre-consistency of the post-fixpoint has now to be proven.

3615: Nevertheless, since the novel algorithm is in practice very similar to

3616: the algorithm presented in \cite{TOPLAS}, we only discuss here the extended

3617: widening operator which  ensures a good compromise between

3618: efficiency and accuracy. A detailed description of the algorithm and

3619: its correctness proof can be found

3620: in \cite{sequence}.

3621:

3622:

3623:

3624: \subsection{Extended Widening}

3625: \label{sub:EW}

3626: The extended widening operation used by the novel algorithm is defined as follows.

3627:

3628:

3629:

3630: \begin{definition}[Extended Widening]

3631: \label{def:EW}

3632: \noindent

3633: An {\em extended widening}

3634: on abstract sequences is a (polymorphic%

3635: \footnote{It is parametrized over $D$.})

3636: operation

3637: $\nabla:{\it ASS}_D \times{\it ASS}_D \rightarrow {\it ASS}_D$ that

3638: enjoys the following properties.

3639: Let $\{B_i\}_{i\in{\bf N}}$ be a sequence of elements of ${\it ASS}_D$.

3640: Consider the sequence $\{B'_i\}_{i\in{\bf N}}$ defined by

3641: \vskip-0.1cm

3642: $$

3643: \begin{array}{llll}

3644: B'_0 & = & B_0,                               \\

3645: {B'}_{i+1} & = & B_{i+1}\nabla{B'}_{i} & (i\in{\bf N}).

3646: \end{array}

3647: $$

3648: The following conditions hold:

3649: \begin{enumerate}

3650: \itemsep 1pt

3651: \item $B'_i \geq B_i$ $(i\in{\bf N})$;

3652: \item the sequence $\{B'_i\}_{i\in{\bf N}}$ is stationary, i.e.,

3653: there exists  $j\ge 0$ such that ${B'}_i={B'}_j$

3654: for all $i$ such that $j\leq i$.

3655: \end{enumerate}

3656: \end{definition}

3657:

3658:

3659: An extended widening is slightly more general than a widening

3660: \cite{Cousot92c} because the sequence $\{B'_i\}_{i\in{\bf N}}$

3661: is {\em not} required to be a chain.

3662:

3663: Let us now explain how the extended widening is used

3664: by the algorithm.

3665: Given an input

3666: pair $\langle\beta,p \rangle$, the

3667: algorithm iterates on the computation of

3668: ${\it TAB}({\it sat})\langle\beta,p \rangle$ until

3669: convergence, and concurrently updates {\it sat}, as follows

3670: (recursive calls -- which also modify {\it sat} -- are ignored

3671: in the discussion):

3672: \begin{enumerate}

3673: \itemsep 0pt

3674: \item $B'_0$ $=$ $B_{\bot}$ is stored in the initial ${\it sat}$ as

3675: the output for $\langle\beta,p\rangle$;

3676:

3677: \item $B_i$ results from the $i$-$th$  execution of

3678: ${\it TAB}({\it sat})\langle\beta,p \rangle$;

3679:

3680: \item ${B'}_{i}=B_{i}\nabla{B'}_{i-1}$ is stored in the current ${\it

3681: sat}$ after the $i$-$th$  execution of

3682: ${\it TAB}({\it sat})\langle\beta,p \rangle$;

3683:

3684: \item the loop is exited when $B_{i+1}\leq B'_i$.

3685: \end{enumerate}

3686:

3687: \noindent

3688: The loop terminates because

3689: there must be some $i$ such that $B'_{i+1}=B'_i$

3690: (otherwise Condition 2 of Definition \ref{def:EW} would

3691: be violated), and, hence, $B_{i+1}\leq B'_i$ since

3692: $B'_{i+1}\geq B_{i+1}$ by Condition 1.

3693: The loop can be resumed later on

3694: because some values in {\it sat} have been updated

3695: (Step 1 is omitted in these subsequent executions);

3696: all re-executions of the loop terminate for

3697: the same reasons as the first one; moreover,

3698: the loop can only be resumed finitely many times because

3699: no element in {\it sat} can be improved infinitely many often,

3700: since

3701: there is a $j$ such that $B'_i=B'_j$

3702: for all $i$ greater or equal to $j$.

3703: Note that a local post-fixpoint

3704: is attained  each time the loop is exited.

3705: Thus a global post-fixpoint is obtained when all loops

3706: are terminated for all values in {\it sat}.

3707: The formal characterization of Definition \ref{def:EW} elegantly

3708: captures the idea that the algorithm

3709: sticks as closely as possible to the abstract semantics

3710: during the first iterations, and starts lumping

3711:  the results together only when enough accuracy is obtained,

3712: in order to ensure convergence. The advantage

3713: of this characterization is that no particular value of $j$ is

3714: fixed. So we can think of ``intelligent''

3715: extended widenings that observe how the successive iterates

3716: behave and that enforce convergence exactly at the right time.

3717: %Consistency

3718: %of the result is guaranteed because each $B'_i$ is pre-consistent and

3719: %the algorithm terminates with a post-fixpoint.  Pre-consistency of the

3720: %$B'_i$ follows from ${B'}_i \geq B_i$ and the pre-consistency of $B_i$

3721: %due to Lemma \ref{lemma2}.

3722: The extended widening  used

3723: in our experimental evaluation is based on

3724: this intuitive idea (see Section \ref{sub:GAO}).

3725:

3726:

3727:

3728: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

3729: \section{Cardinality analysis}

3730: \label{sec:FASAS}

3731: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

3732:

3733: The abstract interpretation framework for Prolog presented in previous sections

3734: has been instantiated by a domain of

3735:  abstract  sequences

3736: to perform so-called cardinality analysis; see \cite{cardinality}.

3737:  Cardinality analysis

3738: approximates the number of solutions to a goal and is useful for  many

3739: purposes such as indexing, cut insertion and elimination

3740: \cite{D89,Sahlin91}, dead code elimination, and memory management

3741: and scheduling in parallel systems \cite{Bueno91b,Hermenegildo86}.

3742: The analysis  subsumes traditional determinacy analysis

3743:  such as those of \cite{Ramakrishnan93,D89,Giacobazzi92,Sahlin91}.

3744:

3745:

3746: This section is organized as follows.

3747: First we describe how  a generic abstract domain

3748: for cardinality analysis, which is parametric with respect to

3749:  {\em any} domain of

3750: abstract substitutions, can be built. Then,

3751: we instantiate this generic domain to the domain of abstract substitutions

3752: {\tt Pattern}

3753: \cite{TOPLAS}.

3754: Finally, we discuss  experimental evaluations of the

3755: analysis from both accuracy and efficiency standpoints.

3756:

3757:

3758: \subsection{Generic Abstract Domains for Cardinality Analysis}

3759: \label{sub:GADCA}

3760:

3761: In this section,  generic  domains of  abstract sequences

3762: and abstract sequences with cut information

3763: are  built.

3764: The domains are generic with respect to the information on

3765: the substitutions in the sequences, but they provide

3766: specific information about the sequence structure.

3767: The latter  consists of

3768: lower and upper bounds to the number of substitutions in the

3769: sequences

3770: and  information about the nature

3771: (i.e.,  finite,

3772: incomplete or infinite) of the sequences.

3773: This information  allows us to

3774: perform non-termination analysis

3775:  and a limited form of termination analysis.

3776: %Moreover, it may improve

3777: %the accuracy of the substitution level analysis for some class of programs,

3778: %like those containing cuts.

3779: %These

3780: %abstract domains  %described here

3781: %are adequate for

3782: %can also be used to perform

3783:  Predi\-cate level analyses, like

3784:  determinacy

3785: %\cite{Ramakrishnan93,Debray89a,Giacobazzi92,Sahlin91}

3786: and functionality

3787: \cite{Debray89a}, which  were previously

3788: considered falling outside the scope of

3789: abstract interpretation, can be performed.

3790: \\

3791:

3792: \noindent

3793: {\bf Abstract Substitutions.}

3794: The substitution part of our generic  domain of abstract sequences

3795: is assumed to be an element of an arbitrary domain of abstract substitutions

3796: ${\it AS}_D$. The only requirement on ${\it AS}_D$ is that it  contains a minimum element

3797: $\beta_\emptyset$ such that ${\it Cc}(\beta_\emptyset)=\emptyset$.

3798: %This is

3799: %a trivial requirement since

3800: An abstract domain can always be

3801: enhanced with such an element.\\

3802:

3803: \noindent

3804: {\bf Abstract  Sequences.}

3805: %\label{subsub:ASe}

3806: The generic domain of abstract  sequences manipulates

3807: termination information whose domain is defined below.

3808:

3809: \begin{definition}[Termination Information]

3810: \label{def:TI}

3811: \noindent

3812: A termination information $t$ is an element of the set

3813:  ${\it TI}=\{ {\it st}, \;{\it snt}, \;{\it pt}\}$

3814:  endowed with the ordering $\leq$ defined by

3815: $$

3816: \begin{array}{llll}

3817: t_1\leq t_2 &  \Leftrightarrow & \mbox{\rm either } t_1 = t_2\ \mbox{\rm or}\ t_2 = {\it pt}

3818:              & \;\;\;\forall t_1,t_2\in{\it TI}.

3819: \end{array}

3820: $$

3821: The symbol {\it st} stands for

3822: ``sure termination'' and it characterizes finite sequences;

3823: {\it snt} stands for

3824: ``sure non termination'' and characterizes incomplete and infinite

3825: sequences; {\it pt} stands for ``possible

3826: termination'' and corresponds to  absence of information.

3827: \end{definition}

3828:

3829: \noindent

3830:

3831:

3832: The domain of abstract substitution sequences is  defined as follows.

3833:

3834:

3835:

3836: \begin{definition}[Abstract Sequences]

3837: \label{def:ASS_D}

3838: \noindent

3839: Let $D$ be a finite set of program variables. We denote by

3840: ${\it ASS}_D$ the set of all 4-tuples

3841:  $\langle \beta, m, M, t \rangle$ such that

3842: % \begin{enumerate}

3843: % \item

3844:        $\beta \in {\it AS}_D$,

3845:        $m \in {\bf N}$,

3846:        $M \in {\bf N}\cup \{\infty\}$,

3847:        and $t \in {\it  TI}$.%;

3848: % \item $m\leq M$;

3849: % \item $M=0\ \Leftrightarrow\ \beta=\beta_\emptyset$.

3850: %  \end{enumerate}

3851: \end{definition}

3852:

3853:   Informally, $\beta$ describes all substitutions in the

3854: sequences,

3855: $m$ and $M$ are

3856: lower and upper bounds on the number of substitutions in the

3857: sequences,  and

3858: $t$ is an  information on termination.

3859:  %In the following, abstract sequences are denoted by the letter $B$

3860: %possibly subscripted.

3861:

3862: The ordering on abstract sequences is defined as follows.

3863:

3864: \begin{definition}[Ordering on Abstract Sequences]

3865: \label{def:OAS}

3866: \noindent

3867: Let $B_1,B_2 \in {\it ASS}_D$.

3868: $$

3869: \begin{array}{llll}

3870: B_1 \leq B_2 &\mbox{\rm iff } & \beta_1 \leq \beta_2 \mbox{\rm \ and\ }

3871:                                  m_1 \geq m_2 \mbox{\rm \ and\ }

3872:                                  M_1 \leq M_2 \mbox{\rm \ and\ }

3873:                                  t_1 \leq t_2.

3874: \end{array}

3875: $$

3876: \end{definition}

3877:

3878:

3879: The set of program substitution sequences described by an abstract sequence

3880: $B$ is formally defined as follows.

3881:

3882:

3883: \begin{definition}[Concretization for Abstract Sequences]

3884: \label{def:CASS_D}

3885:

3886: \noindent

3887: Let $B\!\!=\!\!\langle \beta, m,M, t \rangle\!\in\! {\it ASS}_D$.

3888: We define

3889: \begin{center}

3890: \begin{tabbing}

3891: 123\=456\=789\=123456\=12345\kill

3892: ${\it Cc}(B)={\it Sseq}_1(\beta)\cap{\it

3893: Sseq}_2(m,M) \cap{\it Sseq}_3(t)$ \\

3894:  { where} \\

3895: \>\> \>

3896: ${\it Sseq}_1(\beta) = \{S: S\in {\it PSS}_D\ \mbox{\rm and}\

3897:                      {\it Subst}(S)\subseteq{\it Cc}(\beta)\},$\\

3898:         \>\> \>

3899: ${\it Sseq}_2(m,M) = \{S: S\in {\it PSS}\ \mbox{\rm and}\

3900:                     m \leq {\it Ns}(S) \leq M\},$\\

3901:         \>\> \>

3902: ${\it Sseq}_3({\it snt}) = \{S: S\in {\it PSS}\ \mbox{\rm and}\

3903:                     S$ is incomplete or infinite\},\\

3904:         \>\> \>

3905: ${\it Sseq}_3({\it st}) = \{S: S\in {\it PSS}\ \mbox{\rm and}\

3906:                     S$ is finite\},\\

3907:         \>\>\>

3908: ${\it Sseq}_3({\it pt}) = {\it PSS}.$ \end{tabbing}

3909: \end{center}

3910: \end{definition}

3911:

3912:

3913:

3914: Monotonicity of the concretization function is a simple

3915: consequence of the definition.

3916:

3917:

3918:

3919: We denote by $B_\bot$  the special abstract sequence

3920: $\langle\beta_\emptyset,0,0,{\it snt}\rangle$ which is such that

3921:  ${\it Cc}(B_\bot)=\{<\bot>\}$ as required in Section \ref{sub:AD}.

3922: %Notice that

3923: %, assuming that $\beta_\emptyset$ is the only abstract substitution

3924: %such that ${\it Cc}(\beta_\emptyset)=\emptyset$,

3925: %there is no minimum

3926: %abstract sequence in the domain (even no minimal

3927: %one, since the component $m$ is unbounded).

3928: %In fact, because we assume that ${\it AS}_D$ contains an abstract substitution

3929: %whose concretization is empty, we should sensibly assume that

3930: %${\it ASS}_D$ and ${\it ASSC}_D$ contain corresponding special elements

3931: %$B_\emptyset$ and $C_\emptyset$, because ``no input produces no output''.

3932: %These special elements are not needed actually because

3933: %they can be safely replaced by {\em any} other element (from the same

3934: %domain). Dealing with elements $B_\emptyset$ and $C_\emptyset$ would

3935: %require to add special cases in the implementation of each abstract

3936: %operation, which is cumbersome.

3937: %Thus, for the sake of simplicity, we do not consider such elements and

3938: %we replace them by arbitrary approximations where appropriate.

3939: It is easy to prove that for all abstract sequences

3940: $B\in{\it ASS}_D$, the set ${\it Cc}(B)$ is chain-closed; see \cite{sequence}.

3941:

3942: \vskip0.3cm

3943:

3944: \noindent

3945: {\bf Abstract Sequences with Cut Information.}

3946: %\label{subsub:ASCI}

3947: Abstract sequences with cut information are obtained by enhancing

3948: abstract sequences  with  information about

3949: execution of cuts.

3950:

3951:

3952: Let us first define the abstract domain for cut information.

3953:

3954: \begin{definition}[Abstract Cut Information]

3955: \label{def:ACI}

3956: \noindent

3957: %We require the existence of a new object denoted by {\it weakcut}.

3958: An abstract cut information {\it acf} is an element of the set

3959: ${\it ACF}=\{{\it cut},{\it  nocut}, {\it weakcut}\}$.

3960: \end{definition}

3961:

3962:

3963:

3964:

3965: \begin{definition}[Abstract Sequences with Cut Information]

3966: \label{def:ASSC}

3967: \noindent

3968: Let $D$ be a finite set of program variables.

3969: We denote by $ {\it ASSC}_D $  the set of pairs

3970: $\langle B, {\it acf} \rangle$ where

3971: $B\in{\it ASS}_D $ and ${\it acf} \in {\it ACF}$.

3972: \end{definition}

3973:

3974:

3975: Informally, {\it cut} indicates that a cut has been executed in all

3976: sequences, {\it nocut} that no cut has been executed in any sequence,

3977: and {\it weakcut} that a cut has been executed for all sequences

3978: producing at least one solution. More formally, the concretization

3979: of an abstract sequence with cut information is defined as follows.

3980:

3981: \begin{definition}[Concretization for Abstract Sequences

3982:   with Cut Information]

3983: \label{def:CASCI}

3984: \noindent

3985: Let $ B\in{\it ASS}_D$. We define\\

3986: \\

3987: $\begin{array}{llllll}

3988:  {\it  Cc}(\langle B,{\it cut}\rangle) & =

3989: & \{ \langle S, cut\rangle : S\in {\it Cc}(B)\}, \\

3990:  {\it Cc}(\langle B,{\it nocut}\rangle) &=&

3991: \{  \langle S, nocut \rangle: S\in {\it Cc}(B)\}, \\

3992:  {\it Cc}(\langle B,{\it weakcut}\rangle) &=&

3993: \{  \langle S, cut \rangle : S\in {\it Cc}(B)\}

3994:                 \cup  \\

3995: & &           \{  \langle S, nocut \rangle :\

3996:                           S\in {\it Cc}(B)

3997:                      \mbox{\rm\ and\ }S\in  \{<>,<\bot>\}\}.

3998: \end{array} $

3999: \end{definition}

4000:

4001: %\noindent

4002: %Abstract sequences with cut information are denoted by the letter $C$.

4003:

4004: \subsection{Abstract Operations}

4005: \label{sub:GAO}

4006: Our next task is to provide definitions of all abstract operations

4007: specified in Section~\ref{sub:AO}.

4008: For space reasons, we describe here a subset of the

4009: operations, i.e.,  extended widening,

4010:  unification,   operation  treating cut,

4011: and  concatenation.

4012: The other operations are described in the appendix.

4013: The reader is referred to \cite{sequence} for the correctness proofs.

4014:

4015: The operations on abstract substitutions which are used in the

4016: definition of the

4017: operations on abstract sequences

4018:  will be recalled when needed.

4019:

4020:

4021:

4022:

4023: \vskip0.3cm

4024:

4025: \noindent

4026: {\bf Extended Widening:} $\nabla: {\it ASS}_D\times {\it ASS}_D\rightarrow {\it ASS}_D$\\

4027: %\label{subsub:AS:EW}

4028: We require that the abstract domain ${\it AS}_D$ is equipped with a

4029: widening operation

4030: $\nabla': {\it AS}_D\times {\it AS}_D\rightarrow {\it AS}_D$.

4031: It can be an extended widening, a normal widening, or,

4032: if ${\it AS}_D$ is finite or

4033: enjoys the finite ascending chain property,

4034: any upper bound operation.

4035:  The widening

4036: on sequences is obtained by taking the least upper bound of

4037: the termination components, the minimum of the lower bounds

4038: and setting the upper bound to infinity. \\

4039: Assume that $B_{\it old} = \langle \beta_{\it old}, m_{\it old},

4040: M_{\it old}, t_{\it old} \rangle$ and $B_{\it new} = \langle \beta_{\it new},

4041:  m_{\it new},

4042: M_{\it new}, t_{\it new} \rangle$.\\

4043: The

4044: operation

4045:  $\nabla: {\it ASS}_D\times {\it ASS}_D\rightarrow {\it ASS}_D$

4046:  is defined

4047: as follows.

4048:

4049:

4050:

4051: \begin{small}

4052: $$

4053: \begin{array}{lllll}

4054: B_{\it new}\nabla B_{\it old} & = & \langle \beta_{\it new}\nabla'\beta_{\it old}, m_{\it new}, M_{\it new}, t_{\it new}\rangle

4055:   &\myif &\! \beta_{\it new}\not\leq\beta_{\it old}

4056:         \\

4057: %           & = & \langle \beta_{\it old}, 0, M_{\it new}, {\it pt}\rangle

4058: %  &\myif &\! \beta = \beta_\emptyset

4059: %                   \and t\not\leq   t_{\it o},     \\

4060:            & = & \langle \beta_{\it old}, m_{\it new}, M_{\it new}, {\it pt}\rangle

4061:   &\myif &\! \beta_{\it new}\leq\beta_{\it old}

4062:                    \and t_{\it new}\not\leq   t_{\it old}    \\

4063:            & = & \langle \beta_{\it old},

4064:                          \min(m_{\it new},m_{\it old}),

4065:                                      \infty, t_{\it old}\rangle

4066:   &\myif &\! \beta_{\it new}\leq\beta_{\it old}

4067:                    \and t_{\it new}\leq   t_{\it old}  \and \\

4068: & & & &

4069:

4070:                         (m_{\it new} \!< \!m_{\it old}

4071:                                 \myor

4072:                          M_{\it new} \!> \!M_{\it old})    \\

4073:            & = & B_{\it old}

4074:   &\myif &\! B_{\it new}\leq   B_{\it old}.     \\

4075: \end{array}

4076: $$

4077: \end{small}

4078:

4079:

4080: The first case makes sure that the algorithm iterates until the

4081: abstract substitution  part stabilizes. When it is stable,

4082: the widening is applied on sequences.

4083: \\

4084:

4085:

4086:

4087: \noindent

4088: {\it Example}.

4089: Consider the following program:\\

4090: \\

4091: \(

4092: \begin{array}{lll}

4093: {\tt repeat}.\\

4094: {\tt repeat} & \mbox{\tt :-}& {\tt repeat}.

4095: \end{array}

4096: \)\\

4097:

4098:

4099: The concrete semantics of this program maps the input

4100: $\langle \epsilon, {\tt repeat}\rangle$, where $\epsilon$ is the empty

4101: substitution, to the infinite sequence $<\epsilon,\ldots,\epsilon,\ldots>$.

4102:

4103: On this example, because the program has no variables,

4104: our domain of abstract substitutions only contains   two values, say

4105: $\beta_{\emptyset}$ and $\beta_{\top}$, such that\\

4106: \\

4107: \(\begin{array}{llll}

4108: {\it Cc}(\beta_{\emptyset})& =& \emptyset\\

4109:  {\it Cc}(\beta_{\top})& =&\{\epsilon\}.

4110: \end{array}\)\\

4111:

4112:

4113: Let $B_{\bot}=\langle \beta_{\emptyset},0,0,{\it snt}\rangle$.

4114:  Starting from $B_{\bot}$, the algorithm computes the abstract sequences\\

4115: \\

4116: \(\begin{array}{llllllllllllll}

4117: B_0& =& B_{\bot}& \hskip0.8cm B'_0& = & B_{\bot}\\

4118: B_1& =& \langle \beta_{\top},1,1,{\it snt}\rangle&

4119: \hskip0.8cm

4120: B'_1& =& B_1 \nabla B'_{0}=\langle \beta_{\top},1,1,{\it snt}\rangle \\

4121: B_2& =& \langle \beta_{\top},2,2,{\it snt}\rangle

4122: &\hskip0.8cm

4123:  B'_2& =& B_2 \nabla B'_{1}=\langle \beta_{\top},1,\infty,{\it snt}\rangle \\

4124: B_3& =& \langle \beta_{\top},2,\infty,{\it snt}\rangle

4125: \end{array}

4126: \)

4127: \\

4128:

4129:

4130: Notice that the widening on sequences is applied when the

4131: abstract substitution part stabilizes, i.e., after the computation of the

4132: abstract sequence $B_2$. The next iterate $B_3$ satisfies the property that

4133: $B_3\leq B'_2$. Hence, according to the discussion in Section

4134: \ref{sub:EW},  the execution terminates

4135: returning the final value

4136: \begin{center}

4137: $B'_2=\langle \beta_{\top}, 1,\infty,{\it snt}\rangle.$

4138: \end{center}

4139:

4140: Observe that $B'_2$

4141:     safely approximates the concrete

4142: infinite sequence

4143: $<\epsilon,\ldots,\epsilon,\ldots>$. Moreover, it

4144:  expresses the fact that

4145: the execution of {\tt repeat} {\em surely} succeeds at least once

4146: and  {\em surely} does not terminate\footnote{This example also shows that

4147: our framework can express non-failure properties such as the ones

4148: described in \cite{BC99,DLH97}.}.

4149:

4150: \vskip0.3cm

4151: \noindent

4152: {\bf Unification of Two Variables:}\ \

4153:    {\tt UNIF-VAR}$: {\it AS}_{\{x_1,x_2\}}

4154:                      \rightarrow {\it ASS}_{\{x_1,x_2\}}$\ \\

4155: \noindent

4156: Given an abstract substitution $\beta$ with domain

4157: $\{x_1,x_2\}$, this operation returns an abstract  sequence

4158: which represents a set of substitution sequences

4159:  of length 0 or~1 (depending upon the

4160: success or failure of the unification). The terms bound to $x_1$ and

4161: $x_2$ are unified in all these sequences.

4162: The operation {\tt UNIF-VAR} on abstract sequences uses an upgraded

4163: version of the operation {\tt UNIF-VAR}

4164: on abstract substitutions defined in \cite{ICLP91AI,TOPLAS}.

4165: The latter,  in addition

4166: to the resulting abstract substitution,

4167:  produces now two flags indicating whether the unification always

4168: succeeds, always fails, or can both succeed and fail.

4169:  The additional information

4170: is expressed by the boolean values {\it ss} and {\it sf} as specified below.\\

4171:

4172: \noindent

4173: {\em Operation}  $\;${\tt UNIF-VAR}$: {\it AS}_{\{x_1,x_2\}}

4174:                      \rightarrow ({\it AS}_{\{x_1,x_2\}}

4175:                      \times{\it Bool}\times{\it Bool})$

4176:

4177: \noindent

4178: Let $\beta\in{\it AS}_{\{x_1,x_2\}}$ and

4179:     $\langle\beta',{\it ss},{\it  sf}\rangle=

4180:      \mbox{\tt UNIF-VAR}(\beta)$. The following  conditions

4181: hold:

4182: \begin{enumerate}

4183: \itemsep 3pt

4184: \item $

4185:     \begin{array}{llll}

4186:       \forall\theta\in{\it Cc}(\beta):

4187:        \forall\sigma\in{\it SS}:

4188:         (\sigma\in{\it mgu}(x_1\theta,x_2\theta)&\Rightarrow&

4189:      [\![\theta\sigma]\!]\in{\it Cc}(\beta'));

4190:     \end{array}$

4191: \item

4192:   $\begin{array}{lll}

4193:    {\it ss}={\it true}&\Rightarrow& %{\it Cc}(\beta)\neq \emptyset\mbox{\rm\ and\ }

4194:                          (\forall\theta\in{\it Cc}(\beta):\

4195:                           x_1\theta\mbox{\rm\ and\ }

4196:                           x_2\theta\mbox{\rm\ \ are~unifiable);}

4197:    \end{array}$

4198: \item

4199:   $\begin{array}{lll}

4200:    {\it sf}={\it true}&\Rightarrow&

4201:                          (\forall\theta\in{\it Cc}(\beta):\

4202:                           x_1\theta\mbox{\rm\ and\ }

4203:                           x_2\theta\mbox{\rm\ \ are~not~unifiable})

4204:                           %\mbox{\rm\ and\ }\beta'=\beta_\emptyset

4205:                           .

4206:    \end{array}$

4207: \end{enumerate}

4208:

4209: \vskip0.2cm

4210:

4211:

4212:  Based on the upgraded operation {\tt UNIF-VAR}

4213:  for abstract substitutions, we provide an implementation of the operation

4214: {\tt UNIF-VAR} for abstract sequences,

4215: which is correct with respect to the corresponding

4216: specification given in

4217: Section \ref{sub:AO}.

4218:

4219: The operation {\tt UNIF-VAR}$: {\it AS}_{\{x_1,x_2\}}

4220:                      \rightarrow {\it ASS}_{\{x_1,x_2\}}$ on abstract sequences

4221: is  defined as follows.

4222:  Let $\beta\in {\it AS}_{\{x_1,x_2\}} $ and

4223: $\langle\beta'',{\it ss},{\it  sf}\rangle=

4224:      \mbox{\tt UNIF-VAR}(\beta)$. We have that {\tt UNIF-VAR}$(\beta)=B'$ where

4225:  $B'$ is the abstract sequence $\langle \beta',m',M',t'\rangle$

4226: such that

4227: $$

4228: \begin{array}{lll}

4229: \beta'   & = & \beta''                  \\

4230: m'      & = & \myif~{\it ss}~\mbox{\rm then}~1~\mbox{\rm else}~0\\

4231: M'      & = & \myif~{\it sf}~\mbox{\rm then}~0~\mbox{\rm else}~1\\

4232: t'      & = & {\it st}.

4233: \end{array}

4234: $$

4235: %$$

4236: %\begin{array}{llllll}

4237: %\beta'   & = & \beta''                     \\

4238: %m'      & = & 1 & \myif~{\it ss=true}\\

4239: %& =& 0 & \mbox{ otherwise}\\

4240: %M'      & = & 0 & \myif~{\it sf=true}\\

4241: %& =& 1 & \mbox{ otherwise}\\

4242: %t'      & = & {\it st}.

4243: %%

4244: %%\myif~{\it ss}~\mbox{\rm then}~1~\mbox{\rm else}~0;\\

4245: %M'      & = & \myif~{\it sf}~\mbox{\rm then}~0~\mbox{\rm else}~1;\\

4246: %t'      & = & {\it st}.

4247: %\end{array}

4248: %$$

4249:

4250:

4251:

4252:

4253: %\vskip0.2cm

4254: \noindent

4255: {\bf Abstract Interpretation of the Cut:}\ \

4256:    {\tt AI-CUT}$: {\it ASSC}_{{D'}}

4257:                         \rightarrow {\it ASSC}_{{D'}}$\

4258:

4259: \noindent

4260: Let $C=\langle\langle \beta,m,M,t\rangle,{\it acf}\rangle$.

4261: {\tt AI-CUT}$(C)=\langle\langle \beta',m,'M',t'\rangle,{\it acf'}\rangle$

4262: where

4263: %

4264: %and $C'=\langle\langle \beta',m,'M',t'\rangle,{\it acf'}\rangle$.

4265: %Then  $C'=${\tt AI-CUT}$(C)$ where

4266: \begin{quote}

4267:  $ \begin{array}{llll}

4268: \beta' & = & \beta                                         \\

4269:

4270: m'     & = & \min(1,m)                                      \\

4271:

4272: M'     & = & \min(1,M)                                      \\

4273:

4274: t'     & = & {\it st}  &\myif m\geq 1 \myor {\it t}={\it st}  \\

4275:        & = & {\it snt} &\myif M=0 \and {\it t}={\it snt}   \\

4276:        & = & {\it pt}  &\otherwise                          \\

4277:

4278: {\it acf}'

4279:        & = & {\it cut}  &\myif m\geq 1 \myor {\it acf}={\it cut}\\

4280:        & = & {\it nocut} &\myif M=0 \and {\it acf}={\it nocut}\\

4281:        & = & {\it weakcut} &\otherwise.

4282:   \end{array}$

4283: \end{quote}

4284:

4285:

4286: \noindent

4287: {\it Example}. Consider the program

4288: \\

4289: \\

4290: \(\begin{array}{lllll}

4291: {\tt p(X)} & \mbox{\tt :-} & {\tt q(X),\;!}.\\

4292: {\tt q(X)} & \mbox{\tt :-} & {\tt X=a}.\\

4293: {\tt q(X)} & \mbox{\tt :-} & {\tt X=b}.

4294: \end{array}

4295: \)\\

4296: \\

4297:

4298: For the sake of simplicity we use a

4299: simple domain of abstract substitutions

4300: which can be seen as the mode component of the

4301: {\tt Pattern} domain \cite{RP-98-002,TOPLAS}.

4302:  The example is intended to illustrate the abstract execution

4303: of the operation

4304: {\tt AI-CUT}. Hence, we do not enter here into the details of

4305: the other  operations, but the reader is referred

4306: to the appendix for

4307: their definition.

4308:

4309:

4310: The abstract execution of  the procedure ${\tt p}$ called with

4311: its argument being a variable is as follows.

4312: Let $$\beta={\tt X}\mapsto {\tt var}$$ be the initial abstract substitution.

4313: Let  $c$ be  the clause of the program defining ${\tt p}$.

4314: \\

4315: First, the abstract sequence with cut information $C$ is computed by

4316: $$C={\tt EXTC}(c,\beta)=\langle\langle {\tt X}\mapsto {\tt var}, 1,1,

4317: {\it st}\rangle , {\it nocut}\rangle.$$

4318: Then, the procedure {\tt q} that occurs

4319: in the body of $c$ is executed with

4320: $\beta={\tt SUBST}(C)$ returning the

4321: abstract sequence

4322: $$B=\langle {\tt X}\mapsto {\tt ground}, 2, 2, {\it st}\rangle.$$

4323: Hence, the abstract sequence

4324: with cut information $C'$ is computed as follows

4325: $$C'={\tt EXTGS}({\tt q(X)}, C,B)=\langle \langle {\tt X}\mapsto {\tt ground}, 2, 2, {\it st}\rangle,

4326: {\it nocut}\rangle.$$ Now, the operation

4327: {\tt AI-CUT}$(C')$ is applied. Following the definition above, one obtains

4328:  $$\mbox{{\tt AI-CUT}}(C')=\langle \langle {\tt X}\mapsto {\tt ground}, 1, 1, {\it st}\rangle,

4329: {\it cut}\rangle$$

4330: expressing~the fact that a cut in the body of $c$ is {\em surely} executed.

4331: The final result~is

4332: $$B'={\tt SEQ}(C')=\langle {\tt X}\mapsto {\tt ground}, 1, 1, {\it st}\rangle$$

4333:  stating that the execution of ${\tt p} $ called with

4334: its argument being  a variable surely terminates and

4335: succeeds exactly once.

4336:

4337:

4338: Consider now the abstract execution of  the procedure ${\tt p}$ called with

4339: a ground argument.

4340: Let $$\beta={\tt X}\mapsto {\tt ground}$$ be the initial abstract substitution.

4341: In this case, the abstract sequence with cut information $C$ is first

4342: computed by

4343: $$C={\tt EXTC}(c,\beta)=\langle\langle {\tt X}\mapsto {\tt ground}, 1,1,

4344: {\it st}\rangle , {\it nocut}\rangle.$$

4345: Then, the procedure {\tt q}  is executed with

4346: $\beta={\tt SUBST}(C)$ returning

4347: $$B=\langle {\tt X}\mapsto {\tt ground}, 0, 1, {\it st}\rangle.$$

4348: The abstract sequence

4349: with cut information $C'$ is computed as follows

4350: $$C'={\tt EXTGS}({\tt q(X)}, C,B)=

4351: \langle \langle {\tt X}\mapsto {\tt ground}, 0, 1, {\it st}\rangle,

4352: {\it nocut}\rangle.$$ The operation

4353: {\tt AI-CUT}$(C')$ returns

4354:  $$\mbox{\tt AI-CUT}(C')=\langle \langle {\tt X}\mapsto {\tt ground}, 0, 1, {\it

4355: st}\rangle,

4356: {\it weakcut}\rangle$$ expressing the fact

4357:  that, in this case, the computation either fails without executing the cut

4358: or succeeds  once after executing the cut.

4359: The final result is

4360: $$B'={\tt SEQ}(C')=\langle {\tt X}\mapsto {\tt ground}, 0, 1, {\it

4361: st}\rangle$$ stating that the execution of ${\tt p}$

4362: called with a ground argument

4363: succeeds at most once and surely terminates.

4364:

4365:

4366: The {\tt Pattern} domain

4367: used in our experiments

4368: is more elaborated than

4369: the simple domain of abstract substitutions

4370: used in this example.

4371: However, it does not provide more precision in these

4372: cases.

4373: A more sophisticated domain where an

4374: abstract sequence is represented

4375: as $\langle<\beta_1,\ldots,\beta_n>,m,M,t\rangle$

4376: with $<\beta_1,\ldots,\beta_n>$ being an explicit sequence of

4377:  abstract substitutions

4378:  could return in the first case

4379: a more precise result. Indeed, one could obtain

4380: $B=\langle \{{\tt X}\mapsto {\tt a}\},\{{\tt X}\mapsto {\tt b}\}

4381: , 2, 2, {\it st}\rangle$ and then

4382: $B'=\langle \{{\tt X}\mapsto {\tt a}\}, 1, 1, {\it st}\rangle$.

4383: However, such a domain could not improve the result in the second

4384: case since the fact that the output substitution can be

4385:  either ${\tt X}\mapsto {\tt a}$ or

4386: ${\tt X}\mapsto {\tt b}$ would be represented by

4387: ${\tt X}\mapsto {\tt ground}$ as we have done above.

4388:

4389:

4390: \vskip0.3cm

4391: \noindent

4392: {\bf Abstract Lazy Concatenation.}

4393: %\label{subsub:ALC}

4394: The implementation of the operation {\tt CONC}  is complicated here,

4395: in order  to get accurate results when the domain ${AS}_D$ is instantiated to the domain {\tt Pattern}.

4396: The implementation works on {\em

4397:   enhanced} sets of abstract sequences which allow us to keep

4398: individual structural information about the results of every clause in order

4399: to detect mutual exclusion of the clauses.

4400:

4401: Let us  motivate the lifting of

4402: abstract sequences to enhanced abstract sequences.

4403: Lifting an abstract domain to its power set,

4404: see, for instance, \cite{Cousot79a,FileILPS94}, is sometimes

4405: useful when the original abstract domain is not expressive enough to

4406: gain a given level of

4407: accuracy.

4408: Replacing an abstract domain by its power set is computationally

4409: expensive

4410: however; see \cite{WSA93_Granularity}.

4411: Sometimes, the accuracy is lost only inside a few operations;

4412: thus, a good compromise can be to lift the domain only locally,

4413: when these operations are executed, and to go back to the simple

4414: domain afterwards. This is exactly what we are going to do for the

4415: operation {\tt CONC}. The lifted version

4416: of the abstract domain

4417: that we are about to define

4418: is useful when the abstract domain  is able to express

4419: definite, but not disjunctive,

4420:  structural information about

4421:  terms.

4422: % but not disjunctive structural information.

4423: In such a domain, for instance,

4424: the principal functor of the term bound to a program

4425: variable

4426: can be either

4427: definitely known or not known at all; it is not possible to express

4428: that it belongs to a given finite set.

4429: The domain {\tt Pattern} used in our experiments is an abstract domain

4430: of this kind. Disjunctive structural information is however essential

4431: to implement the operation {\tt CONC} accurately: it allows us to

4432: detect

4433: mutually exclusive abstract sequences, i.e., abstract sequences that

4434: should not be ``abstractly concatenated'' since they correspond to

4435: different concrete inputs.

4436: In order to keep disjunctive structural information, our

4437: implementation of {\tt CONC} works on a finite set of abstract sequences.

4438: This set is ``normalized'' in some way, in order to simplify the

4439: case analysis in the implementation.

4440: Basically, we differentiate between ``surely empty'' abstract sequences,

4441: approximating only sequences of the form  $<>$ or $<\bot>$,

4442:  and ``surely non empty'' abstract

4443: sequences, approximating

4444:  only sequences of the form $<\theta>::S$. This is useful

4445: because sequences such as $<>$ or $<\bot>$ are possible outputs

4446: for any input, while sequences of the form $<\theta>::S$ are only

4447: possible for some inputs. Therefore we only have to check

4448: incompatibility of ``surely non empty'' abstract

4449: sequences.

4450: This discussion motivates

4451: the following definitions of semi-simple abstract sequences and simple

4452: abstract sequences.

4453:

4454: \begin{definition}[Semi-Simple Abstract Sequences]

4455: \label{def:SNAS}

4456: \noindent

4457: Let $B\in{\it ASS}_D$. We say that $B$ is a {\em semi-simple} abstract sequence

4458: if

4459: \begin{enumerate}

4460: \itemsep 0pt

4461: \item either, $\beta=\beta_\emptyset$ and $m=M=0$

4462: \item or, $\beta\neq\beta_\emptyset$ and $1\leq m \leq M$.

4463: \end{enumerate}

4464: \end{definition}

4465:

4466: \begin{definition}[Simple Abstract Sequences]

4467: \label{def:NAS}

4468: \noindent

4469: Let $B\in{\it ASS}_D$. We say that $B$ is a {\em simple} abstract sequence

4470: if it is semi-simple and $t\in\{{\it snt},{\it st}\}$.

4471: \end{definition}

4472:

4473:

4474: Semi-simple abstract sequences formalize our idea of distinguishing

4475: between ``surely empty'' and ``surely non empty'' abstract sequences.

4476: Note that,

4477: assuming that $\beta_\emptyset$ is the only abstract substitution such that

4478: ${\it Cc}(\beta_\emptyset)=\emptyset$, we have that ${\it Cc}(B)\neq\emptyset$ for

4479: any semi-simple abstract sequence $B$.

4480: %Enhanced abstract sequences

4481: %are  then defined as follows.

4482:

4483: \begin{definition}[Enhanced Abstract Sequences]

4484: \label{def:EnhAS}

4485: \noindent

4486: Let $D$ be a finite set of program variables.  We denote by

4487: ${\it ASS}^{\it enh}_D$  the set of all sets of the form

4488: $\{B_1,\dots,B_n\}$, where $n\geq 0$ and

4489: $B_1,\dots,B_n$ are semi-simple abstract sequences from ${\it ASS}_D$.

4490: Elements of ${\it ASS}^{\it enh}_D$ are called {\em enhanced abstract

4491:   sequences}; they are denoted by {\it SB} in the following.

4492: The concretization function

4493: ${\it Cc}:{\it ASS}^{\it enh}_D \rightarrow {\it CSS}_D$ is defined by

4494: ${\it Cc}({\it SB}) =\bigcup_{B\in{\it SB}}{\it Cc}(B).$

4495: \end{definition}

4496:

4497:

4498: The operation $ {\tt SPLIT1}$ transforms an

4499: arbitrary abstract sequence into an equivalent enhanced abstract

4500: sequence.

4501: %This is realized by the operation $ {\tt SPLIT1}$ described  below.

4502:

4503: \vskip0.3cm

4504: \noindent

4505: {\em Operation} $ \;{\tt SPLIT1}: {\it ASS}_D \rightarrow {\it ASS}^{\it enh}_D$\\

4506: \noindent

4507: This operation is required to satisfy the property that

4508: for every $B\in{\it ASS}_D$,

4509: $ {\it Cc}({\tt SPLIT1}(B))={\it Cc}(B)$.

4510: %It is defined as follows.

4511: Let $B=\langle \beta,m,M,t\rangle$. We define ${\it SB}'={\tt SPLIT1}(B)$

4512: as

4513:  ${\it SB}'={\it SB}_1\cup{\it SB}_2$ where

4514: \vskip0.2cm

4515: \begin{quote}

4516:   $\begin{array}{llll}

4517:    {\it SB}_1 & = & \{\langle \beta_\emptyset,0,0,t\rangle\} &

4518:                     \myif m=0                             \\

4519:               & = & \emptyset & \otherwise                  \\

4520:    {\it SB}_2 & = & \{\langle \beta,\max(1,m),M,t\rangle\} &

4521:                     \myif \beta\neq\beta_\emptyset

4522:                     \and \max(1,m)\leq M                   \\

4523:               & = & \emptyset & \otherwise .                 \\

4524:   \end{array}$

4525: \end{quote}

4526:

4527:

4528: \vskip0.2cm

4529:

4530: The operation {\tt MERGE} is the converse of {\tt SPLIT1}: it

4531: %

4532: %We also need a conversion operation that

4533: transforms an enhanced

4534: abstract sequence into a plain abstract sequence.

4535: %This operation is called

4536: %{\tt MERGE}.

4537: % and it is the

4538: %converse of {\tt SPLIT1}.

4539: Most of the time, this operation loses

4540: part of the information expressed by the enhanced abstract

4541: substitution sequence; but it does not lose any information

4542: when the enhanced

4543: abstract sequence results from a single application of {\tt SPLIT1}.

4544:

4545: \vskip0.3cm

4546: \noindent

4547: {\em Operation} $ \;{\tt MERGE}: {\it ASS}^{\it enh}_D \rightarrow {\it ASS}_D$\\

4548: The operation

4549: $ {\tt MERGE}$

4550:  satisfies the following properties:

4551:

4552: \begin{enumerate}

4553: \itemsep 2pt

4554: \item For every ${\it SB}\in{\it ASS}^{\it enh}_D$,

4555: $ {\it Cc}({\it SB})\subseteq{\it Cc}({\tt MERGE}({\it SB}))$

4556: \item For every $B\in{\it ASS}_D$,

4557: $ {\it Cc}({\tt MERGE}({\tt SPLIT1}(B)))={\it Cc}(B)$.

4558: \end{enumerate}

4559:

4560:

4561:

4562:

4563:

4564: The definition of {\tt MERGE} requires choosing a particular abstract

4565: sequence $B_\emptyset$ such that ${\it Cc}(B_\emptyset)=\emptyset$.

4566: We decide that $B_\emptyset=\langle\beta_\emptyset,1,0,{\it st}\rangle$.

4567: This choice is arbitrary since there is no best (least)

4568: representation of the empty set of abstract sequences in this domain.

4569: Moreover, it uses the binary operation

4570: $ {\tt UNION}: ({\it AS}_D \times {\it AS}_D) \rightarrow {\it AS}_D$,

4571:  which is inherited from

4572: our previous framework. The latter is

4573:  extended  to finite sequences

4574: of abstract substitutions as follows:

4575:

4576:

4577: \begin{quote}

4578:   \begin{tabular}{lllll}

4579:    ${\tt UNION}(<>)$ & $=$ & $\beta_\emptyset$ &       \\

4580:    ${\tt UNION}(<\beta>)$ & $=$ & $\beta,\;$

4581:  for every

4582:                                         $\beta\in {\it AS}_D$  \\

4583:    ${\tt UNION}(<\beta_1,\dots,\beta_n>)$ & $=$ &

4584:    ${\tt UNION}(\beta_1,{\tt UNION}(<\beta_2,\dots,\beta_n>))$, \\

4585: & &  for all

4586:     $\beta_1,\dots,\beta_n\in{\it AS}_D$ $(n\geq 2)$.  \\

4587:   \end{tabular}

4588: \end{quote}

4589:

4590:

4591: The operation  {\tt MERGE} can now be defined.

4592: Let $\sqcup$ denote the least upper bound on {\it TI}.

4593: Let  ${\it SB}\in{\it ASS}^{\it enh}_D$ such that

4594: ${\it SB}=\{B_1,\dots,B_n\}$ and

4595: $B_i=\langle \beta_i,m_i,M_i,t_i\rangle$ $(1\leq i\leq n)$.

4596: The abstract sequence $B'={\tt MERGE}({\it SB})$ is such that

4597:

4598: \begin{quote}

4599:   $\begin{array}{llll}

4600:      B' & = & B_\emptyset  & \myif n=0  \\

4601:

4602:         & = & B_1 & \myif  n=1 \\

4603:

4604:         & = & \langle {\tt UNION}(<\beta_1,\dots,\beta_n>) ,

4605:                       \min(m_1,\dots,m_n),\\

4606:         &   &       \ \max(M_1,\dots,M_n),

4607:                       t_1\sqcup\dots \sqcup t_n\rangle & \myif n\geq 2.

4608:   \end{array}$

4609: \end{quote}

4610:

4611: %The composition of {\tt MERGE} and {\tt SPLIT1} actually

4612: %{\em is} the normalizing operation.\\

4613:

4614: The notion of simple abstract sequence with cut information

4615:  is also useful to simplify the case analysis in the

4616: implementation of {\tt CONC}.

4617:

4618:

4619: \begin{definition}[Simple Abstract Sequences with Cut Information]

4620:

4621: \noindent

4622: Let $B\in{ASS}_D$ and $ {\it acf}\in{\it ACF}$. The abstract sequence

4623: with cut information $\langle B,{\it acf} \rangle$ is said to be {\em simple}

4624: if  $B$ is simple and ${\it acf}\in{\it CF}$.

4625: \end{definition}

4626:

4627:

4628: The operation {\tt SPLIT2}

4629: %Hence,

4630: %we need an operation  to

4631: converts an arbitrary

4632: abstract sequence with cut information into an equivalent set of

4633:  simple abstract sequences with cut information.

4634: %We call such operation

4635:  %{\tt SPLIT2}.

4636:

4637: \vskip0.3cm

4638: \noindent

4639: {\em Operation} $ {\tt SPLIT2}: {\it ASSC}_D \rightarrow \wp({\it ASSC}_D)$\\

4640: The operation {\tt SPLIT2}  satisfies the following

4641: properties.

4642: For every $C\in{\it ASSC}_D$,

4643: \begin{enumerate}

4644: \itemsep 2pt

4645: \item $\bigcup_{C'\in{\mbox{\footnotesize \tt SPLIT2}} (C)}

4646: %

4647:                {\it Cc}(C') \,=\,{\it Cc}(C)$;

4648: \item all abstract sequences with cut information in ${\tt SPLIT2}(C)$

4649:   are simple.

4650: \end{enumerate}

4651:

4652: \noindent

4653: Its definition is simple.

4654:  We first

4655: apply the operation {\tt SPLIT1} to the abstract sequence part of $C$.

4656: Then we split the cut information.

4657: Finally we split the termination information. Formally,

4658: ${\tt SPLIT2}(C)$ is defined as follows.

4659:

4660: \begin{enumerate}

4661: \itemsep 0pt

4662: \item Let $C=\langle B, {\it acf}\rangle\in{\it ASSC}_D$. We define

4663:   \begin{quote}

4664: %\hskip-0.7cm

4665:     $\begin{array}{lll}

4666:         {\tt SPLIT2}(C) & = & \bigcup_{B'\in

4667:               \mbox{\footnotesize \tt SPLIT1}(B)}{\tt

4668:         SPLIT2}(\langle B',{\it acf}\rangle).

4669:     \end{array}$

4670:          \end{quote}

4671: \item Let $B=\langle \beta,m,M,t\rangle\in{\it ASS}_D$. Assume that $B$ is semi-simple. We define

4672:   \begin{quote}

4673: %\hskip-0.7cm

4674:     $\begin{array}{llll}

4675:           {\tt SPLIT2}(\langle B,{\it weakcut}\rangle) & = &

4676:                 {\tt SPLIT2}(\langle B,{\it nocut}\rangle)\cup

4677:                 {\tt SPLIT2}(\langle B,{\it cut}\rangle) & \myif m=0\\

4678:                                               & = &

4679:              {\tt SPLIT2}(\langle B,{\it cut}\rangle) & \myif m\geq 1.

4680:   \end{array}$

4681:  \end{quote}

4682:

4683: \noindent (Remember that, by Definition \ref{def:SNAS}, we

4684: also have $\beta=\beta_\emptyset$ and $M=0$, in the first case, and

4685: $\beta\neq\beta_\emptyset$ and $m\leq M$, in the second case.)

4686:

4687: \item Let $B=\langle \beta,m,M,t\rangle\in{\it ASS}_D$ and ${\it cf}\in{\it CF}$.

4688:       Assume that $B$ is semi-simple. We define

4689:       \begin{quote}

4690: %\hskip-0.7cm

4691:            $\begin{array}{llll}

4692:                 {\tt SPLIT2}(\langle B,{\it cf}\rangle) & = &

4693:                                           \{\langle B,{\it cf}\rangle\}

4694:                        & \myif t\in\{{\it snt},{\it st}\};\\

4695:                                         & = &

4696:             \{\langle \langle \beta,m,M,{\it snt}\rangle,{\it cf}\rangle,

4697:                \langle \langle \beta,m,M,{\it st}\rangle,{\it cf}\rangle\}

4698:                        & \myif t={\it pt}.

4699:              \end{array}$

4700:        \end{quote}

4701: \end{enumerate}

4702:

4703:

4704:

4705: Before presenting the implementation of {\tt CONC},

4706:  we still need to specify the operation {\tt EXCLUSIVE}, which

4707: is aimed at detecting incompatible outputs.

4708: An implementation of this operation for the domain {\tt Pattern} is

4709: given in Section \ref{sub:IP}.

4710:

4711: \vskip0.3cm

4712: \noindent

4713: {\em Operation} ${\;\tt EXCLUSIVE}:({\it AS}_D\times {\it AS}_D\times {\it AS}_D)

4714: \rightarrow {\it Bool}$\\

4715: The operation {\tt EXCLUSIVE}  satisfies the following

4716: property.

4717: For all $\beta,\beta_1,\beta_2\in{\it AS}_D$,

4718: \begin{center}

4719: $\begin{array}{lll}

4720: {\tt EXCLUSIVE}(\beta,\beta_1,\beta_2) & \Rightarrow &

4721: \neg(\exists \theta\in{\it Cc}(\beta)

4722:              ,\theta_1\in{\it Cc}(\beta_1)

4723:              ,\theta_2\in{\it Cc}(\beta_2),

4724:               \sigma_1,\sigma_2\in{\it SS}:\\\

4725: & &  \theta\sigma_1=\theta_1

4726:                                       \and                                              \theta\sigma_2=\theta_2).

4727:   \end{array}$

4728: \end{center}

4729:

4730:

4731:

4732:  We are now ready to describe the operation {\tt CONC}.

4733:

4734: \vskip0.3cm

4735: \noindent

4736: {\em Operation} $\;{\tt CONC}: ({\it AS}_D \times {\it ASSC}_D

4737:                          \times {\it ASS}^{\it enh}_D)

4738:                  \rightarrow {\it ASS}^{\it enh}_D$.\\

4739: \noindent

4740: Let $\beta\in {\it AS}_D$, $C_1\in {\it ASSC}_D  $ and

4741: ${\it SB}_2\in {\it ASS}^{\it enh}_D$.

4742: ${\it SB}'={\tt CONC}(\beta,C_1,{\it SB}_2)$ is defined as follows.

4743: We assume that $B_i=\langle \beta_i,m_i,M_i,t_i\rangle$.

4744: %The implementation of the operation {\tt CONC} is as follows.

4745: %When the second argument is simple and the third one is a singleton,

4746: %we perform a careful case analysis; in the general case, we split both

4747: %arguments, we compute the individual results, and we put all

4748: %the individual results

4749: %together in the final resulting set.

4750: %Formally, ${\it SB}'={\tt CONC}(\beta,C_1,{\it SB}_2)$

4751: %is defined as below where $B_i=\langle \beta_i,m_i,M_i,t_i\rangle$.

4752:

4753: \begin{enumerate}

4754: \itemsep 0pt

4755: \item

4756: Let us assume first that  $C_1=\langle B_1,{\it acf}_1\rangle$ is simple and ${\it SB}_2=\{B_2\}

4757:   $.

4758:   \begin{enumerate}

4759: \itemsep 0pt

4760:   \item Suppose that ${\it acf}_1={\it cut}$ or $t_1={\it snt}$. In

4761:     this case, we define

4762:     \begin{quote}

4763: %\hskip-0.8cm

4764:       $\begin{array}{lll}

4765:         {\it SB}'&= & \{B_1\}.

4766:       \end{array}$

4767:     \end{quote}

4768:   \item Suppose, on the contrary, that ${\it acf}_1={\it nocut}$

4769:                 and $t_1={\it st}$. We define

4770:     \begin{quote}

4771: \hskip-1.0cm

4772:      $\begin{array}{llll}

4773:         {\it SB}'&= & \{B_2\} & \myif M_1 = 0\\

4774:

4775:                  &= & \{\langle \beta_1,m_1,M_1,t_2\rangle\}

4776:                     & \myif M_1 \geq 1 \and M_2 =0\\

4777:

4778:                  &= & \{\langle {\tt UNION}(\beta_1,\beta_2),

4779:                                  m_1+m_2,M_1+M_2,t_2\rangle\}

4780:                     & \myif M_1 \geq 1 \and M_2 \geq 1\\

4781:                  &&&\and \neg{\tt EXCLUSIVE}(\beta,\beta_1,\beta_2)\\

4782:

4783:                  &= & \emptyset

4784:                     & \myif M_1 \geq 1 \and M_2 \geq 1\\

4785:                    &&&\and {\tt EXCLUSIVE}(\beta,\beta_1,\beta_2).

4786:      \end{array}$

4787:     \end{quote}

4788:   \end{enumerate}

4789: \item In the general case, we define

4790:   \begin{quote}

4791:     $

4792:     \begin{array}{lll}

4793:      {\it SB}'&= &

4794:      \displaystyle\bigcup_{\renewcommand{\arraystretch}{0.1}

4795:                                          \begin{array}{c}

4796:                                         _{C\in\mbox

4797:                                        {\footnotesize \tt SPLIT2}(C_1)}\\

4798:                                         _{B\in{\it SB}_2}

4799:                                         \end{array}

4800:                            \renewcommand{\arraystretch}{1}}

4801:                    {\tt CONC}(\beta,C,\{B\}).

4802:     \end{array}$

4803:   \end{quote}

4804: \end{enumerate}

4805:

4806:

4807: \subsection{Instantiation to {\tt Pattern}}

4808: \label{sub:IP}

4809:

4810: The domain of abstract substitutions {\tt Pattern} has been introduced

4811: in \cite{MusumbuThesis} and it has been used in many of our

4812: previous works, e.g., \cite{SPE,ACTA95}. The reader is referred

4813: to \cite{TOPLAS} for a detailed description of the domain and of its

4814:   abstract operations.

4815:

4816: \vskip0.3cm

4817: \noindent

4818: {\bf The Abstract Domain  Pattern.}

4819: %\label{subsub:ADP}

4820: \noindent

4821: The version of {\tt Pattern} used in the experimental

4822: evaluation of Section \ref{sub:EE}

4823: %is more expressive than the

4824: %version described in \cite{TOPLAS}. It contains an additional

4825: %so-called arithmetic component that keeps information about

4826: %executions of arithmetic built-ins such as {\tt >}.

4827: % this

4828: %information is used by the operation {\tt EXCLUSIVE}.

4829: %The domain is

4830: can be best viewed as an instantiation of the generic

4831: pattern domain $\pag$  \cite{POPL94,SCP00} with  mode, sha\-ring, and

4832: arithmetic  components.

4833:

4834: The key intuition behind $\pag$ is to represent information on some

4835: subterms occurring in a substitution instead of information on terms

4836: bound to variables only.  More precisely, $\pag$ may associate the

4837: following information with each

4838: conside\-red

4839: subterm: (1)

4840: %\begin{enumerate}

4841: %\itemsep 0pt

4842: %\item

4843: its {\em pattern}, which specifies the main functor of the

4844:   subterm (if any) and the subterms which are its arguments;

4845: %\item

4846: its {\em properties}, which are left unspecified and are

4847: given in the domain $\cal R$.

4848: %\end{enumerate}

4849:  In addition

4850: to the above information, each variable in the domain of the

4851: substitution is associated with one of the subterms. It can be

4852: expressed that two arguments have the same value (and hence

4853: that two variables are bound together) by associating both arguments

4854: with the same subterm.  It should be emphasized that the pattern

4855: information may be void.  In theory, information on all subterms could

4856: be kept but the requirement for a finite analysis makes this

4857: impossible for almost all applications. As a consequence, the domain

4858: shares some features with the depth-k abstraction \cite{Kanamori87},

4859: although $\pag$ does not impose a fixed depth but adjusts it

4860: dynamically through upper bound and widening operations. Note that the

4861: identification of subterms (and hence the link between the structural

4862: components and the $\cal R$-domain) is a somewhat arbitrary choice. In

4863: \pag{}, subterms are identified by integer indices, say $1,\ldots,n$

4864: if $n$ subterms are considered, and we denote sets of

4865: indices by the symbol $I$.

4866:

4867: More formally,

4868: the pattern and same-value component

4869: can be described as follows.

4870:   The {\em pattern} component is a partial function

4871: ${\it frm}: I \not\rightarrow {\it Pat}_I$,

4872: from the set of indices $I$ to the set

4873: of patterns over $I$, i.e., elements of the form $f(i_1,\ldots,i_n)$,

4874: where $f\in{\cal F}$ is a functor symbol of arity $n$

4875:  and $i_1,\ldots,i_n \in I$. When the

4876: pattern is undefined for an index $i$, we write {\it frm($i$) $=$ undef}.

4877: The {\em same-value} component is a total function

4878: ${\it sv}: D \rightarrow I$,

4879: where $D = \{x_1,\ldots,x_n\}$ is the domain of the abstract

4880: substitution.

4881:

4882: A pattern component ${\it frm}: I \not\rightarrow {\it Pat}_I$

4883: denotes a set of families $(t_i)_{i\in I}$

4884: of terms as defined below.

4885: %Let

4886: %${\it frm}: I \not\rightarrow {\it Pat}_I$

4887: %and ${\cal F}$ be the set of all functor symbols.

4888: %

4889: \begin{quote}

4890:   $\begin{array}{llll}

4891:  {\it Cc}({\it frm})   & = &

4892:  \{ (t_i)_{i\in I}\ |  &

4893:  {\it frm}(i) = f(i_1, \dots, i_n) \,\Rightarrow\,

4894:      t_i     = f(t_{i_1}, \dots, t_{i_n}),\\

4895: & & &

4896:  \forall i, i_1, \dots, i_n \in I,

4897:  \forall f \in {\cal F}\}.

4898:   \end{array}$

4899: \end{quote}

4900:

4901: \noindent

4902: In order to simulate

4903:  unification with

4904: occur-check,  we also assume that every pattern

4905: component {\it frm}  satisfies the following condition:

4906:  the relation $\subc\subseteq I\times I$ such that

4907: $i\subc j$ if and only if ${\it frm}(i)$ is of the form

4908: $f(\dots, j, \dots)$ must be

4909: well-founded.

4910:

4911:  A pair  $\langle {\it sv},{\it frm} \rangle$ with

4912: ${\it sv}: D \rightarrow I$ and ${\it frm}: I \not\rightarrow {\it Pat}_I$

4913: is called

4914: {\em structural abstract substitution};

4915: it denotes a set of program substitutions as follows:

4916: \begin{quote}

4917:   $\begin{array}{lll}

4918:  \Cc{\langle {\it sv},{\it frm} \rangle}   & = &

4919:  \{ \theta \in {\it PS}_D \ | \

4920: \exists

4921: (t_i)_{i\in I}\in{\it Cc}({\it frm}):\;

4922: x_j\theta = t_{{\it sv}(x_j)},\;

4923:  \forall x_j \in D \}.

4924:   \end{array}$

4925: \end{quote}

4926:

4927:

4928:

4929: The $\cal R$-domain is the generic part which specifies subterm

4930: information by descri\-bing properties of a set of tuples $<t_1, \ldots,

4931: t_n>$ where $t_1, \ldots, t_n$ are terms. As a consequence, defining

4932: the $\cal R$-domain amounts essentially to defining a traditional

4933: domain on substitutions and its operations. We now describe the

4934: various components of the $\cal R$-domain which can be built as an

4935: open product \cite{POPL94,SCP00}.

4936:

4937: The {\em mode} component is described in \cite{TOPLAS} and associates a

4938: mode from the set $ {\it Modes} =

4939: \{ {\tt var}, {\tt ground}, {\tt novar}, {\tt noground}, {\tt ngv},

4940: {\tt gv}, {\tt any}\}$ with each subterm.

4941: Formally, it is a total function ${\it mo}: I \rightarrow {\it Modes}$

4942: whose concretization is defined as

4943:

4944: \begin{quote}

4945:   $\begin{array}{lll}

4946:  {\it Cc}({\it mo}) & = &

4947:    \{ (t_i)_{i\in I}\ |\; \ t_i \in{\it  Cc}({\it mo}(i)), \;\forall  i \in I \}.

4948:   \end{array}$

4949: \end{quote}

4950:

4951:

4952:

4953: The {\em sharing} component maintains information about possible sharing

4954: between pairs of subterms and is also described in \cite{TOPLAS}.

4955: Formally, it is a symmetrical relation ${\it ps}\subseteq I\times I$

4956: whose concretization is defined as

4957:

4958: \begin{quote}

4959:   $\begin{array}{lll}

4960:  {\it Cc}({\it ps}) & = &

4961:    \{ (t_i)_{i\in I}\ |\; \

4962:    {\it var}(t_i) \cap {\it var}(t_j) \Rightarrow {\it ps}(i,j),

4963:    \;\forall  i,j \in I \}.

4964:   \end{array}$\\

4965: \end{quote}

4966:

4967:

4968:

4969: The {\em arithmetic} component is novel and aims at using arithmetic

4970: predicates to detect mutual exclusion between clauses. It  approximates

4971: information about arithmetic relationships by rational order

4972: constraints, i.e., binary constraints of the form $i \ \delta \ j$ and

4973: unary constraints of the form $i \ \delta \ c$, where $i,j$ are

4974: indices, $\delta \in \{>,\geq,=,\neq,\leq,<\}$ and $c$ is an integer

4975: constant. For instance, a built-in $X \geq Y + 2$ is approximated by a

4976: constraint $X > Y$.

4977: Formally, an element {\it arithm}

4978: is a set of rational order constraints over indices, whose

4979: concretization  is defined as follows (a constraint being

4980: satisfied only if the terms are numbers).

4981:

4982: \begin{quote}

4983:   $\begin{array}{lll}

4984:  {\it Cc}({\it arithm}) &= & \{

4985:   (t_i)_{i\in I}\ |\; \ \forall \ i \ \delta \ j  \in  arithm:

4986: \; t_i \ \delta \ t_j  \and  \forall \ i \ \delta \ c  \in

4987: arithm: \; t_i \ \delta \ c \}.

4988:   \end{array}$

4989: \end{quote}

4990:

4991:

4992:

4993:

4994:

4995: \vskip0.3cm

4996: \noindent

4997: {\bf The Operation EXCLUSIVE.}

4998: We describe here the implementation of the ope\-ration {\tt EXCLUSIVE} on our

4999: domain of abstract substitutions. This operation was not present in our

5000: previous  works.

5001: It aims at detecting situations where

5002: two output abstract sequences $B_1$ and $B_2$ are incompatible, given

5003: that they both originate from the same abstract input

5004: substitution $\beta$. Only the abstract substitution components

5005: $\beta_1$ and $\beta_2$ of $B_1$ and $B_2$ are useful to detect such

5006: situations. Thus the operation {\tt EXCLUSIVE} has three arguments

5007: $\beta$, $\beta_1$, and $\beta_2$. (See its specification in

5008:  Section~\ref{sub:GAO}.)

5009:

5010: Let us first introduce the notion of

5011: decomposition of a program substitution

5012: with respect to a structural abstract substitution.

5013: It represents the family

5014: of terms, occurring in the program substitution, that are given an

5015: index by the structural abstract substitution.

5016:

5017: \begin{definition}[Decomposition of a Program Substitution]

5018: \label{def:DPS}

5019:

5020: \noindent

5021: Let $\langle {\it sv},{\it frm}\rangle $ be a structural abstract

5022: substitution over domain

5023: $D=\{x_1,\dots,x_n\}$ and set of indices $I$.

5024: Let also $\theta\in{\it Cc}\langle {\it sv},{\it frm}\rangle$.

5025: The {\em decomposition of $\theta$

5026:           with respect to $\langle {\it sv},{\it frm}\rangle $} is the (unique)

5027: family of terms $(t_i)_{i\in I}$ such that

5028: \begin{quote}

5029:   $\begin{array}{lll}

5030:    \theta=\{x_1/t_{{\it sv}(x_1)},\dots,x_n/t_{{\it sv}(x_n)}\} & \and

5031:    &

5032:    (t_i)_{i\in I}\in{\it Cc}({\it frm}).

5033:   \end{array}$

5034: \end{quote}

5035:

5036:  Existence and unicity of the family $(t_i)_{i\in I}$ can be

5037: proven by an induction argument that uses the fact that

5038: the relation $\subc$ over $I$ is well-founded. Unicity holds conditional to the fact that $I$ does not contain any "useless" element, i.e., for eve\-ry $i \in I$, there exists a variable $x_j \in D$ and a set of indices $i_1,\dots,i_k$ such that

5039: $i_1 = {\it sv}(x_j)$,

5040: $i_1 \succ \dots \succ i_k$,

5041: and $i_k = i$. From now on we assume that this condition always holds.

5042: \end{definition}

5043:

5044:

5045: The next definition models a property of the structural abstract

5046: substitutions obtained by performing any number of abstract

5047: unification steps on another structural abstract substitution.

5048:

5049: %The following concept of instance of a structural

5050: %abstract substitution is also useful.

5051:

5052: \begin{definition}[Instance of a Structural Abstract Substitution]

5053: \label{def:ISAS}

5054: \noindent

5055: Let $\langle {\it sv},{\it frm}\rangle $ and

5056: $\langle {\it sv}',{\it frm}'\rangle $ be two structural abstract

5057: substitutions over the same domain

5058: $D=\{x_1,\dots,x_n\}$ and respective sets of indices $I$ and $I'$.

5059: Let also ${\it im}:I \rightarrow I'$ be a total function.

5060: We say that

5061: {\em $\langle {\it sv}',{\it frm}'\rangle $ is an instance of

5062:      $\langle {\it sv},{\it frm}\rangle $ with respect to {\it im}} if the

5063:  following conditions hold:

5064:

5065: \begin{enumerate}

5066: \itemsep 2pt

5067: \item ${\it sv}' = {\it im} \ \comp\  {\it sv}$;

5068: \item for all $i,i_1,\dots,i_m\in I,\\

5069:   \begin{array}{lll}

5070: {\it frm}(i)= f(i_1,\dots,i_m) & \Rightarrow &

5071: {\it frm}'({\it im}(i))= f({\it im}(i_1),\dots,{\it im}(i_m)).

5072:   \end{array}$

5073: \end{enumerate}

5074: \noindent

5075: Moreover, we say that

5076: {\em $\langle {\it sv}',{\it frm}'\rangle $ is an instance of

5077:      $\langle {\it sv},{\it frm}\rangle $} if there exists a

5078:    function ${\it im}$ such that the conditions hold.

5079: \end{definition}

5080:

5081:

5082: The next property holds.

5083:

5084:

5085:

5086: \begin{property}\label{prop:PI}

5087: Let $\langle {\it sv},{\it frm}\rangle $ and

5088:      $\langle {\it sv}',{\it frm}'\rangle $

5089: be two structural abstract substitutions, and let

5090:  ${\it im}: I \rightarrow I'$ be such that

5091: $\langle {\it sv}',{\it frm}'\rangle $ is an instance of

5092: $\langle {\it sv},{\it frm}\rangle $ with respect to {\it im}.

5093: Let also $\theta\in{\it Cc}\langle {\it sv},{\it frm}\rangle $,

5094: $\theta'\in{\it Cc}\langle {\it sv}',{\it frm}'\rangle $,  and

5095:  $\sigma \in {\it SS}$.

5096: Finally, let $(t_i)_{i\in I}$ and $(t'_{i})_{i\in I'}$ be the

5097: decompositions of $\theta$ and $\theta'$ with respect to

5098: $\langle {\it sv},{\it frm}\rangle $ and

5099:      $\langle {\it sv}',{\it frm}'\rangle $, respectively.

5100: Then we have

5101:  \begin{quote}

5102:    $\begin{array}{lll}

5103: \theta' = \theta\sigma &\Rightarrow&

5104: (t_i\sigma)_{i\in I} = (t'_{{\it im}(i)})_{i\in I}.

5105:    \end{array}$

5106:  \end{quote}

5107: \end{property}

5108: %\vskip-0.5cm

5109: \noindent

5110: The proof is a simple induction on the well-founded relation $\subc$,

5111: induced on $I$ by~${\it frm}$.

5112:

5113:

5114:

5115: The next definitions and properties are instrumental to the

5116: implementation and correctness proof of the operation {\tt EXCLUSIVE}.

5117:

5118:

5119: \begin{definition}[Exclusive Pair of Indices]

5120: \label{def:EPI}

5121: \noindent

5122: Let $ {\it frm}_1$ and $ {\it frm}_2$ be two pattern components over sets

5123: of indices $I$ and $J$, respectively. Let also $i\in I$ and  $j\in J$.

5124: \begin{enumerate}

5125: \itemsep 2pt

5126: \item We say that $\langle i,j\rangle$ is {\em directly exclusive} with respect to

5127: $\langle{\it frm}_1,{\it frm}_2\rangle$ iff ${\it frm}_1(i)=

5128: f(i_1,\ldots,i_p)$,  ${\it frm}_2(j)=g(j_1,\ldots,j_{q})$

5129: and either $f \neq g$ or $p \neq q$.

5130: \item We say that $\langle i,j\rangle$ is {\em exclusive} with respect to

5131: $\langle{\it frm}_1,{\it frm}_2\rangle$ iff $\langle i,j\rangle$ is

5132: {\em directly exclusive} with respect to $\langle{\it frm}_1,{\it frm}_2\rangle$,

5133: or ${\it frm}_1(i)=f(i_1,\ldots,i_p)$, ${\it frm}_2(j)=f(j_1,\ldots,j_p)$ and there exists

5134: $k: 1 \leq k \leq p$ such that $\langle i_k,j_k\rangle$ is exclusive with respect to

5135: $\langle{\it frm}_1,{\it frm}_2\rangle$.

5136: \end{enumerate}

5137: \end{definition}

5138:

5139: \vskip-0.3cm

5140: \begin{property}\label{prop:EPI}

5141: Let $ {\it frm}_1$ and $ {\it frm}_2$ be two pattern components over sets

5142: of indices $I$ and $J$, respectively.

5143: Let $(t_i)_{i\in I}\in {Cc}({\it frm}_1)$ and

5144:     $(t_j)_{j\in J}\in {Cc}({\it frm}_2)$. Let also  $i\in I$ and~$j\in J$.

5145:     \begin{enumerate}

5146: \itemsep 2pt

5147:     \item If the pair $\langle i,j\rangle$ is {\em directly exclusive} with respect to

5148:           $\langle{\it frm}_1,{\it frm}_2\rangle$, then the terms $t_i$

5149:           and $t_{j}$ are compound and they have distinct principal functors.

5150:     \item If the pair $\langle i,j\rangle$ is {\em exclusive} with respect to

5151:           $\langle{\it frm}_1,{\it frm}_2\rangle$, then the terms $t_i$

5152:           and $t_{j}$ are distinct ($t_i \neq t_{j}$).

5153:     \end{enumerate}

5154: \end{property}

5155:

5156:

5157:

5158: We are now in position to provide the implementation of the operation

5159: {\tt EXCLUSIVE} for the domain {\tt Pattern}.

5160: We just show here a partial implementation which only uses the pattern,

5161: same-value, and mode components but it gives the

5162: idea behind the complete implementation. For additional details, the

5163:  reader is referred to

5164: \cite{Braem.Modard94}.

5165:

5166: \vskip0.3cm

5167: \noindent

5168: {\em Operation} $\;{\tt

5169:     EXCLUSIVE}:{\tt Pattern}\times {\tt Pattern}\times{\tt Pattern}\rightarrow {\it Bool}$\\

5170: \noindent

5171: Let $\beta,\beta_1,\beta_2$ be abstract substitutions over the same

5172: domain $D$ and sets of indices $I$, $I_1$, and $I_2$, respectively.

5173: Assume that $\langle {\it sv}_1, {\it frm}_1 \rangle$ and

5174: $\langle {\it sv}_2, {\it frm}_2 \rangle$

5175: are instances of $\langle {\it sv}, {\it frm} \rangle$ with respect to

5176: ${\it im}_1$ and ${\it im}_2$, respectively.

5177: The value of

5178: {\tt EXCLUSIVE}$(\beta,\beta_1,\beta_2)$ is {\it true} if and only if there

5179: exists

5180: $i \in I $ such that

5181: \begin{enumerate}

5182: \itemsep 2pt

5183: \item ${\it mo}(i) \in \{{\tt ngv},{\tt novar}\}$ and the pair

5184: $\langle{\it im}_1(i),{\it im}_2(i)\rangle$ is directly exclusive

5185: with respect to $\langle {\it frm}_1,{\it frm}_2\rangle$, or

5186: \item ${\it mo(i)} = {\tt ground}$ and the pair

5187: $\langle{\it im}_1(i),{\it im}_2(i)\rangle$ is exclusive

5188: with respect to $\langle {\it frm}_1,{\it frm}_2\rangle$.

5189: \end{enumerate}

5190:

5191: %\noindent

5192: %The value of

5193: %{\tt EXCLUSIVE}$(\beta,\beta_1,\beta_2)$ is {\it false}, otherwise.

5194:

5195: \noindent

5196: Correctness of the implementation

5197: follows from Properties \ref{prop:PI} and \ref{prop:EPI}; see

5198: \cite{sequence}.\\

5199:

5200: %\noindent The implementation of {\tt EXCLUSIVE} presented here does

5201: %not make use of the {\it arithm} component while the implementation

5202: %used in the experimental evaluation of the next section does.

5203: %Additional details are available in \cite{Braem.Modard94}.

5204: %\\

5205:

5206: \noindent

5207: {\bf Prolog's Built-in Predicates.}

5208: Prolog's built-in predicates such as test predicates

5209: ({\tt var}, {\tt ground}, and the like) or arithmetic predicates

5210: ({\tt is}, {\tt <}, \dots) can be handled in essentially the same way

5211: as abstract unification. Our implementation actually includes abstract

5212: operations that deal with test and arithmetic predicates

5213:  \cite{Braem.Modard94}. Other built-in predicates can be

5214: accommodated as well, including the predicates {\tt assert}

5215: and {\tt retract}. However, the treatment of the latter predicates

5216: assumes that dynamic predicates are disjoint from static predicates,

5217: i.e., it assumes that the underlying program $P$ is not modified.

5218: A more satisfactory treatment of dynamic predicates requires to

5219: introduce a new abstract object representing the dynamic program;

5220: this improvement is a topic for further work.

5221:

5222: \subsection{Experimental Evaluation}\label{sub:EE}

5223: The experimental results presented in this section provide evidence of

5224: the fact that the approach presented in this paper allows one

5225: to integrate predicate level analysis to existing variable

5226: level analysis at a reasonable implementation cost.

5227: Compari\-sons with other cardinality and determinacy analyses

5228: can be found in Section \ref{sec:RDASBAI}.\\

5229:

5230:

5231: \begin{table}

5232:  \caption{Efficiency of the Cardinality Analysis}

5233: \label{efficiency}

5234: %\footnotesize

5235: \centering

5236: \begin{tabular}{||l||r|r||r|r|r|r||r|r|r|r||}

5237: %\hline \hline

5238: \multicolumn{11}{c}{}  \\

5239: \cline{1-11}

5240:  \multicolumn{1}{||c||}{} & \multicolumn{2}{c||}{} & \multicolumn{4}{c||}{} &

5241: \multicolumn{4}{c||}{}  \\

5242:  \multicolumn{1}{||c||}{} &\multicolumn{2}{c||}{OR} & \multicolumn{4}{c||}{PC} &

5243: \multicolumn{4}{c||}{PCA} \\

5244: \cline{1-11}

5245: & & & & & & & & & & \\

5246: Programs &  I & T & I & T & IR & TR & I & T & IR & TR \\

5247: %\hline

5248: \cline{1-11}

5249: & & & & & & & & & & \\

5250: Qsort &

5251: 13 & 0.08 & 17 & 0.12 & 1.31 & 1.50 & 13 & 0.08 & 1.00 & 1.00\\

5252: Qsort2 &

5253:  15 & 0.08 & 19 & 0.12 & 1.27 &  1.50 & 15 & 0.09 & 1.00 & 1.13 \\

5254: Queens &

5255: 15 & 0.07 & 18 & 0.08 & 1.20 &  1.14 & 18 & 0.10 &  1.20 & 1.43 \\

5256: Press1 &

5257: 532 & 11.77 & 581 & 13.11 & 1.09 &  1.11 & 581 & 13.45 &  1.09 & 1.14

5258: \\

5259: Press2 &

5260: 197 & 3.27 & 200 & 3.56 & 1.02 &  1.09 & 200 & 3.56 &  1.02 & 1.09 \\

5261: Gabriel &

5262:  78

5263:  & 0.90 & 84 & 1.00 & 1.08 &  1.11 & 84 & 0.98 &  1.08 & 1.09 \\

5264: Peep &

5265: 132 & 3.21 & 131 & 18.85 & 0.99 &  5.87 & 131 & 19.08 &  0.99 & 5.94 \\

5266: Read &

5267:  432 & 23.91 & 458 & 25.32 & 1.06 &  1.06 & 458 & 25.37 &  1.06 & 1.06

5268: \\

5269: Kalah &

5270: 115 & 1.90 & 121 & 2.09 & 1.05 &  1.10 & 120 & 2.11 &  1.04 & 1.11 \\

5271: Cs &

5272: 79 & 2.19 & 91 & 3.05 & 1.15 &  1.39 & 90 & 3.02 &  1.14 & 1.38 \\

5273: Plan &

5274: 36 & 0.21 & 38 & 0.30 & 1.06 &  1.43 & 38 & 0.27 &  1.06 & 1.29 \\

5275: Disj &

5276:  64 & 1.95 & 68 & 2.14 & 1.06 &  1.10 & 68 & 2.12 &  1.06 & 1.09 \\

5277: Pg &

5278: 38  & 0.32 & 40 & 0.36 & 1.05 &  1.13 & 39 & 0.35 &  1.03 & 1.09 \\

5279: Boyer &

5280: 56  & 0.76 & 56 & 1.15 & 1.00 &  1.51 & 56 & 1.17 &  1.00 & 1.54 \\

5281: Credit &

5282: 63 & 0.57 & 64 & 0.81 & 1.02 &  1.42 & 64 & 0.80 &  1.02 & 1.40\\

5283:     &     & &    &      &      &        &    &      &      &

5284: %\hline

5285: \\

5286: \cline{1-11}

5287: & & & & & & & & & &\\

5288: Mean & &   &   &   &  1.09  & 1.56  &   &   & 1.05  & 1.52  \\

5289: %

5290: \cline{1-11}

5291: \end{tabular}

5292: %\caption{Efficiency of the Cardinality Analysis}

5293: %\label{efficiency}

5294: \end{table}

5295:

5296: \noindent

5297: {\bf Benchmarks.} Our experiments use our traditional benchmarks

5298: except that cuts have been reinserted as in the original versions.

5299:  In addition, some

5300: new programs have been added.  {\tt Boyer} is a theorem-prover from the

5301: DEC-10 benchmarks, {\tt Credit} is an expert system from \cite{Sterling86}.

5302:   There are two versions of {\tt Qsort}

5303: which differ in procedure {\tt Partition} which uses or does not use

5304: auxiliary predicates for the arithmetic built-ins.

5305: All the benchmarks are available by anonymous ftp from

5306: ftp://ftp.info.fundp.ac.be/pub/users/ble/bench.p.

5307: They have been run on a SUN SS-10/20.

5308:

5309:

5310: \vskip0.3cm

5311: \noindent

5312:

5313: \noindent

5314: {\bf Efficiency.} The efficiency results are reported in Table

5315: \ref{efficiency}. Several algorithms are compared: {\tt OR} is the

5316: original

5317: {\tt GAIA} algorithm on {\tt Pattern} \cite{TOPLAS}, {\tt PC} is the

5318: cardinality analysis with {\tt Pattern} and {\tt PCA} is {\tt PC} with

5319: the abstraction for arithmetic predicates.

5320: I, T, IR and TR are the number of ite\-ra\-tions, the execution time

5321: (in seconds),

5322: the iteration's ratio and the time's ratio respectively.

5323: The first interesting

5324: point to notice is the slight increase (about 5\% on {\tt PCA}) in

5325: iterations when moving from abstract substitutions to abstract

5326: sequences, sho\-wing the effectiveness of our widening operator. Even

5327: more important perhaps is the fact that the time overhead of the

5328: cardinality analysis is small with respect to the traditional analysis:

5329: {\tt PCA}

5330: is 1.52 slower than {\tt OR}.  Note that in fact most programs enjoys

5331: an even smaller overhead but {\tt Peep} is about 6 times slower than

5332: {\tt

5333: OR} in {\tt PCA}. This comes from many procedures with many clauses,

5334: most of which being not surely cut; much time is spent in the

5335: concatenation operation. Finally, note that adding

5336: more functionality in the domain did not slow down the analysis by

5337: much.

5338:

5339:

5340:

5341:

5342: \begin{table}

5343: \caption{Accuracy of the Cardinality Analysis}

5344: \label{accuracy}

5345: %\footnotesize

5346: \centering

5347: \begin{tabular}{||l|l|r||r|r||r|r||r|r||r|r||}

5348: %\hline \hline

5349:  \multicolumn{11}{c}{}  \\

5350: \cline{1-11}

5351: \multicolumn{1}{||c|}{}  &\multicolumn{1}{c|}{}   & \multicolumn{1}{c||}{} & \multicolumn{2}{c||}{} & \multicolumn{2}{c||}{} &

5352: \multicolumn{2}{c||}{}

5353: & \multicolumn{2}{c||}{} \\

5354: \multicolumn{1}{||c|}{}  &\multicolumn{1}{c|}{}   & \multicolumn{1}{c||}{} & \multicolumn{2}{c||}{P} & \multicolumn{2}{c||}{C} &

5355: \multicolumn{2}{c||}{PC}

5356: & \multicolumn{2}{c||}{PCA} \\

5357: \cline{1-11}

5358: & & & & & & & & & & \\

5359: Programs &  Query & NP & D & \%D & D & \%D & D & \%D & D & \%D \\

5360: %\hline

5361:  \cline{1-11}

5362: & & & & & & & & & & \\

5363: Qsort & {\tt qsort(g,v)} & 3 & 0 & 0 & 0 & 0 & 0 & 0 & 3 & 100 \\

5364: Qsort2 & {\tt qsort(g,v)} & 5 & 2 & 40 & 2 & 40 & 2 & 40 & 5 & 100 \\

5365: Queens &{\tt queens(g,v)} &5 & 2 & 40 & 0 & 0 & 2 & 40 & 2 & 40 \\

5366: Press1 & {\tt test\_press(v,v)}&

5367: 47 & 8 & 17 & 19 & 40 & 19 & 40 & 19 & 40 \\

5368: Press2 & {\tt test\_press(v,v)}&

5369: 47 & 12 & 26 & 19 & 40 & 28 & 60 & 28 & 60 \\

5370: Gabriel &  {\tt main(v,v)}&

5371: 17 & 0 & 0 & 4 & 24 & 4 & 24 & 4 & 24 \\

5372: Peep &  {\tt comppeeppopt(g,v,g)}&

5373: 24 & 4 & 17 & 7 & 29 & 16 & 67 & 16 & 67 \\

5374: Read & {\tt read(v,v)} &

5375: 46 & 11 & 24 & 27 & 59 & 31 & 67 & 31 & 67 \\

5376: Kalah & {\tt play(v,v)}& 46 & 16 & 35 & 20 & 43 & 33 & 72 & 40 & 87 \\

5377: Cs & {\tt pgenconfig(v)} & 32 & 11 & 34 & 7 & 22 & 11 & 34 & 13 & 41 \\

5378: Plan & {\tt transform(g,g,v)} &

5379: 13 & 1 & 8 & 0 & 0 & 1 & 8 & 1 & 8 \\

5380: Disj & {\tt top(v)} & 28 & 13 & 46 & 11 & 39 & 13 & 46 & 13 & 46 \\

5381: Pg & {\tt pdsbm(g,v)} & 10 & 2 & 20 & 3 & 30 & 5 & 50 & 6 & 50 \\

5382: Boyer & {\tt boyer(g)} & 24 & 0 & 0 & 20 & 83 & 20 & 83 & 20 & 83 \\

5383: Credit & {\tt credit(a,a)} &

5384: 26 & 14 & 58 & 11 & 42 & 14 & 54 & 16 & 62 \\

5385:        &   &   &      &    &    &  &  &  &       &         \\

5386: %\hline

5387: \cline{1-11}

5388:  & & & & & & & & & &\\

5389: Mean &    &   & & 24 &   &  33 &  & 46 &   & 58 \\

5390: %\hline

5391: \cline{1-11}

5392: \end{tabular}

5393: %\caption{Accuracy of the Cardinality Analysis}

5394: %\label{accuracy}

5395: \end{table}

5396:

5397:

5398: \vskip0.3cm

5399: \noindent

5400:

5401: \noindent

5402: {\bf Accuracy.}

5403: The accuracy results are reported in Table \ref{accuracy}.

5404: For each program we speci\-fy the initial query to which the abstract

5405: interpretation algorithm is applied (we denote  by {\tt a}, {\tt g}

5406: and {\tt v} the modes {\tt any}, {\tt ground} and {\tt var}, respectively).

5407: Several

5408: versions of the algorithm are compared with respect to their ability to

5409: detect

5410: determinacy of procedures, which was our primary motivation. {\tt P}

5411: is using only the domain {\tt Pattern} (i.e., cuts are ignored), {\tt

5412: C} is only using the cut (i.e.,  {\tt EXCLUSIVE} always returns false),

5413: and {\tt PC, PCA} are defined as previously.

5414: In the table, NP stands for the number of procedures and

5415:  D and \%D denote

5416: the number of  procedures and the percentage of

5417: procedures, respectively,  that are detected to be

5418: deterministic

5419:  by the algorithms.

5420:  There are several

5421: interesting points to notice.  First, {\tt PCA} detects that 58\% of

5422: the procedures are deterministic, although many of these programs in

5423: fact use heavily the nondeterminism of Prolog. Most of the results are

5424: optimal and a nice example is the program {\tt Kalah}. Second, the

5425: cut and input/output patterns are really complementary to improve the

5426: analysis. Input/output patterns alone give 41\% of the deterministic

5427: procedures (i.e., those detected by {\tt PCA}),

5428: while the cut detects 57\%  of the deterministic

5429: procedures. The abstraction of arithmetic predicates

5430: adds 21\% of deterministic procedures\footnote{Notice that

5431: 24/58=0.41, 33/58=0.57 and (58-46)/58=0.21.

5432:  The inequality  41+57+21$\not=$100  can be understood by the fact

5433:  that the analysis computed by P, C and A (the latter being the algorithm that

5434: only considers the

5435:  arithmetic predicates)

5436:  are not completely exclusive.}.

5437: The main lesson here is that all

5438:  components are of primary importance to obtain precise results.

5439:

5440:

5441:

5442:

5443: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

5444: \section{Retaled works on determinacy analysis}

5445: \label{sec:RDASBAI}

5446: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

5447:

5448: Determinacy of logic programs in general and of Prolog programs in

5449: particular

5450: is an important research topic because determinate programs can be

5451: implemented more efficiently than non-determinate programs

5452: (often, much more efficiently).

5453: Several forms of determinacy have been identified, which lead to

5454: different kinds of optimizations.

5455:  In this section, we review a few

5456: interesting papers on determinacy analysis at the light of our novel

5457: framework for the abstract interpretation of Prolog.

5458: The benefit of this study is twofold:

5459: first, it sheds new light on these analyses in the

5460: context of abstract interpretation;

5461: second, it supports the claim that our proposal is appropriate to

5462: integrate most existing analyses into a single framework.

5463:

5464:

5465:

5466: \subsection{Sahlin's Determinacy Analysis for Full Prolog}\label{sub:SDAFP}

5467:

5468:

5469: The analysis proposed by D. Sahlin  \shortcite{Sahlin91} aims at

5470: detecting  procedures of a (full) Prolog program that are determinate

5471: (i.e., they succeed at most once) or fully-determinate

5472: (i.e., they succeed exactly once).

5473: The analysis is developed in the context of the partial evaluator

5474: Mixtus \cite{Sahlin.PhD} in order to detect situations where cuts

5475: can be ``executed'' or removed.

5476:  Sahlin's analysis  is not based on abstract interpretation;

5477:  hence he provides a specific correctness proof for it. \\

5478: In this section, we show that the determinacy analysis

5479: proposed by Sahlin  \shortcite{Sahlin91}  is indeed an instance of our

5480: framework over his abstract domain.

5481:

5482:

5483: \vskip0.3cm

5484: \noindent

5485: {\bf Abstract Domains.}

5486: \noindent

5487:  Sahlin's analysis completely ignores information

5488: on program variables. The

5489:  abstract domains are

5490: concerned with the sequence structure  only: substitutions are completely

5491: ignored. Note that no abstract interpretation framework

5492: available at the time of his writing

5493: was adequate to his needs.

5494:

5495: \vskip0.3cm

5496: \noindent

5497: {\bf Abstract Substitutions.}

5498: %According to Sahlin's analysis, the transformations performed by

5499: %the partial evaluator ensure that the program is sufficiently

5500: %``instantiated''  to obtain an accurate analysis. This is modeled

5501: %in our framework by choosing sets ${\it AS}_D$ consisting of a single element.

5502: Since program variables are ignored, we can assume a domain

5503: ${\it AS}$ consisting of an arbitrary single element.

5504: %also drop the subscript

5505: %$\scriptstyle D$ in the following.

5506:

5507: \vskip0.3cm

5508: \noindent

5509: {\bf Abstract Sequences.}

5510: Sahlin's analysis can be formalized in our framework by defi\-ning

5511:  ${\it ASS}=\wp({\it AASS})$, where

5512: ${\it AASS}=\{{\cal L}, 0, 1, 1', 2, 2'\}$\footnote{We

5513: choose to denote the elements of {\it AASS} by the same

5514:   symbols as in \cite{Sahlin91}.}.

5515: We call elements of {\it AASS}, {\em atomic abstract sequences}.

5516: Their concretization is defined as follows:

5517: %function ${\it Cc}:{\it AASS}\rightarrow \wp({\it

5518: %  PSS})$ is defined by:

5519: \vskip0.2cm

5520: \begin{quote}

5521:   $\begin{array}{lllllll}

5522:      {\it Cc}({\cal L}) & = & \{ <\bot> \} \\

5523:      {\it Cc}(0)       & = & \{  \ <>  \   \} \\

5524:    {\it Cc}(1)       & = & \{  S\in {\it PSS} \mid {\it Ns}(S)=1

5525:                                \and S\mbox{\rm ~is~finite}\} \\

5526:      {\it Cc}(1')      & = & \{  S\in {\it PSS} \mid {\it Ns}(S)=1

5527:                                \and S\mbox{\rm ~is~incomplete}\} \\

5528:       {\it Cc}(2)      & = &

5529:           \{  S\in {\it PSS} \mid {\it Ns}(S)>1

5530:                                \and S\mbox{\rm ~is~finite}\} \\

5531:      {\it Cc}(2')      & = &

5532:           \{  S\in {\it PSS} \mid {\it Ns}(S)>1

5533:                                \and S\mbox{\rm ~is~incomplete~or~infinite}\}

5534:   \end{array}$

5535: \end{quote}

5536: \vskip0.2cm

5537: \noindent

5538: %An abstract sequence $B$ represents a set of program substitution

5539: %sequences as below.

5540: The concretization function ${\it Cc}:{\it ASS}\rightarrow \wp({\it

5541:   PSS})$ is defined by:

5542: %\vskip0.2cm

5543: \begin{quote}

5544:   $\begin{array}{lll}

5545:      {\it Cc}(B) &  = &

5546:

5547:       \bigcup_{b\in B} {\it Cc}(b).

5548:

5549:   \end{array}$

5550: \end{quote}

5551: %\vskip0.2cm

5552: \noindent The relation $\leq$ on {\it ASS} is naturally defined as

5553: being set inclusion. The

5554: concretization function is thus clearly monotonic.

5555:

5556: \vskip0.3cm

5557: \noindent

5558: {\bf Abstract Sequences with Cut Information.}

5559: We define the set ${\it ASSC}$ as being

5560:  equal to $\wp({\it AASS}\times {\it CF})$.

5561:  The elements of ${\it ASSC}$ are denoted by

5562: ${\cal L}_{n}$, $0_{n}$, $1_{n}$, $1'_{n}$, $2_{n}$, $2'_{n}$,

5563: ${\cal L}_{c}$, $0_{c}$, $1_{c}$, $1'_{c}$, $2_{c}$, $2'_{c}$,

5564: in \cite{Sahlin91}, where the index $n$ stands for {\it nocut}, while the index

5565: $c$ stands for {\it cut}.

5566: The concretization function is defined in the obvious way.

5567:

5568: \vskip0.3cm

5569: \noindent

5570: {\bf Extended Widening.}

5571: %\label{subsub:EWS}

5572: \noindent

5573: In order to instantiate our generic abstract interpretation algorithm

5574: to the above domains, it remains to provide an implementation of the

5575: various abstract operations.

5576: This can be done systematically from the specifications of the

5577: operations

5578: and the domain

5579: definitions; we leave it as an exercise to the reader, except for

5580: the extended widening, whose implementation is not obvious.

5581: The basic intuition behind the extended widening is that it should

5582: ``observe'' how the abstract sequences evolve between the consecutive iterations

5583:  in order to ensure convergence when enough accuracy seems to be

5584: attained.

5585: In this abstract domain, the abstract sequence $B_i$ produced at

5586: step $i$ may intuitively differ from $B_{i-1}$ by the fact that some

5587: ``incomplete'' elements (i.e., ${\cal L}$,  $1'$,  $2'$) can be removed

5588: and replaced by more ``complete'' ones.

5589: Of course the computation starts with $B_0=\{{\cal L}\}$.

5590: Thus the algorithm waits until ``enough incomplete elements have been removed''

5591: and then  accumulates the next iteration results to enforce termination.

5592: This can be formalized by defining a pre-order

5593: $\cleq$ over {\it ASS}

5594: such that $B_1\cleq B_2$ holds when $B_2$ only contains elements that

5595: are ``more complete'' than some elements of $B_1$ and when, conversely,

5596: $B_1$ only contains elements that

5597: are ``less complete'' than some elements of $B_2$.

5598: We first define the relation {\em is strictly less complete than} between

5599: atomic abstract sequences by the table:

5600:

5601: \begin{quote}

5602:   $\begin{array}{llllllllll}

5603:

5604:    {\cal L} \clt 0 &\; {\cal L} \clt 1 & \;{\cal L} \clt 1' & \;

5605: {\cal L}   \clt 2 &\;

5606:    {\cal L} \clt 2'&\;

5607:    1' \clt 1      & \; 1'   \clt 2        & \; 1'  \clt  2' &\;

5608:    2' \clt 2.

5609:   \end{array}$

5610: \end{quote}

5611:

5612: \noindent

5613: Then, for all atomic abstract sequences $b_1$ and $b_2$, we say that

5614: {\em $b_1$ is less complete than $b_2$}, denoted by

5615: $b_1\cleq b_2$, if $b_1 = b_2$ or $b_1\clt b_2$.

5616: This relation is lifted to general abstract sequences as follows:

5617:

5618: \begin{definition}[Computational Pre-Ordering]

5619: \label{def:CO}

5620: \noindent Let $B_1,B_2\in {\it ASS}$. By definition,

5621: \begin{quote}

5622:   $\begin{array}{lll}

5623:    B_1\cleq B_2 & \myiff & (\forall  b_1\in B_1,\,

5624:                          \exists b_2\in B_2 \mbox{ such that }

5625:  b_1 \cleq b_2) \and\\

5626:                       & &   (\forall  b_2\in B_2,\,

5627:                          \exists b_1\in B_1

5628: \mbox{ such that } b_1 \cleq b_2).

5629: \end{array}$

5630: \end{quote}

5631:

5632: %\noindent

5633:  We  write $B_1\clt B_2$ to denote the condition

5634: ($B_1 \cleq B_2 \and B_2 \not\cleq B_1$).

5635: \end{definition}

5636: \vskip-0.3cm

5637:

5638: We are now in position to define the extended widening.

5639:

5640: \begin{definition}

5641: [Extended Widening for Sahlin's Domain: $B'=B_{\it new}\nabla B_{\it old}$]

5642: \label{imp:EWSD}

5643: \begin{quote}

5644:   $\begin{array}{llll}

5645:    B' & = & B_{\it new}         & \myif   B_{\it old} \clt  B_{\it new},\\

5646:        & = & B_{\it new}\cup B_{\it old} & \otherwise.

5647:   \end{array}$

5648: \end{quote}

5649: \end{definition}

5650:

5651: %\noindent

5652: In fact, the above operation does not

5653: fulfill, strictly speaking, the requirements for being an extended

5654: widening. It works however if we have $B_{\it old}\cleq B_{\it new}$ each time it is

5655: applied. This is normally the case if the other abstract

5656: operations are carefully implemented, since each iteration of the

5657: abstract interpretation algorithm should

5658:  replace every element in

5659: $B_{\it old}$ by one or several more complete elements.

5660: Before stating

5661: %precisely

5662: what  is it actually achieved by the

5663: operation $\nabla$, we need two definitions.

5664:

5665: \begin{definition}[Equivalent Abstract Sequences]

5666: \label{def:EAS}

5667: Let $B_1, B_2\in {\it ASS}$. By definition,

5668: \begin{quote}

5669:   $\begin{array}{lll}

5670:    B_1 \eqass B_2 & \myiff & B_1 \cleq B_2 \and     B_2 \cleq B_1  .

5671:   \end{array}$

5672: \end{quote}

5673: \end{definition}

5674:

5675: %\noindent

5676:  The relation $\eqass$ is an equivalence because $\cleq$ is a

5677: pre-order. It can be shown that $\eqass$ determines $42$ equivalence

5678: classes, of which 28 are a singleton (e.g., $\{\{{\cal L}, 0, 1'\}\}$),

5679: 10 have 2 elements (e.g., $\{\{{\cal L}, 0, 2'\},\{{\cal L}, 0, 1',

5680: 2'\}\}$),

5681: and 4 have 4 elements (e.g., $\{\{{\cal L}, 0, 2\},

5682:                                 \{{\cal L}, 0, 2, 2'\},

5683:                                 \{{\cal L}, 0, 1', 2\},

5684:                                \{{\cal L}, 0, 1', 2, 2'\}\}$).

5685: It is also important to note that distinct equivalent abstract

5686: sequences always have different concretizations.

5687:

5688:

5689: \begin{definition}[Strengthened Computational Ordering]

5690: \label{prop:SCO}

5691: \noindent

5692: Let $B_1, B_2\in {\it ASS}$. By definition,

5693: \begin{quote}

5694:   $\begin{array}{lll}

5695:      B_1 \scleq B_2 & \myiff & B_1\clt B_2 \myor

5696:                           (B_1 \eqass B_2 \and B_1 \subseteq B_2).

5697:    \end{array}$

5698: \end{quote}

5699: \end{definition}

5700:

5701: %\noindent

5702: The relation $\scleq$ is an order;  every ascending

5703: sequence

5704: $B_1 \scleq B_2 \scleq \dots \scleq B_i \dots\,$ is stationary

5705:  since {\it ASS} is finite.

5706:

5707: \begin{property}[Conditional Convergence of the

5708:   Extended Widening]

5709: \label{prop:CCEW}

5710: Let $\{B_i\}_{i\in{\bf N}}$ and $\{B'_i\}_{i\in{\bf N}}$ be two

5711: sequences of elements of {\it ASS} such that

5712: \begin{enumerate}

5713: \itemsep 3pt

5714: \item $B'_i \cleq B_{i+1},$ for all $i\in{\bf N}$;

5715: \item $B'_{i+1}=B_{i+1}\nabla B'_i,$ for all $i\in{\bf N}$.

5716: \end{enumerate}

5717:

5718: \noindent Then we have $B_i\leq B'_i$, for all $i\in{\bf N}^\ast$,

5719: and the sequence

5720: $\{B'_i\}_{i\in{\bf N}}$ is stationary.

5721: \end{property}

5722:

5723: \begin{proof}

5724: The fact that $B_i\leq B'_i$, for all $i\in{\bf N}^\ast$, is a direct

5725: consequence of the definition of the operation $\nabla$.

5726: Moreover, the hypotheses on the sequences ensure that

5727: $B'_1\scleq B'_2 \scleq \dots \scleq B'_i \dots\,$; thus the sequence

5728: $\{B'_i\}_{i\in{\bf N}}$ is stationary.

5729: \end{proof}

5730:

5731: \vskip0.3cm

5732:

5733: If all abstract operations are congruent with respect to $\cleq$

5734: \footnote{We would have written {\em monotonic} if the relation $\cleq$ was an

5735:   order, not a pre-order only.},

5736: each iteration of the

5737: abstract interpretation algorithm ensures that $B_{\it old}\cleq B_{\it new}$,

5738: where $B_{\it old}$ is the current value in {\it sat} and $B_{\it new}$ is the newly

5739: computed

5740: abstract sequence. Thus, Proper\-ty~\ref{prop:CCEW} guarantees

5741: termination of the abstract interpretation algorithm.

5742: Congruence of the abstract operations with respect to $\cleq$

5743: is ensured if they are

5744: ``as accurate as possible'' (which is achieved in \cite{Sahlin91});

5745: however,

5746: proving this property entails a lot of work.

5747:  A

5748: simpler solution  consists of testing whether $B_{\it old}\cleq B_{\it new}$

5749: actually holds before each application of the extending widening. If

5750: the condition does not hold, we switch to a cruder form of widening, which

5751: simply merges all successive results.

5752:

5753: \vskip0.3cm

5754: \noindent

5755: {\bf

5756: Comparison with our Cardinality Analysis.}

5757: The determinacy information inferred by means of Sahlin's domain is in general

5758: less accurate than our cardinality analysis (except maybe in some

5759: partial evaluation contexts). For instance, with the former domain, it

5760: is not possible to detect mutually exclusive clauses except when cuts

5761: occur in the clauses. As illustrated in Section \ref{sub:IP}, the

5762:  information provided by the abstract substitution

5763: component of our domain is instrumental to detect sure failure, sure

5764: success, and mutual exclusion, which all contribute to improve the

5765: accuracy of the determinacy (or cardinality) analysis.

5766: Nevertheless, the specific information about the sequence structure is

5767: finer grained in Sahlin's domain than in ours. Consider the abstract

5768: sequence $\{ {\cal L}, 1\}$; it is approximated, in our domain, by

5769: $\langle 0, 1, {\it pt} \rangle$, which is actually equivalent to

5770: $\{ {\cal L}, 0, 1, 1'\}$. Thus, it could be interesting to design a

5771: domain for abstract sequences similar to our cardinality domain,

5772: where

5773: the sequence component coincides with Sahlin's domain.

5774:

5775:

5776: \vspace{-0.2cm}

5777: \subsection{Giacobazzi and Ricci's Analysis of Determinate Computations}

5778: \label{sub:GRADC}

5779:

5780: The work of  Giacobazzi and  Ricci  \shortcite{Giacobazzi92},

5781: is also worth being reviewed in our context.

5782: They propose an analysis of functional dependencies \cite{Mendelzon85}

5783: between procedure arguments

5784: of the success set of pure logic programs. Their

5785: ana\-ly\-sis is a bottom-up abstract interpretation, based on

5786: \cite{Barbuti90,FLMP89}. The analysis also infers groundness

5787: information and is intended to be used for parallel logic program

5788: optimization.

5789: In our comparison, we focus on the functional dependencies and we

5790:  simplify the presentation in order to concentrate on the

5791: salient points.

5792: First, we provide a definition of functional dependency tailored to

5793: our framework. The definitions use some notions from Section \ref{sub:IP}.

5794:

5795: \begin{definition}[Functional Dependency]

5796: \label{def:FD}

5797: \noindent

5798: Let $\langle {\it sv}, {\it frm} \rangle$ be a structural abstract

5799: substitution over domain $D$ and set of indices $I$. A {\em functional

5800: dependency for} $\langle {\it sv}, {\it frm} \rangle$,

5801: denoted by $J\rightarrow j$, is a pair consisting of a subset $J$ of

5802: $I$ and an index $j\in I$.

5803:

5804: \noindent Let $S\in{\it PSS}_D$ be a program substitution sequence

5805: such that ${\it Subst}(S)\subseteq {\it Cc}\langle {\it sv}, {\it frm}

5806: \rangle$.

5807: We say that the functional dependency $J\rightarrow j$ {\em holds  in

5808: $S$ for $\langle {\it sv}, {\it frm} \rangle$}, if for all fami\-lies of

5809: terms $(t_i)_{i\in I}$, $(t'_i)_{i\in I}$ that are decompositions of

5810: some program substitutions of ${\it Subst}(S)$, the

5811: following implication is true:

5812: %

5813: \begin{quote}

5814:   $\begin{array}{lll}

5815: (t_i)_{i\in J}=(t'_i)_{i\in J} & \Rightarrow & t_j=t'_j.

5816:   \end{array}$

5817: \end{quote}

5818: %

5819: \end{definition}

5820:

5821: %\noindent

5822: Then we  define an abstract domain to express functional

5823: dependencies.

5824: % in program substitution sequences.

5825:

5826: \begin{definition}[Abstract Sequences with Functional

5827:   Dependencies]

5828: \label{def:ASFD}

5829:

5830: \noindent  {\em An abstract sequence with functional

5831:   dependencies} is a triple $\langle {\it sv}, {\it frm}, {\it fd}

5832: \rangle$ where $\langle {\it sv}, {\it frm} \rangle$ is a structural abstract

5833: substitution over  domain $D$ and set of indices $I$, and {\it fd}

5834: is a set of functional dependencies for $\langle {\it sv}, {\it frm} \rangle$.

5835: The concretization function for abstract sequences with functional

5836:  dependencies is defined by

5837: \vskip0.2cm

5838:  \begin{quote}

5839:

5840:    $\begin{array}{lll}

5841:    {\it Cc}\langle {\it sv}, {\it frm}, {\it fd} \rangle & = &

5842:

5843: \left\{\begin{array}{l}

5844:

5845:       S\in {\it PSS}_D \ \

5846:

5847:      \begin{array}{|l}

5848:

5849:          {\it Subst}(S)\subseteq \Cc{\langle {\it sv}, {\it frm} \rangle} \and\\

5850:

5851:          J\rightarrow j \mbox{\rm ~holds~in~}

5852:          S \mbox{\rm~for~} \langle {\it sv}, {\it frm} \rangle,\\

5853:

5854:          \mbox{\rm for~every~}J\rightarrow j\in {\it fd}.

5855:

5856:      \end{array}

5857:

5858: \end{array}\right\}.

5859:

5860:    \end{array}$

5861:  \end{quote}

5862:

5863: \end{definition}

5864: \vskip0.3cm

5865: %\noindent

5866:

5867: In fact, the functional dependency component {\it fd} is best

5868: viewed as an additional component to the cardinality domain defined in Section

5869: \ref{sec:FASAS}, since its usefulness for determinacy analysis depends

5870: on the availability of mode information. Let $S\in{\it CPSS}_D$ be a

5871: canonical program substitution sequence. We say that $S$ is {\em functional}

5872: if the set ${\it Subst}(S)$ is empty or is a singleton.

5873: %\footnote{Here we stick to the terminology of \cite{Debray89a}.}

5874: Such sequences model the behavior of procedures that cannot produce

5875: two or more distinct solutions. Assume that $S$ is the output sequence

5876: corresponding to the input substitution $\theta$, for some procedure $p$.

5877: Assume that $\theta\in{\it Cc}\langle {\it sv}, {\it frm} \rangle$ and

5878: $S\in{\it Cc}\langle {\it sv}', {\it frm}', {\it fd}' \rangle$ where

5879: $\langle {\it sv}', {\it frm}' \rangle$ is more instantiated than

5880: $\langle {\it sv}, {\it frm} \rangle$% with respect to {\it im}

5881: . We can

5882: infer that $S$ is functional if there exists $J\subseteq I'$ such

5883: that ${\it fd}'$ contains a functional dependency of the form

5884: $J\rightarrow i$, for every $i\in {\it sv}'(D)$, and if every term $t_j$

5885: corresponding to an index $j\in J$ in a program substitution of $S$ is

5886: not more instantiated than the corresponding term in $\theta$. The

5887: latter information is easily deduced if we know, for instance, that

5888: $t_j$ is ground or is a variable. Thus adding a functional dependency

5889: component to our cardinality domain allows us to infer that output

5890: program substitution sequences are functional.

5891:

5892:

5893:

5894: It is important to point out that the new component {\it fd} expresses a

5895: property of program substitution sequences, not a property of (single) program

5896: substitutions.

5897:  It is meaningless  to use functional dependencies

5898: in a domain of abstract substitutions, because a set of functional

5899: dependencies determines a (two valued) condition

5900:  on a set of program substitution.

5901: Either the set verifies the condition, then no constraint is added, or

5902: it does not and the set is rejected as a whole.

5903: Thus, a component {\it fd} defines a {\em set of sets of} program

5904: substitutions.

5905: As a consequence, functional

5906: dependencies cannot be handled by previous top-down abstract

5907: interpretation frameworks such as \cite{Bruynooghe91,TOPLAS,Marriott89,%

5908: Mellish87,Muthukumar92,Warren92,Winsborough92}.

5909: However the abstract interpretation framework used by \cite{Giacobazzi92}

5910: is bottom-up and abstracts the success set of the program. The result

5911: of an analysis represents a set of possible success sets, i.e.,

5912: {\em a set of sets of output patterns}, which is similar to a set of

5913: sets of program substitutions.

5914: As far as we know, it is the first time that this difference of

5915: expressivity between bottom-up and (previous) top-down abstract

5916: interpretation frameworks is pointed out in the literature.

5917: The comparison usually concentrates on the fact that bottom-up

5918: frameworks are {\em goal independent}, i.e., they provide information on

5919: the program as a whole, while top-down frameworks are {\em goal dependent},

5920: i.e., they provide information about the program {\em and} a given initial

5921: goal. We believe that a more fundamental difference lies in the fact that

5922:  top-down frameworks are {\em functional}, i.e., they abstract

5923: the behavior of a program by a function between sets of sets, while bottom-up

5924: frameworks are {\em relational}, i.e., they abstract  the behavior of

5925: a program by a set of relations. The difference between the two

5926: approaches has been previously put forward

5927: by Cousot and Cousot \shortcite{CousotJLC92}, but not in the context of logic programs.

5928: The functional approach can easily focus on small parts of the program

5929: behavior but looses the dependencies between inputs and outputs;

5930: the converse holds for the relational approach.

5931: Our novel framework is basically functional, but the domain of

5932: abstract sequences is in some sense relational; thus the framework allows us to combine the

5933: advantages of both approaches.

5934:

5935:

5936: \subsection{Debray and Warren's Analysis of Functional Computations}

5937:

5938: In the previous section, we have shown that functional dependencies

5939: are useful to infer that an output program substitution sequence is functional,

5940:  i.e., does not contain two or more distinct program substitutions.

5941: Such a sequence may contain several occurrences of the same program

5942: substitution, however.

5943: The importance of functional computations for logic program

5944: optimization was advocated early by Debray and Warren

5945: \shortcite{Debray89a}.

5946: In this paper, these authors propose a sophisticated algorithm to infer

5947: functional computations of a logic program. The analysis exploits

5948: functional dependencies and mode information, as well as a set of

5949: sufficient conditions to detect mutually exclusive clauses.

5950: Their algorithm is not based on abstract interpretation and assumes

5951: that functional dependencies and mode information are given from

5952: outside.

5953: Thus the algorithm considers an annotated program; it uses a set

5954: $\{ \bot, {\bf true}, {\bf false} \}$ where $\bot$ is an initializing value,

5955: {\bf true} means that a procedure is functional and {\bf false} means

5956: that it is not known whether the procedure is functional. Hence, the

5957: set can be viewed as a domain of abstract sequences, with

5958: concretization function

5959: ${\it Cc} : \{ \bot, {\bf true}, {\bf false}\} \rightarrow \wp({\it CPSS})$

5960: defined by

5961: \begin{quote}

5962:   $\begin{array}{lll}

5963:    {\it Cc}(\bot)       & = & \{<\bot>\};\\

5964:    {\it Cc}({\bf true})  & = & \{S\in{\it CPSS}\mid \

5965:                                {\it Subst}(S)

5966:                                 \mbox{\rm~is~empty~or~is~a~singleton.}

5967:                              \};\\

5968:    {\it Cc}({\bf false}) & = & {\it CPSS}.

5969:         \end{array}$

5970: \end{quote}

5971:

5972: \noindent

5973: All aspects of their analysis can be accommodated in our approach by providing

5974: suitable abstract domains. An abstract domain consisting of our cardinality

5975: domain augmented with a functional dependency component  would

5976: probably be fairly accurate.

5977: Moreover, in our approach,  all analyses can be performed at the same

5978: time and interact with each other, making it possible to get

5979: a better accuracy.

5980:

5981:

5982:

5983: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

5984: \section{Conclusion}

5985: \label{conclusion}

5986: %%%%%%%%%%%%%%%%%%%%%%%%%%

5987:

5988: This paper has introduced a novel abstract interpretation framework,

5989: capturing the depth-first search strategy and the cut operation of

5990: Prolog. The framework is based on the notion of substitution sequences

5991: and the abstract semantics is defined as a pre-consistent post-fixpoint

5992: of the abstract transformation. Abstract interpretation algorithms

5993: need chain-closed domains and a special widening operator to compute

5994: the semantics. This approach overcomes some of the limitations of

5995: previous frameworks. In particular, it broadens the applicability of

5996: the abstract interpretation approach to new analyses and

5997: can potentially improve

5998: the precision of existing analyses.

5999: On the practical side, in this paper, we have only shown

6000: that our approach allows one to integrate -

6001: efficiently and at a low conceptual cost - a predicate

6002: level analysis (i.e., determinacy analysis)

6003: to variable level analyses classically handled by abstract

6004: interpretation. However, the improvement on classical analyses

6005: is marginal because, due to our design choices for the

6006: abstract sequence domain

6007: (i.e., a simple extension of {\tt Pattern}), the new system

6008: behaves almost as the previous

6009: version of GAIA for variable level analyses.

6010: Nevertheless, the new framework opens a door for defining

6011: and exploiting more sophisticated domains for abstract sequences.

6012:

6013: \bibliographystyle{tlp}

6014: \bibliography{Biblio}

6015:

6016:

6017: \newpage

6018: \appendix

6019: \section*{Appendix}

6020: We complete here the description of the abstract operations

6021: started in Section \ref{sub:GAO}.

6022: The correctness proofs of all the abstract operations can be found in

6023: \cite{sequence}. The definitions below have been added in order to allow

6024: the reader to check the details of the examples in Section \ref{sub:GAO}.

6025: \\

6026:

6027: \noindent

6028: {\bf Extension at  Clause Entry}:\ \

6029:      {\tt EXTC}$(c,\cdot):{\it AS}_D

6030: \rightarrow{\it ASSC}_{D'} $ \ \\

6031: The implementation reuses the homonymous operation

6032: from the previous framework,

6033: which is specified as follows.

6034: \\

6035:

6036: \noindent

6037: {\it Operation}$\;$

6038: {\tt EXTC}$(c,\cdot):{\it AS}_D

6039:                        \rightarrow{\it AS}_{D'} $\\

6040: Let $\beta\!\in\!{\it AS}_{D}$,

6041: $\theta\!\in\!{\it CPS}_{D}$, and $\theta'\!\in\!{\it PS}_{D'}$

6042: such that $x_i\theta'=x_i\theta$

6043:                    $(\forall i:1\leq i \leq n)$ and

6044:                     $x_{n+1}\theta'$, \dots , $x_{m}\theta'$ are

6045:                distinct standard variables

6046:                    not belonging to ${\it codom}(\theta).$

6047: Then

6048: $$

6049: \begin{array}{lll}

6050: \theta\in{\it Cc}(\beta) & \Rightarrow & [\![\theta']\!]\in

6051: {\it Cc}({\tt EXTC}(c,\beta)).

6052: \end{array}

6053: $$

6054: Hence, the {\tt EXTC} operation on sequences is defined by

6055: $$

6056: \begin{array}{lll}

6057: {\tt EXTC}(c,\beta) & = &

6058:   \langle \langle {\tt EXTC}(c,\beta), 1, 1, {\it st} \rangle

6059:          , {\it nocut} \rangle.

6060: \end{array}

6061: $$

6062:

6063: \vskip0.3cm

6064: \noindent

6065: {\bf Restriction at Clause Exit}:\ \

6066:  {\tt RESTRC}$(c,\cdot):{\it ASSC}_{D'}

6067:                          \rightarrow{\it ASSC}_D $ \ \\

6068: The treatment of this operation is similar to the previous one.

6069: We first specify the abstract substitution version of the operation.

6070: \\

6071:

6072: \noindent

6073: {\it Operation}$\;$

6074: {\tt RESTRC}$(c,\cdot):{\it AS}_{D'}

6075:                        \rightarrow{\it AS}_{D} $

6076: \\

6077: Let $\beta\!\in\!{\it AS}_{D'}$ and

6078: $\theta\!\in\!{\it CPS}_{D'}$. We have

6079: $$

6080: \begin{array}{lll}

6081: \theta\in{\it Cc}(\beta) & \Rightarrow & [\![\theta_{|D}]\!]\in

6082: {\it Cc}({\tt RESTRC}(c,\beta)).

6083: \end{array}

6084: $$

6085: Hence, the {\tt RESTRC} operation on sequences is defined by

6086: $$

6087: \begin{array}{lll}

6088: {\tt RESTRC}(c,C) & = &

6089:   \langle {\tt RESTRC}(c,\beta), m, M, {\it acf} \rangle.

6090: \end{array}

6091: $$

6092:

6093: \vskip0.3cm

6094: \noindent

6095: {\bf Restriction before a Call}:\ \

6096:  {\tt RESTRG}$(l,\cdot):{\it AS}_{D'}

6097:                          \rightarrow{\it AS}_{D'''} $\ \\

6098: This  operation is simply inherited from the previous framework.

6099: \\

6100:

6101: \noindent

6102: {\bf Unification of a Variable and a Functor}:\ \

6103:    {\tt UNIF-FUNC}$(f,\cdot): {\it AS}_{D}

6104:                      \rightarrow {\it ASS}_{{D}}$\ \\

6105: The treatment of this operation is identical to the treatment of the

6106: {\tt UNIF-VAR} ope\-ration and  is thus omitted.

6107: \\

6108:

6109: \noindent

6110: {\bf Extension of the Result of a Call}:\ \

6111: {\tt EXTGS}$(l,\cdot,\cdot):{\it ASSC}_{D'}\times {\it ASS}_{D'''}

6112:                          \rightarrow{\it ASSC}_{D'} $\

6113: \\

6114: This operation reuses the operation {\tt EXTG} from the previous framework.

6115: The reused operation has to fulfill the specification just below.

6116: \\

6117:

6118: \noindent

6119: {\it Operation}$\;$

6120: {\tt EXTG}$(l,\cdot,\cdot):{\it AS}_{D'}\times {\it AS}_{D'''}

6121:                          \rightarrow{\it AS}_{D'} $

6122: \\

6123: Let $\beta_1\in{\it AS}_{D'}$ and $\beta_2\in{\it AS}_{D'''}$.

6124: Let $\theta_1\in {\it CPS}_{D'}$ and $\theta_2\in {\it PS}_{D'''}$ be

6125: such that $x_{i_j}\theta_1=x_j\theta_2$

6126: $(\forall j : 1\leq j\leq {n'})$.

6127: Let $\sigma\in {\it SS}$ such that

6128: ${\it dom}(\sigma)\subseteq {\it codom}(\theta_2)$.

6129: Let $\{z_1,\ldots,z_r\}={\it codom}(\theta_1)

6130:                              \setminus{\it codom}(\theta_2)$.

6131: Let $y_{1},\ldots,y_{r}$ be distinct standard variables

6132:       not belonging to ${\it codom}(\theta_1)\cup {\it

6133:         codom}(\sigma)$.

6134: Let $\rho=\{z_1/y_{1},\ldots,

6135:                     z_r/y_{r},

6136:                     y_{1}/z_1,\ldots,

6137:                     y_{r}/z_r\}$.

6138: Under these assumptions,

6139: \begin{quote}$

6140: \begin{array}{rcl}

6141:        \left.

6142:        \begin{array}{r}

6143:        \theta_1\in {\it Cc}(\beta_1),\\

6144:        \theta_2\sigma\in{\it Cc}(\beta_2)

6145:        \end{array}

6146:        \right\}

6147:                &\Rightarrow&

6148:         [\![\theta_1\rho\sigma]\!] \in

6149:                        {\it  Cc}({\tt EXTG}(l,\beta_1,\beta_2)).

6150: \end{array}$

6151: \end{quote}

6152:

6153: \noindent

6154: The implementation of {\tt EXTGS} is as follows.

6155:

6156:

6157: \begin{quote}

6158:   $\begin{array}{llll}

6159:    \beta' & = & {\tt EXTG}(l,\beta_1,\beta_2);                     \\

6160:

6161:    m'     & = & m_1 m_2         & \myif t_2={\it st},              \\

6162:           & = & \min(1,m_1) m_2 & \otherwise;                      \\

6163:

6164:    M'     & = & \min(1,M_1) M_2 & \myif t_2={\it snt},             \\

6165:           & = & M_1 M_2         & \otherwise;                      \\

6166:

6167:    t'     & = & {\it snt}       & \myif t_1={\it snt}

6168:                                   \myor (t_2={\it snt}

6169:                                          \and m_1\geq 1),         \\

6170:           & = & {\it st}        & \myif t_1={\it st}

6171:                                   \and (t_2={\it st}

6172:                                          \myor M_1=0),            \\

6173:           & = & {\it pt}        & \otherwise;                     \\

6174:

6175:    {\it acf}'

6176:           & = & {\it acf}.

6177:    \end{array}$

6178: \end{quote}

6179:

6180:

6181: \vskip0.3cm

6182: \noindent

6183: {\bf Operation}$\;$ ${\tt SEQ}: {\it ASSC}_{D}

6184:                          \rightarrow{\it ASS}_{D}$\

6185: \\

6186: We define

6187: $${\tt SEQ}(\langle B, {\it acf}\rangle) \  = \  B.$$

6188:

6189: \vskip0.3cm

6190: \noindent

6191: {\bf Operation} $\;$ ${\tt SUBST}: {\it ASSC}_{D'}

6192:                          \rightarrow{\it AS}_{D'}$\\

6193: We define

6194: $$

6195: {\tt SUBST}(\langle \langle \beta, m, M, t \rangle, {\it acf}\rangle)\

6196: =\  \beta.$$

6197:

6198:

6199: \end{document}

6200:

6201:

6202:

6203:

6204:

6205: