0010:cs0010034/cs0010034

1: \documentstyle[12pt,fullpage]{article}

2: \setlength{\unitlength}{1.0cm}

3: \newtheorem{lem}{Lemma}

4: \newtheorem{lemaux}{Auxiliary Lemma}

5: \newtheorem{fact}{Fact}

6: \newtheorem{ax}{Axiom}

7: \newtheorem{defn}{Definition}

8: \newtheorem{theo}{Theorem}

9: \newtheorem{nota}{Notation}

10: \newtheorem{prop}{Proposition}

11: \newtheorem{cor}{Corollary}

12: \newtheorem{rem}{Remark}

13: \newtheorem{eg}{Example}

14: \newcommand{\rarr}{\rightarrow}

15: \setlength{\oddsidemargin}{-.26in}   	%for dvi-imagen, a bit -ve needed.

16: \setlength{\topmargin}{-0.7in}		%for qms

17: \setlength{\textheight}{9.0in}

18: \setlength{\textwidth}{6.5in} % 5.5in = 14 cm

19: \setlength{\marginparwidth}{0in}

20: \setlength{\marginparsep}{0in}

21: \setlength{\marginparpush}{0in}

22:

23: \newcommand{\smaran}{{\sf Smaran}}

24: \newcommand{\tgr}{{\sf TGR}}

25:

26: \setcounter{secnumdepth}{3}	%to get numbers on subsubsection in text

27: 				%default is 2.

28: \setcounter{tocdepth}{3}	%to get numbers on subsubsection in toc

29: 				%default is 2.

30: \setlength{\unitlength}{1.0cm}

31: \renewcommand{\baselinestretch}{1.0}

32: \begin{document}

33: \thispagestyle{empty}

34: \newtheorem{example}{Example}

35: \newtheorem{progs}{Program}

36: \begin{center}

37: {\Large \bf Static Analysis Techniques for Equational Logic

38: Programs\footnote{Partially supported by NSF grant CCR-9732186}}

39: \vspace{2mm}

40:

41: {\large Rakesh Verma} \vspace{1mm}

42:

43: Computer Science Department \\

44: University of Houston \\

45: Houston, TX 77204 \\

46: Ph: (713)743-3348 \\

47: Fax: (713)743-3335 \\

48: Email: rmverma@cs.uh.edu \\

49: \end{center}

50: \begin{center}

51:

52:

53: {\bf Abstract}

54: \end{center}

55: An equational logic program is a set of directed equations or rules,

56: which are used to compute in the obvious way (by replacing equals with

57: ``simpler'' equals). We present static analysis techniques for efficient

58: equational logic programming, some of which have been implemented

59: in  $LR^2$, a laboratory for developing and evaluating fast,

60: efficient, and practical rewriting techniques.  Two novel

61: features of $LR^2$ are that non-left-linear rules are allowed in most

62: contexts and it has a tabling option based on the congruence-closure based

63: algorithm to compute normal forms. Although, the focus of this

64: research is on the tabling approach some of the techniques are

65: applicable to the untabled approach as well. Our presentation

66: is in the context of $LR^2$, which is an interpreter, but some of the

67: techniques apply to compilation as well.

68:

69:

70: \newpage

71: \pagenumbering{arabic}

72:

73: \section{Introduction}

74: Equational logic programming has received increasing attention in

75: recent years and there are now fast compilers such as Elan, Maude,

76: etc., and fast interpreters such as $LR^2$. The objective of this

77: paper is to enhance the efficiency of interpretation (and possibly

78: also compilation) of equational logic programming by analyzing the input

79: program. Since we want fast and efficient interpreters, our focus is on

80: practical and effective techniques that can be implemented very

81: cheaply (linear or close to linear time) in terms of their time and

82: space cost.

83: To describe the techniques we assume the usual scenario, viz., an

84: equational program, or a set of rules,  $P$ and a term $t$ are given

85: and the goal is to compute a normal form of $t$. We  assume here

86: that $P$ is confluent and that $t$ is well-formed but we do not assume termination. Although most

87: of the techniques we present are of general significance, their immediate

88: motivation is to enhance the efficiency of $LR^2$, especially its

89: tabling mechanism, so we give a brief

90: self-contained introduction to $LR^2$  and  a description of

91: the tabling algorithm. For more details on $LR^2$, including useful optimizations and some performance results,  the reader can consult

92: \cite{rakshal}. Throughout this paper, when we use the term tabling we

93: are specifically referring to this algorithm.

94: \subsection{$LR^2$}

95: $LR^2$ is a laboratory for developing and evaluating fast, efficient,

96: and practical rewriting techniques.

97: 	$LR^2$ consists of an expression graph interpreter \tgr,

98: and an expression graph rewriter that tables the history of its reductions,

99: called \smaran\ , based on the  congruence closure approach. The input to $LR^2$ is a program

100: representing either an orthogonal, not necessarily terminating,

101: rewrite system, or a convergent (terminating and confluent) rewrite

102: system,  and an input   expression. First-order orthogonal systems can

103: easily express first-order functions and it is a well-known fact that

104: parallel-outermost reduction or lazy evaluation is necessary and

105: sufficient to compute with such systems. Observe that since convergent

106: systems can be given to $LR^2$, hence, it allows non-left-linear rules

107: such as $x + -x \rarr 0$ and $equal(x,x) \rarr true$, etc.  Similar to

108: algebraic  specification languages like OBJ, ASF+SDF and Elan a

109: program is composed from modules. Each module defines its own

110: signature and rewriting rules.  A module can import other

111: modules. Expressions in $LR^2$ are written in prefix form. The

112: language of $LR^2$ contains several built-in datatypes, viz.,

113: integers, floating-point arithmetic, booleans, characters, sets,

114: multisets, and strings with associated operations. The set datatype

115: supports the insertion, deletion, membership, union, etc., operations.

116: The string datatype supports membership and indexing operations.

117:

118: $LR^2$ also includes a variant detector that can determine if a new expression

119: is an alphabetic variant of an existing expression, which is currently usable with

120: the tabling option. If so, the appropriate

121: variant of the result computed for the existing expression is used for

122: further rewriting instead of starting from scratch.

123: $LR^2$ also allows a  compact form for storing lists of arithmetic

124: progressions as these occur frequently. Instead of storing the

125: entire list, $LR^2$ stores the initial value, the final value and the

126: difference.  $LR^2$ provides a set of commands so that it can  be called by other systems for symbolic computation. This currently requires the UNIX message

127: passing mechanism.

128:

129:

130:    $LR^2$ provides a variety of options for controlling

131: the amount of history that is stored by the system, if

132: the user chooses the tabling option. The default option using \smaran\

133: is to save the results of each rewrite

134: step in a compact data structure. The language of $LR^2$ allows annotating  specific functions with the keyword ``memo''. This

135: allows to save all the results of  rewriting expressions that have the

136: specified function symbol at the root. The Delete (history) option in

137: $LR^2$ allows to delete the entire

138: history of rewrites performed so far except for the given expression and its

139: latest reduct after every  $i^{th}$   rewrite step, where  $i$  can be

140: specified by the user. Another possibility allowed is to delete the

141: entire history except the given expression and the latest reduct as soon as

142: the free memory available to the user drops to a user-specified

143: percentage of the total available memory.     Options can be combined in any way to suit the application. However, the delete option overrides the

144: other options.

145:

146:

147: Operators can also be declared associative (A) or commutative (C) or

148: both (AC) in $LR^2$. However, currently only

149: left-linear rules with A, C, or AC operators can be handled; efficient matching

150: algorithms/heuristics for nonlinear rules with such operators are

151: fairly involved and is currently in the testing

152: phase. It is known that A-matching, C-matching and AC-matching of

153: general patterns are all NP-complete even with some

154: restrictions on occurrences of variables \cite{rakinfcom}.

155:

156: The rest of this paper is organized as follows. In Section 2, we

157: give a description of the main algorithm for \smaran\ ,

158: in Section 3 we discuss techniques to enhance the efficiency of

159: $LR^2$,  in Section 4 we

160: mention the techniques that have already been implemented in $LR^2$. We

161: conclude with some promising directions for future research.

162: We omit here all proofs of the results stated in this paper.

163:

164: \section{Smaran's Core Algorithm } \label{ccnaalg}

165: The basic algorithm at the core of  \smaran\ was proposed in \cite{chewalg}

166: for orthogonal rules and later extended in \cite{rakdiss,rakfocs} to non-left-linear

167: rewrite systems under fairly general conditions. The basic algorithm

168: in turn is an extension of the

169:  well-known congruence closure algorithm (CCA) \cite{dst,no,shostak,kozen}) for ground equations.

170:

171: Recall that CCA divides the set of terms

172: into numbered equivalence classes. Membership

173: of a term in an equivalence class is decided by its {\em

174: signature}. The signature of a term $f(t_1,\ldots ,

175: t_n)$ is the tuple  $\langle f~ \#[t_1]~\ldots \#[t_n]

176: \rangle$, where $\#[t_i]$ is the number of the equivalence class

177: containing the signature representing $t_i$. Equivalence class

178: $C$ {\em represents} term $t$ if $C$ contains a signature representing

179: $t$. CCA operates by merging equivalence class

180: representing terms whose equivalence follows from the given equations.

181:

182:

183: To extend CCA for normalization we make use of the concept of

184: a distinguished signature in each equivalence class called the {\it

185: unreduced signature} \cite{chewalg,rakdiss}. Using this signature we

186: construct a distinguished term. It has been shown in

187: \cite{chewalg,rakdiss,rakfocs} that it is enough to examine this

188: term to select useful rule instances, and that it is sufficient

189: to check this term for irreducibility for rewrite systems satisfying

190: fairly general conditions. If it exists and is irreducible, then the class

191: containing this term has a normal form.

192: Thus, instances of left hand sides  represented by reduced

193: signatures, do not lead to any progress towards the  normal form, and

194: any term represented by a reduced signature cannot be in normal

195: form.

196: Whenever a rule instance $A \rarr B$ can be applied to

197: the distinguished term of a class, $C$, because of a match, the

198: signature  representing $A$ is marked reduced and the class

199: representing $B$ (if any) is merged with  $C$. If there is no

200: class representing $B$, then a signature  representing $B$ is

201: constructed, inserted into $C$, and marked the

202: unreduced signature of $C$.

203:

204: The algorithm starts by constructing the signature of the given term. This signature is then

205: inserted into a class and marked the unreduced signature of the class.

206: This class number is tracked throughout the process of normalization.

207: Signatures of terms are constructed in the obvious bottom-up way.

208: We illustrate this algorithm with a

209: small example. Consider the  convergent rewrite system with three rules that are numbered for convenience,

210: \[

211: 1: fib(x)  \rarr f(x   > 1, x), 2: f(true,x)  \rarr  fib(x-1) +

212: fib(x-2), 3: f(false,x)  \rarr  1, \] that

213: uses built-in arithmetic to define the fibonacci function and

214: let the input term be $fib(2)$.

215:

216: The initial set of classes is: \[ 0:\{2^{*}\}\hspace{0.05in} 1:\{\langle fib~

217: 0 \rangle^{*}\} \] (the symbol `*' indicates unreduced

218: signatures). Now the match procedure  is called to find a match

219: between the unreduced signature of any class and the left-hand side of

220: any rule. A match occurs between  class 1 and rule 1.

221: Now, the signature representing  the corresponding instance of the

222: right-hand side of rule 1   is constructed, inserted into class 1 and

223: marked as its unreduced signature. Note that here  we do

224: not show the signatures, corresponding to the built-in datatypes, that

225: can be evaluated directly.  At this stage the classes are as follows:

226:

227: \centerline{ $0:\{2^{*}\}\hspace{0.05in} 1:\{\langle fib~ 0 \rangle, \langle f~ 3~ 0 \rangle^{*}\}

228: \hspace{0.05in}2:\{1^{*}\}\hspace{0.05in}3:\{true^* \}$}

229:

230: During the next iteration class 1 matches rule 2. The right-hand side

231: instance is $fib(1) + fib(0)$. The signature representing this is

232: created and inserted into class 1 as its new unreduced signature. At

233: the end of the second iteration the new/changed classes

234: are as follows:

235:

236: \centerline{$ 1:\{\langle fib~ 0 \rangle, \langle f~ 3~ 0 \rangle, \langle +~ 4~

237: 6 \rangle^{*}\}\hspace{0.05in}4:\{\langle fib~ 2 \rangle^*\} \hspace{0.05in}5:\{0^*\}

238: \hspace{0.05in} 6:\{\langle fib~ 5 \rangle^*\}$}

239:

240: After two rewrite steps, using rules 1 and 3 respectively, the

241: term $fib(1)$ represented by class 4 reduces to 1, which is represented by

242: class 2. Hence classes 4 and 2 are merged, say into 2, and we have:

243:

244: \centerline{ $0:\{2^{*}\}\hspace{0.02in} 1:\{\langle fib~ 0 \rangle, \langle f~ 3~ 0 \rangle, \langle +~ 2~ 6 \rangle^{*}\}

245: \hspace{0.02in}2:\{\langle fib~ 2 \rangle, \langle f~ 7~ 2 \rangle,

246: 1^{*}\} $ }

247: \centerline{$3:\{true^* \} \hspace{0.02in}5:\{0^*\}

248: \hspace{0.02in} 6:\{\langle fib~ 5 \rangle^*\} \hspace{0.02in} 7:\{false^*\}$}

249:

250: Note that signatures containing the class number 4 have been

251: updated to contain class number 2. After two more rewrite steps, the

252: term $fib(0)$ represented by class 6 reduces to 1. Hence classes 6

253: and 2 are merged, say into 2, and we get:

254:

255: \centerline{$0:\{2^{*}\}\hspace{0.02in} 1:\{\langle fib~ 0 \rangle, \langle f~ 3~ 0 \rangle, \langle +~ 2~ 2 \rangle^{*}\}

256: \hspace{0.02in}2:\{\langle fib~ 2 \rangle, \langle fib~ 5 \rangle, \langle f~ 7~ 5 \rangle, \langle f~ 7~ 2 \rangle,

257: 1^{*}\}$}

258: \centerline{$3:\{true^* \} \hspace{0.02in}5:\{0^*\}  \hspace{0.02in}

259:  \hspace{0.02in} 7:\{false^*\}$}

260:

261: Now, the unreduced signature of class 1 can be evaluated to yield the

262: term 2, which is in class 0. Hence, classes 1 and 0 are merged, say

263: into 0, and we get:

264:

265: \centerline{$1:\{2^{*}, \langle fib~ 0 \rangle, \langle f~ 3~ 0 \rangle, \langle +~ 2~ 2 \rangle\}

266: \hspace{0.02in}2:\{\langle fib~ 2 \rangle, \langle fib~ 5 \rangle, \langle f~ 7~ 5 \rangle, \langle f~ 7~ 2 \rangle,

267: 1^{*}\}$}

268: \centerline{$3:\{true^* \} \hspace{0.02in}5:\{0^*\}  \hspace{0.02in}

269:  \hspace{0.02in} 7:\{false^*\}$}

270:

271: No more matches are found. Therefore, the algorithm checks for the

272: existence of a normal form for the given term. Class 1 contains the unreduced

273: signature {\it 2}, which is irreducible. Thus the normal

274: form of $fib(2)$ is 2.  Note that if the normal form

275: of $fib(fib(2))$ is needed, no more computations are needed

276: since this term is also represented by the signature $\langle fib~

277: 0 \rangle$ in class 1. It is  simply extracted from class 1.  The

278: normal form of this term is also $2$. On the other hand, an

279: interpreter that does not store history would compute $fib(2)$ twice

280: to normalize this term.

281: This  compact data structure helps exploit the advantages of

282: storing history and can also speed up  normalization.

283:

284: \section{When does tabling help?}

285: In this section, we present techniques that analyze the program $P$ and

286: term $t$ to determine whether tabling could  be helpful in  computing

287: the normal form of $t$. Recall that tabling can help in two important

288: ways:

289: \begin{enumerate}

290: \item It can improve the termination characteristics, i.e., tabled

291: rewriting can halt in cases where untabled would fail to halt because

292: there is no normal form. For example, consider the rule $a \rarr

293: f(a)$.

294: \item It can improve the efficiency of the computation. Each rule

295: instance is applied at most once and its results  are stored. This is

296: extremely  useful for: (a) problems that recursively solve lots of

297: overlapping subproblems, e.g., dynamic programming problems, (b)

298: proving theorems of the form $A = B$, and determining whether a

299: critical-pair ($A, B$) generated during KB-completion \cite{kb} is

300: trivial. (It is possible that $A$ and $B$ were reduced to a common

301: term albeit at different times, since  this common term is remembered

302: in tabling, it will terminate immediately whereas untabled methods may

303: go on forever.), and (c) incremental computation.

304: \end{enumerate}

305:

306: We now develop two sets of necessary conditions that a program $P$

307: must satisfy so that tabling can potentially be useful for $P$, one

308: set for each way in which tabling can help.

309: \subsection{When does tabling improve termination?}

310: First we give necessary conditions on $P$ alone for tabling to help

311: termination. Next, we will analyze both $P$ and $t$ to develop even

312: stronger necessary conditions.

313: As usual, we partition the constants and function symbols of $P$ into two classes:

314: constructors and defined symbols.

315: \begin{defn}

316: A symbol is called {\em defined}

317: if it is the outermost symbol of the left-hand side of some rule in

318: $P$,

319: otherwise it is a constructor.

320: \end{defn}

321: For example, $fib$ is a defined symbol in the program given above,

322: whereas $true$ is a constructor.

323: Note that constructors include the constants from the predefined types

324: such as bool, float, int, char, etc., (we know that there can be no rules with integer, float, char and

325: bool constants as left-hand sides).

326: The standard operators over these predefined

327:  types  are treated as constructor functions in this section.

328: \begin{defn}

329: Let $f$ be a defined symbol of $P$  and let $g$ be any function symbol

330: or constant of $P$. We say

331: that $f$ {\em needs} $g$ if $g$ appears {\em in} the right-hand side of any

332: rule that has $f$ as its defined symbol.

333: \end{defn}

334:

335: Now we define the {\em needs} graph of $P$ as the directed graph $G= (V,E)$,

336: where $V$ contains a vertex for  each function symbol or constant of

337: $P$ and $E$ contains a directed edge from the vertex for  $f$ to the

338: vertex for $g$ precisely when $f$ needs $g$.  Note that this graph may

339: have self-loops, i.e., cycles with just one edge.

340: \begin{prop}[Necessary Condition]

341: A necessary condition for tabling to improve termination of the

342: computation of normal form of $t$ with respect to $P$ is the existence

343: of a cycle in the needs graph of $P$.

344: \end{prop}

345: This condition can be checked in time that is linear in the sum of the

346: lengths of the right-hand sides of $P$ using a standard search

347: algorithm. We only introduce  vertices in the graph that are

348: necessary, i.e., they represent either defined symbols with at least

349: one edge or a symbol that is adjacent to a defined symbol. The reader

350: may wonder if it is possible to have a defined symbol with no

351: originating edge. This happens when all the rules defining this symbol

352: are  collapsing, i.e., the

353: right-hand side is just a variable.

354:

355: Clearly, the above condition is not sufficient for two different

356: kinds of reasons: (i) for instance,

357: $t$ could be a normal form and $P$ can have many cycles, but obviously

358: tabling is of no help, (ii) $P$ may have cycles but still represent a

359: terminating system (e.g., the fibonacci system above). Trying to

360: decide termination of $P$ is in general undecidable and even when it is

361: decidable it can be prohibitively expensive.

362: Therefore, we only try to ameliorate the first source of insufficiency

363: by  analyzing $t$ as well.

364: \begin{prop}[Necessary Condition]

365: A necessary condition for tabling to improve termination of the

366: computation of normal form of $t$ with respect to $P$ is the existence

367: of a cycle in the needs graph of $P$  containing a vertex that is

368: reachable by a directed path from a vertex representing a function

369: symbol or constant that appears also in $t$.

370: \end{prop}

371: This condition can be checked by marking the vertices that appear on

372: some cycle in the needs graph and then running a modified search

373: algorithm that looks for marked vertices.

374:

375: The needs graph defined above is a refinement of the call graph from

376: classical compilation techniques, where procedure parameters are also

377: taken into account. Variations of these graphs have appeared in

378: other  contexts in rewriting (e.g., for narrowing

379: a related but different graph is used in \cite{dershosiva}),

380: functional and logic programming (e.g., for termination). However,

381: generally speaking, there are differences between such graphs in

382: existing literature (that we are aware of) and our approach.

383: Many of the  applications here appear to be new as well.

384: \subsection{When does tabling help efficiency?}

385: Now we focus on the second benefit of tabling, i.e., the possibility

386: that the computation of the normal form of $t$ would need the same

387: reduction step more than once. Since the tabling approach here

388: shares common subexpressions to the extent possible, repeated subterms

389: in $t$ or its reducts do not necessarily lead to repeated reduction

390: steps.

391: \begin{prop}[Necessary Condition]

392: A necessary condition for tabling to improve the efficiency of the

393: sequential computation of normal form of $t$ with respect to $P$ is

394: the existence of a node with in-degree at least one and outdegree at

395: least one in the needs  graph of $P$.

396: \end{prop}

397: This condition is easily checked in time that is linear in the sum of

398: right-hand sides of $P$. Again, this condition is not sufficient for

399: two kinds of reasons. Informally, they are: (i)  $t$ has no ``need''

400: for this vertex, or (ii) $P$ does not ``allow'' to use this vertex from

401: different terms.  Again, it is quite complex to decide (ii). Hence, we

402: only try to ameliorate the first by analyzing $t$.

403: \begin{prop}[Necessary Condition]

404: A necessary condition for tabling to improve the efficiency of the

405: sequential computation of normal form of $t$ with respect to $P$ is

406: the existence of a node with in-degree more than one and outdegree at

407: least one in the needs  graph of $P$ that is reachable via

408: directed paths from two different symbols of $t$.

409: \end{prop}

410: To check this condition we can use a standard search algorithm.

411: Note that we are looking for a very ``short'' chain in the needs graph,

412: i.e., we are looking for the possibility that just one reduction step

413: can be saved. The length of the desired chain can be adjusted based on

414: the ratio of the time it takes to do a reduction step versus the time

415: it takes to lookup the results of a reduction step in the table, which

416: grows with the table size.

417:

418: \subsection{\bf Techniques for Efficient Reduction and Tabling}

419: For technical reasons, the standard operators over the predefined

420:  types  are not treated as constructor functions in this section,

421:  even though we do not

422: have any rules in the system for them.

423: \subsubsection{Optimizing Reduction}

424: We define the following class of signatures called the

425: don't-reduce  signatures that represent a subclass of normal forms

426: that are formed from only constructors, or constructor terms.

427:

428: \begin{defn}

429: (a) Base case: All constructor constants are don't-reduce signatures.

430: (b) Inductive case: If classes $c_1,...,c_n$  contain only

431: don't-reduce signatures {\em as unreduced signatures},

432: then $f(c_1,...,c_n)$ is also a don't-reduce signature if $f$ is a

433: constructor.

434: \end{defn}

435:

436: Now note that no rule can match a don't-reduce signature.

437: This means that when we do leftmost-outermost reduction and we reach a class

438: that contains an unreduced signature which is a don't-reduce signature, then

439: we need not go down (below this class) recursively to find any

440: matches.  Actually, if a class contains a normal form, then, of

441: course,  we need not explore the structure below to find any

442: matches. However, to detect normal forms one must match rules against

443: the unreduced signature of a class, whereas the detection of

444: don't-reduce signatures is easier since it can be done in a bottom-up

445: manner without any matching.  Moreover, our situation is even more

446: complicated with respect to detection of normal forms, since we have

447: built-in arithmetic and other types. It is easy to construct a

448: scenario where  no rules match an unreduced signature $s$ because $s$

449: depends on a class which contains an unevaluated arithmetic signature

450: as unreduced signature and once it is evaluated rules may or may not

451: match. Although we state this in terms of signatures here, it is clear

452: that this idea applies to the untabled approach as well by marking

453: subterms of terms that are constructor terms.

454: \subsubsection{Optimizing Matching}

455: By analyzing the term $t$ together with the program $P$ we can

456: eliminate unnecessary rules, i.e. those rules that are never needed in

457: the computation of the normal form of $t$. For this we find all the

458: defined symbols of $P$ that are reachable from all the symbols in $t$

459: using a standard search algorithm on the needs graph of $P$. We then

460: include only rules for the reachable defined symbols in building

461: efficient data structures for matching, e.g., discrimination nets. In

462: compilation this corresponds to not generating code for the

463: unnecessary rules and could speed up the compilation process

464: itself (of course, the set of terms to be normalized must be known in

465: advance and specified before compilation). The generated code uses

466: less space.

467: \subsubsection{Optimizing dependency lists in Tabling}

468: To identify the signatures that must be updated in the event a class

469: is unioned into another class, a dependency list is usually associated

470: with each class. It

471: contains all the signatures that ``depend'' on this class. Our next

472: optimizations are intended to reduce the size of these lists, since

473: processing them can get quite expensive.

474:

475: For this purpose, we  define a class of signatures called the don't-add signatures,

476: which is a subclass of the class of don't-reduce signatures.

477: \begin{defn}

478: (a)  Base case: All constructor constants are don't-add signatures.

479: (b) Inductive case: If classes $c_1,...,c_n$  contain only

480: don't-add signatures, then $f(c_1,...,c_n)$ is also a don't-add

481: signature if $f$ is a constructor.

482: \end{defn}

483: Note the difference in case (b) between don't-add vs. don't-reduce signatures.

484: All constants are, of course, never added to any dependency

485: list since they do not depend any class. We need (a) for the inductive

486: step.

487:

488: Now we have the following rule for don't-add signatures: do not add a

489: don't-add signature to dependency lists. However, there is a practical

490: difficulty in  implementing this rule, since it is quite possible that

491: $f(c_1,..., c_n)$ is a don't-add signature, but because of the insertion

492: of a signature into class $c_i$, class $c_i$ contains a signature

493: which is an ``add'' signature and hence $f(c_1,...,c_n)$ is also now an

494: ``add'' signature. It may be possible to do this efficiently, but it

495: will require time and also the programming may be somewhat complex.

496:

497: Therefore, we identify a subclass of don't-add signatures, called

498: never-add signatures.

499: \subsubsection{Never-add signatures}

500: In the following we assume that the rewrite system is also

501: non-collapsing, i.e., does not have a rule whose right-hand side is a

502: variable.

503:

504: While we are parsing the rules, we can

505: also look for: (i) if any constructor constant is the right-hand side

506: of some rule (note "is" the rhs and not "is in" the rhs)

507: and (ii) if any standard operators over the predefined types  appear in the

508: right-hand side of some rule.

509:

510: Now, we define the class of never-add signatures.

511:

512: \begin{defn}

513: (a) Base case: Every user-defined constructor constant that does not

514: appear as the rhs of any rule is a never-add constant.  If $t$ is

515: a constant from one of predefined  types in (ii) above,   then $t$ is a

516: never-add constant only if

517: there is no rule containing an operator that takes an

518: argument of the type of $t$ in the right-hand-side and the term to be

519: normalised also does not contain any operator that takes an argument

520: of type t. (b) Inductive case: If classes $c_1, \ldots, c_n$ contain only

521: never-add signatures and $f$ is a constructor that does not

522: appear as the outermost symbol of the right-hand side of any rule,

523: then  $f(c_1,\ldots,c_n)$ is a never-add signature.

524: \end{defn}

525:

526: Never-add signatures are a proper subclass of don't-add signatures,

527: which are easily identified in a bottom-up manner and they can never

528: change on the union of two classes or on the insertion of a signature

529: in a class containing a never-add signature. So in \smaran\, we never

530: add a never-add signature to the dependency list of any class.

531:

532: If the rewrite system can contain collapsing rules, then it is not

533: sufficient to check that no constructor constant and no

534: constructor function appears as the outermost symbol of any

535: rule. Consider, the following rewrite system for example:

536: \begin{eqnarray}

537: from(x,y) & \rarr & if(y>0,cons(x,from(x+1,y-1)),nil) \\

538: if(true,x,y) & \rarr & x \\

539: if(false,x,y) & \rarr & y

540: \end{eqnarray}

541:

542: In this system both nil and cons do not appear as the outermost symbol

543: of any rule but there are terms $T$ that reduce to terms of the form nil

544: and cons(...). Hence, it is possible that because of a union one of

545: the terms from $T$ changes and so on.

546: In fact, it is in general undecidable whether given a term $t$ and a

547: rewrite system $R$ whether $t$ reduces to a constructor term.

548:

549: \section{Implementation in $LR^2$}

550: Some of the techniques discussed above have been implemented in $LR^2$

551: and have proven their effectiveness. The techniques that have been

552: implemented are  the identification

553: of don't-reduce and never-add signatures. We are currently in the

554: process of implementing the needs-graph analysis to help the user with

555: choosing the tabling or the non-tabling option in $LR^2$.

556: \section{Discussion and Future Work}

557: In this paper, we have presented some easily computed static analysis

558: techniques to enhance the efficiency of tabled and untabled

559: computations with equational logic programs. These techniques have

560: implications for interpreters (and  sometimes compilers) and some are

561: applicable in a wider context, e.g., functional or logic

562: programming. The analysis described here is a first approximation since it

563: ignores the arguments of functions symbols. A useful direction for

564: future research is to use constraint-based analysis with the needs

565: graph to take into account the arguments as well. Care must be taken

566: to ensure that the benefits of the analyses outweigh the costs.

567: Other avenues are transformation of the program including partial

568: evaluation.

569: \bibliographystyle{plain}

570: \bibliography{/auto/disk01/faculty/cosc/rmverma/latex/space}

571: \end{document}

572:

573:

574:

575:

576:

577:

578:

579:

580: