1: \documentstyle[12pt,fullpage]{article}
2: \setlength{\unitlength}{1.0cm}
3: \newtheorem{lem}{Lemma}
4: \newtheorem{lemaux}{Auxiliary Lemma}
5: \newtheorem{fact}{Fact}
6: \newtheorem{ax}{Axiom}
7: \newtheorem{defn}{Definition}
8: \newtheorem{theo}{Theorem}
9: \newtheorem{nota}{Notation}
10: \newtheorem{prop}{Proposition}
11: \newtheorem{cor}{Corollary}
12: \newtheorem{rem}{Remark}
13: \newtheorem{eg}{Example}
14: \newcommand{\rarr}{\rightarrow}
15: \setlength{\oddsidemargin}{-.26in} %for dvi-imagen, a bit -ve needed.
16: \setlength{\topmargin}{-0.7in} %for qms
17: \setlength{\textheight}{9.0in}
18: \setlength{\textwidth}{6.5in} % 5.5in = 14 cm
19: \setlength{\marginparwidth}{0in}
20: \setlength{\marginparsep}{0in}
21: \setlength{\marginparpush}{0in}
22:
23: \newcommand{\smaran}{{\sf Smaran}}
24: \newcommand{\tgr}{{\sf TGR}}
25:
26: \setcounter{secnumdepth}{3} %to get numbers on subsubsection in text
27: %default is 2.
28: \setcounter{tocdepth}{3} %to get numbers on subsubsection in toc
29: %default is 2.
30: \setlength{\unitlength}{1.0cm}
31: \renewcommand{\baselinestretch}{1.0}
32: \begin{document}
33: \thispagestyle{empty}
34: \newtheorem{example}{Example}
35: \newtheorem{progs}{Program}
36: \begin{center}
37: {\Large \bf Static Analysis Techniques for Equational Logic
38: Programs\footnote{Partially supported by NSF grant CCR-9732186}}
39: \vspace{2mm}
40:
41: {\large Rakesh Verma} \vspace{1mm}
42:
43: Computer Science Department \\
44: University of Houston \\
45: Houston, TX 77204 \\
46: Ph: (713)743-3348 \\
47: Fax: (713)743-3335 \\
48: Email: rmverma@cs.uh.edu \\
49: \end{center}
50: \begin{center}
51:
52:
53: {\bf Abstract}
54: \end{center}
55: An equational logic program is a set of directed equations or rules,
56: which are used to compute in the obvious way (by replacing equals with
57: ``simpler'' equals). We present static analysis techniques for efficient
58: equational logic programming, some of which have been implemented
59: in $LR^2$, a laboratory for developing and evaluating fast,
60: efficient, and practical rewriting techniques. Two novel
61: features of $LR^2$ are that non-left-linear rules are allowed in most
62: contexts and it has a tabling option based on the congruence-closure based
63: algorithm to compute normal forms. Although, the focus of this
64: research is on the tabling approach some of the techniques are
65: applicable to the untabled approach as well. Our presentation
66: is in the context of $LR^2$, which is an interpreter, but some of the
67: techniques apply to compilation as well.
68:
69:
70: \newpage
71: \pagenumbering{arabic}
72:
73: \section{Introduction}
74: Equational logic programming has received increasing attention in
75: recent years and there are now fast compilers such as Elan, Maude,
76: etc., and fast interpreters such as $LR^2$. The objective of this
77: paper is to enhance the efficiency of interpretation (and possibly
78: also compilation) of equational logic programming by analyzing the input
79: program. Since we want fast and efficient interpreters, our focus is on
80: practical and effective techniques that can be implemented very
81: cheaply (linear or close to linear time) in terms of their time and
82: space cost.
83: To describe the techniques we assume the usual scenario, viz., an
84: equational program, or a set of rules, $P$ and a term $t$ are given
85: and the goal is to compute a normal form of $t$. We assume here
86: that $P$ is confluent and that $t$ is well-formed but we do not assume termination. Although most
87: of the techniques we present are of general significance, their immediate
88: motivation is to enhance the efficiency of $LR^2$, especially its
89: tabling mechanism, so we give a brief
90: self-contained introduction to $LR^2$ and a description of
91: the tabling algorithm. For more details on $LR^2$, including useful optimizations and some performance results, the reader can consult
92: \cite{rakshal}. Throughout this paper, when we use the term tabling we
93: are specifically referring to this algorithm.
94: \subsection{$LR^2$}
95: $LR^2$ is a laboratory for developing and evaluating fast, efficient,
96: and practical rewriting techniques.
97: $LR^2$ consists of an expression graph interpreter \tgr,
98: and an expression graph rewriter that tables the history of its reductions,
99: called \smaran\ , based on the congruence closure approach. The input to $LR^2$ is a program
100: representing either an orthogonal, not necessarily terminating,
101: rewrite system, or a convergent (terminating and confluent) rewrite
102: system, and an input expression. First-order orthogonal systems can
103: easily express first-order functions and it is a well-known fact that
104: parallel-outermost reduction or lazy evaluation is necessary and
105: sufficient to compute with such systems. Observe that since convergent
106: systems can be given to $LR^2$, hence, it allows non-left-linear rules
107: such as $x + -x \rarr 0$ and $equal(x,x) \rarr true$, etc. Similar to
108: algebraic specification languages like OBJ, ASF+SDF and Elan a
109: program is composed from modules. Each module defines its own
110: signature and rewriting rules. A module can import other
111: modules. Expressions in $LR^2$ are written in prefix form. The
112: language of $LR^2$ contains several built-in datatypes, viz.,
113: integers, floating-point arithmetic, booleans, characters, sets,
114: multisets, and strings with associated operations. The set datatype
115: supports the insertion, deletion, membership, union, etc., operations.
116: The string datatype supports membership and indexing operations.
117:
118: $LR^2$ also includes a variant detector that can determine if a new expression
119: is an alphabetic variant of an existing expression, which is currently usable with
120: the tabling option. If so, the appropriate
121: variant of the result computed for the existing expression is used for
122: further rewriting instead of starting from scratch.
123: $LR^2$ also allows a compact form for storing lists of arithmetic
124: progressions as these occur frequently. Instead of storing the
125: entire list, $LR^2$ stores the initial value, the final value and the
126: difference. $LR^2$ provides a set of commands so that it can be called by other systems for symbolic computation. This currently requires the UNIX message
127: passing mechanism.
128:
129:
130: $LR^2$ provides a variety of options for controlling
131: the amount of history that is stored by the system, if
132: the user chooses the tabling option. The default option using \smaran\
133: is to save the results of each rewrite
134: step in a compact data structure. The language of $LR^2$ allows annotating specific functions with the keyword ``memo''. This
135: allows to save all the results of rewriting expressions that have the
136: specified function symbol at the root. The Delete (history) option in
137: $LR^2$ allows to delete the entire
138: history of rewrites performed so far except for the given expression and its
139: latest reduct after every $i^{th}$ rewrite step, where $i$ can be
140: specified by the user. Another possibility allowed is to delete the
141: entire history except the given expression and the latest reduct as soon as
142: the free memory available to the user drops to a user-specified
143: percentage of the total available memory. Options can be combined in any way to suit the application. However, the delete option overrides the
144: other options.
145:
146:
147: Operators can also be declared associative (A) or commutative (C) or
148: both (AC) in $LR^2$. However, currently only
149: left-linear rules with A, C, or AC operators can be handled; efficient matching
150: algorithms/heuristics for nonlinear rules with such operators are
151: fairly involved and is currently in the testing
152: phase. It is known that A-matching, C-matching and AC-matching of
153: general patterns are all NP-complete even with some
154: restrictions on occurrences of variables \cite{rakinfcom}.
155:
156: The rest of this paper is organized as follows. In Section 2, we
157: give a description of the main algorithm for \smaran\ ,
158: in Section 3 we discuss techniques to enhance the efficiency of
159: $LR^2$, in Section 4 we
160: mention the techniques that have already been implemented in $LR^2$. We
161: conclude with some promising directions for future research.
162: We omit here all proofs of the results stated in this paper.
163:
164: \section{Smaran's Core Algorithm } \label{ccnaalg}
165: The basic algorithm at the core of \smaran\ was proposed in \cite{chewalg}
166: for orthogonal rules and later extended in \cite{rakdiss,rakfocs} to non-left-linear
167: rewrite systems under fairly general conditions. The basic algorithm
168: in turn is an extension of the
169: well-known congruence closure algorithm (CCA) \cite{dst,no,shostak,kozen}) for ground equations.
170:
171: Recall that CCA divides the set of terms
172: into numbered equivalence classes. Membership
173: of a term in an equivalence class is decided by its {\em
174: signature}. The signature of a term $f(t_1,\ldots ,
175: t_n)$ is the tuple $\langle f~ \#[t_1]~\ldots \#[t_n]
176: \rangle$, where $\#[t_i]$ is the number of the equivalence class
177: containing the signature representing $t_i$. Equivalence class
178: $C$ {\em represents} term $t$ if $C$ contains a signature representing
179: $t$. CCA operates by merging equivalence class
180: representing terms whose equivalence follows from the given equations.
181:
182:
183: To extend CCA for normalization we make use of the concept of
184: a distinguished signature in each equivalence class called the {\it
185: unreduced signature} \cite{chewalg,rakdiss}. Using this signature we
186: construct a distinguished term. It has been shown in
187: \cite{chewalg,rakdiss,rakfocs} that it is enough to examine this
188: term to select useful rule instances, and that it is sufficient
189: to check this term for irreducibility for rewrite systems satisfying
190: fairly general conditions. If it exists and is irreducible, then the class
191: containing this term has a normal form.
192: Thus, instances of left hand sides represented by reduced
193: signatures, do not lead to any progress towards the normal form, and
194: any term represented by a reduced signature cannot be in normal
195: form.
196: Whenever a rule instance $A \rarr B$ can be applied to
197: the distinguished term of a class, $C$, because of a match, the
198: signature representing $A$ is marked reduced and the class
199: representing $B$ (if any) is merged with $C$. If there is no
200: class representing $B$, then a signature representing $B$ is
201: constructed, inserted into $C$, and marked the
202: unreduced signature of $C$.
203:
204: The algorithm starts by constructing the signature of the given term. This signature is then
205: inserted into a class and marked the unreduced signature of the class.
206: This class number is tracked throughout the process of normalization.
207: Signatures of terms are constructed in the obvious bottom-up way.
208: We illustrate this algorithm with a
209: small example. Consider the convergent rewrite system with three rules that are numbered for convenience,
210: \[
211: 1: fib(x) \rarr f(x > 1, x), 2: f(true,x) \rarr fib(x-1) +
212: fib(x-2), 3: f(false,x) \rarr 1, \] that
213: uses built-in arithmetic to define the fibonacci function and
214: let the input term be $fib(2)$.
215:
216: The initial set of classes is: \[ 0:\{2^{*}\}\hspace{0.05in} 1:\{\langle fib~
217: 0 \rangle^{*}\} \] (the symbol `*' indicates unreduced
218: signatures). Now the match procedure is called to find a match
219: between the unreduced signature of any class and the left-hand side of
220: any rule. A match occurs between class 1 and rule 1.
221: Now, the signature representing the corresponding instance of the
222: right-hand side of rule 1 is constructed, inserted into class 1 and
223: marked as its unreduced signature. Note that here we do
224: not show the signatures, corresponding to the built-in datatypes, that
225: can be evaluated directly. At this stage the classes are as follows:
226:
227: \centerline{ $0:\{2^{*}\}\hspace{0.05in} 1:\{\langle fib~ 0 \rangle, \langle f~ 3~ 0 \rangle^{*}\}
228: \hspace{0.05in}2:\{1^{*}\}\hspace{0.05in}3:\{true^* \}$}
229:
230: During the next iteration class 1 matches rule 2. The right-hand side
231: instance is $fib(1) + fib(0)$. The signature representing this is
232: created and inserted into class 1 as its new unreduced signature. At
233: the end of the second iteration the new/changed classes
234: are as follows:
235:
236: \centerline{$ 1:\{\langle fib~ 0 \rangle, \langle f~ 3~ 0 \rangle, \langle +~ 4~
237: 6 \rangle^{*}\}\hspace{0.05in}4:\{\langle fib~ 2 \rangle^*\} \hspace{0.05in}5:\{0^*\}
238: \hspace{0.05in} 6:\{\langle fib~ 5 \rangle^*\}$}
239:
240: After two rewrite steps, using rules 1 and 3 respectively, the
241: term $fib(1)$ represented by class 4 reduces to 1, which is represented by
242: class 2. Hence classes 4 and 2 are merged, say into 2, and we have:
243:
244: \centerline{ $0:\{2^{*}\}\hspace{0.02in} 1:\{\langle fib~ 0 \rangle, \langle f~ 3~ 0 \rangle, \langle +~ 2~ 6 \rangle^{*}\}
245: \hspace{0.02in}2:\{\langle fib~ 2 \rangle, \langle f~ 7~ 2 \rangle,
246: 1^{*}\} $ }
247: \centerline{$3:\{true^* \} \hspace{0.02in}5:\{0^*\}
248: \hspace{0.02in} 6:\{\langle fib~ 5 \rangle^*\} \hspace{0.02in} 7:\{false^*\}$}
249:
250: Note that signatures containing the class number 4 have been
251: updated to contain class number 2. After two more rewrite steps, the
252: term $fib(0)$ represented by class 6 reduces to 1. Hence classes 6
253: and 2 are merged, say into 2, and we get:
254:
255: \centerline{$0:\{2^{*}\}\hspace{0.02in} 1:\{\langle fib~ 0 \rangle, \langle f~ 3~ 0 \rangle, \langle +~ 2~ 2 \rangle^{*}\}
256: \hspace{0.02in}2:\{\langle fib~ 2 \rangle, \langle fib~ 5 \rangle, \langle f~ 7~ 5 \rangle, \langle f~ 7~ 2 \rangle,
257: 1^{*}\}$}
258: \centerline{$3:\{true^* \} \hspace{0.02in}5:\{0^*\} \hspace{0.02in}
259: \hspace{0.02in} 7:\{false^*\}$}
260:
261: Now, the unreduced signature of class 1 can be evaluated to yield the
262: term 2, which is in class 0. Hence, classes 1 and 0 are merged, say
263: into 0, and we get:
264:
265: \centerline{$1:\{2^{*}, \langle fib~ 0 \rangle, \langle f~ 3~ 0 \rangle, \langle +~ 2~ 2 \rangle\}
266: \hspace{0.02in}2:\{\langle fib~ 2 \rangle, \langle fib~ 5 \rangle, \langle f~ 7~ 5 \rangle, \langle f~ 7~ 2 \rangle,
267: 1^{*}\}$}
268: \centerline{$3:\{true^* \} \hspace{0.02in}5:\{0^*\} \hspace{0.02in}
269: \hspace{0.02in} 7:\{false^*\}$}
270:
271: No more matches are found. Therefore, the algorithm checks for the
272: existence of a normal form for the given term. Class 1 contains the unreduced
273: signature {\it 2}, which is irreducible. Thus the normal
274: form of $fib(2)$ is 2. Note that if the normal form
275: of $fib(fib(2))$ is needed, no more computations are needed
276: since this term is also represented by the signature $\langle fib~
277: 0 \rangle$ in class 1. It is simply extracted from class 1. The
278: normal form of this term is also $2$. On the other hand, an
279: interpreter that does not store history would compute $fib(2)$ twice
280: to normalize this term.
281: This compact data structure helps exploit the advantages of
282: storing history and can also speed up normalization.
283:
284: \section{When does tabling help?}
285: In this section, we present techniques that analyze the program $P$ and
286: term $t$ to determine whether tabling could be helpful in computing
287: the normal form of $t$. Recall that tabling can help in two important
288: ways:
289: \begin{enumerate}
290: \item It can improve the termination characteristics, i.e., tabled
291: rewriting can halt in cases where untabled would fail to halt because
292: there is no normal form. For example, consider the rule $a \rarr
293: f(a)$.
294: \item It can improve the efficiency of the computation. Each rule
295: instance is applied at most once and its results are stored. This is
296: extremely useful for: (a) problems that recursively solve lots of
297: overlapping subproblems, e.g., dynamic programming problems, (b)
298: proving theorems of the form $A = B$, and determining whether a
299: critical-pair ($A, B$) generated during KB-completion \cite{kb} is
300: trivial. (It is possible that $A$ and $B$ were reduced to a common
301: term albeit at different times, since this common term is remembered
302: in tabling, it will terminate immediately whereas untabled methods may
303: go on forever.), and (c) incremental computation.
304: \end{enumerate}
305:
306: We now develop two sets of necessary conditions that a program $P$
307: must satisfy so that tabling can potentially be useful for $P$, one
308: set for each way in which tabling can help.
309: \subsection{When does tabling improve termination?}
310: First we give necessary conditions on $P$ alone for tabling to help
311: termination. Next, we will analyze both $P$ and $t$ to develop even
312: stronger necessary conditions.
313: As usual, we partition the constants and function symbols of $P$ into two classes:
314: constructors and defined symbols.
315: \begin{defn}
316: A symbol is called {\em defined}
317: if it is the outermost symbol of the left-hand side of some rule in
318: $P$,
319: otherwise it is a constructor.
320: \end{defn}
321: For example, $fib$ is a defined symbol in the program given above,
322: whereas $true$ is a constructor.
323: Note that constructors include the constants from the predefined types
324: such as bool, float, int, char, etc., (we know that there can be no rules with integer, float, char and
325: bool constants as left-hand sides).
326: The standard operators over these predefined
327: types are treated as constructor functions in this section.
328: \begin{defn}
329: Let $f$ be a defined symbol of $P$ and let $g$ be any function symbol
330: or constant of $P$. We say
331: that $f$ {\em needs} $g$ if $g$ appears {\em in} the right-hand side of any
332: rule that has $f$ as its defined symbol.
333: \end{defn}
334:
335: Now we define the {\em needs} graph of $P$ as the directed graph $G= (V,E)$,
336: where $V$ contains a vertex for each function symbol or constant of
337: $P$ and $E$ contains a directed edge from the vertex for $f$ to the
338: vertex for $g$ precisely when $f$ needs $g$. Note that this graph may
339: have self-loops, i.e., cycles with just one edge.
340: \begin{prop}[Necessary Condition]
341: A necessary condition for tabling to improve termination of the
342: computation of normal form of $t$ with respect to $P$ is the existence
343: of a cycle in the needs graph of $P$.
344: \end{prop}
345: This condition can be checked in time that is linear in the sum of the
346: lengths of the right-hand sides of $P$ using a standard search
347: algorithm. We only introduce vertices in the graph that are
348: necessary, i.e., they represent either defined symbols with at least
349: one edge or a symbol that is adjacent to a defined symbol. The reader
350: may wonder if it is possible to have a defined symbol with no
351: originating edge. This happens when all the rules defining this symbol
352: are collapsing, i.e., the
353: right-hand side is just a variable.
354:
355: Clearly, the above condition is not sufficient for two different
356: kinds of reasons: (i) for instance,
357: $t$ could be a normal form and $P$ can have many cycles, but obviously
358: tabling is of no help, (ii) $P$ may have cycles but still represent a
359: terminating system (e.g., the fibonacci system above). Trying to
360: decide termination of $P$ is in general undecidable and even when it is
361: decidable it can be prohibitively expensive.
362: Therefore, we only try to ameliorate the first source of insufficiency
363: by analyzing $t$ as well.
364: \begin{prop}[Necessary Condition]
365: A necessary condition for tabling to improve termination of the
366: computation of normal form of $t$ with respect to $P$ is the existence
367: of a cycle in the needs graph of $P$ containing a vertex that is
368: reachable by a directed path from a vertex representing a function
369: symbol or constant that appears also in $t$.
370: \end{prop}
371: This condition can be checked by marking the vertices that appear on
372: some cycle in the needs graph and then running a modified search
373: algorithm that looks for marked vertices.
374:
375: The needs graph defined above is a refinement of the call graph from
376: classical compilation techniques, where procedure parameters are also
377: taken into account. Variations of these graphs have appeared in
378: other contexts in rewriting (e.g., for narrowing
379: a related but different graph is used in \cite{dershosiva}),
380: functional and logic programming (e.g., for termination). However,
381: generally speaking, there are differences between such graphs in
382: existing literature (that we are aware of) and our approach.
383: Many of the applications here appear to be new as well.
384: \subsection{When does tabling help efficiency?}
385: Now we focus on the second benefit of tabling, i.e., the possibility
386: that the computation of the normal form of $t$ would need the same
387: reduction step more than once. Since the tabling approach here
388: shares common subexpressions to the extent possible, repeated subterms
389: in $t$ or its reducts do not necessarily lead to repeated reduction
390: steps.
391: \begin{prop}[Necessary Condition]
392: A necessary condition for tabling to improve the efficiency of the
393: sequential computation of normal form of $t$ with respect to $P$ is
394: the existence of a node with in-degree at least one and outdegree at
395: least one in the needs graph of $P$.
396: \end{prop}
397: This condition is easily checked in time that is linear in the sum of
398: right-hand sides of $P$. Again, this condition is not sufficient for
399: two kinds of reasons. Informally, they are: (i) $t$ has no ``need''
400: for this vertex, or (ii) $P$ does not ``allow'' to use this vertex from
401: different terms. Again, it is quite complex to decide (ii). Hence, we
402: only try to ameliorate the first by analyzing $t$.
403: \begin{prop}[Necessary Condition]
404: A necessary condition for tabling to improve the efficiency of the
405: sequential computation of normal form of $t$ with respect to $P$ is
406: the existence of a node with in-degree more than one and outdegree at
407: least one in the needs graph of $P$ that is reachable via
408: directed paths from two different symbols of $t$.
409: \end{prop}
410: To check this condition we can use a standard search algorithm.
411: Note that we are looking for a very ``short'' chain in the needs graph,
412: i.e., we are looking for the possibility that just one reduction step
413: can be saved. The length of the desired chain can be adjusted based on
414: the ratio of the time it takes to do a reduction step versus the time
415: it takes to lookup the results of a reduction step in the table, which
416: grows with the table size.
417:
418: \subsection{\bf Techniques for Efficient Reduction and Tabling}
419: For technical reasons, the standard operators over the predefined
420: types are not treated as constructor functions in this section,
421: even though we do not
422: have any rules in the system for them.
423: \subsubsection{Optimizing Reduction}
424: We define the following class of signatures called the
425: don't-reduce signatures that represent a subclass of normal forms
426: that are formed from only constructors, or constructor terms.
427:
428: \begin{defn}
429: (a) Base case: All constructor constants are don't-reduce signatures.
430: (b) Inductive case: If classes $c_1,...,c_n$ contain only
431: don't-reduce signatures {\em as unreduced signatures},
432: then $f(c_1,...,c_n)$ is also a don't-reduce signature if $f$ is a
433: constructor.
434: \end{defn}
435:
436: Now note that no rule can match a don't-reduce signature.
437: This means that when we do leftmost-outermost reduction and we reach a class
438: that contains an unreduced signature which is a don't-reduce signature, then
439: we need not go down (below this class) recursively to find any
440: matches. Actually, if a class contains a normal form, then, of
441: course, we need not explore the structure below to find any
442: matches. However, to detect normal forms one must match rules against
443: the unreduced signature of a class, whereas the detection of
444: don't-reduce signatures is easier since it can be done in a bottom-up
445: manner without any matching. Moreover, our situation is even more
446: complicated with respect to detection of normal forms, since we have
447: built-in arithmetic and other types. It is easy to construct a
448: scenario where no rules match an unreduced signature $s$ because $s$
449: depends on a class which contains an unevaluated arithmetic signature
450: as unreduced signature and once it is evaluated rules may or may not
451: match. Although we state this in terms of signatures here, it is clear
452: that this idea applies to the untabled approach as well by marking
453: subterms of terms that are constructor terms.
454: \subsubsection{Optimizing Matching}
455: By analyzing the term $t$ together with the program $P$ we can
456: eliminate unnecessary rules, i.e. those rules that are never needed in
457: the computation of the normal form of $t$. For this we find all the
458: defined symbols of $P$ that are reachable from all the symbols in $t$
459: using a standard search algorithm on the needs graph of $P$. We then
460: include only rules for the reachable defined symbols in building
461: efficient data structures for matching, e.g., discrimination nets. In
462: compilation this corresponds to not generating code for the
463: unnecessary rules and could speed up the compilation process
464: itself (of course, the set of terms to be normalized must be known in
465: advance and specified before compilation). The generated code uses
466: less space.
467: \subsubsection{Optimizing dependency lists in Tabling}
468: To identify the signatures that must be updated in the event a class
469: is unioned into another class, a dependency list is usually associated
470: with each class. It
471: contains all the signatures that ``depend'' on this class. Our next
472: optimizations are intended to reduce the size of these lists, since
473: processing them can get quite expensive.
474:
475: For this purpose, we define a class of signatures called the don't-add signatures,
476: which is a subclass of the class of don't-reduce signatures.
477: \begin{defn}
478: (a) Base case: All constructor constants are don't-add signatures.
479: (b) Inductive case: If classes $c_1,...,c_n$ contain only
480: don't-add signatures, then $f(c_1,...,c_n)$ is also a don't-add
481: signature if $f$ is a constructor.
482: \end{defn}
483: Note the difference in case (b) between don't-add vs. don't-reduce signatures.
484: All constants are, of course, never added to any dependency
485: list since they do not depend any class. We need (a) for the inductive
486: step.
487:
488: Now we have the following rule for don't-add signatures: do not add a
489: don't-add signature to dependency lists. However, there is a practical
490: difficulty in implementing this rule, since it is quite possible that
491: $f(c_1,..., c_n)$ is a don't-add signature, but because of the insertion
492: of a signature into class $c_i$, class $c_i$ contains a signature
493: which is an ``add'' signature and hence $f(c_1,...,c_n)$ is also now an
494: ``add'' signature. It may be possible to do this efficiently, but it
495: will require time and also the programming may be somewhat complex.
496:
497: Therefore, we identify a subclass of don't-add signatures, called
498: never-add signatures.
499: \subsubsection{Never-add signatures}
500: In the following we assume that the rewrite system is also
501: non-collapsing, i.e., does not have a rule whose right-hand side is a
502: variable.
503:
504: While we are parsing the rules, we can
505: also look for: (i) if any constructor constant is the right-hand side
506: of some rule (note "is" the rhs and not "is in" the rhs)
507: and (ii) if any standard operators over the predefined types appear in the
508: right-hand side of some rule.
509:
510: Now, we define the class of never-add signatures.
511:
512: \begin{defn}
513: (a) Base case: Every user-defined constructor constant that does not
514: appear as the rhs of any rule is a never-add constant. If $t$ is
515: a constant from one of predefined types in (ii) above, then $t$ is a
516: never-add constant only if
517: there is no rule containing an operator that takes an
518: argument of the type of $t$ in the right-hand-side and the term to be
519: normalised also does not contain any operator that takes an argument
520: of type t. (b) Inductive case: If classes $c_1, \ldots, c_n$ contain only
521: never-add signatures and $f$ is a constructor that does not
522: appear as the outermost symbol of the right-hand side of any rule,
523: then $f(c_1,\ldots,c_n)$ is a never-add signature.
524: \end{defn}
525:
526: Never-add signatures are a proper subclass of don't-add signatures,
527: which are easily identified in a bottom-up manner and they can never
528: change on the union of two classes or on the insertion of a signature
529: in a class containing a never-add signature. So in \smaran\, we never
530: add a never-add signature to the dependency list of any class.
531:
532: If the rewrite system can contain collapsing rules, then it is not
533: sufficient to check that no constructor constant and no
534: constructor function appears as the outermost symbol of any
535: rule. Consider, the following rewrite system for example:
536: \begin{eqnarray}
537: from(x,y) & \rarr & if(y>0,cons(x,from(x+1,y-1)),nil) \\
538: if(true,x,y) & \rarr & x \\
539: if(false,x,y) & \rarr & y
540: \end{eqnarray}
541:
542: In this system both nil and cons do not appear as the outermost symbol
543: of any rule but there are terms $T$ that reduce to terms of the form nil
544: and cons(...). Hence, it is possible that because of a union one of
545: the terms from $T$ changes and so on.
546: In fact, it is in general undecidable whether given a term $t$ and a
547: rewrite system $R$ whether $t$ reduces to a constructor term.
548:
549: \section{Implementation in $LR^2$}
550: Some of the techniques discussed above have been implemented in $LR^2$
551: and have proven their effectiveness. The techniques that have been
552: implemented are the identification
553: of don't-reduce and never-add signatures. We are currently in the
554: process of implementing the needs-graph analysis to help the user with
555: choosing the tabling or the non-tabling option in $LR^2$.
556: \section{Discussion and Future Work}
557: In this paper, we have presented some easily computed static analysis
558: techniques to enhance the efficiency of tabled and untabled
559: computations with equational logic programs. These techniques have
560: implications for interpreters (and sometimes compilers) and some are
561: applicable in a wider context, e.g., functional or logic
562: programming. The analysis described here is a first approximation since it
563: ignores the arguments of functions symbols. A useful direction for
564: future research is to use constraint-based analysis with the needs
565: graph to take into account the arguments as well. Care must be taken
566: to ensure that the benefits of the analyses outweigh the costs.
567: Other avenues are transformation of the program including partial
568: evaluation.
569: \bibliographystyle{plain}
570: \bibliography{/auto/disk01/faculty/cosc/rmverma/latex/space}
571: \end{document}
572:
573:
574:
575:
576:
577:
578:
579:
580: