cs0010034/cs0010034
1: \documentstyle[12pt,fullpage]{article}
2: \setlength{\unitlength}{1.0cm}
3: \newtheorem{lem}{Lemma}
4: \newtheorem{lemaux}{Auxiliary Lemma}
5: \newtheorem{fact}{Fact}
6: \newtheorem{ax}{Axiom}
7: \newtheorem{defn}{Definition}
8: \newtheorem{theo}{Theorem}
9: \newtheorem{nota}{Notation}
10: \newtheorem{prop}{Proposition}
11: \newtheorem{cor}{Corollary}
12: \newtheorem{rem}{Remark}
13: \newtheorem{eg}{Example}
14: \newcommand{\rarr}{\rightarrow}
15: \setlength{\oddsidemargin}{-.26in}   	%for dvi-imagen, a bit -ve needed.
16: \setlength{\topmargin}{-0.7in}		%for qms
17: \setlength{\textheight}{9.0in}
18: \setlength{\textwidth}{6.5in} % 5.5in = 14 cm
19: \setlength{\marginparwidth}{0in}
20: \setlength{\marginparsep}{0in}
21: \setlength{\marginparpush}{0in}
22: 
23: \newcommand{\smaran}{{\sf Smaran}}
24: \newcommand{\tgr}{{\sf TGR}}
25: 
26: \setcounter{secnumdepth}{3}	%to get numbers on subsubsection in text
27: 				%default is 2.
28: \setcounter{tocdepth}{3}	%to get numbers on subsubsection in toc
29: 				%default is 2.
30: \setlength{\unitlength}{1.0cm}
31: \renewcommand{\baselinestretch}{1.0}
32: \begin{document}
33: \thispagestyle{empty}
34: \newtheorem{example}{Example}
35: \newtheorem{progs}{Program}
36: \begin{center}
37: {\Large \bf Static Analysis Techniques for Equational Logic
38: Programs\footnote{Partially supported by NSF grant CCR-9732186}}  
39: \vspace{2mm}  
40: 
41: {\large Rakesh Verma} \vspace{1mm}
42: 
43: Computer Science Department \\
44: University of Houston \\
45: Houston, TX 77204 \\
46: Ph: (713)743-3348 \\
47: Fax: (713)743-3335 \\
48: Email: rmverma@cs.uh.edu \\
49: \end{center}
50: \begin{center}
51: 
52: 
53: {\bf Abstract}
54: \end{center}
55: An equational logic program is a set of directed equations or rules,
56: which are used to compute in the obvious way (by replacing equals with
57: ``simpler'' equals). We present static analysis techniques for efficient
58: equational logic programming, some of which have been implemented 
59: in  $LR^2$, a laboratory for developing and evaluating fast,
60: efficient, and practical rewriting techniques.  Two novel
61: features of $LR^2$ are that non-left-linear rules are allowed in most
62: contexts and it has a tabling option based on the congruence-closure based
63: algorithm to compute normal forms. Although, the focus of this
64: research is on the tabling approach some of the techniques are
65: applicable to the untabled approach as well. Our presentation 
66: is in the context of $LR^2$, which is an interpreter, but some of the 
67: techniques apply to compilation as well. 
68: 
69: 
70: \newpage
71: \pagenumbering{arabic}
72: 
73: \section{Introduction}
74: Equational logic programming has received increasing attention in
75: recent years and there are now fast compilers such as Elan, Maude,
76: etc., and fast interpreters such as $LR^2$. The objective of this
77: paper is to enhance the efficiency of interpretation (and possibly 
78: also compilation) of equational logic programming by analyzing the input
79: program. Since we want fast and efficient interpreters, our focus is on
80: practical and effective techniques that can be implemented very
81: cheaply (linear or close to linear time) in terms of their time and
82: space cost.  
83: To describe the techniques we assume the usual scenario, viz., an
84: equational program, or a set of rules,  $P$ and a term $t$ are given
85: and the goal is to compute a normal form of $t$. We  assume here
86: that $P$ is confluent and that $t$ is well-formed but we do not assume termination. Although most
87: of the techniques we present are of general significance, their immediate
88: motivation is to enhance the efficiency of $LR^2$, especially its
89: tabling mechanism, so we give a brief
90: self-contained introduction to $LR^2$  and  a description of
91: the tabling algorithm. For more details on $LR^2$, including useful optimizations and some performance results,  the reader can consult
92: \cite{rakshal}. Throughout this paper, when we use the term tabling we
93: are specifically referring to this algorithm. 
94: \subsection{$LR^2$}
95: $LR^2$ is a laboratory for developing and evaluating fast, efficient, 
96: and practical rewriting techniques. 
97: 	$LR^2$ consists of an expression graph interpreter \tgr,
98: and an expression graph rewriter that tables the history of its reductions,
99: called \smaran\ , based on the  congruence closure approach. The input to $LR^2$ is a program
100: representing either an orthogonal, not necessarily terminating,
101: rewrite system, or a convergent (terminating and confluent) rewrite
102: system,  and an input   expression. First-order orthogonal systems can
103: easily express first-order functions and it is a well-known fact that
104: parallel-outermost reduction or lazy evaluation is necessary and
105: sufficient to compute with such systems. Observe that since convergent
106: systems can be given to $LR^2$, hence, it allows non-left-linear rules
107: such as $x + -x \rarr 0$ and $equal(x,x) \rarr true$, etc.  Similar to
108: algebraic  specification languages like OBJ, ASF+SDF and Elan a
109: program is composed from modules. Each module defines its own
110: signature and rewriting rules.  A module can import other
111: modules. Expressions in $LR^2$ are written in prefix form. The
112: language of $LR^2$ contains several built-in datatypes, viz.,
113: integers, floating-point arithmetic, booleans, characters, sets,
114: multisets, and strings with associated operations. The set datatype
115: supports the insertion, deletion, membership, union, etc., operations.
116: The string datatype supports membership and indexing operations.  
117: 
118: $LR^2$ also includes a variant detector that can determine if a new expression
119: is an alphabetic variant of an existing expression, which is currently usable with 
120: the tabling option. If so, the appropriate
121: variant of the result computed for the existing expression is used for
122: further rewriting instead of starting from scratch. 
123: $LR^2$ also allows a  compact form for storing lists of arithmetic
124: progressions as these occur frequently. Instead of storing the
125: entire list, $LR^2$ stores the initial value, the final value and the
126: difference.  $LR^2$ provides a set of commands so that it can  be called by other systems for symbolic computation. This currently requires the UNIX message  
127: passing mechanism. 
128: 
129: 
130:    $LR^2$ provides a variety of options for controlling
131: the amount of history that is stored by the system, if 
132: the user chooses the tabling option. The default option using \smaran\
133: is to save the results of each rewrite 
134: step in a compact data structure. The language of $LR^2$ allows annotating  specific functions with the keyword ``memo''. This  
135: allows to save all the results of  rewriting expressions that have the
136: specified function symbol at the root. The Delete (history) option in
137: $LR^2$ allows to delete the entire 
138: history of rewrites performed so far except for the given expression and its
139: latest reduct after every  $i^{th}$   rewrite step, where  $i$  can be
140: specified by the user. Another possibility allowed is to delete the
141: entire history except the given expression and the latest reduct as soon as
142: the free memory available to the user drops to a user-specified
143: percentage of the total available memory.     Options can be combined in any way to suit the application. However, the delete option overrides the
144: other options.  
145: 
146:    
147: Operators can also be declared associative (A) or commutative (C) or
148: both (AC) in $LR^2$. However, currently only
149: left-linear rules with A, C, or AC operators can be handled; efficient matching
150: algorithms/heuristics for nonlinear rules with such operators are
151: fairly involved and is currently in the testing
152: phase. It is known that A-matching, C-matching and AC-matching of
153: general patterns are all NP-complete even with some 
154: restrictions on occurrences of variables \cite{rakinfcom}. 
155: 
156: The rest of this paper is organized as follows. In Section 2, we
157: give a description of the main algorithm for \smaran\ ,  
158: in Section 3 we discuss techniques to enhance the efficiency of 
159: $LR^2$,  in Section 4 we
160: mention the techniques that have already been implemented in $LR^2$. We
161: conclude with some promising directions for future research. 
162: We omit here all proofs of the results stated in this paper. 
163: 
164: \section{Smaran's Core Algorithm } \label{ccnaalg}
165: The basic algorithm at the core of  \smaran\ was proposed in \cite{chewalg}
166: for orthogonal rules and later extended in \cite{rakdiss,rakfocs} to non-left-linear
167: rewrite systems under fairly general conditions. The basic algorithm
168: in turn is an extension of the 
169:  well-known congruence closure algorithm (CCA) \cite{dst,no,shostak,kozen}) for ground equations.  
170: 
171: Recall that CCA divides the set of terms
172: into numbered equivalence classes. Membership
173: of a term in an equivalence class is decided by its {\em
174: signature}. The signature of a term $f(t_1,\ldots , 
175: t_n)$ is the tuple  $\langle f~ \#[t_1]~\ldots \#[t_n]
176: \rangle$, where $\#[t_i]$ is the number of the equivalence class
177: containing the signature representing $t_i$. Equivalence class 
178: $C$ {\em represents} term $t$ if $C$ contains a signature representing
179: $t$. CCA operates by merging equivalence class 
180: representing terms whose equivalence follows from the given equations. 
181: 
182: 
183: To extend CCA for normalization we make use of the concept of 
184: a distinguished signature in each equivalence class called the {\it
185: unreduced signature} \cite{chewalg,rakdiss}. Using this signature we
186: construct a distinguished term. It has been shown in
187: \cite{chewalg,rakdiss,rakfocs} that it is enough to examine this  
188: term to select useful rule instances, and that it is sufficient
189: to check this term for irreducibility for rewrite systems satisfying
190: fairly general conditions. If it exists and is irreducible, then the class 
191: containing this term has a normal form. 
192: Thus, instances of left hand sides  represented by reduced
193: signatures, do not lead to any progress towards the  normal form, and
194: any term represented by a reduced signature cannot be in normal
195: form. 
196: Whenever a rule instance $A \rarr B$ can be applied to
197: the distinguished term of a class, $C$, because of a match, the
198: signature  representing $A$ is marked reduced and the class 
199: representing $B$ (if any) is merged with  $C$. If there is no
200: class representing $B$, then a signature  representing $B$ is
201: constructed, inserted into $C$, and marked the
202: unreduced signature of $C$. 
203: 
204: The algorithm starts by constructing the signature of the given term. This signature is then 
205: inserted into a class and marked the unreduced signature of the class.
206: This class number is tracked throughout the process of normalization.
207: Signatures of terms are constructed in the obvious bottom-up way. 
208: We illustrate this algorithm with a
209: small example. Consider the  convergent rewrite system with three rules that are numbered for convenience, 
210: \[
211: 1: fib(x)  \rarr f(x   > 1, x), 2: f(true,x)  \rarr  fib(x-1) +
212: fib(x-2), 3: f(false,x)  \rarr  1, \] that
213: uses built-in arithmetic to define the fibonacci function and 
214: let the input term be $fib(2)$. 
215: 
216: The initial set of classes is: \[ 0:\{2^{*}\}\hspace{0.05in} 1:\{\langle fib~
217: 0 \rangle^{*}\} \] (the symbol `*' indicates unreduced
218: signatures). Now the match procedure  is called to find a match  
219: between the unreduced signature of any class and the left-hand side of
220: any rule. A match occurs between  class 1 and rule 1.
221: Now, the signature representing  the corresponding instance of the
222: right-hand side of rule 1   is constructed, inserted into class 1 and
223: marked as its unreduced signature. Note that here  we do
224: not show the signatures, corresponding to the built-in datatypes, that
225: can be evaluated directly.  At this stage the classes are as follows: 
226: 
227: \centerline{ $0:\{2^{*}\}\hspace{0.05in} 1:\{\langle fib~ 0 \rangle, \langle f~ 3~ 0 \rangle^{*}\}
228: \hspace{0.05in}2:\{1^{*}\}\hspace{0.05in}3:\{true^* \}$}
229: 
230: During the next iteration class 1 matches rule 2. The right-hand side
231: instance is $fib(1) + fib(0)$. The signature representing this is
232: created and inserted into class 1 as its new unreduced signature. At
233: the end of the second iteration the new/changed classes 
234: are as follows: 
235: 
236: \centerline{$ 1:\{\langle fib~ 0 \rangle, \langle f~ 3~ 0 \rangle, \langle +~ 4~
237: 6 \rangle^{*}\}\hspace{0.05in}4:\{\langle fib~ 2 \rangle^*\} \hspace{0.05in}5:\{0^*\}
238: \hspace{0.05in} 6:\{\langle fib~ 5 \rangle^*\}$}
239: 
240: After two rewrite steps, using rules 1 and 3 respectively, the
241: term $fib(1)$ represented by class 4 reduces to 1, which is represented by
242: class 2. Hence classes 4 and 2 are merged, say into 2, and we have:
243: 
244: \centerline{ $0:\{2^{*}\}\hspace{0.02in} 1:\{\langle fib~ 0 \rangle, \langle f~ 3~ 0 \rangle, \langle +~ 2~ 6 \rangle^{*}\}
245: \hspace{0.02in}2:\{\langle fib~ 2 \rangle, \langle f~ 7~ 2 \rangle,
246: 1^{*}\} $ }
247: \centerline{$3:\{true^* \} \hspace{0.02in}5:\{0^*\}
248: \hspace{0.02in} 6:\{\langle fib~ 5 \rangle^*\} \hspace{0.02in} 7:\{false^*\}$}
249: 
250: Note that signatures containing the class number 4 have been
251: updated to contain class number 2. After two more rewrite steps, the
252: term $fib(0)$ represented by class 6 reduces to 1. Hence classes 6
253: and 2 are merged, say into 2, and we get:
254: 
255: \centerline{$0:\{2^{*}\}\hspace{0.02in} 1:\{\langle fib~ 0 \rangle, \langle f~ 3~ 0 \rangle, \langle +~ 2~ 2 \rangle^{*}\}
256: \hspace{0.02in}2:\{\langle fib~ 2 \rangle, \langle fib~ 5 \rangle, \langle f~ 7~ 5 \rangle, \langle f~ 7~ 2 \rangle,
257: 1^{*}\}$} 
258: \centerline{$3:\{true^* \} \hspace{0.02in}5:\{0^*\}  \hspace{0.02in}
259:  \hspace{0.02in} 7:\{false^*\}$}
260: 
261: Now, the unreduced signature of class 1 can be evaluated to yield the
262: term 2, which is in class 0. Hence, classes 1 and 0 are merged, say
263: into 0, and we get:
264: 
265: \centerline{$1:\{2^{*}, \langle fib~ 0 \rangle, \langle f~ 3~ 0 \rangle, \langle +~ 2~ 2 \rangle\}
266: \hspace{0.02in}2:\{\langle fib~ 2 \rangle, \langle fib~ 5 \rangle, \langle f~ 7~ 5 \rangle, \langle f~ 7~ 2 \rangle,
267: 1^{*}\}$}
268: \centerline{$3:\{true^* \} \hspace{0.02in}5:\{0^*\}  \hspace{0.02in}
269:  \hspace{0.02in} 7:\{false^*\}$}
270: 
271: No more matches are found. Therefore, the algorithm checks for the
272: existence of a normal form for the given term. Class 1 contains the unreduced
273: signature {\it 2}, which is irreducible. Thus the normal
274: form of $fib(2)$ is 2.  Note that if the normal form 
275: of $fib(fib(2))$ is needed, no more computations are needed 
276: since this term is also represented by the signature $\langle fib~
277: 0 \rangle$ in class 1. It is  simply extracted from class 1.  The
278: normal form of this term is also $2$. On the other hand, an
279: interpreter that does not store history would compute $fib(2)$ twice
280: to normalize this term.
281: This  compact data structure helps exploit the advantages of
282: storing history and can also speed up  normalization. 
283: 
284: \section{When does tabling help?}
285: In this section, we present techniques that analyze the program $P$ and
286: term $t$ to determine whether tabling could  be helpful in  computing
287: the normal form of $t$. Recall that tabling can help in two important
288: ways: 
289: \begin{enumerate}
290: \item It can improve the termination characteristics, i.e., tabled
291: rewriting can halt in cases where untabled would fail to halt because
292: there is no normal form. For example, consider the rule $a \rarr
293: f(a)$. 
294: \item It can improve the efficiency of the computation. Each rule
295: instance is applied at most once and its results  are stored. This is
296: extremely  useful for: (a) problems that recursively solve lots of
297: overlapping subproblems, e.g., dynamic programming problems, (b)
298: proving theorems of the form $A = B$, and determining whether a
299: critical-pair ($A, B$) generated during KB-completion \cite{kb} is
300: trivial. (It is possible that $A$ and $B$ were reduced to a common
301: term albeit at different times, since  this common term is remembered
302: in tabling, it will terminate immediately whereas untabled methods may
303: go on forever.), and (c) incremental computation. 
304: \end{enumerate}
305: 
306: We now develop two sets of necessary conditions that a program $P$
307: must satisfy so that tabling can potentially be useful for $P$, one
308: set for each way in which tabling can help.
309: \subsection{When does tabling improve termination?}
310: First we give necessary conditions on $P$ alone for tabling to help
311: termination. Next, we will analyze both $P$ and $t$ to develop even
312: stronger necessary conditions. 
313: As usual, we partition the constants and function symbols of $P$ into two classes:
314: constructors and defined symbols. 
315: \begin{defn}
316: A symbol is called {\em defined}
317: if it is the outermost symbol of the left-hand side of some rule in
318: $P$,
319: otherwise it is a constructor. 
320: \end{defn}
321: For example, $fib$ is a defined symbol in the program given above,
322: whereas $true$ is a constructor. 
323: Note that constructors include the constants from the predefined types
324: such as bool, float, int, char, etc., (we know that there can be no rules with integer, float, char and
325: bool constants as left-hand sides). 
326: The standard operators over these predefined 
327:  types  are treated as constructor functions in this section.
328: \begin{defn}
329: Let $f$ be a defined symbol of $P$  and let $g$ be any function symbol
330: or constant of $P$. We say
331: that $f$ {\em needs} $g$ if $g$ appears {\em in} the right-hand side of any
332: rule that has $f$ as its defined symbol.
333: \end{defn}
334: 
335: Now we define the {\em needs} graph of $P$ as the directed graph $G= (V,E)$,
336: where $V$ contains a vertex for  each function symbol or constant of
337: $P$ and $E$ contains a directed edge from the vertex for  $f$ to the
338: vertex for $g$ precisely when $f$ needs $g$.  Note that this graph may
339: have self-loops, i.e., cycles with just one edge. 
340: \begin{prop}[Necessary Condition]
341: A necessary condition for tabling to improve termination of the
342: computation of normal form of $t$ with respect to $P$ is the existence
343: of a cycle in the needs graph of $P$. 
344: \end{prop}
345: This condition can be checked in time that is linear in the sum of the
346: lengths of the right-hand sides of $P$ using a standard search
347: algorithm. We only introduce  vertices in the graph that are
348: necessary, i.e., they represent either defined symbols with at least
349: one edge or a symbol that is adjacent to a defined symbol. The reader
350: may wonder if it is possible to have a defined symbol with no
351: originating edge. This happens when all the rules defining this symbol
352: are  collapsing, i.e., the
353: right-hand side is just a variable. 
354: 
355: Clearly, the above condition is not sufficient for two different
356: kinds of reasons: (i) for instance,
357: $t$ could be a normal form and $P$ can have many cycles, but obviously
358: tabling is of no help, (ii) $P$ may have cycles but still represent a
359: terminating system (e.g., the fibonacci system above). Trying to
360: decide termination of $P$ is in general undecidable and even when it is
361: decidable it can be prohibitively expensive.  
362: Therefore, we only try to ameliorate the first source of insufficiency
363: by  analyzing $t$ as well.  
364: \begin{prop}[Necessary Condition]
365: A necessary condition for tabling to improve termination of the
366: computation of normal form of $t$ with respect to $P$ is the existence
367: of a cycle in the needs graph of $P$  containing a vertex that is
368: reachable by a directed path from a vertex representing a function
369: symbol or constant that appears also in $t$.  
370: \end{prop}
371: This condition can be checked by marking the vertices that appear on
372: some cycle in the needs graph and then running a modified search
373: algorithm that looks for marked vertices. 
374: 
375: The needs graph defined above is a refinement of the call graph from
376: classical compilation techniques, where procedure parameters are also
377: taken into account. Variations of these graphs have appeared in
378: other  contexts in rewriting (e.g., for narrowing
379: a related but different graph is used in \cite{dershosiva}),
380: functional and logic programming (e.g., for termination). However,
381: generally speaking, there are differences between such graphs in
382: existing literature (that we are aware of) and our approach. 
383: Many of the  applications here appear to be new as well. 
384: \subsection{When does tabling help efficiency?}
385: Now we focus on the second benefit of tabling, i.e., the possibility
386: that the computation of the normal form of $t$ would need the same
387: reduction step more than once. Since the tabling approach here 
388: shares common subexpressions to the extent possible, repeated subterms
389: in $t$ or its reducts do not necessarily lead to repeated reduction
390: steps.  
391: \begin{prop}[Necessary Condition]
392: A necessary condition for tabling to improve the efficiency of the
393: sequential computation of normal form of $t$ with respect to $P$ is
394: the existence of a node with in-degree at least one and outdegree at
395: least one in the needs  graph of $P$.  
396: \end{prop}
397: This condition is easily checked in time that is linear in the sum of
398: right-hand sides of $P$. Again, this condition is not sufficient for
399: two kinds of reasons. Informally, they are: (i)  $t$ has no ``need''
400: for this vertex, or (ii) $P$ does not ``allow'' to use this vertex from
401: different terms.  Again, it is quite complex to decide (ii). Hence, we
402: only try to ameliorate the first by analyzing $t$. 
403: \begin{prop}[Necessary Condition]
404: A necessary condition for tabling to improve the efficiency of the
405: sequential computation of normal form of $t$ with respect to $P$ is
406: the existence of a node with in-degree more than one and outdegree at
407: least one in the needs  graph of $P$ that is reachable via 
408: directed paths from two different symbols of $t$.  
409: \end{prop}
410: To check this condition we can use a standard search algorithm. 
411: Note that we are looking for a very ``short'' chain in the needs graph,
412: i.e., we are looking for the possibility that just one reduction step
413: can be saved. The length of the desired chain can be adjusted based on
414: the ratio of the time it takes to do a reduction step versus the time
415: it takes to lookup the results of a reduction step in the table, which
416: grows with the table size. 
417: 
418: \subsection{\bf Techniques for Efficient Reduction and Tabling}
419: For technical reasons, the standard operators over the predefined 
420:  types  are not treated as constructor functions in this section, 
421:  even though we do not
422: have any rules in the system for them. 
423: \subsubsection{Optimizing Reduction}
424: We define the following class of signatures called the 
425: don't-reduce  signatures that represent a subclass of normal forms
426: that are formed from only constructors, or constructor terms. 
427: 
428: \begin{defn}
429: (a) Base case: All constructor constants are don't-reduce signatures. 
430: (b) Inductive case: If classes $c_1,...,c_n$  contain only
431: don't-reduce signatures {\em as unreduced signatures}, 
432: then $f(c_1,...,c_n)$ is also a don't-reduce signature if $f$ is a
433: constructor.   
434: \end{defn}
435: 
436: Now note that no rule can match a don't-reduce signature.
437: This means that when we do leftmost-outermost reduction and we reach a class
438: that contains an unreduced signature which is a don't-reduce signature, then 
439: we need not go down (below this class) recursively to find any
440: matches.  Actually, if a class contains a normal form, then, of
441: course,  we need not explore the structure below to find any
442: matches. However, to detect normal forms one must match rules against
443: the unreduced signature of a class, whereas the detection of
444: don't-reduce signatures is easier since it can be done in a bottom-up
445: manner without any matching.  Moreover, our situation is even more
446: complicated with respect to detection of normal forms, since we have
447: built-in arithmetic and other types. It is easy to construct a
448: scenario where  no rules match an unreduced signature $s$ because $s$
449: depends on a class which contains an unevaluated arithmetic signature
450: as unreduced signature and once it is evaluated rules may or may not
451: match. Although we state this in terms of signatures here, it is clear
452: that this idea applies to the untabled approach as well by marking
453: subterms of terms that are constructor terms. 
454: \subsubsection{Optimizing Matching}
455: By analyzing the term $t$ together with the program $P$ we can
456: eliminate unnecessary rules, i.e. those rules that are never needed in
457: the computation of the normal form of $t$. For this we find all the
458: defined symbols of $P$ that are reachable from all the symbols in $t$ 
459: using a standard search algorithm on the needs graph of $P$. We then
460: include only rules for the reachable defined symbols in building
461: efficient data structures for matching, e.g., discrimination nets. In
462: compilation this corresponds to not generating code for the
463: unnecessary rules and could speed up the compilation process
464: itself (of course, the set of terms to be normalized must be known in
465: advance and specified before compilation). The generated code uses
466: less space.   
467: \subsubsection{Optimizing dependency lists in Tabling}
468: To identify the signatures that must be updated in the event a class
469: is unioned into another class, a dependency list is usually associated
470: with each class. It
471: contains all the signatures that ``depend'' on this class. Our next
472: optimizations are intended to reduce the size of these lists, since
473: processing them can get quite expensive. 
474: 
475: For this purpose, we  define a class of signatures called the don't-add signatures,
476: which is a subclass of the class of don't-reduce signatures. 
477: \begin{defn}
478: (a)  Base case: All constructor constants are don't-add signatures.
479: (b) Inductive case: If classes $c_1,...,c_n$  contain only
480: don't-add signatures, then $f(c_1,...,c_n)$ is also a don't-add
481: signature if $f$ is a constructor. 
482: \end{defn}
483: Note the difference in case (b) between don't-add vs. don't-reduce signatures.
484: All constants are, of course, never added to any dependency
485: list since they do not depend any class. We need (a) for the inductive
486: step. 
487: 
488: Now we have the following rule for don't-add signatures: do not add a 
489: don't-add signature to dependency lists. However, there is a practical
490: difficulty in  implementing this rule, since it is quite possible that 
491: $f(c_1,..., c_n)$ is a don't-add signature, but because of the insertion
492: of a signature into class $c_i$, class $c_i$ contains a signature
493: which is an ``add'' signature and hence $f(c_1,...,c_n)$ is also now an
494: ``add'' signature. It may be possible to do this efficiently, but it
495: will require time and also the programming may be somewhat complex. 
496: 
497: Therefore, we identify a subclass of don't-add signatures, called
498: never-add signatures. 
499: \subsubsection{Never-add signatures}
500: In the following we assume that the rewrite system is also
501: non-collapsing, i.e., does not have a rule whose right-hand side is a
502: variable. 
503: 
504: While we are parsing the rules, we can 
505: also look for: (i) if any constructor constant is the right-hand side
506: of some rule (note "is" the rhs and not "is in" the rhs)
507: and (ii) if any standard operators over the predefined types  appear in the
508: right-hand side of some rule. 
509: 
510: Now, we define the class of never-add signatures.
511: 
512: \begin{defn}
513: (a) Base case: Every user-defined constructor constant that does not
514: appear as the rhs of any rule is a never-add constant.  If $t$ is  
515: a constant from one of predefined  types in (ii) above,   then $t$ is a
516: never-add constant only if 
517: there is no rule containing an operator that takes an
518: argument of the type of $t$ in the right-hand-side and the term to be
519: normalised also does not contain any operator that takes an argument
520: of type t. (b) Inductive case: If classes $c_1, \ldots, c_n$ contain only
521: never-add signatures and $f$ is a constructor that does not
522: appear as the outermost symbol of the right-hand side of any rule,
523: then  $f(c_1,\ldots,c_n)$ is a never-add signature. 
524: \end{defn}
525: 
526: Never-add signatures are a proper subclass of don't-add signatures,
527: which are easily identified in a bottom-up manner and they can never
528: change on the union of two classes or on the insertion of a signature
529: in a class containing a never-add signature. So in \smaran\, we never
530: add a never-add signature to the dependency list of any class. 
531: 
532: If the rewrite system can contain collapsing rules, then it is not
533: sufficient to check that no constructor constant and no
534: constructor function appears as the outermost symbol of any
535: rule. Consider, the following rewrite system for example:
536: \begin{eqnarray}
537: from(x,y) & \rarr & if(y>0,cons(x,from(x+1,y-1)),nil) \\
538: if(true,x,y) & \rarr & x \\
539: if(false,x,y) & \rarr & y
540: \end{eqnarray}
541: 
542: In this system both nil and cons do not appear as the outermost symbol
543: of any rule but there are terms $T$ that reduce to terms of the form nil
544: and cons(...). Hence, it is possible that because of a union one of
545: the terms from $T$ changes and so on.
546: In fact, it is in general undecidable whether given a term $t$ and a
547: rewrite system $R$ whether $t$ reduces to a constructor term. 
548: 
549: \section{Implementation in $LR^2$}
550: Some of the techniques discussed above have been implemented in $LR^2$
551: and have proven their effectiveness. The techniques that have been
552: implemented are  the identification
553: of don't-reduce and never-add signatures. We are currently in the
554: process of implementing the needs-graph analysis to help the user with
555: choosing the tabling or the non-tabling option in $LR^2$. 
556: \section{Discussion and Future Work}
557: In this paper, we have presented some easily computed static analysis
558: techniques to enhance the efficiency of tabled and untabled
559: computations with equational logic programs. These techniques have
560: implications for interpreters (and  sometimes compilers) and some are 
561: applicable in a wider context, e.g., functional or logic
562: programming. The analysis described here is a first approximation since it
563: ignores the arguments of functions symbols. A useful direction for
564: future research is to use constraint-based analysis with the needs
565: graph to take into account the arguments as well. Care must be taken
566: to ensure that the benefits of the analyses outweigh the costs. 
567: Other avenues are transformation of the program including partial
568: evaluation.  
569: \bibliographystyle{plain}
570: \bibliography{/auto/disk01/faculty/cosc/rmverma/latex/space}
571: \end{document}
572: 
573: 
574: 
575: 
576: 
577: 
578: 
579: 
580: