cs0608104/hra.tex
1: \documentclass{acmtrans2e}
2: \usepackage{epsf}
3: \usepackage{pst-rel-points}
4: \usepackage{colortab}
5: \usepackage{colordvi}
6: \usepackage{ifthen}
7: \usepackage{longtable}
8: \usepackage[normalem]{ulem}
9: \usepackage{amsmath}
10: \usepackage{amssymb}
11: \usepackage{amsfonts}
12: \usepackage{mathptm}
13: \usepackage{graphpap}
14: \usepackage{color,pstcol}
15: \usepackage{psboxit}
16: \usepackage{rotating}
17: \usepackage{pstcol}
18: \usepackage{pst-grad}
19: \usepackage{pstricks}
20: \usepackage{pst-text}
21: \usepackage{pst-node}
22: \usepackage{pst-coil}
23: \usepackage{epsfig}
24: \usepackage{url}
25: 
26: \input{def.tex}
27: 
28: \bibliographystyle{acmtrans}
29: 
30: \markboth{Uday Khedker et al.}{Heap Reference Analysis Using Access Graphs} 
31: \title{Heap Reference Analysis Using Access Graphs} 
32: \author{UDAY P. KHEDKER, 
33: 	AMITABHA SANYAL and 
34: 	AMEY KARKARE \\
35: 	Department of Computer Science \& Engg., IIT Bombay.}
36: 
37: \begin{abstract}
38: Despite significant progress in the theory and practice of program
39: analysis, analyzing properties of heap data has not reached the same
40: level of maturity as the analysis of static and stack data.  The
41: spatial and temporal structure of stack and static data is well
42: understood while that of heap data seems arbitrary and is unbounded.
43: We devise bounded representations which summarize properties of the
44: heap data. This summarization is based on the structure of the program
45: which manipulates the heap.  The resulting summary representations are
46: certain kinds of graphs called {\em access graphs}. The boundedness of
47: these representations and the monotonicity of the operations to
48: manipulate them make it possible to compute them through data flow
49: analysis.
50: 
51: An important  application which benefits from  heap reference analysis
52: is  garbage  collection, where  currently  liveness is  conservatively
53: approximated by reachability from program variables. As a consequence,
54: current garbage collectors leave a  lot of garbage uncollected, a fact
55: which has been confirmed by several empirical studies.  We propose the
56: first ever end-to-end static analysis to distinguish live objects from
57: reachable  objects.  We  use  this information  to  make dead  objects
58: unreachable by modifying the  program. This application is interesting
59: because  it requires  discovering data  flow  information representing
60: complex  semantics.  In particular,  we  formulate  the following  new
61: analyses for heap data: liveness, availability, and anticipability and
62: propose  solution  methods for  them.   Together,  they cover  various
63: combinations of directions of analysis (i.e. forward and backward) and
64: confluence of information (i.e. union and intersection).
65: %
66: Our analysis can also be used for plugging memory leaks in 
67: C/C++ languages.
68: \end{abstract}
69: %
70: \category{D.3.4}{Programming Languages}{Processors}[Memory management
71:   (garbage collection) \and Optimization]
72: %
73: \category{F.3.2}{Logics and Meanings Of Programs}{Semantics of
74:   Programming Languages}[Program analysis]
75: %
76: \terms{Algorithms, Languages, Theory}
77: %
78: \keywords{Aliasing, Data Flow Analysis, Heap References, Liveness}
79: 
80: \boldmath
81: 
82: \begin{document}
83: \begin{bottomstuff}
84: \end{bottomstuff}
85: 
86: \maketitle
87: 
88: 
89: 
90: %\tableofcontents
91: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
92: \section{Introduction}
93: \label{sec:intro}
94: 
95: Conceptually, data in a program is allocated in either the static data
96: area, stack, or heap.  Despite significant progress in the theory and
97: practice of program analysis, analyzing the properties of heap data
98: has not reached the same level of maturity as the analysis of static
99: and stack data.  Section~\ref{sec:back} investigates possible reasons.
100: 
101: In order to facilitate a systematic analysis, we devise bounded
102: representations which summarize properties of the heap data. This
103: summarization is based on the structure of the program which
104: manipulates the heap.  The resulting summary representations are
105: certain kinds of graphs, called access graphs which are obtained
106: through data flow analysis. We believe that our technique of
107: summarization is general enough to be also used in contexts other than
108: heap reference analysis.
109: 
110: 
111: \subsection{Improving Garbage Collection through Heap Reference Analysis}
112: 
113: An important application which benefits from heap reference analysis
114: is garbage collection, where liveness of heap data is conservatively
115: approximated by reachability. This amounts to approximating the future
116: of an execution with its past.  Since current garbage collectors
117: cannot distinguish live data from data that is reachable but not live,
118: they leave a lot of garbage uncollected.  This has been confirmed by
119: empirical
120: studies~\cite{hirz02,Hirzel.liveness.02,shah00,shah01,shah02} which
121: show that a large number (24\% to 76\%) of heap objects  which are
122: reachable at a program point are actually not accessed beyond that
123: point.  In order to collect such objects, we perform static analyses
124: to make dead objects unreachable by setting appropriate references to
125: \NULL. The idea that doing so would facilitate better garbage
126: collection is well known as ``Cedar Mesa Folk Wisdom''~\cite{gcfaq}.
127: The empirical attempts at achieving this have
128: been~\cite{shah01,shah02}.
129: 
130: Garbage collection is an interesting application for us because it
131: requires discovering data flow information representing complex
132: semantics.  In particular, we need to discover four properties of heap
133: references: liveness, aliasing, availability, and anticipability.
134: Liveness captures references that may be used beyond the program point
135: under consideration. Only the references that are not live can be
136: considered for \NULL\ assignments.  Safety of \NULL\ assignments
137: further requires (a) discovering all possible ways of accessing a
138: given heap memory cell (aliasing), and (b) ensuring that the reference
139: being nullified is accessible (availability and anticipability).
140: 
141: 
142: \begin{figure}[t]
143: \rule{\textwidth}{.2mm}
144: \begin{tabular}{@{}c@{}|@{}c}
145: \begin{tabular}{@{}c@{}@{}}
146: {
147: $\begin{array}{ll}
148: \\
149:     1. & w = x 
150:     {\parbox{1.75in}{\raggedleft  // $x$ points to $m_a$ }}\\
151:     2. & \mbox{\em while } (x.getdata() < max) \\
152:        & \{ \\
153:     3. & \;\;\;\;\;\;\;\; x = x.\rptr  \\
154:        & \} \\
155:     4. & y = x.\lptr  \\ 
156:     5. & z = \mbox{\em New } \mbox{\em class\_of\_z}
157:     \parbox{1.4in}{\raggedleft  // Possible GC Point} \\
158:     6. & y = y.\lptr  \\
159:     7. & z.sum = x.{\lptr}.getdata() + y.getdata()   \\ \\
160:   \end{array}$
161: }
162: \\
163: {(a) A Program Fragment}
164: \\ \hline
165: \multicolumn{1}{@{}l@{}}{
166: \begin{pspicture}(-6.6,-6)(-.5,-.9)
167: \psset{xunit=.6cm}
168: \psset{yunit=.6cm}
169: {
170: \psline[linestyle=dashed](-9.75,-7.5)(-9.75,-1.75)
171: \psline(-9.25,-7.2)(-7,-7.2) \rput(-6.25,-7.2){Heap} \psline{->}(-5.5,-7.2)(-3,-7.2)
172: \rput(-10.5,-7.2){Stack} %\psline{->}(.25,.25)(-.25,.25)
173: \rput(-10.5,-2.5){\rnode{n0}{\psframebox{$z$}}}
174: \rput(-10.5,-3.75){\rnode{n0}{\psframebox{$x$}}}
175: \rput(-10.5,-4.75){\rnode{w}{\psframebox[framesep=.09]{$w$}}}
176: \rput(-9,-4.5){\rnode{n1}{\pscirclebox[fillcolor=lightgray,framesep=0]{$m_a$}}}
177: \ncline{->}{w}{n1}
178: \rput(-10.5,-6){\rnode{n00}{\psframebox{$y$}}}
179: \rput(-9,-2.5){\rnode{n11}{\pscirclebox[framesep=0]{$m_k$}}}
180: \rput(-7.5,-3.75){\rnode{n2}{\pscirclebox[fillcolor=lightgray,framesep=0]{$m_b$}}}
181: \nccurve[angleA=0,angleB=170,linewidth=.6mm,linestyle=dashed,dash=1mm .6mm]{->}{n0}{n2}
182: \nccurve[angleA=0,angleB=150,linewidth=.6mm,linestyle=dashed,
183: 	dash=1mm .5mm]{->}{n0}{n1}
184: \ncline{->}{n1}{n2}
185: \aput[0]{:U}{$\rptr$}
186: \rput(-6,-3){\rnode{n3}{\pscirclebox[framesep=0]{$m_c$}}}
187: \ncline{->}{n2}{n3}
188: \aput[0]{:U}{$\rptr$}
189: \nccurve[angleA=2,angleB=170,linewidth=.6mm,linestyle=dashed,
190: 	dash=1mm .6mm]{->}{n0}{n3}
191: \rput(-4.5,-2.25){\rnode{n4}{\pscirclebox[framesep=0]{$m_d$}}}
192: \ncline{->}{n3}{n4}
193: \aput[0]{:U}{$\rptr$}
194: \nccurve[angleA=10,angleB=170,linewidth=.6mm,linestyle=dashed,
195: 	dash=1mm .6mm]{->}{n0}{n4}
196: \rput(-3,-3){\rnode{n5}{\pscirclebox[framesep=0]{$m_e$}}}
197: \ncline[linewidth=.7mm]{->}{n4}{n5}
198: \aput[0]{:U}{$\lptr$}
199: \rput(-6,-4.5){\rnode{n6}{\pscirclebox[fillcolor=lightgray,framesep=0]{$m_f$}}}
200: \ncline[linewidth=.7mm]{->}{n2}{n6}
201: \aput[0]{:U}{$\lptr$}
202: \rput(-4.5,-3.75){\rnode{n7}{\pscirclebox[framesep=0]{$m_g$}}}
203: \ncline[linewidth=.7mm]{->}{n3}{n7}
204: \aput[0]{:U}{$\lptr$}
205: \rput(-4.5,-5.15){\rnode{n8}{\pscirclebox[fillcolor=lightgray,framesep=0]{$m_h$}}}
206: \ncline[linewidth=.7mm]{->}{n6}{n8}
207: \aput[0]{:U}{$\lptr$}
208: \rput(-7.5,-5.25){\rnode{n9}{\pscirclebox[fillcolor=lightgray,framesep=0]{$m_i$}}}
209: \ncline[linewidth=.7mm]{->}{n1}{n9}
210: \aput[0]{:U}{$\lptr$}
211: \rput(-5.75,-6){\rnode{n10}{\pscirclebox[framesep=0]{$m_j$}}}
212: \ncline[linewidth=.7mm]{->}{n9}{n10}
213: \aput[0]{:U}(.3){$\lptr$}
214: \nccurve[angleA=0,angleB=240,linewidth=.6mm,linestyle=dashed,dash=1mm .6mm,nodesepB=-.08]{->}{n00}{n6}
215: \nccurve[angleA=0,angleB=200,linewidth=.6mm,linestyle=dashed,dash=1mm .6mm]{->}{n00}{n9}
216: \nccurve[angleA=-10,angleB=240,linewidth=.6mm,linestyle=dashed,dash=1mm .6mm,nodesepB=-.08]{->}{n00}{n7}
217: \nccurve[angleA=-20,angleB=250,linewidth=.6mm,linestyle=dashed,dash=1mm .6mm,ncurv=.95]{->}{n00}{n5}
218: \rput(-2.25,-2.65){\rnode{n51}{}}
219: \ncline{->}{n5}{n51}
220: \rput(-1.65,-3.75){\rnode{n51}{\pscirclebox[framesep=0]{$m_l$}}}
221: \ncline[linewidth=.7mm]{->}{n5}{n51}
222: \aput[0]{:U}(.5){$\lptr$} % By karkare
223: 
224: \rput(-2.75,-4.5){\rnode{n51}{\pscirclebox[framesep=0]{$m_m$}}}
225: \ncline[linewidth=.7mm]{->}{n7}{n51}
226: \aput[0]{:U}(.5){$\lptr$} % By karkare
227: 
228: \rput(-3.75,-3.4){\rnode{n51}{}}
229: \ncline{->}{n7}{n51}
230: \rput(-3.75,-5.6){\rnode{n51}{}}
231: \ncline{->}{n8}{n51}
232: \rput(-3.75,-4.9){\rnode{n51}{}}
233: \ncline{->}{n8}{n51}
234: \rput(-4.85,-6.35){\rnode{n51}{}}
235: \ncline{->}{n10}{n51}
236: \rput(-4.85,-5.65){\rnode{n51}{}}
237: \ncline{->}{n10}{n51}
238: \rput(-8.25,-2.85){\rnode{n51}{}}
239: \ncline{->}{n11}{n51}
240: \rput(-8.25,-2.15){\rnode{n51}{}}
241: \ncline{->}{n11}{n51}
242: \rput[l](-11.,-9.){(b) \parbox[t]{2.2in}{
243: Superimposition of memory graphs before line 5.  Dashed arrows capture
244: the effect of different iterations of the {\em while} loop.  All thick
245: arrows (both dashed and solid) are live links.}}  }
246: \end{pspicture}
247: }
248: \end{tabular}
249: &
250: \renewcommand{\arraystretch}{1.1}
251: \begin{tabular}{@{}ll@{}}
252: \\
253: \LCC & \lightgray\\
254:    & {$y = z = \NULL$}\\
255: \ECC 
256: 1. & $w = x $\\
257: \LCC & \lightgray\\
258:  & {$w = \NULL$}\\
259: \ECC 
260: 2. & {\em while } $(x.getdata() < max)$\\
261: \LCC & \lightgray\\
262:    & \{$\;\;\;\;\;\; 
263:       x.\lptr = \NULL$\\
264: \ECC
265: 3. & $\;\;\;\;\;\;\;\;\, x = x.\rptr$\\
266: & \} \\
267: \LCC & \lightgray\\
268:   & {$x.\rptr = x.\lptr.\rptr = \NULL$}\\
269:   %x.\lptr.\lptr.\rptr =  \NULL$}\\
270: %   & {$x.\lptr.\rptr = \NULL$}\\
271:    & {$x.\lptr.\lptr.\lptr = \NULL$}\\
272:    & {$x.\lptr.\lptr.\rptr = \NULL$}\\
273: \ECC
274: 4. & $y = x.\lptr$\\ 
275: \LCC & \lightgray\\
276:    & {$y.\rptr = y.\lptr.\lptr = y.\lptr.\rptr = \NULL$}\\
277: \ECC
278: 5. & $z$ = {\em New } {\em class\_of\_z}\\
279: \LCC & \lightgray\\
280:    & {$z.\lptr = z.\rptr = \NULL$}\\
281: \ECC
282: 6. & $y = y.\lptr$\\
283: \LCC & \lightgray\\
284:    & $x.\lptr.\lptr = y.\lptr = y.\rptr = \NULL$\\
285: \ECC
286: 7. & $z.sum = x.\lptr.getdata() + y.getdata()$\\
287: \LCC & \lightgray\\
288:    & {$x = y = z = \NULL$}\\
289: \ECC
290: \\
291: \multicolumn{2}{@{}c@{}}{(c) \parbox[t]{2.25in}{The
292:     modified program. Highlighted statements indicate the \NULL\
293:     assignments inserted in the program using our method. (More details
294:     in Section~\ref{sec:nullability})}}
295: \end{tabular}
296: \end{tabular}
297: \caption{A motivating example.}
298: \label{fig:memory.graph}
299: \rule{\textwidth}{.2mm}
300: \end{figure}
301: 
302: For simplicity of exposition, we present our method using a memory model
303: similar to that of Java.  {Extensions required for handling C/C++ model of
304: heap usage are easy and are explained in Section~\ref{sec:c++ext}.}
305: %
306: We assume that
307: root variable references are on the stack and the actual objects
308: corresponding to the root variables are in the heap.  In the rest of
309: the paper we ignore non-reference variables.  We view the heap at a
310: program point as a directed graph called {\em memory graph}. Root
311: variables form the entry nodes of a memory graph.  Other nodes in the
312: graph correspond to objects on the heap and edges correspond to
313: references. The out-edges of entry nodes are labeled by root variable
314: names while out-edges of other nodes are labeled by field names. The
315: edges in the memory graph are  called {\em links}.
316: 
317: \begin{example}
318: \label{exmp:motivation}
319: Figure~\ref{fig:memory.graph} shows a program fragment and its memory
320: graphs before line 5. Depending upon the number of times the {\em
321: while} loop is executed $x$ points to $m_a$, $m_b$, $m_c$ etc.
322: Correspondingly, $y$ points to $m_i$, $m_f$, $m_g$ etc.  The call to
323: {\em New\/} on line 5 may require garbage collection.  A conventional
324: copying collector will preserve all nodes except $m_{k}$. However,
325: only a few of them are used beyond line 5.
326: 
327: The modified program is an evidence of the strength of our approach.
328: It makes the unused nodes unreachable by nullifying relevant links.
329: The modifications in the program are general enough to nullify
330: appropriate links for any number of iterations of the loop.  Observe
331: that a \NULL\ assignment has also been inserted within the loop body
332: thereby making some memory unreachable in each iteration of the loop.
333: \mybox
334: \end{example}
335: 
336: After such modifications, a garbage collector will collect a lot more
337: garbage. Further, since copying collectors process only live data,
338: garbage collection by such collectors will be faster. Both these
339: facts are corroborated by our empirical measurements
340: (Section~\ref{sec:measurements}).
341: 
342: In the context of C/C++, instead of setting the references to \NULL,
343: allocated memory will have to be explicitly deallocated after checking
344: that no alias is live.
345: 
346: \subsection{Difficulties in Analyzing Heap Data}
347: \label{sec:back}
348: 
349: 
350: A program accesses data through expressions which have l-values and
351: hence are called {\em access expressions}. They can be scalar
352: variables such as $x$, or may involve an array access such as
353: $a[2*i]$, or can be a reference expression such as $x.l.r$.
354: 
355: An important question that any program analysis has to answer is: {\em
356: Can an access expression $\alpha_1$ at program point~$p_1$ have the
357: same l-value as $\alpha_2$ at program point~$p_2$?}  Note that the
358: access expressions {or} program points could be identical.  The
359: precision of the analysis depends on the precision of the answer to
360: the above question.
361: 
362: When the access expressions are simple and correspond to scalar data,
363: answering the above question is often easy because, the mapping of
364: access expressions to l-values remains fixed in a given scope
365: throughout the execution of a program.  However in the case of array
366: or reference expressions, the mapping between an access expression and
367: its l-value is likely to change during execution. From now on, we
368: shall limit our attention to reference expressions, since these are
369: the expressions that are primarily used to access the heap.  Observe that
370: manipulation of the heap is nothing but changing the mapping between
371: reference expressions and their l-values. For example, in
372: Figure~\ref{fig:memory.graph}, access expression $x.\lptr$ refers to
373: $m_i$ when the execution reaches line number 2 and may refer to $m_i$,
374: $m_f$, $m_g$, or $m_e$ at line 4.
375: 
376: This implies that, subject to type compatibility, any access
377: expression can correspond to any heap data, making it difficult to
378: answer the question mentioned above.  The problem is compounded
379: because the program may contain loops implying that the same access
380: expression appearing at the same program point may refer to different
381: l-values at different points of time.  Besides, the heap data may
382: contain cycles, causing an infinite number of access expressions to
383: refer to the same l-value.  All these make analysis of programs
384: involving heaps difficult.
385: 
386: 
387: \subsection{Contributions of This Paper}
388: 
389: The contributions of this paper fall in the following two categories
390: \begin{itemize}
391: \item {\em  Contributions in Data  Flow Analysis.}  We present  a data
392: flow framework in which the data flow values represent abstractions of
393: heap.   An interesting  aspect  of our  method  is the  way we  obtain
394: bounded representations  of the properties  by using the  structure of
395: the  program which  manipulates the  heap.  As  a consequence  of this
396: summarization,  the  values  of  data flow  information  constitute  a
397: complete  lattice  with  finite  height. Further,  we  have  carefully
398: identified a set of monotonic  operations to manipulate this data flow
399: information.  Hence, the standard results of data flow analysis can be
400: extended to  heap reference  analysis. Due to  the generality  of this
401: approach, it can be applied to other analyses as well.
402: 
403: \item {\em Contributions in Heap Data Analysis.}  We propose the first
404: ever end-to-end solution (in the intraprocedural context) for
405: statically discovering heap references which can be made \NULL\ to
406: improve garbage collection. The only approach which comes close to our
407: approach is the {\em heap safety automaton\/} based
408: approach~\cite{ran.shaham-sas03}.  However, our approach is superior
409: to their approach in terms of completeness, effectiveness, and
410: efficiency (details in Section~\ref{sec:ran.comparison}).
411: \end{itemize}
412: 
413: The concept which unifies the contributions is the
414: summarization of heap properties which uses the fact that {\em the heap
415: manipulations consist of repeating patterns which bear a close
416: resemblance to the program structure.}  Our approach to summarization
417: is more natural and more precise than other approaches because it does
418: not depend on an a-priori
419: bound~\cite{DBLP:conf/popl/JonesM79,jones82flexible,LarusH1988,ChaseWegZad90}.
420: 
421: \subsection{Organization of the paper}
422: 
423: The rest of the paper is organized as follows.
424: %
425: Section~\ref{sec:liveness} defines the concept of explicit liveness
426: of heap objects and formulates a data flow analysis
427: by using access graphs as data flow values.
428: %
429: Section~\ref{sec:other.analyses} defines other properties required for ensuring
430: safety of \NULL\ assignment insertion.
431: %
432: Section~\ref{sec:nullability} explains how \NULL\ assignments are
433: inserted.
434: %
435: Section~\ref{sec:termination} discusses convergence and complexity
436: issues.
437: %
438: Section~\ref{sec:soundness} shows the soundness of our approach.
439: %
440: Section~\ref{sec:measurements} presents empirical results.
441: %
442: %
443: Section~\ref{sec:c++ext} extends the approach to C++. 
444: Section~\ref{sec:related} reviews related work while
445: Section~\ref{sec:conclusions} concludes the paper.
446: 
447: 
448: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
449: \section{Explicit Liveness Analysis of Heap References}
450: \label{sec:liveness}
451: 
452: Our method discovers live links at each program point, i.e., links
453: which may be used in the program beyond the point under consideration.
454: Links which are not live can be set to \NULL. 
455: This section describes the liveness analysis. In particular, we define
456: liveness of heap references, devise a bounded representation called an 
457: {\em access graph\/} for liveness, and
458: then propose a data flow analysis for discovering liveness. 
459: Other analyses required for safety of \NULL\ insertion are described in 
460: Section~\ref{sec:other.analyses}.
461: 
462: Our method is flow sensitive but context insensitive. This means that 
463: we compute point-specific information in each procedure by taking into
464: account the flow of control at the intraprocedural level and 
465: by approximating the interprocedural information such that it is
466: not context-specific but is safe in all calling contexts.
467: For the purpose of analysis, arrays are handled by approximating any 
468: occurrence of an array element by the entire array.
469: The current version models exception handling by
470: explicating possible control flows. However, programs containing 
471: threads are not covered.
472: 
473: \subsection{Access Paths}
474: 
475: In order to discover
476: liveness and other properties of heap, we need a way of naming links in the
477: memory graph.  We do this using  access paths.
478: 
479: \label{sec:concepts.def}
480: An {\em access path\/} is a root variable name followed by a sequence
481: of zero or more field names and is denoted by \mbox{$\rho_x \equiv
482: x\myarrow f_1\myarrow f_2\myarrow\cdots\myarrow f_k$}.  Since an
483: access path represents a path in a memory graph, it can be used for
484: naming links and nodes.  An access path consisting of just a root
485: variable name is called a {\em simple\/} access path; it represents a
486: path consisting of a single link corresponding to the root
487: variable. \Empty\ denotes an empty access path.
488: 
489: The last field name in an access path $\rho$ is called its {\em
490: frontier\/} and is denoted by $\Front(\rho)$. The frontier of a simple
491: access path is the root variable name.  The access path corresponding
492: to the longest sequence of names in $\rho$ excluding its frontier is
493: called its {\em base\/} and is denoted by $\Base(\rho)$.  Base of a
494: simple access path is the empty access path \Empty. The object reached by traversing an
495: access path $\rho$ is called the {\em target\/} of the access path and
496: is denoted by $\Target(\rho)$.  When we use an access path $\rho$ to refer
497: to a link in a memory graph, it  denotes the last link in $\rho$, i.e. the
498: link corresponding to $\Front(\rho)$.
499: %
500: \begin{example}
501: \label{exmp:access.path.1}
502: As explained earlier, Figure~\ref{fig:memory.graph}(b) is the
503: superimposition of memory graphs that can result before line 5 for
504: different executions of the program.  For the access path \mbox{$\rho_x \equiv
505: x\myarrow\lptr\myarrow\lptr$}, depending on whether the {\em while\/}
506: loop is executed 0, 1, 2, or 3 times, $\Target(\rho_x)$ denotes nodes
507: $m_j$, $m_h$, $m_m$, or $m_l$.  $\Front(\rho_x)$ denotes one of the
508: links \mbox{$m_i\rightarrow m_j$}, \mbox{$m_f\rightarrow m_{h}$},
509: \mbox{$m_g\rightarrow m_{m}$} or \mbox{$m_e\rightarrow m_{l}$}.
510: $\Base(\rho_x)$ represents the following paths in the heap memory:
511: \mbox{$x\rightarrow m_a\rightarrow m_i$}, \mbox{$x\rightarrow
512: m_b\rightarrow m_f$}, \mbox{$x\rightarrow m_c\rightarrow m_g$} or
513: \mbox{$x\rightarrow m_d\rightarrow m_e$}. 
514: \mybox
515: \end{example}
516: 
517: In the rest of the paper, $\alpha$ denotes an access expression,
518: $\rho$ denotes an access path and $\sigma$ denotes a (possibly empty)
519: sequence of field names separated by $\myarrow$. Let the 
520: %If the  
521: access expression 
522: %denoted by 
523: $\alpha_x$ be $x.f_1.f_2\ldots f_n$. Then, the
524: corresponding access path $\rho_x$ is $x\myarrow f_1\myarrow f_2\ldots f_n$.
525: %is denoted by $\rho_x$. 
526: When the root variable name is not required,
527: we drop the subscripts from $\alpha_x$ and $\rho_x$.
528: 
529: \subsection{Program Flow Graph}
530: 
531: Since the current version of our method involves context insensitive analysis, 
532: each procedure is analyzed separately and only once. Thus there is no need of 
533: maintaining a call graph and we use the term program and procedure
534: interchangeably.
535: 
536: To simplify the description of analysis we make the following assumptions:
537: \begin{itemize}
538: \item The  program flow  graph has  a unique  $\entrynode$ and  a unique
539:       $\exitnode$ node. We assume that there is a distinguished {\tt main\/} procedure.
540: \item Each  statement  forms  a basic  block.%\footnote{As  explained  in
541:       %Section~\ref{sec:complexity}, multiple  statements can  be grouped
542:       %together in a larger basic block.}
543: \item The conditions  that alter  flow of  control are  made up  only of
544:       simple variables.  If not,  the offending reference  expression is
545:       assigned to  a fresh simple  variable before the condition  and is
546:       replaced by the fresh variable in the condition.
547: \end{itemize}
548: With these simplification, each statement falls in one of the following
549: categories:
550: %
551: \begin{itemize}
552: \item {\em Function Calls\/}. These are statements 
553: 	\mbox{$x = f(\alpha_y, \alpha_z, \ldots)$} where the functions 
554:       involve access expressions in arguments. The type of $x$ does not matter.
555: \item {\em Assignment Statements\/}.  These are assignments to
556:   references and are denoted by \mbox{$\alpha_x = \alpha_y$}.  Only
557:   these statements can modify the structure of the heap.
558: \item {\em Use Statements\/}. These statements use heap references to
559:   access heap data but do not modify heap references. For the purpose
560:   of analysis, these statements are abstracted as lists of expressions
561:   $\alpha_y.d$ where $\alpha_y$ is an access expression
562:   and $d$ is a non-reference. 
563: \item {\em Return Statement\/} of the type $\return \; \alpha_x$ involving
564: reference variable $x$.
565: \item {\em Other Statements\/}.  These statements include all
566:   statements which do not refer to the heap.  We ignore these
567:   statements since they do not influence heap reference analysis.
568: \end{itemize}
569: 
570: \newcommand{\onepath}{\psi}
571: 
572: 
573: 
574: When we talk  about the execution path, we shall  refer to the execution
575: of the program derived by  retaining all function calls, assignments and
576: use statements and ignoring the condition checks in the path.
577: 
578: For    simplicity   of    exposition,    we    present   the    analyses
579: assuming   that   there   are   no  cycles   in   the   heap.   
580: This assumption does not limit the theory in any way because 
581: our analyses inherently compute conservative information in the presence
582: of cycles without requiring any special treatment.
583: 
584: \subsection{Liveness of Access Paths}
585: \label{sec:liveness.specs}
586: 
587: A link  $l$ is {\em live\/}  at a program point  {$p$} if it is  used in
588: some control flow path starting from {$p$}. Note that $l$ may be used in
589: two different ways. It may be dereferenced to access an object or tested
590: for comparison. An  erroneous nullification of $l$ would  affect the two
591: uses in different  ways: Dereferencing $l$ would result  in an exception
592: being raised whereas testing $l$ for  comparison may alter the result of
593: condition and thereby the execution path.
594: 
595: Figure~\ref{fig:memory.graph}(b) shows links that are live before line 5
596: by thick arrows. For  a link $l$ to be live, there must  be at least one
597: access path from some root variable to  $l$ such that every link in this
598: path is  live. This is the  path that is actually  traversed while using
599: $l$.
600: 
601: 
602: 
603: Since our  technique involves nullification of  access paths, we  need to extend
604: the notion of liveness from links to  access paths. An access path is defined to
605: be  {\em  live\/}  {at  $p$}  if  the link  corresponding  to  its  frontier  is
606:  live  along  some  path  starting at  $p$.   Safety  of  \NULL\
607: assignments  requires that the  access paths  which are  live are  excluded from
608: nullification.
609: 
610: 
611: We initially  limit ourselves to a  subset of live access  paths, whose liveness
612: can be  determined without taking into  account the aliases  created before $p$.
613: These  access paths  are live  solely because  of the  execution of  the program
614: beyond  $p$.   We call  access  paths  which are  live  in  this  sense as  {\em
615: explicitly live} access paths. An interesting property of explicitly live access
616: paths is that they form the minimal set covering every live link. 
617: 
618: \begin{example}
619: \label{exmp:liveness}
620: If the body of the {\em while\/} loop in Figure~\ref{fig:memory.graph}(a) is not
621: executed   even  once,   \mbox{$\Target(y)=m_i$}  at   line  5   and   the  link
622: \mbox{$m_i\rightarrow m_j$} is live at line 5  because it is used in line 6. The
623: access paths \mbox{$y$} and \mbox{$y\myarrow \lptr$} are explicitly live because
624: their liveness at 5 can be determined solely from the statements from 5 onwards.
625: In contrast, the access path \mbox{$w\myarrow\lptr\myarrow\lptr$} is live without
626: being explicitly live.  It becomes  live because of the alias between \mbox{$y$}
627: and \mbox{$w\myarrow\lptr$} and this alias was created before 5.  Also note that
628: if an access path is explicitly live, so are all its prefixes.  \mybox
629: \end{example}
630: 
631: 
632: 
633: %% \begin{figure}[t]
634: %% \centering
635: %% \begin{tabular}{ll}
636: %% \small
637: %% \psset{unit=.95mm}
638: %% \begin{pspicture}(-2,-28)(40,16)
639: %%   \psrelpoint{origin}{x0}{1}{1}
640: %%   \rput(\x{x0},\y{x0}){\rnode{x0}{\psframebox[framesep=1]{$x$}}}
641: %%   \psrelpoint{x0}{u}{0}{6}
642: %%   \rput(\x{u},\y{u}){\rnode{u}{\psframebox[framesep=1]{$u$}}}
643: %%   \psrelpoint{u}{v}{0}{-18}
644: %%   \rput(\x{v},\y{v}){\rnode{v}{\psframebox[framesep=1]{$v$}}}
645: %%   \psrelpoint{u}{w}{0}{6}
646: %%   \rput(\x{w},\y{w}){\rnode{w}{\psframebox[framesep=.8]{$w$}}}
647: %%   \psrelpoint{v}{y0}{0}{6}
648: %%   \rput(\x{y0},\y{y0}){\rnode{y0}{\psframebox[framesep=.8]{$y$}}}
649: %%   \psrelpoint{v}{p}{0}{-6}
650: %%   \rput(\x{p},\y{p}){\rnode{p}{\psframebox[framesep=.8]{$p$}}}
651: %%   \psrelpoint{p}{q}{0}{-6}
652: %%   \rput(\x{q},\y{q}){\rnode{q}{\psframebox[framesep=1]{$q$}}}
653: %%   %%
654: %%   \psrelpoint{x0}{mx0}{8}{3}
655: %%   \cnode(\x{mx0},\y{mx0}){2.5}{mx0}
656: %%   \ncline{->}{x0}{mx0}
657: %%   \ncline{->}{u}{mx0}
658: %%   %%
659: %%   \psrelpoint{mx0}{mx1}{8}{0}
660: %%   \cnode(\x{mx1},\y{mx1}){2.5}{mx1}
661: %%   \ncline{->}{mx0}{mx1}
662: %%   \Aput[0.1]{$r$}
663: %%   %%
664: %%   \psrelpoint{mx1}{mx2}{8}{0}
665: %%   \cnode(\x{mx2},\y{mx2}){2.5}{mx2}
666: %%   \ncline[linestyle=dashed,dash=.6 .6,linewidth=.5]{->}{mx1}{mx2}
667: %%   \Aput[0.1]{$n$}
668: %%   %%
669: %%   \psrelpoint{y0}{mz0}{-1}{-3}
670: %%   \cnode(\x{mz0},\y{mz0}){2.5}{mz0}
671: %%   \ncline{->}{v}{mz0}
672: %%   \ncline{->}{y0}{mz0}
673: %%   %%
674: %%   \psrelpoint{mz0}{mz1}{8}{0}
675: %%   \cnode(\x{mz1},\y{mz1}){2.5}{mz1}
676: %%   \ncline{->}{mz0}{mz1}
677: %%   \Aput[0.1]{$n$}
678: %%   %%
679: %%   \psrelpoint{mz1}{mz2}{8}{0}
680: %%   \cnode(\x{mz2},\y{mz2}){2.5}{mz2}
681: %%   \ncline{->}{mz1}{mz2}
682: %%   \Aput[0.1]{$n$}
683: %%   %%
684: %%   \psrelpoint{mz2}{mz3}{8}{0}
685: %%   \cnode(\x{mz3},\y{mz3}){2.5}{mz3}
686: %%   \ncline{->}{mz2}{mz3}
687: %%   \Aput[0.1]{$n$}
688: %%   \ncline[linewidth=.6]{->}{mx1}{mz2}
689: %%   \nccurve[angleA=10,angleB=135]{->}{w}{mx2}
690: %%   \nccurve[angleA=60,angleB=125]{->}{mz2}{mz3}
691: %%   \Aput[0.1]{$r$}
692: %%   \nccurve[angleA=-10,angleB=225]{->}{p}{mz2}
693: %%   \nccurve[angleA=-10,angleB=225]{->}{q}{mz3}
694: %% \end{pspicture}  
695: %% &
696: %% \psset{unit=.95mm}
697: %% \small
698: %% \begin{pspicture}(-2,-28)(40,16)
699: %%   \psrelpoint{origin}{x0}{1}{1}
700: %%   \rput(\x{x0},\y{x0}){\rnode{x0}{\psframebox[framesep=1]{$x$}}}
701: %%   \psrelpoint{x0}{u}{0}{6}
702: %%   \rput(\x{u},\y{u}){\rnode{u}{\psframebox[framesep=1]{$u$}}}
703: %%   \psrelpoint{u}{v}{0}{-18}
704: %%   \rput(\x{v},\y{v}){\rnode{v}{\psframebox[framesep=1]{$v$}}}
705: %%   \psrelpoint{u}{w}{0}{6}
706: %%   \rput(\x{w},\y{w}){\rnode{w}{\psframebox[framesep=.8]{$w$}}}
707: %%   \psrelpoint{v}{y0}{0}{6}
708: %%   \rput(\x{y0},\y{y0}){\rnode{y0}{\psframebox[framesep=.8]{$y$}}}
709: %%   \psrelpoint{v}{p}{0}{-6}
710: %%   \rput(\x{p},\y{p}){\rnode{p}{\psframebox[framesep=.8]{$p$}}}
711: %%   \psrelpoint{p}{q}{0}{-6}
712: %%   \rput(\x{q},\y{q}){\rnode{q}{\psframebox[framesep=1]{$q$}}}
713: %%   %%
714: %%   \psrelpoint{x0}{mx0}{8}{3}
715: %%   \cnode(\x{mx0},\y{mx0}){2.5}{mx0}
716: %%   \ncline{->}{x0}{mx0}
717: %%   \ncline{->}{u}{mx0}
718: %%   %%
719: %%   \psrelpoint{y0}{mz0}{5}{-3}
720: %%   \cnode(\x{mz0},\y{mz0}){2.5}{mz0}
721: %%   \ncline{->}{v}{mz0}
722: %%   \ncline{->}{y0}{mz0}
723: %%   \ncline{->}{mx0}{mz0}
724: %%   \Aput[0.1]{$r$}
725: %%   %%
726: %%   \psrelpoint{mz0}{mz1}{8}{0}
727: %%   \cnode(\x{mz1},\y{mz1}){2.5}{mz1}
728: %%   \ncline[linestyle=dashed,dash=.6 .6,linewidth=.5]{->}{mz0}{mz1}
729: %%   \Aput[0.1]{$n$}
730: %%   %%
731: %%   \psrelpoint{mz1}{mz2}{8}{0}
732: %%   \cnode(\x{mz2},\y{mz2}){2.5}{mz2}
733: %%   \ncline{->}{mz1}{mz2}
734: %%   \Aput[0.1]{$n$}
735: %%   %%
736: %%   \psrelpoint{mz2}{mz3}{8}{0}
737: %%   \cnode(\x{mz3},\y{mz3}){2.5}{mz3}
738: %%   \ncline{->}{mz2}{mz3}
739: %%   \Aput[0.1]{$n$}
740: %%   \nccurve[angleA=5,angleB=90]{->}{w}{mz1}
741: %%   \nccurve[angleA=60,angleB=125]{->}{mz2}{mz3}
742: %%   \Aput[0.1]{$r$}
743: %%   \nccurve[angleA=-10,angleB=230]{->}{p}{mz2}
744: %%   \nccurve[angleA=-10,angleB=225]{->}{q}{mz3}
745: %%   \nccurve[angleA=325,angleB=210,linewidth=.6]{->}{mz0}{mz2}
746: %%   \Bput[0.2]{$n$}
747: %% \end{pspicture}  
748: %%  \\
749: 
750: %% \textRed
751: %% (a) Assignment modifying left hand side only \& (b) Assignment
752: %%  modifying both left and right hand sides
753: %% \end{tabular}
754: %% \caption{Effect of assignment \mbox{$x.r.n = y.n.n$} on liveness.  The dotted and
755: %% the thick arrows represents the links before and after the assignment.}
756: %% \label{fig:aliasing.motivation}
757: %% \rule{\textwidth}{.2mm}
758: %% \end{figure}
759: 
760: \begin{example}
761: \label{exmp:liveness.defn}
762: We illustrate  the issues  in determining explicit  liveness of access  paths by
763: considering the assignment \mbox{$x.r.n = y.n.n$}.
764: 
765: 
766: \begin{itemize}
767: \item {\em Killed Access Paths}. Since the assignment modifies $\Front(x\myarrow
768: r\myarrow  n)$,  any  access  path  which  is  live  after  the  assignment  and
769:  has $x\myarrow r\myarrow n$ as prefix will cease to be  live before the 
770:  assignment. Access paths that are live after  the assignment and not killed  
771:  by it are live  before the assignment also.
772: \item {\em Directly Generated Access Paths}. All prefixes of $x\myarrow r$ 
773: and $y\myarrow n$ are explicitly live before the assignment due to the local effect
774: of the assignment.
775: \item {\em Transferred Access Paths}. If 
776: \mbox{$x\myarrow r \myarrow n \myarrow \sigma$} is live after
777:   the assignment, then \mbox{$y\myarrow n \myarrow n \myarrow \sigma$}
778:   will be live before the assignment. For example, if \mbox{$x\myarrow
779:   r \myarrow n \myarrow n$} is live after the assignment, then
780:   \mbox{$y\myarrow n \myarrow n \myarrow n$} will be live before the
781:   assignment. The sequence of field names $\sigma$ is viewed as being
782:   {\em transferred\/} from {$x\myarrow r \myarrow n$} to
783:   \mbox{$y\myarrow n \myarrow n$}.
784: \mybox
785: \end{itemize}
786: \end{example}
787: 
788: We now define liveness by generalizing the above observations.
789: We use the notation \mbox{$\rho_x\myarrow *$} to enumerate all access paths
790: which have $\rho_x$ as a prefix. The summary liveness information for a set $S$ of
791: reference variables is defined as follows:
792: \begin{eqnarray*}
793: \Summary(S) & = & 
794:     \bigcup_{x \in S} \{ x\myarrow * \} 
795: \end{eqnarray*}
796: Further, the set of  all global variables is denoted by \Global\  and the set of
797: formal parameters of the function being analyzed is denoted by \param.
798: 
799: 
800: 
801: \begin{definition}{\rm\bf Explicit Liveness}.
802: \label{def:explicit.liveness}
803: The set of { explicitly} live access paths at a program point~$p$,
804: denoted by $\live_p$ is defined as follows.
805:   \begin{eqnarray*}
806:     \live_p& = & 
807:     \displaystyle\bigcup_{\onepath \in Paths(p)}(\Plive^{\onepath}_{p}) 
808:   \end{eqnarray*}
809: where, $\onepath \in Paths(p)$ is a control flow path $p$ to \exitnode\ and
810: $\Plive^{\onepath}_{p}$ denotes the
811: liveness at $p$ along $\onepath$ and is defined as follows. 
812: If $p$ is not program exit then let the statement which follows it be denoted by
813:   $s$ and the program point immediately following $s$ be denoted by $p'$. Then,
814:   \begin{eqnarray*}
815:     \Plive^{\onepath}_{p}& = & \left\{\begin{array}{cl}
816:     \emptyset & p \mbox{ is  \exitnode\ of {\tt main}} \\\
817:     \Summary(\Global) & p \mbox{ is  \exitnode\ of some procedure} \\
818:     \Slive_{s}(\Plive^{\onepath}_{p'})  & \mbox{otherwise}
819:     \end{array}\right. 
820:   \end{eqnarray*}
821: where the flow function for $s$ is defined as follows:
822:   \begin{eqnarray*}
823:     \Slive_{s}(X) & = & (X - \ELPK_s) \; \cup \; \ELPD_s \;\cup\;\ELPT_s(X) 
824:   \end{eqnarray*}
825: $\ELPK_s$ denotes the sets of access paths which cease to be live before 
826: statement $s$, $\ELPD_s$ denotes the set of access paths which become live due 
827: to local effect of $s$ and $\ELPT_s(X)$ denotes the
828: the set of access paths which become live before $s$ due to transfer of liveness from
829: live access paths after $s$. 
830: They are defined in Figure~\ref{fig:flow.fun.liveness}. \mybox
831: \end{definition}
832: %
833: Observe that
834: the    definitions   of   $\ELPK_s$,    $\ELPD_s$,   and
835: $\ELPT_s(X)$ ensure that the $\live_{p}$ is prefix-closed.
836: 
837: \begin{figure}[t]
838: \begin{center}
839: \scalebox{1.2}{%
840: $
841: \begin{array}{|l|c|c|c|}
842: \hline
843: \mbox{Statement } s & \ELPK_s & \ELPD_s & \ELPT_s(X) \\  \hline\hline
844: \alpha_x = \alpha_y & \{ \rho_x\myarrow * \} & 
845: 	\prefix(\Base(\rho_x))\cup \prefix(
846: 	\Base(\rho_y))  &
847: 	\{ \rho_y\myarrow \sigma \mid \rho_x\myarrow \sigma \in X\} 
848:  		\\ \hline
849:  \alpha_x = f(\alpha_y) & \{\rho_x\myarrow *\} & 
850:  \prefix(\Base(\rho_x))  & \emptyset \\ 
851: & & \cup\; \Summary(\{\rho_y\}\cup\Global) & 
852:  		\\ \hline
853: \alpha_x = \new & \{ \rho_x\myarrow * \} & \prefix(\Base(\rho_x)) & \emptyset
854:  		\\ \hline
855: \alpha_x = \NULL & \{ \rho_x\myarrow * \} & \prefix(\Base(\rho_x)) & \emptyset
856:  		\\ \hline
857: %\use  \; \alpha_y  & \emptyset & \prefix( \Base(\rho_y)) & \emptyset \\ \hline
858: \use  \; \alpha_y.d  & \emptyset & \prefix(\rho_y) & \emptyset
859:  		\\ \hline
860: \return \;  \alpha_y & \emptyset &  \Summary(\{\rho_y\}) & \emptyset 
861:  		\\ \hline
862: \mbox{other}  & \emptyset &  \emptyset & \emptyset 
863:  		\\ \hline
864: \end{array}
865: $}
866: \end{center}
867: \caption{Defining Flow Functions for Liveness. \Global\ denotes the set of global 
868: 	references and \param\ denotes the set of formal parameters. 
869: 	For simplicity, we have shown 
870: 	a single access expression on the RHS.}
871: \label{fig:flow.fun.liveness}
872: \rule{\textwidth}{.2mm}
873: \end{figure}
874: 
875: 
876: \begin{example}
877: \label{exmp:unbounded.ap}
878: In Figure~\ref{fig:memory.graph},  it cannot be  statically determined
879: which  link is  represented by  access expression  \mbox{$x.\lptr$} at
880: line 4. Depending  upon the number of iterations  of the {\em while\/}
881: loop, it may be any of  the links represented by thick arrows. Thus at
882: line    1,   we    have   to    assume   that    all    access   paths
883: \mbox{\{\mbox{$x\myarrow\lptr\myarrow\lptr$},
884: \mbox{$x\myarrow\rptr\myarrow\lptr\myarrow\lptr$},
885: \mbox{$x\myarrow\rptr\myarrow\rptr\myarrow\lptr\myarrow\lptr$},
886: \ldots\}} are explicitly live.  \mybox
887: \end{example}
888: 
889: In general, an infinite number  of access paths with unbounded lengths
890: may be live before a  loop. Clearly, performing data flow analysis for
891: access    paths   requires    a   suitable    finite   representation.
892: Section~\ref{sec:access.graphs} defines access graphs for the purpose.
893: 
894: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
895: \subsection{Representing Sets of Access Paths by Access Graphs}
896: \label{sec:access.graphs}
897: 
898: In the  presence of loops, the set  of access paths may  be infinite and
899: the  lengths of access  paths may  be unbounded.   If the  algorithm for
900: analysis tries  to compute sets of access  paths explicitly, termination
901: cannot be  guaranteed.  We solve this  problem by representing  a set of
902: access paths by a graph of bounded size.
903: 
904: \subsubsection{Defining Access Graphs}
905: \label{sec:dependence}
906: An  {\em  access   graph},  denoted  by  $G_v$,  is   a  directed  graph
907: \mbox{$\langle n_0,  N, E \rangle$}  representing a set of  access paths
908: starting from a root variable $v$.\footnote{Where the root variable name
909: is not required, we drop the  subscript $v$ from $G_v$.}  $N$ is the set
910: of nodes, $n_0  \in N_F$ is the  entry node with no in-edges  and $E$ is
911: the set  of edges. Every  path in the  graph represents an  access path.
912: The {\em empty graph\/} $\Empty\!_G$ has  no nodes or edges and does not
913: accept any access path.
914: 
915: The entry node of an access graphs  is labeled with the name of the root
916: variable  while the  non-entry nodes  are  labeled with  a unique  label
917: created as  follows: If a  field name $f$  is referenced in  basic block
918: $b$,  we  create  an  access  graph node  with  a  label  \mbox{$\langle
919: f,b,i\rangle$} where $i$ is  the instance number used for distinguishing
920: multiple occurrences of the field name $f$ in block $b$.  Note that this
921: implies that  the nodes  with the same  label are treated  as identical.
922: Often, $i$ is 0 and in such a case we denote the label \mbox{$\langle f,
923: b, 0\rangle$} by $f_b$  for brevity.  Access paths \mbox{$\rho_x\myarrow
924: *$} are  represented by  including a summary  node denoted $n_*$  with a
925: self loop over it.  It is  distinct from all other nodes but matches the
926: field name of any other node.
927: 
928: A node  in the access graph represents  one or more links  in the memory
929: graph.  Additionally,  during analysis, it represents a  state of access
930: graph  construction (explained in  Section~\ref{sec:summarisation}).  An
931: edge \mbox{$  f_n\rightarrow g_m$} in  an access graph at  program point
932: $p$ indicates  that a  link corresponding to  field $f$  dereferenced in
933: block $n$ may  be used to dereference a link  corresponding to field $g$
934: in  block $m$  on  some path  starting at  $p$.  This has  been used  in
935: Section~\ref{sec:complexity} to argue that  the size of access graphs in
936: practical programs is small.
937: 
938: Pictorially, the entry node of an access graph is indicated by an
939: incoming double arrow.
940: 
941: \begin{figure}[t]
942: \small
943: \begin{pspicture}(2.,1.35)(10,4.15)
944: \psset{unit=1mm}
945: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Gx2 %%%%
946: \psrelpoint{origin}{base}{0}{0}
947: \psrelpoint{base}{n0}{30}{25}
948: \rput(\x{n0},\y{n0}){\rnode{n0}{}}
949: \psrelpoint{n0}{n1}{0}{-6}
950: \psrelpoint{n1}{n2}{0}{-9}
951: \rput(\x{n1},\y{n1}){\rnode{n1}{1 \psframebox{$x = x.r$} \white 1}}
952: \rput(\x{n2},\y{n2}){\rnode{n2}{2 \psframebox{$x = x.r$} \white 2}}
953: \psrelpoint{n2}{n3}{0}{-6}
954: \rput(\x{n3},\y{n3}){\rnode{n3}{}}
955: \ncline[offset=.1]{->}{n0}{n1}
956: \ncline[offset=.1]{->}{n1}{n2}
957: \ncline[offset=.1]{->}{n2}{n3}
958: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
959: \psrelpoint{n0}{t1}{20}{-4}
960: \psrelpoint{t1}{t2}{0}{-7}
961: \rput[l](\x{t1},\y{t1}){Live access paths at entry of block 1:
962:  $\{ x,\, x\myarrow r,\, x\myarrow r\myarrow r \}$}
963: \rput[l](\x{t2}, \y{t2}){Corresponding access graph: $G_x^2$}
964: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
965: \psrelpoint{t2}{n0}{50}{0}
966: \rput(\x{n0},\y{n0}){\rnode{n0}{}}
967: \psrelpoint{n0}{x0}{7}{0}
968: \psrelpoint{x0}{n1}{10}{0}
969: \psrelpoint{n1}{n2}{10}{0}
970: \rput(\x{x0},\y{x0}){\rnode{x0}{\pscirclebox[framesep=.9]{$x$}}}
971: \rput(\x{n1},\y{n1}){\rnode{n1}{\pscirclebox[framesep=.2]{$r_1$}}}
972: \rput(\x{n2},\y{n2}){\rnode{n2}{\pscirclebox[framesep=.2]{$r_2$}}}
973: \ncline[doubleline=true]{->}{n0}{x0}
974: \ncline{->}{x0}{n1}
975: %\Aput[0.1]{$n$}
976: \ncline{->}{n1}{n2}
977: %\Aput[0.1]{$n$}
978: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
979: \rput[l](20,26){\rule{\columnwidth}{.2mm}}
980: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
981: \psrelpoint{base}{n0}{30}{42}
982: \rput(\x{n0},\y{n0}){\rnode{n0}{}}
983: \psrelpoint{n0}{n1}{0}{-7}
984: \rput(\x{n1},\y{n1}){\rnode{n1}{1 \psframebox{$x = x.r$} \white 1}}
985: \psrelpoint{n1}{n2}{0}{-7}
986: \rput(\x{n2},\y{n2}){\rnode{n00}{}}
987: %{2 \psframebox{$\cdots = x.d$}}}
988: \ncline[offset=.1]{->}{n0}{n1}
989: \ncline[offset=.1]{->}{n1}{n00}
990: \ncloop[angleA=270,angleB=90,loopsize=-8,arm=3,offset=3,
991:   linearc=.1]{->}{n1}{n1}
992: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
993: \psrelpoint{n0}{t1}{20}{-4}
994: \psrelpoint{t1}{t2}{0}{-7}
995: \rput[l](\x{t1},\y{t1}){Live access paths at entry of block 1: 
996: $\{ x,\, x\myarrow r,\, x\myarrow r\myarrow r,\,x\myarrow r\myarrow
997:   r\myarrow r,\,\ldots \}$}
998: \rput[l](\x{t2},\y{t2}){Corresponding access graph: $G_x^1$}
999: %
1000: \psrelpoint{t2}{n0}{50}{0}
1001: \rput(\x{n0},\y{n0}){\rnode{n0}{}}
1002: \psrelpoint{n0}{x0}{7}{0}
1003: \psrelpoint{x0}{n1}{10}{0}
1004: \rput(\x{x0},\y{x0}){\rnode{x0}{\pscirclebox[framesep=0.9]{$x$}}}
1005: \rput(\x{n1},\y{n1}){\rnode{n1}{\pscirclebox[framesep=0.2]{$r_1$}}}
1006: \ncline[doubleline=true]{->}{n0}{x0}
1007: \ncline{->}{x0}{n1}
1008: %\Aput[0.1]{$n$}
1009: \nccurve[nodesepA=-.2mm,nodesepB=-.3mm,angleA=320,angleB=40,
1010:   ncurv=3]{->}{n1}{n1}
1011: %\Bput[0.1]{$n$}
1012: \end{pspicture}
1013: \caption{Approximations in Access Graphs}
1014: \label{fig:access.graphs.first}
1015: \rule{\textwidth}{.2mm}
1016: \end{figure}
1017: 
1018: 
1019: \subsubsection{Summarization}
1020: \label{sec:summarisation}
1021: 
1022: Recall that a link is live at a program point~$p$ if it is used along
1023: some control flow path from $p$ to \exitnode.  Since different access
1024: paths may be live along different control flow paths and there may be
1025: infinitely many control flow paths in the case of a loop following
1026: $p$, there may be infinitely many access paths which are live at $p$.
1027: Hence, the lengths of access paths will be unbounded. In such a case
1028: summarization is required.
1029: 
1030: Summarization is achieved by merging appropriate nodes in access
1031: graphs, retaining all in and out edges of merged nodes.  We explain
1032: merging with the help of
1033: Figure~\ref{fig:access.graphs.first}:
1034: \begin{itemize} 
1035: \item Node $n_1$ in access graph $G_x^1$ indicates references of $n$
1036:   at {\em different execution instances of the same\/} program point.
1037:   Every time this program point is visited during analysis, the same
1038:   state is reached in that the pattern of references after $n_1$ is
1039:   repeated.  Thus all occurrences of $n_1$ are merged into a single
1040:   state.  This creates a cycle which captures the repeating pattern of
1041:   references.
1042:   
1043: \item In $G_x^2$, nodes $n_1$ and $n_2$ indicate referencing $n$ at
1044:   {\em different\/} program points.  Since the references made after
1045:   these program points may be different, $n_1$ and $n_2$ are not
1046:   merged.
1047: \end{itemize}
1048: 
1049: Summarization captures the pattern of heap traversal in the most
1050: straightforward way.  Traversing a path in the heap requires the
1051: presence of reference assignments \mbox{$\alpha_x = \alpha_y$} such
1052: that $\rho_x$ is a proper prefix of 
1053: $\rho_y$. Assignments in 
1054: Figure~\ref{fig:access.graphs.first} are examples of such
1055: assignments. The structure of the flow of control between such
1056: assignments in a program determines the pattern of heap traversal.
1057: Summarization captures this pattern without the need of control flow
1058: analysis and the resulting structure is reflected in the access graphs
1059: as can be seen in Figure~\ref{fig:access.graphs.first}.  More examples
1060: of the resemblance of program structure and access graph structure can
1061: be seen in the access graphs in Figure~\ref{fig:liveness.info.1}.
1062: 
1063: \subsubsection{Operations on Access Graphs}
1064: \label{sec:access.graph.operations}
1065: 
1066: Section~\ref{sec:liveness.specs}  defined liveness  by  applying certain
1067: operations on  access paths.   In this subsection  we define  the corresponding
1068: operations on  access graphs.  Unless  specified otherwise,  the binary
1069: operations  are applied only  to access  graphs having  same root  variable. The
1070: auxiliary operations and associated notations are:
1071: 
1072: \renewcommand{\graphA}[1]{\mbox{{\sf\em\small G\/}$({#1})$}}
1073: \renewcommand{\graphO}[1]{\mbox{{\sf\em\small\magenta GOnly\/}$({#1})$}}
1074: 
1075: \begin{itemize} 
1076: \item \RootVar($\rho$)  denotes the root  variable of access path  $\rho$, while
1077:   \RootVar($G$) denotes the root variable of access graph $G$.
1078: \item \FieldName($n$) for a node $n$ denotes the field name component
1079:   of the label of $n$.
1080: \item  \graphA{\rho} constructs access  graphs corresponding  to $\rho$.
1081:   It uses the  current basic block number and the  field names to create
1082:   appropriate  labels for  nodes.  The  instance number  depends  on the
1083:   number    of     occurrences    of    a    field     name    in    the
1084:   block.  \graphA{\rho\myarrow  *} creates  an  access  graph with  root
1085:   variable $x$ and the summary node $n_*$ with an edge from $x$ to $n_*$
1086:   and a self loop over $n_*$.
1087: \item   \lNode{G}  returns   the   last   node  of   a   {\em  linear   graph\/}
1088: $G$ constructed from a given $\rho$.
1089: \item $\clean(G)$ deletes the nodes which are not reachable from the
1090:   entry node. % or which do not have a path to a final node.
1091: 
1092: \newcommand{\cn}[2]{\mbox{{\sf\em ACN\/}$({#1},{#2})$}}
1093: \item  \cNodes{G}{G'}{S}  computes  the   set  of  nodes  of  $G$  which
1094:   correspond to  the nodes of $G'$  specified  in  the set $S$.
1095:   To compute  \cNodes{G}{G'}{S}, we define \cn{G}{G'}, the  set of pairs
1096:   of {\em all corresponding nodes}.  Let \mbox{$G \equiv \langle n_0, N,
1097:   E\rangle$} and \mbox{$G' \equiv  \langle n'_0, N', E'\rangle$}. A node
1098:   $n$ in $G$ corresponds to a node $n'$ in $G'$ if there there exists an
1099:   access path $\rho$ which is represented by a path from $n_0$ to $n$ in
1100:   $G$ and a path from $n'_0$ to $n'$ in $G'$.
1101: 
1102:   Formally, \cn{G}{G'} is the least solution of the following  equation:
1103:   \begin{eqnarray*}
1104:      \cn{G}{G'} &=&
1105:        \left\{\begin{array}{llr@{}}
1106:        \emptyset & & \hskip -1cm\RootVar(G) \not= \RootVar(G') \\
1107:        \{\langle n_0, n'_0 \rangle\}
1108:        \cup \{\langle n_j, n'_j \rangle \mid 
1109:        \FieldName(n_j) = \FieldName(n'_j),
1110:        & & \mbox{otherwise}\\
1111: 		          {\hskip 3cm} n_i \rightarrow n_j \in E, 
1112: 		          n'_i \rightarrow n'_j \in E', \\
1113: 			  {\hskip 3cm} \langle n_i, n'_i \rangle\ \in \cn{G}{G'} \}
1114:        \end{array}\right.  \\
1115:        \cNodes{G}{G'}{S} &=& \{ n \mid \langle n, n' \rangle \in \cn{G}{G'}, \,
1116:                           n' \in S \}
1117:   \end{eqnarray*}
1118: \end{itemize}
1119: Note that $\FieldName(n_j) = \FieldName(n'_j)$ would hold even when $n_j$ or $n'_j$ is
1120: the summary node $n_*$.
1121: 
1122: Let \mbox{$G \equiv \langle n_0, N, E\rangle$} and \mbox{$G' \equiv
1123: \langle n_0, N', E'\rangle$} be access graphs (having the same
1124: entry node). $G$ and $G'$ are equal if $N=N'$ and $E=E'$.
1125: 
1126:  The main operations of interest are defined below and
1127: are illustrated in Figure~\ref{fig:exmp.ops.ag}.
1128: 
1129: \begin{figure}[t]
1130: {\includegraphics{fig-new-graph-operations.epsi}}
1131: \caption{Examples of operations on access graphs.} 
1132: \rule{\textwidth}{.2mm}
1133: \label{fig:exmp.ops.ag}
1134: \end{figure}
1135: 
1136: 
1137: \begin{enumerate}
1138: \item
1139: {\em Union} ($\cupG$).  $G \cupG G'$ combines access graphs $G$ and
1140: $G'$ such that any access path contained in $G$ or $G'$ is contained
1141: in the resulting graph.
1142: \begin{eqnarray*}
1143:   G \cupG G' &=& \left\langle n_0, N \cup N',
1144:    E \cup E'
1145:   \right\rangle
1146: \end{eqnarray*}
1147: The operation $N\cup N'$ treats the nodes with the same label as identical.
1148: Because of associativity, $\cupG$ can be generalized to arbitrary
1149: number of arguments in an obvious manner.
1150: \item
1151: 
1152: {\em Path Removal} ($\minus$). The operation \mbox{$G\minus\rho$}
1153: removes those access paths in $G$ which have $\rho$ as a prefix.
1154: \begin{eqnarray*}
1155:   G \minus \rho & = & \left\{\begin{array}{ll}
1156:   G & \rho = \Empty \mbox{ or } \RootVar(\rho) \neq \RootVar(G) \\
1157:   \Empty\!_G & \rho \mbox{ is a simple access path} \\
1158:   \clean(\langle n_0, N, E - E_{del} \rangle) \rule{.3cm}{0cm} &
1159:   otherwise
1160:   \end{array} \right.
1161: \end{eqnarray*}
1162: where
1163: %
1164: \[\begin{array}{rcl}
1165: %\begin{eqnarray*}
1166:   E_{del} & = & \{ n_i \rightarrow n_j \mid
1167:                      n_i \rightarrow n_j \in E,
1168: 		     n_i \in \cNodes{G}{G^B}{\{\lNode{G^B}\}}, \\
1169:             &   & 
1170: {\white \{ n_i \mathop{\rightarrow}^f n_j \mid}
1171: \FieldName(n_j) = \Front(\rho), 
1172: 		     G^B = \graphA{\Base(\rho)}, \\
1173:             &   & 
1174: {\white \{ n_i \mathop{\rightarrow}^f n_j \mid}
1175: 		     \uniquepath( G,n_i) \}
1176: %\end{eqnarray*}
1177:   \end{array}\]
1178: 
1179: \uniquepath($G$, $n$) returns  true if in $G$, all  paths from the entry
1180: node to node $n$ represent the  same access path. Note that path removal
1181: is conservative  in that some paths  having $\rho$ as prefix  may not be
1182: removed. Since  an access graph edge  may be contained in  more than one
1183: access paths,  we have  to ensure  that access paths  which do  not have
1184: $\rho$ as prefix are not erroneously deleted.
1185: 
1186: \item {\em Factorization} (/).  Recall that the {\em Transfer\/} term in
1187: Definition  \ref{def:explicit.liveness} requires extracting  suffixes of
1188: access  paths  and  attaching  them  to some  other  access  paths.  The
1189: corresponding   operations  on   access  graphs   are   performed  using
1190: factorization and extension.
1191: %
1192: Given a node \mbox{$m \in (N  - \{n_0\})$} of an access graph $G$, the
1193: {\em Remainder Graph\/} of $G$ at $m$ is the subgraph of $G$ rooted at
1194: $m$ and is denoted by $\subG{G}{m}$. If $m$ does not have any outgoing
1195: edges,  then the  result is  the  empty remainder  graph $\EFG$. 
1196: Let  $M$ be a subset
1197: of the  nodes of $G'$  and $M'$ be  the set of corresponding  nodes in
1198: $G$. Then,  $G/(G',M)$ computes the  set of remainder  graphs of
1199: the successors of nodes in $M'$.
1200: %
1201: \begin{eqnarray}
1202:   G/(G',M) &=& \{\subG{G}{n_j} \mid n_i \rightarrow n_j \in E, n_i \in
1203:   \cNodes{G}{G'}{M}\}
1204:   \label{eq:factorization}
1205: \end{eqnarray}
1206: 
1207: A remainder  graph is similar  to an  access graph  except that  (a) its
1208: entry node does not correspond to  a root variable but to a field name
1209: and (b) the  entry node can have incoming edges.  
1210: 
1211: \item{\em  Extension}. Extending an empty access graph $\Empty\!_G$
1212: results in the empty access graph $\Empty\!_G$. For non-empty graphs,
1213: this operation is defined as follows.
1214:   \begin{enumerate}
1215:   \item{\em Extension with a remainder graph}
1216:     ($\cdot$). 
1217: Let $M$ be a subset of the nodes of $G$ and
1218:     \mbox{$R \equiv \langle n',\, N^{R}, \,E^{R} \rangle$} be a remainder graph.
1219: Then,    \appendG{(G,M)}{R} appends the suffixes in $R$ to the access paths ending
1220:     on nodes in $M$.
1221:     \begin{eqnarray}
1222:       \appendG{(G,M)}{\EFG} &=& G \nonumber \\
1223:       \appendG{(G,M)}{R} &=&
1224:       \left\langle n_0, N \cup N^{R}, 
1225:       E \cup E^{R} \cup 
1226:     \{n_i \rightarrow n' \mid n_i \in M\}
1227:       \right\rangle
1228:   \label{eq:extension.1}
1229:     \end{eqnarray}
1230: 
1231:   \item {\em Extension with a set of remainder graphs}
1232:     ($\#$). Let $S$ be a set of remainder graphs. Then, \extend{G}{S} extends access 
1233:     graph $G$ with every
1234:     remainder graph in $S$.
1235:     \begin{eqnarray}
1236:       \extend{(G,M)}{\emptyset} &=& \Empty\!_G \nonumber \\
1237:       \extend{(G,M)}{S} &=&
1238:       \displaystyle\mathop{\bigcupG}_{R \in S}\;
1239:       \appendG{(G,M)}{R}
1240:   \label{eq:extension.2}
1241:     \end{eqnarray}
1242:   \end{enumerate}
1243: \end{enumerate}
1244: 
1245: \subsubsection{Safety of Access Graph Operations}
1246: 
1247: \begin{figure}[t]
1248: \[
1249: \renewcommand{\arraystretch}{1.2}
1250: \begin{array}{|l|l|l|}
1251: \hline 
1252: \mbox{Operation} &
1253: \mbox{Access Graphs} &
1254: \mbox{Access Paths} 
1255: \\ \hline
1256: \hline 
1257: \mbox{Union} & G_3 = G_1 \cupG\> G_2 
1258: & \AP{G_3,M_3} \supseteq \AP{G_1,M_1} \cup \; \AP{G_2,M_2} 
1259: \\ \hline
1260: \mbox{Path Removal} & G_2 = G_1 \minus\; \rho
1261: & \AP{G_2,M_2} \supseteq 
1262: \AP{G_1,M_1} - \; \{\rho\myarrow\sigma \mid \rho\myarrow\sigma \in \AP{G_1,M_1}\}
1263: \\ \hline
1264: \mbox{Factorization} & S = G_1/(G_2,M)	
1265: & \AP{S,M_s} = \{ \sigma \mid \rho'\myarrow\sigma \in \AP{G_1,M_1},
1266:                 \rho' \in \AP{G_2,M} \} 
1267: \\ \hline
1268: \mbox{Extension} & G_2 = \extend{(G_1,M)}{S} &
1269: \AP{G_2,M_2} \supseteq \AP{G_1,M_1} \cup \{ \rho\myarrow\sigma \mid  \rho \in \AP{G_1,M},
1270:                       \sigma \in \AP{S,M_s} \}
1271: \\ \hline
1272: \end{array}
1273: \]
1274: \caption{Safety of Access Graph Operations. $\AP{G,M}$ is the set of
1275:   paths in graph $G$ terminating on nodes in $M$.  
1276:   For graph $G_i$, $M_i$ is the set of all nodes in $G_i$. $S$ is the
1277:   set of remainder graphs and \AP{S,M_s} is the
1278:   set of all paths in all remainder graphs in $S$.}
1279: \label{fig:G.ops.properties}
1280: \rule{\textwidth}{.2mm}
1281: \end{figure}
1282: 
1283: Since access graphs are not exact representations of sets of access
1284: paths, the safety of approximations needs to be defined
1285: explicitly.  The constraints defined in
1286: Figure~\ref{fig:G.ops.properties} 
1287: capture safety in the context of liveness in the following sense: Every access
1288: path which can possibly be live should be retained by each operation.
1289: Since the complement of liveness is used for nullification, this ensures
1290: that no live access path is considered for nullification.
1291: These properties have been
1292: proved~\cite{hra.AG.Safety} using the PVS 
1293: theorem prover\footnote{Available from \url{http://pvs.csl.sri.com}.}.
1294: 
1295: \subsection{Data Flow Analysis for Discovering Explicit Liveness}
1296: 
1297: \label{sec:live-analysis}
1298: %
1299: For a given root variable \mbox{$v$}, $\Lin{v}(i)$ and $\Lout{v}(i)$
1300: denote the access graphs representing explicitly live access paths at
1301: the entry and exit of basic block $i$.
1302: We use $\Empty\!_G$ as the initial value for $\Lin{v}(i)/\Lout{v}(i)$.
1303: 
1304: \begin{eqnarray}
1305: \Lin{v}(i) & = & 
1306: \left( \Lout{v}(i) \minus \Lkill{v}(i)\right) \cupG \Lgen{v}(i)
1307: 	\label{eq:Elin}
1308: \\
1309: \Lout{v}(i) & = & \left \{ \begin{array}{l@{\ \ \ }l}
1310: 		\graphA{v\myarrow *} & i = \exitnode,\; v \in \Global \\ 
1311: 		\Empty\!_G & i = \exitnode, \; v \not\in \Global \\
1312: 		\displaystyle\mathop{\bigcupG}_{s \in succ(i)}
1313: 		\; \Lin{v}(s) & \mbox{otherwise}
1314: 		\end{array}\right. 
1315: 	\label{eq:Elout}	
1316: \end{eqnarray}
1317: %
1318: where
1319: \begin{equation*}
1320: \Lgen{v}(i) = \LD{v}(i) \cupG \LT{v}(i)
1321: \end{equation*}
1322: 
1323: 
1324: We define $\Lkill{v}(i)$, $\LD{v}(i)$, and $\LT{v}(i)$ depending upon
1325: the statement.
1326: 
1327: \begin{enumerate}
1328: \item {\em Assignment statement\/} \mbox{$\alpha_x = \alpha_y$}. Apart
1329: from defining the desired terms for $x$ and $y$, we also need to
1330: define them for any other variable $z$. In the following equations,
1331: $G_x$ and $G_y$ denote $\graphA{\rho_x}$ and
1332: $\graphA{\rho_y}$ respectively, whereas $M_x$ and $M_y$ denote
1333: $\lNode{\graphA{\rho_x}}$ and  $\lNode{\graphA{\rho_y}}$ respectively.
1334: %
1335: \begin{eqnarray}
1336: \LD{x}(i) & \!\!=\!\!& \graphA{\Base(\rho_x)} \nonumber \\
1337: \LD{y}(i) & \!\!=\!\!& \left\{\!\! \renewcommand{\arraystretch}{1.1}
1338: 		\begin{array}{l@{\ \ \ }l}
1339: 		 \Empty\!_G &  \alpha_y {\mbox{ is {\em New\/} \ldots\ or \NULL}} \\
1340: 		\graphA{\Base(\rho_y)}  & \mbox{otherwise}
1341: 		\end{array}
1342: 		\right.
1343: 		\nonumber   \\
1344: \LD{z}(i) & \!\!=\!\!& \Empty\!_G, \mbox{for any variable } z \mbox{ other than } x \mbox{ and }
1345: 	y \nonumber  \\
1346: \LT{y}(i) & \!\!=\!\! & \left\{\!\! \renewcommand{\arraystretch}{1.1}
1347: 		\begin{array}{l@{\ }l}
1348: 		 \Empty\!_G &  \alpha_y \mbox{ is {\em New\/} or \NULL} \\
1349: 		\extend{(G_y, M_y)}{} & \mbox{otherwise} \\
1350: 		\;\;\;\;\;\;\;\;
1351: 		{(\Lout{x}(i)/( G_x, M_x))} 
1352: 		\end{array}
1353: 		\right.
1354: \label{eq:xfer.y.asgn}\\
1355: \LT{z}(i) & \!\!=\!\! & \Empty\!_G, \mbox{ for any variable } z 
1356: 	\mbox{ other than } y \nonumber  \\
1357: %\end{eqnarray*}
1358: %\begin{eqnarray*}
1359: \Lkill{x}(i) & \!\!=\!\! & \rho_x \nonumber \\
1360: \Lkill{z}(i) & \!\!=\!\! &\Empty, \mbox{ for any variable } z 
1361: 	\mbox{ other than } x \nonumber   
1362: \end{eqnarray}
1363: %
1364: 
1365: As stated earlier, the path removal operation deletes an edge only if it
1366: is  contained in  a unique  path. Thus  fewer paths  may be  killed than
1367: desired. This  is a safe approximation.  Another  approximation which is
1368: also  safe is  that only  the  paths rooted  at $x$  are killed.   Since
1369: assignment   to    $\alpha_x$   changes   the    link   represented   by
1370: $\Front(\rho_x)$, for precision, any path which is guaranteed to contain
1371: the link  represented by $\Front(\rho_x)$  should also be  killed.  Such
1372: paths  can be  discovered through  must-alias analysis  which we  do not
1373: perform.
1374: 
1375: \begin{figure}[t]
1376: \hfill{\includegraphics{fig-liveness-info-1.epsi}}\hfill\mbox{}
1377: \caption{Explicit liveness for the program in Figure
1378:   \ref{fig:memory.graph} under 
1379: the assumption that all variables are local variables.}
1380: \label{fig:liveness.info.1}
1381: \rule{\textwidth}{.2mm}
1382: \end{figure}
1383: 
1384: 
1385: \item  {\em  Function  call\/}  \mbox{$\alpha_x =  f(\alpha_y)$}.   We
1386:   conservatively assume that a function  call may make any access path
1387:   rooted at $y$ or any global reference variable live. 
1388:   Thus this version of our analysis is context insensitive. 
1389: \begin{eqnarray*}
1390: \LD{x}(i) & \!\!=\!\!& \graphA{\Base(\rho_x)}  \\
1391: \LD{y}(i) & \!\!=\!\!&  \graphA{\rho_y\myarrow *} \\
1392: \LD{z}(i) & \!\!=\!\!& \left\{\!\! \renewcommand{\arraystretch}{1.1}
1393: 		\begin{array}{l@{\ }l}
1394: 	\graphA{z\myarrow *} & \mbox{if $z$ is a global variable}   \\
1395: 	\Empty\!_G & \mbox{otherwise}
1396: 		\end{array}
1397: 		\right.
1398: 		\\
1399: \LT{z}(i) & \!\!=\!\! & \Empty\!_G, \mbox{ for all variables } z  \\
1400: %\end{eqnarray*}
1401: %\begin{eqnarray*}
1402: \Lkill{x}(i) & \!\!=\!\! & \rho_x \nonumber \\
1403: \Lkill{z}(i) & \!\!=\!\! &\Empty, \mbox{ for any variable } z 
1404: 	\mbox{ other than } x \nonumber   
1405: \end{eqnarray*}
1406: %
1407: \item {\em Return Statement} $\return \; \alpha_x$.
1408: \begin{eqnarray*}
1409: \LD{x}(i) & \!\!=\!\!&  \graphA{\rho_x\myarrow *} \\
1410: \LD{z}(i) & \!\!=\!\!& \left\{\!\! \renewcommand{\arraystretch}{1.1}
1411: 		\begin{array}{l@{\ }l}
1412: 	\graphA{z\myarrow *} & \mbox{if $z$ is a global variable }   \\
1413: 	\Empty\!_G & \mbox{otherwise}
1414: 		\end{array}
1415: 		\right.
1416: 		\\
1417: \LT{z}(i) & \!\!=\!\! & \Empty\!_G, \mbox{ for any variable } z  \\
1418: \Lkill{z}(i) & \!\!=\!\! & \Empty, \mbox{ for any variable } z  
1419: \end{eqnarray*}
1420: 
1421: \item {\em Use Statements}
1422: \begin{eqnarray*}
1423: %\LD{y}(i) & = & \bigcupG\; \graphA{\Base(\rho_y)} \mbox{ \ for every } \alpha_y
1424: 		 %\mbox{ used in } i \label{eq:direct.use.1} \\
1425: \LD{x}(i) & = & \bigcupG\; \graphA{\rho_x} \mbox{ \ for every } 
1426: 		\alpha_x.d \mbox{ used in } i \label{eq:direct.use.2} \\
1427: \LD{z}(i) & = & \Empty\!_G \mbox{ \ for any variable $z$ other than $x$ and} y
1428: 		\label{eq:direct.use.z} \\
1429: \LT{z}(i) & = & \Empty\!_G, \mbox{ for every variable } z  \label{eq:xfer.z.use} \\
1430: \Lkill{z}(i) & = & \Empty, \mbox{ for every variable } z  \label{eq:kill.z.use} 
1431: \end{eqnarray*}
1432: \end{enumerate}
1433: 
1434: 
1435: %%\del{
1436: %%Once explicitly live access graphs are computed, implicitly live
1437: %%access graphs at a given program point can be discovered by computing
1438: %%may-link-aliases of explicitly live access graphs at that
1439: %%point.
1440: %%%%
1441: %%{ Mathematically, 
1442: %%
1443: %%\begin{eqnarray*}
1444: %%\CLin{v}(i) & = & 
1445: %%\bigcupG\;\{ G_v \mid G_v \in \LAliases(\Lin{u}(i), \NodeAin(i))
1446: %%   \mbox{ for some variable $u$} \}
1447: %%\\
1448: %%\CLout{v}(i) & = & 
1449: %%\bigcupG\;\{ G_v \mid G_v \in \LAliases(\Lout{u}(i), \NodeAout(i))
1450: %%   \mbox{ for some variable $u$} \}
1451: %%\end{eqnarray*}
1452: %%}
1453: %%}
1454: %%
1455: 
1456: \begin{example}
1457: \label{exmp:liveness.info.1}
1458: Figure~\ref{fig:liveness.info.1}  lists explicit  liveness information
1459: at different  points of  the program in  Figure \ref{fig:memory.graph}
1460: under the  assumption that all  variables are local  variables.
1461: %
1462: \mybox
1463: \end{example}
1464: 
1465: %%\akadd{
1466: Observe that computing liveness using equations~(\ref{eq:Elin}) and (\ref{eq:Elout}) 
1467: results in an MFP (Maximum Fixed Point) solution of data flow analysis whereas
1468: definition~(\ref{def:explicit.liveness}) specifies an MoP (Meet over Paths) solution
1469: of data flow analysis. 
1470: Since the flow functions are non-distributive (see appendix~\ref{sec:non-distributivity}),
1471: the two solutions may be different. 
1472: %%}
1473: 
1474: 
1475: 
1476: 
1477: \section{Other Analyses for Inserting \NULL\ Assignments}
1478: \label{sec:other.analyses}
1479: 
1480: \begin{figure}
1481: \begin{center}
1482: %\begin{tabular}{l}
1483: {\psset{unit=.8mm}
1484: %\renewcommand{\arraystretch}{.8}
1485: \begin{pspicture}(-3,4)(45,50)
1486: %\psframe(-3,4)(45,50)
1487: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1488: \psrelpoint{origin}{n1}{20}{45}
1489: \rput(\x{n1},\y{n1}){\rnode{n1}{1 \psframebox{$x = y$}\white 1\ }}
1490: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1491: \psrelpoint{n1}{n2}{-12}{-10}
1492: \rput(\x{n2},\y{n2}){\rnode{n2}{2 \psframebox{\white$x = \mbox{\em New}$}\white 2\ }}
1493: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1494: \psrelpoint{n1}{n3}{12}{-10}
1495: \rput(\x{n3},\y{n3}){\rnode{n3}{3 \psframebox{$x.n = \mbox{\em New}$}\white 3\ }}
1496: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1497: \psrelpoint{n3}{n4}{-12}{-10}
1498: \rput(\x{n4},\y{n4}){\rnode{n4}{4 \psframebox{$x.n = \NULL?$}\white 4\ }}
1499: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1500: \psrelpoint{n4}{n5}{-12}{-12}
1501: \rput(\x{n5},\y{n5}){\rnode{n5}{5 \psframebox{\white$y = x.r$}\white 5\ }}
1502: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1503: \psrelpoint{n4}{n6}{12}{-12}
1504: \rput(\x{n6},\y{n6}){\rnode{n6}{6 \white\psframebox{$y = x.r$}\white 6\ }}
1505: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1506: \psrelpoint{n6}{n7}{0}{-10}
1507: \rput(\x{n7},\y{n7}){\rnode{n7}{7 \psframebox{$x.n.n = \mbox{\em New}$}\white 7\ }}
1508: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1509: \ncline{->}{n1}{n2}
1510: \ncline{->}{n1}{n3}
1511: \ncline{->}{n2}{n4}
1512: \ncline{->}{n3}{n4}
1513: \ncline{->}{n4}{n5}
1514: \bput[0pt](.7){\small T}
1515: \ncline{->}{n4}{n6}
1516: \aput[0pt](.7){\small F}
1517: \ncline{->}{n6}{n7}
1518: \end{pspicture}
1519: }
1520: %\end{tabular}
1521: \end{center}
1522: \caption{Explicit liveness information is not sufficient for nullification.}
1523: 
1524: \label{fig:pgm.nullasgn}
1525: \rule{\textwidth}{.2mm}
1526: \end{figure}
1527: 
1528: 
1529: Explicit liveness alone is not  enough to decide whether an assignment
1530: \mbox{$\alpha_x = \NULL$}  can be safely inserted at  $p$.  We have to
1531: additionally ensure that:
1532: 
1533: \begin{itemize}
1534: \item $\Front(\rho_x)$ is not live through an alias created before the
1535: program  point $p$.  The extensions  required to  find all  live access
1536: paths, including those created due  to aliases, is discussed in section
1537: \ref{sec:aliasing.def}.
1538: \item Dereferencing links  during the
1539: execution of the inserted statement \mbox{$\alpha_x = \NULL$} does not
1540: cause an exception.  This is done through {\em  availability} and {\em
1541: anticipability}    analysis   and    is    described   in    section
1542: \ref{sec:availability.def}.
1543: \end{itemize}
1544: 
1545: Both these requirements are illustrated through the example shown
1546: below:
1547: 
1548: 
1549: \begin{example}
1550: \label{exmp.illegal.dereference}
1551: In   Figure~\ref{fig:pgm.nullasgn},  access  path \mbox{$y\myarrow n$}  
1552: is not explicitly live in block 6. However,  \mbox{$\Front(y\myarrow n)$} and 
1553: \mbox{$\Front(x\myarrow n)$} represent the same link due to the 
1554: assignment \mbox{$x=y$}. Thus \mbox{$y\myarrow n$} is implicitly live
1555: and setting it to \NULL\ in block 6 will raise an exception in block 7. 
1556: Also, \mbox{$x\myarrow n\myarrow n$}  
1557: is not live  in block 2.  However, it cannot be  set to
1558: \NULL\ since  the object pointed  to by \mbox{$x\myarrow n$}  does not
1559: exist  in  memory when  the  execution  reaches  block 2.   Therefore,
1560: insertion of \mbox{$x.n.n = \NULL$} in block 2 will raise an exception
1561: at run-time.  \mybox
1562: \end{example}
1563: 
1564: 
1565: 
1566: \subsection{Computing Live Access Paths}
1567: \label{sec:aliasing.def}
1568: 
1569: Recall that an access path is  live if it is either explicitly live or
1570:   shares its \Front\ with some  explicitly live path.  The property of
1571:   sharing is  captured by {\em  aliasing}.  Two access  paths $\rho_x$
1572:   and  $\rho_y$  are  {\em   aliased\/}  at  a  program  point~$p$  if
1573:   $\Target(\rho_x)$ is  same as  $\Target(\rho_y)$ at $p$  during some
1574:   execution of  the program.  They  are {\em link-aliased\/}  if their
1575:   frontiers represent the same  link; they are {\em node-aliased\/} if
1576:   they are aliased but their frontiers do not represent the same link.
1577:   Link-aliases   can   be   derived   from  node-aliases   (or   other
1578:   link-aliases)  by adding  the  same field  names  to aliased  access
1579:   paths.
1580: 
1581: Alias  information  is {\em  flow-sensitive\/}  if  the  aliases at  a
1582: program  point  depend on  the  statements  along  control flow  paths
1583: reaching  the point.  Otherwise  it is  flow insensitive.   Among flow
1584: sensitive aliases, two access paths are {\em must-aliased\/} at $p$ if
1585: they are aliased along every  control flow path reaching $p$; they are
1586: {\em may-aliased\/} if  they are aliased along some  control flow path
1587: reaching  $p$.   As   an  example,  in  Figure~\ref{fig:memory.graph},
1588: \mbox{$x\myarrow\lptr$}   and    \mbox{$y$}   are   must-node-aliases,
1589: \mbox{$x\myarrow\lptr\myarrow\lptr$}  and  \mbox{$y\myarrow\lptr$} are
1590: must-link-aliases, and $w$ and $x$ are node-aliases at line 5.
1591: 
1592: 
1593: We compute flow sensitive may-aliases (without kills) using the algorithm described by
1594: \citeN{hind99interprocedural}  and  use  pairs  of  access  graphs  for
1595: compact  representation of  aliases.  Liveness  is computed  through a
1596: backward  propagation much  in the  same manner  as  explicit liveness
1597: except that it is ensured that the live paths at each program point is
1598: closed under may-aliasing. This  requires the following two changes in
1599: the earlier  scheme.
1600: \begin{enumerate}
1601: \item {\em Inclusion of Intermediate Nodes in Access Graphs\/}. Unlike
1602: explicit liveness, live  access paths may not be  prefix closed.  This
1603: is because the frontier of a live access path $\rho_x$ may be accessed
1604: using  some  other  access  path  and  not  through  the  links  which
1605: constitute $\rho_x$. Hence prefixes of $\rho_x$ may not be live. In an
1606: access graph  representing liveness, all paths may  not represent live
1607: links.  We  therefore modify the access  graph so that  such paths are
1608: not described by the access  graph. In order to make this distinction,
1609: we  divide the  nodes  in an  access  graphs in  two categories:  {\em
1610: final\/} and {\em intermediate\/}.  The only access paths described by
1611: the  access  graph  are  those  which end  at  final  nodes.
1612: \footnote{These  two  categories  are  completely  orthogonal  to  the
1613: labeling  criterion of  the nodes.}   
1614: This change  affects  the access
1615: graph operations in the following manner:
1616:       \begin{itemize}
1617:       \item The equality of graphs now must consider equality of the 
1618:             sets of intermediate nodes and the sets of final nodes
1619:             separately.
1620:       \item Graph  constructor \graphA{\rho_x} marks all  nodes in the
1621:             resulting  graph  as  final  implying that  all  non-empty
1622:             prefixes  of  $\rho_x$ are  contained  in  the graph.   We
1623:             define  a new  constructor \newGraphO{\rho_x}  which marks
1624:             only  the  last node  as  final  and  all other  nodes  as
1625:             intermediate implying  that only $\rho_x$  is contained in
1626:             the graph.
1627:       \item Whenever multiple nodes with identical labels are combined, if any 
1628:             instance of
1629:             the node is final then the resulting node is treated as final. This 
1630: 	    influences union $(\!\!\cupG\!\!)$ and extension (\extend{}{}).
1631: 	\item The set $M$ used in defining factorization and extension 
1632: 	(equations~\ref{eq:factorization}, \ref{eq:extension.1}, \ref{eq:extension.2})
1633: 	and the safety properties of access graph operations 
1634: 	(Figure~\ref{fig:G.ops.properties})
1635: 	contain final nodes only.
1636: 	\item Extension $\appendG{G}{RG}$ marks all nodes in $G$ as intermediate.
1637: 	      If $G$ and $RG$ have a common node then the status of the node is 
1638: 	       governed by its status in $RG$.
1639:         \item The $\clean(G)$ operation is modified to delete those intermediate nodes
1640:               which do not have a path leading to a final node.
1641:       \end{itemize}
1642: \item {\em Link Alias Closure\/}. To  discover all  link  aliases of  a  
1643:       live access  we compute link alias closure as defined below.
1644: Given an alias set \Aset, the set of link aliases of an access path 
1645: $\rho_x\myarrow f$ is the least solution of:
1646: \begin{equation*}
1647: \LnA(\rho_x\myarrow f,\Aset)  = \{ \rho_y\myarrow f \mid 
1648: 	\langle \rho_x,\rho_y\rangle \in \Aset \mbox{ or }
1649: 	\langle \rho_x,\rho_y\rangle \in \LnA(\rho_x,\Aset) 
1650: 	\}
1651: \end{equation*}
1652: 
1653:  Given an alias pair $ \langle g_x,g_y\rangle$
1654: link aliases of $G_x$ rooted at $y$ are included
1655: in the access graph $G_y$ as follows:
1656: \begin{equation}
1657:   \LnG(G_y,G_x, \langle g_x,g_y\rangle)= G_y  \cupG
1658: \extend{(g_y,m_y)}{((G_x/(g_x,m_x)}) \!-\!\EFG)
1659:   %\rho_x \in G_x,
1660: 	\label{eq:link.alias.computation} 
1661: \end{equation}
1662: where $m_y$ and $m_x$ are the singleton sets containing the final nodes of
1663: $g_y$ and $g_x$ respectively.
1664: $\EFG$ has to be removed from set of remainder graphs because we want
1665: to transfer non-empty links only. 
1666: Complete liveness is computed as the least solution of the following
1667: equations 
1668: %
1669: \begin{eqnarray*}
1670: %\TLin{v}(i) & = & 
1671: %\left( \CLout{v}(i) \minus \Lkill{v}(i)\right) \cupG \Lgen{v}(i) \\
1672: \CLin{v}(i) & = & \TLin{v}(i) 
1673: 	\displaystyle\mathop{\bigcupG}_{\raisebox{.1mm}{$\scriptstyle
1674: 		\langle g_v,g_u\rangle$} 
1675: 		\in 
1676: 		\raisebox{-.2mm}{\NodeAin(i)}}
1677: \LnG\left(\CLin{v}(i),\CLin{u}(i), \langle g_v,g_u\rangle \right)
1678: \\
1679: \CLout{v}(i) & = & \left \{ \begin{array}{l@{\ \ \ }l}
1680: 	\graphA{v\myarrow *}
1681:                 & i = \exitnode, \; v \in \Global \\
1682:                 &\mbox{or }  v \in \param\\
1683: \Empty\!_G \cupG 
1684: 	\LnG\left( \CLout{v}(i),
1685:                    \CLout{u}(i), \langle g_v,g_u\rangle 
1686:            \right)
1687: 		& i = \exitnode,\; v \not\in \Global,
1688:  \\ 
1689: 		& \langle g_v,g_u\rangle \in \NodeAout(i)
1690: \\
1691: 		\displaystyle\mathop{\bigcupG}_{s \in succ(i)}
1692: 		\; \CLin{v}(s) & \mbox{otherwise}
1693: 		\end{array}\right. 
1694: \end{eqnarray*}
1695: %
1696: where $\TLin{v}(i)$ is same as $\Lin{v}(i)$
1697: except that $\Lout{v}(i)$ is replaced by
1698: $\CLout{v}(i)$ in the main equation (equation \ref{eq:Elin}) and in
1699: the computation of {\em Transfer\/} (equation \ref{eq:xfer.y.asgn}).
1700: 
1701: \end{enumerate}
1702: 
1703: 
1704: \begin{figure}[t]
1705: \newcommand{\pair}[2]{\mbox{$\left\langle \raisebox{-1.3mm}{#1}, 
1706: 	\raisebox{-1.3mm}{#2}\right\rangle$}}
1707: \newcommand{\Gx}
1708: {\scalebox{.95}{
1709: {\psset{unit=.8mm}
1710: 		\begin{pspicture}(1,.25)(9,5.5)
1711: 		%\psframe(0,0)(11,6)
1712: 		\putnode{a}{origin}{0}{3}{}
1713: 		\putnode{x}{a}{8}{0}{\pscirclebox{$\,x\,$}}
1714: 		\ncline[doubleline=true]{->}{a}{x}
1715: 		\end{pspicture}
1716: 		}
1717: }
1718: }
1719: \newcommand{\Gxr}
1720: {\scalebox{.95}{
1721: {\psset{unit=.8mm}
1722: 		\begin{pspicture}(1,.5)(17.5,5.5)
1723: 		%\psframe(0,0)(20,6)
1724: 		\putnode{a}{origin}{0}{3}{}
1725: 		\putnode{x}{a}{8}{0}{\pscirclebox[linestyle=dashed, dash=.6 .6]{$x$}}
1726: 		\putnode{r}{x}{9}{0}{\pscirclebox[framesep=.8]{$r_3$}}
1727: 		\ncline[doubleline=true]{->}{a}{x}
1728: 		\ncline{->}{x}{r}
1729: 		\end{pspicture}
1730: 		}
1731: }
1732: }
1733: \newcommand{\Gxl}
1734: {\scalebox{.95}{
1735: {\psset{unit=.8mm}
1736: 		\begin{pspicture}(1,.5)(17.5,5.5)
1737: 		%\psframe(0,0)(20,6)
1738: 		\putnode{a}{origin}{0}{3}{}
1739: 		\putnode{x}{a}{8}{0}{\pscirclebox[linestyle=dashed, dash=.6 .6]{$x$}}
1740: 		\putnode{r}{x}{9}{0}{\pscirclebox[framesep=.7]{$l_4$}}
1741: 		\ncline[doubleline=true]{->}{a}{x}
1742: 		\ncline{->}{x}{r}
1743: 		\end{pspicture}
1744: 		}
1745: }
1746: }
1747: \newcommand{\Gw}
1748: {\scalebox{.95}{
1749: {\psset{unit=.8mm}
1750: 		\begin{pspicture}(1,.5)(9,5.5)
1751: 		%\psframe(0,0)(11,6)
1752: 		\putnode{a}{origin}{0}{3}{}
1753: 		\putnode{x}{a}{8}{0}{\pscirclebox{$w$}}
1754: 		\ncline[doubleline=true]{->}{a}{x}
1755: 		\end{pspicture}
1756: 		}
1757: }
1758: }
1759: \newcommand{\Gy}
1760: {\scalebox{.95}{
1761: {\psset{unit=.8mm}
1762: 		\begin{pspicture}(1,.5)(9,5.5)
1763: 		%\psframe(0,0)(11,6)
1764: 		\putnode{a}{origin}{0}{3}{}
1765: 		\putnode{x}{a}{8}{0}{\pscirclebox{$y$}}
1766: 		\ncline[doubleline=true]{->}{a}{x}
1767: 		\end{pspicture}
1768: 		}
1769: }
1770: }
1771: \newcommand{\Gyl}
1772: {\scalebox{.95}{
1773: {\psset{unit=.8mm}
1774: 		\begin{pspicture}(1,.5)(17.5,5.5)
1775: 		%\psframe(0,0)(20,6)
1776: 		\putnode{a}{origin}{0}{3}{}
1777: 		\putnode{x}{a}{8}{0}{\pscirclebox[linestyle=dashed, dash=.6 .6]{$y$}}
1778: 		\putnode{r}{x}{9}{0}{\pscirclebox[framesep=.7]{$l_6$}}
1779: 		\ncline[doubleline=true]{->}{a}{x}
1780: 		\ncline{->}{x}{r}
1781: 		\end{pspicture}
1782: 		}
1783: }
1784: }
1785: 
1786: \[
1787: \begin{array}{|c|l|l|} \hline
1788: i & \multicolumn{1}{|c|}{\NodeAin(i)}  & \multicolumn{1}{|c|}{\NodeAout(i)} \\ \hline
1789: \hline
1790: \rule[-.75em]{0em}{2em}% 
1791: 1 & \multicolumn{1}{|c|}{\emptyset}          & \pair{\Gx}{\Gw} \\\hline
1792: \rule[-.75em]{0em}{2em}%
1793: 2 & \pair{\Gx}{\Gw}, \pair{\Gx}{\Gxr}
1794:   & \pair{\Gx}{\Gw}, \pair{\Gx}{\Gxr} \\\hline
1795: \rule[-.75em]{0em}{2em}%
1796: 3 & \pair{\Gx}{\Gw}, \pair{\Gx}{\Gxr}
1797:   & \pair{\Gx}{\Gw}, \pair{\Gx}{\Gxr} \\\hline
1798: \rule[-.75em]{0em}{2em}%
1799: 4 & \pair{\Gx}{\Gw}, \pair{\Gx}{\Gxr}
1800:   & \pair{\Gx}{\Gw}, \pair{\Gx}{\Gxr}, \\
1801: \rule[-.75em]{0em}{2em}%
1802:  & &  \pair{\Gy}{\Gxl} \\\hline
1803: \rule[-.75em]{0em}{2em}%
1804: 5 & \pair{\Gx}{\Gw}, \pair{\Gx}{\Gxr}, 
1805:   & \pair{\Gx}{\Gw}, \pair{\Gx}{\Gxr}\\
1806: \rule[-.75em]{0em}{2em}%
1807:  & \; \pair{\Gy}{\Gxl}
1808:   &  \;\pair{\Gy}{\Gxl} \\\hline
1809: \rule[-.75em]{0em}{2em}%
1810: 6 & \pair{\Gx}{\Gw}, \pair{\Gx}{\Gxr},
1811:   & \pair{\Gx}{\Gw}, \pair{\Gx}{\Gxr},\\
1812: \rule[-.75em]{0em}{2em}%
1813:  &  \;\pair{\Gy}{\Gxl}
1814:   &  \;\pair{\Gy}{\Gxl}, \pair{\Gy}{\Gyl} \\\hline
1815: \rule[-.75em]{0em}{2em}%
1816: 7 & \pair{\Gx}{\Gw}, \pair{\Gx}{\Gxr},
1817:   & \pair{\Gx}{\Gw}, \pair{\Gx}{\Gxr},\\
1818: \rule[-.75em]{0em}{2em}%
1819:  &  \;\pair{\Gy}{\Gxl}, \pair{\Gy}{\Gyl}
1820:   &  \;\pair{\Gy}{\Gxl}, \pair{\Gy}{\Gyl} \\\hline
1821: \end{array}
1822: \]
1823: 
1824:  
1825: \caption{Alias pairs for the running example from 
1826: 	Figure\protect~(\ref{fig:memory.graph}). }
1827: \end{figure}
1828: 
1829: \begin{example}
1830: \label{exmp.alias.info}
1831: Figure~\ref{fig:pgm.nullasgn} shows the may-alias information for our
1832: running example from Figure\protect~\ref{fig:memory.graph}. Observe that
1833: the access graphs used for storing alias information have only the last node
1834: as final and all other nodes as intermediate. 
1835: Figure~\ref{fig:link-alias.info} shows the liveness access graphs augmented
1836: with implicit liveness.  \mybox
1837: \end{example}
1838: 
1839: \begin{figure}[t]
1840: \begin{center}
1841: \includegraphics{fig-link-alias.epsi}
1842: \end{center}
1843: 
1844: 
1845: \caption{Liveness access graphs including implicit liveness
1846:   information for the program in Figure~\ref{fig:memory.graph}.
1847: Gray nodes are nodes included by link-alias computation. Intermediate nodes
1848: are shown with dotted lines.
1849: }
1850: \label{fig:link-alias.info}
1851: \rule{\textwidth}{.2mm}
1852: 
1853: \end{figure}
1854: 
1855: Observe that in the presence of cyclic data structures, we will get alias pairs of
1856: the form \mbox{$\langle \rho,\rho\myarrow\sigma\rangle$}. If a link in the cycle is 
1857: live then the link alias closure will ensure that all possible links are marked live by creating
1858: cycles in the access graphs. This may cause approximation but would be safe.
1859: 
1860: \subsection{Availability and Anticipability of Access Paths}
1861: \label{sec:availability.def}
1862: 
1863: Example~\ref{exmp.illegal.dereference} shows  that safety of inserting
1864: an  assignment  \mbox{$\alpha_x  =  \NULL$}  at  a  program  point~$p$
1865: requires  that   whenever  control   reaches  $p$,  every   prefix  of
1866: $\Base(\rho_x)$ has a non-\NULL\ l-value.  Such an access path is said
1867: to be {\em  accessible} at $p$.  Our use  of accessibility ensures the
1868: preservation  of  semantics  in   the  following  sense:  Consider  an
1869: execution path  which does not  have a dereferencing exception  in the
1870: unoptimized  program.  Then  the proposed  optimization will  also not
1871: have any dereferencing exception in the same execution path.
1872: 
1873: \subsubsection{Defining Availability and Anticipability}
1874: 
1875: We define  an access path $\rho_x$ to  be accessible at $p$  if all of
1876: its prefixes are  {\em available} or {\em anticipable} at $p$:
1877: \begin{itemize}
1878: \item  An access  path  $\rho_x$  is {\em  available\/}  at a  program
1879:   point~$p$, if along every path  reaching $p$, there exists a program
1880:   point~$p'$  such  that $\Front(\rho_x)$  is  either dereferenced  or
1881:   assigned a non-\NULL\ l-value at $p'$ and is not made \NULL\ between
1882:   $p'$ to $p$.
1883: \item  An access  path $\rho_x$  is {\em  anticipable\/} at  a program
1884:   point~$p$, if  along every path starting  from $p$, $\Front(\rho_x)$
1885:   is dereferenced before being assigned.
1886: \end{itemize}
1887: Since both these properties are {\em all paths\/} properties, all
1888: may-link aliases of the left hand side of an assignment need to be killed.
1889: Conversely, these properties can be made more precise by including
1890: must-aliases in the set of anticipable or available paths.
1891: 
1892: 
1893: Recall that  comparisons in conditionals consists  of simple variables
1894: only.   The   use   of   these   variables  does   not   involve   any
1895: dereferencing.  Hence a  comparison $x  == y$  does not  contribute to
1896: accessibility of $x$ or $y$. 
1897: 
1898: \begin{figure}[t]
1899: \begin{center}
1900: \scalebox{1.2}{%
1901: $
1902: \begin{array}{|@{\ }l@{\ }|@{\ }c@{\ }|@{\ }c@{\ }|@{\ }c@{\ }|}
1903: \hline
1904: \mbox{Statement } s & \AVK_s & \AVD_s & \AVT_s(X) \\  \hline\hline
1905: \alpha_x = \alpha_y & 
1906: 	\{ \rho_z\myarrow * \mid  \rho_z \in \ \LnA(\rho_x,\NodeAin(s)) \} & 
1907: 	\prefix(\Base(\rho_x)) &
1908: 	\{ \rho_x\myarrow \sigma \mid \rho_y\myarrow \sigma \in X\} 
1909: 	\\
1910: & & \cup \prefix(\Base(\rho_y))  &
1911:  		\\ \hline
1912:  \alpha_x = f(\alpha_y) & 
1913: 	\{ \rho_z\myarrow * \mid  \rho_z \in \ \LnA(\rho_x,\NodeAin(s)) \} & 
1914: 	\prefix(\Base(\rho_x))
1915:  & \emptyset \\ 
1916:  		 \hline
1917: \alpha_x = \new & 
1918: 	\{ \rho_z\myarrow * \mid  \rho_z \in \ \LnA(\rho_x,\NodeAin(s)) \} & 
1919: 	\prefix(\rho_x) & \emptyset
1920:  		\\ \hline
1921: \alpha_x = \NULL & 
1922: 	\{ \rho_z\myarrow * \mid  \rho_z \in \ \LnA(\rho_x,\NodeAin(s)) \} & 
1923: 	\prefix(\Base(\rho_x)) & \emptyset
1924:  		\\ \hline
1925: %\use  \; \alpha_y  & \emptyset & \prefix(\Base(\rho_y)) & \emptyset \\ \hline
1926: \use  \; \alpha_y.d  & \emptyset & \prefix(\rho_y) & \emptyset
1927:  		\\ \hline
1928: \return \;  \alpha_y & \emptyset &  \prefix(\Base(\rho_y))
1929: & \emptyset 
1930:  		\\ \hline
1931: \mbox{other}  & \emptyset &  \emptyset & \emptyset 
1932:  		\\ \hline
1933: \end{array}
1934: $}
1935: \end{center}
1936: \caption{Flow functions for availability. $\NodeAin(s)$ denotes the set of
1937: may-aliases at the entry of $s$.}
1938: \label{fig:flow.fun.availability}
1939: \rule{\textwidth}{.2mm}
1940: \end{figure}
1941: 
1942: 
1943: \begin{definition}{\rm\bf Availability}.
1944: \label{def:availability}
1945: The set of paths which are available at a program point~$p$,
1946: denoted by $\avail_{p}$, is defined as follows.
1947:   \begin{eqnarray*}
1948:     \avail_p& = & 
1949:     \displaystyle\bigcap_{\onepath \in Paths(p)}(\Pavail^{\onepath}_{p}) 
1950:   \end{eqnarray*}
1951: where, $\onepath \in Paths(p)$ is a control flow path \entrynode\ to $p$ and
1952: $\Pavail^{\onepath}_{p}$ denotes the
1953: availability at $p$ along $\onepath$ and is defined as follows. 
1954: If $p$ is not \entrynode\ of the procedure being analyzed, then let the statement 
1955: which precedes it be denoted by
1956:   $s$ and the program point immediately preceding $s$ be denoted by $p'$. Then,
1957:   \begin{eqnarray*}
1958:     \Pavail^{\onepath}_{p}& = & \left\{\begin{array}{cl}
1959:     %\emptyset & p \mbox{ is  \exitnode\ of {\tt main}} \\\
1960:     \emptyset & p \mbox{ is  \entrynode} \\
1961:     \Savail_{s}(\Pavail^{\onepath}_{p'})  & \mbox{otherwise}
1962:     \end{array}\right. 
1963:   \end{eqnarray*}
1964: where the flow function for $s$ is defined as follows:
1965:   \begin{eqnarray*}
1966:     \Savail_{s}(X) & = & (X - \AVK_s) \; \cup \; \AVD_s \;\cup\;\AVT_s(X) 
1967:   \end{eqnarray*}
1968: $\AVK_s$ denotes the sets of access paths which cease to be available after 
1969: statement $s$, $\AVD_s$ denotes the set of access paths which become available due 
1970: to local effect of $s$ and $\AVT_s(X)$ denotes the
1971: the set of access paths which become available after $s$ due to transfer. 
1972: They are defined in Figure~\ref{fig:flow.fun.availability}. \mybox
1973: \end{definition}
1974: 
1975: 
1976: In a similar manner, we define anticipability of access paths.
1977: 
1978: \begin{figure}[t]
1979: \begin{center}
1980: \scalebox{1.2}{%
1981: $
1982: \begin{array}{|@{\ }l@{\ }|@{\ }c@{\ }|@{\ }c@{\ }|@{\ }c@{\ }|}
1983: \hline
1984: \mbox{Statement } s & \ANTK_s & \ANTD_s & \ANTT_s(X) \\  \hline\hline
1985: \alpha_x = \alpha_y & 
1986: 	\{ \rho_z\myarrow * \mid  \rho_z \in \ \LnA(\rho_x,\NodeAout(s)) \} & 
1987: 	\prefix(\Base(\rho_x))  &
1988: 	\{ \rho_y\myarrow \sigma \mid \rho_x\myarrow \sigma \in X\} 
1989: 	\\
1990: & & \cup \prefix(\Base(\rho_y))  &
1991:  		\\ \hline
1992:  \alpha_x = f(\alpha_y) & 
1993: 	\{ \rho_z\myarrow * \mid  \rho_z \in \ \LnA(\rho_x,\NodeAout(s)) \} & 
1994: 	\prefix(\Base(\rho_x))
1995:  & \emptyset \\ 
1996: & & \cup \prefix(\Base(\rho_y))  &
1997: 		\\ 	 \hline
1998: \alpha_x = \new & 
1999: 	\{ \rho_z\myarrow * \mid  \rho_z \in \ \LnA(\rho_x,\NodeAout(s)) \} & 
2000: 	\prefix(\Base(\rho_x)) & \emptyset
2001:  		\\ \hline
2002: \alpha_x = \NULL & 
2003: 	\{ \rho_z\myarrow * \mid  \rho_z \in \ \LnA(\rho_x,\NodeAout(s)) \} & 
2004: 	\prefix(\Base(\rho_x)) & \emptyset
2005:  		\\ \hline
2006: %\use  \; \alpha_y  & \emptyset & \prefix(\Base(\rho_y)) & \emptyset \\ \hline
2007: \use  \; \alpha_y.d  & \emptyset & \prefix(\rho_y) & \emptyset
2008:  		\\ \hline
2009: \return \;  \alpha_y & \emptyset &  \prefix(\Base(\rho_y))
2010: & \emptyset 
2011:  		\\ \hline
2012: \mbox{other}  & \emptyset &  \emptyset & \emptyset 
2013:  		\\ \hline
2014: \end{array}
2015: $}
2016: \end{center}
2017: \caption{Flow functions for anticipability. $\NodeAout(s)$ denotes the set of
2018: may-aliases at the exit of $s$.}
2019: \label{fig:flow.fun.anticipability}
2020: \rule{\textwidth}{.2mm}
2021: \end{figure}
2022: 
2023: \begin{definition}{\rm\bf Anticipability}.
2024: \label{def:anticipability}
2025: The set of paths which are anticipable at a program point~$p$,
2026: denoted by $\ant_p$ is defined as follows.
2027:   \begin{eqnarray*}
2028:     \ant_p& = & 
2029:     \displaystyle\bigcap_{\onepath \in Paths(p)}(\Pant^{\onepath}_{p}) 
2030:   \end{eqnarray*}
2031: where, $\onepath \in Paths(p)$ is a control flow path $p$ to \exitnode\ and
2032: $\Pavail^{\onepath}_{p}$ denotes the
2033: anticipability at $p$ along $\onepath$ and is defined as follows. 
2034: If $p$ is \exitnode\ then let the statement which follows it be denoted by
2035:   $s$ and the program point immediately following $s$ be denoted by $p'$. Then,
2036:   \begin{eqnarray*}
2037:     \Pant^{\onepath}_{p}& = & \left\{\begin{array}{cl}
2038:     \emptyset & p \mbox{ is  \exitnode} \\\
2039:     \Sant_{s}(\Pant^{\onepath}_{p'})  & \mbox{otherwise}
2040:     \end{array}\right. 
2041:   \end{eqnarray*}
2042: where the flow function for $s$ is defined as follows:
2043:   \begin{eqnarray*}
2044:     \Sant_{s}(X) & = & (X - \ANTK_s) \; \cup \; \ANTD_s \;\cup\;\ANTT_s(X) 
2045:   \end{eqnarray*}
2046: $\ANTK_s$ denotes the sets of access paths which cease to be anticipable before 
2047: statement $s$, $\ANTD_s$ denotes the set of access paths which become anticipable due 
2048: to local effect of $s$ and $\ANTT_s(X)$ denotes the
2049: the set of access paths which become anticipable before $s$ due to transfer.
2050: They are defined in Figure~\ref{fig:flow.fun.anticipability}. \mybox
2051: \end{definition}
2052: 
2053: Observe that both $\avail_{p}$ and $\ant_{p}$ are prefix-closed.
2054: 
2055: \subsubsection{Data Flow Analyses for Availability and Anticipability}
2056: 
2057: Availability and Anticipability are {\em all (control-flow) paths\/}
2058: properties in that the desired property must hold along every path
2059: reaching/leaving the program point under consideration.  Thus these
2060: analyses identify access paths which are common to all control flow
2061: paths {\em including acyclic control flow paths}.  Since acyclic
2062: control flow paths can generate only acyclic\footnote{In the presence
2063: of cycles in heap, considering only acyclic access paths results in
2064: an approximation which is safe for availability and anticipability.} and hence
2065: finite access paths, anticipability and availability analyses deal
2066: with a finite number of access paths and summarization is not
2067: required.
2068: 
2069: Thus  there is  no  need to  use  access graphs  for availability  and
2070: anticipability analyses. The data flow analysis can be performed using
2071: a set  of access paths  because the access  paths are bounded  and the
2072: sets would be finite.  Moreover, since the access paths resulting from
2073: anticipability and availability are  prefix-closed, they can be
2074: represented efficiently.
2075: 
2076: The data flow equations are same as the definitions of these analyses
2077: except that definitions are path-based 
2078: {(i.e. they define MoP solution)}
2079: while the data flow equations are edge-based 
2080: {(i.e. they define MFP solution)}
2081: as is customary in data flow analysis. In other words,
2082: the data flow information is merged at the intermediate 
2083: points and availability and anticipability information is derived
2084: from the corresponding information at the preceding and following program
2085: point respectively. 
2086: {As observed in appendix~\ref{sec:non-distributivity},
2087: the flow functions in availability and anticipability analyses are
2088: non-distributive hence MoP and MFP solutions may be different.
2089: }
2090: 
2091: For brevity, we omit the data flow equations.
2092: We use the universal set of
2093: access paths as the initial value for all blocks other than
2094: \entrynode\ for availability analysis and \exitnode\ for
2095: anticipability analysis.
2096: 
2097: \begin{figure}[t]
2098: \hfill{\includegraphics{fig-av-ant.epsi}}\hfill\mbox{}
2099: \caption{Availability and anticipability for the program in Figure
2100:   \ref{fig:memory.graph}.}
2101: \label{fig:av-ant-info}
2102: \rule{\textwidth}{.2mm}
2103: \end{figure}
2104: 
2105: 
2106: 
2107: \begin{example} \label{exmp:av-ant-info}
2108:   Figure~\ref{fig:av-ant-info} gives the availability and
2109:   anticipability information for program in Figure~\ref{fig:memory.graph}.
2110: $\Avin{}(i)$ and $\Avout{}(i)$ denote the set
2111: of available access paths before and after the statement $i$, while
2112: $\Antin{}(i)$ and $\Antout{}(i)$ denote the set of anticipable access
2113: paths before and after the statement $i$. 
2114: %
2115: \mybox
2116: \end{example}
2117: 
2118: %%\subsection{Nullability of Access Paths}
2119: %%\label{sec:nullability.def}
2120: %%
2121: %%An access path $\rho_x$ is {\em nullable\/} at a program point~$p$ if
2122: %%the assignment \mbox{$\alpha_x = \NULL$} can be inserted at $p$
2123: %%without affecting the semantics of the program in any way.  As
2124: %%observed in Example~\ref{exmp.illegal.dereference}, safety of
2125: %%inserting \mbox{$\alpha_x = \NULL$} at $p$ requires that (a) $\rho_x$
2126: %%should not be live at $p$ and (b) every prefix of $\Base(\rho_x)$
2127: %%should be accessible.  Further, from considerations of efficiency,
2128: %%inserting \mbox{$\alpha_x = \NULL$} at $p$ is redundant, if (a) a
2129: %%proper prefix of $\rho_x$ is nullable at $p$, or (b) link
2130: %%corresponding to $\Front(\rho_x)$ has already been nullified before
2131: %%$p$.
2132: %%
2133: %%{The candidate access paths for \NULL\ assignment at program point $p$
2134: %%are created using the notion of accessibility as follows:
2135: %%All prefixes of accessible paths at $p$ are extended by the relevant field 
2136: %%names and the paths which are live at $p$ are excluded.
2137: %%Additional criteria capturing profitability is used for the final
2138: %%decision (section~\ref{sec:nullability}). 
2139: %%}
2140: %%
2141: %%\begin{example}
2142: %%\label{exmp:nullable.def.1}
2143: %%{Consider line 3 in the program in Figure~\ref{fig:memory.graph}(a).
2144: %%It is easy to see that the access path $x$ is both available and
2145: %%anticipable just before line 3. We extend $x$ and observe that
2146: %%\mbox{$x\myarrow\rptr$} is live and cannot be nullified. However,
2147: %%\mbox{$x\myarrow\lptr$} can be set to \NULL.  } \mybox
2148: %%\end{example}
2149: 
2150: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2151: %%\section{Data Flow Analysis for Heap References}
2152: %%\label{sec:dfe}
2153: %%
2154: %%In this section we define data flow analyses for capturing the
2155: %%properties of aliasing, liveness, availability, and anticipability of
2156: %%heap references.  
2157: %%The data flow equations approximate the specifications in
2158: %%Section~\ref{sec:heap.properties}.
2159: %%%We compute the {\em maximal fixed point\/} (MFP) solutions
2160: %%%of the data flow
2161: %%%equations.
2162: %%%
2163: %%\Boundary\ denotes a safe approximation of interprocedural
2164: %%information.  We do not perform must-alias analysis and conservatively
2165: %%assume that every access path is must-link-aliased only to itself.
2166: %%
2167: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2168: %%\subsection{Alias Analysis}
2169: %%\label{sec:alias-analysis}
2170: %%The data flow equations for alias analysis compute the may-node-alias
2171: %%relations defined in Section~\ref{sec:aliasing.def} using access
2172: %%graphs. The alias information is stored in the form of a set of access
2173: %%graph pairs. A pair \mbox{$ \langle G^i, G^j \rangle$} indicates that
2174: %%all access paths in $G^i$ are aliased to all access paths in
2175: %%$G^j$. Though the alias relation represented as access graph pairs is
2176: %%symmetric, we explicitly store only one of the pairs \mbox{$\langle
2177: %%G^i, G^j \rangle$} and \mbox{$\langle G^j, G^i \rangle$}.  A pair
2178: %%\mbox{$\langle G^i,G^j\rangle$} is removed from the set of alias pairs
2179: %%if $G^i$ or $G^j$ is $\Empty\!_G$.
2180: %%
2181: %%
2182: %%$\NodeAin(i)$ and $\NodeAout(i)$ denote the set of may-node-aliases
2183: %%before and after the statement $i$.  Their initial values are
2184: %%$\emptyset$.
2185: %%\begin{eqnarray}
2186: %%\NodeAin(i) & = & \left\{ \begin{array}{c@{\hspace*{1cm}}l}
2187: %%		\Boundary &  i = \entrynode\\
2188: %%		\displaystyle\bigcup_{p \in pred(i)}
2189: %%		\NodeAout(p) & \mbox{otherwise}
2190: %%			\end{array}
2191: %%		\right. \\
2192: %%\NodeAout(i) & = & (\NodeAin(i)\; - \; \NodeAkill(i)) \; \cup \; \NodeAgen(i)
2193: %%\label{eq:nodeaout}
2194: %%\end{eqnarray}
2195: %%
2196: %%Since all our analyses require link-aliases, we derive them from
2197: %%node-aliases.  Given a set of \Aset\ of node-aliases,
2198: %%$\LAliases(G_v,\Aset)$ computes a set of graphs representing
2199: %%link-aliases of an access graph $G_v$.  All link-aliases of every
2200: %%access path in $G_v$ are contained in the resulting access graphs.
2201: %%%
2202: %%\begin{eqnarray}
2203: %%  \LAliases(G_v,\Aset) & = & { \left\{G_v\right\} \cup }
2204: %%  \left\{ G_u \mid G_u = \extend{G'_u}{(G_v/G'_v - \{\EFG\} )},
2205: %%  \langle G'_v,G'_u\rangle \in \Aset \right\}
2206: %%\label{eq:link.aliases}
2207: %%\end{eqnarray}
2208: %%%
2209: %%Note that link-alias  computation should add at least one link to
2210: %%node-aliases and hence should exclude empty suffixes from extension.
2211: %%
2212: %%We now define the flow functions for a statement $i$.  Since use
2213: %%statements do not modify heap references, both $\NodeAgen(i)$ and
2214: %%$\NodeAkill(i)$ are $\emptyset$.  Thus, $\NodeAout(i) = \NodeAin(i)$
2215: %%for such statements.  For an assignment \mbox{$\alpha_x = \alpha_y$},
2216: %%access graphs capturing the sets of access paths \Alhs\ and \Arhs\
2217: %%(Definition~\ref{def:may.node.alias}) are:
2218: %%%%
2219: %%\begin{eqnarray*}
2220: %%\Glhs  & = & \LAliases(\gox,\NodeAin(i)) \\
2221: %%\Grhs  & = & \LAliases(\goy,\NodeAin(i)) 
2222: %%\end{eqnarray*}
2223: %%\Arhsp\ defined by the specifications includes all paths which have a
2224: %%prefix which is must-link-aliased to $\rho_x$.  Since we do not
2225: %%compute must-aliases, we conservatively assume that an access path is
2226: %%must-aliased only to itself. Hence we approximate \Arhsp\ to contain
2227: %%all access paths which have $\rho_x$ as a prefix.  In order to remove
2228: %%such paths, the path removal operation uses $\rho_x$.  Note that the
2229: %%absence of explicit must-alias information and the path removal
2230: %%operation both introduce (safe) approximations.  Effectively, we kill
2231: %%fewer paths than are required by the specifications.
2232: %%
2233: %%
2234: %%\begin{figure}[t]
2235: %%\newcommand{\Gax}
2236: %%{\scalebox{.7}{
2237: %%{\psset{unit=.8mm}
2238: %%		\begin{pspicture}(2,-4)(9,1)
2239: %%%\psframe(-1,-3)(12,2)
2240: %%		\psrelpoint{origin}{x0}{8}{-2}
2241: %%		\rput(\x{x0},\y{x0}){\rnode{x0}{\pscirclebox{$x\,$}}}
2242: %%		\psrelpoint{origin}{n00}{1}{-2}
2243: %%		\rput(\x{n00},\y{n00}){\rnode{n00}{}}
2244: %%		\ncline[doubleline=true]{->}{n00}{x0}
2245: %%		\end{pspicture}
2246: %%		}
2247: %%}
2248: %%}
2249: %%\newcommand{\Gaw}
2250: %%{\scalebox{.7}{
2251: %%{\psset{unit=.8mm}
2252: %%		\begin{pspicture}(1,-4)(10,0)
2253: %%%\psframe(-1,-3)(12,2)
2254: %%		\psrelpoint{origin}{x0}{8}{-2}
2255: %%		\rput(\x{x0},\y{x0}){\rnode{x0}{\pscirclebox[framesep=.7]{$w$}}}
2256: %%		\psrelpoint{origin}{n00}{1}{-2}
2257: %%		\rput(\x{n00},\y{n00}){\rnode{n00}{}}
2258: %%		\ncline[doubleline=true]{->}{n00}{x0}
2259: %%		\end{pspicture}
2260: %%		}
2261: %%}
2262: %%}
2263: %%
2264: %%\newcommand{\Gbw}
2265: %%{\scalebox{.7}{
2266: %%{\psset{unit=.8mm}
2267: %%		\begin{pspicture}(1,1)(19,4)
2268: %%%\psframe(0,1)(22,4)
2269: %%		\psrelpoint{origin}{w0}{8}{2}
2270: %%		\rput(\x{w0},\y{w0}){\rnode{w0}{\pscirclebox[linestyle=dashed,dash=.4 .4,linewidth=.5,framesep=.7]{$w$}}}
2271: %%		\psrelpoint{origin}{n00}{1}{2}
2272: %%		\rput(\x{n00},\y{n00}){\rnode{n00}{}}
2273: %%		\ncline[doubleline=true]{->}{n00}{w0}
2274: %%		%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2275: %%		\psrelpoint{w0}{r3}{10}{0}
2276: %%		\rput(\x{r3},\y{r3}){\rnode{r3}{\pscirclebox[framesep=.6]{$r_3$}}}
2277: %%		\ncline{->}{w0}{r3}
2278: %%		%\aput[0pt]{0}{$r$}
2279: %%		\end{pspicture}
2280: %%		}
2281: %%}
2282: %%}
2283: %%\newcommand{\Gcw}
2284: %%{\scalebox{.7}{
2285: %%{\psset{unit=.8mm}
2286: %%		\begin{pspicture}(2,0)(20,3)
2287: %%%\psframe(0,1)(22,3)
2288: %%		\psrelpoint{origin}{w0}{8}{2}
2289: %%		\rput(\x{w0},\y{w0}){\rnode{w0}{\pscirclebox[framesep=.7]{$w$}}}
2290: %%		\psrelpoint{origin}{n00}{1}{2}
2291: %%		\rput(\x{n00},\y{n00}){\rnode{n00}{}}
2292: %%		\ncline[doubleline=true]{->}{n00}{w0}
2293: %%		%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2294: %%		\psrelpoint{w0}{r3}{10}{0}
2295: %%		\rput(\x{r3},\y{r3}){\rnode{r3}{\pscirclebox[framesep=.6]{$r_3$}}}
2296: %%		\ncline{->}{w0}{r3}
2297: %%		%\aput[0pt]{0}{$r$}
2298: %%		\end{pspicture}
2299: %%		}
2300: %%}
2301: %%}
2302: %%\newcommand{\Gdw}
2303: %%{\scalebox{.7}{
2304: %%{\psset{unit=.8mm}
2305: %%		\begin{pspicture}(0,-0)(19,8)
2306: %%%\psframe(-1,-2)(22,9)
2307: %%		\psrelpoint{origin}{w0}{8}{2}
2308: %%		\rput(\x{w0},\y{w0}){\rnode{w0}{\pscirclebox[linestyle=dashed,dash=.4 .4,linewidth=.5,framesep=.7]{$w$}}}
2309: %%		\psrelpoint{origin}{n00}{1}{2}
2310: %%		\rput(\x{n00},\y{n00}){\rnode{n00}{}}
2311: %%		\ncline[doubleline=true]{->}{n00}{w0}
2312: %%		%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2313: %%		\psrelpoint{w0}{r3}{10}{0}
2314: %%		\rput(\x{r3},\y{r3}){\rnode{r3}{\pscirclebox[framesep=.6]{$r_3$}}}
2315: %%		\ncline{->}{w0}{r3}
2316: %%		%\aput[0pt]{0}{$r$}
2317: %%		\nccurve[angleA=45,angleB=135,nodesep=-1,ncurv=3]{->}{r3}{r3}
2318: %%		\end{pspicture}
2319: %%		}
2320: %%}
2321: %%}
2322: %%\newcommand{\Gew}
2323: %%{\scalebox{.7}{
2324: %%{\psset{unit=.8mm}
2325: %%		\begin{pspicture}(2,0)(19,4)
2326: %%%\psframe(0,1)(20,6)
2327: %%		\psrelpoint{origin}{w0}{8}{2}
2328: %%		\rput(\x{w0},\y{w0}){\rnode{w0}{\pscirclebox[framesep=.7]{$w$}}}
2329: %%		\psrelpoint{origin}{n00}{1}{2}
2330: %%		\rput(\x{n00},\y{n00}){\rnode{n00}{}}
2331: %%		\ncline[doubleline=true]{->}{n00}{w0}
2332: %%		%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2333: %%		\psrelpoint{w0}{r3}{10}{0}
2334: %%		\rput(\x{r3},\y{r3}){\rnode{r3}{\pscirclebox[framesep=.6]{$r_3$}}}
2335: %%		\ncline{->}{w0}{r3}
2336: %%		%\aput[0pt]{0}{$r$}
2337: %%		\nccurve[angleA=45,angleB=135,nodesep=-1,ncurv=3]{->}{r3}{r3}
2338: %%		\end{pspicture}
2339: %%		}
2340: %%}
2341: %%}
2342: %%
2343: %%\begin{center}
2344: %%  \includegraphics{fig-alias-info-1.epsi}  
2345: %%\end{center}
2346: %%
2347: %%Field names \lptr \ and \rptr \ have been abbreviated by $l$ and
2348: %%$r$. For convenience, multiple pairs have been merged into a single
2349: %%pair where possible, e.g. alias pair
2350: %%%
2351: %%\mbox{$\langle\; \raisebox{-.08cm}{\Gax},\rule{0cm}{.4cm} \raisebox{-.08cm}{\Gew}\! \rangle$} 
2352: %%%
2353: %%implies three alias pairs
2354: %%%
2355: %%\mbox{$\langle\; \raisebox{-.08cm}{\Gax},\rule{0cm}{.4cm} \raisebox{-.08cm}{\Gaw}\! \rangle$}, 
2356: %%\mbox{$\langle\; \raisebox{-.08cm}{\Gax},\rule{0cm}{.4cm} \raisebox{.0cm}{\Gbw}\! \rangle$}, 
2357: %%and
2358: %%\mbox{$\langle\; \raisebox{-.08cm}{\Gax},\rule{0cm}{.4cm} \raisebox{-.08cm}{\Gdw}\! \rangle$}.
2359: %%%
2360: %%\caption{Aliases for the program in Figure \ref{fig:memory.graph}. }
2361: %%\label{fig:alias.info.1}
2362: %%\rule{\textwidth}{.2mm}
2363: %%\end{figure}
2364: %%%
2365: %%The assignment \mbox{$\alpha_x = \alpha_y$} kills the aliases
2366: %%involving the access path which contain the link corresponding to
2367: %%$\Front(\rho_x)$.  Instead of computing $\NodeAkill(i)$ explicitly, we
2368: %%calculate \mbox{$\NodeAin(i) - \NodeAkill(i)$} directly as:
2369: %%%
2370: %%\begin{eqnarray*}
2371: %%\left\{ \left\langle G \minus \Grhsp, G' \minus \Grhsp\right\rangle \mid
2372: %%\langle G, G' \rangle \in \NodeAin(i) 
2373: %%\right\}
2374: %%%\label{eq:alias.preserve}
2375: %%\end{eqnarray*}
2376: %%
2377: %%
2378: %% $\NodeAgen(i)$ for the assignment is defined as:
2379: %%\begin{eqnarray*}
2380: %%\NodeAgen(i) & = & \left\{ \begin{array}{ll}
2381: %%		\emptyset & \mbox{$\rho_y$ is \NULL\ or {\em New\/} \ldots} \\
2382: %%                \mbox{\sf\em ADirect\/}(i) \cup \mbox{\sf\em ATransfer\/}(i),
2383: %%		     & \mbox{otherwise} 
2384: %%		     \end{array}\right.
2385: %%\end{eqnarray*}
2386: %%where
2387: %%\begin{eqnarray*}
2388: %%\mbox{\sf\em ADirect\/}(i) & = & \Glhs \times 
2389: %%	\left\{G \minus \Grhsp \mid G \in \Grhs \right\} \\
2390: %%   \mbox{\sf\em ATransfer\/}(i) & = & 
2391: %%   \mbox{\sf\em ATransfer\/}_1(i) \cup \mbox{\sf\em ATransfer\/}_2(i) \nonumber\\
2392: %%   \mbox{\sf\em ATransfer\/}_1(i) & = &  \left\{ 
2393: %%   		\left\langle G_z \minus \Grhsp, \extend{G_u}{(G_y/\goy)}\right\rangle \mid \right. \\
2394: %%		& & \;\;\;\left.
2395: %%		\left\langle G_z,G_y\right\rangle \in \NodeAin(i),
2396: %%			G_u \in \Glhs
2397: %%			\right\} \nonumber\\
2398: %%   \mbox{\sf\em ATransfer\/}_2(i) & = &  \left\{ 
2399: %%   			\left\langle \extend{G_u}{(G^1_y/\goy)},
2400: %%					\extend{G_v}{(G^2_y/\goy)}
2401: %%   			\right\rangle  \mid \left\langle
2402: %%			G^1_y,G^2_y\right\rangle \in \NodeAin(i), \right.\nonumber \\
2403: %%			& & \; \; \; G_u \in \Glhs, 
2404: %%			\left. G_v \in \Glhs
2405: %%			\right\} \nonumber 
2406: %%\end{eqnarray*}
2407: %%
2408: %%
2409: %%
2410: %%\begin{example}
2411: %%\label{exmp:alias.1}
2412: %%Figure~\ref{fig:alias.info.1} lists the alias information for the
2413: %%program in Figure~\ref{fig:memory.graph}.  We have shown only the
2414: %%final result.  The alias pair
2415: %%%
2416: %%\scalebox{.8}{\mbox{$\rule{0cm}{.6cm}\left\langle
2417: %%    \raisebox{.7mm}{\rnode{n00}{}} \;\;\;\;\;
2418: %%    \circlenode[framesep=.8mm]{n1}{x} \,,\,
2419: %%    \ncline[doubleline=true]{->}{n00}{n1}
2420: %%    \raisebox{.7mm}{\rnode{n00}{}} \;\;\;\;\;
2421: %%    \circlenode[framesep=.8mm,linestyle=dashed,dash=.4mm .4mm]{n2}{w}\;\;\;\,
2422: %%    \raisebox{.35mm}{\circlenode[framesep=.1mm]{n3}{$r_3$}}
2423: %%    \ncline[doubleline=true]{->}{n00}{n2}
2424: %%    \nccurve[angleA=45,angleB=135,ncurv=3]{->}{n3}{n3}
2425: %%    \ncline{->}{n2}{n3}\;
2426: %%    \right\rangle$}}
2427: %%%
2428: %%in \NodeAin{}($3$) represents an infinite number of aliases
2429: %%\mbox{$\left\langle x, w\myarrow \rptr\right\rangle$},
2430: %%\mbox{$\left\langle x, w\myarrow \rptr\myarrow \rptr \right\rangle$},
2431: %%\mbox{$\left\langle x, w\myarrow \rptr\myarrow \rptr \myarrow \ldots
2432: %%\right\rangle$}.  created in different execution instances of line 3.
2433: %%Alias \mbox{$\left\langle x, w\right\rangle$} is created at line 1 is
2434: %%represented by the pair
2435: %%%
2436: %%\raisebox{.5mm}{%
2437: %%  \scalebox{.8}{\mbox{$\rule{0cm}{.6cm}\left\langle
2438: %%      \raisebox{.7mm}{\rnode{n00}{}} \;\;\;\;\;
2439: %%      \circlenode[framesep=.8mm]{n1}{x} \,,\,
2440: %%      \ncline[doubleline=true]{->}{n00}{n1}
2441: %%      \raisebox{.7mm}{\rnode{n00}{}} \;\;\;\;\;
2442: %%      \circlenode[framesep=.8mm,linestyle=solid,dash=.4mm .4mm]{n2}{w} %karkare
2443: %%      \ncline[doubleline=true]{->}{n00}{n2}
2444: %%      \right\rangle$}}}.
2445: %%%
2446: %%\mybox
2447: %%\end{example}
2448: 
2449: 
2450: 
2451: %%
2452: %%\begin{eqnarray*}
2453: %%\Lin{v}(i) & = & 
2454: %%\left( \Lout{v}(i) \minus \Lkill{v}(i)\right) \cupG \Lgen{v}(i)
2455: %%\\
2456: %%\Lout{v}(i) & = & \left \{ \begin{array}{c@{\ \ \ }l}
2457: %%		\Boundary & i = \exitnode \\
2458: %%		\displaystyle\mathop{\bigcupG}_{s \in succ(i)}
2459: %%		\; \Lin{v}(s) & \mbox{otherwise}
2460: %%		\end{array}\right. 
2461: %%\end{eqnarray*}
2462: %%%
2463: %%where
2464: %%\begin{equation*}
2465: %%\Lgen{v}(i) = \LD{v}(i) \cupG \LT{v}(i)
2466: %%\end{equation*}
2467: %%
2468: %%We define $\Lkill{v}(i)$, $\LD{v}(i)$, and $\LT{v}(i)$ depending upon
2469: %%the statement.
2470: %%
2471: %%\begin{enumerate}
2472: %%\item {\em Assignment statement\/} \mbox{$\alpha_x = \alpha_y$}. Apart
2473: %%from defining the desired terms for $x$ and $y$, we also need to
2474: %%define them for any other variable $z$.
2475: %%%
2476: %%\begin{eqnarray*}
2477: %%\LD{x}(i) & \!\!=\!\!& \graphA{\Base(\rho_x)} \label{eq:direct.x} \\
2478: %%\LD{y}(i) & \!\!=\!\!& \left\{\!\! \renewcommand{\arraystretch}{1.1}
2479: %%		\begin{array}{l@{\ \ \ }l}
2480: %%		 \Empty\!_G &  \alpha_y \mbox{ is {\em New\/} \ldots\ or \NULL} \\
2481: %%		\graphA{{\Base(\rho_y)}}  & \mbox{otherwise}
2482: %%		\end{array}
2483: %%		\right.
2484: %%		\label{eq:direct.y}   \\
2485: %%\LD{z}(i) & \!\!=\!\!& \Empty\!_G, \mbox{for any variable } z \mbox{ other than } x \mbox{ and }
2486: %%	y \label{eq:direct.z}  \\
2487: %%\LT{y}(i) & \!\!=\!\! & \left\{\!\! \renewcommand{\arraystretch}{1.1}
2488: %%		\begin{array}{l@{\ }l}
2489: %%		 \Empty\!_G &  \alpha_y \mbox{ is {\em New\/} or \NULL} \\
2490: %%		\extend{\graphO{\rho_y}}{(\Lout{x}(i)/\graphO{\rho_x})} 
2491: %%			& \mbox{otherwise}
2492: %%		\end{array}
2493: %%		\right.
2494: %%\label{eq:xfer.y}\\
2495: %%\LT{z}(i) & \!\!=\!\! & \Empty\!_G, \mbox{ for any variable } z 
2496: %%	\mbox{ other than } y \label{eq:xfer.z}  \\
2497: %%%\end{eqnarray*}
2498: %%%\begin{eqnarray*}
2499: %%\Lkill{x}(i) & \!\!=\!\! & \rho_x \label{eq:kill.x} \\
2500: %%\Lkill{z}(i) & \!\!=\!\! &\Empty, \mbox{ for any variable } z 
2501: %%	\mbox{ other than } x \label{eq:kill.z}   
2502: %%\end{eqnarray*}
2503: %%%
2504: %%For the same reasons as in alias analysis, we may kill fewer access paths
2505: %%than are required by the specifications.
2506: %%
2507: %%\item {\em Use Statements}
2508: %%\begin{eqnarray*}
2509: %%\LD{y}(i) & = & \bigcupG\; \graphA{\rho_y} \mbox{ \ for every } \alpha_y
2510: %%		\mbox{ or } \alpha_y.d \mbox{ used in } i \label{eq:direct.use} \\
2511: %%\LD{z}(i) & = & \Empty\!_G \mbox{ \ for any variable $z$ other than } y
2512: %%		\label{eq:direct.use.z} \\
2513: %%\LT{v}(i) & = & \Empty\!_G, \mbox{ for every variable } v  \label{eq:xfer.v} \\
2514: %%\Lkill{v}(i) & = & \Empty, \mbox{ for every variable } v  \label{eq:kill.v} 
2515: %%\end{eqnarray*}
2516: %%\end{enumerate}
2517: %%
2518: %%Once explicitly live access graphs are computed, implicitly live
2519: %%access graphs at a given program point can be discovered by computing
2520: %%may-link-aliases of explicitly live access graphs at that
2521: %%point.
2522: %%%%
2523: %%{ Mathematically, 
2524: %%
2525: %%\begin{eqnarray*}
2526: %%\CLin{v}(i) & = & 
2527: %%\bigcupG\;\{ G_v \mid G_v \in \LAliases(\Lin{u}(i), \NodeAin(i))
2528: %%   \mbox{ for some variable $u$} \}
2529: %%\\
2530: %%\CLout{v}(i) & = & 
2531: %%\bigcupG\;\{ G_v \mid G_v \in \LAliases(\Lout{u}(i), \NodeAout(i))
2532: %%   \mbox{ for some variable $u$} \}
2533: %%\end{eqnarray*}
2534: %%}
2535: %%
2536: %%
2537: %%
2538: %%\begin{example}
2539: %%\label{exmp:liveness.info.1}
2540: %%Figure~\ref{fig:liveness.info.1} lists the explicit liveness
2541: %%information, while Figure~\ref{fig:link-alias.info} gives complete
2542: %%liveness information for program in Figure~\ref{fig:memory.graph}.
2543: %%%
2544: %%\mybox
2545: %%\end{example}
2546: %%
2547: %%
2548: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2549: %%\subsection{Availability and Anticipability Analyses}
2550: %%
2551: %%Availability and Anticipability are {\em all (control-flow) paths\/}
2552: %%properties in that the desired property must hold along every path
2553: %%reaching/leaving the program point under consideration.  Thus these
2554: %%analyses identify access paths which are common to all control flow
2555: %%paths {\em including acyclic control flow paths}.  Since acyclic
2556: %%control flow paths can generate only acyclic\footnote{In the presence
2557: %%of cycles in heap, cycles in access graphs do not represent
2558: %%summarization (Appendix~\ref{sec:cyclic.datastructure}).}  and hence
2559: %%finite access paths, anticipability and availability analyses deal
2560: %%with a finite number of access paths and summarization is not
2561: %%required.
2562: %%
2563: %%Thus there is no need to use access graphs for availability and
2564: %%anticipability analyses. The data flow analysis can be performed using
2565: %%a set of access paths because the access paths are bounded and the
2566: %%sets would be finite.  {Besides, the prefix-closed property of these
2567: %%sets facilitates efficient representation.} The data flow equations
2568: %%are exactly same as the formal specifications of these analyses
2569: %%(Definitions~\ref{def:availability} and
2570: %%\ref{def:anticipability}). $\Avin{}(i)$ and $\Avout{}(i)$ denote the set
2571: %%of available access paths before and after the statement $i$, while
2572: %%$\Antin(i)$ and $\Antout(i)$ denote the set of anticipable access
2573: %%paths before and after the statement $i$. We use the universal set of
2574: %%access paths as the initial value for all blocks other than
2575: %%\entrynode\ for availability analysis and \exitnode\ for
2576: %%anticipability analysis.
2577: %%
2578: %%\begin{figure}[t]
2579: %%\hfill{\includegraphics{fig-av-ant.epsi}}\hfill\mbox{}
2580: %%\caption{Availability and anticipability for the program in Figure
2581: %%  \ref{fig:memory.graph}.}
2582: %%\label{fig:av-ant-info}
2583: %%\rule{\textwidth}{.2mm}
2584: %%\end{figure}
2585: %%
2586: %%
2587: %%
2588: %%\begin{example} \label{exmp:av-ant-info}
2589: %%  Figure~\ref{fig:av-ant-info} gives the availability and
2590: %%  anticipability information for program in Figure~\ref{fig:memory.graph}.
2591: %%%
2592: %%\mybox
2593: %%\end{example}
2594: 
2595: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2596: 
2597: \section{\NULL\ Assignment Insertion} 
2598: \label{sec:nullability}
2599: 
2600: We now explain how the analyses described in preceding sections can be
2601: used to insert appropriate \NULL\ assignments to nullify dead links.
2602: The inserted assignments should be safe and profitable as defined below.
2603: 
2604: %
2605: \begin{definition}{\rm\bf Safety}.
2606: \label{def:safety}
2607: It is safe to insert an assignment \mbox{$\alpha = \NULL$} at a program
2608: point~$p$ if and only if $\rho$ is not live at $p$ and $\Base(\rho)$ can
2609: be dereferenced without raising an exception.
2610: \end{definition}
2611: %
2612: 
2613: An access path $\rho$ is {\em nullable} at a program point~$p$ if and
2614: only if it is safe to insert assignment \mbox{$\alpha = \NULL$} at
2615: $p$.
2616: 
2617: 
2618: %
2619: \begin{definition}{\rm\bf Profitability}.  
2620: \label{def:profitability}
2621: It is profitable to insert an assignment \mbox{$\alpha = \NULL$} at a
2622: program point~$p$ if and only if no proper prefix of $\rho$ is nullable
2623: at $p$ and the link corresponding to $\Front(\rho)$ is not made \NULL\
2624: before execution reaches $p$.
2625: \end{definition}
2626: 
2627: Note that profitability definition is  strict in that every control flow
2628: path  may  nullify  a  particular  link  only  once.   Redundant  \NULL\
2629: assignments on any  path are prohibited.  Since control  flow paths have
2630: common segments,  a \NULL\ assignment may  be partially redundant
2631: in the sense that it may be  redundant along one path but not along some
2632: other  path.  Such  \NULL\ assignments  will be  deemed  unprofitable by
2633: Definition~\ref{def:profitability}.   Our algorithm may  not be  able to
2634: avoid all redundant assignments.
2635: 
2636: %
2637: \begin{example}
2638: We  illustrate  some situations  of  safety  and  profitability for  the
2639: program in Figure~\ref{fig:memory.graph}.
2640: \begin{itemize}
2641: \item Access path \mbox{$x\myarrow\lptr\myarrow\lptr$} is not nullable
2642:   at the entry of 6. This is because
2643:   \mbox{$x\myarrow\lptr\myarrow\lptr$} is implicitly live, due to the
2644:   use of \mbox{$y\myarrow\lptr$} in 6. Hence it is not safe to insert
2645:   \mbox{$x.\lptr.\lptr = \NULL$} at the entry of 6.
2646: \item Access path \mbox{$x\myarrow\rptr$} is nullable at the entry of
2647:   4, and continues to be so on the path from the entry of 4 to the
2648:   entry of 7.  The assignment \mbox{$x.\rptr=\NULL$} is profitable
2649:   only at the entry of 4.
2650: %
2651: \mybox
2652: \end{itemize}
2653: \end{example}
2654: 
2655: \newcommand{\LIVE}[1]{\mbox{{\em Live\/}($#1$)}}
2656: \newcommand{\RCH}[1]{\mbox{{\em Available\/}($#1$) $\cup$ {\em Anticipable\/}($#1$)}}
2657: 
2658: Section~\ref{sec:null.criteria} describes the criteria for deciding
2659: whether a given path $\rho$ should be considered for a \NULL\
2660: assignment at a program point $p$.  Section~\ref{sec:null.candidates}
2661: describes how we create the set of candidate access paths.
2662: Let    \LIVE{p},    \mbox{\em    Available\/}($p$),   and    \mbox{\em
2663: Anticipable\/}($p$)  denote set of live paths,
2664: set  of available  paths and  set of  anticipable paths  respectively at
2665: program point $p$.\footnote{Because availability and anticipability properties are
2666:   prefix closed, $\Base(\rho)\in\RCH{p}$ guarantees that all proper
2667:   prefixes of $\rho$ are either available or anticipable.}
2668: They refer to
2669: $\CLin{}(i)$, $\Avin{}(i)$, and $\Antin{}(i)$ respectively when $p$ is \In{i}.
2670: When $p$ is \Out{i}, they refer to $\CLout{}(i)$, $\Avout{}(i)$, and 
2671: $\Antout{}(i)$ respectively. 
2672: 
2673: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2674: \subsection{Computing Safety and Profitability}
2675: \label{sec:null.criteria} 
2676: 
2677: To find out if $\rho$ can be nullified at $p$, we compute two
2678: predicates: \cannullify\ and \nullify. $\cannullify(\rho, p)$ captures
2679: the safety property---it is true if insertion of assignment
2680: \mbox{$\alpha = \NULL$} at program point $p$ is safe.
2681: %
2682: %
2683: \begin{eqnarray}
2684:   \cannullify(\rho, p) & = & 
2685:   \rho\not\in\LIVE{p}\; \wedge\; \Base(\rho)\in\RCH{p}
2686:   \label{eq:cannullify}
2687: \end{eqnarray}
2688: 
2689: $\nullify(\rho, p)$ captures the  profitability property---it is true if
2690: insertion of assignment \mbox{$\alpha =  \NULL$} at program point $p$ is
2691: profitable. To compute  \nullify, we note that it  is most profitable to
2692: set  a link  to  \NULL\ at  the earliest  point  where it  ceases to  be
2693: live.  Therefore,
2694: the  \nullify\  predicate  at a  point  has  to  take into  account  the
2695: possibility of  \NULL\ assignment insertion at previous  point(s). For a
2696: statement  $i$ in  the program,  let $\In{i}$  and $\Out{i}$  denote the
2697: program points immediately before and after $i$. Then,
2698: %
2699: \begin{eqnarray}
2700:   \nullify(\rho, \Out{i}) &=& \cannullify(\rho, \Out{i})
2701:   %\wedge \neg \cannullify(\Base(\rho), \Out{i}) \nonumber \\
2702:   \wedge(\bigwedge_{\rho' \in \mbox{\scriptsize\em ProperPrefix}(\rho)}
2703:   \rho' \not\in\LIVE{\Out{i}})
2704:   \nonumber \\
2705:   &&\wedge\; (\neg \cannullify(\rho, \In{i}) \vee
2706:   \neg \Transp(\rho, i))
2707:   \label{eq:nullify:A}\\
2708:   %
2709:   \nullify(\rho, \In{i}) &=& \cannullify(\rho, \In{i})
2710:   %\wedge \neg \cannullify(\Base(\rho), \In{i}) \nonumber \\
2711:   \wedge(\bigwedge_{\rho' \in \mbox{\scriptsize\em ProperPrefix}(\rho)}
2712:   \rho' \not\in\LIVE{\In{i}})
2713:   \nonumber \\
2714:   && \wedge\; \rho\neq\Lhs(i) \wedge
2715:   (\neg\!\!\!\! \bigwedge_{j \in pred(i)}\!\!\!\!\!\!
2716:   \cannullify(\rho, \Out{j}))
2717:   \label{eq:nullify:B}
2718: \end{eqnarray}
2719: where, \Transp($\rho$, $i$)  denotes that $\rho$ is transparent  with respect to
2720: statement $i$, i.e.  no prefix of $\rho$ is may-link-aliased  to the access path
2721: corresponding to  the lhs  of statement $i$  at $\In{i}$. \Lhs($i$)  denotes the
2722: access  path  corresponding  to  the  lhs access  expression  of  assignment  in
2723: statement $i$.  $pred(i)$ is  the set  of predecessors of  statement $i$  in the
2724: program.  $\mbox{\em ProperPrefix}(\rho)$  is the  set  of  all  proper prefixes  of
2725: $\rho$.
2726: 
2727: We insert assignment \mbox{$\alpha = \NULL$} at program point $p$ if
2728: $\nullify(\rho, p)$ is true. 
2729: 
2730: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2731: \subsection{Computing Candidate Access Paths for \NULL\ Insertion}
2732: \label{sec:null.candidates} 
2733: The method described above only checks whether a given
2734: access path $\rho$ can be nullified at a given program point $p$.
2735: %
2736: We can generate the {\em candidate} set of access paths for
2737: \NULL\ insertion at $p$ as follows: For any candidate access path
2738: $\rho$, $\Base(\rho)$ must either be available or anticipable at
2739: $p$. Additionally, all simple access paths are also candidates for
2740: \NULL\ insertions. Therefore,
2741: \begin{eqnarray}
2742:   \cand(p) &=& \left\{ \rho\myarrow f \mid \rho \in \RCH{p},
2743:   f \in \outfield(\rho)\right\} \nonumber\\
2744:   && \cup \left\{\rho \mid \rho \mbox{ is a simple access path }
2745:   \right\}\label{eq:candidate}
2746: \end{eqnarray}
2747: Where $\outfield(\rho)$ is the set of fields which can be used to
2748: extend access path $\rho$ at $p$. It can be obtained easily from the
2749: type information of the object $\Target(\rho)$ at $p$.
2750: 
2751: \begin{figure}[p]
2752: \begin{center}
2753:   \includegraphics{fig-null-insertion.epsi}
2754: \end{center}
2755: 
2756: \caption{Null insertion for the program in
2757:   Figure~\ref{fig:memory.graph}.}
2758: \label{fig:null-insertion}
2759: \rule{\textwidth}{.2mm}
2760: 
2761: \end{figure}
2762: 
2763: 
2764: %
2765: Note that all the information required for
2766: equations~(\ref{eq:cannullify}), (\ref{eq:nullify:A}),
2767: (\ref{eq:nullify:B}), and~(\ref{eq:candidate}) is obtained from the result
2768: of data flow analyses described in preceding sections. Type
2769: information of objects required by equation~(\ref{eq:candidate}) can be
2770: obtained from the front end of compiler.
2771: %%
2772: \Transp\ uses  may alias information  as computed in  terms of pairs  of access
2773: graph.
2774: 
2775: 
2776: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2777: \begin{example}
2778: \label{exmp:null-insertion}
2779: Figure~\ref{fig:null-insertion} lists a trace of the null insertion
2780: algorithm for the program in Figure~\ref{fig:memory.graph}.
2781: %
2782: \mybox
2783: \end{example}
2784: 
2785: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2786: \subsection{Reducing Redundant \NULL\ Insertions}
2787: Consider a program with an assignment statement \mbox{$i: \alpha_x =
2788: \alpha_y$}.  Assume a situation where, for some non-empty suffix
2789: $\sigma$, both \mbox{$\nullify(\rho_y\myarrow\sigma, \In{i})$} and
2790: \mbox{$\nullify(\rho_x\myarrow\sigma,\Out{i})$} are true. In that
2791: case, we will be inserting $\alpha_y.\sigma = \NULL$ at $\In{i}$ and
2792: $\alpha_x.\sigma = \NULL$ at $\Out{i}$. Clearly, the latter \NULL\
2793: assignment is redundant in this case and can be avoided by checking if
2794: $\rho_y\myarrow\sigma$ is nullable at $\In{i}$.
2795: 
2796: If must-alias analysis is performed then redundant assignments can be
2797: reduced further. Since
2798: must-link-alias relation is symmetric, reflexive, and transitive and
2799: hence an equivalence relation, the set of candidate paths
2800: at a program point can be divided into equivalence classes based on
2801: must-link-alias relation. Redundant \NULL\ assignments can be reduced
2802: by nullifying at most one access path in any equivalence class.
2803: 
2804: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2805: \section{Convergence of Heap Reference Analysis}
2806: \label{sec:termination}
2807: 
2808: The \NULL\ assignment insertion algorithm makes a single traversal
2809: over the control flow graph. We show the termination of 
2810: liveness analysis using the properties of access graph
2811: operations. Termination of availability and anticipability can be
2812: shown by similar arguments over finite sets of bounded access paths.
2813: Termination of alias analysis follows from \citeN{hind99interprocedural}.
2814: 
2815: \subsection{Monotonicity}
2816: \label{sec:access.graph.properties}
2817: 
2818: For a program there are a finite number of basic blocks, a finite
2819: number of fields for any root variable, and a finite number of field
2820: names in any access expression. Hence the number of access graphs for
2821: a program is finite. Further, the number of nodes and hence the size
2822: of each access graph, is bounded by the number of labels which can be
2823: created for a program.
2824: 
2825: Access graphs for a variable $x$ form a complete lattice with a
2826: partial order $\sqsubseteq_G$ induced by $\cupG$.  Note that $\cupG$
2827: is commutative, idempotent, and associative.
2828: %
2829: Let \mbox{$G = \langle x,N_F,N_I,E\rangle$} and \mbox{$G' = \langle
2830: x,N'_F,N'_I,E'\rangle$} where subscripts $F$ and $I$ distinguish
2831: between the final and intermediate nodes. The partial 
2832: order $\sqsubseteq_G$ is defined as
2833: %
2834: \begin{equation*}
2835:   G \sqsubseteq_G G' \Leftrightarrow 
2836:     \left(N'_F \subseteq N_F\right) \wedge 
2837:     \left(N'_I \subseteq \left(N_F \cup N_I\right)\right) \wedge
2838:     \left(E' \subseteq E\right)
2839: \end{equation*}
2840: %
2841: Clearly, $G \sqsubseteq_G G'$ implies that $G$ contains all access
2842: paths of $G'$.  We extend $\sqsubseteq_G $ to a set of access graphs
2843: as follows:
2844: \[
2845: S_1 \sqsubseteq_{S} S_2 \Leftrightarrow
2846:    \forall G_2 \in S_2, \exists G_1 \in S_1
2847:    \mbox{ s.t. } G_1 \sqsubseteq_G G_2
2848: \]
2849: It is easy to verify that $\sqsubseteq_G$ is reflexive, transitive,
2850: and antisymmetric.  For a given variable $x$, the access graph
2851: $\Empty\!_G$ forms the $\top$ element of the lattice while the $\bot$
2852: element is a greatest lower bound of all access graphs.
2853: 
2854: The partial order over access graphs and their sets can be carried
2855: over unaltered to remainder graphs ($\sqsubseteq_{RG}$) and their
2856: sets ($\sqsubseteq_{RS}$), with the added condition that $\EFG$
2857: is incomparable to any other non empty remainder graph. 
2858: 
2859: \begin{figure}[t]
2860: %\begin{center}
2861: \begin{tabular}{|l|r@{\ \ }c@{\ \ }l|}
2862: \hline
2863: Operation & \multicolumn{3}{|c|}{Monotonicity} \\ \hline \hline
2864: Union &  $G_1\sqsubseteq_G G_1' \wedge G_2\sqsubseteq_G G_2'$
2865: 	& $\Rightarrow $ 
2866: 	& $G_1 \cupG G_2 \sqsubseteq_G G_1' \cupG G_2'$
2867:           \rule[-.15cm]{0cm}{.45cm}
2868: 	\\ \hline
2869: Path Removal & $G_1\sqsubseteq_G G_2 $
2870:         &$\Rightarrow $
2871: 	&$G_1\minus\rho \sqsubseteq_G G_2 \minus\rho$\rule[-.15cm]{0cm}{.45cm} 
2872: 	\\ \hline
2873: Factorization & $G_1\sqsubseteq_G G_2 $
2874:         &$\Rightarrow $
2875: 	&$G_1/(G,M) \sqsubseteq_{RS} G_2/(G,M)$\rule[-.15cm]{0cm}{.45cm} 
2876: 	\\ \hline
2877: Extension & $RS_1 \sqsubseteq_{RS} RS_2 \wedge G_1 \sqsubseteq_G G_2  \wedge
2878: 	M_1 \subseteq M_2$
2879:         &$\Rightarrow$
2880: 	&$\extend{(G_1,M_1)}{RS_1} \sqsubseteq_{G} \extend{(G_2,M_2)}{RS_2} $
2881:            \rule[-.15cm]{0cm}{.45cm} 
2882: 	\\ \hline
2883: %%Link-Alias & $\Aset_1\subseteq \Aset_2$
2884: %%	&$\Rightarrow$
2885: %%	&$\LnA(\rho_x,\Aset_1)\subseteq\LnA(\rho_x,\Aset_2)$
2886: %%         \rule[-.1cm]{0cm}{.40cm}
2887: %%	\\ \cline{2-4}
2888: \renewcommand{\arraystretch}{.8}%
2889: \begin{tabular}{@{}l@{}}
2890: Link-Alias\\
2891: Closure
2892: \end{tabular}
2893:         & $G_1\sqsubseteq_G G'_1 \wedge G_2\sqsubseteq_G G'_2 $
2894: 	&$\Rightarrow$
2895: 	&$\LnG(G_1,G_2,\langle g_x,g_y\rangle)\sqsubseteq_S\LnG(G'_1,G'_2,\langle g_x,g_y\rangle)$
2896:          \rule[-.15cm]{0cm}{.6cm}
2897: 	\\ \hline
2898: \end{tabular}
2899: %\end{center}
2900: \caption{Monotonicity of Access Graph Operations}
2901: \rule{\textwidth}{.2mm}
2902: \label{fig:monotonicity.ag}.
2903: \end{figure}
2904: 
2905: 
2906: Access graph operations are monotonic as described in
2907: Figure~\ref{fig:monotonicity.ag}.  Path removal is monotonic
2908: in the first argument but not in the second argument. Similarly factorization is
2909: monotonic in the first argument but not in the second and the third
2910: argument. However, we show that in each context where they are used,
2911: the resulting functions are monotonic:
2912: \begin{enumerate}
2913: \item Path removal is used only for an assignment
2914:   \mbox{$\alpha_x=\alpha_y$}.  It is used in liveness analysis and its second argument 
2915:   is $\rho_x$ which is constant for
2916:   any assignment statement \mbox{$\alpha_x=\alpha_y$}. Thus the resulting
2917:   flow functions are monotonic.
2918: \item Factorization is used in the following situations:
2919:   \begin{enumerate}
2920:   \item {\em Link-alias closure of access graphs}.  
2921: 	From equation~(\ref{eq:link.alias.computation}) it is clear \LnG\ is
2922: 	monotonic in the first argument (because it is used in $\cupG$) and
2923: 	the second argument (because it is supplied as the first argument of
2924: 	factorization). The third and the fourth arguments of \LnG\ are linear 
2925: 	access graphs containing a single path and hence are incomparable with
2926: 	any other linear access graph.
2927:     Thus link-alias computation is monotonic in all its arguments.
2928: 
2929:   \item {\em Liveness analysis}.  Factorization is used for the flow
2930:     function corresponding to an assignment \mbox{$\alpha_x=\alpha_y$}
2931:     and its second argument is $\graphA{\rho_x}$ while its third
2932:     argument is $\lNode{\graphA{\rho_x}}$ both of which are
2933:     constant for any assignment statement
2934:     \mbox{$\alpha_x=\alpha_y$}. Thus, the resulting flow functions are
2935:     monotonic.
2936:   \end{enumerate}
2937: \end{enumerate}
2938: Thus we conclude that all flow functions are monotonic.  Since
2939: lattices are finite, termination of heap reference analysis follows.
2940: 
2941: {
2942: Appendix~\ref{sec:non-distributivity} discusses the distributivity of 
2943: flow functions.
2944: }
2945: 
2946: \subsection{Complexity}
2947: \label{sec:complexity}
2948: 
2949: This section discusses the issues which influence the complexity and
2950: efficiency of performing heap reference analysis. Empirical
2951: measurements which corroborate the observations made in this section
2952: are presented in Section~\ref{sec:measurements}.
2953: 
2954: The data flow frameworks defined in this paper are not {\em
2955: separable\/}~\cite{dfa.chap} because the data flow information of a
2956: variable depends on the data flow information of other variables.
2957: Thus the number of iterations over control flow graph is not bounded
2958: by the depth of the graph~\cite{asu,hect,dfa.chap} but would also
2959: depend on the number of root variables which depend on each other.
2960: 
2961: Although we consider each statement to be a basic block, our control
2962: flow graphs retain only statements involving references. A further
2963: reduction in the size of control flow graphs follows from the fact
2964: that successive use statements need not be kept separate and can be
2965: grouped together into a block which ends on a reference assignment.
2966: 
2967: The amount of work done in each iteration is not fixed but depends on
2968: the size of access graphs. Of all operations performed in an
2969: iteration, only $\states{G}{G'}$ is costly. Conversion to
2970: deterministic access graphs is also a costly operations but is
2971: performed for a single pass during \NULL\ assignment insertion.  In
2972: practice, the access graphs are quite small because of the following
2973: reason: Recall that edges in access graphs capture dependence of a
2974: reference made at one program point on some other reference made at
2975: another point (Section~\ref{sec:dependence}). 
2976: %
2977: In real  programs, traversals  involving long dependences  are performed
2978: using  iterative constructs  in the  program.  In  such  situations, the
2979: length  of  the  chain of  dependences  is  limited  by the  process  of
2980: summarization because summarization treats nodes with the same
2981: label  as  being identical.  Thus,  in  real  programs chains  of  such
2982: dependences, and hence the access graphs, are quite small in size.
2983: %
2984:  This is corroborated by
2985: Figure~\ref{tab:space.time.data} which provides the empirical data for
2986: the access graphs in our examples.  The average number of nodes in
2987: these access graphs is less than 7 while the average number of edges
2988: is less than 12.  These numbers are still smaller in the interprocedural
2989: analysis. Hence the complexities of access graph operations is
2990: not a matter of concern.
2991: 
2992: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2993: \section{Safety of \NULL\ Assignment Insertion}
2994: \label{sec:soundness}
2995: 
2996: We have to prove that the \NULL\ assignments inserted by our algorithm
2997: (Section~\ref{sec:nullability}) in a program are safe in that they do
2998: not alter the result of executing the program. We do this by showing
2999: that (a) an inserted statement itself does not raise a dereferencing exception, and
3000: (b) an inserted statement does not affect any other statement, both
3001: original and inserted.
3002: 
3003: We use the subscripts $b$ and $a$ for a program point~$p$ to denote
3004: ``before'' and ``after'' in an execution order. Further, the
3005: corresponding program points in the original and modified program are
3006: distinguished by the superscript $o$ and $m$. The correspondence is
3007: defined as follows: If \Pmod\ is immediately before or after an
3008: inserted assignment \mbox{$\alpha = \NULL$}, \Porg\ is the point where
3009: the decision to insert the \NULL\ assignment is taken.  For any other
3010: \Pmod, there is an obvious \Porg.
3011: 
3012: We first assert the soundness of availability, anticipability and
3013: alias analyses without proving them.
3014: 
3015: \begin{lemma}
3016: \label{lemm:avail.sound}
3017: {\rm (Soundness of Availability Analysis).} Let $AV_{\!\!\Pa}$ be the
3018: set of access paths available at program point \Pa. Let $\rho \in
3019: AV_{\!\!\Pa}$.  Then along every path reaching \Pa, there exists a
3020: program point \Pb, such that {the link represented by} $\Front(\rho)$
3021: is either dereferenced or assigned a non-\NULL\ l-value at \Pb\ and is
3022: not made \NULL\ between \Pb\ and \Pa.
3023: \end{lemma}
3024: 
3025: \begin{lemma}
3026: \label{lemm:ant.sound}
3027: {\rm (Soundness of Anticipability Analysis).} Let $AN_p$ be the set of
3028: access paths anticipable at program point~$p$. Let $\rho \in
3029: AN_p$. Then along every path starting from $p$, {the link represented
3030: by $\Front(\rho)$} is dereferenced before being assigned.
3031: \end{lemma}
3032: 
3033: For  semantically valid input  programs (i.e.\  programs which  do not
3034: generate               dereferencing              exceptions),
3035: Lemma~\ref{lemm:avail.sound}  and Lemma~\ref{lemm:ant.sound} guarantee
3036: that {if}  $\rho$ is available or anticipable  at $p$, $\Target(\rho)$
3037: can be dereferenced at $p$.
3038: 
3039: \begin{lemma}
3040: \label{lemm:soundness.aliasing} 
3041: {\rm (Soundness of Alias Analysis).} Let $\Front(\rho_x)$ represents
3042: the same link as $\Front(\rho_y)$ at a program point~$p$ during some
3043: execution of the program. Then link-alias computation of $\rho_x$ at
3044: $p$ would discover $\rho_y$ to be link-aliased to $\rho_x$.
3045: \end{lemma}
3046: 
3047: For the main claim, we relate the access paths at \Pa\ to the
3048: access paths at \Pb\ by incorporating the effect of intervening
3049: statements only, regardless of the statements executed before \Pb.  In
3050: some execution of a program, let $\rho$ be the access path of interest
3051: at \Pa\ and the sequence of statements between \Pb\ and \Pa\ be $s$.\footnote{
3052:   When $s$ is a function call $\alpha_x = f(\alpha_y)$, \Pa\ is the entry point
3053:   of $f$ and \Pb\ is the program point just before the statement $s$ in the
3054:   caller's body. Analogous remark holds for the return statement.}
3055: Then \mbox{$T(s,\rho)$} represents the access path at \Pb\ which, if
3056: non-\Empty, can be used to access the link represented by
3057: $\Front(\rho)$.   \mbox{$T(s, \rho)$} captures the
3058: transitive effect of backward transfers of $\rho$ through $s$.  $T$ is
3059: defined as follows:
3060: %
3061: \begin{eqnarray*}
3062: T(s, \rho ) & = & 
3063: \left\{\begin{array}{ll}
3064: 	\rho & s \mbox{ is a use statement } \\
3065: 	\rho & s \mbox{ is } \alpha_x = \ldots \mbox{ and }
3066: 		\rho_x \mbox{ is not a prefix of } \rho \\
3067: 	\Empty & s \mbox{ is } \alpha_x = \mbox{{\em New\/} and } 
3068: 		\rho=\rho_x\myarrow\sigma \\
3069: 	\Empty & s \mbox{ is } \alpha_x = \NULL\ \mbox{ and }
3070: 		\rho=\rho_x\myarrow\sigma \\
3071: 	\rho_y\myarrow\sigma & s \mbox{ is } \alpha_x = \alpha_y \mbox{ and }
3072: 		\rho=\rho_x\myarrow\sigma \\ %%\textBlue
3073:     \rho & s \mbox{ is the function call } \alpha_x = f(\alpha_y) \mbox{ and }
3074: 	\\ & \RootVar(\rho) \mbox{ is a global variable}\\
3075:     \rho_y\myarrow\sigma & s \mbox{ is the function call } \alpha_x = f(\alpha_y),
3076: 	\rho = z\myarrow\sigma \mbox{ and }\\ & z \mbox{ is the formal parameter of } f\\
3077:     \rho & s \mbox{ is the return statement } return(\alpha_z) \mbox{ and }
3078: 	\\ & \RootVar(\rho) \mbox{ is a global variable}\\
3079:     \rho_z\myarrow\sigma & s \mbox{ is the return statement } 
3080:     return(\alpha_z), \rho = \rho_x\myarrow\sigma \mbox{ and}\\
3081:          & \mbox{the corresponding  call is } \alpha_x = f(\alpha_y)\\
3082: 
3083: %%\textBlack
3084: 	T(s_1, T(s_2,\rho)) & s \mbox{ is  a sequence } s_1;s_2
3085: \end{array}
3086: \right. 
3087: \end{eqnarray*}
3088: 
3089: \begin{lemma}{\em (Liveness Propagation)}.
3090: \label{lemm:liveness.prop}
3091: Let $\rho^a$ be in some explicit liveness graph at \Pa. Let the
3092: sequence of statements between \Pb\ to \Pa\ be $s$. Then, if
3093: \mbox{$T(s,\rho^a) = \rho^b$} and $\rho^b$ is not \Empty, then
3094: $\rho^b$ is in some explicit liveness graph at $p_b$.
3095: \end{lemma}
3096: \begin{proof}
3097: The proof is by structural induction on $s$.
3098: %Since $\rho^b$ is
3099: %assumed to be non-\Empty, the relevant base cases are:
3100: Since $\rho^b$ is
3101: non-\Empty, the base cases are:
3102: \begin{enumerate}
3103: \item $s$ is a use statement. In this case $\rho^b = \rho^a$. \label{base.step.1}
3104: \item $s$ is an assignment \mbox{$\alpha_x = \ldots$} such that
3105:   $\rho_x$ is not a prefix of $\rho^a$. Here also $\rho^b = \rho^a$.
3106:   \label{base.step.2}
3107: \item $s$ is an assignment \mbox{$\alpha_x = \alpha_y$} such that
3108:   \mbox{$\rho^a = \rho_x\myarrow\sigma$}. In this case $\rho^b = \rho_y\myarrow\sigma$.
3109:   \label{base.step.3}
3110: %%\textBlue
3111: \item \label{base.step.4} $s$ is the function call $\alpha_x = f(\alpha_y)$. The
3112:   only interesting  case is  when $\rho^a =  z\myarrow\sigma$, where $z$  is the
3113:   formal parameter of $f$. In this case, $\rho^b = \rho_y\myarrow\sigma$.
3114: \item \label{base.step.5}  $s$ is  the return statement  $return(\alpha_z)$. The
3115: only  interesting  case  is   when  $\rho^a  =  \rho_x\myarrow\sigma$,  and  the
3116: corresponding  call  is  $\alpha_x  =   f(\alpha_y)$.  In  this  case,  $\rho^b  =
3117: \rho_z\myarrow\sigma$.
3118: \end{enumerate}
3119: %%\textBlack
3120: For (\ref{base.step.1}) and (\ref{base.step.2}), since
3121: $\rho^a$ is not in \Lkill{}, $\rho^b$ is in some explicit liveness
3122: graph at \Pb. For (\ref{base.step.3}), 
3123: from Equation~(\ref{eq:xfer.y.asgn}), $\rho^b$ is in some explicit liveness 
3124: graph at \Pb.
3125: %%\textBlue
3126:  For (\ref{base.step.4}) and (\ref{base.step.5}), the result
3127: follows from the fact that $\Summary(\rho_y)$ and $\Summary(\rho_z)$ are
3128: in the explicit liveness graph of the program points before the call and return
3129: statements respectively.
3130: 
3131: %%\textBlack
3132: For the inductive step, assume that the lemma holds for $s_1$ and
3133: $s_2$.  From the definition of $T$, there exists a non-\Empty\
3134: $\rho^i$ at the intermediate point~$p_i$ between $s_1$ and $s_2$, such
3135: that \mbox{$\rho^i = T(s_2,\rho^a)$} and \mbox{$\rho^b =
3136: T(s_1,\rho^i)$}. Since $\rho^a$ is in some explicit liveness graph at
3137: \Pa, by the induction hypothesis, $\rho^i$ must be in some explicit
3138: liveness graph at $p_i$.  Further, by the induction hypothesis,
3139: $\rho^b$ must be in some explicit liveness graph at \Pb.
3140: \end{proof}
3141: %
3142: \begin{lemma}
3143: \label{lemm:corresponding.liveness}
3144: Every access path which is in some  liveness graph at \Pbmod\
3145: is also in some liveness graph at \Pborg.
3146: \end{lemma}
3147: \begin{proof}
3148: If an extra explicitly live access path is  introduced in the
3149: modified program, it could be only because of an inserted assignment
3150: \mbox{$\alpha = \NULL$} at some \Pamod.  The only access paths which
3151: this statement can add to an explicit liveness graph are the paths
3152: corresponding the proper prefixes of $\alpha$. However, the algorithm
3153: selects $\alpha$ for nullification only if the access paths
3154: corresponding to all its proper prefixes are in some explicit liveness
3155: graph.  Therefore every access path which is in some explicit liveness
3156: graph at \Pamod\ is also in some explicit liveness graph at
3157: \Paorg. The same relation would hold at \Pbmod\ and \Pborg.
3158: 
3159: %%\textBlue
3160:  If an extra live access path is introduced in the modified program, it
3161: could be only because of an  inserted assignment \mbox{$\alpha = \NULL$} at some
3162: \Pamod.   The only  access paths  which this  statement can  add to  an liveness
3163: graphs are  $\LnA(\rho', \Aset^m)$, where $\rho'$  is a proper  prefix of $\rho$
3164: and  $\Aset^m$ represents  the  alias  set at  \Pamod.   However, the  algorithm
3165: selects  $\alpha$  for  nullification  at  \Pamod\  only  if  the  access  paths
3166: corresponding  to  all  its  proper  prefixes  are in  some  liveness  graph  at
3167: \Paorg. As liveness graphs are closed under link aliasing, this implies that the
3168: liveness graph at \Paorg\ includes paths $\LnA(\rho', \Aset^o)$, where $\Aset^o$
3169: represents the  alias set  at \Paorg.  Since  inserted statements can  only kill
3170: aliases, $\Aset^m  \subseteq \Aset^o$.  Thus, $\LnA(\rho',  \Aset^m)$, the paths
3171: resulting out of insertion, are also in the liveness graph at \Paorg.  Therefore
3172: every access  path which is in  some liveness graph  at \Pamod\ is also  in some
3173: liveness graph at \Paorg.  The same relation would hold at \Pbmod\ and \Pborg.
3174: %
3175: %%\textBlack
3176: \end{proof}
3177: %
3178: \begin{theorem}{\em  (Safety of \NULL\ insertion)}.  Let the  assignment
3179: \mbox{$\alpha^b = \NULL$} be inserted by the algorithm immediately before
3180: \Pbmod. Then:
3181: \begin{enumerate}
3182: \item Execution of \mbox{$\alpha^b = \NULL$} does not raise any
3183:   exception due to dereferencing.
3184: \item Let $\alpha^a$ be used immediately after \Pamod\ (in an original
3185:   statement or an inserted \NULL\ assignment).  Then, execution of
3186:   \mbox{$\alpha^b = \NULL$} cannot nullify any link used in
3187:   $\alpha^a$.
3188: \end{enumerate}
3189: \end{theorem}
3190: \begin{proof} We prove the two parts separately. 
3191: \begin{enumerate}
3192: \item If $\alpha^b$ is a root variable, then the execution of
3193:   \mbox{$\alpha^b = \NULL$} cannot raise an exception. When $\alpha^b$
3194:   is not a root variable, from the null assignment algorithm, every
3195:   proper prefix $\rho'$ of $\rho^b$ is either anticipable or
3196:   available. From the soundness of both these analyses,
3197:   $\Target(\rho')$ exists and the execution of \mbox{$\alpha^b =
3198:     \NULL$} cannot raise an exception.
3199: %
3200: \item We prove this by contradiction.  Let $s$ denote the sequence of statements
3201:   between \Pbmod\ and \Pamod.  Assume that \mbox{$\alpha^b = \NULL$} nullifies a
3202:   link  used in  $\alpha^a$.  This  is possible  only if  there exists  a prefix
3203:   $\rho'$ of $\rho^a$  such that $T(s,\rho')$ shares its  frontier with $\rho^b$
3204:   at  \Pbmod.
3205: %% \textBlue
3206:  By  Lemma~\ref{lemm:liveness.prop}, $T(s,\rho')$  must be  in some
3207:   explicit liveness  graph at \Pbmod.   From Lemma~\ref{lemm:soundness.aliasing}
3208:   and  the  definition  of liveness,  $\rho^b$  is  in  some liveness  graph  at
3209:   \Pbmod. By  Lemma~\ref{lemm:corresponding.liveness}, $\rho^b$ is  also in some
3210:   liveness graph at \Pborg. Thus  a decision to insert \mbox{$\alpha^b = \NULL$}
3211:   cannot be taken at \Pborg.
3212: %%\textBlack
3213: \end{enumerate}
3214: \end{proof}
3215: 
3216: 
3217: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
3218: \section{Empirical Measurements}
3219: \label{sec:measurements}
3220: 
3221: \begin{figure}[t]
3222: %\small
3223: \begin{tabular}{c|c}
3224:   \includegraphics{Loop.epsi} &
3225:   \includegraphics{DLoop.epsi} \\ \hline & \\
3226:   \includegraphics{CReverse.epsi} &
3227:   \includegraphics{BiSort.epsi} \\\hline & \\
3228:   \includegraphics{TreeAdd.epsi} &
3229:   \includegraphics{GCBench.epsi} \\ \hline
3230:   \multicolumn{2}{c}{}
3231: \end{tabular}
3232: 
3233: X axis indicates measurement instants in milliseconds.  Y axis
3234: indicates heap usage in KB. Solid line represents memory required for
3235: original program while dashed line represents memory for the modified
3236: program. Observe that the modified program executed faster than the
3237: original program in each case.
3238: \caption{Temporal plots of memory usages.}  
3239: \label{fig:plots}
3240: \rule{\textwidth}{.2mm}
3241: \end{figure}
3242: 
3243: In order to show the effectiveness of heap reference analysis, we have
3244: developed proof-of-concept implementations
3245: of heap reference analysis at two levels:
3246: One at the interprocedural level and the other at the intraprocedural level.
3247: 
3248: \subsection{Experimentation Methodology}
3249: 
3250: Our intraprocedural analyzer, which predates the interprocedural version
3251: is an evidence of the effectiveness  of  intraprocedural  analysis.
3252: It was implemented using 
3253: XSB-Prolog\footnote{Available from \url{http://xsb.sourceforge.net}.}.
3254: The measurements were made
3255: on a 800 MHz Pentium III machine with 128 MB memory running Fedora
3256: Core release 2. The benchmarks used were \texttt{Loop},
3257: \texttt{DLoop}, \texttt{CReverse}, \texttt{BiSort}, \texttt{TreeAdd}
3258: and \texttt{GCBench}.  Three of these (\texttt{Loop}, \texttt{DLoop}
3259: and \texttt{CReverse}) are similar to those
3260: in~\cite{ran.shaham-sas03}.  \texttt{Loop} creates a singly linked
3261: list and traverses it, \texttt{DLoop} is doubly linked list variation
3262: of the same program, \texttt{CReverse} reverses a singly linked
3263: list. \texttt{BiSort} and \texttt{TreeAdd} are taken from Java version
3264: of Olden benchmark suite~\cite{jolden}.  \texttt{GCBench} is taken
3265: from~\cite{gcbench}. 
3266: 
3267: For measurements on this implementation, the function of interest in a
3268: given Java  program was manually translated  to Prolog representation.
3269: This allowed us to avoid redundant information like temporaries, empty
3270: statements etc. resulting in a compact representations of programs.
3271: The interprocedural information for  this function was approximated in
3272: the  Prolog   representations  in  the  following   manner:  Calls  to
3273: non-recursive functions were inlined  and calls to recursive functions
3274: were replaced by iterative  constructs which approximated the liveness
3275: property of heap manipulations in  the function bodies.  The result of
3276: the analysis  was used  to manually insert  \NULL\ assignments  in the
3277: original Java programs to create modified Java programs.
3278: 
3279: Manual interventions allowed us to handle procedure calls
3280: without performing interprocedural analysis. In order to automate the
3281: analysis and extend it to interprocedural level, we  
3282: used SOOT~\cite{vall99soot} which has built in support for
3283: many of our requirements. However,
3284: compared to the Prolog representation of programs,
3285: the default Jimple representation used by SOOT is 
3286: not efficient for our purposes because it introduces a large number of
3287: temporaries and contains all statements even if they do not affect
3288: heap reference analysis.
3289: 
3290: As was described earlier, our interprocedural analysis is very simplistic.
3291: Our experience shows that imprecision of interprocedural
3292: alias analysis increases the size of alias information
3293: thereby making the analysis inefficient apart from reducing the precision 
3294: of the resulting information. 
3295: This effect has been worsened by the fact that SOOT introduces 
3296: a large number of temporary variables. 
3297: Besides, the complete alias information is not required for our purposes.
3298: 
3299: 
3300: We believe that our approach can be made much more scalable by 
3301: \begin{itemize}
3302: \item Devising a method of avoiding full alias analysis and computing only 
3303:        the required alias information, and
3304: \item Improving the Jimple representation by eliminating redundant 
3305:        information, combining multiple successive uses into a single
3306:        statement etc.
3307: \end{itemize} 
3308: 
3309: The implementations, along with the test programs (with their
3310: original, modified, and Prolog versions) are available
3311: at~\cite{hra.prototype}.
3312: 
3313: \newcommand{\NIter}{\mbox{\#Iter}}
3314: \newcommand{\NGph}{\mbox{\#G}}
3315: \newcommand{\Nnull}{\mbox{\#\NULL}}
3316: 
3317: \begin{figure}[t]
3318: \begin{center}
3319: \renewcommand{\arraystretch}{1}
3320: \renewcommand{\rotatebox}[2]{#2}
3321: \begin{tabular}{|@{}c@{}|}
3322: \hline
3323: Intraprocedural analysis of selected method (Prolog Implementation) \\ \hline \hline
3324: \begin{tabular}{@{\ }l@{\ }|r|r|r|r|r|r|r|r|r|r}
3325: Program
3326: & \multicolumn{2}{|c|}{Analysis} 
3327: & \multicolumn{5}{|c|}{Access Graphs}
3328: & \multicolumn{1}{|c|}{}
3329: & \multicolumn{2}{|c@{}}{Execution} 
3330: \\ \cline{2-3}\cline{4-8}
3331: Name.Function
3332: & \multicolumn{1}{|c|}{} 
3333: & \multicolumn{1}{|c|}{Time} 
3334: & \multicolumn{1}{|c|}{}
3335: & \multicolumn{2}{|c|}{Nodes}
3336: & \multicolumn{2}{|c|}{Edges}
3337: & \multicolumn{1}{|c|}{}
3338: & \multicolumn{2}{|c@{}}{Time (sec)} 
3339: \\ \cline{5-8}\cline{10-11}
3340: & \multicolumn{1}{|c|}{\rotatebox{90}{\NIter}}
3341: & \multicolumn{1}{|c|}{(sec)} 
3342: & \multicolumn{1}{|c|}{\NGph}
3343: & \multicolumn{1}{|c|}{\rotatebox{90}{Avg}}
3344: & \multicolumn{1}{|c|}{\rotatebox{90}{Max}}
3345: & \multicolumn{1}{|c|}{\rotatebox{90}{Avg}}
3346: & \multicolumn{1}{|c|}{\rotatebox{90}{Max}}
3347: & \multicolumn{1}{|c|}{\rotatebox{90}{\Nnull}}
3348: & \multicolumn{1}{|c|}{Orig.} 
3349: & \multicolumn{1}{|c@{}}{Mod.}
3350: \\
3351: \hline\hline
3352: \texttt{Loop.main}    & 5& 0.082 & 172 & 1.13 & 2 &  0.78 &  2 &  9
3353: & 1.503 & 1.388 \\ \hline
3354: \texttt{DLoop.main}   & 5& 1.290 & 332 & 2.74 & 4 &  5.80 & 10 & 11
3355: & 1.594 & 1.470 \\ \hline
3356: \texttt{CReverse.main}& 5& 0.199 & 242 & 1.41 & 4 &  1.10 &  6 &  8
3357: & 1.512 & 1.414 \\ \hline
3358: \texttt{BiSort.main}  & 6& 0.083 &  63 & 2.16 & 3 &  3.81 &  6 &  5
3359: & 3.664 & 3.646 \\ \hline
3360: \texttt{TreeAdd.addtree} & 6& 0.255 & 132 & 2.84 & 7 &  4.87 & 14 &  7
3361: & 1.976 & 1.772 \\ \hline
3362: \texttt{GCBench.Populate} & 6& 0.247 & 136 & 2.73 & 7 &  4.63 & 14 &  7
3363: & 132.99 & 88.86 %\\ \hline
3364: %\multicolumn{11}{c}{}
3365: \end{tabular}
3366: \\
3367: \hline \hline
3368: Interprocedural analysis of all methods (SOOT Implementation) \\ \hline
3369: \hline
3370: \begin{tabular}{l|r|r|r|r|r|r|r|r|r|r}
3371: \multicolumn{1}{c|}{Program}
3372: & \multicolumn{1}{|c|}{LOC}
3373: & \multicolumn{1}{|c|}{\#}
3374: & \multicolumn{1}{|c|}{Analysis}
3375: & \multicolumn{4}{|c|}{Access Graph Stats}
3376: & \multicolumn{1}{|c|}{} 
3377: & \multicolumn{2}{|c}{Execution} 
3378: \\ \cline{5-8}
3379: \multicolumn{1}{c|}{Name}
3380: & \multicolumn{1}{|c|}{in}
3381: & \multicolumn{1}{|c|}{methods}
3382: & \multicolumn{1}{|c|}{Time}
3383: & \multicolumn{2}{|c|}{Nodes}
3384: & \multicolumn{2}{|c|}{Edges}
3385: & \multicolumn{1}{|c|}{\Nnull} 
3386: & \multicolumn{2}{|c}{Time (sec)} 
3387: \\ \cline{5-8}\cline{10-11}
3388: \multicolumn{1}{c|}{}
3389: & \multicolumn{1}{|c|}{Jimple}
3390: & \multicolumn{1}{|c|}{}
3391: & \multicolumn{1}{|c|}{(sec.)}
3392: & \multicolumn{1}{|c|}{Max}
3393: & \multicolumn{1}{|c|}{Avg}
3394: & \multicolumn{1}{|c|}{Max}
3395: & \multicolumn{1}{|c|}{Avg}
3396: & \multicolumn{1}{|c|}{} 
3397: & \multicolumn{1}{|c|}{Orig.} 
3398: & \multicolumn{1}{|c}{Mod.}
3399: \\ \hline\hline
3400: \texttt{Loop}    &  83 &  2 &  0.558  & 2 & 1.24 & 2 & 0.39 & 12 & 1.868 & 1.711 \\ \hline
3401: \texttt{DLoop}   &  78 &  2 & 20.660  & 5 & 1.45 & 12 & 0.76 & 12 & 1.898 & 1.772\\ \hline
3402: \texttt{CReverse}&  85 &  2 &  1.833  & 3 & 1.39 & 4 & 0.51 & 12 & 1.930 & 1.929 \\ \hline
3403: \texttt{BiSort}  & 466 & 12 &  1.498  & 7 & 1.29 & 10 & 0.40 & 77 & 1.519 & 1.524 \\ \hline
3404: \texttt{TreeAdd} & 228 &  4 &  0.797  & 6 & 1.29 & 7 & 0.46 & 34 & 2.704 & 2.716\\ \hline
3405: \texttt{GCBench} & 226 &  9 &  1.447  & 4 & 1.13 & 5 & 0.16 & 56 & 122.731 & 60.372 \\ 
3406: \hline
3407: \end{tabular}
3408: \\ \multicolumn{1}{c}{}
3409: \end{tabular}
3410: 
3411: \hspace*{.25in}
3412: %\begin{minipage}{\textwidth}
3413: \begin{itemize}
3414: \item[-] \NIter\ is the maximum number of iterations taken by any analysis. 
3415: \item[-] Analysis Time is the total time taken by all analyses. 
3416: \item[-] \NGph\ is total number of access graphs created by alias
3417: analysis and liveness analysis. Prolog implementation performs alias analysis also
3418: using access graphs.
3419: \item[-] Max nodes (edges) is the maximum over number of nodes (edges) in
3420: all access graphs.
3421: In some cases, maximum number nodes/edges is more in case of intraprocedural 
3422: analysis due to presence of longer paths in explicitly
3423: supplied boundary information, which gets replaced by a single *
3424: node in interprocedural analysis.
3425: \item[-] Avg nodes (edges) is the average number of nodes (edges) over
3426: all access graphs.
3427: \item[-] \Nnull\ is the number of inserted \NULL\ assignments.
3428: \end{itemize}
3429: %\end{minipage}
3430: \end{center}
3431: \vspace*{-.15in}
3432: \caption{Empirical measurements of proof-of-concept implementations of heap reference
3433: analyzer.}
3434: \label{tab:space.time.data}
3435: \rule{\textwidth}{.2mm}
3436: \end{figure}
3437: 
3438: 
3439: \subsection{Measurements and Observations}
3440: \label{sec:measurements.obs}
3441: 
3442: 
3443: Our experiments were directed at measuring:
3444: \begin{enumerate}
3445: \item {\em The efficiency of analysis}.  We measured the total time
3446:   required, number of iterations of round robin analyses, and the
3447:   number and sizes of access graphs.
3448: \item {\em The effectiveness of \NULL\ assignment insertions}.  The
3449:   programs were made to create huge data structures.  Memory usage was
3450:   measured by explicit calls to garbage collector in both modified and
3451:   original Java programs at specific probing points such as call
3452:   sites, call returns, loop begins and loop ends.  The overall
3453:   execution time for the original and the modified programs was also
3454:   measured.
3455: \end{enumerate}
3456: 
3457: The results of our experiments are shown in Figure~\ref{fig:plots} and
3458: Figure~\ref{tab:space.time.data}. As can be seen from
3459: Figure~\ref{fig:plots}, nullification of links helped the garbage
3460: collector to collect a lot more garbage, thereby reducing the
3461: allocated heap memory. In case of BiSort, however, the links were last
3462: used within a recursive procedure which was called multiple times.
3463: Hence, safety criteria prevented \NULL\ assignment insertion within
3464: the called procedure. Our analysis could only nullify the root of the
3465: data structure at the end of the program. Thus the memory was released
3466: only at the end of the program.
3467: 
3468: For interprocedural analysis, class files for both original as well as
3469: modified  programs were  generated using  SOOT.  As can  be seen  from
3470: Figure~\ref{tab:space.time.data},     modified    programs    executed
3471: faster. In general, a reduction in execution time can be attributed to
3472: the following  two factors: (a) a  decrease in the number  of calls to
3473: garbage  collector and  (b) reduction  in the  time taken  for garbage
3474: collection  in   each  call.  The   former  is  possible   because  of
3475: availability of a larger amount of free memory, the latter is possible
3476: because  lesser  reachable memory  needs  to be  copied.\footnote{This
3477: happens  because   Java  Virtual   Machine  uses  a   copying  garbage
3478: collector.} In  our experiments,  factor~(a) above was  absent because
3479: the  number  of  (explicit)  calls  to  garbage  collector  were  kept
3480: same. \texttt{GCBench}  showed a  large improvement in  execution time
3481: after  \NULL\ assignment insertion.  This is  because \texttt{GCBench}
3482: creates large trees  in heap, which are not used  in the program after
3483: creation and  our implementation  was able to  nullify left  and right
3484: subtrees of  these trees immediately  after their creation.  This also
3485: reduced the high water mark of the heap memory requirement.
3486: 
3487: As explained in Section~\ref{sec:complexity}, sizes of the access
3488: graphs (average number of nodes and edges) is small. This can be
3489: verified from Figure~\ref{tab:space.time.data}. The analysis of
3490: \texttt{DLoop} creates a large number of access graphs because of the
3491: presence of cycles in heap. In such a case, a large number of alias
3492: pairs are generated, many of which are redundant. Though it is
3493: possible to reduce analysis time by eliminating redundant alias pairs,
3494: our implementation, being a proof-of-concept implementation, does not do so 
3495: for sake of simplicity.
3496: 
3497: Our technique and implementation compares well with the
3498: technique and results described in~\cite{ran.shaham-sas03}.  
3499: A conceptual comparison with this method is included in Section~\ref{sec:ran.comparison}. 
3500: The implementation described in~\cite{ran.shaham-sas03} runs on a 900 MHz
3501: P-III with 512 MB RAM running Windows 2000. It takes 1.76 seconds,
3502: 2.68 seconds and 4.79 seconds respectively for \texttt{Loop},
3503: \texttt{DLoop} and \texttt{CReverse} for \NULL\ assignment insertion.
3504: Time required by our implementation for the above mentioned
3505: programs is given in Figure~\ref{tab:space.time.data}. Our
3506: implementation automatically computes the program points for \NULL\
3507: insertion whereas their method cannot do so.
3508: {Our implementation} performs much
3509: better { in all cases.} 
3510: 
3511: \section{Extensions for C++ }
3512: \label{sec:c++ext}
3513: 
3514: This approach becomes applicable to C++ by extending the concept of access graphs
3515: to faithfully represent the C++ memory model. It is assumed that the memory 
3516: which becomes unreachable due to nullification of pointers is reclaimed by
3517: an independent garbage collector. Otherwise, explicit reclamation of
3518: memory can be performed by checking that no node-alias of a nullified
3519: pointer is live.
3520: 
3521: In order to extend the concept of access graphs to C++, 
3522: we need to account for two major differences between the C++ and
3523: the Java memory model:
3524: \begin{enumerate}
3525: \item Unlike Java, C++ has explicit pointers.  Field of a structure
3526:   ({\tt struct} or {\tt class}) can be accessed in two different ways
3527:   in C++: 
3528:   \begin{itemize}
3529:   \item using pointer dereferencing ({$*.$}),
3530:   e.g. {$(*x).\lptr$}\footnote{This is equivalently written as
3531:   $x\!-\!\!\!\!\!>\!\!\lptr$.} or 
3532:   \item using simple dereferencing
3533:   ({$.$}) , e.g. {$y.\rptr$}. 
3534:   \end{itemize} 
3535:   We need to distinguish between the two.
3536:   
3537: \item Although root variables are allocated on stack in both C++ and
3538:   Java, C++ allows a pointer/reference to point to root variables on stack 
3539:   through the use of addressof ({\tt\&}) operator, whereas
3540:   Java does not allow a reference to point to stack. 
3541:   Since the root nodes in access graphs  do not have an incoming edge by
3542:   definition, it is not possible to use access graphs
3543:   directly to represent memory links in C++.  
3544:  % However, this
3545:  % difficulty is very easily circumvented through the following
3546:  % extension:
3547:  % We need to create a view
3548:  % of C++ memory model such that this view follows Java model.
3549: \end{enumerate}
3550: 
3551: \newcommand{\deref}{\mbox{\sf\em deref}}
3552: 
3553: We create access graphs for C++ memory model as follows:
3554: \begin{enumerate}
3555: \item We treat dereference of a pointer as a field reference, i.e., $*$
3556: is considered as a field named \deref. For example, an access expression $(*x).\lptr$
3557: is viewed as $ x.\deref.\lptr$, and corresponding access path is
3558: \mbox{$x\myarrow\deref\myarrow\lptr$}. The access path for $x.\lptr$ is
3559: \mbox{x\myarrow\lptr}. 
3560: 
3561: \item Though a pointer can point to a variable $x$, it is not
3562: possible extract the address of $\&x$, i.e. no pointer can point to $\&x$. 
3563: For Java, we partition memory as stack and
3564: heap, and had root variables of access graphs correspond to stack
3565: variables. In C++, we partition the memory as {\em address of variables}
3566: and rest of the memory (stack and heap together). We make the roots of
3567: access graphs correspond to addresses of variables. A root variable $y$
3568: is represented as $\deref\,(\&y)$. Thus,
3569: %
3570: \scalebox{.9}{%
3571: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
3572: \rule[-.2cm]{0cm}{.6cm}%
3573: \raisebox{1.35mm}{\rnode{n0}{}} \ \ \  \
3574: \rule{0cm}{.4cm}\raisebox{.6mm}{\circlenode[framesep=.4mm]{n1}{\small$\&y$}}
3575: \ \  \
3576: \raisebox{.6mm}{\circlenode[framesep=.5mm]{n2}{$d_1$}}
3577: \ncline{->}{n1}{n2}
3578: \ \  \
3579: \raisebox{.6mm}{\circlenode[framesep=.7mm]{n3}{$l_2$}}
3580: \ncline{->}{n2}{n3}
3581: \ncline[doubleline=true]{->}{n0}{n1}
3582: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
3583: }
3584: %
3585: represents access paths $\&y$ and \mbox{$\&y\myarrow \deref$} and
3586: \mbox{$\&y\myarrow \deref\myarrow l$}, which correspond to access expressions
3587: $\&y$, $y$ and $y.l$ respectively.
3588: 
3589: \end{enumerate}
3590: 
3591: Handling pointer arithmetic and type casting in C++ is orthogonal to
3592: above discussion, and requires techniques similar
3593: to~\cite{yong99pointer,cheng00modular} to be used.
3594: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
3595: 
3596: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
3597: \section{Related Work}
3598: \label{sec:related}
3599: 
3600: Several properties of heap (viz. reachability, sharing, liveness etc.) have 
3601: been explored in past; a good review has been provided by~\citeN{shape.chap}.  
3602: In this section, we review the related work in the main
3603: property of interest: liveness.  We are not aware of
3604: past work in availability and anticipability analysis of heap
3605: references.
3606: 
3607: \subsection{Liveness Analysis of Heap Data}
3608: Most of the reported literature in liveness analysis of heap data
3609: either does not address liveness of individual objects or
3610: addresses liveness of objects identified by their allocation sites.  
3611: Our method, by contrast, does not need the knowledge of allocation site.
3612: Since the precision of a
3613: garbage collector depends on its ability to distinguish between
3614: reachable heap objects and live heap objects, even state of art
3615: garbage collectors leave a significant amount of garbage
3616: uncollected~\cite{AgesenDetMos98,shah00,shah01,shah02}.  All reported
3617: attempts to incorporate liveness in garbage collection have been quite
3618: approximate. The known approaches have been:
3619: \begin{enumerate}
3620: \item {\em Liveness of root variables.}  A popular approach (which has
3621:   also been used in some state of art garbage collectors) involves
3622:   identifying liveness of root variable on the stack. All heap objects
3623:   reachable from the live root variables are considered
3624:   live~\cite{AgesenDetMos98}.
3625: \item {\em Imposing stack discipline on heap objects.}  These
3626:   approaches try to change the statically unpredictable lifetimes of
3627:   heap objects into predictable lifetimes similar to stack data. They
3628:   can be further classified as
3629:   \begin{itemize}
3630:   \item {\em Allocating objects on call stack}.  These approach try to
3631:     detect which objects can be allocated on stack frame so that they
3632:     are automatically deallocated without the need of traditional
3633:     garbage collection. A profile based approach which tracks the last
3634:     use of an object is reported in~\cite{mcdo98}, while a static
3635:     analysis based approach is reported in~\cite{reid99}.
3636: 
3637:     Some approaches ask a converse question: which objects are
3638:     unstackable (i.e. their lifetimes outlive the procedure which
3639:     created it)?  They use abstract interpretation and perform {\em
3640:     escape analysis\/} to discover objects which {\em escape\/} a
3641:     procedure\cite{Blanchet:1999:EAO,Blanchet:2003:EAJ,choi99escape}.
3642:     All other objects are allocated on stack.
3643:   \item {\em Associating objects with call stack}~\cite{cann00}. This
3644:     approach identifies the stackability. The objects are allocated in
3645:     the heap but are associated with a stack frame and the runtime
3646:     support is modified to deallocate these (heap) objects when the
3647:     associated stack frame is popped.
3648:   \item {\em Allocating objects on separate stack}. This approach uses
3649:     a static analysis called {\em region
3650:     inference\/}~\cite{tofte98region,tofte-region-pldi-02} to identify
3651:     {\em regions\/} which are storages for objects. These regions are
3652:     allocated on a separate region stack.
3653:   \end{itemize}
3654:   All these approaches require modifying the runtime support for the
3655:   programs.
3656: \item {\em Liveness analysis of locally allocated objects.} The
3657:   Free-Me approach~\cite{guyer06free}  combines  a   lightweight  pointer
3658: analysis with liveness information that detects when allocated objects
3659: die  and insert  statements to  free  such objects.  The analysis  is
3660: simpler and  cheaper as the scope  is limited, but  it frees
3661: locally  allocated   objects  only by separating objects which escape
3662: the procedure call from those which do not.   The objects which do not
3663: escape the procedure which creates them become unreachable at the end of the
3664: procedure anyway and would be garbage collected. 
3665: Thus their method merely advances the work
3666: of garbage collection instead of creating new garbage. Further, this
3667: does not happened in the called method. Further, their  method  uses
3668: traditional liveness  analysis for root  variables only and  hence can
3669: not free objects that are stored in field references. 
3670: \item The {\em Shape Analysis Based\/} based approaches. The two approaches
3671:      in this category are
3672:      \begin{itemize}
3673:       \item Heap Safety Automaton approach~\cite{ran.shaham-sas03} is a 
3674:       recently reported work which
3675:   comes closest to our approach since it tries to determine if a
3676:   reference can be made \NULL.  We discuss this approach in the next
3677:   section.
3678:     \item \citeN{cherem06compile}   use  a   shape  analysis
3679: framework~\cite{hackett05region} to  analyze a  single heap cell  to
3680: discover  the  point in  the
3681: program where it  object becomes unreachable. Their method  claims the objects
3682: at such points  thereby reducing the work of  the garbage collector. They
3683: use equivalence classes of expressions to store definite points-to and
3684: definitely-not  points-to   information  in  order   to  increase  the
3685: precision of abstract  reference counts.  However, multiple iterations
3686: of the analysis and the optimization steps are required, since freeing
3687: a cell  might result in  opportunities for more  deallocations.  Their
3688: method  does not  take into  account the  last use  of an  object, and
3689: therefore  does not make  additional objects  unreachable. 
3690:      \end{itemize}
3691: \end{enumerate}
3692: 
3693: \subsection{Heap Safety Automaton Based Approach}
3694: \label{sec:ran.comparison}
3695: 
3696: This approach models safety of inserting a null statement at a given
3697: point by an automaton. A shape graph based abstraction of the program
3698: is then model-checked against the heap safety automaton.
3699: Additionally, they also consider freeing the object; our approach can
3700: be easily extended to include freeing.
3701: 
3702: The fundamental differences between the two approaches are
3703: \begin{itemize}
3704: \item Their method answers the following question: Given an access
3705:   expression and a program point, can the access expression be set to
3706:   \NULL \ immediately after that program point? However, they leave a
3707:   very important question unanswered: Which access expressions should
3708:   we consider and at which point in the program?  It is impractical to
3709:   use their method to ask this question for every pair of access
3710:   expression and program point.  Our method answers both the questions
3711:   by finding out appropriate access expressions and program points.
3712: \item We insert \NULL\ assignments at the earliest possible point.
3713:   The effectiveness of any method to improve garbage collection
3714:   depends crucially on this aspect.  Their method does not address
3715:   this issue directly.
3716: \item {As noted in Section~\ref{sec:measurements.obs},} their method is
3717:   inefficient in practice. For a simple Java program containing 11
3718:   lines of executable statements, it takes over 1.37 MB of storage and
3719:   takes 1.76 seconds for answering the question: Can the variable $y$
3720:   be set to \NULL \ after line 10?
3721: \end{itemize}
3722: Hence our approach is superior to their approach in terms of
3723: completeness, effectiveness, and efficiency.
3724: 
3725: %%\akadd{
3726: %%\subsection{Two new approaches that have come up recently}
3727: %%*** Need to fit them somewhere above ***
3728: %%
3729: %%{\em   Free-Me}~\cite{guyer06free}  combines  a   lightweight  pointer
3730: %%analysis with liveness information that detects when allocated objects
3731: %%die  and insert  statements to  free  such objects.   The analysis  is
3732: %%simpler and  cheaper as the scope  is limited, but  it typically frees
3733: %%locally  allocated,   short  lived  objects  only.    The  method  uses
3734: %%traditional liveness  analysis for root  variables only and  hence can
3735: %%not free objects that are stored in field references.
3736: %%%%
3737: %%Cherem   and  Rugina~\cite{cherem06compile}   use  a   shape  analysis
3738: %%framework~\cite{hackett05region} to  analyze a  single heap cell  at a
3739: %%time  for  deallocation.  Their  method  discovers  the  point in  the
3740: %%program where an  object is becoming unreachable and  claim the object
3741: %%at that point,  thus reducing the work of  the garbage collector. They
3742: %%use equivalence classes of expressions to store definite points-to and
3743: %%definitely-not  points-to   information  in  order   to  increase  the
3744: %%precision of abstract  reference counts.  However, multiple iterations
3745: %%of the analysis and the optimization steps are required, since freeing
3746: %%a cell  might result in  opportunities for more  deallocations.  Their
3747: %%method  does not  take into  account the  last use  of an  object, and
3748: %%therefore  does not make  additional objects  unreachable. It  will be
3749: %%interesting to  combine our approach  with these methods as  it may
3750: %%make  more objects  unreachable,  reducing the garbage
3751: %%collection overheads at the same time.}
3752: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
3753: \section{Conclusions and Further Work }
3754: \label{sec:conclusions}
3755: 
3756: Two fundamental challenges in analyzing heap data are that the
3757: temporal and spatial structures of heap data seem arbitrary and are
3758: unbounded.  The apparent arbitrariness arises due to the fact that the
3759: mapping between access expressions and l-values varies dynamically.
3760: 
3761: The two key insights which allow us to overcome the above problems in the
3762: context of liveness analysis of heap data are: 
3763: \begin{itemize}
3764: \item {\em Creating finite representations for properties of heap data using
3765:   program structure}.
3766: We create an abstract representation of heap in terms of sets of
3767: access paths.  Further, a bounded representation, called access
3768: graphs, is used for summarizing sets of access paths. Summarization is
3769: based on the fact that the heap can be viewed as consisting of
3770: repeating patterns which bear a close resemblance to the program
3771: structure. Access graphs capture this fact directly by tagging program
3772: points to access graph nodes.  Unlike
3773: \cite{horw89dependence,ChaseWegZad90,choi93efficient,wilson95efficient,hind99interprocedural}
3774: where only memory allocation points are remembered, we remember all
3775: program points where references are used.  This allows us to combine
3776: data flow information arising out of the same program point, resulting
3777: in bounded representations of heap data. These representations are
3778: simple, precise, and natural. 
3779: 
3780: The dynamically varying mapping between access expressions and
3781: l-values is handled by abstracting out regions in the heap which can
3782: possibly be accessed by a program.  These regions are represented by
3783: sets of access paths and access graphs which are manipulated using a
3784: carefully chosen set of operations.  The computation of access graphs
3785: and access paths using data flow analysis is possible because of their
3786: finiteness and the monotonicity of the chosen operations.  We define
3787: data flow analyses for liveness, availability and
3788: anticipability of heap references.  Liveness analysis is an
3789: any path problem, hence it involve unbounded information requiring
3790: access graphs as data flow values. Availability and anticipability
3791: analyses are all paths problems, hence they involve bounded
3792: information which is represented by finite sets of access paths.
3793: 
3794: \item {\em Identifying the minimal information which covers every live link
3795:   in the heap}. An interesting aspect of our liveness analysis is that the
3796:   property of explicit liveness captures the minimal information which covers
3797:   every link which can possibly be live. Complete liveness is computed by
3798:   incorporating alias information in explicit liveness.
3799: \end{itemize}
3800: 
3801: 
3802: 
3803: An immediate application of these analyses is a technique to improve
3804: garbage collection. This technique works by identifying objects which
3805: are dead and rendering them unreachable by setting them to null as
3806: early as possible. Though this idea was previously {known} to yield
3807: benefits~{\cite{gcfaq}}, nullification of dead objects was based on
3808: profiling~{\cite{shah01,shah02}}.  Our method, instead, is based on
3809: static analysis.
3810: 
3811: %We intend to pursue future work in the following two directions:
3812: For the future work, we find some scope of improvements on both conceptual 
3813: level and at the level of implementation.
3814: \begin{enumerate}
3815: \item {\em Conceptual Aspects\/}.
3816: \begin{enumerate}
3817: \item Since the scalability of  our method critically depends on the scalability
3818:       of alias  analysis, we would like  to explore the  possibility of avoiding
3819:       computation  of complete alias  information at  each program  point. Since
3820:       explicit  liveness  does not  require  alias  information, an  interesting
3821:       question for further investigation is:  Just how much alias information is
3822:       enough to compute complete liveness from explicit liveness?  This question
3823:       is important because:
3824:       \begin{itemize} 
3825:       \item Not all aliases contribute to complete liveness.
3826:       \item  Even  when  an  alias  contributes  to liveness,  it  needs  to  be
3827:             propagated over a limited region of the program.
3828:       \end{itemize}
3829:       
3830:       \item  We have  proposed an  efficient version  of  call strings
3831:       based  interprocedural  data  flow  analysis in  an  independent
3832:       work~\cite{bageshri07phd}.    It  is   a  generic
3833:       approach which  retains full context sensitivity.  We would like
3834:       to use it for heap reference analysis.
3835:       \item We would like to improve the \NULL\ insertion algorithm so
3836:         that the same link is not nullified more than once.
3837:       \item  We  would like  to  analyze  array  fragments instead  of
3838:         treating an entire array as  a scalar (and hence, all elements
3839:         as equivalent).
3840:       \item  We  would also  like to  extend the  scope  of heap
3841: 	reference analysis for functional languages.  The basic method
3842: 	and  the   details  of  the  liveness   analysis  are  already
3843: 	finalized~\cite{karkare07liveness}.   The   details  of  other
3844: 	analyses are being finalized~\cite{karkare07hra}.
3845: 
3846: \end{enumerate}
3847: \item {\em Implementation Related Aspects\/}.
3848:   \begin{enumerate}
3849:   \item We would also like  to implement this approach for C/C++
3850:     and use it for plugging memory leaks statically.
3851:   \item  Our  experience  with  our  proof-of-concept  implementations
3852:     indicates that the engineering  choices made in the implementation
3853:     have a significant  bearing on the performance of  our method. For
3854:     example, we would like to use a better representation than the one
3855:     provided by SOOT.
3856:   \end{enumerate}
3857: \end{enumerate}
3858: 
3859: We would also like to apply the summarization heuristic to other
3860: analyses. Our initial explorations indicate that a similar approach
3861: would be useful for extending static inferencing of flow-sensitive
3862: monomorphic types~\cite{kdm.typeinferencing} to include polymorphic
3863: types.  This is possible because polymorphic types represent an
3864: infinite set of types and hence discovering them requires
3865: summarizing unbounded information.
3866: 
3867: \begin{acks}
3868: Several people have contributed to this work in past few years. We would
3869: particularly like to thank Smita  Ranjan, Asit Varma, Deepti Gupta, Neha
3870: Adkar,  C.\ Arunkumar,  Mugdha Jain,  and  Reena Kharat.  Neha's work  was
3871: supported  by Tata  Infotech Ltd.   A initial  version of  the prototype
3872: implementation of  this work was  partially supported by  Synopsys India
3873: Ltd.  Amey Karkare  has been  supported  by Infosys  Fellowship. We  are
3874: thankful to the anonymous referee for detailed remarks.
3875: \end{acks}
3876: 
3877: \bibliography{gc}
3878: 
3879: \appendix
3880: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
3881: 
3882: \section{Non-Distributivity in Heap Reference Analysis}
3883: 
3884: \label{sec:non-distributivity} 
3885: 
3886: Explicit liveness analysis defined in this paper is not distributive
3887: whereas availability and anticipability analyses are distributive.
3888: %
3889: %
3890: %%%\begin{example}
3891: %%%\label{exmp:non-distributivity.alias}
3892: %%%Figure~\ref{fig:spurious.cyclicity} shows an example of
3893: %%%non-distributivity of aliasing. Since every may-node-alias of
3894: %%%$y\myarrow n$ should be may-node-aliased to every may-node-alias of
3895: %%%$z$, our analysis concludes that $x$ is aliased to $x\myarrow
3896: %%%n$. However, it can be verified from the memory graphs that this is
3897: %%%not possible.  Let $f_4$ denote the flow function of block 4. Then, %
3898: %%%it is easy to see that
3899: %%%\[
3900: %%%f_4 (\NodeAout(2) \cup \NodeAout(4))
3901: %%%\supset f_4 (\NodeAout(2)) \cup f_4(\NodeAout(4))
3902: %%%\]
3903: %%%because of the spurious alias \mbox{$\langle x, x\myarrow n\rangle$}.
3904: %%%%disappears when the flow function is applied before merging data flow
3905: %%%%information. 
3906: %%%The spurious node-alias \mbox{$\langle x, x\myarrow n\rangle$}
3907: %%%generates further spurious aliases due the cycle closure.
3908: %%%\mybox
3909: %%%\end{example}
3910: %%%
3911: %%%
3912: %%%
3913: %%%As Example~\ref{exmp:non-distributivity.alias} illustrates, aliasing
3914: %%%is non-distributive because though the confluence ($\cup$) is an exact
3915: %%%operation, the flow functions are not exact since they work on a
3916: %%%combination of elements in the input set.  The flow functions in
3917: %%%liveness analysis are exact because they work on individual elements
3918: %%%in the set of the paths represented by the access graph. However, it
3919: Explicit liveness analysis
3920: is non-distributive because of the approximation introduced by the $\cupG$
3921: operation.  \mbox{$G_1 \cupG G_2$} may contain access paths which are
3922: neither in $G_1$ nor in $G_2$.
3923: \begin{example}
3924: \label{exmp:non-distributivity.liveness}
3925: Figure~\ref{fig:liveness.non-distributivity} illustrates the
3926: non-distributivity of explicit liveness analysis.  Liveness graphs associated
3927: with the entry each block is shown in shaded boxes.  Let $f_1$ denote
3928: the flow function which computes $x$-rooted liveness graphs at the
3929: entry of block 1.  Neither $\Lin{x}(2)$ nor $\Lin{x}(4)$ contains the
3930: access path \mbox{$x\myarrow r\myarrow n\myarrow r$} but their union
3931: contains it.  It is easy to see that
3932: \[
3933:  f_1 (\Lin{x}(2) \cupG \Lin{x}(4)) \sqsubseteq_G
3934:   f_1 (\Lin{x}(2))  \cupG f_1(\Lin{x}(4)) \\
3935: \]
3936: \mybox
3937: \end{example}
3938: 
3939: \begin{figure}[t]
3940: \newcommand{\glx}{
3941: \scalebox{1}{%
3942: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
3943: 	\raisebox{1.4mm}{\rnode{n0}{}} \ \ \ 
3944: 	\raisebox{.6mm}{\circlenode[framesep=.8mm]{n1}{$x$}}
3945: 	\ncline[doubleline=true]{->}{n0}{n1}
3946: 	\  \  \ \ 
3947: 	\raisebox{.8mm}{\circlenode[framesep=0]{n2}{$n_7$}}
3948: 	\ncline{->}{n1}{n2}%
3949: %	\Aput[.2mm]{$n$}%
3950: 	\rule{0cm}{.4cm}%
3951: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
3952: }}
3953: \newcommand{\grx}{
3954: \scalebox{1}{%
3955: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
3956: 	\raisebox{1.4mm}{\rnode{n0}{}} \ \ \ 
3957: 	\raisebox{.6mm}{\circlenode[framesep=.8mm]{n1}{$x$}}
3958: 	\ncline[doubleline=true]{->}{n0}{n1}
3959: 	\  \  \ \ 
3960: 	\raisebox{.8mm}{\circlenode[framesep=0]{n2}{$r_8$}}
3961: 	\ncline{->}{n1}{n2}%
3962: %	\Aput[.2mm]{$n$}%
3963: 	\rule{0cm}{.4cm}%
3964: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
3965: }}
3966: \newcommand{\gnnrx}{
3967: \scalebox{1}{%
3968: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
3969: \psset{unit=1mm}
3970: \begin{pspicture}(-3,1)(17,11)
3971: %\psframe(-3,1)(17,11)
3972: \psrelpoint{origin}{n00}{-2}{11}
3973: \rput(\x{n00},\y{n00}){\rnode{n00}{}}
3974: \psrelpoint{n00}{n01}{4}{-5}
3975: \rput(\x{n01},\y{n01}){\rnode{n01}{\pscirclebox[framesep=.8]{$x$}}}
3976: \psrelpoint{n01}{n0}{6}{0}
3977: \rput(\x{n0},\y{n0}){\rnode{n0}{\pscirclebox[framesep=0]{$n_6$}}}
3978: \psrelpoint{n0}{n1}{6}{3}
3979: \rput(\x{n1},\y{n1}){\rnode{n1}{\pscirclebox[framesep=.2]{$r_8$}}}
3980: \ncline{->}{n0}{n1}
3981: \psrelpoint{n0}{n1}{6}{-3}
3982: \rput(\x{n1},\y{n1}){\rnode{n1}{\pscirclebox[framesep=0]{$n_7$}}}
3983: \ncline{->}{n0}{n1}
3984: \ncline[doubleline=true]{->}{n00}{n01}
3985: \ncline{->}{n01}{n0}
3986: \end{pspicture}
3987: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
3988: }}
3989: 
3990: \newcommand{\gnrx}{
3991: \scalebox{1}{%
3992: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
3993: \psset{unit=1mm}
3994: \begin{pspicture}(-3,1)(17,11)
3995: %\psframe(-3,1)(17,11)
3996: \psrelpoint{origin}{n00}{-2}{11}
3997: \rput(\x{n00},\y{n00}){\rnode{n00}{}}
3998: \psrelpoint{n00}{n01}{4}{-5}
3999: \rput(\x{n01},\y{n01}){\rnode{n01}{\pscirclebox[framesep=.8]{$x$}}}
4000: \psrelpoint{n01}{n0}{6}{-3}
4001: \rput(\x{n0},\y{n0}){\rnode{n0}{\pscirclebox[framesep=0]{$n_6$}}}
4002: \psrelpoint{n0}{n1}{6}{0}
4003: \rput(\x{n1},\y{n1}){\rnode{n1}{\pscirclebox[framesep=.2]{$r_8$}}}
4004: \ncline{->}{n0}{n1}
4005: \psrelpoint{n01}{n2}{6}{3}
4006: \rput(\x{n2},\y{n2}){\rnode{n2}{\pscirclebox[framesep=0]{$n_3$}}}
4007: \ncline{->}{n0}{n1}
4008: \ncline[doubleline=true]{->}{n00}{n01}
4009: \ncline{->}{n01}{n0}
4010: \ncline{->}{n01}{n2}
4011: \end{pspicture}
4012: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
4013: }}
4014: 
4015: \newcommand{\gnnx}{
4016: \scalebox{1}{%
4017: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
4018: \psset{unit=1mm}
4019: \begin{pspicture}(-3,1)(17,11)
4020: %\psframe(-3,1)(17,11)
4021: \psrelpoint{origin}{n00}{-2}{11}
4022: \rput(\x{n00},\y{n00}){\rnode{n00}{}}
4023: \psrelpoint{n00}{n01}{4}{-5}
4024: \rput(\x{n01},\y{n01}){\rnode{n01}{\pscirclebox[framesep=.8]{$x$}}}
4025: \psrelpoint{n01}{n0}{6}{-3}
4026: \rput(\x{n0},\y{n0}){\rnode{n0}{\pscirclebox[framesep=0]{$n_6$}}}
4027: \psrelpoint{n0}{n1}{6}{0}
4028: \rput(\x{n1},\y{n1}){\rnode{n1}{\pscirclebox[framesep=.2]{$n_7$}}}
4029: \ncline{->}{n0}{n1}
4030: \psrelpoint{n01}{n2}{6}{3}
4031: \rput(\x{n2},\y{n2}){\rnode{n2}{\pscirclebox[framesep=0]{$n_5$}}}
4032: \ncline{->}{n0}{n1}
4033: \ncline[doubleline=true]{->}{n00}{n01}
4034: \ncline{->}{n01}{n0}
4035: \ncline{->}{n01}{n2}
4036: \end{pspicture}
4037: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
4038: }}
4039: 
4040: \newcommand{\gNnrx}{
4041: \scalebox{1}{%
4042: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
4043: \psset{unit=1mm}
4044: \begin{pspicture}(-3,1)(22,12)
4045: %\psframe(-3,1)(22,12)
4046: \psrelpoint{origin}{n00}{-2}{11}
4047: \rput(\x{n00},\y{n00}){\rnode{n00}{}}
4048: \psrelpoint{n00}{n000}{4}{-5}
4049: \rput(\x{n000},\y{n000}){\rnode{n000}{\pscirclebox[framesep=.8]{$x$}}}
4050: \psrelpoint{n000}{n01}{6}{0}
4051: \rput(\x{n01},\y{n01}){\rnode{n01}{\pscirclebox[framesep=.1]{$n_2$}}}
4052: \psrelpoint{n01}{n0}{6}{-3}
4053: \rput(\x{n0},\y{n0}){\rnode{n0}{\pscirclebox[framesep=0]{$n_6$}}}
4054: \psrelpoint{n0}{n1}{6}{0}
4055: \rput(\x{n1},\y{n1}){\rnode{n1}{\pscirclebox[framesep=.2]{$r_8$}}}
4056: \ncline{->}{n0}{n1}
4057: \psrelpoint{n01}{n2}{6}{3}
4058: \rput(\x{n2},\y{n2}){\rnode{n2}{\pscirclebox[framesep=0]{$n_3$}}}
4059: \ncline{->}{n0}{n1}
4060: \ncline[doubleline=true]{->}{n00}{n000}
4061: \ncline{->}{n01}{n0}
4062: \ncline{->}{n01}{n2}
4063: \ncline{->}{n000}{n01}
4064: \end{pspicture}
4065: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
4066: }}
4067: 
4068: \newcommand{\gRnnx}{
4069: \scalebox{1}{%
4070: \psset{unit=1mm}
4071: \begin{pspicture}(-3,-1)(22,11)
4072: %\psframe(-3,-1)(22,11)
4073: \psrelpoint{origin}{n00}{-2}{10}
4074: \rput(\x{n00},\y{n00}){\rnode{n00}{}}
4075: \psrelpoint{n00}{n000}{4}{-5}
4076: \rput(\x{n000},\y{n000}){\rnode{n000}{\pscirclebox[framesep=.8]{$x$}}}
4077: \psrelpoint{n000}{n01}{6}{0}
4078: \rput(\x{n01},\y{n01}){\rnode{n01}{\pscirclebox[framesep=.1]{$r_4$}}}
4079: \psrelpoint{n01}{n0}{6}{3}
4080: \rput(\x{n0},\y{n0}){\rnode{n0}{\pscirclebox[framesep=0]{$n_6$}}}
4081: \psrelpoint{n0}{n1}{6}{0}
4082: \rput(\x{n1},\y{n1}){\rnode{n1}{\pscirclebox[framesep=.2]{$n_7$}}}
4083: \ncline{->}{n0}{n1}
4084: \psrelpoint{n01}{n2}{6}{-3}
4085: \rput(\x{n2},\y{n2}){\rnode{n2}{\pscirclebox[framesep=0]{$n_5$}}}
4086: \ncline{->}{n0}{n1}
4087: \ncline[doubleline=true]{->}{n00}{n000}
4088: \ncline{->}{n01}{n0}
4089: \ncline{->}{n01}{n2}
4090: \ncline{->}{n000}{n01}
4091: \end{pspicture}
4092: }}
4093: \newcommand{\lout}{
4094: \scalebox{1}{%
4095: \psset{unit=1mm}
4096: \begin{pspicture}(-3,-4)(22,18)
4097: %\psframe(-3,-4)(22,18)
4098: \psrelpoint{origin}{n0}{-2}{11}
4099: \rput(\x{n0},\y{n0}){\rnode{n0}{}}
4100: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
4101: \psrelpoint{n0}{n1}{3}{-6}
4102: \rput(\x{n1},\y{n1}){\rnode{n1}{\pscirclebox[framesep=1]{$x$}}}
4103: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
4104: \psrelpoint{n1}{n2}{6}{3}
4105: \rput(\x{n2},\y{n2}){\rnode{n2}{\pscirclebox[framesep=.2]{$n_2$}}}
4106: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
4107: \psrelpoint{n2}{n3}{6}{3}
4108: \rput(\x{n3},\y{n3}){\rnode{n3}{\pscirclebox[framesep=.2]{$n_3$}}}
4109: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
4110: \psrelpoint{n1}{r4}{6}{-3}
4111: \rput(\x{r4},\y{r4}){\rnode{r4}{\pscirclebox[framesep=.4]{$r_4$}}}
4112: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
4113: \psrelpoint{r4}{n5}{6}{-3}
4114: \rput(\x{n5},\y{n5}){\rnode{n5}{\pscirclebox[framesep=.2]{$n_5$}}}
4115: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
4116: \psrelpoint{r4}{n6}{6}{3}
4117: \rput(\x{n6},\y{n6}){\rnode{n6}{\pscirclebox[framesep=.2]{$n_6$}}}
4118: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
4119: \psrelpoint{n6}{n7}{6}{3}
4120: \rput(\x{n7},\y{n7}){\rnode{n7}{\pscirclebox[framesep=.2]{$n_7$}}}
4121: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
4122: \psrelpoint{n6}{r8}{6}{-3}
4123: \rput(\x{r8},\y{r8}){\rnode{r8}{\pscirclebox[framesep=.2]{$r_8$}}}
4124: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
4125: \ncline[doubleline=true]{->}{n0}{n1}
4126: \ncline[nodesep=-.5]{->}{n1}{n2}
4127: \ncline[nodesep=-.5]{->}{n1}{r4}
4128: \ncline[nodesep=-.5]{->}{n2}{n3}
4129: \ncline[nodesep=-.5]{->}{n2}{n6}
4130: \ncline[nodesep=-.5]{->}{r4}{n5}
4131: \ncline[nodesep=-.5]{->}{r4}{n6}
4132: \ncline[nodesep=-.5]{->}{n6}{n7}
4133: \ncline[nodesep=-.5]{->}{n6}{r8}
4134: \psrelpoint{origin}{n0}{4}{15}
4135: \rput(\x{n0},\y{n0}){\rnode{n0}{$\Lout{x}(1)$}}
4136: \end{pspicture}
4137: }}
4138: 
4139: \newcommand{\lina}{
4140: \scalebox{1}{%
4141: \psset{unit=1mm}
4142: \begin{pspicture}(-3,-3)(22,18)
4143: %\psframe(-3,-3)(22,18)
4144: \psrelpoint{origin}{n0}{-2}{10}
4145: \rput(\x{n0},\y{n0}){\rnode{n0}{}}
4146: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
4147: \psrelpoint{n0}{n1}{4}{-6}
4148: \rput(\x{n1},\y{n1}){\rnode{n1}{\pscirclebox[framesep=1]{$x$}}}
4149: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
4150: %\psrelpoint{n1}{n2}{7}{4}
4151: %\rput(\x{n2},\y{n2}){\rnode{n2}{\pscirclebox[framesep=.2]{$n_2$}}}
4152: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
4153: %\psrelpoint{n2}{n3}{7}{4}
4154: %\rput(\x{n3},\y{n3}){\rnode{n3}{\pscirclebox[framesep=.2]{$n_3$}}}
4155: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
4156: \psrelpoint{n1}{r4}{6}{0}
4157: \rput(\x{r4},\y{r4}){\rnode{r4}{\pscirclebox[framesep=.4]{$r_4$}}}
4158: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
4159: \psrelpoint{r4}{n5}{6}{-3}
4160: \rput(\x{n5},\y{n5}){\rnode{n5}{\pscirclebox[framesep=.2]{$n_5$}}}
4161: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
4162: \psrelpoint{r4}{n6}{6}{3}
4163: \rput(\x{n6},\y{n6}){\rnode{n6}{\pscirclebox[framesep=.2]{$n_6$}}}
4164: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
4165: \psrelpoint{n6}{n7}{6}{3}
4166: \rput(\x{n7},\y{n7}){\rnode{n7}{\pscirclebox[framesep=.2]{$n_7$}}}
4167: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
4168: \psrelpoint{n6}{r8}{6}{-3}
4169: \rput(\x{r8},\y{r8}){\rnode{r8}{\pscirclebox[framesep=.2]{$r_8$}}}
4170: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
4171: \ncline[doubleline=true]{->}{n0}{n1}
4172: %\ncline[nodesep=-.5]{->}{n1}{n2}
4173: \ncline[nodesep=-.5]{->}{n1}{r4}
4174: %\ncline[nodesep=-.5]{->}{n2}{n3}
4175: %\ncline[nodesep=-.5]{->}{n2}{n6}
4176: \ncline[nodesep=-.5]{->}{r4}{n5}
4177: \ncline[nodesep=-.5]{->}{r4}{n6}
4178: \ncline[nodesep=-.5]{->}{n6}{n7}
4179: \ncline[nodesep=-.5]{->}{n6}{r8}
4180: \psrelpoint{origin}{n0}{10}{16}
4181: \rput(\x{n0},\y{n0}){\rnode{n0}{$f_1 (\Lin{x}(2) \cupG \Lin{x}(4))$}}
4182: \end{pspicture}
4183: }}
4184: 
4185: \newcommand{\linb}{
4186: \scalebox{1}{%
4187: \psset{unit=1mm}
4188: \begin{pspicture}(-3,4)(24,20)
4189: %\psframe(-3,-5)(24,15)
4190: \psrelpoint{origin}{n0}{-20}{0}
4191: \rput(\x{n0},\y{n0}){{\scalebox{1.1}{\gRnnx}\ \ \ \ \ \ \ \ \ \ \ }}
4192: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
4193: %\psrelpoint{origin}{n0}{4}{17}
4194: %\rput(\x{n0},\y{n0}){\rnode{n0}{$\Lin{x}^B(1)$}}
4195: \psrelpoint{origin}{n0}{10}{17}
4196: \rput(\x{n0},\y{n0}){\rnode{n0}{$f_1 (\Lin{x}(2)) \cupG f_1(\Lin{x}(4))$}}
4197: \end{pspicture}
4198: }}
4199: 
4200: 
4201: \psset{unit=1mm}
4202: \begin{tabular}{cc}
4203: \begin{pspicture}(0,15)(90,110)
4204: \small
4205: %\psframe(0,15)(90,100)
4206: \psrelpoint{origin}{n1}{45}{95}
4207: \rput(\x{n1},\y{n1}){\rnode{n1}{1\ \ \psframebox{$x.n=\NULL$}\white \ \ 1}}
4208: %%%%%%%%%%%%%%%%%%%%%%%
4209: \psrelpoint{n1}{n2}{-25}{-20}
4210: \rput(\x{n2},\y{n2}){\rnode{n2}{2\ \ \psframebox{$x=x.n$}\white \ \ 2}}
4211: %%%%%%%%%%%%%%%%%%%%%%%
4212: \psrelpoint{n1}{n3}{25}{-20}
4213: \rput(\x{n3},\y{n3}){\rnode{n3}{\white 4\ \ \black\psframebox{$x=x.r$}\ \ 4}}
4214: %%%%%%%%%%%%%%%%%%%%%%%
4215: \psrelpoint{n2}{n4}{0}{-20}
4216: \rput(\x{n4},\y{n4}){\rnode{n4}{3\ \ \psframebox{$x.n.n=\NULL$}\white \ \ 3}}
4217: %%%%%%%%%%%%%%%%%%%%%%%
4218: \psrelpoint{n3}{n5}{0}{-20}
4219: \rput(\x{n5},\y{n5}){\rnode{n5}{\white 5\ \ \black\psframebox{$x.n.r=\NULL$}\ \ 5}}
4220: %%%%%%%%%%%%%%%%%%%%%%%
4221: \psrelpoint{n5}{n6}{-25}{-22}
4222: \rput(\x{n6},\y{n6}){\rnode{n6}{6\ \ \psframebox{$x=x.n$}\white \ \ 6}}
4223: %%%%%%%%%%%%%%%%%%%%%%%
4224: \psrelpoint{n6}{n7}{-25}{-16}
4225: \rput(\x{n7},\y{n7}){\rnode{n7}{7\ \ \psframebox{$z=x.n$}\white \ \ 7}}
4226: \psrelpoint{n7}{n71}{-10}{3}
4227: \rput(\x{n7},\y{n7}){\rnode{n7}{7\ \ \psframebox{$z=x.n$}\white \ \ 7}}
4228: %%%%%%%%%%%%%%%%%%%%%%%
4229: \psrelpoint{n6}{n8}{25}{-16}
4230: \rput(\x{n8},\y{n8}){\rnode{n8}{\white 8\ \ \black\psframebox{$z=x.r$}\ \ 8}}
4231: %%%%%%%%%%%%%%%%%%%%%%%
4232: \ncdiag[armA=12.25,armB=3.3,linearc=.25,offsetA=-2,angleA=270,angleB=90]{->}{n1}{n2}
4233: \ncdiag[armA=12.25,armB=3.3,linearc=.25,offsetA=2,angleA=270,angleB=90]{->}{n1}{n3}
4234: %\ncline{->}{n1}{n3}
4235: \ncline{->}{n2}{n4}
4236: %\ncline{->}{n4}{n6}
4237: \ncdiag[armA=14.4,armB=3.1,linearc=.25,offsetB=-2,angleA=270,angleB=90]{->}{n4}{n6}
4238: \ncdiag[armA=14.4,armB=3.1,linearc=.25,offsetB=2,angleA=270,angleB=90]{->}{n5}{n6}
4239: \ncline{->}{n3}{n5}
4240: %\ncline{->}{n6}{n7}
4241: \ncdiag[armA=9.2,armB=2.6,linearc=.25,offsetA=-2,angleA=270,angleB=90]{->}{n6}{n7}
4242: \ncdiag[armA=9.2,armB=2.6,linearc=.25,offsetA=2,angleA=270,angleB=90]{->}{n6}{n8}
4243: %%%%%%%%%%%%%%%%%%%%%%%
4244: \psrelpoint{n7}{np}{-5}{9}
4245: \rput(\x{np},\y{np}){\psframebox[framesep=0,linestyle=none,framearc=.5,fillstyle=solid,fillcolor=lightgray]{\glx}}
4246: %%%%%%%%%%%%%%%%%%%%%%%
4247: \psrelpoint{n8}{np}{5}{9}
4248: \rput(\x{np},\y{np}){\psframebox[framesep=0,linestyle=none,framearc=.5,fillstyle=solid,fillcolor=lightgray]{\grx}}
4249: %%%%%%%%%%%%%%%%%%%%%%%
4250: \psrelpoint{n6}{np}{-1}{11}
4251: \rput(\x{np},\y{np}){\psframebox[framesep=0,linestyle=none,framearc=.5,fillstyle=solid,fillcolor=lightgray]{\gnnrx}}
4252: %%%%%%%%%%%%%%%%%%%%%%%
4253: \psrelpoint{n4}{np}{12}{10}
4254: \rput(\x{np},\y{np}){\psframebox[framesep=0,linestyle=none,framearc=.5,fillstyle=solid,fillcolor=lightgray]{\gnrx}}
4255: %%%%%%%%%%%%%%%%%%%%%%%
4256: \psrelpoint{n5}{np}{-12}{10}
4257: \rput(\x{np},\y{np}){\psframebox[framesep=0,linestyle=none,framearc=.5,fillstyle=solid,fillcolor=lightgray]{\gnnx}}
4258: %%%%%%%%%%%%%%%%%%%%%%%
4259: \psrelpoint{n3}{np}{-55}{12}
4260: \rput(\x{np},\y{np}){\psframebox[framesep=0,linestyle=none,framearc=.5,fillstyle=solid,fillcolor=lightgray]{\gNnrx}}
4261: %%%%%%%%%%%%%%%%%%%%%%%
4262: \psrelpoint{n3}{np}{5}{12}
4263: \rput(\x{np},\y{np}){\psframebox[framesep=0,linestyle=none,framearc=.5,fillstyle=solid,fillcolor=lightgray]{\gRnnx}}
4264: %%%%%%%%%%%%%%%%%%%%%%%
4265: \end{pspicture}
4266: &\begin{tabular}[b]{@{}l@{}}
4267: \lout\\ \\ \hline\\
4268: \lina\\ \\ \hline\\
4269: \linb \\\\
4270: \end{tabular}
4271: \end{tabular}
4272: %%%%%%%%%%%%%%%%%%%%%%%
4273: \caption{Non-distributivity of liveness analysis.  Access path
4274: \mbox{$x\protect\myarrow r\protect\myarrow n\protect\myarrow r$} is a
4275: spurious access path which does not get killed by the assignment in
4276: block 1.}
4277: \label{fig:liveness.non-distributivity}
4278: \rule{\textwidth}{.2mm}
4279: \end{figure}
4280: 
4281:  Availability  and  anticipability analyses  are  non-distributive because  they
4282: depend on may-alias analysis which is non-distributive.
4283: 
4284: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
4285: \end{document}
4286: $Log:
4287: Revision 1322: TOPLAS submission here. Now rewriting null insertion.
4288: 
4289: $
4290: