cs0007029/main.tex
1: \documentclass[final]{siamltex}
2: \usepackage{times,algorithm,algorithmic,comment,psfig,latexsym}
3: 
4:  
5:  
6:  
7:  
8: \def\AND{\wedge}
9: \def\OR{\vee}
10: \def\oper{\circ}
11: \def\goesto{\rightarrow}
12: \def\implies{\Rightarrow}
13: \def\zeroone{\{0,1\}} 
14:  \def\sstar{\zeroone^{*}} 
15: \def\L{\langle}
16: \def\R{\rangle}
17: \def\HYP{\hbox{-}}
18: \def\IFF{\leftrightarrow}
19: \def\Ldef{\buildrel \rm def \over \leftrightarrow}
20: \def\Edef{\buildrel \rm def \over =}
21: \def\almostall{\hbox{\rlap{$_{\thinspace\forall}$}{$^{^\infty}$}}}
22: \def\infoften{\hbox{\rlap{$_{\thinspace\exists}$}{$^{^\infty}$}}}
23: \def\N{{\bf N}}
24: \def\cminus{\dot{-}} 
25: \def\plusminus{\pm} 
26: \def\PR{{\rm Pr}}
27: \def\HSAT{{\rm HORN}\hbox{-}{\rm SAT}}
28: \def\PUR{{\rm PUR}}
29: \def\beginproof{\noindent{\bf Proof.}\quad}
30: \def\endproof{}
31: 
32:  
33: \makeatletter
34: \newcommand{\singlespacing}{\let\CS=
35:         \@currsize\renewcommand{\baselinestretch}{1}\tiny\CS}
36: \newcommand{\singlespacingplus}{\let\CS=
37:         \@currsize\renewcommand{\baselinestretch}{1.15}\tiny\CS}
38: \newcommand{\doublespacing}{\let\CS=
39:         \@currsize\renewcommand{\baselinestretch}{1.75}\tiny\CS}
40: \newcommand{\draftspacing}{\let\CS=
41:         \@currsize\renewcommand{\baselinestretch}{2.0}\tiny\CS}
42: \newcommand{\normalspacing}{\singlespacing}
43: \makeatother
44:  
45:  
46:  
47: %%%%%%%%%%desc
48: \def\desclabel#1{\bf #1\hfil}
49: \def\desc{\list{}{%
50: \labelwidth=\leftmargin
51: \advance \labelwidth by -\labelsep
52: \let \makelabel=\desclabel}}
53: \let\enddesc=\endlist
54: 
55: 
56: 
57: %\newtheorem{lemma}{Lemma}[section]
58: %\newtheorem{theorem}[lemma]{Theorem}
59: \newtheorem{corrolary}{Corollary}
60: %\newtheorem{proposition}[lemma]{Proposition}
61: %\newtheorem{fact}[lemma]{Fact}
62: \newtheorem{example}{Example}
63: \newtheorem{observation}{Observation}
64: \newtheorem{claim}{Claim}
65: %\newtheorem{definition}{Definition}
66: %\newtheorem{obs}[lemma]{Observation}
67: 
68: \def\qed{\hfill$\Box$\newline\vspace{5mm}}
69: \newenvironment{PROOF}{\noindent{\bf Proof:}}{{\qed}}
70: 
71: 
72:  
73: \author{Gabriel Istrate\thanks{
74:         Center for Nonlinear Science and CIC-3 Division,      
75:         Los Alamos National Laboratory,
76:         Los Alamos, NM 87545, 
77:         gistrate@cnls.lanl.gov}} 
78: 
79: \title{Dimension-dependent behavior in the satisfiability of 
80: random $k$-Horn formulae}
81: \date{}
82: \pagestyle{empty}
83:  
84:  
85:  
86: \sloppy
87:  
88: \begin{document}
89: 
90: \bibliographystyle{plain}
91: 
92:  
93: \maketitle
94: \begin{abstract} We determine the asymptotical satisfiability
95: probability of a random at-most-$k$-Horn formula, via a probabilistic 
96: analysis of a simple version, called \PUR, of positive unit resolution.
97: We show that for $k=k(n)\goesto \infty$ the problem
98: can be ``reduced'' to the case $k(n)=n$, that was solved in
99: \cite{istrate:cs.DS/9912001}. On 
100: the other hand, in the case $k=$ constant the behavior of \PUR\ is
101: modeled by a simple queuing chain, leading to a closed-form
102: solution when $k=2$. Our analysis predicts an ``easy-hard-easy''
103: pattern in this latter case. 
104: Under a rescaled parameter, the graphs of satisfaction probability
105: corresponding to finite values of $k$ converge to the one for the
106: uniform case, a ``dimension-dependent behavior'' similar to the one found
107: experimentally in \cite{kirkpatrick:selman:scaling} for $k$-SAT. 
108: The phenomenon is qualitatively explained by a threshold property for 
109: the number of 
110: iterations of \PUR\ makes on random {\em satisfiable} Horn
111: formulas. Also, for $k=2$ \PUR\ has a peak in its average complexity at
112: the critical point.    
113: \end{abstract}
114: 
115: 
116: \begin{keywords}
117: random Horn satisfiability, critical behavior, probabilistic analysis.
118: \end{keywords}
119:  
120: \begin{AMS}
121: 68Q25,82B27
122: \end{AMS}
123:  
124: \pagestyle{myheadings}
125: \thispagestyle{plain}
126: \markboth{G. ISTRATE}{DIMENSION DEPENDENT BEHAVIOR OF RANDOM HORN 
127: SATISFIABILITY}
128:          
129: 
130: \section{Introduction}
131: 
132: Finding the ground state (state of minimum energy) of a physical
133: system and computing an optimal solution to a combinatorial 
134: optimization
135: problem 
136: are intuitively two very similar tasks. This simple observation, that
137: motivated the development of  {\em simulated annealing} 
138: \cite{simmulated:annealing}, a simple general-purpose heuristic for combinatorial
139: optimization, lies behind the
140: recent birth of a new field at the crossroads of Statistical
141: Mechanics, Theoretical Computer Science and Artificial Intelligence,
142: that studies {\em phase transitions in combinatorial problems} (see
143: \cite{hayes:cant:get:sat} for a readable introduction). The transfer of 
144: principles and
145: methods from Physics (mainly from Spin Glass Theory 
146: \cite{virasoro-parisi-mezard}) to
147: Computer Science has already been quite successful, and is responsible
148: for a couple of interesting results, such as a better understanding of
149: the factors that account for computational intractability
150: \cite{2+p:rsa, 2+p:nature},
151: strikingly accurate predictions of the average running time of various
152: algorithms \cite{scaling:search:cost:2,scaling:search:cost}, or of 
153: expected values of optimal solutions
154: \cite{mezard:parisi:matching}. 
155: 
156: The need for a rigorous validation of these insights is quite
157: obvious. The theory of spin glasses is a relatively young field, which
158: still presents many heuristic, unsolved or plain controversial aspects 
159: (for example see
160: \cite{non:mean:field:1,non:mean:field:2,non:mean:field:3} for a debate
161: on the validity and scope of the so-called Parisi solution of the
162: Sherrington--Kirkpatrick model). Moreover, while physical intuition can
163: guide the development of the theory for ``physical'' models, by corroborating (or 
164: falsifying) 
165: some of its predictions (e.g. see \cite{virasoro-parisi-mezard},
166: for a discussion of the demise, on physical grounds,
167: of the first formulation of the so-called {\em replica method}), such
168: intuition is not available when applying this type of ideas to 
169: combinatorial problems. Given that rigorous results are hard to come
170: by in the case of spin glasses proper, it is not surprising that while there has
171: been recently some progress (see e.g. 
172: \cite{talagrand:verres}), an analysis of most interesting 
173: combinatorial problems is still out of reach. 
174: 
175: An approach that was popular in Statistical Mechanics was to gather
176: intuition through the systematic study of {\em exactly solved models}
177: \cite{baxter:rigorous}. These are ``toy'' versions of the original models that
178: are simple to deal with, but retain much of the properties of the
179: former ones. We advocate such an approach for problems in 
180: Computer Science as well, and the purpose of this paper is to present
181: a (hopefully
182: nontrivial) ``exactly solvable satisfiability model'' that displays a 
183: {\em dimension-dependent behavior} fairly similar to the one observed 
184: previously in various
185: contexts such as percolation \cite{hara:slade:critical}, self-avoiding
186: walks, and recently for $k$-satisfiability by Kirkpatrick and Selman 
187: \cite{kirkpatrick:selman:scaling}. The problem
188: we investigate is {\em random Horn satisfiability}, and the 
189: ``dimensionality'' of a formula is taken to be the 
190: {\em maximum length} of its clauses.\footnote{for technical 
191: convenience, all 
192: over the paper {\em random $k$-Horn satisfiability} is understood as 
193: {\em random {\bf at-most-$k$}-Horn satisfiability}.} 
194: 
195: \section{Overview}
196: There are actually two different notions of phase transition
197: in a combinatorial problem. The first of them, called 
198: {\em order-disorder phase transition} applies to optimization
199: problems and directly parallels the approach from Statistical 
200: Mechanics.
201: Potential solutions for an instance of $P$ are viewed as ``states'' of
202: a system. One defines an abstract {\em Hamiltonian (energy) function},
203: that measures the ``quality'' of a given solution, and applies methods
204: from the theory of spin glasses \cite{virasoro-parisi-mezard} to make 
205: predictions on the typical
206: structure of optimal solutions. In this setting a
207: phase transition is defined as non-analytical behavior of a certain
208: ``order parameter'' called free energy,
209: and a discontinuity in this parameter, manifest by the sudden
210: emergence of a {\em backbone} of constrained ``degrees of freedom''
211: \cite{2+p:rsa} is responsible for the exponential slow-down of many 
212: natural algorithms.
213:  
214: The second definition is combinatorial and pertains to decision
215: problems. It relies on the concept of {\em threshold property} from
216: random graph theory, more precisely a restricted version of this
217: notion, called {\em sharp threshold}.
218: A satisfiability threshold always exists for monotone problems 
219: \cite{bollob-thomasson}, but may or may
220: not be sharp (we speak of a {\em coarse threshold} in the latter
221: case). 
222: 
223: The layout of the paper is as follows: in section~\ref{section:1} we 
224: review 
225: the results of Kirkpatrick and Selman, in particular discussing the 
226: concept of {\em critical behavior}, as well as some objectionable aspects 
227: of their results. 
228:   We then define the type of dimension 
229: dependent behavior we are interested in, argue that it captures to a 
230: large 
231: extent the results presented in \cite{kirkpatrick:selman:scaling}, and 
232: contrast it with  
233: critical behavior.  
234: Our results are presented and discussed in section~\ref{section:3}, 
235: while in
236: section~\ref{section:4} we further discuss their significance. 
237: 
238: 
239: Finally for $k=2$, the one where the satisfaction probability has a 
240: singularity we are able to rigorously display another phenomenon that 
241: is 
242: believed to be characteristic of phase transitions: in many cases the 
243: ``hardest on the average'' instances appear at the transition point 
244: (even if we only 
245: consider satisfiable instances \cite{achlioptas-sat-instances,mammen-hogg}); this feature is 
246: quite robust with respect to the choice of the particular algorithm 
247: \cite{cheeseman-kanefsky-taylor}. 
248: We are able to prove that for a {\em particular problem}, random 
249: at-most-2-Horn satisfiability,  the average 
250: running time of a {\em particular algorithm}, when restricted to 
251: satisfiable 
252: instances (the ones that are statistically significant on both sides of
253: the critical point) is finite outside the critical point, and it 
254: diverges as 
255: we approach this point, thus providing some evidence for the 
256: experimental 
257: wisdom. 
258: 
259: 
260: \section{Phase transitions and critical behavior}\label{section:1}
261: 
262: We first discuss, briefly and limited to our interests, threshold
263: phenomena. Perhaps the best way to introduce them is through a concrete
264: example. To do this, we will use one ``canonical'' NP-complete
265: problem, {\em $k$-CNF satisfiability}. 
266: 
267: To generate random formulas we use a 
268: model with one parameter, {\em the constraint
269: density $c$}, defined as the ratio between
270: the number of clauses $m$ and the number of variables $n$ of the
271: formula. A random formula is obtained by choosing $m$ random clauses. 
272: If we plot the probability that such a random formula is satisfiable 
273: against the constraint density $c$, we notice the existence of a
274: critical value $c_{k}$ such that the satisfaction probability drops
275: (as $n\goesto \infty$) from one to zero at $c_{k}$. Such a ``sudden
276: change'' is an illustration of the mathematical concept of {\em sharp
277:   threshold}, qualitatively illustrated in Figure~\ref{sharp:thr}. The
278: existence of a critical value $c_{k}$ has not been rigorously
279: established (except for $c_{2}=1$), even though Friedgut
280: \cite{friedgut:k:sat} has shown
281: that the transition is ``sharp'' for every $k$.  
282: \begin{figure}
283: \label{sharp:thr}
284: \centerline{
285: \psfig{figure=fig3.ps,width=3.5in}}
286: \caption{Qualitative picture of a (rescaled) sharp threshold}
287: \end{figure}    
288: 
289: Of special interest will also be the width of the so-called {\em scaling window (a.k.a. critical region)}. To define it consider, for $0 <\delta < 1$, 
290: $\alpha_-(n,\delta)$, the supremum over
291: $\alpha$ such that for $m=\alpha n$, the
292: probability of a random formula being
293: satisfiable is at least $1-\delta$.
294: Similarly, let
295: $\alpha_+(n,\delta)$ be the infimum over
296: $\alpha$ such that for $m=\alpha n$, the
297: probability of a random formula being
298: satisfiable is at most $\delta$.
299: Then, for $\alpha$  within the {\em $\delta$-scaling window}
300: \begin{equation}
301: W(n,\delta) = (\alpha_-(n,\delta),
302: \alpha_+(n,\delta)),
303: \end{equation}
304: the probability that
305: a random formula is satisfiable is
306: between $\delta$ and $1-\delta$.
307: 
308: We will be interested in the width of the window
309: $W(n,\delta)$ as a function of
310: $n$. It is generally believed that $|W(n)|=\theta(n^{-1/\nu})$
311: for some $\nu=\nu_{k}\geq 1$ independent of $\delta$, even though the existence of $\nu_{k}$ has
312: only been established for $k=2$ \cite{scaling:window:2sat}.  
313: 
314: \subsection{Order/disorder phase transitions}
315: 
316: Statistical mechanics deals with the description of systems having a 
317: large 
318: number of degrees of freedom. One of its fundamental predictions 
319: concerns the 
320: fact that at thermal equilibrium each such state occurs with 
321: probability 
322: proportional to $exp(-\beta H(\sigma))$, where $\beta$ is an {\em 
323: inverse 
324: temperature}, and $H$ is a {\em Hamiltonian function}, describing the 
325: energy of 
326: the particular state $\sigma$. The resulting distribution is called 
327: {\em 
328: the Gibbs distribution $G_{\beta}$} given by 
329: \[
330: \Pr[\sigma]=\frac{exp(-\beta\cdot H(\Phi;\sigma))}{Z[\Phi]},
331: \]
332: where
333: \[
334: Z[\Phi]= \sum_{\sigma\in \{0,1\}^{n}}exp(-\beta\cdot H(\Phi;\sigma))
335: \]
336: is the so-called {\em partition function}.
337:  
338: Changes in the order properties of the system, 
339: which characterize order-disorder phase transitions, manifest 
340: themselves as  
341: non-analytical behavior of thermal averages (i.e. averages over the 
342: Gibbs distribution) of a certain {\em order parameter}.
343: We want to emphasize that the physicists' use of the term order
344: parameter would be quite different from the one from combinatorics. 
345: An order parameter is a quantity that is zero on one side of the 
346: phase transition and becomes non-zero on the other side (for instance 
347: the satisfaction probability could be an order parameter).   
348:  
349: One of the simplest illustrations of these 
350: concepts is the {\em two-dimensional Ising model} (see 
351: \cite{baxter:rigorous} for a 
352: thorough treatment).  In this model we
353: have a number of {\em spins}, that are small magnets located on the
354: vertices of the two-dimensional lattice, and pointing either                   
355: up or down. The spins interact with their neighbors and with an {\em 
356: external
357: magnetic field $h\in {\bf R}$}, which will tend to align the spins in one of the
358: two directions. The energy of a state $\sigma$ is 
359: \[
360: H(\sigma)=- \sum_{i\sim j}\sigma_{i}\cdot \sigma_{j} + h\cdot 
361: \left(\sum_{i}\sigma_{i}\right).
362: \]
363: 
364: 
365: 
366: The order parameter is called {\em free energy}, is a function of
367: temperature, and is formally defined as 
368: 
369: \[
370: f = -\frac{1}{\beta n} \ln Z[\Phi].
371: \]
372: 
373: It measures the fraction of spins that are ``frozen'' 
374: when the field is turned off. 
375: 
376: We now briefly describe the essence of the phase transition: 
377: above a certain temperature $T_{c}$, {\em the Curie-Weiss point}, when 
378: the magnetic field is turned to zero
379: the proportion of spins that point in each direction is about
380: $\frac{1}{2}$ (the so-called {\em disordered phase}). But for
381: temperatures below $T_{c}$ when we turn the field to zero some
382: orientation still dominates (the  {\em ordered phase}), and the proportion of
383: spins pointing up(down) changes discontinuously as $h$ passes through zero.
384: 
385:  The connection with combinatorial optimization follows from the
386: observation that when $\beta \goesto \infty$ (that is the temperature
387: approaches 0 K), the Gibbs distribution $G_{\beta}$ converges to a
388: uniform distribution $G$ on the set of states of minimal energy
389: (ground states). Thus, based on  this analogy, one can hope that 
390: ideas from Statistical Mechanics are able to provide insight into the 
391: structure of optimal solutions to an instance of a problem in 
392: Combinatorial Optimization. Rather than providing a complete discussion (which 
393: would require to 
394: rigorously define the notion of optimization problem) we will discuss 
395: this in the
396: context of MAX 3-SAT, the optimization version of satisfiability. For 
397: now 
398: it suffices to mention the three main ingredients of an optimization 
399: problem, 
400: its {\em instances}, {\em solutions} to instances of a problem, and an 
401: {\em 
402: cost function}, that measures the quality of a solution for a certain 
403: instance. 
404: 
405: \begin{example}(MAX 3-SAT)
406:  
407:  
408: {\bf Input:} A propositional formula $\Phi$ in conjunctive normal form, 
409: such that every
410: clause has length exactly 3.
411:  
412: {\bf Solution:} A truth assignment $\sigma$ for the propositional 
413: variables in $\Phi$
414: that maximizes the number of satisfied clauses.
415:  
416: {\bf Cost function:} The cost $C(\Phi,\sigma)$ of a truth assignment 
417: $\sigma$ for an instance 
418: $\Phi$ of MAX 3-SAT is the number of clauses of $\Phi$ that are 
419: violated by 
420: $\sigma$. 
421: \end{example}
422:  
423: Let $Q$ be an optimization problem and let  $\Phi$ be an instance of 
424: $Q$ ``on
425: $n$ variables'' (i.e., all solutions have length $n$). We view the
426: set of all assignments on $\{0,1\}^{n}$ as ``states of a system.'' To 
427: each such
428: state $\sigma$ we associate the Hamiltonian (energy function)
429: \[
430: H(\Phi;\sigma)=\mbox{ the cost of instance }(\Phi;\sigma)\mbox{ of }Q.
431: \]                                                                              
432: \begin{example}
433: Let $\Phi$ be a 3-CNF formula, and let $\sigma$ be an assignment. 
434: According 
435: to the previous definition $H(\Phi;\sigma)=C(\Phi;\sigma)$. $H$ can be 
436: formally expressed \cite{monasson:zecchina} as
437:  
438: \[
439: H(\Phi;\sigma)=\sum_{l=1}^{m}\delta\left[\sum_{i=1}^{n}C_{l,i}\cdot 
440: (-1)^{\sigma_{i}};-3\right],
441: \]
442: \end{example}
443: where $\delta[i;j]= 1_{\{i=j\}}$ is the Kronecker symbol and $C_{l,i}$ 
444: is 1 if the $l$th
445: clause contains the literal $x_{i}$, $-1$ if it contains 
446: $\overline{x_{i}}$ and
447: zero otherwise.
448: 
449: 
450: For the case of problems of interest to Computer Science the instance           
451: $\Phi$ is not fixed, but rather is a sample from a certain
452: distribution. This is very similar to the context of {\em spin-glass 
453: theory}, 
454: a subfield of Statistical Mechanics. 
455: The extra ingredient of this theory is that the coupling coefficients 
456: are no 
457: longer considered fixed, but are rather independent samples from a 
458: certain 
459: distribution. In the language of the theory of spin glasses  $\Phi$ is 
460: called a {\em quenched quantity}).
461:  
462: As in the case of the Ising model, the order parameter 
463: is the {\em ground state free energy}, more precise its expected value 
464: \[
465: \overline{f}=-\frac{1}{\beta n}\overline{\ln(Z)},
466: \]
467: where $\overline{(\ldots)}$ stands for the average over the random
468: distribution of $\Phi$.                                                         
469: \begin{definition}
470: A {\em physical (order/disorder)
471: phase transition} in a combinatorial optimization problem
472: is a point where $\overline{f}$ is not analytical.
473: \end{definition}
474:  
475:  
476: Free energy has  an especially crisp intuitive
477: interpretation in the case of the problem MAX 3-SAT 
478: \cite{monasson:zecchina}:
479:  
480: \begin{example}\label{3sat:expl}
481: Let $\Phi_{n}$ be an instance of MAX 3-SAT, let $A$ be the set of 
482: optimal
483: assignments to $\Phi_{n}$, endowed with the uniform measure $\mu_{n}$.
484: Statistical Mechanics predicts that, as $n\goesto \infty$, $\mu_{n}$ is
485: ``close'' to a product measure on $\{0,1\}^{n}$, $\mu_{1,n} \ldots
486: \mu_{n,n}$. The {\em free energy per site} $f$ is the fraction of
487: variables $x_{i}$ that are (asymptotically) {\em fully constrained} 
488: (that is
489: $\mu_{i,n}$ converges in distribution to a measure having all its
490: weight on one of the two points 0,1.                                           
491: \end{example}                                           
492:  
493: 
494: \section{Critical behavior and the mean-field 
495: approximation}\label{section:2}
496: 
497: An important feature that order/disorder
498: phase transition share with the combinatorial notion of {\em threshold 
499: properties} (that are usually the type of phase transition of interest 
500: in combinatorics) is that the various quantities of interest,
501: such as the satisfaction probability, the ground state energy, and the
502: location of the phase transition are hard to compute. No
503: general-purpose  methods exist, and in some cases even obtaining good
504: non-rigorous estimates is a challenging open problem. 
505: 
506: A technique  that often provides realistic approximate values for
507: these quantities
508: came to be known as the {\em mean-field (annealed) approximation}. In a nutshell 
509: a mean-field approximation assumes that we are trying to compute the
510: average (over a certain discrete probability space) of a certain
511: expression $f\circ (g_{1}, \ldots, g_{n})$. Then the mean 
512: field-approximation amounts to taking 
513: 
514: \[
515: E[f(g_{1}(x),\ldots, g_{n}(x)]\sim f[E[g_{1}(x)],\ldots, E[g_{n}(x)]]. 
516: \]
517: 
518: This technical definition of the mean-field approximation does not
519: convey a useful intuition: suppose we want to solve a  
520: combinatorial problem whose objective function depends on 
521: simultaneously satisfy several ``constraints'' whose effects are 
522: usually not independent. The mean-field approximation ignores
523:  the dependencies between various constraints, and treat them
524: as independent.  
525: \vspace{5mm}
526: \begin{example}
527: Let us return to the case of spin glasses. Each configuration of spins 
528: $\sigma$
529: has an energy specified by a {\em Hamiltonian} $H(\sigma)$. A typical
530: expression for $H(\sigma)$ is 
531: \[
532: H(\sigma)=\sum_{i\sim j}a_{i,j}\sigma_{i}\sigma_{j},
533: \]
534: 
535: where the $a_{i,j}$'s are interaction coefficients between adjacent 
536: spins
537: (according to some adjacency graph specific to the considered model).
538: The quantity of interest, {\em average free energy} $\overline{f}$
539: is hard to compute directly because of the logarithmic function present 
540: in 
541: the definition of the free energy. In this context the mean-field
542: approximation amounts to 
543: 
544: \[
545: \overline{f}\sim -\frac{1}{\beta n}\ln[ \overline{Z[\Phi]}]. 
546: \]
547: \end{example}
548: \vspace{5mm}
549: 
550: The advantage of this heuristic is that the average on the right-hand 
551: side is one that is usually much easier
552: to compute. 
553: 
554: 
555: For combinatorial phase transitions, the mean-field approach usually
556: amounts to an approximation using the so-called {\em first-moment
557: method} 
558: 
559: \vspace{5mm}
560: 
561: \begin{example} {\bf ($k$-Satisfiability)}
562: 
563: The reason that the satisfiability probability of a random formula is
564: hard to compute is that, for two assignments $A,B$ the events $A\models
565: \Phi$ and $B\models\Phi$ are not independent. 
566: One way to construct a mean-field theory for $k$-SAT is to ignore the
567: dependencies between these events.  More precisely, we have
568: 
569: \[ 1_{SAT}[\Phi] = f(g_{A_{1}}[\Phi], \ldots, g_{A_{2^{n}}}[\Phi]),
570: \]
571: where 
572: \[ f(x_{1}, x_{2}, \ldots, x_{2^{n}})= 1 - \prod _{i=1}^{2^{n}}x_{i},
573: \]
574: and 
575: 
576: \[ g_{A}[\Phi]= \left \{\begin{array}{ll}
577:                  1, & \mbox{ if }A\not \models \Phi,
578:                  \\
579:  
580:                  0,  & \mbox{ otherwise.}\\ 
581:         \end{array}
582: \right.
583: \]
584: Define $\gamma_{k}=1-2^{-k}$. The mean-field approximation amounts to 
585: \[ 
586: \Pr[\Phi \in SAT] = E[1_{SAT}[\Phi]]\sim f(E_{g_{1}}[\Phi], \ldots,
587: E_{g_{2^{n}}}[\Phi])
588: \]
589: Since 
590: 
591: \[
592: E_{g_{1}}[\Phi]= \ldots = E_{g_{2^{n}}}[\Phi])= 1-\gamma_{k}^{cn}
593: \]
594: this reads,
595: \[ 
596: \Pr[\Phi \in SAT]\sim 1- \left[1-\gamma_{k}^{cn}\right]^{2^{n}}\sim 
597: 1-e^{-
598: 2^{n}\cdot \gamma_{k}^{cn}}= 1- e^{-E[\#_{SAT}[\Phi]]}
599: \]
600: where $\#_{SAT}[\Phi]$ is the number of satisfying assignments for
601: $\Phi$. Thus (neglecting the case $E[\#_{SAT}[\Phi]]=1$) 
602: 
603: \[ 
604: \Pr[\Phi \in SAT]= \left \{\begin{array}{ll}
605:                  1, & \mbox{ if }E[\#_{SAT}[\Phi]]\goesto \infty,
606:                  \\
607:  
608:                  0,  & \mbox{ if }E[\#_{SAT}[\Phi]]\goesto 0.\\ 
609: \end{array}
610: \right. 
611: \]
612: 
613: 
614: \end{example}
615: \vspace{5mm}
616: \subsection{Critical exponents and behavior}
617: 
618: 
619: A phenomenon that has been observed in various contexts is 
620: {\em critical behavior}. In these cases the class of problems under
621: study has an intrinsic notion of 
622: dimensionality $d$, and in the limit $d\goesto \infty$ (or sometimes 
623: even when $d$ is greater than a so-called {\em critical dimension}) 
624: ``the annealed approximation becomes exact''.
625: 
626: A way to give precise meaning to the above quote comes from the
627: concept of {\em universality}. In Statistical Mechanics one define
628: certain {\em critical exponents}, that describe the behavior of the
629: system near the critical points; universality predicts that phase 
630: transitions
631: with the same critical exponents are ``structurally similar''. 
632: 
633: Since critical exponents can be defined for the mean-field versions of
634: the physical models too, critical behavior means that as $d\goesto 
635: \infty$
636: (or, sometimes, for $d$ larger than a value called {\em the upper 
637: critical dimension}) the critical
638: exponents of the $d$-dimensional system coincide with the critical
639: exponents of the $d$-dimensional mean-field model.   
640: 
641: \vspace{5mm}
642: \begin{example}
643: {\bf (Bond) percolation on the lattice ${\bf Z}^{d}$.}
644: Percolation \cite{grimmett:percolation} is a mathematical theory that 
645: models the flow of  liquids in random porous media. In our case 
646: the flow is on the
647: lattice ${\bf Z}^{d}$ of dimension $d$, and the model has one
648: parameter, the edge probability $p\in [0,1]$. Each bond (grid
649: edge of the lattice ${\bf Z}^{d}$) is considered open with
650: probability $p$ (independently of the other bonds) and the order
651: parameter is the probability $P_{d}(p)$ that the origin lies in an
652: infinite cluster. $P_{d}$ is a monotonically increasing function of
653: $p$. It is  believed that $P_{d}(p)$ is  zero up to a {\em critical 
654: value $p_{c}(d)$} (known
655: rigorously only for $d=2$), greater than zero beyond that point, and
656: non-analytical but continuous (at least for $d=2$) at $p_{c}(d)$.
657: It is also believed that above 
658: (and around the critical value) $P_{d}(p)\sim (p-p_{c}(d))^{\beta}$ where $\beta$ is a 
659: {\em critical exponent} that depends on $d$ but {\em not} on
660: the explicit lattice considered (i.e. it would be the same if we choose
661: another $d$-dimensional lattice instead of ${\bf Z}^{d}$). This is 
662: only one of the several critical exponents that are believed to
663: structurally characterize percolation on $d$-dimensional lattices (see
664: \cite{grimmett:percolation}).  
665: 
666: Without going into further details, we note that 
667: the ``mean-field approximation''
668: corresponds to considering percolation on the {\em $d$-dimensional
669:   Bethe lattice}, a
670: nd the critical behavior 
671: amounts to the observation that for $d$ greater than a {\em critical
672: dimension} (known to be at most 16 \cite{hara:slade:critical}, and is 
673: believed to be 6) the 
674: critical exponents of percolation on ${\bf Z}^{d}$ are those of
675: percolation on the Bethe lattice.   
676: 
677: \end{example}
678: 
679: 
680: \subsection{Rescaling and critical behavior}
681: \label{discuss}
682: A recent example of critical behavior has recently been observed
683: experimentally by Kirkpatrick and Selman
684: \cite{kirkpatrick:selman:scaling} for satisfiability problems. 
685: 
686: Their results does not mention 
687: critical exponents (although it is closely related).  To explain 
688: them, 
689: we need to 
690: introduce first another concept from Statistical Mechanics: {\em 
691: finite-size
692: scaling}. The intuition behind it is that
693: \cite{kirkpatrick:selman:scaling} ``sufficiently close to a threshold
694: or critical point, systems of all sizes are indistinguishable except
695: for an overall change of scale.'' In
696: mathematical terms this amounts to defining a new order parameter
697: that ``opens up'' the {\em scaling window, 
698: the region where the probability decreases from 1 to 0.}  
699: \vspace{5mm}
700: \begin{example} {\bf Hamiltonian Cycle.}
701: 
702: The random model has one parameter $m$, the number of edges. A random
703: sample is obtained by choosing uniformly at random a set of $m$
704: distinct edges of a complete graph with $n$ vertices. The following 
705: result (obtained by Koml\'{o}s and Szemer\'{e}di \cite{hamcyclerand}) 
706: describes the phase
707: transition in this problem: 
708: 
709: Let $m=m(n)= \frac{1}{2}n\cdot \log(n)+\frac{1}{2}n\cdot \log 
710: \log(n)+c_{n}\cdot n$. Then
711: 
712: \[
713: \lim_{n\implies \infty}Pr[G\mbox{ has a Hamiltonian cycle}]=\left 
714:  \{\begin{array}{ll}
715:                   0, & \mbox{ if $c_{n}\goesto -\infty$,}
716:                  \\
717:                  e^{-e^{-2c}},  & \mbox{if $c_{n}\goesto c$,}\\
718:                  1, & \mbox{ if $c_{n}\goesto \infty$.}
719:                  \\  
720:         \end{array}
721: \right.
722: \] 
723: 
724: A rescaled parameter for the Hamiltonian cycle problem can be defined
725: by $c_{n}=\frac{1}{n}\cdot [m-\frac{1}{2}n\cdot
726: \log(n)-\frac{1}{2}n\cdot \log \log(n)]$. This parameter yields a
727: rescaled limit probability function $f(c)=e^{-e^{-2c}}$. 
728: \end{example}
729: \vspace{5mm}
730: 
731: It is important to note that, since an annealed approximation yields 
732: an expression for the order parameter (in our case satisfaction
733: probability) that will usually display a phase transition as well, 
734: a rescaled parameter can be defined for the mean-field version of the 
735: problem as well. 
736: 
737: The definition of the rescaled parameter allows a precise formulation
738: of the intuition that an annealed approximation becomes exact in the
739: limit $d\goesto \infty$. Let $P_{d}$ be a class of satisfiability
740: problems indexed by a dimensionality parameter $d$, let $F_{d}$
741: be the rescaled satisfaction probability graph of $P_{d}$, and let 
742: $F_{ann,d}$ be 
743: the rescaled graph corresponding to the annealed approximation. 
744: Kirkpatrick and Selman observe experimentally that {\em as $d\goesto 
745: \infty$, 
746: the function sequences $F_{d}$, $F_{ann,d}$ converge punctually to a 
747: common limit $F_{\infty}$}. 
748: \vspace{5mm}
749: \begin{example}
750: We present in detail the experimental results of Kirkpatrick 
751: and Selman. They define an (approximate) rescaled parameter for $k$-SAT
752: \[
753: y_{k} = n^{1/\nu_{k}}\frac{(c-c_{k})}{c_{k}},
754: \]
755: where $c=m/n$, $c_{k}$ is the critical threshold for $k$-SAT, and
756: $\nu_{k}$ is the scaling width coefficient.  
757: Also, define the ``annealed rescaled parameter'' 
758: \[
759: y_{\infty,k} = n\frac{(c-c_{k})}{c_{k}},
760: \]
761: 
762: The rescaled limit probability graphs (and, see below, the rescaled
763: versions of the mean-field versions) seem to converge (see Fig. 4 in
764: that paper) to the ``annealed limit'' 
765: 
766: \[
767: f_{\infty}(y) = e^{-2^{-y}}. 
768: \]
769: \end{example}
770: \vspace{5mm}
771: 
772: 
773: \vspace{5mm}
774: \begin{definition}
775: In this paper {\em dimension-dependent 
776: behavior} refers to the above-mentioned type phenomenon, convergence 
777: of the ``rescaled'' probability functions (and their annealed
778: counterparts) to some common {\em annealed limit}.
779: \end{definition}
780: 
781: \vspace{5mm}
782: \begin{observation}
783: 
784: It is important to note that dimension-dependent behavior is at the
785: same time more and less demanding than critical behavior. 
786: 
787: 
788: It is more demanding since it requires that the 
789: annealed approximation be exact {\em throughout the (rescaled version) 
790: of
791: the critical region}. In contrast, critical exponents only provide a
792: qualitative picture of this region, rather than uniquely determine the
793: limit probability throughout it; for instance the width of the scaling 
794: window $\nu$ is equal to $2\beta+\gamma$, where  $\beta$ 
795: is the
796: so-called {\em order-parameter exponent}, that characterizes the
797: asymptotic behavior of the order parameter close to the transition
798: point, and $\gamma$ is called {\em susceptibility exponent} (see
799: e.g. \cite{scaling:window:2sat}). 
800: 
801: It is less demanding since it does
802: not assume the existence of critical exponents, therefore 
803: {\em it makes sense for problems having coarse thresholds,
804: including those that have no singular/critical points}.   
805: 
806: 
807: \end{observation}
808: \vspace{5mm}
809: 
810: 
811: 
812: Why should we expect critical behavior and the above form for the 
813: annealed 
814: limit ? The intuition is very simple: the major difficulty in computing 
815: the 
816: probability that a random $k-SAT$ formula is satisfiable is the fact 
817: that, for two assignments $A$ and $B$, the events ``$A\models \Phi$'' 
818: and 
819: ``$B\models \Phi$'' are not generally independent, because there exist 
820: clauses of length $k$ that are falsified by both $A$ and $B$. On the 
821: other hand, 
822: qualitatively, as $k\goesto \infty$ clausal constraints become 
823: progressively ``looser'', so that in the limit we can neglect such 
824: correlations.
825: 
826: As to the exact expression for $f_{\infty}(y)$, for a $k$-CNF formula 
827: the mean-field approximation implies
828: 
829: \[
830: \Pr[\Phi \in \overline{SAT}]\sim (1-\gamma_{k}^{cn})^{2^n}\sim 
831: e^{-2^{n}\cdot \gamma_{k}^{cn}}. 
832: \]
833: 
834: But since $c_{k}$ is specified (in the mean-field approximation) by 
835: $E[\# SAT]\sim 1$, i.e. $2^{n}\cdot \gamma_{k}^{c_{k}n}\sim 1$, 
836: or $1+c_{k}\log_{2}\gamma_{k}=0$, this implies that as $k\goesto 
837: \infty$ 
838: \[
839: \Pr[\Phi \in \overline{SAT}]\sim e^{-2^{n\cdot [1-c/c_{k}]}}\sim 
840: f_{\infty}(y_{\infty,k}).
841: \]
842: 
843: In other words, when plotted against the annealed order parameters 
844: $y_{ann,k}$ the rescaled satisfaction probability graphs (and their 
845: annealed 
846: counterparts) punctually converge to the graph of $f_{\infty}$. 
847: 
848: \section{Does critical behavior really exist ?}
849: 
850: The intuitive argument sketched in the preceding paragraph seems to provide 
851: a beautiful explanation of the experimental results from \cite{kirkpatrick:selman:scaling}. That this 
852: intuition is, however, problematic has been shown by Wilson 
853: \cite{wilson:ksat:wrong}. First 
854: note that if the previous argument were true, we would have 
855: $\nu_{k}=1$ 
856: for any large enough $k$, since this is the width of the scaling
857: window that the mean-field versions of $k-SAT$ predict.  
858: On the other hand Wilson 
859: presented a simple argument that implies that $\nu_{k}\geq 2$)
860: Hence the above explanation is not rigorously valid.  
861: 
862: We stress that Wilson's observation does {\em not} rule out the 
863: existence of critical behavior: we, in fact, believe that the 
864: qualitative intuition that motivated \cite{kirkpatrick:selman:scaling}, 
865: that versions of $k-SAT$ become more and more ``similar'' as $k$ goes to 
866: infinity, is correct. {\em It is the notion of annealed approximation that 
867: needs to be changed}.  
868: And, certainly, {\bf his results do not rule the possibility that the rescaled 
869: limit probabilities converge, as $k\goesto \infty$, to a 
870: suitable-defined limit}. Obtaining a rigorous example where this holds, 
871: that identifies a 
872: suitable ``annealed approximation that becomes exact'' and also obtains 
873: an 
874: explanation for this convergence,  could hopefully  
875: offer insights on how to address this problem 
876: for random $k-SAT$ as well. This is what our theorems in the next section 
877: provide.   
878: 
879: \section{Our results}\label{section:3}
880: A {\em Horn clause} is a disjunction of literals containing {\em at
881: most one positive literal}. It will be called {\em positive} if it
882: contains a positive literal and {\em negative} otherwise.
883: A Horn formula is a conjunction of Horn
884: clauses. {\em Horn satisfiability} (denoted by $\HSAT$) is the
885: problem of deciding whether a given Horn formula has a satisfying
886: assignment.                                                
887: 
888: In this chapter we prove a result that displays 
889: dimension-dependent behavior for (at most) $k$-Horn satisfiability, the 
890: natural version of Horn
891: satisfiability studied, parameterized by the maximum clause length.  
892: This problem is also of practical
893: interest in Artificial Intelligence, 
894: mainly in connection to {\em theory approximation}
895: \cite{kautz-selman-kc}. 
896: The results can be summarized as
897:  follows:
898:  
899: \begin{enumerate}
900: \item For an unbounded $k=k(n)$ the threshold phenomenon
901: is essentially the one from the ``uniform case'' $k(n)=n$. 
902: In particular there exists a
903: ``rescaled'' parameter that makes the graphs of the limit probabilities 
904: superimpose (Theorem~\ref{k:infinite}). 
905:  
906: \item For any constant $k$ the threshold phenomenon is qualitatively
907: described by a suitably chosen queuing model
908: (Theorem~\ref{k:3etc}). This yields a 
909: closed-form expression for the satisfaction probability when 
910: $k=2$ (Theorem~\ref{k:2}). This expression has a singularity (though $k=2$
911: is likely the only case that does so).  
912: \item The rescaled limit probabilities from the
913: cases when $k$ is a constant converge to the one from the ``infinite'' 
914: case, that can in turn be seen as the result of a mean-field approximation
915: (thus the problem displays what we have called dimension-dependent behavior). 
916: 
917: \item Somewhat surprisingly, the explanation for this convergence (an
918: intrinsic feature of the problem) is
919: a threshold property for the number of iterations of PUR
920: (a particular algorithm) on random satisfiable Horn formulas 
921: ``in the critical range.'' 
922: 
923: \item In the case when $k=2$ \PUR\ displays an 
924: ``easy-hard-easy'' pattern for the average number of iterations on 
925: satisfiable instances, peaked at the point where the limit probability
926: has a singularity (Theorem~\ref{k:2:runtime}). 
927: \end{enumerate}
928: \vspace{5mm}
929: 
930: 
931: Note, however, the important difference between 
932: random $k$-SAT and random at-most-$k$-\HSAT: for every $k\geq 2$,
933: $k$-SAT has a sharp threshold
934: \cite{friedgut:k:sat}. All versions of \HSAT\ have coarse thresholds. 
935: 
936:  
937: \vspace{5mm}
938: \begin{definition}
939: Let $k=k(n):\N \goesto \N$ be monotonically increasing, $1\leq
940: k(n)\leq n$. We define the following random model $\Omega(k,n,m)$:
941: {\em formula $\Phi$ on $n$ variables 
942: is obtained by selecting (uniformly at random
943: and with repetition) $m$ clauses from the set of all (non-empty) Horn
944: clauses in the given variables of length {\em at most $k(n)$}.}
945: \end{definition}
946: \vspace{5mm}
947: 
948: The following are our results (whose proofs are only sketched):
949: \vspace{5mm}
950: 
951: \begin{theorem} \label{k:infinite}
952: If $k(n)\goesto \infty$, $c>0$, 
953: $H_{k(n)}$ is the number of Horn clauses on $n$ variables
954: having length at most $k(n)$,  and $m(n)= c\cdot \frac{H_{k(n)}}{n}$ 
955: then 
956: \begin{equation}
957: \label{formula:1}
958: p_{\infty}(c):=\lim_{n\goesto \infty} Pr_{\Phi \in 
959: \Omega(k(n),n,m)}(\Phi \in \mbox{HORN-SAT}\/) =
960: 1-F_{1}(e^{-c}).
961: \end{equation}
962: \end{theorem}
963: 
964: \vspace{5mm}
965: 
966: \begin{theorem}\label{k:2}
967: If $c>0$, and $F_{2}:(0,1)\goesto (1,\infty)$,
968: $F_{2}(x)=\ln x/(x-1)$, then 
969: \begin{equation}
970: \label{formula:2}
971: p_{2}(c):=\lim_{n\goesto \infty} Pr_{\Phi \in \Omega(2,n,cn)}(\Phi \in 
972: \mbox{HORN-SAT}\/) =
973: \left \{\begin{array}{ll}
974:                  1, & \mbox{ if $c\leq \frac{3}{2}$,}
975:                  \\
976:  
977:                  F_{2}^{-1}(2c/3),  & \mbox{ otherwise.}\\ 
978:         \end{array}
979: \right.
980: \end{equation} 
981: \end{theorem}
982: \vspace{5mm}
983: 
984: More generally, define $\lambda_{k}=\frac{k!}{k+1}$ and
985: $S_{j}^{i}={{i}\choose {0}}+{{i}\choose {1}}+\ldots+{{i}\choose
986: {j}}$ (with the usual convention ${{i}\choose{j}}=0$ for $i<j$). Then
987: 
988: \begin{theorem}\label{k:3etc}
989: The limit probability $p_{k}(c):=\lim_{n\goesto \infty}
990: Pr_{\Phi \in \Omega(k,n,c\cdot n^{k-1})}(\Phi \in \mbox{HORN-SAT}\/)$
991: is equal to the probability that the following Markov chain
992: ever hits state zero:
993: \begin{equation}\label{eq:3etc}
994: \left \{\begin{array}{l}
995:         Q_{0}=1,\\
996:         Q_{i+1}=Q_{i}\cminus 1+Po(c\cdot \lambda_{k}\cdot 
997: S_{k-2}^{i+1}),\\
998: \end{array}
999: \right.
1000: \end{equation}
1001: \end{theorem}
1002: \vspace{5mm}
1003: 
1004: To get a better intuition on the threshold phenomenon, as displayed by
1005: Theorems~\ref{k:infinite}, \ref{k:2} and \ref{k:3etc}, we have plotted
1006: (in Fig. 1) the limit probability functions 
1007: $p_{2}(\cdot),p_{3}(\cdot),p_{\infty}(\cdot)$, against the ``rescaled'' parameter 
1008: (inspired by Theorem~\ref{k:infinite}) $\hat{c}=\frac{m\cdot
1009: n}{H_{k(n)}}$. This rescaling has the pleasant property that it
1010: simplifies the factor $\lambda_{k}$ from the right-hand side
1011: of~\ref{eq:3etc}, in particular mapping the critical point in
1012: Theorem~\ref{k:2} to $\hat{c}=1$. 
1013: The graphs of $p_{2}$ (continuous) and $p_{\infty}$
1014: (dashed) are obtained from their formulas in the previous results,
1015: while $p_{3}$ (dotted) is obtained via simulations. The figure makes
1016: apparent that the graphs of $p_{2}, p_{3}, \ldots, \ldots$ 
1017: converge to
1018: the graph of $p_{\infty}$. This statement can be
1019: proved rigorously :
1020: 
1021: \begin{theorem}\label{annealed}
1022: For every $\hat{c}>0$, $\lim_{n\goesto
1023: \infty}p_{n}(\hat{c})=p_{\infty}(\hat{c})$. 
1024: \end{theorem}
1025: \vspace{5mm}
1026:  
1027: 
1028: \begin{figure}
1029: \centerline{
1030: \psfig{figure=fig1.ps,width=3.5in}}
1031: \caption{Rescaled threshold functions}
1032:  \end{figure}
1033: 
1034: 
1035: 
1036: 
1037: As a bonus our analysis yields the following result:
1038: \vspace{5mm}
1039: \begin{theorem}\label{k:2:runtime}
1040: Let $q$ be the limit of the 
1041: expected number of iterations of \PUR\ on a random formula
1042: $\Phi \in \Omega(2,n,cn)$, conditional on $\Phi$ being
1043: satisfiable. Then 
1044: \begin{equation}
1045: \label{q:2}
1046: q=
1047: \left \{\begin{array}{ll}
1048:                  \frac{1}{1-p_{2}\lambda_{2}c} & \mbox{, if $c\neq 
1049: \frac{3}{2}$,}
1050:                  \\
1051:  
1052:                  \infty,  & \mbox{ otherwise.}\\ 
1053:         \end{array}
1054: \right.
1055: \end{equation} 
1056: \end{theorem}
1057: \vspace{5mm}
1058: 
1059: This theorem suggests (see Fig.2) and explains the ``easy-hard-easy''
1060: pattern for the average running time of
1061: \PUR\ 
1062: in this case. Experiments we performed confirm this prediction.
1063:  \begin{figure}
1064:  \centerline{
1065:  \psfig{figure=fig2.ps,width=3.5in}}
1066:  \caption{The ``easy-hard-easy'' pattern.}
1067:  \label{figure-2}
1068:  \end{figure}
1069: 
1070: \section{Preliminaries}
1071: 
1072: Throughout this paper we use ``with high probability'' (w.h.p.)
1073: as a substitute for ``with probability $1-o(1)$''. 
1074: We denote (sometimes abusing notation) by $B(n,p) (Po(\lambda))$ a 
1075: random
1076: variable having a binomial (Poisson) distribution with the 
1077: corresponding
1078: parameter(s), and by $a\cminus b$ the value $max(a-b,0)$. 
1079: We will use the following version of the Chernoff bound
1080: \vspace{5mm}
1081: 
1082: \begin{theorem}
1083: If $0<\theta <1/4$ then
1084: $\PR[|B(n,p)-np|>\theta np ] \leq e^{-np\frac{\theta^{2}}{4}}$.
1085: \end{theorem}
1086: \vspace{5mm}
1087: 
1088: as well as the related inequality from \cite{probabilistic-method} : 
1089:  \vspace{5mm}
1090: 
1091: \begin{proposition}\label{chernoff:poisson}
1092: Let $P$ have Poisson distribution with mean $\mu$. For $\epsilon >0$,
1093:  
1094: \[ \Pr[P\leq \mu \cdot (1-\epsilon)] \leq e^{\epsilon^{2}\cdot \mu
1095: /2},
1096: \]
1097:  
1098: \[ \Pr[P\geq \mu \cdot (1+\epsilon)] \leq
1099: [e^{\epsilon}(1+\epsilon)^{-(1+\epsilon)}]^{\mu}.
1100: \]
1101: \end{proposition}    
1102: \vspace{5mm}
1103: 
1104: We also use the following inequality:
1105: \vspace{5mm}
1106: 
1107: \begin{proposition}
1108: Let $k\in \N$ and $p\in [0,1]$. Then for every $n\geq k$ 
1109: \begin{equation}
1110: 1-\sum_{i=0}^{k-1} {{n}\choose {i}}p^{i}(1-p)^{n-i}\leq {{n}\choose 
1111: {k}}p^{k}.
1112: \end{equation} 
1113: \end{proposition}
1114: \vspace{5mm}
1115: 
1116: 
1117: \begin{PROOF} Define $f:[0,1]\goesto R$, $f(p)=1-\sum_{i=0}^{k-1} 
1118: {{n}\choose
1119: {i}}p^{i}(1-p)^{n-i} -{{n}\choose {k}}p^{k}$. It is easy to see that 
1120: $f^{\prime}(p)=n{{n-1}\choose {k-1}}p^{k-1}[(1-p)^{n-k}-1]\leq 0$,
1121: therefore $f$ is monotonically decreasing, and $f(0)=0$.
1122: \end{PROOF}
1123: 
1124: 
1125: 
1126: 
1127: 
1128: 
1129:  
1130: We will also employ {\em couplings of Markov
1131: chains} (see \cite{lindvall:coupling}) to assert stochastic
1132: domination. The following is the definition of the type of 
1133: coupling we employ in
1134: this paper:
1135: \vspace{5mm}
1136: \begin{definition}
1137: Let $(X_{t})_{t}$ and $(Y_{t})_{t}$ be two Markov chains on ${\bf Z}$. 
1138: A {\em coupling of $X$ and $Y$ such that $X_{t}\leq Y_{t}$} is a
1139: Markov chain $Z=(Z_{t,1},Z_{t,2})$ such that:
1140: \begin{itemize}
1141: \item $Z_{t,1}$ is distributed like $X_{t}$ given $X_{0}$. 
1142: \item $Z_{t,2}$ is distributed like $Y_{t}$ given $Y_{0}$.
1143: \item for every $i\geq 0$, $Z_{i,1}\leq Z_{i,2}$.
1144: \end{itemize} 
1145: \end{definition}
1146: \vspace{5mm}
1147: 
1148: We use such couplings to bound the probability that a Markov 
1149: chain $Y_{t}$ ever decreases below a certain value $a$ by coupling it 
1150: with a chain $X_{t}$ such that $X_{t}\leq Y_{t}$ and using the
1151: estimate $\Pr[\exists t: Y_{t}\leq a]\leq \Pr[\exists t: X_{t}\leq a]$ 
1152: (that follows from the coupling). The couplings we construct employ the 
1153: following ideas:   
1154: \begin{itemize}
1155: \item Suppose the recurrences 
1156: describing $\Delta X_{t}$ and $\Delta Y_{t}$ are identical, 
1157: except for one term, which is $B(m_{1},\tau)$ in $X_{t}$ and 
1158: $B(m_{2},\tau)$ in $Y_{t}$, 
1159: where $m_{1}\leq m_{2}$ are positive integers and $\tau \in (0,1)$. 
1160: Obtain a coupling by identifying $B(m_{1},\tau)$ with the outcome of 
1161: the first $m_{1}$ Bernoulli experiments in $B(m_{2},\tau)$. 
1162: \item Suppose now that $\Delta X_{t}$ and $\Delta Y_{t}$ differ by
1163: exactly one term which is $B(m,p)$ in $\Delta X_{t}$ and  
1164: $B(m,q)$ in $\Delta Y_{t}$, $p \leq q$. Let $A_{i}$ and $B_{i}$,
1165: $i=1,m$,  
1166: be independent $0/1$ experiments with success probabilities $p$
1167: and $\frac{q-p}{1-p}$ respectively. Define the pair $(Z_{t,1}, 
1168: Z_{t,2})$ so that 
1169: \begin{enumerate}
1170: \item $Z_{t,1}$ is the number of times $A_{i}$ succeeds. 
1171: \item $Z_{t,2}$ is the number of times at least one of $A_{i}$ and 
1172: $B_{i}$
1173: succeeds. 
1174: \end{enumerate}
1175: \end{itemize}
1176: 
1177: %We will also explicitly refer to  
1178: %the following stochastic dominance inequality obtained by the first 
1179: %coupling. 
1180: %Let $0<m_{1}\leq m_{2}$ and $\tau >0$. Then, for every $a>0$, 
1181: %\begin{equation}\label{couple} 
1182: %\Pr[B(m_{1},\tau)\geq a]\leq \Pr[B(m_{2},\tau)\geq a]. 
1183: %\end{equation}
1184: 
1185: We measure the distance between two probability distributions 
1186: $P$ and $Q$ by {\em the total variation distance}, 
1187: denoted by $d_{TV}(P,Q)$,  and recall the following results, 
1188: (see \cite{sheu:poisson} and \cite{barbour:holst:janson}, page
1189: 2 and Remark 1.4):
1190: \vspace{5mm}
1191: 
1192: \begin{lemma}\label{b:h:j}If $n,p,\lambda, \mu >0$ then 
1193: $d_{TV}(B(n,p),Po(np))\leq \min\{np^{2},\frac{3p}{2}\}$ and
1194: $d_{TV}(Po(\lambda), Po(\mu))\leq |\mu - \lambda|$. 
1195: \end{lemma}
1196: \vspace{5mm}
1197: 
1198: We will also need the following simple lemma:
1199: 
1200: \begin{lemma}\label{approximation}
1201: Let c be a fixed positive integer. For every $t\in \N$ let
1202: $\xi_{t}$, $\eta_{t}$ be two probability distributions. Define the
1203: Markov chains $(X_{t})_{t}$ and $(Y_{t})_{t}$ by recurrences 
1204: \begin{equation}
1205: \left\{\begin{array}{l}
1206: X_{t+1}=X_{t}\cminus c + \xi_{t}, \\
1207: Y_{t+1}=Y_{t}\cminus c + \eta_{t}.\\
1208: \end{array}
1209: \right.
1210: \end{equation}
1211: 
1212: Then, for every $t\geq 0$, $d_{TV}(X_{t},Y_{t})\leq
1213: d_{TV}(X_{0},Y_{0})+ \sum_{i=0}^{t-1} d_{TV}(\xi_{i}, \eta_{i}).$  
1214: \end{lemma}
1215: 
1216: \beginproof 
1217: 
1218: The following result gives a more convenient inequality that
1219: immediately implies Lemma~\ref{approximation}
1220: \vspace{5mm}
1221: 
1222: \begin{lemma}\label{easy:approximation}
1223: Let c be a fixed positive integer. Let
1224: $X$, $Y$, $\xi$, $\eta$ be random variables with nonnegative integer
1225: values. Define the 
1226: random variables $Z$ and $T$ by recurrences 
1227: \begin{equation}
1228: \left\{\begin{array}{l}
1229: Z=X\cminus c + \xi, \\
1230: T=Y\cminus c + \eta.\\
1231: \end{array}
1232: \right.
1233: \end{equation}
1234: Then, for every $d_{TV}(Z,T)\leq
1235: d_{TV}(X,Y)+ d_{TV}(\xi, \eta).$  
1236: \end{lemma}
1237: \vspace{5mm}
1238: 
1239: \beginproof
1240: 
1241: To prove this result, we will denote (for the ``generic'' r.v. $A$) by
1242: $A_{i}$ the probability that $A$ takes value $i$. We also employ the 
1243: following simple inequality, valid for $a,b,c,d\geq 0$: $|ad-bc|\leq
1244: a|d-c|+|a-b|c$. 
1245: 
1246: For every $a\geq 0$ we have:
1247: \[
1248: Z_{a}=\sum_{i=0}^{c} X_{i}\xi_{a}+\sum_{i=c+1}^{c+a} X_{i}\xi_{a+c-i},
1249: \]
1250: \[
1251: T_{a}=\sum_{i=0}^{c} Y_{i}\eta_{a}+\sum_{i=c+1}^{c+a} 
1252: Y_{i}\eta_{a+c-i},
1253: \]
1254: 
1255: Applying the above-mentioned inequality and summing we get:
1256: 
1257: \begin{eqnarray*}
1258: d_{TV}(Z,T) \\ & \leq &
1259: \frac{1}{2} \{
1260: \sum_{i=0}^{c}\sum_{a=0}^{\infty}X_{i}|\xi_{a}-\eta_{a}|
1261: +\sum_{i=0}^{c}\sum_{a=0}^{\infty}|X_{i}-Y_{i}|\eta_{a}+ \\
1262: & + & 
1263: \sum_{i=c+1}^{c+a}\sum_{a=0}^{\infty}X_{i}|\xi_{c+a-i}-\eta_{c+a-i}|
1264: +\sum_{i=c+1}^{c+a}\sum_{a=0}^{\infty}|X_{i}-Y_{i}|\eta_{c+a-i}\}.
1265: \end{eqnarray*}
1266: 
1267: Let A,B,C,D be the four terms of the sum. By simple algebraic
1268: manipulations we obtain:
1269: \[
1270: \begin{array}{lcl}
1271:  A = (\sum_{i=0}^{c}X_{i})\cdot d_{TV}(\xi,\eta), &\hspace{5mm} & B = 
1272: \frac{1}{2}\sum_{i=0}^{c} |X_{i}-Y_{i}|,\\ 
1273: C = (\sum_{i=c+1}^{\infty}X_{i})\cdot d_{TV}(\xi,\eta),
1274: & \hspace{5mm} & D =
1275:  \frac{1}{2}\sum_{i=c+1}^{\infty}|X_{i}-Y_{i}|,
1276: \end{array}
1277: \]
1278: and the result follows.
1279: \qed
1280: 
1281: 
1282: Finally, we need the following trivial occupancy property:
1283: \vspace{5mm}
1284: 
1285: \begin{lemma}\label{occupancy}
1286: Let $a$ white balls and $b$ black balls be thrown uniformly at random
1287: in $n$ bins. 
1288: \begin{enumerate}
1289: \item if $r=\max(a,b)=o(n^{1/2})$ then the probability that there is a 
1290: bin that contains both white and black balls is at most 
1291: $\frac{4r^2}{n}=o(1)$.
1292: \item if $s=\min(a,b)=\omega(n^{1/2})$ then the probability that there 
1293: is a 
1294: bin that contains both white and black balls is $1-o(1/poly)$.
1295: \end{enumerate}
1296: \end{lemma}
1297: \vspace{5mm}
1298: 
1299: \beginproof
1300: The first part is easy: the probability that two balls (of any color) 
1301: end up in the same bin is at most ${{a+b}\choose {2}}\cdot 
1302: \frac{1}{n}$.
1303: For the second part, let $A$ be the event that no two balls of
1304: different colors end up in the same bin, and let $B$ the event that at
1305: least $\sqrt{n}$ bins contain white balls. We have:
1306: \[ \Pr[A]\leq \Pr[A|B]+\Pr[\overline{B}].\]
1307: But
1308: \[ \Pr[\overline{B}]\leq {{n}\choose {\sqrt{n}}}\cdot
1309: (\frac{1}{\sqrt{n}})^{a}= n^{\sqrt{n}-a/2}=o(\frac{1}{poly}), \]
1310: and 
1311: \[\Pr[A|B]\leq (1-\frac{1}{\sqrt{n}})^{b}\sim
1312: e^{-b/\sqrt{n}}=o(\frac{1}{poly}). \]
1313: \qed
1314: 
1315: The algorithm \PUR\ is displayed in Figure 3. 
1316: We regard \PUR\ as working in stages, indexed by the
1317: number of variables still left unassigned; thus, the stage number
1318: decreases as \PUR\ moves on. We say that {\em formula $\Phi$ survives
1319: Stage $t$} if \PUR\ on input $\Phi$ does not halt at Stage $t$ or
1320: earlier. Let $\Phi_i$ be the formula at the
1321: beginning of stage $i$, and let $N_{i}$ denote the number of its
1322: clauses. We will also denote by $P_{i,t} (N_{i,t})$, the number of 
1323: clauses of
1324: $\Phi_{t}$ of size $i$ and containing one (no) positive
1325: literal. Define $\Phi_{i,t}^{P}$ ($\Phi_{i,t}^{N}$) to be the
1326: subformula of $\Phi_{t}$ containing the clauses counted by $P_{i,t} 
1327: (N_{i,t})$.
1328: 
1329: The following lemmas were proved in \cite{istrate:cs.DS/9912001}, in 
1330: the
1331: context of analyzing the behavior of \PUR\ on $\Phi\in
1332: \Omega(n,n,m)$, $m=c\cdot 2^n$.
1333: \vspace{5mm}
1334: 
1335: \begin{lemma}\label{k:inf:recurrence}
1336: \begin{enumerate}
1337: \item 
1338: Suppose $\PUR$ does not halt before stage $t$. Then, conditional on $N_{t}$,
1339: the clauses of $\Phi_{t}$ are random and independent. 
1340: \item   
1341: Suppose now that we condition on $\Gamma_{t}=(N_{1,t},N_{2,t},P_{1,t},
1342: P_{2,t}$ and on the fact that $\Phi$
1343: survives Stage $t$ as well. Then  we have 
1344: 
1345: \begin{equation}\label{eq:markovchain}
1346: N_{t-1}=N_{t}-\Delta_{1,P}(t)-\Delta_{2,P}(t), 
1347: \end{equation}
1348: 
1349: where 
1350: \begin{itemize}
1351: \item $\Delta_{1,P}(t)$, the number of positive clauses that are
1352: satisfied at stage $t$, has the distribution $1+B\left(P_{1,t}-1,\frac{1}{t}\right)$. 
1353: \item  
1354: $\Delta_{2,P}(t)$, the number of positive non-unit clauses 
1355: that are satisfied at stage $t$, has the binomial distribution
1356: $B\left(P_{2,t},\frac{1}{t}\right)$.
1357: \end{itemize}
1358: \end{enumerate}
1359: \end{lemma}
1360: 
1361: \vspace{5mm}
1362: 
1363: \begin{lemma}\label{k:inf:bounds}
1364: For every $c>0$ and every $t, n-c\sqrt n \leq t \leq n$, 
1365: the conditional probability that the inequality 
1366: \begin{equation}\label{concentrate}
1367: N_{n}-(n-t)\left[1+\frac{2(N_{n}-1)}{t}\right]\leq N_{j}\leq 
1368: N_{n}\end{equation}
1369: holds for all $t\leq j \leq n$, in the event that $\PUR$ reaches stage 
1370: $t$,
1371: is $1-o(1)$.
1372: \end{lemma}
1373: \vspace{5mm}
1374: 
1375: \begin{lemma}\label{k:inf:prob}
1376: Let $X_{n}\in [0,n]$ be the r.v. denoting the number of iterations of 
1377: \PUR\ on
1378: a random {\em satisfiable} formula $\Phi\in \Omega(n,c\cdot
1379: 2^{n})$. Then $X_{n}$ converges in distribution to a distribution 
1380: $\rho$ on $[0,n]$ having support on the nonnegative integers, 
1381: $\rho=(\rho_{k})_{k\geq
1382: 0}$, $\rho_{k}= Prob[\rho = k]$, 
1383: given by
1384: \[ \rho_{k}=\frac{e^{-2^{k}c}}{1-F(e^{-c})}\cdot \prod_{i=1}^{k-1}
1385: (1-e^{-2^{i}c}).
1386: \]
1387: \end{lemma}
1388: \vspace{5mm}
1389: 
1390: \begin{center}
1391: \begin{figure}
1392: {\tt
1393: \begin{tabbing}
1394: 
1395: Pr\=ogram PUR($\Phi$): \\
1396:    \> if \= $\Phi$ (contains no positive literal as a clause)\\
1397:    \> \>then \= return TRUE \\
1398:    \> \>else \\
1399:    \> \> \>choose such a positive unit clause $x$ \\
1400:    \> \> \>if \= ($\Phi$ contains $\overline{x}$ as a clause)\\
1401:    \> \> \> \>then \= \\
1402:    \> \> \> \> \>return FALSE \\
1403:    \> \> \> \>else \\
1404:    \> \> \> \> \>let $\Phi^{\prime}$ be the formula \\
1405:    \> \> \> \> \>obtained by setting
1406:    $x$ to 1 \\
1407:     \> \> \> \> \>return \PUR($\Phi^{'}$) \\
1408: \end{tabbing}
1409: }
1410: \caption{Algorithm PUR}
1411: \end{figure}
1412: \end{center}
1413: \section{The proof of Theorem~\ref{k:infinite}}
1414: 
1415: Let $c_{1}<c_{2}<c_{3}$ be arbitrary constants. Consider three
1416:  random formulas $\Phi_{1}\in \Omega(n,{\bf k(n)},c_{1}\cdot
1417:  \frac{H_{k(n)}}{n})$,$\Phi_{2} \in \Omega(n,{\bf n},c_{2}\cdot
1418:  2^{n})$ and  $\Phi_{3}\in\Omega(n,{\bf k(n)}, c_{3}\cdot 
1419: \frac{H_{k(n)}}{n})$,
1420: and let $\Phi^{\prime}$ be the subformula of $\Phi_{2}$ consisting of
1421:  the clauses of size at
1422: most $k(n)$. By the Chernoff bound, with high probability,
1423: $m^{\prime}$, the number of clauses of $\Phi^{\prime}$, is in the 
1424: interval
1425: $[c_{1}\cdot \frac{H_{k(n)}}{n},c_{3}\cdot \frac{H_{k(n)}}{n}] $. 
1426: When $n\goesto \infty$ the probability that $\Phi_{2} \in \HSAT$ tends
1427: to $1-F_{1}(e^{-c_{2}})$. 
1428: 
1429: From Lemma~\ref{k:inf:prob} we infer the following easy consequence
1430: \vspace{5mm}
1431: \begin{claim}
1432: The probability that \PUR\  accepts $\Phi_{2}$ 
1433: after stage $n-k(n)+1$ is $o(1)$. 
1434: \end{claim}
1435: 
1436: 
1437:  
1438: Since in the first $k(n)-1$ stages 
1439: of \PUR\  {\em only the clauses of $\Phi^{\prime}$ can influence the
1440: algorithm acceptance/rejection of  $\Phi_{2}$ 
1441: (because \PUR\  accepts/rejects
1442: at Stage $i$ based only on the unit clauses, and 
1443: each non-simplified clause loses at most one literal at each phase)},
1444: \[ |\Pr[\Phi_{2}\in \HSAT]- \Pr[\Phi^{\prime}\in \HSAT]|= o(1).
1445: \]
1446: By the monotonicity of SAT and the randomness of
1447: $\Phi_{1},\Phi_{2}, \Phi^{'}$ we have 
1448: 
1449: \[ \Pr[\Phi_{1}\in \HSAT]-o(1)\leq \Pr [\Phi_{2} \in \HSAT] \leq
1450: \Pr[\Phi_{3}\in \HSAT]+o(1). 
1451: \]
1452: Taking limits it follows that 
1453: 
1454: \begin{eqnarray*}
1455: {\overline{\lim}_{n\goesto \infty} \Pr}_{\Phi\in
1456: \Omega(n,k(n),c_{1}H_{k(n)}/n)} [\Phi \in \HSAT] & \leq 1-F(e^{-c_2}) 
1457: \leq & \\
1458: {\underline{\lim}_{n\goesto \infty} \Pr}_{\Phi \in
1459: \Omega(n,k(n),c_{3}H_{k(n)}/n)} [\Phi \in \HSAT] .
1460: \end{eqnarray*}
1461: Since $c_{1},c_{2},c_{3}$ were chosen arbitrarily, 
1462: by choosing $c_{1}=c, c_{2}=c+\epsilon$, and $c_{2}=c-\epsilon, c_{3}=
1463: c$, respectively, we infer that 
1464: 
1465: \begin{eqnarray*}
1466: 1-F_{1}(e^{-(c-\epsilon)})\leq  {\underline{\lim}_{n\goesto
1467: \infty} \Pr}_{\Phi\in \Omega(n,k(n),cH_{k(n)}/n)}[\Phi \in
1468: \HSAT]  \leq & \\ 
1469: {\overline{\lim}_{n\goesto \infty}\Pr}_{\Phi \in
1470: \Omega(n,k(n),cH_{k(n)}/n)}[\Phi \in \HSAT]\leq 
1471: 1-F_{1}(e^{-(c+\epsilon)}).
1472: \end{eqnarray*}
1473: As $\epsilon$ is arbitrary, we get the desired result.
1474: \qed
1475: 
1476: \begin{observation}\label{obs:coupling}
1477: One point about the previous proof that is intuitively clear, but gets
1478: somewhat obscured by the technical details of the proof, is that if
1479: $\Phi_{2} \in \Omega(n,{\bf n},c_{2}\cdot 2^{n})$
1480: then $\Phi^{'}$ behaves ``for every practical purpose'' 
1481: as if it were a uniform formula in $\Omega(n,{\bf k(n)},c_{2}\cdot
1482:  \frac{H_{k(n)}}{n})$. We will use a similar
1483: intuition in the proof of Proposition~\ref{annealed}. 
1484: \end{observation}
1485:  \vspace{5mm}
1486: 
1487: \section{The uniformity lemma}
1488: 
1489: The following lemma is the analog of Lemma~\ref{k:inf:recurrence}
1490: for the case $k=2$, and the basis for our analysis of this case:
1491: \vspace{5mm}
1492: 
1493: \begin{lemma}\label{k:2:recurrence}
1494: Suppose that $\Phi$ survives up to stage $t$. Then, conditional on 
1495: $(P_{1,t}, N_{1,t}, P_{2,t}, N_{2,t})$, the clauses in  
1496: $\Phi_{1,t}^{P},
1497: \Phi_{1,t}^{N}, \Phi_{2,t}^{P}, \Phi_{2,t}^{N}$ are chosen uniformly
1498: at random and are independent. Also, conditional on the
1499: fact that $\Phi$ survives stage $t$ as well, the following recurrences
1500: hold:
1501: \begin{equation}\label{k:2:markovchain}
1502: \left \{\begin{array}{l}
1503:          P_{1,t-1}=P_{1,t}-1-\Delta_{1,t}^{P}+\Delta_{12,t}^{P}, \\        
1504:          N_{1,t-1}=N_{1,t}+\Delta_{12,t}^{N},                    \\
1505:          P_{2,t-1}=P_{2,t}-\Delta_{12,t}^{P}-\Delta_{02,t}^{P},  \\
1506:          N_{2,t-1}=N_{2,t}-\Delta_{12,t}^{N},                    \\
1507:         \end{array}
1508: \right.
1509: \end{equation}
1510: where (in distribution)
1511: \begin{equation}\label{k:2:distribution}
1512: \left \{\begin{array}{l}
1513: \Delta_{1,t}^{P} =B(P_{1,t}-1,1/t),\\
1514: \Delta_{12,t}^{P}=B(P_{2,t},1/t),\\
1515: \Delta_{02,t}^{P}=B(P_{2,t}-\Delta_{12,t}^{P},1/t),\\
1516: \Delta_{12,t}^{N}=B(N_{2,t},2/t).\\
1517: \end{array}
1518: \right.
1519: \end{equation}
1520: \end{lemma}
1521:  \vspace{5mm}
1522: 
1523: \beginproof
1524: A formula will be represented by an
1525: $m\times 2$ table. The rows
1526: of the table correspond to clauses in the formula and the entries are
1527: its literals. They are gradually unveiled as the algorithm proceeds. 
1528: We assume that when generating $\Phi$ we mark those
1529: clauses containing only one literal (so that we know their location,
1530: but not their content).
1531: We say that a row (or a clause) is ``blocked'' either if the clause is
1532: already satisfied or the clause has been turned into the empty
1533: clause.
1534: Suppose $\PUR$ arrives at stage $t$ on $\Phi$.  Then in stages
1535: $i=n, n-1, \ldots, t+1$, $\Phi_i$ should contain a unit clause
1536: consisting of a positive literal but should not have contained
1537: complementary unit clauses of the same variable.
1538: To carry out the disclosure at stage $i$, let $x$ be the variable set
1539: to one in this stage. We assume that the formula unveils
1540: all occurrences of $x$ or $\overline{x}$ in $\Phi$. For each clause we 
1541: perform the following:
1542: 
1543: \begin{enumerate}
1544: \item if it contains $x$ we unveil all its literals and block;
1545: \item otherwise we do nothing. 
1546: \end{enumerate}
1547: The clauses of $\Phi_{t}$ having size two correspond to the rows of 
1548: $\Phi$
1549: that contain no unveiled literal. 
1550: The clauses of size one are either the clauses of
1551: size one in $\Phi$ that contain none of the chosen literals, or the 
1552: clauses of size two that contain the negation of one chosen variable 
1553: and another is yet to be chosen. 
1554: Given these observations the uniformity and independence follow from
1555: the way we construct $\Phi$. 
1556: 
1557: To prove the recurrences, let $x$ be the variable set to
1558: one in stage $t$ (it exists since \PUR\ does not halt at
1559: this stage). By uniformity and independence, each of the $P_{1,t}-1$
1560: positive unit clauses of $\Phi_{t}$, other than the chosen one, is
1561: equal to $x$ with probability $1/t$ (since there are $t$ variables
1562: left at this stage). On the other hand, the positive unit clauses of 
1563: $\Phi_{t-1}$ that are not present
1564: in $\Phi_{t}$ can only come from clauses of size two of $\Phi_{t}$
1565: that contain $\overline{x}$ and a positive literal (therefore counted
1566: by $P_{2,t}$). Uniformity and independence imply therefore that
1567: $\Delta_{1}^{P}(t)$ has the distribution claimed in
1568: (\ref{k:2:distribution}). The other relations can be
1569: justified similarly (noting that, since \PUR\ does not reject at this
1570: stage, every negative unit clause of $\Phi_{t}$ is also present in 
1571: $\Phi_{t-1}$).  
1572: 
1573: 
1574:  
1575: It will be useful to consider the Markov chain
1576: (\ref{k:2:markovchain}) for all
1577: values of $t=n,\ldots, 0$ (even when the algorithm halts). To
1578: accomplish that, the ``minus'' signs in the first equation of
1579: (\ref{k:2:markovchain}) and the definition of $\Delta_{1,t}^{P}$ 
1580: should be replaced by $\cminus$. We also need to specify the
1581: distribution of each component of
1582: the tuple $(P_{1,n}, N_{1,n}, P_{2,n}, N_{2,n})$. Let $\Delta_{n}$ be
1583: a random variable having the Bernoulli distribution $B(cn,
1584: \frac{2n}{2n+3{{n}\choose {2}}})$. It is easy to see that in
1585: distribution
1586:  
1587: \begin{equation}\label{k:2:initial:condition}
1588: \left \{\begin{array}{l}
1589:          P_{1,n}=B(\Delta_{n},1/2),\\
1590:          N_{1,n}=\Delta_{n}-P_{1,n},\\
1591:          P_{2,n}=B(cn-\Delta_{n},2/3)\\
1592:          N_{2,n}=cn-\Delta_{n}-P_{2,n}.\\
1593:         \end{array}
1594: \right.
1595: \end{equation}
1596: \endproof
1597: \qed 
1598: 
1599: \section{Proof of Theorem~\ref{k:2}}
1600: 
1601: 
1602:  
1603: The main intuition for the proof is that in ``most interesting stages''
1604: $\Delta_{1,t}^{P}=0$ and $\Delta_{12,t}^{P}$ is approximately
1605:  Poisson distributed. Therefore,  $P_{1,t}$ qualitatively
1606: behaves like the Markov Chain $(Q_{t})_{t}$ defined by 
1607: \begin{equation}
1608: \left \{\begin{array}{l}
1609:         Q_{n+1}=1,\\
1610:         Q_{t-1}=Q_{t}\cminus 1+Po(\lambda),\\
1611: \end{array}
1612: \right.
1613: \end{equation}
1614: where $\lambda=2c/3$.
1615: This explains the closed form of the limit probability: a well-known 
1616: result 
1617: states that $\rho$, the probability that the queuing chain $Q_{t}$ 
1618: reaches 
1619: state 0, satisfies the equation $\rho= \Phi(\rho)$, where
1620: $\Phi(t)=e^{\lambda(t-1)}$ is the generating function of the
1621: arrival distribution $Po(\lambda)$.  
1622: We will define a suitable value $\omega_{0}$ such that:
1623: \begin{enumerate}
1624: \item With high probability \PUR\ does not reject in any of stages $n,
1625: \ldots, n-\omega_{0}$. 
1626: \item \PUR\ accepts ``mostly before or at stage $n-\omega_{0}$'' (i.e. 
1627: the 
1628: probability that \PUR\ accepts after stage $n-\omega_{0}$, given that
1629: $\Phi$ survives this far is $o(1)$). 
1630: \item With high probability, for every $t\in n, \ldots, n-\omega_{0}$, 
1631: $\Delta_{1,t}^{P}=0$.  
1632: \item At stages $n,\ldots, n-\omega_{0}$, $P_{1,t}$ is ``very close''
1633: to $Q_{t}$, with respect to total variation distance. 
1634: \end{enumerate}
1635:  
1636: This program can be accomplished as described if $c< 3/2$. To prove
1637: Property 4 we make use of Lemmas~\ref{b:h:j} and
1638: \ref{occupancy}. Property 2 is proved only implicitly: in this
1639: case (see \cite{hoel:port:stone}) the probability that $Q_{i}=0$ for
1640: some $i$ tends to one, and, in fact, by a technical result due to
1641: Frieze and Suen (Lemma 3.1 in \cite{frieze-suen}), $\Pr[Q_{i}=0\mbox{ 
1642: for some
1643: }i\geq n - \log n]$ is $1-o(1)$.
1644:  
1645:  
1646: Let us now concentrate on the case when $c>3/2$ (the case when $c=3/2$ 
1647: will
1648: follow by a monotonicity argument). In the previous argument we only 
1649: used 
1650: the fact
1651: that $c<3/2$ when deriving the probability that $Q_{t}$ hits state 0,
1652: hence the
1653: arguments from above carry on, and the conclusion is that the
1654: probability that \PUR\ accepts at one of the stages $n,\ldots,
1655: n-\omega_{0}$ differs by $o(1)$ from the probability that $Q_{t}=0$
1656: somewhere in this range. We now, however, have to consider the
1657: probability that \PUR\ accepts at some stage later than $n-\omega_{0}$
1658: and aim to prove that this probability is $o(1)$. It is conceptually 
1659: simpler to divide
1660: the interval $[n-\omega_{0},0]$ into two subintervals, $[n-\omega_{0},
1661: n-\omega_{1}]$ and its complement, such that
1662: w.h.p. $\Phi_{n-\omega_{1}}$ (if defined) contains two opposite unit
1663: clauses, therefore
1664: the probability that \PUR\ accepts after stage $n-\omega_{1}$ is
1665: $o(1)$. In the range $[n-\omega_{0},n-\omega_{1}]$ we would like to
1666: prove that ``most of the time'' $\Delta_{1,t}^{P}$ is zero and
1667: $P_{1,t}$ is ``close'' to $Q_{t}$ and to reduce the problem to the
1668: analysis of $Q_{t}$. Unfortunately there are two problems with this
1669: approach: although the probability that each individual
1670: $\Delta_{1,t}^{P}>0$ is fairly small, to make $\Phi_{n-\omega_{1}}$
1671: unsatisfiable w.h.p., $\omega_{1}$ has to
1672: be $\omega(\sqrt n)$. This implies
1673: that we cannot sum these probabilities over
1674: $[n-\omega_{0},n-\omega_{1}]$ and expect the sum to be $o(1)$; a
1675: similar problem arises if we want to sum the upper bounds for
1676: $d_{TV}(\Delta_{12,t}^{P},Po(\lambda))$. 
1677:  
1678:  
1679: Fortunately there is a way to circumvent this, avoiding the use
1680: of total variation distance altogether: although we cannot guarantee 
1681: that
1682: w.h.p. each $\Delta_{1,t}^{P}=0$, we can arrange that w.h.p. for every
1683: sequence of $p$ consecutive stages $t, t-1, \ldots t-p+1$, 
1684: $\Delta_{1,t}^{P}+\Delta_{1,t-1}^{P}+\ldots +\Delta_{1,t-p+1}^{P}\leq
1685: 3$ (*). Intuitively, in any sequence of $p$ consecutive steps at most
1686: $p+3$
1687: clients leave the queue, and the number of those who arrive is the sum
1688: of $p$ approximately Poisson variables, thus approximately Poisson
1689: with parameter $p\lambda$. Choosing $p$ large enough so that $\lambda
1690: >1+\frac{3}{p}$ ensures that in any $p$ steps {\em the average number 
1691: of  
1692: customers that arrive is strictly larger than the number of customers
1693: that are served in this time span}. Therefore we will seek to 
1694: approximate
1695: $P_{1,t}$ by a queuing chain $\overline{Q}_{t}$ with this
1696: property. Since $P_{1,n-\omega_{0}}=\overline{Q}_{n-\omega_{0}}$ is
1697: ``large,'' an elementary analysis of the queuing chain implies 
1698: that the probability that 
1699: $\overline{Q}_{t}$ hits state 0 in the interval
1700: $[n-\omega_{0},n-\omega_{1}]$ is exponentially small. So we obtain the
1701: desired result if $\overline{Q}_{t}$ is constructed so that it is
1702: stochastically dominated by $P_{1,t}$. 
1703:  
1704: \subsection{The case $c<3/2$} 
1705:   Define $\omega_{0}=n^{0.1}$.
1706: The following are the main steps of the proof in this case:
1707: \vspace{5mm}
1708: 
1709: \begin{lemma}\label{small:p2}  
1710: With probability $1-o(1/poly)$ for every $t\in [n,\ldots , n/2]$ we
1711: have  $$\Delta_{12,t}^{P},\Delta_{02,t}^{P}, \Delta_{12,t}^{N}\leq
1712: \frac{1}{2}n^{0.1}.$$ 
1713: \end{lemma}
1714: \vspace{5mm}
1715: 
1716: \beginproof
1717: Use the coupling with $m_{1}=P_{2,t} (N_{2,t})$,
1718: $m_{2}=cn$, $\tau = 1/t$, and apply Chernoff bound to
1719: $B(cn,1/t)$.
1720: \endproof
1721: \qed
1722: \vspace{5mm}
1723: 
1724: \begin{corrolary}\label{p:2:t}  
1725: Consider $\omega \leq n/2$. 
1726: If for every $t\in [n,\ldots , n/2]$,
1727: $\Delta_{12,t}^{P},\Delta_{02,t}^{P}, \Delta_{12,t}^{N}\leq 
1728: \frac{1}{2}n^{0.1}$ then, for all $t\in
1729: [n,\ldots, n-\omega]$, $P_{1,t}, N_{1,t}, |P_{2,t}-P_{2,n}|, 
1730: |N_{2,t}-N_{2,n}| <
1731: (n-t)\cdot n^{0.1}$. 
1732: \end{corrolary}
1733: \vspace{5mm}
1734: 
1735: \begin{lemma}\label{small:delta1}
1736: If for all $t\in
1737: [n,\ldots, n-\omega]$, $P_{1,t}, N_{1,t}, |P_{2,t}-P_{2,n}|, 
1738: |N_{2,t}-N_{2,n}| <
1739: (n-t)\cdot n^{0.1}$ holds then 
1740: w.h.p. $\Delta_{1,t}^{P}=0$ for every $t\in [n,\ldots , n-\omega_{0}]$. 
1741: \end{lemma}
1742: \vspace{5mm}
1743: 
1744: \beginproof
1745: $\Pr[B(P_{1,t}-1,\frac{1}{t})>0] = 1-\Pr[B(P_{1,t}-1,\frac{1}{t})=0]= 
1746: 1-(1-\frac{1}{t})^{P_{1,t}-1}<\frac{P_{1,t}-1}{t}< n^{-0.9}$.  
1747: \endproof
1748: \qed
1749: \vspace{5mm}
1750: 
1751: \begin{lemma}\label{p:2:n}
1752: W.h.p., $|P_{2,n}-\frac{2}{3}cn|, |N_{2,n}-\frac{1}{3}cn|  < n^{0.6}$. 
1753: \end{lemma}
1754: \vspace{5mm}
1755: 
1756: \beginproof
1757: Directly from the Chernoff bounds on $\Delta_{n}$ and $P_{2,n}$.
1758: \endproof
1759: \qed
1760: \vspace{5mm}
1761: 
1762: \begin{lemma}\label{p:2:t:poisson}
1763: If the events in the conclusions of Lemmas~\ref{p:2:t} and \ref{p:2:n} 
1764: hold for 
1765: $\omega = \omega_{0}$, $\epsilon_{1}=1/6$ and $\epsilon_{2}=0.1$, then 
1766: there exists a constant
1767: $r>0$ such that for every $t=n, \ldots,n-\omega_{0}$,
1768: $|\frac{P_{2,t}}{t}-\frac{2}{3}c| \leq r n^{-0.4}$. 
1769: \end{lemma}
1770: \vspace{5mm}
1771: 
1772: \beginproof
1773:  We have
1774: $$|\frac{P_{2,t}}{t}-\frac{2}{3}c| \leq
1775: P_{2,t}\left|\frac{1}{t}-\frac{1}{n}\right|+
1776: \frac{|P_{2,t}-P_{2,n}|}{n}+ 
1777: \left|\frac{P_{2,n}}{n}-\frac{2}{3}c\right | \leq
1778: P_{2,n}\frac{\omega_{0}}{n(n-\omega_{0})}+\frac{n^{0.2}}{n}+ 
1779: n^{0.6-1},$$
1780: by Lemma ~\ref{p:2:t:poisson} and $n-\omega_{0}\leq t\leq n,$ and the 
1781: result
1782: immediately follows.
1783: \endproof
1784: \qed
1785: \vspace{5mm}
1786: 
1787: \begin{lemma}\label{distance}
1788: If the conclusions of Lemmas~\ref{p:2:t:poisson} and
1789: \ref{small:delta1} are true  then 
1790: $$\sum_{t=n-\omega_{0}}^{n}d_{TV}(P_{1,t},Q_{t})=o(1/\omega_{0}).$$
1791: \end{lemma}
1792: \vspace{5mm}
1793: 
1794: \beginproof
1795: By
1796: Lemma~\ref{p:2:t:poisson} and the inequalities on
1797: total variation distance there exist $r_{1},r_{2}>0$ such that 
1798: \begin{eqnarray*}
1799: d_{TV}(\Delta_{12,t}^{P},Po(\lambda)) & \leq & 
1800: d_{TV}\left(\Delta_{12,t}^{P}, Po\left(\frac{P_{2,t}}{t}\right)\right)+
1801: d_{TV}\left(Po\left(\frac{P_{2,t}}{t}\right) 
1802: ,Po\left(\frac{2}{3}c\right)\right) \\ & \leq & r_{1}\frac{1}{t}+r_{2}n^{-0.4}\leq 
1803: r_{3}n^{-0.4}, 
1804: \end{eqnarray*} where $r_{3}=r_{1}+r_{2}$. Employing
1805: Lemma~\ref{approximation} it follows that
1806: $$\sum_{t=n-\omega_{0}}^{n}d_{TV}(P_{1,t},Q_{t})\leq
1807: r_{3}\sum_{t=n-\omega_{0}}^{n}tn^{-0.4}\leq
1808: r_{3}n^{-0.4}\frac{\omega_{0}^{2}}{2},$$ and this amount is 
1809: $o(1/\omega_{0})$. 
1810: \endproof
1811: \qed
1812: \vspace{5mm}
1813: 
1814: \begin{observation}
1815: The probability that the conditions in the previous lemma are not
1816: fulfilled is at most $\omega_{0}^{4}/n = n^{-0.6}$. Indeed, 
1817: the events that ensure the applicability of the previous lemma are: 
1818: \begin{enumerate}
1819: \item for every $t\in [n,\ldots , n/2]$, 
1820: $\Delta_{12,t}^{P},\Delta_{02,t}^{P}, \Delta_{12,t}^{N}\leq
1821: \frac{1}{2}n^{0.1}$,
1822: \item for all $t\in
1823: [n,\ldots, n-\omega_{0}]$, $\Delta_{1,t}^{P}=0$, and
1824: \item  $|P_{2,n}-\frac{2}{3}cn|, |N_{2,n}-\frac{1}{3}cn|,  < n^{0.6}$
1825: \end{enumerate}
1826: The first and the third events have probability $1-o(1/poly)$ (as they
1827: come from applying Chernoff bounds). The second fails (for a specific
1828: $t$) with probability at most $\frac{P_{1,t}}{n-t}\leq 
1829: \omega_{0}^{2}/(n-\omega_{0})$, so its total
1830: probability is at most $\omega_{0}\cdot
1831: \omega_{0}^{2}/(n-\omega_{0})$. Both terms can be absorbed into
1832: $\omega_{0}^{4}/n$. 
1833: 
1834: \end{observation}
1835: \vspace{5mm}
1836:  
1837: 
1838: \begin{lemma}\label{no:reject}
1839: If the event in Lemma~\ref{p:2:t} holds then 
1840: w.h.p. \PUR\ does not reject at stage $t$, for every $t$ in the range 
1841: $n$, $n-1, \ldots, n-\omega_{0}$, given that $\Phi$ survives up to
1842: this stage.   
1843: \end{lemma}
1844: \vspace{5mm}
1845: 
1846: \beginproof
1847: To prove Lemma~\ref{no:reject} we show that, 
1848: with high probability the unit clauses of each $\Phi_{t}$ involve
1849: different variables. This can be seen as follows: consider
1850: $P_{1,t}+N_{1,t}$ balls to be thrown into $t$ urns. The probability
1851: that two of them arrive in the same urn is at most
1852: ${{P_{1,t}+N_{1,t}}\choose {2}}\cdot \frac{1}{t}$. This is upper 
1853: bounded
1854: by $\frac{(\omega_{0}n^{0.1})^{2}}{2(n-\omega_{0})}$. Summing this for 
1855: $t=n, \ldots,
1856: n-\omega_{0}$ yields an upper bound, which is $o(1)$. 
1857: \endproof
1858: \qed
1859: \vspace{5mm}
1860: 
1861: The proof for the case $c<3/2$ follows easily from these results: 
1862: with probability $1-o(1)$ all the events in Lemmas~\ref{small:p2}, 
1863: \ref{p:2:t}, \ref{small:delta1}, \ref{p:2:n}, \ref{distance}, and 
1864: \ref{no:reject} take place, therefore 
1865: \PUR\ does not reject at any of the stages $n$ to $n-\omega_{0}$ and 
1866: $P_{1,t}$ is close to $Q_{t}$ in the sense of
1867: Lemma~\ref{distance}. Therefore the probability that for some $t$ in
1868: this range $P_{1,t}=0$ (i.e. \PUR\ accepts) differs by $o(1)$ from the
1869: corresponding probability for $Q_{t}$. But according to the result by
1870: Frieze and Suen \cite{frieze-suen} this latter probability is $1-o(1)$.
1871: \endproof
1872: 
1873:  
1874: \subsection{The case $c>3/2$}
1875: Define $\omega_{1}=n^{0.51}$. The following
1876: are the auxiliary results we use in this case:
1877: 
1878: \vspace{5mm}
1879: 
1880: \begin{lemma}\label{trick:delta} Let $A=n^{0.61}$. 
1881: For every $k>0$ there exists a constant $c_{k}>0$ such that for every 
1882: $r>0$ the probability that there exists $t\in [n-\omega_{0},
1883: n-\omega_{1}]$, $\Delta_{1,t}^{P}+ \Delta_{1,t-1}^{P}+\ldots +
1884: \Delta_{1,t-r+1}^{P}\geq k$ is at most 
1885: $c_{k}(\omega_{1}-\omega_{0})(rA/n)^{k}$. 
1886: \end{lemma}
1887: \vspace{5mm}
1888: 
1889: \beginproof
1890: By Corollary~\ref{p:2:t} we can assume that $P_{1,t}\leq A$. Then for 
1891: every $i$, 
1892: 
1893: \[\Pr[\Delta_{1,t}^{P}\geq i]=\Pr[B(P_{1,t}-1,\frac{1}{t})\geq i]\leq
1894: \Pr[B(A,\frac{1}{t})\geq i]
1895: \]
1896: 
1897: \[ = 1 -
1898: \sum_{j=1}^{i-1}{{A}\choose{j}}\left(\frac{1}{t}\right)^{j}\left(1-\frac{1}{t}\right)^{A-j}
1899: \]
1900: 
1901: \[ \leq {{A}\choose{i}}(\frac{1}{t})^{i}
1902: \]
1903: The event $\Delta_{1,t}^{P}+ \Delta_{1,t-1}^{P}+\ldots +
1904: \Delta_{1,t-r+1}^{P}\geq k$ happens when:
1905: \begin{itemize}
1906: \item one of the factors is at least $k$, or
1907: \item one of the factors is at least $k-1$, and another one is at
1908: least 1,  or
1909: \item \ldots
1910: \item at least $k$ of the factors are at least one. 
1911: \end{itemize}
1912: (a finite number of possibilities). Applying the previous inequality, 
1913: and taking into account that $r,k$ are fixed immediately proves the 
1914: lemma.
1915: \endproof
1916: \vspace{5mm}
1917: 
1918: To flesh out the argument outlined before we construct a
1919: succession of Markov chains running along $P_{1,t}$, 
1920: that provide better and better ``approximations'' to 
1921: $\overline{Q}_{t}$. 
1922: Our use of indices will be slightly nonstandard (to reflect
1923: the connection with $P_{1,t}$), in that 
1924: the sequence of indices starts with $n-\omega_{0}$ and is decreasing.
1925: \vspace{5mm}
1926:  \begin{definition}
1927: Let
1928: $X_{n-\omega_{0}}=Y_{n-\omega_{0}}=Z_{n-\omega_{0}}=\overline{Q}_{n-\omega_{0}}=
1929: P_{1,n-\omega_{0}}$
1930: and 
1931: \begin{equation}\label{sequences}
1932: \left \{\begin{array}{l}
1933:          X_{t-1}=X_{t}-(p+3)\chi_{p{\bf 
1934: Z}+1}(n-\omega_{0}-t)+\Delta_{12,t}^{P},
1935: \\
1936:          Y_{t-1}=Y_{t}-(p+3)\chi_{p{\bf Z}+1}(n-\omega_{0}-t)+B(P_{2, 
1937: n-\omega_{
1938: 1}}, 1/t),\\
1939:          Z_{t-1}=Z_{t}-(p+3)\chi_{p{\bf Z}+1}(n-\omega_{0}-t)+B(P_{2, 
1940: n-\omega_{
1941: 1}},\frac{1}{n}),\\
1942:         \overline{Q}_{t-1}=\overline{Q}_{t-1}-1+B(p\lfloor \frac{P_{2, 
1943: n-\omega_
1944: {1}}}{p+3}\rfloor,\frac{1}{n}).\\
1945:         \end{array}
1946: \right.
1947: \end{equation}
1948: \end{definition}    
1949:   
1950: Let $c = \Pr[(\exists t\in [n-\omega_{0},n-\omega_{1}]): P_{1,t}=0]$. 
1951: Note 
1952: that the amount $p+3$
1953: is subtracted from $X_{t}, Y_{t}, Z_{t}$ exactly once in every
1954: $p$ consecutive steps, so 
1955: whenever the condition (*) is satisfied it holds that $X_{t}\leq
1956: P_{1,t}$ for every $t\in [n-\omega_{0},n-\omega_{1}]$. By coupling
1957: $\Delta_{12,t}^{P}(= B(P_{2,t}, 1/t))$ with $B(P_{2,n-\omega_{1}},1/t)$ 
1958: we
1959: deduce that we can couple $X_{t}$ and $Y_{t}$ so that $Y_{t}\leq
1960: X_{t}$. We can also couple $Y_{t}$ and $Z_{t}$ such that $Z_{t}\leq 
1961: Y_{t}$. 
1962: Finally, notice that we can couple $Z_{n-\omega_{0}-jp}$ and
1963: $\overline{Q}_{n-\omega_{0}-j(p+3)}$ such that
1964: $\overline{Q}_{n-\omega_{0}-j(p+3)}\leq  Z_{n-\omega_{0}-jp}$. 
1965: So an upper bound on $\alpha$ is $\Pr[(\exists t\in [0,n-\omega_{0}]):
1966: \overline{Q}_{t}=0]$. With high probability the Bernoulli distribution
1967: in the definition of the chain $\overline{Q}_{t}$ has the average
1968: strictly  greater than
1969: one, (because the flow from $P_{2,t}$ is approximately Poisson), and 
1970: $\overline{Q}_{n-\omega_{0}}=\Omega(\omega_{0})$, 
1971: therefore, by an elementary property of the queuing chain, the 
1972: probability that $\overline{Q}_t$ hits state 0 is exponentially
1973: small. This yields the desired conclusion, that $\alpha =o(1)$.   
1974:  
1975: One word about the way to prove the fact that $\Phi_{n-\omega_{1}}$ is
1976: unsatisfiable (if defined): one can prove that w.h.p. both
1977: $P_{1,n-\omega_{1}}$ and $N_{1,n-\omega_{1}}$ are
1978: $\Omega(\omega_{1})$. By the uniformity lemma ~\ref{k:2:recurrence} 
1979: we are left with the following instance of the occupancy problem:
1980: there are 
1981: $P_{1,n-\omega_{1}}$ white balls, $N_{1,n-\omega_{1}}$ black balls and
1982: $n-\omega_{1}$ bins. The desired fact now follows from the second part
1983: of Lemma~\ref{occupancy}. 
1984:  
1985: 
1986: \section{Proof of Theorem~\ref{k:3etc}}
1987: 
1988: 
1989: Theorem~\ref{k:3etc} is proved along lines very similar to the proof of
1990: Theorem~\ref{k:2}. The basis is the following generalization of
1991: Lemma~\ref{k:2:recurrence}: 
1992: \vspace{5mm}
1993: 
1994: \begin{lemma}\label{k:3etc:recurrence}
1995: Suppose that $\Phi$ survives up to stage $t$. Then, conditional on the
1996: values 
1997: $(P_{1,t}, N_{1,t},\ldots, P_{k,t}, N_{k,t})$, the clauses in  
1998: $\Phi_{1,t}^{P},
1999: \Phi_{1,t}^{N},\ldots,  \Phi_{k,t}^{P}, \Phi_{k,t}^{N}$ are chosen 
2000: uniformly
2001: at random and are independent. Also, conditional on the
2002: fact that $\Phi$ survives stage $t$ as well, the following recurrences
2003: hold:
2004: \begin{equation}\label{k:3etc:markovchain}
2005: \left \{\begin{array}{l}
2006:          P_{1,t-1}=P_{1,t}-1-\Delta_{01,t}^{P}+\Delta_{12,t}^{P}, \\        
2007:          N_{1,t-1}=N_{1,t}+\Delta_{12,t}^{N},                    \\
2008:          
2009: P_{i,t-1}=P_{i,t}-\Delta_{0i,t}^{P}-\Delta_{(i-1)i,t}^{P}+\Delta_{i(i+1
2010: ),t}^{P}\mbox{, for }i=\overline{2,k},  \\
2011:          
2012: N_{i,t-1}=N_{i,t}-\Delta_{(i-1)i,t}^{N}+\Delta_{i(i+1),t}^{N}\mbox{,\ \
2013:  \  for }i=\overline{2,k},  \\
2014:         \end{array}
2015: \right.
2016: \end{equation}
2017: where (in distribution)
2018: \begin{equation}\label{k:3etc:distribution}
2019: \left \{\begin{array}{l}
2020: \Delta_{01,t}^{P} =B(P_{1,t}-1,1/t),\\
2021: \Delta_{(i-1)i,t}^{P}=B(P_{i,t},(i-1)/t),\\
2022: \Delta_{0i,t}^{P}=B(P_{i,t}-\Delta_{(i-1)i,t}^{P},1/t),\\
2023: \Delta_{(i-1)i,t}^{N}=B(N_{i,t},i/t),\\
2024: \Delta_{k(k+1),t}^{P}=\Delta_{k(k+1),t}^{N}=0.\\
2025: \end{array}
2026: \right.
2027: \end{equation}
2028: \end{lemma}
2029: \vspace{5mm}
2030: 
2031: \beginproof
2032: 
2033: The uniformity condition and the justification of the recurrences are 
2034: absolutely similar to the ones from Lemma~\ref{k:2}. 
2035: The additional technical complication is that now there is a ``positive 
2036: flow 
2037: into $P_{2,t}, N_{2,t}$.'' 
2038: \endproof
2039: \qed
2040: \vspace{5mm}
2041: 
2042: \begin{lemma}
2043: With high probability it holds that 
2044: 
2045: \[
2046: P_{i,t}=(1+o(1))\cdot \frac{c}{n}\cdot \lambda_{k}\cdot i\cdot 
2047: {{t}\choose
2048: {i}}\cdot S^{n+1-t}_{k-i},
2049: \]
2050:  and 
2051: \[
2052: N_{i,t}=(1+o(1))\cdot \frac{c}{n}\cdot \lambda_{k}\cdot {
2053: {t}\choose
2054: {i}}\cdot S^{n+1-t}_{k-i},
2055: \]
2056: for every $i\geq 2$, and uniformly on $t=n-o(n)$. 
2057: \end{lemma}
2058: \vspace{5mm}
2059: 
2060: \beginproof
2061: 
2062: Let us first heuristically derive a formula for $x_{i,t}$, $y_{i,t}$, 
2063: the expected values of $P_{i,t}$, $N_{i,t}$,
2064: obtained by replacing the binomial distributions in the equations by
2065: their expected values. 
2066: 
2067: We have: 
2068: \begin{equation}\label{k:3etc:markovchain:avg}
2069: \left \{\begin{array}{l}
2070:                  x_{i,t-1}=x_{i,t}- 
2071: \frac{x_{i,t}}{t}-\frac{(i-1)x_{i,t}}{t}+\frac{ix_{i+1,t}}{t}\mbox{, for }i=\overline{2,k},  \\
2072:          
2073: y_{i,t-1}=y_{i,t}-\frac{iy_{i,t}}{t}+\frac{(i+1)y_{(i+1),t}}{t}\mbox{,\ \
2074:  \  for }i=\overline{2,k},  \\
2075:         \end{array}
2076: \right.
2077: \end{equation}
2078: Rearranging terms the recurrences become 
2079: \begin{equation}\label{k:3etc:markovchain:avg:simple}
2080: \left \{\begin{array}{l}
2081:                  
2082: x_{i,t-1}=x_{i,t}(1-\frac{i}{t})+x_{i+1,t}\frac{i}{t}\mbox{, for }i=\overline{2,k},  \\
2083:          
2084: y_{i,t-1}=y_{i,t}(1-\frac{i}{t})+y_{(i+1),t}\frac{(i+1)}{t}\mbox{,\ \
2085:  \  for }i=\overline{2,k}. \\
2086:         \end{array}
2087: \right.
2088: \end{equation}
2089: Also, 
2090: \begin{equation}\label{k:3etc:markovchain:begin}
2091: \left \{\begin{array}{l}
2092:          x_{i,n}= \frac{i{{n}\choose{i}}}{H_{k}}\cdot
2093:          c\lambda_{k}\cdot \frac{H_{k}}{n}=
2094:          \frac{c}{n}\lambda_{k}\cdot i{{n}\choose{i}},\\
2095:          y_{i,n}= \frac{{{n}\choose{i}}}{H_{k}}\cdot c\lambda_{k}\cdot
2096:          \frac{H_{k}}{n}= \frac{c}{n}\lambda_{k}\cdot
2097:          {{n}\choose{i}}.\\
2098: \end{array}
2099: \right.
2100: \end{equation}
2101: A simple induction shows that these expected
2102: values are $x_{i,t}= \frac{c}{n}\cdot \lambda_{k}\cdot i\cdot 
2103: {{t}\choose
2104: {i}}\cdot S^{n+1-t}_{k-i}$, and $y_{i,t}= \frac{c}{n}\cdot 
2105: \lambda_{k}\cdot {
2106: {t}\choose
2107: {i}}\cdot S^{n+1-t}_{k-i}$. 
2108: 
2109: 
2110: The concentration property can be proved inductively, starting from
2111: $i=k$ towards $3$, by noting that the expected values of the
2112: binomial terms in the recurrence are
2113: $\omega(n)$, hence, by the Chernoff bound,
2114:  the probabilities that they
2115: significantly deviate from their expected values is exponentially
2116: small. 
2117: 
2118: Almost the same argument holds for 
2119: $P_{2,t}$ and for $N_{2,t})$. 
2120: The only amounts to be handled differently are ``the
2121: clause flows out of $P_{2,t}, N_{2,t}$,'' but they are approximately
2122: Poisson distributed, hence ``small'' with high probability by 
2123: Proposition~\ref{chernoff:poisson}. Therefore $P_{2,t}=(1+o(1)) \frac{c}{n}\cdot
2124: \lambda_{k}\cdot 2\cdot {{t}\choose {2}}\cdot S^{n+1-t}_{k-2}$. 
2125: \endproof
2126: \qed 
2127: 
2128: The previous lemma implies that $\Delta_{2,t}^{P}\sim Po(c\cdot
2129: \lambda_{k}\cdot S^{n+1-t}_{k-2})$ (for $t=n-o(n)$); thus in this range 
2130: $P_{1,t-1}\sim P_{1,t}-1+Po(c\cdot
2131: \lambda_{k}\cdot S^{n+1-t}_{k-2})$. The proof follows exactly the same
2132: pattern as in the case $c<3/2$ for $k=2$: the conclusion for the stages
2133: $[n,n-\omega_{0}]$ is that the probability that $P_{1.t}$ is zero 
2134: somewhere in this range differs by $o(1)$ from the corresponding
2135: probability for the queuing chain in (\ref{eq:3etc}). The fact that the
2136: stages after $[n,n-\omega_{0}]$ have a contribution of $o(1)$ to the
2137: final accepting probability can be seen by the fact that there is
2138: possible to couple the Markov $M_{1}$, describing the evolution of
2139: \PUR\ on a random $k$-SAT formula, and $M_{2}$ that runs on the  2-CNF
2140: component of the formula, such that for every $t$ we have 
2141: $P_{1,t}^{M_{2}}\leq
2142: P_{1,t}^{M_{1}}$. Perhaps the most intuitive way to see this coupling
2143: is to ``paint'' the initial clauses of the formula having size at most
2144: two in red, and the other clauses in blue. At every step $t$
2145: $P_{1,t}^{M_{2}}$ will count only red clauses having unit size at step
2146: $t$, while $P_{1,t}^{M_{1}}$ will count clauses of both colors. 
2147: 
2148: Given the stochastic domination, the desired result follows from the
2149: corresponding proof in the case $k=2$.  
2150: \qed
2151: 
2152: \section{Proof of Proposition~\ref{annealed}}
2153: 
2154: The idea of the proof is to consider \PUR\ on a random at-most-$k$-Horn 
2155: formula $\Phi$ with 
2156: $\hat{c}\cdot \frac{H_{k}}{n}$ clauses and prove that there exists a
2157: function $\phi(k)$ with $\lim_{k\goesto \infty}\phi(k)=0$ such
2158: that 
2159: \[
2160: \lim_{n\goesto \infty} 
2161: \Pr[\PUR\ \mbox{ accepts in at least } k\mbox{ steps }]\leq \phi(k). 
2162: \]
2163: Indeed, from the previous proof it follows that $\lim_{n\goesto
2164: \infty}\Pr[\PUR\ \mbox{ accepts in }\geq k\mbox{ steps }]$ satisfies 
2165: the recurrence:
2166: \[\label{rec}
2167: x_{t+1}= x_{1,t}-1+Po(\hat{c}\cdot S^{t+1+k}_{k-2}),
2168: \]
2169: where
2170: \[
2171: x_{0}= P_{1,k}\geq 1.
2172: \]
2173: We define $\phi(k)$ to be the probability that the sequence in the
2174: recurrence (\ref{rec}) hits zero. Trivially $\lim_{k\goesto
2175: \infty} S^{k+1}_{k-2}=\infty$, so the expected values of the Poisson
2176: distributions in (\ref{rec}) can be made larger than any given constant
2177: $\lambda$. Using the fact that the sum of two Poisson distributions
2178: with parameters $a$ and $b$ has a Poisson distribution with parameter
2179: $a+b$ it follows that, for large enough $k$, one can couple $x_{t}$ 
2180: with the  
2181: queuing chain
2182: \[\label{rec2}
2183: y_{t+1}= y_{1,t}-1+Po(\lambda),
2184: \]
2185: 
2186: \[
2187: y_{0}= 1,
2188: \]
2189: such that $y_{t}\leq x_{t}$. It follows that, for large $k$,
2190: $\phi(k)\leq \Pr[\mbox{ the chain $y_{t}$ hits state
2191: zero}]$. Since $\lambda$ was arbitrary, it follows that $\lim_{k\goesto
2192: \infty}\phi(k)=0$. 
2193: 
2194: Now consider a random {\em uniform} Horn formula $\Phi$ with 
2195: $\hat{c}\cdot \frac{H_{n}}{n}$ clauses, and let $\overline \Phi$ be
2196: its subformula consisting of clauses of size at most $k$. It is easily 
2197: seen
2198: that the behavior of \PUR\ on the first $k-1$ steps depends only on
2199: the clauses of $\overline \Phi$, so 
2200: \[
2201: \Pr[\PUR\ \mbox{ accepts }\Phi\mbox{ in less than }k\mbox{ 
2202: steps}]=\Pr[\PUR\ \mbox{ accepts }\overline
2203:  \Phi\mbox{ in less than }k\mbox{ steps}].
2204: \]
2205: On the other hand we have
2206: 
2207: \[0\leq \Pr[\PUR\ \mbox{ accepts }\Phi\mbox{ in at least }k\mbox{ 
2208: steps}]\leq \Pr[\PUR\ \mbox{ accepts }\overline
2209:  \Phi\mbox{ in at least }k\mbox{ steps}].
2210: \]
2211: The fact that ``$\overline
2212:  \Phi$ is close to a random formula in $\Omega(n,k,c\cdot
2213:  \frac{H_{k}}{n})$'' (see the discussion in 
2214: Observation~\ref{obs:coupling})
2215: implies that
2216: the right-hand side
2217: term can be made less than any fixed constant $\epsilon$ (for $n,k$
2218: big enough). It follows that 
2219: 
2220: \[
2221: |\Pr[\PUR\ \mbox{ accepts }\Phi]-\Pr[\PUR\ \mbox{ accepts }\overline
2222:  \Phi]|\leq 2\cdot \epsilon,
2223: \]
2224: for large enough values of $n,k$. This immediately implies the desired 
2225: result.
2226: \endproof
2227: \qed
2228: 
2229: 
2230: \section{Proof of Theorem~\ref{k:2:runtime}}
2231: Theorem~\ref{k:2:runtime} is based on the
2232: proof of the Theorem~\ref{k:2} 
2233: and an elementary property of the queuing chain $Q_{t}$
2234: (the expected time to hit state zero, conditional on actually hitting
2235: it has the desired form). 
2236: 
2237: The crucial point is to prove that the probabilities that any of the
2238: conditions we have employed in our analysis fails have a negligible
2239: effect on the running time.
2240: 
2241: This is easy to see for stages smaller than $n-\omega_{0}$: since the
2242: probabilities that the various steps of the analysis  
2243:  are either exponentially small or can be made $o(1/n)$ (by choosing a
2244: large enough $k$ in Lemma~\ref{trick:delta},  
2245: the probability that $P_{1,t}$ hits state zero after
2246: stage $n-\omega_{0}$ is $o(1/n)$, therefore its influence on
2247: the average running time of \PUR\ is $o(1)$. 
2248: The corresponding observation  is not true for stages before
2249: $n-\omega_{0}$, but these stages can be handled directly, using the
2250: statement from 
2251: Lemma~\ref{distance}. 
2252:  
2253: \endproof
2254: \qed
2255: 
2256:  
2257: \section{Random Horn satisfiability as a mean-field 
2258: approximation}\label{section:4}
2259: 
2260: What we have shown so far is to prove that (under a suitably rescaled
2261: picture) the rescaled probability graphs for random at-most-$k$ Horn
2262: satisfiability converge to the graph for random Horn satisfiability. 
2263: To be able to argue that our results display critical behavior, we
2264: have to be able to show that this latter probability $p_{\infty}$, 
2265: is indeed the one predicted by some  mean-field
2266: approximation. 
2267: 
2268: 
2269: In the sequel 
2270: we will show that this is indeed the case. However the mean-field
2271: approximation is {\em not} the one from \cite{kirkpatrick:selman:scaling}
2272: , and incorporates a correction specific to the
2273: properties of random Horn satisfiability. 
2274: 
2275: Let us
2276: first see that it is not accurate if no correction is taken into
2277: account. Indeed, were it true we would have 
2278: 
2279: \[ 
2280: \lim_{n\goesto \infty} Pr[\Phi\in \HSAT]= 1 - \lim_{n\goesto
2281: \infty}\prod_{A\in \{0,1\}^{n}} \left(1-\Pr[A \models \Phi]\right). 
2282: \]
2283: Since, for an assignment $A$ of Hamming weight $i$ there are exactly
2284: $2^{i}-1+(n-i)\cdot 2^{i}$ Horn clauses that $A$ falsifies, we have
2285: 
2286: \[
2287: \Pr[A \models \Phi]= \left(1 - \frac{2^{i}-1+(n-i)\cdot 
2288: 2^{i}}{(n+2)\cdot
2289: 2^{n}-1}\right)^{c\cdot 2^{n}},
2290: \]
2291: so the mean-field prediction reads
2292: 
2293: \[\lim_{n\goesto \infty} \Pr[\Phi\in \HSAT]=1- \lim_{n\goesto 
2294: \infty}\prod_{j=0}^{n}\left(1-\left(1 - \frac{2^{j}-1+(n-j)\cdot 
2295: 2^{j}}{(n+2)\cdot 2^{n}-1}\right)^{c\cdot 2^{n}}\right)^{{{n}\choose {j}}}. 
2296: \]
2297: 
2298: All terms in the product are less than 1. Since the term corresponding
2299: to $j=1$ is $\left(1-\left(1 - \frac{1+2\cdot (n-1)}{(n+2)\cdot
2300: 2^{n}-1}\right)^{c\cdot 2^{n}}\right)^{n}$ has limit 0, the mean-field 
2301: prediction
2302: would imply that $\lim_{n\goesto \infty} \Pr[\Phi\in \HSAT]=1$.
2303: On the other hand let us observe that, if we do not consider 
2304: the power ${{n}\choose
2305: {j}}$ in the infinite product we obtain the right
2306: result: it is a simple but tedious task to prove that
2307: 
2308: \[\lim_{n\goesto \infty} \prod_{j=0}^{n}
2309: \left(1-\left(1 - \frac{2^{j}-1+(n-j)\cdot 2^{j}}{(n+2)\cdot 
2310: 2^{n}-1}\right)^{c\cdot
2311: 2^{n}}\right)=  \prod_{j=0}^{\infty}\left(1-e^{-c \cdot 2^{j}}\right). 
2312: \]
2313: 
2314: Intuitively  this means that  ``there exist a correction of the 
2315:  mean-field approximation that only considers a single assignment of 
2316: each
2317:  weight, and is accurate.'' The following simple result gives a
2318:  precise statement to the above intuition: 
2319: 
2320: \begin{lemma}
2321: Suppose $\Phi$ is given as a union of 
2322: formulas $\Phi_{1}, \ldots, \Phi_{n}$, where $\Phi_{i}$ contains all
2323: clauses of length {\em exactly} $i$. Then there is a set 
2324: $T=\{T_{0}, \ldots, T_{n-1}\}$ of assignments, with {\em $T_{i}$ of 
2325: Hamming
2326: weight exactly $i$ and depending
2327: only on $\Phi_{1}\cup \ldots \cup \Phi_{i+1}$}, 
2328: such that $\Phi$ is satisfiable if and only if it is
2329: satisfied by some assignment in $T$. 
2330: \end{lemma}
2331: 
2332: \beginproof
2333: 
2334: Let $\overline{y_{1}\ldots y_{k}}$ denote the assignment that makes 
2335: $y_{1}=\ldots = y_{k}=1$, and all the other variables equal to zero. 
2336: 
2337: The set $T$ has two parts: the first is simply the set of 
2338: assignments implicitly examined by the algorithm \PUR\ in testing
2339: satisfiability. That is, if $x_{1}, \ldots, x_{k}$ are the variables
2340: assigned by \PUR\  in this order, the first part includes the
2341: assignments $00000$, $\overline{x_{1}}, \ldots,\overline{x_{1},\ldots,
2342: x_{k}}$. The second part contains a random assignment for each 
2343: remaining weight. 
2344: \endproof
2345: \qed
2346: The result has a ``mean-field'' interpretation: as before, define 
2347: $f(x_{1}, \ldots, x_{n})= 1-
2348: \prod_{i=1}^{n} x_{i}$, and the function 
2349: $g_{k}[\Phi]$ to be  the indicator function for the event ``$T_{k}
2350: \not \models \Phi$, given that event $\overline{A}_{n}\AND \ldots \AND
2351: \overline{A}_{n-k+1}$ happens,'' i.e. 
2352: 
2353: \[g_{k}[\Phi]= \frac{1}{\Pr[\overline{A}_{n}\AND \ldots \AND
2354: \overline{A}_{n-k+1}]}\cdot \left \{\begin{array}{ll}
2355:                  1, & \mbox{ if } T_{k} \not \models \Phi \AND 
2356: \overline{A}_{n}\AND \ldots \AND
2357: \overline{A}_{n-k+1}
2358:                  \\
2359:  
2360:                  0,  & \mbox{ otherwise.}\\ 
2361:         \end{array}
2362: \right.
2363: \]
2364: We have 
2365: 
2366: \[
2367: E[g_{k}[\Phi]]= \Pr[\overline{A_{n-k}}|\overline{A}_{n}\AND \ldots \AND
2368: \overline{A}_{n-k+1}].
2369: \]
2370: Indeed, $g_{k}[\Phi]\neq 0$ exactly when $R_{n}\OR \ldots \OR
2371: R_{n-k+1}$ or $T_{k}\not \models \Phi \AND S_{n}\AND
2372: \ldots S_{n-k+1}$. The second event is equivalent to
2373: $\overline{A_{n-k}}\AND S_{n}\AND \ldots S_{n-k+1}$, hence we have 
2374: $g_{k}[\Phi]\neq 0$ exactly when $\overline{A_{n-k}}\AND
2375: \overline{A}_{n}\AND \ldots \AND\overline{A}_{n-k+1}$ holds. 
2376: 
2377: Thus we have, by the discussion in the previous chapter, 
2378: \[ 
2379: f(E[g_{1}[\Phi]], \ldots,E[g_{n}[\Phi]])= 
2380: 1-\prod_{k=0}^{n}\Pr[\overline{A_{n-k}}|\overline{A}_{n}\AND \ldots \AND
2381: \overline{A}_{n-k+1}]= \Pr[\Phi \in \HSAT]. 
2382: \]
2383: 
2384: The above correction seems
2385: to be specific to the random model for Horn satisfiability, which
2386: allows clauses of varying lengths.
2387: 
2388: To sum up: {\em the mean-field approximation is true, modulo a
2389: correction that takes into account some particular features of the
2390: random model for Horn satisfiability}. 
2391: 
2392: \section{Discussion}
2393: 
2394: We have characterized the asymptotical
2395: satisfiability probability of a random $k$-Horn formula, and 
2396: showed that it exhibits very similar behavior to the one uncovered
2397: experimentally in \cite{kirkpatrick:selman:scaling}. 
2398: 
2399: We have also displayed an ``easy-hard-easy'' pattern similar to the
2400: ones observed experimentally in the AI literature. In our case the 
2401: pattern is fully explained by elementary properties of the queuing 
2402: chain. 
2403: 
2404: As for an explanation of the ``critical behavior'', 
2405: consider an intermediate stage $i$ of \PUR\ and 
2406: let $C_{j}$ be the set of clauses of $\Phi_{i,j}^{P}$.  
2407: It is clear that whether \PUR\ accepts is
2408: dependent only on the number of clauses in $C_{1}$. The
2409: restriction on the clause length acts like a ``dampening''
2410: perturbation (in that it eliminates the ``clause flow into $C_{k}$''). 
2411: The proof of Theorem~\ref{k:infinite} states that when
2412:  $k(n)\goesto \infty$, with high probability \PUR\ accepts (if
2413: $\Phi$ is satisfiable) ``before the perturbation reaches $C_{1}$'', 
2414: therefore the satisfiability probability is the one from the uniform
2415: case. On the other hand, for any constant $k$,  with probability
2416: greater than 0 \PUR\ does not 
2417: halt during the first $k$ iterations (for the exact value see 
2418: \cite{istrate:cs.DS/9912001}), and the dampening has a
2419: significant influence. Thus {\em the explanation for  
2420: the occurrence (and specific form of) critical behavior is a threshold 
2421: property for the number of iterations of \PUR\ on random satisfiable 
2422: Horn
2423: formulas ``in the critical region''}. 
2424: 
2425: 
2426: A related, and somewhat controversial, open issue is whether random
2427: Horn satisfiability properly displays critical behavior. Problems with 
2428: a sharp threshold display ``critical'' (i.e singular) behavior at least 
2429: in one parameter, the satisfaction probability, which conceivably 
2430: allows the definition of critical exponents. This is not so for random 
2431: $k$-Horn satisfiability, that has a coarse 
2432: threshold, and no criticality for $k>2$, hence the question seems not 
2433: to be 
2434: meaningful. Note, however, that the order parameter involved in the 
2435: recent study of the phase transition in 2-SAT \cite{scaling:window:2sat} 
2436: is {\bf not} satisfaction probability, but the (expected size) of the 
2437: so-called {\em backbone} (or its more tractable version {\em spine}) of a 
2438: random formula. The ``window'' that we use to peek at the threshold 
2439: behavior of random 
2440: Horn satisfiability does not seem to be ``naturally required'' by any 
2441: physical 
2442: considerations, and it is possible in principle that the random 
2443: Horn formulas display critical behavior if we take the spine as the 
2444: order parameter.
2445: 
2446: \section{Acknowledgments}
2447: 
2448: This paper is part of the author's Ph.D. thesis at the University of
2449: Rochester. 
2450: Support for this work has come from the
2451: NSF CAREER Award CCR-9701911 and the
2452: NSF Grant 9725021.
2453: 
2454: {\small
2455: %\bibliography{bibtheory}
2456: \bibliography{/home/gistrate/bib/bibtheory}
2457: \clearpage
2458:  }
2459: 
2460: 
2461: 
2462: 
2463: \end{document}