1: \documentclass{elsart}
2: \usepackage{natbib}
3: \usepackage{epsfig}
4: \usepackage{amssymb}
5: \newtheorem{approks}[thm]{Approximation}
6:
7: \begin{document}
8:
9: \begin{frontmatter}
10:
11: \title{Heuristic average-case analysis of the backtrack resolution
12: of random 3-Satisfiability instances.}
13:
14: \author{Simona Cocco}
15: \address{CNRS-Laboratoire de Dynamique des Fluides Complexes,
16: 3 rue de l'universit\'e, 67000 Strasbourg, France}
17: \ead{cocco@ldfc.u-strasbg.fr}
18: \ead[url]{http://ludfc39.u-strasbg.fr/cocco/index.html}
19:
20: \author{R\'emi Monasson}
21: \address{CNRS-Laboratoire de Physique Th{\'e}orique de l'ENS,
22: 24 rue Lhomond, 75005 Paris, France;\\
23: CNRS-Laboratoire de Physique Th{\'e}orique, 3 rue de l'universit\'e,
24: 67000 Strasbourg, France}
25: \ead{monasson@lpt.ens.fr}
26: \ead[url]{http://www.lptens.fr/$\sim$monasson}
27:
28: \begin{abstract}
29: An analysis of the average-case
30: complexity of solving random 3-Satisfiability (SAT)
31: instances with backtrack algorithms is presented. We first interpret
32: previous rigorous works in a unifying framework based on the statistical
33: physics notions of dynamical trajectories, phase diagram and growth
34: process. It is argued that, under
35: the action of the Davis--Putnam--Loveland--Logemann (DPLL) algorithm,
36: 3-SAT instances are turned into $2+p$-SAT instances whose characteristic
37: parameters (ratio $\alpha$ of clauses per variable, fraction $p$ of
38: 3-clauses) can be followed during the operation, and define resolution
39: trajectories. Depending on the location of trajectories in the
40: phase diagram of the 2+p-SAT model, easy (polynomial) or hard
41: (exponential) resolutions are generated. Three regimes are identified,
42: depending on the ratio $\alpha$ of the 3-SAT instance to be
43: solved. Lower sat phase: for small ratios, DPLL almost surely finds a
44: solution in a time growing linearly with the number $N$ of
45: variables. Upper sat phase: for intermediate ratios, instances are
46: almost surely satisfiable but finding a solution requires exponential
47: time ($\sim 2 ^{N\,\omega}$ with $\omega>0$)
48: with high probability. Unsat phase: for
49: large ratios, there is almost always no solution and proofs of
50: refutation are exponential. An analysis of the growth of the search
51: tree in both upper sat and unsat regimes is presented, and allows us
52: to estimate $\omega$ as a function of $\alpha$. This analysis is based
53: on an exact relationship between the average size of the search tree
54: and the powers of the evolution operator encoding the elementary
55: steps of the search heuristic.
56: \end{abstract}
57:
58: \begin{keyword}
59: satisfiability, analysis of algorithms, backtrack.
60: % keywords here, in the form: keyword \sep keyword
61:
62: % PACS codes here, in the form: \PACS code \sep code
63:
64: \end{keyword}
65:
66: \end{frontmatter}
67:
68: \section{Introduction.}
69:
70: This paper focuses on the average complexity of solving random 3-SAT
71: instances using backtrack algorithms. Being an NP-complete problem,
72: 3-SAT is not thought to be solvable in an efficient way, {\em i.e.}
73: in time growing at most polynomially with $N$. In practice, one
74: therefore resorts to methods that need, {\em a priori}, exponentially
75: large computational resources. One of these algorithms is the
76: ubiquitous Davis--Putnam--Loveland--Logemann (DPLL) solving
77: procedure\citep{DP,survey}. DPLL is a complete search algorithm based on
78: backtracking; its operation is briefly recalled in Figure~1.
79: The sequence of assignments of variables made by DPLL
80: in the course of instance solving can be represented as a search tree,
81: whose size $Q$ (number of nodes) is a convenient measure of the
82: hardness of resolution.
83: Some examples of search trees are presented in Figure~\ref{trees}.
84:
85: In the past few years, much experimental and theoretical progress
86: has been made on the probabilistic analysis of 3-SAT \citep{AI,Hans}.
87: Distributions of random instances controlled by few parameters are
88: particularly useful in shedding light on the onset of complexity. An
89: example that has attracted a lot of attention over the past years is
90: random 3-SAT: all clauses are drawn randomly and each variable negated
91: or left unchanged with equal probabilities. Experiments
92: \citep{AI,Cra,Mit,Kir} and theory \citep{Friedgut,Dub,Dubtcs} indicate
93: that clauses can almost surely always (respectively never) be
94: simultaneously satisfied if $\alpha$ is smaller (resp. larger) than a
95: critical threshold $\alpha _C \simeq 4.3$ as soon as the numbers $M$
96: of clauses and $N$ of variables go to
97: infinity at a fixed ratio $\alpha$. This phase
98: transition \citep{Sta1} is accompanied by a drastic peak in
99: hardness at threshold \citep{AI,Mit,Cra}. The emerging pattern of
100: complexity is as follows. At small ratios $\alpha < \alpha _L$, where
101: $\alpha _L$ depends on the heuristic used by DPLL, instances are
102: almost surely satisfiable (sat), see \citet{Fra2} and \citet{Achl}
103: for recent
104: reviews. The size $Q$ of the associated search tree scales, with high
105: probability, linearly with the number $N$ of variables, and almost no
106: backtracking is present \citep{Fri} (Figure~\ref{trees}A). Above the
107: critical ratio, that is when $\alpha >\alpha _C$, instances are
108: a.s. unsatisfiable (unsat) and proofs of refutation are obtained
109: through massive backtracking (Figure~\ref{trees}B), leading to an
110: exponential hardness: $Q = 2^{N\omega}$ with $\omega >0$~\citep{Chv}.
111: In the intermediate range, $\alpha _L <\alpha <\alpha _C$, finding a
112: solution a.s. requires exponential effort ($\omega >0$)
113: \citep{Vardi1,Achl3,Coc}.
114:
115: The aim of this article is two-fold. First, we propose a simple and
116: intuitive framework to unify the above findings. This framework is
117: presented in Section 2. It is based on the statistical physics notions
118: of dynamical trajectories and phase diagram, and was, to some extent,
119: implicitly contained in the pioneering analysis of search heuristics
120: by \citet{fra2,Fra}. Secondly, we present in Section 3 a quantitative
121: study of the growth of the search tree in the unsat regime. Such a
122: study has been lacking so far due to the formidable difficulty in
123: taking into account the effect of massive backtracking on the
124: operation of DPLL. We first establish an exact relationship
125: between the average size of the search tree and the powers
126: of the evolution operator encoding the elementary steps of the
127: search heuristic. This equivalence is then used (in a non rigorous way) to
128: accurately estimate the logarithm $\omega$ of the average complexity
129: $Q$ as a function of $\alpha$,
130: \begin{equation}
131: \omega (\alpha) = \lim _{N\to \infty}\; \frac 1N\;
132: \log _2 \; E_{(\alpha,N)} [Q] \quad ,
133: \end{equation}
134: where $E_{(N,\alpha)}$ denotes the expectation value for given $N$
135: and $\alpha$. The approach emphasizes the relevance of
136: partial differential equations to analyse algorithms in presence of
137: massive backtracking, as opposed to ordinary differential equations in
138: the absence of the latter \citep{Worm,Achl}. In Section 4, we focus upon
139: the upper sat regime {\em i.e.} upon ratios $\alpha _L < \alpha <
140: \alpha_C$. Combining the framework of Section 2 and the analysis of
141: Section 3 we unveil the structure of the search tree
142: (Figure~\ref{trees}C) and calculate $\omega$ as a function of the
143: ratio $\alpha$ of the 3-SAT instance to be solved.
144:
145: For the sake of clarity and since the style of our approach may look
146: unusual to the computer scientist reader, the status of the different
147: calculations and results (experimental, exact, conjectured,
148: approximate, ...) are made explicit throughout the article.
149:
150: \begin{figure}
151: \begin{center}
152: \includegraphics[width=280pt,angle=-0]{fig1.eps}
153: \vskip .5cm
154: \caption{DPLL algorithm. When a variable has been chosen at step (1)
155: e.g. $x=T$, at step (2) some clauses are satisfied e.g. $C=(x \;\hbox
156: {\rm OR}\;{y} \;\hbox {\rm OR}\; z)$ and eliminated, other are reduced
157: e.g. $C=(\hbox{\rm not}\, x\; \hbox {\rm OR}\;{y}\; \hbox {\rm OR}\;
158: {z}) \to C=({y}\; \hbox {\rm OR}\; {z})$. If some clauses include one
159: variable only e.g. $C=y$, the corresponding variable is automatically
160: fixed to satisfy the clause ($y=T$). This
161: propagation (2c) is repeated up to the exhaustion of all
162: unit clauses. Contradictions result from the presence of two opposite
163: unit clauses
164: e.g. $C=(y), C'=(\hbox{\rm not}\, y)$. A solution is found when no clauses
165: are left. The search process of DPLL is represented by a tree
166: (Figure~\ref{trees}) whose
167: nodes correspond to (1), and edges to (2). Branch extremities are
168: marked with contradictions C (2B,2C), or by a solution S (2A,2C).}
169: \end{center}
170: \end{figure}
171:
172:
173: \begin{figure}
174: \begin{center}
175: \includegraphics[width=190pt,angle=-90]{fig2.eps}
176: \end{center}
177: \vskip .5cm
178: \caption{Types of search trees generated by the DPLL solving procedure
179: on random 3-SAT.
180: {\bf A.} {\em simple branch:} the algorithm finds
181: easily a solution without ever backtracking. {\bf B.} {\em dense tree:}
182: in the absence of solution, DPLL builds a tree, including many branches
183: ending with contradictory leaves, before stopping. {\bf C.} {\em
184: mixed case, branch + tree:} if many contradictions arise before
185: reaching a solution, the resulting search tree can be decomposed into a
186: single branch followed by a dense tree. G is the highest
187: node in the tree reached by DPLL through backtracking.}
188: \label{trees}
189: \end{figure}
190:
191: \section{Phase diagram and trajectories.}
192:
193: \subsection{The 2+p-SAT distribution and split heuristics}
194:
195: The action of DPLL on an instance of 3-SAT causes changes to the
196: overall numbers of variables and clauses, and thus of the ratio
197: $\alpha$. Furthermore, DPLL reduces some 3-clauses to 2-clauses. A
198: mixed 2+p-SAT distribution, where $p$ is the fraction of 3-clauses,
199: can be used to model what remains of the input instance at a node of
200: the search tree. Using experiments and methods from statistical
201: mechanics \citep{Sta1}, the threshold line $\alpha _C (p)$, separating
202: sat from unsat phases, may be estimated with the results shown in
203: Figure~\ref{diag}. For $p \le p_0 = 2/5$, {\em i.e.} to the left of
204: point T, the threshold line is given by $\alpha _C(p)=1/(1-p)$, as
205: rigorously confirmed by \citet{Achl1}, and saturates the upper bound
206: for the satisfaction of 2-clauses. Above $p_0$, no exact value for
207: $\alpha _C (p)$ is known. Note that $\alpha _C \simeq 4.3$ corresponds to
208: $p=1$.
209:
210: The phase diagram of 2+p-SAT is the natural space in which DPLL
211: dynamic takes place. An input 3-SAT instance with ratio $\alpha$ shows
212: up on the right vertical boundary of Figure~\ref{diag} as a point of
213: coordinates $(p=1,\alpha )$. Under the action of DPLL, the
214: representative point moves aside from the 3-SAT axis and follows a
215: trajectory. This trajectory obviously depends on the heuristic of split
216: followed by DPLL (Figure~1).
217: Possible simple heuristics are \citep{fra2,Fra},
218: \begin{itemize}
219: \item{\em Unit-Clause (UC):}
220: randomly pick up a literal among a unit clause
221: if any, or any unset variable otherwise.
222:
223: \item{\em Generalized Unit-Clause (GUC):}
224: randomly pick up a literal among the shortest avalaible clauses.
225:
226: \item{\em Short Clause With Majority (SC$_1$):}
227: randomly pick up a literal among unit clauses if any; otherwise
228: randomly pick up
229: an unset variable $v$, count the numbers of occurences $\ell,
230: \bar \ell$ of $v$, $\bar v$ in 3-clauses, and choose $v$ (respectively
231: $\bar v$) if $\ell > \bar \ell$ (resp. $\ell < \bar \ell$). When
232: $\ell=\bar \ell$, $v$ and $\bar v$ are equally likely to be chosen.
233: \end{itemize}
234:
235: Rigorous mathematical analysis, undertaken to provide rigorous bounds to
236: the critical threshold $\alpha _C$, have so far been restricted to
237: the action of DPLL prior to any backtracking, that is, to the
238: first descent of the algorithm in the search tree\footnote{The analysis of
239: \cite{Fri} however includes a very limited version of backtracking, see
240: Section 2.2}.
241: The corresponding search branch is drawn on Figure~\ref{trees}A.
242: These studies rely on the two following facts:
243:
244: First, the representative point
245: of the instance treated by DPLL does not ``leave'' the 2+p-SAT phase
246: diagram. In other words, the instance is, at any stage of the search
247: process, uniformly distributed from the 2+p-SAT
248: distribution conditioned to its clause--per--variable ratio $\alpha$ and
249: fraction of 3-clauses $p$. This assumption is not true for
250: all heuristics of split, but holds for the above examples ($UC$, $GUC$,
251: $SC_1$) \citep{fra2}. Analysis of more sophisticated heuristics
252: require to handle more complex instance distributions \citep{Kap}.
253:
254: Secondly, the trajectory followed by an instance in the course of
255: resolution is a stochastic object, due to the randomness of
256: the instance and of the assignments done by DPLL.
257: In the large size limit ($N\to\infty$), this trajectory gets
258: concentrated around its average locus in the 2+p-SAT phase diagram.
259: This concentration phenomenon results from general properties of
260: Markov chains \citep{Worm, Achl}.
261:
262: \begin{center}
263: \begin{figure}
264: \includegraphics[width=300pt,angle=-90]{fig3.eps}
265: \vskip 1cm
266: \caption{Phase diagram of 2+p-SAT and dynamical trajectories of DPLL.
267: The threshold line $\alpha_C (p)$ (bold full line) separates sat
268: (lower part of the plane) from unsat (upper part) phases. Extremities
269: lie on the vertical 2-SAT (left) and 3-SAT (right) axis at coordinates
270: ($p=0,\alpha _C=1$) and ($p=1,\alpha _C\simeq 4.3$) respectively.
271: Departure points for DPLL trajectories are located on the
272: 3-SAT vertical axis and the corresponding values of $\alpha$ are
273: explicitely given. Dashed curves represent tree trajectories in the
274: unsat region (thick lines, black arrows) and branch
275: trajectories in the sat phase (thin lines, empty
276: arrows). Arrows indicate the direction of "motion" along trajectories
277: parametrized by the fraction $t$ of variables set by DPLL. For small
278: ratios $\alpha < \alpha _L$, branch trajectories remain
279: confined in the sat phase, end in S of coordinates $(1,0)$, where a
280: solution is found. At $\alpha_L$ ($\simeq 3.003$ for the GUC heuristic),
281: the single branch trajectory hits
282: tangentially the threshold line in T of coordinates $(2/5,5/3)$. In
283: the intermediate range $\alpha _L < \alpha < \alpha_C$, the branch
284: trajectory intersects the threshold line at some point G (which depends
285: on $\alpha$). A dense tree then grows in the unsat phase, as happens
286: when 3-SAT departure ratios are above threshold $\alpha > \alpha _C
287: \simeq 4.3$. The tree trajectory halts on the dot-dashed curve
288: $\alpha \simeq 1.259/(1-p)$ where the tree growth process stops.
289: At this point, DPLL has reached back the highest backtracking node in
290: the search tree, that is, the first node when $\alpha > \alpha _C$, or node G
291: for $\alpha _L < \alpha < \alpha_C$.
292: In the latter case, a solution can be reached from a new descending branch
293: while, in the former case, unsatisfiability is proven, see Figure~2.}
294: \label{diag}
295: \end{figure}
296: \end{center}
297:
298: \subsection{Trajectories associated to search branches}
299:
300: Let us briefly recall \citet{fra2} analysis of the
301: average trajectory corresponding to the action of DPLL prior to
302: backtracking. The ratio of clauses per variable of the
303: 3-SAT instance to be solved will be denoted by $\alpha _0$.
304: The numbers of 2 and 3-clauses are initially equal to $C_2=0, C_3=\alpha _0
305: \, N$ respectively. Under the action of DPLL, $C_2$ and $C_3$ follow
306: a Markovian stochastic evolution process, as the depth $T$ along the branch
307: (number of assigned variables) increases. Both $C_2$ and $C_3$ are
308: concentrated around their expectation values, the densities
309: $c_j (t)= E[C_j( T=t\,N)]$ ($j=2,3$) of which obey a set of
310: coupled ordinary differential equations (ODE) \citep{fra2,Fra,Achl},
311: \begin{equation}
312: \frac{d c_3}{dt} = - \frac{ 3\, c_3}{1-t} \qquad , \qquad
313: \frac{d c_2}{dt} = \frac{ 3\, c_3}{2(1-t)} - \frac{ 2\, c_2}{1-t} -
314: \rho _1 (t) \; h(t) \qquad , \label{ode}
315: \end{equation}
316: where $\rho _1 (t) = 1 - c_2(t)/(1-t)$ is the probability that DPLL fixes a
317: variable at depth $t$ (fraction of assigned variables)
318: through unit-propagation. Function $h$ depends upon the
319: heuristic: $h_{UC} (t)=0$, $h_{GUC} (t)=1$ (if $\alpha _0> 2/3$;
320: for $\alpha _0<2/3$, see \citet{Fra}),
321: $h_{SC_1}
322: (t)=a\, e^{-a}\, (I_0(a)+I_1(a))/2$ where $a\equiv 3\, c_3(t)/(1-t)$
323: and $I_\ell$ is the $\ell^{th}$ modified Bessel function.
324: To obtain the single branch trajectory in the phase diagram of
325: Figure~\ref{diag},
326: we solve ODEs (\ref{ode}) with initial conditions $c_2(0)=0,
327: c_3(0)=\alpha_0$, and perform the change of variables
328: \begin{equation}
329: p(t) = \frac{c_3(t)}{c_2(t)+c_3(t)} \qquad , \qquad
330: \alpha (t) = \frac{c_2(t)+c_3(t)}{1-t} \quad . \label{change}
331: \end{equation}
332:
333: Results are shown for the GUC heuristics and starting ratios $\alpha_0 =2$
334: and 2.8 in Figure~\ref{diag}. The trajectory,
335: indicated by a light dashed line, first heads to the left and then
336: reverses to the right until reaching a point on the 3-SAT axis at
337: a small ratio. Further action of
338: DPLL leads to a rapid elimination of the remaining clauses and the
339: trajectory ends up at the right lower corner S, where a solution is
340: found.
341:
342: \citet{Fri} have shown that, for ratios $\alpha _0 < \alpha _L \simeq 3.003$
343: (for the GUC heuristics), the full search tree essentially reduces
344: to a single branch, and is thus entirely described by the ODEs (\ref{ode}).
345: The amount of backtracking necessary to reach a solution is bounded from above
346: by a power of $\log N$. The average size of the branch, $Q$, scales linearly
347: with $N$ with a multiplicative factor $\gamma (\alpha _0)=Q/N$ that can
348: be calculated \citep{Coc}.
349: The boundary $\alpha _L$ of this easy sat region can be defined as the largest
350: initial ratio $\alpha _0$ such that the branch trajectory $(p(t),\alpha (t))$
351: issued from $(1,\alpha _0)$ never leaves the sat phase during DPLL action.
352: In other words, the instance essentially keeps being sat throughout the
353: resolution process. We shall see in Section 4 this does not hold for sat
354: instances with ratios $\alpha _L < \alpha _0 < \alpha _C$.
355:
356: \section{Analysis of the search tree growth in the unsat phase.}
357:
358: In this Section, we present an analysis of search trees
359: corresponding to unsat instances, that is, in presence of massive
360: backtracking. We first report results from numerical experiments,
361: then expose our analytical approach to compute the complexity of
362: resolution (size of search tree).
363:
364: \subsection{Numerical experiments}
365:
366: For ratios above threshold ($\alpha _0 > \alpha _C\simeq 4.3$),
367: instances almost never have a solution but a considerable amount of
368: backtracking is necessary before proving that clauses are
369: incompatible. Figure~\ref{trees}B shows a generic unsat, or
370: refutation, tree. In contrast to the previous section, the sequence
371: of points $(p,\alpha)$ attached to the nodes of the search tree
372: do not arrange along a line any longer, but
373: rather form a cloud with a finite extension in the phase diagram of
374: Figure~\ref{diag}. Examples of clouds are provided on
375: Figure~\ref{patch}.
376:
377: \begin{center}
378: \begin{figure}
379: \hskip 1.5cm \includegraphics[height=250pt,angle=-90]{fig4.eps}
380: \caption{Clouds associated to search trees obtained from the resolution
381: of three unsat
382: instances with initial ratios $\alpha _0= 4.3, 7$ and 10
383: respectively. Each point in the cloud corresponds to a splitting node
384: in the search tree. Sizes of instances and search trees are
385: $N=120,Q=7597$ for $\alpha _0=4.3$, $N=200,Q=6335$ for $\alpha _0 =7$,
386: and $N=300,Q=6610$ for $\alpha _0=10$.}
387: \label{patch}
388: \end{figure}
389: \end{center}
390: \begin{table}
391: $$
392: \begin{array}{|c|c|c|c|c|c||c|}
393: \hline
394: \multicolumn{1}{|c} {\alpha _0}&
395: \multicolumn{1}{|c} {4.3} & \multicolumn{1}{|c} 7 &
396: \multicolumn{1}{|c} {10} & \multicolumn{1}{|c} {15} &
397: \multicolumn{1}{|c||} {20} & \multicolumn{1}{|c|} {3.5} \\
398: \hline
399: \omega _{EXP} &
400: 0.089 & 0.0477 & 0.0320 &0.0207 & 0.0153 & 0.034 \\
401: & \pm 0.001 & \pm 0.0005 &\pm 0.0005 &
402: \pm 0.0002 & \pm 0.0002 & \pm 0.003 \\
403: \hline
404: \omega _{THE} & 0.0916 & 0.0486 & 0.0323 &
405: 0.0207 & 0.0153 & 0.035 \\
406: \hline
407: \end{array}
408: $$
409: \vskip .3cm
410: \caption{Logarithm of the complexity $\omega$ from experiments (EXP) and
411: theory (THE) as a function of the ratio $\alpha _0$ of clauses per
412: variable of the 3-SAT instance.
413: Ratios above 4.3 correspond to unsat instances; the rightmost ratio
414: lies in the upper sat phase.}
415: \end{table}
416:
417:
418: The number of points in a cloud {\em i.e.} the size $Q$ of its
419: associated search tree grows exponentially with $N$ \citep{Chv}. It is
420: thus convenient to define its logarithm $\omega$ through $Q=2^{N
421: \omega}$. We experimentally measured $Q$, and averaged its
422: logarithm $\omega$ over a large number of
423: instances. Results have then be extrapolated to the $N\to \infty$
424: limit \citep{Coc} and are reported in Table~1. $\omega$ is a
425: decreasing function of $\alpha _0$ \citep{Bea}: the larger $\alpha
426: _0$, the larger the number of clauses affected by a split, and the
427: earlier a contradiction is detected. We will use the vocable
428: ``branch'' to denote a path in the refutation tree which joins the top
429: node (root) to a contradiction (leaf). The number of branches, $B$, is
430: related to the number of nodes, $Q$, through the relation $Q=B-1$
431: valid for any complete binary tree. As far as exponential (in $N$)
432: scalings are concerned, the logarithm of $B$ (divided by $N$) equals
433: $\omega$. In the following paragraph, we show how $B$ can be
434: estimated through the use of a matrix formalism.
435:
436: \begin{center}
437: \begin{figure}
438: \hskip 1.5cm \includegraphics[height=150pt,angle=0]{fig5.eps}
439: \vskip .5cm
440: \caption{Imaginary, parallel growth process of an unsat search tree used in the
441: theoretical analysis. Variables are fixed through unit-propagation, or
442: by the splitting heuristic as in the DPLL
443: procedure, but branches evolve in parallel. $T$ denotes the depth in the
444: tree, that is the number of variables assigned by DPLL along each
445: branch. At depth $T$, one literal is chosen on each branch among 1-clauses
446: (unit-propagation, grey circles not represented on Figure 2), or 2,3-clauses
447: (splitting, black circles as in Figure~2).
448: If a contradiction occurs as a result of unit-propagation, the branch gets
449: marked with C and dies out. The growth of the tree proceeds
450: until all branches carry C leaves. The resulting tree is identical to the one
451: built through the usual, sequential operation of DPLL. }
452: \label{struct}
453: \end{figure}
454: \end{center}
455:
456:
457: \subsection{Parallel growth process and Markovian evolution matrix}
458:
459: The probabilistic analysis of DPLL in the unsat regime appears to be a
460: formidable task since the search tree of Figure~\ref{trees}B is the
461: output of a complex, sequential process: nodes and edges are added by
462: DPLL through successive descents and backtrackings (depth-first
463: search). We have imagined a
464: different building up of the refutation tree, which results in the
465: same complete tree but can be mathematically analyzed. In our
466: imaginary process (Figure~\ref{struct}), the tree grows in parallel,
467: layer after layer (breadth-first search).
468: At time $T=0$, the tree reduces to a root
469: node, to which is attached the 3-SAT instance to be solved, and an
470: attached outgoing edge. At time $T$, that is, after having assigned
471: $T$ variables in the instance attached to each branch, the tree is
472: made of $B(T) \ (\le 2^T)$ branches, each one carrying a partial
473: assignment of variables. At
474: next time step $T\to T+1$, a new layer is added by assigning,
475: according to DPLL heuristic, one more variable along every branch. As a
476: result, a branch may keep growing through unitary propagation, get hit
477: by a contradiction and die out, or split if the partial assignment
478: does not induce unit clauses. This parallel growth process is
479: Markovian, and can be encoded in an instance--dependent matrix we now
480: construct.
481:
482: To do so, we need some preliminary definitions:
483:
484: \begin{defn} Partial state of variables.
485:
486: The partial state $s$ of a Boolean variable $x$ is one of the three
487: following possibilities: undetermined ($u$) if the variable has not
488: been assigned by the search heuristic yet, true ($t$) if the variable
489: is partially assigned to true, false ($f$) if the variable is
490: partially assigned to false. The partial state $S$ of a set of Boolean
491: variables $X=\{x_1, x_2, \ldots , x_N\}$ is the collection of the
492: states of its elements, $S=\{s_1, s_2, \ldots , s_N \}$.
493:
494: Let $\bf I$ be an instance of the SAT problem, defined over a set of
495: Boolean variables $X$ with partial state $S$. A clause of $\bf I$ is
496: said to be
497: \begin{itemize}
498: \item satisfied if at least one of its literals is true according to $S$;
499: \item unsatisfied, or violated if all its literals are false according to $S$;
500: \item undetermined otherwise; then its `type' is the number $(=1,2,3)$ of
501: undetermined variables it includes.
502: \end{itemize}
503: The instance $\bf I$ is said to be satisfied if all its clauses are
504: satisfied, unsatisfied
505: if one (at least) of its clauses is violated, undetermined otherwise.
506: The set of partial states that violate ${\bf I}$ is denoted by $W$.
507: \end{defn}
508:
509: \begin{defn} Vector space attached to a variable.
510:
511: To each Boolean variable $x$ is associated
512: a three dimensional vector space ${\bf v}$ with spanning basis $|u\rangle$,
513: $|t\rangle$, $|f\rangle$, orthonormal with respect
514: to the dot (inner) product denoted by $\langle . | . \rangle$,
515: \begin{equation}
516: \langle u | u \rangle = \langle t | t \rangle = \langle f | f \rangle =
517: 1\ ,\ \langle u |
518: t \rangle = \langle u | f \rangle = \langle t | f \rangle = 0 \ .
519: \end{equation}
520: The partial state attached to a basis vector $|s\rangle$ is $s$ (=$\,u,t,f)$.
521: \end{defn}
522:
523: Letters $u$, $t$, $f$ stand for the different partial states the variable may
524: acquire in the course of the search process. Note that the coefficients
525: of the decomposition of any vector $|x\rangle \in {\bf v}$
526: over the spanning basis,
527: \begin{equation}
528: |x\rangle = x ^{(u)} \, |u\rangle + x^{(t)} \, |t\rangle + x^{(f)} \, |f\rangle
529: \quad ,
530: \end{equation}
531: can be obtained through use of the dot product: $x^{(s)} = \langle
532: s | x \rangle$ with $s=u,t,f$. By extension, $\langle S|$ denotes the
533: transposed of vector $|S\rangle$.
534:
535:
536: \begin{defn} Vector space attached to a set of variables.
537:
538: We associate to the set $X=\{x_1, x_2, \ldots , x_N\}$ of $N$ Boolean
539: variables the $3^N$--dimensional vector space ${\bf V}= {\bf v}_1
540: \otimes {\bf v}_2 \otimes \ldots \otimes {\bf v}_N$. The spanning
541: basis of ${\bf V}$ is the tensor product of the spanning basis of the
542: ${\bf v}_i$'s. To lighten notations, we shall write $|s_1 , s_2, ...,
543: s _N \rangle$ for $|s _1\rangle \otimes |s _2 \rangle \otimes \ldots
544: \otimes |s _N\rangle$. The partial state attached to a basis vector
545: $|S\rangle= |s_1 , s_2, ..., s _N \rangle$ is $S=(s_1 , s_2, ..., s
546: _N)$. The dot product naturally extends over ${\bf V}$: $\langle s'_1
547: , s'_2 , \ldots , s'_N | s_1, s_2 , \ldots , s_N \rangle = 1$ if
548: $s_i=s' _i\ \forall i$, 0 otherwise.
549: \end{defn}
550:
551:
552: Any element $|X\rangle \in {\bf V}$ can be uniquely decomposed
553: as a linear combination of vectors from the spanning basis.
554: Two examples of vectors are $|\Sigma\rangle$ and $|U\rangle$,
555: respectively the sum of all vectors in the
556: spanning basis and the fully undetermined vector,
557: \begin{eqnarray} \label{sigvec}
558: |\Sigma\rangle &=& \big( |u \rangle + |t \rangle + |f \rangle \big) \otimes
559: \big( |u \rangle + |t \rangle + |f \rangle \big) \otimes \ldots \otimes
560: \big( |u \rangle + |t \rangle + |f \rangle \big) \ , \\
561: |U\rangle &=& |u,u,\ldots , u\rangle \qquad . \label{uvec}
562: \end{eqnarray}
563: Basis vectors fulfill the closure identity
564: \begin{equation} \label{closure}
565: \sum _{S} | S\rangle \; \langle S | = {\bf 1} \quad ,
566: \end{equation}
567: where ${\bf 1}$ is the identity operator on ${\bf V}$. To establish identity
568: (\ref{closure}), apply the left hand side
569: operator to any vector $|S'\rangle$ and take advantage of
570: the orthonormality of the spanning basis .
571:
572: \begin{defn} (Heuristic-induced) Transition probabilities
573:
574: Let $S=(s_1 , s_2, ..., s _N)$ be a partial state which does not violate
575: instance ${\bf I}$.
576: Call $S^{(j,x)}$, with $j=1,\ldots, N$ and $x=t,f$,
577: the partial state obtained from $S$ by replacing $s_j$ with $x$.
578: The probability that the heuristic under consideration (UC, GUC, ...)
579: chooses to assign variable $x_j$ when presented partial state $S$
580: is denoted by $h(j|S)$.
581: The probability that the heuristic under consideration
582: then fixes variable $x_j$ to $x\, (=\, t,f)$ is denoted by $g(x|S,j)$.
583: \end{defn}
584:
585: A few elementary facts about transition probabilities are:
586: \begin{enumerate}
587: \item $h(j|S)=0$ if $s_j\ne u$.
588: \item $g(x|S,j)+g(\bar x|S,j)=1$.
589: \item Assume that
590: the number $C_1 (S)$ of undetermined clauses of type 1 (unit
591: clauses) is larger or equal to unity. Call $C_1(j|S)$ the number of
592: unit clauses containing variable $x_j$, and $C_1(x|S,j)$ the number
593: of unit clauses satisfied if $x_j$ equals
594: $x\,(=\,t,f)$. Clearly $C_1(j|S)= C_1(t|S,j)+C_1(f|S,j)$.
595: Then, as a result of unit--propagation,
596: \begin{eqnarray}
597: h(j|S) &=&\frac{C_1(j|S)}{C_1(S)} \quad , \nonumber \\
598: g(x|S,j) &=& \frac{C_1 (x|S,j)}{C_1(j|S)} \quad
599: \hbox{\rm for}\ x=t,f \
600: \hbox{\rm and} \ C_1(j|S) \ge 1 \ .
601: \end{eqnarray}
602: \item In the absence of unitary clause ($C_1(S)=0$), transition probabilities depend on the details
603: of the heuristic. For instance, in the case of the UC heuristic,
604: \begin{enumerate}
605: \item if $s_j=u$, $h(j|S)=\frac{1}{u(S)}$ and $g(x|S,j)=\frac 12$,
606: \item if $s_j\ne u$, $h(j|S)=0$,
607: \end{enumerate}
608: where $u(S)$ is the number of undetermined variables in partial state $S$.
609: \item The sum of transition probabilities from a partial state $S$
610: is equal to unity,
611: \begin{equation}
612: \sum _{j=1} ^N h (j|S) \; \bigg[ g(t|S,j) + g(f|S,j) \bigg] =1 \quad .
613: \end{equation}
614: \end{enumerate}
615: It is important to stress that the definition of the transition probabilities
616: does not make any reference to any type of backtracking. It relies
617: on the notion of variable assignement through the heuristic of search only.
618:
619: Let us now introduce the
620:
621: \begin{defn} (Heuristic-induced) Evolution operator.
622:
623: The evolution operator is a linear operator $\bf H$ acting on $\bf V$
624: encoding the action of DPLL for a given unsatisfiable instance ${\bf I}$. Its
625: matrix elements in the spanning basis are
626: \begin{enumerate}
627: \item if $S$ violates $\bf I$,
628: \begin{equation} \label{defmu}
629: \langle S'| {\bf H} | S\rangle
630: = \left\{ \begin{array} {c c}
631: 1 & \hbox{\rm if} \ S'=S \\
632: 0 & \hbox{\rm if} \ S' \ne S
633: \end{array} \right. \qquad ,
634: \end{equation}
635: \item if $S$ does not violate $\bf I$,
636: \begin{equation} \label{defms}
637: \langle S'| {\bf H} | S\rangle
638: = \left\{ \begin{array} {c l}
639: h(j|S)\times g(x|S,j) & \hbox{\rm if} \ C_1(S) \ge 1
640: \ \hbox{\rm and} \ S'=S^{(j,x)}\\
641: h(j |S) & \hbox{\rm if} \ C_1(S) =0\ \hbox{\rm and} \\
642: \ & \big(S'=S^{(j,x)}\ \hbox{\rm or} \ S'=S^{(j,\bar x)}\big)
643: \\
644: 0 & \hbox{\rm otherwise}
645: \end{array} \right. \nonumber
646: \end{equation}
647: \end{enumerate}
648: where $S,S'$ are the attached partial states to $|S\rangle, |S'\rangle$,
649: and $C_1(S)$ is the number of undetermined clauses of type 1 (unitary clauses)
650: for partial state $S$.
651: \end{defn}
652:
653: Notice that we use the same notation, $\bf H$, for the operator and its matrix
654: in the spanning basis. The different cases encountered in the above
655: definition of $\bf H$ are symbolized in Figure~\ref{thm}.
656: We may now conclude:
657:
658: \begin{thm} Branch function and average size of refutation tree
659:
660: Call branch function the function $B$ with integer-valued argument $T$,
661: \begin{equation} \label{funcq}
662: B(T) = \langle \Sigma | {\bf H}^T | U \rangle \quad ,
663: \end{equation}
664: where $\bf H$ is the evolution operator associated to the unsatisfiable
665: instance $\bf I$, ${\bf H}^T$ denotes the $T^{th}$ (matricial)
666: power of ${\bf H}$, and vectors $|\Sigma\rangle, |U\rangle$ are
667: defined in (\ref{sigvec},\ref{uvec}).
668: Then, there exist two instance--dependent
669: integers $T^* \ (\le N)$ and $B^* \ (\le 2^N)$ such that,
670: \begin{equation} \label{statq}
671: B(T) = B^* \ , \quad \forall \ T \ge T^* \ .
672: \end{equation}
673: Furthermore, $B^*$ is the expectation value over the random assignments
674: of variables of the size (number of leaves) of the search tree
675: produced by DPLL to refute ${\bf I}$. The smallest non zero $T^*$ for
676: which (\ref{statq}) holds is the largest number
677: of variables that the heuristic needs to assign to reach a contradiction.
678: \end{thm}
679:
680: \begin{pf*}{Proof of Theorem 6}
681:
682: Let $S$ be a partial state. We call refutation tree built from
683: $S$ a complete search tree that proves the unsatisfiability of ${\bf I}$
684: conditioned to the fact that DPLL is allowed to assign only variables
685: which are undetermined in $S$. The height of the search tree is the
686: maximal number of assignments leading from the root node (attached to partial
687: state $S$) to a contradictory leaf.
688:
689: Let $T$ be a positive integer. We call
690: $b_T (S)$ the average size (number of leaves) of refutation trees
691: of height $\le T$ that can be built from partial state $S$. Clearly,
692: $b_T(S)=1$ for all $S\in W$, and $b_T(S)\ge 2$ if $S\notin W$. Recall
693: $W$ is the set of violating partial states from Definition 1.
694:
695: Assume now $T$ is an integer larger or equal to 1, $S$ a partial
696: state with $C_1(S)$ unitary clauses. Our parallel representation of
697: DPLL allows us to write simple recursion relations:
698: \begin{enumerate}
699: \item if $S \in W$, $b_T (S)=1=b_{T-1} (S)$.
700: \item if $S \notin W$ and $C_1(S)\ge 1$,
701: \begin{equation}
702: b_T (S) = \sum _{j=1}^N \sum _{x=t,f} h(j |S)\, g(x |S,j)
703: \; b_{T-1} \big(S ^{(j,x)} \big) \quad .
704: \end{equation}
705: \item if $S \notin W$ and $C_1(S)= 0$,
706: \begin{equation}
707: b_T (S) = \sum _{j=1}^N h(j|S) \; \bigg[
708: b_{T-1} \big(S ^ {(j,t)} \big) + b_{T-1} \big(S ^ {(j,f)} \big)
709: \bigg] \quad .
710: \end{equation}
711: \end{enumerate}
712: These three different cases are symbolized on Figure~\ref{thm}A, B and
713: C respectively. From definitions (\ref{defmu},\ref{defms}), these
714: recursion relations are equivalent to
715: \begin{equation} \label{transpo}
716: b_T (S) = \sum _{S'} \langle S' | {\bf H} |S\rangle \;
717: b_{T-1} (S') \quad ,
718: \end{equation}
719: for any partial state $S$. Let $|b_T\rangle$ be the vector of $\bf V$ whose
720: coefficients on the spanning basis $\{|S\rangle\}$ are the $b_T(S)$'s.
721: In particular,
722: \begin{equation}
723: |b_0\rangle = \sum _{S_0 \in W} |S_0\rangle \quad .
724: \end{equation}
725: Then identity (\ref{transpo}) can be written as
726: $|b _T\rangle = {\bf H}^\dagger\; |b _{T-1} \rangle$
727: where ${\bf H}^\dagger$ is the transposed of the evolution operator.
728: Note that the branch function (\ref{funcq}) is simply $B(T)=\langle U|b_T
729: \rangle$. We deduce
730: \begin{equation} \label{funcqprim}
731: |b _T\rangle = ({\bf H}^\dagger)^T\; |b_0 \rangle = \sum _{S_0\in W}
732: \sum _{\sigma _T} p\big( \sigma _T;S_0\big) |S _ 0\rangle
733: \quad ,
734: \end{equation}
735: where the second sum runs over all $3^{N\times T}$ sequences $\sigma _T =
736: (S_1, S_2 , \ldots , S_{T-1}, S _T)$
737: of $T$ partial states with associated weight
738: \begin{eqnarray}
739: p\big( \sigma _T;S_0\big) &=& \langle S_T |{\bf H}^\dagger
740: |S_{T-1}\rangle \times \ldots \times
741: \langle S_2 |{\bf H}^\dagger|S_{1}\rangle \times \langle
742: S _{1} |{\bf H}^\dagger|S _0\rangle \nonumber \\
743: &=& \langle S _0 |{\bf H} |S_{1}\rangle
744: \times \langle S _1 |{\bf H} |S _{2}\rangle \ldots
745: \times \langle S _{T-1} |{\bf H}|S _T\rangle
746: \ , \label{weightseq}
747: \end{eqnarray}
748: The length of a sequence is the number of partial states it includes.
749: We call $S_0$--genuine a sequence of partial states $\sigma _T$ with
750: non zero weight (\ref{weightseq}). The second sum
751: on the right hand side of equation (\ref{funcqprim}) may be
752: rewritten as a sum over all $S_0$--genuine sequences $\sigma _T$
753: of length $T$ only.
754:
755: \begin{lem}
756: Take $S_0\in W$.
757: Any $S_0$-genuine sequence $\sigma _{N+1}$ of length $N+1$
758: includes at least one partial state belonging to $W$.
759: \end{lem}
760: Suppose this is not true. There exists a genuine sequence
761: $\sigma _{{N+1}}$ with $S_{T}\notin W$,
762: $\forall\ 1\le T\le N+1$. Call $u _T$ the number of undetermined variables in
763: partial state $S_{T}$. Since the sequence is genuine,
764: $\langle S_{T-1} |{\bf H} | S_{T}\rangle \ne 0$
765: for every $T$ comprised between 1 and $N+1$. From the evolution operator
766: definition (\ref{defms}), $S_{T}$ contains exactly one more
767: undetermined variable than $S_{T-1}$, and $u _T=u _{T-1}+1$ for all
768: $1\le T\le N+1$. Hence $u _{N+1}-u_0=N+1$. But
769: $u_0$ and $u_{N+1}$ are, by definition,
770: integer numbers comprised between 0 and $N$.
771: \qed
772:
773: From Lemma 7, the index $\nu$ of a $S_0$--genuine sequence $\sigma _{N+1}$
774: of length $N+1$,
775: \begin{equation}
776: \nu = \sup \big\{ T : 1\le T\le N+1\ \hbox{\rm and}\
777: S_{T} \in \sigma _{N+1} \ \hbox{\rm and}\ S_{T}\ \in \ W \big\} \quad ,
778: \end{equation}
779: exists and is larger, or equal, to $1$. Let us define
780: \begin{equation}
781: \hat \sigma _{N+1} = (S_{\nu +1}, S_{\nu+2},
782: \ldots, S_{N}, S_{N+1} )\quad .
783: \end{equation}
784: From definition
785: (\ref{defmu}), $\sigma _{N+1}$ is simply $S_0$ repeated $\nu$ times
786: followed by $\hat \sigma _{N+1}$, and $p (\sigma _{N+1})= p(\hat \sigma
787: _{N+1})$. Call $\nu^*(S_0)$ the smallest index of all $S_0$-genuine
788: sequences of length $N+1$, and $\nu^*$ the minimum of $\nu^* (S_0)$ over
789: $S_0\in W$.
790: Then, from equation (\ref{funcqprim}), $|b_{N+1}\rangle = |b_N\rangle =
791: \ldots = |b_{T^*}\rangle $
792: where $T^*=N+1-\nu^*\le N$. Thus $|b_{T^*}\rangle$ is a right eigenvector
793: of ${\bf H}^\dagger$ with eigenvalue unity, and
794: $|b _{T}\rangle= |b_{T^*}\rangle$ for all $T\ge T^*$.
795: $T^*$, which depends upon instance ${\bf I}$,
796: is the length of the longest genuine sequence without repetition.
797: It is the maximal
798: number of (undetermined) variables to be fixed before a contradiction is
799: found.
800:
801: \begin{lem} Take $S\notin W$.
802: Then there is no $S$-genuine sequence of length $T^*$.
803: \end{lem}
804: Suppose this is not true. There exist $S\notin W$ and a $S$--genuine
805: sequence $\sigma _{T^*}$ of length $T^*$. As $S$ does not violate ${\bf I}$,
806: and ${\bf I}$ is not satisfiable, there are still some undetermined
807: variables in partial state $S$. A certain number of them, say $T'\ge 1$,
808: must be assigned to some $t,f$ values to reach a contradiction, that is,
809: a partial state $S_0\in W$. Therefore there exists a $S_0$--genuine
810: sequence, $\tilde \sigma$, of length $T'\ge 1$ ending with $S$
811: and with no repeated partial state.
812: Concatenating $\tilde \sigma$ and $\sigma _{T^*}$, we obtain a
813: $S_0$--genuine sequence of length $T^*+T '>T^*$ and without repetition,
814: in contradiction with the above result.
815: \qed
816:
817: Using Lemma 8, we may replace $|b_0\rangle$ in equation (\ref{funcqprim})
818: with $|\Sigma\rangle$, and find
819: \begin{equation}
820: B(T)\equiv \langle \Sigma |{\bf H}^T |U\rangle =
821: \langle U |({\bf H}^\dagger)^T |\Sigma\rangle = \langle U | b_{T^*}\rangle
822: = b_{T^*} (U) \ ,
823: \end{equation}
824: for all $T\ge T^*$. Hence, $B^*=b_{T^*} (U)$ is the average size
825: (over the random assignments made by the heuristic) of the refutation tree
826: to instance $\bf I$ generated from the fully undetermined partial state.
827: \qed
828: \end{pf*}
829:
830: \begin{center}
831: \begin{figure}
832: \includegraphics[height=150pt,angle=0]{fig6.eps}
833: \vskip .5cm
834: \caption{Transitions allowed by the heuristic-induced evolution
835: operator. Grey and black nodes correspond to variables assigned through
836: unit-propagation and splitting respectively, as in Figure~\ref{struct}.
837: {\bf A}. If the partial state $S$ already violates the instance
838: ${\bf I}$, it is left
839: unchanged. {\bf B}. If the partial state does not violate $\bf I$ and
840: there is at least one unitary clause, a variable is fixed through unit
841: propagation (grey node) e.g. $x_j=x$. The output partial state is $S^{j,x}$.
842: {\bf C}. If the partial state does not violate $\bf I$ and
843: there is no unitary clause, a variable $x_j$ is fixed through splitting
844: (black node). Two partial states are generated, $S^{j,t}$ and $S^{j,f}$.}
845: \label{thm}
846: \end{figure}
847: \end{center}
848:
849: \subsection{Some examples of short instances and associated matrices}
850:
851: We illustrate the above definitions and results with three explicit
852: examples of instances involving few variables:
853:
854:
855: \begin{exmp}
856: Instance over $N=1$ variable
857:
858: Consider the following unsat instance built from a single variable,
859: \begin{equation}
860: {\bf I}_1 = x_1 \wedge \bar x_1 \qquad .
861: \end{equation}
862: The 3--dimensional vector space ${\bf v}_1$ is spanned by vectors $|u\rangle, |t\rangle,
863: |f\rangle$. The evolution matrix reads
864: \begin{equation}
865: {\bf H} = \left( \begin{array}{c c c}
866: 0 & 0 & 0 \\ \frac 12 & 1 & 0 \\ \frac 12 & 0 &1
867: \end{array} \right) \ \hbox{\rm with} \quad
868: |u\rangle = \left( \begin{array}{c} 1 \\ 0 \\ 0 \end{array} \right) \ ,\quad
869: |t\rangle = \left( \begin{array}{c} 0 \\ 1 \\ 0 \end{array} \right) \ ,\quad
870: |f \rangle = \left( \begin{array}{c} 0 \\ 0 \\ 1 \end{array} \right) \ .
871: \end{equation}
872: Entries can be interpreted as follows.
873: Starting from the $u$ state, variable $x_1$ will be set through unit-propagation to $t$
874: or $f$ with equal probabilities: $\langle t |{\bf H }|
875: u \rangle =\langle f |{\bf H} | u \rangle
876: =1/2$. Once the variable has reached this state, the instance is violated:
877: $\langle t |{\bf H} | t \rangle =\langle f |{\bf H} | f \rangle =1$. All other entries are null.
878: In particular, state $u$ can never be reached from any state,
879: so the first line of the matrix is filled in with zeroes: $\langle u |{\bf
880: H}| s \rangle =0,
881: \forall s$. Function (\ref{funcq}) is easily calculated
882: \begin{equation}
883: B (T) = \left( \begin{array}{c} 1 \\ 1 \\ 1 \end{array} \right)
884: ^\dagger \;.\; {\bf H}
885: ^T\; . \; \left( \begin{array}{c} 1 \\ 0 \\ 0 \end{array} \right) =
886: 1\ , \quad \forall\ T\ge 0 \ .
887: \end{equation}
888: Therefore, $T^*=B^*=1$. Indeed, refutation is obtained without
889: any split, and the search tree involves a unique branch of length 1
890: (Figure~\ref{tree-exemple}A).
891: \end{exmp}
892:
893: Our next example is a 2-SAT instance whose refutation requires to split one
894: variable.
895:
896: \begin{exmp} Instance over $N=2$ variables, with a unique refutation tree.
897:
898: \begin{equation}
899: {\bf I}_2 =(x_1 \vee x_2) \wedge (\bar x_1 \vee x_2) \wedge
900: (x_1 \vee \bar x_2 ) \wedge (\bar x_1 \vee \bar x_2)
901: \end{equation}
902: The evolution matrix $\bf H$ is a $9\times 9$ matrix with 16 non zero entries,
903: \begin{eqnarray}
904: \langle s,u | {\bf H} | u,u \rangle &=&
905: \langle u,s | {\bf H} | u,u \rangle = \frac 12 \ , \qquad \forall\ s=t,f
906: \label{m21} \\ \label{m22}
907: \langle s,s' | {\bf H} | s,u \rangle &=&
908: \langle s,s' | {\bf H} | u,s' \rangle = \frac 12 \ , \qquad \forall\ s,s'=t,f\\
909: \label{m23}
910: \langle s',s | {\bf H} | s',s \rangle &=& 1\ , \qquad \forall \ s,s'=t,f \quad .
911: \end{eqnarray}
912: We now explain how these matrix elements were obtained. From the
913: undetermined state $|u,u\rangle$, any of the four clause can be chosen
914: by the heuristic. Thus, any of the two literals $x_1$, $x_2$ has a
915: probability $1/2$ to be chosen: $h(1|u,u)=h(2|u,u)=\frac 12$.
916: Next, unit-propagation will set the unassigned variable to true, or
917: false with equal
918: probabilities $1/2$ (\ref{m22}). Finally, entries corresponding to
919: violating states in eqn (\ref{m23}) are calculated according to
920: rule (\ref{defmu}).
921:
922: The branch function $B(T)$ equals 1 for $T=0$, 2 for any $T\ge 1$;
923: thus, $T^*=1$ and $B^*=2$, in agreement with the
924: associated search tree symbolized in Figure~\ref{tree-exemple}B.
925: \end{exmp}
926:
927: We now introduce an instance with a non unique refutation tree.
928:
929: \begin{exmp} Instance with $N=3$ variables, and two refutation trees.
930: \begin{equation}
931: {\bf I}_3 =(x_1 \vee x_2) \wedge (\bar x_1 \vee x_2) \wedge
932: (x_1 \vee \bar x_2 ) \wedge (\bar x_1 \vee \bar x_2) \wedge
933: (x_3 \vee \bar x_3)
934: \end{equation}
935: Notice the presence of a (trivial) clause containing opposite
936: literals, which allows
937: us to obtain a variety in the search trees without considering more than three
938: variables. The evolution matrix $\bf H$ is a $27\times 27$ matrix with 56 non
939: zero entries (for the GUC heuristic),
940:
941: \begin{eqnarray} \label{m31}
942: \langle s,u,u |{\bf H} | u,u,u \rangle &=&
943: \langle u,s,u | {\bf H} | u,u,u \rangle = \frac 25 \ , \qquad \forall\ s=t,f \\
944: \label{m32}
945: \langle u,u,s | {\bf H} | u,u,u \rangle &=& \frac 1{5} \ , \qquad \forall\ s=t,f \\
946: \langle s,s',s'' | {\bf H} | s,u,s'' \rangle &=&
947: \langle s,s',s'' | {\bf H} | u,s',s'' \rangle = \frac 1{2} \ , \quad \forall\ s,s'=t,f;
948: s''=u,t,f \nonumber \\
949: \langle s',u,s | {\bf H} | u,u,s \rangle &=&
950: \langle u,s',s | {\bf H} | u,u,s \rangle = \frac 1{2} \ , \qquad \forall\ s,s'=t,f
951: \nonumber \\
952: \langle s,s',s'' | {\bf H} | s,s',s'' \rangle &=&1 \ , \qquad \forall\ s,s'=t,f;
953: \ s''=u,t,f \nonumber
954: \end{eqnarray}
955: The first split variable is $x_3$ if the last clause is chosen (probability $1/5$),
956: or $x_1$ or $x_2$ otherwise (with probability $2/5$ each), leading to
957: expressions (\ref{m31}) and (\ref{m32}). The remaining entries of $\bf H$ are
958: obtained in the same way as explained in Example 10.
959:
960: We obtain $B(0)=1$, $B(1)=2$ and $B(T\ge 2)=12/5$.
961: Therefore, $T^*=2$ and
962: \begin{equation}
963: B^* = \frac{12}5 = \frac 45 \times 2 + \frac 15 \times 4 \qquad ,
964: \end{equation}
965: where the different contributions to $B^*$ and their probabilities are explicitely
966: written down, see Figures~\ref{tree-exemple}B and \ref{tree-exemple}C.
967: \end{exmp}
968:
969: \begin{center}
970: \begin{figure}
971: \includegraphics[height=150pt,angle=0]{fig7.eps}
972: \vskip .5cm
973: \caption{Refutation search trees associated to instances ${\bf I}_1$,
974: ${\bf I}_2$ and ${\bf I}_3$. Grey and black nodes correspond to
975: variables assigned through unit-propagation and split respectively, as
976: in Figure~\ref{struct}. {\bf A}. Example 9: refutation of instance
977: ${\bf I}_1$ is obtained as a result of unit-propagation. The size
978: (number of leaves) of the search tree is $B=1$. {\bf B}. Example 10:
979: search tree generated by DPLL on instance ${\bf I}_2$. The black and
980: grey node correspond to the split of $x_1$ and unit-propagation over
981: $x_2$, or vice-versa. The size of the tree is $B=2$. {\bf C}. Example
982: 11: search tree corresponding to the instance ${\bf I}_3$ when DPLL
983: first splits variable $x_3$. The size of the tree is $B=4$. If the
984: first split variable is $x_1$ or $x_2$, the refutation search tree of
985: instance ${\bf I}_3$ corresponds to case {\bf B}. }
986: \label{tree-exemple}
987: \end{figure}
988: \end{center}
989:
990:
991:
992: \subsection{Dynamical annealing approximation}
993:
994: Let us denotes by $\overline q$ the expectation value of a
995: function $q$ of the instance ${\bf I}$ over the random 3-SAT distribution,
996: at given numbers of variable, $N$, and clauses, $\alpha \, N$.
997: From Theorem 6, the expectation value of the size of the refutation tree is
998: \begin{equation}
999: B^*(\alpha, N) \equiv \overline{ B^* }=
1000: \langle \Sigma | \overline{ {\bf H} ^N} | U \rangle
1001: \quad .
1002: \end{equation}
1003: Calculation of the expectation value of the $N^{th}$ power of ${\bf H}$ is
1004: a hard task that we were unable to perform for large sizes $N$. We therefore turned
1005: to a simplifying approximation, hereafter called dynamical annealing.
1006: This approximation is not thought to be justified in general, but may be
1007: asymptotically exact in some limiting cases we will expose later on.
1008:
1009: A first temptation is to approximate the expectation of the $N^{th}$ power of ${\bf H}$
1010: with the $N^{th}$ power of the expectation of ${\bf H}$. This is however too a brutal
1011: approximation to be meaningful, and a more refined scheme is needed.
1012:
1013: \begin{defn} Clause projection operator
1014:
1015: Consider an instance ${\bf I}$ of the 3-SAT problem. The clause vector
1016: $\vec C (S)$ of a partial state $S$ is a three dimensional vector
1017: $\vec C = (C_1, C_2, C_3)$ where $C_j$ is the number of undetermined
1018: clauses of ${\bf I}$ of type $j$. The clause projection operator,
1019: ${\bf P} ({\vec C})$, is the operator acting on $\bf V$ and
1020: projecting onto the subspace of partial state
1021: vectors with clause vectors $\vec C$,
1022: \begin{equation}
1023: {\bf P} ({\vec C}) \; |S\rangle = \left[ \prod _{j=1}^3 \delta _{ C_j - C_j(S)}\right] \
1024: |S\rangle \quad ,
1025: \end{equation}
1026: where $\delta$ is the Kronecker function. The sum of all state
1027: vectors in the spanning basis with clause vector $\vec C$ is denoted by
1028: $| \Sigma (\vec C) \rangle ={\bf P} ({\vec C})\; | \Sigma \rangle$.
1029: The sum of all state
1030: vectors in the spanning basis with clause vector $\vec C$ and
1031: $U$ undetermined variables is denoted by
1032: $| \Sigma _U (\vec C) \rangle$.
1033: \end{defn}
1034:
1035: It is an easy check that ${\bf P}$ is indeed a projection operator:
1036: ${\bf P}^2({\vec C})={\bf P}({\vec C})$.
1037: As the set of partial states can be
1038: partitioned according to their clause vectors,
1039: \begin{equation} \label{identriv}
1040: \sum _{\vec C} {\bf P} ({\vec C}) = \sum _{\vec C} {\bf P}^2 ({\vec C})
1041: ={\bf 1} \quad .
1042: \end{equation}
1043: We now introduce the clause vector-dependent branch function
1044: \begin{equation}
1045: B ( \vec C , T) = \langle \Sigma (\vec C) | {\bf H}^T |U \rangle
1046: \quad .
1047: \end{equation}
1048: Summation of the $B$'s over all $\vec C$ gives back function (\ref{funcq})
1049: from identity (\ref{identriv}).
1050: The evolution equation for $B ( \vec C , T)$ is,
1051: \begin{eqnarray} \label{ap2}
1052: B ( \vec C , T+1) &=& \langle \Sigma (\vec C)| {\bf H} \;
1053: \times {\bf H}^T |U\rangle
1054: \nonumber \\
1055: &=& \langle \Sigma (\vec C)| {\bf H} \times
1056: \left( \sum _{\vec C'} {\bf P} ^2 ({\vec C'}) \right) \times {\bf H}^T |U\rangle
1057: \nonumber \\
1058: &=& \sum _{\vec C'}\; \langle \Sigma (\vec C)| {\bf H} \times
1059: {\bf P} ({\vec C'}) \times \left( \sum _S |S\rangle \langle S|
1060: \right)\times {\bf P} ({\vec C'}) \times {\bf H}^T |U\rangle
1061: \nonumber \\
1062: &=& \sum _{\vec C'} \sum _{S} \langle \Sigma (\vec C)|
1063: {\bf H} \times {\bf P} ({\vec C'}) | S\rangle \; \langle S|
1064: {\bf P} ({\vec C'})\times {\bf H}^T |U\rangle
1065: \end{eqnarray}
1066: where we have made use of identities (\ref{closure}) and (\ref{identriv}).
1067: We are now ready to do the two following approximation steps:
1068:
1069:
1070: \begin{approks} Dynamical annealing (step A)
1071:
1072: Substitute in equation (\ref{ap2}) the partial state vector
1073: \begin{equation}
1074: {\bf P} (\vec C')\, |S\rangle \quad \hbox{ with} \quad
1075: \frac{1}{\langle \Sigma | \Sigma _{N-T} (\vec C') \rangle} \;
1076: |\Sigma _{N-T} (\vec C') \rangle \quad ,
1077: \end{equation}
1078: that is, with its average over the set of basis vectors with clause
1079: vector $\vec C'$ and $N-T$ undetermined variables.
1080: \end{approks}
1081:
1082: Following step A, equation (\ref{ap2}) becomes an approximated
1083: evolution equation for $B$,
1084: \begin{equation} \label{ap3}
1085: B ( \vec C , T+1) = \sum _{\vec C'} {\bf \hat H} [ \vec C , \vec C' ;T]
1086: \; B ( \vec C , T) \quad ,
1087: \end{equation}
1088: where the new evolution matrix ${\bf \hat H}$, not to be confused
1089: with ${\bf H}$, is
1090: \begin{equation} \label{ap4}
1091: {\bf \hat H} [ \vec C , \vec C' ;T ] = \frac{ \langle \Sigma (\vec C)|
1092: {\bf H} | \Sigma _{N-T}(\vec C') \rangle} { \langle \Sigma |
1093: \Sigma _{N-T}(\vec C') \rangle}
1094: \quad .
1095: \end{equation}
1096: Then,
1097:
1098:
1099: \begin{approks} Dynamical annealing (step B)
1100:
1101: Substitute in equation (\ref{ap3}) the evolution matrix ${\bf \hat H}$ with
1102: \begin{equation}
1103: {\bf \bar H} [ \vec C , \vec C' ;T] = \frac{ \overline{\langle \Sigma
1104: (\vec C)| {\bf H} | \Sigma _{N-T}(\vec C') \rangle}}{\overline{\langle
1105: \Sigma | \Sigma _{N-T}(\vec C') \rangle} }
1106: \end{equation}
1107: that is, consider the instance ${\bf I}$ is redrawn at each time
1108: step $T\to T+1$, keeping information about clause vectors at time $T$
1109: only.
1110: \end{approks}
1111:
1112: Let us interpret what we have done so far. The quantity we focus on
1113: is $\bar B(\vec C;T+1)$, the expectation number of branches at depth
1114: $T$ in the search tree (Figure~\ref{struct}) carrying partial states
1115: with clause vector
1116: $\vec C = (C_1,C_2,C_3)$. Within the dynamical annealing approximation,
1117: the evolution of the $\bar B$'s is Markovian,
1118: \begin{equation}
1119: \label{bradp}
1120: \bar B(\vec C;T+1)=\sum_{\vec C'}\;{\bf \bar H}\;[\vec C,\vec C';T]\;
1121: \bar B(\vec C';T) \ .
1122: \end{equation}
1123: The entries of the evolution matrix ${\bf \bar H}[\vec C, \vec C';T]$
1124: can be interpreted as the average number of branches with clause vector
1125: $\vec C$ that DPLL will generate through the assignment of one variable
1126: from a partial assignment (partial state) of variables
1127: with clause vector $\vec C'$.
1128:
1129: For the GUC heuristic, we find \citep{Coc},
1130: \begin{eqnarray}
1131: \label{bbra}
1132: && {\bf\bar H} [\vec C,\vec C'; T] = {C_3' \choose C_3'-C_3} \; \left
1133: (\frac 3{N-T}\right)^{C_3'-C_3} \; \left(1-\frac 3{N-T}\right)^{C_3}
1134: \times \nonumber \\ &&\qquad \qquad \qquad \qquad \sum_{w_2=0}^{C_3'-C_3} \left (\frac
1135: 1 2 \right)^{C_3'-C_3} {C_3'-C_3\choose w_2}\times \nonumber \\
1136: && \left \{(1 - \delta _{C'_1} ) \; \left( 1 - \frac1{2(N-T)} \right)^{C'_1-1}
1137: \sum_{z_2=0}^{C_2'} {C_2' \choose z_2} \left (\frac
1138: 2{N-T}\right)^{z_2} \right. \times \nonumber \\ &&
1139: \left (1- \frac 2{N-T}\right)^{C_2'-z_2}
1140: \sum_{w_1=0}^{z_2} \left (\frac 1 2
1141: \right)^{z_2} {z_2\choose w_1}\; \delta_{C_2-C_2'-w_2+z_2}
1142: \;\delta_{C_1-C_1'-w_1+1} + \nonumber \\&&
1143: \delta_{C_1'}
1144: \sum_{z_2=0}^{C_2'-1} {C_2'-1 \choose z_2} \left (\frac
1145: 2{N-T}\right)^{z_2}\, \left( 1- \frac 2{N-T}\right)^{C_2'-1-z_2}
1146: \times \nonumber \\ && \left.
1147: \sum_{w_1=0}^{z_2} \left(\frac 1 2 \right)^{z_2} {z_2\choose w_1}\;
1148: \delta_{C_2-C_2'-w_2+z_2+1} \; [\delta_{C_1-w_1} + \delta_{C_1-1-w_1}]
1149: \right\} \ ,
1150: \end{eqnarray}
1151: where $\delta _X$ denotes the Kronecker delta function over integers $X$:
1152: $\delta _X=1$ if $X=0$, $\delta _X=0$ otherwise. Expression (\ref{bbra}) is
1153: easy to obtain from the interpretation following equation (\ref{bradp}).
1154:
1155: \subsection{Generating functions and asymptotic scalings at large $N$}
1156:
1157: Let us introduce the generating function $G (\,{\vec y}\,;T\,)$ of
1158: the average number of branches $\bar B(\,{\vec C}\,;T\,)$
1159: where $\vec y\equiv(y_1,y_2,y_3)$, through
1160: \begin{equation}
1161: \label{gener}
1162: G (\,{\vec y}\,;T\,) =\sum_{\vec C}\; e^{\,{\vec y} \cdot
1163: {\vec C}}\; \bar B(\,{\vec C}\,,T\,)\quad ,
1164: \quad
1165: {\vec y} \cdot {\vec C} \equiv \sum _{j=1}^3 y_j \, C_j \quad .
1166: \end{equation}
1167: Evolution equation (\ref{ap3}) for the $\bar B$'s can be rewritten in
1168: term of the generating function $G$,
1169: \begin{eqnarray}
1170: \label{eqev}
1171: G(\,{\vec y}\,;T+1\,) &=& e^{-\gamma_1({\vec y})} \; G \big(
1172: \,{\vec \gamma}({\vec y})\,; T\,\big) + \nonumber \\
1173: && \left ( e^{- \gamma_2({\vec y})}
1174: (e^{y_1}+1) - e^{- \gamma_1({\vec y})}\right) \;
1175: G \big(-\infty,\, \gamma_2({\vec y}),\, \gamma_3({\vec y})\,; T\, \big)
1176: \end{eqnarray}
1177: where ${\vec \gamma}$ is a vectorial function of argument ${\vec y}$ whose
1178: components read
1179: \begin{eqnarray}
1180: \gamma_1 ({\vec y}) &=&y_1+\ln\left[1-\frac 1 {2(N-T)} \right]
1181: \quad , \nonumber\\
1182: \gamma_2({\vec y})&=&y_2+\ln\left[1+\frac 2 {N-T} \left (\frac{e^{-y_2}}{2}
1183: \left(1+ e^{y_1}\right) -1\right)\right]
1184: \quad , \nonumber \\
1185: \gamma_3 ({\vec y})&=&y_3+\ln\left[1+\frac 3 {N-T} \left (\frac{e^{-y_3}}{2}
1186: \left(1+ e^{y_2}\right) -1\right)\right] \quad .
1187: \end{eqnarray}
1188: To solve equation (\ref{eqev}), we infer the large $N$ behaviour of
1189: $G$ from the following remarks:
1190:
1191: \begin{enumerate}
1192: \item Each time DPLL assigns
1193: variables through splitting or unit-propagation, the numbers $C_j$ of
1194: clauses of length $j$ undergo $O(1)$ changes. It is thus sensible to
1195: assume that, when the number of assigned variables increases from $T_1 = t\,
1196: N$ to $T_2=t\, N + \Delta T$
1197: with $\Delta T $ very large but $o(N)$ e.g. $\Delta T = \sqrt N$,
1198: the densities $c_2=C_2/N$ and $c_3=C_3/N$ of 2- and 3-clauses have
1199: been modified by $o(1)$.
1200:
1201: \item On the same time interval $T_1<T<T_2$, we expect
1202: the number of unit-clauses $C_1$ to vary at each time step. But its
1203: distribution $\rho (C_1|c_2,c_3;t)$, conditioned to the densities
1204: $c_2$, $c_3$ and the reduced time $t$, should reach some well defined limit
1205: distribution. This claim is a generalization of the result obtained
1206: by \citet{Fri} for the analysis of the GUC heuristic in the absence
1207: of backtracking.
1208:
1209: \item As long as a partial state does not violate the instance,
1210: very few unit-clauses
1211: are generated, and splitting frequently occurs. In other words, the
1212: probability that $C_1=0$ is strictly positive as $N$ gets
1213: large.
1214:
1215: \end{enumerate}
1216:
1217: The above arguments entice us to make the following
1218:
1219: \begin{claim} Asymptotic expression for the generating function $G$
1220:
1221: For large $N,T$ at fixed ratio $t=T/N$, the generating function
1222: (\ref{gener}) of the average numbers $\bar B$ of branches
1223: is expected to behave as
1224: \begin{equation}
1225: \label{scalinghyp2}
1226: G(y_1,y_2, y_3;t\,N ) = \exp \bigg[\, N \, \varphi ( y_2, y_3 ; t)
1227: + \psi ( y_1,y_2, y_3 ; t) + o(1) \bigg] \quad .
1228: \end{equation}
1229: \end{claim}
1230:
1231: Hypothesis (\ref{scalinghyp2}) expresses in a concise way some important
1232: information on the distribution of clause populations
1233: during the search process that we now extract.
1234: Call $\omega$ the Legendre transform of $\varphi$,
1235: \begin{equation}
1236: \omega (\,c_2,c_3\,;t\,) = \min _{y_2,y_3} \bigg[ \varphi
1237: (\, y_2,y_3 \, ; t\, ) - y_2\, c_2 - y_3\, c_3\bigg]
1238: \qquad . \label{inversion}
1239: \end{equation}
1240: Then, combining equations (\ref{gener}), (\ref{scalinghyp2}) and
1241: (\ref{inversion}), we obtain
1242: \begin{equation}
1243: \label{scalinghyp}
1244: \sum _{C_1\ge 0}\rho (C_1|c_2,c_3;t)\; \bar B(C_1,c_2\,N,
1245: c_3\, N; t\, N ) \asymp \exp \big[\,N \, \omega ( c_2,c_3 ; t\,)
1246: \big] \quad ,
1247: \end{equation}
1248: up to non exponential in $N$ corrections. In other words,
1249: the expectation value of the number of branches carrying partial states
1250: with $(1-t)\, N$ undetermined variables and
1251: $c_j\,N$ $j$-clauses ($j=2,3$) scales exponentially with $N$,
1252: with a growth function $\omega (c_2,c_3;t)$ related to $\varphi (y_2,y_3;t)$
1253: through identity (\ref{inversion}).
1254: Moreover, $\varphi (0,0;t)$ is the logarithm of the
1255: number of branches (divided by $N$) after a fraction $t$ of variables
1256: have been assigned. The most probable values of the densities $c_j(t)$
1257: of $j$-clauses are then obtained from the partial derivatives of
1258: $\varphi$: $c_j(t)=\partial \varphi /\partial y_j (0,0)$ for $j=2,3$.
1259:
1260: Let us emphasize that $\varphi$ in equation (\ref{scalinghyp2}) does
1261: not depend on $y_1$. This hypothesis simply expresses that, as far as
1262: non violating partial states are concerned,
1263: both terms on the right hand side of (\ref{eqev}) are of the same order,
1264: and that the density of unit-clauses, $c_1=\partial \varphi/\partial y_1$,
1265: identically vanishes.
1266:
1267: Similarly, function $\psi (y_1,y_2,y_3;t)$ is related to the generating
1268: function of distribution $\rho (C_1|c_2,c_3;t)$,
1269: \begin{equation} \label{psigen}
1270: \sum _{C_1\ge 0} \rho (C_1|c_2,c_3;t)\, \e^{y_1\, C_1} =
1271: e^{\psi (y_1,y_2,y_3;t) - \psi(0,y_2,y_3;t)} \quad ,
1272: \end{equation}
1273: where $c_j=\partial \varphi /\partial y_j (y_2,y_3;t)$ ($j=2,3$)
1274: on the left hand side of the above formula.
1275:
1276: Inserting expression (\ref{scalinghyp2}) into the evolution equation
1277: (\ref{eqev}), we find
1278: \begin{eqnarray} \label{mdar}
1279: \frac{\partial \varphi } {\partial t} (y_2,y_3;t) &=&
1280: -y_1 + \frac 2{1-t} \left[ e^{-y_2} \left( \frac{1+e^{y_1}}2 \right)
1281: -1 \right] \frac{\partial \varphi } {\partial y_2} (y_2,y_3;t)
1282: \nonumber \\
1283: &+& \frac 3{1-t} \left[ e^{-y_3} \left( \frac{1+e^{y_2}}2 \right)
1284: -1 \right] \frac{\partial \varphi } {\partial y_3} (y_2,y_3;t)
1285: \nonumber \\
1286: &+& \ln \left[ 1+ K(y_1,y_2) \; e^{\psi (-\infty,y_2,y_3;t)
1287: - \psi(y_1,y_2,y_3;t)} \right]
1288: \end{eqnarray}
1289: where $K(y_1,y_2)=e^{-y_2}(e^{2 \, y_1} + e^{y_1}) -1$.
1290: As $\varphi$ does not depend upon $y _1$, the latter may be chosen
1291: at our convenience e.g. to cancel $K$ and the contribution from the last
1292: term in equation (\ref{mdar}),
1293: \begin{equation}
1294: y_1 =
1295: Y_1 (y_2 ) \equiv y_2 - \ln \left(\frac{ 1 +\sqrt{1+ 4 e^{y_2}}}2 \right)
1296: \quad .
1297: \end{equation}
1298: Such a procedure, sometimes called kernel method and, to our knowledge,
1299: first proposed by \citet{knu}, is correct in the major
1300: part of the $y_2,y_3$ space and, in particular, in the vicinity of
1301: $(0,0)$ we focus on in this paper\footnote{It has however to be
1302: to modified in a small region of the $y_2,y_3$ space; a complete
1303: analysis of this case was carried out by \citet{Coc}.}.
1304: We end up with the following partial differential equation (PDE) for
1305: $\varphi$,
1306: \begin{equation} \label{croi2}
1307: \frac{\partial \varphi } {\partial t} (y_2,y_3;t) = { H} \left[
1308: \frac{\partial \varphi } {\partial y_2} , \frac{\partial \varphi }
1309: {\partial y_3 } , y_2, y_3 , t \right] \quad ,
1310: \end{equation}
1311: where ${H}$ incorporates the details of the splitting
1312: heuristic\footnote{For the UC heuristic,
1313: \begin{equation}
1314: {H} _{UC} = {\ln2} + \frac {3\, c_3}{1-t}\; \left[ e^{-y_3}\left(
1315: \;\frac{1+e^{y_2}}{2}\right) -1 \right]+
1316: \frac{c_2}{1-t} \; \left( \frac 32 e^{-y_2}
1317: -2 \right) \qquad .
1318: \end{equation}},
1319: \begin{eqnarray}
1320: {H} _{GUC} [c_2 ,c_3, y_2, y_3 , t ] &=&
1321: -Y_1(y_2) +\frac {3\, c_3}{1-t}\; \left[ e^{-y_3}\;\left(
1322: \frac{1+e^{y_2}}{2}\right) -1 \right]\nonumber \\
1323: &+& \frac{c_2}{1-t} \; \left( e^{ -Y_1(y_2)}
1324: -2 \right) \qquad .
1325: \end{eqnarray}
1326: We must therefore solve the partial differential equation (PDE) (\ref{croi2})
1327: with the initial condition,
1328: \begin{equation}\label{initphi}
1329: \varphi(y_2,y_3,t=0) = \alpha _0 \; y_3 \quad ,
1330: \end{equation}
1331: obtained through inverse Legendre transform (\ref{inversion}) of the
1332: initial condition over $\bar B$, or equivalently over $\omega$,
1333: \begin{equation}
1334: \omega (c_2,c_3;t=0)= \left\{ \begin{array} {c c c}
1335: 0 &\hbox{\rm if} & c_3=\alpha _0\ ,\nonumber \\ -\infty
1336: &\hbox{\rm if} & c_3\ne\alpha _0 \ .\end{array}
1337: \right . \label{condi}
1338: \end{equation}
1339:
1340:
1341: \begin{center}
1342: \begin{figure}
1343: \includegraphics[height=200pt,angle=-90]{fig8a.ps}
1344: \includegraphics[height=200pt,angle=-90]{fig8b.ps}
1345: $^{\ }$\hskip 3cm \includegraphics[height=200pt,angle=-90]{fig8c.ps}
1346: \vskip .5cm
1347: \caption{Snapshots of the surface $\omega (p,\alpha;t )$ for $\alpha
1348: _0=10$ at three different times {\em i.e.} depths in the tree,
1349: $t=0.01$, 0.05 and 0.09 (from left to right, top to
1350: down). The height $\omega ^*(t)$ of the top of the surface, with
1351: coordinates $p^*(t), \alpha^*(t)$, is the logarithm (divided by $N$) of the
1352: number of branches. The coordinates $(p^*(t),\alpha ^*(t))$ define the tree
1353: trajectory shown in Figure~\ref{diag}. The halt line is hit at $t_h \simeq 0.094$.
1354: Note the overall growth of the surface $\omega (p,\alpha;t)$ with time
1355: (beware of the change of scales between figures).}
1356: \label{dome}
1357: \end{figure}
1358: \end{center}
1359:
1360:
1361: \subsection{Interpretation in terms of growth process}
1362:
1363: We can interpret the dynamical annealing approximation made in the previous
1364: paragraphs, and the resulting PDE (\ref{croi2}) as a description of
1365: the growth process of the search tree resulting from DPLL operation.
1366: Using Legendre transform (\ref{inversion}), PDE (\ref{croi2}) can
1367: be written as an evolution equation for the
1368: logarithm $\omega(c_2,c_3,t)$ of the average number of branches with
1369: parameters $c_2,c_3$ as the depth $t=T/N$ increases,
1370: \begin{equation}
1371: \frac{\partial \omega } {\partial t} (c_2,c_3,t) = { H} \left[ c_2, c_3,
1372: -\frac{\partial \omega } {\partial c_2} , -\frac{\partial \omega }
1373: {\partial c_3 } ,t \right] \qquad . \label{croi}
1374: \end{equation}
1375: Partial differential equation (PDE) (\ref{croi}) is
1376: analogous to growth processes encountered in statistical physics
1377: \citep{Gro}. The surface $\omega$, growing with ``time'' $t$ above the
1378: plane $c_2,c_3$, or equivalently from (\ref{change}), above the plane
1379: $p,\alpha$ (Figure~\ref{dome}), describes the whole distribution of branches.
1380: The average number of branches at depth $t$ in the tree equals
1381: \begin{equation}
1382: B(t) = \int _0 ^1 dp\; \int _0 d\alpha \; e^{N\, \omega (p,\alpha;t)} \asymp
1383: e^{N\, \omega ^* (t)} \quad ,
1384: \end{equation}
1385: where $\omega ^* (t)$ is the maximum over $p,\alpha$ of $\omega (p,\alpha;t)$
1386: reached in $p^*(t), \alpha^*(t)$.
1387: In other words, the exponentially dominant contribution to $B(t)$
1388: comes from branches carrying 2+p-SAT instances with parameters
1389: $p^*(t), \alpha^*(t)$, that is clause densities
1390: $c^*_2(t)= \alpha^*(t) (1-p^*(t))$, $c^*_3(t)= \alpha^*(t) p^*(t)$.
1391: Parametric plot of $p^*(t),\alpha ^*(t)$ as a function of $t$
1392: defines the tree trajectories on Figure~\ref{diag}.
1393:
1394: The hyperbolic line in Figure~\ref{diag} indicates the halt points, where
1395: contradictions prevent dominant branches from further growing.
1396: Each time DPLL assigns a variable through
1397: unit-propagation, an average number $u(p,\alpha)$ of new 1-clauses is
1398: produced, resulting in a net rate of $u-1$ additional 1-clauses.
1399: As long as $u< 1$, 1-clauses are quickly eliminated and do not
1400: accumulate. Conversely, if $u >1$, 1-clauses tend to accumulate.
1401: Opposite 1-clauses $x$ and $\bar x$ are likely to appear,
1402: leading to a contradiction \citep{Fra,Fri}. The halt line is defined through
1403: $u (p,\alpha)=1$, and reads \citep{Coc},
1404: \begin{equation}
1405: \alpha = \left( \frac{3+\sqrt 5}2 \right)
1406: \ln \left[ \frac{1+\sqrt 5}2 \right]\;\frac 1{1-p}
1407: \qquad .
1408: \end{equation}
1409: It differs from the halt line $\alpha=1/(1-p)$ corresponding to
1410: a single branch \citep{Fri}. As far as dominant branches are concerned,
1411: an alternative and simpler way of obtaining the halt criterion
1412: is through calculation of the probability
1413: $\rho ^*_S (t) \equiv \rho (C_1=0|c^*_2(t),c^*_3(t);t)$ that a split occurs
1414: when a variable is assigned by DPLL,
1415: \begin{equation}
1416: \rho ^*_S (t) = \exp \left(\frac{\partial \varphi}{\partial t}
1417: (0,0;t)\right) -1 \quad ,
1418: \end{equation}
1419: from equations (\ref{psigen},\ref{mdar}).
1420: The probability of split vanishes, and unit-clauses accumulate till
1421: a contradiction is obtained, when the tree stops growing.
1422: Along the tree trajectory, $\omega ^*(t)$ grows thus from 0, on the
1423: right vertical axis, up to some final positive value, $\omega_{THE}$,
1424: on the halt line. $\omega _{THE}$ is our theoretical prediction for the
1425: logarithm of the complexity (divided by $N$)\footnote{Notice that
1426: we have to divide the theoretical value by $\ln 2$ to match the definition
1427: used for numerical experiments; this is done in Table~1}.
1428:
1429:
1430: Equation (\ref{croi}) was solved using the method of characteristics.
1431: Using eqn. (\ref{change}), we have plotted the surface $\omega$ at different
1432: times, with the results shown in Figure~\ref{dome} for
1433: $\alpha _0=10$. Values of $\omega _{THE}$, obtained for
1434: $4.3<\alpha<20$ by solving equation
1435: (\ref{croi}) compare very well with numerical results (Table~1).
1436: We stress that, though our calculation is not rigorous,
1437: it provides a very good quantitative estimate of the complexity.
1438: It is therefore expected that our dynamical annealing approximation
1439: be quantitavely accurate. It is a reasonable conjecture that it becomes
1440: exact at large ratios $\alpha_0$, where PDE (\ref{croi2}) can be exactly solved:
1441:
1442: \begin{conj} Asymptotic equivalent of $\omega$ for large ratios
1443:
1444: Resolution of PDE (\ref{croi}) in the large ratio $\alpha_0$ limit
1445: gives (for the GUC heuristic),
1446: \begin{equation}
1447: \omega _{THE} (\alpha _0 ) \asymp \frac {3+\sqrt{5}}{6 \,\ln 2}\;
1448: \left[ \ln \left( \frac{1+\sqrt 5}{2} \right) \right]^2 \; \frac 1{\alpha_0}
1449: \quad .
1450: \end{equation}
1451: This result exhibits the $1/\alpha _0$ scaling proven by \citet{Bea}, and
1452: is conjectured to be exact.
1453: \end{conj}
1454:
1455: As $\alpha _0$ increases, search trees become smaller and smaller, and
1456: correlations between branches, weaker and weaker, making dynamical
1457: annealing more and more accurate.
1458:
1459: \begin{center}
1460: \begin{figure}
1461: \includegraphics[width=400pt,height=250pt,angle=0]{fig9.eps}
1462: \vskip .3cm
1463: \caption{Detailed structure of the search tree in the
1464: upper sat phase ($\alpha _L < \alpha < \alpha _C$). DPLL starts with a
1465: satisfiable 3-SAT instance and transforms it into a sequence of
1466: 2+p-SAT instances. The leftmost branch in the tree
1467: symbolizes the first descent made by DPLL. Above node $G_0$,
1468: instances are satisfiable while below
1469: $G_1$, instances have no solutions. A grey triangle accounts for the
1470: (exponentially) large refutation subtree that DPLL has to go through
1471: before backtracking above $G_1$ and reaching $G_0$. By definition, the highest
1472: node reached back by DPLL is $G_0$. Further backtracking, below $G_0$, will
1473: be necessary but a solution will be eventually found (right subtree), see
1474: Figure~\ref{trees}C.}
1475: \label{treeinter}
1476: \end{figure}
1477: \end{center}
1478:
1479: \section{Upper phase and mixed branch--tree trajectories.}
1480:
1481: The interest of the trajectory framework proposed in this paper is best seen
1482: in the upper sat phase, that is, for
1483: ratios $\alpha _0$ ranging from $\alpha _L$
1484: to $\alpha _C$. This intermediate region juxtaposes branch and tree
1485: behaviors, see search tree in Figures~\ref{trees}C and \ref{treeinter}.
1486:
1487: The branch trajectory,
1488: started from the point $(p=1,\alpha _0)$ corresponding to the initial
1489: 3-SAT instance, hits the critical line $\alpha_c(p)$
1490: at some point G with coordinates ($p_G,\alpha_G$) after $N\;t_G$ variables
1491: have been assigned by DPLL, see Figure~\ref{diag}. The
1492: algorithm then enters the unsat phase and, with high probability,
1493: generates a 2+p-SAT instance
1494: with no solution. A dense subtree that DPLL has to go through
1495: entirely, forms beyond G till the halt line (left subtree in
1496: Figure~\ref{treeinter}). The size of this subtree
1497: can be analytically predicted from the theory exposed in
1498: Section~3. All calculations are identical, except
1499: initial condition (\ref{initphi}) which has to be changed into
1500: \begin{equation}
1501: \varphi(y_2,y_3,t=0) = \alpha _G \; (1-p_G) \; y_2
1502: +\alpha _G \; p_G \; y_3 \quad .
1503: \end{equation}
1504: As a result we obtain the size $2^{N_G \; \omega _G}$ of the unsatisfiable
1505: subtree to be backtracked (leftmost subtree in Figure~\ref{treeinter}).
1506: $N_G = N\,(1-t_G)$ denotes the number of undetermined variables
1507: at point $G$.
1508:
1509: $G$ is the highest backtracking node in the tree (Figures~\ref{trees}C
1510: and \ref{treeinter}) reached back by
1511: DPLL, since nodes above G are located in the sat phase and carry
1512: 2+p-SAT instances with solutions. DPLL will eventually reach
1513: a solution. The corresponding branch (rightmost
1514: path in Figure~\ref{trees}C) is highly non typical and does not contribute
1515: to the complexity, since almost all branches in the search tree
1516: are described by the tree trajectory issued from G (Figure~\ref{diag}).
1517: We expect that the computational effort DPLL requires to find
1518: a solution will, to exponential order in $N$, be given by the
1519: size of the left unsatisfiable subtree of Figure~\ref{treeinter}.
1520: In other words, massive backtracking will certainly be present
1521: in the right subtree (the one leading to the solution), and no
1522: significant statistical difference is expected between both subtrees.
1523:
1524: We have experimentally checked this scenario for
1525: $\alpha _0=3.5$. The average coordinates of the highest
1526: backtracking node, $(p_G\simeq 0.78, \alpha _G \simeq 3.02$), coincide
1527: with the computed intersection of the single branch trajectory
1528: (Section 2.2) and the estimated critical line $\alpha_c(p)$ \citep{Coc}.
1529: As for complexity, experimental measures of $\omega$ from
1530: 3-SAT instances at $\alpha _0= 3.5$, and of $\omega _G$ from
1531: 2+0.78-SAT instances at $\alpha _G =3.02$, obey the
1532: expected identity
1533: \begin{equation}
1534: \omega _{THE} = \omega _G \times (1-t_G) \quad ,
1535: \end{equation}
1536: and are in very good agreement with theory (Table~1). Therefore,
1537: the structure of search trees corresponding to instances of 3-SAT
1538: in the upper sat regime reflects the
1539: existence of a critical line for 2+p-SAT instances.
1540:
1541:
1542: \section{Conclusions.}
1543:
1544: In this paper, we have exposed a procedure to understand the
1545: complexity pattern of the backtrack resolution of the random
1546: Satisfiability problem (Figure~\ref{sche}). Main steps are:
1547: \begin{enumerate}
1548: \item Identify the space of parameters in which
1549: the dynamical evolution takes place; this space will be generally larger
1550: than the initial parameter space since the algorithm modifies the instance
1551: structure. While the distribution of 3-SAT instances is characterized by
1552: the clause per variable ratio
1553: $\alpha$ only, another parameter $p$ accounting for
1554: the emergence of 2-clauses has to be considered.
1555: \item Divide the parameter space
1556: into different regions (phases) depending on the output
1557: of the resolution e.g. sat/unsat phases for 2+p-SAT.
1558: \item Represent the action of the algorithm as trajectories in this
1559: phase diagram. Intersection of trajectories with the phase boundaries
1560: allow to distinguish hard from easy regimes (Figure~\ref{sche}).
1561: \end{enumerate}
1562:
1563: In addition, we have also presented
1564: a non rigorous study of the search tree growth, which allows us to
1565: accurately estimate the complexity of resolution in presence
1566: of massive backtracking. From a mathematical point of view, it is worth
1567: noticing that monitoring the growth of the search tree requires
1568: a PDE, while ODEs are sufficient to account for the evolution of a single
1569: branch~\citep{Achl}.
1570:
1571: An interesting question raised by this picture is the robustness of
1572: the polynomial/exponential crossover point T (Figure 3). While the
1573: ratio $\alpha_L$ separating easy (polynomial) from hard (exponential)
1574: resolutions depends on the heuristics used by DPLL ($\alpha _L ^{GUC} \simeq
1575: 3.003$, $\alpha _L ^{UC} = 8/3$), T appears to be located at the same
1576: coordinates $(p_T=2/5,\alpha _T =5/3)$ for all three UC, GUC, and
1577: SC$_1$ heuristics. From a technical point of view, the robustness of T
1578: comes from the structure of the ODEs (\ref{ode}). The coordinates of
1579: T, and the time $t_T$ at which the branch trajectory issued from
1580: $(p=1,\alpha _0=\alpha _L)$ hits the critical line $\alpha _C(p)$
1581: tangentially, obey the equations $\rho _1 = \partial \rho _1 /\partial
1582: t = 0$ with $\rho _1 = 1- \alpha(t) \big(1-p(t)\big)$. The set of
1583: ODEs (\ref{ode}), combined with the previous conditions, gives
1584: $p_T=2/5$ \citep{Achl}.
1585:
1586: This robustness explains why the polynomial/exponential crossover
1587: location of critically constrained 2+p-SAT instances, which should
1588: {\em a priori} depend on the algorithm used, was found by
1589: \citet{Sta1} to coincide roughly with the algorithm--independent,
1590: tricritical point on the $\alpha_C(p)$ line.
1591:
1592: Our approach has already been extended to other decision problems, e.g. the
1593: vertex covering of random graphs \citep{Wei} or the coloring
1594: of random graphs \citep{Liat} (see \citep{Cris} for recent rigorous
1595: results on backtracking in this case). It is important to stress that it is
1596: not limited to the determination of the average solving time, but may
1597: also be used to capture its distribution \citep{Gent2,Coc3,Mont}
1598: and to understand the efficiency of restarts techniques \citep{Gomes}.
1599: Finally, we emphasize that theorem 6 relates the computational effort
1600: to the evolution operator representing the elementary steps
1601: of the search heuristic for a {\em given} instance. It is expected
1602: that this approach will be useful to obtain results on the average-case
1603: complexity of DPLL at fixed instance, where the average is performed over the
1604: random choices done by the algorithm only \citep{momo}.
1605:
1606: \begin{center}
1607: \begin{figure}
1608: \includegraphics[height=250pt,angle=0]{fig10.eps}
1609: \caption{Schematic representation of the resolution trajectories in the
1610: sat (branch trajectories symbolized with dashed line) and unsat (tree
1611: trajectories represented by hatched regions) phases. DPLL goes along
1612: branch trajectories in a linear time, but takes an exponential time
1613: to go through tree trajectories. The mixed case
1614: of hard sat instances correspond to the crossing of the boundary separating
1615: the two phases (bold line), which leads to the exploration of unsat subtrees
1616: before a solution is finally found.}
1617: \label{sche}
1618: \end{figure}
1619: \end{center}
1620:
1621:
1622: \begin{ack}
1623: We thank J. Franco for his constant support during
1624: the completion of this work. R. Monasson was in part supported by
1625: the ACI Jeunes Chercheurs ``Algorithmes d'optimisation et
1626: syst\`emes d\'esordonn\'es quantiques'' from the French Ministry of Research.
1627: \end{ack}
1628:
1629: \begin{thebibliography}{}
1630:
1631: \harvarditem{Achlioptas et al.}{2001a}{Achl1}
1632: Achlioptas, D., Kirousis, L., Kranakis, E. and Krizanc, D.
1633: Rigorous results for random (2+p)-SAT, {\em Theor. Comp.
1634: Sci.} {\bf 265}, 109-129 (2001).
1635:
1636: \harvarditem{Achlioptas}{2001b}{Achl}
1637: Achlioptas, D.
1638: Lower bounds for random 3-SAT via differential equations,
1639: {\em Theor. Comp. Sci.} {\bf 265}, 159--185 (2001).
1640:
1641: \harvarditem{Achlioptas, Beame and Molloy}{2001c}{Achl3}
1642: Achlioptas, D., Beame, P. and Molloy, M.
1643: A Sharp Threshold in Proof Complexity.
1644: in {\em Proceedings of STOC 01}, p.337-346 (2001).
1645:
1646: \harvarditem{Beame et al.}{1998}{Bea}
1647: Beame, P., Karp, R., Pitassi, T. and Saks, M.
1648: {\em ACM Symp.\ on Theory of Computing (STOC98)}, 561--571
1649: Assoc. Comput. Mach., New York (1998).
1650:
1651: \harvarditem{Chao and Franco}{1986}{fra2}
1652: Chao, M.T. and Franco, J.
1653: Probabilistic analysis of two heuristics for the 3-satisfiability problem,
1654: {\em SIAM Journal on Computing} {\bf 15}, 1106-1118 (1986).
1655:
1656: \harvarditem{Chao and Franco}{1990}{Fra}
1657: Chao, M.T. and Franco, J.
1658: Probabilistic analysis of a generalization of the unit-clause literal
1659: selection heuristics for the k-satisfiability problem,
1660: {\em Information Science} {\bf 51}, 289--314 (1990).
1661:
1662: \harvarditem{Chv\`atal and Szmeredi}{1988}{Chv}
1663: Chv{\`a}tal, V. and Szmeredi, E.
1664: Many hard examples for resolution,
1665: {\em Journal of the ACM} {\bf 35}, 759--768 (1988).
1666:
1667: \harvarditem{Coarfa et al.}{2000}{Vardi1}
1668: Coarfa, C., Dernopoulos, D.D., San Miguel Aguirre, A., Subramanian, D.
1669: and Vardi, M.Y. Random 3-SAT: The plot thickens. In R. Dechter, editor,
1670: {\em Proc. Principles and Practice of Constraint Programming
1671: (CP'2000)}, Lecture Notes in Computer Science 1894, 143-159 (2000).
1672:
1673: \harvarditem{Cocco and Monasson}{2001}{Coc}
1674: Cocco, S. and Monasson, R.
1675: Trajectories in phase diagrams, growth processes
1676: and computational complexity: how search algorithms solve the
1677: 3-Satisfiability problem, {\em Phys. Rev. Lett.} {\bf 86}, 1654 (2001);
1678: Analysis of the computational complexity of solving random satisfiability
1679: problems using branch and bound search algorithms, {\em Eur. Phys. J. B}
1680: {\bf 22}, 505 (2001).
1681:
1682: \harvarditem{Cocco and Monasson}{2002}{Coc3}
1683: Cocco, S. and Monasson R.
1684: Exponentially hard problems are sometimes polynomial, a large deviation
1685: analysis of search algorithms for the random satisfiability problem,
1686: and its application to stop-and-restart resolutions,
1687: {\em Phys. Rev. E} {\bf 66}, 037101 (2002).
1688:
1689: \harvarditem{Crawford and Auton}{1996}{Cra}
1690: Crawford, J. and Auton, L.
1691: Experimental Results on the Cross-Over Point in Satisfiability Problems,
1692: {\em Proc.\ 11th Natl.\ Conference on Artificial Intelligence
1693: (AAAI-93),} 21--27, The AAAI Press / MIT Press, Cambridge, MA (1993);
1694: {\em Artificial Intelligence} {\bf 81} (1996).
1695:
1696: \harvarditem{Davis, Logemann and Loveland}{1962}{DP}
1697: Davis, M., Logemann, G., Loveland, D.
1698: {A machine program for theorem proving.}
1699: {\em Communications of the ACM} {\bf 5}, 394-397 (1962).
1700:
1701: \harvarditem{Dubois, Boufkhad and Mandler}{2000}{Dub}
1702: Dubois, O., Boufkhad, Y. and Mandler, J.
1703: Typical random 3-SAT formulae and the satisfiability threshold.
1704: {\em SODA}, p. 126-127 (2000).
1705:
1706: \harvarditem{Dubois et al.}{2001}{Dubtcs}
1707: Dubois, O., Monasson, R., Selman, B. and Zecchina, R. (eds)
1708: Phase transitions in combinatorial problems.
1709: {\em Theor. Comp. Sci.} {\bf 265} (2001).
1710:
1711: \harvarditem{Franco}{2001}{Fra2}
1712: Franco, J.
1713: Results related to thresholds phenomena research in satisfiability:
1714: lower bounds.
1715: {\em Theor. Comp. Sci.} {\bf 265}, 147--157 (2001).
1716:
1717: \harvarditem{Friedgut}{1999}{Friedgut}
1718: Friedgut, E.
1719: Sharp thresholds of graph properties, and the k-sat problem,
1720: {\em Journal of the A.M.S.} {\bf 12}, 1017 (1999).
1721:
1722: \harvarditem{Frieze and Suen}{1996}{Fri}
1723: Frieze, A. and Suen, S.
1724: Analysis of two simple heuristics on a random instance of k-SAT,
1725: {\em Journal of Algorithms} {\bf 20}, 312--335 (1996).
1726:
1727: \harvarditem{Gent and Walsh}{1994}{Gent2}
1728: Gent, I.P. and Walsh, T. Easy problems are sometimes hard,
1729: {\em Artificial Intelligence} {\bf 70}, 335-345 (1994).
1730:
1731: \harvarditem{Gent, van Maaren and Walsh}{2000}{Hans}
1732: Gent , I., van Maaren, H. and Walsh, T. (eds).
1733: SAT2000: Highlights of Satisfiability Research in the Year 2000,
1734: {\em Frontiers in Artificial Intelligence and Applications},
1735: vol. 63, IOS Press, Amsterdam (2000).
1736:
1737: \harvarditem{Gomes et al.}{2000}{Gomes}
1738: Gomes, C.P., Selman, B., Crato, N. and Kautz, H. {\em J. Automated
1739: Reasoning} {\bf 24}, 67 (2000).
1740:
1741: \harvarditem{Gu, Purdom, Franco and Wah}{1997}{survey}
1742: Gu, J., Purdom, P.W., Franco, J. and Wah, B.W.
1743: Algorithms for satisfiability (SAT) problem: a survey.
1744: {\em DIMACS Series on Discrete Mathematics
1745: and Theoretical Computer Science} {\bf 35}, 19-151,
1746: American Mathematical Society (1997).
1747:
1748: \harvarditem{Hartmann and Weigt}{2001}{Wei}
1749: Hartmann, A. and Weigt, M.
1750: Typical solution time for a vertex-covering algorithm on
1751: finite-connectivity random graphs,
1752: {\em Phys. Rev. Lett.} {\bf 86}, 1658 (2001).
1753:
1754: \harvarditem{Hogg, Huberman and Williams}{1996}{AI}
1755: Hogg, T., Huberman, B.A. and Williams, C. (eds).
1756: Frontiers in problem solving: phase transitions and complexity.
1757: {\em Artificial Intelligence} {\bf 81} I \& II (1996).
1758:
1759: \harvarditem{Jia and Moore}{2003}{Cris}
1760: Jia, H. and Moore, C.
1761: How much backtracking does it take to 3-color a random graph?
1762: {\em preprint} (2003).
1763:
1764: \harvarditem{Kaporis, Kirousis and Lalas}{2002}{Kap}
1765: Kaporis, A.C., Kirousis, L.M. and Lalas, E.G.
1766: The Probabilistic Analysis of a Greedy Satisfiability Algorithm.
1767: {\em ESA}, p. 574-585 (2002).
1768:
1769: \harvarditem{Knuth}{1968}{knu}
1770: Knuth, D.E.
1771: The art of computer programming, vol. 1: Fundamental algorithms,
1772: Section 2.2.1, Addison-Wesley, New York (1968).
1773:
1774: \harvarditem{Ein-Dor and Monasson}{2003}{Liat}
1775: Ein-Dor, L. and Monasson, R.
1776: The dynamics of proving uncolorability of large random graphs.
1777: {\em J. Phys. A} {\bf 36} 11055 (2003).
1778:
1779: \harvarditem{McKane, Droz, Vannimenus and Wolf}{1995}{Gro}
1780: McKane, A. Droz, M. Vannimenus, J. and Wolf D. (eds),
1781: {Scale invariance, interfaces, and non--equilibrium dynamics},
1782: {\em Nato Asi Series B: Physics}, vol. 344, Plenum Press, New-York (1995).
1783:
1784: \harvarditem{Mitchell, Selman and Levesque}{1992}{Mit}
1785: Mitchell, D., Selman, B. and Levesque, H.
1786: Hard and Easy Distributions of SAT Problems,
1787: {\em Proc.\ of the Tenth Natl.\ Conf.\ on Artificial Intelligence
1788: (AAAI-92)}, 440-446,
1789: The AAAI Press / MIT Press, Cambridge, MA (1992).
1790:
1791: \harvarditem{Monasson et al.}{1999}{Sta1}
1792: Monasson, R., Zecchina, R., Kirkpatrick, S., Selman, B. and Troyansky, L.
1793: Determining computational complexity from characteristic 'phase transitions'.
1794: {\em Nature} {\bf 400}, 133--137 (1999);
1795: 2+p-SAT: Relation of Typical-Case Complexity to the Nature of
1796: the Phase Transition,
1797: {\em Random Structure and Algorithms} {\bf 15}, 414 (1999).
1798:
1799: \harvarditem{Monasson}{2003}{momo}
1800: Monasson, R. On the analysis of backtrack procedures for the coloring of
1801: random graphs. {\em preprint} (2003).
1802:
1803: \harvarditem{Montanari and Zecchina}{2002}{Mont}
1804: Montanari, A. and Zecchina, R.
1805: Optimizing searches via rare events.
1806: {\em Phys. Rev. Lett.} {\bf 88}, 178701 (2002)
1807:
1808: \harvarditem{Selman and Kirkpatrick}{1994}{Kir}
1809: Selman, B. and Kirkpatrick, S.
1810: Critical Behavior in the Satisfiability of Random Boolean Expressions.
1811: {\em Science} {\bf 264}, 1297--1301 (1994).
1812:
1813: \harvarditem{Wormald}{1995}{Worm}
1814: Wormald, N.
1815: Differential equations for random processes and random graphs.
1816: {\em Ann. Appl. Probab.} {\bf 5}, 1217-1235 (1995).
1817:
1818: \end{thebibliography}
1819:
1820: \end{document}
1821:
1822: