0401:cs0401011/paper.tex

1: \documentclass{elsart}

2: \usepackage{natbib}

3: \usepackage{epsfig}

4: \usepackage{amssymb}

5: \newtheorem{approks}[thm]{Approximation}

6:

7: \begin{document}

8:

9: \begin{frontmatter}

10:

11: \title{Heuristic average-case analysis of the backtrack resolution

12: of random 3-Satisfiability instances.}

13:

14: \author{Simona Cocco}

15: \address{CNRS-Laboratoire de Dynamique des Fluides Complexes,

16: 3 rue de l'universit\'e, 67000 Strasbourg, France}

17: \ead{cocco@ldfc.u-strasbg.fr}

18: \ead[url]{http://ludfc39.u-strasbg.fr/cocco/index.html}

19:

20: \author{R\'emi Monasson}

21: \address{CNRS-Laboratoire de Physique Th{\'e}orique de l'ENS,

22: 24 rue Lhomond, 75005 Paris, France;\\

23: CNRS-Laboratoire de Physique Th{\'e}orique, 3 rue de l'universit\'e,

24: 67000 Strasbourg, France}

25: \ead{monasson@lpt.ens.fr}

26: \ead[url]{http://www.lptens.fr/$\sim$monasson}

27:

28: \begin{abstract}

29: An analysis of the average-case

30: complexity of solving random 3-Satisfiability (SAT)

31: instances with backtrack algorithms is presented. We first interpret

32: previous rigorous works in a unifying framework based on the statistical

33: physics notions of dynamical trajectories, phase diagram and growth

34: process. It is argued that, under

35: the action of the Davis--Putnam--Loveland--Logemann (DPLL) algorithm,

36: 3-SAT instances are turned into $2+p$-SAT instances whose characteristic

37: parameters (ratio $\alpha$ of clauses per variable, fraction $p$ of

38: 3-clauses) can be followed during the operation, and define resolution

39: trajectories. Depending on the location of trajectories in the

40: phase diagram of the 2+p-SAT model, easy (polynomial) or hard

41: (exponential) resolutions are generated.  Three regimes are identified,

42: depending on the ratio $\alpha$ of the 3-SAT instance to be

43: solved. Lower sat phase: for small ratios, DPLL almost surely finds a

44: solution in a time growing linearly with the number $N$ of

45: variables. Upper sat phase: for intermediate ratios, instances are

46: almost surely satisfiable but finding a solution requires exponential

47: time ($\sim 2 ^{N\,\omega}$ with $\omega>0$)

48: with high probability.  Unsat phase: for

49: large ratios, there is almost always no solution and proofs of

50: refutation are exponential. An analysis of the growth of the search

51: tree in both upper sat and unsat regimes is presented, and allows us

52: to estimate $\omega$ as a function of $\alpha$. This analysis is based

53: on an exact relationship between the average size of the search tree

54: and the powers of the evolution operator encoding the elementary

55: steps of the search heuristic.

56: \end{abstract}

57:

58: \begin{keyword}

59: satisfiability, analysis of algorithms, backtrack.

60: % keywords here, in the form: keyword \sep keyword

61:

62: % PACS codes here, in the form: \PACS code \sep code

63:

64: \end{keyword}

65:

66: \end{frontmatter}

67:

68: \section{Introduction.}

69:

70: This paper focuses on the average complexity of solving random 3-SAT

71: instances using backtrack algorithms.  Being an NP-complete problem,

72: 3-SAT is not thought to be solvable in an efficient way, {\em i.e.}

73: in time growing at most polynomially with $N$. In practice, one

74: therefore resorts to methods that need, {\em a priori}, exponentially

75: large computational resources. One of these algorithms is the

76: ubiquitous Davis--Putnam--Loveland--Logemann (DPLL) solving

77: procedure\citep{DP,survey}. DPLL is a complete search algorithm based on

78: backtracking; its operation is briefly recalled in Figure~1.

79: The sequence of assignments of variables made by DPLL

80: in the course of instance solving can be represented as a search tree,

81: whose size $Q$ (number of nodes) is a convenient measure of the

82: hardness of resolution.

83: Some examples of search trees are presented in Figure~\ref{trees}.

84:

85: In the past few years, much experimental and theoretical progress

86: has been made on the probabilistic analysis of 3-SAT \citep{AI,Hans}.

87: Distributions of random instances controlled by few parameters are

88: particularly useful in shedding light on the onset of complexity. An

89: example that has attracted a lot of attention over the past years is

90: random 3-SAT: all clauses are drawn randomly and each variable negated

91: or left unchanged with equal probabilities. Experiments

92: \citep{AI,Cra,Mit,Kir} and theory \citep{Friedgut,Dub,Dubtcs} indicate

93: that clauses can almost surely always (respectively never) be

94: simultaneously satisfied if $\alpha$ is smaller (resp. larger) than a

95: critical threshold $\alpha _C \simeq 4.3$ as soon as the numbers $M$

96: of clauses and $N$ of variables go to

97: infinity at a fixed ratio $\alpha$. This phase

98: transition \citep{Sta1} is accompanied by a drastic peak in

99: hardness at threshold \citep{AI,Mit,Cra}. The emerging pattern of

100: complexity is as follows.  At small ratios $\alpha < \alpha _L$, where

101: $\alpha _L$ depends on the heuristic used by DPLL, instances are

102: almost surely satisfiable (sat), see \citet{Fra2} and \citet{Achl}

103: for recent

104: reviews.  The size $Q$ of the associated search tree scales, with high

105: probability, linearly with the number $N$ of variables, and almost no

106: backtracking is present \citep{Fri} (Figure~\ref{trees}A). Above the

107: critical ratio, that is when $\alpha >\alpha _C$, instances are

108: a.s. unsatisfiable (unsat) and proofs of refutation are obtained

109: through massive backtracking (Figure~\ref{trees}B), leading to an

110: exponential hardness: $Q = 2^{N\omega}$ with $\omega >0$~\citep{Chv}.

111: In the intermediate range, $\alpha _L <\alpha <\alpha _C$, finding a

112: solution a.s. requires exponential effort ($\omega >0$)

113: \citep{Vardi1,Achl3,Coc}.

114:

115: The aim of this article is two-fold. First, we propose a simple and

116: intuitive framework to unify the above findings. This framework is

117: presented in Section 2. It is based on the statistical physics notions

118: of dynamical trajectories and phase diagram, and was, to some extent,

119: implicitly contained in the pioneering analysis of search heuristics

120: by \citet{fra2,Fra}.  Secondly, we present in Section 3 a quantitative

121: study of the growth of the search tree in the unsat regime. Such a

122: study has been lacking so far due to the formidable difficulty in

123: taking into account the effect of massive backtracking on the

124: operation of DPLL. We first establish an exact relationship

125: between the average size of the search tree and the powers

126: of the evolution operator encoding the elementary steps of the

127: search heuristic. This equivalence is then used (in a non rigorous way) to

128: accurately estimate the logarithm $\omega$ of the average complexity

129: $Q$ as a function of $\alpha$,

130: \begin{equation}

131: \omega (\alpha) = \lim _{N\to \infty}\; \frac 1N\;

132: \log _2 \; E_{(\alpha,N)} [Q] \quad ,

133: \end{equation}

134: where $E_{(N,\alpha)}$ denotes the expectation value for given $N$

135: and $\alpha$. The approach emphasizes the relevance of

136: partial differential equations to analyse algorithms in presence of

137: massive backtracking, as opposed to ordinary differential equations in

138: the absence of the latter \citep{Worm,Achl}. In Section 4, we focus upon

139: the upper sat regime {\em i.e.} upon ratios $\alpha _L < \alpha <

140: \alpha_C$.  Combining the framework of Section 2 and the analysis of

141: Section 3 we unveil the structure of the search tree

142: (Figure~\ref{trees}C) and calculate $\omega$ as a function of the

143: ratio $\alpha$ of the 3-SAT instance to be solved.

144:

145: For the sake of clarity and since the style of our approach may look

146: unusual to the computer scientist reader, the status of the different

147: calculations and results (experimental, exact, conjectured,

148: approximate, ...) are made explicit throughout the article.

149:

150: \begin{figure}

151: \begin{center}

152: \includegraphics[width=280pt,angle=-0]{fig1.eps}

153: \vskip .5cm

154: \caption{DPLL algorithm.  When a variable has been chosen at step (1)

155: e.g. $x=T$, at step (2) some clauses are satisfied e.g. $C=(x \;\hbox

156: {\rm OR}\;{y} \;\hbox {\rm OR}\; z)$ and eliminated, other are reduced

157: e.g. $C=(\hbox{\rm not}\, x\; \hbox {\rm OR}\;{y}\; \hbox {\rm OR}\;

158: {z}) \to C=({y}\; \hbox {\rm OR}\; {z})$. If some clauses include one

159: variable only e.g. $C=y$, the corresponding variable is automatically

160: fixed to satisfy the clause ($y=T$).  This

161: propagation (2c) is repeated up to the exhaustion of all

162: unit clauses. Contradictions result from the presence of two opposite

163: unit clauses

164: e.g. $C=(y), C'=(\hbox{\rm not}\, y)$. A solution is found when no clauses

165: are left.  The search process of DPLL is represented by a tree

166: (Figure~\ref{trees}) whose

167: nodes correspond to (1), and edges to (2).  Branch extremities are

168: marked with contradictions C (2B,2C), or by a solution S (2A,2C).}

169: \end{center}

170: \end{figure}

171:

172:

173: \begin{figure}

174: \begin{center}

175: \includegraphics[width=190pt,angle=-90]{fig2.eps}

176: \end{center}

177: \vskip .5cm

178: \caption{Types of search trees generated by the DPLL solving procedure

179: on random 3-SAT.

180: {\bf A.} {\em simple branch:} the algorithm finds

181: easily a solution without ever backtracking. {\bf B.} {\em dense tree:}

182: in the absence of solution, DPLL builds a tree, including many branches

183: ending with contradictory leaves, before stopping.  {\bf C.} {\em

184: mixed case, branch + tree:} if many contradictions arise before

185: reaching a solution, the resulting search tree can be decomposed into a

186: single branch followed by a dense tree. G is the highest

187: node in the tree reached by DPLL through backtracking.}

188: \label{trees}

189: \end{figure}

190:

191: \section{Phase diagram and trajectories.}

192:

193: \subsection{The 2+p-SAT distribution and split heuristics}

194:

195: The action of DPLL on an instance of 3-SAT causes changes to the

196: overall numbers of variables and clauses, and thus of the ratio

197: $\alpha$.  Furthermore, DPLL reduces some 3-clauses to 2-clauses.  A

198: mixed 2+p-SAT distribution, where $p$ is the fraction of 3-clauses,

199: can be used to model what remains of the input instance at a node of

200: the search tree. Using experiments and methods from statistical

201: mechanics \citep{Sta1}, the threshold line $\alpha _C (p)$, separating

202: sat from unsat phases, may be estimated with the results shown in

203: Figure~\ref{diag}. For $p \le p_0 = 2/5$, {\em i.e.} to the left of

204: point T, the threshold line is given by $\alpha _C(p)=1/(1-p)$, as

205: rigorously confirmed by \citet{Achl1}, and saturates the upper bound

206: for the satisfaction of 2-clauses. Above $p_0$, no exact value for

207: $\alpha _C (p)$ is known. Note that $\alpha _C \simeq 4.3$ corresponds to

208: $p=1$.

209:

210: The phase diagram of 2+p-SAT is the natural space in which DPLL

211: dynamic takes place. An input 3-SAT instance with ratio $\alpha$ shows

212: up on the right vertical boundary of Figure~\ref{diag} as a point of

213: coordinates $(p=1,\alpha )$.  Under the action of DPLL, the

214: representative point moves aside from the 3-SAT axis and follows a

215: trajectory. This trajectory obviously depends on the heuristic of split

216: followed by DPLL (Figure~1).

217: Possible simple heuristics are \citep{fra2,Fra},

218: \begin{itemize}

219: \item{\em Unit-Clause (UC):}

220: randomly pick up a literal among a unit clause

221: if any, or any unset variable otherwise.

222:

223: \item{\em Generalized Unit-Clause (GUC):}

224: randomly pick up a literal  among the shortest avalaible clauses.

225:

226: \item{\em Short Clause With Majority (SC$_1$):}

227: randomly pick up a literal among unit clauses if any; otherwise

228: randomly pick up

229: an unset variable $v$, count the numbers of occurences $\ell,

230: \bar \ell$ of $v$, $\bar v$ in 3-clauses, and choose $v$ (respectively

231: $\bar v$) if $\ell > \bar \ell$ (resp.  $\ell < \bar \ell$). When

232: $\ell=\bar \ell$, $v$ and $\bar v$ are equally likely to be chosen.

233: \end{itemize}

234:

235: Rigorous mathematical analysis, undertaken to provide rigorous bounds to

236: the critical threshold $\alpha _C$,  have so far been restricted to

237: the action of DPLL prior to any backtracking, that is, to the

238: first descent of the algorithm in the search tree\footnote{The analysis of

239: \cite{Fri} however includes a very limited version of backtracking, see

240: Section 2.2}.

241: The corresponding search branch is drawn on Figure~\ref{trees}A.

242: These studies rely on the two following facts:

243:

244: First, the representative point

245: of the instance treated by DPLL does not ``leave'' the 2+p-SAT phase

246: diagram. In other words, the instance is, at any stage of the search

247: process, uniformly distributed from the 2+p-SAT

248: distribution conditioned to its clause--per--variable ratio $\alpha$ and

249: fraction of 3-clauses $p$. This assumption is not true for

250: all heuristics of split, but holds for the above examples ($UC$, $GUC$,

251: $SC_1$) \citep{fra2}. Analysis of more sophisticated heuristics

252: require to handle more complex instance distributions \citep{Kap}.

253:

254: Secondly, the trajectory followed by an instance in the course of

255: resolution is a stochastic object, due to the randomness of

256: the instance and of the assignments done by DPLL.

257: In the large size limit ($N\to\infty$), this trajectory gets

258: concentrated around its average locus in the 2+p-SAT phase diagram.

259: This concentration phenomenon results from general properties of

260: Markov chains \citep{Worm, Achl}.

261:

262: \begin{center}

263: \begin{figure}

264: \includegraphics[width=300pt,angle=-90]{fig3.eps}

265: \vskip 1cm

266: \caption{Phase diagram of 2+p-SAT and dynamical trajectories of DPLL.

267: The threshold line $\alpha_C (p)$ (bold full line) separates sat

268: (lower part of the plane) from unsat (upper part) phases. Extremities

269: lie on the vertical 2-SAT (left) and 3-SAT (right) axis at coordinates

270: ($p=0,\alpha _C=1$) and ($p=1,\alpha _C\simeq 4.3$) respectively.

271: Departure points for DPLL trajectories are located on the

272: 3-SAT vertical axis and the corresponding values of $\alpha$ are

273: explicitely given. Dashed curves represent tree trajectories in the

274: unsat region (thick lines, black arrows) and branch

275: trajectories in the sat phase (thin lines, empty

276: arrows). Arrows indicate the direction of "motion" along trajectories

277: parametrized by the fraction $t$ of variables set by DPLL.  For small

278: ratios $\alpha < \alpha _L$, branch trajectories remain

279: confined in the sat phase, end in S of coordinates $(1,0)$, where a

280: solution is found. At $\alpha_L$ ($\simeq 3.003$ for the GUC heuristic),

281: the single branch trajectory hits

282: tangentially the threshold line in T of coordinates $(2/5,5/3)$. In

283: the intermediate range $\alpha _L < \alpha < \alpha_C$, the branch

284: trajectory intersects the threshold line at some point G (which depends

285: on $\alpha$). A dense tree then grows in the unsat phase, as happens

286: when 3-SAT departure ratios are above threshold $\alpha > \alpha _C

287: \simeq 4.3$. The tree trajectory halts on the dot-dashed curve

288: $\alpha \simeq 1.259/(1-p)$ where the tree growth process stops.

289: At this point, DPLL has reached back the highest backtracking node in

290: the search tree, that is, the first node when $\alpha > \alpha _C$, or node G

291: for $\alpha _L < \alpha < \alpha_C$.

292: In the latter case, a solution can be reached from a new descending branch

293: while, in the former case, unsatisfiability is proven, see Figure~2.}

294: \label{diag}

295: \end{figure}

296: \end{center}

297:

298: \subsection{Trajectories associated to search branches}

299:

300: Let us briefly recall \citet{fra2} analysis of the

301: average trajectory corresponding to the action of DPLL prior to

302: backtracking. The ratio of clauses per variable of the

303: 3-SAT instance to be solved will be denoted by $\alpha _0$.

304: The numbers of 2 and 3-clauses are initially equal to $C_2=0, C_3=\alpha _0

305: \, N$ respectively. Under the action of DPLL, $C_2$ and $C_3$ follow

306: a Markovian stochastic evolution process, as the depth $T$ along the branch

307: (number of assigned variables) increases. Both $C_2$ and $C_3$ are

308: concentrated around their expectation values, the densities

309: $c_j (t)= E[C_j( T=t\,N)]$ ($j=2,3$) of which obey a set of

310: coupled ordinary differential equations (ODE) \citep{fra2,Fra,Achl},

311: \begin{equation}

312: \frac{d c_3}{dt} = - \frac{ 3\, c_3}{1-t} \qquad , \qquad

313: \frac{d c_2}{dt} = \frac{ 3\, c_3}{2(1-t)} - \frac{ 2\, c_2}{1-t} -

314: \rho _1 (t) \; h(t) \qquad , \label{ode}

315: \end{equation}

316: where $\rho _1 (t) = 1 - c_2(t)/(1-t)$ is the probability that DPLL fixes a

317: variable  at depth $t$ (fraction of assigned variables)

318: through unit-propagation. Function $h$ depends upon the

319: heuristic: $h_{UC} (t)=0$, $h_{GUC} (t)=1$ (if $\alpha _0> 2/3$;

320: for $\alpha _0<2/3$, see \citet{Fra}),

321: $h_{SC_1}

322: (t)=a\, e^{-a}\, (I_0(a)+I_1(a))/2$  where $a\equiv 3\, c_3(t)/(1-t)$

323: and $I_\ell$ is the $\ell^{th}$ modified Bessel function.

324: To obtain the single branch trajectory in the phase diagram of

325: Figure~\ref{diag},

326: we solve ODEs (\ref{ode}) with initial conditions $c_2(0)=0,

327: c_3(0)=\alpha_0$, and perform the change of variables

328: \begin{equation}

329: p(t) = \frac{c_3(t)}{c_2(t)+c_3(t)} \qquad , \qquad

330: \alpha (t) = \frac{c_2(t)+c_3(t)}{1-t} \quad . \label{change}

331: \end{equation}

332:

333: Results are shown for the GUC heuristics and starting ratios $\alpha_0 =2$

334: and 2.8 in Figure~\ref{diag}. The trajectory,

335: indicated by a light dashed line, first heads to the left and then

336: reverses to the right until reaching a point on the 3-SAT axis at

337: a small ratio. Further action of

338: DPLL leads to a rapid elimination of the remaining clauses and the

339: trajectory ends up at the right lower corner S, where a solution is

340: found.

341:

342: \citet{Fri} have shown that, for ratios $\alpha _0 < \alpha _L \simeq 3.003$

343: (for the GUC heuristics), the full search tree essentially reduces

344: to a single branch, and is thus entirely described by the ODEs (\ref{ode}).

345: The amount of backtracking necessary to reach a solution is bounded from above

346: by a power of $\log N$. The average size of the branch, $Q$, scales linearly

347: with $N$ with a multiplicative factor $\gamma (\alpha _0)=Q/N$ that can

348: be calculated \citep{Coc}.

349: The boundary $\alpha _L$ of this easy sat region can be defined as the largest

350: initial ratio $\alpha _0$ such that the branch trajectory $(p(t),\alpha (t))$

351: issued from $(1,\alpha _0)$ never leaves the sat phase during DPLL action.

352: In other words, the instance essentially keeps being sat throughout the

353: resolution process. We shall see in Section 4 this does not hold  for sat

354: instances with ratios $\alpha _L < \alpha _0 < \alpha _C$.

355:

356: \section{Analysis of the search tree growth in the unsat phase.}

357:

358: In this Section, we present an analysis of search trees

359: corresponding to unsat instances, that is, in presence of massive

360: backtracking. We first report results from numerical experiments,

361: then expose our analytical approach to compute the complexity of

362: resolution (size of search tree).

363:

364: \subsection{Numerical experiments}

365:

366: For ratios above threshold ($\alpha _0 > \alpha _C\simeq 4.3$),

367: instances almost never have a solution but a considerable amount of

368: backtracking is necessary before proving that clauses are

369: incompatible. Figure~\ref{trees}B shows a generic unsat, or

370: refutation, tree.  In contrast to the previous section, the sequence

371: of points $(p,\alpha)$ attached to the nodes of the search tree

372: do not arrange along a line any longer, but

373: rather form a cloud with a finite extension in the phase diagram of

374: Figure~\ref{diag}.  Examples of clouds are provided on

375: Figure~\ref{patch}.

376:

377: \begin{center}

378: \begin{figure}

379: \hskip 1.5cm \includegraphics[height=250pt,angle=-90]{fig4.eps}

380: \caption{Clouds associated to search trees obtained from the resolution

381: of three unsat

382: instances with initial ratios $\alpha _0= 4.3, 7$ and 10

383: respectively. Each point in the cloud corresponds to a splitting node

384: in the search tree. Sizes of instances and search trees are

385: $N=120,Q=7597$ for $\alpha _0=4.3$, $N=200,Q=6335$ for $\alpha _0 =7$,

386: and $N=300,Q=6610$ for $\alpha _0=10$.}

387: \label{patch}

388: \end{figure}

389: \end{center}

390: \begin{table}

391: $$

392: \begin{array}{|c|c|c|c|c|c||c|}

393: \hline

394: \multicolumn{1}{|c} {\alpha _0}&

395: \multicolumn{1}{|c} {4.3} & \multicolumn{1}{|c} 7 &

396: \multicolumn{1}{|c} {10} & \multicolumn{1}{|c} {15} &

397: \multicolumn{1}{|c||} {20} & \multicolumn{1}{|c|} {3.5} \\

398: \hline

399: \omega _{EXP} &

400: 0.089  & 0.0477  & 0.0320  &0.0207  & 0.0153 & 0.034 \\

401: & \pm 0.001  & \pm 0.0005 &\pm 0.0005 &

402: \pm 0.0002 & \pm 0.0002 & \pm 0.003   \\

403: \hline

404: \omega _{THE} & 0.0916 & 0.0486 & 0.0323 &

405: 0.0207  & 0.0153 &  0.035  \\

406: \hline

407: \end{array}

408: $$

409: \vskip .3cm

410: \caption{Logarithm of the complexity $\omega$ from experiments (EXP) and

411: theory (THE) as a function of the ratio $\alpha _0$ of clauses per

412: variable of the 3-SAT instance.

413: Ratios above 4.3 correspond to unsat instances; the rightmost ratio

414: lies in the upper sat phase.}

415: \end{table}

416:

417:

418: The number of points in a cloud {\em i.e.} the size $Q$ of its

419: associated search tree grows exponentially with $N$ \citep{Chv}. It is

420: thus convenient to define its logarithm $\omega$ through $Q=2^{N

421: \omega}$. We experimentally measured $Q$, and averaged its

422: logarithm $\omega$ over a large number of

423: instances. Results have then be extrapolated to the $N\to \infty$

424: limit \citep{Coc} and are reported in Table~1.  $\omega$ is a

425: decreasing function of $\alpha _0$ \citep{Bea}: the larger $\alpha

426: _0$, the larger the number of clauses affected by a split, and the

427: earlier a contradiction is detected.  We will use the vocable

428: ``branch'' to denote a path in the refutation tree which joins the top

429: node (root) to a contradiction (leaf). The number of branches, $B$, is

430: related to the number of nodes, $Q$, through the relation $Q=B-1$

431: valid for any complete binary tree. As far as exponential (in $N$)

432: scalings are concerned, the logarithm of $B$ (divided by $N$) equals

433: $\omega$.  In the following paragraph, we show how $B$ can be

434: estimated through the use of a matrix formalism.

435:

436: \begin{center}

437: \begin{figure}

438: \hskip 1.5cm \includegraphics[height=150pt,angle=0]{fig5.eps}

439: \vskip .5cm

440: \caption{Imaginary, parallel growth process of an unsat search tree used in the

441: theoretical analysis. Variables are fixed through unit-propagation, or

442: by the splitting heuristic as in the DPLL

443: procedure, but branches evolve in parallel. $T$ denotes the depth in the

444: tree, that is the number of variables assigned by DPLL along each

445: branch. At depth $T$, one literal is chosen on each branch among 1-clauses

446: (unit-propagation, grey circles not represented on Figure 2), or 2,3-clauses

447: (splitting, black circles as in Figure~2).

448: If a contradiction occurs as a result of unit-propagation, the branch gets

449: marked with C and dies out. The growth of the tree proceeds

450: until all branches carry C leaves. The resulting tree is identical to the one

451: built through the usual, sequential operation of DPLL. }

452: \label{struct}

453: \end{figure}

454: \end{center}

455:

456:

457: \subsection{Parallel growth process and Markovian evolution matrix}

458:

459: The probabilistic analysis of DPLL in the unsat regime appears to be a

460: formidable task since the search tree of Figure~\ref{trees}B is the

461: output of a complex, sequential process: nodes and edges are added by

462: DPLL through successive descents and backtrackings (depth-first

463: search). We have imagined a

464: different building up of the refutation tree, which results in the

465: same complete tree but can be mathematically analyzed. In our

466: imaginary process (Figure~\ref{struct}), the tree grows in parallel,

467: layer after layer (breadth-first search).

468: At time $T=0$, the tree reduces to a root

469: node, to which is attached the 3-SAT instance to be solved, and an

470: attached outgoing edge. At time $T$, that is, after having assigned

471: $T$ variables in the instance attached to each branch, the tree is

472: made of $B(T) \ (\le 2^T)$ branches, each one carrying a partial

473: assignment of variables. At

474: next time step $T\to T+1$, a new layer is added by assigning,

475: according to DPLL heuristic, one more variable along every branch. As a

476: result, a branch may keep growing through unitary propagation, get hit

477: by a contradiction and die out, or split if the partial assignment

478: does not induce unit clauses. This parallel growth process is

479: Markovian, and can be encoded in an instance--dependent matrix we now

480: construct.

481:

482: To do so, we need some preliminary definitions:

483:

484: \begin{defn} Partial state of variables.

485:

486: The partial state $s$ of a Boolean variable $x$ is one of the three

487: following possibilities: undetermined ($u$) if the variable has not

488: been assigned by the search heuristic yet, true ($t$) if the variable

489: is partially assigned to true, false ($f$) if the variable is

490: partially assigned to false. The partial state $S$ of a set of Boolean

491: variables $X=\{x_1, x_2, \ldots , x_N\}$ is the collection of the

492: states of its elements, $S=\{s_1, s_2, \ldots , s_N \}$.

493:

494: Let $\bf I$ be an instance of the SAT problem, defined over a set of

495: Boolean variables $X$ with partial state $S$. A clause of $\bf I$ is

496: said to be

497: \begin{itemize}

498: \item satisfied if at least one of its literals is true according to $S$;

499: \item unsatisfied, or violated if all its literals are false according to $S$;

500: \item undetermined otherwise; then its `type' is the number $(=1,2,3)$ of

501: undetermined variables it includes.

502: \end{itemize}

503: The instance $\bf I$ is said to be satisfied if all its clauses are

504: satisfied, unsatisfied

505: if one (at least) of its clauses is violated, undetermined otherwise.

506: The set of partial states that violate ${\bf I}$ is denoted by $W$.

507: \end{defn}

508:

509: \begin{defn} Vector space attached to a variable.

510:

511: To each Boolean variable $x$ is associated

512: a three dimensional vector space ${\bf v}$ with spanning basis $|u\rangle$,

513: $|t\rangle$, $|f\rangle$, orthonormal with respect

514: to the dot (inner) product denoted by $\langle . | . \rangle$,

515: \begin{equation}

516: \langle u | u \rangle = \langle t | t \rangle = \langle f | f \rangle =

517: 1\ ,\ \langle u |

518: t \rangle = \langle u | f \rangle = \langle t | f \rangle = 0 \ .

519: \end{equation}

520: The partial state attached to a basis vector $|s\rangle$ is $s$ (=$\,u,t,f)$.

521: \end{defn}

522:

523: Letters $u$, $t$, $f$ stand for the different partial states the variable may

524: acquire in the course of the search process. Note that the coefficients

525: of the decomposition of any vector $|x\rangle \in {\bf v}$

526: over the spanning basis,

527: \begin{equation}

528: |x\rangle = x ^{(u)} \, |u\rangle + x^{(t)} \, |t\rangle  + x^{(f)} \, |f\rangle

529: \quad ,

530: \end{equation}

531: can be obtained through use of the dot product: $x^{(s)} = \langle

532: s | x \rangle$ with $s=u,t,f$. By extension, $\langle S|$ denotes the

533: transposed of vector $|S\rangle$.

534:

535:

536: \begin{defn} Vector space attached to a set of variables.

537:

538: We associate to the set $X=\{x_1, x_2, \ldots , x_N\}$ of $N$ Boolean

539: variables the $3^N$--dimensional vector space ${\bf V}= {\bf v}_1

540: \otimes {\bf v}_2 \otimes \ldots \otimes {\bf v}_N$. The spanning

541: basis of ${\bf V}$ is the tensor product of the spanning basis of the

542: ${\bf v}_i$'s. To lighten notations, we shall write $|s_1 , s_2, ...,

543: s _N \rangle$ for $|s _1\rangle \otimes |s _2 \rangle \otimes \ldots

544: \otimes |s _N\rangle$.  The partial state attached to a basis vector

545: $|S\rangle= |s_1 , s_2, ..., s _N \rangle$ is $S=(s_1 , s_2, ..., s

546: _N)$.  The dot product naturally extends over ${\bf V}$: $\langle s'_1

547: , s'_2 , \ldots , s'_N | s_1, s_2 , \ldots , s_N \rangle = 1$ if

548: $s_i=s' _i\ \forall i$, 0 otherwise.

549: \end{defn}

550:

551:

552: Any element $|X\rangle \in {\bf V}$ can be uniquely decomposed

553: as a linear combination of vectors from the spanning basis.

554: Two examples of vectors are $|\Sigma\rangle$ and $|U\rangle$,

555: respectively the sum of all vectors in the

556: spanning basis and the fully undetermined vector,

557: \begin{eqnarray} \label{sigvec}

558: |\Sigma\rangle &=& \big( |u \rangle + |t \rangle + |f \rangle \big) \otimes

559: \big( |u \rangle + |t \rangle + |f \rangle \big) \otimes \ldots \otimes

560: \big( |u \rangle + |t \rangle + |f \rangle \big) \ , \\

561: |U\rangle &=& |u,u,\ldots , u\rangle \qquad . \label{uvec}

562: \end{eqnarray}

563: Basis vectors fulfill the closure identity

564: \begin{equation} \label{closure}

565: \sum _{S} | S\rangle \; \langle S | = {\bf 1} \quad ,

566: \end{equation}

567: where ${\bf 1}$ is the identity operator on ${\bf V}$. To establish identity

568: (\ref{closure}), apply the left hand side

569: operator to any vector $|S'\rangle$ and take advantage of

570: the orthonormality of the spanning basis .

571:

572: \begin{defn} (Heuristic-induced) Transition probabilities

573:

574: Let $S=(s_1 , s_2, ..., s _N)$ be a partial state which does not violate

575: instance ${\bf I}$.

576: Call $S^{(j,x)}$, with $j=1,\ldots, N$ and $x=t,f$,

577: the partial state obtained from $S$ by replacing $s_j$ with $x$.

578: The probability that the heuristic under consideration (UC, GUC, ...)

579: chooses to assign variable $x_j$ when presented partial state $S$

580: is denoted by $h(j|S)$.

581: The probability that the heuristic under consideration

582: then fixes variable $x_j$ to $x\, (=\, t,f)$ is denoted by $g(x|S,j)$.

583: \end{defn}

584:

585: A few elementary facts about  transition probabilities are:

586: \begin{enumerate}

587: \item $h(j|S)=0$ if $s_j\ne u$.

588: \item $g(x|S,j)+g(\bar x|S,j)=1$.

589: \item Assume that

590: the number $C_1 (S)$ of undetermined clauses of type 1 (unit

591: clauses) is larger or equal to unity. Call $C_1(j|S)$ the number of

592: unit clauses containing variable $x_j$, and $C_1(x|S,j)$ the number

593: of unit clauses satisfied if $x_j$ equals

594: $x\,(=\,t,f)$. Clearly  $C_1(j|S)= C_1(t|S,j)+C_1(f|S,j)$.

595: Then, as a result of unit--propagation,

596: \begin{eqnarray}

597: h(j|S) &=&\frac{C_1(j|S)}{C_1(S)} \quad , \nonumber \\

598: g(x|S,j) &=& \frac{C_1 (x|S,j)}{C_1(j|S)} \quad

599: \hbox{\rm for}\ x=t,f \

600: \hbox{\rm and} \ C_1(j|S) \ge 1 \ .

601: \end{eqnarray}

602: \item In the absence of unitary clause ($C_1(S)=0$), transition probabilities depend on the details

603: of the heuristic. For instance, in the case of the UC heuristic,

604: \begin{enumerate}

605: \item if $s_j=u$, $h(j|S)=\frac{1}{u(S)}$ and $g(x|S,j)=\frac 12$,

606: \item if $s_j\ne u$, $h(j|S)=0$,

607: \end{enumerate}

608: where $u(S)$ is the number of undetermined variables in partial state $S$.

609: \item The sum of transition probabilities from a partial state $S$

610: is equal to unity,

611: \begin{equation}

612: \sum _{j=1} ^N h (j|S) \; \bigg[ g(t|S,j) + g(f|S,j) \bigg] =1 \quad .

613: \end{equation}

614: \end{enumerate}

615: It is important to stress that the definition of the transition probabilities

616: does not make any reference to any type of backtracking. It relies

617: on the notion of variable assignement through the heuristic of search only.

618:

619: Let us now introduce the

620:

621: \begin{defn} (Heuristic-induced) Evolution operator.

622:

623: The evolution operator is a linear operator $\bf H$ acting on $\bf V$

624: encoding the action of DPLL for a given unsatisfiable instance ${\bf I}$. Its

625: matrix elements in the spanning basis are

626: \begin{enumerate}

627: \item if $S$ violates $\bf I$,

628: \begin{equation} \label{defmu}

629: \langle S'| {\bf H} | S\rangle

630: = \left\{ \begin{array} {c c}

631: 1 & \hbox{\rm if} \ S'=S \\

632: 0 & \hbox{\rm if} \ S' \ne S

633: \end{array} \right. \qquad ,

634: \end{equation}

635: \item if $S$ does not violate $\bf I$,

636: \begin{equation} \label{defms}

637: \langle S'| {\bf H} | S\rangle

638: = \left\{ \begin{array} {c l}

639: h(j|S)\times g(x|S,j) & \hbox{\rm if} \ C_1(S) \ge 1

640: \ \hbox{\rm and} \ S'=S^{(j,x)}\\

641: h(j |S) & \hbox{\rm if} \ C_1(S) =0\ \hbox{\rm and} \\

642: \ & \big(S'=S^{(j,x)}\ \hbox{\rm or} \ S'=S^{(j,\bar x)}\big)

643: \\

644: 0 & \hbox{\rm otherwise}

645: \end{array} \right. \nonumber

646: \end{equation}

647: \end{enumerate}

648: where $S,S'$ are the attached partial states to $|S\rangle, |S'\rangle$,

649: and $C_1(S)$ is the number of undetermined clauses of type 1 (unitary clauses)

650: for partial state $S$.

651: \end{defn}

652:

653: Notice that we use the same notation, $\bf H$, for the operator and its matrix

654: in the spanning basis. The different cases encountered in the above

655: definition of $\bf H$ are symbolized in Figure~\ref{thm}.

656: We may now conclude:

657:

658: \begin{thm} Branch function and average size of refutation tree

659:

660: Call branch function the function $B$ with integer-valued argument $T$,

661: \begin{equation} \label{funcq}

662: B(T) = \langle \Sigma | {\bf H}^T | U \rangle \quad ,

663: \end{equation}

664: where $\bf H$ is the evolution operator associated to the unsatisfiable

665: instance $\bf I$, ${\bf H}^T$ denotes the $T^{th}$ (matricial)

666: power of ${\bf H}$, and vectors $|\Sigma\rangle, |U\rangle$ are

667: defined in (\ref{sigvec},\ref{uvec}).

668: Then, there exist two instance--dependent

669: integers $T^* \ (\le N)$ and $B^* \ (\le 2^N)$ such that,

670: \begin{equation} \label{statq}

671: B(T) = B^* \ , \quad \forall \ T \ge T^*  \ .

672: \end{equation}

673: Furthermore, $B^*$ is the expectation value over the random assignments

674: of variables of the size (number of leaves) of the search tree

675: produced by DPLL to refute ${\bf I}$. The smallest non zero $T^*$ for

676: which (\ref{statq}) holds is the largest number

677: of variables that the heuristic needs to assign to reach a contradiction.

678: \end{thm}

679:

680: \begin{pf*}{Proof of Theorem 6}

681:

682: Let $S$ be a partial state. We call refutation tree built from

683: $S$ a complete search tree that proves the unsatisfiability of ${\bf I}$

684: conditioned to the fact that DPLL is allowed to assign only variables

685: which are undetermined in $S$. The height of the search tree is the

686: maximal number of assignments leading from the root node (attached to partial

687: state $S$) to a contradictory leaf.

688:

689: Let $T$ be a positive integer. We call

690: $b_T (S)$ the average size (number of leaves) of refutation trees

691: of height $\le T$ that can be built from partial state $S$. Clearly,

692: $b_T(S)=1$ for all $S\in W$, and $b_T(S)\ge 2$ if $S\notin W$. Recall

693: $W$ is the set of violating partial states from Definition 1.

694:

695: Assume now $T$ is an integer larger or equal to 1, $S$ a partial

696: state with $C_1(S)$ unitary clauses. Our parallel representation of

697: DPLL allows us to write simple recursion relations:

698: \begin{enumerate}

699: \item if $S \in W$, $b_T (S)=1=b_{T-1} (S)$.

700: \item if $S \notin W$ and $C_1(S)\ge 1$,

701: \begin{equation}

702: b_T (S) = \sum _{j=1}^N \sum _{x=t,f} h(j |S)\, g(x |S,j)

703: \; b_{T-1} \big(S ^{(j,x)} \big) \quad .

704: \end{equation}

705: \item if $S \notin W$ and $C_1(S)= 0$,

706: \begin{equation}

707: b_T (S) = \sum _{j=1}^N  h(j|S) \; \bigg[

708: b_{T-1} \big(S ^ {(j,t)} \big) + b_{T-1} \big(S ^ {(j,f)} \big)

709: \bigg] \quad .

710: \end{equation}

711: \end{enumerate}

712: These three different cases are symbolized on Figure~\ref{thm}A, B and

713: C respectively. From definitions (\ref{defmu},\ref{defms}), these

714: recursion relations are equivalent to

715: \begin{equation} \label{transpo}

716: b_T (S) = \sum _{S'} \langle S' | {\bf H} |S\rangle \;

717: b_{T-1} (S') \quad ,

718: \end{equation}

719: for any partial state $S$. Let $|b_T\rangle$ be the vector of $\bf V$ whose

720: coefficients on the spanning basis $\{|S\rangle\}$ are the $b_T(S)$'s.

721: In particular,

722: \begin{equation}

723: |b_0\rangle = \sum _{S_0 \in W} |S_0\rangle \quad .

724: \end{equation}

725: Then identity (\ref{transpo}) can be written as

726: $|b _T\rangle = {\bf H}^\dagger\; |b _{T-1} \rangle$

727: where ${\bf H}^\dagger$ is the transposed of the evolution operator.

728: Note that the branch function (\ref{funcq}) is simply $B(T)=\langle U|b_T

729: \rangle$. We deduce

730: \begin{equation} \label{funcqprim}

731: |b _T\rangle = ({\bf H}^\dagger)^T\; |b_0 \rangle = \sum _{S_0\in W}

732: \sum _{\sigma _T}  p\big( \sigma _T;S_0\big) |S _ 0\rangle

733: \quad ,

734: \end{equation}

735: where the second sum runs over all $3^{N\times T}$ sequences $\sigma _T =

736: (S_1, S_2 , \ldots , S_{T-1}, S _T)$

737: of $T$ partial states with associated weight

738: \begin{eqnarray}

739: p\big( \sigma _T;S_0\big) &=& \langle S_T |{\bf H}^\dagger

740: |S_{T-1}\rangle \times  \ldots \times

741: \langle S_2 |{\bf H}^\dagger|S_{1}\rangle \times \langle

742: S _{1} |{\bf H}^\dagger|S _0\rangle \nonumber \\

743:  &=& \langle S _0 |{\bf H} |S_{1}\rangle

744: \times  \langle S _1 |{\bf H} |S _{2}\rangle \ldots

745: \times \langle S _{T-1} |{\bf H}|S _T\rangle

746: \ , \label{weightseq}

747: \end{eqnarray}

748: The length of a sequence is the number of partial states it includes.

749: We call $S_0$--genuine a sequence of partial states $\sigma _T$ with

750: non zero weight (\ref{weightseq}). The second sum

751: on the right hand side of equation (\ref{funcqprim}) may be

752: rewritten as a sum over all $S_0$--genuine sequences $\sigma _T$

753: of length $T$ only.

754:

755: \begin{lem}

756: Take $S_0\in W$.

757: Any $S_0$-genuine sequence $\sigma _{N+1}$ of length $N+1$

758: includes at least one partial state belonging to $W$.

759: \end{lem}

760: Suppose this is not true. There exists a genuine sequence

761: $\sigma _{{N+1}}$ with $S_{T}\notin W$,

762: $\forall\ 1\le T\le N+1$. Call $u _T$ the number of undetermined variables in

763: partial state $S_{T}$. Since the sequence is genuine,

764: $\langle S_{T-1} |{\bf H} | S_{T}\rangle \ne 0$

765: for every $T$ comprised between 1 and $N+1$. From the evolution operator

766: definition (\ref{defms}), $S_{T}$ contains exactly one more

767: undetermined variable than $S_{T-1}$, and $u _T=u _{T-1}+1$ for all

768: $1\le T\le N+1$. Hence $u _{N+1}-u_0=N+1$. But

769: $u_0$ and $u_{N+1}$ are, by definition,

770: integer numbers comprised between 0 and $N$.

771: \qed

772:

773: From Lemma 7, the index $\nu$ of a $S_0$--genuine sequence $\sigma _{N+1}$

774: of length $N+1$,

775: \begin{equation}

776: \nu = \sup \big\{ T : 1\le T\le N+1\ \hbox{\rm and}\

777: S_{T} \in \sigma _{N+1} \ \hbox{\rm and}\ S_{T}\ \in \ W \big\} \quad ,

778: \end{equation}

779: exists and is larger, or equal, to $1$. Let us define

780: \begin{equation}

781: \hat \sigma _{N+1} = (S_{\nu +1}, S_{\nu+2},

782: \ldots, S_{N}, S_{N+1} )\quad .

783: \end{equation}

784: From definition

785: (\ref{defmu}), $\sigma _{N+1}$ is simply $S_0$ repeated $\nu$ times

786: followed by $\hat \sigma _{N+1}$, and $p (\sigma _{N+1})= p(\hat \sigma

787: _{N+1})$. Call $\nu^*(S_0)$ the smallest index of all $S_0$-genuine

788: sequences of length $N+1$, and $\nu^*$ the minimum of $\nu^* (S_0)$ over

789: $S_0\in W$.

790: Then, from equation (\ref{funcqprim}), $|b_{N+1}\rangle = |b_N\rangle =

791: \ldots = |b_{T^*}\rangle $

792: where $T^*=N+1-\nu^*\le N$. Thus $|b_{T^*}\rangle$ is a right eigenvector

793: of ${\bf H}^\dagger$ with eigenvalue unity, and

794: $|b _{T}\rangle= |b_{T^*}\rangle$ for all $T\ge T^*$.

795: $T^*$, which depends upon instance ${\bf I}$,

796: is the length of the longest genuine sequence without repetition.

797: It is the maximal

798: number of (undetermined) variables to be fixed before a contradiction is

799: found.

800:

801: \begin{lem} Take $S\notin W$.

802: Then there is no $S$-genuine sequence of length $T^*$.

803: \end{lem}

804: Suppose this is not true. There exist $S\notin W$ and a $S$--genuine

805: sequence $\sigma _{T^*}$ of length $T^*$. As $S$ does not violate ${\bf I}$,

806: and ${\bf I}$ is not satisfiable, there are still some undetermined

807: variables in partial state $S$. A certain number of them, say $T'\ge 1$,

808: must be assigned to some $t,f$ values to reach a contradiction, that is,

809: a partial state $S_0\in W$. Therefore there exists a $S_0$--genuine

810: sequence, $\tilde \sigma$, of length $T'\ge 1$ ending with $S$

811: and with no repeated partial state.

812: Concatenating $\tilde \sigma$ and $\sigma _{T^*}$, we obtain a

813: $S_0$--genuine sequence of length $T^*+T '>T^*$ and without repetition,

814: in contradiction with the above result.

815: \qed

816:

817: Using Lemma 8, we may replace $|b_0\rangle$ in equation (\ref{funcqprim})

818: with $|\Sigma\rangle$, and find

819: \begin{equation}

820: B(T)\equiv \langle \Sigma |{\bf H}^T |U\rangle =

821: \langle U |({\bf H}^\dagger)^T |\Sigma\rangle =  \langle U | b_{T^*}\rangle

822: = b_{T^*} (U) \ ,

823: \end{equation}

824: for all $T\ge T^*$. Hence, $B^*=b_{T^*} (U)$ is the average size

825: (over the random assignments made by the heuristic) of the refutation tree

826: to instance $\bf I$ generated from the fully undetermined partial state.

827: \qed

828: \end{pf*}

829:

830: \begin{center}

831: \begin{figure}

832: \includegraphics[height=150pt,angle=0]{fig6.eps}

833: \vskip .5cm

834: \caption{Transitions allowed by the heuristic-induced evolution

835: operator. Grey and black nodes correspond to variables assigned through

836: unit-propagation and splitting respectively, as in Figure~\ref{struct}.

837: {\bf A}. If the partial state $S$ already violates the instance

838: ${\bf I}$, it is left

839: unchanged. {\bf B}. If the partial state does not violate $\bf I$ and

840: there is at least one unitary clause, a variable is fixed through unit

841: propagation (grey node) e.g. $x_j=x$. The output partial state is $S^{j,x}$.

842: {\bf C}. If the partial state does not violate $\bf I$ and

843: there is no unitary clause, a variable $x_j$ is fixed through splitting

844: (black node). Two partial states are generated, $S^{j,t}$ and $S^{j,f}$.}

845: \label{thm}

846: \end{figure}

847: \end{center}

848:

849: \subsection{Some examples of short instances and associated matrices}

850:

851: We illustrate the above definitions and results with three explicit

852: examples of instances involving few variables:

853:

854:

855: \begin{exmp}

856: Instance over $N=1$ variable

857:

858: Consider the following unsat instance built from a single variable,

859: \begin{equation}

860: {\bf I}_1 = x_1 \wedge  \bar x_1 \qquad .

861: \end{equation}

862: The 3--dimensional vector space ${\bf v}_1$ is spanned by vectors $|u\rangle, |t\rangle,

863: |f\rangle$. The evolution matrix reads

864: \begin{equation}

865: {\bf H} = \left( \begin{array}{c c c}

866: 0 & 0 & 0 \\ \frac 12 & 1 & 0 \\  \frac 12 &  0 &1

867: \end{array} \right) \  \hbox{\rm with} \quad

868: |u\rangle = \left( \begin{array}{c} 1 \\ 0 \\ 0 \end{array} \right) \ ,\quad

869: |t\rangle = \left( \begin{array}{c} 0 \\ 1 \\ 0 \end{array} \right) \ ,\quad

870: |f \rangle = \left( \begin{array}{c} 0 \\ 0 \\ 1 \end{array} \right) \ .

871: \end{equation}

872: Entries can be interpreted as follows.

873: Starting from the $u$ state, variable $x_1$ will be set through unit-propagation to $t$

874: or $f$ with equal probabilities: $\langle t |{\bf H }|

875: u \rangle =\langle f |{\bf H} | u \rangle

876: =1/2$. Once the variable has reached this state, the instance is violated:

877: $\langle t |{\bf H} | t \rangle =\langle f |{\bf H} | f \rangle =1$. All other entries are null.

878: In particular, state $u$ can never be reached from any state,

879: so the first line of the matrix is filled in with zeroes: $\langle u |{\bf

880: H}| s \rangle =0,

881: \forall s$. Function (\ref{funcq}) is easily calculated

882: \begin{equation}

883: B (T)  =  \left( \begin{array}{c} 1 \\ 1 \\ 1 \end{array} \right)

884: ^\dagger \;.\; {\bf H}

885: ^T\; . \; \left( \begin{array}{c} 1 \\ 0 \\ 0 \end{array} \right) =

886: 1\ , \quad \forall\ T\ge 0 \ .

887: \end{equation}

888: Therefore, $T^*=B^*=1$. Indeed, refutation is obtained without

889: any split, and the search tree involves a unique branch of length 1

890: (Figure~\ref{tree-exemple}A).

891: \end{exmp}

892:

893: Our next example is a 2-SAT instance whose refutation requires to split one

894: variable.

895:

896: \begin{exmp} Instance over $N=2$ variables, with a unique refutation tree.

897:

898: \begin{equation}

899: {\bf I}_2 =(x_1 \vee x_2) \wedge (\bar x_1 \vee x_2) \wedge

900: (x_1 \vee \bar x_2 ) \wedge (\bar  x_1 \vee \bar x_2)

901: \end{equation}

902: The evolution matrix $\bf H$ is a $9\times 9$ matrix with 16 non zero entries,

903: \begin{eqnarray}

904: \langle s,u | {\bf H} | u,u \rangle &=&

905: \langle u,s | {\bf H} | u,u \rangle = \frac 12 \ , \qquad \forall\ s=t,f

906: \label{m21} \\ \label{m22}

907: \langle s,s' | {\bf H} | s,u \rangle &=&

908: \langle s,s' | {\bf H} | u,s' \rangle = \frac 12 \ , \qquad \forall\ s,s'=t,f\\

909: \label{m23}

910: \langle s',s | {\bf H} | s',s \rangle &=& 1\ , \qquad \forall \ s,s'=t,f \quad .

911: \end{eqnarray}

912: We now explain how these matrix elements were obtained. From the

913: undetermined state $|u,u\rangle$, any of the four clause can be chosen

914: by the heuristic. Thus, any of the two literals $x_1$, $x_2$ has a

915: probability $1/2$ to be chosen: $h(1|u,u)=h(2|u,u)=\frac 12$.

916: Next, unit-propagation will set the unassigned variable to true, or

917: false with equal

918: probabilities $1/2$ (\ref{m22}). Finally, entries corresponding to

919: violating states in eqn (\ref{m23}) are calculated according to

920: rule (\ref{defmu}).

921:

922: The branch function $B(T)$ equals 1 for $T=0$, 2 for any $T\ge 1$;

923: thus, $T^*=1$ and $B^*=2$, in agreement with the

924: associated search tree symbolized in Figure~\ref{tree-exemple}B.

925: \end{exmp}

926:

927: We now introduce an instance with a non unique refutation tree.

928:

929: \begin{exmp} Instance with $N=3$ variables, and two refutation trees.

930: \begin{equation}

931: {\bf I}_3 =(x_1 \vee x_2) \wedge (\bar x_1 \vee x_2) \wedge

932: (x_1 \vee \bar x_2 ) \wedge (\bar  x_1 \vee \bar x_2) \wedge

933: (x_3 \vee \bar x_3)

934: \end{equation}

935: Notice the presence of a (trivial) clause containing opposite

936: literals, which allows

937: us to obtain a variety in the search trees without considering more than three

938: variables. The evolution matrix $\bf H$ is a $27\times 27$ matrix with 56 non

939: zero entries (for the GUC heuristic),

940:

941: \begin{eqnarray} \label{m31}

942: \langle s,u,u |{\bf H} | u,u,u \rangle &=&

943: \langle u,s,u | {\bf H} | u,u,u \rangle = \frac 25  \ , \qquad \forall\ s=t,f \\

944: \label{m32}

945: \langle u,u,s | {\bf H} | u,u,u \rangle &=&  \frac 1{5} \ , \qquad \forall\ s=t,f \\

946: \langle s,s',s'' | {\bf H} | s,u,s'' \rangle &=&

947: \langle s,s',s'' | {\bf H} | u,s',s'' \rangle = \frac 1{2} \ , \quad \forall\ s,s'=t,f;

948: s''=u,t,f \nonumber \\

949: \langle s',u,s | {\bf H} | u,u,s \rangle &=&

950: \langle u,s',s | {\bf H} | u,u,s \rangle = \frac 1{2} \ , \qquad \forall\ s,s'=t,f

951: \nonumber \\

952: \langle s,s',s'' | {\bf H} | s,s',s'' \rangle &=&1 \ , \qquad \forall\ s,s'=t,f;

953: \ s''=u,t,f \nonumber

954: \end{eqnarray}

955: The first split variable is $x_3$ if the last clause is chosen (probability $1/5$),

956: or $x_1$ or $x_2$ otherwise (with probability $2/5$ each), leading to

957: expressions (\ref{m31}) and (\ref{m32}). The remaining entries of $\bf H$ are

958: obtained in the same way as explained in Example 10.

959:

960: We obtain $B(0)=1$, $B(1)=2$ and $B(T\ge 2)=12/5$.

961: Therefore, $T^*=2$ and

962: \begin{equation}

963: B^* = \frac{12}5 = \frac 45 \times 2 + \frac 15 \times 4 \qquad ,

964: \end{equation}

965: where the different contributions to $B^*$ and their probabilities are explicitely

966: written down, see Figures~\ref{tree-exemple}B and \ref{tree-exemple}C.

967: \end{exmp}

968:

969: \begin{center}

970: \begin{figure}

971: \includegraphics[height=150pt,angle=0]{fig7.eps}

972: \vskip .5cm

973: \caption{Refutation search trees associated to instances ${\bf I}_1$,

974: ${\bf I}_2$ and ${\bf I}_3$. Grey and black nodes correspond to

975: variables assigned through unit-propagation and split respectively, as

976: in Figure~\ref{struct}.  {\bf A}. Example 9: refutation of instance

977: ${\bf I}_1$ is obtained as a result of unit-propagation. The size

978: (number of leaves) of the search tree is $B=1$. {\bf B}. Example 10:

979: search tree generated by DPLL on instance ${\bf I}_2$. The black and

980: grey node correspond to the split of $x_1$ and unit-propagation over

981: $x_2$, or vice-versa.  The size of the tree is $B=2$. {\bf C}. Example

982: 11: search tree corresponding to the instance ${\bf I}_3$ when DPLL

983: first splits variable $x_3$. The size of the tree is $B=4$. If the

984: first split variable is $x_1$ or $x_2$, the refutation search tree of

985: instance ${\bf I}_3$ corresponds to case {\bf B}.  }

986: \label{tree-exemple}

987: \end{figure}

988: \end{center}

989:

990:

991:

992: \subsection{Dynamical annealing approximation}

993:

994: Let us denotes by $\overline q$ the expectation value of a

995: function $q$ of the instance ${\bf I}$ over the random 3-SAT distribution,

996: at given numbers of variable, $N$, and clauses, $\alpha \, N$.

997: From Theorem 6, the expectation value of the size of the refutation tree is

998: \begin{equation}

999: B^*(\alpha, N) \equiv \overline{ B^* }=

1000: \langle \Sigma | \overline{ {\bf H} ^N} | U \rangle

1001: \quad .

1002: \end{equation}

1003: Calculation of the expectation value of the $N^{th}$ power of ${\bf H}$ is

1004: a hard task that we were unable to perform for large sizes $N$. We therefore turned

1005: to a simplifying approximation, hereafter called dynamical annealing.

1006: This approximation is not thought to be justified in general, but may be

1007: asymptotically exact in some limiting cases we will expose later on.

1008:

1009: A first temptation is to approximate the expectation of the $N^{th}$ power of ${\bf H}$

1010: with the $N^{th}$ power of the expectation of ${\bf H}$. This is however too a brutal

1011: approximation to be meaningful, and a more refined scheme is needed.

1012:

1013: \begin{defn} Clause projection operator

1014:

1015: Consider an instance ${\bf I}$ of the 3-SAT problem. The clause vector

1016: $\vec C (S)$ of a partial state $S$ is a three dimensional vector

1017: $\vec C = (C_1, C_2, C_3)$ where $C_j$ is the number of undetermined

1018: clauses of ${\bf I}$ of type $j$. The clause projection operator,

1019: ${\bf P} ({\vec C})$, is the operator acting on $\bf V$ and

1020: projecting onto the subspace of partial state

1021: vectors with clause vectors $\vec C$,

1022: \begin{equation}

1023: {\bf P} ({\vec C}) \; |S\rangle = \left[ \prod _{j=1}^3 \delta _{ C_j - C_j(S)}\right] \

1024: |S\rangle \quad ,

1025: \end{equation}

1026: where $\delta$ is the Kronecker function. The sum of all state

1027: vectors in the spanning basis with clause vector $\vec C$ is denoted by

1028: $| \Sigma (\vec C) \rangle ={\bf P} ({\vec C})\; | \Sigma \rangle$.

1029: The sum of all state

1030: vectors in the spanning basis with clause vector $\vec C$ and

1031: $U$ undetermined variables is denoted by

1032: $| \Sigma _U (\vec C) \rangle$.

1033: \end{defn}

1034:

1035: It is an easy check that ${\bf P}$ is indeed a projection operator:

1036: ${\bf P}^2({\vec C})={\bf P}({\vec C})$.

1037: As the set of partial states can be

1038: partitioned according to their clause vectors,

1039: \begin{equation} \label{identriv}

1040: \sum _{\vec C}  {\bf P} ({\vec C}) = \sum _{\vec C}  {\bf P}^2 ({\vec C})

1041: ={\bf 1} \quad .

1042: \end{equation}

1043: We now introduce the clause vector-dependent branch function

1044: \begin{equation}

1045: B ( \vec C ,  T) = \langle \Sigma (\vec C) | {\bf H}^T |U \rangle

1046: \quad .

1047: \end{equation}

1048: Summation of the $B$'s over all $\vec C$ gives back function (\ref{funcq})

1049: from identity (\ref{identriv}).

1050: The evolution equation for $B ( \vec C , T)$ is,

1051: \begin{eqnarray} \label{ap2}

1052: B ( \vec C , T+1)  &=& \langle \Sigma (\vec C)| {\bf H} \;

1053: \times {\bf H}^T |U\rangle

1054: \nonumber \\

1055: &=& \langle \Sigma (\vec C)| {\bf H} \times

1056: \left( \sum _{\vec C'} {\bf P} ^2 ({\vec C'}) \right) \times {\bf H}^T |U\rangle

1057: \nonumber \\

1058: &=& \sum _{\vec C'}\;  \langle \Sigma (\vec C)| {\bf H} \times

1059:  {\bf P}  ({\vec C'}) \times \left( \sum _S |S\rangle \langle S|

1060: \right)\times  {\bf P} ({\vec C'})     \times {\bf H}^T |U\rangle

1061: \nonumber \\

1062: &=& \sum _{\vec C'} \sum _{S} \langle \Sigma (\vec C)|

1063: {\bf H} \times  {\bf P} ({\vec C'}) | S\rangle \; \langle S|

1064:  {\bf P} ({\vec C'})\times {\bf H}^T |U\rangle

1065: \end{eqnarray}

1066: where we have made use of identities (\ref{closure}) and (\ref{identriv}).

1067: We are now ready to do the two following approximation steps:

1068:

1069:

1070: \begin{approks} Dynamical annealing (step A)

1071:

1072: Substitute in equation (\ref{ap2}) the partial state vector

1073: \begin{equation}

1074: {\bf P} (\vec C')\, |S\rangle \quad \hbox{ with} \quad

1075: \frac{1}{\langle \Sigma | \Sigma _{N-T} (\vec C') \rangle} \;

1076: |\Sigma _{N-T} (\vec C') \rangle \quad ,

1077: \end{equation}

1078: that is, with its average over the set of basis vectors with clause

1079: vector $\vec C'$ and $N-T$ undetermined variables.

1080: \end{approks}

1081:

1082: Following step A, equation (\ref{ap2}) becomes an approximated

1083: evolution equation for $B$,

1084: \begin{equation} \label{ap3}

1085: B ( \vec C , T+1)  = \sum _{\vec C'}  {\bf \hat H} [ \vec C , \vec C' ;T]

1086: \;  B ( \vec C , T) \quad ,

1087: \end{equation}

1088: where the new evolution matrix ${\bf \hat H}$, not to be confused

1089: with ${\bf H}$, is

1090: \begin{equation} \label{ap4}

1091: {\bf \hat H} [ \vec C , \vec C' ;T ] =  \frac{ \langle \Sigma (\vec C)|

1092: {\bf H} | \Sigma _{N-T}(\vec C') \rangle} { \langle \Sigma |

1093: \Sigma _{N-T}(\vec C') \rangle}

1094: \quad .

1095: \end{equation}

1096: Then,

1097:

1098:

1099: \begin{approks} Dynamical annealing (step B)

1100:

1101: Substitute in equation (\ref{ap3}) the evolution matrix ${\bf \hat H}$ with

1102: \begin{equation}

1103: {\bf \bar H} [ \vec C , \vec C' ;T] =  \frac{ \overline{\langle \Sigma

1104: (\vec C)| {\bf H} | \Sigma _{N-T}(\vec C') \rangle}}{\overline{\langle

1105: \Sigma | \Sigma _{N-T}(\vec C') \rangle} }

1106: \end{equation}

1107: that is, consider the instance ${\bf I}$ is redrawn at each time

1108: step $T\to T+1$, keeping information about clause vectors at time $T$

1109: only.

1110: \end{approks}

1111:

1112: Let us interpret what we have done so far. The quantity we focus on

1113: is $\bar B(\vec C;T+1)$, the expectation number of branches at depth

1114: $T$ in the search tree (Figure~\ref{struct}) carrying partial states

1115: with clause vector

1116: $\vec C = (C_1,C_2,C_3)$. Within the dynamical annealing approximation,

1117: the evolution of the $\bar B$'s is Markovian,

1118: \begin{equation}

1119: \label{bradp}

1120: \bar B(\vec C;T+1)=\sum_{\vec C'}\;{\bf \bar H}\;[\vec C,\vec C';T]\;

1121: \bar B(\vec C';T) \ .

1122: \end{equation}

1123: The entries of the evolution matrix ${\bf \bar H}[\vec C, \vec C';T]$

1124: can be interpreted as the average number of branches with clause vector

1125: $\vec C$ that DPLL will generate through the assignment of one variable

1126: from a partial assignment (partial state) of variables

1127: with clause vector $\vec C'$.

1128:

1129: For the GUC heuristic, we find \citep{Coc},

1130: \begin{eqnarray}

1131: \label{bbra}

1132: && {\bf\bar H} [\vec C,\vec C'; T] = {C_3' \choose C_3'-C_3} \; \left

1133: (\frac 3{N-T}\right)^{C_3'-C_3} \; \left(1-\frac 3{N-T}\right)^{C_3}

1134: \times \nonumber \\ &&\qquad \qquad \qquad \qquad \sum_{w_2=0}^{C_3'-C_3} \left (\frac

1135: 1 2 \right)^{C_3'-C_3} {C_3'-C_3\choose w_2}\times \nonumber \\

1136: && \left \{(1 - \delta _{C'_1} ) \; \left( 1 - \frac1{2(N-T)} \right)^{C'_1-1}

1137: \sum_{z_2=0}^{C_2'} {C_2' \choose z_2} \left (\frac

1138: 2{N-T}\right)^{z_2} \right. \times \nonumber \\ &&

1139:  \left (1- \frac 2{N-T}\right)^{C_2'-z_2}

1140: \sum_{w_1=0}^{z_2} \left (\frac 1 2

1141: \right)^{z_2} {z_2\choose w_1}\; \delta_{C_2-C_2'-w_2+z_2}

1142: \;\delta_{C_1-C_1'-w_1+1}  + \nonumber \\&&

1143: \delta_{C_1'}

1144: \sum_{z_2=0}^{C_2'-1} {C_2'-1 \choose z_2} \left (\frac

1145: 2{N-T}\right)^{z_2}\, \left( 1- \frac 2{N-T}\right)^{C_2'-1-z_2}

1146: \times \nonumber \\ && \left.

1147: \sum_{w_1=0}^{z_2} \left(\frac 1 2 \right)^{z_2} {z_2\choose w_1}\;

1148: \delta_{C_2-C_2'-w_2+z_2+1} \; [\delta_{C_1-w_1} + \delta_{C_1-1-w_1}]

1149:  \right\} \ ,

1150: \end{eqnarray}

1151: where $\delta _X$ denotes the Kronecker delta function over integers $X$:

1152: $\delta _X=1$ if $X=0$, $\delta _X=0$ otherwise. Expression (\ref{bbra}) is

1153: easy to obtain from the interpretation following equation (\ref{bradp}).

1154:

1155: \subsection{Generating functions and asymptotic scalings at large $N$}

1156:

1157: Let us introduce the generating function $G (\,{\vec y}\,;T\,)$ of

1158: the average number of branches $\bar B(\,{\vec C}\,;T\,)$

1159: where $\vec y\equiv(y_1,y_2,y_3)$, through

1160: \begin{equation}

1161: \label{gener}

1162: G (\,{\vec y}\,;T\,) =\sum_{\vec C}\; e^{\,{\vec y} \cdot

1163: {\vec C}}\; \bar B(\,{\vec C}\,,T\,)\quad ,

1164: \quad

1165: {\vec y} \cdot  {\vec C} \equiv \sum _{j=1}^3 y_j \, C_j \quad .

1166: \end{equation}

1167: Evolution equation (\ref{ap3}) for the $\bar B$'s can be rewritten in

1168: term of the generating function $G$,

1169: \begin{eqnarray}

1170: \label{eqev}

1171: G(\,{\vec y}\,;T+1\,) &=& e^{-\gamma_1({\vec y})} \; G \big(

1172: \,{\vec \gamma}({\vec y})\,; T\,\big) + \nonumber \\

1173: && \left ( e^{- \gamma_2({\vec y})}

1174: (e^{y_1}+1) - e^{- \gamma_1({\vec y})}\right) \;

1175: G \big(-\infty,\, \gamma_2({\vec y}),\, \gamma_3({\vec y})\,; T\, \big)

1176: \end{eqnarray}

1177: where ${\vec \gamma}$ is a vectorial function of argument ${\vec y}$ whose

1178: components read

1179: \begin{eqnarray}

1180: \gamma_1 ({\vec y}) &=&y_1+\ln\left[1-\frac 1 {2(N-T)} \right]

1181: \quad , \nonumber\\

1182: \gamma_2({\vec y})&=&y_2+\ln\left[1+\frac 2 {N-T} \left (\frac{e^{-y_2}}{2}

1183: \left(1+ e^{y_1}\right) -1\right)\right]

1184: \quad , \nonumber \\

1185: \gamma_3 ({\vec y})&=&y_3+\ln\left[1+\frac 3 {N-T}  \left (\frac{e^{-y_3}}{2}

1186: \left(1+ e^{y_2}\right) -1\right)\right] \quad .

1187: \end{eqnarray}

1188: To solve equation (\ref{eqev}), we infer the large $N$ behaviour of

1189: $G$ from the following remarks:

1190:

1191: \begin{enumerate}

1192: \item Each time DPLL assigns

1193: variables through splitting or unit-propagation, the numbers $C_j$ of

1194: clauses of length $j$ undergo $O(1)$ changes. It is thus sensible to

1195: assume that, when the number of assigned variables increases from $T_1 = t\,

1196: N$ to $T_2=t\, N + \Delta T$

1197: with $\Delta T $ very large but $o(N)$ e.g. $\Delta T = \sqrt N$,

1198: the densities $c_2=C_2/N$ and $c_3=C_3/N$ of 2- and 3-clauses have

1199: been modified by $o(1)$.

1200:

1201: \item On the same time interval $T_1<T<T_2$, we expect

1202: the number of unit-clauses $C_1$ to vary at each time step. But its

1203: distribution $\rho (C_1|c_2,c_3;t)$, conditioned to the densities

1204: $c_2$, $c_3$ and the reduced time $t$, should reach some well defined limit

1205: distribution. This claim is a generalization of the result obtained

1206: by \citet{Fri} for the analysis of the GUC heuristic in the absence

1207: of backtracking.

1208:

1209: \item As long as a partial state does not violate the instance,

1210: very few unit-clauses

1211: are generated, and splitting frequently occurs. In other words, the

1212: probability that $C_1=0$ is strictly positive as $N$ gets

1213: large.

1214:

1215: \end{enumerate}

1216:

1217: The above arguments entice us to make the following

1218:

1219: \begin{claim} Asymptotic expression for the generating function $G$

1220:

1221: For large $N,T$ at fixed ratio $t=T/N$, the generating function

1222: (\ref{gener}) of the average numbers $\bar B$ of branches

1223: is expected to behave as

1224: \begin{equation}

1225: \label{scalinghyp2}

1226: G(y_1,y_2, y_3;t\,N ) = \exp \bigg[\, N \, \varphi ( y_2, y_3 ; t)

1227: + \psi ( y_1,y_2, y_3 ; t) + o(1) \bigg] \quad .

1228: \end{equation}

1229: \end{claim}

1230:

1231: Hypothesis (\ref{scalinghyp2}) expresses in a concise way some important

1232: information on the distribution of clause populations

1233: during the search process that we now extract.

1234: Call $\omega$ the Legendre transform of $\varphi$,

1235: \begin{equation}

1236: \omega (\,c_2,c_3\,;t\,) = \min _{y_2,y_3} \bigg[ \varphi

1237: (\, y_2,y_3 \, ; t\, ) - y_2\, c_2 - y_3\, c_3\bigg]

1238: \qquad . \label{inversion}

1239: \end{equation}

1240: Then, combining equations (\ref{gener}), (\ref{scalinghyp2}) and

1241: (\ref{inversion}), we obtain

1242: \begin{equation}

1243: \label{scalinghyp}

1244: \sum _{C_1\ge 0}\rho (C_1|c_2,c_3;t)\; \bar B(C_1,c_2\,N,

1245: c_3\, N; t\, N ) \asymp \exp \big[\,N \, \omega ( c_2,c_3 ; t\,)

1246: \big] \quad ,

1247: \end{equation}

1248: up to non exponential in $N$ corrections. In other words,

1249: the expectation value of the number of branches carrying partial states

1250: with $(1-t)\, N$ undetermined variables and

1251: $c_j\,N$ $j$-clauses ($j=2,3$) scales exponentially with $N$,

1252: with a growth function $\omega (c_2,c_3;t)$ related to $\varphi (y_2,y_3;t)$

1253: through identity (\ref{inversion}).

1254: Moreover, $\varphi (0,0;t)$ is the logarithm of the

1255: number of branches (divided by $N$) after a fraction $t$ of variables

1256: have been assigned. The most probable values of the densities $c_j(t)$

1257: of $j$-clauses are then obtained from the partial derivatives of

1258: $\varphi$: $c_j(t)=\partial \varphi /\partial y_j (0,0)$ for $j=2,3$.

1259:

1260: Let us emphasize that $\varphi$ in equation (\ref{scalinghyp2}) does

1261: not depend on $y_1$. This hypothesis simply expresses that, as far as

1262: non violating partial states are concerned,

1263: both terms on the right hand side of (\ref{eqev}) are of the same order,

1264: and that the density of unit-clauses, $c_1=\partial \varphi/\partial y_1$,

1265: identically vanishes.

1266:

1267: Similarly, function $\psi (y_1,y_2,y_3;t)$ is related to the generating

1268: function of distribution $\rho (C_1|c_2,c_3;t)$,

1269: \begin{equation} \label{psigen}

1270: \sum _{C_1\ge 0} \rho (C_1|c_2,c_3;t)\, \e^{y_1\, C_1} =

1271: e^{\psi (y_1,y_2,y_3;t) - \psi(0,y_2,y_3;t)} \quad ,

1272: \end{equation}

1273: where $c_j=\partial \varphi /\partial y_j (y_2,y_3;t)$ ($j=2,3$)

1274: on the left hand side of the above formula.

1275:

1276: Inserting expression (\ref{scalinghyp2}) into the evolution equation

1277: (\ref{eqev}), we find

1278: \begin{eqnarray} \label{mdar}

1279: \frac{\partial \varphi } {\partial t} (y_2,y_3;t) &=&

1280: -y_1 + \frac 2{1-t} \left[ e^{-y_2} \left( \frac{1+e^{y_1}}2 \right)

1281: -1 \right] \frac{\partial \varphi } {\partial y_2} (y_2,y_3;t)

1282: \nonumber \\

1283: &+& \frac 3{1-t} \left[ e^{-y_3} \left( \frac{1+e^{y_2}}2 \right)

1284: -1 \right] \frac{\partial \varphi } {\partial y_3} (y_2,y_3;t)

1285: \nonumber \\

1286: &+& \ln \left[ 1+ K(y_1,y_2) \; e^{\psi (-\infty,y_2,y_3;t)

1287: - \psi(y_1,y_2,y_3;t)} \right]

1288: \end{eqnarray}

1289: where $K(y_1,y_2)=e^{-y_2}(e^{2 \, y_1} + e^{y_1}) -1$.

1290: As $\varphi$ does not depend upon $y _1$, the latter may be chosen

1291: at our convenience e.g. to cancel $K$ and the contribution from the last

1292: term in equation (\ref{mdar}),

1293: \begin{equation}

1294: y_1 =

1295: Y_1 (y_2 ) \equiv y_2 - \ln \left(\frac{  1 +\sqrt{1+ 4 e^{y_2}}}2 \right)

1296: \quad .

1297: \end{equation}

1298: Such a procedure, sometimes called kernel method and, to our knowledge,

1299: first proposed by \citet{knu}, is correct in the major

1300: part of the $y_2,y_3$ space and, in particular, in the vicinity of

1301: $(0,0)$ we focus on in this paper\footnote{It has however to be

1302: to modified in a small region of the $y_2,y_3$ space; a complete

1303: analysis of this case was carried out by \citet{Coc}.}.

1304: We end up with the following partial differential equation (PDE) for

1305: $\varphi$,

1306: \begin{equation} \label{croi2}

1307: \frac{\partial \varphi } {\partial t} (y_2,y_3;t) = { H} \left[

1308: \frac{\partial \varphi } {\partial y_2} , \frac{\partial \varphi }

1309: {\partial y_3 } , y_2, y_3 , t \right] \quad ,

1310: \end{equation}

1311: where ${H}$ incorporates the details of the splitting

1312: heuristic\footnote{For the UC heuristic,

1313: \begin{equation}

1314: {H} _{UC} = {\ln2} +  \frac {3\, c_3}{1-t}\; \left[ e^{-y_3}\left(

1315: \;\frac{1+e^{y_2}}{2}\right) -1 \right]+

1316: \frac{c_2}{1-t}  \; \left(  \frac 32 e^{-y_2}

1317: -2 \right)   \qquad .

1318: \end{equation}},

1319: \begin{eqnarray}

1320: {H} _{GUC} [c_2 ,c_3, y_2, y_3 , t ] &=&

1321: -Y_1(y_2) +\frac {3\, c_3}{1-t}\; \left[ e^{-y_3}\;\left(

1322: \frac{1+e^{y_2}}{2}\right) -1 \right]\nonumber \\

1323: &+& \frac{c_2}{1-t}  \; \left( e^{ -Y_1(y_2)}

1324: -2 \right)  \qquad .

1325: \end{eqnarray}

1326: We must therefore solve the partial differential equation (PDE) (\ref{croi2})

1327: with the initial condition,

1328: \begin{equation}\label{initphi}

1329: \varphi(y_2,y_3,t=0) = \alpha _0 \; y_3 \quad ,

1330: \end{equation}

1331: obtained through inverse Legendre transform (\ref{inversion}) of the

1332: initial condition over $\bar B$, or equivalently over $\omega$,

1333: \begin{equation}

1334: \omega (c_2,c_3;t=0)= \left\{ \begin{array} {c c c}

1335: 0 &\hbox{\rm if} & c_3=\alpha _0\ ,\nonumber \\ -\infty

1336: &\hbox{\rm if} & c_3\ne\alpha _0 \ .\end{array}

1337: \right . \label{condi}

1338: \end{equation}

1339:

1340:

1341: \begin{center}

1342: \begin{figure}

1343: \includegraphics[height=200pt,angle=-90]{fig8a.ps}

1344: \includegraphics[height=200pt,angle=-90]{fig8b.ps}

1345: $^{\ }$\hskip 3cm \includegraphics[height=200pt,angle=-90]{fig8c.ps}

1346: \vskip .5cm

1347: \caption{Snapshots of the surface $\omega (p,\alpha;t )$ for $\alpha

1348: _0=10$ at three different times {\em i.e.} depths in the tree,

1349: $t=0.01$, 0.05 and 0.09 (from left to right, top to

1350: down). The height $\omega ^*(t)$ of the top of the surface, with

1351: coordinates $p^*(t), \alpha^*(t)$, is the logarithm (divided by $N$) of the

1352: number of branches. The coordinates $(p^*(t),\alpha ^*(t))$ define the tree

1353: trajectory shown in Figure~\ref{diag}. The halt line is hit at $t_h \simeq 0.094$.

1354: Note the overall growth of the surface $\omega (p,\alpha;t)$ with time

1355: (beware of the change of scales between figures).}

1356: \label{dome}

1357: \end{figure}

1358: \end{center}

1359:

1360:

1361: \subsection{Interpretation in terms of growth process}

1362:

1363: We can interpret the dynamical annealing approximation made in the previous

1364: paragraphs, and the resulting PDE (\ref{croi2}) as a description of

1365: the growth process of the search tree resulting from DPLL operation.

1366: Using Legendre transform (\ref{inversion}), PDE (\ref{croi2}) can

1367: be written as an evolution equation for the

1368: logarithm $\omega(c_2,c_3,t)$ of the average number of branches with

1369: parameters $c_2,c_3$ as the depth $t=T/N$ increases,

1370: \begin{equation}

1371: \frac{\partial \omega } {\partial t} (c_2,c_3,t) = { H} \left[ c_2, c_3,

1372: -\frac{\partial \omega } {\partial c_2} , -\frac{\partial \omega }

1373: {\partial c_3 } ,t \right] \qquad . \label{croi}

1374: \end{equation}

1375: Partial differential equation (PDE) (\ref{croi}) is

1376: analogous to growth processes encountered in statistical physics

1377: \citep{Gro}.  The surface $\omega$, growing with ``time'' $t$ above the

1378: plane $c_2,c_3$, or equivalently from (\ref{change}), above the plane

1379: $p,\alpha$ (Figure~\ref{dome}), describes the whole distribution of branches.

1380: The average number  of branches at depth $t$ in the tree equals

1381: \begin{equation}

1382: B(t) = \int _0 ^1 dp\; \int _0 d\alpha \; e^{N\, \omega (p,\alpha;t)} \asymp

1383: e^{N\, \omega ^* (t)} \quad ,

1384: \end{equation}

1385: where $\omega ^* (t)$ is the maximum over $p,\alpha$ of $\omega (p,\alpha;t)$

1386: reached in $p^*(t), \alpha^*(t)$.

1387: In other words, the exponentially dominant contribution to $B(t)$

1388: comes from branches carrying 2+p-SAT instances with parameters

1389: $p^*(t), \alpha^*(t)$, that is clause densities

1390: $c^*_2(t)= \alpha^*(t) (1-p^*(t))$, $c^*_3(t)= \alpha^*(t) p^*(t)$.

1391: Parametric plot of $p^*(t),\alpha ^*(t)$ as a function of $t$

1392: defines the tree trajectories on Figure~\ref{diag}.

1393:

1394: The hyperbolic line in Figure~\ref{diag} indicates the halt points, where

1395: contradictions prevent dominant branches from further growing.

1396: Each time DPLL assigns a variable through

1397: unit-propagation, an average number $u(p,\alpha)$ of new 1-clauses is

1398: produced, resulting in a net rate of $u-1$ additional 1-clauses.

1399: As long as $u< 1$, 1-clauses are quickly eliminated and do not

1400: accumulate.  Conversely, if $u  >1$, 1-clauses tend to accumulate.

1401: Opposite 1-clauses $x$ and $\bar x$ are likely to appear,

1402: leading to a contradiction \citep{Fra,Fri}. The halt line is defined through

1403: $u (p,\alpha)=1$, and reads \citep{Coc},

1404: \begin{equation}

1405: \alpha = \left( \frac{3+\sqrt 5}2 \right)

1406: \ln \left[ \frac{1+\sqrt 5}2 \right]\;\frac 1{1-p}

1407: \qquad .

1408: \end{equation}

1409: It differs from the halt line $\alpha=1/(1-p)$ corresponding to

1410: a single branch \citep{Fri}. As far as dominant branches are concerned,

1411: an alternative and simpler way of obtaining the halt criterion

1412: is through calculation of the probability

1413: $\rho ^*_S (t) \equiv \rho (C_1=0|c^*_2(t),c^*_3(t);t)$ that a split occurs

1414: when a variable is assigned by DPLL,

1415: \begin{equation}

1416: \rho ^*_S (t) = \exp \left(\frac{\partial \varphi}{\partial t}

1417: (0,0;t)\right) -1  \quad ,

1418: \end{equation}

1419: from equations (\ref{psigen},\ref{mdar}).

1420: The probability of split vanishes, and unit-clauses accumulate till

1421: a contradiction is obtained, when the tree stops growing.

1422: Along the tree trajectory, $\omega ^*(t)$ grows thus from 0, on the

1423: right vertical axis, up to some final positive value, $\omega_{THE}$,

1424: on the halt line. $\omega _{THE}$ is our theoretical prediction for the

1425: logarithm of the complexity (divided by $N$)\footnote{Notice that

1426: we have to divide the theoretical value by $\ln 2$ to match the definition

1427: used for numerical experiments; this is done in Table~1}.

1428:

1429:

1430: Equation (\ref{croi}) was solved using the method of characteristics.

1431: Using eqn. (\ref{change}), we have plotted the surface $\omega$ at different

1432: times, with the results shown in Figure~\ref{dome} for

1433: $\alpha _0=10$. Values of $\omega _{THE}$, obtained for

1434: $4.3<\alpha<20$ by solving equation

1435: (\ref{croi}) compare very well with numerical results (Table~1).

1436: We stress that, though our calculation is not rigorous,

1437: it provides a very good quantitative estimate of the complexity.

1438: It is therefore expected that our dynamical annealing approximation

1439: be quantitavely accurate. It is a reasonable conjecture that it becomes

1440: exact at large ratios $\alpha_0$, where PDE (\ref{croi2}) can be exactly solved:

1441:

1442: \begin{conj} Asymptotic equivalent of $\omega$ for large ratios

1443:

1444: Resolution of PDE (\ref{croi}) in the large ratio $\alpha_0$ limit

1445: gives (for the GUC heuristic),

1446: \begin{equation}

1447: \omega _{THE} (\alpha _0 ) \asymp \frac {3+\sqrt{5}}{6 \,\ln 2}\;

1448: \left[ \ln \left( \frac{1+\sqrt 5}{2} \right) \right]^2 \; \frac 1{\alpha_0}

1449: \quad .

1450: \end{equation}

1451: This result exhibits the $1/\alpha _0$ scaling proven by \citet{Bea}, and

1452: is conjectured to be exact.

1453: \end{conj}

1454:

1455: As $\alpha _0$ increases, search trees become smaller and smaller, and

1456: correlations between branches, weaker and weaker, making dynamical

1457: annealing more and more accurate.

1458:

1459: \begin{center}

1460: \begin{figure}

1461: \includegraphics[width=400pt,height=250pt,angle=0]{fig9.eps}

1462: \vskip .3cm

1463: \caption{Detailed structure of the search tree in the

1464: upper sat phase ($\alpha _L < \alpha < \alpha _C$). DPLL starts with a

1465: satisfiable 3-SAT instance and transforms it into a sequence of

1466: 2+p-SAT instances. The leftmost branch in the tree

1467: symbolizes the first descent made by DPLL. Above node $G_0$,

1468: instances are satisfiable while below

1469: $G_1$, instances have no solutions. A grey triangle accounts for the

1470: (exponentially) large refutation subtree that DPLL has to go through

1471: before backtracking above $G_1$ and reaching $G_0$. By definition, the highest

1472: node reached back by DPLL is $G_0$. Further backtracking, below $G_0$, will

1473: be necessary but a solution will be eventually found (right subtree), see

1474: Figure~\ref{trees}C.}

1475: \label{treeinter}

1476: \end{figure}

1477: \end{center}

1478:

1479: \section{Upper phase and mixed branch--tree trajectories.}

1480:

1481: The interest of the trajectory framework proposed in this paper is best seen

1482: in the upper sat phase, that is, for

1483: ratios $\alpha _0$ ranging from $\alpha _L$

1484: to $\alpha _C$. This intermediate region juxtaposes branch and tree

1485: behaviors, see search tree in Figures~\ref{trees}C and \ref{treeinter}.

1486:

1487: The branch trajectory,

1488: started from the point $(p=1,\alpha _0)$ corresponding to the initial

1489: 3-SAT instance, hits the critical line $\alpha_c(p)$

1490: at some point G with coordinates ($p_G,\alpha_G$) after $N\;t_G$ variables

1491: have been assigned by DPLL, see Figure~\ref{diag}.  The

1492: algorithm then enters the unsat phase and, with high probability,

1493: generates a 2+p-SAT instance

1494: with no solution. A dense subtree that DPLL has to go through

1495: entirely, forms beyond G till the halt line (left subtree in

1496: Figure~\ref{treeinter}).  The size of this subtree

1497: can be analytically predicted from the theory exposed in

1498: Section~3. All calculations are identical, except

1499: initial condition (\ref{initphi}) which has to be changed into

1500: \begin{equation}

1501: \varphi(y_2,y_3,t=0) = \alpha _G \; (1-p_G) \; y_2

1502: +\alpha _G \; p_G \; y_3 \quad .

1503: \end{equation}

1504: As a result we obtain the size $2^{N_G \; \omega _G}$ of the unsatisfiable

1505: subtree to be backtracked (leftmost subtree in Figure~\ref{treeinter}).

1506: $N_G = N\,(1-t_G)$ denotes the number of undetermined variables

1507: at point $G$.

1508:

1509: $G$ is the highest backtracking node in the tree (Figures~\ref{trees}C

1510: and \ref{treeinter}) reached back by

1511: DPLL, since nodes above G are located in the sat phase and carry

1512: 2+p-SAT instances with solutions. DPLL will eventually reach

1513: a solution. The corresponding branch (rightmost

1514: path in Figure~\ref{trees}C) is highly non typical and does not contribute

1515: to the complexity, since almost all branches in the search tree

1516: are described by the tree trajectory issued from G (Figure~\ref{diag}).

1517: We expect that the computational effort DPLL requires to find

1518: a solution will, to exponential order in $N$, be given by the

1519: size of the left unsatisfiable subtree of Figure~\ref{treeinter}.

1520: In other words, massive backtracking will certainly be present

1521: in the right subtree (the one leading to the solution), and no

1522: significant statistical difference is expected between both subtrees.

1523:

1524: We have experimentally checked this scenario for

1525: $\alpha _0=3.5$. The average coordinates of the highest

1526: backtracking node, $(p_G\simeq 0.78, \alpha _G \simeq 3.02$), coincide

1527: with the computed intersection of the single branch trajectory

1528: (Section 2.2) and the estimated critical line $\alpha_c(p)$ \citep{Coc}.

1529: As for complexity, experimental measures of $\omega$ from

1530: 3-SAT instances at $\alpha _0= 3.5$, and of $\omega _G$ from

1531: 2+0.78-SAT instances at $\alpha _G =3.02$, obey the

1532: expected identity

1533: \begin{equation}

1534: \omega _{THE} = \omega _G \times (1-t_G) \quad ,

1535: \end{equation}

1536: and are in very good agreement with theory (Table~1). Therefore,

1537: the structure of search trees corresponding to instances of 3-SAT

1538: in the upper sat regime reflects the

1539: existence of a critical line for 2+p-SAT instances.

1540:

1541:

1542: \section{Conclusions.}

1543:

1544: In this paper, we have exposed a procedure to understand the

1545: complexity pattern of the backtrack resolution of the random

1546: Satisfiability problem (Figure~\ref{sche}). Main steps are:

1547: \begin{enumerate}

1548: \item  Identify the  space of parameters in which

1549: the dynamical evolution takes place; this space will be generally larger

1550: than the initial parameter space since the algorithm modifies the instance

1551: structure. While the distribution of 3-SAT instances is characterized by

1552: the clause per variable ratio

1553: $\alpha$ only, another parameter $p$ accounting for

1554: the emergence of 2-clauses has to be considered.

1555: \item Divide the parameter space

1556: into different regions (phases) depending on the output

1557: of the resolution e.g. sat/unsat phases for 2+p-SAT.

1558: \item Represent the action of  the algorithm as trajectories in this

1559: phase diagram. Intersection of trajectories with the phase boundaries

1560: allow to distinguish hard from easy regimes (Figure~\ref{sche}).

1561: \end{enumerate}

1562:

1563: In addition, we have also presented

1564: a non rigorous study of the search tree growth, which allows us to

1565: accurately estimate the complexity of resolution in presence

1566: of massive backtracking. From a mathematical point of view, it is worth

1567: noticing that monitoring the growth of the search tree requires

1568: a PDE, while ODEs are sufficient to account for the evolution of a single

1569: branch~\citep{Achl}.

1570:

1571: An interesting question raised by this picture is the robustness of

1572: the polynomial/exponential crossover point T (Figure 3). While the

1573: ratio $\alpha_L$ separating easy (polynomial) from hard (exponential)

1574: resolutions depends on the heuristics used by DPLL ($\alpha _L ^{GUC} \simeq

1575: 3.003$, $\alpha _L ^{UC} = 8/3$), T appears to be located at the same

1576: coordinates $(p_T=2/5,\alpha _T =5/3)$ for all three UC, GUC, and

1577: SC$_1$ heuristics. From a technical point of view, the robustness of T

1578: comes from the structure of the ODEs (\ref{ode}). The coordinates of

1579: T, and the time $t_T$ at which the branch trajectory issued from

1580: $(p=1,\alpha _0=\alpha _L)$ hits the critical line $\alpha _C(p)$

1581: tangentially, obey the equations $\rho _1 = \partial \rho _1 /\partial

1582: t = 0$ with $\rho _1 = 1- \alpha(t) \big(1-p(t)\big)$.  The set of

1583: ODEs (\ref{ode}), combined with the previous conditions, gives

1584: $p_T=2/5$ \citep{Achl}.

1585:

1586: This robustness explains why the polynomial/exponential crossover

1587: location of critically constrained 2+p-SAT instances, which should

1588: {\em a priori} depend  on the algorithm used, was found by

1589: \citet{Sta1} to coincide roughly with the algorithm--independent,

1590: tricritical point on the $\alpha_C(p)$ line.

1591:

1592: Our approach has already been extended to other decision problems, e.g. the

1593: vertex covering of random graphs \citep{Wei} or the coloring

1594: of random graphs \citep{Liat} (see \citep{Cris} for recent rigorous

1595: results on backtracking in this case). It is important to stress that it is

1596: not limited to the determination of the average solving time, but may

1597: also be used to capture its distribution \citep{Gent2,Coc3,Mont}

1598: and to understand the efficiency of restarts techniques \citep{Gomes}.

1599: Finally, we emphasize that theorem 6 relates the computational effort

1600: to the evolution operator representing the elementary steps

1601: of the search heuristic for a {\em given} instance. It is expected

1602: that this approach will be useful to obtain results on the average-case

1603: complexity of DPLL at fixed instance, where the average is performed over the

1604: random choices done by the algorithm only \citep{momo}.

1605:

1606: \begin{center}

1607: \begin{figure}

1608: \includegraphics[height=250pt,angle=0]{fig10.eps}

1609: \caption{Schematic representation of the resolution trajectories in the

1610: sat (branch trajectories symbolized with dashed line) and unsat (tree

1611: trajectories represented by hatched regions) phases. DPLL goes along

1612: branch trajectories in a linear time, but takes an exponential time

1613: to go through tree trajectories. The mixed case

1614: of hard sat instances correspond to the crossing of the boundary separating

1615: the two phases (bold line), which leads to the exploration of unsat subtrees

1616: before a solution is finally found.}

1617: \label{sche}

1618: \end{figure}

1619: \end{center}

1620:

1621:

1622: \begin{ack}

1623: We thank J. Franco for his constant support during

1624: the completion of this work. R. Monasson was in part supported by

1625: the ACI Jeunes Chercheurs ``Algorithmes d'optimisation et

1626: syst\`emes d\'esordonn\'es quantiques'' from the French Ministry of Research.

1627: \end{ack}

1628:

1629: \begin{thebibliography}{}

1630:

1631: \harvarditem{Achlioptas et al.}{2001a}{Achl1}

1632: Achlioptas, D., Kirousis, L., Kranakis, E. and Krizanc, D.

1633: Rigorous results for random (2+p)-SAT, {\em Theor. Comp.

1634: Sci.} {\bf 265}, 109-129  (2001).

1635:

1636: \harvarditem{Achlioptas}{2001b}{Achl}

1637: Achlioptas, D.

1638: Lower bounds for random 3-SAT via differential equations,

1639: {\em Theor. Comp. Sci.} {\bf 265}, 159--185 (2001).

1640:

1641: \harvarditem{Achlioptas, Beame and Molloy}{2001c}{Achl3}

1642: Achlioptas, D., Beame, P. and Molloy, M.

1643: A Sharp Threshold in Proof Complexity.

1644: in {\em Proceedings of STOC 01}, p.337-346 (2001).

1645:

1646: \harvarditem{Beame et al.}{1998}{Bea}

1647: Beame, P., Karp, R., Pitassi, T. and Saks, M.

1648: {\em ACM Symp.\ on Theory of Computing (STOC98)}, 561--571

1649: Assoc. Comput. Mach., New York (1998).

1650:

1651: \harvarditem{Chao and Franco}{1986}{fra2}

1652: Chao, M.T. and Franco, J.

1653: Probabilistic analysis of two heuristics for the 3-satisfiability problem,

1654: {\em SIAM Journal on Computing} {\bf 15}, 1106-1118 (1986).

1655:

1656: \harvarditem{Chao and Franco}{1990}{Fra}

1657: Chao, M.T. and Franco, J.

1658: Probabilistic analysis of a generalization of the unit-clause literal

1659: selection heuristics for the k-satisfiability problem,

1660: {\em Information Science} {\bf 51}, 289--314 (1990).

1661:

1662: \harvarditem{Chv\`atal and Szmeredi}{1988}{Chv}

1663: Chv{\`a}tal, V. and Szmeredi, E.

1664: Many hard examples for resolution,

1665: {\em Journal of the ACM} {\bf 35}, 759--768 (1988).

1666:

1667: \harvarditem{Coarfa et al.}{2000}{Vardi1}

1668: Coarfa, C., Dernopoulos, D.D., San Miguel Aguirre, A., Subramanian, D.

1669: and Vardi, M.Y. Random 3-SAT: The plot thickens. In R. Dechter, editor,

1670: {\em Proc. Principles and Practice of Constraint Programming

1671: (CP'2000)}, Lecture Notes in Computer Science 1894, 143-159 (2000).

1672:

1673: \harvarditem{Cocco and Monasson}{2001}{Coc}

1674: Cocco, S. and Monasson, R.

1675: Trajectories in phase diagrams, growth processes

1676: and computational complexity: how search algorithms solve the

1677: 3-Satisfiability problem, {\em Phys. Rev. Lett.} {\bf 86}, 1654 (2001);

1678: Analysis of the computational complexity of solving random satisfiability

1679: problems using branch and bound search algorithms, {\em Eur. Phys. J. B}

1680: {\bf 22}, 505 (2001).

1681:

1682: \harvarditem{Cocco and Monasson}{2002}{Coc3}

1683: Cocco, S. and Monasson R.

1684: Exponentially hard problems are sometimes polynomial, a large deviation

1685: analysis of search algorithms for the random satisfiability problem,

1686: and its application to stop-and-restart resolutions,

1687: {\em Phys. Rev. E} {\bf 66}, 037101 (2002).

1688:

1689: \harvarditem{Crawford and Auton}{1996}{Cra}

1690: Crawford, J. and Auton, L.

1691: Experimental Results on the Cross-Over Point in Satisfiability Problems,

1692: {\em Proc.\ 11th Natl.\ Conference on Artificial Intelligence

1693: (AAAI-93),} 21--27, The AAAI Press / MIT Press, Cambridge, MA (1993);

1694: {\em Artificial Intelligence} {\bf  81} (1996).

1695:

1696: \harvarditem{Davis, Logemann and Loveland}{1962}{DP}

1697: Davis, M., Logemann, G., Loveland, D.

1698: {A machine program for theorem proving.}

1699: {\em Communications of the ACM} {\bf 5}, 394-397 (1962).

1700:

1701: \harvarditem{Dubois, Boufkhad and Mandler}{2000}{Dub}

1702: Dubois, O., Boufkhad, Y. and Mandler, J.

1703: Typical random 3-SAT formulae and the satisfiability threshold.

1704: {\em SODA}, p. 126-127 (2000).

1705:

1706: \harvarditem{Dubois et al.}{2001}{Dubtcs}

1707: Dubois, O., Monasson, R., Selman, B. and Zecchina, R. (eds)

1708: Phase transitions in combinatorial problems.

1709: {\em Theor. Comp. Sci.} {\bf 265} (2001).

1710:

1711: \harvarditem{Franco}{2001}{Fra2}

1712: Franco, J.

1713: Results related to thresholds phenomena research in satisfiability:

1714: lower bounds.

1715: {\em Theor. Comp. Sci.} {\bf 265}, 147--157 (2001).

1716:

1717: \harvarditem{Friedgut}{1999}{Friedgut}

1718: Friedgut, E.

1719: Sharp thresholds of graph properties, and the k-sat problem,

1720: {\em Journal of the A.M.S.} {\bf 12}, 1017 (1999).

1721:

1722: \harvarditem{Frieze and Suen}{1996}{Fri}

1723: Frieze, A. and Suen, S.

1724: Analysis of two simple heuristics on a random instance of k-SAT,

1725: {\em Journal of Algorithms} {\bf 20}, 312--335 (1996).

1726:

1727: \harvarditem{Gent and Walsh}{1994}{Gent2}

1728: Gent, I.P. and Walsh, T. Easy problems are sometimes hard,

1729: {\em Artificial Intelligence} {\bf 70}, 335-345 (1994).

1730:

1731: \harvarditem{Gent, van Maaren and Walsh}{2000}{Hans}

1732: Gent , I., van Maaren, H. and Walsh, T. (eds).

1733: SAT2000: Highlights of Satisfiability Research in the Year 2000,

1734: {\em Frontiers in Artificial Intelligence and Applications},

1735: vol. 63, IOS Press, Amsterdam (2000).

1736:

1737: \harvarditem{Gomes et al.}{2000}{Gomes}

1738: Gomes, C.P., Selman, B., Crato, N. and Kautz, H. {\em J. Automated

1739: Reasoning} {\bf 24}, 67 (2000).

1740:

1741: \harvarditem{Gu, Purdom, Franco and Wah}{1997}{survey}

1742: Gu, J., Purdom, P.W., Franco, J. and Wah, B.W.

1743: Algorithms for satisfiability (SAT) problem: a survey.

1744: {\em DIMACS Series on Discrete Mathematics

1745: and Theoretical Computer Science} {\bf 35}, 19-151,

1746: American Mathematical Society (1997).

1747:

1748: \harvarditem{Hartmann and Weigt}{2001}{Wei}

1749: Hartmann, A. and Weigt, M.

1750: Typical solution time for a vertex-covering algorithm on

1751: finite-connectivity random graphs,

1752: {\em Phys. Rev. Lett.} {\bf 86}, 1658 (2001).

1753:

1754: \harvarditem{Hogg, Huberman and Williams}{1996}{AI}

1755: Hogg, T., Huberman, B.A. and Williams, C. (eds).

1756: Frontiers in problem solving: phase transitions and complexity.

1757: {\em Artificial Intelligence} {\bf 81} I \& II (1996).

1758:

1759: \harvarditem{Jia and Moore}{2003}{Cris}

1760: Jia, H. and Moore, C.

1761: How much backtracking does it take to 3-color a random graph?

1762: {\em preprint} (2003).

1763:

1764: \harvarditem{Kaporis, Kirousis and Lalas}{2002}{Kap}

1765: Kaporis, A.C., Kirousis, L.M. and Lalas, E.G.

1766: The Probabilistic Analysis of a Greedy Satisfiability Algorithm.

1767: {\em ESA}, p. 574-585 (2002).

1768:

1769: \harvarditem{Knuth}{1968}{knu}

1770: Knuth, D.E.

1771: The art of computer programming, vol. 1: Fundamental algorithms,

1772: Section 2.2.1, Addison-Wesley, New York (1968).

1773:

1774: \harvarditem{Ein-Dor and Monasson}{2003}{Liat}

1775: Ein-Dor, L. and Monasson, R.

1776: The dynamics of proving uncolorability of large random graphs.

1777: {\em J. Phys. A} {\bf 36} 11055 (2003).

1778:

1779: \harvarditem{McKane, Droz, Vannimenus and Wolf}{1995}{Gro}

1780: McKane, A. Droz, M. Vannimenus, J. and Wolf D. (eds),

1781: {Scale invariance, interfaces, and non--equilibrium dynamics},

1782: {\em Nato Asi Series B: Physics}, vol. 344, Plenum Press, New-York (1995).

1783:

1784: \harvarditem{Mitchell, Selman and Levesque}{1992}{Mit}

1785: Mitchell, D., Selman, B. and Levesque, H.

1786: Hard and Easy Distributions of SAT Problems,

1787: {\em Proc.\ of the Tenth Natl.\ Conf.\ on Artificial Intelligence

1788: (AAAI-92)}, 440-446,

1789: The AAAI Press / MIT Press, Cambridge, MA (1992).

1790:

1791: \harvarditem{Monasson et al.}{1999}{Sta1}

1792: Monasson, R., Zecchina, R., Kirkpatrick, S., Selman, B. and Troyansky, L.

1793: Determining computational complexity from characteristic 'phase transitions'.

1794: {\em Nature} {\bf 400}, 133--137 (1999);

1795: 2+p-SAT: Relation of Typical-Case Complexity to the Nature of

1796: the Phase Transition,

1797: {\em Random Structure and Algorithms} {\bf 15}, 414 (1999).

1798:

1799: \harvarditem{Monasson}{2003}{momo}

1800: Monasson, R. On the analysis of backtrack procedures for the coloring of

1801: random graphs. {\em preprint} (2003).

1802:

1803: \harvarditem{Montanari and Zecchina}{2002}{Mont}

1804: Montanari, A. and Zecchina, R.

1805: Optimizing searches via rare events.

1806: {\em Phys. Rev. Lett.} {\bf 88}, 178701 (2002)

1807:

1808: \harvarditem{Selman and Kirkpatrick}{1994}{Kir}

1809: Selman, B. and Kirkpatrick, S.

1810: Critical Behavior in the Satisfiability of Random Boolean Expressions.

1811: {\em Science} {\bf 264}, 1297--1301 (1994).

1812:

1813: \harvarditem{Wormald}{1995}{Worm}

1814: Wormald, N.

1815: Differential equations for random processes and random graphs.

1816: {\em Ann. Appl. Probab.} {\bf 5}, 1217-1235 (1995).

1817:

1818: \end{thebibliography}

1819:

1820: \end{document}

1821:

1822: