1: \documentclass[pra,floats,aps,nofootinbib,superscriptaddress]{revtex4}
2: \usepackage{graphicx}
3: \usepackage{amssymb}
4: \usepackage{epsfig}
5: \input psfig.sty
6: \begin{document}
7: \title{Approximate analysis of search algorithms with ``physical'' methods}
8: \author{S. Cocco}
9: \affiliation{CNRS-Laboratoire de Dynamique des Fluides Complexes,
10: 3 rue de l'universit\'e 67084 Strasbourg, France}
11: \author{R. Monasson}
12: \affiliation{CNRS-Laboratoire de Physique Th{\'e}orique de l'ENS,
13: 24 rue Lhomond, 75005 Paris, France}
14: \affiliation{CNRS-Laboratoire de Physique Th{\'e}orique,
15: 3 rue de l'universit\'e, 67084 Strasbourg, France}
16: \author{A. Montanari}
17: \affiliation{CNRS-Laboratoire de Physique Th{\'e}orique de l'ENS,
18: 24 rue Lhomond, 75005 Paris, France}
19: \author{G. Semerjian}
20: \affiliation{CNRS-Laboratoire de Physique Th{\'e}orique de l'ENS,
21: 24 rue Lhomond, 75005 Paris, France}
22:
23: \begin{abstract}
24: An overview of some methods of statistical physics applied to the analysis
25: of algorithms for optimization problems (satisfiability of Boolean
26: constraints, vertex cover of graphs, decoding, ...)
27: with distributions of random inputs is proposed.
28: Two types of algorithms are analyzed: complete procedures with
29: backtracking (Davis-Putnam-Loveland-Logeman algorithm) and incomplete,
30: local search procedures (gradient descent, random walksat, ...).
31: The study of complete algorithms makes use of physical concepts such as
32: phase transitions, dynamical renormalization flow, growth processes, ...
33: As for local search procedures, the connection between computational
34: complexity and the structure of the cost function landscape is questioned,
35: with emphasis on the notion of metastability.
36: \end{abstract}
37:
38: \maketitle
39:
40: \section{Introduction}
41:
42: The computational effort needed to deal with large combinatorial
43: structures considerably varies with the task to be performed and the
44: resolution procedure used\cite{papadimi}. The worst case complexity
45: of a task, more precisely an optimization or decision problem, is
46: defined as the time required by the best algorithm to treat any
47: possible inputs to the problem. For instance, the sorting problem of a
48: list of $N$ numbers has worst-case complexity $\sim N\log N$: there
49: exists several algorithms that can order any list in at most $\sim N\log
50: N$ elementary operations, and none with asymptotically
51: less operations. Unfortunately, the worst-case complexities
52: of many important computational problems, called NP-Complete, is not
53: known. Partitioning a list of $N$ numbers in two sets with equal
54: partial sums is one among hundreds of such NP-complete problems. It
55: is a fundamental conjecture of theoretical computer science that there
56: exists no algorithm capable of partitioning any list of length $N$,
57: or of solving any other NP-Complete problem with inputs of size $N$, in
58: a time bounded by a polynomial of $N$. Therefore, when dealing with
59: such a problem, one necessarily uses algorithms which may takes
60: exponential times on some inputs. Quantifying how `frequent' these
61: hard inputs are for a given algorithm is the question answered by the
62: analysis of algorithms. In this paper, we will present an overview
63: of recent works done by physicists to address this point, and more
64: precisely to characterize the average performances, called hereafter
65: complexity, of a given
66: algorithm over a distribution of inputs to an optimization problem.
67:
68: The history of algorithm analysis by physical methods/ideas is at least as
69: old as the use of computers by physicists. One well-established chapter in
70: this history is, for instance, the analysis of Monte Carlo sampling algorithms
71: for statistical mechanics models. In this context, it is well known that
72: phase transitions, {\em i.e.} abrupt changes in the physical
73: properties of the model, can imply a dramatic increase in the time
74: necessary to the sampling procedure. This phenomenon is commonly
75: known as critical slowing down. The physicists' insight in this problem
76: comes mainly from the analogy between the dynamics of algorithms and the
77: physical dynamics of the system. This analogy is quite natural:
78: in fact many algorithms mimick the physical dynamics itself.
79:
80: A quite new idea is instead to abstract from physically motivated problems
81: and use statistical mechanics ideas for analyzing the dynamics of algorithms.
82: In effect there are many reasons which suggest that analysis of algorithms and
83: statistical physics should be considered close relatives. In
84: both cases one would like to understand the asymptotic behavior
85: of dynamical processes acting on exponentially large (in the size
86: of the problem) configuration spaces. The differences between the two
87: disciplines mainly lie in the methods (and, we are tempted to say, the style)
88: of investigation. Theoretical computer science derives rigorous results
89: based on probability theory. However these results are sometimes too weak
90: for a complete characterization of the algorithm. Physicists provide instead
91: heuristic results based on intuitively sensible approximations. These
92: approximations are eventually validated by a comparison with numerical
93: experiments. In some lucky cases, approximations are asymptotically
94: irrelevant: estimates are turned into conjectures left for future rigorous
95: derivations.
96:
97: Perhaps more interesting than stylistic differences is the {\it point of
98: view} which physics brings with itself. Let us highlight two consequences of
99: this point of view.
100:
101: First, a particular importance is attributed to ``complexity phase
102: transitions'' {\em i.e.} abrupt changes in the resolution
103: complexity as some parameter defining the input distribution
104: is varied\cite{AI,Friedgut}.
105: We shall consider two examples in the next Sections:
106: \begin{itemize}
107: \item Random Satisfiability of Boolean constraints (SAT).
108: In $K$-SAT one is given an instance, that is, a
109: set of $M$ logical constraints (clauses)
110: among $N$ boolean variables, and wants to find a truth assignment for the
111: variables which fulfill all the constraints. Each clause is the logical OR of
112: $K$ literals, a literal being one of the $N$ variables or its
113: negation e.g. $(x_1 \vee x_{17} \vee \overline{x_{31}})$ for 3-SAT.
114: Random $K$-SAT is the $K$-SAT problem supplied with a distribution of
115: inputs uniform over all instances having fixed values of $N$ and $M$. The
116: limit of interest is $N,M\to \infty$ at fixed ratio $\alpha=M/N$ of
117: clauses per variable\cite{Mit,Hans}.
118: \item Vertex cover of random graphs (VC).
119: An input instance of the VC decision problem consists in a
120: graph $G$ and an integer number $X$.
121: The problem consists in finding a way to distribute $X$ covering marks over the
122: vertices in such a way that every edge of the graph is
123: covered, that is, has at least one of its ending vertices marked.
124: A possible distribution of inputs is provided by drawing random graphs
125: $G$ {\em \`a la} Erd\"os-Reny\`i {\em i.e.}
126: with uniform probability among all the graphs having
127: $N$ vertices and $E$ edges. The limit of interest is
128: $N,E\to\infty$ at fixed ratio $c=2E/N$ of edges per vertex.
129: \end{itemize}
130: The algorithms for random SAT and VC we shall consider in the next
131: Sections undergo a complexity phase transition as the input parameter
132: $\pi$ ($=\alpha$ for SAT, $c$ for VC) crosses some critical threshold
133: $\pi_{\rm alg}$. Typically resolution of a randomly drawn
134: instance requires linear time below the threshold
135: $\pi<\pi_{\rm alg}$ and exponential time above
136: $\pi>\pi_{\rm alg}$. The observation that most difficult
137: instances are located near the phase boundary confirms the relevance of
138: the phase-transition phenomenon.
139:
140: Secondly, a key role is played by the intrinsic (algorithm
141: independent) properties of the instance under study. The intuition is
142: that, underlying the dramatic slowing down of a particular algorithm,
143: there can be some {\it qualitative} change in some structural property
144: of the problem e.g. the geometry of the space of solutions. While
145: there is no general understanding of this question, we can further
146: specify the above statements case-by-case. Let us consider, for
147: instance, a local search algorithm for a combinatorial optimization
148: problem. If the algorithm never increases the value of the cost
149: function $F(C)$ where $C$ is the configuration (assignment) of variables to be
150: optimized over, the number and geometry of the local minima of $F(C)$
151: will be crucial for the understanding of the dynamics of the algorithm. This
152: example is illustrated in Sec. \ref{gradxor}. While the ``dynamical''
153: behavior of a particular algorithm is not necessarily related to any
154: ``static'' property of the instance, this approach is nevertheless of
155: great interest because it could provide us with some `universal'
156: results. Some properties of the instance, for example, may imply the
157: ineffectiveness of an entire class of algorithms.
158:
159: While we shall mainly study in this paper
160: the performances of search algorithms applied to
161: hard combinatorial problems as SAT, VC, we will also consider easy, that is,
162: polynomial problems as benchmarks for these algorithms.
163: The reason is that we want to understand if the average hardness of
164: resolution of solving NP-complete problems with a given distribution
165: of instances and a given algorithm truly reflects the
166: intrinsic hardness of these combinatorial problems or is
167: simply due to some lack of
168: efficiency of the algorithm under study. The benchmark problem we shall
169: consider is random XORSAT.
170: It is a version of a satisfiability problem, much simpler than SAT
171: from a computational complexity point of
172: view\cite{Crei}. The only but essential
173: difference with SAT is that a
174: clause is said to be satisfied if the exclusive, and not inclusive,
175: disjunction of its literals is true.
176: XORSAT may be recast as a linear algebra problem, where a set of
177: $M$ equations involving $N$ Boolean variables must be satisfied modulo
178: 2, and is therefore solvable in polynomial time by
179: various methods e.g. Gaussian elimination. Nevertheless, it is
180: legitimate to ask what are the performances of general search algorithms
181: for this kind of polynomial computational problem. In particular,
182: we shall see that some algorithms requiring exponential times to solve
183: random SAT instances behave badly on random XORSAT instances too.
184: A related question we shall focus on in Sec.~\ref{CodeSection} is
185: decoding, which may also, in some cases, be expressed as the
186: resolution of a set of Boolean equations.
187:
188: The paper is organized as follows. In Sec.~\ref{DpllSection} we shall review
189: backtracking search algorithms, which, roughly speaking, work in the
190: space of instances. We explain the general ideas and then illustrate them on
191: random SAT (Sec.~\ref{DpllSatSection}) and VC (Sec.~\ref{DpllVcSection}).
192: In Sec.~\ref{DpllFlucSection} we consider the fluctuations in running times of
193: these algorithms and analyze the possibility of exploiting these fluctuations
194: in random restart strategies. In Sec.~\ref{LocalSection} we turn to local
195: search algorithms, which work in the space of configurations.
196: We review the analysis of such algorithms for decoding problems
197: (Sec.~\ref{CodeSection}), random XORSAT (Sec.~\ref{gradxor}),
198: and SAT (Sec.~\ref{WalkSatSection}).
199: Finally in the Conclusion we suggest some possible future developments
200: in the field.
201:
202: \section{Analysis of the Davis-Putnam-Loveland-Logeman search procedure}
203:
204: \subsection{Overview of the algorithm and physical concepts}
205: \label{DpllSection}
206:
207: In this section, we briefly review the Davis-Putnam-Loveland-Logemann
208: (DPLL) procedure~\cite{DP,survey}. A decision problem can be formulated as a
209: constrained satisfaction problem, where a set of variables must be
210: sought for to fulfill some given constraints. For simplicity, we
211: suppose here that variables may take a finite set of values with
212: cardinality $v$ e.g. $v=2$ for SAT or VC. DPLL is an exhaustive search
213: procedure operating by trials and errors, the sequence of which can be
214: graphically represented by a search tree (Fig.~\ref{trees}). The tree
215: is defined as follows: {\bf (1)} A node in the tree corresponds to a
216: choice of a variable. {\bf (2)} An outgoing branch (edge) codes for
217: the value of the variable and the logical implications of this choice
218: upon not yet assigned variables and clauses. Obviously a node gives
219: birth to $v$ branches at most. {\bf (3)} Implications can lead to:
220: {\bf (3.1)} a violated constraint, then the branch ends with $C$
221: (contradiction), the last choice is modified (backtracking of the
222: tree) and the procedure goes on along a new branch (point 2 above);
223: {\bf (3.2)} a solution when all constraints are satisfied, then the
224: search process is over; {\bf (3.3)} otherwise, some constraints remain
225: and further assumptions on the variables have to be done (loop back to
226: point 1).
227: \begin{figure}
228: \centerline{\includegraphics[scale=0.38,angle=-90]{trees.eps}}
229: \caption{Types of search trees generated by the DPLL solving procedure
230: for variables taking $v=2$ values at most.
231: Nodes (black dots) stand for the choices of variables made by the
232: heuristic, and edges between nodes denote the elimination of
233: unitary clauses.
234: {\bf A.} {\em simple branch:} the algorithm finds
235: easily a solution without ever backtracking. {\bf B.} {\em dense tree:}
236: in the absence of solution, the algorithm builds a ``bushy'' tree,
237: with many branches of various lengths, before stopping. {\bf C.} {\em
238: mixed case, branch + tree:} if many contradictions arise before
239: reaching a solution, the resulting search tree can be decomposed in a
240: single branch followed by a dense tree. The junction G is the highest
241: backtracking node reached back by DPLL.}
242: \label{trees}
243: \end{figure}
244:
245: A computer independent measure of computational complexity, that is,
246: the amount of operations necessary to solve the instance, is given by
247: the size $Q$ of the search tree {\em i.e.} the number of nodes it
248: contains. Performances can be improved by designing sophisticated
249: heuristic rules for choosing variables (point 1). The resolution time
250: (or complexity) is a stochastic variable depending on the instance
251: under consideration and on the choices done by the variable assignment
252: procedure. Its average value, $\bar Q$, is a function of the input
253: distribution parameters $\pi$ e.g. the ratio $\alpha$ of clauses per
254: variable for SAT, or the average degree $c$ for the VC of random
255: graphs, which can be measured experimentally and that we want to calculate
256: theoretically. More precisely, our aim is to determine the values of the
257: input parameters for which the complexity is linear, $\bar Q=\gamma
258: \,N$ or exponential, $\bar Q = 2^{N\,\omega}$, in the size $N$ of the
259: instance and to calculate the coefficients $\gamma, \omega$ as
260: functions of $\pi$.
261:
262: The DPLL algorithm gives rise to a dynamical process. Indeed, the
263: initial instance is modified during the search through the assignment
264: of some variables and the simplification of the constraints that
265: contain these variables. Therefore, the parameters of the input
266: distribution are modified as the algorithm runs. This dynamical
267: process has been rigorously studied and understood in the case of a
268: search tree reducing to one branch (tree A in
269: Figure~\ref{trees})\cite{fra2,Fra,Achl,Fri,Kir,kir2}. Study of trees
270: with massive backtracking e.g. trees B and C in Fig.~\ref{trees} is
271: much more difficult. Backtracking introduces strong correlations
272: between nodes visited by DPLL at very different times, but close in
273: the tree. In addition, the process is non Markovian since instances
274: attached to each node are memorized to allow the search to resume
275: after a backtracking step.
276:
277: The study of the operation of DPLL is based on the following, elementary
278: observation. Since instances are modified when treated by DPLL,
279: description of their statistical properties
280: generally requires additional parameters
281: with respects to the defining parameters $\pi$ of the input distribution.
282: Our task therefore consists in
283: \begin{enumerate}
284: \item identifying these extra parameters $\pi'$\cite{kir2};
285: \item deriving the phase diagram of this new, extended distribution
286: $\pi,\pi'$ to identify, in the $\pi,\pi'$ space, the critical surface
287: separating instances having solution with high probability
288: (satisfiable phase) from instances having generally
289: no solution (unsatisfiable phase), see Fig.~\ref{schemoins}.
290: \item tracking the evolution of an instance under resolution with time
291: $t$ (number of steps of the algorithm), that is, the trajectory of its
292: characteristic parameters $\pi(t),\pi'(t)$ in the phase diagram.
293: \end{enumerate}
294: Whether this trajectory remains confined to one of the two phases or
295: crosses the boundary inbetween has dramatic consequences on the
296: resolution complexity. We find three average behaviours, schematized
297: on Fig.~\ref{schemoins}:
298: \begin{itemize}
299: \item if the initial instance has a solution and the trajectory
300: remains in the sat phase, resolution is typically linear,
301: and almost no backtracking is present (Fig.~\ref{trees}A).
302: The coordinates of the trajectory $\pi(t),\pi'(t)$ of the instance
303: in the course of the resolution obey a set of coupled ordinary
304: differential equations accounting for the changes in the distribution
305: parameters done by DPLL.
306: \item if the initial instance has no solution,
307: solving the instance, that is, finding a proof of unsatisfiability,
308: takes exponentially large time and makes use of
309: massive backtracking (Fig.~\ref{trees}B). Analysis of the search tree
310: is much more complicated than in the linear regime, and requires a partial
311: differential equation that gives information on the population of
312: branches with parameters $\pi,\pi'$ throughout the growth of the search tree.
313: \item in some intermediary regime, instances have solutions but finding
314: one requires an exponentially large time (Fig.~\ref{trees}C).
315: This may be related to the crossing of the boundary between sat and unsat
316: phases of the instance trajectory. We have therefore a mixed
317: behaviour which can be understood through combination of the two above
318: cases.
319: \end{itemize}
320: We now explain how to apply concretly this approach to the cases of
321: random SAT and VC.
322:
323: \begin{figure}
324: \begin{center}
325: \includegraphics[height=250pt,angle=0]{sche2.eps}
326: \end{center}
327: \caption{Schematic representation of the resolution trajectories in the
328: sat (branch trajectories symbolized by dashed lines) and unsat (tree
329: trajectories represented by hatched regions) phases. For simplicity
330: we have considered the case where both $\pi$ and $\pi'$ are scalar
331: and not vectorial parameters. Vertical
332: axis is the instance distribution defining parameter $\pi$. Instances
333: are almost always satisfiable if $\pi < \pi _c$, unsatisfiable if
334: $\pi > \pi _c$. Under the action of DPLL, the distribution of
335: instances is modified and requires another parameter $\pi'$ to be
336: characterized (horizontal axis), equal to, say, zero prior to any action
337: of DPLL. For non zero values of $\pi'$, the critical value
338: of the defining parameter $\pi$ obviously changes; the line
339: $\pi _c (\pi')$ defines a boundary separating typically sat from unsat
340: instances (bold line).
341: When the instance is unsat (point U), DPLL takes an exponential time
342: to go through the tree trajectory. For satisfiable and easy
343: instances, DPLL goes along a branch trajectory in a linear time
344: (point S). The mixed case
345: of hard sat instances (point MS)
346: correspond to the branch trajectory crossing the boundary separating
347: the two phases (bold line), which leads to the exploration of unsat subtrees
348: before a solution is finally found.}
349: \label{schemoins}
350: \end{figure}
351: \subsection{Average analysis of the Random SAT problem}
352: \label{DpllSatSection}
353:
354: \begin{figure}
355: \centerline{\includegraphics[scale=0.5,angle=0]{algo.eps}}
356: \caption{Example of 3--SAT instance and Davis-Putnam- Loveland-Logemann
357: resolution.
358: {\bf Step~0.} The instance consists of $M=5$ clauses
359: involving $N=4$ variables $x,y,w,z$, which can be assigned to true (T) or false
360: (F). $\bar w$ means (NOT $w$) and $\vee$ denotes the logical OR. The search
361: tree is empty. {\bf 1.} DPLL randomly selects a clause among the
362: shortest ones, and assigns a variable in the clause
363: to satisfy it, e.g. $w=$T
364: (splitting with the Generalized Unit Clause --GUC-- heuristic \cite{fra2}).
365: A node and an edge symbolizing respectively the variable chosen ($w$)
366: and its value (T) are added to the tree. {\bf 2.} The logical
367: implications of the last choice are extracted: clauses containing $w$
368: are satisfied and eliminated, clauses including $\bar w$ are
369: simplified and the remaining ones are left unchanged. If no unitary
370: clause ({\em i.e.} with a single variable) is present, a new choice of
371: variable has to be made. {\bf 3.} Splitting takes over. Another node
372: and another edge are added to the tree. {\bf 4.} Same as step 2 but
373: now unitary clauses are present. The variables they contain have to
374: be fixed accordingly. {\bf 5.} The propagation of the unitary
375: clauses results in a contradiction. The current branch dies out and
376: gets marked with C. {\bf 6.} DPLL backtracks to the last split
377: variable ($x$), inverts it (F) and creates a new edge. {\bf 7.} Same
378: as step 4. {\bf 8.} The propagation of the unitary clauses
379: eliminates all the clauses. A solution S is found and the instance is
380: satisfiable. For an
381: unsatisfiable instance, unsatisfiability is proven when backtracking
382: (see step 6) is not possible anymore since all split variables have
383: already been inverted. In this case, all the nodes in the final search
384: tree have two descendent edges and all branches terminate by a
385: contradiction C.}
386: \label{algo}
387: \end{figure}
388:
389: \begin{figure}
390: \begin{center}
391: \includegraphics[height=400pt,angle=-90]{diag2.eps}
392: \end{center}
393: \caption{Phase diagram of 2+p-SAT and resolution trajectories under
394: DPLL action. The threshold line $\alpha_C (p)$ (bold full line)
395: separates sat (lower part of the plane) from unsat (upper part)
396: phases. Departure points for DPLL trajectories are located on the
397: 3-SAT vertical axis. Arrows indicate the direction of "motion" along
398: trajectories (dashed curves) parameterized by the fraction $t$ of
399: variables set by DPLL. For small ratios $\alpha < \alpha _L$ ($\simeq
400: 3.003$ for the GUC heuristic), branch trajectories remain confined to
401: the sat phase, end in $S$ of coordinates $(1,0)$, where a solution is
402: found (with a search process reported on Fig.~\ref{trees}A). For
403: $\alpha > \alpha _C \simeq 4.3$, proofs of unsatisfiability are given
404: by complete search trees with all leaves carrying contradictions
405: (Fig.~\ref{trees}B). The corresponding tree trajectories are
406: represented by bold dashed lines (full arrows), which end up on the
407: halting (dot-dashed) line, see text. For ratios $\alpha _L < \alpha <
408: \alpha_C$, the branch trajectory intersects the threshold line at some
409: point $G$. A contradiction a.s. arises, and extensive backtracking up
410: to $G$ permits to find a solution (Fig.~\ref{trees}C). With
411: exponentially small probability, the search tree looks like
412: Fig.~\ref{trees}A instead: the trajectory (dashed curve) crosses
413: the "dangerous" region where contradictions are likely to occur,
414: then exits from this region and ends up with a solution (lowest dashed
415: trajectory). Inset: Resolution time of 3-SAT instances as a function of
416: the ratio of clauses per variable $\alpha$ and for three different
417: sizes. Data correspond
418: to the median resolution time of 10,000 instances by DPLL; the average time
419: may be somewhat larger due to the presence of rare, exceptionally
420: hard instances, cf. Sec.~\ref{DpllFlucSection}.
421: The computational complexity is linear for $\alpha < \alpha _L \simeq 3.003$,
422: exponential above. }
423: \label{sche}
424: \end{figure}
425:
426: The input distribution of 3-SAT is characterized by a single parameter
427: $\pi$, the ratio $\alpha$ of clauses per variable.
428: The action of DPLL on an instance of 3-SAT, illustrated
429: in Fig.~\ref{algo}, causes
430: the changes of the overall numbers of variables and clauses, and thus
431: of $\alpha$. Furthermore, DPLL reduces some 3-clauses to 2-clauses. We
432: use a mixed 2+p-SAT distribution\cite{Sta}, where $p (=\pi')$ is the fraction
433: of 3-clauses, to model what remains of the input instance at a node of
434: the search tree. Using experiments and methods from statistical
435: mechanics\cite{Sta} and rigorous calculations\cite{Achl1},
436: the threshold line $\alpha _C (p)$, separating
437: sat from unsat phases, may be estimated with the results shown in
438: Fig.~\ref{sche}. For $p \le p_0 = 2/5$, {\em i.e.} left to point T,
439: the threshold line is
440: given by $\alpha _C(p)=1/(1-p)$, and saturates the upper bound
441: for the satisfaction of 2-clauses. Above $p_0$, no exact
442: value for $\alpha _C (p)$ is known.
443: The phase diagram of 2+p-SAT is the natural space in which the DPLL
444: dynamics takes place. An input 3-SAT instance with ratio $\alpha$ shows
445: up on the right vertical boundary of Fig.~\ref{sche} as a point of
446: coordinates $(p=1,\alpha )$. Under the action of DPLL, the
447: representative point moves aside from the 3-SAT axis and follows a
448: trajectory in the $(p,\alpha )$ plane.
449:
450: In this section, we show that the location of this trajectory in
451: the phase diagram allows a precise understanding of the search tree
452: structure and of complexity as a function of the ratio $\alpha$
453: of the instance to be solved (Inset of Fig.~\ref{sche}).
454: In addition, we shall present an
455: approximate calculation of trajectories accounting for the case of
456: massive backtracking, that is for unsat instances, and slightly below
457: the threshold in the sat phase. Our approach is based on a non
458: rigorous extension of works by Chao and Franco who first studied the
459: action of DPLL (without backtracking) on easy, sat
460: instances\cite{fra2,Fra} as a way to obtain lower bounds to the
461: threshold $\alpha_C$, see \cite{Achl} for a recent review.
462:
463: Let us emphasize that the idea of trajectory is made possible thanks to an
464: important statistical property of the heuristics of split we
465: consider \cite{fra2,Fra},
466: \begin{itemize}
467: \item{Unit-Clause (UC) heuristic:}
468: pick up randomly a literal among a unit clause
469: if any, or any unset variable otherwise.
470:
471: \item{Generalized Unit-Clause (GUC) heuristic:}
472: pick up randomly a literal among the shortest avalaible clauses.
473:
474: \item{Short Clause With Majority (SC$_1$) heuristic:}
475: pick up randomly a literal among unit clauses if any, or pick up randomly
476: an unset variable $v$, count
477: the numbers of occurences $\ell, \bar \ell$ of $v$, $\bar v$ in 3-clauses,
478: and choose $v$ (respectively $\bar v$) if $\ell > \bar \ell$ (resp.
479: $\ell < \bar \ell$). When
480: $\ell=\bar \ell$, $v$ and $\bar v$ are equally likely to be chosen.
481: \end{itemize}
482:
483: These heuristics do not induce any bias nor correlation in the
484: instances distribution\cite{fra2,kir2}. Such a statistical
485: ``invariance'' is required to ensure that the dynamical evolution
486: generated by DPLL remains confined to the phase diagram of
487: Fig.~\ref{sche}. In the following, the initial ratio of clauses per
488: variable of the instance to be solved will be denoted by $\alpha _0$.
489:
490: \subsubsection{Lower sat phase and branch trajectories.}
491:
492: Let us consider the first descent of the algorithm, that is the action
493: of DPLL in the absence of backtracking. The search tree is a single
494: branch (Fig.~\ref{trees}A). The numbers of 2 and 3-clauses
495: are initially equal to $C_2=0, C_3=\alpha _0
496: \, N$ respectively. Under the action of DPLL, $C_2$ and $C_3$ follow
497: a Markovian stochastic evolution process, as the depth $T$ along the branch
498: (number of assigned variables) increases. Both $C_2$ and $C_3$ are
499: concentrated around their average values, the densities
500: $c_j (t)= E[C_j( t N)/N]$ ($j=2,3$) of which obey a set of
501: coupled ordinary differential equations (ODE)\cite{fra2,Fra,Achl},
502: \begin{equation}
503: \frac{d c_3}{dt} = - \frac{ 3\, c_3}{1-t} \qquad , \qquad
504: \frac{d c_2}{dt} = \frac{ 3\, c_3}{2(1-t)} - \frac{ 2\, c_2}{1-t} -
505: \rho _1 (t) \; h(t) \qquad , \label{ode}
506: \end{equation}
507: where $\rho _1 (t) = 1 - c_2(t)/(1-t)$ is the probability that DPLL fixes a
508: variable at depth $t$ through unit-propagation. Function $h$ depends upon the
509: heuristic: $h_{UC} (t)=0$, $h_{GUC} (t)=1$ (if $\alpha _0> 2/3$),
510: $h_{SC_1}
511: (t)=a\, e^{-a}\, (I_0(a)+I_1(a))/2$ with $a\equiv 3\, c_3(t)/(1-t)$
512: and $I_\ell$ denotes the $\ell^{th}$ modified Bessel function.
513: To obtain the single branch trajectory in the phase diagram of Fig.~\ref{sche},
514: we solve the ODEs (\ref{ode}) with initial conditions $c_2(0)=0,
515: c_3(0)=\alpha_0$, and perform the change of variables
516: \begin{equation}
517: p(t) = \frac{c_3(t)}{c_2(t)+c_3(t)} \qquad , \qquad
518: \alpha (t) = \frac{c_2(t)+c_3(t)}{1-t} \qquad . \label{change}
519: \end{equation}
520:
521: Results are shown for the GUC heuristics and starting ratios $\alpha_0 =2$
522: and 2.8 in Fig.~\ref{sche}. Trajectories,
523: indicated by light dashed lines, first head to the left and then
524: reverse to the right until reaching a point on the 3-SAT axis at
525: a small ratio. Further action of
526: DPLL leads to a rapid elimination of the remaining clauses and the
527: trajectory ends up at the right lower corner S, where a solution is
528: found.
529:
530: Frieze and Suen \cite{Fri}
531: have shown that, for ratios $\alpha _0 < \alpha _L \simeq 3.003$
532: (for the GUC heuristics), the full search tree essentially reduces
533: to a single branch, and is thus entirely described by the ODEs (\ref{ode}).
534: The number of backtrackings necessary to reach a solution
535: is bounded from above
536: by a power of $\log N$. The average size $\bar Q$ of the branch then
537: scales linearly
538: with $N$ with a multiplicative factor $\gamma (\alpha _0)=Q/N$ that can
539: be analytically computed \cite{Coc}.
540:
541: The boundary $\alpha _L$ of this easy sat region can be defined as the largest
542: initial ratio $\alpha _0$ such that the branch trajectory $p(t),\alpha (t)$
543: issued from $\alpha _0$ never leaves the sat phase in the course of DPLL
544: resolution.
545:
546:
547: \subsubsection{Unsat phase and tree trajectories.}
548:
549: For ratios above threshold ($\alpha _0 > \alpha _C\simeq 4.3$), instances
550: almost never have a solution but a considerable amount of
551: backtracking is necessary before proving that clauses are
552: incompatible. As shown in Fig.~\ref{trees}B, a generic unsat tree includes
553: many branches. The number of branches (leaves), $B$, or the number
554: of nodes, $Q=B-1$, grow exponentially with $N$\cite{Chv}.
555: It is convenient to define its logarithm $\omega$ through $B=2^{N
556: \omega}$.
557: Contrary to the previous section, the sequence of points $(p,\alpha)$
558: characterizing the evolution of the 2+p-SAT instance solved by DPLL does not
559: define a line any longer, but rather a patch, or cloud of points with
560: a finite extension in the phase diagram of Fig.~\ref{schemoins}.
561:
562: We have analytically computed the logarithm $\omega$ of the size
563: of these patches, as a function of $\alpha_0$,
564: extending to the unsat region the probabilistic analysis of DPLL. This
565: is, {\em a priori}, a very difficult task since the
566: search tree of Fig.~1B is the output of a complex, sequential process: nodes
567: and edges are added by DPLL through successive descents and
568: backtrackings. We have imagined a different building up, that results
569: in the same complete tree but can be mathematically analyzed: the tree
570: grows in parallel, layer after layer (Fig.~\ref{struct}).
571:
572: \begin{figure}
573: \begin{center}
574: \includegraphics[height=150pt,angle=0]{struct.eps}
575: \end{center}
576: \caption{Imaginary, parallel growth process of an unsat search tree used in the
577: theoretical analysis of the computational complexity. Variables
578: are fixed through unit-propagation, or the splitting heuristics as in the DPLL
579: procedure, but branches evolve in parallel. $T$ denotes the depth in the
580: tree, that is the number of variables assigned by DPLL along each (living)
581: branch. At depth $T$, one literal is chosen on each branch among 1-clauses
582: (unit-propagation, grey circles not represented on Figure 1), or 2,3-clauses
583: (split, black circles as in Figure 1).
584: If a contradiction occurs as a result of unit-propagation, the branch gets
585: marked with C and dies out. The growth of the tree proceeds
586: until all branches carry C leaves. The resulting tree is identical to the one
587: built through the usual, sequential operation of DPLL. }
588: \label{struct}
589: \end{figure}
590:
591: A new layer is added by assigning, according to DPLL heuristic, one
592: more variable along each living branch. As a result, a branch may
593: split (case 1), keep growing (case 2) or carry a contradiction and die
594: out (case 3). Cases 1,2 and 3 are stochastic events, the
595: probabilities of which depend on the characteristic parameters
596: $c_2,c_3,t$ defining the 2+p-SAT instance carried by the branch, and
597: on the depth (fraction of assigned variables) $t$ in the tree. We
598: have taken into account the correlations between the parameters
599: $c_2,c_3$ on each of the two branches issued from splitting (case 1),
600: but have neglected any further correlation which appear between
601: different branches at different levels in the tree\cite{Coc}. This
602: Markovian approximation permits to write an evolution equation for the
603: logarithm $\omega(c_2,c_3,t)$ of the average number of branches with
604: parameters $c_2,c_3$ as the depth $t$ increases,
605:
606: \begin{equation}
607: \frac{\partial \omega } {\partial t} (c_2,c_3,t) = { H} \left[ c_2, c_3,
608: \frac{\partial \omega } {\partial c_2} , \frac{\partial \omega }
609: {\partial c_3 } ,t \right] \qquad . \label{croi}
610: \end{equation}
611: ${H}$ incorporates the details of the splitting
612: heuristics. In terms of the partial
613: derivatives $y_2=\partial \omega/ \partial c_2$, $y_3=\partial \omega/
614: \partial c_3$, we have for the UC and GUC heuristics
615: \begin{eqnarray}
616: {H} _{UC}
617: %(c_2 ,c_3, y_2, y_3 , t )
618: &=& 1 + \frac{1}{\ln2} \left[ \frac {3\, c_3}{1-t}\; \left( e^{y_3}
619: \frac{1+e^{-y_2}}{2} -1 \right)+ \frac{c_2}{1-t} \; \left( \frac 32 e^{-y_2}
620: -2 \right) \right] \nonumber \\
621: {H} _{GUC}
622: %(c_2 ,c_3, y_2, y_3 , t )
623: &=& \log _2 \nu (y_2)
624: + \frac{1}{\ln2} \left[ \frac {3\, c_3}{1-t}\; \left( e^{y_3}
625: \frac{1+e^{-y_2}}{2} -1 \right)+ \frac{c_2}{1-t} \; \left( \nu(y_2)
626: -2 \right) \right] \nonumber \\ \hbox{\rm where} &&
627: \nu (y_2 ) = \frac 12\; e^{y_2}\left( 1 +\sqrt{1+ 4 e^{-y_2}} \right)\qquad .
628: \end{eqnarray}
629: Partial differential equation (PDE) (\ref{croi}) is
630: analogous to growth processes encountered in statistical physics
631: \cite{Gro}. The surface $\omega$, growing with ``time'' $t$ above the
632: plane $(c_2,c_3)$, or equivalently from (\ref{change}), above the plane
633: $(p,\alpha)$ (Fig.~\ref{dome}), describes the whole distribution of branches.
634: The average number of branches at depth $t$ in the tree equals
635: $B(t) = \int dp\; d\alpha \; 2^{N\, \omega (p,\alpha,t)} \simeq
636: 2^{N\, \omega ^*(t)}$,
637: where $\omega ^*(t)$ is the maximum over $p,\alpha$ of $\omega (p,\alpha,t)$
638: reached in $p^*(t), \alpha^*(t)$.
639: In other words, the exponentially dominant contribution to $B(t)$
640: comes from branches carrying 2+p-SAT instances with parameters
641: $p^*(t), \alpha^*(t)$, which define the tree trajectories on
642: Fig.~\ref{sche}.
643:
644: The hyperbolic line in Fig.~\ref{sche} indicates the halt points, where
645: contradictions prevent dominant branches from further growing.
646: Each time DPLL assigns a variable through
647: unit-propagation, an average number $u(p,\alpha)$ of new 1-clauses is
648: produced, resulting in a net rate of $u-1$ additional 1-clauses.
649: As long as $u< 1$, 1-clauses are quickly eliminated and do not
650: accumulate. Conversely, if $u >1$, 1-clauses tend to accumulate.
651: Opposite 1-clauses $x$ and $\bar x$ are likely to appear,
652: leading to a contradiction \cite{Fra,Fri}. The halt line is defined through
653: $u (p,\alpha)=1$. As far as dominant branches are concerned,
654: the equation of the halt line reads
655: \begin{equation}
656: \alpha = \left( \frac{3+\sqrt 5}2 \right)
657: \ln \left[ \frac{1+\sqrt 5}2 \right]\;\frac 1{1-p}
658: \simeq \frac{1.256}{1-p}\qquad .
659: \end{equation}
660:
661: Along the tree trajectory, $\omega ^*(t)$ grows from 0, on the
662: right vertical axis, up to some final positive value, $\hat \omega$,
663: on the halt line. $\hat \omega $ is our theoretical prediction for the
664: logarithm of the complexity (divided by $N$). Values of $\hat \omega
665: $ obtained for $4.3<\alpha_0<20$ by solving equation (\ref{croi})
666: compare very well with numerical results \cite{Coc}.
667:
668: \begin{center}
669: \begin{figure}
670: \includegraphics[height=160pt,angle=-90]{a10t.01.ps}
671: \includegraphics[height=160pt,angle=-90]{a10t.05.ps}
672: \includegraphics[height=160pt,angle=-90]{a10t.09.ps}
673: \vskip .5cm
674: \caption{Snapshots of the surface $\omega (p,\alpha )$ for $\alpha
675: _0=10$ at three different depths, $t=0.01$, 0.05 and 0.09 (from left to
676: right). The height $\omega ^*(t)$ of the top of the surface, with
677: coordinates $p^*(t), \alpha^*(t)$, is the logarithm (divided by $N$) of the
678: number of branches. The halt line is hit at $t_h \simeq 0.094$.}
679: \label{dome}
680: \end{figure}
681: \end{center}
682:
683: We
684: have plotted the surface $\omega$ above the $(p,\alpha)$ plane, with the results
685: shown in Fig.~\ref{dome}.
686: It must be stressed that, though our calculation is not rigorous, it provides
687: a very good quantitative estimate of the complexity. Furthermore,
688: complexity is found to scale asymptotically as
689: \begin{equation}
690: \hat \omega (\alpha _0) \sim \frac {3+\sqrt{5}}{(6 \,\ln 2) \;
691: \alpha _0}\; \left[ \ln \left( \frac{1+\sqrt 5}{2} \right)
692: \right]^2 \simeq \frac{0.292}{\alpha _0} \qquad (\alpha _0 \gg \alpha _C ) .
693: \end{equation}
694: This result exhibits the expected scaling\cite{Bea}, and
695: could indeed be exact. As $\alpha _0$ increases, search
696: trees become smaller and smaller, and
697: correlations between branches, weaker and weaker.
698:
699:
700:
701:
702: \subsubsection{Upper sat phase and mixed branch--tree trajectories.}
703:
704: The interest of the trajectory approach proposed in this paper is best
705: seen in the upper sat phase, that is ratios $\alpha _0$ ranging from
706: $\alpha _L$ to $\alpha _C$. This intermediate region juxtaposes branch
707: and tree behaviors, see Fig.~\ref{trees}C. The branch trajectory starts
708: from the point $(p=1,\alpha _0)$ corresponding to the initial 3-SAT
709: instance and hits the critical line $\alpha_c(p)$ at some point G with
710: coordinates ($p_G,\alpha_G$) after $N\;t_G$ variables have been
711: assigned by DPLL (Fig.~\ref{sche}). The algorithm then enters the unsat
712: phase and generates 2+p-SAT instances with no solution. A dense
713: subtree, that DPLL has to go through entirely, forms beyond G till the
714: halt line (Fig.~\ref{sche}). The size of this subtree, $2^{N\,(1-t_G)\,\hat
715: \omega _G}$, can be analytically predicted from our theory. G is the
716: highest backtracking node in the tree (Fig.~\ref{trees}C) reached back by
717: DPLL, since nodes above G are located in the sat phase and carry
718: 2+p-SAT instances with solutions. DPLL will eventually reach a
719: solution. The corresponding branch (rightmost path in Fig.~\ref{trees}C) is
720: highly non typical and does not contribute to the complexity, since
721: almost all branches in the search tree are described by the tree
722: trajectory issued from G (Fig.~\ref{sche}). We have checked experimentally
723: this scenario for $\alpha _0=3.5$. The coordinates of the average
724: highest backtracking node, $(p_G\simeq 0.78, \alpha _G \simeq 3.02$),
725: coincide with the analytically computed intersection of the single
726: branch trajectory and the critical line $\alpha_c(p)$\cite{Coc}. As
727: for complexity, experimental measures of $\omega$ from 3-SAT instances
728: at $\alpha _0= 3.5$, and of $\omega _G$ from 2+0.78-SAT instances at
729: $\alpha _G =3.02$, obey the expected identity $\omega = \omega _G \;
730: (1-t_G)$ and are in very good agreement with theory\cite{Coc}.
731: Therefore, the structure of search trees for 3-SAT reflects
732: the existence of a critical line for 2+p-SAT instances.
733:
734: \subsection{Average analysis of the vertex cover of random graphs}
735: \label{DpllVcSection}
736:
737: We now consider the VC problem, where inputs are random graphs
738: drawn from the $G(N,p=c/N)$ ensemble\cite{Bollo}. In other words,
739: graphs have $N$ vertices and the probability that a pair of vertices
740: are linked through an edge is $c/N$, independently of other edges.
741: When the number $X=xN$ of covering marks is lowered, the model undergoes
742: a COV/UNCOV transition at some critical density of covers
743: $x_{\rm c}(c)$ for $N\to \infty$.
744: For $x>x_{\rm c}(c)$, vertex covers of size $Nx$
745: exist with probability one, for $x<x_{\rm c}(c)$ the available covering
746: marks are not sufficient.
747: The statistical mechanics analysis of Ref. \cite{WeHa1} gave the result
748:
749: \begin{eqnarray}
750: x_{\rm c}(c)= 1- \frac{2W(c)+W(c)^2}{2c}\, , \;\;\;\;\;\;\;\;
751: \mbox{for}\;\; c<e\, ,\label{Critical_VC}
752: \end{eqnarray}
753:
754: where $W(c)$ solves the equation $We^W=c$.
755: This result is compatible with the bounds of Refs. \cite{Ga_VC,Fr_VC},
756: and was later shown to be exact \cite{Bauer}.
757: For $c>e$, Eq.~(\ref{Critical_VC}) only gives an approximate estimate
758: of $x_{\rm c}(c)$. More sophisticated calculations can be found in Ref.
759: \cite{WeHaLong}.
760:
761: \begin{figure}
762: \begin{center}
763: \includegraphics[height=220pt,angle=0] {traj_VC.eps}
764: \end{center}
765: \caption{Phase diagram of VC. The low-$x$, high-$c$ UNCOV phase is
766: separated by the dashed line, cf. Eq. (\ref{Critical_VC}), from
767: the high-$x$, low-$c$ COV phase. The symbols (numerics) and continuous lines
768: (analytical prediction, cf. Eq. (\ref{EqTrajVC}))
769: refer to the simple search algorithm described in the text.
770: The dotted line is the separatrix between two types of
771: trajectories.}
772: \label{traj_VC}
773: \end{figure}
774:
775: Let us consider a simple implementation of the DPLL procedure for the present
776: problem.
777: During the computation, vertices can be {\em covered},
778: {\em uncovered} or just {\em free},
779: meaning that the algorithm has not yet assigned any value to that
780: vertex. At the beginning all the vertices are set {\it free}.
781: At each step the algorithm chooses a vertex $i$ at
782: random among those which are {\em free}.
783: If $i$ has neighboring vertices which are either {\em free} or {\em
784: uncovered}, then the vertex $i$ is declared {\em covered} first. In case $i$
785: has only covered neighbors, the vertex is declared {\em uncovered}. The
786: process continues unless the number of covered vertices exceeds $X$.
787: In this case the algorithm backtracks and
788: the opposite choice is taken for the vertex
789: $i$ unless this corresponds to declaring {\em uncovered} a vertex that
790: has one or more {\em uncovered} neighbors.
791: The algorithm halts if it finds a solution
792: (and declares the graph to be COV) or after exploring all the search tree (in
793: this case it declares the graph to be UNCOV).
794:
795: \begin{figure}
796: \begin{center}
797: \includegraphics[height=220pt,angle=0] {time_VC.eps}
798: \end{center}
799: \caption{Number of operations required to solve (or to show that no solution
800: exists to) the VC decision problem with the search algorithm described in
801: the text. The logarithm of the number of nodes
802: of the backtracking tree divided by the size $N$, is plotted versus the
803: number of covering marks. Here we consider random instances
804: with average connectivity $c=2$. The phase transition is at
805: $x_{\rm c}(c=2)\approx 0.3919$ and corresponds to the peak in computational
806: complexity.}
807: \label{time_VC}
808: \end{figure}
809:
810: Of course one can improve over this algorithm by using smarter
811: heuristics \cite{WeHeur}. One remarkable example is the
812: ``leaf-removal'' algorithm defined in Ref. \cite{Bauer}.
813: Instead of picking any vertex randomly, one chooses a connectivity-one
814: vertex, declare it {\it uncovered}, and declare {\it covered} its neighbor.
815: This procedure is repeated iteratively on the subgraph
816: of {\it free} nodes, until no connectivity-one nodes are left.
817: In the low-connectivity, COV region $\{ c<e, x>x_{\rm c}(c)\}$,
818: it stops only when the graph is completely covered.
819: As a consequence, this algorithm can solve VC in linear time
820: with high probability in all this region. No equally good heuristics exists
821: for higher connectivity, $c>e$.
822:
823: \subsubsection{Branch trajectories}
824:
825: Under the action of one of the above algorithms, the instance is progressively
826: modified and the number of variables is reduced.
827: In fact, at each step a vertex
828: is selected and can be eliminated from the graph regardless whether it is
829: declared {\it covered} or {\it uncovered}.
830: The analysis of the first algorithm is greatly simplified by the remark
831: that, as long as backtracking has not begun, the new vertex is selected
832: randomly. This implies that the modified instance
833: produced by the algorithm is still a random graph.
834: Its evolution can be effectively described by a
835: trajectory in the $(c,x)$ space.
836: If one starts from the parameters $c_0$, $x_0$,
837: after $Nt$ steps of the algorithm, he will end up
838: with a new instance of size $N(1-t)$ and parameters \cite{WeHa2}
839:
840: \begin{eqnarray}
841: c(t)=c_0(1-t)\, , \;\;\;\;\; x(t)=
842: \frac{x_0-t}{1-t}+\frac{e^{-c_0(1-t)}-e^{-c_0}}{c_0(1-t)}\, .\label{EqTrajVC}
843: \end{eqnarray}
844:
845: Some examples of the two types of trajectories (the ones leading to a solution
846: and the ones which eventually enter the UNCOV region) are
847: shown in Fig. \ref{traj_VC}. The separatrix is given by
848:
849: \begin{eqnarray}
850: x_{\rm s}(c) = 1-\frac{1-e^{-c}}{c}\, ,\label{Separatrix_VC}
851: \end{eqnarray}
852:
853: and corresponds to the dotted line in Fig. \ref{traj_VC}. Above this line the
854: algorithm solves the problem in linear time.
855:
856: For more general heuristics the analysis becomes less
857: straightforward because the graph produced by the algorithm
858: does not belong to the standard random-graph {\it ensemble}.
859: It may be necessary to augment the number of parameters which
860: describe the evolution of the instance. As an example,
861: the leaf-removal algorithm mentioned in the previous Section
862: is conveniently described by keeping track of three numbers
863: which parametrize the degree profile (i.e. the
864: fraction of vertices $p_d(t)$ having a given degree $d$) of the
865: graph \cite{WeHeur}.
866:
867: \subsubsection{Tree trajectories}
868:
869: Below the critical line $x_{\rm c}(c)$, cf. Eq. (\ref{Critical_VC}),
870: no solution exists to the typical random instance of VC. Our algorithm
871: must explore a large backtracking tree to prove it
872: and this takes an exponential time.
873: The size of the backtracking tree could be computed along the lines
874: of Sec. II.B.2. However a good result can be obtained with a
875: simple ``static'' calculation \cite{WeHa1}.
876:
877: As explained in Sec. II.B.2, we imagine the evolution of the backtracking
878: tree as proceeding ``in parallel''.
879: At the level $M$ of the tree a set of $M$ vertices has been visited.
880: Call ${\cal G}_M$ the subgraph induced by these vertices.
881: Since we always put a covering mark on a vertex which is surrounded by
882: vertices declared {\it uncovered}, each node
883: of the backtracking tree will carry a vertex cover
884: of the associated subgraph ${\cal G}_M$ . Therefore the number of backtracking
885: nodes is given by
886:
887: \begin{eqnarray}
888: Q = \sum_{M=1}^N {\cal N}_{\rm VC}({\cal G}_M;X)\, ,
889: \end{eqnarray}
890:
891: where ${\cal N}_{\rm VC}({\cal G}_M;X)$ is the number of VC's of
892: ${\cal G}_M$ using at most $X$ marks. A very crude estimate of the
893: right-hand side of the above equation is:
894:
895: \begin{eqnarray}
896: Q \le \sum_{M=1}^N \sum_{X'=0}^{{\rm min}(X,M)}
897: \left(\begin{array}{c}M\\X'\end{array}\right)\, ,
898: \end{eqnarray}
899:
900: where we bounded the number of VC's of size $X'$ on ${\cal G}_M$
901: with the number of ways of placing $X'$ marks on $M$ vertices.
902: The authors of \cite{WeHa2} provided a refined estimate based on
903: the {\it annealed approximation} of statistical mechanics.
904: The results of this calculation are compared in Fig. \ref{time_VC}
905: with the numerics.
906:
907: \subsubsection{Mixed trajectories}
908:
909: If the parameters which characterize an instance of VC lie in the region
910: between $x_{\rm c}(c)$, cf. Eq. (\ref{Critical_VC}),
911: and $x_{\rm s}(c)$, cf. Eq. (\ref{Separatrix_VC}), the problem is
912: still soluble but our algorithm takes an exponential time to solve it.
913: In practice, after a certain number of vertices has been visited and
914: declared either {\it covered} or {\it uncovered}, the remaining subgraph
915: ${\cal G}_{free}$ cannot be any longer covered with the leftover marks.
916: This happens typically when the first descent trajectory (\ref{EqTrajVC})
917: crosses the critical line (\ref{Critical_VC}).
918:
919: It takes some time for the algorithm to realize this fact. More precisely,
920: it takes exactly the time necessary to prove that ${\cal G}_{free}$
921: is uncoverable. This time dominates the computational complexity in
922: this region and can be calculated along the lines sketched in the
923: previous Section. The result is, once again, reported in Fig. \ref{time_VC},
924: which clearly shows a computational peak at the phase boundary.
925:
926: Finally, let us notice that this mixed behavior disappears in the
927: entire $c<e$ region if the leaf-removal heuristics is adopted for the
928: first descent.
929:
930: \subsection {Distribution of resolution times}
931: \label{DpllFlucSection}
932:
933: \begin{figure}
934: \begin{center}
935: \includegraphics[height=220pt,angle=-90] {historun.eps}
936: \end{center}
937: \caption{Probability distributions of the logarithm $\omega$ of the
938: resolution complexity from 20,000 runs of DPLL on random 3-SAT instances
939: with ratio $\alpha =3.5$. Each
940: distribution corresponds
941: to one randomly drawn instance of size $N=300$.}
942: \label{historun}
943: \end{figure}
944:
945:
946: \begin{figure}
947: \begin{center}
948: \includegraphics[height=220pt,angle=-90] {complex.eps}
949: \end{center}
950: \caption{Resolution of random 3-SAT instances in the upper sat phase:
951: logarithm of complexity with DPLL ($\omega$ -- simulations:
952: circles, theory: dotted line) and restarts
953: ($\zeta$ -- simulations: squares, theory: full line)
954: as a function of ratio $\alpha$. Inset: Minus log. of
955: the cumulative probability $P_{lin}$ of complexities
956: $Q\le N$ as a function of the size for $100 \le N\le 400$
957: (full line); log. of the number of restarts $N_{rest}$
958: necessary to find a solution for $100\le N \le 1000$
959: (dotted line) for $\alpha=3.5$. Slopes are $\zeta = 0.0011$ and $\bar
960: \zeta = 0.00115$ respectively.}
961: \label{histolin}
962: \end{figure}
963:
964: Up to now we have studied the typical resolution complexity. The study of
965: fluctuations of resolution times is interesting too, particularly in
966: the upper sat phase where solutions exist but are found at a price of
967: a large computational effort. We may expect that there exist lucky but
968: rare resolutions able to find a solution in a time much smaller
969: than the typical one.
970: Due to the stochastic character of DPLL
971: complexity indeed fluctuates from run to run of the algorithm on the same
972: instance. In Fig.~\ref{historun} we show this run-to-run distribution of
973: the logarithm $\omega$ of the resolution complexity for four
974: instances of random 3-SAT with the same ratio $\alpha =3.5$.
975: The run to run distribution are qualitatively independent of the
976: particular instances, and exhibit two bumps. The wide right one,
977: located in $\omega \simeq 0.035$, correspond to the major part
978: of resolutions. It acquires more and more weight as $N$ increases and
979: corresponds to the typical behavior analysed in Section~II.B.3.
980: The left peak corresponds to much faster resolutions, taking place in
981: linear time. The weight of this peak (fraction of runs with complexities
982: falling in the peak) decreases exponentially fast with $N$, and can be
983: numerically estimated to $W_{lin} = 2^{- N \zeta}$
984: with $\zeta \simeq 0.011$.
985: Therefore, instances at $\alpha =3.5$ are typically solved in
986: exponential time but a tiny (exponentially small) fraction of runs
987: are able to find a solution in linear time only.
988:
989: A systematic stop-and-restart procedure may be introduced to take
990: advantage of this fluctuation phenomenon and speed up resolution. If
991: a solution is not found before $N$ splits, DPLL is stopped and rerun
992: after some random permutations of the variables and clauses. The
993: expected number $N_{rest}$ of restarts necessary to find a solution
994: being equal to the inverse probability $1/W_{lin}$ of linear
995: resolutions, the resulting complexity scales as $N\; W_{lin}^{-1} \sim
996: 2^{N\, \zeta}$.
997:
998: To calculate $\zeta$ we have analyzed, along the lines of the
999: study of the growth of the search tree in the unsat phase,
1000: the whole distribution of the complexity
1001: for a given ratio $\alpha$ in the upper sat phase.
1002: Calculations can be found in \cite{Cocrs}.
1003: Linear resolutions are found to correspond to branch trajectories that
1004: cross the unsat phase without being hit by a contradiction,
1005: see Fig.~\ref{sche}.
1006: Results are reported in Fig.~\ref{histolin} and compare very well
1007: with the experimentally measured number $N_{rest}$
1008: of restarts necessary to find a solution.
1009: In the whole upper sat phase, the use of restarts offers an exponential gain
1010: with respect to usual DPLL resolution (see Fig.~\ref{histolin}
1011: for comparison between $\zeta$ and $\omega$), but the completeness
1012: of DPLL is lost.
1013:
1014: \begin{figure}
1015: \centerline{\includegraphics[height=220pt,angle=-90] {c32x06.eps}}
1016: \caption{The computational complexity of the search
1017: algorithm for VC, with restarts after $\exp(N\omega'_R)$
1018: backtracking steps. The complexity is defined as the
1019: logarithm of the total number of visited nodes, divided by the
1020: size $N$ of the graph. Symbols refer to $N=30$ (circles), $60$ (triangles),
1021: and $120$ (diamonds). The stars are the result
1022: of an $N\to\infty$ extrapolation.
1023: The continuous (dashed) line reproduces the theoretical
1024: prediction with (without) taking into account fluctuations of the first
1025: descent trajectory.}
1026: \label{time_RVC}
1027: \end{figure}
1028:
1029:
1030: A slightly more general restart strategy consists in stopping the backtracking
1031: procedure after a fixed number of nodes $Q_R = e^{N\omega'_R}$
1032: has been visited.
1033: A new (and statistically independent) DPLL procedure is then
1034: started from the beginning.
1035: In this case one exploits lucky, but still
1036: exponential, stochastic runs. The tradeoff between
1037: the exponential gain of time and the exponential number of restarts,
1038: can be optimized by tuning the parameter
1039: $\omega'_R$. This approach has been analyzed in Ref. \cite{VCRestart}
1040: taking VC as a working example.
1041: In Fig. \ref{time_RVC} we show the computatonal complexity of
1042: such a strategy as a function of the restart parameter $\omega'_R$. We
1043: compare the numerics with an approximate calculation \cite{VCRestart}.
1044: The instances were random graphs with average connectivity
1045: $c=3.2$, and $x=0.6$ covering marks per vertex.
1046: The optimal choice of the parameter seems to be (in this case)
1047: $\omega'_R\approx 0$, corresponding to polynomial runs.
1048:
1049: \begin{figure}
1050: \begin{center}
1051: \includegraphics[height=220pt,angle=-90] {cx.eps}
1052: \end{center}
1053: \caption{Restart experiments for VC with initial condition at
1054: $c_0=3.2$, $x_0=0.6$
1055: (empty circle). The long-dashed line is the critical line
1056: (\ref{Critical_VC}). The rightmost dotted line is the typical trajectory.
1057: The leftmost one is the rare trajectory followed by the last
1058: (successful) restart of the algorithm when $\omega'_R=0.1$.
1059: The symbols are numerical results for the
1060: $(c,x)$ coordinates of the root of the backtrack tree generated by the
1061: algorithm since the last restart.
1062: Triangles, squares and stars correspond,
1063: respectively, to $N=30$, $60$, $120$ (in each case we considered
1064: several values for $\omega'_R$, each one corresponding to a symbol).
1065: The continuous line is an approximate analytical
1066: prediction for the same quantity.}
1067: \label{root_RVC}
1068: \end{figure}
1069:
1070: The analytical prediction reported in Fig. \ref{time_RVC}
1071: requires, as for 3-SAT, an estimate of the execution-time fluctuations
1072: of the DPLL procedure (without restart).
1073: It turns out that one major source of fluctuations is, in the present case,
1074: the location in the $(c,x)$ plane of the highest node
1075: in the backtracking tree.
1076: In the typical run this coincides with the intersection $(c_G,x_G)$ between
1077: the first descent trajectory (\ref{EqTrajVC})
1078: and the critical line (\ref{Critical_VC}). One can estimate the probability
1079: $P(c,x) \sim \exp\{-N\psi(c,x)\}$ for this node to have coordinates
1080: $(c,x)$ (obviously $\psi(c_G,x_G) = 0$).
1081:
1082: When an upper bound $\omega'_R$ on the backtracking time is fixed,
1083: the problem is solved in those lucky runs which are characterized
1084: by an atypical highest backtracking node. Roughly speaking, this means
1085: that the algorithm has made some very good (random) choices in its first
1086: steps. In Fig. \ref{root_RVC} we plot the position of the highest
1087: backtracking point in the (last) successful runs for several values of
1088: $\omega'_R$. Once again the numerics compare favourably with
1089: an approximate calculation.
1090:
1091: \section{Analysis of local search algorithms}
1092: \label{LocalSection}
1093:
1094: We now turn to the description and study of algorithms of another
1095: type, namely local search algorithms. As a common feature, these
1096: algorithms start from a configuration (assignment) of the variables,
1097: and then make successive improvements by changing at each step few
1098: of the variables in the configuration (local move).
1099: For instance, in the SAT
1100: problem, one variable is flipped from being true to false, or {\em vice
1101: versa}, at each step. Whereas complete algorithms of the DPLL type
1102: give a definitive answer to any instance of a decision problem,
1103: exhibiting either a solution or a proof of unsatisfiability, local
1104: search algorithms give a sure answer when a solution is found but
1105: cannot prove unsatisfiability. However, these algorithms can sometimes be
1106: turned into one-sided probabilistic algorithms, with an
1107: upper bound on the probability that a solution exists and has not
1108: been found after $T$ steps of the algorithm, decreasing to zero
1109: when $T\to\infty$\cite{rando}.
1110:
1111: \subsection{Landscape and search dynamics}
1112:
1113: Local search algorithms perform repeated changes of a configuration $C$
1114: of variables (values of the Boolean variables for SAT, status --marked
1115: or unmarked-- of vertices for VC) according to some criterion, usually
1116: based on the comparison of the cost function $F$ (number of
1117: unsatisfied clauses for SAT, of uncovered
1118: edges for VC) evaluated at $C$ and over its neighborhood.
1119: It is therefore clear that the shape of the multidimensional surface
1120: $C \to F(C)$, called cost function landscape, is of high importance.
1121: On intuitive grounds, if this landscape is relatively smooth with a
1122: unique minimum, local procedures as gradient descent should be very
1123: efficient, while the presence of many local minima could hinder the
1124: search process (Fig.~\ref{landscape}). The fundamental underlying
1125: question is whether the performances of the
1126: dynamical process (ability to find the global minimum, time needed to
1127: reach it) can be understood in terms of an analysis of the
1128: cost function landscape only.
1129:
1130: This question was intensively studied and answered for a limited class
1131: of cost functions, called mean field spin glass models, some years
1132: ago\cite{leti}.
1133: The characterization of landscapes is indeed of huge importance in
1134: physical systems. There, the cost function is simply the physical
1135: energy, and local dynamics are usually low or zero temperature Monte
1136: Carlo dynamics, essentially equivalent to gradient descent.
1137: Depending on the parameters of the input distribution, the minima
1138: of the cost functions may undergo structural changes, a phenomenon
1139: called clustering in physics.
1140:
1141: Clustering has been rigorously shown to take place in the random 3-XORSAT
1142: problem\cite{Crei,xorsat1,xorsat2},
1143: and is likely to exist in many other random combinatorial
1144: problems as 3-SAT\cite{varia,cavi}. Instances of the 3-XORSAT problem with
1145: $M=\alpha\,N$ clauses and $N$ variables have almost surely
1146: solutions as long as $\alpha < \alpha_c \simeq 0.918$\cite{xorsat1,xorsat2}.
1147: The clustering
1148: phase transition takes place at $\alpha_s \simeq 0.818$ and is related
1149: to a change in the geometric structure of the space of solutions,
1150: see Fig.~\ref{landscape}:
1151: \begin{itemize}
1152: \item when $\alpha < \alpha_s$, the space of solutions is connected. Given
1153: a pair of solutions $C,C'$, {\em i.e.} two assignments of the $N$
1154: Boolean variables that satisfy the clauses, there almost surely exists
1155: a sequence of solutions, $C_j, j=0,1,2,\ldots, J$, with $C_0\equiv C$,
1156: $C_J\equiv C'$, $J=O(N)$,
1157: connecting the two solutions such that the Hamming distance
1158: (number of different variables) between $C_j$ and $C_{j+1}$ is bounded
1159: from above by some finite constant when $N\to \infty$.
1160: \item when $\alpha_s<\alpha<\alpha_c$,
1161: the space of solutions is not connected any longer.
1162: It is made of an exponential (in $N$) number of connected components,
1163: called clusters, each containing an exponentially large number of
1164: solutions. Clusters are separated by large voids: the Hamming distance
1165: between two clusters, that is, the smallest Hammming distance between
1166: pairs of solutions belonging to these clusters, is of the order of $N$.
1167: \end{itemize}
1168: From intuitive grounds,
1169: changes of the statistical properties of the cost function landscape e.g.
1170: of the structure of the solutions space may potentially affect
1171: the search dynamics. This connection between dynamics and static properties
1172: was established in numerous works in the context of
1173: mean field models of spin glasses \cite{leti}, and
1174: subsequently also put forward in some studies of local search algorithms in
1175: combinatorial optimization problems\cite{varia,suedois,cavi}.
1176: So far, there is no satisfying explanation to when and why features of
1177: {\em a priori} algorithm dependent dynamical phenomena
1178: should be related to, or predictable from
1179: some statistical properties of the cost function landscape.
1180: We shall see some examples in the following where such a connection
1181: indeed exist (Sec.~III.B) and other ones where its presence is far less
1182: obvious (Sec.~IIIC,D).
1183:
1184: \begin{figure}
1185: \centerline{
1186: \includegraphics[scale=0.3,angle=0]{landscape1.eps}
1187: \includegraphics[scale=0.3,angle=0]{landscape2.eps}
1188: \includegraphics[scale=0.3,angle=0]{landscape3.eps}
1189: }
1190: \caption{Landscapes corresponding to three different cost functions.
1191: Horizontal axis represent the space of configurations $C$, while vertical
1192: axis is the associated cost $F(C)$. Left: smooth cost function,
1193: with a single minimum easily reachable with local search procedures
1194: e.g. gradient descent. Middle: rough cost function with a lot of
1195: local minima whose presence may damage the performances of local search
1196: algorithms. The various global minima are spread out homogeneously
1197: over the configuration space. Right: rough cost function with
1198: global minima clustered in some portions of the configuration space only. }
1199: \label{landscape}
1200: \end{figure}
1201:
1202: \subsection{Algorithms for error correcting codes}
1203: \label{CodeSection}
1204:
1205: \def\utx{\underline{\tx}}
1206: \def\tx{{\tt x}}
1207: \def\utz{\underline{\tz}}
1208: \def\tz{{\tt z}}
1209: \def\ut0{\underline{\t0}}
1210: \def\t0{{\tt 0}}
1211:
1212: \begin{figure}
1213: \centerline{
1214: \includegraphics[height=220pt,angle=0] {tanner.eps}
1215: }
1216: \caption{Tanner graph of a {\it regular}
1217: linear code. A left-hand node is associated to each
1218: variable, and a right hand node to each parity check.
1219: A link is drawn between two nodes whenever the variable associated to
1220: the left-hand one enters in the parity check corresponding to the right-hand
1221: one.}
1222: \label{Tanner}
1223: \end{figure}
1224:
1225: Coding theory is a rich source of computational problems (and algorithms)
1226: for which the average case analysis is relevant
1227: \cite{Barg,Spielman}. Let us focus,
1228: for sake of concreteness, on the decoding problem. Codewords are sequences
1229: of symbols with some built-in redundancy. If we consider the case of
1230: linear codes on a binary alphabet, this redundancy can be implemented as a set
1231: of linear constraints. In practice, a codeword is a vector
1232: $\utx\in \{0,1\}^N$ (with $N\gg 1$) which satisfies the equation
1233:
1234: \begin{eqnarray}
1235: {\mathbb H}\, \utx = \ut0\;\;\;\; ({\rm mod}\;\;\; 2)\, ,
1236: \label{ParityCheckMatrix}
1237: \end{eqnarray}
1238:
1239: where ${\mathbb H}$ is an $M\times N$ binary matrix
1240: ({\it parity check matrix}). Each one of the $M$ linear equations
1241: involved in Eq. (\ref{ParityCheckMatrix}) is called a {\it parity check}.
1242: This set of equation can be represented graphically by a {\it Tanner graph},
1243: cf. Fig. \ref{Tanner}. This is a bipartite graph
1244: highlighting the relations between the variables ${\tt x}_i$ and the constraints
1245: (parity checks) acting on them.
1246: The decoding problem consists in finding, among the solutions of
1247: Eq. (\ref{ParityCheckMatrix}), the ``closest'' one $\utx_{\rm d}$ to
1248: the output $\utx_{\rm out}$ of
1249: some communication channel. This problem is, in general, NP-hard
1250: \cite{Berlekamp}.
1251:
1252: The precise meaning of ``closest'' depends upon the nature of the
1253: communication channel. Let us make two examples:
1254: \begin{itemize}
1255: \item The binary symmetric channel (BSC). In this case the output of
1256: the communication channel $\utx_{\rm out}$ is a codeword, i.e. a solution of
1257: (\ref{ParityCheckMatrix}), in which a fraction $p$ of the entries has
1258: been flipped. ``Closest'' has to be understood in the Hamming-distance
1259: sense. $\utx_{\rm d}$ is the solution of Eq. (\ref{ParityCheckMatrix})
1260: which minimizes the Hamming distance from $\utx_{\rm out}$.
1261: \item The binary erasure channel (BEC). The output $\utx_{\rm out}$
1262: is a codeword
1263: in which a fraction $p$ of the entries has been erased. One has to
1264: find a solution $\utx_{\rm d}$
1265: of Eq. (\ref{ParityCheckMatrix}) which is compatible with
1266: the remaining entries. Such a problem has a {\it unique}
1267: solution for small enough erasure probability $p$.
1268: \end{itemize}
1269: There are two sources of randomness in the decoding problem: $(i)$ the matrix
1270: ${\mathbb H}$ which defines the code is usually drawn from
1271: some random {\it ensemble}; $(ii)$ the received message which is distributed
1272: according to some probabilistic model of the communication channel
1273: (in the two examples above, the bits to be flipped/erased were
1274: chosen randomly). Unlike many other combinatorial problems,
1275: there is therefore a ``natural'' probability distribution defined
1276: on the instances. Average case analysis with respect to this
1277: distribution is of great practical relevance.
1278:
1279: Recently, amazingly good performances have been obtained by using
1280: low-density parity check (LDPC) codes \cite{Chung}. LDPC codes
1281: are defined by parity check matrices ${\mathbb H}$ which are large and sparse.
1282: As an example we can consider Gallager {\it regular} codes
1283: \cite{GallagerThesis}. In this case ${\mathbb H}$
1284: is chosen with flat probability distribution within
1285: the family of matrices having $l$ ones per column,
1286: and $k$ ones per row.
1287: These are decoded using a suboptimal linear-time algorithm known as
1288: ``belief-propagation'' or ``sum-product'' algorithm
1289: \cite{GallagerThesis,Pearl}.
1290: This is an iterative algorithm which takes advantage of the locally tree-like
1291: structure of the Tanner graph, see Fig. \ref{Tanner}, for LDPC codes.
1292: After $n$ iterations it incorporates the information
1293: conveyed by the variables up to distance $n$ from the one to be decoded.
1294: This can be done in a recursive fashion allowing for linear-time
1295: decoding.
1296:
1297: Belief-propagation decoding shows a striking threshold phenomenon
1298: as the noise level $p$ crosses some critical (code-dependent) value $p_d$.
1299: While for $p<p_d$ the transmitted codeword is recovered with
1300: high probability, for $p>p_d$ decoding will fail almost always.
1301: The threshold noise $p_d$ is, in general, smaller
1302: than the threshold $p_c$ for optimal decoding (with unbounded
1303: computational resources).
1304:
1305: The rigorous analysis of Ref.~\cite{RichardsonUrbankeIntroduction}
1306: allows a precise determination of the critical noise $p_d$ under
1307: quite general circumstances. Nevertheless some important theoretical
1308: questions remain open:
1309: Can we find some smarter linear-time algorithm whose threshold
1310: is greater than $p_d$? Is there any ``intrinsic'' (i.e. algorithm independent)
1311: characterization of the threshold phenomenon
1312: taking place at $p_d$?
1313: As a first step towards the answer to these questions,
1314: Ref.~\cite{DynamicCodes} explored the dynamics of local optimization
1315: algorithms by using statistical mechanics techniques.
1316: The interesting point is that ``belief propagation'' is by no means a local
1317: search algorithm.
1318:
1319: For sake of concreteness, we shall focus on the binary erasure channel.
1320: In this case we can treat decoding as a combinatorial
1321: optimization problem within the space of bit sequences of length $Np$
1322: (the number of erased bits, the others being fixed by the received
1323: message).
1324: The function to be minimized is the {\it energy density}
1325:
1326: \begin{eqnarray}
1327: \epsilon(\utx) = \frac{2}{N}d_H({\mathbb H}\utx,\ut0)\, ,\label{CostFunction}
1328: \end{eqnarray}
1329:
1330: where we denote as $d_{H}(\utx_1,\utx_2)$ the Hamming distance between
1331: two vectors $\utx_1$ and $\utx_2$, and we
1332: introduced the normalizing factor for future convenience.
1333: Notice that both arguments of $d_H(\cdot,\cdot)$ in Eq. (\ref{CostFunction})
1334: are vectors in $\{0,1\}^{M}$.
1335:
1336: We can define the $R$-neighborhood of a given sequence $\utx$ as the set
1337: of sequences $\utz$ such that $d_{\rm H}(\utx,\utz)\le R$,
1338: and we call $R$-stable states the bit sequences which are optima
1339: of the decoding problem within their $R$-neighborhood.
1340:
1341: \begin{figure}
1342: \centerline{
1343: \includegraphics[height=220pt,angle=-90] {glauber.eps}
1344: }
1345: \caption{The $(6,3)$ Gallager code decoded by local search with
1346: $1$-neighborhoods. At each time step, the algorithm looks for a
1347: bit (among the ones uncorrectly received)
1348: such that flipping it decreases the cost function (\ref{CostFunction}).
1349: We plot the average number of violated parity checks (multiplied by $2/N$)
1350: after the algorithm halts, as a
1351: function of the erasure probability $p$.}
1352: \label{Glauber}
1353: \end{figure}
1354:
1355: One can easily invent local search algorithms
1356: \cite{papadimi} for the decoding problem
1357: which use the $R$-neighborhoods. The algorithm start from a random
1358: sequence and, at each step, optimize it within its $R$-neighborhood.
1359: This algorithm is clearly suboptimal and halts on $R$-stable states.
1360: Let us consider, for instance, a $(k=6,l=3)$ regular code
1361: and decode it by local search in $1$-neighborhoods.
1362: In Fig.~\ref{Glauber} we report the resulting energy density $\epsilon$ after
1363: the local search algorithm halts, as a function of the erasure
1364: probability $p$. We averaged over 100 different realizations of the
1365: noise and of the matrix ${\mathbb H}$. For sake of comparison
1366: we recall that the threshold for belief-propagation decoding
1367: is $p_d\approx 0.429440$ \cite{RichardsonUrbankeIntroduction},
1368: while the threshold for optimal
1369: decoding is at $p_c\approx 0.488151$ \cite{DynamicCodes}.
1370: It is evident that local search by $1$-neighborhoods performs quite poorly.
1371:
1372: A natural question is whether (and how much), these performances
1373: are improved by increasing $R$.
1374: It is therefore quite natural to study {\it metastable} states.
1375: These are $R$-stable states
1376: for any $R = o(N)$\footnote{We use the standard notation
1377: $f_N=o(N)$ if $\lim_{N\to\infty}f_N/N = 0$.}. There exists no
1378: completely satisfying definition of such states: here we
1379: shall just suggest a possibility among others.
1380: The tricky point is that we do not know how to compare
1381: $R$-stable states for different values of $N$.
1382: This forbids us to make use of the above asymptotic statement.
1383: One possibility is to count without really defining them.
1384: This can be done, at least in principle, by counting $R$-stable
1385: states, take the $N\to\infty$ limit and, at the end, the $R\to\infty$
1386: limit\cite{giu}.
1387: On physical grounds, we expect $R$-stable states to be exponentially
1388: numerous. In particular, if we call ${\cal N}_{R}(\epsilon)$ the number of
1389: $R$-stable states taking a value $\epsilon$ of the cost function
1390: (\ref{CostFunction}), we have
1391:
1392: \begin{eqnarray}
1393: {\cal N}_{R}(\epsilon)\sim \exp\{N S_R(\epsilon)\}\, .
1394: \end{eqnarray}
1395:
1396: We can therefore define the so called
1397: (physical) complexity $\Sigma(\epsilon)$ as follows,
1398:
1399: \begin{eqnarray}
1400: \Sigma(\epsilon) \equiv \lim_{R\to\infty} S_R(\epsilon)\, .
1401: \end{eqnarray}
1402:
1403: Roughly speaking we can say that the number of metastable states is
1404: $\exp\{N\Sigma(\epsilon)\}$.
1405: Of course there are several alternative ways of taking the
1406: limits $R\to\infty$, $N\to\infty$, and we do not yet have a proof
1407: that these procedures give the same result for $\Sigma(\epsilon)$.
1408: Nevertheless it is quite clear that the existence of an exponential number of
1409: metastable states should affect dramatically the behavior of local search
1410: algorithms.
1411:
1412: \begin{figure}
1413: \centerline{
1414: \includegraphics[height=220pt,angle=-90] {cplxdie.eps}}
1415: \caption{The complexity $\Sigma(\epsilon)$ of a $(6,3)$ code on the BEC,
1416: for (from top to bottom)
1417: $p=0.45$ (below $p_c$), $p = 0.5$, and $p=0.55$ (above $p_c$).
1418: Recall that $\Sigma(\epsilon)$ is positive only above $p_d\approx 0.429440$.}
1419: \label{Complexity}
1420: \end{figure}
1421:
1422: Statistical mechanics methods \cite{DynamicCodes}
1423: allows to determine the complexity $\Sigma(\epsilon)$ \cite{Remi}.
1424: In ``difficult'' cases (such as for error-correcting codes),
1425: the actual computation may involve some approximation,
1426: e.g. the use of a variational Ansatz. Nevertheless the outcome
1427: is usually quite accurate.
1428: In Fig. \ref{Complexity} we consider a $(6,3)$ regular code on the
1429: binary erasure channel.
1430: We report the resulting complexity
1431: for three different values of the erasure probability $p$.
1432: The general picture is as follows. Below $p_d$ there is no metastable
1433: state, except the one corresponding to the correct
1434: codeword. Between $p_d$ and $p_c$ there is an exponential number
1435: of metastable states with energy density belonging to the interval
1436: $\epsilon_{GS}<\epsilon<\epsilon_D$ ($\Sigma(\epsilon)$ is strictly positive
1437: in this interval).
1438: Above $p_c$, $\epsilon_{GS}=0$.
1439: The maximum of $\Sigma(\epsilon)$ is always at $\epsilon_D$.
1440:
1441: \begin{figure}
1442: \centerline{
1443: \includegraphics[height=220pt,angle=-90] {annealing.eps}}
1444: \caption{The $(6,3)$ LDPC code on the BSC decoded by simulated annealing.
1445: The circles give the number of violated checks in the resulting sequence.
1446: The continuous line is the analytical result for the typical energy
1447: density of metastable states ($\epsilon_D$ in Fig. \ref{Complexity}).}
1448: \label{Annealing}
1449: \end{figure}
1450:
1451: The above picture tell us that any local algorithm will run into
1452: difficulties above $p_d$. In order to confirm this picture,
1453: the authors of Ref.~\cite{DynamicCodes} made
1454: some numerical computations using simulated annealing as decoding
1455: algorithm for
1456: quite large codes ($N=10^4$ bits). For each value of
1457: $p$, we start the simulation fixing a fraction $(1-p)$ of spins to
1458: $\sigma_i = +1$ (this part will be kept fixed all along the run). The
1459: remaining $p N$ spins are the dynamical variables we change during the
1460: annealing in order to try to satisfy all the parity checks. The
1461: energy of the system counts the number of unsatisfied parity checks.
1462:
1463: The cooling schedule has been chosen in the following way: $\tau$
1464: Monte Carlo sweeps (MCS)~\footnote{Each Monte Carlo sweep consists in
1465: $N$ proposed spin flips. Each proposed spin flip is accepted or not
1466: accordingly to a standard Metropolis test.}
1467: at each of the 1000 equidistant temperatures
1468: between $T=1$ and $T=0$. The highest temperature is such that the
1469: system very rapidly equilibrates.
1470: Typical values for $\tau$ are from 1 to $10^3$.
1471:
1472: Notice that, for any fixed cooling schedule, the computational complexity of
1473: the simulated annealing method is linear in $N$. Then we expect it to be
1474: affected by metastable states of energy $\epsilon_D$, which are
1475: present for $p>p_d$: the energy relaxation should be strongly reduced
1476: around $\epsilon_D$ and eventually be completely blocked.
1477: Some results are plotted in
1478: Fig. \ref{Annealing} together with the theoretical prediction for
1479: $\epsilon_D$. The good agreement confirm our picture: the algorithm
1480: gets stucked in metastable states, which have, in the great majority
1481: of cases, energy density $\epsilon_D$.
1482:
1483: Both ``belief propagation'' and local search algorithms fail to
1484: decode correctly between $p_d$ and $p_c$. This leads naturally to the
1485: following conjecture: no linear time algorithm
1486: can decode in this regime of noise. The (typical case)
1487: computational complexity changes from
1488: being linear below $p_d$ to superlinear above $p_d$. In the case of
1489: the binary erasure channel it remains polynomial between $p_d$ and
1490: $p_c$ (since optimal decoding can be realized with linear algebra methods).
1491: However it is plausible that for a general channel it becomes
1492: non-polynomial.
1493:
1494: \subsection{Gradient descent and XORSAT}
1495: \label{gradxor}
1496:
1497: In this section the local procedure we consider is gradient descent (GD).
1498: GD is defined as follows. {\bf (1)} Start from an initial randomly chosen
1499: configuration of the variables. Call $E$ the number of unsatisfied
1500: clauses. {\bf (2)} If $E=0$ then stop (a solution is found). Otherwise,
1501: pick randomly one variable, say $x_i$, and compute the number $E'$ of
1502: unsatisfied clauses when this variable is negated; if $E'\ge E$
1503: then accept this change {\em i.e.} replace $x_i$ with $\bar x_i$ and
1504: $E$ with $E'$; if $E' >E$, do not do anything. Then go to step 2.
1505: The study of the performances of GD to find the minima of cost
1506: functions related to statistical physics models has recently
1507: motivated various studies\cite{gdphys,gdproof}. Numerics indicate that
1508: GD is typically able to solve random 3-SAT instances with ratios $\alpha
1509: < 3.9$ \cite{suedois,cavi} close to the onset of clustering
1510: \cite{varia,cavi,par}. We shall rigorously show below that this is not so
1511: for 3-XORSAT.
1512:
1513: Let us apply GD to an instance of XORSAT. The instance has a graph
1514: representation explained in Fig.~\ref{xorgr}. Vertices are in
1515: one--to--one correspondence with variables. A clause is
1516: fully represented
1517: by a plaquette joining three variables and a Boolean label
1518: equal to the number of negated variables it contains modulo 2
1519: (not represented on Fig.~\ref{xorgr}).
1520: Once a configuration of the variables is chosen, each plaquette
1521: may be labelled by its status, S or U, whether the associated
1522: clause is respectively satisfied or unsatisfied. A fundamental
1523: property of XORSAT is that each time a variable is changed, {\em i.e.}
1524: its value is negated, the clauses it belongs to change status too.
1525:
1526: This property makes the analysis of some properties of GD easy.
1527: Consider the hypergraph made of 15 vertices and 7 plaquettes in
1528: Fig.~\ref{bi},
1529: and suppose the central plaquette is violated (U) while all other
1530: plaquettes are satisfied (S). The number of unsatisfied clauses
1531: is $E=1$. Now run GD on this special instance of XORSAT.
1532: Two cases arise, symbolized in Fig.~\ref{bi}, whether the vertex
1533: attached to the variable to be flipped
1534: belongs, or not, to the central plaquette.
1535: It is an easy check that, in both cases, $E'=2$ and the change is not
1536: permitted by GD. The hypergraph of Fig.~\ref{bi} will be called hereafter
1537: island. When the status of the plaquettes is U for the central one
1538: and S for the other ones, the island is called blocked. Though the
1539: instance of the XORSAT problem encoded by a blocked island is obviously
1540: satisfiable (think of negating at the same time one variable attached
1541: to a vertex $V$ of the central plaquette and one variable in each of
1542: the two peripherical plaquettes joining the central plaquette at $V$),
1543: GD will never be able to find a solution and will be blocked forever
1544: in the local minimum with height $E=1$.
1545:
1546: The purpose of this section is to show that this situation typically
1547: happens for random instances of XORSAT. More precisely, while almost all
1548: instances of XORSAT with a ratio of clauses per variables smaller than
1549: $\alpha \simeq 0.918$ have a lot of solutions, GD is almost never able to find
1550: one. Even worse, the number of violated clauses reached by GD
1551: is bounded from below by $\Psi (\alpha) \, N$ where
1552: \begin{equation} \label{teh}
1553: \Psi (\alpha) = \frac{729}{1024} \, \alpha^7 \, e^{-45\, \alpha} \qquad .
1554: \end{equation}
1555: In other words, the number of clauses remaining unsatisfied at the
1556: end of a typical GD run is of the order of $N$.
1557: Our demonstration, inspired from \cite{gdproof},
1558: is based on the fact that, with high probability, a
1559: random instance of XORSAT contains a large number of blocked islands
1560: of the type of Fig.~\ref{bi}.
1561:
1562: To make the proof easier, we shall study the following fixed clause
1563: probability ensemble. Instead of imposing
1564: the number of clauses to be equal to $M (=\alpha N)$, any triplet $\tau$
1565: of three vertices (among $N$) is allowed to carry a plaquette with
1566: probability $\mu = \alpha N/{N \choose 3} = 6\alpha /N^2 + O(1/N^3)$. Notice
1567: that this probability ensures that, on the average, the number of
1568: plaquettes equals $\alpha N$.
1569: Let us now draw a hypergraph with this distribution.
1570: For each triplet $\tau$ of vertices, we define $I_{\tau}=1$ if
1571: $\tau$ is the center of a island, 0 otherwise. We shall show
1572: that the total number of islands,
1573: $I = \sum _{\tau} I_{\tau}$, is highly concentrated
1574: in the large $N$ limit, and calculate its average value.
1575:
1576: The expectation value of $I_{\tau}$ is equal to
1577: \begin{equation} \label{expitau}
1578: E [I_{\tau} ] = \frac{(N-3)\times (N-4) \times \ldots \times
1579: (N-13)\times (N-14)}{8\times 8\times 8} \times \mu ^A \, (1-\mu) ^B
1580: \ ,
1581: \end{equation}
1582: where $A=7$ is the number of plaquettes in the island, and
1583: \begin{equation}
1584: B = {N \choose 3} - {N -15 \choose 3} - 7 \qquad ,
1585: \end{equation}
1586: is the number of triplets with at least one vertex among the set of 15 vertices
1587: of the island that do not carry plaquette. The significance of the
1588: terms in Eq.~(\ref{expitau}) is transparent. The central
1589: triplet $\tau$ occupying three vertices, we choose 2 vertices among $N-3$
1590: to draw the first peripherical plaquette of the island, then other 2
1591: vertices among $N-5$ for the other peripherical plaquette having a common
1592: vertex with the latter. The order in which these two plaquettes are
1593: built does not matter and a factor $1/2$ permits us to avoid double
1594: counting. The other four peripherical plaquettes have multiplicities
1595: calculable in the same way (with less and less available vertices).
1596: The terms in
1597: $\mu$ and $1-\mu$ correspond to the probability that such a 7 plaquettes
1598: configuration is drawn on the 15 vertices of the island, and is disconnected
1599: from the remaining $N-15$ vertices. The expectation value of the
1600: number $i=I/N$ of islands per vertex thus reads,
1601: \begin{equation} \label{eq89}
1602: \lim _{N\to\infty} E[ i] = \lim _{N\to\infty} \frac
1603: 1N \, {N \choose 3}\, E[I_{\tau}] = \frac{729}{8} \, \alpha^7 \, e^{-45\,
1604: \alpha} \quad .
1605: \end{equation}
1606: Chebyshev's inequality can be used to show that $i$ is concentrated
1607: around its above average value. Let us calculate the second moment
1608: of the number of islands, $E[I^2]= \sum _{\tau , \sigma }
1609: E[ I_{\tau} I_{\sigma}] $. Clearly, $E[ I_{\tau} I_{\sigma}]$ depends
1610: only on the number $\ell=0,1,2,3$ of vertices common to triplets $\tau$
1611: and $\sigma$. It is obvious that no two triplets of vertices
1612: can be centers of islands when they have $\ell=1$ or $\ell=2$ common
1613: vertices. If $\ell=3$, $\tau=\sigma$ and $E_{\ell =0} \equiv
1614: E[ I_{\tau} ^2]=E[ I_{\tau}]$ has been calculated above.
1615: For $\ell=0$, a similar calculation gives
1616: \begin{eqnarray}
1617: E_{\ell =0} &=& \frac{(N-6) (N-7) ... (N-29)}{2^{18}} \, \mu ^{14}
1618: (1-\mu) ^{B'} \\ \nonumber
1619: B' &=& {N \choose 3} - {N -30 \choose 3} - 14
1620: \qquad .
1621: \end{eqnarray}
1622: Finally, we obtain
1623: \begin{equation}
1624: E[i^2] = \frac 1{N^2} \left[ {N \choose 3} E_{\ell =3} +
1625: {N \choose 3}{N -3\choose 3} E_{\ell =0} \right] =
1626: E[i]^2 + O\left( \frac 1N \right)
1627: \quad .
1628: \end{equation}
1629: Therefore the variance of $i$ vanishes and $i$ is, with high
1630: probability, equal to its average value given by (\ref{eq89}).
1631: To conclude, an island has a probability $1/2^7=1/128$ to be blocked
1632: by definition. Therefore the number (per vertex)
1633: of blocked islands in a random
1634: XORSAT instance with ratio $\alpha$ is almost surely equal to $\Psi (
1635: \alpha)$
1636: given by Eq.~(\ref{teh}). Since each blocked island has one unsatisfied
1637: clause, this is also a lower bound to the number of violated
1638: clauses per variable. Notice however that $\Psi (\alpha )$ is very small
1639: and bounded from above by $1.5\ 10^{-9}$ over the range of interest,
1640: $0<\alpha<0.918$. Therefore, one would in principle need to deal
1641: with billions of variables not to reach solutions and be in the true
1642: asymptotic regime of GD.
1643:
1644: The proof is easily generalizable to gradient descent with more than
1645: one look ahead. To extend the notion of blocked island to
1646: the case where GD is allowed to invert $R$, and not only 1, variables
1647: at a time, it is sufficient to have $R+1$, and not 2, peripherical
1648: plaquettes attached to each vertex of the central plaquette. The
1649: calculation of the lower bound $\Psi (\alpha ,R)$ to the number of violated
1650: clauses (divided by $N$) reached by GD is straightforward and not reproduced
1651: here. As a consequence, GD, even with $R$ simultaneous
1652: flips permitting to overcome local barriers, remains almost surely
1653: trapped at an extensive (in $N$) level of violated clauses for any
1654: finite $R$. Actually the lower bound $\Psi (\alpha ,R) N$ tends
1655: to zero only if $R$ is of the order of $\log N$.
1656:
1657: We stress that the statistical physics calculation of
1658: physical `complexity' $\Sigma$ (see Sec.~\ref{CodeSection}) predicts there
1659: is no metastable states for $\alpha < 0.818$\cite{xorsat1},
1660: while GD is almost surely
1661: trapped by the presence of blocked islands for any $\alpha >0$.
1662: This apparent discrepancy comes
1663: from the fact that GD is sensible to the presence of configurations
1664: blocked for finite $R$, while the physical `complexity' considers states
1665: metastable in the limit $R\to\infty$ only\cite{giu}.
1666:
1667: \begin{figure}
1668: \centerline{\includegraphics[scale=0.5,angle=0]{xorgr.eps}}
1669: \caption{Graphical representation of the XORSAT instance with two
1670: clauses involving variables
1671: $x_1,x_2,x_3$ and $x_2,x_4,x_5$. Each clause or equation
1672: is represented by a plaquette whose vertices are the attached variables.
1673: When the variables are assigned some values, the clauses can
1674: be satisfied (S) or unsatisfied (U). }
1675: \label{xorgr}
1676: \end{figure}
1677:
1678: \begin{figure}
1679: \centerline{\includegraphics[scale=0.5,angle=0]{bi.eps}}
1680: \caption{A blocked island (left) is an instance of 7 clauses (1 central, 6
1681: peripherical) with
1682: variables such that the central plaquette is unsatisfied and all
1683: peripherical plaquettes are satisfied. Inversion of any variable (grey vertex)
1684: increases the number of unsatisfied clauses by 1, be it attached to the central
1685: (middle) or to a peripherical (right) plaquette.}
1686: \label{bi}
1687: \end{figure}
1688:
1689: \subsection{The WalkSAT procedure}
1690: \label{WalkSatSection}
1691:
1692: The Pure Random WalkSAT (PRWSAT) algorithm for solving $K$-SAT is
1693: defined by the following rules\cite{papa2}.
1694:
1695: \begin{enumerate}
1696: \item Choose randomly a configuration of the Boolean variables.
1697: \item If all clauses are satisfied, output ``Satisfiable''.
1698: \item If not, choose randomly one of the unsatisfied clauses, and one
1699: among the $K$ variables of this clause. Flip (invert) the chosen
1700: variable. Notice that the selected clause is now satisfied, but the
1701: flip operation may have violated other clauses which were previously
1702: satisfied.
1703: \item Go to step 2, until a limit on the number of flips fixed
1704: beforehand has been reached. Then Output ``Don't know''.
1705: \end{enumerate}
1706:
1707: What is the output of the algorithm? Either ``Satisfiable'' and a
1708: solution is exhibited, or ``Don't know'' and no certainty on the
1709: status of the formula is achieved. Papadimitriou introduced this
1710: procedure for $K=2$, and showed that it solves with high probability
1711: any satisfiable 2-SAT instance in a number of steps (flips) of the
1712: order of $N^2$\cite{papa2}. Recently Sch\"oning was able to prove the
1713: following very interesting result for 3-SAT\cite{scho}. Call `trial' a
1714: run of PRWSAT consisting of the random choice of an initial
1715: configuration followed by $3\times N$ steps of the procedure. If none
1716: of $T$ successive trials on a given instance has been successful (has
1717: provided a solution), then the probability that this instance is
1718: satisfiable is lower than $\exp( - T \times (3/4)^N)$. In other words,
1719: after $T\gg (4/3)^N$ trials of PRWSAT, most of the configuration space
1720: has been `probed': if there were a solution, it would have been found.
1721: Though this local search algorithm is not complete, the uncertainty on
1722: its output can be made as small as desired and it can be used to prove
1723: unsatisfiability (in a probabilistic sense).
1724:
1725: Sch\"oning's bound is true for any instance. Restriction to special
1726: input distributions allows to strengthen this result. Alekhnovich and
1727: Ben-Sasson showed that instances drawn from the random
1728: 3-Satisfiability ensemble described above are solved in polynomial
1729: time with high probability when $\alpha$ is smaller than
1730: $1.63$\cite{ben}.
1731:
1732: \subsubsection{Behaviour of the algorithm}
1733: In this section, we briefly sketch the behaviour of PRWSAT, as seen
1734: from numerical experiments~\cite{Parkes} and the analysis of
1735: \cite{notrewsat,leurwsat}. A dynamical threshold $\alpha _d $ ($\simeq
1736: 2.7$ for 3-SAT) is found, which separates two regimes:
1737: \begin{itemize}
1738: \item for $\alpha < \alpha _d$, the algorithm finds a solution very
1739: quickly, namely with a number of flips growing linearly with the
1740: number of variables $N$. Figure~\ref{wsat_phen}A shows the plot of
1741: the fraction $\varphi _0$ of unsatisfied clauses as a function of time
1742: $t$ (number of flips divided by $M$) for one instance with ratio
1743: $\alpha=2$ and $N=500$ variables. The curve shows a fast decrease from
1744: the initial value ($\varphi _0(t=0)=1/8$ in the large $N$ limit
1745: independently of $\alpha$) down to zero on a time scale
1746: $t_{res}=O(1)$. Fluctuations are smaller and smaller as $N$ grows.
1747: $t_{res}$ is an increasing function of $\alpha$. This {\em
1748: relaxation} regime corresponds to the one studied by Alekhnovich and
1749: Ben-Sasson, and $\alpha _d > 1.63$ as expected\cite{ben}.
1750:
1751: \item for instances in the $\alpha_d < \alpha < \alpha _c$ range, the
1752: initial relaxation phase taking place on $t=O(1)$ time scale is not
1753: sufficient to reach a solution (Fig.~\ref{wsat_phen}B). The fraction
1754: $\varphi_0$ of unsat clauses then fluctuates around some plateau value
1755: for a very long time. On the plateau, the system is trapped in a {\em
1756: metastable} state. The life time of this metastable state (trapping
1757: time) is so huge that it is possible to define a (quasi) equilibrium
1758: probability distribution $p_N(\varphi _0)$ for the fraction
1759: $\varphi_0$ of unsat clauses. (Inset of Fig.~\ref{wsat_phen}B). The
1760: distribution of fractions is well peaked around some average value
1761: (height of the plateau), with left and right tails decreasing
1762: exponentially fast with $N$, $p_N(\varphi _0) \sim \exp ( N \bar \zeta
1763: (\varphi_0))$ with $\bar \zeta \le 0$. Eventually a large negative
1764: fluctuation will bring the system to a solution ($\varphi
1765: _0=0$). Assuming that these fluctuations are independant random events
1766: occuring with probability $p_N(0)$ on an interval of time of order
1767: $1$, the resolution time is a stochastic variable with exponential
1768: distribution. Its average is, to leading exponential order, the
1769: inverse of the probability of resolution on the $O(1)$ time scale:
1770: $[t_{res}] \sim \exp (N \zeta)$ with $\zeta = - \bar \zeta (0)$.
1771: Escape from the metastable state therefore takes place on
1772: exponentially large--in--$N$ time scales, as confirmed by numerical
1773: simulations for different sizes. Sch\"oning's result\cite{scho} can
1774: be interpreted as a lower bound to the probability $\bar \zeta (0)>
1775: \ln (3/4)$, true for any instance.
1776: \end{itemize}
1777:
1778: The plateau energy, that is, the fraction of unsatisfied clauses
1779: reached by PRWSAT on the linear time scale is plotted on
1780: Fig.~\ref{wsat_plateau}. Notice that the ``dynamic'' critical value
1781: $\alpha_d$ above which the plateau energy is positive (PRWSAT stops
1782: finding a solution in linear time) is strictly smaller than the
1783: ``static'' ratio $\alpha_c$, where formulas go from satisfiable with
1784: high probability to unsatisfiable with high probability. In the
1785: intermediate range $\alpha_d < \alpha <\alpha_c$, instances are almost
1786: surely satisfiable but PRWSAT needs an exponentially large time to
1787: prove so. Interestingly, $\alpha _d$ and $\alpha_c$ coincides for
1788: 2-SAT in agreement with Papadimitriou's
1789: result\cite{papa2}. Furthermore, the dynamical transition is
1790: apparently not related to the onset of clustering taking place at
1791: $\alpha _s \simeq 3.9$.
1792:
1793: \begin{center}
1794: \begin{figure}
1795: A\epsfig{file=wsat_phena.eps,angle=-90,width=7cm} \hskip 1cm
1796: B\epsfig{file=wsat_phenb.eps,angle=-90,width=7cm} \vskip .5cm
1797: \caption{Fraction $\varphi _0$ of unsatisfied clauses as a function of time
1798: $t$ (number of flips over $M$) during the action of PRWSAT on two randomly drawn instances of
1799: 3-SAT with ratios $\alpha =2$ ({\bf A}) and $\alpha =3$ ({\bf B}) with
1800: $N=500$ variables. Note the difference of time scales between the two
1801: figures. Insets of figure B: left: blow up of the initial relaxation
1802: of $\varphi_0$, taking place on the $O(1)$ time scale as in ({\bf A});
1803: right: histogram $p_{500} (\varphi _0 )$ of the fluctuations of
1804: $\varphi _0$ on the plateau $1\le t\le 130$. }
1805: \label{wsat_phen}
1806: \end{figure}
1807: \end{center}
1808:
1809:
1810: \begin{figure}
1811: \begin{center}
1812: \epsfig{file=wsat_plateau.eps,angle=-90,width=7cm} \vskip .5cm
1813: \end{center}\caption{Fraction $\varphi _0$ of unsatisfied clauses on
1814: the metastable plateau of PRWSAT on 3-SAT as a function of the ratio $\alpha$ of clauses
1815: per variable. Diamonds are the output of numerical experiments, and
1816: have been obtained through average of data from simulations at a given
1817: size $N$ (nb. of variables) over 1,000 samples of 3-SAT, and
1818: extrapolation to infinite sizes (dotted line serves as a guide to the
1819: eye). The ratio at which $\varphi _0$ begins being positive, $\alpha
1820: _d \simeq 2.7$, is smaller than the thresholds $\alpha _s\simeq 3.9$
1821: and $\alpha _c\simeq 4.3$ above which solutions gather into distinct
1822: clusters and instances have almost surely no solution respectively.
1823: The full line is the prediction of the Markovian approximation of
1824: Section~\ref{wsat_expphase}.}
1825: \label{wsat_plateau}
1826: \end{figure}
1827:
1828:
1829: \subsubsection{Results for the linear phase $\alpha < \alpha _d$}
1830:
1831: When PRWSAT finds easily a solution, the number of steps it requires
1832: is of the order of $N$, or equivalently, $M$. Let us call
1833: $t_{res}(\alpha,K)$ the average of this number divided by the number
1834: of clauses $M$. By definition of the dynamic threshold, $t_{res}$
1835: diverges when $\alpha \to \alpha_d^-$.
1836: Assuming that $t_{res}(\alpha ,K)$ can be expressed as a series of
1837: powers of $\alpha$, we find the following expansion\cite{notrewsat}
1838: \begin{equation}
1839: t_{res} (\alpha , K)= \frac{1}{2^K} + \frac{K(K+1)}{K-1}
1840: \frac{1}{2^{2K+1}} \, \alpha + \frac{4K^6 + K^5 +6 K^3 -10 K^2 + 2
1841: K}{3(K-1)(2K-1)(K^2-2)} \frac{1}{2^{3K+1}} \, \alpha^2 + O(\alpha^3)
1842: \quad .\label{dev_cluster_tresK}
1843: \end{equation}
1844: around $\alpha=0$. As only a finite number of terms in this
1845: expansion have been computed,
1846: we do not control its radius of convergence, yet as shown
1847: in Fig.~\ref{wsat_fig_tres_q1} the numerical experiments provide
1848: convincing evidence in favour of its validity.
1849:
1850: The above calculation is based on two facts. First, for
1851: $\alpha < 1/(K(K-1))$ the instance under consideration splits into
1852: independent subinstances (involving no common variable)
1853: that contains a number of variables of the order
1854: of $\log N$ at most. Moreover,
1855: the number of the connected components containing $m$ clauses,
1856: computed with probabilistic arguments very similar to those of
1857: Section~\ref{gradxor}, contribute to a power expansion in $\alpha$
1858: only at order $\alpha^m$. Secondly, the number of steps the
1859: algorithm needs to solve the instance is simply equal to the sum
1860: of the numbers of steps needed for each of its independent subinstances.
1861: This additivity remains true when one averages over the
1862: initial configuration and the choices done by the algorithm.
1863:
1864: One is then left with the enumeration of the different subinstances with a
1865: given size and the calculation of the average number of steps for
1866: their resolution. A detailed presentation of
1867: this method has been given in a general case in~\cite{clusters}, and
1868: applied more specifically to this problem in~\cite{notrewsat}; the
1869: reader is referred to these previous works for more details.
1870: Equation (\ref{dev_cluster_tresK}) is the output of the enumeration of subinstances
1871: with up to three clauses.
1872:
1873:
1874: \begin{figure}
1875: \begin{center}
1876: \epsfig{file=K3_time_res.eps,width=7cm}
1877: \end{center}
1878: \caption{Average resolution time $t_{res}(\alpha , 3)$ for PRWSAT on 3-SAT. Symbols: numerical simulations, averaged over $1,000$ runs for
1879: $N=10,000$. Solid line: prediction from the cluster expansion
1880: (\ref{dev_cluster_tresK}).}
1881: \label{wsat_fig_tres_q1}
1882: \end{figure}
1883:
1884:
1885: \subsubsection{Results for the exponential phase $\alpha > \alpha _d$}
1886: \label{wsat_expphase}
1887:
1888: The above small $\alpha$ expansion does not allow us to investigate the
1889: $\alpha > \alpha _d$ regime.
1890: We turn now to an approximate method more adapted to this situation.
1891:
1892: Let us denote by $C$ an assignment of the boolean variables.
1893: PRWSAT defines a Markov process on the space of the
1894: configurations $C$, a discrete set of cardinality $2^N$. It is a
1895: formidable task to follow the probabilities of all these
1896: configurations as a function of the number of steps $T$ of the
1897: algorithm so one can look for a simpler
1898: description of the state of the system during the evolution of the
1899: algorithm. The simplest, and crucial, quantity to follow is the number
1900: of clauses unsatisfied by the current assignment of the boolean
1901: variables, $M_0(C)$. Indeed, as soon as this value vanishes, the
1902: algorithm has found a solution and stops.
1903:
1904: A crude approximation consists in assuming that, at each time step
1905: $T$, all configurations with a
1906: given number of unsatisfied clauses are equiprobable, whereas the
1907: Hamming distance between two configurations visited at step $T$ and
1908: $T+k$ of the algorithm is at most $k$. However, the results obtained
1909: are much more sensible that one could fear.
1910: Within this simplification, a Markovian evolution equation for
1911: the probability that $M_0$ clauses are unsatisfied after $T$ steps
1912: can be written.
1913: Using methods similar to the ones in Section~\ref{DpllSatSection},
1914: we obtain (see \cite{notrewsat} for more
1915: details and \cite{leurwsat} for an alternative way of presenting the
1916: approximation):
1917: \begin{itemize}
1918: \item the average fraction of unsatisfied clauses,
1919: $\varphi_0(t)$, after $T=t\, M$ steps of the algorithm.
1920: For ratios $\alpha > \alpha_d(K)= (2^K - 1) /K$,
1921: $\varphi_0$ remains positive at large times, which means that typically
1922: a large formula will not be solved by PRWSAT, and that the fraction of
1923: unsat clauses on the plateau is
1924: $\varphi_0(t \to \infty)$.
1925: The predicted value for $K=3$, $\alpha_d=7/3$, is in
1926: good but not perfect agreement with the estimates from numerical
1927: simulations, around $2.7$. The plateau height,
1928: $2^{-K}(1-\alpha_d(K)/\alpha)$, is compared to
1929: numerics in Fig.~\ref{wsat_plateau}.
1930:
1931: \item the probability $p_N(\varphi _0) \sim \exp ( N \bar \zeta
1932: (\varphi_0))$ that the fraction of unsatisfied clauses is $\varphi _0$.
1933: It has been argued above that
1934: the distribution of resolution times in the $\alpha > \alpha_d$ phase
1935: is expected to be, at leading order, an exponential distribution of
1936: average $e^{N \zeta}$, with $\zeta = -\bar \zeta (0)$. Predictions for
1937: $\bar \zeta (0)$ are plotted and compared to experimental measures
1938: of $\zeta$ in Fig.~\ref{wsat_fig_zeta}. Despite the roughness of our Markovian
1939: approximation, theoretical predictions are in qualitative agreement
1940: with numerical experiments.
1941: \end{itemize}
1942:
1943: A similar study of the behaviour of PRWSAT on XORSAT problems has been
1944: also performed in~\cite{notrewsat,leurwsat}, with qualitatively
1945: similar conclusions: there exists a dynamic threshold $\alpha_d$ for
1946: the algorithm, smaller both than the satisfiability and clustering
1947: thresholds (known exactly in this case~\cite{xorsat2}). For low
1948: values of $\alpha$, the resolution time is linear in the size of the
1949: formula; between $\alpha_d$ and $\alpha_c$ resolution occurs on
1950: exponentially large time scales, through fluctuations around a plateau
1951: value for the number of unsatisfied clauses. In the XORSAT case, the
1952: agreement between numerical experiments and this approximate study
1953: (which predicts $\alpha_d=1/K$) is quantitatively better and seems to
1954: improve with growing $K$.
1955:
1956: \begin{figure}
1957: \begin{center}
1958: \epsfig{file=wsat_zeta.eps,angle=-90,width=7cm}
1959: \end{center}
1960: \caption{Large deviations for the action of PRWSAT on 3-SAT. The logarithm
1961: $\bar \zeta $ of the probability of successful resolution (over the
1962: linear in $N$ time scale) is plotted as a function of the ratio
1963: $\alpha$ of clauses per variables. Prediction for $\bar \zeta (\alpha
1964: ,3)$ has been obtained within the approximations of
1965: Section~\ref{wsat_expphase}. Diamonds corresponds to (minus) the
1966: logarithm $\zeta$ of the average resolution times (averaged over 2,000
1967: to 10,000 samples depending on the values of $\alpha,N$, divided by
1968: $N$ and extrapolated to $N\to\infty$) obtained from numerical
1969: simulations. Error bars are of the order of the size of the diamond
1970: symbol. Sch\"oning's bound is $\bar \zeta \ge \ln(3/4) \simeq
1971: -0.288$.}
1972: \label{wsat_fig_zeta}
1973: \end{figure}
1974:
1975: \section{Conclusion and perspectives}
1976:
1977: In this article, we have tried to give an overview of the studies that
1978: physicists have devoted to the analysis of algorithms. This
1979: presentation is certainly not exhaustive. Let us mention that use of
1980: statistical physics ideas have permitted to obtain very interesting
1981: results on related issues as number partitioning\cite{mertens},
1982: binary search trees \cite{majum}, learning in neural networks
1983: \cite{nn}, extremal optimization \cite{boe} ...
1984:
1985: It may be objected that algorithms are mathematical and well defined
1986: objects and, as so, should be analysed with rigorous techniques only.
1987: Though this point of view should ultimately prevail, the current state
1988: of available probabilistic or combinatorics techniques compared to
1989: the sophisticated nature of algorithms used in computer science make
1990: this goal unrealistic nowadays. We hope the reader is now convinced that
1991: statistical physics ideas, techniques, ... may be of help
1992: to acquire a quantitative intuition or even formulate conjectures on the
1993: average performances of search algorithms.
1994: A wealth of concepts and methods
1995: familiar to physicists e.g. phase transitions
1996: and diagrams, dynamical renormalization flow, out-of-equilibrium
1997: growth phenomena, metastability, perturbative approaches... are found
1998: to be useful to understand
1999: the behaviour of algorithms. It is a simple bet that this list will
2000: get longer in next future and that more and more powerful techniques
2001: and ideas issued from modern theoretical physics will find their place
2002: in the field.
2003:
2004: Open questions are numerous. Variants of DPLL with complex
2005: splitting heuristics, random
2006: backtrackings\cite{marq} or applied to combinatorial problems
2007: with internal symmetries\cite{liat} would be worth being studied.
2008: As for local search algorithms, it would be very interesting to
2009: study refined versions of the Pure WalkSAT procedure that alternate
2010: random and greedy steps \cite{defwsat,selmantuning,HoosStutz} to understand
2011: the observed existence and properties of optimal strategies.
2012: One of the main open questions in this context is
2013: to what extent performances are related to intrinsic
2014: features of the combinatorial problems and not to the details
2015: of the search algorithm\cite{lancas}. This raises the question of
2016: how the structure of the cost function landscape may induce
2017: some trapping or slowing down of search algorithms\cite{par}.
2018: Last of all, the input distributions of instances we have focused on here
2019: are far from being realistic. Real instances have a lot of structure
2020: which will strongly reflect on the performances of algorithms.
2021: Going towards more realistic distributions or, even better, obtaining
2022: results true for any instance would be of great interest.
2023:
2024:
2025: \vskip .5cm
2026: {\bf Acknowledgments.}
2027: This work was partly funded by the ACI Jeunes Chercheurs
2028: ``Algorithmes d'optimisation et syst\`emes d\'esordonn\'es quantiques''
2029: from the French Ministry of Research.
2030:
2031: \begin{thebibliography}{99}
2032:
2033: \bibitem{papadimi}
2034: C.H. Papadimitriou, K. Steiglitz, {\em Combinatorial Optimization:
2035: Algorithms and Complexity}, Prentice-Hall, Englewood Cliffs, NJ (1982).
2036:
2037: \bibitem{AI}
2038: T. Hogg, B.A. Huberman, C. Williams, C. (eds),
2039: Frontiers in problem solving: phase transitions and complexity.
2040: {\em Artificial Intelligence} {\bf 81} I \& II (1996).
2041:
2042: \bibitem{Friedgut}
2043: E. Friedgut,
2044: Sharp thresholds of graph properties, and the k-sat problem,
2045: {\em Journal of the A.M.S.} {\bf 12}, 1017 (1999).
2046:
2047: \bibitem{Mit}
2048: D. Mitchell, B. Selman, H. Levesque,
2049: Hard and Easy Distributions of SAT Problems,
2050: {\em Proc.\ of the Tenth Natl.\ Conf.\ on Artificial Intelligence
2051: (AAAI-92)}, 440-446,
2052: The AAAI Press / MIT Press, Cambridge, MA (1992).
2053:
2054: \bibitem{Hans}
2055: I. Gent , H. van Maaren, T. Walsh (eds),
2056: SAT2000: Highlights of Satisfiability Research in the Year 2000,
2057: {\em Frontiers in Artificial Intelligence and Applications},
2058: vol. 63, IOS Press, Amsterdam (2000).
2059:
2060:
2061: \bibitem{Crei}
2062: N. Creignou, H. Daud\'e, Satisfiability threshold for random XOR CNF
2063: formulae,
2064: {\em Discrete Applied Mathematics} {\bf 96-97}, 41 (1999).
2065:
2066: \bibitem{DP}
2067: M. Davis, G. Logemann, D. Loveland,
2068: {A machine program for theorem proving.}
2069: {\em Communications of the ACM} {\bf 5}, 394-397 (1962).
2070:
2071: \bibitem{survey}
2072: J. Gu, P.W. Purdom, J. Franco, B.W. Wah,
2073: Algorithms for satisfiability (SAT) problem: a survey.
2074: {\em DIMACS Series on Discrete Mathematics
2075: and Theoretical Computer Science} {\bf 35}, 19-151,
2076: American Mathematical Society (1997).
2077:
2078: \bibitem{fra2}
2079: M.T. Chao, J. Franco,
2080: Probabilistic analysis of two heuristics for the 3-satisfiability problem,
2081: {\em SIAM Journal on Computing} {\bf 15}, 1106-1118 (1986).
2082:
2083: \bibitem{Fra}
2084: M.T. Chao, J. Franco,
2085: Probabilistic analysis of a generalization of the unit-clause literal
2086: selection heuristics for the k-satisfiability problem,
2087: {\em Information Science} {\bf 51}, 289--314 (1990).
2088:
2089: \bibitem{Achl}
2090: for a review on the analysis of search heuristics in the absence of
2091: backtracking, see: \\
2092: D. Achlioptas,
2093: Lower bounds for random 3-SAT via differential equations,
2094: {\em Theor. Comput. Sci.} {\bf 265}, 159-186 (2001).
2095:
2096: \bibitem{Kir}
2097: A.C. Kaporis, L.M. Kirousis, E.G. Lalas,
2098: The probabilistic analysis of a greedy satisfiability algorithm,
2099: {\em Proceedings of SAT 2002}, pp 362--376 (2002), available at
2100: http://gauss.ececs.uc.edu/Conferences/SAT2002/Abstracts/kaporis.ps
2101:
2102: \bibitem{kir2}
2103: A.C. Kaporis, L.M. Kirousis, Y.C. Stamatiou,
2104: How to prove conditional randomness using the principle of deferred
2105: decisions, technical report, Computer technology Institute,
2106: Greece (2002),
2107: available at http://www.ceid.upatras.fr/faculty/kirousis/kks-pdd02.ps
2108:
2109: \bibitem{Fri}
2110: A. Frieze, S. Suen,
2111: Analysis of two simple heuristics on a random instance of k-SAT,
2112: {\em Journal of Algorithms} {\bf 20}, 312--335 (1996).
2113:
2114: \bibitem{Sta}
2115: R. Monasson, R. Zecchina, S. Kirkpatrick, B. Selman, L. Troyansky,
2116: 2+p-SAT: Relation of Typical-Case Complexity to the Nature of
2117: the Phase Transition,
2118: {\em Random Structure and Algorithms} {\bf 15}, 414 (1999).
2119:
2120: \bibitem{Achl1}
2121: D. Achlioptas, L. Kirousis, E. Kranakis, D. Krizanc,
2122: Rigorous results for random (2+p)-SAT,
2123: {\em Theor. Comput. Sci.} {\bf 265}, 109--130 (2001).
2124:
2125: \bibitem{Coc}
2126: S. Cocco, R. Monasson,
2127: Trajectories in phase diagrams, growth processes
2128: and computational complexity: how search algorithms solve the
2129: 3-Satisfiability problem, {\em Phys. Rev. Lett.} {\bf 86}, 1654 (2001);
2130: Analysis of the computational complexity of solving random satisfiability
2131: problems using branch and bound search algorithms, {\em Eur. Phys. J. B}
2132: {\bf 22}, 505 (2001).
2133:
2134: \bibitem{Chv}
2135: V. Chv{\`a}tal, E. Szmeredi, Many hard examples for resolution,
2136: {\em Journal of the ACM} {\bf 35}, 759--768 (1988).
2137:
2138: \bibitem{Gro}
2139: A. McKane, M. Droz, J. Vannimenus, D. Wolf (eds),
2140: {Scale invariance, interfaces, and non--equilibrium dynamics},
2141: {\em Nato Asi Series B: Physics}, vol. 344, Plenum Press, New-York (1995).
2142:
2143: \bibitem{Bea}
2144: P. Beame, R. Karp, T. Pitassi, M. Saks,
2145: {\em ACM Symp.\ on Theory of Computing (STOC98)}, 561--571
2146: Assoc. Comput. Mach., New York (1998).
2147:
2148: \bibitem{Bollo}
2149: B. Bollobas, {\em Random Graphs}, 2nd edition,
2150: (Cambridge University Press, Cambridge, 2001).
2151:
2152: \bibitem{WeHa1} M.~Weigt and A.~K.~Hartmann, The number of guards
2153: needed by a museum: A phase transition in vertex covering of random graphs,
2154: {\it Phys.~Rev.~Lett.} {\bf 84}, 6118 (2000)
2155:
2156: \bibitem{Ga_VC} P.~G.~Gazmuri, Independent sets in random sparse graphs,
2157: {\it Networks} {\bf 14}, 367 (1984);
2158:
2159: \bibitem{Fr_VC} A.~M.~Frieze, On the independence number of random
2160: graphs, {\it Discr.~Math.} {\bf 81}, 171 (1990)
2161:
2162: \bibitem{Bauer} M.~Bauer and O.~Golinelli,
2163: Core percolation in random graphs: a critical phenomena analysis,
2164: {\it Eur.~Phys.~J.} {\bf B 24}, 339 (2001)
2165:
2166: \bibitem{WeHaLong}
2167: M.~Weigt, A.K.~Hartmann, Minimal vertex covers
2168: on finite-connectivity random graphs - a hard-sphere lattice-gas picture,
2169: {\em Phys.~Rev. E} {\bf 63}, 056127 (2001).
2170:
2171: \bibitem{WeHeur} M.~Weigt, Dynamics of heuristic optimization
2172: algorithms on random graphs, {\it Eur.~Phys.~J.} {\bf B 28}, 369 (2002)
2173:
2174: \bibitem{WeHa2} M.~Weigt and A.~K.~Hartmann, Typical
2175: solution time for a vertex-covering algorithm on finite-connectivity
2176: random graphs, {\it Phys.~Rev.~Lett.} {\bf 86}, 1658 (2001)
2177:
2178: \bibitem{Cocrs}
2179: S. Cocco, R. Monasson,
2180: Exponentially hard problems are sometimes polynomial, a large
2181: deviation analysis of search algorithms for the random
2182: Satisfiability problem, and its application to stop-and-restart
2183: resolutions, {\em Phys. Rev. E} {\bf 66}, 037101 (2002);
2184: Restart method and exponential acceleration of random 3-SAT
2185: instances resolutions: a
2186: large deviation analysis of the Davis--Putnam--Loveland--Logemann algorithm,
2187: to appear in AMAI (2003).
2188:
2189: \bibitem{VCRestart} A.~Montanari and R.~Zecchina,
2190: Optimizing Searches via Rare Events
2191: {\it Phys.~Rev.~Lett.} {\bf 88}, 178701 (2002)
2192:
2193: \bibitem{rando}
2194: R. Motwani, P. Raghavan,
2195: {\em Randomized Algorithms} (Cambridge University Press, Cambridge,
2196: 2000).
2197:
2198: \bibitem{leti}
2199: L.F. Cugliandolo, Dynamics of glassy systems, Lecture notes,
2200: Les Houches, {\em preprint cond-mat/0210312} (2002).
2201:
2202: \bibitem{xorsat1}
2203: F. Ricci-Tersinghi, M. Weigt, R. Zecchina,
2204: Simplest random K-satisfiability problem, {\em Phys. Rev. E} {\bf 63},
2205: 026702 (1999).
2206:
2207: S. Franz {\em et al.},
2208: Exact Solutions for Diluted Spin Glasses and Optimization Problems,
2209: {\em Phys. Rev. Lett.} {\bf 87}, 127209 (2001).
2210:
2211: \bibitem{xorsat2}
2212: O. Dubois, J. Mandler, The 3-XORSAT threshold, {\em Proc. of the 43rd
2213: annual IEEE symposium on Foundations of Computer Science}, Vancouver,
2214: 769--778 (2002).
2215:
2216: S. Cocco, O. Dubois, J. Mandler, R. Monasson,
2217: Rigorous decimation-based construction of ground pure states for
2218: spin glass models on random lattices,
2219: {\em Phys. Rev. Lett.} {\bf 90}, 047205 (2003).
2220:
2221: M. M\'ezard, F. Ricci-Tersenghi, R. Zecchina,
2222: Alternative solutions to diluted p-spin models and XORSAT problems,
2223: cond-mat/0207140 (2002), to appear in {\em J. Stat. Phys.} (2003).
2224:
2225: \bibitem{varia}
2226: G. Biroli, R. Monasson, M. Weigt,A variational description of the ground state structure in random satisfiability problems,
2227: {\em Eur. Phys. J. B} {\bf 14}, 551 (2000).
2228:
2229: \bibitem{cavi}
2230: M. M\'ezard, R. Zecchina,
2231: Random K-satisfiability problem: From an analytic solution to an
2232: efficient algorithm, {\em Phys. Rev. E} {\bf 66}, 056126 (2002).
2233:
2234: \bibitem{suedois}
2235: P. Svenson, M.G. Nordhal, Relaxation in graph coloring and satisfiability
2236: problems, {\em Phys. Rev. E} {\bf 59}, 3983 (1999).
2237:
2238: \bibitem{Barg} A.~Barg, Complexity Issues in Coding Theory,
2239: in {\it Handbook of Coding Theory},
2240: edited by V.~S.~Pless and W.~C.~Huffman, (Elsevier Science, Amsterdam, 1998).
2241:
2242: \bibitem{Spielman} D.~A.~Spielman, The Complexity of Error-Correcting Codes,
2243: in {\it Lecture Notes in Computer Science} {\bf 1279}, pp. 67-84 (1997).
2244:
2245: \bibitem{Berlekamp} E.~R.~Berlekamp, R.~J.~McEliece, and
2246: H.~C.~A.~van~Tilborg,
2247: On the Inherent Intractability of Certain Coding Problems,
2248: {\it IEEE Trans. Inform. Theory}, {\bf 24}, 384 (1978)
2249:
2250: \bibitem{Chung} S.-Y.~Chung, G.~D.~Forney,~Jr., T.~J.~Richardson and
2251: R.~Urbanke, On the design of low-density parity-check codes within
2252: 0.0045 dB of the Shannon limit, {\it IEEE Comm. Letters}, {\bf 5},
2253: 58 (2001).
2254:
2255: \bibitem{GallagerThesis} R.~G.~Gallager, Low Density Parity-Check
2256: Codes (MIT Press, Cambridge, MA, 1963)
2257:
2258: \bibitem{Pearl} J.~Pearl, Probabilistic reasoning in intelligent
2259: systems: network of plausible inference (Morgan Kaufmann, San Francisco,1988).
2260:
2261: \bibitem{RichardsonUrbankeIntroduction} T.~Richardson and R.~Urbanke,
2262: An introduction to the analysis of iterative coding systems,
2263: in {\it Codes, Systems, and Graphical Models}, edited by
2264: B.~Marcus and J.~Rosenthal (Springer, New York, 2001).
2265:
2266: \bibitem{DynamicCodes} S.~Franz, M.~Leone, A.~Montanari, and
2267: F.~Ricci-Tersenghi, The dynamic phase transition for
2268: decoding algorithms, {\it Phys. Rev.} {\bf E 66}, 046120 (2002)
2269:
2270: \bibitem{giu}
2271: G. Biroli, R. Monasson,
2272: From inherent structures to pure states: some simple remarks and examples,
2273: {\em Eur. Phys. Lett.} {\bf 50}, 155 (2000).
2274:
2275: \bibitem{Remi} R.~Monasson, Structural Glass Transition and the
2276: Entropy of the Metastable States, {\it Phys.~Rev.~Lett.} {\bf 75}, 2847 (1995)
2277:
2278: \bibitem{gdphys}
2279: R. Melin, J.C. Angles d'Auriac, P. Chandra, B. Dou\c cot,
2280: Glassy behaviour in the ferromagnetic Ising model on a Cayley tree,
2281: {\em J. Phys. A} {\bf 29}, 5773 (1996).
2282:
2283: D.S. Dean, Metastable states of spin glasses on random thin graphs,
2284: {\em Eur. Phys. J. B} {\bf 15}, 493 (2000).
2285:
2286: P. Svenson, Freezing in random graph ferromagnets,
2287: {\em Phys. Rev. E} {\bf 64}, 036122 (2001).
2288:
2289: V. Spirin, P.L. Krapivsky, S. Redner, Freezing in Ising ferromagnets,
2290: {\em Phys. Rev. E} {\bf 65}, 016119 (2001).
2291:
2292: \bibitem{gdproof}
2293: O. H\"aggstr\"om, Zero-temperature dynamics for the ferromagnetic Ising
2294: model on random graph, {\em Physica A} {\bf 310}, 275 (2002).
2295:
2296: \bibitem{par}
2297: G. Parisi, Some remarks on the survey decimation algorithm for
2298: K-satisfiability, preprint cs.CC/0301015 (2003).
2299:
2300: \bibitem{papa2}
2301: C.H. Papadimitriou, On Selecting a Satisfying Truth Assignment,
2302: {\em Proceedings of the 32nd Annual IEEE Symposium on
2303: Foundations of Computer Science}, 163-169 (1991).
2304:
2305: \bibitem{scho}
2306: U. Sch\"oning, A Probabilistic algorithm for k-SAT based on limited
2307: local search and restart, {\em Algorithmica} {\bf 32}, 615-623 (2002).
2308:
2309: \bibitem{ben}
2310: M. Alekhnovich, E. Ben-Sasson, Analysis of the Random Walk Algorithm
2311: on Random 3-CNFs,
2312: preprint (2002).
2313:
2314: \bibitem{Parkes}
2315: A.J. Parkes, Scaling Properties of Pure Random Walk on Random 3-SAT, {\em Lecture Notes in Computer Science} {\bf 2470}, 708 (2002).
2316:
2317: \bibitem{notrewsat}
2318: G. Semerjian and R. Monasson, Relaxation and Metastability in the Random WalkSAT search procedure, cond-mat/0301272, preprint (2003).
2319:
2320: \bibitem{leurwsat}
2321: W. Barthel, A. Hartmann, M. Weigt, Solving satisfiability problems
2322: by fluctuations: An approximate description of the dynamics of
2323: stochastic local search algorithms, cond-mat/0301271, preprint (2003).
2324:
2325: \bibitem{clusters}
2326: G. Semerjian, L.F. Cugliandolo, Cluster expansions in dilute systems: applications to satisfiability problems and spin glasses, {\em Phys. Rev. E} {\bf 64}, 036115 (2001).
2327:
2328: \bibitem{mertens}
2329: S. Mertens, Phase Transition in the Number Partitioning Problem,
2330: {\em Phys. Rev. Lett.} {\bf 81}, 4281 (1998);
2331: Random Costs in Combinatorial Optimization
2332: {\em Phys. Rev. Lett.} {\bf 84}, 1347 (2000).
2333:
2334: \bibitem{majum}
2335: S.N. Majumdar, P.L. Krapivsky,
2336: Extreme value statistics and traveling fronts: Application to computer
2337: science, {\em Phys. Rev. E} {\bf 65}, 036127 (2002).
2338:
2339: \bibitem{nn}
2340: A. Engel, C. van den Broeck, Statistical mechanics of
2341: learning (Cambridge University Press, Cambridge, 2001).
2342:
2343: \bibitem{boe}
2344: S. Boettcher, M. Grigni,
2345: Jamming Model for the Extremal Optimization Heuristic,
2346: {\em J. Phys. A} {\bf 35}, 1109-1123 (2002). \\
2347: S. Boettcher, A.G. Percus,
2348: Extremal Optimization: an Evolutionary Local-Search Algorithm,
2349: in Computational Modeling and Problem Solving in the Networked World, eds. H.
2350: M. Bhargava and N. Ye (Kluver, Boston, 2003).
2351:
2352: \bibitem{marq}
2353: L. Baptista, J.P. Marques-Silva, using randomization and learning to solve hard real-world instances of satisfiability, {\em in International Conference
2354: of Principles and Practice of Contsraint Programming}, 489--491 (2000).
2355:
2356: \bibitem{liat}
2357: L. Ein-Dor, R. Monasson, The dynamics of proving uncolorability of
2358: random graphs, in preparation (2003).
2359:
2360: \bibitem{defwsat}
2361: B. Selman, H. Kautz and B. Cohen, Noise Strategies for Improving Local Search, {\em Proc. AAAI-94}, Seattle, WA, 337-343 (1994).
2362:
2363: \bibitem{selmantuning}
2364: D. McAllester, B. Selman and H. Kautz, Evidence for Invariants in Local Search, Proc. AAAI-97, Providence, RI, 1997.
2365:
2366: \bibitem{HoosStutz}
2367: H. H. Hoos and T. St\"utzle, Local Search Algorithms for SAT: An Empirical Evaluation, J. Automated reasoning {\bf 24}, 421 (2000).
2368:
2369: \bibitem{lancas}
2370: D. Lancaster, Some statistical mechanical models based on permutations,
2371: in preparation (2003).
2372:
2373: \end{thebibliography}
2374:
2375: \end{document}
2376:
2377:
2378:
2379:
2380:
2381:
2382:
2383:
2384:
2385:
2386:
2387:
2388: