cond-mat0312483/dbp.tex
1: \documentclass[aps,showpacs,superscriptaddress]{revtex4}
2: %\documentclass[aps,pre,preprint,superscriptaddress,showpacs]{revtex4}
3: %\documentclass[aps,prl,twocolumn,superscriptaddress,showpacs]{revtex4}
4: 
5: \usepackage{graphicx}
6: 
7: \begin{document}
8: 
9: \title{Survey Propagation as local equilibrium equations}
10: 
11: \author{Alfredo Braunstein}
12: \email[]{abraunst@sissa.it}
13: \affiliation{SISSA, Via Beirut 9, 34100 Trieste, Italy} 
14: \affiliation{ICTP, Strada Costiera 11, I-34100 Trieste, Italy}
15: 
16: \author{Riccardo Zecchina}
17: \email[]{zecchina@ictp.trieste.it}
18: \affiliation{ICTP, Strada Costiera 11, I-34100 Trieste, Italy}
19: 
20: 
21: \begin{abstract}
22: It has been shown experimentally that a decimation algorithm based on
23: Survey Propagation (SP) equations allows to solve efficiently some
24: combinatorial problems over random graphs.  We show that these
25: equations can be derived as sum-product equations for the computation
26: of marginals in an extended space where the variables are allowed to
27: take an additional value -- $*$ -- when they are not forced by the
28: combinatorial constraints.  An appropriate ``local equilibrium
29: condition'' cost/energy function is introduced and its entropy is
30: shown to coincide with the expected logarithm of the number of
31: clusters of solutions as computed by SP. These results may help to
32: clarify the geometrical notion of clusters assumed by SP for the
33: random K-SAT or random graph coloring (where it is conjectured to be
34: exact) and helps to explain which kind of clustering operation or
35: approximation is enforced in general/small sized models in which it is
36: known to be inexact.
37: 
38: \end{abstract}
39: 
40: \pacs{89.20.Ff, 75.10.Nr, 02.60.Pn, 05.20.-y}
41: 
42: \maketitle
43: 
44: \section{Introduction}
45: 
46: Recent developments in statistical physics of disordered systems have
47: shown a remarkable convergence of themes with other disciplines such
48: as computer science (e.g combinatorial optimization~\cite{TCS}),
49: information theory (e.g error correcting codes~\cite{Codes}) and
50: discrete mathematics (e.g. random
51: structures~\cite{Aldous,Guerra:Talagrand,Aldous_z2}).  While the study
52: of a typical static measure characterizing the slow dynamics of both
53: physical and algorithmic processes is the unifying issue in
54: out-of-equilibrium problems, the study of the geometrical structure of
55: ground states of spin-glass-like energy functions $E$ is central to
56: the understanding of the onset of computational complexity in random
57: combinatorial problems.  The combinatorial problem of satisfying a
58: given set of constraints is viewed in the physics framework as the
59: problem of minimizing $E$ and ``ground state configurations'',
60: ``solutions'' or ``satisfying assignments'' should be understood as
61: synonymous.
62: 
63: Important in an attempt of providing a complete theory of random
64: combinatorial problems is the notion of pure states, or clusters of
65: configurations, on which the probability measure over optimal
66: configurations is assumed to concentrate.  Recently, a new class of
67: algorithms has been proposed \cite{MZ,science,BMZ} that have shown
68: surprising capabilities in dealing with the (exponential)
69: proliferation of clusters of metastable states and therefore in
70: solving random instances of combinatorial problems which are difficult
71: to solve for local search heuristics.  Such algorithms are based on
72: the so called Survey Propagation (SP) equations in which indeed a
73: decomposition of the ground states probability distribution -- the
74: Gibbs measure -- into an exponential number of clusters is assumed
75: from the beginning.  The SP equations can be viewed as zero
76: temperature cavity equations~\cite{cavity} formulated for single
77: instances at a level equivalent to the one-step of replica symmetry
78: breaking (1-RSB) scenario~\cite{states}.
79: 
80: 
81: The SP algorithm consists in a message-passing technique which is
82: closely related to another message-passing method -- known as
83: sum-product or Belief Propagation (BP)~\cite{Gallager,Pearl} algorithm
84: -- which have shown amazing performance for solving the decoding
85: problem~\cite{Spielman} in error correcting codes based on sparse
86: graph
87: encodings~\cite{Sourlas,turbo,Forney,good_codes1,good_codes2,MacKay}.
88: 
89: The aim of this study is to discuss the precise (finite size)
90: structure of the SP equations, linking them to the BP formalism. This
91: is a well defined mathematical issue, independent on the physical
92: origin of the equations. Due to the algorithmic relevance of both BP
93: and SP for coding theory and combinatorial optimization, it is a basic
94: question to understand what these equations are doing for a finite
95: number of variables $N$ since this is the regime in which they are
96: used.
97: 
98: As we shall see, the SP ``algorithmic'' equations at finite N are
99: performing a very specific clustering operation over the solution
100: space.  Moreover, the number of such clusters in the Bethe
101: approximation will be shown to coincide with the prediction of the
102: cavity theory.
103: 
104: These results will be obtained by showing that the SP equations are
105: the BP equations for a modified combinatorial problem. By this mapping
106: we clarify how the hypothesis making BP exact (that is, uncorrelation
107: of distant variables) translate onto a condition of uncorrelation of
108: "frozen"  variables belonging to different clusters: SP produces a
109: collapse of the internal structure of clusters and eliminates
110: correlations among the unfrozen parts.
111: 
112: We shall present the results in the case of the K-SAT problem even
113: though the method could be applied to any discrete combinatorial model
114: defined over locally tree--like graphs.  The results concerning the
115: cluster entropy will be compared with the prediction of the 1-RSB
116: cavity analysis for random K-SAT.
117: 
118: The line of reasoning of the paper consists in showing that the SP
119: equations can be re-derived as sum-product or BP equations --
120: i.e. simple replica symmetric (RS) cavity equations -- over an
121: extended configuration space.  The definition of this space consists
122: in associating to each binary variable a new extra value ``$*$'' which
123: will correspond to the possibility that the variable is not forced to
124: take one of the binary values $\{ -1,+1 \}$ in a given solution
125: \cite{pspin-note}.  We will introduce a {\it local equilibrium
126: condition} (LEC) cost-energy function $\hat E$ derived from $E$,
127: acting over the extended space, together with a (technical) duality
128: transformation needed to preserve the locality of the interactions for
129: implementing properly the BP equations. The following two statements
130: will hold: {\it
131: \begin{itemize}
132: \item[{\bf (I)}] Marginals given by the BP equations derived from
133: $\hat E$ coincide with the marginals given by SP on the original
134: problem. 
135: 
136: \item[{\bf (II)}] Bethe approximation to the entropy of $\hat E$ in
137: the enlarged space as computed by BP coincides with the logarithm of
138: the number of clusters of solutions -- the so called ``complexity'' --
139: predicted by SP on the original problem.
140: \end{itemize}
141: } 
142: 
143: The proof of {\bf (I)} will be achieved by finding a direct connection
144: between quantities (``messages'') propagated by the two algorithms at
145: each iteration step.  We recall that the Bethe approximation to the
146: entropy is exact over trees without and with boundary conditions,
147: i.e. with leaf variables taking given values.
148: 
149: The possibility of interpreting SP as appropriate BP equations may
150: have consequences for their rigorous probabilistic analysis, through a
151: proper application/generalization of the known methods for the
152: analysis of convergence of BP like equations over random graphs (as it
153: has already been done for problems like the random matching
154: \cite{Aldous_z2}). Some preliminary exact numerical results that we
155: give in the concluding section are in support of this possibility.
156: 
157: Throughout the paper we heavily rely on the notations of
158: refs.~\cite{MZ,BMZ} for what concerns the SP equations.
159: 
160: \section{Survey Propagation, Belief Propagation and K-SAT}
161: 
162: SP and BP (or sum-product) are examples of message-passing procedures.
163: In BP the unknowns which are evaluated by iteration are the marginals
164: over the solution space of the variables characterizing the
165: combinatorial problem (e.g. binary ``spin'' variables). According to
166: the physical interpretation, the quantities that are evaluated by SP
167: are the probability distributions of local fields over the set of
168: clusters. That is, while BP performs a ``white'' average over
169: solutions, SP takes care of cluster to cluster fluctuations, telling
170: us which is the probability of picking up a cluster at random and
171: finding a given variable completely biased (frozen) in a certain
172: direction -- that is forced to take the same value within the cluster
173: -- or unfrozen.
174: 
175: In both SP or BP one assumes to know the marginals of all variables in
176: the temporary absence of one of them and then writes the marginal
177: probability induced on this ``cavity'' variable in absence of another
178: third variable interacting with it (i.e. the so called Bethe lattice
179: approximation for the problem).  These relations define a closed set of
180: equations for such cavity marginals that can be solved iteratively
181: (this fact is known as message-passing technique). The equations
182: become exact if the cavity variables acting as inputs are
183: uncorrelated. They are conjectured to be an asymptotically exact
184: approximation over some random locally tree--like structures\cite{MZ}.
185: 
186: The $K$-satisfiability problem ($K$-SAT) is easily stated: Given $N$
187: Boolean variables each of which can be assigned the value True (1) or
188: False (-1), and $M$ clauses between them, is there a 'SAT-assignment',
189: i.e. an assignment of the Boolean variables which satisfies all
190: constraints? A clause takes the form of an 'OR' function of $K$
191: variables in the ensemble (or their negations).  A SAT formula in
192: conjunctive normal form over $N$ Boolean variables $\{\sigma_i =\pm 1\}$ can
193: be written as
194: \begin{equation}
195: \mathcal{F}=\prod_{a\in A}C_{a}
196: \label{F}
197: \end{equation}
198: where 
199: \begin{equation}
200: C_a = 1 - E_a \; \; \; , \; \; E_a \equiv \prod_{i\in
201: a}\delta(J_{a,i},\sigma_i)
202: \label{eq:clause}
203: \end{equation}
204: where $\delta(x,y)$ is the Kronecker function (also written as
205: $\delta_{x,y}$ in the rest of the paper) and $\{ C_a \}$ are the
206: clauses encoded by the parameters $J_{a,i}$ as follows: $J_{a,i}=\pm
207: 1$ if respectively $\pm \sigma_i$ appears in clause $a$ (in Boolean
208: notation we would have $J_{a,i}=-1$ (resp. $+1$) if the Boolean
209: variable $x_i$ (resp. $\neg x_i$) appears in clause $a$).  We call
210: $E_a$ the ``energy'' of a clause.  The symbol $i\in a$ will denote the
211: set of variables participating in clause $a$. Additionally it will be
212: useful to use the symbol $a\in i$ to denote the set of clauses
213: depending on variable $i$. The clause size $|\{i:i\in a\}|$ will be
214: denoted by $n_a$ ($n_a\equiv K$ for $K$-SAT), and the variable
215: connectivity $|\{a:a\in i\}|$ will be denoted by $n_i$.
216: 
217: The satisfiability problem consists in determining the existence of an
218: assignment to the Boolean variables which satisfies all clauses at the
219: same time, that is such that $\mathcal{F}=1$.  We may write the energy
220: function which counts the number of violated clauses as $E=\sum_a E_a$
221: so that the satisfiability problem becomes finding the zero energy
222: ground states of $E$. The random version of $K$-SAT corresponds to the
223: case in which the variables appearing in each clause are chosen
224: uniformly at random, and negated with probability $\frac12$. For the
225: sake of simplicity, hereafter we concentrate mostly on the $3$-SAT
226: case.
227: 
228: The energy function $E$ of a random $3$-SAT formula is a spin glass
229: model defined over a locally tree-like graph that can been studied
230: with the techniques of statistical physics of random systems, namely
231: the replica and cavity methods.
232: 
233: Numerical experiments have shown that a decimation algorithm based on
234: SP equations allows to find satisfying assignments of critically
235: constrained random $3$-SAT instances -- that is random formulas with
236: $\alpha=M/N$ just below a critical ratio $\alpha_c \simeq 4.267$ where
237: formulas are conjectured to become unsatisfiable with high probability
238: -- with a computational cost roughly scaling as $N \log N$~\cite{BMZ}
239: while the other known algorithms typically take times that are
240: exponential in $N$~\cite{Cook_review,nature}.  According to the cavity
241: -- or SP -- analysis , in such hard region (more precisely for $\alpha
242: \in [4.15,4.267]$~\cite{MZ,MPR}) there is a genuine one step RSB
243: phase, in which the space of solution decomposes into an exponential
244: number of clusters and where metastable states are even more numerous.
245: 
246: As discussed in great detail in ref.~\cite{MZ}, one crucial feature
247: that comes out from the SP analysis is the distinction between frozen
248: and unfrozen variables within the different clusters and we shall
249: introduce a formalism which naturally incorporates such phenomenon
250: (see also refs.~\cite{joker}).
251: 
252: We want to represent the condition for a variable of being not forced
253: to take any specific value in a given ground state (unfrozen) and to
254: this end we consider configuration space of $3-$value variables
255: $s_{i}\in\left\{-1,*,1,\right\} $ instead of $\sigma_i\in \{-1,1\}$.
256: 
257: We observe that $C_{a}$ as defined in Eq.~(\ref{eq:clause}) can be
258: evaluated also in extended variables: it behaves as if variables with
259: the $*$ value could be chosen to the best of $-1$ or $1$ and thus
260: satisfy the clause.  This gives the name ``joker state'' to the value
261: $*$. For a configuration $s^{(i,x)}$ such that $s^{(i,x)}_i = x$ and
262: $s^{(i,x)}_j = s_j$ for $j\neq i$ call
263: \begin{equation}
264: C_{a}^{i,x}(s)=C_{a}(s^{(i,x)})
265: \end{equation}
266: and introduce the constrain over $\{-1,*,1\}^{n}$ configurations given
267: by
268: \begin{equation}
269: V_{i}=\delta_{s_{i},*}\prod_{a\in i} C_{a}^{i,-1} C_{a}^{i,1} +
270:   \sum_{\sigma=\pm 1}\delta_{s_{i},\sigma} \prod_{a\in i}
271:   C_{a}^{i,\sigma}\left(1- \prod_{a\in i} C_{a}^{i,-\sigma}\right)
272: \label{eq:general2}
273: \end{equation}
274: The LEC formula derived from $\mathcal{F}$ will be defined as
275: \begin{equation}
276: \mathcal{G}=\prod_{i}V_{i}.  
277: \label{G}
278: \end{equation}
279: Note that $V_i$ depends only on $(s_{j})_{j\in a, a\in i}$ and
280: therefore preserves the ``locality'' of the structure, if any, of the
281: original formula.  A solution of the LEC problem is a configuration
282: $\mathbf{s}=(s_{i}) _{i\in I}\in\left\{-1,*,1\right\} ^{n}$ such that
283: $\mathcal{G}\left(\mathbf{s}\right)=1$. As a particular case, a
284: solution $\mathcal{G}(\mathbf{s})=1$ such that $s_{i}\in\left\{ \pm
285: 1\right\} $ is also a solution of $\mathcal{F}$.
286: 
287: To fix ideas it might be useful to compare the LEC cost-energy
288: function with the original 3-SAT one. To this end we adopt
289: the so--called factor graph representation~\cite{factor_graph}: Given
290: a formula $\mathcal{F}$, we define its associated \emph{factor graph}
291: as a bipartite undirected graph $G=\left(V;E\right)$, having two types
292: of nodes, and edges only between nodes of different type: {\bf (i)}
293: Variable nodes, each one labeled by a variable index in $I=\left\{
294: 1,\dots,N\right\} $ and {\bf (ii)} Function nodes, each one labeled by
295: a clause index $a\in A$ ($|A|=M$). An edge $\left(a,i\right)$ will
296: belong to the graph if and only if $a\in i$ or equivalently $i\in a$.
297: For instance, the factor graph representation of the random $3$-SAT
298: problem consists in a bipartite graph with $N$ variable nodes having a
299: Poisson random connectivity of mean $3 \alpha$ and $M$ function nodes
300: with energy $E_a$ of uniform connectivity $3$ (a portion is shown in
301: part (a) of Fig.\ref{duality}).  The extended LEC spin glass energy
302: function reads:
303: \begin{equation}
304: \hat E =  \sum_{a=1}^M \hat E_a +\sum_{i=1}^N A_i 
305: \end{equation}
306: where now $\hat E_a = 1-C_a$ is evaluated in the extended configuration
307: space and
308: \begin{equation}
309: A_i=\delta_{s_{i},*}\left(1-\delta_{E_i^{-1},E_i^1}\right) +
310: \sum_{\sigma=\pm 1}\delta_{s_{i},\sigma}\theta\left(E_i^\sigma -
311: E_i^{-\sigma}\right)
312: \end{equation}
313: with $E_i^\sigma=\sum_{a\in i}(1-C^{i,\sigma}_a)$ and $\theta(x)=1$ if
314: $x > 0$ and $0$ otherwise.  The factor graph of the LEC has $N$
315: additional function nodes (the $A_i$ terms enforcing the joker
316: condition) that extend over the second neighbors (inset (b) in
317: Fig. \ref{duality}). 
318: 
319: By inspecting Eq.~(\ref{G}) we notice a first problem, namely that we
320: have lost the locally tree-likeness of the original graph. There are
321: interactions terms between every (ordered) pair of neighbors variable
322: nodes $i,j\in a$ (in the original graph), and thus for instance every
323: such pair shares two constraints $V_i,V_j$ (making an effective
324: 2-loop). This introduces an obvious problem for implementing BP over
325: this combinatorial problem, and moreover would make difficult to
326: compare both algorithms, as the underlying geometry is now
327: different. Fortunately, there is an easy (but unfortunately
328: notationally somewhat involved) way out. We will group together
329: neighbor variables, effectively performing a sort of duality
330: transformation over the graph. We describe the procedure explicitly
331: below (Note that this is a particularly simple case of a Kikuchi or
332: ``generalized belief propagation''-type approximation \cite{GBP}).
333: 
334: We will define: {\bf (i.)}  $M$ multi state variables each one
335: corresponding to a tuple $t_a=\{t^{(i)}_a\}_{i\in a}$ ($t^{(i)}_a\in
336: \{-1,*,1\}$) and ``centered'' on $a$ clauses and have (uniform)
337: connectivity $n_a$ ((c) in Fig.\ref{duality}), and {\bf (ii.)} N
338: function nodes $\chi^{dbp}_i$ having Poisson connectivity, depending
339: on $T_i\equiv \{t_a\}_{a\in i}$ and enforcing both the joker state
340: condition as well as identifying the values of the single variables
341: $t^{(i)}_a$ shared by different tuples $a\in i$ ((d) in
342: Fig.\ref{duality}).  An explicit expression of $\chi_i^{dbp}(T_i)$
343: (conf. Eq.~(\ref{eq:general2})) is
344: \begin{equation}
345: \chi^{dbp}_{i} = \sum_{\{s_i\}}\left(\prod_{a\in
346: i}\delta_{t_a^{(i)},s_i}\right) \left(\delta_{s_i,*}\prod_{a\in
347: i} C_{a}^{i,-1} C_{a}^{i,1} + \sum_{\sigma=\pm 1}\delta_{s_i,\sigma}
348: \prod_{a\in i} C_{a}^{i,\sigma}\left(1 - \prod_{a\in i}
349: C_{a}^{i,-\sigma}\right)\right)
350: \label{eq:dualenergy}
351: \end{equation}
352: We shall refer to the BP equations over the dual graph as {\it Dual
353: BP} (DBP).
354: \begin{figure} 
355: \begin{center} 
356: \includegraphics[width=0.5\textwidth,height=0.3\textheight]{duality} 
357: \caption{(a) Portion of the original factor graphs, (b) LEC graph with
358: 3-state variables and additional constraints $A_i$ (black nodes) (c)
359: duality transformation (d) dual graph}
360: \label{duality}
361: \end{center} 
362: \end{figure}
363: 
364: \section{SP equations as BP equations over the dual graph}
365: 
366: Basic SP and DBP iterations can be thought of as transformations in
367: the space of probability distributions of the signs $h_i=\{-1,0,1\}$
368: of the effective fields acting on the single spin variables and of the
369: tuples $t_a=\{-1,*,1\}^{n_a}$ in the dual graph.  In the cavity
370: notation the quantities that are iterated refer to a graph in which a
371: given node and all its neighbor nodes are temporarily eliminated (see
372: Fig. \ref{duality} (a) and (d)) and all quantities are labeled by
373: oriented indices of the type $a \to i$ or $i \to a$ where the node on
374: the right of the arrow is the one eliminated.  Therefore the equations
375: describe a local transformation of some input probability
376: distributions into an output distribution in which a characteristic
377: function $\chi$ eliminates contributions from those combinations of
378: input and output fields or variables that violate some kind of local
379: constraints (it is worth noticing that these cavity equations are
380: closely related to the iterative local equations of the so called
381: Objective Method~\cite{Aldous} of combinatorial
382: probability). Explicitly we have:\\
383: 
384: 
385: {\bf DBP equations:}
386: \begin{eqnarray}
387: P_{a\to i}^{dbp}\left(t_a\right) & \propto & \sum_{\left\{
388: t_{b}\right\}} \prod_{j\in a\setminus i} \chi^{dbp}_j
389: \left(t_a,\left\{t_b\right\}\right) \prod_{b\in j\setminus a} P_{b\to
390: j}^{dbp}\left(t_{b}\right)
391: \end{eqnarray}
392: \\
393: 
394: {\bf SP equations:} ~\cite{MZ,BMZ}
395: \begin{eqnarray}
396: P_{j\to a}^{sp}\left(h_j\right) & \propto & \sum_{\left\{ h_k\right\}}
397: \chi^{sp}_{j\to a}\left(h_{j},\left\{h_{k}\right\}\right) \prod_{b\in
398: j\setminus a} \prod_{k\in b\setminus j} P^{sp}_{k\to
399: b}\left(h_{k}\right)
400: \end{eqnarray}
401: where 
402: \begin{eqnarray}
403: \chi^{sp}_{j\to a} & = & \delta_{h_{j},*}\prod_{b\in j\setminus
404: a}C_{b}^{j,1}C_{b}^{j,-1} + \sum_{\sigma=\pm
405: 1}\delta_{h_{j},\sigma}\prod_{b\in j\setminus
406: a}C_{b}^{j,\sigma}\left(1-\prod_{b\in j\setminus
407: a}C_{b}^{j,-\sigma}\right)
408: \end{eqnarray}
409: $C_{b}$ clauses are here evaluated in $\left(\left(h_{k}\right)_{k\in
410: b\setminus j},h_{j}\right)$.\\
411: 
412: In order to show the connection between the above equations it is
413: convenient to introduce an auxiliary transformation $\tau$ of a
414: similar type:\\
415: 
416: {\bf $\tau$ transformation:}
417: \begin{eqnarray}
418: P_{a\to i}^{\tau}\left(t_a\right) & \propto & \sum_{\left\{
419:   h_j\right\}}\prod_{j\in a\setminus i} \chi^{\tau}_{j\to
420:   a}\left(t_a,h_j\right) P_{j\to a}\left(h_j\right)
421: \end{eqnarray}
422: and
423: \begin{eqnarray} 
424: \chi^{\tau}_{j\to a} = \sum_{\sigma=\pm 1} C_a \delta_{h_j,\sigma}
425: \delta_{t_a^{(j)},\sigma} + \delta_{h_j,*} \left[
426: \delta_{t_a^{(j)},*} C_{a}^{j,-1}C_{a}^{j,1} + \sum_{\sigma=\pm 1}
427: \delta_{t_a^{(j)},\sigma} C_{a}^{j,\sigma}\left(1 -
428: C_{a}^{j,-\sigma}\right)\right]
429: \label{eq:tau}
430: \end{eqnarray}
431: $C_{a}$ terms are evaluated here in $t_a$.\\
432: 
433: 
434: We will drop now the argument dependence of the measures $P_{j\to
435: a}^{sp}$, $P_{a\to i}^{dbp}$ and $P_{j\to a}^{\tau}$ and make instead
436: explicit the dependence on the input probability measures
437: $\left\{P_{k\to b}\right\},\left\{P_{b\to j}\right\},\left\{P_{j\to
438: a}\right\}$ respectively.
439: 
440: The connection between $DBP$ and $SP$ can be written as follows:
441: \begin{equation}
442: P_{a\to i}^{dbp}\left(\left\{P_{k\to b}^{\tau}\right\}\right) \equiv
443:   P_{a\to i}^{\tau}\left(\left\{P_{j\to a}^{sp}\right\}\right)
444: \label{eq:p_equiv}
445: \end{equation}
446: where both sides of the (functional) equality in turn depend on some
447: arbitrary set of probability distributions $\left\{P_k(h_k)\right\}$
448: where $k\in b\setminus j$ for $b\in j\setminus a$ and finally $j\in
449: a\setminus i$. In short,
450: \begin{equation}
451: P^{dbp}\circ P^{\tau}\equiv P^{\tau}\circ P^{sp}
452: \label{eq:equiv}
453: \end{equation}
454: %%%%%%%%%%%%%%%%%%%%%% proof tau o sp = bp o tau
455: 
456: In order to check the validity of the above identity we observe
457: that a direct inspection of the composition shows that it is true if
458: for every $j\in a\setminus i$ the following condition among the
459: characteristic functions holds:
460: \begin{equation}
461: \sum_{\{h_j\}}\chi^{\tau}_{j\to a} \chi_{j\to a}^{sp} =
462: \sum_{\{t_b\}}\chi^{dbp}_j\prod_{b\in
463: j\setminus a} \prod_{k\in b\setminus j}\chi^{\tau}_{k\to
464: b}\label{eq:compo}
465: \end{equation}
466: In appendix \ref{proof} we display the proof that this identity holds
467: and, as a consequence, that also identity Eq.~(\ref{eq:equiv}) is
468: valid. Eq.~(\ref{eq:equiv}) in turn implies that
469: \begin{equation}
470: \left(P^{dbp}\right)^{\left(k\right)}\circ P^{\tau}\equiv
471: P^{\tau}\circ\left(P^{sp}\right)^{\left(k\right)} \; \; \;,
472: \end{equation}
473: where the $\left(k\right)$ exponent means composition. This in turn
474: implies that we have a direct step-by-step connection between the
475: elementary quantities used in the DBP equations and those used in the
476: SP equations: convergence is obtained simultaneously and
477: Eq.~(\ref{eq:equiv}) holds for the respective fixed points.  It is
478: straightforward to compute from the $DBP$ equations the marginals
479: $P_{i}^{dbp}\left(s_{i}\right)$ of the single variables as a
480: marginalization of $P_{a}^{dbp}\left(t_{a}\right)$ for some $a\in i$
481: with respect to all other variables in the clause, (on a fixed point,
482: it doesn't matter which $a\in i$ one chooses). One finds that the
483: marginals predicted by DBP are in one to one correspondence with the
484: local fields given by SP, that is $P_i^{dbp}(s_i=-1,*,1)$ coincides
485: respectively with $P_i^{sp}(H_i=-1,0,1)$ (see refs.~\cite{MZ,BMZ}).
486: 
487: 
488: \subsection{Clustering and whitening}
489: 
490: The marginals over $\{1,*,-1 \}^N$ given by SP/DBP acquire a
491: computational/physical significance once we interpret what solutions
492: of combinatorial problem defined by Eq.~(\ref {G}) mean in term of
493: clusters (or groups) of solutions of the original problem defined by
494: Eq.~(\ref{F}). We will first define the Hamming distance between
495: configurations $s,t\in \{1,*,-1\}^n$, $H(s,t)=|\{i:s_i\neq t_i\}|$ and
496: an ordering relation over $\{-1,*,1\}$ configurations: if $s,t\in
497: \{1,*,-1\}^n$ we say that $s\leq t$ iff $t_i \neq s_i$ implies that
498: $t_i=*$. For instance, $(0,1)\leq(0,*)$ and $(1,1,1)\leq(1,*,*)$ but
499: $(0,1)\not\leq(1,*)$.
500: 
501: 
502: We will say that a configuration $s\in \{\pm 1\}^n$ is {\it contained}
503: in $t\in$ if $s\leq t$. In this sense, ``clustering'' would mean,
504: starting with some set $S\subset \{\pm 1\}^n$ of solutions of the
505: original combinatorial problem, to find some set $T \subset
506: \{1,*,-1\}^n$ such that every $s\in S$ is contained in some $t\in
507: T$. Of course, one would like to do so in some maximal way, but
508: satisfying some kind of separation between different clusters.
509: 
510: One trivial observation about the set ${\mathcal G}=1$ is that
511: solutions are by force separated, in the sense that $H(s,t) > 1$ if
512: ${\mathcal G}(s)={\mathcal G}(t)=1$ and $s\neq t$. To prove this,
513: suppose that $H(s,t)=1$. If their difference comes because $s_i=\pm 1$
514: and $t_i=*$ then by force one of $V_i(t)$ or $V_i(s)$ is clearly
515: violated. If on the contrary, it comes because $s_i=1$ and $t_i=-1$ or
516: viceversa, then by force both of $V_i(t)$ and $V_i(s)$ are violated
517: and the only possible ``correct'' value for $s_i$ is $*$.
518: 
519: A more important observation is that every solution of ${\mathcal
520: F}=1$ is {\it contained} in a solution of ${\mathcal G}=1$ with the
521: minimal number of $*$, and that solution can be easily found. Take a
522: solution $x$ of ${\mathcal F}=1$, and suppose that ${\mathcal G}=0$,
523: Choose a $V_i$ such that $V_i=0$. It can be easily seen that by
524: replacing $x_i$ by $*$, then $V_i$ becomes $1$. Then we pick another
525: violated constrain and repeat the process, until ${\mathcal G}=1$. We
526: will call the resulting configuration $w(x)$ (this procedure has been
527: already used under the name of {\it whitening} in the context of graph
528: coloring by G. Parisi in~\cite{joker}). It is easy to prove that the
529: result of this procedure does not depend on the order in which you
530: pick variables violating nodes $V_i$ (the proof being that any
531: violated $V_i$ will continue to be violated in the procedure, exactly
532: until we switch $x_i$ to $*$), and so $w(x)$ is uniquely defined. Note
533: that two configurations $x,y$ at Hamming distance $H(x,y)=1$ will have
534: $w(x)=w(y)$ and so every solution in a fixed connected component of
535: the solution space will end up inside the same ``cluster''. An example
536: of the whitening procedure for some set of solutions is depicted in
537: Figure~(\ref{whitening-good}).
538: \begin{figure} 
539: \begin{center} 
540: \includegraphics[height=2cm]{whitening2}
541: \caption{The whitening procedure from left to right: the original set
542: of solutions $\{(-1,-1,-1), (1,1,-1), (1,1,1)\}$ and the set of
543: whitened clusters in the final step $\{(-1,-1,-1),(1,1,*)\}$}
544: \label{whitening-good}
545: \end{center} 
546: \end{figure}
547: An interesting point of view is that if one tries to build from
548: scratch a Hamiltonian to describe the behaviour of the outcomes of the
549: whitening procedure of some SAT formula, Eq.~(\ref{G}) comes
550: naturally.
551: 
552: The reader should note however that the presented definition of
553: clustering is far from perfect in the worst case: there is a number of
554: systematic errors produced by the whitening. For instance, in
555: Figure~(\ref{whitening-errors}) we can see one cluster claiming an
556: uncorrectly large volume. And there is of course also another problem:
557: unfortunately, there is no warranty that the sole solutions of
558: ${\mathcal G}=1$ are the ones of the whitening, and in fact small
559: counter-examples can be easily constructed. Numerical work is being
560: done to ascertain a quantification of these two types of errors
561: (\cite{napolano}).
562: 
563: \begin{figure} 
564: \begin{center} 
565: \includegraphics[height=2cm]{whitening}
566: \caption{A systematic error of the whitening $w((1,1,-1))$ (the dark
567: solution in the left). From left to right: the original sets of
568: solutions $\{(1,1,-1), (1,1,1), (1,-1,1), (-1,-1,-1)\}$ and first step
569: $(1,1,-1)$, second step $(1,1,*)$, third step $\{(1,*,*)\}$ and final
570: step $\{(*,*,*)\}$}
571: \label{whitening-errors}
572: \end{center} 
573: \end{figure}
574: 
575: 
576: \section{Entropy and complexity}
577: 
578: The equivalence between the DBP marginals and the SP local field
579: probability distributions has the direct consequence that the Bethe
580: approximation to the entropy on the dual graph, $S^{dbp}$, coincides
581: with the logarithm of the number of clusters of solutions predicted by
582: SP, the so called complexity $\Sigma$.
583: 
584: On general grounds the Bethe approximation to the entropy of a problem
585: is exact if correlations among cavity variables can be neglected
586: (i.e. the global joint probability distribution takes a factorized
587: form). This is certainly true over tree graphs and it is conjectured
588: to be true in some cases for locally tree-like random graphs in the
589: limit of large size (one informal explanation is that distance between
590: cavity variables diverges with probability tending to one).
591: Factorization of marginal probabilities over our dual factor graph
592: amounts at writing $P(\{t_a\})=\prod_{i\in I} P^{dbp}_{i}(T_i)
593: \prod_{a \in A} [ P_a^{dbp}(t_a)]^{1-n_a}$ where $P_i^{dbp}(T_i)$ is
594: the joint probability distribution of the triples connected to node
595: $i$ ($T_i \equiv \{ t_b\}_{b \in i}$) and $P_a^{dbp}(t_a)$ is the
596: single triple marginal. Under this condition the entropy reads
597: \begin{eqnarray}
598: S = -\sum_{i}\sum_{\left\{ T_{i}\right\} }P^{dbp}_{i}(T_{i})\log
599:  P_{i}^{dbp}(T_{i}) + \sum_{a}\left(n_a - 1\right)\sum_{\left\{
600:  t_{a}\right\} }P^{dbp}_{a}(t_{a})\log P^{dbp}_{a}(t_{a}) \; .
601: \label{eq:entropy}
602: \end{eqnarray}
603: 
604: Showing $S=\Sigma$ is a straightforward calculation that we
605: report in the appendix. It requires to express the entropy in terms of
606: the cavity fields given by SP exploiting both Eq.~(\ref{eq:equiv}) and
607: the fixed point conditions. One finds
608: \begin{eqnarray}
609: S =  \sum_{i}\log c_{i}-\sum_{a}\left(n_{a} - 1\right)\log
610:  c_{a}-\sum_{i}\sum_{a\in i}\log D_{a\to i}
611: \label{eq:Sconst}
612: \end{eqnarray}
613: where the three normalization constants are defined by
614: \begin{eqnarray}
615: c_{i} & = & 
616: \sum_{\left\{ T_i\right\}}\prod_{a\in i}P_{a\to
617: i}\left(t_{a}\right)\chi_{i}\left(T_{i}\right)
618: \label{ci} \\
619: c_{a} & = & \sum_{t_a}\sum_{\left\{ h_{j}\right\}}\prod_{j\in
620: a}P_{j\to a}\left(h_{j}\right)\chi^{\tau}_{j\to
621: a}\left(h_{j},t_a\right)
622: \label{ca}\\
623: D_{a\to i} & = & \sum_{t_a}\sum_{\left\{ h_{j}\right\}}\prod_{j\in
624: a\setminus i}P_{j\to a}\left(h_{j}\right)\chi^{\tau}_{j\to
625: a}\left(h_{j},t_a\right)
626: \label{D}
627: \end{eqnarray}
628: These constants are not independent and the explicit expressions of
629: the first two are sufficient for writing $S$ in terms of SP
630: quantities:
631: \begin{eqnarray}
632: c_{a} & = & \sum_{\left\{ h_{j}\right\}} \prod_{j\in a} P_{j\to a}
633:  \left(h_{j}\right)\sum_{\left\{ t_a\right\}} \prod_{j\in a}
634:  \chi^{\tau}_{j\to a} \left(h_{j},t_a\right) \\ & = & 1 -
635:  \sum_{\left\{ h_{j}\right\}} \prod_{j\in a} P_{j\to a}
636:  \left(h_{j}\right)\left(1-\sum_{\left\{ t_{a}\right\}}\prod_{j\in a}
637:  \chi^{\tau}_{j\to a} \left(h_{j},t_a\right) \right) \\ & = & 1 -
638:  \prod_{j\in a}P_{j\to a}\left(J_{a,j}\right)\\ & = & 1- \prod_{j\in
639:  a} \frac{\Pi_{j\to a}^{u}}{\left(\Pi_{j\to a}^{s}+\Pi_{j\to
640:  a}^{0}+\Pi_{j\to a}^{u}\right)}
641: \end{eqnarray}
642: where we have borrowed the notation of Eq.~(18) in~\cite{BMZ}. For
643: computing $c_i$ we first notice that
644: \begin{equation} 
645: P_{a\to i}\left(t_{a}\right)=D_{a\to i}\sum_{\left\{ h_{j}\right\}
646: _{j\in a\setminus i}}\chi^{\tau}_{j\to
647: a}\left(t_{a},h_{j}\right)\prod_{j\in a\setminus i}P_{j\to
648: a}\left(h_{j}\right)
649: \end{equation}
650: so that Eq.~(\ref{ci}) reads
651: \begin{eqnarray} 
652: c_{i} & = & \prod_{a\in i} D_{a\to i}\sum_{\left\{ H_{i}\right\}
653: }\sum_{\left\{ T_{i}\right\} } \chi_{i} \left(T_{i}\right)
654: \prod_{a}\prod_{j\in a\setminus i}\chi^{\tau}_{j\to
655: a}\left(t_{a},h_{j}\right) P_{j\to a}\left(h_{j}\right)\nonumber \\ & = &
656: \prod_{a\in i}D_{a\to i}\sum_{\left\{ H_{i}\right\}
657: }\chi_i^{sp}(H_{i})\prod_{a}\prod_{j\in a\setminus i}P_{j\to
658: a}\left(h_{j}\right)\nonumber \\ & = & \prod_{a\in i}D_{a\to
659: i}\left(\hat{\Pi}_{i}^{+} + \hat{\Pi}_{i}^{0} +
660: \hat{\Pi}_{i}^{-}\right)
661: \end{eqnarray}
662: in the notations of Eq.~(21) in~\cite{BMZ}. Finally, plugging these
663: expressions into Eq.~(\ref{eq:Sconst}) and calling
664: \begin{eqnarray} w_{i} & = &
665: \hat{\Pi}_{i}^{+}+\hat{\Pi}_{i}^{0}+\hat{\Pi}_{i}^{-} \nonumber \\ x_{i\to a} & =
666: & \Pi_{j\to a}^{s}+\Pi_{j\to a}^{0}+\Pi_{j\to a}^{u}\nonumber \\ y_{i\to a} & =
667: & \Pi_{j\to a}^{u}
668: \end{eqnarray} 
669: we get from Eq. (\ref{eq:Sconst})
670: \begin{eqnarray}
671: S = \sum_{i}\log w_{i}- \left(n_{a} - 1\right) \sum_{a}
672: \log\left(1-\prod_{j\in a}\frac{y_{i\to a}}{x_{i\to a}}\right)
673: \label{esse1}
674: \end{eqnarray}
675: In this expression, $w_i$ represents the probability the local field
676: acting on the spin variable $i$ does not produce a contradiction and
677: $1 - \frac{y_{i\to a}}{x_{i\to a}}$ is the probability that the cavity
678: fields satisfy clause $a$.
679: 
680: We recall that the expression of the $SP$ complexity $\Sigma$ defined
681: in Eq.~(25-27) in~\cite{BMZ} is
682: \begin{eqnarray}
683: \Sigma & = & \sum_{i}\left(1-n_{i}\right)\log w_{i} +
684:  \sum_{a}\log\left(\prod_{i\in a}x_{i\to a} - \prod_{i\in a}y_{i\to
685:  a}\right) \nonumber \\ & = & \sum_{i}\log w_{i} - \sum_{a}\sum_{i\in a}\log
686:  w_{i} + \sum_{a}\log\left(\prod_{i\in a}x_{i\to
687:  a}-\prod_{i\in a}y_{i\to a}\right)
688: \label{sigma1} 
689: \end{eqnarray}
690: Despite their different look, it turns out that Eq.~(\ref{esse1}) and
691: Eq.~(\ref{sigma1}) are identical if evaluated in a fixed point of the SP
692: equations.  Their difference
693: \begin{eqnarray}
694: \label{eq:sigma-esse}
695: \Sigma - S = \sum_{a}\left\{ -\sum_{i\in a}\log w_{i} +
696: n_{a}\log\left(1-\prod_{i\in a}\frac{y_{i\to a}}{x_{i\to
697: a}}\right) - \sum_{i\in a}\log x_{i\to a}\right\}
698: \end{eqnarray}
699: is zero since in the fixed point every term inside the curly brackets
700: vanishes: using Eq.~(17) in~\cite{BMZ} we have that $\eta_{a\to
701: i}=\prod_{j\in a\setminus i}\frac{y_{j\to a}}{x_{j\to a}}$ ,
702: i.e. $\prod_{j\in a}\frac{y_{i\to a}}{x_{i\to a}}=\eta_{a\to
703: i}\frac{y_{i\to a}}{x_{i\to a}}$ for every $i\in a$ and hence
704: \begin{equation}
705: n_{a}\log\left(1-\prod_{j\in a}\frac{y_{i\to a}}{x_{i\to
706: a}}\right)=\sum_{j\in a}\log\left(1-\eta_{a\to i}\frac{y_{i\to
707: a}}{x_{i\to a}}\right)
708: \end{equation}
709: A simple calculation shows that $w_{i} = x_{a\to i}-\eta_{a\to
710: i}y_{a\to i}$ for every $a\in i$ and therefore we get $\Sigma=S$ as
711: desired.
712: 
713: %%%%%%%%%%%%%%%%%%%%%
714: \section{Discussion and Conclusions}
715: 
716: In this work we have shown by elementary means that the SP equations
717: can be interpreted and derived as sum-product equations for the
718: marginals over a modified combinatorial problem. An important
719: consequence of this fact is a clarification of the hypothesis behind
720: the algorithm. It is to be expected that the essential hypothesis
721: making sum-product to work is the uncorrelation of the marginals of
722: distant (or cavity) variables. Under the shown mapping, this directly
723: implies that the hypothesis behind SP (and in a way, of its definition
724: of clusters) is the uncorrelation of the frozen part of distant
725: variables, that is the uncorrelation between {\bf different} clusters.
726: 
727: Under this light one can think of the SP procedure of obtaining $\hat
728: E$ from $E$ as a way of collapsing the internal structure of pure
729: states: the resulting problem ${\mathcal G}$ has many pure states but
730: with zero internal entropy. Note that this is a completely different
731: limit case with respect to the ``one pure state''  in which BP
732: (more precisely DBP) is shown to work correctly and to predict an
733: accurate entropy (which we remind is the complexity of the original
734: $E$).
735: 
736: As far as the connection between solutions of the modified problem and
737: the original one is concerned, things are particularly simple over
738: tree factor graphs (see also \cite{BMZ} for results concerning
739: propagation of messages): Indeed, for any fixed boundary condition
740: (i.e. an assignment for the leaf variables), there is at most one
741: solution with $\hat{E} = 0$, and it is easy to prove (see
742: appendix~\ref{tree}) that all solutions of $E=0$ correspond to the
743: same connected component of the solution space (i.e. every two
744: solutions can be joined by a path of solutions in which successive
745: configurations in the path differ by exactly one spin flip). 
746: 
747: The situation on loopy graphs (corresponding for instance to random
748: formulae) is obviously more complicated. A coherent interpretation
749: would be that not only the recursive $DBP$/$SP$ equations themselves
750: are accurate in a probabilistic sense (i.e. when the factorization of
751: the corresponding input joint probability is sound) to compute the
752: statistics of the ground states of $\hat E$, but also that the
753: exactness of the interpretation of the ground states of $\hat E$ in
754: terms of clustering of the ground states of $E$ relies on this
755: hypothesis being true.
756: 
757: To this extent we mention that exact enumerations on a large number
758: (thousands) of small random 3-sat formulas (up to $N=100$) showed that
759: all the zero energy configurations of $\hat E$ which are stable under
760: SP iterations can be extended to real solution of the original
761: problem.  Spurious ground states (i.e. configurations that are not
762: extensible to real solutions) do exist with a non negligible
763: probability for small $N$, however they turn out to be always unstable
764: fixed points of SP , that is unsat configurations which are irrelevant
765: for the SP marginals \cite{napolano}.  While such a result was
766: expected to hold for tree-like graphs, it is somewhat surprising to
767: observe it numerically on small, loopy, random factor graphs.  The
768: robustness of such result calls for a finite $N$ probabilistic analysis
769: which would represent a building brick for the rigorous analysis of SP
770: (of course, small ad-hoc counterexamples on improbable formulae can be
771: easily constructed).
772: 
773: As a concluding remark we notice that the discussed formalism can be
774: generalized to take care of the non-zero energy regime where not all
775: constraints can be satisfied simultaneously (``frustrated'' case). The
776: LEC energy function takes the form $\hat E= \lambda \sum_{a\in A} \hat
777: E_a + \sum_{i\in I} A_i$, where $\lambda$~\cite{note} plays the role
778: of the so called Parisi re-weighting parameter~\cite{cavity}. \\
779: 
780: 
781: \section{Acknowledgments}
782: 
783: We thank D. Achlioptas, M. Mezard, G. Parisi, A. Pelizzola and M.
784: Pretti for very fruitful discussions. This work has been supported in
785: part by the European Community's Human Potential Programme under
786: contract HPRN-CT-2002-00319, STIPCO.
787: 
788: 
789: \appendix
790: 
791: \section{Proof of equivalence}
792: 
793: \label{proof}
794: 
795: For the LHS of Eq.~(\ref{eq:compo}) we have:\\
796: 
797: \noindent
798: If $h_j = \sigma\in\{\pm 1\}$ then
799: \begin{equation}
800:  \chi^{\tau}_{j\to a}= C_a\delta_{t_a^{(j)},\sigma} \; \; \; , \; \; 
801:  \chi^{sp}_{j\to a} = \prod_{b\in
802: j\setminus a}C_b \left(1-\prod_{b\in j\setminus a}
803: C_{b}^{j,-\sigma}\right)
804: \end{equation}
805: \noindent
806: If $h_j = *$ then
807: \begin{equation}
808: \chi^{\tau}_{j\to a} = \delta_{t^{(j)}_a,*}C_{a}^{j,-1}C_{a}^{j,1} +
809: \sum_{\sigma=\pm 1}\delta_{t_a^{(j)},\sigma} C_{a}^{j,\sigma}\left(1 -
810: C_{a}^{j,-\sigma}\right) \; \; \; , \; \; \chi^{sp}_{j\to a} =
811: \prod_{b\in j\setminus a}C_{b}^{j,-1}C_{b}^{j,1}.
812: \end{equation}
813: 
814: Summing up both products and regrouping the LHS of Eq.~(\ref{eq:compo}) reads:
815: \begin{eqnarray}
816: \sum_{\sigma=\pm 1}\delta_{t_a^{(j)},\sigma} \prod_{b\in j}
817: C_{b}^{j,\sigma} \left(1 - \prod_{b\in j} C_{b}^{j,-\sigma}\right) +
818: \delta_{t_a^{(j)},*} \prod_{b\in j} C_{b}^{j,-1}
819: C_{b}^{j,1}\label{eq:sptau}
820: \end{eqnarray}
821: where $C_{b}$ for $b\in j\setminus a$ is evaluated here in
822: $\left(\{h_k\}_{k\in b\setminus j},t_a^{(j)}\right)$ and $C_{a}$ is
823: evaluated in $t_a$.
824: 
825: For the RHS of Eq.~(\ref{eq:compo}) we first notice that as the
826: $\chi^{dbp}_{j}$ term includes $\prod_{a\in j} \delta_{t_a^{(j)},s_j}$
827: we will simply replace all occurrences of $t_b^{(j)}$ and $s_j$
828: variables by $t_a^{(j)}$ and drop the outer sum and the product term
829: itself. For instance, the sum over $\{ t_b\}_{b\in j}$ thus reduces to
830: a sum over $\left\{\{ t_b^{(k)}\}_{k\in b\setminus
831: j},{t_a^{(j)}}\right\}$. Let's evaluate the RHS of
832: Eq.~(\ref{eq:compo}) on the three possible values of $t_a^{(j)}$:\\
833: \noindent
834: If $t_a^{(j)} = *$ then by Eq.~(\ref{eq:dualenergy}) $\chi^{dbp}_j =
835: \prod_{b\in j}C_{b}^{j,-1}C_{b}^{j,1}$.  Moreover, just by looking at
836: its definition Eq.~(\ref{eq:tau}), one finds that in
837: $\chi^{\tau}_{k\to b}$ all $C$ terms are equal to $1$ since their $j$
838: coordinate $t_b^{(j)}=t_a^{(j)}$ is $*$.  Then $\chi^{\tau}_{k\to b} =
839: \delta_{t_b^{(k)},h_k}$ and the RHS of Eq.~(\ref{eq:compo}) becomes
840: \begin{equation}
841: C_{a}^{j,-1} C_{a}^{j,1} \prod_{b\in j\setminus a} C_{b}^{j,-1}
842: C_{b}^{j,1} \prod_{k\in b\setminus j} \delta_{t_b^{(k)},h_k}
843: \end{equation}
844: which is exactly the term in Eq.~(\ref{eq:sptau}) corresponding to
845: $t_a^{(j)} = *$ (remember that $C_{b}$ clauses here are evaluated in
846: $t_b$).\\
847: \noindent
848: If $t_a^{(j)} = \sigma\in\{\pm 1\}$ then it is convenient to break
849: $\chi^{dbp}_j$ in two addenda:
850: \begin{equation}
851: \prod_{b\in j} C_{b}-\prod_{b\in j} C_{b} C_{b}^{j,-\sigma}
852: \end{equation}
853: so that the RHS of Eq.~(\ref{eq:compo}) becomes
854: \begin{eqnarray}
855: C_{a}\prod_{b\in j\setminus a} \left(\sum_{\{t_{b}\}}C_{b}\prod_{k\in
856:   b\setminus j}{\chi^{\tau}_{k\to b}}\right) - C_{a}C_{a}^{j,-\sigma}
857:   \prod_{b\in j\setminus
858:   a}\left(\sum_{\{t_{b}\}}C_{b}C_{b}^{j,-\sigma}\prod_{k\in b\setminus
859:   j}\chi^{\tau}_{k\to b}\right)
860: \end{eqnarray}
861: Finally, both sums can be computed explicitly and the result is
862: again exactly the corresponding term in Eq.~(\ref{eq:sptau}). This ends
863: the proof of the identity Eq.~(\ref{eq:equiv}).
864: 
865: 
866: 
867: \section{Computation of the entropy}
868: \label{entropy-complexity}
869: For simplicity of notation, in what follows we write $P_{a}(t_{a}),
870: P_{a\to i}(t_a), P_i(T_i)$ and $\chi_i(T_i)$ in place of
871: $P_{a}^{dbp}(t_{a}), P_{a\to i}^{dbp}(t_a), P_i^{dbp}(T_i)$ and
872: $\chi_i^{dbp}(T_i)$ respectively and $P_{i\to a}(h_i)$ in place of
873: $P_{i\to a}^{sp}(h_i)$.
874: 
875: To compute the entropy (\ref{eq:entropy}) we first need
876: \begin{eqnarray*}
877: P_{a}(t_{a}) & = & c_{a}^{-1}\sum_{\left\{ h_i \right\}} \prod_{i\in
878:  a}P_{i\to a}\left(h_i\right)\prod_{i\in a}\chi^{\tau}_{i\to
879:  a}\left(t_{a},h_i\right)\\ & = & c_{a}^{-1}\prod_{i\in
880:  a}\sum_{\left\{ h_i\right\}} P_{i\to
881:  a}\left(h_i\right)\chi^{\tau}_{i\to a}\left(t_{a},h_i\right)
882: \end{eqnarray*} 
883: Thus calling
884: \begin{equation}
885: f_{a\to i}=\sum_{\left\{ h_i\right\} }P_{i\to
886: a}\left(h_i\right)\chi^{\tau}_{i\to a}\left(t_{a},h_i\right)
887: \end{equation}
888: we have that
889: \begin{eqnarray}
890: \sum_{\left\{ t_{a}\right\} }P_{a}(t_{a})\log P_{a}(t_{a}) =
891: -c_{a}^{-1}\log c_{a}+\sum_{\left\{ t_{a}\right\}
892: }P_{a}(t_{a})\sum_{i\in a}\log f_{a\to i}\nonumber \\ =
893: -c_{a}^{-1}\log c_{a}+\sum_{i\in a}\sum_{\left\{ t_{a}\right\}
894: }P_{a}(t_{a})\log f_{a\to i}\label{eq:ca}
895: \end{eqnarray}
896: Writing  $\omega_{a\to i}=\sum_{\left\{ t_{a}\right\} }P_{a}(t_{a})\log
897: f_{a\to i}$ we get
898: \begin{eqnarray}
899: \sum_{a}\left(n_{a} - 1\right)\sum_{i\in a}\omega_{a\to i} &=&
900: \sum_{i}\sum_{a\in i}\sum_{j\in a\setminus i}\omega_{a\to j}\nonumber
901: \\ &=& \sum_{i}\sum_{a\in i}\sum_{j\in a\setminus i}\sum_{\left\{
902: t_{a}\right\} }P_{a}(t_{a})\log f_{a\to j}\nonumber \\
903:  &=& \sum_{i}\sum_{a\in i}\sum_{\left\{ t_{a}\right\} }
904: P_{a}(t_{a})\prod_{j\in a\setminus i}\log f_{a\to j}\nonumber \\
905:  &=& \sum_{i}\sum_{a\in i}\sum_{\left\{ t_{a}\right\} }\sum_{\left\{
906: t_{b}\right\} _{b\in i\setminus a}}P_{i}(T_{i})\prod_{j\in a\setminus
907: i}\log f_{a\to j}\nonumber \\  &=& \sum_{i}\sum_{a\in i}\sum_{\left\{
908: T_{i}\right\} } P_{i}(T_{i}) \log\prod_{j\in a\setminus i}f_{a\to
909: j}\label{eq:star}
910: \end{eqnarray} 
911: The term inside the logarithm above reads
912: \begin{eqnarray} 
913: \prod_{j\in a\setminus i}f_{a\to j} =  \sum_{\left\{ h_j\right\}}
914:  \prod_{j\in a\setminus i} \chi^{sp}_{j\to a} \left(t_{a},h_j\right)
915:  \prod_{j\in a\setminus i} P_{j\to a}(h_j)  =  \frac{1}{D_{a\to
916:  i}}P_{a\to i}(t_{a})
917: \end{eqnarray}
918: where $D_{a\to i}$ is an appropriate normalization constant. Going
919: back to Eq.~(\ref{eq:star}), we have
920: \begin{eqnarray}
921: \sum_{a}(n_{a}-1)\sum_{i\in a}\omega_{a\to i} = -\sum_{i}\sum_{a\in
922: i}\log D_{a\to i} + \sum_{i}\sum_{a\in i}\sum_{\left\{ T_{i}\right\}
923: }P_{i}\left(T_{i}\right)\log P_{a\to i}(t_{a})
924: \label{eq:D+P}
925: \end{eqnarray} 
926: The second term in the right-hand side equals
927: \begin{eqnarray}
928: \sum_{i}\sum_{\left\{ T_{i}\right\}
929:  }P_{i}\left(T_{i}\right)\log\prod_{a\in i}P_{a\to i}(t_{a})
930:   & = & \sum_{i}\sum_{\left\{ T_{i}\right\}
931:  }P_{i}\left(T_{i}\right)\log\chi_{i}(T_{i})\prod_{a\in i}P_{a\to
932:  i}(t_{a})\nonumber \\ & = & \sum_{i}\sum_{\left\{ T_{i}\right\}
933:  }P_{i}\left(T_{i}\right)\log Q_{i}(T_{i})\nonumber \\ & = &
934:  \sum_{i}\sum_{\left\{ T_{i}\right\} }P_{i}\left(T_{i}\right)\log
935:  P_{i}(T_{i})+\sum_{i}\sum_{\left\{ T_{i}\right\}
936:  }P_{i}\left(T_{i}\right)\log c_{i}
937: \label{eq:P}
938: \end{eqnarray}
939: where in the second step above $\chi_{i}(T_{i})$ has been artificially
940: multiplied inside the logarithm (we can do it because there is a
941: $P_{i}(T_{i})$ outside) and $P_{i}(T_{i}) =
942: \frac{1}{c_{i}}Q_{i}(T_{i})$. Eqs.~(\ref{eq:D+P}),(\ref{eq:P}) give:
943: \begin{eqnarray} 
944: \sum_{a}(n_{a}-1)\sum_{i\in a}\omega_{a\to i} = -\sum_{i}\sum_{a\in
945:   i}\log D_{a\to i} +\sum_{i}\sum_{\left\{ T_{i}\right\}
946:   }P_{i}\left(T_{i}\right)\log P_{i}(T_{i})+\sum_{i}\log
947:   c_{i}
948: \label{eq:D}
949: \end{eqnarray}
950: Going back to the first expression of the entropy
951: Eq.~(\ref{eq:entropy}), and using Eq.~(\ref{eq:ca}) and
952: Eq.~(\ref{eq:D}) we get:
953: \begin{eqnarray}
954: S & = & -\sum_{i}\sum_{\left\{ T_{i}\right\} } P_{i}(T_{i})\log
955:  P_{i}(T_{i}) + \sum_{a}\left(n_{a} - 1\right)\sum_{\left\{
956:  t_{a}\right\} }P_{a}(t_{a})\log P_{a}(t_{a})\nonumber \\ & = &
957:  \sum_{i}\log c_{i}-\sum_{i}\sum_{\left\{ T_{i}\right\}
958:  }P_{i}(T_{i})\log Q_{i}(T_{i}) +
959:  \sum_{a}\left(n_{a}-1\right)\sum_{\left\{ t_{a}\right\}
960:  }P_{a}\left(t_{a}\right)\log P_{a}(t_{a})\nonumber \\ & = &
961:  \sum_{i}\log c_{i}-\sum_{a}\left(n_{a}-1\right)\log
962:  c_{a}-\sum_{i}\sum_{a\in i}\log D_{a\to i}
963: \end{eqnarray}
964: where the constants are defined in Eqs.~(\ref{ci}-\ref{D}).
965: 
966: \section{Tree factor graphs}
967: \label{tree} 
968: The argument turns out to be similar to the one given in an analogous
969: ``tutorial'' appendix in ref.  \cite{Barthel_Hartmann} for the Vertex
970: Cover problem.\\ 
971: We will first build a reference solution ${\mathbf x}$, and then show
972: that every solution of $E=0$ is connected to it. ${\mathbf x}$ will be
973: built from the leaves to the root. Suppose the variables are labeled
974: in an ordering that respects distances to the root, such that the
975: first ones are the leaves and the last one is the root. In such an
976: ordering, the parents (resp. child) of $i$ are neighbors with labels
977: $j<i$ (resp.  $j>i$). We will fix $x_i$ iteratively: once $x_j$ for
978: $j<i$ are fixed, all parents of $j$ are fixed; then for $x_j$ there
979: are two possibilities: either its parents force it to take a specific
980: value, or they don't. In the first case we chose $x_i$ to take the
981: forced value; in the second one we chose the value that satisfy the
982: child clause. Now we can show that ${\mathbf x}$ is connected with
983: every other solution ${\mathbf s}$ (and thus every two solution are
984: connected). It is easy to see that the configurations ${\mathbf
985: y}^{(k)}$ defined by ${\mathbf y}^{(k)}_j = s_j$ if $j<k$ and
986: ${\mathbf y}^{(k)}_j = x_j$ if $j\geq k$ form a path of configurations
987: connecting ${\mathbf x}$ and ${\mathbf s}$. Clearly ${\mathbf
988: y}^{(1)}={\mathbf x}$ and ${\mathbf y}^{(n)}={\mathbf s}$. Also they
989: are all solutions, since if ${ \mathbf y}^{(k)}$ is a solution, then
990: clearly ${\mathbf y}^{(k+1)}$ is also a solution: if they are
991: different it is because ${\mathbf y}^{(k+1)}_{k+1}$ has been chosen to
992: satisfy the child clause (and it was not forced from parents in $s$
993: and thus neither in $y^{(k+1)}$).
994: 
995: We can now look for solutions of $\hat E$ on a satisfiable tree (with
996: boundary conditions). Let's start with a free-boundary tree with $2$
997: and $3$-clauses: it is easy to see that the solution with all $*$
998: assignments has $\hat E = 0$. It is also clearly unique: suppose that
999: there is a solution with some variable set to $\sigma\neq *$. Then
1000: there is forcefully one of its neighboring clauses in which the two
1001: (or one) remaining variables are fixed in order to not satisfy the
1002: clause. Repeating again the argument recursively for one of them, we
1003: can get a never-ending path of fixed variables in the tree. But as a
1004: trees have no loops, this is a contradiction.
1005: 
1006: There is also exactly one such solutions for a satisfiable tree with
1007: boundary conditions (if we disregard $V_i$ constraints on the
1008: variables with assigned boundary values). We will build it explicitly
1009: using the so-called unit clause propagation (UCP).  The UCP procedure
1010: consists in removing (in this case starting from the boundary) every
1011: fixed variable by (a) removing all clauses satisfied by the variable
1012: and (b) removing the variable from all clauses in which it appears
1013: without satisfying the clause.  (if the original tree is satisfiable,
1014: no $0$-clause can appear in this erasure step).  Then every possibly
1015: appearing $1$-clause is taken and its variable fixed in order to
1016: satisfy the clause, and the procedure starts again from the beginning
1017: until no more $1$-clauses show up. The resulting graph is
1018: boundary-free and with no $1$-clauses.
1019: 
1020: The promised solution will be built by taking all variables fixed by
1021: UCP with their assigned value, and by assigning the value $*$ to the
1022: remaining ones. The resulting configuration $\hat x$ has $\hat E(\hat
1023: x)=0$.  Clearly the constraints $V_i$ (see Eq.~(\ref{eq:general2}))
1024: are satisfied by $\hat x$ for all $i$ fixed by UCP (because they are
1025: ``frozen'' by their neighbors). We easily see that this partial
1026: assignement is the unique one that can give $\hat E = 0$. Using the
1027: fact that the subgraph produced by UCP has no boundary condition and
1028: that the unique solution for $\hat E=0$ on that subgraph is the
1029: all-$*$ one, we see that the proposed configuration is indeed the
1030: unique solution.
1031: 
1032: Note also that every solution of $E=0$ will coincide with $\hat x$ in
1033: the $-1,1$-assigned variables of the latter, because these variables
1034: were fixed by UCP and thus are forced in every satisfying
1035: configuration. Moreover, if one takes an index $i$ such that $\hat
1036: x_i$ is $*$, then there is at least one solution of $E(s)=0$ with
1037: $s_i=1$ (resp. $-1$): by fixing $s_i$ and applying again UCP one
1038: cannot get any contradiction (i.e. a $0$-clause) because the subgraph
1039: has no loops nor $1$-clauses. The remaining graph is still loop-free,
1040: and thus trivially satisfiable.
1041: 
1042: 
1043: \begin{thebibliography}{99}
1044: 
1045: \bibitem{TCS} Special Issue on {\it NP-hardness and Phase transitions},
1046: O. Dubois, R. Monasson, B. Selman and R. Zecchina (eds.),
1047: Theor. Comp. Sci. \textbf{265}, Issue: 1-2, August 28 (2001).
1048: 
1049: \bibitem{Codes} H. Nishimori, {\it Statistical Physics of Spin Glasses and
1050: Information Processing}, Oxford University Press, 2001
1051: 
1052: \bibitem{Aldous} D. Aldous, J. M. Steele, Probability on Discrete
1053: Structures (Vol. 110 of Encyclopaedia of Mathematical Sciences),
1054: ed. H. Kesten, p. 1-72. Springer, 2003.
1055: 
1056: \bibitem{Guerra:Talagrand} F. Guerra, Comm. Math. Phys. {\bf
1057: 233}, 1 (2003); M. Talagrand, C.R. Acad. Sci. Paris, Ser. I {\bf 337},
1058: 111 (2003)
1059: 
1060: \bibitem{Aldous_z2} D. Aldous, Random Structures and Algorithms {\bf
1061: 18} 381 (2001)
1062: 
1063: 
1064: \bibitem{MPV} M. Mezard, G. Parisi, M.A. Virasoro, {\it Spin Glass
1065: Theory and Beyond}, World Scientific, (1987)
1066: 
1067: \bibitem{pspin} S. Cocco, O. Dubois, J. Mandler, R. Monasson.
1068: Phys. Rev. Lett. {\bf 90}, 047205 (2003); M. Mezard, F. Ricci-Tersenghi,
1069: R. Zecchina, J. Stat. Phys. {\bf 111}, 505 (2003)
1070: 
1071: \bibitem{science} M. Mezard, G. Parisi, R. Zecchina, Science {\bf
1072: 297}, 812 (2002)
1073: 
1074: \bibitem{MZ} M. Mezard and R. Zecchina, Phys.Rev. {\bf E 66}, 056126 (2002)
1075: 
1076: \bibitem{BMZ} A. Braunstein, M. Mezard, R. Zecchina, {\it Survey
1077: propagation: an algorithm for satisfiability}, ArXiv:
1078: xxx.lanl.gov/ps/cs.CC/0212002 (2002)
1079: 
1080: \bibitem{Gallager} R.G. Gallager, Information Theory and Reliable
1081:   Communications, Wiley, New York, 1968
1082: 
1083: \bibitem{Pearl} J. Pearl, {\sl Probabilistic Reasoning in Intelligent Systems},
1084: 2nd ed. (San Francisco, MorganKaufmann,1988)
1085: 
1086: \bibitem{Spielman} D.A. Spielman, in {\sl Lecture Notes in Computer
1087:   Science} {\bf 1279}, 67 (1997) 
1088: 
1089: \bibitem{Sourlas} N. Sourlas, in {\sl From Statistical Physics to
1090: Statistical Inference and Back}, P. Grassberger and J-P. Nadal Edts.,
1091: Kluwer Academic, Dordrecht (1994)
1092: 
1093: \bibitem{turbo} C. Berrou, A. Glavieux and P. Thitimajshima,
1094:   Proc. Int. Conf. Comm, 1064-1070 (1993)
1095: 
1096: \bibitem{Forney} G.D. Forney, Jr., IEEE Trans. Inform. Theory, {\bf
1097:   47}, 520 (2001)
1098: 
1099: \bibitem{good_codes1} M.G. Luby, M. Mitzenmacher, M.A. Shokrollahi and
1100:   D.A. Spielman, IEEE Trans. Inform. Theory, {\bf 47}, 569 (2001)
1101: 
1102: \bibitem{good_codes2} S-Y. Chung, G.D. Forney,Jr., T.J. Richardson and
1103:   R. Urbanke, IEEE Comm. Letters {\bf 5}, 58 (2001)
1104: 
1105: \bibitem{MacKay} D.J.C. MacKay, IEEE Trans. Inform. Theory {\bf 45},
1106:   399 (1999)
1107: 
1108: \bibitem{cavity} M. Mezard, G. Parisi, M.A. Virasoro, Europhys. Lett. {\bf
1109: 1}, 77 (1986); M. Mezard, G. Parisi, Eur. Phys. J. {\bf B 20}, 217
1110: (2001); M. Mezard, G. Parisi, J. Stat. Phys.  {\bf 111}, 1 (2003)
1111: 
1112: 
1113: \bibitem{Cook_review} S.A. Cook, D.G. Mitchell, {\it Finding Hard
1114: Instances of the Satisfiability Problem: A Survey}, In: {\sl
1115: Satisfiability Problem: Theory and Applications}, Du, Gu and Pardalos
1116: (Eds).  DIMACS Series in Discrete Mathematics and Theoretical Computer
1117: Science, Volume 35, (1997)
1118: 
1119: \bibitem{nature} R. Monasson, R. Zecchina, S. Kirkpatrick,
1120: B. Selman, and L.  Troyansky, Nature \textbf{400}, 133 (1999);
1121: 
1122: \bibitem{MPR} A. Montanari, G. Parisi, F. Ricci-Tersenghi, ArXiv:
1123: xxx.lanl.gov/ps/cond-mat/0308147 (2003)
1124: 
1125: \bibitem{joker} 
1126: A. Braunstein, M. Mezard, M. Weigt, R. Zecchina, {\it Constraint
1127: Satisfaction by Survey Propagation}, ArXiv
1128: lanl.arXiv.org/ps/cond-mat/0212451 (2002);
1129: G. Parisi, {\it On the survey-propagation equations for the random
1130: K-satisfiability problem}, ArXiv: xxx.lanl.gov/ps/cs.CC/0212009
1131: (2002);
1132: G. Parisi, {\it On local equilibrium equations for clustering states}
1133: ArXiv: xxx.lanl.gov/ps/cs.CC/0212047 (2002);
1134: 
1135: \bibitem{factor_graph} F.R. Kschischang, B.J. Frey, H.-A. Loeliger,
1136: {\it IEEE Trans. Infor. Theory} {\bf 47}, 498 (2002).
1137: 
1138: \bibitem{GBP} Yedidia, J.S.; Freeman, W.T.; Weiss, Y., {\it Generalized
1139: Belief Propagation}, Advances in Neural Information Processing Systems
1140: (NIPS) {\bf 13}, 689 (2000)
1141: 
1142: \bibitem{states} There exist multiple definitions of states (clusters)
1143: for finite sizes (e.g.  k-flip stable, with $lim_{N \to \infty} k/N
1144: =0$, \cite{cavity,Biroli:Monasson}) which lead to equivalent
1145: thermodynamical limits in which the SP-cavity formalism is assumed to
1146: hold.
1147: 
1148: \bibitem{pspin-note} In particularly simple cases like the so called
1149: diluted p-spin glasses (or random sparse parity check
1150: equations)~\cite{pspin}, the introduction of $*$ states has allowed
1151: for an explicit construction of an exponential number of clusters of
1152: solutions and to prove the exactness of the so called one step replica
1153: symmetry breaking (RSB) solution in the scheme of Parisi~\cite{MPV}.
1154: However, for such models the $*$ variables are in a sense trivial in
1155: that they do not depend on the cluster and their (recursive)
1156: elimination leads to a residual model which can be solved exactly by a
1157: simple annealed/first-moment calculation.  For K-SAT the situation is
1158: more complex (and more general) in that variables are expected to
1159: become $*$ depending on the clusters.
1160: 
1161: 
1162: \bibitem{note} In the computation of the free energy $\lambda$ should
1163: be taken proportional to the temperature $T$ in the limit $T \to 0$.
1164: 
1165: \bibitem{Biroli:Monasson} G. Biroli, R. Monasson, Europhys. Lett. 50,
1166: 155 (2000)
1167: 
1168: 
1169: \bibitem{Barthel_Hartmann} W. Barthel, A.K. Hartmann,{\it Clustering
1170: analysis of the ground-state structure of the vertex cover problem},
1171: cond-mat/0403193
1172: 
1173: \bibitem{napolano} A. Braunstein, V. Napolano, R. Zecchina {\it
1174: Clustering in random SAT}, in preparation
1175: 
1176: 
1177: \end{thebibliography} 
1178: \end{document}
1179: 
1180: