0312:cond-mat0312483/dbp.tex

1: \documentclass[aps,showpacs,superscriptaddress]{revtex4}

2: %\documentclass[aps,pre,preprint,superscriptaddress,showpacs]{revtex4}

3: %\documentclass[aps,prl,twocolumn,superscriptaddress,showpacs]{revtex4}

4:

5: \usepackage{graphicx}

6:

7: \begin{document}

8:

9: \title{Survey Propagation as local equilibrium equations}

10:

11: \author{Alfredo Braunstein}

12: \email[]{abraunst@sissa.it}

13: \affiliation{SISSA, Via Beirut 9, 34100 Trieste, Italy}

14: \affiliation{ICTP, Strada Costiera 11, I-34100 Trieste, Italy}

15:

16: \author{Riccardo Zecchina}

17: \email[]{zecchina@ictp.trieste.it}

18: \affiliation{ICTP, Strada Costiera 11, I-34100 Trieste, Italy}

19:

20:

21: \begin{abstract}

22: It has been shown experimentally that a decimation algorithm based on

23: Survey Propagation (SP) equations allows to solve efficiently some

24: combinatorial problems over random graphs.  We show that these

25: equations can be derived as sum-product equations for the computation

26: of marginals in an extended space where the variables are allowed to

27: take an additional value -- $*$ -- when they are not forced by the

28: combinatorial constraints.  An appropriate ``local equilibrium

29: condition'' cost/energy function is introduced and its entropy is

30: shown to coincide with the expected logarithm of the number of

31: clusters of solutions as computed by SP. These results may help to

32: clarify the geometrical notion of clusters assumed by SP for the

33: random K-SAT or random graph coloring (where it is conjectured to be

34: exact) and helps to explain which kind of clustering operation or

35: approximation is enforced in general/small sized models in which it is

36: known to be inexact.

37:

38: \end{abstract}

39:

40: \pacs{89.20.Ff, 75.10.Nr, 02.60.Pn, 05.20.-y}

41:

42: \maketitle

43:

44: \section{Introduction}

45:

46: Recent developments in statistical physics of disordered systems have

47: shown a remarkable convergence of themes with other disciplines such

48: as computer science (e.g combinatorial optimization~\cite{TCS}),

49: information theory (e.g error correcting codes~\cite{Codes}) and

50: discrete mathematics (e.g. random

51: structures~\cite{Aldous,Guerra:Talagrand,Aldous_z2}).  While the study

52: of a typical static measure characterizing the slow dynamics of both

53: physical and algorithmic processes is the unifying issue in

54: out-of-equilibrium problems, the study of the geometrical structure of

55: ground states of spin-glass-like energy functions $E$ is central to

56: the understanding of the onset of computational complexity in random

57: combinatorial problems.  The combinatorial problem of satisfying a

58: given set of constraints is viewed in the physics framework as the

59: problem of minimizing $E$ and ``ground state configurations'',

60: ``solutions'' or ``satisfying assignments'' should be understood as

61: synonymous.

62:

63: Important in an attempt of providing a complete theory of random

64: combinatorial problems is the notion of pure states, or clusters of

65: configurations, on which the probability measure over optimal

66: configurations is assumed to concentrate.  Recently, a new class of

67: algorithms has been proposed \cite{MZ,science,BMZ} that have shown

68: surprising capabilities in dealing with the (exponential)

69: proliferation of clusters of metastable states and therefore in

70: solving random instances of combinatorial problems which are difficult

71: to solve for local search heuristics.  Such algorithms are based on

72: the so called Survey Propagation (SP) equations in which indeed a

73: decomposition of the ground states probability distribution -- the

74: Gibbs measure -- into an exponential number of clusters is assumed

75: from the beginning.  The SP equations can be viewed as zero

76: temperature cavity equations~\cite{cavity} formulated for single

77: instances at a level equivalent to the one-step of replica symmetry

78: breaking (1-RSB) scenario~\cite{states}.

79:

80:

81: The SP algorithm consists in a message-passing technique which is

82: closely related to another message-passing method -- known as

83: sum-product or Belief Propagation (BP)~\cite{Gallager,Pearl} algorithm

84: -- which have shown amazing performance for solving the decoding

85: problem~\cite{Spielman} in error correcting codes based on sparse

86: graph

87: encodings~\cite{Sourlas,turbo,Forney,good_codes1,good_codes2,MacKay}.

88:

89: The aim of this study is to discuss the precise (finite size)

90: structure of the SP equations, linking them to the BP formalism. This

91: is a well defined mathematical issue, independent on the physical

92: origin of the equations. Due to the algorithmic relevance of both BP

93: and SP for coding theory and combinatorial optimization, it is a basic

94: question to understand what these equations are doing for a finite

95: number of variables $N$ since this is the regime in which they are

96: used.

97:

98: As we shall see, the SP ``algorithmic'' equations at finite N are

99: performing a very specific clustering operation over the solution

100: space.  Moreover, the number of such clusters in the Bethe

101: approximation will be shown to coincide with the prediction of the

102: cavity theory.

103:

104: These results will be obtained by showing that the SP equations are

105: the BP equations for a modified combinatorial problem. By this mapping

106: we clarify how the hypothesis making BP exact (that is, uncorrelation

107: of distant variables) translate onto a condition of uncorrelation of

108: "frozen"  variables belonging to different clusters: SP produces a

109: collapse of the internal structure of clusters and eliminates

110: correlations among the unfrozen parts.

111:

112: We shall present the results in the case of the K-SAT problem even

113: though the method could be applied to any discrete combinatorial model

114: defined over locally tree--like graphs.  The results concerning the

115: cluster entropy will be compared with the prediction of the 1-RSB

116: cavity analysis for random K-SAT.

117:

118: The line of reasoning of the paper consists in showing that the SP

119: equations can be re-derived as sum-product or BP equations --

120: i.e. simple replica symmetric (RS) cavity equations -- over an

121: extended configuration space.  The definition of this space consists

122: in associating to each binary variable a new extra value ``$*$'' which

123: will correspond to the possibility that the variable is not forced to

124: take one of the binary values $\{ -1,+1 \}$ in a given solution

125: \cite{pspin-note}.  We will introduce a {\it local equilibrium

126: condition} (LEC) cost-energy function $\hat E$ derived from $E$,

127: acting over the extended space, together with a (technical) duality

128: transformation needed to preserve the locality of the interactions for

129: implementing properly the BP equations. The following two statements

130: will hold: {\it

131: \begin{itemize}

132: \item[{\bf (I)}] Marginals given by the BP equations derived from

133: $\hat E$ coincide with the marginals given by SP on the original

134: problem.

135:

136: \item[{\bf (II)}] Bethe approximation to the entropy of $\hat E$ in

137: the enlarged space as computed by BP coincides with the logarithm of

138: the number of clusters of solutions -- the so called ``complexity'' --

139: predicted by SP on the original problem.

140: \end{itemize}

141: }

142:

143: The proof of {\bf (I)} will be achieved by finding a direct connection

144: between quantities (``messages'') propagated by the two algorithms at

145: each iteration step.  We recall that the Bethe approximation to the

146: entropy is exact over trees without and with boundary conditions,

147: i.e. with leaf variables taking given values.

148:

149: The possibility of interpreting SP as appropriate BP equations may

150: have consequences for their rigorous probabilistic analysis, through a

151: proper application/generalization of the known methods for the

152: analysis of convergence of BP like equations over random graphs (as it

153: has already been done for problems like the random matching

154: \cite{Aldous_z2}). Some preliminary exact numerical results that we

155: give in the concluding section are in support of this possibility.

156:

157: Throughout the paper we heavily rely on the notations of

158: refs.~\cite{MZ,BMZ} for what concerns the SP equations.

159:

160: \section{Survey Propagation, Belief Propagation and K-SAT}

161:

162: SP and BP (or sum-product) are examples of message-passing procedures.

163: In BP the unknowns which are evaluated by iteration are the marginals

164: over the solution space of the variables characterizing the

165: combinatorial problem (e.g. binary ``spin'' variables). According to

166: the physical interpretation, the quantities that are evaluated by SP

167: are the probability distributions of local fields over the set of

168: clusters. That is, while BP performs a ``white'' average over

169: solutions, SP takes care of cluster to cluster fluctuations, telling

170: us which is the probability of picking up a cluster at random and

171: finding a given variable completely biased (frozen) in a certain

172: direction -- that is forced to take the same value within the cluster

173: -- or unfrozen.

174:

175: In both SP or BP one assumes to know the marginals of all variables in

176: the temporary absence of one of them and then writes the marginal

177: probability induced on this ``cavity'' variable in absence of another

178: third variable interacting with it (i.e. the so called Bethe lattice

179: approximation for the problem).  These relations define a closed set of

180: equations for such cavity marginals that can be solved iteratively

181: (this fact is known as message-passing technique). The equations

182: become exact if the cavity variables acting as inputs are

183: uncorrelated. They are conjectured to be an asymptotically exact

184: approximation over some random locally tree--like structures\cite{MZ}.

185:

186: The $K$-satisfiability problem ($K$-SAT) is easily stated: Given $N$

187: Boolean variables each of which can be assigned the value True (1) or

188: False (-1), and $M$ clauses between them, is there a 'SAT-assignment',

189: i.e. an assignment of the Boolean variables which satisfies all

190: constraints? A clause takes the form of an 'OR' function of $K$

191: variables in the ensemble (or their negations).  A SAT formula in

192: conjunctive normal form over $N$ Boolean variables $\{\sigma_i =\pm 1\}$ can

193: be written as

194: \begin{equation}

195: \mathcal{F}=\prod_{a\in A}C_{a}

196: \label{F}

197: \end{equation}

198: where

199: \begin{equation}

200: C_a = 1 - E_a \; \; \; , \; \; E_a \equiv \prod_{i\in

201: a}\delta(J_{a,i},\sigma_i)

202: \label{eq:clause}

203: \end{equation}

204: where $\delta(x,y)$ is the Kronecker function (also written as

205: $\delta_{x,y}$ in the rest of the paper) and $\{ C_a \}$ are the

206: clauses encoded by the parameters $J_{a,i}$ as follows: $J_{a,i}=\pm

207: 1$ if respectively $\pm \sigma_i$ appears in clause $a$ (in Boolean

208: notation we would have $J_{a,i}=-1$ (resp. $+1$) if the Boolean

209: variable $x_i$ (resp. $\neg x_i$) appears in clause $a$).  We call

210: $E_a$ the ``energy'' of a clause.  The symbol $i\in a$ will denote the

211: set of variables participating in clause $a$. Additionally it will be

212: useful to use the symbol $a\in i$ to denote the set of clauses

213: depending on variable $i$. The clause size $|\{i:i\in a\}|$ will be

214: denoted by $n_a$ ($n_a\equiv K$ for $K$-SAT), and the variable

215: connectivity $|\{a:a\in i\}|$ will be denoted by $n_i$.

216:

217: The satisfiability problem consists in determining the existence of an

218: assignment to the Boolean variables which satisfies all clauses at the

219: same time, that is such that $\mathcal{F}=1$.  We may write the energy

220: function which counts the number of violated clauses as $E=\sum_a E_a$

221: so that the satisfiability problem becomes finding the zero energy

222: ground states of $E$. The random version of $K$-SAT corresponds to the

223: case in which the variables appearing in each clause are chosen

224: uniformly at random, and negated with probability $\frac12$. For the

225: sake of simplicity, hereafter we concentrate mostly on the $3$-SAT

226: case.

227:

228: The energy function $E$ of a random $3$-SAT formula is a spin glass

229: model defined over a locally tree-like graph that can been studied

230: with the techniques of statistical physics of random systems, namely

231: the replica and cavity methods.

232:

233: Numerical experiments have shown that a decimation algorithm based on

234: SP equations allows to find satisfying assignments of critically

235: constrained random $3$-SAT instances -- that is random formulas with

236: $\alpha=M/N$ just below a critical ratio $\alpha_c \simeq 4.267$ where

237: formulas are conjectured to become unsatisfiable with high probability

238: -- with a computational cost roughly scaling as $N \log N$~\cite{BMZ}

239: while the other known algorithms typically take times that are

240: exponential in $N$~\cite{Cook_review,nature}.  According to the cavity

241: -- or SP -- analysis , in such hard region (more precisely for $\alpha

242: \in [4.15,4.267]$~\cite{MZ,MPR}) there is a genuine one step RSB

243: phase, in which the space of solution decomposes into an exponential

244: number of clusters and where metastable states are even more numerous.

245:

246: As discussed in great detail in ref.~\cite{MZ}, one crucial feature

247: that comes out from the SP analysis is the distinction between frozen

248: and unfrozen variables within the different clusters and we shall

249: introduce a formalism which naturally incorporates such phenomenon

250: (see also refs.~\cite{joker}).

251:

252: We want to represent the condition for a variable of being not forced

253: to take any specific value in a given ground state (unfrozen) and to

254: this end we consider configuration space of $3-$value variables

255: $s_{i}\in\left\{-1,*,1,\right\} $ instead of $\sigma_i\in \{-1,1\}$.

256:

257: We observe that $C_{a}$ as defined in Eq.~(\ref{eq:clause}) can be

258: evaluated also in extended variables: it behaves as if variables with

259: the $*$ value could be chosen to the best of $-1$ or $1$ and thus

260: satisfy the clause.  This gives the name ``joker state'' to the value

261: $*$. For a configuration $s^{(i,x)}$ such that $s^{(i,x)}_i = x$ and

262: $s^{(i,x)}_j = s_j$ for $j\neq i$ call

263: \begin{equation}

264: C_{a}^{i,x}(s)=C_{a}(s^{(i,x)})

265: \end{equation}

266: and introduce the constrain over $\{-1,*,1\}^{n}$ configurations given

267: by

268: \begin{equation}

269: V_{i}=\delta_{s_{i},*}\prod_{a\in i} C_{a}^{i,-1} C_{a}^{i,1} +

270:   \sum_{\sigma=\pm 1}\delta_{s_{i},\sigma} \prod_{a\in i}

271:   C_{a}^{i,\sigma}\left(1- \prod_{a\in i} C_{a}^{i,-\sigma}\right)

272: \label{eq:general2}

273: \end{equation}

274: The LEC formula derived from $\mathcal{F}$ will be defined as

275: \begin{equation}

276: \mathcal{G}=\prod_{i}V_{i}.

277: \label{G}

278: \end{equation}

279: Note that $V_i$ depends only on $(s_{j})_{j\in a, a\in i}$ and

280: therefore preserves the ``locality'' of the structure, if any, of the

281: original formula.  A solution of the LEC problem is a configuration

282: $\mathbf{s}=(s_{i}) _{i\in I}\in\left\{-1,*,1\right\} ^{n}$ such that

283: $\mathcal{G}\left(\mathbf{s}\right)=1$. As a particular case, a

284: solution $\mathcal{G}(\mathbf{s})=1$ such that $s_{i}\in\left\{ \pm

285: 1\right\} $ is also a solution of $\mathcal{F}$.

286:

287: To fix ideas it might be useful to compare the LEC cost-energy

288: function with the original 3-SAT one. To this end we adopt

289: the so--called factor graph representation~\cite{factor_graph}: Given

290: a formula $\mathcal{F}$, we define its associated \emph{factor graph}

291: as a bipartite undirected graph $G=\left(V;E\right)$, having two types

292: of nodes, and edges only between nodes of different type: {\bf (i)}

293: Variable nodes, each one labeled by a variable index in $I=\left\{

294: 1,\dots,N\right\} $ and {\bf (ii)} Function nodes, each one labeled by

295: a clause index $a\in A$ ($|A|=M$). An edge $\left(a,i\right)$ will

296: belong to the graph if and only if $a\in i$ or equivalently $i\in a$.

297: For instance, the factor graph representation of the random $3$-SAT

298: problem consists in a bipartite graph with $N$ variable nodes having a

299: Poisson random connectivity of mean $3 \alpha$ and $M$ function nodes

300: with energy $E_a$ of uniform connectivity $3$ (a portion is shown in

301: part (a) of Fig.\ref{duality}).  The extended LEC spin glass energy

302: function reads:

303: \begin{equation}

304: \hat E =  \sum_{a=1}^M \hat E_a +\sum_{i=1}^N A_i

305: \end{equation}

306: where now $\hat E_a = 1-C_a$ is evaluated in the extended configuration

307: space and

308: \begin{equation}

309: A_i=\delta_{s_{i},*}\left(1-\delta_{E_i^{-1},E_i^1}\right) +

310: \sum_{\sigma=\pm 1}\delta_{s_{i},\sigma}\theta\left(E_i^\sigma -

311: E_i^{-\sigma}\right)

312: \end{equation}

313: with $E_i^\sigma=\sum_{a\in i}(1-C^{i,\sigma}_a)$ and $\theta(x)=1$ if

314: $x > 0$ and $0$ otherwise.  The factor graph of the LEC has $N$

315: additional function nodes (the $A_i$ terms enforcing the joker

316: condition) that extend over the second neighbors (inset (b) in

317: Fig. \ref{duality}).

318:

319: By inspecting Eq.~(\ref{G}) we notice a first problem, namely that we

320: have lost the locally tree-likeness of the original graph. There are

321: interactions terms between every (ordered) pair of neighbors variable

322: nodes $i,j\in a$ (in the original graph), and thus for instance every

323: such pair shares two constraints $V_i,V_j$ (making an effective

324: 2-loop). This introduces an obvious problem for implementing BP over

325: this combinatorial problem, and moreover would make difficult to

326: compare both algorithms, as the underlying geometry is now

327: different. Fortunately, there is an easy (but unfortunately

328: notationally somewhat involved) way out. We will group together

329: neighbor variables, effectively performing a sort of duality

330: transformation over the graph. We describe the procedure explicitly

331: below (Note that this is a particularly simple case of a Kikuchi or

332: ``generalized belief propagation''-type approximation \cite{GBP}).

333:

334: We will define: {\bf (i.)}  $M$ multi state variables each one

335: corresponding to a tuple $t_a=\{t^{(i)}_a\}_{i\in a}$ ($t^{(i)}_a\in

336: \{-1,*,1\}$) and ``centered'' on $a$ clauses and have (uniform)

337: connectivity $n_a$ ((c) in Fig.\ref{duality}), and {\bf (ii.)} N

338: function nodes $\chi^{dbp}_i$ having Poisson connectivity, depending

339: on $T_i\equiv \{t_a\}_{a\in i}$ and enforcing both the joker state

340: condition as well as identifying the values of the single variables

341: $t^{(i)}_a$ shared by different tuples $a\in i$ ((d) in

342: Fig.\ref{duality}).  An explicit expression of $\chi_i^{dbp}(T_i)$

343: (conf. Eq.~(\ref{eq:general2})) is

344: \begin{equation}

345: \chi^{dbp}_{i} = \sum_{\{s_i\}}\left(\prod_{a\in

346: i}\delta_{t_a^{(i)},s_i}\right) \left(\delta_{s_i,*}\prod_{a\in

347: i} C_{a}^{i,-1} C_{a}^{i,1} + \sum_{\sigma=\pm 1}\delta_{s_i,\sigma}

348: \prod_{a\in i} C_{a}^{i,\sigma}\left(1 - \prod_{a\in i}

349: C_{a}^{i,-\sigma}\right)\right)

350: \label{eq:dualenergy}

351: \end{equation}

352: We shall refer to the BP equations over the dual graph as {\it Dual

353: BP} (DBP).

354: \begin{figure}

355: \begin{center}

356: \includegraphics[width=0.5\textwidth,height=0.3\textheight]{duality}

357: \caption{(a) Portion of the original factor graphs, (b) LEC graph with

358: 3-state variables and additional constraints $A_i$ (black nodes) (c)

359: duality transformation (d) dual graph}

360: \label{duality}

361: \end{center}

362: \end{figure}

363:

364: \section{SP equations as BP equations over the dual graph}

365:

366: Basic SP and DBP iterations can be thought of as transformations in

367: the space of probability distributions of the signs $h_i=\{-1,0,1\}$

368: of the effective fields acting on the single spin variables and of the

369: tuples $t_a=\{-1,*,1\}^{n_a}$ in the dual graph.  In the cavity

370: notation the quantities that are iterated refer to a graph in which a

371: given node and all its neighbor nodes are temporarily eliminated (see

372: Fig. \ref{duality} (a) and (d)) and all quantities are labeled by

373: oriented indices of the type $a \to i$ or $i \to a$ where the node on

374: the right of the arrow is the one eliminated.  Therefore the equations

375: describe a local transformation of some input probability

376: distributions into an output distribution in which a characteristic

377: function $\chi$ eliminates contributions from those combinations of

378: input and output fields or variables that violate some kind of local

379: constraints (it is worth noticing that these cavity equations are

380: closely related to the iterative local equations of the so called

381: Objective Method~\cite{Aldous} of combinatorial

382: probability). Explicitly we have:\\

383:

384:

385: {\bf DBP equations:}

386: \begin{eqnarray}

387: P_{a\to i}^{dbp}\left(t_a\right) & \propto & \sum_{\left\{

388: t_{b}\right\}} \prod_{j\in a\setminus i} \chi^{dbp}_j

389: \left(t_a,\left\{t_b\right\}\right) \prod_{b\in j\setminus a} P_{b\to

390: j}^{dbp}\left(t_{b}\right)

391: \end{eqnarray}

392: \\

393:

394: {\bf SP equations:} ~\cite{MZ,BMZ}

395: \begin{eqnarray}

396: P_{j\to a}^{sp}\left(h_j\right) & \propto & \sum_{\left\{ h_k\right\}}

397: \chi^{sp}_{j\to a}\left(h_{j},\left\{h_{k}\right\}\right) \prod_{b\in

398: j\setminus a} \prod_{k\in b\setminus j} P^{sp}_{k\to

399: b}\left(h_{k}\right)

400: \end{eqnarray}

401: where

402: \begin{eqnarray}

403: \chi^{sp}_{j\to a} & = & \delta_{h_{j},*}\prod_{b\in j\setminus

404: a}C_{b}^{j,1}C_{b}^{j,-1} + \sum_{\sigma=\pm

405: 1}\delta_{h_{j},\sigma}\prod_{b\in j\setminus

406: a}C_{b}^{j,\sigma}\left(1-\prod_{b\in j\setminus

407: a}C_{b}^{j,-\sigma}\right)

408: \end{eqnarray}

409: $C_{b}$ clauses are here evaluated in $\left(\left(h_{k}\right)_{k\in

410: b\setminus j},h_{j}\right)$.\\

411:

412: In order to show the connection between the above equations it is

413: convenient to introduce an auxiliary transformation $\tau$ of a

414: similar type:\\

415:

416: {\bf $\tau$ transformation:}

417: \begin{eqnarray}

418: P_{a\to i}^{\tau}\left(t_a\right) & \propto & \sum_{\left\{

419:   h_j\right\}}\prod_{j\in a\setminus i} \chi^{\tau}_{j\to

420:   a}\left(t_a,h_j\right) P_{j\to a}\left(h_j\right)

421: \end{eqnarray}

422: and

423: \begin{eqnarray}

424: \chi^{\tau}_{j\to a} = \sum_{\sigma=\pm 1} C_a \delta_{h_j,\sigma}

425: \delta_{t_a^{(j)},\sigma} + \delta_{h_j,*} \left[

426: \delta_{t_a^{(j)},*} C_{a}^{j,-1}C_{a}^{j,1} + \sum_{\sigma=\pm 1}

427: \delta_{t_a^{(j)},\sigma} C_{a}^{j,\sigma}\left(1 -

428: C_{a}^{j,-\sigma}\right)\right]

429: \label{eq:tau}

430: \end{eqnarray}

431: $C_{a}$ terms are evaluated here in $t_a$.\\

432:

433:

434: We will drop now the argument dependence of the measures $P_{j\to

435: a}^{sp}$, $P_{a\to i}^{dbp}$ and $P_{j\to a}^{\tau}$ and make instead

436: explicit the dependence on the input probability measures

437: $\left\{P_{k\to b}\right\},\left\{P_{b\to j}\right\},\left\{P_{j\to

438: a}\right\}$ respectively.

439:

440: The connection between $DBP$ and $SP$ can be written as follows:

441: \begin{equation}

442: P_{a\to i}^{dbp}\left(\left\{P_{k\to b}^{\tau}\right\}\right) \equiv

443:   P_{a\to i}^{\tau}\left(\left\{P_{j\to a}^{sp}\right\}\right)

444: \label{eq:p_equiv}

445: \end{equation}

446: where both sides of the (functional) equality in turn depend on some

447: arbitrary set of probability distributions $\left\{P_k(h_k)\right\}$

448: where $k\in b\setminus j$ for $b\in j\setminus a$ and finally $j\in

449: a\setminus i$. In short,

450: \begin{equation}

451: P^{dbp}\circ P^{\tau}\equiv P^{\tau}\circ P^{sp}

452: \label{eq:equiv}

453: \end{equation}

454: %%%%%%%%%%%%%%%%%%%%%% proof tau o sp = bp o tau

455:

456: In order to check the validity of the above identity we observe

457: that a direct inspection of the composition shows that it is true if

458: for every $j\in a\setminus i$ the following condition among the

459: characteristic functions holds:

460: \begin{equation}

461: \sum_{\{h_j\}}\chi^{\tau}_{j\to a} \chi_{j\to a}^{sp} =

462: \sum_{\{t_b\}}\chi^{dbp}_j\prod_{b\in

463: j\setminus a} \prod_{k\in b\setminus j}\chi^{\tau}_{k\to

464: b}\label{eq:compo}

465: \end{equation}

466: In appendix \ref{proof} we display the proof that this identity holds

467: and, as a consequence, that also identity Eq.~(\ref{eq:equiv}) is

468: valid. Eq.~(\ref{eq:equiv}) in turn implies that

469: \begin{equation}

470: \left(P^{dbp}\right)^{\left(k\right)}\circ P^{\tau}\equiv

471: P^{\tau}\circ\left(P^{sp}\right)^{\left(k\right)} \; \; \;,

472: \end{equation}

473: where the $\left(k\right)$ exponent means composition. This in turn

474: implies that we have a direct step-by-step connection between the

475: elementary quantities used in the DBP equations and those used in the

476: SP equations: convergence is obtained simultaneously and

477: Eq.~(\ref{eq:equiv}) holds for the respective fixed points.  It is

478: straightforward to compute from the $DBP$ equations the marginals

479: $P_{i}^{dbp}\left(s_{i}\right)$ of the single variables as a

480: marginalization of $P_{a}^{dbp}\left(t_{a}\right)$ for some $a\in i$

481: with respect to all other variables in the clause, (on a fixed point,

482: it doesn't matter which $a\in i$ one chooses). One finds that the

483: marginals predicted by DBP are in one to one correspondence with the

484: local fields given by SP, that is $P_i^{dbp}(s_i=-1,*,1)$ coincides

485: respectively with $P_i^{sp}(H_i=-1,0,1)$ (see refs.~\cite{MZ,BMZ}).

486:

487:

488: \subsection{Clustering and whitening}

489:

490: The marginals over $\{1,*,-1 \}^N$ given by SP/DBP acquire a

491: computational/physical significance once we interpret what solutions

492: of combinatorial problem defined by Eq.~(\ref {G}) mean in term of

493: clusters (or groups) of solutions of the original problem defined by

494: Eq.~(\ref{F}). We will first define the Hamming distance between

495: configurations $s,t\in \{1,*,-1\}^n$, $H(s,t)=|\{i:s_i\neq t_i\}|$ and

496: an ordering relation over $\{-1,*,1\}$ configurations: if $s,t\in

497: \{1,*,-1\}^n$ we say that $s\leq t$ iff $t_i \neq s_i$ implies that

498: $t_i=*$. For instance, $(0,1)\leq(0,*)$ and $(1,1,1)\leq(1,*,*)$ but

499: $(0,1)\not\leq(1,*)$.

500:

501:

502: We will say that a configuration $s\in \{\pm 1\}^n$ is {\it contained}

503: in $t\in$ if $s\leq t$. In this sense, ``clustering'' would mean,

504: starting with some set $S\subset \{\pm 1\}^n$ of solutions of the

505: original combinatorial problem, to find some set $T \subset

506: \{1,*,-1\}^n$ such that every $s\in S$ is contained in some $t\in

507: T$. Of course, one would like to do so in some maximal way, but

508: satisfying some kind of separation between different clusters.

509:

510: One trivial observation about the set ${\mathcal G}=1$ is that

511: solutions are by force separated, in the sense that $H(s,t) > 1$ if

512: ${\mathcal G}(s)={\mathcal G}(t)=1$ and $s\neq t$. To prove this,

513: suppose that $H(s,t)=1$. If their difference comes because $s_i=\pm 1$

514: and $t_i=*$ then by force one of $V_i(t)$ or $V_i(s)$ is clearly

515: violated. If on the contrary, it comes because $s_i=1$ and $t_i=-1$ or

516: viceversa, then by force both of $V_i(t)$ and $V_i(s)$ are violated

517: and the only possible ``correct'' value for $s_i$ is $*$.

518:

519: A more important observation is that every solution of ${\mathcal

520: F}=1$ is {\it contained} in a solution of ${\mathcal G}=1$ with the

521: minimal number of $*$, and that solution can be easily found. Take a

522: solution $x$ of ${\mathcal F}=1$, and suppose that ${\mathcal G}=0$,

523: Choose a $V_i$ such that $V_i=0$. It can be easily seen that by

524: replacing $x_i$ by $*$, then $V_i$ becomes $1$. Then we pick another

525: violated constrain and repeat the process, until ${\mathcal G}=1$. We

526: will call the resulting configuration $w(x)$ (this procedure has been

527: already used under the name of {\it whitening} in the context of graph

528: coloring by G. Parisi in~\cite{joker}). It is easy to prove that the

529: result of this procedure does not depend on the order in which you

530: pick variables violating nodes $V_i$ (the proof being that any

531: violated $V_i$ will continue to be violated in the procedure, exactly

532: until we switch $x_i$ to $*$), and so $w(x)$ is uniquely defined. Note

533: that two configurations $x,y$ at Hamming distance $H(x,y)=1$ will have

534: $w(x)=w(y)$ and so every solution in a fixed connected component of

535: the solution space will end up inside the same ``cluster''. An example

536: of the whitening procedure for some set of solutions is depicted in

537: Figure~(\ref{whitening-good}).

538: \begin{figure}

539: \begin{center}

540: \includegraphics[height=2cm]{whitening2}

541: \caption{The whitening procedure from left to right: the original set

542: of solutions $\{(-1,-1,-1), (1,1,-1), (1,1,1)\}$ and the set of

543: whitened clusters in the final step $\{(-1,-1,-1),(1,1,*)\}$}

544: \label{whitening-good}

545: \end{center}

546: \end{figure}

547: An interesting point of view is that if one tries to build from

548: scratch a Hamiltonian to describe the behaviour of the outcomes of the

549: whitening procedure of some SAT formula, Eq.~(\ref{G}) comes

550: naturally.

551:

552: The reader should note however that the presented definition of

553: clustering is far from perfect in the worst case: there is a number of

554: systematic errors produced by the whitening. For instance, in

555: Figure~(\ref{whitening-errors}) we can see one cluster claiming an

556: uncorrectly large volume. And there is of course also another problem:

557: unfortunately, there is no warranty that the sole solutions of

558: ${\mathcal G}=1$ are the ones of the whitening, and in fact small

559: counter-examples can be easily constructed. Numerical work is being

560: done to ascertain a quantification of these two types of errors

561: (\cite{napolano}).

562:

563: \begin{figure}

564: \begin{center}

565: \includegraphics[height=2cm]{whitening}

566: \caption{A systematic error of the whitening $w((1,1,-1))$ (the dark

567: solution in the left). From left to right: the original sets of

568: solutions $\{(1,1,-1), (1,1,1), (1,-1,1), (-1,-1,-1)\}$ and first step

569: $(1,1,-1)$, second step $(1,1,*)$, third step $\{(1,*,*)\}$ and final

570: step $\{(*,*,*)\}$}

571: \label{whitening-errors}

572: \end{center}

573: \end{figure}

574:

575:

576: \section{Entropy and complexity}

577:

578: The equivalence between the DBP marginals and the SP local field

579: probability distributions has the direct consequence that the Bethe

580: approximation to the entropy on the dual graph, $S^{dbp}$, coincides

581: with the logarithm of the number of clusters of solutions predicted by

582: SP, the so called complexity $\Sigma$.

583:

584: On general grounds the Bethe approximation to the entropy of a problem

585: is exact if correlations among cavity variables can be neglected

586: (i.e. the global joint probability distribution takes a factorized

587: form). This is certainly true over tree graphs and it is conjectured

588: to be true in some cases for locally tree-like random graphs in the

589: limit of large size (one informal explanation is that distance between

590: cavity variables diverges with probability tending to one).

591: Factorization of marginal probabilities over our dual factor graph

592: amounts at writing $P(\{t_a\})=\prod_{i\in I} P^{dbp}_{i}(T_i)

593: \prod_{a \in A} [ P_a^{dbp}(t_a)]^{1-n_a}$ where $P_i^{dbp}(T_i)$ is

594: the joint probability distribution of the triples connected to node

595: $i$ ($T_i \equiv \{ t_b\}_{b \in i}$) and $P_a^{dbp}(t_a)$ is the

596: single triple marginal. Under this condition the entropy reads

597: \begin{eqnarray}

598: S = -\sum_{i}\sum_{\left\{ T_{i}\right\} }P^{dbp}_{i}(T_{i})\log

599:  P_{i}^{dbp}(T_{i}) + \sum_{a}\left(n_a - 1\right)\sum_{\left\{

600:  t_{a}\right\} }P^{dbp}_{a}(t_{a})\log P^{dbp}_{a}(t_{a}) \; .

601: \label{eq:entropy}

602: \end{eqnarray}

603:

604: Showing $S=\Sigma$ is a straightforward calculation that we

605: report in the appendix. It requires to express the entropy in terms of

606: the cavity fields given by SP exploiting both Eq.~(\ref{eq:equiv}) and

607: the fixed point conditions. One finds

608: \begin{eqnarray}

609: S =  \sum_{i}\log c_{i}-\sum_{a}\left(n_{a} - 1\right)\log

610:  c_{a}-\sum_{i}\sum_{a\in i}\log D_{a\to i}

611: \label{eq:Sconst}

612: \end{eqnarray}

613: where the three normalization constants are defined by

614: \begin{eqnarray}

615: c_{i} & = &

616: \sum_{\left\{ T_i\right\}}\prod_{a\in i}P_{a\to

617: i}\left(t_{a}\right)\chi_{i}\left(T_{i}\right)

618: \label{ci} \\

619: c_{a} & = & \sum_{t_a}\sum_{\left\{ h_{j}\right\}}\prod_{j\in

620: a}P_{j\to a}\left(h_{j}\right)\chi^{\tau}_{j\to

621: a}\left(h_{j},t_a\right)

622: \label{ca}\\

623: D_{a\to i} & = & \sum_{t_a}\sum_{\left\{ h_{j}\right\}}\prod_{j\in

624: a\setminus i}P_{j\to a}\left(h_{j}\right)\chi^{\tau}_{j\to

625: a}\left(h_{j},t_a\right)

626: \label{D}

627: \end{eqnarray}

628: These constants are not independent and the explicit expressions of

629: the first two are sufficient for writing $S$ in terms of SP

630: quantities:

631: \begin{eqnarray}

632: c_{a} & = & \sum_{\left\{ h_{j}\right\}} \prod_{j\in a} P_{j\to a}

633:  \left(h_{j}\right)\sum_{\left\{ t_a\right\}} \prod_{j\in a}

634:  \chi^{\tau}_{j\to a} \left(h_{j},t_a\right) \\ & = & 1 -

635:  \sum_{\left\{ h_{j}\right\}} \prod_{j\in a} P_{j\to a}

636:  \left(h_{j}\right)\left(1-\sum_{\left\{ t_{a}\right\}}\prod_{j\in a}

637:  \chi^{\tau}_{j\to a} \left(h_{j},t_a\right) \right) \\ & = & 1 -

638:  \prod_{j\in a}P_{j\to a}\left(J_{a,j}\right)\\ & = & 1- \prod_{j\in

639:  a} \frac{\Pi_{j\to a}^{u}}{\left(\Pi_{j\to a}^{s}+\Pi_{j\to

640:  a}^{0}+\Pi_{j\to a}^{u}\right)}

641: \end{eqnarray}

642: where we have borrowed the notation of Eq.~(18) in~\cite{BMZ}. For

643: computing $c_i$ we first notice that

644: \begin{equation}

645: P_{a\to i}\left(t_{a}\right)=D_{a\to i}\sum_{\left\{ h_{j}\right\}

646: _{j\in a\setminus i}}\chi^{\tau}_{j\to

647: a}\left(t_{a},h_{j}\right)\prod_{j\in a\setminus i}P_{j\to

648: a}\left(h_{j}\right)

649: \end{equation}

650: so that Eq.~(\ref{ci}) reads

651: \begin{eqnarray}

652: c_{i} & = & \prod_{a\in i} D_{a\to i}\sum_{\left\{ H_{i}\right\}

653: }\sum_{\left\{ T_{i}\right\} } \chi_{i} \left(T_{i}\right)

654: \prod_{a}\prod_{j\in a\setminus i}\chi^{\tau}_{j\to

655: a}\left(t_{a},h_{j}\right) P_{j\to a}\left(h_{j}\right)\nonumber \\ & = &

656: \prod_{a\in i}D_{a\to i}\sum_{\left\{ H_{i}\right\}

657: }\chi_i^{sp}(H_{i})\prod_{a}\prod_{j\in a\setminus i}P_{j\to

658: a}\left(h_{j}\right)\nonumber \\ & = & \prod_{a\in i}D_{a\to

659: i}\left(\hat{\Pi}_{i}^{+} + \hat{\Pi}_{i}^{0} +

660: \hat{\Pi}_{i}^{-}\right)

661: \end{eqnarray}

662: in the notations of Eq.~(21) in~\cite{BMZ}. Finally, plugging these

663: expressions into Eq.~(\ref{eq:Sconst}) and calling

664: \begin{eqnarray} w_{i} & = &

665: \hat{\Pi}_{i}^{+}+\hat{\Pi}_{i}^{0}+\hat{\Pi}_{i}^{-} \nonumber \\ x_{i\to a} & =

666: & \Pi_{j\to a}^{s}+\Pi_{j\to a}^{0}+\Pi_{j\to a}^{u}\nonumber \\ y_{i\to a} & =

667: & \Pi_{j\to a}^{u}

668: \end{eqnarray}

669: we get from Eq. (\ref{eq:Sconst})

670: \begin{eqnarray}

671: S = \sum_{i}\log w_{i}- \left(n_{a} - 1\right) \sum_{a}

672: \log\left(1-\prod_{j\in a}\frac{y_{i\to a}}{x_{i\to a}}\right)

673: \label{esse1}

674: \end{eqnarray}

675: In this expression, $w_i$ represents the probability the local field

676: acting on the spin variable $i$ does not produce a contradiction and

677: $1 - \frac{y_{i\to a}}{x_{i\to a}}$ is the probability that the cavity

678: fields satisfy clause $a$.

679:

680: We recall that the expression of the $SP$ complexity $\Sigma$ defined

681: in Eq.~(25-27) in~\cite{BMZ} is

682: \begin{eqnarray}

683: \Sigma & = & \sum_{i}\left(1-n_{i}\right)\log w_{i} +

684:  \sum_{a}\log\left(\prod_{i\in a}x_{i\to a} - \prod_{i\in a}y_{i\to

685:  a}\right) \nonumber \\ & = & \sum_{i}\log w_{i} - \sum_{a}\sum_{i\in a}\log

686:  w_{i} + \sum_{a}\log\left(\prod_{i\in a}x_{i\to

687:  a}-\prod_{i\in a}y_{i\to a}\right)

688: \label{sigma1}

689: \end{eqnarray}

690: Despite their different look, it turns out that Eq.~(\ref{esse1}) and

691: Eq.~(\ref{sigma1}) are identical if evaluated in a fixed point of the SP

692: equations.  Their difference

693: \begin{eqnarray}

694: \label{eq:sigma-esse}

695: \Sigma - S = \sum_{a}\left\{ -\sum_{i\in a}\log w_{i} +

696: n_{a}\log\left(1-\prod_{i\in a}\frac{y_{i\to a}}{x_{i\to

697: a}}\right) - \sum_{i\in a}\log x_{i\to a}\right\}

698: \end{eqnarray}

699: is zero since in the fixed point every term inside the curly brackets

700: vanishes: using Eq.~(17) in~\cite{BMZ} we have that $\eta_{a\to

701: i}=\prod_{j\in a\setminus i}\frac{y_{j\to a}}{x_{j\to a}}$ ,

702: i.e. $\prod_{j\in a}\frac{y_{i\to a}}{x_{i\to a}}=\eta_{a\to

703: i}\frac{y_{i\to a}}{x_{i\to a}}$ for every $i\in a$ and hence

704: \begin{equation}

705: n_{a}\log\left(1-\prod_{j\in a}\frac{y_{i\to a}}{x_{i\to

706: a}}\right)=\sum_{j\in a}\log\left(1-\eta_{a\to i}\frac{y_{i\to

707: a}}{x_{i\to a}}\right)

708: \end{equation}

709: A simple calculation shows that $w_{i} = x_{a\to i}-\eta_{a\to

710: i}y_{a\to i}$ for every $a\in i$ and therefore we get $\Sigma=S$ as

711: desired.

712:

713: %%%%%%%%%%%%%%%%%%%%%

714: \section{Discussion and Conclusions}

715:

716: In this work we have shown by elementary means that the SP equations

717: can be interpreted and derived as sum-product equations for the

718: marginals over a modified combinatorial problem. An important

719: consequence of this fact is a clarification of the hypothesis behind

720: the algorithm. It is to be expected that the essential hypothesis

721: making sum-product to work is the uncorrelation of the marginals of

722: distant (or cavity) variables. Under the shown mapping, this directly

723: implies that the hypothesis behind SP (and in a way, of its definition

724: of clusters) is the uncorrelation of the frozen part of distant

725: variables, that is the uncorrelation between {\bf different} clusters.

726:

727: Under this light one can think of the SP procedure of obtaining $\hat

728: E$ from $E$ as a way of collapsing the internal structure of pure

729: states: the resulting problem ${\mathcal G}$ has many pure states but

730: with zero internal entropy. Note that this is a completely different

731: limit case with respect to the ``one pure state''  in which BP

732: (more precisely DBP) is shown to work correctly and to predict an

733: accurate entropy (which we remind is the complexity of the original

734: $E$).

735:

736: As far as the connection between solutions of the modified problem and

737: the original one is concerned, things are particularly simple over

738: tree factor graphs (see also \cite{BMZ} for results concerning

739: propagation of messages): Indeed, for any fixed boundary condition

740: (i.e. an assignment for the leaf variables), there is at most one

741: solution with $\hat{E} = 0$, and it is easy to prove (see

742: appendix~\ref{tree}) that all solutions of $E=0$ correspond to the

743: same connected component of the solution space (i.e. every two

744: solutions can be joined by a path of solutions in which successive

745: configurations in the path differ by exactly one spin flip).

746:

747: The situation on loopy graphs (corresponding for instance to random

748: formulae) is obviously more complicated. A coherent interpretation

749: would be that not only the recursive $DBP$/$SP$ equations themselves

750: are accurate in a probabilistic sense (i.e. when the factorization of

751: the corresponding input joint probability is sound) to compute the

752: statistics of the ground states of $\hat E$, but also that the

753: exactness of the interpretation of the ground states of $\hat E$ in

754: terms of clustering of the ground states of $E$ relies on this

755: hypothesis being true.

756:

757: To this extent we mention that exact enumerations on a large number

758: (thousands) of small random 3-sat formulas (up to $N=100$) showed that

759: all the zero energy configurations of $\hat E$ which are stable under

760: SP iterations can be extended to real solution of the original

761: problem.  Spurious ground states (i.e. configurations that are not

762: extensible to real solutions) do exist with a non negligible

763: probability for small $N$, however they turn out to be always unstable

764: fixed points of SP , that is unsat configurations which are irrelevant

765: for the SP marginals \cite{napolano}.  While such a result was

766: expected to hold for tree-like graphs, it is somewhat surprising to

767: observe it numerically on small, loopy, random factor graphs.  The

768: robustness of such result calls for a finite $N$ probabilistic analysis

769: which would represent a building brick for the rigorous analysis of SP

770: (of course, small ad-hoc counterexamples on improbable formulae can be

771: easily constructed).

772:

773: As a concluding remark we notice that the discussed formalism can be

774: generalized to take care of the non-zero energy regime where not all

775: constraints can be satisfied simultaneously (``frustrated'' case). The

776: LEC energy function takes the form $\hat E= \lambda \sum_{a\in A} \hat

777: E_a + \sum_{i\in I} A_i$, where $\lambda$~\cite{note} plays the role

778: of the so called Parisi re-weighting parameter~\cite{cavity}. \\

779:

780:

781: \section{Acknowledgments}

782:

783: We thank D. Achlioptas, M. Mezard, G. Parisi, A. Pelizzola and M.

784: Pretti for very fruitful discussions. This work has been supported in

785: part by the European Community's Human Potential Programme under

786: contract HPRN-CT-2002-00319, STIPCO.

787:

788:

789: \appendix

790:

791: \section{Proof of equivalence}

792:

793: \label{proof}

794:

795: For the LHS of Eq.~(\ref{eq:compo}) we have:\\

796:

797: \noindent

798: If $h_j = \sigma\in\{\pm 1\}$ then

799: \begin{equation}

800:  \chi^{\tau}_{j\to a}= C_a\delta_{t_a^{(j)},\sigma} \; \; \; , \; \;

801:  \chi^{sp}_{j\to a} = \prod_{b\in

802: j\setminus a}C_b \left(1-\prod_{b\in j\setminus a}

803: C_{b}^{j,-\sigma}\right)

804: \end{equation}

805: \noindent

806: If $h_j = *$ then

807: \begin{equation}

808: \chi^{\tau}_{j\to a} = \delta_{t^{(j)}_a,*}C_{a}^{j,-1}C_{a}^{j,1} +

809: \sum_{\sigma=\pm 1}\delta_{t_a^{(j)},\sigma} C_{a}^{j,\sigma}\left(1 -

810: C_{a}^{j,-\sigma}\right) \; \; \; , \; \; \chi^{sp}_{j\to a} =

811: \prod_{b\in j\setminus a}C_{b}^{j,-1}C_{b}^{j,1}.

812: \end{equation}

813:

814: Summing up both products and regrouping the LHS of Eq.~(\ref{eq:compo}) reads:

815: \begin{eqnarray}

816: \sum_{\sigma=\pm 1}\delta_{t_a^{(j)},\sigma} \prod_{b\in j}

817: C_{b}^{j,\sigma} \left(1 - \prod_{b\in j} C_{b}^{j,-\sigma}\right) +

818: \delta_{t_a^{(j)},*} \prod_{b\in j} C_{b}^{j,-1}

819: C_{b}^{j,1}\label{eq:sptau}

820: \end{eqnarray}

821: where $C_{b}$ for $b\in j\setminus a$ is evaluated here in

822: $\left(\{h_k\}_{k\in b\setminus j},t_a^{(j)}\right)$ and $C_{a}$ is

823: evaluated in $t_a$.

824:

825: For the RHS of Eq.~(\ref{eq:compo}) we first notice that as the

826: $\chi^{dbp}_{j}$ term includes $\prod_{a\in j} \delta_{t_a^{(j)},s_j}$

827: we will simply replace all occurrences of $t_b^{(j)}$ and $s_j$

828: variables by $t_a^{(j)}$ and drop the outer sum and the product term

829: itself. For instance, the sum over $\{ t_b\}_{b\in j}$ thus reduces to

830: a sum over $\left\{\{ t_b^{(k)}\}_{k\in b\setminus

831: j},{t_a^{(j)}}\right\}$. Let's evaluate the RHS of

832: Eq.~(\ref{eq:compo}) on the three possible values of $t_a^{(j)}$:\\

833: \noindent

834: If $t_a^{(j)} = *$ then by Eq.~(\ref{eq:dualenergy}) $\chi^{dbp}_j =

835: \prod_{b\in j}C_{b}^{j,-1}C_{b}^{j,1}$.  Moreover, just by looking at

836: its definition Eq.~(\ref{eq:tau}), one finds that in

837: $\chi^{\tau}_{k\to b}$ all $C$ terms are equal to $1$ since their $j$

838: coordinate $t_b^{(j)}=t_a^{(j)}$ is $*$.  Then $\chi^{\tau}_{k\to b} =

839: \delta_{t_b^{(k)},h_k}$ and the RHS of Eq.~(\ref{eq:compo}) becomes

840: \begin{equation}

841: C_{a}^{j,-1} C_{a}^{j,1} \prod_{b\in j\setminus a} C_{b}^{j,-1}

842: C_{b}^{j,1} \prod_{k\in b\setminus j} \delta_{t_b^{(k)},h_k}

843: \end{equation}

844: which is exactly the term in Eq.~(\ref{eq:sptau}) corresponding to

845: $t_a^{(j)} = *$ (remember that $C_{b}$ clauses here are evaluated in

846: $t_b$).\\

847: \noindent

848: If $t_a^{(j)} = \sigma\in\{\pm 1\}$ then it is convenient to break

849: $\chi^{dbp}_j$ in two addenda:

850: \begin{equation}

851: \prod_{b\in j} C_{b}-\prod_{b\in j} C_{b} C_{b}^{j,-\sigma}

852: \end{equation}

853: so that the RHS of Eq.~(\ref{eq:compo}) becomes

854: \begin{eqnarray}

855: C_{a}\prod_{b\in j\setminus a} \left(\sum_{\{t_{b}\}}C_{b}\prod_{k\in

856:   b\setminus j}{\chi^{\tau}_{k\to b}}\right) - C_{a}C_{a}^{j,-\sigma}

857:   \prod_{b\in j\setminus

858:   a}\left(\sum_{\{t_{b}\}}C_{b}C_{b}^{j,-\sigma}\prod_{k\in b\setminus

859:   j}\chi^{\tau}_{k\to b}\right)

860: \end{eqnarray}

861: Finally, both sums can be computed explicitly and the result is

862: again exactly the corresponding term in Eq.~(\ref{eq:sptau}). This ends

863: the proof of the identity Eq.~(\ref{eq:equiv}).

864:

865:

866:

867: \section{Computation of the entropy}

868: \label{entropy-complexity}

869: For simplicity of notation, in what follows we write $P_{a}(t_{a}),

870: P_{a\to i}(t_a), P_i(T_i)$ and $\chi_i(T_i)$ in place of

871: $P_{a}^{dbp}(t_{a}), P_{a\to i}^{dbp}(t_a), P_i^{dbp}(T_i)$ and

872: $\chi_i^{dbp}(T_i)$ respectively and $P_{i\to a}(h_i)$ in place of

873: $P_{i\to a}^{sp}(h_i)$.

874:

875: To compute the entropy (\ref{eq:entropy}) we first need

876: \begin{eqnarray*}

877: P_{a}(t_{a}) & = & c_{a}^{-1}\sum_{\left\{ h_i \right\}} \prod_{i\in

878:  a}P_{i\to a}\left(h_i\right)\prod_{i\in a}\chi^{\tau}_{i\to

879:  a}\left(t_{a},h_i\right)\\ & = & c_{a}^{-1}\prod_{i\in

880:  a}\sum_{\left\{ h_i\right\}} P_{i\to

881:  a}\left(h_i\right)\chi^{\tau}_{i\to a}\left(t_{a},h_i\right)

882: \end{eqnarray*}

883: Thus calling

884: \begin{equation}

885: f_{a\to i}=\sum_{\left\{ h_i\right\} }P_{i\to

886: a}\left(h_i\right)\chi^{\tau}_{i\to a}\left(t_{a},h_i\right)

887: \end{equation}

888: we have that

889: \begin{eqnarray}

890: \sum_{\left\{ t_{a}\right\} }P_{a}(t_{a})\log P_{a}(t_{a}) =

891: -c_{a}^{-1}\log c_{a}+\sum_{\left\{ t_{a}\right\}

892: }P_{a}(t_{a})\sum_{i\in a}\log f_{a\to i}\nonumber \\ =

893: -c_{a}^{-1}\log c_{a}+\sum_{i\in a}\sum_{\left\{ t_{a}\right\}

894: }P_{a}(t_{a})\log f_{a\to i}\label{eq:ca}

895: \end{eqnarray}

896: Writing  $\omega_{a\to i}=\sum_{\left\{ t_{a}\right\} }P_{a}(t_{a})\log

897: f_{a\to i}$ we get

898: \begin{eqnarray}

899: \sum_{a}\left(n_{a} - 1\right)\sum_{i\in a}\omega_{a\to i} &=&

900: \sum_{i}\sum_{a\in i}\sum_{j\in a\setminus i}\omega_{a\to j}\nonumber

901: \\ &=& \sum_{i}\sum_{a\in i}\sum_{j\in a\setminus i}\sum_{\left\{

902: t_{a}\right\} }P_{a}(t_{a})\log f_{a\to j}\nonumber \\

903:  &=& \sum_{i}\sum_{a\in i}\sum_{\left\{ t_{a}\right\} }

904: P_{a}(t_{a})\prod_{j\in a\setminus i}\log f_{a\to j}\nonumber \\

905:  &=& \sum_{i}\sum_{a\in i}\sum_{\left\{ t_{a}\right\} }\sum_{\left\{

906: t_{b}\right\} _{b\in i\setminus a}}P_{i}(T_{i})\prod_{j\in a\setminus

907: i}\log f_{a\to j}\nonumber \\  &=& \sum_{i}\sum_{a\in i}\sum_{\left\{

908: T_{i}\right\} } P_{i}(T_{i}) \log\prod_{j\in a\setminus i}f_{a\to

909: j}\label{eq:star}

910: \end{eqnarray}

911: The term inside the logarithm above reads

912: \begin{eqnarray}

913: \prod_{j\in a\setminus i}f_{a\to j} =  \sum_{\left\{ h_j\right\}}

914:  \prod_{j\in a\setminus i} \chi^{sp}_{j\to a} \left(t_{a},h_j\right)

915:  \prod_{j\in a\setminus i} P_{j\to a}(h_j)  =  \frac{1}{D_{a\to

916:  i}}P_{a\to i}(t_{a})

917: \end{eqnarray}

918: where $D_{a\to i}$ is an appropriate normalization constant. Going

919: back to Eq.~(\ref{eq:star}), we have

920: \begin{eqnarray}

921: \sum_{a}(n_{a}-1)\sum_{i\in a}\omega_{a\to i} = -\sum_{i}\sum_{a\in

922: i}\log D_{a\to i} + \sum_{i}\sum_{a\in i}\sum_{\left\{ T_{i}\right\}

923: }P_{i}\left(T_{i}\right)\log P_{a\to i}(t_{a})

924: \label{eq:D+P}

925: \end{eqnarray}

926: The second term in the right-hand side equals

927: \begin{eqnarray}

928: \sum_{i}\sum_{\left\{ T_{i}\right\}

929:  }P_{i}\left(T_{i}\right)\log\prod_{a\in i}P_{a\to i}(t_{a})

930:   & = & \sum_{i}\sum_{\left\{ T_{i}\right\}

931:  }P_{i}\left(T_{i}\right)\log\chi_{i}(T_{i})\prod_{a\in i}P_{a\to

932:  i}(t_{a})\nonumber \\ & = & \sum_{i}\sum_{\left\{ T_{i}\right\}

933:  }P_{i}\left(T_{i}\right)\log Q_{i}(T_{i})\nonumber \\ & = &

934:  \sum_{i}\sum_{\left\{ T_{i}\right\} }P_{i}\left(T_{i}\right)\log

935:  P_{i}(T_{i})+\sum_{i}\sum_{\left\{ T_{i}\right\}

936:  }P_{i}\left(T_{i}\right)\log c_{i}

937: \label{eq:P}

938: \end{eqnarray}

939: where in the second step above $\chi_{i}(T_{i})$ has been artificially

940: multiplied inside the logarithm (we can do it because there is a

941: $P_{i}(T_{i})$ outside) and $P_{i}(T_{i}) =

942: \frac{1}{c_{i}}Q_{i}(T_{i})$. Eqs.~(\ref{eq:D+P}),(\ref{eq:P}) give:

943: \begin{eqnarray}

944: \sum_{a}(n_{a}-1)\sum_{i\in a}\omega_{a\to i} = -\sum_{i}\sum_{a\in

945:   i}\log D_{a\to i} +\sum_{i}\sum_{\left\{ T_{i}\right\}

946:   }P_{i}\left(T_{i}\right)\log P_{i}(T_{i})+\sum_{i}\log

947:   c_{i}

948: \label{eq:D}

949: \end{eqnarray}

950: Going back to the first expression of the entropy

951: Eq.~(\ref{eq:entropy}), and using Eq.~(\ref{eq:ca}) and

952: Eq.~(\ref{eq:D}) we get:

953: \begin{eqnarray}

954: S & = & -\sum_{i}\sum_{\left\{ T_{i}\right\} } P_{i}(T_{i})\log

955:  P_{i}(T_{i}) + \sum_{a}\left(n_{a} - 1\right)\sum_{\left\{

956:  t_{a}\right\} }P_{a}(t_{a})\log P_{a}(t_{a})\nonumber \\ & = &

957:  \sum_{i}\log c_{i}-\sum_{i}\sum_{\left\{ T_{i}\right\}

958:  }P_{i}(T_{i})\log Q_{i}(T_{i}) +

959:  \sum_{a}\left(n_{a}-1\right)\sum_{\left\{ t_{a}\right\}

960:  }P_{a}\left(t_{a}\right)\log P_{a}(t_{a})\nonumber \\ & = &

961:  \sum_{i}\log c_{i}-\sum_{a}\left(n_{a}-1\right)\log

962:  c_{a}-\sum_{i}\sum_{a\in i}\log D_{a\to i}

963: \end{eqnarray}

964: where the constants are defined in Eqs.~(\ref{ci}-\ref{D}).

965:

966: \section{Tree factor graphs}

967: \label{tree}

968: The argument turns out to be similar to the one given in an analogous

969: ``tutorial'' appendix in ref.  \cite{Barthel_Hartmann} for the Vertex

970: Cover problem.\\

971: We will first build a reference solution ${\mathbf x}$, and then show

972: that every solution of $E=0$ is connected to it. ${\mathbf x}$ will be

973: built from the leaves to the root. Suppose the variables are labeled

974: in an ordering that respects distances to the root, such that the

975: first ones are the leaves and the last one is the root. In such an

976: ordering, the parents (resp. child) of $i$ are neighbors with labels

977: $j<i$ (resp.  $j>i$). We will fix $x_i$ iteratively: once $x_j$ for

978: $j<i$ are fixed, all parents of $j$ are fixed; then for $x_j$ there

979: are two possibilities: either its parents force it to take a specific

980: value, or they don't. In the first case we chose $x_i$ to take the

981: forced value; in the second one we chose the value that satisfy the

982: child clause. Now we can show that ${\mathbf x}$ is connected with

983: every other solution ${\mathbf s}$ (and thus every two solution are

984: connected). It is easy to see that the configurations ${\mathbf

985: y}^{(k)}$ defined by ${\mathbf y}^{(k)}_j = s_j$ if $j<k$ and

986: ${\mathbf y}^{(k)}_j = x_j$ if $j\geq k$ form a path of configurations

987: connecting ${\mathbf x}$ and ${\mathbf s}$. Clearly ${\mathbf

988: y}^{(1)}={\mathbf x}$ and ${\mathbf y}^{(n)}={\mathbf s}$. Also they

989: are all solutions, since if ${ \mathbf y}^{(k)}$ is a solution, then

990: clearly ${\mathbf y}^{(k+1)}$ is also a solution: if they are

991: different it is because ${\mathbf y}^{(k+1)}_{k+1}$ has been chosen to

992: satisfy the child clause (and it was not forced from parents in $s$

993: and thus neither in $y^{(k+1)}$).

994:

995: We can now look for solutions of $\hat E$ on a satisfiable tree (with

996: boundary conditions). Let's start with a free-boundary tree with $2$

997: and $3$-clauses: it is easy to see that the solution with all $*$

998: assignments has $\hat E = 0$. It is also clearly unique: suppose that

999: there is a solution with some variable set to $\sigma\neq *$. Then

1000: there is forcefully one of its neighboring clauses in which the two

1001: (or one) remaining variables are fixed in order to not satisfy the

1002: clause. Repeating again the argument recursively for one of them, we

1003: can get a never-ending path of fixed variables in the tree. But as a

1004: trees have no loops, this is a contradiction.

1005:

1006: There is also exactly one such solutions for a satisfiable tree with

1007: boundary conditions (if we disregard $V_i$ constraints on the

1008: variables with assigned boundary values). We will build it explicitly

1009: using the so-called unit clause propagation (UCP).  The UCP procedure

1010: consists in removing (in this case starting from the boundary) every

1011: fixed variable by (a) removing all clauses satisfied by the variable

1012: and (b) removing the variable from all clauses in which it appears

1013: without satisfying the clause.  (if the original tree is satisfiable,

1014: no $0$-clause can appear in this erasure step).  Then every possibly

1015: appearing $1$-clause is taken and its variable fixed in order to

1016: satisfy the clause, and the procedure starts again from the beginning

1017: until no more $1$-clauses show up. The resulting graph is

1018: boundary-free and with no $1$-clauses.

1019:

1020: The promised solution will be built by taking all variables fixed by

1021: UCP with their assigned value, and by assigning the value $*$ to the

1022: remaining ones. The resulting configuration $\hat x$ has $\hat E(\hat

1023: x)=0$.  Clearly the constraints $V_i$ (see Eq.~(\ref{eq:general2}))

1024: are satisfied by $\hat x$ for all $i$ fixed by UCP (because they are

1025: ``frozen'' by their neighbors). We easily see that this partial

1026: assignement is the unique one that can give $\hat E = 0$. Using the

1027: fact that the subgraph produced by UCP has no boundary condition and

1028: that the unique solution for $\hat E=0$ on that subgraph is the

1029: all-$*$ one, we see that the proposed configuration is indeed the

1030: unique solution.

1031:

1032: Note also that every solution of $E=0$ will coincide with $\hat x$ in

1033: the $-1,1$-assigned variables of the latter, because these variables

1034: were fixed by UCP and thus are forced in every satisfying

1035: configuration. Moreover, if one takes an index $i$ such that $\hat

1036: x_i$ is $*$, then there is at least one solution of $E(s)=0$ with

1037: $s_i=1$ (resp. $-1$): by fixing $s_i$ and applying again UCP one

1038: cannot get any contradiction (i.e. a $0$-clause) because the subgraph

1039: has no loops nor $1$-clauses. The remaining graph is still loop-free,

1040: and thus trivially satisfiable.

1041:

1042:

1043: \begin{thebibliography}{99}

1044:

1045: \bibitem{TCS} Special Issue on {\it NP-hardness and Phase transitions},

1046: O. Dubois, R. Monasson, B. Selman and R. Zecchina (eds.),

1047: Theor. Comp. Sci. \textbf{265}, Issue: 1-2, August 28 (2001).

1048:

1049: \bibitem{Codes} H. Nishimori, {\it Statistical Physics of Spin Glasses and

1050: Information Processing}, Oxford University Press, 2001

1051:

1052: \bibitem{Aldous} D. Aldous, J. M. Steele, Probability on Discrete

1053: Structures (Vol. 110 of Encyclopaedia of Mathematical Sciences),

1054: ed. H. Kesten, p. 1-72. Springer, 2003.

1055:

1056: \bibitem{Guerra:Talagrand} F. Guerra, Comm. Math. Phys. {\bf

1057: 233}, 1 (2003); M. Talagrand, C.R. Acad. Sci. Paris, Ser. I {\bf 337},

1058: 111 (2003)

1059:

1060: \bibitem{Aldous_z2} D. Aldous, Random Structures and Algorithms {\bf

1061: 18} 381 (2001)

1062:

1063:

1064: \bibitem{MPV} M. Mezard, G. Parisi, M.A. Virasoro, {\it Spin Glass

1065: Theory and Beyond}, World Scientific, (1987)

1066:

1067: \bibitem{pspin} S. Cocco, O. Dubois, J. Mandler, R. Monasson.

1068: Phys. Rev. Lett. {\bf 90}, 047205 (2003); M. Mezard, F. Ricci-Tersenghi,

1069: R. Zecchina, J. Stat. Phys. {\bf 111}, 505 (2003)

1070:

1071: \bibitem{science} M. Mezard, G. Parisi, R. Zecchina, Science {\bf

1072: 297}, 812 (2002)

1073:

1074: \bibitem{MZ} M. Mezard and R. Zecchina, Phys.Rev. {\bf E 66}, 056126 (2002)

1075:

1076: \bibitem{BMZ} A. Braunstein, M. Mezard, R. Zecchina, {\it Survey

1077: propagation: an algorithm for satisfiability}, ArXiv:

1078: xxx.lanl.gov/ps/cs.CC/0212002 (2002)

1079:

1080: \bibitem{Gallager} R.G. Gallager, Information Theory and Reliable

1081:   Communications, Wiley, New York, 1968

1082:

1083: \bibitem{Pearl} J. Pearl, {\sl Probabilistic Reasoning in Intelligent Systems},

1084: 2nd ed. (San Francisco, MorganKaufmann,1988)

1085:

1086: \bibitem{Spielman} D.A. Spielman, in {\sl Lecture Notes in Computer

1087:   Science} {\bf 1279}, 67 (1997)

1088:

1089: \bibitem{Sourlas} N. Sourlas, in {\sl From Statistical Physics to

1090: Statistical Inference and Back}, P. Grassberger and J-P. Nadal Edts.,

1091: Kluwer Academic, Dordrecht (1994)

1092:

1093: \bibitem{turbo} C. Berrou, A. Glavieux and P. Thitimajshima,

1094:   Proc. Int. Conf. Comm, 1064-1070 (1993)

1095:

1096: \bibitem{Forney} G.D. Forney, Jr., IEEE Trans. Inform. Theory, {\bf

1097:   47}, 520 (2001)

1098:

1099: \bibitem{good_codes1} M.G. Luby, M. Mitzenmacher, M.A. Shokrollahi and

1100:   D.A. Spielman, IEEE Trans. Inform. Theory, {\bf 47}, 569 (2001)

1101:

1102: \bibitem{good_codes2} S-Y. Chung, G.D. Forney,Jr., T.J. Richardson and

1103:   R. Urbanke, IEEE Comm. Letters {\bf 5}, 58 (2001)

1104:

1105: \bibitem{MacKay} D.J.C. MacKay, IEEE Trans. Inform. Theory {\bf 45},

1106:   399 (1999)

1107:

1108: \bibitem{cavity} M. Mezard, G. Parisi, M.A. Virasoro, Europhys. Lett. {\bf

1109: 1}, 77 (1986); M. Mezard, G. Parisi, Eur. Phys. J. {\bf B 20}, 217

1110: (2001); M. Mezard, G. Parisi, J. Stat. Phys.  {\bf 111}, 1 (2003)

1111:

1112:

1113: \bibitem{Cook_review} S.A. Cook, D.G. Mitchell, {\it Finding Hard

1114: Instances of the Satisfiability Problem: A Survey}, In: {\sl

1115: Satisfiability Problem: Theory and Applications}, Du, Gu and Pardalos

1116: (Eds).  DIMACS Series in Discrete Mathematics and Theoretical Computer

1117: Science, Volume 35, (1997)

1118:

1119: \bibitem{nature} R. Monasson, R. Zecchina, S. Kirkpatrick,

1120: B. Selman, and L.  Troyansky, Nature \textbf{400}, 133 (1999);

1121:

1122: \bibitem{MPR} A. Montanari, G. Parisi, F. Ricci-Tersenghi, ArXiv:

1123: xxx.lanl.gov/ps/cond-mat/0308147 (2003)

1124:

1125: \bibitem{joker}

1126: A. Braunstein, M. Mezard, M. Weigt, R. Zecchina, {\it Constraint

1127: Satisfaction by Survey Propagation}, ArXiv

1128: lanl.arXiv.org/ps/cond-mat/0212451 (2002);

1129: G. Parisi, {\it On the survey-propagation equations for the random

1130: K-satisfiability problem}, ArXiv: xxx.lanl.gov/ps/cs.CC/0212009

1131: (2002);

1132: G. Parisi, {\it On local equilibrium equations for clustering states}

1133: ArXiv: xxx.lanl.gov/ps/cs.CC/0212047 (2002);

1134:

1135: \bibitem{factor_graph} F.R. Kschischang, B.J. Frey, H.-A. Loeliger,

1136: {\it IEEE Trans. Infor. Theory} {\bf 47}, 498 (2002).

1137:

1138: \bibitem{GBP} Yedidia, J.S.; Freeman, W.T.; Weiss, Y., {\it Generalized

1139: Belief Propagation}, Advances in Neural Information Processing Systems

1140: (NIPS) {\bf 13}, 689 (2000)

1141:

1142: \bibitem{states} There exist multiple definitions of states (clusters)

1143: for finite sizes (e.g.  k-flip stable, with $lim_{N \to \infty} k/N

1144: =0$, \cite{cavity,Biroli:Monasson}) which lead to equivalent

1145: thermodynamical limits in which the SP-cavity formalism is assumed to

1146: hold.

1147:

1148: \bibitem{pspin-note} In particularly simple cases like the so called

1149: diluted p-spin glasses (or random sparse parity check

1150: equations)~\cite{pspin}, the introduction of $*$ states has allowed

1151: for an explicit construction of an exponential number of clusters of

1152: solutions and to prove the exactness of the so called one step replica

1153: symmetry breaking (RSB) solution in the scheme of Parisi~\cite{MPV}.

1154: However, for such models the $*$ variables are in a sense trivial in

1155: that they do not depend on the cluster and their (recursive)

1156: elimination leads to a residual model which can be solved exactly by a

1157: simple annealed/first-moment calculation.  For K-SAT the situation is

1158: more complex (and more general) in that variables are expected to

1159: become $*$ depending on the clusters.

1160:

1161:

1162: \bibitem{note} In the computation of the free energy $\lambda$ should

1163: be taken proportional to the temperature $T$ in the limit $T \to 0$.

1164:

1165: \bibitem{Biroli:Monasson} G. Biroli, R. Monasson, Europhys. Lett. 50,

1166: 155 (2000)

1167:

1168:

1169: \bibitem{Barthel_Hartmann} W. Barthel, A.K. Hartmann,{\it Clustering

1170: analysis of the ground-state structure of the vertex cover problem},

1171: cond-mat/0403193

1172:

1173: \bibitem{napolano} A. Braunstein, V. Napolano, R. Zecchina {\it

1174: Clustering in random SAT}, in preparation

1175:

1176:

1177: \end{thebibliography}

1178: \end{document}

1179:

1180: