1: \documentclass[preprint,11pt]{article}
2: \usepackage{amssymb,amsfonts,amsmath,amsthm,amscd}
3: \usepackage{graphicx}% Include figure files
4: \usepackage{epsfig}% Include figure files
5:
6: \footnotesep 14pt
7: \floatsep 27pt plus 2pt minus 4pt % Nominal is double what is in art12.sty
8: \textfloatsep 40pt plus 2pt minus 4pt
9: \intextsep 27pt plus 4pt minus 4pt
10:
11: % Somewhat wider and taller page than in art12.sty
12: \topmargin -0.4in \headsep 0.4in \textheight 9.0in
13: \oddsidemargin -0.15in \evensidemargin -0.15in \textwidth 7.in
14:
15: \newtheorem{df}{Definition}
16: \newtheorem{thm}{Theorem}
17: \newtheorem{coro}{Corollary}
18: \newtheorem{lemma}{Lemma}
19: \newtheorem{propo}{Proposition}
20: \newtheorem{conj}{Conjecture}
21:
22: \newenvironment{example}[1]{\begin{list}{\setlength{\rightmargin}{\leftmargin}}\item{\bf Example {#1}:}}{\end{list}}
23:
24: \def\gt{\tilde{g}}
25: \def\Z{Z_N}
26: \def\ve{\varepsilon}
27: \def\da{{\partial a}}
28: \def\di{{\partial i}}
29: \def\dpi{{\partial_{+} i}}
30: \def\dpj{{\partial_{+} j}}
31: \def\dmi{{\partial_- i}}
32: \def\dmj{{\partial_- j}}
33: \def\ed{\stackrel{\rm d}{=}}
34: \def\ob{\overline{\beta}}
35:
36: \def\uh{\underline{h}}
37: \def\oh{\overline{h}}
38: \def\uu{\underline{u}}
39: \def\ou{\overline{u}}
40:
41: \def\D{{\mathcal D}}
42: \def\R{{\mathbb R}}
43: \def\E{{\mathbb E}}
44: \def\prob{{\mathbb P}}
45:
46: \def\ux{\underline{x}}
47: \def\uy{\underline{y}}
48: \def\uz{\underline{z}}
49:
50: \def\sBP{{\rm {\tiny BP} }}
51: \def\sTV{{\rm {\tiny TV} }}
52:
53: \def\l|{\left|\left|}
54: \def\r|{\right|\right|}
55:
56: \def\pro{{\sf CD}}
57: \def\Ball{{\sf B}}
58: \def\T{{\sf T}}
59: \def\hT{\widehat{\sf T}}
60: \def\poisson{{\sf Poisson}}
61: \def\ind{{\mathbb I}}
62:
63: \def\proof{\hspace{1.cm}{\bf Proof:}\hspace{0.1cm}}
64: \def\prooft{\hspace{0.5cm}{\bf Proof:}\hspace{0.1cm}}
65: \def\endproof{\hfill$\Box$\vspace{0.4cm}}
66:
67: \newcommand{\<}{\langle}
68: \renewcommand{\>}{\rangle}
69:
70: \begin{document}
71:
72: \title{Counting good truth assignments of random $k$-SAT formulae}
73:
74: \author{Andrea Montanari\thanks{Laboratoire de Physique
75: Th\'{e}orique de l'Ecole Normale Sup\'{e}rieure, Paris.
76: Research is partially supported by European Union under the ip EVERGROW. Email: {\tt montanar@lpt.ens.fr}} \;\; and\;
77: Devavrat Shah\thanks{LIDS, MIT. Research is partially supported by NSF CAREER.
78: Email: {\tt devavrat@mit.edu}.
79: \newline
80: {\bf Keywords:} Random $k$-SAT, Correlation Decay, Uniqueness, Gibbs Distribution}}
81:
82: \date{\today}
83: \maketitle
84:
85: \thispagestyle{empty}
86:
87: \abstract{We present a deterministic approximation algorithm to compute
88: \emph{logarithm} of the number of `good' truth assignments for
89: a random $k$-satisfiability ($k$-SAT) formula in polynomial time
90: (by `good' we mean that violate a small fraction of clauses). The
91: relative error is bounded above by an arbitrarily small constant
92: $\epsilon$ with high probability\footnote{In this paper, by term "with high probability" (whp) we
93: mean with probability $1-o_N(1)$.} as long as the clause density (ratio of
94: clauses to variables) $\alpha<\alpha_{\rm u}(k) = 2k^{-1}\log k(1+o(1))$.
95: The algorithm is based on computation of marginal distribution via belief
96: propagation and use of an interpolation procedure. This scheme
97: substitutes the traditional one based on approximation of
98: marginal probabilities via MCMC, in conjunction with self-reduction,
99: which is not easy to extend to the present problem.
100:
101: We derive $2k^{-1}\log k (1+o(1))$ as threshold for uniqueness of the
102: Gibbs distribution on satisfying assignment of random infinite tree
103: $k$-SAT formulae to establish our results, which is of interest in its own right.
104: }
105: %
106: %*******************************************************************
107: %
108: \section{Introduction}
109:
110: \noindent{\bf Setup and Problem Statement.}
111: Given $N$ boolean variables $x_i, 1\leq i\leq N$, an $M$ clause
112: $k$-satisfiability ($k$-SAT) formula has the form $F = \wedge_{j=1}^M C_j$,
113: where $C_j = \vee_{\ell=1}^k z_{j_\ell}$ with literal $z_{j_\ell}$ being
114: either $x_i$ for $\bar{x}_i$ for some $1\leq i\leq N$. An assignment
115: $\ux \in \{0,1\}^N$
116: of variables $x_i, 1\leq i\leq N$ satisfies clauses $C_j$ if at least of
117: one the $k$ literals of $C_j$ evaluates to be true. We will denote
118: true by ``1'' and false by ``0''. For given $F$, $E(\ux)$ denote
119: the number of unsatisfied clauses of $F$ under assignment $\ux$.
120: Given $\beta \in {\mathbb R}_+$ (called {\em inverse temperature} in
121: statistical physics), define {\em partition function} as
122: %
123: \begin{eqnarray}
124: %
125: \Z(\beta,F) \equiv \sum_{\ux \in \{0,1\}^N}e^{-\beta E(\ux)}\, .\label{eq:ZDefinition}
126: %
127: \end{eqnarray}
128: %
129: Notice that $\Z(\beta, F)$ weighs in favor of ``good'' assignments,
130: i.e. assignments that satisfy more clauses. As $\beta \to \infty$,
131: $\Z(\beta, F)$ becomes the number of assignments that satisfy
132: (all clauses of) $F$.
133: The partition function naturally arises as normalizing constant in
134: the following probability measure on $\{0,1\}^N$, often denoted as
135: {\em Boltzmann}
136: distribution \cite{Georgii} related to $F$: for $\ux \in \{0,1\}^N$,
137: \begin{eqnarray}
138: \mu_{\beta, F}(\ux) = \frac{1}{\Z(\beta, F)}\prod_{j=1}^M\psi_j(\ux) ~~=~~ \frac{e^{-\beta E(\ux)}}{\Z(\beta, F)} \, ,\;\;\;\;\mbox{where}
139: \;\;\;\;\;\psi_j(\ux) = \left\{\begin{array}{ll}
140: 1 & \mbox{ if $\ux$ satisfy clause $C_j$,}\\
141: e^{-\beta} & \mbox{ otherwise.}
142: \end{array}\right.\label{eq:GraphicalModel}
143: %
144: \end{eqnarray}
145: We shall write $\mu(\,\cdot\,) =
146: \mu_{\beta,F}(\,\cdot\,)$ whenever it will not be necessary
147: to specify the formula and inverse temperature.
148: We further denote by $\<\,\cdot\,\> =\<\,\cdot\,\>_{\beta,F}$
149: expectations with respect to the measure $\mu$.
150:
151: In this paper, we are interested in {\em random $k$-SAT} formulas.
152: These are generated by
153: selecting $M$ clauses independently and uniformly at random from all possible
154: $2^k {N \choose k}$ $k$-clauses. Specifically, let $M$ scale linearly in $N$, i.e.
155: $M = \alpha N$ for $\alpha \in {\mathbb R}_+$.
156:
157: The main motivation in this paper is to describe an
158: efficient algorithm to compute a good approximation of $\Z(\beta, F)$
159: for such random formulas.
160: An important open conjecture is to show, that for any
161: $\alpha, \beta \in {\mathbb R}_+$,
162: under the probability distribution induced by random $k$-SAT formula,
163: the limit $\lim_{N\to\infty} \frac{1}{N} \log \Z(\beta, F)$
164: exists with probability $1$. The analysis of our algorithm implies
165: such a result for all finite $\beta$, and $\alpha$ smaller than a critical
166: value.
167:
168: \vspace{.1in}
169: \noindent{\bf Related Previous Work.} The well-known threshold conjecture
170: for random $k$-SAT states that for all $k \ge 2$, there exists
171: $\alpha_{\rm c}(k)$ such
172: that for $\alpha < \alpha_{\rm c}(k)$ (resp. $\alpha > \alpha_{\rm c}(k)$)
173: the randomly generated formula is satisfiable (resp. not satisfiable) with
174: probability $1$ as $N\to\infty$. There has been a
175: lot of interesting work on this topic, and a convergence of
176: methods from different communities \cite{Monasson,Mezard,ANPNature}.
177: Due to space limitation, we will recall only some of the key relevant results.
178:
179: Friedgut \cite{Friedgut} established existence of a sharp threshold.
180: More precisely, he proved that there exists
181: $\alpha_{\rm c}(k, N)$ such that the satisfiability probability
182: tends to $1$ (to $0$) if $\alpha<\alpha_{\rm c}(k,N)(1-\eta)$
183: (respectively $\alpha>\alpha_{\rm c}(k,N)(1+\eta)$).
184: While it is expected that $\lim_{N\to\infty} \alpha_c(k,N)$ exists,
185: it has still remained elusive. Recently, Achlioptas and Peres \cite{AP04}
186: established that $\alpha_{\rm c}(k, N) = 2^k \ln k (1+o_k(1))$
187: thus implying that $\alpha_{\rm c}(k, N)$ can be taken $N$ independent
188: to first order for large $k$.
189:
190: The existence of $\lim_{N\to\infty} \lim_{\beta\to\infty}
191: \frac{1}{N} \log \Z(\beta, F)$ with probability $1$, for all
192: $\alpha \in {\mathbb R}_+$ and $k$ naturally establishes the threshold
193: conjecture.
194: More generally, the log-partition function at $\beta=\infty$
195: provides detailed information about the satisfying assignments
196: (computing it exactly is of course $\#$-P complete).
197: In \cite{MonassonZecchina} a formula for the limit log-partition
198: function was derived through the non-rigorous replica method
199: from statistical physics.
200: The existence of the $N\to\infty$ limit was proved by
201: Franz, Leone and Toninelli \cite{FL03, FLT03} for even $k$ and all values of
202: $\alpha$. These authors also provided an upper bound on
203: $\lim_{N\to\infty}\frac{1}{N} \log \Z(\beta, F)$.
204: However evaluating the bound requires solving an {\em a priori} complex
205: optimization problem, and a matching lower bound wasn't proved there.
206: Talagrand \cite{T01} established the existence
207: of the limit and its value for very small value of $\beta$ (depending
208: on $k$).
209:
210: \vspace{.1in}
211: \noindent{\bf Overview of Results.} In this paper, we essentially prove that
212: the Boltzmann distribution (\ref{eq:GraphicalModel}) is a
213: {\em pure state} \cite{Georgii} by establishing appropriate worst-case
214: {\em correlation decay} for tree formulae. The approach of Talagrand \cite{T01}
215: also crucially relied of proving correlation decay,
216: albeit with different means. This resulted in a limitation to small values of
217: $\beta$ and thus leaving out interesting regime of large $\beta$.
218:
219: An analogy can be drawn with the Markov Chain Monte Carlo (MCMC) approach
220: to the approximate computation of partition functions
221: (see, for example, work by Jerrum and Sinclair \cite{JS93}).
222: In that case, the crucial step consists in proving an appropriate
223: mixing condition (`temporal' correlation decay) for some Markov Chain.
224: The same role is played here by `spatial' correlation decay with respect
225: to the measure (\ref{eq:GraphicalModel}).
226:
227: In this paper, we establish correlation decay for random $k$-SAT formula for a
228: range of $\alpha$ and all $\beta$.
229: This allows to estabilish that deterministic Belief Propagation algorithm
230: provides a good approximation of the marginals with respect to
231: the distribution (\ref{eq:GraphicalModel}), cf. Section \ref{sec2}.
232: In the usual MCMC approach, marginals are used to approximate the partition
233: function by recursively fixing the variables and exploiting self-reducibility.
234: This cannnot be done in the present case because the reduced SAT
235: formulae are not random anymore.
236: Instead, we use {\em interpolation} in $\beta$, to obtain
237: $\log \Z(\beta, F)$ approximately (Theorem \ref{thm:KSAT}).
238: The analysis of the approximation scheme implies the existence of the limit
239: $\lim_{N\to\infty}\frac{1}{N} \log \Z(\beta, F)$
240: (Theorem \ref{thm:LimitPartFun}).
241: We hope that our novel approach for counting will find applications in
242: other hard combinatorial problems.
243: Similar schemes were recently discussed by Weitz \cite{W06}, and
244: Bandyopadhyay and
245: Gamarnik \cite{BG06} for counting independent sets approximately
246: via deterministic algorithms.
247:
248: Finally, we show that the computation of the partition function
249: leads to an estimate of the number of truth assignments that violate at most
250: $N\ve$ clauses, for small $\ve$ (Theorem \ref{thm:AlmostSAT}).
251: As a byproduct, we obtain an asymptotically (in $k$) threshold for
252: uniqueness Gibbs measure on infinite $k$-SAT tree formula
253: (Theorem \ref{thm:UniquenessTrees}).
254:
255:
256: \vspace{.1in}
257:
258: \noindent{\bf Organization.} Section \ref{sec1} presents preliminaries and
259: statements of the main results. The Section \ref{sec2} describes the
260: approximate counting algorithm and the proof of key Lemmas related
261: to the correlation decay (or uniqueness) of Gibbs distribution on random
262: {\em tree} $k$-SAT. The Section \ref{sec3} completes the proofs of all
263: main results stated in Section \ref{sec1}. We present direction for future work in
264: Section \ref{sec4}.
265:
266: \section{Preliminaries and Main Results}
267: \label{sec1}
268:
269: Given $\alpha$ and $k$, define $\alpha_*(k)$ to be
270: the smallest positive root of the equation $\kappa(\alpha) = 1$,
271: where
272: %
273: \begin{eqnarray}
274: %
275: \kappa(\alpha) \equiv k(k-1)\alpha\left(1-\frac{1}{4}\,e^{-k\alpha/2}\right)
276: \left(1-\frac{1}{2}\,e^{-k\alpha/2}\right)^{k-2} \, .
277: \label{eq:ContractionRate}
278: %
279: \end{eqnarray}
280: %
281: For $k=2, 3, 4, 6$, the $\alpha_*(k)$ is approximately
282: $0.58216, 0.293, 0.217, 0.16670$. Asymptotically,
283: $\alpha_*(k) = 2k^{-1}\log k \left(1+ O\left(\frac{\log \log k}{\log k}\right)\right)$.
284: Now, we state the main result of this paper about approximating logarithm of
285: partition function.
286: \begin{thm}\label{thm:KSAT}
287: Given $\ve > 0$ and $\alpha < \alpha_*(k)$, there exists $\delta'>0$ and
288: a polynomial (in $N$, independent of $\ve$) time algorithm that computes
289: a number $\Phi(\beta,F)$ (the input being $\beta\in {\mathbb R}$ and a
290: satisfiability formula $F$) such that the following is true.
291: If $\beta \in [0, N^{\delta'}]$ and $F$ is
292: random $k$-SAT formula with $N$ variables and $M = N\alpha$ clauses,
293: then, with high probability,
294: %
295: \begin{eqnarray}
296: %
297: \Phi(\beta,F)\,(1-\ve)\le\log \Z(\beta,F) \le \Phi(\beta,F)\,(1+\ve). \,
298: \label{eq:Guarantee}
299: %
300: \end{eqnarray}
301: %
302: \end{thm}
303: %
304: The proof of Theorem \ref{thm:KSAT} requires us to prove uniqueness of Gibbs
305: measure for the model (\ref{eq:GraphicalModel}) on infinite tree random $k$-SAT formulae. To state this result, we
306: first need some definitions. An appropriate model for tree random $k$-SAT,
307: $\T_*(r)$ is described as follows: For $r=0$, it is the graph containing
308: a unique variable node. For any $r \ge 1$, start by a single variable node
309: (the root) and add $l\ed \poisson(k\alpha)$ clauses, each one
310: including the root, and $k-1$ new variables (first generation variables).
311: For each one of the $l$ clauses, the corresponding literals are
312: non-negated or negated indipendently with equal probability. If $r \ge 2$,
313: generate an independent copy of $\T_*(r-1)$ for each variable node
314: in the first generation and attach it to them. By construction,
315: for any $r'<r$ the first $r'$ generations of a tree from
316: $\T_*(r)$ are distributed according to the model $\T_*(r')$.
317: As a consequence, the infinite tree distribution $\T_*(\infty)$
318: is also well defined. In what follows, we denote
319: the root of $\T_*(\cdot)$ as $0$. Let $\mu$ denote the Gibbs distribution
320: on random formula on $\T_*(r)$ (cf. (\ref{eq:GraphicalModel})) and
321: $\mu_{0|r}(x_0|\ux_r)$ be the conditional distribution of root
322: variable conditional to the assignment of $r$-th generation
323: nodes of $\T_*(r)$ according to $\ux_r$. The key property for most of
324: the results of this paper is that of correlation decay with respect to
325: random tree formulas $\T_*(\cdot)$.
326: %
327: \begin{df}\label{def:Uniqueness}
328: Given $\alpha, \beta \in {\mathbb R}_+$ and $k\ge 2$, the Gibbs distribution
329: defined by (\ref{eq:GraphicalModel}) on the random tree $\T_*(\cdot)$
330: is unique with exponential correlation decay if there exists positive constants $A,\gamma > 0$,
331: such that
332: %
333: \begin{eqnarray}
334: %
335: \E\left[\sup_{\ux_r,\uz_r}
336: \l|\mu_{0|r}(\,\cdot\,|\ux_r)-\mu_{0|r}(\,\cdot\,|\uz_r)\r|_{\sTV} \right]
337: \le A\, e^{-\gamma r}\, ,
338: %
339: \end{eqnarray}
340: %
341: for any $r\ge 0$. The uniqueness threshold $\alpha_{\rm u}(k)$ is the supremum value
342: of $\alpha$ such that the above condition is verified for any
343: $\beta\in[0,\infty]$.
344: %
345: \end{df}
346: %
347: The property defined here is a lot stronger than the usual notion of
348: correlation decay, which only requires
349: $\l|\mu_{0|r}(\,\cdot\,|\ux_r)-\mu_{0|r}(\,\cdot\,|\uz_r)\r|_{\sTV}\to 0$ as $r\to\infty$
350: almost surely. Let $\alpha_{\rm u}'(k)$ denote the threshold for this
351: weaker property. To the best of our knowledge, nothing has been known
352: about the precise values of $\alpha_{\rm u}(k), \alpha_{\rm u}'(k)$ or the
353: relation between them other than trivial lower bound from percolation
354: threshold of $\Omega(k^{-2})$. We establish the precise asymptotic behavior
355: of $\alpha_{\rm u}(k)$ and show that $\alpha_{\rm u}(k) =
356: \alpha_{\rm u}'(k)(1+o_k(1))$
357: as stated below.
358: \begin{thm}\label{thm:UniquenessTrees}
359: For the Gibbs distribution (\ref{eq:GraphicalModel}) defined on $\T_*(\cdot)$ as above,
360: %
361: \begin{eqnarray}
362: %
363: \alpha_{\rm u}(k) = \frac{2\log k}{k}
364: \left\{1+O\left(\frac{\log\log k}{\log k}\right)\right\}, \;\;\;\;\;\;\;\;
365: \alpha_{\rm u}'(k) = \frac{2\log k}{k}
366: \left\{1+O\left(\frac{\log\log k}{\log k}\right)\right\}\, .
367: %
368: \end{eqnarray}
369: %
370: \end{thm}
371: %
372: Though algorithmically we obtain approximation of $\log \Z(\beta, F)$,
373: it is possible to establish the convergence of $\frac{1}{N} \log \Z(\beta, F)$
374: with probability $1$. Before stating this result, we need some definitions.
375: In what follows, define function $f:\R^{k-1}\to\R$ as
376: %
377: \begin{eqnarray}
378: %
379: f(x_1,\dots,x_{k-1}) = -\frac{1}{2}\log\left\{1-
380: \frac{1-e^{-\beta}}{2^{k-1}}\prod_{i=1}^{k-1}(1-\tanh x_i)\right\}\, .
381: \label{eq:FDefinition}
382: %
383: \end{eqnarray}
384: %
385: Let $\D$ denote the space of probability distributions on the
386: real line $\R$. Define functions
387: $S, S_1, S_2: \D \to \D$ as follows: Given $\mu \in \D$, define
388: random variable $u = f(h_1,\dots, h_{k-1})$ where $h_1,\dots, h_{k-1}$ are
389: i.i.d. with distribution $\mu$. Define distribution of $u$ as $S_1(\mu)$.
390: Given a distribution $\nu \in \D$, let random variable $h_0 = \sum_{a=1}^{\ell^+} u_a - \sum_{b=1}^{\ell^-} u_b$,
391: where $\ell^+, \ell^-$ are independent Poisson random variables with
392: mean $k\alpha/2$ and $u_a, u_b$ be i.i.d. with distribution $\nu$. Let distribution
393: of $h_0$ be denoted by $S_2(\nu)$. Define $S \equiv S_1 \circ S_2$. Now,
394: we state the result.
395: \begin{thm}\label{thm:LimitPartFun}
396: %
397: Given $k$, let $\alpha< \alpha_*(k)$ and $\beta\in [0,\infty)$. Then,
398: the function $S : \D \to \D$ as defined above has unique fixed point, say
399: $\mu^*$. Let $\nu^* = S_2(\mu^*)$. Then,
400: \begin{eqnarray}
401: %
402: \frac{1}{N}\log Z(\beta,F_N)\stackrel{{\rm a.s.}}{\to} \phi(\beta)\, ,\label{eq:AlmostSure}
403: %
404: \end{eqnarray}
405: \begin{eqnarray}
406: %
407: \mbox{where}~\phi(\beta) &=& -k\alpha\E\log[1+\tanh h\tanh u]+
408: \alpha\E\log \left\{1-\frac{1}{2^{k}}(1-e^{-\beta})
409: \prod_{i=1}^{k}(1-\tanh h_i)\right\}
410: +\label{eq:phi}\\
411: &&+\E\log\left\{\prod_{i=1}^{\ell_+}(1+\tanh u^+_i)
412: \prod_{i=1}^{\ell_-}(1-\tanh u^-_i)+\prod_{i=1}^{\ell_+}(1-\tanh u^+_i)
413: \prod_{i=1}^{\ell_-}(1+\tanh u^-_i)\right\}\, ,\nonumber
414: %
415: \end{eqnarray}
416: %
417: where $u, u^{\pm}_i$ are i.i.d. with distribution $\mu^*$, $h, h_j$ are i.i.d.
418: with distribution $\nu^*$ and $\ell_{\pm}$ are Poisson of mean $k\alpha/2$.
419:
420: \end{thm}
421: %
422: Finally, define $\Xi(\zeta,F)$ to be the number of assigments that violate at
423: most $\zeta$ clauses. The next result formalizes the relation between the
424: approximation of $\Z(\beta, F)$ and counting the number of truth assignments
425: that violate a small fraction of clauses.
426: \begin{thm}\label{thm:AlmostSAT}
427: %
428: For any $k\ge 2$, $\ve > 0$, and $\alpha<\alpha_*(k)$
429: there exists $A,C>0$, $a>0$ such that the following is true.
430: If $F$ is a random $k$-SAT $M=N\alpha$ clauses over $N$ variables,
431: and $\beta = A\log 1/\ve$, then
432: %
433: \begin{eqnarray}
434: %
435: \left|\log \Xi(N\ve,F)-\Phi(\beta,F)\right|\le NC\ve^a\, ,
436: %
437: \end{eqnarray}
438: %
439: with high probability, where $\Phi(\beta, F)$ as defined in Theorem
440: \ref{thm:KSAT}.
441: %
442: \end{thm}
443: %
444:
445: \section{Algorithm and Key Lemmas}
446: \label{sec2}
447:
448: \subsection{Algorithm}
449: \label{ssec:algorithm}
450:
451: We first define a factor graph $G_F$ for a given formula $F$: each
452: variable is represented by (circle) variable node and each clause
453: by a (square) clause node with an edge between a variable and
454: a clause node only if corresponding variable belongs to the clause. The
455: edge is solid if variable is non-negated and dashed if variable is
456: negated. The Belief Propagation (BP) algorithm is a heuristic (exact for
457: tree factor graphs) to estimate the marginal distribution of node variables
458: for any factor graph. Specifically, we will use BP to approximately compute
459: marginals of the distribution (\ref{eq:GraphicalModel}).
460:
461: We will quickly recall BP for our specific setup. We refer reader to
462: see \cite{Pearl, WJ} for further details on the algorithm. BP is a
463: message passing algorithm in which at each iteration messages are
464: sent from variable nodes to neighboring clause nodes and vice versa. The
465: messages at iteration $t+1$ are functions of messages received at iteration
466: $t$. To describe the message update equations, we need some notation. Let
467: $\da$ denote the set of all variables that belong to clause $a$.
468: If variable $x_i$ is involved in clause $a$ as literal $z$
469: (either $z=x_i$ or $z=\bar{x}_i$), then define $\dpi(a)$ as the set
470: of all clauses (minus $a$) in which $x_i$ appears as $z$. Similarly,
471: $\dmi(a)$ denotes the set of all clauses in which $x_i$ appears as $\bar{z}$.
472: Let $\{h^{(t)}_{i\to a}\}$, $\{u^{(t)}_{a\to i}\}$ denote the messages
473: (ideally they are half log-likelihood ratios) that are passed along the ddirected edges
474: $i\to a$ and $a\to i$
475: respectively at time $t$, then the precise update equations are
476: %
477: \begin{eqnarray}
478: %
479: h_{i\to a}^{(t+1)} = \sum_{b\in\dpi(a)}u^{(t)}_{b\to i}
480: -\sum_{b\in\dmi(a)}u^{(t)}_{b\to i}\,
481: , \;\;\;\;\;\;
482: u^{(t)}_{a\to i}= f(\{h^{(t)}_{j\to a};j\in\da\backslash i\})\, ,
483: \label{eq:BPUpdate}
484: %
485: \end{eqnarray}
486: %
487: where the function $f(\, \cdot\,)$ has been defined in
488: Eq.~(\ref{eq:FDefinition}). We shall assume \footnote{In fact an arbitrary
489: initial condition and a smaller number of iterations wouldn't change
490: our main results.} that the update equations are initialized by $h^{(0)}_{i\to a} =0$
491: and algorithm stops at iteration $t_{\rm max}$ which is equal to the diameter of
492: $G_F$. Let $(h_{i\to a}, u_{a\to i})$ be messages passed in the last iteration of BP.
493: Using these messages, an estimate of the probability that
494: a clause is satisfied can
495: be obtained as follows. Let $E_a(\ux_{\da})$ be the indicator function for
496: the $a$-th clause not being satisfied. As mentioned above,
497: $h_{i\to a}$ is thought of as half log-likelihood ratio for $i$ satisfying
498: $a$ and $i$ not satisfying $a$, in the absence of clause $a$ itself.
499: A little algebra then shows that the BP estimate for the expectation
500: of $E_a(\ux_{\da})$ is
501: %
502: \begin{eqnarray}
503: %
504: \<E_a(\ux_{\da})\>_{\sBP} = \frac{\sum_{\ux_{\da}}
505: E_a(\ux_{\da})\; \exp\{-\beta E_a(\ux_{\da})+ h_{i\to a}\sigma_{ai}(x_i)\}}
506: {\sum_{\ux_{\da}} \exp\{-\beta E_a(\ux_{\da})+ h_{i\to a}\sigma_{ai}(x_i)\}}
507: \, ,
508: %
509: \end{eqnarray}
510: %
511: where $\sigma_{ai}(x) = +1$ if setting $x_i=x$ satisfies clause $a$,
512: and $=-1$ otherwise. We further introduce the number of clauses violated by
513: $\ux$, $E(\ux) = \sum_a E_a(\ux_{\da})$, and its BP estimate
514: $\<E(\ux)\>_{\sBP} = \sum_a \<E_a(\ux)\>_{\sBP}$.
515:
516: Given $\beta>0$, we let $\beta_i = i\beta/N^{2}$, for $i=0,\dots,n\equiv N^{2}$.
517: Then,
518: %
519: \begin{eqnarray}
520: %
521: \log Z(\beta,F) & = & \log Z(0,F) +
522: \sum_{i=0}^{n-1}\log \frac{Z(\beta_{i+1},F)}{Z(\beta_{i},F)} =
523: N\log 2+\sum_{i=0}^{n-1}\log \<e^{-\Delta E(\ux)}\>_i\, ,
524: \label{eq:Telescopic}
525: %
526: \end{eqnarray}
527: %
528: where $\Delta \equiv \beta_{i+1}-\beta_i$, and
529: $\<\,\cdot\,\>_i$ is a shorthand for expectation with respect
530: to the measure $\mu_{\beta_i,F}(\,\cdot\,)$. The above
531: expression is difficult to evaluate. However, due to $\Delta$
532: being small the $\<-\Delta E(\ux)\>$ is a good estimate of
533: $\log \<e^{-\Delta E(\ux)}\>_i$. Hence, define the algorithm estimate as
534: %
535: \begin{eqnarray}
536: %
537: \Phi(\beta,F) = N\log 2 - \sum_{i=1}^{n-1}\Delta\, \<E(\ux)\>_{\sBP,i}\, ,
538: \label{eq:AlgorithmOutput}
539: %
540: \end{eqnarray}
541: %
542: where the subscript in $\<\,\cdot\,\>_{\sBP,i}$ emphasizes that the BP
543: computation must be performed at inverse temperature $\beta_i$.
544:
545: \subsection{Key Lemmas}
546: \label{ssec2}
547:
548: Before presenting useful Lemmas, let us mention a few facts.
549: Given factor graph $G_F$
550: and variable node $i$, $1\leq i \leq N$, let $\Ball_i(r)$ denote subgraph
551: induced by the set of all variable that are within shortest path distance $r$
552: of node $i$ (distance between two variables sharing a clause is unit).
553: Analogously, for a clause node $a$, $\Ball_a(r)$ is the union
554: of $\Ball_i(r)$ with $i$ running over the variables involved in $a$. Let
555: $A$ be subset of variable nodes. Then, let $\ux_A$ denote an
556: assignment to the corresponding variables. Given two such subsets
557: $A,B\subseteq [N]$ and assignments $\ux_A$, $\ux_B$, let
558: $\mu_{A|B}(\ux_A|\ux_B)$ be the conditional probability under
559: the distribution (\ref{eq:GraphicalModel}) of the variables in $A$,
560: given assignment $\ux_B$ on $B$. The following is a well-known
561: result about BP algorithm (see \cite{TJ99}).
562: %
563: \begin{lemma}\label{lemma:BoundBP}
564: Given a clause $a$ and $r$, let $\Ball_a(r+1)$ be a tree.
565: Let $U = \Ball_a(r)$ and $V=[N]\backslash U$. Then
566: %
567: \begin{eqnarray}
568: %
569: \left|\<E_a(\ux_{\da})\>-\<E_a(\ux_{\da})\>_{\sBP}\right|
570: \le \sup_{\uy,\uz}\l|\mu_{\da|V}(\,\cdot\,|\uy_{V})-
571: \mu_{\da|V}(\,\cdot\,|\uz_{V})\r|_{\sTV}\, ,\\
572: 0\le\<E_a(\ux_{\da})\>,\,\<E_a(\ux_{\da})\>_{\sBP}\le
573: \max_{\uz_{V}}\left\{\sum_{\ux_{\da}}E_a(\ux)\mu_{\da|V}(\ux_{\da}|\uz_V)
574: \right\}
575: %
576: \end{eqnarray}
577: \end{lemma}
578:
579: Next, we present a known result about locally tree-like structure of random
580: $k$-SAT formula (an analogous result concerns the local structure of
581: sparse random graphs).
582: %
583: \begin{lemma}\label{lemma:LocalTree}
584: %
585: Consider $k\ge 2$, $\alpha\in [0,\infty)$ and a random $k$-SAT formula
586: $F$ with clause density $\alpha$. For $r\ge 0$, let $\Ball_i(r)$ be the ball
587: of radius $r$ centered at a uniformly random variable node $i$. Let
588: $S(r)$ be an $r$-generation tree with distribution same as $\T_*(r)$
589: (with the same values of $k$ and $\alpha$). Then, there exists $A, \rho$ (dependent
590: on $\alpha, k$) such that
591: %
592: \begin{eqnarray}
593: %
594: \l|\,\prob\{\Ball_i(r)\in\,\cdot\, \}-\prob\{S(r)\in\,\cdot\, \}\,
595: \r|_{\sTV}\le \frac{A\,e^{\rho r}}{N}\, .
596: %
597: \end{eqnarray}
598: %
599: \end{lemma}
600: %
601: %
602: \begin{figure}
603: \center{\includegraphics[width=6.cm]{twogen.eps}}
604: \put(-182,5){$b$}
605: \put(-192,25){$u_{b\to j_1}$}
606: \put(-155,44){$j_1$}
607: \put(-72,85){$a$}
608: \put(-73,107){$u_{a\to i}$}
609: \put(-72,130){$i$}
610: \caption{Pictorial representation of the recursion (\ref{eq:RecursionU})
611: on the factor graph $G_F$: filled squares represent function nodes and
612: empty circles variable nodes. Dashed edges correspond to negations.}
613: \label{fig:TwoGen}
614: \end{figure}
615: %
616:
617: \begin{lemma}\label{lemma:Key}
618: %
619: Let $\alpha_*(k)$ be the smallest positive root of the equation
620: $\kappa(\alpha) = 1$, with $\kappa(\alpha)$ defined as
621: in Eq.~(\ref{eq:ContractionRate}).
622: Then $\alpha_*(k)\le\alpha_{\rm u}(k)$.
623: %
624: \end{lemma}
625: %
626: \prooft
627: Given an $r$-generations tree formula $F$, consider an edge
628: $i\to a$ directed toward the root
629: and the subtree rooted at $i$ and \emph{not} containing $a$.
630: Denote by $\mu_{i\to a}( \, \cdot\,)$ the marginal distribution of
631: $x_i$ with respect to the model associated to this subtree, and
632: let $h_{i\to a}\in[-\infty,\infty]$ be the corresponding log-likelihood ratio
633: %
634: \begin{eqnarray}
635: %
636: h_{i\to a} \equiv \frac{1}{2}\log \left\{
637: \frac{\mu_{i\to a}(\mbox{$x_i$ satisfies $a$})}
638: {\mu_{i\to a}(\mbox{$x_i$ doesn't satisfy $a$})}\right\}\, .\label{eq:DefLLR}
639: %
640: \end{eqnarray}
641: %
642: Analogously, given an edge $a\to i$, we consider the subtree rooted at $i$ and
643: containing only $a$ among the clauses involving $i$. We denote
644: by $\mu_{a\to i}( \, \cdot\,)$ the corresponding marginal distribution at
645: $i$, and let
646: %
647: \begin{eqnarray}
648: %
649: u_{a\to i} \equiv \frac{1}{2}\log \left\{
650: \frac{\mu_{a\to i}(\mbox{$x_i$ satisfies $a$})}
651: {\mu_{a\to i}(\mbox{$x_i$ doesn't satisfy $a$})}\right\}\, .
652: %
653: \end{eqnarray}
654: %
655: It is easy to show that these log-likelihoods satisfy the
656: recursions\footnote{The reader will notice that these coincide with
657: the BP update equations, cf. Eq.~(\ref{eq:BPUpdate}), which are known to
658: be exact on trees.}
659: %
660: \begin{eqnarray}
661: %
662: h_{j\to a} = \sum_{b\in\dpj(a)}u_{b\to j}
663: -\sum_{b\in\dmj(a)}u_{b\to j}\,
664: , \;\;\;\;\;\;\;\;\;\;
665: u_{a\to i}= f(\{h_{j\to a};j\in\da\backslash i\})\, ,
666: %
667: \end{eqnarray}
668: %
669: with the function $f(\,\cdot\,)$ being defined as in
670: Eq.~(\ref{eq:FDefinition}).
671: For the calculations below, it is convernient
672: to eliminate the $h_{i\to a}$ variables, to get
673: %
674: \begin{eqnarray}
675: %
676: u_{a\to i}= f\left( \sum_{b\in\dpj_1(a)}u_{b\to j_1}\!
677: -\!\!\!\sum_{b\in\dmj_1(a)}u_{b\to j_1};\;\; \dots\;\;;
678: \sum_{b\in\dpj_{k-1}(a)}u_{b\to j_{k-1}}\!
679: -\!\!\!\sum_{b\in\dmj_{k-1}(a)}u_{b\to j_{k-1}}\right)\, ,\label{eq:RecursionU}
680: %
681: \end{eqnarray}
682: %
683: where we denoted by $j_1,\dots, j_{k-1}$ the indices of variables involved in
684: clause $a$ (other than $i$).
685: A pictorial representation of this recursion is provided in
686: Fig.~\ref{fig:TwoGen}.
687:
688: Notice that the above recursions hold irrespective whether
689: one considers the unconditional measure $\mu(\,\cdot\,)$, or
690: the conditional one $\mu(\, \cdot\,|\ux_r)$. What changes
691: in the two cases are the initial condition for the recursion,
692: i.e. the value of $h_{i\to a}$ associated with the variables
693: $i$ at the $r$-th generation. For the unconditioned measure
694: (`free boudary condition'), the appropriate initialization is $h_{i\to a}=0$.
695: If one conditions to $\ux_r$, $h_{i\to a}= +\infty$, or
696: $=-\infty$ depending (respectively) whether $x_i$ satisfy clause
697: $a$ or not.
698:
699: In the rest of the proof, we shall think always to the conditioned
700: measure $\mu(\, \cdot\,|\ux_r)$. As a consequence, the log-likelihoods
701: are, implicity, functions of $\ux_r$:
702: $u_{a\to i} = u_{a\to i}(\ux_r)$
703: (indeed of the restriction of $\ux_r$ to the subtree rooted at $i$,
704: and only containing $a$).
705: We then let
706: %
707: \begin{eqnarray}
708: %
709: \ou_{a\to i} = \max_{\ux_r}\; u_{a\to i}(\ux_r)\, ,\;\;\;\;\;
710: \uu_{a\to i} = \min_{\ux_r}\; u_{a\to i}(\ux_r)\, .
711: %
712: \end{eqnarray}
713: %
714: In the case $\beta = \infty$, the maximum (minimum) is taken over
715: all boundary conditions $\ux_r$, such that the sub-formula rooted
716: at $i$ admits at least one solutions, under the condition $\ux_r$
717: (there is always at least one such boundaries).
718: We further let $\Delta_{a\to i} = \ou_{a\to i}-\uu_{a\to i}\ge 0$.
719:
720: Consider a random tree distributed as $\T_*(r)$, conditioned to the root
721: having degree $1$, i.e. to the root variable being involved in a unique
722: clause, to
723: be denoted by $a$. Let $\Delta^{(r)} = \Delta_{a\to i}$ be the corresponding
724: log-likelihoods interval. We will show that
725: $\E\tanh\Delta^{(r)}\le e^{-\gamma r}$ for some positive constant
726: $\gamma$. Before proving this claim, let
727: us show that it indeed implies the thesis.
728: Denoting by $\partial_+0$ the set of clauses in which the root is involved
729: as the direct literal, and by $\partial_-0$ the set in which it is
730: involved as negated, we have
731: %
732: \begin{eqnarray}
733: %
734: \l|\mu_{0|r}(\,\cdot\,|\ux_r)-\mu_{0|r}(\,\cdot\,|\uz_r)\r|_{\sTV} &=&
735: \frac{1}{2}\big|\tanh h_0(\ux_r)-\tanh h_0(\uz_r)\big| \, ,\\
736: h_0(\ux_r) &\equiv &\sum_{a\in\partial_+0} u_{a\to 0}(\ux_r)-
737: \sum_{a\in\partial_-0} u_{a\to 0}(\ux_r)\, .
738: %
739: \end{eqnarray}
740: %
741: Since $x\mapsto \tanh(x)$ is monotonically increasing in $x$, we have
742: %
743: \begin{eqnarray}
744: %
745: &&\l|\mu_{0|r}(\,\cdot\,|\ux_r)-\mu_{0|r}(\,\cdot\,|\uz_r)\r|_{\sTV} \le
746: \frac{1}{2}\big\{\tanh \oh_0-\tanh \uh_0\big\} \, ,\\
747: &&\oh_0 \equiv \sum_{a\in\partial_+0} \ou_{a\to 0}-
748: \sum_{a\in\partial_-0} \uu_{a\to 0}\, ,\;\;\;\;\;\;\;\;
749: \uh_0 \equiv \sum_{a\in\partial_+0} \uu_{a\to 0}-
750: \sum_{a\in\partial_-0} \ou_{a\to 0}\, .\label{eq:h0}
751: %
752: \end{eqnarray}
753: %
754: Using the elementary properties $\tanh x-\tanh y\le 2\tanh(x-y)$
755: for any $x\ge y$, and $\tanh(x+y)\le \tanh x+\tanh y$ for $x,y\ge 0$,
756: we get
757: \begin{eqnarray}
758: %
759: \l|\mu_{0|r}(\,\cdot\,|\ux_r)-\mu_{0|r}(\,\cdot\,|\uz_r)\r|_{\sTV} \le
760: \tanh\left\{\sum_{a\in\partial 0}\Delta_{a\to 0}\right\}
761: \le \sum_{a\in\partial 0} \tanh \Delta_{a\to 0}\, .
762: %
763: \end{eqnarray}
764: %
765: We can take the maximum over boundary condition and
766: the expectation with respect to the tree ensemble. Recalling
767: that $|\partial 0|$ is a Poisson random variable of mean $k\alpha$, we get
768: %
769: \begin{eqnarray}
770: %
771: \E\max_{\ux,\uz} \l|\mu_{0|r}(\,\cdot\,|\ux_r)-\mu_{0|r}
772: (\,\cdot\,|\uz_r)\r|_{\sTV} \le k\alpha\, \E
773: \tanh\Delta^{(r)} \, ,
774: %
775: \end{eqnarray}
776: %
777: which implies the thesis upon taking $A=k\alpha $.
778:
779: We are now left with the task of proving $\E\tanh\Delta^{(r)}\le
780: e^{-\gamma r}$. It is easy to realize that $f(x_1,\dots, x_{k-1})$
781: is monotonically decreasing in each of its arguments.
782: Therefore Eq.~(\ref{eq:RecursionU}) yields the following recursion
783: for upper/lower bounds
784: %
785: \begin{eqnarray}
786: %
787: \ou_{a\to i}=f\left( \sum_{b\in\dpj_{1}(a)}\uu_{b\to j_1}\!
788: -\!\!\!\sum_{b\in\dmj_1(a)}\ou_{b\to j_1};\;\; \dots\;\;;
789: \sum_{b\in\dpj_{k-1}(a)}\uu_{b\to j_{k-1}}\!
790: -\!\!\!\sum_{b\in\dmj_{k-1}(a)}\ou_{b\to j_{k-1}}\right)\, ,
791: %
792: \end{eqnarray}
793: %
794: together with the equation obtained by interchanging $\uu_{\cdots}$
795: and $\ou_{\cdots}$. By taking the difference of these two equations, we
796: get
797: %
798: \begin{eqnarray}
799: %
800: \Delta_{a\to i} = f(\uh_{1};\;\dots\; ;\uh_{k-1})-
801: f(\oh_{1};\;\dots\; ;\oh_{k-1})\, ,
802: %
803: \end{eqnarray}
804: %
805: where we defined $\uh_{i} = \sum_{b\in\dpj_{i}(a)}\uu_{b\to j_i}\!
806: -\!\!\!\sum_{b\in\dmj_i(a)}\ou_{b\to j_i}$
807: and $\oh_{i} = \sum_{b\in\dpj_{i}(a)}\ou_{b\to j_i}\!
808: -\!\!\!\sum_{b\in\dmj_i(a)}\uu_{b\to j_i}$
809: (obviously $\uh_i\ge \oh_i$).
810:
811: Suppose now $n$ out of the $k-1$ variables $x_{j_1},\dots, x_{j_{k-1}}$
812: are pure literals, let's say variables $x_{j_1},\dots,x_{j_n}$.
813: This means that $\dmj_1(a), \dots \dmj_n(a) = \emptyset$, and therefore,
814: since the loglikelihoods $u_{b\to j}$ are non-negative (because $f$ is
815: non-negative), $\uh_{1},\dots \uh_n\ge 0$.
816: It is an easy exercise of analysis to show that, if $x_1,\dots,x_n\ge 0$,
817: %
818: \begin{eqnarray}
819: %
820: 0\le-\frac{\partial f}{\partial x_i}(x_1,\dots,x_{k-1})\le \frac{1}{2^n}\, .
821: %
822: \end{eqnarray}
823: %
824: Therefore, by the Mean Value Theorem
825: %
826: \begin{eqnarray}
827: %
828: \Delta_{a\to i} \le \frac{1}{2^n}\sum_{l=1}^{k-1} (\oh_l-\uh_l)
829: =\frac{1}{2^n}\sum_{l=1}^{k-1} \sum_{b\in\partial j_l}
830: \Delta_{b\to j_l}
831: \, ,
832: %
833: \end{eqnarray}
834: %
835: Next we take the hyperbolic tangent of both sides, and
836: use again $\tanh(x+y)\le \tanh x+\tanh y$, for $x,y\ge 0$
837: to get
838: %
839: \begin{eqnarray}
840: %
841: \tanh\Delta_{a\to i} \le \frac{1}{2^n}\sum_{l=1}^{k-1} \sum_{b\in\partial j_l}
842: \tanh \Delta_{b\to j_l}
843: \, .
844: %
845: \end{eqnarray}
846: %
847: Finally we take expectation of this inequality. In order to do this,
848: we recall that $n$ is just the number of pure literals
849: among $x_{j_1},\dots x_{j_{k-1}}$. In our notations this can be written
850: as $n = \sum_{l=1}^{k-1}\ind(|\dmj_l(a)| =0)$. We further assume that
851: $i$ is the root of a tree from $\T_*(r+1)$, $r\ge 0$ and therefore
852: $\Delta_{a\to i}$ is distributed as $\Delta^{(r)}$. Furthermore
853: the differences $\Delta_{b\to j_{l}}$ will be distributed as
854: $\Delta^{(r+1)}$. We thus obtain
855: %
856: \begin{eqnarray}
857: %
858: \E\tanh\Delta^{(r+1)} &\le &\E\left\{
859: \prod_{l=1}^{k-1}\frac{1}{2^{\ind(|\dmj_l(a)| =0)}}
860: \sum_{l=1}^{k-1}
861: \sum_{b\in\partial j_l(a)}
862: \tanh \Delta^{(r)}\right\} =\\
863: &=& (k-1)\,\E\left\{\frac{1}{2^{\ind(|\dmj| =0)}}
864: |\partial j|\right\}\,
865: \left\{\E\,2^{-\ind(|\dmj| =0)}\right\}^{k-2}
866: \, \E\tanh\Delta^{(r)}
867: \, .
868: %
869: \end{eqnarray}
870: %
871: The expectations over $|\dpj|$, $|\dmj|$ are easily evaluated by recalling that
872: these are inpependent Poisson random variables of mean $k\alpha/2$.
873: One finally obtains $\E\tanh\Delta^{(r+1)} \le\kappa(\alpha)
874: \E\tanh\Delta^{(r)}$. The thesis follows (with $\gamma = -\log\kappa(\alpha)$)
875: by noticing that $\E\tanh \Delta^{(0)}\le 1$, and recalling that
876: $\kappa(\alpha)<1$ for $\alpha<\alpha_*(k)$.
877: %
878: \endproof
879:
880: Next, we state result about the error in expectation w.r.t. to BP estimate in
881: a clauses being satisfied or not. To obtain bound in the error of BP estimate of $\<E_a(\ux)\>$,
882: we need to study the error in estimation of the joint distribution
883: of $k$ variables in a clause. For this, we first choose a clause at
884: random and treat all of its $k$ variables as root of $k$ independent rooted random
885: trees (of suitable depth $r$) as before. Note that, this asymptotically
886: does not bias the distribution of the original random formula as
887: this process tilt the original distribution by at most $O(1/N)$.
888:
889: To this end, let $\ux_r$ be an
890: assigment for the $r$-th generation variables. We shall denote by
891: $\<\,\cdot\,\>^{(r)}$ the expectation with respect to the graphical model
892: (\ref{eq:GraphicalModel}) associated to a formula constructed as follows.
893: First we generate a uniformly random clause over variables $x_1,\dots, x_k$.
894: Then we sample $k$ independent trees according to $\T_*(r)$ and root them
895: at $x_1,\dots, x_k$.
896: We let $\<\,\cdot\,\>_{\ux_r}^{(r)}$ be
897: the corresponding conditional expectation,
898: given the assignment to the $r$-th generation.
899: %
900: \begin{lemma}\label{lemma:EnergyTree}
901: %
902: Let $k\ge 2$, $\alpha<\alpha_*(k)$ and $\beta\in[0,\infty]$.
903: Then there exist two positive constants $A$, $\gamma$, such that
904: %
905: \begin{eqnarray}
906: %
907: \E\max_{\ux_r,\uz_r}
908: \left|\<E_a(\ux)\>_{\ux_r}^{(r)}-\<E_a(\ux)\>_{\uz_r}^{(r)}\right|\le
909: A\, e^{-\gamma r}\, .
910: %
911: \end{eqnarray}
912: %
913: \end{lemma}
914: %
915: \prooft
916: Denote by $\ux_{\da} = \{x_1,\dots ,x_{k}\}$ the zeroth generation
917: variables, by $T_1,\dots, T_k$ the tree factor graphs drawn from $\T_*(r)$ and rooted,
918: respectively, at variable nodes $1,\dots,k$.
919: We then denote by $\mu_i(x_i|\ux_r)$, $i\in\{1,\dots,k\}$ the conditional
920: distribution for variable $x_i$ with respect to the model associated with the
921: tree $T_i$. We also let
922: $h_i(\ux_r)$ be the associated log-likelihoods
923: (defined analogously to Eq.~(\ref{eq:DefLLR})), and
924: $\uh_i = \max_{\ux_r}h_i(\ux_r)$ ($\oh_i = \min_{\ux_r}h_i(\ux_r)$)
925: be their maximum (minimum) values with respect to the boundary condition.
926:
927: It is not hard to show that
928: $\<E_a(\ux)\>_{\ux_r}^{(r)} = g(h_1(\ux_r),\dots,h_k(\ux_r))$
929: where the function $g:\R^k\to\R$ is defined as follows
930: %
931: \begin{eqnarray}
932: %
933: g(x_1,\dots,x_k) \equiv \frac{e^{-\beta}\prod_{i=1}^k\frac{1}{2}(1-\tanh x_i)}
934: {1-(1-e^{-\beta})\prod_{i=1}^k\frac{1}{2}(1-\tanh x_i)}\, .
935: \label{eq:Gdef}
936: %
937: \end{eqnarray}
938: %
939: Since $g(x_1,\dots,x_k)$ is monotonically decreasing in each of its arguments,
940: we have
941: %
942: \begin{eqnarray}
943: %
944: \E\max_{\ux_r,\uz_r}
945: \left|\<E_a(\ux)\>_{\ux_r}^{(r)}-\<E_a(\ux)\>_{\uz_r}^{(r)}\right|
946: \le \E\left\{g(\uh_1,\dots,\uh_k)- g(\oh_1,\dots,\oh_k)\right\}\, ,
947: \label{eq:GDiff}
948: %
949: \end{eqnarray}
950: %
951: where the couples $(\uh_1,\oh_1),\dots,(\uh_k,\oh_k)$ are i.i.d.'s
952: and distributed as $(\uh_0,\oh_0)$ in the proof of Lemma \ref{lemma:Key},
953: cf. Eq.~(\ref{eq:h0}). In particular, proceeding as in that proof,
954: we deduce that $\E\tanh(\oh_i-\uh_i)\le A\, e^{-\gamma r}$.
955: We are left with the task of proving that this implies an analogous bound
956: on the right hand side of Eq.~(\ref{eq:GDiff}).
957:
958: To this end, we first consider a single variable function $\gt:\R\to \R$
959: with $0\le \gt(x)\le 1$ and $-1\le\gt'(x)\le 0$. Then
960: %
961: \begin{eqnarray}
962: %
963: \E\{\gt(\uh_1)-\gt(\oh_1)\} & \le & \prob\{\oh_1-\uh_1\ge \Delta\}+
964: \E\{(\oh_1-\uh_1)\, \ind(\oh_1-\uh_1<\Delta)\}\le\\
965: &\le &\frac{1}{\tanh\Delta}\E\tanh(\oh_1-\uh_1)+\frac{\Delta}{\tanh\Delta}
966: \E\{\tanh(\oh_1-\uh_1)\, \ind(\oh_1-\uh_1<\Delta)\}\le \nonumber\\
967: &\le&\frac{1+\Delta}{\tanh\Delta}\E\tanh(\oh_1-\uh_1)\, .\nonumber
968: %
969: \end{eqnarray}
970: %
971: The proof is completed by writing
972: $\E\{g(\uh_1,\dots,\uh_k)- g(\oh_1,\dots,\oh_k)\} =
973: \sum_{i=0}^k \E\{\gt_i(\uh_i)-\gt_i(\oh_i)\}$ where
974: $\gt_i(x) \equiv g(\oh_1\dots \oh_{i-1},x,\uh_{i+1},\dots,\uh_k)$ and
975: noticing that $-1\le\frac{\partial g}{\partial x_i}\le 0$ (the last statement
976: is proved in the appendix)
977: %
978: \endproof
979: %
980: \iffalse
981: Because of Lemma \ref{lemma:BoundBP}, the above two results, namely
982: Lemmas \ref{lemma:LocalTree} and \ref{lemma:EnergyTree}, imply that belief
983: propagation provides a good approximation of averages with
984: respect to the measure $\mu_{\beta,F}(\,\cdot\,)$.
985: For the sake of concreteness we limit ourselves to bounding the
986: expected error on the average cost $\<E(\ux)\>$. Any local marginal can
987: however be treated along the same lines.
988: %
989: \fi
990:
991: Finally, a result that puts together the above observations to derive the net
992: error in BP estimation.
993: \begin{lemma}\label{lemma:EnergyEstimate}
994: %
995: Let $k\ge 2$, $\alpha<\alpha_*(k)$ and $\beta\in[0,\infty]$.
996: Then there exists two positive constants $C$ and $\delta<1$
997: such that for any $N$,
998: %
999: \begin{eqnarray}
1000: %
1001: \E\left|\<E(\ux)\>-\<E(\ux)\>_{\sBP}\right|\le CN^{\delta}\, .
1002: \label{eq:BoundTotalEnergy}
1003: %
1004: \end{eqnarray}
1005: %
1006: \end{lemma}
1007: %
1008: \prooft
1009: By linearity of expectation and using Lemma \ref{lemma:BoundBP},
1010: we get
1011: %
1012: \begin{eqnarray}
1013: %
1014: \E\left|\<E(\ux)\>-\<E(\ux)\>_{\sBP}\right| \le
1015: M\E\left|\<E_a(\ux)\>-\<E_a(\ux)\>_{\sBP}\right|\le
1016: M\E\left\{\max_{\ux,\uz}\left|\<E_a(\ux)\>^{(r)}_{\ux_r}-
1017: \<E_a(\ux)\>^{(r)}_{\uz_r}\right|\right\}\, .
1018: %
1019: \end{eqnarray}
1020: %
1021: We would like to apply Lemma \ref{lemma:EnergyTree}, but the expectation
1022: in the last expression is taken with respect to the formula $F$ drawn from the
1023: random $k$-SAT ensemble, instead of the tree model $\hT_*(r)$. However,
1024: the quantity in curly brackets depends only of the radius $r$ neighborhood
1025: $\Ball_a(r)$ of vertex $a$ in $G_F$. Furthermore is non negative and
1026: upper bounded by 1. We can therefore apply Lemma
1027: \ref{lemma:LocalTree} and \ref{lemma:EnergyTree} to upper bound the last
1028: expression by (here
1029: $\E_{\hT}$ denotes expectation with respect to the tree ensemble):
1030: %
1031: \begin{align}
1032: %
1033: M\l|\,\prob\{\Ball_i(r)\in\,\cdot\, \}-\prob\{S(r)\in\,\cdot\, \}\,
1034: \r|_{\sTV}+ M\E_{\hT}\left\{\max_{\ux,\uz}\left|\<E_a(\ux)\>^{(r)}_{\ux_r}-
1035: \<E_a(\ux)\>^{(r)}_{\uz_r}\right|\right\}&\le\\
1036: \le A\alpha\,& e^{\rho r}+NA'\alpha \, e^{-\gamma r}\nonumber
1037: %
1038: \end{align}
1039: %
1040: The proof is completed by setting $r = \frac{1}{\rho+\gamma}\log N$,
1041: which yields Eq.~(\ref{eq:BoundTotalEnergy}) with $\delta =
1042: \frac{\rho}{\gamma+\rho}$.
1043: %
1044: \endproof
1045: %.
1046:
1047: \section{Proofs of Theorems}
1048: \label{sec3}
1049:
1050: \subsection{Proof of Theorem \ref{thm:KSAT}}
1051: Clearly, the running time of algorithm described in Section \ref{sec2} is
1052: $O(N^4)$ as total number of BP runs are $O(N^2)$ and each BP
1053: run takes $O(N)$ iterations or $O(N^2)$ serial operations. Now,
1054: we'll prove Eq.~(\ref{eq:Guarantee}).
1055:
1056: Using te existing lower bounds on $\alpha_{\rm c}(k,N)$ (see \cite{AP04}
1057: and references therein), it is not hard to show that
1058: $\alpha_*(k)\le\alpha_{\rm c}(k,N)(1-\eta)$ for some $\eta>0$
1059: all $k\ge 2$ and $N$ large enough.
1060: By definition, for $\alpha<\alpha_{\rm c}(k,N)(1-\eta)$,
1061: $\beta\in[0,\infty]$ there exists a constant $C(\alpha)>0$ such that
1062: $\log Z(\beta,F)\ge C(\alpha)N\log 2$ whp. This follows from the following
1063: two facts for appropriate $C(\alpha)$: (1) at least $C(\alpha)N$ variables
1064: do not appear in any clause whp and (2) at least one solution is satisfying
1065: assignment whp as $\alpha < \alpha_{\rm c}(k,N)(1-\eta)$. Thus, there are
1066: at least $2^{C(\alpha) N}$ satisfying assignment, whence
1067: $\Z(\beta, F) \geq 2^{C(\alpha) N}$.
1068: Given this, it is sufficient to show that
1069: $\left|\log Z(\beta,F)-\Phi(\beta,F)\right|\le N\ve$ w.h.p.
1070: for any $\ve>0$ and $N$ large enough.
1071:
1072: Now, Eqs.~(\ref{eq:Telescopic}) and (\ref{eq:AlgorithmOutput}) imply that
1073: %
1074: \begin{eqnarray}
1075: %
1076: \left|\log Z(\beta,F)-\Phi(\beta,F)\right| &\le &
1077: \sum_{i=0}^{n-1}\left|\log \<e^{-\Delta E(\ux)}\>_i+
1078: \Delta\, \<E(\ux)\>_{\sBP,i}\right|\nonumber\\
1079: &\le &
1080: \sum_{i=0}^{n-1}\left|\log \<e^{-\Delta E(\ux)}\>_i+
1081: \Delta\, \<E(\ux)\>_{i}\right|+\sum_{i=0}^{n-1}
1082: \Delta\left|\<E(\ux)\>_{i}-\<E(\ux)\>_{\sBP,i}\right|\, .
1083: \label{eq:BoundDiff}
1084: %
1085: \end{eqnarray}
1086: %
1087: Consider the first term in (\ref{eq:BoundDiff}): for any non-negative
1088: random variable $X$,
1089: $ \log \< e^{-X} \> \leq \< e^{-X} \> - 1 \leq$ $\<1 - X + X^2 \> - 1 \leq - \< X \> + \<X^2 \>$.
1090: As a consequence, we obtain
1091: %
1092: \begin{eqnarray}
1093: %
1094: \sum_{i=0}^{n-1}\left|\log \<e^{-\Delta E(\ux)}\>_i+
1095: \Delta\, \<E(\ux)\>_{i}\right| ~\leq~\sum_{i=0}^{n-1}\Delta^2\<E(\ux)^2\>_i\le \beta\Delta\sup_i\<E(\ux)^2\>_i\le
1096: N^{2\delta'}\alpha^2\, ,
1097: %
1098: \end{eqnarray}
1099: %
1100: where we used $\beta\le N^{\delta'}$, $\Delta = \beta/N^2\le N^{\delta'-2}$
1101: and $0\le E(\ux)\le N\alpha$. If we choose $\delta'<1/2$, this
1102: contribution is smaller than $N\ve/2$ for all $N$ large enough.
1103:
1104: Now, the second term in Eq.~(\ref{eq:BoundDiff}): the bound
1105: (\ref{eq:BoundTotalEnergy}) holds for any $\beta$ in the compact region
1106: $[0,\infty]$. Furhter, the left hand side is uniformly bounded (in terms of $N$) and
1107: continuous in $\beta$. Hence, there exists a $C$ so that the bound (\ref{eq:BoundTotalEnergy})
1108: holds uniformly for $\beta\in[0,\infty]$. This will imply that
1109: %
1110: \begin{eqnarray}
1111: %
1112: \sum_{i=0}^{n-1}
1113: \Delta\,\E\left|\<E(\ux)\>_{i}-\<E(\ux)\>_{\sBP,i}\right|
1114: &\le & \beta C N^{\delta} \le C N^{\delta+\delta'}
1115: %
1116: \end{eqnarray}
1117: %
1118: Choosing $\delta'\in (0,1-\delta)$ and Markov inequality will imply
1119: that the second term is also bounded above by $N\ve/2$ whp. This
1120: completes the proof of Theorem \ref{thm:KSAT}.
1121: %
1122: \endproof
1123: %
1124: %************************************************************************
1125: %
1126: \subsection{Proofs of Theorems \ref{thm:UniquenessTrees}, \ref{thm:LimitPartFun},
1127: and \ref{thm:AlmostSAT}}
1128: \label{sec:Sketch}
1129:
1130: Due to shortage of space, they are moved to Appendix \ref{ap1}.
1131:
1132: \section{Discussion and Future Work}
1133: \label{sec4}
1134:
1135: We presented a novel deterministic algorithm for approximately counting
1136: good truth assignments
1137: of random $k$-SAT formula with high probability. The algorithm is built upon the well-known
1138: Belief Propagation heuristic and an interpolation method for the log-partition function. In
1139: the process of establishing the correctness of the algorithm, we obtained the threshold
1140: for uniqueness of Gibbs distribution for random $k$-SAT formula as $2k^{-1}\log k (1+o_k(1))$.
1141: This result if of interest in its own right.
1142:
1143: We believe that our result can be extended to a reasonable class of non-random $k$-SAT
1144: formula. We also believe that the approximation guarantees of Theorem \ref{thm:KSAT}
1145: should hold for any $\beta \in [0,\infty]$.
1146:
1147: %
1148: %************************************************************************
1149: %
1150: \begin{thebibliography}{99}
1151:
1152: \bibitem{Georgii} H.~O.~Georgii,
1153: ``Gibbs Measures and Phase Transitions''.
1154: Berlin, Walter de Gruyter and Co., 1988
1155:
1156: \bibitem{Monasson} R.~Monasson, R.~Zecchina, S.~Kirkpatrick, B.~Selman,
1157: and L.~Troyansky, ``Determining computational complexity
1158: from characteristic `phase transitions' '',
1159: Nature 300 (1999), 133--137
1160:
1161: \bibitem{Mezard} M.~M\'ezard, G.~Parisi, and R.~Zecchina,
1162: ``Analytic and Algorithmic Solution of Random Satisfiability Problems'',
1163: Science 297 (2002), 812--815
1164:
1165: \bibitem{ANPNature} D.~Achlioptas, A.~Naor and Y.~Peres,
1166: ``Rigorous location of phase transitions in hard optimization problems'',
1167: Nature 435 (2005), 759--764
1168:
1169: \bibitem{T01} M.~Talagrand,
1170: `` The high temperature case of the K-sat problem",
1171: Probability Theory and Related Fields 119, 2001, 187-212.
1172:
1173: \bibitem{AP04} D. Achlioptas and Y. Peres,
1174: ``The threshold for random $k$-SAT is $2k \log 2 - O(k)$",
1175: Journal of the AMS, 17 (2004), 947-973.
1176:
1177: \bibitem{MonassonZecchina} R.~Monasson and R.~Zecchina,
1178: ``Entropy of the $K$-Satisfiability Problem'',
1179: Phys. Rev. Lett. 76 (1996), 3881–3885
1180: %
1181: %\bibitem{ANP05} D. Achlioptas, A. Naor and Y. Peres, ``The Fraction of Satisfiable
1182: %Clauses in a Random Formula", Preliminary version in Proceedings of IEEE FOCS 2003. Long
1183: %version to appear in the Journal of ACM.
1184:
1185: \bibitem{FL03} S.~Franz and M.~Leone,
1186: ``Replica bounds for optimization problems and diluted spin systems",
1187: Journal of Statistical Physics, 111 (2003), 535.
1188:
1189: \bibitem{FLT03} S.~Franz, M.~Leone and F.~L.~Toninelli,
1190: ``Replica bounds for diluted non-Poissonian spin systems",
1191: Journal of Physics, A 36 (2003) 10967 .
1192:
1193: \bibitem{JS93} M.~Jerrum and A.~Sinclair,
1194: ``Polynomial-time Approximation Algorithms for the Ising Model",
1195: SIAM Journal on Computing 22 (1993), pp. 1087-1116.
1196:
1197: \bibitem{W06} D.~Weitz,
1198: ``Counting independent sets up to the tree threshold'',
1199: In Proceedings of STOC, 2006.
1200:
1201: \bibitem{BG06} A.~Bandyopadhyay and D.~Gamarnik, ``Counting without sampling. New algorithms for
1202: enumeration problems using statistical physics'', In Proceedings of SODA, 2006.
1203:
1204: \bibitem{Future} A.~Montanari and D.~Shah,
1205: ``$k$-SAT: Counting Satisfying Assignment and Threshold for Correlation Decay",
1206: Longer version, in preparation.
1207:
1208: \bibitem{Friedgut} E.~Friedgut,
1209: ``Sharp Thresholds of Graph Proprties, and the $k$-sat Problem'',
1210: Journal of American Mathematical Society, 12 (1999), no. 4, 1017--1054.
1211:
1212: \bibitem{Pearl} J.~Pearl,
1213: ``Probabilistic Reasoning in Intelligent Systems: Networks
1214: of Plausible Inference'',
1215: San Francisco, CA: Morgan Kaufmann, 1988.
1216:
1217: \bibitem{WJ} M.~Wainwright and M.~Jordan,
1218: ``Graphical models, exponential families,
1219: and variational inference,'' \emph{Tech.\ Report}, Dept. of
1220: Stat.,University of Cal., Berkeley, 2003.
1221:
1222: \bibitem{TJ99} S.~Tatikonda and M.~Jordan, ``Loopy Belief
1223: Propagation and Gibbs Measure,'' Berkeley Working Paper, 2002.
1224:
1225: \end{thebibliography}
1226:
1227: %
1228: %************************************************************************
1229: %
1230: \appendix
1231: \section{Proof Sketches: Theorems \ref{thm:UniquenessTrees}, \ref{thm:LimitPartFun} and
1232: \ref{thm:AlmostSAT}}
1233: \label{ap1}
1234:
1235: Due to space limitations, we only provide sketch of proofs for
1236: Theorems \ref{thm:UniquenessTrees}, \ref{thm:LimitPartFun} and
1237: \ref{thm:AlmostSAT}.
1238:
1239: \vspace{.1in}
1240: \noindent{\bf Proof sketch for Theorem \ref{thm:UniquenessTrees}.}
1241: By using the definition
1242: $\kappa(\alpha_*) = 1$ (with $\kappa(\alpha)$ being defined as in
1243: Eq.~(\ref{eq:ContractionRate})), it is easy to show that
1244: $\alpha_*(k) = 2k^{-1}\log k\{1+O(\log\log k/\log k)\}$. To complete
1245: the proof, we need a (asymptotically in $k$) matching upper bound.
1246: In order to obtain such an upper bound, we consider the case $\beta =\infty$, i.e. only
1247: satisfying assignments have positive weight. Consider a tree formula
1248: which is distributed as $\T_*(r)$. Let $P_r$ be the probability
1249: that there exists two boundary conditions $\ux^{(0)}_r$, $\ux^{(1)}_r$,
1250: such that the root takes values, respectively, $0$ or $1$ in all the
1251: satisfying assignments with the respective boundary conditions.
1252: Clearly for the Gibbs measure to be unique (or have correlation decay)
1253: in the sense of Definition \ref{def:Uniqueness}
1254: (but also in the weaker sense correspondint to the threshold
1255: $\alpha'_{\rm u}(k)$), it must be that $P_r\to 0$
1256: as $r\to\infty$. Hence, if we establish that for
1257: $ \alpha > 2k^{-1}\log k\{1+O(\log\log k/\log k)\}$,
1258: there exists such boundary conditions with positive probability, then the
1259: proof will be complete. Next, we do that.
1260:
1261: For this, consider a tree from $\T_*(r)$ with the root having degree
1262: $1$. Given
1263: such a tree, let $\rho_r$ be the probability that there
1264: exists a boundary condition $\ux_r$, such that the root variable is
1265: the only variable that satisfies the only clause in which it belongs (recall
1266: that the root variable has degree $1$) for all possible satisfying assignments
1267: with the given boundary condition. If $P_r\to 0$, then $\rho_r\to 0$.
1268: To prove this claim, assume by contraddiction that $\rho_r$ remains
1269: bounded away from zero (say $\rho_r\ge \underline{\rho}>0$) and consider
1270: an tree from $\T_*(r)$ (without conditioning). With finite probability
1271: the root belongs to two clauses in which it appears, respectively,
1272: directed and negated. With probability at least $\underline{\rho}^2>0$, for
1273: each of the corresponding subtrees there exists a boundary condition
1274: that fixes the root variable to be (respectively) directed or negated.
1275: By extending arbitrarily this boundary conditions to the full tree,
1276: we obtain the desired $\ux_r^{(1)}$, $\ux_r^{(0)}$.
1277:
1278: It turns out that $\rho_r$ can be determined recursively. Set $\rho_0=1$
1279: and $\rho_{r+1} = \{1-\exp(-k\alpha\rho_r/2)\}^{k-1}$. Recursively, $\rho_r \to 0$
1280: as $r\to\infty$ only if $\alpha < \alpha^*(k)$, where $\alpha^*(k)$ for the
1281: above recursion (with little bit of algebra) evaluates to
1282: $\alpha^*(k) = 2k^{-1}\log k\{1+O(\log\log k/\log k)\}$. This completes
1283: the proof sketch of Theorem \ref{thm:UniquenessTrees}.
1284:
1285: \vspace{.1in}
1286: \noindent{\bf Proof sketch for Theorem \ref{thm:LimitPartFun}.} First notice
1287: that, if $F$ and $F'$ differ in a single clause, then $|\log Z(\beta,F)-\log Z(\beta,F')|\le 2\beta$.
1288: Hence, by application of Azuma-Hoeffding's inequality, it follows that
1289: $|\log Z-\E\log Z|\le N\delta$ with probability at least $1-e^{-NC_\beta \delta^2}$,
1290: for some $C_\beta >0$ for any $\beta \in [0,\infty)$. Given this, to obtain the almost
1291: sure convergence as in (\ref{eq:AlmostSure}), it is sufficient to prove that
1292: $\lim_{N\to\infty}N^{-1}\E\,\Phi(\beta,F) = \phi(\beta)$,
1293: in light of Theorem \ref{thm:KSAT} and Borel-Cantelli's Lemma.
1294:
1295: To do so, first we need to establish that
1296: %
1297: \begin{eqnarray}
1298: %
1299: \lim_{N\to\infty}\frac{1}{N}\E\<E(\ux)\>_{\sBP,\beta} = \alpha
1300: \E g(h_1,\dots,h_k) \, ,\label{eq:LimitEnergy}
1301: %
1302: \end{eqnarray}
1303: %
1304: where $g$ is defined as in Eq.~(\ref{eq:Gdef}); the random variables $h_1,\dots,h_k$
1305: are i.i.d. with distribution $\nu^*$ that is fixed point of operator $S$ as defined in
1306: the statement of Theorem \ref{thm:LimitPartFun}. We claimed that the fixed point
1307: is unique for $S$. To justify this claim, first note that the
1308: image of $S$ is contained in the space of distributions supported on
1309: $[0,\beta/2]$, call it ${\cal D}_{\beta}$, which is a compact space with respect to
1310: the weak topology. Being continuous on ${\cal D}_{\beta}$, $S$ admits at least one
1311: fixed point in it. Moreover,
1312: the contraction condition implied by the correlation decay (proved as a part of
1313: Theorem \ref{thm:UniquenessTrees}) implies the attractiveness as well as
1314: the uniqueness of the fixed point of $S$.
1315:
1316: Once we establish existence of the unique fixed point, the (\ref{eq:LimitEnergy})
1317: follows from Lemma \ref{lemma:LocalTree} and correlation decay established in
1318: Theorem \ref{thm:UniquenessTrees}. Now, by integrating Eq.~(\ref{eq:LimitEnergy})
1319: over $\beta$ and observing that $\beta_{i+1} -\beta_i = \beta/N^2$ (hence integration
1320: error is negligible at scale $1/N$) one gets
1321: %
1322: \begin{eqnarray}
1323: \lim_{N\to\infty}N^{-1}\E\,\Phi(\beta,F) = \log 2-\alpha\int_{0}^{\beta}
1324: \E_{\beta'} g(h_1,\dots,h_k)\, {\rm d}\beta'\, ,
1325: \end{eqnarray}
1326: %
1327: where a subscript has been added in $\E_{\beta'}$ to stress that the fixed
1328: point distribution has to be taken at inverse temperature $\beta'$.
1329: The proof of Theorem \ref{thm:LimitPartFun} is completed
1330: by showing that the integral on the
1331: right hand side of the last equation is given by $\phi(\beta)$
1332: as in Eq.~(\ref{eq:phi}). In fact, by taking the derivative of this expression
1333: wrt $\beta$, one gets a contribution coming from the explicit $\beta$
1334: dependence, which evaluates to $-\alpha\E g(h_1,\dots,h_k)$, and one from
1335: the $\beta$ dependence of the fixed poit distribution, that can be shown
1336: to vanish.
1337:
1338: \vspace{.1in}
1339: \noindent{\bf Proof Sketch of Theorem \ref{thm:AlmostSAT}. }
1340: For the ease of notation, let $Z(\beta) \equiv \Z(\beta,F)$, $\Xi(\zeta) \equiv\Xi(\zeta,F)$ and
1341: $U(\beta)\equiv\<E(\ux)\>_{\beta,F}$. Because of Theorem \ref{thm:KSAT}, it is sufficient to prove
1342: that $|\log \Xi(N\epsilon)-\log Z(\beta)|\le N\epsilon^a$ whp.
1343: This follows from two inequalities.
1344:
1345: First inequality. For any $\zeta \geq 0$,
1346: \begin{eqnarray}
1347: Z(\beta) & = & \sum_{\ux: E(\ux) \geq \zeta} e^{-\beta E(\ux)} + \sum_{\ux: E(\ux) < \zeta} e^{-\beta E(\ux)}
1348: \geq e^{-\beta \zeta} \Xi(\zeta). \label{eq:zz2}
1349: \end{eqnarray}
1350: Second inequality. For any $\zeta \geq 0$ and using the first equality in (\ref{eq:zz2}), we
1351: obtain
1352: \begin{eqnarray}
1353: Z(\beta) & \leq & \sum_{\ux: E(\ux) \geq \zeta} e^{-\beta E(\ux)} + \Xi(\zeta). \nonumber
1354: \end{eqnarray}
1355: Equivalently,
1356: $Z(\beta) \mu(E(\ux) < \zeta) \leq \Xi(\zeta)$.
1357: Now, take $\zeta = 2U(\beta)$ then, we get using Markov's inequality
1358: \begin{eqnarray}
1359: \mu\{E(\ux) < 2 U(\beta) \} & \geq & 1 - \frac{U(\beta)}{2U(\beta)} ~=~ \frac{1}{2}. \label{eq:zz4}
1360: \end{eqnarray}
1361: >From (\ref{eq:zz2}) and (\ref{eq:zz4}), we obtain
1362: \begin{eqnarray}
1363: \log Z(\beta) - \log 2 & \leq & \log \Xi(2U(\beta)) ~\leq~ \log Z(\beta) + 2\beta U(\beta). \label{eq:zz5}
1364: \end{eqnarray}
1365: %
1366:
1367: The next sep consists in controlling $U(\beta)$ at large $\beta$.
1368: Arguing analogously to the proof of Theorem \ref{thm:KSAT}
1369: one can show that there exist constants $C_1$, $C_2$, $C_3$, $a>0$ such
1370: that, for any $\beta\in [0,\infty]$,
1371: $NC_1 e^{-2\beta}\le U(\beta)\le N C_2e^{-b\beta}+C_3N^{\delta}$
1372: whp.
1373:
1374: Fix $\beta_1$ in such a way that $2C_1e^{-2\beta_1} = \ve$.
1375: Then $2U(\beta_1)\ge N\ve$ whp. By the upper bound in Eq.~(\ref{eq:zz5})
1376: and monotonicit of $\Xi(\zeta)$, we get
1377: %
1378: \begin{eqnarray}
1379: %
1380: \log\Xi(N\ve)\le \log Z(\beta_1)+2\beta_1 U(\beta_1)\le
1381: \log Z(\beta_1)+2\beta_1 N C_2e^{-b\beta_1}+2\beta_1C_3N^{\delta}\, .
1382: %
1383: \end{eqnarray}
1384: %
1385: Using the definition of $\beta_1$, which gives
1386: $\beta_1 = \frac{1}{2}\log\frac{2C_1}{\ve}$, we get that there exists
1387: $C, a>0$ such that
1388: %
1389: \begin{eqnarray}
1390: %
1391: \log\Xi(N\ve)\le \log Z(\beta_1)+NC\ve^{a}\, .
1392: %
1393: \end{eqnarray}
1394: %
1395: with high probability.
1396:
1397: The lower bound on $\log\Xi(N\ve)$ is proved analogously by taking
1398: $\beta_2$ such that $2C_2e^{-b\beta}+2C_3N^{-1+\delta}=\ve$
1399: thus getting $\log\Xi(N\ve)\ge \log Z(\beta_2)-NC\ve^{a}$ whp.
1400: One concludes by bounding the difference of the two partition
1401: functions:
1402: $|\log Z(\beta_2)-\log Z(\beta_1)|\le U(\beta_2)|\beta_1-\beta_2|\le
1403: NC\ve^{a}$ whp.
1404: \endproof
1405: \end{document}
1406: