0007:cs0007029/main.tex

1: \documentclass[final]{siamltex}

2: \usepackage{times,algorithm,algorithmic,comment,psfig,latexsym}

3:

4:

5:

6:

7:

8: \def\AND{\wedge}

9: \def\OR{\vee}

10: \def\oper{\circ}

11: \def\goesto{\rightarrow}

12: \def\implies{\Rightarrow}

13: \def\zeroone{\{0,1\}}

14:  \def\sstar{\zeroone^{*}}

15: \def\L{\langle}

16: \def\R{\rangle}

17: \def\HYP{\hbox{-}}

18: \def\IFF{\leftrightarrow}

19: \def\Ldef{\buildrel \rm def \over \leftrightarrow}

20: \def\Edef{\buildrel \rm def \over =}

21: \def\almostall{\hbox{\rlap{$_{\thinspace\forall}$}{$^{^\infty}$}}}

22: \def\infoften{\hbox{\rlap{$_{\thinspace\exists}$}{$^{^\infty}$}}}

23: \def\N{{\bf N}}

24: \def\cminus{\dot{-}}

25: \def\plusminus{\pm}

26: \def\PR{{\rm Pr}}

27: \def\HSAT{{\rm HORN}\hbox{-}{\rm SAT}}

28: \def\PUR{{\rm PUR}}

29: \def\beginproof{\noindent{\bf Proof.}\quad}

30: \def\endproof{}

31:

32:

33: \makeatletter

34: \newcommand{\singlespacing}{\let\CS=

35:         \@currsize\renewcommand{\baselinestretch}{1}\tiny\CS}

36: \newcommand{\singlespacingplus}{\let\CS=

37:         \@currsize\renewcommand{\baselinestretch}{1.15}\tiny\CS}

38: \newcommand{\doublespacing}{\let\CS=

39:         \@currsize\renewcommand{\baselinestretch}{1.75}\tiny\CS}

40: \newcommand{\draftspacing}{\let\CS=

41:         \@currsize\renewcommand{\baselinestretch}{2.0}\tiny\CS}

42: \newcommand{\normalspacing}{\singlespacing}

43: \makeatother

44:

45:

46:

47: %%%%%%%%%%desc

48: \def\desclabel#1{\bf #1\hfil}

49: \def\desc{\list{}{%

50: \labelwidth=\leftmargin

51: \advance \labelwidth by -\labelsep

52: \let \makelabel=\desclabel}}

53: \let\enddesc=\endlist

54:

55:

56:

57: %\newtheorem{lemma}{Lemma}[section]

58: %\newtheorem{theorem}[lemma]{Theorem}

59: \newtheorem{corrolary}{Corollary}

60: %\newtheorem{proposition}[lemma]{Proposition}

61: %\newtheorem{fact}[lemma]{Fact}

62: \newtheorem{example}{Example}

63: \newtheorem{observation}{Observation}

64: \newtheorem{claim}{Claim}

65: %\newtheorem{definition}{Definition}

66: %\newtheorem{obs}[lemma]{Observation}

67:

68: \def\qed{\hfill$\Box$\newline\vspace{5mm}}

69: \newenvironment{PROOF}{\noindent{\bf Proof:}}{{\qed}}

70:

71:

72:

73: \author{Gabriel Istrate\thanks{

74:         Center for Nonlinear Science and CIC-3 Division,

75:         Los Alamos National Laboratory,

76:         Los Alamos, NM 87545,

77:         gistrate@cnls.lanl.gov}}

78:

79: \title{Dimension-dependent behavior in the satisfiability of

80: random $k$-Horn formulae}

81: \date{}

82: \pagestyle{empty}

83:

84:

85:

86: \sloppy

87:

88: \begin{document}

89:

90: \bibliographystyle{plain}

91:

92:

93: \maketitle

94: \begin{abstract} We determine the asymptotical satisfiability

95: probability of a random at-most-$k$-Horn formula, via a probabilistic

96: analysis of a simple version, called \PUR, of positive unit resolution.

97: We show that for $k=k(n)\goesto \infty$ the problem

98: can be ``reduced'' to the case $k(n)=n$, that was solved in

99: \cite{istrate:cs.DS/9912001}. On

100: the other hand, in the case $k=$ constant the behavior of \PUR\ is

101: modeled by a simple queuing chain, leading to a closed-form

102: solution when $k=2$. Our analysis predicts an ``easy-hard-easy''

103: pattern in this latter case.

104: Under a rescaled parameter, the graphs of satisfaction probability

105: corresponding to finite values of $k$ converge to the one for the

106: uniform case, a ``dimension-dependent behavior'' similar to the one found

107: experimentally in \cite{kirkpatrick:selman:scaling} for $k$-SAT.

108: The phenomenon is qualitatively explained by a threshold property for

109: the number of

110: iterations of \PUR\ makes on random {\em satisfiable} Horn

111: formulas. Also, for $k=2$ \PUR\ has a peak in its average complexity at

112: the critical point.

113: \end{abstract}

114:

115:

116: \begin{keywords}

117: random Horn satisfiability, critical behavior, probabilistic analysis.

118: \end{keywords}

119:

120: \begin{AMS}

121: 68Q25,82B27

122: \end{AMS}

123:

124: \pagestyle{myheadings}

125: \thispagestyle{plain}

126: \markboth{G. ISTRATE}{DIMENSION DEPENDENT BEHAVIOR OF RANDOM HORN

127: SATISFIABILITY}

128:

129:

130: \section{Introduction}

131:

132: Finding the ground state (state of minimum energy) of a physical

133: system and computing an optimal solution to a combinatorial

134: optimization

135: problem

136: are intuitively two very similar tasks. This simple observation, that

137: motivated the development of  {\em simulated annealing}

138: \cite{simmulated:annealing}, a simple general-purpose heuristic for combinatorial

139: optimization, lies behind the

140: recent birth of a new field at the crossroads of Statistical

141: Mechanics, Theoretical Computer Science and Artificial Intelligence,

142: that studies {\em phase transitions in combinatorial problems} (see

143: \cite{hayes:cant:get:sat} for a readable introduction). The transfer of

144: principles and

145: methods from Physics (mainly from Spin Glass Theory

146: \cite{virasoro-parisi-mezard}) to

147: Computer Science has already been quite successful, and is responsible

148: for a couple of interesting results, such as a better understanding of

149: the factors that account for computational intractability

150: \cite{2+p:rsa, 2+p:nature},

151: strikingly accurate predictions of the average running time of various

152: algorithms \cite{scaling:search:cost:2,scaling:search:cost}, or of

153: expected values of optimal solutions

154: \cite{mezard:parisi:matching}.

155:

156: The need for a rigorous validation of these insights is quite

157: obvious. The theory of spin glasses is a relatively young field, which

158: still presents many heuristic, unsolved or plain controversial aspects

159: (for example see

160: \cite{non:mean:field:1,non:mean:field:2,non:mean:field:3} for a debate

161: on the validity and scope of the so-called Parisi solution of the

162: Sherrington--Kirkpatrick model). Moreover, while physical intuition can

163: guide the development of the theory for ``physical'' models, by corroborating (or

164: falsifying)

165: some of its predictions (e.g. see \cite{virasoro-parisi-mezard},

166: for a discussion of the demise, on physical grounds,

167: of the first formulation of the so-called {\em replica method}), such

168: intuition is not available when applying this type of ideas to

169: combinatorial problems. Given that rigorous results are hard to come

170: by in the case of spin glasses proper, it is not surprising that while there has

171: been recently some progress (see e.g.

172: \cite{talagrand:verres}), an analysis of most interesting

173: combinatorial problems is still out of reach.

174:

175: An approach that was popular in Statistical Mechanics was to gather

176: intuition through the systematic study of {\em exactly solved models}

177: \cite{baxter:rigorous}. These are ``toy'' versions of the original models that

178: are simple to deal with, but retain much of the properties of the

179: former ones. We advocate such an approach for problems in

180: Computer Science as well, and the purpose of this paper is to present

181: a (hopefully

182: nontrivial) ``exactly solvable satisfiability model'' that displays a

183: {\em dimension-dependent behavior} fairly similar to the one observed

184: previously in various

185: contexts such as percolation \cite{hara:slade:critical}, self-avoiding

186: walks, and recently for $k$-satisfiability by Kirkpatrick and Selman

187: \cite{kirkpatrick:selman:scaling}. The problem

188: we investigate is {\em random Horn satisfiability}, and the

189: ``dimensionality'' of a formula is taken to be the

190: {\em maximum length} of its clauses.\footnote{for technical

191: convenience, all

192: over the paper {\em random $k$-Horn satisfiability} is understood as

193: {\em random {\bf at-most-$k$}-Horn satisfiability}.}

194:

195: \section{Overview}

196: There are actually two different notions of phase transition

197: in a combinatorial problem. The first of them, called

198: {\em order-disorder phase transition} applies to optimization

199: problems and directly parallels the approach from Statistical

200: Mechanics.

201: Potential solutions for an instance of $P$ are viewed as ``states'' of

202: a system. One defines an abstract {\em Hamiltonian (energy) function},

203: that measures the ``quality'' of a given solution, and applies methods

204: from the theory of spin glasses \cite{virasoro-parisi-mezard} to make

205: predictions on the typical

206: structure of optimal solutions. In this setting a

207: phase transition is defined as non-analytical behavior of a certain

208: ``order parameter'' called free energy,

209: and a discontinuity in this parameter, manifest by the sudden

210: emergence of a {\em backbone} of constrained ``degrees of freedom''

211: \cite{2+p:rsa} is responsible for the exponential slow-down of many

212: natural algorithms.

213:

214: The second definition is combinatorial and pertains to decision

215: problems. It relies on the concept of {\em threshold property} from

216: random graph theory, more precisely a restricted version of this

217: notion, called {\em sharp threshold}.

218: A satisfiability threshold always exists for monotone problems

219: \cite{bollob-thomasson}, but may or may

220: not be sharp (we speak of a {\em coarse threshold} in the latter

221: case).

222:

223: The layout of the paper is as follows: in section~\ref{section:1} we

224: review

225: the results of Kirkpatrick and Selman, in particular discussing the

226: concept of {\em critical behavior}, as well as some objectionable aspects

227: of their results.

228:   We then define the type of dimension

229: dependent behavior we are interested in, argue that it captures to a

230: large

231: extent the results presented in \cite{kirkpatrick:selman:scaling}, and

232: contrast it with

233: critical behavior.

234: Our results are presented and discussed in section~\ref{section:3},

235: while in

236: section~\ref{section:4} we further discuss their significance.

237:

238:

239: Finally for $k=2$, the one where the satisfaction probability has a

240: singularity we are able to rigorously display another phenomenon that

241: is

242: believed to be characteristic of phase transitions: in many cases the

243: ``hardest on the average'' instances appear at the transition point

244: (even if we only

245: consider satisfiable instances \cite{achlioptas-sat-instances,mammen-hogg}); this feature is

246: quite robust with respect to the choice of the particular algorithm

247: \cite{cheeseman-kanefsky-taylor}.

248: We are able to prove that for a {\em particular problem}, random

249: at-most-2-Horn satisfiability,  the average

250: running time of a {\em particular algorithm}, when restricted to

251: satisfiable

252: instances (the ones that are statistically significant on both sides of

253: the critical point) is finite outside the critical point, and it

254: diverges as

255: we approach this point, thus providing some evidence for the

256: experimental

257: wisdom.

258:

259:

260: \section{Phase transitions and critical behavior}\label{section:1}

261:

262: We first discuss, briefly and limited to our interests, threshold

263: phenomena. Perhaps the best way to introduce them is through a concrete

264: example. To do this, we will use one ``canonical'' NP-complete

265: problem, {\em $k$-CNF satisfiability}.

266:

267: To generate random formulas we use a

268: model with one parameter, {\em the constraint

269: density $c$}, defined as the ratio between

270: the number of clauses $m$ and the number of variables $n$ of the

271: formula. A random formula is obtained by choosing $m$ random clauses.

272: If we plot the probability that such a random formula is satisfiable

273: against the constraint density $c$, we notice the existence of a

274: critical value $c_{k}$ such that the satisfaction probability drops

275: (as $n\goesto \infty$) from one to zero at $c_{k}$. Such a ``sudden

276: change'' is an illustration of the mathematical concept of {\em sharp

277:   threshold}, qualitatively illustrated in Figure~\ref{sharp:thr}. The

278: existence of a critical value $c_{k}$ has not been rigorously

279: established (except for $c_{2}=1$), even though Friedgut

280: \cite{friedgut:k:sat} has shown

281: that the transition is ``sharp'' for every $k$.

282: \begin{figure}

283: \label{sharp:thr}

284: \centerline{

285: \psfig{figure=fig3.ps,width=3.5in}}

286: \caption{Qualitative picture of a (rescaled) sharp threshold}

287: \end{figure}

288:

289: Of special interest will also be the width of the so-called {\em scaling window (a.k.a. critical region)}. To define it consider, for $0 <\delta < 1$,

290: $\alpha_-(n,\delta)$, the supremum over

291: $\alpha$ such that for $m=\alpha n$, the

292: probability of a random formula being

293: satisfiable is at least $1-\delta$.

294: Similarly, let

295: $\alpha_+(n,\delta)$ be the infimum over

296: $\alpha$ such that for $m=\alpha n$, the

297: probability of a random formula being

298: satisfiable is at most $\delta$.

299: Then, for $\alpha$  within the {\em $\delta$-scaling window}

300: \begin{equation}

301: W(n,\delta) = (\alpha_-(n,\delta),

302: \alpha_+(n,\delta)),

303: \end{equation}

304: the probability that

305: a random formula is satisfiable is

306: between $\delta$ and $1-\delta$.

307:

308: We will be interested in the width of the window

309: $W(n,\delta)$ as a function of

310: $n$. It is generally believed that $|W(n)|=\theta(n^{-1/\nu})$

311: for some $\nu=\nu_{k}\geq 1$ independent of $\delta$, even though the existence of $\nu_{k}$ has

312: only been established for $k=2$ \cite{scaling:window:2sat}.

313:

314: \subsection{Order/disorder phase transitions}

315:

316: Statistical mechanics deals with the description of systems having a

317: large

318: number of degrees of freedom. One of its fundamental predictions

319: concerns the

320: fact that at thermal equilibrium each such state occurs with

321: probability

322: proportional to $exp(-\beta H(\sigma))$, where $\beta$ is an {\em

323: inverse

324: temperature}, and $H$ is a {\em Hamiltonian function}, describing the

325: energy of

326: the particular state $\sigma$. The resulting distribution is called

327: {\em

328: the Gibbs distribution $G_{\beta}$} given by

329: \[

330: \Pr[\sigma]=\frac{exp(-\beta\cdot H(\Phi;\sigma))}{Z[\Phi]},

331: \]

332: where

333: \[

334: Z[\Phi]= \sum_{\sigma\in \{0,1\}^{n}}exp(-\beta\cdot H(\Phi;\sigma))

335: \]

336: is the so-called {\em partition function}.

337:

338: Changes in the order properties of the system,

339: which characterize order-disorder phase transitions, manifest

340: themselves as

341: non-analytical behavior of thermal averages (i.e. averages over the

342: Gibbs distribution) of a certain {\em order parameter}.

343: We want to emphasize that the physicists' use of the term order

344: parameter would be quite different from the one from combinatorics.

345: An order parameter is a quantity that is zero on one side of the

346: phase transition and becomes non-zero on the other side (for instance

347: the satisfaction probability could be an order parameter).

348:

349: One of the simplest illustrations of these

350: concepts is the {\em two-dimensional Ising model} (see

351: \cite{baxter:rigorous} for a

352: thorough treatment).  In this model we

353: have a number of {\em spins}, that are small magnets located on the

354: vertices of the two-dimensional lattice, and pointing either

355: up or down. The spins interact with their neighbors and with an {\em

356: external

357: magnetic field $h\in {\bf R}$}, which will tend to align the spins in one of the

358: two directions. The energy of a state $\sigma$ is

359: \[

360: H(\sigma)=- \sum_{i\sim j}\sigma_{i}\cdot \sigma_{j} + h\cdot

361: \left(\sum_{i}\sigma_{i}\right).

362: \]

363:

364:

365:

366: The order parameter is called {\em free energy}, is a function of

367: temperature, and is formally defined as

368:

369: \[

370: f = -\frac{1}{\beta n} \ln Z[\Phi].

371: \]

372:

373: It measures the fraction of spins that are ``frozen''

374: when the field is turned off.

375:

376: We now briefly describe the essence of the phase transition:

377: above a certain temperature $T_{c}$, {\em the Curie-Weiss point}, when

378: the magnetic field is turned to zero

379: the proportion of spins that point in each direction is about

380: $\frac{1}{2}$ (the so-called {\em disordered phase}). But for

381: temperatures below $T_{c}$ when we turn the field to zero some

382: orientation still dominates (the  {\em ordered phase}), and the proportion of

383: spins pointing up(down) changes discontinuously as $h$ passes through zero.

384:

385:  The connection with combinatorial optimization follows from the

386: observation that when $\beta \goesto \infty$ (that is the temperature

387: approaches 0 K), the Gibbs distribution $G_{\beta}$ converges to a

388: uniform distribution $G$ on the set of states of minimal energy

389: (ground states). Thus, based on  this analogy, one can hope that

390: ideas from Statistical Mechanics are able to provide insight into the

391: structure of optimal solutions to an instance of a problem in

392: Combinatorial Optimization. Rather than providing a complete discussion (which

393: would require to

394: rigorously define the notion of optimization problem) we will discuss

395: this in the

396: context of MAX 3-SAT, the optimization version of satisfiability. For

397: now

398: it suffices to mention the three main ingredients of an optimization

399: problem,

400: its {\em instances}, {\em solutions} to instances of a problem, and an

401: {\em

402: cost function}, that measures the quality of a solution for a certain

403: instance.

404:

405: \begin{example}(MAX 3-SAT)

406:

407:

408: {\bf Input:} A propositional formula $\Phi$ in conjunctive normal form,

409: such that every

410: clause has length exactly 3.

411:

412: {\bf Solution:} A truth assignment $\sigma$ for the propositional

413: variables in $\Phi$

414: that maximizes the number of satisfied clauses.

415:

416: {\bf Cost function:} The cost $C(\Phi,\sigma)$ of a truth assignment

417: $\sigma$ for an instance

418: $\Phi$ of MAX 3-SAT is the number of clauses of $\Phi$ that are

419: violated by

420: $\sigma$.

421: \end{example}

422:

423: Let $Q$ be an optimization problem and let  $\Phi$ be an instance of

424: $Q$ ``on

425: $n$ variables'' (i.e., all solutions have length $n$). We view the

426: set of all assignments on $\{0,1\}^{n}$ as ``states of a system.'' To

427: each such

428: state $\sigma$ we associate the Hamiltonian (energy function)

429: \[

430: H(\Phi;\sigma)=\mbox{ the cost of instance }(\Phi;\sigma)\mbox{ of }Q.

431: \]

432: \begin{example}

433: Let $\Phi$ be a 3-CNF formula, and let $\sigma$ be an assignment.

434: According

435: to the previous definition $H(\Phi;\sigma)=C(\Phi;\sigma)$. $H$ can be

436: formally expressed \cite{monasson:zecchina} as

437:

438: \[

439: H(\Phi;\sigma)=\sum_{l=1}^{m}\delta\left[\sum_{i=1}^{n}C_{l,i}\cdot

440: (-1)^{\sigma_{i}};-3\right],

441: \]

442: \end{example}

443: where $\delta[i;j]= 1_{\{i=j\}}$ is the Kronecker symbol and $C_{l,i}$

444: is 1 if the $l$th

445: clause contains the literal $x_{i}$, $-1$ if it contains

446: $\overline{x_{i}}$ and

447: zero otherwise.

448:

449:

450: For the case of problems of interest to Computer Science the instance

451: $\Phi$ is not fixed, but rather is a sample from a certain

452: distribution. This is very similar to the context of {\em spin-glass

453: theory},

454: a subfield of Statistical Mechanics.

455: The extra ingredient of this theory is that the coupling coefficients

456: are no

457: longer considered fixed, but are rather independent samples from a

458: certain

459: distribution. In the language of the theory of spin glasses  $\Phi$ is

460: called a {\em quenched quantity}).

461:

462: As in the case of the Ising model, the order parameter

463: is the {\em ground state free energy}, more precise its expected value

464: \[

465: \overline{f}=-\frac{1}{\beta n}\overline{\ln(Z)},

466: \]

467: where $\overline{(\ldots)}$ stands for the average over the random

468: distribution of $\Phi$.

469: \begin{definition}

470: A {\em physical (order/disorder)

471: phase transition} in a combinatorial optimization problem

472: is a point where $\overline{f}$ is not analytical.

473: \end{definition}

474:

475:

476: Free energy has  an especially crisp intuitive

477: interpretation in the case of the problem MAX 3-SAT

478: \cite{monasson:zecchina}:

479:

480: \begin{example}\label{3sat:expl}

481: Let $\Phi_{n}$ be an instance of MAX 3-SAT, let $A$ be the set of

482: optimal

483: assignments to $\Phi_{n}$, endowed with the uniform measure $\mu_{n}$.

484: Statistical Mechanics predicts that, as $n\goesto \infty$, $\mu_{n}$ is

485: ``close'' to a product measure on $\{0,1\}^{n}$, $\mu_{1,n} \ldots

486: \mu_{n,n}$. The {\em free energy per site} $f$ is the fraction of

487: variables $x_{i}$ that are (asymptotically) {\em fully constrained}

488: (that is

489: $\mu_{i,n}$ converges in distribution to a measure having all its

490: weight on one of the two points 0,1.

491: \end{example}

492:

493:

494: \section{Critical behavior and the mean-field

495: approximation}\label{section:2}

496:

497: An important feature that order/disorder

498: phase transition share with the combinatorial notion of {\em threshold

499: properties} (that are usually the type of phase transition of interest

500: in combinatorics) is that the various quantities of interest,

501: such as the satisfaction probability, the ground state energy, and the

502: location of the phase transition are hard to compute. No

503: general-purpose  methods exist, and in some cases even obtaining good

504: non-rigorous estimates is a challenging open problem.

505:

506: A technique  that often provides realistic approximate values for

507: these quantities

508: came to be known as the {\em mean-field (annealed) approximation}. In a nutshell

509: a mean-field approximation assumes that we are trying to compute the

510: average (over a certain discrete probability space) of a certain

511: expression $f\circ (g_{1}, \ldots, g_{n})$. Then the mean

512: field-approximation amounts to taking

513:

514: \[

515: E[f(g_{1}(x),\ldots, g_{n}(x)]\sim f[E[g_{1}(x)],\ldots, E[g_{n}(x)]].

516: \]

517:

518: This technical definition of the mean-field approximation does not

519: convey a useful intuition: suppose we want to solve a

520: combinatorial problem whose objective function depends on

521: simultaneously satisfy several ``constraints'' whose effects are

522: usually not independent. The mean-field approximation ignores

523:  the dependencies between various constraints, and treat them

524: as independent.

525: \vspace{5mm}

526: \begin{example}

527: Let us return to the case of spin glasses. Each configuration of spins

528: $\sigma$

529: has an energy specified by a {\em Hamiltonian} $H(\sigma)$. A typical

530: expression for $H(\sigma)$ is

531: \[

532: H(\sigma)=\sum_{i\sim j}a_{i,j}\sigma_{i}\sigma_{j},

533: \]

534:

535: where the $a_{i,j}$'s are interaction coefficients between adjacent

536: spins

537: (according to some adjacency graph specific to the considered model).

538: The quantity of interest, {\em average free energy} $\overline{f}$

539: is hard to compute directly because of the logarithmic function present

540: in

541: the definition of the free energy. In this context the mean-field

542: approximation amounts to

543:

544: \[

545: \overline{f}\sim -\frac{1}{\beta n}\ln[ \overline{Z[\Phi]}].

546: \]

547: \end{example}

548: \vspace{5mm}

549:

550: The advantage of this heuristic is that the average on the right-hand

551: side is one that is usually much easier

552: to compute.

553:

554:

555: For combinatorial phase transitions, the mean-field approach usually

556: amounts to an approximation using the so-called {\em first-moment

557: method}

558:

559: \vspace{5mm}

560:

561: \begin{example} {\bf ($k$-Satisfiability)}

562:

563: The reason that the satisfiability probability of a random formula is

564: hard to compute is that, for two assignments $A,B$ the events $A\models

565: \Phi$ and $B\models\Phi$ are not independent.

566: One way to construct a mean-field theory for $k$-SAT is to ignore the

567: dependencies between these events.  More precisely, we have

568:

569: \[ 1_{SAT}[\Phi] = f(g_{A_{1}}[\Phi], \ldots, g_{A_{2^{n}}}[\Phi]),

570: \]

571: where

572: \[ f(x_{1}, x_{2}, \ldots, x_{2^{n}})= 1 - \prod _{i=1}^{2^{n}}x_{i},

573: \]

574: and

575:

576: \[ g_{A}[\Phi]= \left \{\begin{array}{ll}

577:                  1, & \mbox{ if }A\not \models \Phi,

578:                  \\

579:

580:                  0,  & \mbox{ otherwise.}\\

581:         \end{array}

582: \right.

583: \]

584: Define $\gamma_{k}=1-2^{-k}$. The mean-field approximation amounts to

585: \[

586: \Pr[\Phi \in SAT] = E[1_{SAT}[\Phi]]\sim f(E_{g_{1}}[\Phi], \ldots,

587: E_{g_{2^{n}}}[\Phi])

588: \]

589: Since

590:

591: \[

592: E_{g_{1}}[\Phi]= \ldots = E_{g_{2^{n}}}[\Phi])= 1-\gamma_{k}^{cn}

593: \]

594: this reads,

595: \[

596: \Pr[\Phi \in SAT]\sim 1- \left[1-\gamma_{k}^{cn}\right]^{2^{n}}\sim

597: 1-e^{-

598: 2^{n}\cdot \gamma_{k}^{cn}}= 1- e^{-E[\#_{SAT}[\Phi]]}

599: \]

600: where $\#_{SAT}[\Phi]$ is the number of satisfying assignments for

601: $\Phi$. Thus (neglecting the case $E[\#_{SAT}[\Phi]]=1$)

602:

603: \[

604: \Pr[\Phi \in SAT]= \left \{\begin{array}{ll}

605:                  1, & \mbox{ if }E[\#_{SAT}[\Phi]]\goesto \infty,

606:                  \\

607:

608:                  0,  & \mbox{ if }E[\#_{SAT}[\Phi]]\goesto 0.\\

609: \end{array}

610: \right.

611: \]

612:

613:

614: \end{example}

615: \vspace{5mm}

616: \subsection{Critical exponents and behavior}

617:

618:

619: A phenomenon that has been observed in various contexts is

620: {\em critical behavior}. In these cases the class of problems under

621: study has an intrinsic notion of

622: dimensionality $d$, and in the limit $d\goesto \infty$ (or sometimes

623: even when $d$ is greater than a so-called {\em critical dimension})

624: ``the annealed approximation becomes exact''.

625:

626: A way to give precise meaning to the above quote comes from the

627: concept of {\em universality}. In Statistical Mechanics one define

628: certain {\em critical exponents}, that describe the behavior of the

629: system near the critical points; universality predicts that phase

630: transitions

631: with the same critical exponents are ``structurally similar''.

632:

633: Since critical exponents can be defined for the mean-field versions of

634: the physical models too, critical behavior means that as $d\goesto

635: \infty$

636: (or, sometimes, for $d$ larger than a value called {\em the upper

637: critical dimension}) the critical

638: exponents of the $d$-dimensional system coincide with the critical

639: exponents of the $d$-dimensional mean-field model.

640:

641: \vspace{5mm}

642: \begin{example}

643: {\bf (Bond) percolation on the lattice ${\bf Z}^{d}$.}

644: Percolation \cite{grimmett:percolation} is a mathematical theory that

645: models the flow of  liquids in random porous media. In our case

646: the flow is on the

647: lattice ${\bf Z}^{d}$ of dimension $d$, and the model has one

648: parameter, the edge probability $p\in [0,1]$. Each bond (grid

649: edge of the lattice ${\bf Z}^{d}$) is considered open with

650: probability $p$ (independently of the other bonds) and the order

651: parameter is the probability $P_{d}(p)$ that the origin lies in an

652: infinite cluster. $P_{d}$ is a monotonically increasing function of

653: $p$. It is  believed that $P_{d}(p)$ is  zero up to a {\em critical

654: value $p_{c}(d)$} (known

655: rigorously only for $d=2$), greater than zero beyond that point, and

656: non-analytical but continuous (at least for $d=2$) at $p_{c}(d)$.

657: It is also believed that above

658: (and around the critical value) $P_{d}(p)\sim (p-p_{c}(d))^{\beta}$ where $\beta$ is a

659: {\em critical exponent} that depends on $d$ but {\em not} on

660: the explicit lattice considered (i.e. it would be the same if we choose

661: another $d$-dimensional lattice instead of ${\bf Z}^{d}$). This is

662: only one of the several critical exponents that are believed to

663: structurally characterize percolation on $d$-dimensional lattices (see

664: \cite{grimmett:percolation}).

665:

666: Without going into further details, we note that

667: the ``mean-field approximation''

668: corresponds to considering percolation on the {\em $d$-dimensional

669:   Bethe lattice}, a

670: nd the critical behavior

671: amounts to the observation that for $d$ greater than a {\em critical

672: dimension} (known to be at most 16 \cite{hara:slade:critical}, and is

673: believed to be 6) the

674: critical exponents of percolation on ${\bf Z}^{d}$ are those of

675: percolation on the Bethe lattice.

676:

677: \end{example}

678:

679:

680: \subsection{Rescaling and critical behavior}

681: \label{discuss}

682: A recent example of critical behavior has recently been observed

683: experimentally by Kirkpatrick and Selman

684: \cite{kirkpatrick:selman:scaling} for satisfiability problems.

685:

686: Their results does not mention

687: critical exponents (although it is closely related).  To explain

688: them,

689: we need to

690: introduce first another concept from Statistical Mechanics: {\em

691: finite-size

692: scaling}. The intuition behind it is that

693: \cite{kirkpatrick:selman:scaling} ``sufficiently close to a threshold

694: or critical point, systems of all sizes are indistinguishable except

695: for an overall change of scale.'' In

696: mathematical terms this amounts to defining a new order parameter

697: that ``opens up'' the {\em scaling window,

698: the region where the probability decreases from 1 to 0.}

699: \vspace{5mm}

700: \begin{example} {\bf Hamiltonian Cycle.}

701:

702: The random model has one parameter $m$, the number of edges. A random

703: sample is obtained by choosing uniformly at random a set of $m$

704: distinct edges of a complete graph with $n$ vertices. The following

705: result (obtained by Koml\'{o}s and Szemer\'{e}di \cite{hamcyclerand})

706: describes the phase

707: transition in this problem:

708:

709: Let $m=m(n)= \frac{1}{2}n\cdot \log(n)+\frac{1}{2}n\cdot \log

710: \log(n)+c_{n}\cdot n$. Then

711:

712: \[

713: \lim_{n\implies \infty}Pr[G\mbox{ has a Hamiltonian cycle}]=\left

714:  \{\begin{array}{ll}

715:                   0, & \mbox{ if $c_{n}\goesto -\infty$,}

716:                  \\

717:                  e^{-e^{-2c}},  & \mbox{if $c_{n}\goesto c$,}\\

718:                  1, & \mbox{ if $c_{n}\goesto \infty$.}

719:                  \\

720:         \end{array}

721: \right.

722: \]

723:

724: A rescaled parameter for the Hamiltonian cycle problem can be defined

725: by $c_{n}=\frac{1}{n}\cdot [m-\frac{1}{2}n\cdot

726: \log(n)-\frac{1}{2}n\cdot \log \log(n)]$. This parameter yields a

727: rescaled limit probability function $f(c)=e^{-e^{-2c}}$.

728: \end{example}

729: \vspace{5mm}

730:

731: It is important to note that, since an annealed approximation yields

732: an expression for the order parameter (in our case satisfaction

733: probability) that will usually display a phase transition as well,

734: a rescaled parameter can be defined for the mean-field version of the

735: problem as well.

736:

737: The definition of the rescaled parameter allows a precise formulation

738: of the intuition that an annealed approximation becomes exact in the

739: limit $d\goesto \infty$. Let $P_{d}$ be a class of satisfiability

740: problems indexed by a dimensionality parameter $d$, let $F_{d}$

741: be the rescaled satisfaction probability graph of $P_{d}$, and let

742: $F_{ann,d}$ be

743: the rescaled graph corresponding to the annealed approximation.

744: Kirkpatrick and Selman observe experimentally that {\em as $d\goesto

745: \infty$,

746: the function sequences $F_{d}$, $F_{ann,d}$ converge punctually to a

747: common limit $F_{\infty}$}.

748: \vspace{5mm}

749: \begin{example}

750: We present in detail the experimental results of Kirkpatrick

751: and Selman. They define an (approximate) rescaled parameter for $k$-SAT

752: \[

753: y_{k} = n^{1/\nu_{k}}\frac{(c-c_{k})}{c_{k}},

754: \]

755: where $c=m/n$, $c_{k}$ is the critical threshold for $k$-SAT, and

756: $\nu_{k}$ is the scaling width coefficient.

757: Also, define the ``annealed rescaled parameter''

758: \[

759: y_{\infty,k} = n\frac{(c-c_{k})}{c_{k}},

760: \]

761:

762: The rescaled limit probability graphs (and, see below, the rescaled

763: versions of the mean-field versions) seem to converge (see Fig. 4 in

764: that paper) to the ``annealed limit''

765:

766: \[

767: f_{\infty}(y) = e^{-2^{-y}}.

768: \]

769: \end{example}

770: \vspace{5mm}

771:

772:

773: \vspace{5mm}

774: \begin{definition}

775: In this paper {\em dimension-dependent

776: behavior} refers to the above-mentioned type phenomenon, convergence

777: of the ``rescaled'' probability functions (and their annealed

778: counterparts) to some common {\em annealed limit}.

779: \end{definition}

780:

781: \vspace{5mm}

782: \begin{observation}

783:

784: It is important to note that dimension-dependent behavior is at the

785: same time more and less demanding than critical behavior.

786:

787:

788: It is more demanding since it requires that the

789: annealed approximation be exact {\em throughout the (rescaled version)

790: of

791: the critical region}. In contrast, critical exponents only provide a

792: qualitative picture of this region, rather than uniquely determine the

793: limit probability throughout it; for instance the width of the scaling

794: window $\nu$ is equal to $2\beta+\gamma$, where  $\beta$

795: is the

796: so-called {\em order-parameter exponent}, that characterizes the

797: asymptotic behavior of the order parameter close to the transition

798: point, and $\gamma$ is called {\em susceptibility exponent} (see

799: e.g. \cite{scaling:window:2sat}).

800:

801: It is less demanding since it does

802: not assume the existence of critical exponents, therefore

803: {\em it makes sense for problems having coarse thresholds,

804: including those that have no singular/critical points}.

805:

806:

807: \end{observation}

808: \vspace{5mm}

809:

810:

811:

812: Why should we expect critical behavior and the above form for the

813: annealed

814: limit ? The intuition is very simple: the major difficulty in computing

815: the

816: probability that a random $k-SAT$ formula is satisfiable is the fact

817: that, for two assignments $A$ and $B$, the events ``$A\models \Phi$''

818: and

819: ``$B\models \Phi$'' are not generally independent, because there exist

820: clauses of length $k$ that are falsified by both $A$ and $B$. On the

821: other hand,

822: qualitatively, as $k\goesto \infty$ clausal constraints become

823: progressively ``looser'', so that in the limit we can neglect such

824: correlations.

825:

826: As to the exact expression for $f_{\infty}(y)$, for a $k$-CNF formula

827: the mean-field approximation implies

828:

829: \[

830: \Pr[\Phi \in \overline{SAT}]\sim (1-\gamma_{k}^{cn})^{2^n}\sim

831: e^{-2^{n}\cdot \gamma_{k}^{cn}}.

832: \]

833:

834: But since $c_{k}$ is specified (in the mean-field approximation) by

835: $E[\# SAT]\sim 1$, i.e. $2^{n}\cdot \gamma_{k}^{c_{k}n}\sim 1$,

836: or $1+c_{k}\log_{2}\gamma_{k}=0$, this implies that as $k\goesto

837: \infty$

838: \[

839: \Pr[\Phi \in \overline{SAT}]\sim e^{-2^{n\cdot [1-c/c_{k}]}}\sim

840: f_{\infty}(y_{\infty,k}).

841: \]

842:

843: In other words, when plotted against the annealed order parameters

844: $y_{ann,k}$ the rescaled satisfaction probability graphs (and their

845: annealed

846: counterparts) punctually converge to the graph of $f_{\infty}$.

847:

848: \section{Does critical behavior really exist ?}

849:

850: The intuitive argument sketched in the preceding paragraph seems to provide

851: a beautiful explanation of the experimental results from \cite{kirkpatrick:selman:scaling}. That this

852: intuition is, however, problematic has been shown by Wilson

853: \cite{wilson:ksat:wrong}. First

854: note that if the previous argument were true, we would have

855: $\nu_{k}=1$

856: for any large enough $k$, since this is the width of the scaling

857: window that the mean-field versions of $k-SAT$ predict.

858: On the other hand Wilson

859: presented a simple argument that implies that $\nu_{k}\geq 2$)

860: Hence the above explanation is not rigorously valid.

861:

862: We stress that Wilson's observation does {\em not} rule out the

863: existence of critical behavior: we, in fact, believe that the

864: qualitative intuition that motivated \cite{kirkpatrick:selman:scaling},

865: that versions of $k-SAT$ become more and more ``similar'' as $k$ goes to

866: infinity, is correct. {\em It is the notion of annealed approximation that

867: needs to be changed}.

868: And, certainly, {\bf his results do not rule the possibility that the rescaled

869: limit probabilities converge, as $k\goesto \infty$, to a

870: suitable-defined limit}. Obtaining a rigorous example where this holds,

871: that identifies a

872: suitable ``annealed approximation that becomes exact'' and also obtains

873: an

874: explanation for this convergence,  could hopefully

875: offer insights on how to address this problem

876: for random $k-SAT$ as well. This is what our theorems in the next section

877: provide.

878:

879: \section{Our results}\label{section:3}

880: A {\em Horn clause} is a disjunction of literals containing {\em at

881: most one positive literal}. It will be called {\em positive} if it

882: contains a positive literal and {\em negative} otherwise.

883: A Horn formula is a conjunction of Horn

884: clauses. {\em Horn satisfiability} (denoted by $\HSAT$) is the

885: problem of deciding whether a given Horn formula has a satisfying

886: assignment.

887:

888: In this chapter we prove a result that displays

889: dimension-dependent behavior for (at most) $k$-Horn satisfiability, the

890: natural version of Horn

891: satisfiability studied, parameterized by the maximum clause length.

892: This problem is also of practical

893: interest in Artificial Intelligence,

894: mainly in connection to {\em theory approximation}

895: \cite{kautz-selman-kc}.

896: The results can be summarized as

897:  follows:

898:

899: \begin{enumerate}

900: \item For an unbounded $k=k(n)$ the threshold phenomenon

901: is essentially the one from the ``uniform case'' $k(n)=n$.

902: In particular there exists a

903: ``rescaled'' parameter that makes the graphs of the limit probabilities

904: superimpose (Theorem~\ref{k:infinite}).

905:

906: \item For any constant $k$ the threshold phenomenon is qualitatively

907: described by a suitably chosen queuing model

908: (Theorem~\ref{k:3etc}). This yields a

909: closed-form expression for the satisfaction probability when

910: $k=2$ (Theorem~\ref{k:2}). This expression has a singularity (though $k=2$

911: is likely the only case that does so).

912: \item The rescaled limit probabilities from the

913: cases when $k$ is a constant converge to the one from the ``infinite''

914: case, that can in turn be seen as the result of a mean-field approximation

915: (thus the problem displays what we have called dimension-dependent behavior).

916:

917: \item Somewhat surprisingly, the explanation for this convergence (an

918: intrinsic feature of the problem) is

919: a threshold property for the number of iterations of PUR

920: (a particular algorithm) on random satisfiable Horn formulas

921: ``in the critical range.''

922:

923: \item In the case when $k=2$ \PUR\ displays an

924: ``easy-hard-easy'' pattern for the average number of iterations on

925: satisfiable instances, peaked at the point where the limit probability

926: has a singularity (Theorem~\ref{k:2:runtime}).

927: \end{enumerate}

928: \vspace{5mm}

929:

930:

931: Note, however, the important difference between

932: random $k$-SAT and random at-most-$k$-\HSAT: for every $k\geq 2$,

933: $k$-SAT has a sharp threshold

934: \cite{friedgut:k:sat}. All versions of \HSAT\ have coarse thresholds.

935:

936:

937: \vspace{5mm}

938: \begin{definition}

939: Let $k=k(n):\N \goesto \N$ be monotonically increasing, $1\leq

940: k(n)\leq n$. We define the following random model $\Omega(k,n,m)$:

941: {\em formula $\Phi$ on $n$ variables

942: is obtained by selecting (uniformly at random

943: and with repetition) $m$ clauses from the set of all (non-empty) Horn

944: clauses in the given variables of length {\em at most $k(n)$}.}

945: \end{definition}

946: \vspace{5mm}

947:

948: The following are our results (whose proofs are only sketched):

949: \vspace{5mm}

950:

951: \begin{theorem} \label{k:infinite}

952: If $k(n)\goesto \infty$, $c>0$,

953: $H_{k(n)}$ is the number of Horn clauses on $n$ variables

954: having length at most $k(n)$,  and $m(n)= c\cdot \frac{H_{k(n)}}{n}$

955: then

956: \begin{equation}

957: \label{formula:1}

958: p_{\infty}(c):=\lim_{n\goesto \infty} Pr_{\Phi \in

959: \Omega(k(n),n,m)}(\Phi \in \mbox{HORN-SAT}\/) =

960: 1-F_{1}(e^{-c}).

961: \end{equation}

962: \end{theorem}

963:

964: \vspace{5mm}

965:

966: \begin{theorem}\label{k:2}

967: If $c>0$, and $F_{2}:(0,1)\goesto (1,\infty)$,

968: $F_{2}(x)=\ln x/(x-1)$, then

969: \begin{equation}

970: \label{formula:2}

971: p_{2}(c):=\lim_{n\goesto \infty} Pr_{\Phi \in \Omega(2,n,cn)}(\Phi \in

972: \mbox{HORN-SAT}\/) =

973: \left \{\begin{array}{ll}

974:                  1, & \mbox{ if $c\leq \frac{3}{2}$,}

975:                  \\

976:

977:                  F_{2}^{-1}(2c/3),  & \mbox{ otherwise.}\\

978:         \end{array}

979: \right.

980: \end{equation}

981: \end{theorem}

982: \vspace{5mm}

983:

984: More generally, define $\lambda_{k}=\frac{k!}{k+1}$ and

985: $S_{j}^{i}={{i}\choose {0}}+{{i}\choose {1}}+\ldots+{{i}\choose

986: {j}}$ (with the usual convention ${{i}\choose{j}}=0$ for $i<j$). Then

987:

988: \begin{theorem}\label{k:3etc}

989: The limit probability $p_{k}(c):=\lim_{n\goesto \infty}

990: Pr_{\Phi \in \Omega(k,n,c\cdot n^{k-1})}(\Phi \in \mbox{HORN-SAT}\/)$

991: is equal to the probability that the following Markov chain

992: ever hits state zero:

993: \begin{equation}\label{eq:3etc}

994: \left \{\begin{array}{l}

995:         Q_{0}=1,\\

996:         Q_{i+1}=Q_{i}\cminus 1+Po(c\cdot \lambda_{k}\cdot

997: S_{k-2}^{i+1}),\\

998: \end{array}

999: \right.

1000: \end{equation}

1001: \end{theorem}

1002: \vspace{5mm}

1003:

1004: To get a better intuition on the threshold phenomenon, as displayed by

1005: Theorems~\ref{k:infinite}, \ref{k:2} and \ref{k:3etc}, we have plotted

1006: (in Fig. 1) the limit probability functions

1007: $p_{2}(\cdot),p_{3}(\cdot),p_{\infty}(\cdot)$, against the ``rescaled'' parameter

1008: (inspired by Theorem~\ref{k:infinite}) $\hat{c}=\frac{m\cdot

1009: n}{H_{k(n)}}$. This rescaling has the pleasant property that it

1010: simplifies the factor $\lambda_{k}$ from the right-hand side

1011: of~\ref{eq:3etc}, in particular mapping the critical point in

1012: Theorem~\ref{k:2} to $\hat{c}=1$.

1013: The graphs of $p_{2}$ (continuous) and $p_{\infty}$

1014: (dashed) are obtained from their formulas in the previous results,

1015: while $p_{3}$ (dotted) is obtained via simulations. The figure makes

1016: apparent that the graphs of $p_{2}, p_{3}, \ldots, \ldots$

1017: converge to

1018: the graph of $p_{\infty}$. This statement can be

1019: proved rigorously :

1020:

1021: \begin{theorem}\label{annealed}

1022: For every $\hat{c}>0$, $\lim_{n\goesto

1023: \infty}p_{n}(\hat{c})=p_{\infty}(\hat{c})$.

1024: \end{theorem}

1025: \vspace{5mm}

1026:

1027:

1028: \begin{figure}

1029: \centerline{

1030: \psfig{figure=fig1.ps,width=3.5in}}

1031: \caption{Rescaled threshold functions}

1032:  \end{figure}

1033:

1034:

1035:

1036:

1037: As a bonus our analysis yields the following result:

1038: \vspace{5mm}

1039: \begin{theorem}\label{k:2:runtime}

1040: Let $q$ be the limit of the

1041: expected number of iterations of \PUR\ on a random formula

1042: $\Phi \in \Omega(2,n,cn)$, conditional on $\Phi$ being

1043: satisfiable. Then

1044: \begin{equation}

1045: \label{q:2}

1046: q=

1047: \left \{\begin{array}{ll}

1048:                  \frac{1}{1-p_{2}\lambda_{2}c} & \mbox{, if $c\neq

1049: \frac{3}{2}$,}

1050:                  \\

1051:

1052:                  \infty,  & \mbox{ otherwise.}\\

1053:         \end{array}

1054: \right.

1055: \end{equation}

1056: \end{theorem}

1057: \vspace{5mm}

1058:

1059: This theorem suggests (see Fig.2) and explains the ``easy-hard-easy''

1060: pattern for the average running time of

1061: \PUR\

1062: in this case. Experiments we performed confirm this prediction.

1063:  \begin{figure}

1064:  \centerline{

1065:  \psfig{figure=fig2.ps,width=3.5in}}

1066:  \caption{The ``easy-hard-easy'' pattern.}

1067:  \label{figure-2}

1068:  \end{figure}

1069:

1070: \section{Preliminaries}

1071:

1072: Throughout this paper we use ``with high probability'' (w.h.p.)

1073: as a substitute for ``with probability $1-o(1)$''.

1074: We denote (sometimes abusing notation) by $B(n,p) (Po(\lambda))$ a

1075: random

1076: variable having a binomial (Poisson) distribution with the

1077: corresponding

1078: parameter(s), and by $a\cminus b$ the value $max(a-b,0)$.

1079: We will use the following version of the Chernoff bound

1080: \vspace{5mm}

1081:

1082: \begin{theorem}

1083: If $0<\theta <1/4$ then

1084: $\PR[|B(n,p)-np|>\theta np ] \leq e^{-np\frac{\theta^{2}}{4}}$.

1085: \end{theorem}

1086: \vspace{5mm}

1087:

1088: as well as the related inequality from \cite{probabilistic-method} :

1089:  \vspace{5mm}

1090:

1091: \begin{proposition}\label{chernoff:poisson}

1092: Let $P$ have Poisson distribution with mean $\mu$. For $\epsilon >0$,

1093:

1094: \[ \Pr[P\leq \mu \cdot (1-\epsilon)] \leq e^{\epsilon^{2}\cdot \mu

1095: /2},

1096: \]

1097:

1098: \[ \Pr[P\geq \mu \cdot (1+\epsilon)] \leq

1099: [e^{\epsilon}(1+\epsilon)^{-(1+\epsilon)}]^{\mu}.

1100: \]

1101: \end{proposition}

1102: \vspace{5mm}

1103:

1104: We also use the following inequality:

1105: \vspace{5mm}

1106:

1107: \begin{proposition}

1108: Let $k\in \N$ and $p\in [0,1]$. Then for every $n\geq k$

1109: \begin{equation}

1110: 1-\sum_{i=0}^{k-1} {{n}\choose {i}}p^{i}(1-p)^{n-i}\leq {{n}\choose

1111: {k}}p^{k}.

1112: \end{equation}

1113: \end{proposition}

1114: \vspace{5mm}

1115:

1116:

1117: \begin{PROOF} Define $f:[0,1]\goesto R$, $f(p)=1-\sum_{i=0}^{k-1}

1118: {{n}\choose

1119: {i}}p^{i}(1-p)^{n-i} -{{n}\choose {k}}p^{k}$. It is easy to see that

1120: $f^{\prime}(p)=n{{n-1}\choose {k-1}}p^{k-1}[(1-p)^{n-k}-1]\leq 0$,

1121: therefore $f$ is monotonically decreasing, and $f(0)=0$.

1122: \end{PROOF}

1123:

1124:

1125:

1126:

1127:

1128:

1129:

1130: We will also employ {\em couplings of Markov

1131: chains} (see \cite{lindvall:coupling}) to assert stochastic

1132: domination. The following is the definition of the type of

1133: coupling we employ in

1134: this paper:

1135: \vspace{5mm}

1136: \begin{definition}

1137: Let $(X_{t})_{t}$ and $(Y_{t})_{t}$ be two Markov chains on ${\bf Z}$.

1138: A {\em coupling of $X$ and $Y$ such that $X_{t}\leq Y_{t}$} is a

1139: Markov chain $Z=(Z_{t,1},Z_{t,2})$ such that:

1140: \begin{itemize}

1141: \item $Z_{t,1}$ is distributed like $X_{t}$ given $X_{0}$.

1142: \item $Z_{t,2}$ is distributed like $Y_{t}$ given $Y_{0}$.

1143: \item for every $i\geq 0$, $Z_{i,1}\leq Z_{i,2}$.

1144: \end{itemize}

1145: \end{definition}

1146: \vspace{5mm}

1147:

1148: We use such couplings to bound the probability that a Markov

1149: chain $Y_{t}$ ever decreases below a certain value $a$ by coupling it

1150: with a chain $X_{t}$ such that $X_{t}\leq Y_{t}$ and using the

1151: estimate $\Pr[\exists t: Y_{t}\leq a]\leq \Pr[\exists t: X_{t}\leq a]$

1152: (that follows from the coupling). The couplings we construct employ the

1153: following ideas:

1154: \begin{itemize}

1155: \item Suppose the recurrences

1156: describing $\Delta X_{t}$ and $\Delta Y_{t}$ are identical,

1157: except for one term, which is $B(m_{1},\tau)$ in $X_{t}$ and

1158: $B(m_{2},\tau)$ in $Y_{t}$,

1159: where $m_{1}\leq m_{2}$ are positive integers and $\tau \in (0,1)$.

1160: Obtain a coupling by identifying $B(m_{1},\tau)$ with the outcome of

1161: the first $m_{1}$ Bernoulli experiments in $B(m_{2},\tau)$.

1162: \item Suppose now that $\Delta X_{t}$ and $\Delta Y_{t}$ differ by

1163: exactly one term which is $B(m,p)$ in $\Delta X_{t}$ and

1164: $B(m,q)$ in $\Delta Y_{t}$, $p \leq q$. Let $A_{i}$ and $B_{i}$,

1165: $i=1,m$,

1166: be independent $0/1$ experiments with success probabilities $p$

1167: and $\frac{q-p}{1-p}$ respectively. Define the pair $(Z_{t,1},

1168: Z_{t,2})$ so that

1169: \begin{enumerate}

1170: \item $Z_{t,1}$ is the number of times $A_{i}$ succeeds.

1171: \item $Z_{t,2}$ is the number of times at least one of $A_{i}$ and

1172: $B_{i}$

1173: succeeds.

1174: \end{enumerate}

1175: \end{itemize}

1176:

1177: %We will also explicitly refer to

1178: %the following stochastic dominance inequality obtained by the first

1179: %coupling.

1180: %Let $0<m_{1}\leq m_{2}$ and $\tau >0$. Then, for every $a>0$,

1181: %\begin{equation}\label{couple}

1182: %\Pr[B(m_{1},\tau)\geq a]\leq \Pr[B(m_{2},\tau)\geq a].

1183: %\end{equation}

1184:

1185: We measure the distance between two probability distributions

1186: $P$ and $Q$ by {\em the total variation distance},

1187: denoted by $d_{TV}(P,Q)$,  and recall the following results,

1188: (see \cite{sheu:poisson} and \cite{barbour:holst:janson}, page

1189: 2 and Remark 1.4):

1190: \vspace{5mm}

1191:

1192: \begin{lemma}\label{b:h:j}If $n,p,\lambda, \mu >0$ then

1193: $d_{TV}(B(n,p),Po(np))\leq \min\{np^{2},\frac{3p}{2}\}$ and

1194: $d_{TV}(Po(\lambda), Po(\mu))\leq |\mu - \lambda|$.

1195: \end{lemma}

1196: \vspace{5mm}

1197:

1198: We will also need the following simple lemma:

1199:

1200: \begin{lemma}\label{approximation}

1201: Let c be a fixed positive integer. For every $t\in \N$ let

1202: $\xi_{t}$, $\eta_{t}$ be two probability distributions. Define the

1203: Markov chains $(X_{t})_{t}$ and $(Y_{t})_{t}$ by recurrences

1204: \begin{equation}

1205: \left\{\begin{array}{l}

1206: X_{t+1}=X_{t}\cminus c + \xi_{t}, \\

1207: Y_{t+1}=Y_{t}\cminus c + \eta_{t}.\\

1208: \end{array}

1209: \right.

1210: \end{equation}

1211:

1212: Then, for every $t\geq 0$, $d_{TV}(X_{t},Y_{t})\leq

1213: d_{TV}(X_{0},Y_{0})+ \sum_{i=0}^{t-1} d_{TV}(\xi_{i}, \eta_{i}).$

1214: \end{lemma}

1215:

1216: \beginproof

1217:

1218: The following result gives a more convenient inequality that

1219: immediately implies Lemma~\ref{approximation}

1220: \vspace{5mm}

1221:

1222: \begin{lemma}\label{easy:approximation}

1223: Let c be a fixed positive integer. Let

1224: $X$, $Y$, $\xi$, $\eta$ be random variables with nonnegative integer

1225: values. Define the

1226: random variables $Z$ and $T$ by recurrences

1227: \begin{equation}

1228: \left\{\begin{array}{l}

1229: Z=X\cminus c + \xi, \\

1230: T=Y\cminus c + \eta.\\

1231: \end{array}

1232: \right.

1233: \end{equation}

1234: Then, for every $d_{TV}(Z,T)\leq

1235: d_{TV}(X,Y)+ d_{TV}(\xi, \eta).$

1236: \end{lemma}

1237: \vspace{5mm}

1238:

1239: \beginproof

1240:

1241: To prove this result, we will denote (for the ``generic'' r.v. $A$) by

1242: $A_{i}$ the probability that $A$ takes value $i$. We also employ the

1243: following simple inequality, valid for $a,b,c,d\geq 0$: $|ad-bc|\leq

1244: a|d-c|+|a-b|c$.

1245:

1246: For every $a\geq 0$ we have:

1247: \[

1248: Z_{a}=\sum_{i=0}^{c} X_{i}\xi_{a}+\sum_{i=c+1}^{c+a} X_{i}\xi_{a+c-i},

1249: \]

1250: \[

1251: T_{a}=\sum_{i=0}^{c} Y_{i}\eta_{a}+\sum_{i=c+1}^{c+a}

1252: Y_{i}\eta_{a+c-i},

1253: \]

1254:

1255: Applying the above-mentioned inequality and summing we get:

1256:

1257: \begin{eqnarray*}

1258: d_{TV}(Z,T) \\ & \leq &

1259: \frac{1}{2} \{

1260: \sum_{i=0}^{c}\sum_{a=0}^{\infty}X_{i}|\xi_{a}-\eta_{a}|

1261: +\sum_{i=0}^{c}\sum_{a=0}^{\infty}|X_{i}-Y_{i}|\eta_{a}+ \\

1262: & + &

1263: \sum_{i=c+1}^{c+a}\sum_{a=0}^{\infty}X_{i}|\xi_{c+a-i}-\eta_{c+a-i}|

1264: +\sum_{i=c+1}^{c+a}\sum_{a=0}^{\infty}|X_{i}-Y_{i}|\eta_{c+a-i}\}.

1265: \end{eqnarray*}

1266:

1267: Let A,B,C,D be the four terms of the sum. By simple algebraic

1268: manipulations we obtain:

1269: \[

1270: \begin{array}{lcl}

1271:  A = (\sum_{i=0}^{c}X_{i})\cdot d_{TV}(\xi,\eta), &\hspace{5mm} & B =

1272: \frac{1}{2}\sum_{i=0}^{c} |X_{i}-Y_{i}|,\\

1273: C = (\sum_{i=c+1}^{\infty}X_{i})\cdot d_{TV}(\xi,\eta),

1274: & \hspace{5mm} & D =

1275:  \frac{1}{2}\sum_{i=c+1}^{\infty}|X_{i}-Y_{i}|,

1276: \end{array}

1277: \]

1278: and the result follows.

1279: \qed

1280:

1281:

1282: Finally, we need the following trivial occupancy property:

1283: \vspace{5mm}

1284:

1285: \begin{lemma}\label{occupancy}

1286: Let $a$ white balls and $b$ black balls be thrown uniformly at random

1287: in $n$ bins.

1288: \begin{enumerate}

1289: \item if $r=\max(a,b)=o(n^{1/2})$ then the probability that there is a

1290: bin that contains both white and black balls is at most

1291: $\frac{4r^2}{n}=o(1)$.

1292: \item if $s=\min(a,b)=\omega(n^{1/2})$ then the probability that there

1293: is a

1294: bin that contains both white and black balls is $1-o(1/poly)$.

1295: \end{enumerate}

1296: \end{lemma}

1297: \vspace{5mm}

1298:

1299: \beginproof

1300: The first part is easy: the probability that two balls (of any color)

1301: end up in the same bin is at most ${{a+b}\choose {2}}\cdot

1302: \frac{1}{n}$.

1303: For the second part, let $A$ be the event that no two balls of

1304: different colors end up in the same bin, and let $B$ the event that at

1305: least $\sqrt{n}$ bins contain white balls. We have:

1306: \[ \Pr[A]\leq \Pr[A|B]+\Pr[\overline{B}].\]

1307: But

1308: \[ \Pr[\overline{B}]\leq {{n}\choose {\sqrt{n}}}\cdot

1309: (\frac{1}{\sqrt{n}})^{a}= n^{\sqrt{n}-a/2}=o(\frac{1}{poly}), \]

1310: and

1311: \[\Pr[A|B]\leq (1-\frac{1}{\sqrt{n}})^{b}\sim

1312: e^{-b/\sqrt{n}}=o(\frac{1}{poly}). \]

1313: \qed

1314:

1315: The algorithm \PUR\ is displayed in Figure 3.

1316: We regard \PUR\ as working in stages, indexed by the

1317: number of variables still left unassigned; thus, the stage number

1318: decreases as \PUR\ moves on. We say that {\em formula $\Phi$ survives

1319: Stage $t$} if \PUR\ on input $\Phi$ does not halt at Stage $t$ or

1320: earlier. Let $\Phi_i$ be the formula at the

1321: beginning of stage $i$, and let $N_{i}$ denote the number of its

1322: clauses. We will also denote by $P_{i,t} (N_{i,t})$, the number of

1323: clauses of

1324: $\Phi_{t}$ of size $i$ and containing one (no) positive

1325: literal. Define $\Phi_{i,t}^{P}$ ($\Phi_{i,t}^{N}$) to be the

1326: subformula of $\Phi_{t}$ containing the clauses counted by $P_{i,t}

1327: (N_{i,t})$.

1328:

1329: The following lemmas were proved in \cite{istrate:cs.DS/9912001}, in

1330: the

1331: context of analyzing the behavior of \PUR\ on $\Phi\in

1332: \Omega(n,n,m)$, $m=c\cdot 2^n$.

1333: \vspace{5mm}

1334:

1335: \begin{lemma}\label{k:inf:recurrence}

1336: \begin{enumerate}

1337: \item

1338: Suppose $\PUR$ does not halt before stage $t$. Then, conditional on $N_{t}$,

1339: the clauses of $\Phi_{t}$ are random and independent.

1340: \item

1341: Suppose now that we condition on $\Gamma_{t}=(N_{1,t},N_{2,t},P_{1,t},

1342: P_{2,t}$ and on the fact that $\Phi$

1343: survives Stage $t$ as well. Then  we have

1344:

1345: \begin{equation}\label{eq:markovchain}

1346: N_{t-1}=N_{t}-\Delta_{1,P}(t)-\Delta_{2,P}(t),

1347: \end{equation}

1348:

1349: where

1350: \begin{itemize}

1351: \item $\Delta_{1,P}(t)$, the number of positive clauses that are

1352: satisfied at stage $t$, has the distribution $1+B\left(P_{1,t}-1,\frac{1}{t}\right)$.

1353: \item

1354: $\Delta_{2,P}(t)$, the number of positive non-unit clauses

1355: that are satisfied at stage $t$, has the binomial distribution

1356: $B\left(P_{2,t},\frac{1}{t}\right)$.

1357: \end{itemize}

1358: \end{enumerate}

1359: \end{lemma}

1360:

1361: \vspace{5mm}

1362:

1363: \begin{lemma}\label{k:inf:bounds}

1364: For every $c>0$ and every $t, n-c\sqrt n \leq t \leq n$,

1365: the conditional probability that the inequality

1366: \begin{equation}\label{concentrate}

1367: N_{n}-(n-t)\left[1+\frac{2(N_{n}-1)}{t}\right]\leq N_{j}\leq

1368: N_{n}\end{equation}

1369: holds for all $t\leq j \leq n$, in the event that $\PUR$ reaches stage

1370: $t$,

1371: is $1-o(1)$.

1372: \end{lemma}

1373: \vspace{5mm}

1374:

1375: \begin{lemma}\label{k:inf:prob}

1376: Let $X_{n}\in [0,n]$ be the r.v. denoting the number of iterations of

1377: \PUR\ on

1378: a random {\em satisfiable} formula $\Phi\in \Omega(n,c\cdot

1379: 2^{n})$. Then $X_{n}$ converges in distribution to a distribution

1380: $\rho$ on $[0,n]$ having support on the nonnegative integers,

1381: $\rho=(\rho_{k})_{k\geq

1382: 0}$, $\rho_{k}= Prob[\rho = k]$,

1383: given by

1384: \[ \rho_{k}=\frac{e^{-2^{k}c}}{1-F(e^{-c})}\cdot \prod_{i=1}^{k-1}

1385: (1-e^{-2^{i}c}).

1386: \]

1387: \end{lemma}

1388: \vspace{5mm}

1389:

1390: \begin{center}

1391: \begin{figure}

1392: {\tt

1393: \begin{tabbing}

1394:

1395: Pr\=ogram PUR($\Phi$): \\

1396:    \> if \= $\Phi$ (contains no positive literal as a clause)\\

1397:    \> \>then \= return TRUE \\

1398:    \> \>else \\

1399:    \> \> \>choose such a positive unit clause $x$ \\

1400:    \> \> \>if \= ($\Phi$ contains $\overline{x}$ as a clause)\\

1401:    \> \> \> \>then \= \\

1402:    \> \> \> \> \>return FALSE \\

1403:    \> \> \> \>else \\

1404:    \> \> \> \> \>let $\Phi^{\prime}$ be the formula \\

1405:    \> \> \> \> \>obtained by setting

1406:    $x$ to 1 \\

1407:     \> \> \> \> \>return \PUR($\Phi^{'}$) \\

1408: \end{tabbing}

1409: }

1410: \caption{Algorithm PUR}

1411: \end{figure}

1412: \end{center}

1413: \section{The proof of Theorem~\ref{k:infinite}}

1414:

1415: Let $c_{1}<c_{2}<c_{3}$ be arbitrary constants. Consider three

1416:  random formulas $\Phi_{1}\in \Omega(n,{\bf k(n)},c_{1}\cdot

1417:  \frac{H_{k(n)}}{n})$,$\Phi_{2} \in \Omega(n,{\bf n},c_{2}\cdot

1418:  2^{n})$ and  $\Phi_{3}\in\Omega(n,{\bf k(n)}, c_{3}\cdot

1419: \frac{H_{k(n)}}{n})$,

1420: and let $\Phi^{\prime}$ be the subformula of $\Phi_{2}$ consisting of

1421:  the clauses of size at

1422: most $k(n)$. By the Chernoff bound, with high probability,

1423: $m^{\prime}$, the number of clauses of $\Phi^{\prime}$, is in the

1424: interval

1425: $[c_{1}\cdot \frac{H_{k(n)}}{n},c_{3}\cdot \frac{H_{k(n)}}{n}] $.

1426: When $n\goesto \infty$ the probability that $\Phi_{2} \in \HSAT$ tends

1427: to $1-F_{1}(e^{-c_{2}})$.

1428:

1429: From Lemma~\ref{k:inf:prob} we infer the following easy consequence

1430: \vspace{5mm}

1431: \begin{claim}

1432: The probability that \PUR\  accepts $\Phi_{2}$

1433: after stage $n-k(n)+1$ is $o(1)$.

1434: \end{claim}

1435:

1436:

1437:

1438: Since in the first $k(n)-1$ stages

1439: of \PUR\  {\em only the clauses of $\Phi^{\prime}$ can influence the

1440: algorithm acceptance/rejection of  $\Phi_{2}$

1441: (because \PUR\  accepts/rejects

1442: at Stage $i$ based only on the unit clauses, and

1443: each non-simplified clause loses at most one literal at each phase)},

1444: \[ |\Pr[\Phi_{2}\in \HSAT]- \Pr[\Phi^{\prime}\in \HSAT]|= o(1).

1445: \]

1446: By the monotonicity of SAT and the randomness of

1447: $\Phi_{1},\Phi_{2}, \Phi^{'}$ we have

1448:

1449: \[ \Pr[\Phi_{1}\in \HSAT]-o(1)\leq \Pr [\Phi_{2} \in \HSAT] \leq

1450: \Pr[\Phi_{3}\in \HSAT]+o(1).

1451: \]

1452: Taking limits it follows that

1453:

1454: \begin{eqnarray*}

1455: {\overline{\lim}_{n\goesto \infty} \Pr}_{\Phi\in

1456: \Omega(n,k(n),c_{1}H_{k(n)}/n)} [\Phi \in \HSAT] & \leq 1-F(e^{-c_2})

1457: \leq & \\

1458: {\underline{\lim}_{n\goesto \infty} \Pr}_{\Phi \in

1459: \Omega(n,k(n),c_{3}H_{k(n)}/n)} [\Phi \in \HSAT] .

1460: \end{eqnarray*}

1461: Since $c_{1},c_{2},c_{3}$ were chosen arbitrarily,

1462: by choosing $c_{1}=c, c_{2}=c+\epsilon$, and $c_{2}=c-\epsilon, c_{3}=

1463: c$, respectively, we infer that

1464:

1465: \begin{eqnarray*}

1466: 1-F_{1}(e^{-(c-\epsilon)})\leq  {\underline{\lim}_{n\goesto

1467: \infty} \Pr}_{\Phi\in \Omega(n,k(n),cH_{k(n)}/n)}[\Phi \in

1468: \HSAT]  \leq & \\

1469: {\overline{\lim}_{n\goesto \infty}\Pr}_{\Phi \in

1470: \Omega(n,k(n),cH_{k(n)}/n)}[\Phi \in \HSAT]\leq

1471: 1-F_{1}(e^{-(c+\epsilon)}).

1472: \end{eqnarray*}

1473: As $\epsilon$ is arbitrary, we get the desired result.

1474: \qed

1475:

1476: \begin{observation}\label{obs:coupling}

1477: One point about the previous proof that is intuitively clear, but gets

1478: somewhat obscured by the technical details of the proof, is that if

1479: $\Phi_{2} \in \Omega(n,{\bf n},c_{2}\cdot 2^{n})$

1480: then $\Phi^{'}$ behaves ``for every practical purpose''

1481: as if it were a uniform formula in $\Omega(n,{\bf k(n)},c_{2}\cdot

1482:  \frac{H_{k(n)}}{n})$. We will use a similar

1483: intuition in the proof of Proposition~\ref{annealed}.

1484: \end{observation}

1485:  \vspace{5mm}

1486:

1487: \section{The uniformity lemma}

1488:

1489: The following lemma is the analog of Lemma~\ref{k:inf:recurrence}

1490: for the case $k=2$, and the basis for our analysis of this case:

1491: \vspace{5mm}

1492:

1493: \begin{lemma}\label{k:2:recurrence}

1494: Suppose that $\Phi$ survives up to stage $t$. Then, conditional on

1495: $(P_{1,t}, N_{1,t}, P_{2,t}, N_{2,t})$, the clauses in

1496: $\Phi_{1,t}^{P},

1497: \Phi_{1,t}^{N}, \Phi_{2,t}^{P}, \Phi_{2,t}^{N}$ are chosen uniformly

1498: at random and are independent. Also, conditional on the

1499: fact that $\Phi$ survives stage $t$ as well, the following recurrences

1500: hold:

1501: \begin{equation}\label{k:2:markovchain}

1502: \left \{\begin{array}{l}

1503:          P_{1,t-1}=P_{1,t}-1-\Delta_{1,t}^{P}+\Delta_{12,t}^{P}, \\

1504:          N_{1,t-1}=N_{1,t}+\Delta_{12,t}^{N},                    \\

1505:          P_{2,t-1}=P_{2,t}-\Delta_{12,t}^{P}-\Delta_{02,t}^{P},  \\

1506:          N_{2,t-1}=N_{2,t}-\Delta_{12,t}^{N},                    \\

1507:         \end{array}

1508: \right.

1509: \end{equation}

1510: where (in distribution)

1511: \begin{equation}\label{k:2:distribution}

1512: \left \{\begin{array}{l}

1513: \Delta_{1,t}^{P} =B(P_{1,t}-1,1/t),\\

1514: \Delta_{12,t}^{P}=B(P_{2,t},1/t),\\

1515: \Delta_{02,t}^{P}=B(P_{2,t}-\Delta_{12,t}^{P},1/t),\\

1516: \Delta_{12,t}^{N}=B(N_{2,t},2/t).\\

1517: \end{array}

1518: \right.

1519: \end{equation}

1520: \end{lemma}

1521:  \vspace{5mm}

1522:

1523: \beginproof

1524: A formula will be represented by an

1525: $m\times 2$ table. The rows

1526: of the table correspond to clauses in the formula and the entries are

1527: its literals. They are gradually unveiled as the algorithm proceeds.

1528: We assume that when generating $\Phi$ we mark those

1529: clauses containing only one literal (so that we know their location,

1530: but not their content).

1531: We say that a row (or a clause) is ``blocked'' either if the clause is

1532: already satisfied or the clause has been turned into the empty

1533: clause.

1534: Suppose $\PUR$ arrives at stage $t$ on $\Phi$.  Then in stages

1535: $i=n, n-1, \ldots, t+1$, $\Phi_i$ should contain a unit clause

1536: consisting of a positive literal but should not have contained

1537: complementary unit clauses of the same variable.

1538: To carry out the disclosure at stage $i$, let $x$ be the variable set

1539: to one in this stage. We assume that the formula unveils

1540: all occurrences of $x$ or $\overline{x}$ in $\Phi$. For each clause we

1541: perform the following:

1542:

1543: \begin{enumerate}

1544: \item if it contains $x$ we unveil all its literals and block;

1545: \item otherwise we do nothing.

1546: \end{enumerate}

1547: The clauses of $\Phi_{t}$ having size two correspond to the rows of

1548: $\Phi$

1549: that contain no unveiled literal.

1550: The clauses of size one are either the clauses of

1551: size one in $\Phi$ that contain none of the chosen literals, or the

1552: clauses of size two that contain the negation of one chosen variable

1553: and another is yet to be chosen.

1554: Given these observations the uniformity and independence follow from

1555: the way we construct $\Phi$.

1556:

1557: To prove the recurrences, let $x$ be the variable set to

1558: one in stage $t$ (it exists since \PUR\ does not halt at

1559: this stage). By uniformity and independence, each of the $P_{1,t}-1$

1560: positive unit clauses of $\Phi_{t}$, other than the chosen one, is

1561: equal to $x$ with probability $1/t$ (since there are $t$ variables

1562: left at this stage). On the other hand, the positive unit clauses of

1563: $\Phi_{t-1}$ that are not present

1564: in $\Phi_{t}$ can only come from clauses of size two of $\Phi_{t}$

1565: that contain $\overline{x}$ and a positive literal (therefore counted

1566: by $P_{2,t}$). Uniformity and independence imply therefore that

1567: $\Delta_{1}^{P}(t)$ has the distribution claimed in

1568: (\ref{k:2:distribution}). The other relations can be

1569: justified similarly (noting that, since \PUR\ does not reject at this

1570: stage, every negative unit clause of $\Phi_{t}$ is also present in

1571: $\Phi_{t-1}$).

1572:

1573:

1574:

1575: It will be useful to consider the Markov chain

1576: (\ref{k:2:markovchain}) for all

1577: values of $t=n,\ldots, 0$ (even when the algorithm halts). To

1578: accomplish that, the ``minus'' signs in the first equation of

1579: (\ref{k:2:markovchain}) and the definition of $\Delta_{1,t}^{P}$

1580: should be replaced by $\cminus$. We also need to specify the

1581: distribution of each component of

1582: the tuple $(P_{1,n}, N_{1,n}, P_{2,n}, N_{2,n})$. Let $\Delta_{n}$ be

1583: a random variable having the Bernoulli distribution $B(cn,

1584: \frac{2n}{2n+3{{n}\choose {2}}})$. It is easy to see that in

1585: distribution

1586:

1587: \begin{equation}\label{k:2:initial:condition}

1588: \left \{\begin{array}{l}

1589:          P_{1,n}=B(\Delta_{n},1/2),\\

1590:          N_{1,n}=\Delta_{n}-P_{1,n},\\

1591:          P_{2,n}=B(cn-\Delta_{n},2/3)\\

1592:          N_{2,n}=cn-\Delta_{n}-P_{2,n}.\\

1593:         \end{array}

1594: \right.

1595: \end{equation}

1596: \endproof

1597: \qed

1598:

1599: \section{Proof of Theorem~\ref{k:2}}

1600:

1601:

1602:

1603: The main intuition for the proof is that in ``most interesting stages''

1604: $\Delta_{1,t}^{P}=0$ and $\Delta_{12,t}^{P}$ is approximately

1605:  Poisson distributed. Therefore,  $P_{1,t}$ qualitatively

1606: behaves like the Markov Chain $(Q_{t})_{t}$ defined by

1607: \begin{equation}

1608: \left \{\begin{array}{l}

1609:         Q_{n+1}=1,\\

1610:         Q_{t-1}=Q_{t}\cminus 1+Po(\lambda),\\

1611: \end{array}

1612: \right.

1613: \end{equation}

1614: where $\lambda=2c/3$.

1615: This explains the closed form of the limit probability: a well-known

1616: result

1617: states that $\rho$, the probability that the queuing chain $Q_{t}$

1618: reaches

1619: state 0, satisfies the equation $\rho= \Phi(\rho)$, where

1620: $\Phi(t)=e^{\lambda(t-1)}$ is the generating function of the

1621: arrival distribution $Po(\lambda)$.

1622: We will define a suitable value $\omega_{0}$ such that:

1623: \begin{enumerate}

1624: \item With high probability \PUR\ does not reject in any of stages $n,

1625: \ldots, n-\omega_{0}$.

1626: \item \PUR\ accepts ``mostly before or at stage $n-\omega_{0}$'' (i.e.

1627: the

1628: probability that \PUR\ accepts after stage $n-\omega_{0}$, given that

1629: $\Phi$ survives this far is $o(1)$).

1630: \item With high probability, for every $t\in n, \ldots, n-\omega_{0}$,

1631: $\Delta_{1,t}^{P}=0$.

1632: \item At stages $n,\ldots, n-\omega_{0}$, $P_{1,t}$ is ``very close''

1633: to $Q_{t}$, with respect to total variation distance.

1634: \end{enumerate}

1635:

1636: This program can be accomplished as described if $c< 3/2$. To prove

1637: Property 4 we make use of Lemmas~\ref{b:h:j} and

1638: \ref{occupancy}. Property 2 is proved only implicitly: in this

1639: case (see \cite{hoel:port:stone}) the probability that $Q_{i}=0$ for

1640: some $i$ tends to one, and, in fact, by a technical result due to

1641: Frieze and Suen (Lemma 3.1 in \cite{frieze-suen}), $\Pr[Q_{i}=0\mbox{

1642: for some

1643: }i\geq n - \log n]$ is $1-o(1)$.

1644:

1645:

1646: Let us now concentrate on the case when $c>3/2$ (the case when $c=3/2$

1647: will

1648: follow by a monotonicity argument). In the previous argument we only

1649: used

1650: the fact

1651: that $c<3/2$ when deriving the probability that $Q_{t}$ hits state 0,

1652: hence the

1653: arguments from above carry on, and the conclusion is that the

1654: probability that \PUR\ accepts at one of the stages $n,\ldots,

1655: n-\omega_{0}$ differs by $o(1)$ from the probability that $Q_{t}=0$

1656: somewhere in this range. We now, however, have to consider the

1657: probability that \PUR\ accepts at some stage later than $n-\omega_{0}$

1658: and aim to prove that this probability is $o(1)$. It is conceptually

1659: simpler to divide

1660: the interval $[n-\omega_{0},0]$ into two subintervals, $[n-\omega_{0},

1661: n-\omega_{1}]$ and its complement, such that

1662: w.h.p. $\Phi_{n-\omega_{1}}$ (if defined) contains two opposite unit

1663: clauses, therefore

1664: the probability that \PUR\ accepts after stage $n-\omega_{1}$ is

1665: $o(1)$. In the range $[n-\omega_{0},n-\omega_{1}]$ we would like to

1666: prove that ``most of the time'' $\Delta_{1,t}^{P}$ is zero and

1667: $P_{1,t}$ is ``close'' to $Q_{t}$ and to reduce the problem to the

1668: analysis of $Q_{t}$. Unfortunately there are two problems with this

1669: approach: although the probability that each individual

1670: $\Delta_{1,t}^{P}>0$ is fairly small, to make $\Phi_{n-\omega_{1}}$

1671: unsatisfiable w.h.p., $\omega_{1}$ has to

1672: be $\omega(\sqrt n)$. This implies

1673: that we cannot sum these probabilities over

1674: $[n-\omega_{0},n-\omega_{1}]$ and expect the sum to be $o(1)$; a

1675: similar problem arises if we want to sum the upper bounds for

1676: $d_{TV}(\Delta_{12,t}^{P},Po(\lambda))$.

1677:

1678:

1679: Fortunately there is a way to circumvent this, avoiding the use

1680: of total variation distance altogether: although we cannot guarantee

1681: that

1682: w.h.p. each $\Delta_{1,t}^{P}=0$, we can arrange that w.h.p. for every

1683: sequence of $p$ consecutive stages $t, t-1, \ldots t-p+1$,

1684: $\Delta_{1,t}^{P}+\Delta_{1,t-1}^{P}+\ldots +\Delta_{1,t-p+1}^{P}\leq

1685: 3$ (*). Intuitively, in any sequence of $p$ consecutive steps at most

1686: $p+3$

1687: clients leave the queue, and the number of those who arrive is the sum

1688: of $p$ approximately Poisson variables, thus approximately Poisson

1689: with parameter $p\lambda$. Choosing $p$ large enough so that $\lambda

1690: >1+\frac{3}{p}$ ensures that in any $p$ steps {\em the average number

1691: of

1692: customers that arrive is strictly larger than the number of customers

1693: that are served in this time span}. Therefore we will seek to

1694: approximate

1695: $P_{1,t}$ by a queuing chain $\overline{Q}_{t}$ with this

1696: property. Since $P_{1,n-\omega_{0}}=\overline{Q}_{n-\omega_{0}}$ is

1697: ``large,'' an elementary analysis of the queuing chain implies

1698: that the probability that

1699: $\overline{Q}_{t}$ hits state 0 in the interval

1700: $[n-\omega_{0},n-\omega_{1}]$ is exponentially small. So we obtain the

1701: desired result if $\overline{Q}_{t}$ is constructed so that it is

1702: stochastically dominated by $P_{1,t}$.

1703:

1704: \subsection{The case $c<3/2$}

1705:   Define $\omega_{0}=n^{0.1}$.

1706: The following are the main steps of the proof in this case:

1707: \vspace{5mm}

1708:

1709: \begin{lemma}\label{small:p2}

1710: With probability $1-o(1/poly)$ for every $t\in [n,\ldots , n/2]$ we

1711: have  $$\Delta_{12,t}^{P},\Delta_{02,t}^{P}, \Delta_{12,t}^{N}\leq

1712: \frac{1}{2}n^{0.1}.$$

1713: \end{lemma}

1714: \vspace{5mm}

1715:

1716: \beginproof

1717: Use the coupling with $m_{1}=P_{2,t} (N_{2,t})$,

1718: $m_{2}=cn$, $\tau = 1/t$, and apply Chernoff bound to

1719: $B(cn,1/t)$.

1720: \endproof

1721: \qed

1722: \vspace{5mm}

1723:

1724: \begin{corrolary}\label{p:2:t}

1725: Consider $\omega \leq n/2$.

1726: If for every $t\in [n,\ldots , n/2]$,

1727: $\Delta_{12,t}^{P},\Delta_{02,t}^{P}, \Delta_{12,t}^{N}\leq

1728: \frac{1}{2}n^{0.1}$ then, for all $t\in

1729: [n,\ldots, n-\omega]$, $P_{1,t}, N_{1,t}, |P_{2,t}-P_{2,n}|,

1730: |N_{2,t}-N_{2,n}| <

1731: (n-t)\cdot n^{0.1}$.

1732: \end{corrolary}

1733: \vspace{5mm}

1734:

1735: \begin{lemma}\label{small:delta1}

1736: If for all $t\in

1737: [n,\ldots, n-\omega]$, $P_{1,t}, N_{1,t}, |P_{2,t}-P_{2,n}|,

1738: |N_{2,t}-N_{2,n}| <

1739: (n-t)\cdot n^{0.1}$ holds then

1740: w.h.p. $\Delta_{1,t}^{P}=0$ for every $t\in [n,\ldots , n-\omega_{0}]$.

1741: \end{lemma}

1742: \vspace{5mm}

1743:

1744: \beginproof

1745: $\Pr[B(P_{1,t}-1,\frac{1}{t})>0] = 1-\Pr[B(P_{1,t}-1,\frac{1}{t})=0]=

1746: 1-(1-\frac{1}{t})^{P_{1,t}-1}<\frac{P_{1,t}-1}{t}< n^{-0.9}$.

1747: \endproof

1748: \qed

1749: \vspace{5mm}

1750:

1751: \begin{lemma}\label{p:2:n}

1752: W.h.p., $|P_{2,n}-\frac{2}{3}cn|, |N_{2,n}-\frac{1}{3}cn|  < n^{0.6}$.

1753: \end{lemma}

1754: \vspace{5mm}

1755:

1756: \beginproof

1757: Directly from the Chernoff bounds on $\Delta_{n}$ and $P_{2,n}$.

1758: \endproof

1759: \qed

1760: \vspace{5mm}

1761:

1762: \begin{lemma}\label{p:2:t:poisson}

1763: If the events in the conclusions of Lemmas~\ref{p:2:t} and \ref{p:2:n}

1764: hold for

1765: $\omega = \omega_{0}$, $\epsilon_{1}=1/6$ and $\epsilon_{2}=0.1$, then

1766: there exists a constant

1767: $r>0$ such that for every $t=n, \ldots,n-\omega_{0}$,

1768: $|\frac{P_{2,t}}{t}-\frac{2}{3}c| \leq r n^{-0.4}$.

1769: \end{lemma}

1770: \vspace{5mm}

1771:

1772: \beginproof

1773:  We have

1774: $$|\frac{P_{2,t}}{t}-\frac{2}{3}c| \leq

1775: P_{2,t}\left|\frac{1}{t}-\frac{1}{n}\right|+

1776: \frac{|P_{2,t}-P_{2,n}|}{n}+

1777: \left|\frac{P_{2,n}}{n}-\frac{2}{3}c\right | \leq

1778: P_{2,n}\frac{\omega_{0}}{n(n-\omega_{0})}+\frac{n^{0.2}}{n}+

1779: n^{0.6-1},$$

1780: by Lemma ~\ref{p:2:t:poisson} and $n-\omega_{0}\leq t\leq n,$ and the

1781: result

1782: immediately follows.

1783: \endproof

1784: \qed

1785: \vspace{5mm}

1786:

1787: \begin{lemma}\label{distance}

1788: If the conclusions of Lemmas~\ref{p:2:t:poisson} and

1789: \ref{small:delta1} are true  then

1790: $$\sum_{t=n-\omega_{0}}^{n}d_{TV}(P_{1,t},Q_{t})=o(1/\omega_{0}).$$

1791: \end{lemma}

1792: \vspace{5mm}

1793:

1794: \beginproof

1795: By

1796: Lemma~\ref{p:2:t:poisson} and the inequalities on

1797: total variation distance there exist $r_{1},r_{2}>0$ such that

1798: \begin{eqnarray*}

1799: d_{TV}(\Delta_{12,t}^{P},Po(\lambda)) & \leq &

1800: d_{TV}\left(\Delta_{12,t}^{P}, Po\left(\frac{P_{2,t}}{t}\right)\right)+

1801: d_{TV}\left(Po\left(\frac{P_{2,t}}{t}\right)

1802: ,Po\left(\frac{2}{3}c\right)\right) \\ & \leq & r_{1}\frac{1}{t}+r_{2}n^{-0.4}\leq

1803: r_{3}n^{-0.4},

1804: \end{eqnarray*} where $r_{3}=r_{1}+r_{2}$. Employing

1805: Lemma~\ref{approximation} it follows that

1806: $$\sum_{t=n-\omega_{0}}^{n}d_{TV}(P_{1,t},Q_{t})\leq

1807: r_{3}\sum_{t=n-\omega_{0}}^{n}tn^{-0.4}\leq

1808: r_{3}n^{-0.4}\frac{\omega_{0}^{2}}{2},$$ and this amount is

1809: $o(1/\omega_{0})$.

1810: \endproof

1811: \qed

1812: \vspace{5mm}

1813:

1814: \begin{observation}

1815: The probability that the conditions in the previous lemma are not

1816: fulfilled is at most $\omega_{0}^{4}/n = n^{-0.6}$. Indeed,

1817: the events that ensure the applicability of the previous lemma are:

1818: \begin{enumerate}

1819: \item for every $t\in [n,\ldots , n/2]$,

1820: $\Delta_{12,t}^{P},\Delta_{02,t}^{P}, \Delta_{12,t}^{N}\leq

1821: \frac{1}{2}n^{0.1}$,

1822: \item for all $t\in

1823: [n,\ldots, n-\omega_{0}]$, $\Delta_{1,t}^{P}=0$, and

1824: \item  $|P_{2,n}-\frac{2}{3}cn|, |N_{2,n}-\frac{1}{3}cn|,  < n^{0.6}$

1825: \end{enumerate}

1826: The first and the third events have probability $1-o(1/poly)$ (as they

1827: come from applying Chernoff bounds). The second fails (for a specific

1828: $t$) with probability at most $\frac{P_{1,t}}{n-t}\leq

1829: \omega_{0}^{2}/(n-\omega_{0})$, so its total

1830: probability is at most $\omega_{0}\cdot

1831: \omega_{0}^{2}/(n-\omega_{0})$. Both terms can be absorbed into

1832: $\omega_{0}^{4}/n$.

1833:

1834: \end{observation}

1835: \vspace{5mm}

1836:

1837:

1838: \begin{lemma}\label{no:reject}

1839: If the event in Lemma~\ref{p:2:t} holds then

1840: w.h.p. \PUR\ does not reject at stage $t$, for every $t$ in the range

1841: $n$, $n-1, \ldots, n-\omega_{0}$, given that $\Phi$ survives up to

1842: this stage.

1843: \end{lemma}

1844: \vspace{5mm}

1845:

1846: \beginproof

1847: To prove Lemma~\ref{no:reject} we show that,

1848: with high probability the unit clauses of each $\Phi_{t}$ involve

1849: different variables. This can be seen as follows: consider

1850: $P_{1,t}+N_{1,t}$ balls to be thrown into $t$ urns. The probability

1851: that two of them arrive in the same urn is at most

1852: ${{P_{1,t}+N_{1,t}}\choose {2}}\cdot \frac{1}{t}$. This is upper

1853: bounded

1854: by $\frac{(\omega_{0}n^{0.1})^{2}}{2(n-\omega_{0})}$. Summing this for

1855: $t=n, \ldots,

1856: n-\omega_{0}$ yields an upper bound, which is $o(1)$.

1857: \endproof

1858: \qed

1859: \vspace{5mm}

1860:

1861: The proof for the case $c<3/2$ follows easily from these results:

1862: with probability $1-o(1)$ all the events in Lemmas~\ref{small:p2},

1863: \ref{p:2:t}, \ref{small:delta1}, \ref{p:2:n}, \ref{distance}, and

1864: \ref{no:reject} take place, therefore

1865: \PUR\ does not reject at any of the stages $n$ to $n-\omega_{0}$ and

1866: $P_{1,t}$ is close to $Q_{t}$ in the sense of

1867: Lemma~\ref{distance}. Therefore the probability that for some $t$ in

1868: this range $P_{1,t}=0$ (i.e. \PUR\ accepts) differs by $o(1)$ from the

1869: corresponding probability for $Q_{t}$. But according to the result by

1870: Frieze and Suen \cite{frieze-suen} this latter probability is $1-o(1)$.

1871: \endproof

1872:

1873:

1874: \subsection{The case $c>3/2$}

1875: Define $\omega_{1}=n^{0.51}$. The following

1876: are the auxiliary results we use in this case:

1877:

1878: \vspace{5mm}

1879:

1880: \begin{lemma}\label{trick:delta} Let $A=n^{0.61}$.

1881: For every $k>0$ there exists a constant $c_{k}>0$ such that for every

1882: $r>0$ the probability that there exists $t\in [n-\omega_{0},

1883: n-\omega_{1}]$, $\Delta_{1,t}^{P}+ \Delta_{1,t-1}^{P}+\ldots +

1884: \Delta_{1,t-r+1}^{P}\geq k$ is at most

1885: $c_{k}(\omega_{1}-\omega_{0})(rA/n)^{k}$.

1886: \end{lemma}

1887: \vspace{5mm}

1888:

1889: \beginproof

1890: By Corollary~\ref{p:2:t} we can assume that $P_{1,t}\leq A$. Then for

1891: every $i$,

1892:

1893: \[\Pr[\Delta_{1,t}^{P}\geq i]=\Pr[B(P_{1,t}-1,\frac{1}{t})\geq i]\leq

1894: \Pr[B(A,\frac{1}{t})\geq i]

1895: \]

1896:

1897: \[ = 1 -

1898: \sum_{j=1}^{i-1}{{A}\choose{j}}\left(\frac{1}{t}\right)^{j}\left(1-\frac{1}{t}\right)^{A-j}

1899: \]

1900:

1901: \[ \leq {{A}\choose{i}}(\frac{1}{t})^{i}

1902: \]

1903: The event $\Delta_{1,t}^{P}+ \Delta_{1,t-1}^{P}+\ldots +

1904: \Delta_{1,t-r+1}^{P}\geq k$ happens when:

1905: \begin{itemize}

1906: \item one of the factors is at least $k$, or

1907: \item one of the factors is at least $k-1$, and another one is at

1908: least 1,  or

1909: \item \ldots

1910: \item at least $k$ of the factors are at least one.

1911: \end{itemize}

1912: (a finite number of possibilities). Applying the previous inequality,

1913: and taking into account that $r,k$ are fixed immediately proves the

1914: lemma.

1915: \endproof

1916: \vspace{5mm}

1917:

1918: To flesh out the argument outlined before we construct a

1919: succession of Markov chains running along $P_{1,t}$,

1920: that provide better and better ``approximations'' to

1921: $\overline{Q}_{t}$.

1922: Our use of indices will be slightly nonstandard (to reflect

1923: the connection with $P_{1,t}$), in that

1924: the sequence of indices starts with $n-\omega_{0}$ and is decreasing.

1925: \vspace{5mm}

1926:  \begin{definition}

1927: Let

1928: $X_{n-\omega_{0}}=Y_{n-\omega_{0}}=Z_{n-\omega_{0}}=\overline{Q}_{n-\omega_{0}}=

1929: P_{1,n-\omega_{0}}$

1930: and

1931: \begin{equation}\label{sequences}

1932: \left \{\begin{array}{l}

1933:          X_{t-1}=X_{t}-(p+3)\chi_{p{\bf

1934: Z}+1}(n-\omega_{0}-t)+\Delta_{12,t}^{P},

1935: \\

1936:          Y_{t-1}=Y_{t}-(p+3)\chi_{p{\bf Z}+1}(n-\omega_{0}-t)+B(P_{2,

1937: n-\omega_{

1938: 1}}, 1/t),\\

1939:          Z_{t-1}=Z_{t}-(p+3)\chi_{p{\bf Z}+1}(n-\omega_{0}-t)+B(P_{2,

1940: n-\omega_{

1941: 1}},\frac{1}{n}),\\

1942:         \overline{Q}_{t-1}=\overline{Q}_{t-1}-1+B(p\lfloor \frac{P_{2,

1943: n-\omega_

1944: {1}}}{p+3}\rfloor,\frac{1}{n}).\\

1945:         \end{array}

1946: \right.

1947: \end{equation}

1948: \end{definition}

1949:

1950: Let $c = \Pr[(\exists t\in [n-\omega_{0},n-\omega_{1}]): P_{1,t}=0]$.

1951: Note

1952: that the amount $p+3$

1953: is subtracted from $X_{t}, Y_{t}, Z_{t}$ exactly once in every

1954: $p$ consecutive steps, so

1955: whenever the condition (*) is satisfied it holds that $X_{t}\leq

1956: P_{1,t}$ for every $t\in [n-\omega_{0},n-\omega_{1}]$. By coupling

1957: $\Delta_{12,t}^{P}(= B(P_{2,t}, 1/t))$ with $B(P_{2,n-\omega_{1}},1/t)$

1958: we

1959: deduce that we can couple $X_{t}$ and $Y_{t}$ so that $Y_{t}\leq

1960: X_{t}$. We can also couple $Y_{t}$ and $Z_{t}$ such that $Z_{t}\leq

1961: Y_{t}$.

1962: Finally, notice that we can couple $Z_{n-\omega_{0}-jp}$ and

1963: $\overline{Q}_{n-\omega_{0}-j(p+3)}$ such that

1964: $\overline{Q}_{n-\omega_{0}-j(p+3)}\leq  Z_{n-\omega_{0}-jp}$.

1965: So an upper bound on $\alpha$ is $\Pr[(\exists t\in [0,n-\omega_{0}]):

1966: \overline{Q}_{t}=0]$. With high probability the Bernoulli distribution

1967: in the definition of the chain $\overline{Q}_{t}$ has the average

1968: strictly  greater than

1969: one, (because the flow from $P_{2,t}$ is approximately Poisson), and

1970: $\overline{Q}_{n-\omega_{0}}=\Omega(\omega_{0})$,

1971: therefore, by an elementary property of the queuing chain, the

1972: probability that $\overline{Q}_t$ hits state 0 is exponentially

1973: small. This yields the desired conclusion, that $\alpha =o(1)$.

1974:

1975: One word about the way to prove the fact that $\Phi_{n-\omega_{1}}$ is

1976: unsatisfiable (if defined): one can prove that w.h.p. both

1977: $P_{1,n-\omega_{1}}$ and $N_{1,n-\omega_{1}}$ are

1978: $\Omega(\omega_{1})$. By the uniformity lemma ~\ref{k:2:recurrence}

1979: we are left with the following instance of the occupancy problem:

1980: there are

1981: $P_{1,n-\omega_{1}}$ white balls, $N_{1,n-\omega_{1}}$ black balls and

1982: $n-\omega_{1}$ bins. The desired fact now follows from the second part

1983: of Lemma~\ref{occupancy}.

1984:

1985:

1986: \section{Proof of Theorem~\ref{k:3etc}}

1987:

1988:

1989: Theorem~\ref{k:3etc} is proved along lines very similar to the proof of

1990: Theorem~\ref{k:2}. The basis is the following generalization of

1991: Lemma~\ref{k:2:recurrence}:

1992: \vspace{5mm}

1993:

1994: \begin{lemma}\label{k:3etc:recurrence}

1995: Suppose that $\Phi$ survives up to stage $t$. Then, conditional on the

1996: values

1997: $(P_{1,t}, N_{1,t},\ldots, P_{k,t}, N_{k,t})$, the clauses in

1998: $\Phi_{1,t}^{P},

1999: \Phi_{1,t}^{N},\ldots,  \Phi_{k,t}^{P}, \Phi_{k,t}^{N}$ are chosen

2000: uniformly

2001: at random and are independent. Also, conditional on the

2002: fact that $\Phi$ survives stage $t$ as well, the following recurrences

2003: hold:

2004: \begin{equation}\label{k:3etc:markovchain}

2005: \left \{\begin{array}{l}

2006:          P_{1,t-1}=P_{1,t}-1-\Delta_{01,t}^{P}+\Delta_{12,t}^{P}, \\

2007:          N_{1,t-1}=N_{1,t}+\Delta_{12,t}^{N},                    \\

2008:

2009: P_{i,t-1}=P_{i,t}-\Delta_{0i,t}^{P}-\Delta_{(i-1)i,t}^{P}+\Delta_{i(i+1

2010: ),t}^{P}\mbox{, for }i=\overline{2,k},  \\

2011:

2012: N_{i,t-1}=N_{i,t}-\Delta_{(i-1)i,t}^{N}+\Delta_{i(i+1),t}^{N}\mbox{,\ \

2013:  \  for }i=\overline{2,k},  \\

2014:         \end{array}

2015: \right.

2016: \end{equation}

2017: where (in distribution)

2018: \begin{equation}\label{k:3etc:distribution}

2019: \left \{\begin{array}{l}

2020: \Delta_{01,t}^{P} =B(P_{1,t}-1,1/t),\\

2021: \Delta_{(i-1)i,t}^{P}=B(P_{i,t},(i-1)/t),\\

2022: \Delta_{0i,t}^{P}=B(P_{i,t}-\Delta_{(i-1)i,t}^{P},1/t),\\

2023: \Delta_{(i-1)i,t}^{N}=B(N_{i,t},i/t),\\

2024: \Delta_{k(k+1),t}^{P}=\Delta_{k(k+1),t}^{N}=0.\\

2025: \end{array}

2026: \right.

2027: \end{equation}

2028: \end{lemma}

2029: \vspace{5mm}

2030:

2031: \beginproof

2032:

2033: The uniformity condition and the justification of the recurrences are

2034: absolutely similar to the ones from Lemma~\ref{k:2}.

2035: The additional technical complication is that now there is a ``positive

2036: flow

2037: into $P_{2,t}, N_{2,t}$.''

2038: \endproof

2039: \qed

2040: \vspace{5mm}

2041:

2042: \begin{lemma}

2043: With high probability it holds that

2044:

2045: \[

2046: P_{i,t}=(1+o(1))\cdot \frac{c}{n}\cdot \lambda_{k}\cdot i\cdot

2047: {{t}\choose

2048: {i}}\cdot S^{n+1-t}_{k-i},

2049: \]

2050:  and

2051: \[

2052: N_{i,t}=(1+o(1))\cdot \frac{c}{n}\cdot \lambda_{k}\cdot {

2053: {t}\choose

2054: {i}}\cdot S^{n+1-t}_{k-i},

2055: \]

2056: for every $i\geq 2$, and uniformly on $t=n-o(n)$.

2057: \end{lemma}

2058: \vspace{5mm}

2059:

2060: \beginproof

2061:

2062: Let us first heuristically derive a formula for $x_{i,t}$, $y_{i,t}$,

2063: the expected values of $P_{i,t}$, $N_{i,t}$,

2064: obtained by replacing the binomial distributions in the equations by

2065: their expected values.

2066:

2067: We have:

2068: \begin{equation}\label{k:3etc:markovchain:avg}

2069: \left \{\begin{array}{l}

2070:                  x_{i,t-1}=x_{i,t}-

2071: \frac{x_{i,t}}{t}-\frac{(i-1)x_{i,t}}{t}+\frac{ix_{i+1,t}}{t}\mbox{, for }i=\overline{2,k},  \\

2072:

2073: y_{i,t-1}=y_{i,t}-\frac{iy_{i,t}}{t}+\frac{(i+1)y_{(i+1),t}}{t}\mbox{,\ \

2074:  \  for }i=\overline{2,k},  \\

2075:         \end{array}

2076: \right.

2077: \end{equation}

2078: Rearranging terms the recurrences become

2079: \begin{equation}\label{k:3etc:markovchain:avg:simple}

2080: \left \{\begin{array}{l}

2081:

2082: x_{i,t-1}=x_{i,t}(1-\frac{i}{t})+x_{i+1,t}\frac{i}{t}\mbox{, for }i=\overline{2,k},  \\

2083:

2084: y_{i,t-1}=y_{i,t}(1-\frac{i}{t})+y_{(i+1),t}\frac{(i+1)}{t}\mbox{,\ \

2085:  \  for }i=\overline{2,k}. \\

2086:         \end{array}

2087: \right.

2088: \end{equation}

2089: Also,

2090: \begin{equation}\label{k:3etc:markovchain:begin}

2091: \left \{\begin{array}{l}

2092:          x_{i,n}= \frac{i{{n}\choose{i}}}{H_{k}}\cdot

2093:          c\lambda_{k}\cdot \frac{H_{k}}{n}=

2094:          \frac{c}{n}\lambda_{k}\cdot i{{n}\choose{i}},\\

2095:          y_{i,n}= \frac{{{n}\choose{i}}}{H_{k}}\cdot c\lambda_{k}\cdot

2096:          \frac{H_{k}}{n}= \frac{c}{n}\lambda_{k}\cdot

2097:          {{n}\choose{i}}.\\

2098: \end{array}

2099: \right.

2100: \end{equation}

2101: A simple induction shows that these expected

2102: values are $x_{i,t}= \frac{c}{n}\cdot \lambda_{k}\cdot i\cdot

2103: {{t}\choose

2104: {i}}\cdot S^{n+1-t}_{k-i}$, and $y_{i,t}= \frac{c}{n}\cdot

2105: \lambda_{k}\cdot {

2106: {t}\choose

2107: {i}}\cdot S^{n+1-t}_{k-i}$.

2108:

2109:

2110: The concentration property can be proved inductively, starting from

2111: $i=k$ towards $3$, by noting that the expected values of the

2112: binomial terms in the recurrence are

2113: $\omega(n)$, hence, by the Chernoff bound,

2114:  the probabilities that they

2115: significantly deviate from their expected values is exponentially

2116: small.

2117:

2118: Almost the same argument holds for

2119: $P_{2,t}$ and for $N_{2,t})$.

2120: The only amounts to be handled differently are ``the

2121: clause flows out of $P_{2,t}, N_{2,t}$,'' but they are approximately

2122: Poisson distributed, hence ``small'' with high probability by

2123: Proposition~\ref{chernoff:poisson}. Therefore $P_{2,t}=(1+o(1)) \frac{c}{n}\cdot

2124: \lambda_{k}\cdot 2\cdot {{t}\choose {2}}\cdot S^{n+1-t}_{k-2}$.

2125: \endproof

2126: \qed

2127:

2128: The previous lemma implies that $\Delta_{2,t}^{P}\sim Po(c\cdot

2129: \lambda_{k}\cdot S^{n+1-t}_{k-2})$ (for $t=n-o(n)$); thus in this range

2130: $P_{1,t-1}\sim P_{1,t}-1+Po(c\cdot

2131: \lambda_{k}\cdot S^{n+1-t}_{k-2})$. The proof follows exactly the same

2132: pattern as in the case $c<3/2$ for $k=2$: the conclusion for the stages

2133: $[n,n-\omega_{0}]$ is that the probability that $P_{1.t}$ is zero

2134: somewhere in this range differs by $o(1)$ from the corresponding

2135: probability for the queuing chain in (\ref{eq:3etc}). The fact that the

2136: stages after $[n,n-\omega_{0}]$ have a contribution of $o(1)$ to the

2137: final accepting probability can be seen by the fact that there is

2138: possible to couple the Markov $M_{1}$, describing the evolution of

2139: \PUR\ on a random $k$-SAT formula, and $M_{2}$ that runs on the  2-CNF

2140: component of the formula, such that for every $t$ we have

2141: $P_{1,t}^{M_{2}}\leq

2142: P_{1,t}^{M_{1}}$. Perhaps the most intuitive way to see this coupling

2143: is to ``paint'' the initial clauses of the formula having size at most

2144: two in red, and the other clauses in blue. At every step $t$

2145: $P_{1,t}^{M_{2}}$ will count only red clauses having unit size at step

2146: $t$, while $P_{1,t}^{M_{1}}$ will count clauses of both colors.

2147:

2148: Given the stochastic domination, the desired result follows from the

2149: corresponding proof in the case $k=2$.

2150: \qed

2151:

2152: \section{Proof of Proposition~\ref{annealed}}

2153:

2154: The idea of the proof is to consider \PUR\ on a random at-most-$k$-Horn

2155: formula $\Phi$ with

2156: $\hat{c}\cdot \frac{H_{k}}{n}$ clauses and prove that there exists a

2157: function $\phi(k)$ with $\lim_{k\goesto \infty}\phi(k)=0$ such

2158: that

2159: \[

2160: \lim_{n\goesto \infty}

2161: \Pr[\PUR\ \mbox{ accepts in at least } k\mbox{ steps }]\leq \phi(k).

2162: \]

2163: Indeed, from the previous proof it follows that $\lim_{n\goesto

2164: \infty}\Pr[\PUR\ \mbox{ accepts in }\geq k\mbox{ steps }]$ satisfies

2165: the recurrence:

2166: \[\label{rec}

2167: x_{t+1}= x_{1,t}-1+Po(\hat{c}\cdot S^{t+1+k}_{k-2}),

2168: \]

2169: where

2170: \[

2171: x_{0}= P_{1,k}\geq 1.

2172: \]

2173: We define $\phi(k)$ to be the probability that the sequence in the

2174: recurrence (\ref{rec}) hits zero. Trivially $\lim_{k\goesto

2175: \infty} S^{k+1}_{k-2}=\infty$, so the expected values of the Poisson

2176: distributions in (\ref{rec}) can be made larger than any given constant

2177: $\lambda$. Using the fact that the sum of two Poisson distributions

2178: with parameters $a$ and $b$ has a Poisson distribution with parameter

2179: $a+b$ it follows that, for large enough $k$, one can couple $x_{t}$

2180: with the

2181: queuing chain

2182: \[\label{rec2}

2183: y_{t+1}= y_{1,t}-1+Po(\lambda),

2184: \]

2185:

2186: \[

2187: y_{0}= 1,

2188: \]

2189: such that $y_{t}\leq x_{t}$. It follows that, for large $k$,

2190: $\phi(k)\leq \Pr[\mbox{ the chain $y_{t}$ hits state

2191: zero}]$. Since $\lambda$ was arbitrary, it follows that $\lim_{k\goesto

2192: \infty}\phi(k)=0$.

2193:

2194: Now consider a random {\em uniform} Horn formula $\Phi$ with

2195: $\hat{c}\cdot \frac{H_{n}}{n}$ clauses, and let $\overline \Phi$ be

2196: its subformula consisting of clauses of size at most $k$. It is easily

2197: seen

2198: that the behavior of \PUR\ on the first $k-1$ steps depends only on

2199: the clauses of $\overline \Phi$, so

2200: \[

2201: \Pr[\PUR\ \mbox{ accepts }\Phi\mbox{ in less than }k\mbox{

2202: steps}]=\Pr[\PUR\ \mbox{ accepts }\overline

2203:  \Phi\mbox{ in less than }k\mbox{ steps}].

2204: \]

2205: On the other hand we have

2206:

2207: \[0\leq \Pr[\PUR\ \mbox{ accepts }\Phi\mbox{ in at least }k\mbox{

2208: steps}]\leq \Pr[\PUR\ \mbox{ accepts }\overline

2209:  \Phi\mbox{ in at least }k\mbox{ steps}].

2210: \]

2211: The fact that ``$\overline

2212:  \Phi$ is close to a random formula in $\Omega(n,k,c\cdot

2213:  \frac{H_{k}}{n})$'' (see the discussion in

2214: Observation~\ref{obs:coupling})

2215: implies that

2216: the right-hand side

2217: term can be made less than any fixed constant $\epsilon$ (for $n,k$

2218: big enough). It follows that

2219:

2220: \[

2221: |\Pr[\PUR\ \mbox{ accepts }\Phi]-\Pr[\PUR\ \mbox{ accepts }\overline

2222:  \Phi]|\leq 2\cdot \epsilon,

2223: \]

2224: for large enough values of $n,k$. This immediately implies the desired

2225: result.

2226: \endproof

2227: \qed

2228:

2229:

2230: \section{Proof of Theorem~\ref{k:2:runtime}}

2231: Theorem~\ref{k:2:runtime} is based on the

2232: proof of the Theorem~\ref{k:2}

2233: and an elementary property of the queuing chain $Q_{t}$

2234: (the expected time to hit state zero, conditional on actually hitting

2235: it has the desired form).

2236:

2237: The crucial point is to prove that the probabilities that any of the

2238: conditions we have employed in our analysis fails have a negligible

2239: effect on the running time.

2240:

2241: This is easy to see for stages smaller than $n-\omega_{0}$: since the

2242: probabilities that the various steps of the analysis

2243:  are either exponentially small or can be made $o(1/n)$ (by choosing a

2244: large enough $k$ in Lemma~\ref{trick:delta},

2245: the probability that $P_{1,t}$ hits state zero after

2246: stage $n-\omega_{0}$ is $o(1/n)$, therefore its influence on

2247: the average running time of \PUR\ is $o(1)$.

2248: The corresponding observation  is not true for stages before

2249: $n-\omega_{0}$, but these stages can be handled directly, using the

2250: statement from

2251: Lemma~\ref{distance}.

2252:

2253: \endproof

2254: \qed

2255:

2256:

2257: \section{Random Horn satisfiability as a mean-field

2258: approximation}\label{section:4}

2259:

2260: What we have shown so far is to prove that (under a suitably rescaled

2261: picture) the rescaled probability graphs for random at-most-$k$ Horn

2262: satisfiability converge to the graph for random Horn satisfiability.

2263: To be able to argue that our results display critical behavior, we

2264: have to be able to show that this latter probability $p_{\infty}$,

2265: is indeed the one predicted by some  mean-field

2266: approximation.

2267:

2268:

2269: In the sequel

2270: we will show that this is indeed the case. However the mean-field

2271: approximation is {\em not} the one from \cite{kirkpatrick:selman:scaling}

2272: , and incorporates a correction specific to the

2273: properties of random Horn satisfiability.

2274:

2275: Let us

2276: first see that it is not accurate if no correction is taken into

2277: account. Indeed, were it true we would have

2278:

2279: \[

2280: \lim_{n\goesto \infty} Pr[\Phi\in \HSAT]= 1 - \lim_{n\goesto

2281: \infty}\prod_{A\in \{0,1\}^{n}} \left(1-\Pr[A \models \Phi]\right).

2282: \]

2283: Since, for an assignment $A$ of Hamming weight $i$ there are exactly

2284: $2^{i}-1+(n-i)\cdot 2^{i}$ Horn clauses that $A$ falsifies, we have

2285:

2286: \[

2287: \Pr[A \models \Phi]= \left(1 - \frac{2^{i}-1+(n-i)\cdot

2288: 2^{i}}{(n+2)\cdot

2289: 2^{n}-1}\right)^{c\cdot 2^{n}},

2290: \]

2291: so the mean-field prediction reads

2292:

2293: \[\lim_{n\goesto \infty} \Pr[\Phi\in \HSAT]=1- \lim_{n\goesto

2294: \infty}\prod_{j=0}^{n}\left(1-\left(1 - \frac{2^{j}-1+(n-j)\cdot

2295: 2^{j}}{(n+2)\cdot 2^{n}-1}\right)^{c\cdot 2^{n}}\right)^{{{n}\choose {j}}}.

2296: \]

2297:

2298: All terms in the product are less than 1. Since the term corresponding

2299: to $j=1$ is $\left(1-\left(1 - \frac{1+2\cdot (n-1)}{(n+2)\cdot

2300: 2^{n}-1}\right)^{c\cdot 2^{n}}\right)^{n}$ has limit 0, the mean-field

2301: prediction

2302: would imply that $\lim_{n\goesto \infty} \Pr[\Phi\in \HSAT]=1$.

2303: On the other hand let us observe that, if we do not consider

2304: the power ${{n}\choose

2305: {j}}$ in the infinite product we obtain the right

2306: result: it is a simple but tedious task to prove that

2307:

2308: \[\lim_{n\goesto \infty} \prod_{j=0}^{n}

2309: \left(1-\left(1 - \frac{2^{j}-1+(n-j)\cdot 2^{j}}{(n+2)\cdot

2310: 2^{n}-1}\right)^{c\cdot

2311: 2^{n}}\right)=  \prod_{j=0}^{\infty}\left(1-e^{-c \cdot 2^{j}}\right).

2312: \]

2313:

2314: Intuitively  this means that  ``there exist a correction of the

2315:  mean-field approximation that only considers a single assignment of

2316: each

2317:  weight, and is accurate.'' The following simple result gives a

2318:  precise statement to the above intuition:

2319:

2320: \begin{lemma}

2321: Suppose $\Phi$ is given as a union of

2322: formulas $\Phi_{1}, \ldots, \Phi_{n}$, where $\Phi_{i}$ contains all

2323: clauses of length {\em exactly} $i$. Then there is a set

2324: $T=\{T_{0}, \ldots, T_{n-1}\}$ of assignments, with {\em $T_{i}$ of

2325: Hamming

2326: weight exactly $i$ and depending

2327: only on $\Phi_{1}\cup \ldots \cup \Phi_{i+1}$},

2328: such that $\Phi$ is satisfiable if and only if it is

2329: satisfied by some assignment in $T$.

2330: \end{lemma}

2331:

2332: \beginproof

2333:

2334: Let $\overline{y_{1}\ldots y_{k}}$ denote the assignment that makes

2335: $y_{1}=\ldots = y_{k}=1$, and all the other variables equal to zero.

2336:

2337: The set $T$ has two parts: the first is simply the set of

2338: assignments implicitly examined by the algorithm \PUR\ in testing

2339: satisfiability. That is, if $x_{1}, \ldots, x_{k}$ are the variables

2340: assigned by \PUR\  in this order, the first part includes the

2341: assignments $00000$, $\overline{x_{1}}, \ldots,\overline{x_{1},\ldots,

2342: x_{k}}$. The second part contains a random assignment for each

2343: remaining weight.

2344: \endproof

2345: \qed

2346: The result has a ``mean-field'' interpretation: as before, define

2347: $f(x_{1}, \ldots, x_{n})= 1-

2348: \prod_{i=1}^{n} x_{i}$, and the function

2349: $g_{k}[\Phi]$ to be  the indicator function for the event ``$T_{k}

2350: \not \models \Phi$, given that event $\overline{A}_{n}\AND \ldots \AND

2351: \overline{A}_{n-k+1}$ happens,'' i.e.

2352:

2353: \[g_{k}[\Phi]= \frac{1}{\Pr[\overline{A}_{n}\AND \ldots \AND

2354: \overline{A}_{n-k+1}]}\cdot \left \{\begin{array}{ll}

2355:                  1, & \mbox{ if } T_{k} \not \models \Phi \AND

2356: \overline{A}_{n}\AND \ldots \AND

2357: \overline{A}_{n-k+1}

2358:                  \\

2359:

2360:                  0,  & \mbox{ otherwise.}\\

2361:         \end{array}

2362: \right.

2363: \]

2364: We have

2365:

2366: \[

2367: E[g_{k}[\Phi]]= \Pr[\overline{A_{n-k}}|\overline{A}_{n}\AND \ldots \AND

2368: \overline{A}_{n-k+1}].

2369: \]

2370: Indeed, $g_{k}[\Phi]\neq 0$ exactly when $R_{n}\OR \ldots \OR

2371: R_{n-k+1}$ or $T_{k}\not \models \Phi \AND S_{n}\AND

2372: \ldots S_{n-k+1}$. The second event is equivalent to

2373: $\overline{A_{n-k}}\AND S_{n}\AND \ldots S_{n-k+1}$, hence we have

2374: $g_{k}[\Phi]\neq 0$ exactly when $\overline{A_{n-k}}\AND

2375: \overline{A}_{n}\AND \ldots \AND\overline{A}_{n-k+1}$ holds.

2376:

2377: Thus we have, by the discussion in the previous chapter,

2378: \[

2379: f(E[g_{1}[\Phi]], \ldots,E[g_{n}[\Phi]])=

2380: 1-\prod_{k=0}^{n}\Pr[\overline{A_{n-k}}|\overline{A}_{n}\AND \ldots \AND

2381: \overline{A}_{n-k+1}]= \Pr[\Phi \in \HSAT].

2382: \]

2383:

2384: The above correction seems

2385: to be specific to the random model for Horn satisfiability, which

2386: allows clauses of varying lengths.

2387:

2388: To sum up: {\em the mean-field approximation is true, modulo a

2389: correction that takes into account some particular features of the

2390: random model for Horn satisfiability}.

2391:

2392: \section{Discussion}

2393:

2394: We have characterized the asymptotical

2395: satisfiability probability of a random $k$-Horn formula, and

2396: showed that it exhibits very similar behavior to the one uncovered

2397: experimentally in \cite{kirkpatrick:selman:scaling}.

2398:

2399: We have also displayed an ``easy-hard-easy'' pattern similar to the

2400: ones observed experimentally in the AI literature. In our case the

2401: pattern is fully explained by elementary properties of the queuing

2402: chain.

2403:

2404: As for an explanation of the ``critical behavior'',

2405: consider an intermediate stage $i$ of \PUR\ and

2406: let $C_{j}$ be the set of clauses of $\Phi_{i,j}^{P}$.

2407: It is clear that whether \PUR\ accepts is

2408: dependent only on the number of clauses in $C_{1}$. The

2409: restriction on the clause length acts like a ``dampening''

2410: perturbation (in that it eliminates the ``clause flow into $C_{k}$'').

2411: The proof of Theorem~\ref{k:infinite} states that when

2412:  $k(n)\goesto \infty$, with high probability \PUR\ accepts (if

2413: $\Phi$ is satisfiable) ``before the perturbation reaches $C_{1}$'',

2414: therefore the satisfiability probability is the one from the uniform

2415: case. On the other hand, for any constant $k$,  with probability

2416: greater than 0 \PUR\ does not

2417: halt during the first $k$ iterations (for the exact value see

2418: \cite{istrate:cs.DS/9912001}), and the dampening has a

2419: significant influence. Thus {\em the explanation for

2420: the occurrence (and specific form of) critical behavior is a threshold

2421: property for the number of iterations of \PUR\ on random satisfiable

2422: Horn

2423: formulas ``in the critical region''}.

2424:

2425:

2426: A related, and somewhat controversial, open issue is whether random

2427: Horn satisfiability properly displays critical behavior. Problems with

2428: a sharp threshold display ``critical'' (i.e singular) behavior at least

2429: in one parameter, the satisfaction probability, which conceivably

2430: allows the definition of critical exponents. This is not so for random

2431: $k$-Horn satisfiability, that has a coarse

2432: threshold, and no criticality for $k>2$, hence the question seems not

2433: to be

2434: meaningful. Note, however, that the order parameter involved in the

2435: recent study of the phase transition in 2-SAT \cite{scaling:window:2sat}

2436: is {\bf not} satisfaction probability, but the (expected size) of the

2437: so-called {\em backbone} (or its more tractable version {\em spine}) of a

2438: random formula. The ``window'' that we use to peek at the threshold

2439: behavior of random

2440: Horn satisfiability does not seem to be ``naturally required'' by any

2441: physical

2442: considerations, and it is possible in principle that the random

2443: Horn formulas display critical behavior if we take the spine as the

2444: order parameter.

2445:

2446: \section{Acknowledgments}

2447:

2448: This paper is part of the author's Ph.D. thesis at the University of

2449: Rochester.

2450: Support for this work has come from the

2451: NSF CAREER Award CCR-9701911 and the

2452: NSF Grant 9725021.

2453:

2454: {\small

2455: %\bibliography{bibtheory}

2456: \bibliography{/home/gistrate/bib/bibtheory}

2457: \clearpage

2458:  }

2459:

2460:

2461:

2462:

2463: \end{document}