0404:cond-mat0404654/nim.tex

1: %\documentclass[a4paper,twocolumn,showpacs]{revtex4}

2: \documentclass[a4paper,preprint]{revtex4}

3:

4: \usepackage{amsmath,amssymb,amsfonts}

5: \usepackage{graphicx}

6:

7: % comandi

8: \newcommand{\fe}{^{\,\!}}

9:

10:

11: \begin{document}

12:

13: \title{On the convergence of Kikuchi's natural iteration method}

14:

15: \author{Marco Pretti}

16:

17: \affiliation{Istituto Nazionale per la Fisica della Materia (INFM)

18: and Dipartimento di Fisica, \\ Politecnico di Torino, Corso Duca

19: degli Abruzzi 24, I-10129 Torino, Italy}

20:

21: \date{\today}

22:

23: \begin{abstract}

24: In this article we investigate on the convergence of the natural

25: iteration method, a numerical procedure widely employed in the

26: statistical mechanics of lattice systems to minimize Kikuchi's

27: cluster variational free energies. We discuss a sufficient

28: condition for the convergence, based on the coefficients of the

29: cluster entropy expansion, depending on the lattice geometry. We

30: also show that such a condition is satisfied for many lattices

31: usually studied in applications. Finally, we consider a recently

32: proposed general method for the minimization of non convex

33: functionals, showing that the natural iteration method turns out

34: as a particular case of that method.

35: \end{abstract}

36:

37: %\pacs{}

38:

39: \maketitle

40:

41:

42: \section{Introduction}

43:

44: The cluster variation method (CVM) is a powerful approximate

45: technique for the statistical mechanics of lattice systems, which

46: can improve the simple mean field and Bethe theories, by taking

47: into account correlations on larger and larger distances. It was

48: first proposed by Kikuchi in 1951~\cite{Kikuchi1951} as an

49: approximate evaluation of the thermodynamic weight of the system,

50: and since then it has been reformulated several

51: times~\cite{Morita1972,Schlijper1983,An1988}, mainly to clarify

52: the nature of the approximation and to simplify the way to work it

53: out. Quite a recent formulation~\cite{An1988} shows that the CVM

54: consists in a truncation of the cumulant entropy expansion. Each

55: cumulant is associated to a cluster of sites and the truncation is

56: justified by the expected rapid vanishing of the cumulants upon

57: increasing the cluster size. In this way the CVM can be viewed as

58: a hierarchy of approximations, each one defined by the set of

59: maximal clusters retained in the cumulant expansion, usually

60: denoted as basic clusters. If pairs of nearest neighbor sites are

61: chosen as basic clusters, the CVM coincides with the Bethe

62: approximation. Generally, using larger basic clusters improves the

63: approximation, even if the convergence of the cumulant expansion

64: to the exact entropy has been rigorously proved just in a few

65: cases~\cite{Schlijper1983,Kikuchi1994}.

66:

67: Due to its relative simplicity and accuracy, the CVM is widely

68: used in every kind of statistical mechanical applications, to

69: determine both thermodynamic

70: properties~\cite{GiaconiaPagotTetot2000,AstaHoyt2000,SchonInden1996}

71: and phase

72: diagrams~\cite{Kentzinger2000,Lopez-Sandoval1999,Oates1999,BuzanoPretti1997}.

73: The CVM results generally compare well with those of Monte Carlo

74: simulations~\cite{Lopez-Sandoval1999,Oates1999,Lapinskas1998} as

75: well as experimental

76: ones~\cite{ClouetNastarSigli2004,SchonInden2001,Kentzinger2000,GiaconiaPagotTetot2000,Lapinskas1998,SchonInden1996}.

77: Making use of suitable series of CVM approximations, it is also

78: possible to extrapolate quite accurate estimates of critical

79: exponents~\cite{Pelizzola2000,Pelizzola1994,KatoriSuzuki1994,KatoriSuzuki1988}.

80: Recently, it has been shown that the belief propagation algorithm,

81: an approximate method for statistical inference, employed for a

82: lot of technologically relevant problems

83: (image~\cite{TanakaInoueTitterington2003} and signal

84: processing~\cite{Kschnischang2002}, decoding of error-correcting

85: codes~\cite{Kschnischang2002,Frey1998}, machine

86: learning~\cite{Frey1998}), is actually equivalent to the

87: minimization of a Bethe free energy for statistical mechanical

88: models defined on graphs~\cite{YedidiaFreemanWeiss2002}. This fact

89: has opened new research areas both to the application of the CVM

90: as an improvement of the

91: approximation~\cite{YedidiaFreemanWeiss2002}, and to the analysis

92: of efficient minimization

93: algorithms~\cite{Yuille2001,HeskesAlbersKappen2003,PrettiPelizzola2003},

94: mainly due to the fact that belief propagation sometimes fails to

95: converge.

96:

97: Let us introduce the problem from the CVM point of view. Once the

98: approximate entropy (and hence free energy) for the chosen set of

99: basic clusters has been obtained, one has to face the problem of

100: minimizing a complicated non-convex functional in the basic

101: cluster probability distributions. An algorithm for minimizing

102: such a functional has been proposed by Kikuchi

103: himself~\cite{Kikuchi1974}, and is known as natural iteration

104: method (NIM). A proof of convergence of this algorithm has been

105: given in the original paper, essentially for the Bethe

106: approximation, which can be easily extended to the Husimi

107: tree~\cite{Pretti2003}. Nevertheless, the range of convergent

108: cases seems to be much wider, so that the natural iteration method

109: might be interesting also for the non conventional applications

110: mentioned above.

111:

112: In this article we analyze a sufficient condition for the

113: convergence of the NIM. Such a condition is a requirement on the

114: coefficients of the cluster entropy expansion (obtained from the

115: cumulant expansion through a M{\"o}bius inversion~\cite{An1988})

116: and is shown to hold for quite a large variety of approximations

117: that are generally used to treat thermodynamic systems. Namely, we

118: consider: a set of ``plaquette'' approximations on different

119: lattices~\cite{Kikuchi1974,SchonInden1996,BuzanoPretti1997,KingChen1999},

120: Kikuchi's B~and C~hierarchies for the

121: square~\cite{KikuchiBrush1967} and

122: triangular~\cite{PelizzolaPretti1999} lattices, the cube

123: approximation for the simple cubic lattice. As far as the latter

124: case is concerned, we actually analyze a generic hypercube

125: approximation on the hypercubic lattice in $d$~dimensions, showing

126: that the sufficient condition holds for $d \leq 3$. Finally we

127: take into account a recently proposed algorithm for the

128: minimization of the CVM free energy~\cite{HeskesAlbersKappen2003},

129: which allows several alternatives, depending on the possibility of

130: upperbounding the free energy with convex (easy to be minimized)

131: functions. We show that one of the best choices is actually

132: equivalent to the natural iteration method.

133:

134:

135: \section{The CVM free energy}

136:

137: As mentioned in the Introduction, the approximate CVM entropy can

138: be written as a linear combination of cluster

139: entropies~\cite{An1988}

140: \begin{equation}

141:   S = \sum_\alpha a_\alpha S_\alpha

142:   ,

143:   \label{eq:sumrule}

144: \end{equation}

145: where the sum index~$\alpha$ runs over all basic clusters and

146: their subclusters. We shall always consider clusters in this set

147: only. The cluster entropies are defined as usual

148: \begin{equation}

149:   S_\alpha

150:   =

151:   - \sum_{x_\alpha}

152:   p_\alpha(x_\alpha) \log p_\alpha(x_\alpha)

153:   ,

154: \end{equation}

155: where $p_\alpha(x_\alpha)$ denotes the probability of the

156: configuration~$x_\alpha$ for the cluster~$\alpha$, the sum runs

157: over all possible configurations, and the Boltzmann constant~$k$

158: is set to~$1$ (entropy is measured in natural units). The

159: coefficients can be determined recursively, starting from basic

160: clusters down to subclusters, making use of the following

161: property~\cite{An1988}

162: \begin{equation}

163:   \sum_{\alpha' \supseteq \alpha} a_{\alpha'} = 1

164:   \ \ \forall \alpha

165:   .

166: \end{equation}

167: Due to the fact that a basic cluster~$\gamma$ never contains (by

168: definition) another basic cluster, from the above formula we

169: immediately get $a_\gamma = 1 \ \forall \gamma$. Here and in the

170: following, $\gamma$ denotes basic clusters. As far as the

171: hamiltonian is concerned, we assume that it can be written as a

172: sum of contributions $h_\gamma$ from all basic clusters as

173: \begin{equation}

174:   \mathcal{H} = \sum_{\gamma} h_\gamma(x_\gamma)

175:   ,

176: \end{equation}

177: where of course $x_\gamma$ denote basic cluster configurations.

178: Let us decide to write the whole CVM free energy as a sum over

179: basic clusters, splitting entropy contributions from each

180: subcluster among all basic clusters that contain it (in equal

181: parts). Assuming energies normalized to $kT$, we obtain

182: \begin{equation}

183:   F[p] =

184:   \sum_{\gamma}

185:   \sum_{x_\gamma} p_\gamma(x_\gamma)

186:   \left[

187:   h_\gamma(x_\gamma)

188:   + \log p_\gamma(x_\gamma)

189:   + \sum_{\alpha \subset \gamma} b_\alpha \log p_\gamma(x_\alpha)

190:   \right]

191:   ,

192:   \label{eq:f1}

193: \end{equation}

194: where

195: \begin{equation}

196:   p_\gamma(x_\alpha) \equiv

197:   \sum_{x_{\gamma \setminus \alpha}}

198:   p_\gamma(x_\gamma)

199:   .

200:   \label{eq:margin1}

201: \end{equation}

202: Let us notice that we have defined new coefficients $b_\alpha

203: \equiv a_\alpha / c_\alpha$, where $c_\alpha$ denotes the number

204: of basic clusters that contain~$\alpha$, and we have expressed

205: subcluster probability distributions as marginals of basic cluster

206: distributions, according to Eq.~\eqref{eq:margin1} (the sum runs

207: over configurations $x_{\gamma \setminus \alpha}$ of the basic

208: cluster $\gamma$ minus the subcluster $\alpha$).

209:

210:

211: \section{The natural iteration method}

212:

213: In the above formulation, basic cluster distributions

214: $\{p_\gamma(x_\gamma)\}$ are the variational parameters of the

215: free energy (which is denoted in short by $F[p]$), and the

216: thermodynamic equilibrium state can be determined by minimization

217: with respect to these parameters with suitable normalization and

218: compatibility constraints. By compatibility we mean of course that

219: marginal distributions $p_\gamma(x_\alpha)$ must be the same for

220: all basic clusters $\gamma \supset \alpha$. Let us notice that,

221: for most thermodynamic applications, one usually makes some

222: homogeneity assumption on the system, and this generally reduces

223: the problem to only one or few different basic cluster

224: distributions. Compatibility constraints may be still necessary to

225: impose the required symmetry. We go on with the complete

226: formulation, without loss of generality. The important thing is

227: that in any case we deal with constraints that are linear in the

228: probability distributions (compatibility), possibly with an

229: additive constant (unit) term (normalization). According to the

230: Lagrange method, we transform the constrained minimum problem with

231: respect to $\{p_\gamma(x_\gamma)\}$ to a free minimum problem for

232: an extended functional which depends on additional parameters

233: (Lagrange multipliers). Due to linearity, the extended functional

234: can be written in the form

235: \begin{equation}

236:   \tilde{F}[p,\lambda] = F[p]

237:   - \sum_{\gamma} \sum_{x_\gamma} p_\gamma(x_\gamma) \lambda_\gamma(x_\gamma)

238:   ,

239:   \label{eq:f2}

240: \end{equation}

241: where $\{\lambda_\gamma(x_\gamma)\}$ are the Lagrange multipliers.

242: Of course, $\{\lambda_\gamma(x_\gamma)\}$ are not all independent

243: variables, but internal relationships are system dependent, and we

244: do not analyze them. Let us only notice, for future use, that the

245: difference between the new functional and the original one (the

246: last term in Eq.~\eqref{eq:f2}) is actually independent of the

247: $\{p_\gamma(x_\gamma)\}$ distributions, provided they satisfy the

248: required constraints.

249:

250: The derivatives of~$\tilde{F}$ with respect

251: to~$p_\gamma(x_\gamma)$ turn out to be

252: \begin{equation}

253:   \frac{\partial \tilde{F}[p,\lambda]}{\partial p_\gamma(x_\gamma)}

254:   =

255:   h_\gamma(x_\gamma)

256:   + \log p_\gamma(x_\gamma)

257:   + \sum_{\alpha \subset \gamma} b_\alpha \log p_\gamma(x_\alpha)

258:   - \lambda_\gamma(x_\gamma)

259:   + \text{const.}

260:   ,

261: \end{equation}

262: where the additive constant is irrelevant and we can absorb it

263: into the Lagrange multipliers. Setting the above derivatives to

264: zero resolves stationarization with respect to probability

265: distributions. The natural iteration method consists in rewriting

266: such equations in a fixed point form, that is

267: \begin{equation}

268:   \hat{p}_\gamma(x_\gamma) =

269:   e^{\lambda_\gamma(x_\gamma) - h_\gamma(x_\gamma)}

270:   \prod_{\alpha \subset \gamma} \left[ p_\gamma(x_\alpha) \right]^{-b_\alpha}

271:   ,

272:   \label{eq:nim}

273: \end{equation}

274: and then solving them by simple iteration. A new estimate of the

275: basic cluster probability distribution $\hat{p}_\gamma(x_\gamma)$

276: is obtained from the previous one $p_\gamma(x_\gamma)$ trough its

277: marginals $p_\gamma(x_\alpha)$. The Lagrange multipliers must be

278: determined at each iteration, so that also

279: $\hat{p}_\gamma(x_\gamma)$ satisfies the required constraints.

280: This job can be done in different ways by a nested procedure

281: (inner loop), for instance a Newton-Raphson method or a suitable

282: fixed point method~\cite{Kikuchi1976,PelizzolaPretti1999}. In this

283: paper we do not deal with the determination of Lagrange

284: multipliers, but we only focus on the convergence of the main

285: loop.

286:

287:

288: \section{Sufficient condition for the convergence}

289:

290: As usual for iterative algorithms designed to minimize functionals

291: that are bounded from below, a proof of convergence can be given

292: by the decreasing of the functional value at each iteration. This

293: is actually the case for the natural iteration method. Let us

294: consider the free energy difference $F[\hat{p}]-F[p]$ for two

295: subsequent iterations $p,\hat{p}$, where $F[p]$ is defined by

296: Eqs.~\eqref{eq:f1} and~\eqref{eq:margin1}. Taking the logarithm of

297: both sides of Eq.~\eqref{eq:nim}, we can rewrite the NIM equations

298: in two different ways, that are

299: \begin{equation}

300:   \log \hat{p}_\gamma(x_\gamma) =

301:   \lambda_\gamma(x_\gamma) - h_\gamma(x_\gamma)

302:   - \sum_{\alpha \subset \gamma} b_\alpha \log p_\gamma(x_\alpha)

303: \end{equation}

304: \begin{equation}

305:   \sum_{\alpha \subset \gamma} b_\alpha \log p_\gamma(x_\alpha)

306:   = \lambda_\gamma(x_\gamma) - h_\gamma(x_\gamma)

307:   - \log \hat{p}_\gamma(x_\gamma)

308:   .

309: \end{equation}

310: Let us replace the former into $F[\hat{p}]$ and the latter into

311: $F[p]$. Remembering that probability distributions satisfy the

312: constraints, whence latter term on the right hand side of

313: Eq.~\eqref{eq:f2} depends on Lagrange multipliers only, we obtain

314: \begin{equation}

315:   F[\hat{p}]-F[p]

316:   = \sum_\gamma \sum_{x_\gamma}

317:   \left\{

318:   p_\gamma(x_\gamma)

319:   \log \frac{\hat{p}_\gamma(x_\gamma)}{p_\gamma(x_\gamma)}

320:   - \hat{p}_\gamma(x_\gamma) \sum_{\alpha \subset \gamma}

321:   b_\alpha \log \frac{p_\gamma(x_\alpha)}{\hat{p}_\gamma(x_\alpha)}

322:   \right\}

323:   .

324:   \label{eq:deltaf2}

325: \end{equation}

326: Let us consider the inequality $\log \xi \le \xi-1$, observing

327: that equality holds if and only if $\xi=1$. By applying this

328: inequality to the first logarithm (the one involving basic cluster

329: probability distributions) in Eq.~\eqref{eq:deltaf2}, and taking

330: into account that distributions are normalized, we obtain

331: \begin{equation}

332:   F[\hat{p}]-F[p]

333:   \leq

334:   - \sum_\gamma \sum_{x_\gamma}

335:   \hat{p}_\gamma(x_\gamma) \sum_{\alpha \subset \gamma}

336:   b_\alpha \log \frac{p_\gamma(x_\alpha)}{\hat{p}_\gamma(x_\alpha)}

337:   ,

338:   \label{eq:deltaf3}

339: \end{equation}

340: where equality holds if and only if $\hat{p}_\gamma(x_\gamma) =

341: p_\gamma(x_\gamma) \ \forall \gamma,x_\gamma$. The same result

342: could be obtained by observing that actually the upperbounded

343: terms coincide with (minus) the Kullbach-Liebler distances between

344: the probability distributions $p_\gamma(x_\gamma)$ and

345: $\hat{p}_\gamma(x_\gamma)$. If all subcluster coefficients

346: $b_\alpha$ were negative, we could apply the same argument to all

347: terms, and the upperbound would be zero. Such a situation occurs

348: for instance in the Bethe~\cite{Kikuchi1974} and Husimi

349: tree~\cite{Pretti2003} approximations, and the proof of

350: convergence would be complete. In a general case we have to

351: require a condition on the $b_\alpha$~coefficients. The basic idea

352: is to ``couple'' smaller cluster terms with a positive coefficient

353: to larger cluster terms with a negative coefficient, yielding a

354: sum of ``negative'' Kullbach-Liebler distances (some between

355: conditional probability distributions), which can then be

356: upperbounded by zero. The details are given in the following.

357:

358: \noindent {\bf Theorem (sufficient condition for the

359: convergence):} Let $\{b_{\alpha^-|\alpha^+}\}$ be a set of non

360: negative coefficients (allocation coefficients), defined for each

361: pair of subclusters $\alpha^-,\alpha^+$, such that

362: $b_{\alpha^-}<0$, $b_{\alpha^+}>0$, and $\alpha^- \supset

363: \alpha^+$. If the following properties hold for all basic

364: clusters~$\gamma$

365: \begin{eqnarray}

366:   b_{\alpha^+} =

367:   & \displaystyle \sum_{\alpha^+ \subset \alpha^- \subset \gamma}

368:   & b_{\alpha^-|\alpha^+}

369:   \ \ \ \ \ \ \forall \alpha^+ \subset \gamma

370:   \label{eq:suffcondplus} \\

371:   -b_{\alpha^-} \geq

372:   & \displaystyle \sum_{\alpha^+ \subset \alpha^-}

373:   & b_{\alpha^-|\alpha^+}

374:   \ \ \ \ \ \ \forall \alpha^- \subset \gamma

375:   ,

376:   \label{eq:suffcondminus}

377: \end{eqnarray}

378: then

379: \begin{eqnarray}

380:   &&

381:   F[\hat{p}]-F[p] \le 0

382:   \label{eq:deltafle0} \\

383:   &&

384:   F[\hat{p}]-F[p] = 0 \ \ \Longleftrightarrow \ \

385:   \hat{p} = p

386:   .

387:   \label{eq:deltafeq0iff}

388: \end{eqnarray}

389: Eq.~(\ref{eq:deltafle0}) means that the free energy can be

390: decreasing or constant during the procedure, while

391: Eq.~(\ref{eq:deltafeq0iff}) assures that it is constant only if

392: the procedure has already reached convergence (i.e., the free

393: energy  can only decrease during the procedure). A relevant

394: consequence of Eq.~\eqref{eq:deltafeq0iff} is that it prevents the

395: dynamical system defined by the NIM equations from having limit

396: cycles at constant free energy, which could occur in principle.

397:

398: \noindent {\bf Proof:} Let us consider the right hand side of

399: Eq.~\eqref{eq:deltaf3} and split the sum over subclusters $\alpha

400: \subset \gamma$ in two sums over subclusters $\alpha^+,\alpha^-$

401: with positive or negative coefficients respectively. Positive

402: coefficients $b_{\alpha^+}$ can be replaced by

403: Eq.~\eqref{eq:suffcondplus}, while, according to

404: Eq.~\eqref{eq:suffcondminus}, negative coefficients can be

405: replaced by

406: \begin{equation}

407:   b_{\alpha^-} =

408:   - \sum_{\alpha^+ \subset \alpha^-}

409:   b_{\alpha^-|\alpha^+}

410:   - d_{\alpha^-}

411:   ,

412: \end{equation}

413: for certain $d_{\alpha^-} \geq 0$. Defining, for each $\alpha^-

414: \supset \alpha^+$, the conditional probability distributions

415: \begin{equation}

416:   p_\gamma(x_{\alpha^-}|x_{\alpha^+})

417:   \equiv

418:   \frac{p_\gamma(x_{\alpha^-})}{p_\gamma(x_{\alpha^+})}

419:   ,

420: \end{equation}

421: after some simple manipulations we obtain

422: \begin{equation}

423:   - \sum_{\alpha \subset \gamma}

424:   b_\alpha \log \frac{p_\gamma(x_\alpha)}{\hat{p}_\gamma(x_\alpha)}

425:   = \sum_{\alpha^- \subset \gamma}

426:   \left[

427:   d_{\alpha^-}

428:   \log \frac{p_\gamma(x_{\alpha^-})}{\hat{p}_\gamma(x_{\alpha^-})}

429:   + \sum_{\alpha^+ \subset \alpha^-}

430:   b_{\alpha^-|\alpha^+}

431:   \log \frac{p_\gamma(x_{\alpha^-}|x_{\alpha^+})}{\hat{p}_\gamma(x_{\alpha^-}|x_{\alpha^+})}

432:   \right]

433:   .

434: \end{equation}

435: The logarithm inequality $\log \xi \le \xi-1$ can now be applied

436: to all terms in the previous equation, because all coefficients

437: are positive (or equivalently we get a sum of Kullbach-Liebler

438: terms), and the zero upperbound of Eq.~\eqref{eq:deltafle0} is

439: obtained. As previously mentioned, Eq.~\eqref{eq:deltafeq0iff} is

440: proved by the fact that the logarithm inequality holds if and only

441: if $\xi = 1$, i.e., the Kullbach-Liebler distance between two

442: probability distributions is zero if and only if the two

443: distributions are equal.~$\blacksquare$

444:

445:

446: \section{Some particular cases}

447:

448: In this section we consider some particular choices of basic

449: clusters, that is, some particular CVM approximations for regular

450: lattices on which several model systems are defined.

451:

452: \subsection{``Plaquette'' approximations}

453:

454: By ``plaquette'' approximations we mean a class of approximations

455: in which basic clusters are of a unique type (which we denote as

456: plaquette, for example a square on a square lattice), while

457: subclusters with non zero coefficients are only single sites and

458: nearest neighbor pairs. Let us denote such clusters by $1$ and $2$

459: respectively, and, according to the notation introduced in

460: Sec.~II, let us denote by $a_1$ and $a_2$ the coefficients of the

461: cluster entropy expansion, by $c_1$ and $c_2$ the numbers of

462: plaquettes sharing a given subcluster, and by $b_i = a_i/c_i$ the

463: normalized coefficients. In this class of approximations, it is

464: possible to show that all the coefficients can be obtained as a

465: function of $c_1,c_2$ and of the lattice coordination number~$q$.

466: Making use of Eq.~\eqref{eq:sumrule}, and remembering that basic

467: clusters (plaquettes) have unit $a$-coefficient, we can write

468: \begin{eqnarray}

469:   &&

470:   a_2 + c_2 = 1

471:   \\

472:   &&

473:   a_1 + qa_2 + c_1 = 1

474:   ,

475: \end{eqnarray}

476: from which $b_i = a_i/c_i$ are easily obtained:

477: \begin{eqnarray}

478:   b_2 & = & -\frac{c_2-1}{c_2}

479:   \label{eq:b2plaq} \\

480:   b_1 & = & \frac{q(c_2-1)-(c_1-1)}{c_1}

481:   .

482: \end{eqnarray}

483: Then, we have to impose the sufficient conditions on the

484: coefficients, Eqs.~\eqref{eq:suffcondplus}

485: and~\eqref{eq:suffcondminus}. From Eq.~\eqref{eq:b2plaq} we easily

486: see that~$b_2 \leq 0$, which is ok for upperbounding, but

487: usually~$b_1 \geq 0$. We then have to couple each site to pairs

488: that contain it and are contained in a given plaquette. Let us

489: adopt the strategy of splitting the site coefficient among such

490: pairs in equal parts, so that, being $b_{2|1}$ the only allocation

491: coefficient and $r$ the number of pairs,

492: Eqs.~\eqref{eq:suffcondplus} and~\eqref{eq:suffcondminus} read

493: \begin{eqnarray}

494:   b_1 & = & r b_{2|1}

495:   \\

496:   -b_2 & \geq & 2 b_{2|1}

497:   .

498: \end{eqnarray}

499: The allocation coefficient may be easily eliminated, yielding the

500: single condition

501: \begin{equation}

502:   \frac{b_1}{r} + \frac{b_2}{2} \leq 0

503:   .

504:   \label{eq:condsuffplaq}

505: \end{equation}

506: It is possible to show that also the $r$~parameter depends on

507: $c_1,c_2,q$ only. Let us imagine to multiply the number~$q$ of

508: nearest neighbor pairs sharing a site times the number $c_2$ of

509: plaquettes sharing a pair. It is easy to realize that in this way

510: we have {\em overcounted} $r$~times the number $c_1$ of plaquettes

511: sharing the given site, i.e.,

512: \begin{equation}

513:   rc_1 = qc_2

514:   .

515: \end{equation}

516: With the above manipulation, the condition~\eqref{eq:condsuffplaq}

517: can be rewritten as

518: \begin{equation}

519:   q(c_2-1) \leq 2(c_1-1)

520:   .

521: \end{equation}

522: In this form we can easily verify its validity, which is done in

523: Tab.~\ref{tab:coefficients} for a set of typical plaquette

524: approximations. We have considered: the 2d square, triangular, and

525: honeycomb lattices with a 4-site

526: square~\cite{BuzanoPretti1997,KingChen1999}, a 3-site

527: triangle~\cite{KingChen1999}, and an elementary hexagon as basic

528: cluster respectively, the simple cubic (sc) lattice with a 4-site

529: square~\cite{KingChen1999} as basic cluster, and the face-centered

530: cubic (fcc) lattice with a 3-site triangle~\cite{KingChen1999} or

531: a 4-site tetrahedron~\cite{Kikuchi1974,SchonInden1996} as basic

532: cluster.

533:

534: \subsection{B and C hierarchies}

535:

536: The B and C~hierarchies, originally proposed by Kikuchi and

537: Brush~\cite{KikuchiBrush1967}, are series of approximations with

538: increasing cluster size, suitable for 2d

539: square~\cite{KikuchiBrush1967} and

540: triangular~\cite{PelizzolaPretti1999} lattices. They are

541: interesting mainly because they converge towards the exact free

542: energy, in spite of the fact that the cluster size increases only

543: in one direction. This result has been proved rigorously only for

544: the C~hierarchy~\cite{Schlijper1983}, but there are numerical

545: evidences for both~\cite{KikuchiBrush1967,PelizzolaPretti1999}.

546: Such results~\cite{Schlijper1983} are related to the transfer

547: matrix concept: As the Bethe approximation solves exactly an

548: Ising-like chain, the CVM, with infinitely long 1d stripes as

549: basic clusters (to which the B and C~hierarchies tend), solves

550: exactly a 2d lattice. Here we are interested in showing that these

551: approximations verify the sufficient condition for the convergence

552: discussed above. Let us consider for instance the B~hierarchy on

553: the triangular lattice (a completely analogous treatment holds for

554: the C~hierarchy and/or for the square lattice). The basic

555: clusters, shown in Fig.~\ref{fig:gerb} (top row, left column), are

556: made up of a sequence of $L-1$~up- and $L$~down-pointing

557: triangles, where $L$ is an adjustable parameter. Of course, also

558: corresponding clusters with $L$~up- and $L-1$~down-pointing

559: triangles are allowed, but all basic clusters always extends only

560: in one direction. This choice can be viewed as a generalization of

561: the triangle plaquette approximation (see Fig.~\ref{fig:gerb}, top

562: row, right column), where of course also up-pointing triangles are

563: included in the set of basic clusters. In the following rows of

564: Fig.~\ref{fig:gerb} also the subclusters of the given basic

565: cluster, having nonzero coefficients in the cluster entropy

566: expansion ($a$-coefficients), are displayed. They are divided in

567: pair-like and site-like subclusters, in that they can be put in

568: one-to-one correspondence with pair and site subclusters for the

569: triangle plaquette approximations. Such analogy is not only a

570: pictorial one. In fact, it is possible to show (for instance

571: making use of Eq.~\eqref{eq:sumrule}, but see also

572: Ref.~\cite{KikuchiBrush1967}) that the $a$-coefficients are

573: $a_2=-1$ for pair-like clusters and $a_1=1$ for site-like

574: clusters, like for the triangle plaquette approximation. The same

575: holds for $c$-coefficients, i.e., the numbers of basic clusters

576: sharing a given subclusters, which turn out to be $c_2=2$ and

577: $c_1=6$ respectively, whence $b_2=-1/2$ and $b_1=1/6$. Finally,

578: from Fig.~\ref{fig:gerb} one easily sees that also the same

579: ``allocation'' technique as for the plaquette approximation can be

580: used. Inside a given basic cluster, each site-like subcluster is

581: shared by $r=2$ pair-like clusters, and each pair-like cluster

582: contains 2 site-like subclusters, whence

583: inequality~\eqref{eq:condsuffplaq} is satisfied.

584:

585: \subsection{Hypercube approximation in $d$ dimensions}

586:

587: Finally, let us consider the case of a hypercubic lattice in

588: $d$~dimensions, and let us choose a $d$-dimensional hypercube

589: ($d$-cube) as basic cluster. Of course, the relevant cases are

590: $d=2,3$, the former of which coincides with the square plaquette

591: approximation, mentioned above, but the interest of a general

592: treatment will be clearer later. It is possible to show, by

593: repeated use of Eq.~\eqref{eq:sumrule}, that clusters with non

594: zero coefficients are only $i$-cubes, for $i=1,\dots,d$, and the

595: $i$-cube coefficient in $d$ dimensions is $a_i^{(d)} =

596: (-1)^{d-i}$. Moreover, the number of $d$-cubes sharing a given

597: $i$-cube (in $d$ dimensions) is $c_i^{(d)} = 2^{d-i}$. As a

598: consequence, the normalized coefficients turn out to be

599: \begin{equation}

600:   b_i^{(d)} = \left( -\frac{1}{2} \right)^{d-i}

601:   .

602:   \label{eq:bcoeff_hcube}

603: \end{equation}

604: Let us now impose the sufficient conditions,

605: Eqs.~\eqref{eq:suffcondplus} and~\eqref{eq:suffcondminus}. Let us

606: notice that the positive coefficients, those who give problems for

607: upperbounding, have the $i$~index with the same parity as~$d$,

608: that is $i=d-2,d-4,\dots$. Then we can couple each $i$-cube with

609: $(i+1)$-cubes that contain it and are contained in a given

610: $d$-cube. As for plaquette approximations, let us split the

611: $i$-cube coefficient in equal parts, so that we have a single

612: $b_{i+1|i}^{(d)}$ allocation coefficient. We still have to observe

613: that each $i$-cube is shared by $d-i$ $(i+1)$-cubes contained in

614: the same $d$-cube (the equivalent of the $r$~parameter for

615: plaquette approximations), and that each $(i+1)$-cube contains

616: $2(i+1)$ different $i$-cubes (the equivalent of $2$~sites in a

617: pair). We can then rewrite Eqs.~\eqref{eq:suffcondplus}

618: and~\eqref{eq:suffcondminus} as

619: \begin{eqnarray}

620:   b_i^{(d)} & = & (d-i) \, b_{i+1|i}^{(d)}

621:   \\

622:   -b_{i+1}^{(d)} & \geq & 2(i+1) \, b_{i+1|i}^{(d)}

623:   .

624: \end{eqnarray}

625: By eliminating the allocation coefficient, we obtain

626: \begin{equation}

627:   \frac{b_i^{(d)}}{d-i} + \frac{b_{i+1}^{(d)}}{2(i+1)} \leq 0

628:   ,

629:   \label{eq:condsuffhcube}

630: \end{equation}

631: which, replacing Eq.~\eqref{eq:bcoeff_hcube} and taking into

632: account that $d-i$ is always even (as previously mentioned),

633: becomes

634: \begin{equation}

635:   2i \leq d-1

636:   .

637: \end{equation}

638: Such inequality becomes more and more difficult to be satisfied as

639: the subcluster index~$i$ increases. Therefore we have to consider

640: the worst case, that is $i=d-2$, leading to

641: \begin{equation}

642:   d \leq 3

643:   .

644: \end{equation}

645: This results essentially proves the convergence for $d=3$, because

646: the $d=2$ case coincides with the square plaquette approximation.

647: Nevertheless, it is mainly interesting in that it gives us the

648: opportunity to experiment the natural iteration method in a case

649: in which the sufficient condition is not verified. We have

650: actually implemented the procedure for the simple Ising model on

651: the $d=4$ hypercubic lattice, easily finding cases in which the

652: behavior is non convergent (oscillating). This fact lead us to

653: conjecture that actually the sufficient condition might be also a

654: necessary one.

655:

656:

657: \section{An equivalent formulation}

658:

659: In a recent paper~\cite{HeskesAlbersKappen2003}, a general method

660: for the minimization of non convex functionals, related to the

661: existence of suitable upperbounds to the functional to be

662: minimized, is proposed and applied to the case of the CVM free

663: energy. Different possible choices for the upperbounding

664: functional are investigated. Hereafter, we show that one choice

665: proposed there, which by the way turns out to be quite convenient

666: in terms of computation time, is equivalent to the natural

667: iteration method. First, let us briefly recall the general method,

668: which is based on the following.

669:

670: \noindent {\bf Theorem:} \ Let $F[p]$ be a continuous functional

671: in the set of variables $p$, defined in some compact

672: domain~$\Omega$, and $\bar{F}[p,p']$ an auxiliary continuous

673: functional in a pair of variable sets $p,p'$, defined in the

674: domain~$\Omega^2$, having a unique minimum with respect to~$p'$

675: for each fixed~$p$. Let the auxiliary functional satisfy the

676: following requirements:

677: \begin{eqnarray}

678:   &&

679:   F[p'] \leq \bar{F}[p,p']

680:   \label{eq:flefbar} \\ &&

681:   F[p'] = \bar{F}[p,p'] \ \ \Longleftrightarrow \ \ p' = p

682:   ,

683:   \label{eq:feqfbariff}

684: \end{eqnarray}

685: that is, the auxiliary functional is an upperbound to the original

686: functional, and equality holds if and only if the two arguments of

687: the former are equal. Then the application $\varphi: p \mapsto

688: \hat{p}$ defined by

689: \begin{equation}

690:   \hat{p} = \arg\min_{p' \in \Omega} \bar{F}[p,p']

691:   \label{eq:application}

692: \end{equation}

693: enjoys the properties

694: \begin{eqnarray}

695:   &&

696:   F[\hat{p}] \leq F[p]

697:   \label{eq:flef} \\ &&

698:   F[\hat{p}] = F[p] \ \ \Longleftrightarrow \ \ \hat{p} = p

699:   .

700:   \label{eq:feqfiff}

701: \end{eqnarray}

702: Therefore, it defines an iterative method to minimize the original

703: functional.

704:

705: \noindent {\bf Proof:} \ It is easy to obtain the following

706: inequality chain

707: \begin{equation}

708:   F[\hat{p}] \leq \bar{F}[p,\hat{p}] \leq \bar{F}[p,p] = F[p]

709:   ,

710:   \label{eq:ineqchain}

711: \end{equation}

712: proving immediately Eq.~\eqref{eq:flef}. The first inequality is

713: the first hypothesis on the auxiliary functional~$\bar{F}$,

714: Eq.~\eqref{eq:flefbar}; the second inequality is a consequence of

715: the definition of~$\varphi$, Eq.~\eqref{eq:application}; the

716: equality descends from the second hypothesis on~$\bar{F}$,

717: Eq.~\eqref{eq:feqfbariff}. In order to prove also

718: Eq.~\eqref{eq:feqfiff}, we have to show that both inequalities

719: hold as equalities if and only if~$\hat{p} = p$. As far as the

720: former is concerned, this is a direct consequence of the

721: hypothesis Eq.~\eqref{eq:feqfbariff}, while the latter is proved

722: by the fact that~$\bar{F}[p,p']$ has a unique minimum, which is

723: also the absolute minimum, with respect to~$p'$.~$\blacksquare$

724:

725: Let us now consider the auxiliary functional defined by

726: \begin{equation}

727:   \bar{F}[p,p'] =

728:   \sum_{\gamma}

729:   \sum_{x_\gamma} p'_\gamma(x_\gamma)

730:   \left[

731:   h_\gamma(x_\gamma)

732:   + \log p'_\gamma(x_\gamma)

733:   + \sum_{\alpha \subset \gamma} b_\alpha \log p_\gamma(x_\alpha)

734:   \right]

735:   .

736: \end{equation}

737: First of all, it is easy to see that $\bar{F}[p,p] = F[p]$, where

738: $F[p]$ is the CVM free energy~\eqref{eq:f1}. Moreover, $F[p,p']$

739: is easily seen to be convex with respect to~$p'$, therefore, if it

740: has a stationary point, it is also unique, and is a minimum.

741: Finally, let us observe that stationarization of this functional

742: with respect to~$p'$, with the usual linear constraints, gives

743: rise just to the NIM equations~\eqref{eq:nim}, which in this way

744: can be used to define the application~$\varphi$. In order to show

745: that $\varphi$ actually perform a minimization of~$F$, a

746: sufficient condition is given by

747: Eqs.~\eqref{eq:flefbar},\eqref{eq:feqfbariff} in the above

748: theorem, that is, we have to upperbound the quantity

749: \begin{equation}

750:   F[p'] - \bar{F}[p,p'] =

751:   - \sum_\gamma \sum_{x_\gamma}

752:   p'_\gamma(x_\gamma)

753:   \sum_{\alpha \subset \gamma}

754:   b_\alpha \log \frac{p_\gamma(x_\alpha)}{p'_\gamma(x_\alpha)}

755:   \label{eq:deltaf4}

756: \end{equation}

757: with zero. Going back to (the right hand side of)

758: Eq.~\eqref{eq:deltaf3}, it easily turns out that this is exactly

759: the same upperbound we have proved with the sufficient condition

760: for the convergence of the NIM.

761:

762:

763: \section{Conclusions}

764:

765: Let us finally summarize our results. We have investigated on the

766: convergence of the natural iteration method, proposed by Kikuchi

767: as a minimization procedure for cluster variational free energies

768: and widely employed in a lot of applications of the CVM. We have

769: discussed a condition on the coefficients of the cluster entropy

770: expansion, which is sufficient to prove that the free energy

771: decreases at each iteration, ensuring the convergence of the

772: method. Such a condition is based on the idea of pairing

773: subcluster entropies with a positive coefficient to larger

774: subcluster terms with a negative coefficient, yielding a set of

775: conditional entropy terms with negative coefficients. It had

776: already been proved by Kikuchi in the original

777: paper~\cite{Kikuchi1974} that negative coefficient terms give

778: decreasing contributions to the free energy. We have also taken

779: into account a set of common CVM approximations defined on various

780: regular lattices, frequently encountered in applications, showing

781: that the sufficient condition is always satisfied. In particular,

782: we have devoted some attention to the class of hypercube

783: approximations on the generic ($d$-dimensional) hypercubic

784: lattice, showing that the sufficient condition is verified for $d

785: \leq 3$. We have also implemented the natural iteration method for

786: $d=4$ on the simple Ising model, and found out that several

787: (random as well as uniform) initial conditions give rise to non

788: convergent (oscillating) behavior. This fact has led us to

789: conjecture that the sufficient condition may be also a necessary

790: one. Finally we have established a connection with a recently

791: proposed method for the minimization of non-convex functionals,

792: which can be applied to the CVM free

793: energy~\cite{HeskesAlbersKappen2003}. Such a method is based on

794: the existence of suitable upperbounding functionals to the

795: functional to be minimized. In Ref.~\cite{HeskesAlbersKappen2003}

796: several choices of upperbounding functionals are proposed and

797: applied to simple inhomogeneous systems. We have shown that one of

798: the upperbounding choices proposed there (indeed quite a good

799: choice in terms of computation time) is actually equivalent to

800: Kikuchi's natural iteration method. It turns out explicitly that

801: the upperbounding condition implies free energy decreasing, whence

802: convergence.

803:

804: \begin{acknowledgments}

805: I would like to express my thanks to Dr. Alessandro Pelizzola for

806: many helpful suggestions and discussions.

807: \end{acknowledgments}

808:

809: %\bibliography{../../bibliography}

810:

811: \input{nim.bbl}

812:

813: \clearpage

814:

815: \begin{table}[p]

816:   \caption{

817:     Coefficients for different plaquette approximations. The first

818:     two columns report respectively the lattice and plaquette (basic cluster) type.

819:     The following three columns display the independent

820:     coefficients: $q$ (coordination number), $c_2,c_1$ (number of

821:     plaquettes sharing a given pair, site). The last two columns

822:     verify the sufficient condition, in that $q(c_2-1) < 2(c_1-1)$.

823:   }

824:   \begin{ruledtabular}

825:   \begin{tabular}{ll|rrr|rr}

826:     lattice    & plaquette   & $q$ & $c_2$ & $c_1$ & $q(c_2-1)$ & $2(c_1-1)$ \cr

827:     \hline

828:     square     & square      &  4 & 2 &  4 &  4 &  6 \cr

829:     triangular & triangle    &  6 & 2 &  6 &  6 & 10 \cr

830:     honeycomb  & hexagon     &  3 & 2 &  3 &  3 &  4 \cr

831:     sc         & square      &  6 & 4 & 12 & 18 & 22 \cr

832:     fcc        & triangle    & 12 & 4 & 24 & 36 & 46 \cr

833:     fcc        & tetrahedron & 12 & 2 &  8 & 12 & 14

834:   \end{tabular}

835:   \end{ruledtabular}

836:   \label{tab:coefficients}

837: \end{table}

838:

839: \clearpage

840:

841: \begin{figure}[p]

842: %  \includegraphics*[10mm,100mm][110mm,260mm]{gerb.ps}

843:

844:   \setlength{\unitlength}{1.2mm}

845:

846:   \begin{picture}(150,140)(-10,-140)

847:

848:   \thicklines

849:

850:   % basic clusters

851:   \put(-3,-12){\makebox(15,2)[lb]{\sf BASIC CLUSTER}}

852:   \put(52,-12){\makebox(15,2)[lb]{\sf PLAQUETTE}}

853:

854:   % C-hierarchy

855:   % sites

856:   \multiput(0,-20)(9,0){5}{\circle*{2}}

857:   \multiput(4.5,-27.5)(9,0){4}{\circle*{2}}

858:   % horizontal bonds

859:   \put(0,-20){\line(1,0){9}}

860:   \put(9,-20){\line(1,0){9}}

861:   \put(18,-20){\line(1,0){3.5}}

862:   \put(27,-20){\line(-1,0){3.5}}

863:   \put(36,-20){\line(-1,0){9}}

864:   \put(4.5,-27.5){\line(1,0){9}}

865:   \put(13.5,-27.5){\line(1,0){9}}

866:   \put(22.5,-27.5){\line(1,0){3.5}}

867:   \put(31.5,-27.5){\line(-1,0){3.5}}

868:   % oblique bonds

869:   \multiput(0,-20)(9,0){4}{\line(3,-5){4.5}}

870:   \put(4.5,-27.5){\line(3,5){4.5}}

871:   \put(13.5,-27.5){\line(3,5){4.5}}

872:   \put(22.5,-27.5){\line(3,5){1.8}}

873:   \put(27,-20){\line(-3,-5){1.8}}

874:   \put(31.5,-27.5){\line(3,5){4.5}}

875:   % indices

876:   \put(-1,-18){\makebox(2,2)[lb]{$1$}}

877:   \put(8,-18){\makebox(2,2)[lb]{$3$}}

878:   \put(17,-18){\makebox(2,2)[lb]{$5$}}

879:   \put(19.5,-18){\makebox(3,2)[lb]{$\dots$}}

880:   \put(24,-18){\makebox(5,2)[lb]{$2L-1$}}

881:   \put(35,-18){\makebox(5,2)[lb]{$2L+1$}}

882:   \put(3.5,-31.5){\makebox(2,2)[lt]{$2$}}

883:   \put(12.5,-31.5){\makebox(2,2)[lt]{$4$}}

884:   \put(21.5,-31.5){\makebox(2,2)[lt]{$6$}}

885:   \put(25,-33){\makebox(3,2)[lt]{$\dots$}}

886:   \put(30.5,-31.5){\makebox(3,2)[lt]{$2L$}}

887:

888:   % triangle approximation

889:   % sites

890:   \multiput(55,-20)(9,0){2}{\circle*{2}}

891:   \put(59.5,-27.5){\circle*{2}}

892:   % horizontal bonds

893:   \put(55,-20){\line(1,0){9}}

894:   % oblique bonds

895:   \multiput(55,-20)(9,0){1}{\line(3,-5){4.5}}

896:   \put(59.5,-27.5){\line(3,5){4.5}}

897:   % indices

898:   \put(54,-18){\makebox(2,2)[lb]{$1$}}

899:   \put(63,-18){\makebox(2,2)[lb]{$3$}}

900:   \put(58.5,-31.5){\makebox(2,2)[lt]{$2$}}

901:

902:   % pair-like clusters

903:   \put(-3,-42){\makebox(15,2)[lb]{\sf PAIR-LIKE CLUSTERS}}

904:   \put(52,-42){\makebox(15,2)[lb]{\sf PAIRS}}

905:

906:   % cluster 12 (b-hierarchy)

907:   % sites

908:   \multiput(0,-50)(9,0){4}{\circle*{2}}

909:   \multiput(4.5,-57.5)(9,0){4}{\circle*{2}}

910:   % horizontal bonds

911:   \put(0,-50){\line(1,0){9}}

912:   \put(9,-50){\line(1,0){9}}

913:   \put(18,-50){\line(1,0){3.5}}

914:   \put(27,-50){\line(-1,0){3.5}}

915:   \put(4.5,-57.5){\line(1,0){9}}

916:   \put(13.5,-57.5){\line(1,0){9}}

917:   \put(22.5,-57.5){\line(1,0){3.5}}

918:   \put(31.5,-57.5){\line(-1,0){3.5}}

919:   % oblique bonds

920:   \multiput(0,-50)(9,0){4}{\line(3,-5){4.5}}

921:   \put(4.5,-57.5){\line(3,5){4.5}}

922:   \put(13.5,-57.5){\line(3,5){4.5}}

923:   \put(22.5,-57.5){\line(3,5){1.8}}

924:   \put(27,-50){\line(-3,-5){1.8}}

925:   % indices

926:   \put(-1,-48){\makebox(2,2)[lb]{$1$}}

927:   \put(8,-48){\makebox(2,2)[lb]{$3$}}

928:   \put(17,-48){\makebox(2,2)[lb]{$5$}}

929:   \put(19.5,-48){\makebox(3,2)[lb]{$\dots$}}

930:   \put(24,-48){\makebox(5,2)[lb]{$2L-1$}}

931:   \put(3.5,-61.5){\makebox(2,2)[lt]{$2$}}

932:   \put(12.5,-61.5){\makebox(2,2)[lt]{$4$}}

933:   \put(21.5,-61.5){\makebox(2,2)[lt]{$6$}}

934:   \put(25,-63){\makebox(3,2)[lt]{$\dots$}}

935:   \put(30.5,-61.5){\makebox(3,2)[lt]{$2L$}}

936:

937:   % cluster 12 (triangle approximation)

938:   % sites

939:   \put(55,-50){\circle*{2}}

940:   \put(59.5,-57.5){\circle*{2}}

941:   % oblique bonds

942:   \put(55,-50){\line(3,-5){4.5}}

943:   % indices

944:   \put(54,-48){\makebox(2,2)[lb]{$1$}}

945:   \put(58.5,-61.5){\makebox(2,2)[lt]{$2$}}

946:

947:   % cluster 23 (C-hierarchy)

948:   % sites

949:   \multiput(9,-70)(9,0){4}{\circle*{2}}

950:   \multiput(4.5,-77.5)(9,0){4}{\circle*{2}}

951:   % horizontal bonds

952:   \put(9,-70){\line(1,0){9}}

953:   \put(18,-70){\line(1,0){3.5}}

954:   \put(27,-70){\line(-1,0){3.5}}

955:   \put(36,-70){\line(-1,0){9}}

956:   \put(4.5,-77.5){\line(1,0){9}}

957:   \put(13.5,-77.5){\line(1,0){9}}

958:   \put(22.5,-77.5){\line(1,0){3.5}}

959:   \put(31.5,-77.5){\line(-1,0){3.5}}

960:   % oblique bonds

961:   \multiput(9,-70)(9,0){3}{\line(3,-5){4.5}}

962:   \put(4.5,-77.5){\line(3,5){4.5}}

963:   \put(13.5,-77.5){\line(3,5){4.5}}

964:   \put(22.5,-77.5){\line(3,5){1.8}}

965:   \put(27,-70){\line(-3,-5){1.8}}

966:   \put(31.5,-77.5){\line(3,5){4.5}}

967:   % indices

968:   \put(8,-68){\makebox(2,2)[lb]{$3$}}

969:   \put(17,-68){\makebox(2,2)[lb]{$5$}}

970:   \put(19.5,-68){\makebox(3,2)[lb]{$\dots$}}

971:   \put(24,-68){\makebox(5,2)[lb]{$2L-1$}}

972:   \put(35,-68){\makebox(5,2)[lb]{$2L+1$}}

973:   \put(3.5,-81.5){\makebox(2,2)[lt]{$2$}}

974:   \put(12.5,-81.5){\makebox(2,2)[lt]{$4$}}

975:   \put(21.5,-81.5){\makebox(2,2)[lt]{$6$}}

976:   \put(25,-83){\makebox(3,2)[lt]{$\dots$}}

977:   \put(30.5,-81.5){\makebox(3,2)[lt]{$2L$}}

978:

979:   % cluster 23 (triangle approximation)

980:   % sites

981:   \put(64,-70){\circle*{2}}

982:   \put(59.5,-77.5){\circle*{2}}

983:   % oblique bonds

984:   \put(59.5,-77.5){\line(3,5){4.5}}

985:   % indices

986:   \put(63,-68){\makebox(2,2)[lb]{$3$}}

987:   \put(58.5,-81.5){\makebox(2,2)[lt]{$2$}}

988:

989:   % cluster 13 (C-hierarchy)

990:   % sites

991:   \multiput(0,-90)(9,0){5}{\circle*{2}}

992:   % horizontal bonds

993:   \put(0,-90){\line(1,0){9}}

994:   \put(9,-90){\line(1,0){9}}

995:   \put(18,-90){\line(1,0){3.5}}

996:   \put(27,-90){\line(-1,0){3.5}}

997:   \put(36,-90){\line(-1,0){9}}

998:   % indices

999:   \put(-1,-88){\makebox(2,2)[lb]{$1$}}

1000:   \put(8,-88){\makebox(2,2)[lb]{$3$}}

1001:   \put(17,-88){\makebox(2,2)[lb]{$5$}}

1002:   \put(19.5,-88){\makebox(3,2)[lb]{$\dots$}}

1003:   \put(24,-88){\makebox(5,2)[lb]{$2L-1$}}

1004:   \put(35,-88){\makebox(5,2)[lb]{$2L+1$}}

1005:

1006:   % cluster 13 (triangle approximation)

1007:   % sites

1008:   \multiput(55,-90)(9,0){2}{\circle*{2}}

1009:   % horizontal bonds

1010:   \put(55,-90){\line(1,0){9}}

1011:   % indices

1012:   \put(54,-88){\makebox(2,2)[lb]{$1$}}

1013:   \put(63,-88){\makebox(2,2)[lb]{$3$}}

1014:

1015:

1016:   % site-like clusters

1017:   \put(-3,-102){\makebox(15,2)[lb]{\sf SITE-LIKE CLUSTERS}}

1018:   \put(52,-102){\makebox(15,2)[lb]{\sf SITES}}

1019:

1020:   % cluster 1 (C-hierarchy)

1021:   % sites

1022:   \multiput(0,-110)(9,0){4}{\circle*{2}}

1023:   % horizontal bonds

1024:   \put(0,-110){\line(1,0){9}}

1025:   \put(9,-110){\line(1,0){9}}

1026:   \put(18,-110){\line(1,0){3.5}}

1027:   \put(27,-110){\line(-1,0){3.5}}

1028:   % indices

1029:   \put(-1,-108){\makebox(2,2)[lb]{$1$}}

1030:   \put(8,-108){\makebox(2,2)[lb]{$3$}}

1031:   \put(17,-108){\makebox(2,2)[lb]{$5$}}

1032:   \put(19.5,-108){\makebox(3,2)[lb]{$\dots$}}

1033:   \put(24,-108){\makebox(5,2)[lb]{$2L-1$}}

1034:

1035:   % cluster 1 (triangle approximation)

1036:   % sites

1037:   \put(55,-110){\circle*{2}}

1038:   % indices

1039:   \put(54,-108){\makebox(2,2)[lb]{$1$}}

1040:

1041:   % cluster 2 (C-hierarchy)

1042:   % sites

1043:   \multiput(4.5,-120)(9,0){4}{\circle*{2}}

1044:   % horizontal bonds

1045:   \put(4.5,-120){\line(1,0){9}}

1046:   \put(13.5,-120){\line(1,0){9}}

1047:   \put(22.5,-120){\line(1,0){3.5}}

1048:   \put(31.5,-120){\line(-1,0){3.5}}

1049:   % indices

1050:   \put(3.5,-118){\makebox(2,2)[lb]{$2$}}

1051:   \put(12.5,-118){\makebox(2,2)[lb]{$4$}}

1052:   \put(21.5,-118){\makebox(2,2)[lb]{$6$}}

1053:   \put(25.5,-118){\makebox(3,2)[lb]{$\dots$}}

1054:   \put(30.5,-118){\makebox(2,2)[lb]{$2L$}}

1055:

1056:   % cluster 2 (triangle approximation)

1057:   % sites

1058:   \put(59.5,-120){\circle*{2}}

1059:   % indices

1060:   \put(58.5,-118){\makebox(2,2)[lb]{$2$}}

1061:

1062:   % cluster 3 (C-hierarchy)

1063:   % sites

1064:   \multiput(9,-130)(9,0){4}{\circle*{2}}

1065:   % horizontal bonds

1066:   \put(9,-130){\line(1,0){9}}

1067:   \put(18,-130){\line(1,0){3.5}}

1068:   \put(27,-130){\line(-1,0){3.5}}

1069:   \put(36,-130){\line(-1,0){9}}

1070:   % indices

1071:   \put(8,-128){\makebox(2,2)[lb]{$3$}}

1072:   \put(17,-128){\makebox(2,2)[lb]{$5$}}

1073:   \put(19.5,-128){\makebox(3,2)[lb]{$\dots$}}

1074:   \put(24,-128){\makebox(5,2)[lb]{$2L-1$}}

1075:   \put(35,-128){\makebox(5,2)[lb]{$2L+1$}}

1076:

1077:   % cluster 3 (triangle approximation)

1078:   % sites

1079:   \put(64,-130){\circle*{2}}

1080:   % indices

1081:   \put(63,-128){\makebox(2,2)[lb]{$3$}}

1082:

1083:   \end{picture}

1084:

1085:   \caption{

1086:     Basic cluster and subclusters for the B~hierarchy (left side) and

1087:     for the corresponding (triangle) plaquette approximation.

1088:   }

1089:   \label{fig:gerb}

1090: \end{figure}

1091:

1092:

1093: \end{document}

1094: