cond-mat0404654/nim.tex
1: %\documentclass[a4paper,twocolumn,showpacs]{revtex4}
2: \documentclass[a4paper,preprint]{revtex4}
3: 
4: \usepackage{amsmath,amssymb,amsfonts}
5: \usepackage{graphicx}
6: 
7: % comandi
8: \newcommand{\fe}{^{\,\!}}
9: 
10: 
11: \begin{document}
12: 
13: \title{On the convergence of Kikuchi's natural iteration method}
14: 
15: \author{Marco Pretti}
16: 
17: \affiliation{Istituto Nazionale per la Fisica della Materia (INFM)
18: and Dipartimento di Fisica, \\ Politecnico di Torino, Corso Duca
19: degli Abruzzi 24, I-10129 Torino, Italy}
20: 
21: \date{\today}
22: 
23: \begin{abstract}
24: In this article we investigate on the convergence of the natural
25: iteration method, a numerical procedure widely employed in the
26: statistical mechanics of lattice systems to minimize Kikuchi's
27: cluster variational free energies. We discuss a sufficient
28: condition for the convergence, based on the coefficients of the
29: cluster entropy expansion, depending on the lattice geometry. We
30: also show that such a condition is satisfied for many lattices
31: usually studied in applications. Finally, we consider a recently
32: proposed general method for the minimization of non convex
33: functionals, showing that the natural iteration method turns out
34: as a particular case of that method.
35: \end{abstract}
36: 
37: %\pacs{}
38: 
39: \maketitle
40: 
41: 
42: \section{Introduction}
43: 
44: The cluster variation method (CVM) is a powerful approximate
45: technique for the statistical mechanics of lattice systems, which
46: can improve the simple mean field and Bethe theories, by taking
47: into account correlations on larger and larger distances. It was
48: first proposed by Kikuchi in 1951~\cite{Kikuchi1951} as an
49: approximate evaluation of the thermodynamic weight of the system,
50: and since then it has been reformulated several
51: times~\cite{Morita1972,Schlijper1983,An1988}, mainly to clarify
52: the nature of the approximation and to simplify the way to work it
53: out. Quite a recent formulation~\cite{An1988} shows that the CVM
54: consists in a truncation of the cumulant entropy expansion. Each
55: cumulant is associated to a cluster of sites and the truncation is
56: justified by the expected rapid vanishing of the cumulants upon
57: increasing the cluster size. In this way the CVM can be viewed as
58: a hierarchy of approximations, each one defined by the set of
59: maximal clusters retained in the cumulant expansion, usually
60: denoted as basic clusters. If pairs of nearest neighbor sites are
61: chosen as basic clusters, the CVM coincides with the Bethe
62: approximation. Generally, using larger basic clusters improves the
63: approximation, even if the convergence of the cumulant expansion
64: to the exact entropy has been rigorously proved just in a few
65: cases~\cite{Schlijper1983,Kikuchi1994}.
66: 
67: Due to its relative simplicity and accuracy, the CVM is widely
68: used in every kind of statistical mechanical applications, to
69: determine both thermodynamic
70: properties~\cite{GiaconiaPagotTetot2000,AstaHoyt2000,SchonInden1996}
71: and phase
72: diagrams~\cite{Kentzinger2000,Lopez-Sandoval1999,Oates1999,BuzanoPretti1997}.
73: The CVM results generally compare well with those of Monte Carlo
74: simulations~\cite{Lopez-Sandoval1999,Oates1999,Lapinskas1998} as
75: well as experimental
76: ones~\cite{ClouetNastarSigli2004,SchonInden2001,Kentzinger2000,GiaconiaPagotTetot2000,Lapinskas1998,SchonInden1996}.
77: Making use of suitable series of CVM approximations, it is also
78: possible to extrapolate quite accurate estimates of critical
79: exponents~\cite{Pelizzola2000,Pelizzola1994,KatoriSuzuki1994,KatoriSuzuki1988}.
80: Recently, it has been shown that the belief propagation algorithm,
81: an approximate method for statistical inference, employed for a
82: lot of technologically relevant problems
83: (image~\cite{TanakaInoueTitterington2003} and signal
84: processing~\cite{Kschnischang2002}, decoding of error-correcting
85: codes~\cite{Kschnischang2002,Frey1998}, machine
86: learning~\cite{Frey1998}), is actually equivalent to the
87: minimization of a Bethe free energy for statistical mechanical
88: models defined on graphs~\cite{YedidiaFreemanWeiss2002}. This fact
89: has opened new research areas both to the application of the CVM
90: as an improvement of the
91: approximation~\cite{YedidiaFreemanWeiss2002}, and to the analysis
92: of efficient minimization
93: algorithms~\cite{Yuille2001,HeskesAlbersKappen2003,PrettiPelizzola2003},
94: mainly due to the fact that belief propagation sometimes fails to
95: converge.
96: 
97: Let us introduce the problem from the CVM point of view. Once the
98: approximate entropy (and hence free energy) for the chosen set of
99: basic clusters has been obtained, one has to face the problem of
100: minimizing a complicated non-convex functional in the basic
101: cluster probability distributions. An algorithm for minimizing
102: such a functional has been proposed by Kikuchi
103: himself~\cite{Kikuchi1974}, and is known as natural iteration
104: method (NIM). A proof of convergence of this algorithm has been
105: given in the original paper, essentially for the Bethe
106: approximation, which can be easily extended to the Husimi
107: tree~\cite{Pretti2003}. Nevertheless, the range of convergent
108: cases seems to be much wider, so that the natural iteration method
109: might be interesting also for the non conventional applications
110: mentioned above.
111: 
112: In this article we analyze a sufficient condition for the
113: convergence of the NIM. Such a condition is a requirement on the
114: coefficients of the cluster entropy expansion (obtained from the
115: cumulant expansion through a M{\"o}bius inversion~\cite{An1988})
116: and is shown to hold for quite a large variety of approximations
117: that are generally used to treat thermodynamic systems. Namely, we
118: consider: a set of ``plaquette'' approximations on different
119: lattices~\cite{Kikuchi1974,SchonInden1996,BuzanoPretti1997,KingChen1999},
120: Kikuchi's B~and C~hierarchies for the
121: square~\cite{KikuchiBrush1967} and
122: triangular~\cite{PelizzolaPretti1999} lattices, the cube
123: approximation for the simple cubic lattice. As far as the latter
124: case is concerned, we actually analyze a generic hypercube
125: approximation on the hypercubic lattice in $d$~dimensions, showing
126: that the sufficient condition holds for $d \leq 3$. Finally we
127: take into account a recently proposed algorithm for the
128: minimization of the CVM free energy~\cite{HeskesAlbersKappen2003},
129: which allows several alternatives, depending on the possibility of
130: upperbounding the free energy with convex (easy to be minimized)
131: functions. We show that one of the best choices is actually
132: equivalent to the natural iteration method.
133: 
134: 
135: \section{The CVM free energy}
136: 
137: As mentioned in the Introduction, the approximate CVM entropy can
138: be written as a linear combination of cluster
139: entropies~\cite{An1988}
140: \begin{equation}
141:   S = \sum_\alpha a_\alpha S_\alpha
142:   ,
143:   \label{eq:sumrule}
144: \end{equation}
145: where the sum index~$\alpha$ runs over all basic clusters and
146: their subclusters. We shall always consider clusters in this set
147: only. The cluster entropies are defined as usual
148: \begin{equation}
149:   S_\alpha
150:   =
151:   - \sum_{x_\alpha}
152:   p_\alpha(x_\alpha) \log p_\alpha(x_\alpha)
153:   ,
154: \end{equation}
155: where $p_\alpha(x_\alpha)$ denotes the probability of the
156: configuration~$x_\alpha$ for the cluster~$\alpha$, the sum runs
157: over all possible configurations, and the Boltzmann constant~$k$
158: is set to~$1$ (entropy is measured in natural units). The
159: coefficients can be determined recursively, starting from basic
160: clusters down to subclusters, making use of the following
161: property~\cite{An1988}
162: \begin{equation}
163:   \sum_{\alpha' \supseteq \alpha} a_{\alpha'} = 1
164:   \ \ \forall \alpha
165:   .
166: \end{equation}
167: Due to the fact that a basic cluster~$\gamma$ never contains (by
168: definition) another basic cluster, from the above formula we
169: immediately get $a_\gamma = 1 \ \forall \gamma$. Here and in the
170: following, $\gamma$ denotes basic clusters. As far as the
171: hamiltonian is concerned, we assume that it can be written as a
172: sum of contributions $h_\gamma$ from all basic clusters as
173: \begin{equation}
174:   \mathcal{H} = \sum_{\gamma} h_\gamma(x_\gamma)
175:   ,
176: \end{equation}
177: where of course $x_\gamma$ denote basic cluster configurations.
178: Let us decide to write the whole CVM free energy as a sum over
179: basic clusters, splitting entropy contributions from each
180: subcluster among all basic clusters that contain it (in equal
181: parts). Assuming energies normalized to $kT$, we obtain
182: \begin{equation}
183:   F[p] =
184:   \sum_{\gamma}
185:   \sum_{x_\gamma} p_\gamma(x_\gamma)
186:   \left[
187:   h_\gamma(x_\gamma)
188:   + \log p_\gamma(x_\gamma)
189:   + \sum_{\alpha \subset \gamma} b_\alpha \log p_\gamma(x_\alpha)
190:   \right]
191:   ,
192:   \label{eq:f1}
193: \end{equation}
194: where
195: \begin{equation}
196:   p_\gamma(x_\alpha) \equiv
197:   \sum_{x_{\gamma \setminus \alpha}}
198:   p_\gamma(x_\gamma)
199:   .
200:   \label{eq:margin1}
201: \end{equation}
202: Let us notice that we have defined new coefficients $b_\alpha
203: \equiv a_\alpha / c_\alpha$, where $c_\alpha$ denotes the number
204: of basic clusters that contain~$\alpha$, and we have expressed
205: subcluster probability distributions as marginals of basic cluster
206: distributions, according to Eq.~\eqref{eq:margin1} (the sum runs
207: over configurations $x_{\gamma \setminus \alpha}$ of the basic
208: cluster $\gamma$ minus the subcluster $\alpha$).
209: 
210: 
211: \section{The natural iteration method}
212: 
213: In the above formulation, basic cluster distributions
214: $\{p_\gamma(x_\gamma)\}$ are the variational parameters of the
215: free energy (which is denoted in short by $F[p]$), and the
216: thermodynamic equilibrium state can be determined by minimization
217: with respect to these parameters with suitable normalization and
218: compatibility constraints. By compatibility we mean of course that
219: marginal distributions $p_\gamma(x_\alpha)$ must be the same for
220: all basic clusters $\gamma \supset \alpha$. Let us notice that,
221: for most thermodynamic applications, one usually makes some
222: homogeneity assumption on the system, and this generally reduces
223: the problem to only one or few different basic cluster
224: distributions. Compatibility constraints may be still necessary to
225: impose the required symmetry. We go on with the complete
226: formulation, without loss of generality. The important thing is
227: that in any case we deal with constraints that are linear in the
228: probability distributions (compatibility), possibly with an
229: additive constant (unit) term (normalization). According to the
230: Lagrange method, we transform the constrained minimum problem with
231: respect to $\{p_\gamma(x_\gamma)\}$ to a free minimum problem for
232: an extended functional which depends on additional parameters
233: (Lagrange multipliers). Due to linearity, the extended functional
234: can be written in the form
235: \begin{equation}
236:   \tilde{F}[p,\lambda] = F[p]
237:   - \sum_{\gamma} \sum_{x_\gamma} p_\gamma(x_\gamma) \lambda_\gamma(x_\gamma)
238:   ,
239:   \label{eq:f2}
240: \end{equation}
241: where $\{\lambda_\gamma(x_\gamma)\}$ are the Lagrange multipliers.
242: Of course, $\{\lambda_\gamma(x_\gamma)\}$ are not all independent
243: variables, but internal relationships are system dependent, and we
244: do not analyze them. Let us only notice, for future use, that the
245: difference between the new functional and the original one (the
246: last term in Eq.~\eqref{eq:f2}) is actually independent of the
247: $\{p_\gamma(x_\gamma)\}$ distributions, provided they satisfy the
248: required constraints.
249: 
250: The derivatives of~$\tilde{F}$ with respect
251: to~$p_\gamma(x_\gamma)$ turn out to be
252: \begin{equation}
253:   \frac{\partial \tilde{F}[p,\lambda]}{\partial p_\gamma(x_\gamma)}
254:   =
255:   h_\gamma(x_\gamma)
256:   + \log p_\gamma(x_\gamma)
257:   + \sum_{\alpha \subset \gamma} b_\alpha \log p_\gamma(x_\alpha)
258:   - \lambda_\gamma(x_\gamma)
259:   + \text{const.}
260:   ,
261: \end{equation}
262: where the additive constant is irrelevant and we can absorb it
263: into the Lagrange multipliers. Setting the above derivatives to
264: zero resolves stationarization with respect to probability
265: distributions. The natural iteration method consists in rewriting
266: such equations in a fixed point form, that is
267: \begin{equation}
268:   \hat{p}_\gamma(x_\gamma) =
269:   e^{\lambda_\gamma(x_\gamma) - h_\gamma(x_\gamma)}
270:   \prod_{\alpha \subset \gamma} \left[ p_\gamma(x_\alpha) \right]^{-b_\alpha}
271:   ,
272:   \label{eq:nim}
273: \end{equation}
274: and then solving them by simple iteration. A new estimate of the
275: basic cluster probability distribution $\hat{p}_\gamma(x_\gamma)$
276: is obtained from the previous one $p_\gamma(x_\gamma)$ trough its
277: marginals $p_\gamma(x_\alpha)$. The Lagrange multipliers must be
278: determined at each iteration, so that also
279: $\hat{p}_\gamma(x_\gamma)$ satisfies the required constraints.
280: This job can be done in different ways by a nested procedure
281: (inner loop), for instance a Newton-Raphson method or a suitable
282: fixed point method~\cite{Kikuchi1976,PelizzolaPretti1999}. In this
283: paper we do not deal with the determination of Lagrange
284: multipliers, but we only focus on the convergence of the main
285: loop.
286: 
287: 
288: \section{Sufficient condition for the convergence}
289: 
290: As usual for iterative algorithms designed to minimize functionals
291: that are bounded from below, a proof of convergence can be given
292: by the decreasing of the functional value at each iteration. This
293: is actually the case for the natural iteration method. Let us
294: consider the free energy difference $F[\hat{p}]-F[p]$ for two
295: subsequent iterations $p,\hat{p}$, where $F[p]$ is defined by
296: Eqs.~\eqref{eq:f1} and~\eqref{eq:margin1}. Taking the logarithm of
297: both sides of Eq.~\eqref{eq:nim}, we can rewrite the NIM equations
298: in two different ways, that are
299: \begin{equation}
300:   \log \hat{p}_\gamma(x_\gamma) =
301:   \lambda_\gamma(x_\gamma) - h_\gamma(x_\gamma)
302:   - \sum_{\alpha \subset \gamma} b_\alpha \log p_\gamma(x_\alpha)
303: \end{equation}
304: \begin{equation}
305:   \sum_{\alpha \subset \gamma} b_\alpha \log p_\gamma(x_\alpha)
306:   = \lambda_\gamma(x_\gamma) - h_\gamma(x_\gamma)
307:   - \log \hat{p}_\gamma(x_\gamma)
308:   .
309: \end{equation}
310: Let us replace the former into $F[\hat{p}]$ and the latter into
311: $F[p]$. Remembering that probability distributions satisfy the
312: constraints, whence latter term on the right hand side of
313: Eq.~\eqref{eq:f2} depends on Lagrange multipliers only, we obtain
314: \begin{equation}
315:   F[\hat{p}]-F[p]
316:   = \sum_\gamma \sum_{x_\gamma}
317:   \left\{
318:   p_\gamma(x_\gamma)
319:   \log \frac{\hat{p}_\gamma(x_\gamma)}{p_\gamma(x_\gamma)}
320:   - \hat{p}_\gamma(x_\gamma) \sum_{\alpha \subset \gamma}
321:   b_\alpha \log \frac{p_\gamma(x_\alpha)}{\hat{p}_\gamma(x_\alpha)}
322:   \right\}
323:   .
324:   \label{eq:deltaf2}
325: \end{equation}
326: Let us consider the inequality $\log \xi \le \xi-1$, observing
327: that equality holds if and only if $\xi=1$. By applying this
328: inequality to the first logarithm (the one involving basic cluster
329: probability distributions) in Eq.~\eqref{eq:deltaf2}, and taking
330: into account that distributions are normalized, we obtain
331: \begin{equation}
332:   F[\hat{p}]-F[p]
333:   \leq
334:   - \sum_\gamma \sum_{x_\gamma}
335:   \hat{p}_\gamma(x_\gamma) \sum_{\alpha \subset \gamma}
336:   b_\alpha \log \frac{p_\gamma(x_\alpha)}{\hat{p}_\gamma(x_\alpha)}
337:   ,
338:   \label{eq:deltaf3}
339: \end{equation}
340: where equality holds if and only if $\hat{p}_\gamma(x_\gamma) =
341: p_\gamma(x_\gamma) \ \forall \gamma,x_\gamma$. The same result
342: could be obtained by observing that actually the upperbounded
343: terms coincide with (minus) the Kullbach-Liebler distances between
344: the probability distributions $p_\gamma(x_\gamma)$ and
345: $\hat{p}_\gamma(x_\gamma)$. If all subcluster coefficients
346: $b_\alpha$ were negative, we could apply the same argument to all
347: terms, and the upperbound would be zero. Such a situation occurs
348: for instance in the Bethe~\cite{Kikuchi1974} and Husimi
349: tree~\cite{Pretti2003} approximations, and the proof of
350: convergence would be complete. In a general case we have to
351: require a condition on the $b_\alpha$~coefficients. The basic idea
352: is to ``couple'' smaller cluster terms with a positive coefficient
353: to larger cluster terms with a negative coefficient, yielding a
354: sum of ``negative'' Kullbach-Liebler distances (some between
355: conditional probability distributions), which can then be
356: upperbounded by zero. The details are given in the following.
357: 
358: \noindent {\bf Theorem (sufficient condition for the
359: convergence):} Let $\{b_{\alpha^-|\alpha^+}\}$ be a set of non
360: negative coefficients (allocation coefficients), defined for each
361: pair of subclusters $\alpha^-,\alpha^+$, such that
362: $b_{\alpha^-}<0$, $b_{\alpha^+}>0$, and $\alpha^- \supset
363: \alpha^+$. If the following properties hold for all basic
364: clusters~$\gamma$
365: \begin{eqnarray}
366:   b_{\alpha^+} =
367:   & \displaystyle \sum_{\alpha^+ \subset \alpha^- \subset \gamma}
368:   & b_{\alpha^-|\alpha^+}
369:   \ \ \ \ \ \ \forall \alpha^+ \subset \gamma
370:   \label{eq:suffcondplus} \\
371:   -b_{\alpha^-} \geq
372:   & \displaystyle \sum_{\alpha^+ \subset \alpha^-}
373:   & b_{\alpha^-|\alpha^+}
374:   \ \ \ \ \ \ \forall \alpha^- \subset \gamma
375:   ,
376:   \label{eq:suffcondminus}
377: \end{eqnarray}
378: then
379: \begin{eqnarray}
380:   &&
381:   F[\hat{p}]-F[p] \le 0
382:   \label{eq:deltafle0} \\
383:   &&
384:   F[\hat{p}]-F[p] = 0 \ \ \Longleftrightarrow \ \
385:   \hat{p} = p
386:   .
387:   \label{eq:deltafeq0iff}
388: \end{eqnarray}
389: Eq.~(\ref{eq:deltafle0}) means that the free energy can be
390: decreasing or constant during the procedure, while
391: Eq.~(\ref{eq:deltafeq0iff}) assures that it is constant only if
392: the procedure has already reached convergence (i.e., the free
393: energy  can only decrease during the procedure). A relevant
394: consequence of Eq.~\eqref{eq:deltafeq0iff} is that it prevents the
395: dynamical system defined by the NIM equations from having limit
396: cycles at constant free energy, which could occur in principle.
397: 
398: \noindent {\bf Proof:} Let us consider the right hand side of
399: Eq.~\eqref{eq:deltaf3} and split the sum over subclusters $\alpha
400: \subset \gamma$ in two sums over subclusters $\alpha^+,\alpha^-$
401: with positive or negative coefficients respectively. Positive
402: coefficients $b_{\alpha^+}$ can be replaced by
403: Eq.~\eqref{eq:suffcondplus}, while, according to
404: Eq.~\eqref{eq:suffcondminus}, negative coefficients can be
405: replaced by
406: \begin{equation}
407:   b_{\alpha^-} =
408:   - \sum_{\alpha^+ \subset \alpha^-}
409:   b_{\alpha^-|\alpha^+}
410:   - d_{\alpha^-}
411:   ,
412: \end{equation}
413: for certain $d_{\alpha^-} \geq 0$. Defining, for each $\alpha^-
414: \supset \alpha^+$, the conditional probability distributions
415: \begin{equation}
416:   p_\gamma(x_{\alpha^-}|x_{\alpha^+})
417:   \equiv
418:   \frac{p_\gamma(x_{\alpha^-})}{p_\gamma(x_{\alpha^+})}
419:   ,
420: \end{equation}
421: after some simple manipulations we obtain
422: \begin{equation}
423:   - \sum_{\alpha \subset \gamma}
424:   b_\alpha \log \frac{p_\gamma(x_\alpha)}{\hat{p}_\gamma(x_\alpha)}
425:   = \sum_{\alpha^- \subset \gamma}
426:   \left[
427:   d_{\alpha^-}
428:   \log \frac{p_\gamma(x_{\alpha^-})}{\hat{p}_\gamma(x_{\alpha^-})}
429:   + \sum_{\alpha^+ \subset \alpha^-}
430:   b_{\alpha^-|\alpha^+}
431:   \log \frac{p_\gamma(x_{\alpha^-}|x_{\alpha^+})}{\hat{p}_\gamma(x_{\alpha^-}|x_{\alpha^+})}
432:   \right]
433:   .
434: \end{equation}
435: The logarithm inequality $\log \xi \le \xi-1$ can now be applied
436: to all terms in the previous equation, because all coefficients
437: are positive (or equivalently we get a sum of Kullbach-Liebler
438: terms), and the zero upperbound of Eq.~\eqref{eq:deltafle0} is
439: obtained. As previously mentioned, Eq.~\eqref{eq:deltafeq0iff} is
440: proved by the fact that the logarithm inequality holds if and only
441: if $\xi = 1$, i.e., the Kullbach-Liebler distance between two
442: probability distributions is zero if and only if the two
443: distributions are equal.~$\blacksquare$
444: 
445: 
446: \section{Some particular cases}
447: 
448: In this section we consider some particular choices of basic
449: clusters, that is, some particular CVM approximations for regular
450: lattices on which several model systems are defined.
451: 
452: \subsection{``Plaquette'' approximations}
453: 
454: By ``plaquette'' approximations we mean a class of approximations
455: in which basic clusters are of a unique type (which we denote as
456: plaquette, for example a square on a square lattice), while
457: subclusters with non zero coefficients are only single sites and
458: nearest neighbor pairs. Let us denote such clusters by $1$ and $2$
459: respectively, and, according to the notation introduced in
460: Sec.~II, let us denote by $a_1$ and $a_2$ the coefficients of the
461: cluster entropy expansion, by $c_1$ and $c_2$ the numbers of
462: plaquettes sharing a given subcluster, and by $b_i = a_i/c_i$ the
463: normalized coefficients. In this class of approximations, it is
464: possible to show that all the coefficients can be obtained as a
465: function of $c_1,c_2$ and of the lattice coordination number~$q$.
466: Making use of Eq.~\eqref{eq:sumrule}, and remembering that basic
467: clusters (plaquettes) have unit $a$-coefficient, we can write
468: \begin{eqnarray}
469:   &&
470:   a_2 + c_2 = 1
471:   \\
472:   &&
473:   a_1 + qa_2 + c_1 = 1
474:   ,
475: \end{eqnarray}
476: from which $b_i = a_i/c_i$ are easily obtained:
477: \begin{eqnarray}
478:   b_2 & = & -\frac{c_2-1}{c_2}
479:   \label{eq:b2plaq} \\
480:   b_1 & = & \frac{q(c_2-1)-(c_1-1)}{c_1}
481:   .
482: \end{eqnarray}
483: Then, we have to impose the sufficient conditions on the
484: coefficients, Eqs.~\eqref{eq:suffcondplus}
485: and~\eqref{eq:suffcondminus}. From Eq.~\eqref{eq:b2plaq} we easily
486: see that~$b_2 \leq 0$, which is ok for upperbounding, but
487: usually~$b_1 \geq 0$. We then have to couple each site to pairs
488: that contain it and are contained in a given plaquette. Let us
489: adopt the strategy of splitting the site coefficient among such
490: pairs in equal parts, so that, being $b_{2|1}$ the only allocation
491: coefficient and $r$ the number of pairs,
492: Eqs.~\eqref{eq:suffcondplus} and~\eqref{eq:suffcondminus} read
493: \begin{eqnarray}
494:   b_1 & = & r b_{2|1}
495:   \\
496:   -b_2 & \geq & 2 b_{2|1}
497:   .
498: \end{eqnarray}
499: The allocation coefficient may be easily eliminated, yielding the
500: single condition
501: \begin{equation}
502:   \frac{b_1}{r} + \frac{b_2}{2} \leq 0
503:   .
504:   \label{eq:condsuffplaq}
505: \end{equation}
506: It is possible to show that also the $r$~parameter depends on
507: $c_1,c_2,q$ only. Let us imagine to multiply the number~$q$ of
508: nearest neighbor pairs sharing a site times the number $c_2$ of
509: plaquettes sharing a pair. It is easy to realize that in this way
510: we have {\em overcounted} $r$~times the number $c_1$ of plaquettes
511: sharing the given site, i.e.,
512: \begin{equation}
513:   rc_1 = qc_2
514:   .
515: \end{equation}
516: With the above manipulation, the condition~\eqref{eq:condsuffplaq}
517: can be rewritten as
518: \begin{equation}
519:   q(c_2-1) \leq 2(c_1-1)
520:   .
521: \end{equation}
522: In this form we can easily verify its validity, which is done in
523: Tab.~\ref{tab:coefficients} for a set of typical plaquette
524: approximations. We have considered: the 2d square, triangular, and
525: honeycomb lattices with a 4-site
526: square~\cite{BuzanoPretti1997,KingChen1999}, a 3-site
527: triangle~\cite{KingChen1999}, and an elementary hexagon as basic
528: cluster respectively, the simple cubic (sc) lattice with a 4-site
529: square~\cite{KingChen1999} as basic cluster, and the face-centered
530: cubic (fcc) lattice with a 3-site triangle~\cite{KingChen1999} or
531: a 4-site tetrahedron~\cite{Kikuchi1974,SchonInden1996} as basic
532: cluster.
533: 
534: \subsection{B and C hierarchies}
535: 
536: The B and C~hierarchies, originally proposed by Kikuchi and
537: Brush~\cite{KikuchiBrush1967}, are series of approximations with
538: increasing cluster size, suitable for 2d
539: square~\cite{KikuchiBrush1967} and
540: triangular~\cite{PelizzolaPretti1999} lattices. They are
541: interesting mainly because they converge towards the exact free
542: energy, in spite of the fact that the cluster size increases only
543: in one direction. This result has been proved rigorously only for
544: the C~hierarchy~\cite{Schlijper1983}, but there are numerical
545: evidences for both~\cite{KikuchiBrush1967,PelizzolaPretti1999}.
546: Such results~\cite{Schlijper1983} are related to the transfer
547: matrix concept: As the Bethe approximation solves exactly an
548: Ising-like chain, the CVM, with infinitely long 1d stripes as
549: basic clusters (to which the B and C~hierarchies tend), solves
550: exactly a 2d lattice. Here we are interested in showing that these
551: approximations verify the sufficient condition for the convergence
552: discussed above. Let us consider for instance the B~hierarchy on
553: the triangular lattice (a completely analogous treatment holds for
554: the C~hierarchy and/or for the square lattice). The basic
555: clusters, shown in Fig.~\ref{fig:gerb} (top row, left column), are
556: made up of a sequence of $L-1$~up- and $L$~down-pointing
557: triangles, where $L$ is an adjustable parameter. Of course, also
558: corresponding clusters with $L$~up- and $L-1$~down-pointing
559: triangles are allowed, but all basic clusters always extends only
560: in one direction. This choice can be viewed as a generalization of
561: the triangle plaquette approximation (see Fig.~\ref{fig:gerb}, top
562: row, right column), where of course also up-pointing triangles are
563: included in the set of basic clusters. In the following rows of
564: Fig.~\ref{fig:gerb} also the subclusters of the given basic
565: cluster, having nonzero coefficients in the cluster entropy
566: expansion ($a$-coefficients), are displayed. They are divided in
567: pair-like and site-like subclusters, in that they can be put in
568: one-to-one correspondence with pair and site subclusters for the
569: triangle plaquette approximations. Such analogy is not only a
570: pictorial one. In fact, it is possible to show (for instance
571: making use of Eq.~\eqref{eq:sumrule}, but see also
572: Ref.~\cite{KikuchiBrush1967}) that the $a$-coefficients are
573: $a_2=-1$ for pair-like clusters and $a_1=1$ for site-like
574: clusters, like for the triangle plaquette approximation. The same
575: holds for $c$-coefficients, i.e., the numbers of basic clusters
576: sharing a given subclusters, which turn out to be $c_2=2$ and
577: $c_1=6$ respectively, whence $b_2=-1/2$ and $b_1=1/6$. Finally,
578: from Fig.~\ref{fig:gerb} one easily sees that also the same
579: ``allocation'' technique as for the plaquette approximation can be
580: used. Inside a given basic cluster, each site-like subcluster is
581: shared by $r=2$ pair-like clusters, and each pair-like cluster
582: contains 2 site-like subclusters, whence
583: inequality~\eqref{eq:condsuffplaq} is satisfied.
584: 
585: \subsection{Hypercube approximation in $d$ dimensions}
586: 
587: Finally, let us consider the case of a hypercubic lattice in
588: $d$~dimensions, and let us choose a $d$-dimensional hypercube
589: ($d$-cube) as basic cluster. Of course, the relevant cases are
590: $d=2,3$, the former of which coincides with the square plaquette
591: approximation, mentioned above, but the interest of a general
592: treatment will be clearer later. It is possible to show, by
593: repeated use of Eq.~\eqref{eq:sumrule}, that clusters with non
594: zero coefficients are only $i$-cubes, for $i=1,\dots,d$, and the
595: $i$-cube coefficient in $d$ dimensions is $a_i^{(d)} =
596: (-1)^{d-i}$. Moreover, the number of $d$-cubes sharing a given
597: $i$-cube (in $d$ dimensions) is $c_i^{(d)} = 2^{d-i}$. As a
598: consequence, the normalized coefficients turn out to be
599: \begin{equation}
600:   b_i^{(d)} = \left( -\frac{1}{2} \right)^{d-i}
601:   .
602:   \label{eq:bcoeff_hcube}
603: \end{equation}
604: Let us now impose the sufficient conditions,
605: Eqs.~\eqref{eq:suffcondplus} and~\eqref{eq:suffcondminus}. Let us
606: notice that the positive coefficients, those who give problems for
607: upperbounding, have the $i$~index with the same parity as~$d$,
608: that is $i=d-2,d-4,\dots$. Then we can couple each $i$-cube with
609: $(i+1)$-cubes that contain it and are contained in a given
610: $d$-cube. As for plaquette approximations, let us split the
611: $i$-cube coefficient in equal parts, so that we have a single
612: $b_{i+1|i}^{(d)}$ allocation coefficient. We still have to observe
613: that each $i$-cube is shared by $d-i$ $(i+1)$-cubes contained in
614: the same $d$-cube (the equivalent of the $r$~parameter for
615: plaquette approximations), and that each $(i+1)$-cube contains
616: $2(i+1)$ different $i$-cubes (the equivalent of $2$~sites in a
617: pair). We can then rewrite Eqs.~\eqref{eq:suffcondplus}
618: and~\eqref{eq:suffcondminus} as
619: \begin{eqnarray}
620:   b_i^{(d)} & = & (d-i) \, b_{i+1|i}^{(d)}
621:   \\
622:   -b_{i+1}^{(d)} & \geq & 2(i+1) \, b_{i+1|i}^{(d)}
623:   .
624: \end{eqnarray}
625: By eliminating the allocation coefficient, we obtain
626: \begin{equation}
627:   \frac{b_i^{(d)}}{d-i} + \frac{b_{i+1}^{(d)}}{2(i+1)} \leq 0
628:   ,
629:   \label{eq:condsuffhcube}
630: \end{equation}
631: which, replacing Eq.~\eqref{eq:bcoeff_hcube} and taking into
632: account that $d-i$ is always even (as previously mentioned),
633: becomes
634: \begin{equation}
635:   2i \leq d-1
636:   .
637: \end{equation}
638: Such inequality becomes more and more difficult to be satisfied as
639: the subcluster index~$i$ increases. Therefore we have to consider
640: the worst case, that is $i=d-2$, leading to
641: \begin{equation}
642:   d \leq 3
643:   .
644: \end{equation}
645: This results essentially proves the convergence for $d=3$, because
646: the $d=2$ case coincides with the square plaquette approximation.
647: Nevertheless, it is mainly interesting in that it gives us the
648: opportunity to experiment the natural iteration method in a case
649: in which the sufficient condition is not verified. We have
650: actually implemented the procedure for the simple Ising model on
651: the $d=4$ hypercubic lattice, easily finding cases in which the
652: behavior is non convergent (oscillating). This fact lead us to
653: conjecture that actually the sufficient condition might be also a
654: necessary one.
655: 
656: 
657: \section{An equivalent formulation}
658: 
659: In a recent paper~\cite{HeskesAlbersKappen2003}, a general method
660: for the minimization of non convex functionals, related to the
661: existence of suitable upperbounds to the functional to be
662: minimized, is proposed and applied to the case of the CVM free
663: energy. Different possible choices for the upperbounding
664: functional are investigated. Hereafter, we show that one choice
665: proposed there, which by the way turns out to be quite convenient
666: in terms of computation time, is equivalent to the natural
667: iteration method. First, let us briefly recall the general method,
668: which is based on the following.
669: 
670: \noindent {\bf Theorem:} \ Let $F[p]$ be a continuous functional
671: in the set of variables $p$, defined in some compact
672: domain~$\Omega$, and $\bar{F}[p,p']$ an auxiliary continuous
673: functional in a pair of variable sets $p,p'$, defined in the
674: domain~$\Omega^2$, having a unique minimum with respect to~$p'$
675: for each fixed~$p$. Let the auxiliary functional satisfy the
676: following requirements:
677: \begin{eqnarray}
678:   &&
679:   F[p'] \leq \bar{F}[p,p']
680:   \label{eq:flefbar} \\ &&
681:   F[p'] = \bar{F}[p,p'] \ \ \Longleftrightarrow \ \ p' = p
682:   ,
683:   \label{eq:feqfbariff}
684: \end{eqnarray}
685: that is, the auxiliary functional is an upperbound to the original
686: functional, and equality holds if and only if the two arguments of
687: the former are equal. Then the application $\varphi: p \mapsto
688: \hat{p}$ defined by
689: \begin{equation}
690:   \hat{p} = \arg\min_{p' \in \Omega} \bar{F}[p,p']
691:   \label{eq:application}
692: \end{equation}
693: enjoys the properties
694: \begin{eqnarray}
695:   &&
696:   F[\hat{p}] \leq F[p]
697:   \label{eq:flef} \\ &&
698:   F[\hat{p}] = F[p] \ \ \Longleftrightarrow \ \ \hat{p} = p
699:   .
700:   \label{eq:feqfiff}
701: \end{eqnarray}
702: Therefore, it defines an iterative method to minimize the original
703: functional.
704: 
705: \noindent {\bf Proof:} \ It is easy to obtain the following
706: inequality chain
707: \begin{equation}
708:   F[\hat{p}] \leq \bar{F}[p,\hat{p}] \leq \bar{F}[p,p] = F[p]
709:   ,
710:   \label{eq:ineqchain}
711: \end{equation}
712: proving immediately Eq.~\eqref{eq:flef}. The first inequality is
713: the first hypothesis on the auxiliary functional~$\bar{F}$,
714: Eq.~\eqref{eq:flefbar}; the second inequality is a consequence of
715: the definition of~$\varphi$, Eq.~\eqref{eq:application}; the
716: equality descends from the second hypothesis on~$\bar{F}$,
717: Eq.~\eqref{eq:feqfbariff}. In order to prove also
718: Eq.~\eqref{eq:feqfiff}, we have to show that both inequalities
719: hold as equalities if and only if~$\hat{p} = p$. As far as the
720: former is concerned, this is a direct consequence of the
721: hypothesis Eq.~\eqref{eq:feqfbariff}, while the latter is proved
722: by the fact that~$\bar{F}[p,p']$ has a unique minimum, which is
723: also the absolute minimum, with respect to~$p'$.~$\blacksquare$
724: 
725: Let us now consider the auxiliary functional defined by
726: \begin{equation}
727:   \bar{F}[p,p'] =
728:   \sum_{\gamma}
729:   \sum_{x_\gamma} p'_\gamma(x_\gamma)
730:   \left[
731:   h_\gamma(x_\gamma)
732:   + \log p'_\gamma(x_\gamma)
733:   + \sum_{\alpha \subset \gamma} b_\alpha \log p_\gamma(x_\alpha)
734:   \right]
735:   .
736: \end{equation}
737: First of all, it is easy to see that $\bar{F}[p,p] = F[p]$, where
738: $F[p]$ is the CVM free energy~\eqref{eq:f1}. Moreover, $F[p,p']$
739: is easily seen to be convex with respect to~$p'$, therefore, if it
740: has a stationary point, it is also unique, and is a minimum.
741: Finally, let us observe that stationarization of this functional
742: with respect to~$p'$, with the usual linear constraints, gives
743: rise just to the NIM equations~\eqref{eq:nim}, which in this way
744: can be used to define the application~$\varphi$. In order to show
745: that $\varphi$ actually perform a minimization of~$F$, a
746: sufficient condition is given by
747: Eqs.~\eqref{eq:flefbar},\eqref{eq:feqfbariff} in the above
748: theorem, that is, we have to upperbound the quantity
749: \begin{equation}
750:   F[p'] - \bar{F}[p,p'] =
751:   - \sum_\gamma \sum_{x_\gamma}
752:   p'_\gamma(x_\gamma)
753:   \sum_{\alpha \subset \gamma}
754:   b_\alpha \log \frac{p_\gamma(x_\alpha)}{p'_\gamma(x_\alpha)}
755:   \label{eq:deltaf4}
756: \end{equation}
757: with zero. Going back to (the right hand side of)
758: Eq.~\eqref{eq:deltaf3}, it easily turns out that this is exactly
759: the same upperbound we have proved with the sufficient condition
760: for the convergence of the NIM.
761: 
762: 
763: \section{Conclusions}
764: 
765: Let us finally summarize our results. We have investigated on the
766: convergence of the natural iteration method, proposed by Kikuchi
767: as a minimization procedure for cluster variational free energies
768: and widely employed in a lot of applications of the CVM. We have
769: discussed a condition on the coefficients of the cluster entropy
770: expansion, which is sufficient to prove that the free energy
771: decreases at each iteration, ensuring the convergence of the
772: method. Such a condition is based on the idea of pairing
773: subcluster entropies with a positive coefficient to larger
774: subcluster terms with a negative coefficient, yielding a set of
775: conditional entropy terms with negative coefficients. It had
776: already been proved by Kikuchi in the original
777: paper~\cite{Kikuchi1974} that negative coefficient terms give
778: decreasing contributions to the free energy. We have also taken
779: into account a set of common CVM approximations defined on various
780: regular lattices, frequently encountered in applications, showing
781: that the sufficient condition is always satisfied. In particular,
782: we have devoted some attention to the class of hypercube
783: approximations on the generic ($d$-dimensional) hypercubic
784: lattice, showing that the sufficient condition is verified for $d
785: \leq 3$. We have also implemented the natural iteration method for
786: $d=4$ on the simple Ising model, and found out that several
787: (random as well as uniform) initial conditions give rise to non
788: convergent (oscillating) behavior. This fact has led us to
789: conjecture that the sufficient condition may be also a necessary
790: one. Finally we have established a connection with a recently
791: proposed method for the minimization of non-convex functionals,
792: which can be applied to the CVM free
793: energy~\cite{HeskesAlbersKappen2003}. Such a method is based on
794: the existence of suitable upperbounding functionals to the
795: functional to be minimized. In Ref.~\cite{HeskesAlbersKappen2003}
796: several choices of upperbounding functionals are proposed and
797: applied to simple inhomogeneous systems. We have shown that one of
798: the upperbounding choices proposed there (indeed quite a good
799: choice in terms of computation time) is actually equivalent to
800: Kikuchi's natural iteration method. It turns out explicitly that
801: the upperbounding condition implies free energy decreasing, whence
802: convergence.
803: 
804: \begin{acknowledgments}
805: I would like to express my thanks to Dr. Alessandro Pelizzola for
806: many helpful suggestions and discussions.
807: \end{acknowledgments}
808: 
809: %\bibliography{../../bibliography}
810: 
811: \input{nim.bbl}
812: 
813: \clearpage
814: 
815: \begin{table}[p]
816:   \caption{
817:     Coefficients for different plaquette approximations. The first
818:     two columns report respectively the lattice and plaquette (basic cluster) type.
819:     The following three columns display the independent
820:     coefficients: $q$ (coordination number), $c_2,c_1$ (number of
821:     plaquettes sharing a given pair, site). The last two columns
822:     verify the sufficient condition, in that $q(c_2-1) < 2(c_1-1)$.
823:   }
824:   \begin{ruledtabular}
825:   \begin{tabular}{ll|rrr|rr}
826:     lattice    & plaquette   & $q$ & $c_2$ & $c_1$ & $q(c_2-1)$ & $2(c_1-1)$ \cr
827:     \hline
828:     square     & square      &  4 & 2 &  4 &  4 &  6 \cr
829:     triangular & triangle    &  6 & 2 &  6 &  6 & 10 \cr
830:     honeycomb  & hexagon     &  3 & 2 &  3 &  3 &  4 \cr
831:     sc         & square      &  6 & 4 & 12 & 18 & 22 \cr
832:     fcc        & triangle    & 12 & 4 & 24 & 36 & 46 \cr
833:     fcc        & tetrahedron & 12 & 2 &  8 & 12 & 14
834:   \end{tabular}
835:   \end{ruledtabular}
836:   \label{tab:coefficients}
837: \end{table}
838: 
839: \clearpage
840: 
841: \begin{figure}[p]
842: %  \includegraphics*[10mm,100mm][110mm,260mm]{gerb.ps}
843: 
844:   \setlength{\unitlength}{1.2mm}
845: 
846:   \begin{picture}(150,140)(-10,-140)
847: 
848:   \thicklines
849: 
850:   % basic clusters
851:   \put(-3,-12){\makebox(15,2)[lb]{\sf BASIC CLUSTER}}
852:   \put(52,-12){\makebox(15,2)[lb]{\sf PLAQUETTE}}
853: 
854:   % C-hierarchy
855:   % sites
856:   \multiput(0,-20)(9,0){5}{\circle*{2}}
857:   \multiput(4.5,-27.5)(9,0){4}{\circle*{2}}
858:   % horizontal bonds
859:   \put(0,-20){\line(1,0){9}}
860:   \put(9,-20){\line(1,0){9}}
861:   \put(18,-20){\line(1,0){3.5}}
862:   \put(27,-20){\line(-1,0){3.5}}
863:   \put(36,-20){\line(-1,0){9}}
864:   \put(4.5,-27.5){\line(1,0){9}}
865:   \put(13.5,-27.5){\line(1,0){9}}
866:   \put(22.5,-27.5){\line(1,0){3.5}}
867:   \put(31.5,-27.5){\line(-1,0){3.5}}
868:   % oblique bonds
869:   \multiput(0,-20)(9,0){4}{\line(3,-5){4.5}}
870:   \put(4.5,-27.5){\line(3,5){4.5}}
871:   \put(13.5,-27.5){\line(3,5){4.5}}
872:   \put(22.5,-27.5){\line(3,5){1.8}}
873:   \put(27,-20){\line(-3,-5){1.8}}
874:   \put(31.5,-27.5){\line(3,5){4.5}}
875:   % indices
876:   \put(-1,-18){\makebox(2,2)[lb]{$1$}}
877:   \put(8,-18){\makebox(2,2)[lb]{$3$}}
878:   \put(17,-18){\makebox(2,2)[lb]{$5$}}
879:   \put(19.5,-18){\makebox(3,2)[lb]{$\dots$}}
880:   \put(24,-18){\makebox(5,2)[lb]{$2L-1$}}
881:   \put(35,-18){\makebox(5,2)[lb]{$2L+1$}}
882:   \put(3.5,-31.5){\makebox(2,2)[lt]{$2$}}
883:   \put(12.5,-31.5){\makebox(2,2)[lt]{$4$}}
884:   \put(21.5,-31.5){\makebox(2,2)[lt]{$6$}}
885:   \put(25,-33){\makebox(3,2)[lt]{$\dots$}}
886:   \put(30.5,-31.5){\makebox(3,2)[lt]{$2L$}}
887: 
888:   % triangle approximation
889:   % sites
890:   \multiput(55,-20)(9,0){2}{\circle*{2}}
891:   \put(59.5,-27.5){\circle*{2}}
892:   % horizontal bonds
893:   \put(55,-20){\line(1,0){9}}
894:   % oblique bonds
895:   \multiput(55,-20)(9,0){1}{\line(3,-5){4.5}}
896:   \put(59.5,-27.5){\line(3,5){4.5}}
897:   % indices
898:   \put(54,-18){\makebox(2,2)[lb]{$1$}}
899:   \put(63,-18){\makebox(2,2)[lb]{$3$}}
900:   \put(58.5,-31.5){\makebox(2,2)[lt]{$2$}}
901: 
902:   % pair-like clusters
903:   \put(-3,-42){\makebox(15,2)[lb]{\sf PAIR-LIKE CLUSTERS}}
904:   \put(52,-42){\makebox(15,2)[lb]{\sf PAIRS}}
905: 
906:   % cluster 12 (b-hierarchy)
907:   % sites
908:   \multiput(0,-50)(9,0){4}{\circle*{2}}
909:   \multiput(4.5,-57.5)(9,0){4}{\circle*{2}}
910:   % horizontal bonds
911:   \put(0,-50){\line(1,0){9}}
912:   \put(9,-50){\line(1,0){9}}
913:   \put(18,-50){\line(1,0){3.5}}
914:   \put(27,-50){\line(-1,0){3.5}}
915:   \put(4.5,-57.5){\line(1,0){9}}
916:   \put(13.5,-57.5){\line(1,0){9}}
917:   \put(22.5,-57.5){\line(1,0){3.5}}
918:   \put(31.5,-57.5){\line(-1,0){3.5}}
919:   % oblique bonds
920:   \multiput(0,-50)(9,0){4}{\line(3,-5){4.5}}
921:   \put(4.5,-57.5){\line(3,5){4.5}}
922:   \put(13.5,-57.5){\line(3,5){4.5}}
923:   \put(22.5,-57.5){\line(3,5){1.8}}
924:   \put(27,-50){\line(-3,-5){1.8}}
925:   % indices
926:   \put(-1,-48){\makebox(2,2)[lb]{$1$}}
927:   \put(8,-48){\makebox(2,2)[lb]{$3$}}
928:   \put(17,-48){\makebox(2,2)[lb]{$5$}}
929:   \put(19.5,-48){\makebox(3,2)[lb]{$\dots$}}
930:   \put(24,-48){\makebox(5,2)[lb]{$2L-1$}}
931:   \put(3.5,-61.5){\makebox(2,2)[lt]{$2$}}
932:   \put(12.5,-61.5){\makebox(2,2)[lt]{$4$}}
933:   \put(21.5,-61.5){\makebox(2,2)[lt]{$6$}}
934:   \put(25,-63){\makebox(3,2)[lt]{$\dots$}}
935:   \put(30.5,-61.5){\makebox(3,2)[lt]{$2L$}}
936: 
937:   % cluster 12 (triangle approximation)
938:   % sites
939:   \put(55,-50){\circle*{2}}
940:   \put(59.5,-57.5){\circle*{2}}
941:   % oblique bonds
942:   \put(55,-50){\line(3,-5){4.5}}
943:   % indices
944:   \put(54,-48){\makebox(2,2)[lb]{$1$}}
945:   \put(58.5,-61.5){\makebox(2,2)[lt]{$2$}}
946: 
947:   % cluster 23 (C-hierarchy)
948:   % sites
949:   \multiput(9,-70)(9,0){4}{\circle*{2}}
950:   \multiput(4.5,-77.5)(9,0){4}{\circle*{2}}
951:   % horizontal bonds
952:   \put(9,-70){\line(1,0){9}}
953:   \put(18,-70){\line(1,0){3.5}}
954:   \put(27,-70){\line(-1,0){3.5}}
955:   \put(36,-70){\line(-1,0){9}}
956:   \put(4.5,-77.5){\line(1,0){9}}
957:   \put(13.5,-77.5){\line(1,0){9}}
958:   \put(22.5,-77.5){\line(1,0){3.5}}
959:   \put(31.5,-77.5){\line(-1,0){3.5}}
960:   % oblique bonds
961:   \multiput(9,-70)(9,0){3}{\line(3,-5){4.5}}
962:   \put(4.5,-77.5){\line(3,5){4.5}}
963:   \put(13.5,-77.5){\line(3,5){4.5}}
964:   \put(22.5,-77.5){\line(3,5){1.8}}
965:   \put(27,-70){\line(-3,-5){1.8}}
966:   \put(31.5,-77.5){\line(3,5){4.5}}
967:   % indices
968:   \put(8,-68){\makebox(2,2)[lb]{$3$}}
969:   \put(17,-68){\makebox(2,2)[lb]{$5$}}
970:   \put(19.5,-68){\makebox(3,2)[lb]{$\dots$}}
971:   \put(24,-68){\makebox(5,2)[lb]{$2L-1$}}
972:   \put(35,-68){\makebox(5,2)[lb]{$2L+1$}}
973:   \put(3.5,-81.5){\makebox(2,2)[lt]{$2$}}
974:   \put(12.5,-81.5){\makebox(2,2)[lt]{$4$}}
975:   \put(21.5,-81.5){\makebox(2,2)[lt]{$6$}}
976:   \put(25,-83){\makebox(3,2)[lt]{$\dots$}}
977:   \put(30.5,-81.5){\makebox(3,2)[lt]{$2L$}}
978: 
979:   % cluster 23 (triangle approximation)
980:   % sites
981:   \put(64,-70){\circle*{2}}
982:   \put(59.5,-77.5){\circle*{2}}
983:   % oblique bonds
984:   \put(59.5,-77.5){\line(3,5){4.5}}
985:   % indices
986:   \put(63,-68){\makebox(2,2)[lb]{$3$}}
987:   \put(58.5,-81.5){\makebox(2,2)[lt]{$2$}}
988: 
989:   % cluster 13 (C-hierarchy)
990:   % sites
991:   \multiput(0,-90)(9,0){5}{\circle*{2}}
992:   % horizontal bonds
993:   \put(0,-90){\line(1,0){9}}
994:   \put(9,-90){\line(1,0){9}}
995:   \put(18,-90){\line(1,0){3.5}}
996:   \put(27,-90){\line(-1,0){3.5}}
997:   \put(36,-90){\line(-1,0){9}}
998:   % indices
999:   \put(-1,-88){\makebox(2,2)[lb]{$1$}}
1000:   \put(8,-88){\makebox(2,2)[lb]{$3$}}
1001:   \put(17,-88){\makebox(2,2)[lb]{$5$}}
1002:   \put(19.5,-88){\makebox(3,2)[lb]{$\dots$}}
1003:   \put(24,-88){\makebox(5,2)[lb]{$2L-1$}}
1004:   \put(35,-88){\makebox(5,2)[lb]{$2L+1$}}
1005: 
1006:   % cluster 13 (triangle approximation)
1007:   % sites
1008:   \multiput(55,-90)(9,0){2}{\circle*{2}}
1009:   % horizontal bonds
1010:   \put(55,-90){\line(1,0){9}}
1011:   % indices
1012:   \put(54,-88){\makebox(2,2)[lb]{$1$}}
1013:   \put(63,-88){\makebox(2,2)[lb]{$3$}}
1014: 
1015: 
1016:   % site-like clusters
1017:   \put(-3,-102){\makebox(15,2)[lb]{\sf SITE-LIKE CLUSTERS}}
1018:   \put(52,-102){\makebox(15,2)[lb]{\sf SITES}}
1019: 
1020:   % cluster 1 (C-hierarchy)
1021:   % sites
1022:   \multiput(0,-110)(9,0){4}{\circle*{2}}
1023:   % horizontal bonds
1024:   \put(0,-110){\line(1,0){9}}
1025:   \put(9,-110){\line(1,0){9}}
1026:   \put(18,-110){\line(1,0){3.5}}
1027:   \put(27,-110){\line(-1,0){3.5}}
1028:   % indices
1029:   \put(-1,-108){\makebox(2,2)[lb]{$1$}}
1030:   \put(8,-108){\makebox(2,2)[lb]{$3$}}
1031:   \put(17,-108){\makebox(2,2)[lb]{$5$}}
1032:   \put(19.5,-108){\makebox(3,2)[lb]{$\dots$}}
1033:   \put(24,-108){\makebox(5,2)[lb]{$2L-1$}}
1034: 
1035:   % cluster 1 (triangle approximation)
1036:   % sites
1037:   \put(55,-110){\circle*{2}}
1038:   % indices
1039:   \put(54,-108){\makebox(2,2)[lb]{$1$}}
1040: 
1041:   % cluster 2 (C-hierarchy)
1042:   % sites
1043:   \multiput(4.5,-120)(9,0){4}{\circle*{2}}
1044:   % horizontal bonds
1045:   \put(4.5,-120){\line(1,0){9}}
1046:   \put(13.5,-120){\line(1,0){9}}
1047:   \put(22.5,-120){\line(1,0){3.5}}
1048:   \put(31.5,-120){\line(-1,0){3.5}}
1049:   % indices
1050:   \put(3.5,-118){\makebox(2,2)[lb]{$2$}}
1051:   \put(12.5,-118){\makebox(2,2)[lb]{$4$}}
1052:   \put(21.5,-118){\makebox(2,2)[lb]{$6$}}
1053:   \put(25.5,-118){\makebox(3,2)[lb]{$\dots$}}
1054:   \put(30.5,-118){\makebox(2,2)[lb]{$2L$}}
1055: 
1056:   % cluster 2 (triangle approximation)
1057:   % sites
1058:   \put(59.5,-120){\circle*{2}}
1059:   % indices
1060:   \put(58.5,-118){\makebox(2,2)[lb]{$2$}}
1061: 
1062:   % cluster 3 (C-hierarchy)
1063:   % sites
1064:   \multiput(9,-130)(9,0){4}{\circle*{2}}
1065:   % horizontal bonds
1066:   \put(9,-130){\line(1,0){9}}
1067:   \put(18,-130){\line(1,0){3.5}}
1068:   \put(27,-130){\line(-1,0){3.5}}
1069:   \put(36,-130){\line(-1,0){9}}
1070:   % indices
1071:   \put(8,-128){\makebox(2,2)[lb]{$3$}}
1072:   \put(17,-128){\makebox(2,2)[lb]{$5$}}
1073:   \put(19.5,-128){\makebox(3,2)[lb]{$\dots$}}
1074:   \put(24,-128){\makebox(5,2)[lb]{$2L-1$}}
1075:   \put(35,-128){\makebox(5,2)[lb]{$2L+1$}}
1076: 
1077:   % cluster 3 (triangle approximation)
1078:   % sites
1079:   \put(64,-130){\circle*{2}}
1080:   % indices
1081:   \put(63,-128){\makebox(2,2)[lb]{$3$}}
1082: 
1083:   \end{picture}
1084: 
1085:   \caption{
1086:     Basic cluster and subclusters for the B~hierarchy (left side) and
1087:     for the corresponding (triangle) plaquette approximation.
1088:   }
1089:   \label{fig:gerb}
1090: \end{figure}
1091: 
1092: 
1093: \end{document}
1094: