0803.4444/text.tex
1: \documentclass[aps,pre,twocolumn,showpacs,floatfix]{revtex4}
2: \usepackage{graphicx}
3: \usepackage{array}
4: \usepackage[usenames]{color}
5: \usepackage{amsmath}
6: \usepackage{ulem}
7: \normalem
8: 
9: \newcommand{\tens}[1]{\,\raisebox{0ex}[0ex][0ex]{\uuline{\mbox{$#1$}}}\,}
10: \newcommand{\Tr}{\ensuremath{\operatorname{Tr}}}
11: \newcommand{\red}[1]{{\color{Red} #1}}\begin{document}  
12: \newcommand{\eigena}{a}
13: \newcommand{\eigenb}{b}
14: \newcommand{\auto}[4]{\ensuremath{\left<#1\left(#2\right)#3\left(#4\right)\right>}}
15: 
16: 
17: \title{Conjugate gradient heatbath for ill-conditioned actions}
18: \author{Michele Ceriotti}\email{michele.ceriotti@phys.chem.ethz.ch}
19: \author{Giovanni Bussi and Michele Parrinello}
20: \affiliation{Computational Science, Department of Chemistry and Applied Biosciences,
21: ETH Zurich, USI Campus, Via Giuseppe Buffi 13, CH-6900 Lugano, Switzerland}
22: \begin{abstract}
23: We present a method for performing sampling from a Boltzmann distribution
24: of an ill-conditioned quadratic action.
25: This method is based on heatbath 
26: thermalization along a set of conjugate directions, generated via 
27: a conjugate-gradient procedure. The resulting scheme outperforms 
28: local updates for matrices with very high condition number, since it avoids
29: the slowing down of modes with lower eigenvalue, and 
30: has some advantages over the global heatbath approach, compared to 
31: which it is more stable and allows for more freedom in devising 
32: case-specific optimizations.
33: \end{abstract}
34: \pacs{
35: 02.50.Ng %Distribution theory and Monte Carlo studies
36: 02.70.Tt %Justifications or modifications of Monte Carlo methods
37: }
38: \maketitle
39: 
40: A common problem in many branches of statistical physics is the sampling of 
41: distributions of the type $p\propto \exp\left(-\frac{1}{2}{\bf x\tens{A} x}\right)$
42: where $\tens{\bf A}$ is a positive definite $N\times N$ matrix and the
43: random variable ${\bf x}$ an $N$-dimensional vector.
44: Areas in which such sampling is needed are for instance QCD\cite{lusc94npb,divi+95npb,forc99parc} 
45: and a recently developed linear scaling electronic structure 
46: method\cite{kraj-parr05prb,kraj-parr06prb}.
47: In principle sampling $p$ is straightforward, if diagonalizing $\tens{\bf A}$ is
48: an option. However, in many cases, $N$ is so large that circumventing the 
49: $\mathcal{O}\left(N^3\right)$ diagonalization step becomes mandatory.
50: Different approaches have been proposed. In the so-called global heatbath
51: method  one writes $\tens{\bf A}=\tens{\bf M}^T\tens{\bf M}$,
52: and obtains a series of statistically independent vectors by solving  the linear
53: system $\tens{\bf M}{\bf x}={\bf R}$, where ${\bf R}$ is a vector whose components
54: are distributed according to a Gaussian with zero mean and 
55: unit variance $\left<R^2\right>=1$. The advantage of this method is that
56: the algorithmic complexity of the problem can be reduced
57: by using an iterative solver for the linear system.
58:  In order to expedite sampling a Metropolis-like
59: criterion has been suggested that leads to correct sampling
60: without having to bring the iterative process to full convergence\cite{forc99pre,wilc02npb}.
61: Unfortunately, when the ratio between the largest and smallest eigenvalues is large 
62: (ill-conditioned matrices) the acceptance of 
63: this scheme drops to zero unless full convergency is achieved.
64: An alternative approach is the local heatbath algorithm, in which 
65: at every step one single component of the state vector ${\bf x}$ 
66: is thermalized in turn, keeping the others fixed.
67: It has been pointed out elsewhere\cite{good-soka89prd,adle89npb} that there is a 
68: close analogy between this second method and the Gauss-Seidel minimization technique.
69: This approach is relatively inexpensive, but becomes very inefficient when the 
70: condition number of $\tens{\bf A}$ is large, and even more inefficient when 
71: the observable of interest depends strongly on the eigenvectors  corresponding to smaller eigenvalues.
72: 
73: In this paper we propose a heatbath algorithm in which moves are performed along 
74: mutually conjugated
75: directions. This choice is based on the analogy between various heatbath methods 
76: (see e.g. Ref.~\cite{good-soka89prd}) and directional minimization techniques.
77: We show both analytically and numerically that the choice of conjugate directions
78: allows all the degrees of freedom to become  decorrelated on the same time scale, 
79: independent of their associated eigenvalue. We also discuss the cases in which
80: the improved efficiency outbalances the additional computational cost. 
81: Our method can be interpreted as the subdivision of the global heatbath matrix 
82: inversion process into $N$ intermediate steps, all of which guarantee an 
83: exact sampling of the probability distribution.
84: 
85: In section~\ref{sec:cartaepenna} we introduce a simple 
86: formalism to treat heatbath moves along general directions, 
87: discuss the properties of a sweep through
88: a set of conjugate directions, and describe a couple of 
89: algorithms to obtain such a set with reasonable effort. In 
90: section~\ref{sec:numerico} we present some numerical tests on 
91: a model action and compare the efficiency of conjugate
92: directions heatbath with local moves for a model observable.
93: In section~\ref{sec:comp-global} we compare our method with global
94: heatbath, and in section~\ref{sec:conclusions} we present our conclusions.
95: 
96: \section{\label{sec:cartaepenna}Collective modes heatbath}
97: Given a probability distribution
98: \begin{equation}
99: P({\bf x})\propto \exp\left[-\left(\frac{1}{2}{\bf x}\tens{\bf A}{\bf x}-{\bf b}\cdot{\bf x}\right)\right]
100: \end{equation}
101: a generic heatbath algorithm can be described as a stochastic process
102: in which the vector ${\bf x}\left(t+1\right)$ is related to the vector at 
103: the previous step ${\bf x}\left(t\right)$ by
104: \begin{equation}
105: {\bf x}\left(t+1\right)={\bf x}\left(t\right)+\tau{\bf d} \label{eq:step},
106: \end{equation}
107: where ${\bf d}$ is a direction in the ${\bf x}$ space and 
108: \begin{equation}
109: \tau = -\frac{{\bf d}\left(\tens{\bf A}{\bf x}-{\bf b}\right)}{{\bf d}\tens{\bf A}{\bf d}}+
110: \left(\beta {\bf d}\tens{\bf A}{\bf d}\right)^{-1/2} R \label{eq:hb-tau}
111: \end{equation}
112: where $R$ is a Gaussian random number with zero mean and unitary spread $\left<R^2\right>=1$, and $\beta$ is the inverse temperature at which the 
113: sampling is performed.
114: The application of this algorithm does not require inversion of the matrix
115: $\tens{\bf A}$.
116: The sequence of directions ${\bf d}$ is rather arbitrary, and could be
117: a random sequence or a predefined deterministic sequence.
118: Strictly speaking, detailed balance is satisfied only if
119: the directions are randomly chosen at each step.
120: Nevertheless it has been shown in Ref\cite{mano-deem99jcp} that correct sampling
121: can be achieved if every Monte Carlo move leaves the
122: equilibrium distribution unchanged. In Appendix~\ref{sec:stationary} we show
123: that this is the case, provided that direction ${\bf d}$ is chosen independently
124: from position ${\bf x}$. 
125: Nevertheless, different choices of directions can lead to different sampling efficiency.
126: Our final choice will be to select for ${\bf d}$ a sequence of conjugate directions 
127: (Section~\ref{sub:conj-dir}). However, we shall first analyze the choice
128: of random, uncorrelated directions, and a sequential sweep along a set of orthogonal
129: directions.
130: 
131: For the sake of simplicity, we take ${\bf b}=0$ and we choose the basis into 
132: which $\tens{\bf A}$ is diagonal, $A_{ij}=\eigena_i\delta_{ij}$.
133: Since these properties are subsequently never used, no loss of generality is implied.
134: To compare the efficiency of the different choices of directions we shall consider
135: the autocorrelation matrix for the components along the eigenmodes 
136: $\auto{x_i}{0}{x_j}{t}$. A quantitative measure of the speed of decorrelation of
137: $\auto{x_i}{0}{x_j}{t}$ can be obtained from its slope at the origin. Since in Monte Carlo
138: one progresses in discrete steps, this quantity is given by
139: \begin{equation}
140: \left<x_i\left(0\right) x_j\left(1\right)\right>=
141: \sqrt{\left<x_i\left(0\right)^2\right>\left<x_j\left(0\right)^2\right>}
142: \left[\delta_{ij}-\Delta_{ij}\left({\bf d}\right)\right]\label{eq:slopefirst}
143: \end{equation}
144: In Eq. (\ref{eq:slopefirst}) we have introduced the normalized slope 
145: tensor $\tens{\boldsymbol\Delta}$, which can 
146: be expressed as a function of the eigenvalues of $\tens{\bf A}$ and of the components
147: of ${\bf d}$, using equations (\ref{eq:step}) and (\ref{eq:hb-tau}):
148: \begin{equation}
149: \Delta_{ij}\left({\bf d}\right)=\frac{\eigena_i h_i h_j}{\sum_k \eigena_k h_k^2}
150: \frac{\left<x_i\left(0\right)^2\right>}
151: {\sqrt{\left<x_i\left(0\right)^2\right>\left<x_j\left(0\right)^2\right>}}=
152: \frac{\sqrt{\eigena_i\eigena_j} d_i d_j}{\sum_k \eigena_k d_k^2}
153: \label{eq:slope-tens}
154: \end{equation}
155: Therefore, depending on the choice of direction ${\bf d}$, the different
156: components of the vector ${\bf x}$ decorrelate at different speeds.
157: However, since $\Tr \tens{\boldsymbol\Delta}=1$, the sum of these
158: normalized speeds does not depend on the direction chosen.
159: The same quantity $\tens{\boldsymbol\Delta}$ also enters a recursion relation 
160: for the autocorrelation functions at a generic Monte Carlo step $t$, 
161: \begin{eqnarray}
162: \auto{x_i}{0}{x_j}{t+1}=\nonumber\\
163: \auto{x_i}{0}{x_j}{t}-
164: \sum_k\left[\auto{x_i}{0}{x_k}{t}\sqrt{\frac{\eigena_k}{\eigena_j}}
165: \Delta_{kj}\left({\bf d}\right)\right]\label{eq:auto-induction}
166: \end{eqnarray}
167: Use of this equation requires that one appropriately averages over the direction
168: ${\bf d}$, as we shall discuss in the following.
169: 
170: We will begin our analysis from the simpler case, in which the direction ${\bf d}$
171: is chosen at every step to be equal to a stochastic vector ${\bf R}$, whose 
172: components are distributed as Gaussian random numbers with zero mean and 
173: standard deviation one. The normalized slope tensor (\ref{eq:slope-tens}) in this 
174: case results from an average over the possible directions, 
175: \begin{equation}
176: \left<\Delta_{ij}\left({\bf d}={\bf R}\right)\right>=
177: \eigena_i \delta_{ij} \left<\frac{R_i^2}{\sum_k\eigena_k R_k^2}\right>
178: \overset{N\rightarrow\infty}{\approx}\frac{\eigena_i \delta_{ij}}{\Tr\tens{\bf A}}.
179: \label{eq:random-slope}
180: \end{equation}
181: The limit expression holds for the size $N$ of the matrix going to infinity 
182: (see Appendix~\ref{sec:rnd-asymptotic}), 
183: under the hypothesis that the largest eigenvalue of $\tens{\bf A}$ does not grow
184: with $N$ and that $\Tr \tens{\bf A}$ is $\mathcal{O}\left(N\right)$, 
185: hypotheses which are relevant to many physical problems.
186: Since in this case the direction chosen at every step is independent of all
187: the previous choices, the same average enters equation (\ref{eq:auto-induction}) 
188: at any time, so that proceeding by induction one can easily obtain the 
189: entire autocorrelation function,
190: \begin{equation}
191: \auto{x_i}{0}{x_j}{t}=
192: \delta_{ij}\left<x_i\left(0\right)^2\right>
193: \left[1-\left<\Delta_{ij}\right>\right]^t \label{eq:random-full}
194: \end{equation}
195: where $\left<\Delta_{ij}\right>$ is the quantity obtained in 
196: equation~(\ref{eq:random-slope}). From (\ref{eq:random-full}) we can 
197: calculate the autocorrelation time for mode $i$, 
198: \[
199: \tau_i=\frac{\sum_{t=0}^\infty \auto{x_i}{0}{x_i}{t}}{\left<x_i^2\right>}
200: =\left[\eigena_i\left<\frac{R_i^2}{\sum_k\eigena_k R_k^2}\right>\right]^{-1}
201: \overset{N\rightarrow\infty}{\approx}
202: \frac{\Tr\tens{\bf A}}{\eigena_i}.
203: \]
204: In the case of large $N$, the decorrelation speed of the components along normal modes
205: is directly proportional to the corresponding eigenvalue, so that in ill-conditioned
206: cases a critical slowing down for the softer normal modes will be present.
207: 
208: 
209: Let us now consider moves along a predefined set of orthogonal directions 
210: $\left\{{\bf u}^{(m)}\right\}_{m=0\ldots N-1}$. This is done  to mimic 
211: the case in which one performs a sweep along Cartesian directions.
212: In our reference frame, where $\tens{\bf A}$ is taken to be diagonal, this
213: would be trivial, hence the choice of an arbitrarily oriented set of orthogonal 
214: directions. As in standard local heatbath, the outcome will depend on the
215: orientation of the $\left\{{\bf u}^{(m)}\right\}$ relative to the
216: eigenvectors of $\tens{\bf A}$.
217: Averaging over all the possible choices of initial direction, we find the
218: slope at $t=0$,
219: \begin{equation}
220: \left<\Delta_{ij}\right>=\frac{1}{N}\sum_m\Delta_{ij}\left({\bf u}^{(m)}\right)=
221: \frac{1}{N}\sum_m\frac{\sqrt{\eigena_i \eigena_j} u_i^{(m)}u_j^{(m)}}{\sum_k{\eigena_k u_k^{(m)}}^2}\label{eq:localdelta}
222: \end{equation}
223: Obviously, it is not possible to reduce this result to an expression which
224: does not depend on the particular set of orthogonal directions.
225: However, the following inequality holds
226: \begin{equation}
227: \frac{\eigena_i \delta_{ij}}{N\eigena_{max}} 
228: \le \left<\Delta_{ij}\right> \le
229: \frac{\eigena_i \delta_{ij}}{N\eigena_{min}}  \label{eq:localuneq}
230: \end{equation}
231: Equation~(\ref{eq:localuneq}) does not put rigid constraints on the 
232: value of $\left<\Delta_{ij}\right>$, but demonstrates that also in this case 
233: $\tens{\boldsymbol\Delta}$ is diagonal and
234: suggests that in real life the convergence will be faster for the higher eigenvalues,
235: and that the spread in the relaxation speed for different modes is 
236: larger when the condition number $\kappa = \eigena_{max}/\eigena_{min}$ is higher.
237: 
238: In the case where directions $\left\{{\bf u}^{(m)}\right\}$ are swept sequentially
239: we have not been able to derive a closed expression for $\auto{x_i}{0}{x_j}{t}$
240: because of the dependence of ${\bf d}\left(t\right)$ on the previous history.
241: If, on the other hand, a random direction is drawn from $\left\{{\bf u}^{(m)}\right\}$
242: at every step,  $\auto{x_i}{0}{x_j}{t}$ is given by expression (\ref{eq:random-full})
243: where $\left<\Delta_{ij}\right>$ has the value in equation~(\ref{eq:localdelta}).
244: 
245: \subsection{\label{sub:conj-dir}Moves along conjugate directions}
246: 
247: It is clear from equation (\ref{eq:random-full}) that a random choice of the directions
248: ${\bf d}$ leads to fast decorrelation of the components relative to the eigenvectors 
249: with high eigenvalues. On the other hand, the components relative to the eigenvectors with 
250: low eigenvalues will decorrelate more slowly. 
251: Similar behavior is expected for the local heatbath method, unless particular
252: relations hold between the eigenvectors and the Cartesian axes. 
253: If the operator $\tens{\bf A}$ is ill-conditioned, the practical
254: consequence is that the slow modes will be accurately sampled only after a very large
255: number of steps.
256: As we have already discussed, the sum of the decorrelation slopes of the different
257: components does not depend on the choice of the directions ${\bf d}$.
258: However, with a proper choice of the directions ${\bf d}$ this sum could
259: be spread in a uniform way among the different modes.
260: A similar problem arises in minimization algorithms based on directional search,
261: and is often solved choosing a sequence of conjugated directions\cite{numerical-recipes}.
262: In the same spirit, we can compute the decorrelation speed of the different
263: modes when the ${\bf d}$'s are chosen to be conjugated directions.
264: Let us consider a set of conjugated directions $\left\{{\bf h}^{(i)}\right\}$,
265: such that ${\bf h}^{(i)}\tens{\bf A}{\bf h}^{(j)}=\delta_{ij}$. 
266: The set $\left\{{\bf h}^{(i)}\right\}$
267: can be generated with various algorithms, such as a Gram-Schmidt 
268: orthogonalization that uses the positive definite $\tens{\bf A}$ matrix as a metric,
269: or a conjugate gradient procedure, as described in Section~\ref{sub:tricks}.
270: 
271: Using the fact that
272: $\sum_k h_i^{(k)}h_j^{(k)}=\eigena_i^{-1}\delta_{ij}$,
273: the slope at $t=0$ is
274: \begin{align*}
275: \left<\Delta_{ij}\right>=\frac{1}{N}\sum_m
276: \frac{\sqrt{\eigena_i \eigena_j} h^{(m)}_i h^{(m)}_j}{{\bf h}^{(m)}\tens{\bf A}{\bf h}^{(m)}}=\\
277: =\frac{1}{N}\sqrt{\frac{\eigena_j}{\eigena_i}}\sum_m
278: \frac{\eigena_i h^{(m)}_i h^{(m)}_j}{{\bf h}^{(m)}\tens{\bf A}{\bf h}^{(m)}}=\frac{\delta_{ij}}{N}
279: \end{align*}
280: With this choice, the decorrelation slopes of the different
281: modes are independent of the eigenvalue.
282: If one chooses one conjugate direction at random at each step it is 
283: straightforward to show that overall the autocorrelation function decays 
284: exponentially as
285: \[
286: \left<x_i\left(0\right) x_j\left(t\right)\right>=
287: \delta_{ij}\left<x_i\left(0\right)^2\right>
288: \left[1-\frac{1}{N}\right]^t
289: \]
290: 
291: This derivation shows that if matrix $\tens{\bf A}$ is ill-conditioned and
292: one wishes to decorrelate the slow modes, then the choice of performing
293: the heatbath using a sequence of conjugated directions can improve
294: the sampling quality dramatically. Of course, the slow modes are accelerated
295: and the fast modes are decelerated.
296: However, it is clear that a completely independent vector ${\bf x}$ is
297: obtained only when all the modes are decorrelated.
298: A heatbath on conjugate directions allows all the modes to be decorrelated
299: with the same efficiency, irrespective of their stiffness.
300: Even better efficiency can be obtained by sequentially sweeping a set of
301: conjugated directions. At first sight it would appear that the dependence
302: of ${\bf h}\left(t\right)$ on ${\bf h}\left(t-1\right)$ would make it
303: very difficult if not impossible to obtain the autocorrelation function 
304: in a closed form.
305: However, conjugate directions have a redeeming feature.
306: If we expand the position vector on the non-orthogonal basis 
307: $\left\{{\bf h}^{(m)}\right\}$, ${\bf x}=\sum_i \alpha^i {\bf h}^{(i)}$,
308: and we evaluate the correlation matrix between
309: the contravariant components $\alpha^i$, we find that 
310: $\left<\alpha^i\alpha^j\right>=\delta_{ij}$. 
311: This property can be easily demonstrated taking into account that the
312: ensemble average $\left<x_i x_j\right>=A^{-1}_{ij}$,
313: and that conjugacy implies
314: ${\bf h}^{(i)}\tens{\bf A}{\bf h}^{(j)}=\delta_{ij}$.
315: Thus, effectively, every time we perform a heatbath move
316: along direction ${\bf h}^{(i)}$ the component $\alpha^i$ is randomized,
317: without affecting the others. After a complete sweep across the set of directions
318: a completely independent state is obtained. 
319: 
320: A more formal proof is provided in appendix~\ref{sec:cd-formal}, where it is
321: also demonstrated that the autocorrelation function is
322: \begin{equation}
323: \left<x_i\left(0\right) x_i\left(t\right)\right>=
324: \left<x_i\left(0\right)^2\right>
325: \left\{
326: \begin{array}{cc}
327: \left[1-\frac{t}{N}\right] 	& t<N\\
328: 0				& t\ge N
329: \end{array}
330: \right.\label{eq:cd-autofun}
331: \end{equation}
332: Therefore the corresponding autocorrelation time is $\tau_i=\left(N+1\right)/2$.
333: A remarkable feature of equation (\ref{eq:cd-autofun}) is that the
334: autocorrelation function is linear, and that after $N$ moves 
335: a completely independent vector is obtained. 
336: This property holds also for the global heatbath method. 
337: In Section~\ref{sec:comp-global} we shall discuss the relation between
338: our approach and  global heatbath sampling.
339: 
340: \subsection{\label{sub:tricks}Conjugate-gradient approach to generate conjugate directions}
341: In the last section we have shown how a heatbath algorithm based on
342: conjugate directions can dramatically improve the sampling of the slow modes
343: for an ill-conditioned action. 
344: An efficient strategy to generate these directions is the application of the 
345: conjugate gradient procedure\cite{numerical-recipes}.  
346: For the sake of completeness and to introduce a consistent notation we give 
347: here an outline of the CG algorithm.
348: One starts from a random configuration and search direction, 
349: ${\bf h}^{(0)}={\bf g}^{(0)}={\bf R}$, so that the directions obtained and 
350: the sample vector ${\bf x}$ are independent as required. Then, 
351: a series of directions ${\bf h}^{(m)}$ and residuals ${\bf g}^{(m)}$ are 
352: generated using the recurrence relations 
353: \[ 
354: {\bf g}^{(i+1)}={\bf g}^{(i)}-\lambda_i \tens{\bf A}\cdot{\bf h}^{(i)}\quad
355: {\bf h}^{(i+1)}={\bf g}^{(i+1)}+\gamma_i \cdot{\bf h}^{(i)}
356: \]
357: \[
358: \lambda_i=\frac{{\bf g}^{(i)}\cdot {\bf g}^{(i)}}{{\bf h}^{(i)}\tens{\bf A}{\bf h}^{(i)}}
359: \quad 
360: \gamma_i=\frac{{\bf g}^{(i+1)}\cdot {\bf g}^{(i+1)}}{{\bf g}^{(i)}\cdot {\bf g}^{(i)}}
361: \]
362: This procedure generates at every step a new direction ${\bf h}^{(i)}$, conjugated to all
363: the previous ones, and it can be used to perform a directional heatbath move on
364: ${\bf x}$. It should be stressed that
365: there is no need to store all the  ${\bf h}^{(i)}$ if the heatbath moves 
366: are performed concurrently with the CG minimization.
367: The ``force'' $\tens{\bf A}{\bf h}^{(i)}$ can be reused for performing 
368:  the heatbath update (cfr. Eq. (\ref{eq:step})).
369: At a certain point the CG procedure will be over, with the residual ${\bf g}$ 
370: dropping to zero. The sequential sweep algorithm described inte the 
371: previous section can be implemented starting again from the same 
372: ${\bf g}^{(0)}$.
373: 
374: In contrast to the global heatbath method, numerical stability is not 
375: a major issue, since the accuracy of the sampling does not depend
376: on the search directions being exactly conjugated. The only effect 
377: of imperfect conjugation would be to slightly reduce 
378:  the decorrelation efficiency.
379: There is however a  drawback to this approach. In order to be ergodic, 
380: the set of directions must span the whole space. 
381: The problem arises when there are degenerate 
382: eigenvalues, as CG converges to zero in a number $p$ of iterations equal to the
383: number of distinct eigenvalues. If we keep reusing the same set of $p<N$ 
384: directions, only a part of the subspaces corresponding to degenerate eigenvalues
385: will be explored, and the sampling will not be ergodic.
386: 
387: \begin{figure}
388: \caption{\label{fig:hybrid}Scheme of the block algorithm 
389: described in paragraph \ref{sub:tricks}; squares represent 
390: eigenvectors of the action matrix, which need to be refreshed in order
391: to obtain a statistically independent sample point; modes on 
392: the same column correspond to the same, degenerate eigenvalue.
393: At every step, one of the vectors of a set with the same 
394: size as the biggest degenerate subspace is used in a conjugate
395: gradient minimization, while the remaining ones are made 
396: orthogonal to the search directions that are generated in 
397: the process. When the first vector approaches zero, one can start
398: back on the second one (Figure~b)), and the process can be 
399: continued (Figures~c) and d)) until the refresh is complete.
400: }
401: \includegraphics[width=0.9\columnwidth]{fig1.eps}
402: \end{figure}
403: 
404: We have considered two possible ways of recovering ergodicity.
405: The simplest consists in 
406: drawing a new random point ${\bf g}^{(0)}={\bf R}$ every time 
407: we reset the CG search. 
408: This causes a deviation from the linear behavior of the autocorrelation
409: functions for $t\approx N$.
410: Non-degenerate eigenvalues will initially converge with $-1/p$ instead of 
411: $-1/N$ slope, but degenerate ones will converge more slowly, and with exponential
412: trend, as we are sampling random directions within every degenerate 
413: subspace.
414: 
415: In order to improve the efficiency, we mix CG with Gram-Schmidt
416: orthogonalization of a small set of vectors, ideally of the same size $d$ 
417: of the largest degeneracy present. 
418: As discussed earlier, here Gram-Schmidt 
419: orthogonalization has to be performed using the metric of $\tens{\bf A}$,
420: which amounts to imposing conjugacy.
421: The procedure is illustrated in 
422: Figure~\ref{fig:hybrid}. 
423: We start from $d$ random vectors, $\left\{{\bf v}^{(j)}\right\}_{j=0..d-1}$. We
424: set  ${\bf h}^{(0)}={\bf g}^{(0)}={\bf v}^{(0)}$ and begin a CG minimization.
425: At each step we obtain a search direction ${\bf h}^{(i)}$, and make each 
426: of the other $d-1$ vectors conjugate to ${\bf h}^{(i)}$ with 
427: a Gram-Schmidt procedure. 
428: This does not require any matrix-vector product other than the one necessary 
429: for the heatbath step. 
430: After $p$ iterations the conjugate gradient will have converged and ${\bf g}$ 
431: will be close to zero. We can start again
432: from the second vector in the pool, which meanwhile has 
433: become $\bar{\bf v}^{(1)}$, and is conjugate to all the directions visited so far. 
434: Thus, we  set ${\bf h}^{(0)}={\bf g}^{(0)}=\bar{\bf v}^{(1)}$ 
435: and start again the CG procedure, orthogonalizing the $d-2$ remaining vectors 
436: to ${\bf h}^{(i)}$, and so on and so forth.
437: After $N$ steps the procedure will be converged. At the successive sweep, one
438: can generate again a set of random initial $\left\{{\bf v}^{(j)}\right\}$. This
439: can make the method more stable, at the cost of some loss in performance.
440: Some savings can be made if one stores the conjugated $\bar{\bf v}^{(i)}$, 
441: and uses them in the subsequent sweeps, avoiding the need to repeat the GS 
442: orthogonalizations (see figure~\ref{fig:hybrid}).
443: In practice, where more than one complete sweep is affordable, it is easy to devise
444: adaptive variations of this scheme, in which the pool of vectors  
445: $\left\{{\bf v}^{(j)}\right\}$ is enlarged whenever the CG minimization converges 
446: in less than $N$ steps, so that in a few sweeps the optimal size to guarantee
447: ergodicity is attained.
448: 
449: \section{\label{sec:numerico}Benchmarks and comparison with local heatbath}
450: In the previous section we have discussed a collective modes heatbath method
451: that could outperform standard local heatbath techniques when
452: the Hamiltonian has a very large condition number and 
453: sampling along the slower eigenmodes is required.
454: In this section we illustrate the efficiency of our
455: algorithm using numerical experiments on a simple
456: model for $\tens{\bf A}$,
457: \begin{equation}
458: \tens{\bf A}=\tens{\bf 1}+\left(\begin{array}{cccccc}
459: -2b 	& b	& 0	&\cdots	& 0	& b	\\
460: b	&-2b 	& b	& 0	&\cdots	& 0	\\
461: 0	& b 	&-2b	& b	&\ddots 	&\vdots	\\
462: \vdots	& 0	& b	&-2b 	&\ddots	& 0	\\
463: 0	& \vdots& \ddots&\ddots	&\ddots	& b	\\
464: b	& 0	&\cdots	& 0	& b	&-2b	
465: \end{array}\right)
466: \label{eq:mat-phonon}
467: \end{equation}
468: This matrix corresponds to the dynamical matrix of a linear chain of 
469: spring-connected masses, with periodic
470: boundary conditions and an additional diagonal term to make  the acoustic mode
471: nonzero. $b$ can be chosen so as to obtain the desired condition number.
472: Eigenmodes and eigenvalues for such a matrix are easily obtained, 
473: \[
474: \eigena_k=1+2b\left(1-\cos\frac{2k\pi}{N}\right)
475: \]
476: \[
477: u^{(k)}_l=\sqrt{\frac{1+\delta_{0k}+\delta_{N/2,k}}{N}}
478: \left\{
479: \begin{array}{lc}
480: \cos\frac{2kl\pi}{N} & k\le N/2 \\
481: \sin\frac{2kl\pi}{N} & k > N/2
482: \end{array}
483: \right.
484: \]
485: and projection of a state on the eigenvectors is quickly 
486: done via fast-Fourier transform.
487: In Figure~\ref{fig:curve} we compare the the autocorrelation functions 
488: obtained with different algorithms for a matrix of the form (\ref{eq:mat-phonon}).
489: Figure~\ref{fig:curve} also highlights the ergodicity problems connected with the 
490: naive use of the conjugate gradient algorithm to generate the search directions, 
491: and shows how both the suggestions of paragraph~\ref{sub:tricks} can help in
492: solving this problem. 
493: In general, a conjugate directions search speeds up
494: decorrelation for the slower modes, but is less efficient than local heatbath 
495: for the modes with a high eigenvalue. This is a direct consequence of
496: the fact that $\Tr\tens{\boldsymbol{\Delta}}=1$. An additional advantage of 
497: our method is the linear rate of decorrelation, which allows complete 
498: decorrelation just like the direct inversion of $\tens{\bf M}$, whereas moves along the 
499: Cartesian axes lead to approximatively exponential autocorrelation functions.
500: 
501: \begin{figure}
502: \caption{\label{fig:curve}Autocorrelation functions for 
503: a) the projection along the mode $\eigena_0=1$;
504: b) the projection along the mode $\eigena_4\approx 9.8$
505: for a matrix of the form (\ref{eq:mat-phonon})
506: with $N=100$ and condition number $\kappa=10^3$.
507: Line {\bf A} corresponds to local heatbath moves (one step 
508: stands for a complete sweep of the $N$ coordinates), lines
509: {\bf B} to {\bf D} to conjugate directions moves:
510: {\bf B} is the hybrid conjugate gradient/Gram-Schmidt block algorithm;
511: {\bf C} corresponds to CG sweeps, with the search direction 
512: randomized at the beginning of every sweep; 
513: curve {\bf D} corresponds to CG sweeps starting from the same 
514: initial vector. 
515: Conjugate direction moves decorrelate faster than local heatbath for the slow mode, but are
516: less efficient for modes with higher eigenvalue. For degenerate 
517: eigenmodes, the method used for curve {\bf D} is not ergodic (and thus
518: gives incorrect values for $\left<x_i^2\right>$), and
519: random restarts (curve {\bf C}) are much less efficient than the hybrid (curve {\bf B})
520: algorithm.
521: }
522: {\centering
523: \includegraphics[width=0.9\columnwidth]{fig2.1.eps}\\~\\
524: \includegraphics[width=0.9\columnwidth]{fig2.2.eps}
525: }
526: \end{figure}
527: 
528: We stress again that the relative efficiency of the two methods
529: depends strongly on the observable being calculated and on the
530: actual spectrum of the Hamiltonian of the system.
531: As a more realistic benchmark we will consider the evaluation of
532: the trace of the inverse matrix, i.e.
533: \begin{equation}
534: \Omega=\Tr \left(\tens{\bf A}^{-1}\right)=\left<{\bf x}^2\right> 
535: \label{eq:omega}
536: \end{equation}
537: This observable is strongly dependent on the slow modes.
538: 
539: \begin{figure}
540: \caption{\label{fig:benchmark}(Color online) Comparison of the 
541: efficiency of local heatbath versus conjugate-gradient moves.
542: The graph represents $\tau_{CG}/\tau_{loc}$,
543: the ratio of the autocorrelation times for the observable $\Omega$ (\ref{eq:omega}); 
544: $\tau_{loc}$ corresponds to the value obtained from standard local heatbath moves 
545: (one unit of
546: Monte Carlo time corresponds to a whole coordinates sweep), while
547: $\tau_{CG}$ corresponds to the value obtained with moves along 
548: conjugate directions, as obtained from our block algorithm with random restarts.
549: The data plotted results from a linear interpolation of some
550: simulations (labeled by $\otimes$) performed for an 
551: action of the form (\ref{eq:mat-phonon}), 
552: with varying size $N$ and condition number $\kappa$.
553: }
554: \includegraphics[width=0.9\columnwidth]{fig3.eps}
555: \end{figure}
556: 
557: In Figure~\ref{fig:benchmark} we plot the ratios of the autocorrelation times 
558: $\tau\left[\Omega\right]$ as obtained with
559: local heatbath moves and with the block conjugate gradient version of 
560: our algorithm, as a function of changing condition number and system size.
561: 
562: \section{\label{sec:comp-global}Comparison with global heatbath}
563: It remains for us to discuss how our method fares in comparison with global heatbath.
564: The latter requires that matrix $\tens{\bf A}$ be decomposable in the 
565: form $\tens{\bf A}=\tens{\bf M}^T\tens{\bf M}$. This is the case in many
566: fields\cite{kraj-parr05prb}, but in principle if it were necessary to 
567: decompose $\tens{\bf A}$ this would add extra cost. 
568: Here we make our comparison assuming that $\tens{\bf M}$ is already
569: available. In such a case, the two algorithms are on paper equally efficient
570: in producing statistically independent samples. 
571: The global heatbath might offer some numerical advantages when
572: the spectrum of $\tens{\bf M}$ is highly degenerate, 
573: since the number of CG iterations needed to solve
574: the $\tens{\bf M}x={\bf R}$ linear system is $p<N$, as discussed earlier. 
575: Whenever a good preconditioner for the linear system is available, other 
576: inversion algorithms such as the stabilized bi-conjugate gradient\cite{vors92jssc}
577: or the generalized conjugate residual may allow to solve the linear system
578: with a sufficient accuracy more efficiently than using CG. In this
579: paper we make the comparison with conjugate gradient because of the 
580: close analogy with our scheme and because our method is aimed at problems
581: where ill-conditioning cannot be otherwise relieved.
582: 
583: In this respect, our method displays significant advantages.
584: Firstly, it is more stable, because every move preserves the probability 
585: distribution, and the conjugate gradient procedure (which is known to be 
586: quite delicate in problems with large condition number) is only used
587: to generate search directions.
588: Instabilities in the procedure, which would cause incorrect sampling in the global
589: heatbath, affect only the efficiency, and not the accuracy.
590: Moreover, dividing the $N$ steps of an iterative inversion process into 
591: separate heatbath moves greatly improves the flexibility of the sampling scheme.
592: To give some examples, if one needs to perform an average on a slowly 
593: varying $\tens{\bf A}$, it is possible to perform only a partial sweep with 
594: fixed action, then continue with the new $\tens{\bf A}$, assuming that eigenmodes
595: will change slowly. It is also straightforward to tailor the choice of
596: directions in order to optimize the convergence speed for the observable
597: or interest. Adler's overrelaxation\cite{adle81prd} can be included
598: naturally, and can help in further optimizing the autocorrelation time.
599:  As an example of possible fine-tunings, 
600: let us recall the observable $\Omega$ introduced 
601: in the previous section (equation (\ref{eq:omega})). This observable depends 
602: strongly on the softer eigenvector of $\tens{\bf{A}}$. We have then modified 
603: our algorithm in the following way: we perform block conjugate gradient 
604: sweeps, with random resets, and we monitor the curvature along the direction being
605: thermalized, ${\bf h}\tens{\bf A}{\bf h}/{\bf h}\cdot{\bf h}$.
606: We save the direction of minimum curvature encountered along the sweep, 
607: ${\bf h}_{min}$; during the following sweep, every $m$ 
608: moves along the CG directions, one move is performed along ${\bf h}_{min}$.
609: As is evident from Figure~\ref{fig:omegatails}, this trick considerably reduces the
610: autocorrelation time for $\Omega$. Even smarter combinations of moves can be 
611: devised, and the one we suggest is just an example of how the additional flexibility
612: gained through subdividing the inversion process in $N$ exact sampling
613: moves can be exploited.
614: In Table~\ref{tab:refvalues} we report some numerical extimates of the error in the
615: evaluation or $\Omega$, which can serve as a reference to compare our method
616: to other approaches.
617: 
618: 
619: \begin{table}
620: \caption{\label{tab:refvalues}
621: Percentual errors in the evaluation of $\Omega=\left<{\bf x^2}\right>$  (equation (\ref{eq:omega})),
622: extimated using a blocking analysis,  
623: for different sampling methods. {\bf A} corresponds to local heatbath, {\bf B} 
624: corresponds to ``hybrid'' versions of our CG algorithm,
625: with a pool of two vectors with random restarts, while curve {\bf C} is obtained 
626: including the tricks described in section~\ref{sec:comp-global} 
627: with $m=50$.
628: Different tests are performed with varying matrix size $N$, number of sampling steps $T$ 
629: and condition number $\kappa$. 
630: Due to the large autocorrelation time, the values of the error for local heatbath 
631: with $N=100$ and $T=10^6$ could not be extimated as reliably as in the other cases,
632: and are only indicative.}
633: \begin{ruledtabular}
634: %N     1000 1000   100    100
635: %k     50k  5k     50k    5k
636: %<x2_exact> 4.473 14.142 1.0657 1.5918
637: \begin{tabular}[c]{c c c c c c}
638:   $N$  &   $\kappa$    &   $T$  & {\bf A} & {\bf B} & {\bf C} 	\\ \hline
639: $10^3$ &$5\times10^4$ & $10^6$ & 4.0  	 &	1.5	 &  1.4		\\ 
640: $10^3$ &$5\times10^4$ & $10^7$ & 1.3	 &	0.51	 &	0.45		\\ 
641: $10^3$ &$5\times10^3$ & $10^6$ & 0.78		 &	0.85   &  0.85	   	\\ 
642: $10^3$ &$5\times10^3$ & $10^7$ & 0.24	 &	0.28	 &	0.28  	\\ 
643: $100 $ &$5\times10^4$ & $10^6$ & $\sim$11 	 &	1.2	 &  1.1 		\\ 
644: $100 $ &$5\times10^4$ & $10^7$ & 4.9 	 &	0.44 &	0.34		\\ 
645: $100 $ &$5\times10^3$ & $10^6$ & $\sim$3   &	0.88  &  0.82 	   	\\ 
646: $100 $ &$5\times10^3$ & $10^7$ & 1.1	 &	0.30 &	0.25     	\\ 
647: \end{tabular}
648: \end{ruledtabular}
649: \end{table}
650: 
651: \begin{figure}
652: \caption{\label{fig:omegatails}Autocorrelation function for the 
653: observable (\ref{eq:omega}) for an action of the form (\ref{eq:mat-phonon}),
654: with size $N=100$ and condition number $\kappa=5\times 10^3$.
655: Line {\bf A} corresponds to local heatbath, line {\bf B}
656: to the ``hybrid'' versions of our CG algorithm,
657: with a pool of two vectors with random restarts, while curve {\bf C} is obtained 
658: including the tricks described in section~\ref{sec:comp-global} 
659: with $m=5$}
660: \includegraphics[width=0.9\columnwidth]{fig4.eps}
661: \end{figure}
662: 
663: \section{\label{sec:conclusions}Conclusions}
664: We have presented an algorithm for performing collective modes heatbath along 
665: conjugate directions for a quadratic action, which allows  the components of
666:  the sampling vector along all modes to be decorrelated in $N$ steps, 
667: with a linear decay to zero.
668:  This method is  more computationally demanding than local updates, 
669: but  becomes competitive for ill-conditioned actions, when one needs to 
670: compute observables which depend on modes with low eigenvalues, or when the 
671: spectrum of the action matrix has only a few high eigenvalue modes which would 
672: slow down Cartesian moves.
673:  In fact, this method has an efficiency comparable with that of direct 
674: inversion of the matrix, but presents various advantages, such as 
675: improved stability, as the numerical issues connected with
676: conjugate gradient method do not affect the accuracy of the sampling, 
677: and the possibility of exploiting some additional
678: flexibility to improve the sampling on a case-by-case basis. Lastly, global
679: heatbath requires the knowledge of the square root of the action $\tens{\bf A}$,
680: so our scheme should be considered whenever the square root is difficult to 
681: compute or its use is inefficient with respect to the original action.
682: 
683: The geometrical simplicity of this approach, with its close analogy 
684: with minimization methods, also suggests that it might be extended 
685: to the sampling of anharmonic systems.
686: 
687: \appendix
688: \section{\label{sec:stationary}}
689: We report here a simple demonstration of the fact that heatbath moves
690: along a generic direction ${\bf d}$ leave an equilibrium probability
691: distribution unchanged. 
692: We will use the fact that if ${\bf R}$, ${\bf R}'$ and ${\bf R}''$ are vectors 
693: distributed as Gaussians with zero mean and standard deviation one, then
694: $\tens{\bf B}{\bf R}+\tens{\bf C}{\bf R}'$ is distributed as $\tens{\bf D}{\bf R}''$ 
695: where $\tens{\bf D}^T\tens{\bf D}=\tens{\bf B}^T\tens{\bf B}+\tens{\bf C}^T\tens{\bf C}$.
696: Since ${\bf x}$ is drawn from the equilibrium distribution, 
697: i.e. ${\bf x}=\tens{\bf M}^{-1}{\bf R}$,
698: we can cast Eq. (\ref{eq:hb-tau}) and (\ref{eq:step}) into the form
699: \begin{align}
700: x_j'=\sum_m P_{jm} R_m + \sum_m Q_{jm} R_m' \nonumber\\ 
701: P_{jm}=\left(\tens{\bf M}^{-1}\right)_{jm}-d_j \sum_k M_{km}d_k \quad\quad
702: Q_{jm}=d_j \delta_{m0}\nonumber
703: \end{align}
704: where we have put ${\bf b}=0$ into Eq. (\ref{eq:hb-tau}) and normalized
705: the direction so that ${\bf d}\tens{\bf A}{\bf d}=1$ in order to simplify the notation.
706: We can then compute
707: \[
708: \sum_m P_{jm}P_{lm}=\tens{\bf A}^{-1}_{jl}-d_j d_l \quad\quad
709: \sum_m Q_{jm}Q_{lm}=d_j d_l
710: \]
711: so that $\tens{\bf P}^T\tens{\bf P}+\tens{\bf Q}^T\tens{\bf Q}=
712: \left(\tens{\bf M}^{-1}\right)^T\tens{\bf M}^{-1}$, i.e. also ${\bf x'}$ may be  
713: written as $\tens{\bf M}^{-1}{\bf R}$, and is therefore correctly distributed.
714: 
715: \section{\label{sec:rnd-asymptotic}}
716: We shall here discuss briefly the derivation of the asymptotic form of equation 
717: (\ref{eq:random-slope}) when the size $N$ of the action matrix tends to infinity.
718: The quantity to be computed is 
719: \[
720: Q_i=\left<\frac{R_i^2}{\sum_k\eigena_k R_k^2}\right>\propto
721: \int\mathrm{d}{\bf x} \frac{x_i^2}{\sum_k x_k^2 \eigena_k} 
722: \exp\left[-\frac{1}{2}\sum_k x_k^2\right]
723: \]
724: The integral can be transformed as follows:
725: \begin{align*}
726: Q_i\propto\int_0^{\infty}\mathrm{d}t\int\mathrm{d}{\bf x} x_i^2
727: \exp\left[-\frac{1}{2}\sum_k \left(1+\eigena_k t\right)x_k^2\right]=\\
728: =\int_0^{\infty}\mathrm{d}t \frac{1}{\eigena_i t+1}\prod_k \frac{1}{\sqrt{\eigena_k t +1}},
729: \end{align*}
730: and the resulting expression, including the correct normalization, is
731: \begin{equation}
732: Q_i=\frac{1}{2}\int_0^{\infty}\mathrm{d}t \frac{1}{\eigena_i t+1} f\left(t\right),
733: \qquad f\left(t\right)=\prod_k \frac{1}{\sqrt{\eigena_k t +1}} \label{eq:asy-qi}
734: \end{equation}
735: Let us focus on $F=\int_0^{\infty}f\left(t\right){\rm d}t$, since all the $Q_i$ 
736: can be computed as 
737: $Q_i=\eigena_i \frac{\partial F}{\partial \eigena_i}+\frac{1}{2}F$. 
738: We perform the change 
739: of variables $Nt\rightarrow t$, so that 
740: \[
741: \int_0^{\infty}f\left(t\right){\rm d}t=
742: \frac{1}{N}\int_0^{\infty}\tilde{f}\left(t\right){\rm d}t,\qquad
743: \tilde{f}\left(t\right)=\prod_k \frac{1}{\sqrt{\frac{\eigena_k}{N} t +1}}.
744: \]
745: Under the physically reasonable assumption that $\Tr \tens{\bf A} = 
746: \mathcal{O}\left(N\right)$, and that the maximum eigenvalue 
747: does not scale with the system size, we can use $1/N$ as a small parameter.
748: Expanding $\log\tilde{f}$ one finds
749: \begin{eqnarray*}
750: \log\tilde{f}\left(t\right)=\sum_k \log\left(1+\frac{\eigena_k}{N}t\right)=\\
751: \sum_{n=1} \frac{t^n}{n+1} \sum_k \left[\frac{\eigena_k}{N}\right]^n
752: =\sum_k\frac{\eigena_k}{N}\frac{t}{2} + \sum_{n=1} t^{n+1} \mathcal{O}\left(\frac{1}{N^n}\right).
753: \end{eqnarray*}
754: All but the leading term become negligible for $N\rightarrow \infty$.
755: This suggests separating out from $\tilde{f}\left(t\right)$ the term order zero
756: in $1/N$, and writing for $F$ the expression 
757: \begin{eqnarray}
758: \frac{1}{N}\int_0^{\infty}\exp\left(-\frac{t}{2}\frac{\Tr \tens{\bf A}}{N}\right)\times
759: \nonumber\\
760: \times\left[1+\frac{1}{4}\sum_k\left(\frac{\eigena_k}{N}\right)^2 t^2 + 
761: \mathcal{O}\left(\frac{1}{N^2}\right)t^3+\ldots
762: \right]
763: {\rm d} t
764: \end{eqnarray}
765: which leads to the asymptotic result $F=\frac{2}{Tr\tens{\bf A}}+\mathcal{O}\left(N^{-2}\right)$.
766: Correspondingly, dropping the higher order terms in $1/N$, we have
767: $Q_i=\frac{1}{Tr\tens{\bf A}}+\mathcal{O}\left(N^{-2}\right)$, which is the
768: desired  result.
769: 
770: 
771: \section{\label{sec:cd-formal}}
772: We obtain here the autocorrelation function for the components along the 
773: eigenmodes of the action matrix $\tens{\bf A}$, when performing heatbath
774: sweeps along a set of conjugate directions
775:  $\left\{{\bf h}^{(m)}\right\}_{m=0\ldots N-1}$. In this
776: section, the indices of the directions are defined modulo $N$, i.e.
777: ${\bf h}^{(j+N)}={\bf h}^{(j)}$.
778: In this case, one can write Eq. (\ref{eq:auto-induction}) as 
779: \begin{align}
780: \left<x_i\left(0\right) x_i\left(t+1\right)\right>=
781: \left<x_i\left(0\right) x_i\left(t\right)\right>-\nonumber\\
782: \frac{1}{N}\sum_m\sum_k\left[\left<x_i\left(0\right) x_k\left(t\right)\right>
783: \sqrt{\frac{\eigena_k}{\eigena_i}}\Delta_{ki}\left({\bf h}^{(m)}\right)\right].
784: \label{eq:cg-induction}
785: \end{align}
786: Explicit calculations for small values of $t$ suggest for $t<N$ the ansatz 
787: \begin{equation}
788: \left<x_i\left(0\right) x_i\left(t\right)\right>=
789: \left<x_i\left(0\right)^2\right>
790: \left[1-\frac{t}{N}\right].\label{eq:cg-ansatz}
791: \end{equation}
792: Since the first term in Eq. (\ref{eq:cg-induction}) does not contain the new
793: direction, we can substitute the ansatz without concern. On the other hand, 
794: the second term contains reference to ${\bf h}^{(m)}$, so that the average 
795: that led to (\ref{eq:cg-ansatz}) cannot be performed separately, and one 
796: should rather write:
797: \begin{align}
798: \frac{1}{N}\sum_m\sum_k\left[
799: \sum_{k'}\left<x_i\left(0\right) x_{k'}\left(t-1\right)\right>\right.
800: \nonumber\\
801: \left.\left(\delta_{k'k}-\sqrt{\frac{\eigena_{k'}}{\eigena_k}}
802: \Delta_{k'k}\left({\bf h}^{(m-1)}\right)\right)
803: \sqrt{\frac{\eigena_k}{\eigena_i}}
804: \Delta_{ki}\left({\bf h}^{(m)}\right)\right].
805: \end{align}
806: which is split into
807: \begin{align}
808: \frac{1}{N}\sum_m\sum_k\left[\left<x_i\left(0\right) x_k\left(t-1\right)\right>
809: \sqrt{\frac{\eigena_k}{\eigena_i}}\Delta_{ki}\left({\bf h}^{(m)}\right)\right],
810: \label{eq:cg-one}\\
811: \frac{1}{N}\sum_{mkk'}\left[\left<x_i\left(0\right) x_{k'}\left(t-1\right)\right>
812: \sqrt{\frac{\eigena_{k'}}{\eigena_i}}
813: \Delta_{k'k}\left({\bf h}^{(m-1)}\right)
814: \Delta_{ki}\left({\bf h}^{(m)}\right)\right]\label{eq:cg-two}
815: \end{align}
816: The term (\ref{eq:cg-two}) goes to zero, since
817: \[
818: \sum_k\sum_m \Delta_{ik}\left({\bf h}^{(m-n)}\right)
819: \Delta_{kj}\left({\bf h}^{(m)}\right)=\delta_{n,pN}\delta_{ij}
820: \]
821: while (\ref{eq:cg-one}) can be expanded again, giving rise to the $t-2$ analogue and
822: to a term containing $\Delta_{k'k}\left({\bf h}^{(m-2)}\right)\Delta_{kj}
823: \left({\bf h}^{(m)}\right)$. One iterates this process recursively until it reaches 
824: $\left<x_i\left(0\right)^2\right>$, thus contributing another $-1/N$ to the 
825: autocorrelation function.
826: Things are different for $t\ge N$, since terms involving products of
827: the slopes for the same direction will enter the procedure at a certain point
828: in the iteration.
829: Because of these terms, for $t\ge N$ autocorrelation functions will be 
830: identically zero.
831: 
832: \section*{Acknowledgments}
833: It is a pleasure to acknowledge useful discussion with Fulvio Ricci and 
834: Nazario Tantalo, whose suggestions have helped improving the manuscript.
835: 
836: \begin{thebibliography}{13}
837: \expandafter\ifx\csname natexlab\endcsname\relax\def\natexlab#1{#1}\fi
838: \expandafter\ifx\csname bibnamefont\endcsname\relax
839:   \def\bibnamefont#1{#1}\fi
840: \expandafter\ifx\csname bibfnamefont\endcsname\relax
841:   \def\bibfnamefont#1{#1}\fi
842: \expandafter\ifx\csname citenamefont\endcsname\relax
843:   \def\citenamefont#1{#1}\fi
844: \expandafter\ifx\csname url\endcsname\relax
845:   \def\url#1{\texttt{#1}}\fi
846: \expandafter\ifx\csname urlprefix\endcsname\relax\def\urlprefix{URL }\fi
847: \providecommand{\bibinfo}[2]{#2}
848: \providecommand{\eprint}[2][]{\url{#2}}
849: 
850: \bibitem[{\citenamefont{L{\"u}scher}(1994)}]{lusc94npb}
851: \bibinfo{author}{\bibfnamefont{M.}~\bibnamefont{L{\"u}scher}},
852:   \bibinfo{journal}{Nucl. Phys. B} \textbf{\bibinfo{volume}{418}},
853:   \bibinfo{pages}{637} (\bibinfo{year}{1994}).
854: 
855: \bibitem[{\citenamefont{{de Divitiis} et~al.}(1995)\citenamefont{{de Divitiis},
856:   Frezzotti, Guagnelli, Masetti, and Petronzio}}]{divi+95npb}
857: \bibinfo{author}{\bibfnamefont{G.~M.} \bibnamefont{{de Divitiis}}},
858:   \bibinfo{author}{\bibfnamefont{R.}~\bibnamefont{Frezzotti}},
859:   \bibinfo{author}{\bibfnamefont{M.}~\bibnamefont{Guagnelli}},
860:   \bibinfo{author}{\bibfnamefont{M.}~\bibnamefont{Masetti}}, \bibnamefont{and}
861:   \bibinfo{author}{\bibfnamefont{R.}~\bibnamefont{Petronzio}},
862:   \bibinfo{journal}{Nucl. Phys. B} \textbf{\bibinfo{volume}{455}},
863:   \bibinfo{pages}{274} (\bibinfo{year}{1995}).
864: 
865: \bibitem[{\citenamefont{{de Forcrand}}(1999{\natexlab{a}})}]{forc99parc}
866: \bibinfo{author}{\bibfnamefont{P.}~\bibnamefont{{de Forcrand}}},
867:   \bibinfo{journal}{Parallel Comp.} \textbf{\bibinfo{volume}{25}},
868:   \bibinfo{pages}{1341} (\bibinfo{year}{1999}{\natexlab{a}}).
869: 
870: \bibitem[{\citenamefont{Krajewski and Parrinello}(2005)}]{kraj-parr05prb}
871: \bibinfo{author}{\bibfnamefont{F.~R.} \bibnamefont{Krajewski}}
872:   \bibnamefont{and}
873:   \bibinfo{author}{\bibfnamefont{M.}~\bibnamefont{Parrinello}},
874:   \bibinfo{journal}{Phys. Rev. B} \textbf{\bibinfo{volume}{71}},
875:   \bibinfo{pages}{233105} (\bibinfo{year}{2005}).
876: 
877: \bibitem[{\citenamefont{Krajewski and Parrinello}(2006)}]{kraj-parr06prb}
878: \bibinfo{author}{\bibfnamefont{F.~R.} \bibnamefont{Krajewski}}
879:   \bibnamefont{and}
880:   \bibinfo{author}{\bibfnamefont{M.}~\bibnamefont{Parrinello}},
881:   \bibinfo{journal}{Phys. Rev. B} \textbf{\bibinfo{volume}{73}},
882:   \bibinfo{pages}{041105} (\bibinfo{year}{2006}).
883: 
884: \bibitem[{\citenamefont{{de Forcrand}}(1999{\natexlab{b}})}]{forc99pre}
885: \bibinfo{author}{\bibfnamefont{P.}~\bibnamefont{{de Forcrand}}},
886:   \bibinfo{journal}{Phys. Rev. {\bf E}} \textbf{\bibinfo{volume}{59}},
887:   \bibinfo{pages}{3698} (\bibinfo{year}{1999}{\natexlab{b}}).
888: 
889: \bibitem[{\citenamefont{Wilcox}(2002)}]{wilc02npb}
890: \bibinfo{author}{\bibfnamefont{W.}~\bibnamefont{Wilcox}},
891:   \bibinfo{journal}{Nucl. Phys. B} \textbf{\bibinfo{volume}{106}},
892:   \bibinfo{pages}{1064} (\bibinfo{year}{2002}).
893: 
894: \bibitem[{\citenamefont{Goodman and Sokal}(1989)}]{good-soka89prd}
895: \bibinfo{author}{\bibfnamefont{J.}~\bibnamefont{Goodman}} \bibnamefont{and}
896:   \bibinfo{author}{\bibfnamefont{A.~D.} \bibnamefont{Sokal}},
897:   \bibinfo{journal}{Phys. Rev. {\bf D}} \textbf{\bibinfo{volume}{40}},
898:   \bibinfo{pages}{2035} (\bibinfo{year}{1989}).
899: 
900: \bibitem[{\citenamefont{Adler}(1989)}]{adle89npb}
901: \bibinfo{author}{\bibfnamefont{S.~L.} \bibnamefont{Adler}},
902:   \bibinfo{journal}{Nucl. Phys. B} \textbf{\bibinfo{volume}{9}},
903:   \bibinfo{pages}{437} (\bibinfo{year}{1989}).
904: 
905: \bibitem[{\citenamefont{Manousiouthakis and Deem}(1999)}]{mano-deem99jcp}
906: \bibinfo{author}{\bibfnamefont{V.~I.} \bibnamefont{Manousiouthakis}}
907:   \bibnamefont{and} \bibinfo{author}{\bibfnamefont{M.~W.} \bibnamefont{Deem}},
908:   \bibinfo{journal}{J. Chem. Phys.} \textbf{\bibinfo{volume}{110}},
909:   \bibinfo{pages}{2753} (\bibinfo{year}{1999}).
910: 
911: \bibitem[{\citenamefont{Press et~al.}(1986)\citenamefont{Press, Teukolsky,
912:   Vetterling, and Flannery}}]{numerical-recipes}
913: \bibinfo{author}{\bibfnamefont{W.~H.} \bibnamefont{Press}},
914:   \bibinfo{author}{\bibfnamefont{S.~A.} \bibnamefont{Teukolsky}},
915:   \bibinfo{author}{\bibfnamefont{W.~T.} \bibnamefont{Vetterling}},
916:   \bibnamefont{and} \bibinfo{author}{\bibfnamefont{B.~P.}
917:   \bibnamefont{Flannery}}, \emph{\bibinfo{title}{Numerical Recipes in Fortran
918:   77}} (\bibinfo{publisher}{Cambridge University Press}, \bibinfo{year}{1986}).
919: 
920: \bibitem[{\citenamefont{{Van Der Vorst}}(1992)}]{vors92jssc}
921: \bibinfo{author}{\bibfnamefont{H.~A.} \bibnamefont{{Van Der Vorst}}},
922:   \bibinfo{journal}{J. Sci. Stat. Comput.} \textbf{\bibinfo{volume}{13}},
923:   \bibinfo{pages}{631} (\bibinfo{year}{1992}).
924: 
925: \bibitem[{\citenamefont{Adler}(1981)}]{adle81prd}
926: \bibinfo{author}{\bibfnamefont{S.~L.} \bibnamefont{Adler}},
927:   \bibinfo{journal}{Phys. Rev. {\bf D}} \textbf{\bibinfo{volume}{23}},
928:   \bibinfo{pages}{2901} (\bibinfo{year}{1981}).
929: 
930: \end{thebibliography}
931: \end{document}
932: