cond-mat0011039/c10.tex
1: \documentstyle[graphicx,multicol,prl,aps]{revtex}
2: 
3: \begin{document}
4: 
5: \draft
6: 
7: \title{On the Use of Optimized Monte Carlo Methods for
8: Studying Spin Glasses}
9: 
10: \author{E. Marinari$^1$, G. Parisi$^1$, F. Ricci-Tersenghi$^2$ and
11: F. Zuliani$^1$}
12: 
13: \address{
14: $^1$ Dipartimento di Fisica, INFN and INFM, 
15: Universit\`a di Roma {\em La Sapienza},\\
16: P. A. Moro 2, 00185 Roma, Italy.}
17: 
18: \address{
19: $^2$ The Abdus Salam International Center for Theoretical Physics,
20: Condensed Matter Group,\\
21: Strada Costiera 11, P.O. Box 586, I-34100 Trieste, Italy.}
22: 
23: \date{October $31$, $2000$}
24: 
25: \maketitle
26: 
27: \begin{abstract}  
28: We start from recently published numerical data by Hatano and
29: Gubernatis~\cite{HATGUB} to discuss properties of convergence to
30: equilibrium of optimized Monte Carlo methods (bivariate multi
31: canonical and parallel tempering).  We show that these data are not
32: thermalized, and they lead to an erroneous physical picture.  We shed
33: some light on why the bivariate multi canonical Monte Carlo method can
34: fail.
35: \end{abstract}
36: 
37: \pacs{PACS numbers: 75.50.Lk, 75.10.Nr, 75.40.Gb}
38: 
39: One of the main problems of numerical results originated from large
40: scale numerical simulations is that checking them is a task that is
41: frequently of the order of magnitude of checking a real experiment:
42: only repeating the full simulation, that demands availability of
43: computer time and codes, allows a full check of the results.
44: 
45: Here we will use as a starting point the work of reference
46: \cite{HATGUB} to discuss a few points both about optimized Monte Carlo
47: algorithms and about the behavior of $3D$ Edwards-Anderson (EA) spin
48: glasses in the low $T$ phase.  We will start by showing that the
49: numerical results reported in reference \cite{HATGUB}, as far as the
50: low $T$ values are concerned, are wrong: they are not equilibrium
51: averages over the Boltzmann probability. Because of that the physical
52: conclusions reached in the paper, supporting a trivial behavior of the
53: broken phase of $3D$ spin glasses, are wrong. On the contrary recent
54: numerical simulations \cite{RECENT} support, in this respect, a
55: behavior of the system consistent with the Replica Symmetry Breaking
56: (RSB) picture \cite{PARISI}. We will also shed some light on why the
57: optimized Monte Carlo method used in \cite{HATGUB} can fail.
58: 
59: In the following we will first analyze our numerical data obtained by
60: the {\em Parallel Tempering} Monte Carlo method \cite{PT}, focusing on
61: the analysis needed to establish that thermal equilibrium has been
62: reached \cite{OPTIMIZED}: we will use a large number of severe
63: criteria that ensure that thermalization has been reached.  After
64: showing that the results of \cite{HATGUB} are not correct in the low
65: $T$ region we will discuss some preliminary simulations done using the
66: same method used in \cite{HATGUB}, a bivariate version of the
67: Multi-Canonical Monte Carlo \cite{BERG_NEU}, and we will point out a
68: series of reasons for which a non careful implementation of this
69: strategy can fail.
70: 
71: %This is figure 1.
72: \begin{figure}
73: \centering\includegraphics[width=0.6\textwidth,angle=0]{f05.eps}
74: \caption[a]{The Binder parameter, $B(t)$, averaged over logarithmic
75: time windows, as a function of time, at $T=0.5$.}
76: \protect\label{F-05}
77: \end{figure}
78: 
79: Let us start from our numerical data obtained through parallel
80: tempering\footnote{For sake of a complete reliability and without fear
81: of appearing over cautious we have chosen to rewrite all our codes in
82: a double blind pattern, with two different sets of programmers, using
83: different programming languages and different random number
84: generators: they always give statistically compatible results.}.  We
85: have simulated a $3D$ Edwards-Anderson spin glass, with binary random
86: quenched couplings, linear size $L=8$ (the largest size used in
87: \cite{HATGUB}), down to $T=0.5\simeq 0.5\, T_c$: let us note that in
88: our simulations for the same $T$ values we are able to thermalize
89: reliably lattices up to $L=16$, and that we just discuss here results
90: about the $L=8$ lattice, where we are completely confident about
91: thermalization, only because this is the largest lattice studied in
92: \cite{HATGUB}. We use a minimum value of the temperature
93: $T_{\mbox{min}}=0.5$, a number of temperatures $N_T=49$ and a constant
94: temperature step $\delta T = \frac{1}{30}$.  The measured correlation
95: times are always smooth functions of $T$ and no anomalies are
96: detected.
97: 
98: Our data at high $T$ turn out to be statistically compatible with the
99: ones of \cite{HATGUB}: in the high $T$ region there are no problems.
100: 
101: In figure \ref{F-05} we plot the value of the Binder parameter,
102: 
103: \begin{equation}
104:   B(t)\equiv\frac12\left(3-
105:   \frac
106:   {  \overline{\langle q^4(t)\rangle}    }
107:   { {\overline{\langle q^2(t)\rangle}}^2 }
108:   \right)\ ,
109: \end{equation}
110: averaged over logarithmic time windows, as a function of time at
111: $T=0.5$ (close to $0.5\,T_c$).  Averaging over logarithmic windows is
112: the safe approach to check convergence in time. We first average over
113: the last half of the total time extent of the run: this is the last
114: point on the right of the plot. We subdivide in the same way the other
115: half of the data, and the second point on the right is the average
116: over the second half of this time span: we continue in this way till
117: the origin of our Monte Carlo run. With a straight line we plot the
118: asymptotic data from \cite{HATGUB} as extracted from figure $7$ in the
119: paper (since we were estimating by hand we have been generous on the
120: statistical error): here there is no time dependence, we only plot
121: with a straight line the asymptotic value. The discrepancy of our data
122: and the data of \cite{HATGUB} is very large and statistically very
123: significant: definitely not an accident.
124: 
125: %This is figure 2.
126: \begin{figure}
127: \centering\includegraphics[width=0.6\textwidth,angle=0]{f06.eps}
128: \caption[a]{As in figure \ref{F-05}, but for $T=0.6$.}
129: \protect\label{F-06}
130: \end{figure}
131: 
132: In figure \ref{F-06} we plot the $T=0.6$ data from the same run,
133: always for the Binder parameter averaged over logarithmic time
134: windows: here $T$ is higher, and one could feel safer about
135: thermalization, but again there is a clear and significant discrepancy
136: among our data and the ones of \cite{HATGUB}. The dramatic stability
137: of our data for $B(t)$ at low $T$ is already a very good indicator of
138: a high level of thermalization. The results are stable at least during
139: the last eight subdivisions of our two million step runs, i.e. at least
140: from times going from $10^4$ to $2\cdot10^6$.
141: 
142: In order to be sure we are not trapped in some metastable situation we
143: have to check standard criteria about convergence, that in the case of
144: optimized dynamics can be quite difficult to assert
145: \cite{OPTIMIZED}. Let us note for example that in recent numerical
146: simulations \cite{RECENT} a careful discussion shows that weaker
147: criteria can be sufficient to guarantee thermalization, making in this
148: way possible to simulate more disorder sample with the same amount of
149: computer time (since one needs less thermal sweeps per sample). Here,
150: since thermalization is the main issue, we will check all of the most
151: stringent criteria.
152: 
153: First of all we have checked the acceptance rates of the tempering
154: sweeps in temperature: a bad choice of the $T$ values can make the
155: swap of the temperature value too rare. In our case the rates are very
156: high, of the order of $.7$ in all the temperature range: our parallel
157: tempering scheme is performing very well.
158: 
159: Secondly we have checked, as customary, if all configurations
160: (we have, as we said, $49$ of them) have spent a similar amount of
161: time in each one of the $49$ allowed $T$ values. This criterion is
162: important, since the first one could not be sufficient: spin
163: configurations could be spending time swapping 
164: among neighboring
165: $T$ values locally,
166: but never leave the high or the low $T$ region. Our {\em permanence
167: histograms} are very good: because of the large time extent of the
168: runs all configurations have visited all regions of the $T$ phase
169: space, and the permanence histograms are very flat. Again, this
170: is a powerful test of thermalization.
171: 
172: The last point we have checked is the symmetry under the exchange
173: $q\to-q$ of the $P_J(q)$ for the {\em individual} samples. Since the
174: overall flip of all spins is supposed to be a very slow mode of the
175: dynamics, once we have good statistics on this mode we expect to have
176: reached all the relevant regions of the phase space. Again, the
177: symmetry is excellent for all individual samples (even for the more
178: complex samples where the $P_J(q)$ has a non-trivial structure).
179: 
180: We consider this body of evidence as clear: our data are thermalized,
181: the numerical data hint evidence in favor of the RSB picture (as
182: confirmed by the data of \cite{RECENT}, where even at very low $T$
183: values one sees that $P(0)$ does not depend on $L$) and the method
184: used in \cite{HATGUB} did not allow a proper thermalization.
185: 
186: In order to get a better understanding of the situation, and some
187: hints about the reason of the failure of \cite{HATGUB} we have
188: implemented a code for rerunning their bivariate multi-canonical
189: simulations.
190: 
191: Our simulations closely follows the description given in the Appendix
192: of reference \cite{HATGUB} and by Hatano himself \cite{HATANO}.  The
193: analysis of few samples of sizes $L=4$, $6$, $8$ has been sufficient
194: in order to understand where the thermalization problems may come
195: from.  Unless differently specified we have always used $10^6$ Monte
196: Carlo Sweeps (MCS) for thermalizing and $10^7$ MCS for taking
197: measurements in {\em each} multi-canonical cycle.  The same number of
198: MCS has been used by the authors of \cite{HATGUB} only for $L=10$
199: \cite{HATANO} (less iterations have been used for smaller lattice
200: sizes).
201: 
202: The most delicate point during the thermalization process is the role
203: played by the {\em entropic barriers} during the multi-canonical
204: simulation.  In a model which undergoes a first order transition the
205: slowing down of the simulation at the critical point is essentially
206: due to the presence of a huge {\em energetic barrier} between the two
207: free energy minima. In this case the multi-canonical simulation works
208: fine \cite{BERG_NEU}, and it rapidly converges towards a regime where
209: every energy is sampled with the right probability, i.e. uniformly.
210: Problems may arise when the multi-canonical method is applied to spin
211: glasses or in general to models where entropic barriers play a central
212: role. To this respect the study of its performances in models with
213: only entropic barriers (e.g. backgammon model \cite{BACKGAMMON}) would
214: be illuminating.
215: 
216: \begin{figure}
217: \centering\includegraphics[width=0.6\textwidth,angle=0]{plot2.eps}
218: \caption[a]{The fraction of $(e,q)$ space where the histogram $h(e,q)$
219: is different from zero as a function of the multi-canonical cycle
220: number.  Even for a very small system ($L=6$) strong convergence
221: problems arise.}
222: \label{F-FRAZ}
223: \end{figure}
224: 
225: Let us focus now specifically on the $3D$ EA model, and see how the
226: estimated density of states (DoS), $D(e,q)$, converges to the exact
227: one.  In particular we are interested in the histogram $h(e,q)$ which
228: counts the number of times, during a multi-canonical cycle, the system
229: is in a macroscopic state $(e,q)$ with energy $e$ and overlap $q$.
230: Thermalization is achieved when $h(e,q)$ is flat and much larger than
231: $1$ for all the physically allowed pairs $(e,q)$.  Starting from a
232: flat DoS, the region where $h(e,q) \gg 1$ broadens with the number of
233: multi-canonical cycles and eventually reaches the boundaries of the
234: allowed domain, $e \in [-e_0,e_0] \; q \in [-1,1]$, where $-e_0$ is
235: the ground-states energy (see the first two snapshots in figure
236: \ref{F-ISTO}, that we will discuss in better detail later on).  In
237: order to describe quantitatively the histogram evolution we plot in
238: figure \ref{F-FRAZ} the fraction of the $(e,q)$ space where $h(e,q)
239: \neq 0$, that is the fraction of macroscopic $(e,q)$ configurations
240: visited by the system during a multi-canonical cycle.  We expect this
241: fraction to increase more or less linearly during the first
242: multi-canonical cycles and then to reach a plateau when simulation is
243: thermalized (see figure \ref{F-FRAZ}.a, where things look good).  For
244: all the $L=4$ samples simulated we have observed this correct
245: behavior.  On the contrary for the $L=6$ samples, problems arise.  At
246: first, if the number of MCS is not large enough the simulation does
247: not converge at all.  In figure \ref{F-FRAZ}.b we show the results for
248: the same sample shown in figure \ref{F-FRAZ}.a, with the only
249: difference that $10^6$ MCS were used instead of $10^7$: here
250: thermalization problems are evident, since in some situations the
251: system simply gets trapped in a very small region of the phase space.
252: In different samples we have found analogous problems also when using
253: $10^7$ MCS (see figures \ref{F-FRAZ}.c and \ref{F-FRAZ}.d).  With
254: $10^6$ MCS the parallel tempering method is able to thermalize samples
255: up to $L=8$ for temperatures down to $T=0.3$ (for example at the
256: lowest $T$ value the Binder parameter thermalizes in $10^6$ MCS):
257: the bivariate multi-canonical method does not seem to be very
258: efficient for spin glasses.
259: 
260: \begin{figure}
261: \centering\includegraphics[width=0.6\textwidth,angle=0]{plot1.eps}
262: \caption[a]{The evolution of the histogram $h(e,q)$ as a function of
263: multi-canonical cycles (sample \#2 in figure \ref{F-FRAZ}).}
264: \label{F-ISTO}
265: \end{figure}
266: 
267: In figure \ref{F-ISTO} we show the histogram evolution for sample \#2
268: (the same used in figure \ref{F-FRAZ}.c).  The four snapshots
269: correspond to the black dots in figure \ref{F-FRAZ}.c and clearly show
270: that the system, after reaching an apparently thermalized state with a
271: flat and broad $h(e,q)$, instead of keeping it for all subsequent
272: times, gets trapped in very small regions of the $(e,q)$ space (the
273: third and fourth snapshots in figure \ref{F-ISTO}).
274: 
275: How can we explain this behavior? During the first multi-canonical
276: cycles the dynamics of the system in the $(e,q)$ space is diffusive in
277: character, while when approaching the boundaries of the $e-q$ plane
278: (especially the energy ones) the system often gets trapped for very
279: long times.  The end of the diffusive behavior near to the ground
280: states can be easily explained in terms of accessibility, that is the
281: probability of decreasing the energy when the system is in a $(e,q)$
282: configuration and it makes a random move to a neighbor configuration.
283: For not too low energies the accessibility is high: in this case a
284: random walk in the configuration space corresponds to a random walk
285: in the $(e,q)$ space, which is a projection of the previous one.  On
286: the contrary for energies close to the one of the ground states the
287: accessibility is very low, due to the presence of a large number of
288: higher local minima. For example if the system is at the bottom of a
289: valley in the space of microscopic configurations, in order to further
290: decrease its energy (a little step in the macroscopic $(e,q)$ space)
291: it may need a long time, the time to find a deeper valley.  The
292: dynamics turns out to be strongly constrained for energies close to
293: the boundaries.
294: 
295: Having in mind that the dynamics becomes slower and slower close to
296: the energy boundaries, one can easily explain the peaks in figure
297: \ref{F-ISTO}.  The system firstly relaxes in a uniform way on a large
298: part of the $(e,q)$ space, the more accessible one.  Still many
299: allowed $(e,q)$ values are unvisited (because of the low
300: accessibility), their DoS estimation becomes very small and their
301: corresponding weights, $W(e,q) = 1 / D(e,q)$, huge.  When the system
302: reaches one of this configurations it can not leave it until the end
303: of the multi-canonical cycle, when $W(e,q)$ will be updated again.
304: 
305: In order to improve the convergence we have also tried to start with a
306: DoS estimated from the one of a thermalized $L=4$ sample.  The
307: convergence seems to be faster, however the problems giving rise to
308: the peak structure in the histogram remain unaltered.
309: 
310: Given that the thermalization task appears to be very hard, one should
311: at least try to use all thermalization checks available.  For example
312: the one based on the symmetry of the overlap distribution for every
313: sample, $P_J(q)$ should always be carefully checked: this analysis is
314: lacking in \cite{HATGUB}.
315: 
316: \begin{figure}
317: \centering\includegraphics[width=0.6\textwidth,angle=0]{plot4.eps}
318: \caption[a]{For a given $L=6$ sample (sample \#3 in figure \ref{F-FRAZ}) the
319: $P(q)$ measured with parallel tempering (top left) is symmetric, while
320: it may become much more narrow when a multi-canonical method is
321: employed.}
322: \label{F-PQ}
323: \end{figure}
324: 
325: In figure \ref{F-PQ} we show the overlap distribution $P_J(q)$ for the
326: single $L=6$ sample considered in figure \ref{F-FRAZ}.d at a low
327: temperature $T=0.3$ (these data come from a further parallel tempering
328: simulation, pushed to lower $T$ values).  In figure \ref{F-PQ}.a we
329: show the $P(q)$ measured with a parallel tempering simulation.  Its
330: very accurate symmetry is a strong evidence of complete
331: thermalization.  In the next $3$ plots (b,c and d) we show with
332: continuous lines the $P(q)$ measured with the multi-canonical method
333: (the chosen times correspond to the dots in figure \ref{F-FRAZ}.d).
334: We always superimpose the thermalized $P(q)$ for comparison.  It is
335: clear that, in the best case (see figure \ref{F-PQ}.b), the
336: multi-canonical method is not able to give results as good as the
337: parallel tempering does: in the worst cases it just gives a completely
338: wrong $P_J(q)$, with a single or a double peak.  The system may very
339: easily get stuck somewhere, and in these cases the estimated $P(q)$
340: would look much narrower than the correct one (see figure \ref{F-PQ}.c
341: and figure \ref{F-PQ}.d): measurements taken in such a biased
342: situation hint for a fake evidence in favor of a single peak $P(q)$,
343: and consequently of the droplet scenario.
344: 
345: As a last piece of evidence we consider the samples where the
346: bivariate multi canonical has been well behaved: the scaling of the
347: visited fraction of the $(e,q)$ phase space (for well thermalized
348: samples) reported in figure \ref{F-SCALING} supports the picture of a
349: diffusion-like evolution of the histogram.  The area of support of the
350: histogram grows more or less linearly with the number of
351: multi-canonical cycles (the best exponent estimate is 0.9).  Moreover,
352: the time for reaching the plateau (equilibration time) grows with
353: $\tau \propto L^{3.37} \propto N^{1.12}$, which seems to be very close
354: to the theoretical lower bound ($\tau \propto N$).  However this
355: result would hold {\em only if} the number of MCS per multi-canonical
356: cycle necessary for a proper thermalization is independent from the
357: system size $N$.  As we have already seen this is not true.  Indeed,
358: using the same $10^7$ MCS per multi-canonical cycle, the fraction of
359: well thermalized samples we have obtained is 100\% for $L=4$, around
360: 40\% for $L=6$ and 0\% for $L=8$.  Because the requested number of MCS
361: per multi-canonical cycle grows with $N$ (apparently very fast), our
362: conclusion is that $\tau$ grows much faster than $N$ (simple arguments
363: by Berg \cite{BERG} suggest at least as $N^2$).
364: 
365: Concluding, we have seen how difficult it is to bring a bivariate
366: multi-canonical simulation of spin glasses to equilibrium and,
367: consequently, one possible reason of the failure of \cite{HATGUB} to
368: thermalize for $L=8$ (we have checked the failure of thermalization
369: with independent parallel tempering simulations).  When we say that
370: the simulation is not thermalized we mean that we can not use the
371: resulting DoS\footnote{Note that the DoS estimation actually used in
372: the measurements in \cite{HATGUB} is $D(e,q) h(e,q)$ and so it is
373: strongly affected by non-uniformities in the histogram.}  in order to
374: estimate the observables averages at all the temperature.  In
375: particular, as long as the simulation does not visit many times the
376: ground-states, we cannot believe to have enough information on the
377: ground-states structure.  However it may perfectly be that, after a
378: certain number of multi-canonical cycles, the estimated DoS gives good
379: averages at higher temperatures, which do not change if new low energy
380: states are reached.  We believe this is the case in \cite{HATGUB},
381: where data at not too low temperatures are perfectly compatible with
382: the ones obtained in previous work and fit the RSB scenario.
383: 
384: \begin{figure}
385: \centering\includegraphics[width=0.6\textwidth,angle=0]{plot3.eps}
386: \caption[a]{The scaling of the visited fraction of the $(e,q)$ phase
387: space (for well thermalized samples) shows that the equilibration time
388: must grow with the system size faster than $\tau \propto N^{1.1}$.}
389: \label{F-SCALING}
390: \end{figure}
391: 
392: We thank N. Hatano for an useful correspondence regarding the
393: bivariate method.
394: 
395: \begin{references}
396: 
397: \bibitem{HATGUB}
398:   N. Hatano and J.E. Gubernatis,
399:   preprint {\tt cond-mat/0008115}.
400: 
401: \bibitem{RECENT}
402:   H.G. Katzgraber, M. Palassini and A.P. Young,
403:   preprint {\tt cond-mat/0007113}.
404: 
405: \bibitem{PARISI}
406:   G. Parisi, 
407:   Phys. Rev. Lett. {\bf 43}, 1754 (1979);
408:   J. Phys. A {\bf 13}, 1101, 1887, L115 (1980); 
409:   Phys. Rev. Lett. {\bf 50}, 1946 (1983); 
410:   M. M\'ezard, G. Parisi and M.A. Virasoro,
411:   {\em Spin Glass Theory and Beyond}
412:   (World Scientific, Singapore 1987).
413: 
414: \bibitem{PT}
415: K. Hukushima and K. Nemoto,
416: J. Phys. Soc. Japan {\bf 65}, 1604 (1996).
417: M.C. Tesi, E.J. Janse van Rensburg, E. Orlandini and S.G.~Whittington, 
418: J. Stat. Phys. {\bf 82}, 155 (1996).
419: 
420: \bibitem{OPTIMIZED}
421: E. Marinari, {\em Optimized Monte Carlo Methods}
422: in {\em Advances in Computer Simulation}, 
423: edited by J. Kertesz and I. Kondor, Springer-Verlag (1997).
424: 
425: \bibitem{BERG_NEU}
426:   B.A. Berg and T. Neuhaus, Phys. Rev. Lett. {\bf 68}, 9 (1992).
427: 
428: \bibitem{HATANO}
429:   N. Hatano, private communication.
430: 
431: \bibitem{BACKGAMMON}
432:   F. Ritort, Phys. Rev. Lett. {\bf 75}, 1190 (1995).
433: 
434: \bibitem{BERG}
435:   B.A. Berg, private communication.
436: 
437: \end{references}
438: \end{document}                                      
439: