0011:cond-mat0011039/c10.tex

1: \documentstyle[graphicx,multicol,prl,aps]{revtex}

2:

3: \begin{document}

4:

5: \draft

6:

7: \title{On the Use of Optimized Monte Carlo Methods for

8: Studying Spin Glasses}

9:

10: \author{E. Marinari$^1$, G. Parisi$^1$, F. Ricci-Tersenghi$^2$ and

11: F. Zuliani$^1$}

12:

13: \address{

14: $^1$ Dipartimento di Fisica, INFN and INFM,

15: Universit\`a di Roma {\em La Sapienza},\\

16: P. A. Moro 2, 00185 Roma, Italy.}

17:

18: \address{

19: $^2$ The Abdus Salam International Center for Theoretical Physics,

20: Condensed Matter Group,\\

21: Strada Costiera 11, P.O. Box 586, I-34100 Trieste, Italy.}

22:

23: \date{October $31$, $2000$}

24:

25: \maketitle

26:

27: \begin{abstract}

28: We start from recently published numerical data by Hatano and

29: Gubernatis~\cite{HATGUB} to discuss properties of convergence to

30: equilibrium of optimized Monte Carlo methods (bivariate multi

31: canonical and parallel tempering).  We show that these data are not

32: thermalized, and they lead to an erroneous physical picture.  We shed

33: some light on why the bivariate multi canonical Monte Carlo method can

34: fail.

35: \end{abstract}

36:

37: \pacs{PACS numbers: 75.50.Lk, 75.10.Nr, 75.40.Gb}

38:

39: One of the main problems of numerical results originated from large

40: scale numerical simulations is that checking them is a task that is

41: frequently of the order of magnitude of checking a real experiment:

42: only repeating the full simulation, that demands availability of

43: computer time and codes, allows a full check of the results.

44:

45: Here we will use as a starting point the work of reference

46: \cite{HATGUB} to discuss a few points both about optimized Monte Carlo

47: algorithms and about the behavior of $3D$ Edwards-Anderson (EA) spin

48: glasses in the low $T$ phase.  We will start by showing that the

49: numerical results reported in reference \cite{HATGUB}, as far as the

50: low $T$ values are concerned, are wrong: they are not equilibrium

51: averages over the Boltzmann probability. Because of that the physical

52: conclusions reached in the paper, supporting a trivial behavior of the

53: broken phase of $3D$ spin glasses, are wrong. On the contrary recent

54: numerical simulations \cite{RECENT} support, in this respect, a

55: behavior of the system consistent with the Replica Symmetry Breaking

56: (RSB) picture \cite{PARISI}. We will also shed some light on why the

57: optimized Monte Carlo method used in \cite{HATGUB} can fail.

58:

59: In the following we will first analyze our numerical data obtained by

60: the {\em Parallel Tempering} Monte Carlo method \cite{PT}, focusing on

61: the analysis needed to establish that thermal equilibrium has been

62: reached \cite{OPTIMIZED}: we will use a large number of severe

63: criteria that ensure that thermalization has been reached.  After

64: showing that the results of \cite{HATGUB} are not correct in the low

65: $T$ region we will discuss some preliminary simulations done using the

66: same method used in \cite{HATGUB}, a bivariate version of the

67: Multi-Canonical Monte Carlo \cite{BERG_NEU}, and we will point out a

68: series of reasons for which a non careful implementation of this

69: strategy can fail.

70:

71: %This is figure 1.

72: \begin{figure}

73: \centering\includegraphics[width=0.6\textwidth,angle=0]{f05.eps}

74: \caption[a]{The Binder parameter, $B(t)$, averaged over logarithmic

75: time windows, as a function of time, at $T=0.5$.}

76: \protect\label{F-05}

77: \end{figure}

78:

79: Let us start from our numerical data obtained through parallel

80: tempering\footnote{For sake of a complete reliability and without fear

81: of appearing over cautious we have chosen to rewrite all our codes in

82: a double blind pattern, with two different sets of programmers, using

83: different programming languages and different random number

84: generators: they always give statistically compatible results.}.  We

85: have simulated a $3D$ Edwards-Anderson spin glass, with binary random

86: quenched couplings, linear size $L=8$ (the largest size used in

87: \cite{HATGUB}), down to $T=0.5\simeq 0.5\, T_c$: let us note that in

88: our simulations for the same $T$ values we are able to thermalize

89: reliably lattices up to $L=16$, and that we just discuss here results

90: about the $L=8$ lattice, where we are completely confident about

91: thermalization, only because this is the largest lattice studied in

92: \cite{HATGUB}. We use a minimum value of the temperature

93: $T_{\mbox{min}}=0.5$, a number of temperatures $N_T=49$ and a constant

94: temperature step $\delta T = \frac{1}{30}$.  The measured correlation

95: times are always smooth functions of $T$ and no anomalies are

96: detected.

97:

98: Our data at high $T$ turn out to be statistically compatible with the

99: ones of \cite{HATGUB}: in the high $T$ region there are no problems.

100:

101: In figure \ref{F-05} we plot the value of the Binder parameter,

102:

103: \begin{equation}

104:   B(t)\equiv\frac12\left(3-

105:   \frac

106:   {  \overline{\langle q^4(t)\rangle}    }

107:   { {\overline{\langle q^2(t)\rangle}}^2 }

108:   \right)\ ,

109: \end{equation}

110: averaged over logarithmic time windows, as a function of time at

111: $T=0.5$ (close to $0.5\,T_c$).  Averaging over logarithmic windows is

112: the safe approach to check convergence in time. We first average over

113: the last half of the total time extent of the run: this is the last

114: point on the right of the plot. We subdivide in the same way the other

115: half of the data, and the second point on the right is the average

116: over the second half of this time span: we continue in this way till

117: the origin of our Monte Carlo run. With a straight line we plot the

118: asymptotic data from \cite{HATGUB} as extracted from figure $7$ in the

119: paper (since we were estimating by hand we have been generous on the

120: statistical error): here there is no time dependence, we only plot

121: with a straight line the asymptotic value. The discrepancy of our data

122: and the data of \cite{HATGUB} is very large and statistically very

123: significant: definitely not an accident.

124:

125: %This is figure 2.

126: \begin{figure}

127: \centering\includegraphics[width=0.6\textwidth,angle=0]{f06.eps}

128: \caption[a]{As in figure \ref{F-05}, but for $T=0.6$.}

129: \protect\label{F-06}

130: \end{figure}

131:

132: In figure \ref{F-06} we plot the $T=0.6$ data from the same run,

133: always for the Binder parameter averaged over logarithmic time

134: windows: here $T$ is higher, and one could feel safer about

135: thermalization, but again there is a clear and significant discrepancy

136: among our data and the ones of \cite{HATGUB}. The dramatic stability

137: of our data for $B(t)$ at low $T$ is already a very good indicator of

138: a high level of thermalization. The results are stable at least during

139: the last eight subdivisions of our two million step runs, i.e. at least

140: from times going from $10^4$ to $2\cdot10^6$.

141:

142: In order to be sure we are not trapped in some metastable situation we

143: have to check standard criteria about convergence, that in the case of

144: optimized dynamics can be quite difficult to assert

145: \cite{OPTIMIZED}. Let us note for example that in recent numerical

146: simulations \cite{RECENT} a careful discussion shows that weaker

147: criteria can be sufficient to guarantee thermalization, making in this

148: way possible to simulate more disorder sample with the same amount of

149: computer time (since one needs less thermal sweeps per sample). Here,

150: since thermalization is the main issue, we will check all of the most

151: stringent criteria.

152:

153: First of all we have checked the acceptance rates of the tempering

154: sweeps in temperature: a bad choice of the $T$ values can make the

155: swap of the temperature value too rare. In our case the rates are very

156: high, of the order of $.7$ in all the temperature range: our parallel

157: tempering scheme is performing very well.

158:

159: Secondly we have checked, as customary, if all configurations

160: (we have, as we said, $49$ of them) have spent a similar amount of

161: time in each one of the $49$ allowed $T$ values. This criterion is

162: important, since the first one could not be sufficient: spin

163: configurations could be spending time swapping

164: among neighboring

165: $T$ values locally,

166: but never leave the high or the low $T$ region. Our {\em permanence

167: histograms} are very good: because of the large time extent of the

168: runs all configurations have visited all regions of the $T$ phase

169: space, and the permanence histograms are very flat. Again, this

170: is a powerful test of thermalization.

171:

172: The last point we have checked is the symmetry under the exchange

173: $q\to-q$ of the $P_J(q)$ for the {\em individual} samples. Since the

174: overall flip of all spins is supposed to be a very slow mode of the

175: dynamics, once we have good statistics on this mode we expect to have

176: reached all the relevant regions of the phase space. Again, the

177: symmetry is excellent for all individual samples (even for the more

178: complex samples where the $P_J(q)$ has a non-trivial structure).

179:

180: We consider this body of evidence as clear: our data are thermalized,

181: the numerical data hint evidence in favor of the RSB picture (as

182: confirmed by the data of \cite{RECENT}, where even at very low $T$

183: values one sees that $P(0)$ does not depend on $L$) and the method

184: used in \cite{HATGUB} did not allow a proper thermalization.

185:

186: In order to get a better understanding of the situation, and some

187: hints about the reason of the failure of \cite{HATGUB} we have

188: implemented a code for rerunning their bivariate multi-canonical

189: simulations.

190:

191: Our simulations closely follows the description given in the Appendix

192: of reference \cite{HATGUB} and by Hatano himself \cite{HATANO}.  The

193: analysis of few samples of sizes $L=4$, $6$, $8$ has been sufficient

194: in order to understand where the thermalization problems may come

195: from.  Unless differently specified we have always used $10^6$ Monte

196: Carlo Sweeps (MCS) for thermalizing and $10^7$ MCS for taking

197: measurements in {\em each} multi-canonical cycle.  The same number of

198: MCS has been used by the authors of \cite{HATGUB} only for $L=10$

199: \cite{HATANO} (less iterations have been used for smaller lattice

200: sizes).

201:

202: The most delicate point during the thermalization process is the role

203: played by the {\em entropic barriers} during the multi-canonical

204: simulation.  In a model which undergoes a first order transition the

205: slowing down of the simulation at the critical point is essentially

206: due to the presence of a huge {\em energetic barrier} between the two

207: free energy minima. In this case the multi-canonical simulation works

208: fine \cite{BERG_NEU}, and it rapidly converges towards a regime where

209: every energy is sampled with the right probability, i.e. uniformly.

210: Problems may arise when the multi-canonical method is applied to spin

211: glasses or in general to models where entropic barriers play a central

212: role. To this respect the study of its performances in models with

213: only entropic barriers (e.g. backgammon model \cite{BACKGAMMON}) would

214: be illuminating.

215:

216: \begin{figure}

217: \centering\includegraphics[width=0.6\textwidth,angle=0]{plot2.eps}

218: \caption[a]{The fraction of $(e,q)$ space where the histogram $h(e,q)$

219: is different from zero as a function of the multi-canonical cycle

220: number.  Even for a very small system ($L=6$) strong convergence

221: problems arise.}

222: \label{F-FRAZ}

223: \end{figure}

224:

225: Let us focus now specifically on the $3D$ EA model, and see how the

226: estimated density of states (DoS), $D(e,q)$, converges to the exact

227: one.  In particular we are interested in the histogram $h(e,q)$ which

228: counts the number of times, during a multi-canonical cycle, the system

229: is in a macroscopic state $(e,q)$ with energy $e$ and overlap $q$.

230: Thermalization is achieved when $h(e,q)$ is flat and much larger than

231: $1$ for all the physically allowed pairs $(e,q)$.  Starting from a

232: flat DoS, the region where $h(e,q) \gg 1$ broadens with the number of

233: multi-canonical cycles and eventually reaches the boundaries of the

234: allowed domain, $e \in [-e_0,e_0] \; q \in [-1,1]$, where $-e_0$ is

235: the ground-states energy (see the first two snapshots in figure

236: \ref{F-ISTO}, that we will discuss in better detail later on).  In

237: order to describe quantitatively the histogram evolution we plot in

238: figure \ref{F-FRAZ} the fraction of the $(e,q)$ space where $h(e,q)

239: \neq 0$, that is the fraction of macroscopic $(e,q)$ configurations

240: visited by the system during a multi-canonical cycle.  We expect this

241: fraction to increase more or less linearly during the first

242: multi-canonical cycles and then to reach a plateau when simulation is

243: thermalized (see figure \ref{F-FRAZ}.a, where things look good).  For

244: all the $L=4$ samples simulated we have observed this correct

245: behavior.  On the contrary for the $L=6$ samples, problems arise.  At

246: first, if the number of MCS is not large enough the simulation does

247: not converge at all.  In figure \ref{F-FRAZ}.b we show the results for

248: the same sample shown in figure \ref{F-FRAZ}.a, with the only

249: difference that $10^6$ MCS were used instead of $10^7$: here

250: thermalization problems are evident, since in some situations the

251: system simply gets trapped in a very small region of the phase space.

252: In different samples we have found analogous problems also when using

253: $10^7$ MCS (see figures \ref{F-FRAZ}.c and \ref{F-FRAZ}.d).  With

254: $10^6$ MCS the parallel tempering method is able to thermalize samples

255: up to $L=8$ for temperatures down to $T=0.3$ (for example at the

256: lowest $T$ value the Binder parameter thermalizes in $10^6$ MCS):

257: the bivariate multi-canonical method does not seem to be very

258: efficient for spin glasses.

259:

260: \begin{figure}

261: \centering\includegraphics[width=0.6\textwidth,angle=0]{plot1.eps}

262: \caption[a]{The evolution of the histogram $h(e,q)$ as a function of

263: multi-canonical cycles (sample \#2 in figure \ref{F-FRAZ}).}

264: \label{F-ISTO}

265: \end{figure}

266:

267: In figure \ref{F-ISTO} we show the histogram evolution for sample \#2

268: (the same used in figure \ref{F-FRAZ}.c).  The four snapshots

269: correspond to the black dots in figure \ref{F-FRAZ}.c and clearly show

270: that the system, after reaching an apparently thermalized state with a

271: flat and broad $h(e,q)$, instead of keeping it for all subsequent

272: times, gets trapped in very small regions of the $(e,q)$ space (the

273: third and fourth snapshots in figure \ref{F-ISTO}).

274:

275: How can we explain this behavior? During the first multi-canonical

276: cycles the dynamics of the system in the $(e,q)$ space is diffusive in

277: character, while when approaching the boundaries of the $e-q$ plane

278: (especially the energy ones) the system often gets trapped for very

279: long times.  The end of the diffusive behavior near to the ground

280: states can be easily explained in terms of accessibility, that is the

281: probability of decreasing the energy when the system is in a $(e,q)$

282: configuration and it makes a random move to a neighbor configuration.

283: For not too low energies the accessibility is high: in this case a

284: random walk in the configuration space corresponds to a random walk

285: in the $(e,q)$ space, which is a projection of the previous one.  On

286: the contrary for energies close to the one of the ground states the

287: accessibility is very low, due to the presence of a large number of

288: higher local minima. For example if the system is at the bottom of a

289: valley in the space of microscopic configurations, in order to further

290: decrease its energy (a little step in the macroscopic $(e,q)$ space)

291: it may need a long time, the time to find a deeper valley.  The

292: dynamics turns out to be strongly constrained for energies close to

293: the boundaries.

294:

295: Having in mind that the dynamics becomes slower and slower close to

296: the energy boundaries, one can easily explain the peaks in figure

297: \ref{F-ISTO}.  The system firstly relaxes in a uniform way on a large

298: part of the $(e,q)$ space, the more accessible one.  Still many

299: allowed $(e,q)$ values are unvisited (because of the low

300: accessibility), their DoS estimation becomes very small and their

301: corresponding weights, $W(e,q) = 1 / D(e,q)$, huge.  When the system

302: reaches one of this configurations it can not leave it until the end

303: of the multi-canonical cycle, when $W(e,q)$ will be updated again.

304:

305: In order to improve the convergence we have also tried to start with a

306: DoS estimated from the one of a thermalized $L=4$ sample.  The

307: convergence seems to be faster, however the problems giving rise to

308: the peak structure in the histogram remain unaltered.

309:

310: Given that the thermalization task appears to be very hard, one should

311: at least try to use all thermalization checks available.  For example

312: the one based on the symmetry of the overlap distribution for every

313: sample, $P_J(q)$ should always be carefully checked: this analysis is

314: lacking in \cite{HATGUB}.

315:

316: \begin{figure}

317: \centering\includegraphics[width=0.6\textwidth,angle=0]{plot4.eps}

318: \caption[a]{For a given $L=6$ sample (sample \#3 in figure \ref{F-FRAZ}) the

319: $P(q)$ measured with parallel tempering (top left) is symmetric, while

320: it may become much more narrow when a multi-canonical method is

321: employed.}

322: \label{F-PQ}

323: \end{figure}

324:

325: In figure \ref{F-PQ} we show the overlap distribution $P_J(q)$ for the

326: single $L=6$ sample considered in figure \ref{F-FRAZ}.d at a low

327: temperature $T=0.3$ (these data come from a further parallel tempering

328: simulation, pushed to lower $T$ values).  In figure \ref{F-PQ}.a we

329: show the $P(q)$ measured with a parallel tempering simulation.  Its

330: very accurate symmetry is a strong evidence of complete

331: thermalization.  In the next $3$ plots (b,c and d) we show with

332: continuous lines the $P(q)$ measured with the multi-canonical method

333: (the chosen times correspond to the dots in figure \ref{F-FRAZ}.d).

334: We always superimpose the thermalized $P(q)$ for comparison.  It is

335: clear that, in the best case (see figure \ref{F-PQ}.b), the

336: multi-canonical method is not able to give results as good as the

337: parallel tempering does: in the worst cases it just gives a completely

338: wrong $P_J(q)$, with a single or a double peak.  The system may very

339: easily get stuck somewhere, and in these cases the estimated $P(q)$

340: would look much narrower than the correct one (see figure \ref{F-PQ}.c

341: and figure \ref{F-PQ}.d): measurements taken in such a biased

342: situation hint for a fake evidence in favor of a single peak $P(q)$,

343: and consequently of the droplet scenario.

344:

345: As a last piece of evidence we consider the samples where the

346: bivariate multi canonical has been well behaved: the scaling of the

347: visited fraction of the $(e,q)$ phase space (for well thermalized

348: samples) reported in figure \ref{F-SCALING} supports the picture of a

349: diffusion-like evolution of the histogram.  The area of support of the

350: histogram grows more or less linearly with the number of

351: multi-canonical cycles (the best exponent estimate is 0.9).  Moreover,

352: the time for reaching the plateau (equilibration time) grows with

353: $\tau \propto L^{3.37} \propto N^{1.12}$, which seems to be very close

354: to the theoretical lower bound ($\tau \propto N$).  However this

355: result would hold {\em only if} the number of MCS per multi-canonical

356: cycle necessary for a proper thermalization is independent from the

357: system size $N$.  As we have already seen this is not true.  Indeed,

358: using the same $10^7$ MCS per multi-canonical cycle, the fraction of

359: well thermalized samples we have obtained is 100\% for $L=4$, around

360: 40\% for $L=6$ and 0\% for $L=8$.  Because the requested number of MCS

361: per multi-canonical cycle grows with $N$ (apparently very fast), our

362: conclusion is that $\tau$ grows much faster than $N$ (simple arguments

363: by Berg \cite{BERG} suggest at least as $N^2$).

364:

365: Concluding, we have seen how difficult it is to bring a bivariate

366: multi-canonical simulation of spin glasses to equilibrium and,

367: consequently, one possible reason of the failure of \cite{HATGUB} to

368: thermalize for $L=8$ (we have checked the failure of thermalization

369: with independent parallel tempering simulations).  When we say that

370: the simulation is not thermalized we mean that we can not use the

371: resulting DoS\footnote{Note that the DoS estimation actually used in

372: the measurements in \cite{HATGUB} is $D(e,q) h(e,q)$ and so it is

373: strongly affected by non-uniformities in the histogram.}  in order to

374: estimate the observables averages at all the temperature.  In

375: particular, as long as the simulation does not visit many times the

376: ground-states, we cannot believe to have enough information on the

377: ground-states structure.  However it may perfectly be that, after a

378: certain number of multi-canonical cycles, the estimated DoS gives good

379: averages at higher temperatures, which do not change if new low energy

380: states are reached.  We believe this is the case in \cite{HATGUB},

381: where data at not too low temperatures are perfectly compatible with

382: the ones obtained in previous work and fit the RSB scenario.

383:

384: \begin{figure}

385: \centering\includegraphics[width=0.6\textwidth,angle=0]{plot3.eps}

386: \caption[a]{The scaling of the visited fraction of the $(e,q)$ phase

387: space (for well thermalized samples) shows that the equilibration time

388: must grow with the system size faster than $\tau \propto N^{1.1}$.}

389: \label{F-SCALING}

390: \end{figure}

391:

392: We thank N. Hatano for an useful correspondence regarding the

393: bivariate method.

394:

395: \begin{references}

396:

397: \bibitem{HATGUB}

398:   N. Hatano and J.E. Gubernatis,

399:   preprint {\tt cond-mat/0008115}.

400:

401: \bibitem{RECENT}

402:   H.G. Katzgraber, M. Palassini and A.P. Young,

403:   preprint {\tt cond-mat/0007113}.

404:

405: \bibitem{PARISI}

406:   G. Parisi,

407:   Phys. Rev. Lett. {\bf 43}, 1754 (1979);

408:   J. Phys. A {\bf 13}, 1101, 1887, L115 (1980);

409:   Phys. Rev. Lett. {\bf 50}, 1946 (1983);

410:   M. M\'ezard, G. Parisi and M.A. Virasoro,

411:   {\em Spin Glass Theory and Beyond}

412:   (World Scientific, Singapore 1987).

413:

414: \bibitem{PT}

415: K. Hukushima and K. Nemoto,

416: J. Phys. Soc. Japan {\bf 65}, 1604 (1996).

417: M.C. Tesi, E.J. Janse van Rensburg, E. Orlandini and S.G.~Whittington,

418: J. Stat. Phys. {\bf 82}, 155 (1996).

419:

420: \bibitem{OPTIMIZED}

421: E. Marinari, {\em Optimized Monte Carlo Methods}

422: in {\em Advances in Computer Simulation},

423: edited by J. Kertesz and I. Kondor, Springer-Verlag (1997).

424:

425: \bibitem{BERG_NEU}

426:   B.A. Berg and T. Neuhaus, Phys. Rev. Lett. {\bf 68}, 9 (1992).

427:

428: \bibitem{HATANO}

429:   N. Hatano, private communication.

430:

431: \bibitem{BACKGAMMON}

432:   F. Ritort, Phys. Rev. Lett. {\bf 75}, 1190 (1995).

433:

434: \bibitem{BERG}

435:   B.A. Berg, private communication.

436:

437: \end{references}

438: \end{document}

439: