0304:cond-mat0304273/u3.tex

1: \documentclass[pre,aps]{revtex4}

2: \usepackage{psfig}

3: \newcommand{\be}{\begin{equation}}

4: \newcommand{\ee}{\end{equation}}

5: \newcommand{\e}{\emph}

6: \newcommand{\bb}{\textbf}

7: % \textwidth = 490pt

8: % \textheight = 680pt

9: % \hoffset = -10pt

10: % \voffset = 10pt

11:

12: %\linespread{1.1}

13:

14: \begin{document}

15:

16: \title{A Quantitative Clustering Approach to

17: Ultrametricity in Spin Glasses}

18:

19: \author{Stefano Ciliberti and Enzo Marinari}

20: \affiliation

21: {Dipartimento di Fisica,

22: SMC and UdR1 of INFM, INFN,

23: Universit\`a di Roma {\em La Sapienza},

24: P.le A. Moro 2, 00185 Roma, Italy}

25:

26: \begin{abstract}

27: We discuss the problem of ultrametricity in mean field spin glasses by

28: means of a hierarchical clustering algorithm.  We complement the

29: clustering approach with quantitative testing: we discuss both in some

30: detail.  We show that the elimination of the (in this context

31: accidental) spin flip symmetry plays a crucial role in the analysis,

32: since the symmetry hides the real nature of the data.  We are able to

33: use in the analysis disorder averaged quantities.  We are able to

34: exhibit a number of features of the low $T$ phase of the mean field

35: theory, and to claim that the full hierarchical structure can be

36: observed without ambiguities only on very large lattice volumes, not

37: currently accessible by numerical simulations.

38: \end{abstract}

39:

40: \date{2003, April 10th}

41:

42: \maketitle

43:

44: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

45: \section*{Happy Birthday}

46:

47: This paper is to honor Giovanni Jona-Lasinio birthday. We are grateful

48: to him since he has taught to us, as to so many other people in Rome

49: and in other places, a lot of physics and much about the way to love

50: good physics. Thanks, and Happy Birthday!

51:

52: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

53: \section{Introduction\label{S-INTROD}}

54:

55: The use of clustering methods to qualify the low temperature phase of

56: spin glass systems has been recently advocated in a group of very

57: interesting papers \cite{domany}. It is indeed well known that the

58: broken phase of mean field spin glasses has a high level of

59: complexity, that translates statically in Parisi spontaneous Replica

60: Symmetry Breaking (RSB) and dynamically in a series of dramatic

61: phenomena that go from a severe critical slowing down $\forall\  T<T_c$

62: to memory effects, aging phenomena and violations of the

63: fluctuation-dissipation theorem \cite{books}.

64:

65: Ultrametricity of states \cite{ultra} is one of the key features of

66: the mean field Parisi picture: states of the system turn out to be

67: endowed by an ultrametric distance, and the phase space is organized

68: hierarchically. Do finite dimensional spin glass systems share this

69: properties, and can we find a way to check that? This is an important

70: issue of the persistent debate \cite{review} about the physics of the

71: low temperature phase of finite dimensional spin glasses.

72:

73: Detecting ultrametricity on finite volume systems turns out to be very

74: difficult \cite{camapa,fraric}: the introduction of constrained Monte

75: Carlo methods \cite{camapa} and the analysis of the dynamical behavior

76: of the system \cite{fraric} help only marginally. Finite size effects

77: are very strong, and make the asymptotic potential emergence of a

78: hierarchical structure difficult to observe.

79:

80: Here we introduce some new analysis techniques and we study the

81: Sherrington-Kirkpatrick (SK) mean field model, where we know that for

82: low $T$ a non-trivial ultrametric structure emerges in the infinite

83: volume limit. We will find out that this is a difficult task, sharing

84: all the problems one observes in finite dimensional systems

85: \cite{domany,camapa}. Our main points can be summarized in four basic

86: issues:

87:

88: \begin{enumerate}

89:

90: \item We find that to be of better use

91: the approach based on hierarchical clustering has to be

92: complemented by the use of testing techniques that have been developed

93: in the field of numerical taxonomy \cite{jaidub}. We discuss some of

94: these techniques and we show how they can be applied to our problem.

95:

96: \item We discuss the role of the $Z_2$ symmetry of the phase space. We

97: find that removing this symmetry (that in this context is accidental)

98: is crucial to get sensible results from quantitative tests. We

99: introduce and discuss the way to remove the symmetry from equilibrium

100: configurations obtained in zero magnetic field.

101:

102: \item Thanks to these techniques we are able to clarify how a finite

103: volume SK system behaves as far as ultrametricity is concerned, by

104: working out strengths and limitations of the method. We find that on

105: the (medium-large) lattice sizes that we are able to analyze one can

106: establish that a structure is emerging, but that one cannot get a

107: compulsory evidence about this structure being ultrametric. This is

108: exactly the same kind of phenomenon one observes when studying finite

109: dimensional systems \cite{domany}.

110:

111: \item We analyze systematically finite size effects (by studying

112: systems on different lattice sizes) and the dependence of our results

113: over $T$. Thanks to the quantitative analysis techniques that we

114: introduce we are able to use hierarchical clustering techniques to

115: discuss also quantities that are {\em averaged over the disorder},

116: opening in this way a large information window.

117:

118: \end{enumerate}

119:

120: The low temperature mean field behavior of spin glass systems is

121: understood in the framework of the Parisi RSB scheme \cite{books}.

122: The prototype of mean field spin glass models is the SK fully

123: connected Ising model where coupling constants are \emph{quenched}

124: random variables:

125: \begin{equation}

126: {\mathcal{H}}_J[\sigma]=-\sum_{i,k=1}^N \sigma_i J_{i,k}\sigma_k \ ,

127: \label{E-H}

128: \end{equation}

129: where $\sigma_i=\pm 1$ are spin variables and the $J_{i,k}$ are

130: distributed according to an even distribution function.  For example

131: we can use a Gaussian distribution with $\overline{J_{ik}}=0$ (since

132: we want to avoid ferromagnetic effects) and

133: $\overline{J^2_{ik}}=\frac1N$ (to ensure that the energy is

134: extensive). As we have already reminded, the Parisi RSB solution of the

135: SK model, which is believed to be the correct solution of mean field

136: theory at low $T$, exhibits an ultrametric organization of the states

137: \cite{ultra}. This means that in the infinite volume limit for any

138: triple of equilibrium spin configurations $\alpha,\beta,\gamma$ we

139: have that:

140: \begin{displaymath}

141: q_{\alpha\beta}\geq \min

142: \{q_{\alpha\gamma}, q_{\beta\gamma}\} \ ,

143: \end{displaymath}

144: where $q_{\alpha\beta}$ is the overlap among

145: configurations $\alpha$ and $\beta$, defined as

146: \begin{equation}

147: \label{E-OVERLAP}

148: q_{\alpha\beta}\equiv\frac1{N}\sum_{i=1}^{N}

149: \sigma_i^\alpha \sigma_i^\beta

150: \end{equation}

151: (here configurations $\alpha$ and $\beta$ are independent

152: configurations at equilibrium under the same Hamiltonian, sharing the

153: same quenched realization of the random couplings: they are only

154: coupled by the fact of sharing the same realization of the random

155: Hamiltonian). The overlap $q_{\alpha\beta}$ is a similarity index, and

156: the distance is connected to one minus the overlap.

157:

158: We will analyze in detail the fact that revealing numerically an

159: ultrametrical emerging structure on finite systems is difficult. The

160: question is even more relevant since detecting reliable

161: signs of an ultrametric structure could be crucial in finite

162: dimensional systems, where the behavior of the system in the low $T$

163: phase is not yet understood \cite{review}.

164:

165: Clustering \cite{jaidub} is a powerful technique for analyzing data

166: (for interesting applications of statistical mechanical ideas to

167: clustering see \cite{rogufo,blwido,stibia}).

168: Since producing a valid hierarchical clustering is equivalent to show

169: the existence of a true ultrametric structure of the data, this kind of

170: approach can give crucial evidences. We will discuss here what happens

171: in the infinite range mean field SK model, where we know that

172: eventually, in the infinite volume limit, ultrametricity of states

173: will emerge. We believe this is needed to help in interpreting the

174: results obtained in the analysis of finite dimensional models

175: \cite{domany}. We will see that some important hints do indeed

176: emerge.

177:

178: In this note we introduce some new ideas relevant for hierarchical

179: cluster as applied to the analysis of disordered and complex systems,

180: and we discuss numerical results obtained from a clustering analysis

181: of equilibrium spin glass configurations, with a particular emphasis

182: on the study of the ultrametric nature of these states.  We explain

183: why a detailed analysis requires an appropriate elimination of the

184: spin flip symmetry and we investigate the dependence of our results on

185: the number of degrees of freedom of the system, showing that finite

186: size effects are actually very large.

187:

188: The paper is organized as follows. In section \ref{S-CLUSTER} we

189: introduce the clustering procedure and we explain the motivations for

190: our precise choice of a given clustering algorithm.  In section

191: \ref{S-ANALYSIS} we apply this technique to the SK model; we discuss

192: our findings about ultrametricity, also by comparing them with those

193: that one obtains by using standard techniques.  Here we will introduce

194: and use quantitative ways to state the significance of the results

195: obtained by clustering (mainly in section \ref{SS-QT}).  As we said

196: before a more detailed analysis requires a previous elimination of the

197: $Z_2$ symmetry, and this is done in section \ref{SS-REVERSE}: in

198: section \ref{SS-OTHER} we will also say a few words about using

199: different clustering schemes. Section \ref{S-SPINS} is dedicated to

200: the clustering of the spins.  We report our conclusions in the last

201: section.

202:

203: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

204: \section{The Clustering Algorithm\label{S-CLUSTER}}

205:

206: Clustering is a special kind of (potentially very powerful)

207: classification tool. We will give here only the basic

208: informations we need for our analysis, and we advise the reader to look

209: at \cite{jaidub}  for further details.

210:

211: Let us consider a sample done of $M$ data $x^\mu$, where each data

212: point $x^\mu\equiv\{ x_1^\mu,\ldots x_N^\mu\}$ is a vector in a

213: $N$-dimensional space.  We want to study the underlying organization

214: of the data, i.e. we want to find out whether the data are organized

215: according to some non-trivial structure.  A problem of this type is

216: strictly related to pattern recognition analysis and to Bayes decision

217: theory \cite{dudhar}: it is of very general interest, since it emerges

218: in many relevant contexts.

219:

220: The main ingredient for the analysis is the {\em proximity matrix}

221: $d_{\mu\nu}\equiv d(x^\mu,x^\nu)$.  $d(x^\mu,x^\nu)$ is some measure

222: of the dissimilarity of data $\mu$ and $\nu$. It is such that

223: $d_{\mu\mu}=0$ and $d_{\mu\nu}=d_{\nu\mu}\geq 0$.  $d$ does not need

224: to be a distance (for example the triangular inequality could not

225: be satisfied) but usually it is one.

226:

227: By clustering we group the data in sets that can be related among them

228: in different ways. Here we will use the exclusive (each data belongs

229: to exactly one cluster), intrinsic (i.e. based only on the proximity

230: matrix $d$) classification known as {\em hierarchical clustering}.

231: Hierarchical clustering is a nested sequence of partitions obtained

232: through a classification technique based on one of many possible

233: algorithms.  The output of the algorithm can be represented by a

234: hierarchical tree (a so-called {\em dendogram}).

235:

236: A generic (even random) set of data can always be arranged to fit a

237: tree-like structure: this is indeed what clustering does. After doing

238: such (potentially arbitrary) clustering we are left with the relevant

239: question of deciding if the hierarchical structure that has been

240: reconstructed was somehow intrinsic to the data set: this requires an

241: analysis \emph{a posteriori}.

242:

243: So, in hierarchical clustering we start from a set of data, we group

244: them by some algorithm (that we will specify in the following)

245: building in this way a hierarchical tree. Comparison of this tree and

246: the original data can lead to quantitative conclusions about the

247: presence of a true hierarchical structure in the data.

248:

249: In the course of a cluster analysis one usually faces two main

250: problems.

251:

252: \begin{itemize}

253:

254: \item The first important step is the definition of the dissimilarity

255: index $d_{\mu\nu}$ which is not always naturally induced from the

256: context (data do not necessarily belong to an Euclidean space).

257:

258: In our case this is an easy problem. Starting from the usual notion of

259: overlap (\ref{E-OVERLAP}) the distance between two spin configurations

260: can be for example naturally and easily defined as

261: \begin{displaymath}

262: d_{\mu\nu}\equiv\frac{1-q_{\mu\nu}}{2}\ .

263: \end{displaymath}

264:

265: \item The second problem is how to update distances among

266: elements. When we fuse elements $\alpha$ and $\beta$ in element

267: $\gamma$ (so joining two smaller clusters in a larger one) we have to

268: define all distances from the new cluster $\gamma$ to all other

269: clusters of the system $\eta$.  This step is crucial since it can play

270: a dramatic role in the structure of the iteration, even if in

271: situation where hierarchical clustering turns out to be {\em natural},

272: i.e. an intrinsic property of the data set, results have to be

273: independent from this issue (there exist alternative approaches which

274: allows to avoid such an explicit choice by means of a priori

275: hypothesis \cite{blwido,giamar}).

276:

277: The most part of our results has been obtained by the \emph{Ward

278: method} (or \emph{minimum variance method}) \cite{ward,jaidub}.  The

279: method is based on minimizing the square error, and is empirically

280: known to outperform other hierarchical clustering methods.

281:

282: When we merge the two clusters that have the smaller distance we

283: define the new distance using the following rule:

284: if $\rho$ and

285: $\sigma$ merge to form $\rho'$,

286: and $n_\alpha$ is the number of elements in the cluster $\alpha$,

287: then for any other cluster $\tau$:

288: \begin{equation}

289: d_{\tau\rho'}=\frac{

290: (n_\tau+n_\rho)d_{\tau\rho}+

291: (n_\tau+n_\sigma)d_{\tau\sigma}-

292: (n_\rho+n_\sigma)d_{\rho\sigma}

293: }

294: {n_\tau+n_\rho+n_\sigma}\ .

295: \label{E-WARD}

296: \end{equation}

297: Let ${\mathcal{C}_\alpha}$ stand for one of the clusters of the system

298: and consider the quantity

299: \begin{displaymath}

300: S=\sum_{\mathcal{C}_\alpha}\tau(\alpha)\ ,

301: \end{displaymath}

302: where the sum is over all the clusters defined in the system

303: and where

304: \begin{equation}

305: \tau(\alpha)=\sum_{\;\mu,\nu\in

306: \mathcal{C}_\alpha} d_{\mu\nu}^2\ .

307: \label{E-TAU}

308: \end{equation}

309: The choice of the Ward algorithm ensures that when merging two

310: clusters to form a new one $S$ increases of a minimal amount. In other

311: terms this definition of distance is the one induced from the maximum

312: likelihood principle.

313: \end{itemize}

314: This defines the clustering scheme that we will follow. We will

315: discuss next how these ideas can be applied to mean field spin glass

316: models, and how the result can be understood and quantified by testing

317: the cluster validity.

318:

319: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

320: \section{Cluster Analysis of the SK Mean Field Spin Glass\label{S-ANALYSIS}}

321:

322: As we have said we have decided to analyze numerically the mean field

323: SK model.  Since here the infinite volume scenario is under full

324: control we believe this is a crucial step in understanding what one can

325: learn from numerical simulations on finite lattices, and to control the

326: consequences of such results obtained on finite dimensional models

327: \cite{domany} where, on the contrary, the theoretical scenario is far

328: from clear.

329:

330: We have started by generating by an optimized Monte Carlo method a

331: large number of uncorrelated spin configurations on lattices of

332: different sizes and for a number of different realizations of the

333: quenched disorder (on which we eventually average), under the

334: Hamiltonian (\ref{E-H}), with quenched random couplings assigned under

335: a Gaussian distribution. We analyze systems with $N$, number of spins,

336: equal to $128$, $256$ and $512$ ($N=512$ is typical of a medium size

337: numerical simulation, corresponding for example to a linear size of $8$

338: in three dimensions). We thermalize our systems at a set of different

339: values of the temperature typically going from $0.1\ T_c$ (a very low

340: value, that we can reach only thanks the power of parallel tempering

341: \cite{PT,PTREV}), and in all cases we analyze $20$ different

342: realizations of the quenched couplings. For all lattice sizes,

343: relevant temperatures and disorder realizations we first thermalize

344: the system.  After doing that we record one spin configuration after

345: any new set of $1000$ combined full Monte Carlo sweeps and parallel

346: tempering updates of the system.  The large ``computer time''

347: separation among different configurations guarantees a very high level

348: of statistical independence. Residual possible (very small)

349: correlations would not spoil our analysis but would only make it a bit

350: less effective. We have recorded $1024$ such independent spin

351: configuration for each value of the parameters: such configurations

352: are the basic set of objects that we have clustered.

353:

354: Parallel tempering \cite{PT,PTREV} has been crucial in allowing to

355: bring at thermal equilibrium spin configurations at such low

356: temperature values on acceptable lattice volumes.  The method is based

357: on simulating in parallel copies of the system at different

358: temperature $T$ values, allowing the different copies to swap $T$

359: among them (with a standard Metropolis weight).  This reduces the free

360: energy barriers, always keeping the different copies at Boltzmann

361: equilibrium: tempering can be seen as an annealing where the basic

362: quantity is not energy but free energy.

363:

364: We have used all standard criteria to check that, when using the

365: Parallel Tempering optimized Monte Carlo scheme, we have really

366: reached thermal equilibrium \cite{PTREV}: we have checked that our

367: sample dependent overlap probability distributions $P_J(q)$ are indeed

368: well symmetric under $q\longrightarrow -q$, we have checked that all

369: copies of the system have visited a number of times all available

370: temperature values, we have checked that the acceptance factor of the

371: temperature acceptance swap has been of order $0.5$.

372:

373: In the rest of this note we will work on {\em clustering} these

374: configurations and on using quantitative testing to extract the

375: implications of the hierarchical structure that we obtain.

376:

377: We first introduce a standard graphical way to get a qualitative

378: feeling about the set of data. We consider the proximity matrix $\cal

379: P$, where we have the set of data (in some order to be specified) on

380: the $x$ and on the $y$ axis, and where we plot with darker colors

381: points with higher overlap: the diagonal constitutes by definition the

382: darkest set of the matrix. In figure \ref{F-MATRIX-A} we start by

383: showing, on the left, the matrix $\cal P$ for a given disorder

384: realization at $N=512$ and $T=0.1\ T_c$ (a very low value of $T$, the

385: lowest we have analyzed: here the system is basically in its ground

386: state) where configurations have been ordered at random.  A clearly

387: random pattern emerges.

388:

389: \begin{figure}

390: \centerline{\psfig{figure=F/fig1.ps,width=0.7\textwidth,angle=90}}

391: \caption{An example of the clustering procedure as applied to a very

392: low temperature set of configurations.  In the left part of the figure

393: we show a proximity matrix $\cal P$ built over $M=512$ configurations

394: of $N=512$ spins at $T=0.1\ T_c$, ordered at random.

395: Darker colors correspond to smaller distances.  On the right

396: part of the figure we draw the dendogram that results from our

397: clustering, and the resulting $\cal P$.

398: The distance on the dendogram is proportional to $\tau(\alpha)$

399: defined in equation (\protect\ref{E-TAU}).

400: The method recovers very well

401: the structure of two giant clusters related by the $Z_2$ symmetry.}

402: \label{F-MATRIX-A}

403: \end{figure}

404:

405: We apply the Ward algorithm to these configurations in order to obtain a

406: hierarchical tree (as we have discussed before) \footnote{For

407: clustering we have used the very flexible set of programs developed

408: by P. Kleiweg, available from {\tt http://

409: odur.let.rug.nl/$\tilde{\ }$kleiweg/clustering/clustering.html }}.  The

410: hierarchical tree that contains the information about the clustering,

411: the so-called {\em dendogram} \footnote{In a dendogram longer lines are

412: for farer clusters. In most of our drawings, when we are not

413: interested in analyzing this specific information, we use an

414: appropriate power law deformation of the scale to make the graph more

415: readable and telling.}  is shown in the upper part of the right side

416: of figure \ref{F-MATRIX-A}.  In the lower part of the right hand side

417: of figure \ref{F-MATRIX-A} we show the matrix obtained by ordering the

418: configurations \emph{as from the dendogram} on the $x$ and on the $y$

419: axis. Now the two reflected states appear very clearly (at such a low

420: $T$ value there are basically two $\delta$ functions at values $\pm

421: \overline{q}$, where $\overline{q}$ is close to one). We cannot

422: observe any further structure, since $T$ is too low (the ideal

423: temperature value for observing hints of ultrametric effects will turn

424: out to be, for our lattice sizes, of the order of $0.5\ T_c$).  As we

425: increase the temperature we observe that well defined structures

426: emerge (see figure \ref{F-3T}, where we show results for a single

427: sample, with $N=512$, at $T=0.3\ T_c$, $T=0.5\ T_c$ and $T=2.0\ T_c$):

428: when we reach the critical temperature $T_c$ and we go deeper in the

429: warm phase we obtain \emph{a homogeneous matrix}: here spins are

430: equally likely to be up or down, and as a consequence the overlap

431: between two configurations is zero on average.

432:

433: \begin{figure}

434: \centerline{\psfig{figure=F/fig2.ps,width=0.8\textwidth,angle=90}}

435: \caption{The dendogram and the related $\cal P$ matrix obtained

436: from the clustering of $M=256$ configurations at three different

437: temperature values.  On the left $T=0.3\ T_c$ (where $T$ is very low

438: and no significant structure but the $Z_2$ degeneracy can be

439: observed), in the center $T=0.5\ T_c$ (that is the best $T$ region for

440: observing the non-trivial state structure), and on the right $T=2.0\

441: T_c$, where there is no structure since we are deep in the high $T$

442: phase.}

443: \label{F-3T}

444: \end{figure}

445:

446: We stress that the information about the $Z_2$ symmetry is a trivial,

447: well known one, that does not give us further insight: still, it is

448: interesting that the clustering algorithm is able to reconstruct it.

449: We will discuss at length the fact that, on the opposite side, the

450: presence of the symmetry is deeply annoying in that it makes more

451: difficult to get quantitative information about the structure in one

452: of the two $Z_2$ sectors, hiding many features of the data, and making

453: interesting predictions impossible.

454:

455: We also use figure \ref{F-3T} to make a further point. The dendograms,

456: that make possible to visualize the hierarchical structure build from

457: the clustering, do not give much unambiguous information about the

458: underlying structure. The picture from $T=0.3\ T_c$ is not so

459: different, but for some power rescaling of the lengths, from the one

460: at very high $T$ ($T=2.0\ T_c$) where we do not expect a non trivial

461: ultrametricity to appear. Clusters at hight $T$ are, indeed, more

462: balanced, but one can only get some qualitative feelings about it.

463:

464: In the following we will work on trying to quantify the qualitative

465: statements about the possible presence of a (maybe hierarchical)

466: definite structure in the low $T$ phase.

467:

468:

469: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

470: \subsection{Quantitative Testing\label{SS-QT}}

471:

472: Before discussing our approach toward a quantitative analysis based

473: on hierarchical clustering techniques and aimed to check whether the

474: spin configurations (our original data set) are really organized

475: according to an ultrametric structure, we analyze the system by

476: applying a more standard statistical mechanical approach.  Following

477: \cite{domany,camapa} we analyze the probability distribution of the

478: variable

479: \begin{displaymath}

480: k_{\mu\nu\rho}\equiv \frac{d_{\mu\nu}-d_{\mu\rho}} {d_{\nu\rho}} \ ,

481: \end{displaymath}

482: where we have ordered the three distances to satisfy the condition

483: $C\equiv\{d_{\mu\nu}\geq d_{\mu\rho}\geq d_{\nu\rho}\}$.  This implies

484: that $K\in [0,1]$. In an ultrametric space we would get that

485: $P(k=K|C)=\delta(k)$.

486:

487: On our finite $N$ lattices we assume the following dependence of $P$

488: over $K$:

489: \begin{displaymath}

490: P(k=K|C) \sim \exp\left\{-\frac{K^2}{2\sigma^2}\right\}\ \theta(K)\ ,

491: \end{displaymath}

492: where $\theta(\cdot)$ is the step function.  We analyze the behavior

493: of the variance $\sigma^2$ with the size $N$.  We show our results in

494: figure \ref{F-SIGMA}. In the upper plot we select $T=0.5\ T_c$ and we

495: plot $\sigma$ as a function of $N$. $\sigma$ decreases with $N$, but

496: very slowly (as we expected from the results of \cite{camapa}, where

497: even with a tuned up Monte Carlo procedure one finds that similar

498: analysis are very difficult): it is not even easy to get a reasonable

499: fit to a zero limit of $\sigma$ (but the very large statistical error

500: allows for it). In the inset of the upper part of the plot we show

501: $P(k=K|C)$ for one single sample. In the bottom plot we show how

502: $\sigma$ depends on $T$ on our largest lattice size, $N=512$. Nothing

503: dramatic happens when increasing $T$: again, only some qualitative low

504: key effect is taking place.

505:

506: \begin{figure}

507: \centerline{\psfig{figure=F/fig3.ps,width=0.7\textwidth,angle=270}}

508: \caption{In the upper part of the figure we plot variance of

509: the distribution $P(k=K|C)$ versus the base two logarithm of the

510: number of spins $N$ at fixed temperature $T=0.5\ T_c$.  In the inset

511: we plot $P(k=K|C)$ as a function of $K$ for a single sample of the

512: quenched disorder.  In the lower part of the figure we plot $\sigma$

513: vs. $\frac{T}{T_c}$ for $N=512$.}

514: \label{F-SIGMA}

515: \end{figure}

516:

517: Now we start with analysis of the results of our cluster

518: reconstruction. We have used our data (spin configurations for a given

519: lattice size and temperature, together with their mutual distances

520: obtained from their mutual overlap) to produce a hierarchical tree,

521: and we want to test if this tree is connected to intrinsic properties

522: of our data (as we have already clarified an ultrametric tree can

523: always be superimposed even to random data). We will adapt standard

524: techniques \cite{jaidub} to judge about the validity of the structure

525: we have found and about the statement that data are organized

526: according to an ultrametric structure.

527:

528: The general procedure testing has a simple structure: given a starting

529: proximity matrix $\cal P$, we end our clustering procedure with a

530: particular ordering of elements of $\cal P$, i.e. with a particular

531: permutation of $|P|$ data. This is what our clustering scheme achieves

532: (transforming the left part of figure \ref{F-MATRIX-A} in the right

533: bottom matrix). Now we have the problem of deciding if what we did was

534: sensible: we can rephrase this question by saying that we have to

535: choose between the {\em randomness hypothesis} ($H_0$: all

536: permutations of labels of $M$ are equally likely) and the {\em

537: alternative hypothesis} ($H_1$: the data have some structure that has

538: been at least partially reconstructed by the clustering). In order to

539: check that we:

540: \begin{enumerate}

541: \item define a variable $T$ that we expect to be

542: ``small'' under the null hypothesis $H_0$;

543: \item assign a {\em confidence level} $\alpha$ for $H_1$ and

544: define a threshold $t_\alpha$ by solving the equation

545: $$

546: P(T\ge t_\alpha | H_0) = 1-\alpha\ ;

547: $$

548: \item measure from the data

549: the value of $T$, that we call  $t^*$. If

550: \begin{enumerate}

551: \item $t^* \ge t_\alpha$ $\Rightarrow$ reject $\ H_0$ at level $\alpha$;

552: \item $t^* < t_\alpha$ $\Rightarrow$ accept $H_0$ at level $\alpha$.

553: \end{enumerate}

554: \end{enumerate}

555: $\alpha$ is a confidence level, i.e. it is connected to

556: the probability that by accepting $H_1$ as true we are not

557: making a mistake.

558:

559: The first tool that we introduce is based on \emph{Hubert's $\Gamma$

560: Statistics} \cite{jaidub,hubsch}, and it is useful to validate

561: clustering. This is done by checking the correlation of the data with

562: a structure we define {\em a priori}.

563:

564: We consider our measured distance matrix $d_{\mu\nu}$, and we

565: introduce the matrix $f_{\mu\nu}$ by

566: \begin{equation}

567: f_{\mu,\nu}=

568: \left\{

569: \begin{array}{cl}

570: 0 & \textrm{if $\mu,\nu\,{{\in}}$

571: same cluster}\\

572: 1 & \textrm{if not}

573: \end{array}

574: \right.

575: \label{E-HUBERT}

576: \end{equation}

577: We will study the correlations among  $d_{\mu\nu}$ and

578: $f_{\mu\nu}$. Clearly we have also to specify the definition of {\em

579: being in the same cluster}. This introduces a parameter that allow to

580: decide how deeply we want to test the clusterization features of the

581: data. We will introduce a threshold, that defines the refinement level

582: that we want to use to check our description.

583:

584: We then have to define the a priori structure that we will compare to

585: the data.  Let us call $d_{\mbox{max}}$ the maximum distance (on the

586: hierarchical tree) among two configurations of our set: we say that

587: {\em two configurations belong to the same cluster if their distance

588: is smaller than a certain fraction of $d_{\mbox{max}}$, say than

589: ${d_{\mbox{max}}}/{z}$}.  We show in the right part of figure

590: \ref{F-HUBERT} how the number of clusters $N_c$ depends on $z$. At

591: very low $T$ we find a linear dependence of $N_c$ over $z$, while at

592: values of the order of $\frac12 T_c$ $N_c$ grows faster than linearly.

593: In figure \ref{F-HUBERT} we also show, for one sample of the quenched

594: disorder, the true distance matrix $d_{\mu,\nu}$ and four different

595: matrices $f_{\mu,\nu}$ obtained with an increasing value of $z$ (from

596: the upper left corner going rightward and then to the lower line and

597: rightward again), $z=$ $4$, $8$, $12$ and $16$. The difference among

598: the structures that we are testing in the different cases is obvious.

599: The careful reader will be able to recognize by eye that the three

600: valley structure implied by the threshold level $z=4$ can indeed be

601: found in the raw distance data of the leftmost matrix.

602:

603: \begin{figure}

604: \centerline{

605: \psfig{figure=F/fig4.ps,width=1.0\textwidth,angle=90}

606: }

607: \caption{On the left we plot the true distance matrix for a single

608: disorder sample at $T=0.5\ T_c$, and in the center four matrices

609: $f_{\mu,\nu}$ obtained for four different values of the threshold as

610: defined in equation (\protect\ref{E-HUBERT}). On the right we plot the

611: number of clusters $N_c$ versus $z$.  i.e. how the the number of

612: valleys depends upon the value of threshold we fix in order to test

613: the hypothesis. It turns out to be linear for small $T/T_c$,

614: exponential if $T\gtrsim T_c/2$ }

615: \label{F-HUBERT}

616: \end{figure}

617:

618: The main ingredient needed for

619: analyzing the Hubert's

620: $\Gamma$ statistics is the correlation function

621: \begin{equation}

622: \Gamma=

623: \frac 1{M^2}\sum_{\mu=0}^M\sum_{\nu=0}^M

624: \frac{

625: \left(d_{\mu,\nu}-m_{D}\right)\left(f_{\mu,\nu}-m_{F}\right)

626: }{

627: {s_{D}\,s_{F}}

628: }\ ,

629: \label{E-GAMMA}

630: \end{equation}

631: where (for $X=D,F$, $x=d,f$)

632: \begin{displaymath}

633: m_X \equiv \frac 1{M^2}\sum_{\mu=0}^M\sum_{\nu=0}^M

634: x_{\mu,\nu}\quad\quad,\quad\quad

635: s^2_X \equiv \frac 1{M^2}\sum_{\mu=0}^M\sum_{\nu=0}^M

636: x^2_{\mu,\nu}-m_X^2\ .

637: \end{displaymath}

638: Let us say that when looking at the output of the clustering we

639: observe a value of $\Gamma$ equal to $\Gamma^*$. In order to estimate

640: if this value hints for the hierarchical structure being intrinsic to

641: the data we have used a number of tests. The first test amounts to

642: little more than checking if our procedures are correct: we take as

643: $H_0$ the randomness hypothesis, i.e. we compare our ordered distance

644: matrix to a matrix where the configurations are at random. We would

645: find that the configuration is not atypical only if our programming

646: was wrong. We compute an histogram $P(\Gamma|H_0)$, i.e. the

647: distribution of $\Gamma$ under the null hypothesis of randomness, by

648: evaluating

649: \begin{displaymath}

650: \Gamma(\pi)=

651: \frac 1{M^2}\sum_{\mu=0}^M\sum_{\nu=0}^M

652: \frac{

653: \left( d_{\mu,\nu}          - m_D \right)

654: \left( f_{\pi(\mu),\pi(\nu)}- m_F \right)

655: }{s_D\,s_F}\ ,

656: \end{displaymath}

657: where the $\pi$ are random permutations of the $M$ configuration.  A

658: cluster is not consistent with the hypothesis $H_0$ (in this case the

659: hypothesis that configurations have not been ordered) if it is

660: ``unusual''. In order to quantify this statement, we introduce an

661: indicator $\Delta$ defined as

662: \begin{displaymath}

663:   \Delta\equiv

664:   \frac{\Gamma^*-\langle \Gamma\rangle}

665:    {\sqrt{\langle(\Delta\Gamma)^2}\rangle}

666: \end{displaymath}

667: where the value of $\Gamma$ that we have observed in our sample and

668: where the averages are taken with respect to the conditioned

669: probability distribution $P(\Gamma|H_0)$.  As expected we always find

670: a very high value of $\Delta$ for all reasonable values of the

671: threshold $z$ (i.e., say, values of $z$ that produce from two to order

672: hundred valleys): $\Delta$ is of order $10^{2}$ and that it is only

673: weakly dependent on the temperature (even at $T=\infty$ this test

674: tells that, yes, we had ordered the configurations, rejecting in this

675: way $H_0$ in a very clear cut way, since we are dealing with a large

676: matrix). As expected this procedure gives positive results both on the

677: original set of configurations and after applying the reversing

678: procedure described in section \ref{SS-REVERSE}.

679:

680: The rest of the (more crucial) testing of the Hubert's $\Gamma$

681: statistics has been done on the set of reversed configurations, where

682: the $Z_2$ symmetry has been eliminated (see section \ref{SS-REVERSE}).

683: We will discuss it later on, after introducing same other important

684: objects and methods.

685:

686: The second tool we use to establish whether the particular

687: hierarchical structure we find is the correct one is based on the

688: evaluation of the so called \emph{cophenetic correlation coefficient}

689: $\cal K$. It is defined as

690: $$

691:   {\cal K}\equiv

692:   \langle d\cdot d_C\rangle-\langle d\rangle \langle

693:   d_C\rangle\ ,

694: $$

695: where the cophenetic distance $d_C(\mu,\nu)$ is measured on the

696: dendogram (and because of that it is ultrametric by definition). For

697: example, in the case of Ward clustering, it is the quantity

698: defined in (\ref{E-WARD}). A high level of correlation of true

699: distance and cophenetic distance  implies that the data have an

700: intrinsic ultrametric organization. On the contrary a low level of

701: correlation suggests that a true ultrametric structure cannot be

702: detected. $\cal K$ is a natural measure of the ultrametricity build in

703: our data set.

704:

705: If we try to analyze our original configuration set without removing

706: the $Z_2$ symmetry (each configuration $\cal C$ has a corresponding

707: configuration $\cal - C'$ which appears with the same probability)

708: we measure a high value of $\cal K$, always higher than $0.97$.

709: Interpreting this result as a confirm of the detection of an

710: ultrametric structure would be  wrong: the $Z_2$ implies a very

711: primitive form of hierarchical organization (states are grouped in two

712: well separated sectors of the phase space) and on finite, medium size

713: volumes, this is what we are measuring.

714:

715: \begin{figure}

716: \centerline{\psfig{figure=F/fig5.ps,width=0.8\textwidth,angle=270}}

717: \caption{Plot of the true distance $d(i_0,j)$ (solid lines with wiggles) and

718: of the ultrametric cophenetic distance $d_C(i_0,j)$ (solid straight lines)

719: versus $j$ for different values of $i_0$.}

720: \label{cfr}

721: \end{figure}

722:

723: One way to clarify this issue is to look at figure \ref{cfr}, where we

724: plot, for a given sample of the quenched disorder, at $N=512$ and low

725: temperature $T=0.3 T_c$, both the true distance $d(i_0,j)$ and the

726: cophenetic distance $d_C(i_0,j)$ as a function of $j$ for various

727: values of $i_0$. It is clear that the $Z_2$ symmetry makes the two

728: distances similar in a trivial way, by designing the same step: this

729: is the reason that makes ${\cal K}\lesssim 1$. The real physical

730: differences are in the wavy behavior of the true distance: it is its

731: difference from the constant behavior of the cophenetic distance that

732: has to be analyzed. This is what we will do in the next section.

733:

734: We will now apply a spin reversal procedure that allows us to obtain a

735: set of configurations that have, in the infinite volume limit, a

736: positive definite mutual overlap. This is a very useful procedure

737: \cite{MAMAZU} that makes our set of configurations equivalent to a set

738: of configurations obtained in an infinitesimal magnetic field (without

739: the drawback of having to keep under control the smallness of the

740: field). Only after doing that we will come back to the evaluation of

741: the cophenetic coefficient $\cal K$.

742:

743: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

744: \subsection{The Reversing Procedure and Our Main Results\label{SS-REVERSE}}

745:

746: \begin{figure}

747: \centerline{\psfig{figure=F/fig6.ps,width=0.6\textwidth,angle=270}}

748: \caption{The probability distribution $P_J(q)$

749: for different realizations of the quenched disorder ($T=0.4$),

750: before and after applying the reversing procedure.

751: Here we use $M=512$ configurations of a $N=512$ spin system.}

752: \label{pdq}

753: \end{figure}

754:

755: In the infinite volume limit the question of identifying in our set of

756: configurations two subsets, $|+\rangle$ and $|-\rangle$ is well

757: posed. After doing that we can flip all signs of the configurations in

758: $|-\rangle$, obtaining in this way a set of configurations with a

759: positive definite overlap.

760:

761: We use here the approach introduced in \cite{MAMAZU}. We take one

762: configuration as starting point, $\cal S$. We consider now a new

763: configuration, and if its overlap with $\cal S$ is negative we flip

764: it. For a third configuration we consider the average overlap with the

765: first two, and we flip it if this is negative. We do that for all

766: configurations. This procedure works quite well, and it can be

767: improved in a number of ways (for example we can repeat it by starting

768: from the new set and considering a different reference configuration

769: and a different order).

770:

771: In figure \ref{pdq} we show the $P_J(q)$ for several samples, before

772: and after the reversing procedure. It is clear that the procedure

773: works quite well. The main problems are for samples where different

774: valleys are quite similar (we are on finite lattices and there are

775: intrinsic ambiguities that disappear in the thermodynamic limit). A

776: good example of a  troublesome samples is the second sample from the

777: top on the right, where the reconstructed $P_J(q)$ has, even after our

778: reversal procedure, a long tail at negative $q$ values. We have

779: verified (see also \cite{MAMAZU}) that when increasing the volume size

780: these spurious effects become smaller.

781:

782: We have also found that a second effective approach to the separation

783: of the phase space is based on using the same clusterization procedure

784: we will eventually use for analyzing the hierarchical structure. We

785: first use clusterization (based for example on the Ward algorithm) to

786: identify the two $Z_2$ subsets. We then flip all spins of all

787: configurations of one of the two, and repeat the clusterization to

788: find a new (hopefully faithful) hierarchical structure. This second

789: approach gives results that are very similar to the ones of the first

790: approach \cite{MAMAZU} that we have discussed before: for example the

791: resulting $P_J(q)$ are basically indistinguishable.

792:

793: In the following we will use spin configurations {\em ``reversed''}

794: using this technique.

795:

796: \begin{figure}

797: \centerline{

798: \psfig{figure=F/fig7A.ps,width=0.5\textwidth}

799: \psfig{figure=F/fig7B.ps,width=0.5\textwidth}

800: }

801: \caption{Proximity matrix for two $N=512$ samples in the left and

802: right parts of the plot (at $T=0.2\ T_c$ on the left for each of the

803: two samples and at $T=0.6\ T_c$ on the right for each of the two

804: samples) ordered according to the output of the clustering procedure

805: (i.e. as from the dendogram, in the bottom) and the corresponding

806: cophenetic matrix implied by the same dendogram (in the top).}

807: \label{A16}

808: \end{figure}

809:

810: In figure \ref{A16} we show the proximity matrix for two $N=512$

811: samples (at $T=0.2\ T_c$ and at $T=0.6\ T_c$) ordered according to the

812: output of the clustering procedure (i.e. as from the dendogram) and

813: the corresponding cophenetic matrix implied by the same dendogram.

814:

815: \begin{figure}

816: \centerline{\psfig{figure=F/fig8.ps,width=0.8\textwidth,angle=0}}

817: \caption{ In figures \protect\ref{corr}.a, \protect\ref{corr}.b and

818: \protect\ref{corr}.c we plot $\cal K$ as a function of $\frac{T}{T_c}$

819: for $N=128$, $N=256$ and $N=512$.  In figure \protect\ref{corr}.d we

820: plot $\langle\Gamma\rangle$ versus the assumed density of valleys,

821: i.e. the number of valleys divided times the number of configurations

822: $M$: a large difference from the high $T$ data implies a plausible

823: hypothesis.  In figure \protect\ref{corr}.e we compare single and

824: complete link clustering: see the text for further details.}

825: \label{corr}

826: \end{figure}

827:

828: When the hierarchical, ultrametric structure is intrinsic to the data

829: set the matrices in the bottom line of figure \ref{A16} become equal

830: to the ones in the central line. Now that the accidental $Z_2$

831: symmetry has been removed we are able to look at the real, relevant

832: physical effects. We have investigated the issue in a systematic

833: way. We average over $20$ different quenched realizations of the

834: disorder, and analyze the system for different lattice volumes as a

835: function of the temperature.

836:

837: In figures \ref{corr}.a, \ref{corr}.b and \ref{corr}.c we plot $\cal

838: K$ as a function of $\frac{T}{T_c}$ for $N=128$, $N=256$ and $N=512$.

839: The upper sets of points with smaller errors are from the analysis

840: done {\em before} the spin reversal ($Z$), the lower sets of points

841: with larger error are from the analysis of the spin reversed

842: configurations ($R$). We have already discussed the fake detection of

843: ultrametricity induced by the $Z_2$ symmetry. We discuss now the data

844: obtained after removing the symmetry. In no cases a clear evidence for

845: the existence of a true ultrametric structure emerges. $\cal K$ is

846: always small, and for $T<T_c$ it does not even increase clearly with

847: $N$ (finite size effects are very large and uncontrolled). It is

848: interesting that in the set of $Z$ data the phase transition is

849: detected quite clearly (but, as we have explained, what we observed is

850: no connected to a hierarchical structure, but only to the

851: usual breaking of the $Z_2$ symmetry). At high $T$ values, for $T>T_c$

852: the $Z$ and the $R$ sets of data coincide: here there is one single

853: state.

854:

855: This analysis shows clearly that on medium size lattices it is

856: impossible to detect more than hints toward a hierarchical structure:

857: in our mean field model we know that ultrametricity will eventually

858: emerge, but very large lattices are needed for that.

859:

860: In figure \ref{corr}.d we try a further test to improve the level of

861: our quantitative understanding. We could phrase our goal by saying

862: that we are trying to understand how many valleys we can be sure are

863: present in the phase space (we repeat that since we are studying the

864: mean field Sherrington-Kirkpatrick theory in the Parisi broken phase

865: we know that asymptotically an infinite number of such valleys will

866: emerge). We go back to $\Gamma$ defined in equation \ref{E-GAMMA}.  At

867: different $T$ values we change the threshold value $z$ and monitor the

868: number of valleys we are building for a given $z$ value (this depends

869: on $T$: we have discussed this procedure when commenting figure

870: \ref{F-HUBERT}). We measure $\langle\Gamma\rangle$ and we plot it

871: versus the average number of valleys per sample (all data are for

872: reversed configurations, except for one set of non-reversed data at

873: $T=0.5 T_c$ that we plot for sake of comparison). We use the high $T$

874: ($T=1.9 T_c$) curve as a reference curve, and we consider it as the

875: randomness threshold: if at a given temperature $T$ the value of

876: $\langle\Gamma\rangle$ is very different than the high $T$ value we

877: consider that as evidence for existence of this number of valleys.

878:

879: Using the hight $T$ limit as the reference line looks to us as a

880: sensible choice (we have already discussing that using unordered

881: matrix lines is basically just a check of the correctness of our

882: procedure). If, for example, we select a value of the $x$ variable

883: (number of clusters divided by $M$) $x=0.002$, that in the case of

884: $N=512$ assumes the presence of two valleys ({\em after} removal of

885: the $Z_2$ symmetry) we see that at low $T$ the data are quite

886: different from the high $T$ ones, suggesting that we are probably

887: already detecting this (quite low) level of organization. When we try

888: a threshold implying a larger number of valleys (already for example

889: for three of four valleys on our larger lattice, $N=512$) the data are

890: not far from the high $T$ ones, implying a failure in supporting the

891: hypothesis.

892:

893: We will discuss figure \ref{corr}.e in the next section.

894:

895: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

896: \subsection{Other Clustering Algorithms\label{SS-OTHER}}

897:

898: As we have discussed in some detail in section \ref{S-CLUSTER} the

899: cluster reconstruction algorithm is defined by selecting the rule used

900: to join two elements at different levels of the partitioning, an to

901: update the distance matrix after each step of refining the

902: partitioning level.

903:

904: In our analysis we have used the Ward scheme \cite{ward} (that updates

905: the distances as in equation \ref{E-WARD}): this is believed to be an

906: optimal choice when there is no information \emph{a priori} on the

907: data \cite{jaidub}.

908:

909: Basic clustering algorithms are the {\em single link} scheme and the

910: \emph{complete link} one.  We will not enter here in many details (see

911: \cite{jaidub} for further information), but let us say that in the

912: single link scheme one just demands a weak connectivity to merge two

913: subsets, and joins them to form a new cluster as early as possible,

914: while in the complete link scheme the opposite happens, and subsets

915: are joined to form a new cluster ``as late as possible''. Both methods

916: have advantages and drawbacks. The crucial observation that we will

917: use now is that when a real hierarchical structure is present all

918: these methods end up to give the same result, and to reconstruct the

919: same classification.

920:

921: In these two algorithms we have that, if as before $\rho$ and $\sigma$

922: merge to form the new cluster $\rho'$ for all other clusters $\tau$:

923: \begin{eqnarray*}

924: d_{\tau,\rho'}=&

925: \min\{d_{\tau,\rho},d_{\tau,\sigma}\}&

926: \;\;\;\;\;\textrm{(single link)}\ ,\\

927: d_{\tau,\rho'}=&

928: \max\{d_{\tau,\rho},d_{\tau,\sigma}\}&

929: \;\;\;\textrm{(complete link)}\ .

930: \end{eqnarray*}

931: The reason for the names is in the graph theory interpretation of the

932: algorithms \cite{jaidub}. As we have already said it is not difficult

933: to show that if the true distance matrix is actually ultrametric the

934: optimal permutation with respect to these two algorithms is be exactly

935: the same.

936:

937: In this framework we have introduced a last test of the structure of

938: our data: we check how different are the output of the two algorithms

939: to try to understand if we can detect further hints for an emerging

940: ultrametric structure.  We have analyzed 20 samples at several

941: temperatures values, and we show in figure \ref{corr}.e the average

942: correlation between the two output distance matrices, that is

943: \begin{displaymath}

944: \omega

945: \equiv \overline{\langle d_{SL}\cdot d_{CL}\rangle} \ .

946: \end{displaymath}

947: The correlation is very high at low $T$, and decreases toward the high

948: $T$ value around $T\sim 0.8\ T_c$. Again, on medium large lattice sizes

949: we can detect hints toward an emerging ultrametric structure but we

950: cannot in any way get a clear cut answer.

951:

952: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

953: \section{Clustering the Spins\label{S-SPINS}}

954:

955: \begin{figure}

956: \centerline{\psfig{figure=F/fig9.ps,width=0.8\textwidth}}

957: \caption{Clustering the spins: for a given sample of the quenched

958: disordered couplings we look at the spins of our configurations as a

959: set done of $N=512$ elements (one per lattice site), each element

960: being a $M=512$ dimensional vector configurations (all the values

961: taken by the spin in the given site on our $M$ independent

962: configurations). After clustering these data vectors we plot the

963: distance matrix $d_{ij}$ between spin $i$ and spin $j$ according to

964: the ordering found in the cluster.  The plots correspond to $T=0.1\

965: T_c, T=0.2\ T_c, ...,0.9\ T_c$. At very low temperatures a large

966: ($O(N)$) spin domain structure emerges. The structure disappears when

967: increasing the temperature.}

968: \label{spin}

969: \end{figure}

970:

971: An interesting question (discussed in details in \cite{domany})

972: concerns a possible clustering of the {\em spins} of our system.

973: The issue is clearly very relevant in the finite dimensional systems

974: studied in \cite{domany} where spatial structures can be very

975: relevant. Here, in mean field, there is no notion of distance, but

976: still spins can be aggregated in different groups that have different

977: degrees of correlation.

978:

979: We will look for the possible presence of some kind of structure (in

980: this case not hierarchical since there is no reason for this) now in

981: the space of the elementary spins instead than in configuration space.

982: In the analysis of configurations we were considering the $N\times M$

983: data matrix $\{\sigma_i^\mu\}$ as representing $M$ configurations,

984: where each data point was an $N$-dimensional vector.  Now we change

985: our point of view; we regard each of the $N$ spins as a data point,

986: that is as a vector in a $M$-dimensional space.  Since we expect

987: highly correlated spins to be in the same cluster, following

988: \cite{domany}

989: we define

990: the distance between spin $i$ and spin $j$ as

991: $$

992: d_{ij}=1-c_{ij}^2\ ,

993: $$

994: where

995: $$

996: c_{ij}\equiv\langle \sigma_i\sigma_j\rangle

997: \equiv\frac 1M\sum_{\mu=1}^M \sigma_i^\mu\sigma_j^\mu

998: $$

999: is the spin correlation matrix that we can evaluate using our

1000: spin configurations generated in a Monte Carlo run.

1001:

1002: It is interesting to follow the evolution in temperature

1003: of the ordered spin matrix for a given sample: we show it in figure

1004: \ref{spin}. At intermediate temperature values a large group of spin

1005: is clearly very correlated: here $O(N)$ spins are grouped

1006: together. This structure disappears at high $T$ values. It is

1007: remarkable how this picture is similar to figure 11.d of the second

1008: paper of reference \cite{domany}. This is a severe warning against

1009: misleading interpretations of the data analysis: here we are in mean

1010: field, and there are no spatial local domains.

1011:

1012:

1013: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

1014: \section{Conclusions\label{S-CONCLUSIONS}}

1015:

1016: The configuration space of a $N$-spin system is a $2^N$-dimensional

1017: space and it is very difficult to represent it in order to catch the

1018: main physical features~\footnote{Only for limited purposes a principal

1019: component analysis (PCA) can be adapted to help in this task

1020: \protect\cite{domany}.}.  We have shown that cluster analysis allows

1021: not only to visualize in a physically meaningful way the structure of

1022: the configuration space, but also allows for quantitative testing of a

1023: priori hypothesis about the structure of the data set.

1024:

1025: We have discussed the role of the $Z_2$ symmetry of the system, and

1026: how its removal is necessary to study the relevant physical

1027: phenomena. Our main issue is that quantitative testing is mandatory to

1028: make of clustering techniques an useful tool. We have introduced some

1029: of these techniques by designing tests such to be useful in our

1030: context of a (disordered) statistical mechanics context.

1031:

1032: As a crucial benchmark we have analyzed the mean field theory in the

1033: low $T$ replica broken phase, where we know that eventually, in the

1034: infinite volume limit, a hierarchical structure of states emerges. We

1035: are able to observe many hints toward the emerging of such structure,

1036: but on the lattice sizes where we are able to work these indications

1037: cannot be considered as unambiguous. Detecting ultrametricity is very

1038: difficult, and demands very large lattice sizes: this turns out to be

1039: true in mean field, and we expect it to be probably true also in

1040: finite dimensional models, where the existence itself of mean field

1041: like states is all to be checked. We believe that the findings and the

1042: techniques that we have reported here will be important to use in the

1043: finite dimensional context. As many other features (we have in mind

1044: for example temperature chaos \cite{CHAOS}, that is very difficult to

1045: detect numerically and emerges only at very high orders in

1046: perturbation theory) ultrametricity emerges, already in mean field,

1047: only on very large lattices.

1048:

1049: We also believe it is important that in this ``quantitative'' approach

1050: to clustering we have been able to introduce a natural way to consider

1051: not only sample dependent but also disorder average quantities.

1052:

1053: A next step is to apply, by continuing the work of \cite{domany},

1054: these techniques to finite dimensional disordered systems (defined on

1055: very large lattices!) on the one side and to glassy systems on the

1056: other side: since here a crucial goal is to try to understand the

1057: details of the spatial, time dependent organization of the system,

1058: techniques like the ones introduced here could turn out to be very

1059: useful.

1060:

1061: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

1062: \section*{Acknowledgments\label{S-ACK}}

1063:

1064: We acknowledge the precious contribution of Loredana Correale to a

1065: first phase of this work. We thank Eytan Domany and Peter Young for many

1066: useful conversations that have motivated us toward this problem.

1067:

1068: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

1069: \begin{thebibliography}{99}

1070:

1071: \bibitem{domany}

1072: G. Hed, A. K. Hartmann, D. Stauffer and E. Domany,

1073: Phys. Rev. Lett. {\bf 86}, 3148 (2001);

1074: E. Domany, G. Hed, M. Palassini and

1075: A. P. Young, Phys. Rev. B {\bf 64}, 224406 (2001).

1076:

1077: \bibitem{books}

1078: M. M\'ezard, G. Parisi and  M. A. Virasoro,

1079: \emph{Spin Glass Theory and Beyond}

1080: (World Scientific, Singapore 1987);

1081: K. Binder and A. P. Young,

1082: Rev. Mod. Phys. {\bf 58}, 801 (1986);

1083: K. H. Fischer and J. A. Hertz,

1084: \emph{Spin Glasses}

1085: (Cambridge University Press, Cambridge, UK 1993);

1086: \emph{Spin Glasses and Random Fields},

1087: edited by A. P. Young

1088: (World Scientific, Singapore 1998).

1089:

1090: \bibitem{ultra}

1091: See for example

1092: R. Rammal, G. Toulouse and M. A. Virasoro,

1093: Rev. Mod. Phys. {\bf 58}, 765 (1986),

1094: and references therein.

1095:

1096: \bibitem{review}

1097: E. Marinari, G. Parisi, F. Ricci-Tersenghi,

1098: J. J. Ruiz-Lorenzo and F. Zuliani,

1099: J. Stat. Phys. {\bf 98}, 973 (2000).

1100:

1101: \bibitem{camapa}

1102: A. Cacciuto, E. Marinari and G. Parisi,

1103: J. Phys. A {\bf 30}, L263 (1997).

1104:

1105: \bibitem{fraric}

1106: S. Franz and F. Ricci-Tersenghi,

1107: Phys. Rev. E {\bf 61}, 1121 (2000).

1108:

1109: \bibitem{jaidub}

1110: A. K. Jain and R. C. Dubes,

1111: {\em Algorithms for Clustering Data}

1112: (Prentice-Hall, Englewood Cliffs, USA 1988).

1113:

1114: \bibitem{rogufo}

1115: K. Rose, E. Gurewitz and G. Fox,

1116: Phys. Rev. Lett. {\bf 65}, 945 (1990).

1117:

1118: \bibitem{blwido}

1119: M. Blatt, S. Wiseman and E. Domany,

1120: Phys. Rev. Lett. {\bf 76}, 3251 (1996);

1121: S. Wiseman, M. Blatt and E. Domany,

1122: Phys. Rev. E {\bf 57}, 3767 (1997).

1123:

1124: \bibitem{stibia}

1125: S. Still and W. Bialek,

1126: preprint physics/0303011 (March 2003).

1127:

1128: \bibitem{dudhar} R. O. Duda and P. E. Hart,

1129: \emph{Pattern Classification and Scene Analysis}

1130: (John Wiley \& Sons, New York 1973).

1131:

1132: \bibitem{giamar}

1133: L. Giada and M. Marsili,

1134: Phys. Rev. E {\bf 63}, 061101 (2001);

1135: Physica A {\bf 315}, 57 (2002).

1136:

1137: \bibitem{ward}

1138: J. H. Ward, Jr.,

1139: Journal of the American Statistical Association {\bf 58}, 236 (1963).

1140:

1141: \bibitem{PT}

1142: M. C. Tesi, E. J. Janse van Rensburg, E. Orlandini and

1143: S. G. Whillington,

1144: J. Stat. Phys. {\bf 82}, 155 (1996);

1145: K. Hukushima and K. Nemoto,

1146: J. Phys. Soc. Japan {\bf 65}, 1604 (1996);

1147:

1148: \bibitem{PTREV}

1149: E. Marinari,

1150: {\em Optimized Monte Carlo Methods,}

1151: in

1152: {\em Advances in Computer Simulations,}

1153: edited by J. Kert\'esz and I. Kondor

1154: (Springer-Verlag, Berlin 1998), p.50.

1155:

1156: \bibitem{hubsch}

1157: L. J. Hubert and J. Schultz,

1158: British Journal of Mathematical and Statistical Psychology

1159: {\bf 29}, 190 (1976).

1160:

1161: \bibitem{MAMAZU}

1162: E. Marinari, O. Martin and F. Zuliani,

1163: Phys. Rev. B {\bf 64}, 184413 (2001).

1164:

1165: \bibitem{CHAOS}

1166: A. Billoire and E. Marinari,

1167: Europhys. Lett. {\bf 60}, 775 (2002);

1168: A. Crisanti and T. Rizzo,

1169: Phys. Rev. Lett. {\bf 90}, 137201 (2003).

1170:

1171: \end{thebibliography}

1172:

1173: \end{document}

1174: