1: \documentclass[pre,aps]{revtex4}
2: \usepackage{psfig}
3: \newcommand{\be}{\begin{equation}}
4: \newcommand{\ee}{\end{equation}}
5: \newcommand{\e}{\emph}
6: \newcommand{\bb}{\textbf}
7: % \textwidth = 490pt
8: % \textheight = 680pt
9: % \hoffset = -10pt
10: % \voffset = 10pt
11:
12: %\linespread{1.1}
13:
14: \begin{document}
15:
16: \title{A Quantitative Clustering Approach to
17: Ultrametricity in Spin Glasses}
18:
19: \author{Stefano Ciliberti and Enzo Marinari}
20: \affiliation
21: {Dipartimento di Fisica,
22: SMC and UdR1 of INFM, INFN,
23: Universit\`a di Roma {\em La Sapienza},
24: P.le A. Moro 2, 00185 Roma, Italy}
25:
26: \begin{abstract}
27: We discuss the problem of ultrametricity in mean field spin glasses by
28: means of a hierarchical clustering algorithm. We complement the
29: clustering approach with quantitative testing: we discuss both in some
30: detail. We show that the elimination of the (in this context
31: accidental) spin flip symmetry plays a crucial role in the analysis,
32: since the symmetry hides the real nature of the data. We are able to
33: use in the analysis disorder averaged quantities. We are able to
34: exhibit a number of features of the low $T$ phase of the mean field
35: theory, and to claim that the full hierarchical structure can be
36: observed without ambiguities only on very large lattice volumes, not
37: currently accessible by numerical simulations.
38: \end{abstract}
39:
40: \date{2003, April 10th}
41:
42: \maketitle
43:
44: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
45: \section*{Happy Birthday}
46:
47: This paper is to honor Giovanni Jona-Lasinio birthday. We are grateful
48: to him since he has taught to us, as to so many other people in Rome
49: and in other places, a lot of physics and much about the way to love
50: good physics. Thanks, and Happy Birthday!
51:
52: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
53: \section{Introduction\label{S-INTROD}}
54:
55: The use of clustering methods to qualify the low temperature phase of
56: spin glass systems has been recently advocated in a group of very
57: interesting papers \cite{domany}. It is indeed well known that the
58: broken phase of mean field spin glasses has a high level of
59: complexity, that translates statically in Parisi spontaneous Replica
60: Symmetry Breaking (RSB) and dynamically in a series of dramatic
61: phenomena that go from a severe critical slowing down $\forall\ T<T_c$
62: to memory effects, aging phenomena and violations of the
63: fluctuation-dissipation theorem \cite{books}.
64:
65: Ultrametricity of states \cite{ultra} is one of the key features of
66: the mean field Parisi picture: states of the system turn out to be
67: endowed by an ultrametric distance, and the phase space is organized
68: hierarchically. Do finite dimensional spin glass systems share this
69: properties, and can we find a way to check that? This is an important
70: issue of the persistent debate \cite{review} about the physics of the
71: low temperature phase of finite dimensional spin glasses.
72:
73: Detecting ultrametricity on finite volume systems turns out to be very
74: difficult \cite{camapa,fraric}: the introduction of constrained Monte
75: Carlo methods \cite{camapa} and the analysis of the dynamical behavior
76: of the system \cite{fraric} help only marginally. Finite size effects
77: are very strong, and make the asymptotic potential emergence of a
78: hierarchical structure difficult to observe.
79:
80: Here we introduce some new analysis techniques and we study the
81: Sherrington-Kirkpatrick (SK) mean field model, where we know that for
82: low $T$ a non-trivial ultrametric structure emerges in the infinite
83: volume limit. We will find out that this is a difficult task, sharing
84: all the problems one observes in finite dimensional systems
85: \cite{domany,camapa}. Our main points can be summarized in four basic
86: issues:
87:
88: \begin{enumerate}
89:
90: \item We find that to be of better use
91: the approach based on hierarchical clustering has to be
92: complemented by the use of testing techniques that have been developed
93: in the field of numerical taxonomy \cite{jaidub}. We discuss some of
94: these techniques and we show how they can be applied to our problem.
95:
96: \item We discuss the role of the $Z_2$ symmetry of the phase space. We
97: find that removing this symmetry (that in this context is accidental)
98: is crucial to get sensible results from quantitative tests. We
99: introduce and discuss the way to remove the symmetry from equilibrium
100: configurations obtained in zero magnetic field.
101:
102: \item Thanks to these techniques we are able to clarify how a finite
103: volume SK system behaves as far as ultrametricity is concerned, by
104: working out strengths and limitations of the method. We find that on
105: the (medium-large) lattice sizes that we are able to analyze one can
106: establish that a structure is emerging, but that one cannot get a
107: compulsory evidence about this structure being ultrametric. This is
108: exactly the same kind of phenomenon one observes when studying finite
109: dimensional systems \cite{domany}.
110:
111: \item We analyze systematically finite size effects (by studying
112: systems on different lattice sizes) and the dependence of our results
113: over $T$. Thanks to the quantitative analysis techniques that we
114: introduce we are able to use hierarchical clustering techniques to
115: discuss also quantities that are {\em averaged over the disorder},
116: opening in this way a large information window.
117:
118: \end{enumerate}
119:
120: The low temperature mean field behavior of spin glass systems is
121: understood in the framework of the Parisi RSB scheme \cite{books}.
122: The prototype of mean field spin glass models is the SK fully
123: connected Ising model where coupling constants are \emph{quenched}
124: random variables:
125: \begin{equation}
126: {\mathcal{H}}_J[\sigma]=-\sum_{i,k=1}^N \sigma_i J_{i,k}\sigma_k \ ,
127: \label{E-H}
128: \end{equation}
129: where $\sigma_i=\pm 1$ are spin variables and the $J_{i,k}$ are
130: distributed according to an even distribution function. For example
131: we can use a Gaussian distribution with $\overline{J_{ik}}=0$ (since
132: we want to avoid ferromagnetic effects) and
133: $\overline{J^2_{ik}}=\frac1N$ (to ensure that the energy is
134: extensive). As we have already reminded, the Parisi RSB solution of the
135: SK model, which is believed to be the correct solution of mean field
136: theory at low $T$, exhibits an ultrametric organization of the states
137: \cite{ultra}. This means that in the infinite volume limit for any
138: triple of equilibrium spin configurations $\alpha,\beta,\gamma$ we
139: have that:
140: \begin{displaymath}
141: q_{\alpha\beta}\geq \min
142: \{q_{\alpha\gamma}, q_{\beta\gamma}\} \ ,
143: \end{displaymath}
144: where $q_{\alpha\beta}$ is the overlap among
145: configurations $\alpha$ and $\beta$, defined as
146: \begin{equation}
147: \label{E-OVERLAP}
148: q_{\alpha\beta}\equiv\frac1{N}\sum_{i=1}^{N}
149: \sigma_i^\alpha \sigma_i^\beta
150: \end{equation}
151: (here configurations $\alpha$ and $\beta$ are independent
152: configurations at equilibrium under the same Hamiltonian, sharing the
153: same quenched realization of the random couplings: they are only
154: coupled by the fact of sharing the same realization of the random
155: Hamiltonian). The overlap $q_{\alpha\beta}$ is a similarity index, and
156: the distance is connected to one minus the overlap.
157:
158: We will analyze in detail the fact that revealing numerically an
159: ultrametrical emerging structure on finite systems is difficult. The
160: question is even more relevant since detecting reliable
161: signs of an ultrametric structure could be crucial in finite
162: dimensional systems, where the behavior of the system in the low $T$
163: phase is not yet understood \cite{review}.
164:
165: Clustering \cite{jaidub} is a powerful technique for analyzing data
166: (for interesting applications of statistical mechanical ideas to
167: clustering see \cite{rogufo,blwido,stibia}).
168: Since producing a valid hierarchical clustering is equivalent to show
169: the existence of a true ultrametric structure of the data, this kind of
170: approach can give crucial evidences. We will discuss here what happens
171: in the infinite range mean field SK model, where we know that
172: eventually, in the infinite volume limit, ultrametricity of states
173: will emerge. We believe this is needed to help in interpreting the
174: results obtained in the analysis of finite dimensional models
175: \cite{domany}. We will see that some important hints do indeed
176: emerge.
177:
178: In this note we introduce some new ideas relevant for hierarchical
179: cluster as applied to the analysis of disordered and complex systems,
180: and we discuss numerical results obtained from a clustering analysis
181: of equilibrium spin glass configurations, with a particular emphasis
182: on the study of the ultrametric nature of these states. We explain
183: why a detailed analysis requires an appropriate elimination of the
184: spin flip symmetry and we investigate the dependence of our results on
185: the number of degrees of freedom of the system, showing that finite
186: size effects are actually very large.
187:
188: The paper is organized as follows. In section \ref{S-CLUSTER} we
189: introduce the clustering procedure and we explain the motivations for
190: our precise choice of a given clustering algorithm. In section
191: \ref{S-ANALYSIS} we apply this technique to the SK model; we discuss
192: our findings about ultrametricity, also by comparing them with those
193: that one obtains by using standard techniques. Here we will introduce
194: and use quantitative ways to state the significance of the results
195: obtained by clustering (mainly in section \ref{SS-QT}). As we said
196: before a more detailed analysis requires a previous elimination of the
197: $Z_2$ symmetry, and this is done in section \ref{SS-REVERSE}: in
198: section \ref{SS-OTHER} we will also say a few words about using
199: different clustering schemes. Section \ref{S-SPINS} is dedicated to
200: the clustering of the spins. We report our conclusions in the last
201: section.
202:
203: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
204: \section{The Clustering Algorithm\label{S-CLUSTER}}
205:
206: Clustering is a special kind of (potentially very powerful)
207: classification tool. We will give here only the basic
208: informations we need for our analysis, and we advise the reader to look
209: at \cite{jaidub} for further details.
210:
211: Let us consider a sample done of $M$ data $x^\mu$, where each data
212: point $x^\mu\equiv\{ x_1^\mu,\ldots x_N^\mu\}$ is a vector in a
213: $N$-dimensional space. We want to study the underlying organization
214: of the data, i.e. we want to find out whether the data are organized
215: according to some non-trivial structure. A problem of this type is
216: strictly related to pattern recognition analysis and to Bayes decision
217: theory \cite{dudhar}: it is of very general interest, since it emerges
218: in many relevant contexts.
219:
220: The main ingredient for the analysis is the {\em proximity matrix}
221: $d_{\mu\nu}\equiv d(x^\mu,x^\nu)$. $d(x^\mu,x^\nu)$ is some measure
222: of the dissimilarity of data $\mu$ and $\nu$. It is such that
223: $d_{\mu\mu}=0$ and $d_{\mu\nu}=d_{\nu\mu}\geq 0$. $d$ does not need
224: to be a distance (for example the triangular inequality could not
225: be satisfied) but usually it is one.
226:
227: By clustering we group the data in sets that can be related among them
228: in different ways. Here we will use the exclusive (each data belongs
229: to exactly one cluster), intrinsic (i.e. based only on the proximity
230: matrix $d$) classification known as {\em hierarchical clustering}.
231: Hierarchical clustering is a nested sequence of partitions obtained
232: through a classification technique based on one of many possible
233: algorithms. The output of the algorithm can be represented by a
234: hierarchical tree (a so-called {\em dendogram}).
235:
236: A generic (even random) set of data can always be arranged to fit a
237: tree-like structure: this is indeed what clustering does. After doing
238: such (potentially arbitrary) clustering we are left with the relevant
239: question of deciding if the hierarchical structure that has been
240: reconstructed was somehow intrinsic to the data set: this requires an
241: analysis \emph{a posteriori}.
242:
243: So, in hierarchical clustering we start from a set of data, we group
244: them by some algorithm (that we will specify in the following)
245: building in this way a hierarchical tree. Comparison of this tree and
246: the original data can lead to quantitative conclusions about the
247: presence of a true hierarchical structure in the data.
248:
249: In the course of a cluster analysis one usually faces two main
250: problems.
251:
252: \begin{itemize}
253:
254: \item The first important step is the definition of the dissimilarity
255: index $d_{\mu\nu}$ which is not always naturally induced from the
256: context (data do not necessarily belong to an Euclidean space).
257:
258: In our case this is an easy problem. Starting from the usual notion of
259: overlap (\ref{E-OVERLAP}) the distance between two spin configurations
260: can be for example naturally and easily defined as
261: \begin{displaymath}
262: d_{\mu\nu}\equiv\frac{1-q_{\mu\nu}}{2}\ .
263: \end{displaymath}
264:
265: \item The second problem is how to update distances among
266: elements. When we fuse elements $\alpha$ and $\beta$ in element
267: $\gamma$ (so joining two smaller clusters in a larger one) we have to
268: define all distances from the new cluster $\gamma$ to all other
269: clusters of the system $\eta$. This step is crucial since it can play
270: a dramatic role in the structure of the iteration, even if in
271: situation where hierarchical clustering turns out to be {\em natural},
272: i.e. an intrinsic property of the data set, results have to be
273: independent from this issue (there exist alternative approaches which
274: allows to avoid such an explicit choice by means of a priori
275: hypothesis \cite{blwido,giamar}).
276:
277: The most part of our results has been obtained by the \emph{Ward
278: method} (or \emph{minimum variance method}) \cite{ward,jaidub}. The
279: method is based on minimizing the square error, and is empirically
280: known to outperform other hierarchical clustering methods.
281:
282: When we merge the two clusters that have the smaller distance we
283: define the new distance using the following rule:
284: if $\rho$ and
285: $\sigma$ merge to form $\rho'$,
286: and $n_\alpha$ is the number of elements in the cluster $\alpha$,
287: then for any other cluster $\tau$:
288: \begin{equation}
289: d_{\tau\rho'}=\frac{
290: (n_\tau+n_\rho)d_{\tau\rho}+
291: (n_\tau+n_\sigma)d_{\tau\sigma}-
292: (n_\rho+n_\sigma)d_{\rho\sigma}
293: }
294: {n_\tau+n_\rho+n_\sigma}\ .
295: \label{E-WARD}
296: \end{equation}
297: Let ${\mathcal{C}_\alpha}$ stand for one of the clusters of the system
298: and consider the quantity
299: \begin{displaymath}
300: S=\sum_{\mathcal{C}_\alpha}\tau(\alpha)\ ,
301: \end{displaymath}
302: where the sum is over all the clusters defined in the system
303: and where
304: \begin{equation}
305: \tau(\alpha)=\sum_{\;\mu,\nu\in
306: \mathcal{C}_\alpha} d_{\mu\nu}^2\ .
307: \label{E-TAU}
308: \end{equation}
309: The choice of the Ward algorithm ensures that when merging two
310: clusters to form a new one $S$ increases of a minimal amount. In other
311: terms this definition of distance is the one induced from the maximum
312: likelihood principle.
313: \end{itemize}
314: This defines the clustering scheme that we will follow. We will
315: discuss next how these ideas can be applied to mean field spin glass
316: models, and how the result can be understood and quantified by testing
317: the cluster validity.
318:
319: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
320: \section{Cluster Analysis of the SK Mean Field Spin Glass\label{S-ANALYSIS}}
321:
322: As we have said we have decided to analyze numerically the mean field
323: SK model. Since here the infinite volume scenario is under full
324: control we believe this is a crucial step in understanding what one can
325: learn from numerical simulations on finite lattices, and to control the
326: consequences of such results obtained on finite dimensional models
327: \cite{domany} where, on the contrary, the theoretical scenario is far
328: from clear.
329:
330: We have started by generating by an optimized Monte Carlo method a
331: large number of uncorrelated spin configurations on lattices of
332: different sizes and for a number of different realizations of the
333: quenched disorder (on which we eventually average), under the
334: Hamiltonian (\ref{E-H}), with quenched random couplings assigned under
335: a Gaussian distribution. We analyze systems with $N$, number of spins,
336: equal to $128$, $256$ and $512$ ($N=512$ is typical of a medium size
337: numerical simulation, corresponding for example to a linear size of $8$
338: in three dimensions). We thermalize our systems at a set of different
339: values of the temperature typically going from $0.1\ T_c$ (a very low
340: value, that we can reach only thanks the power of parallel tempering
341: \cite{PT,PTREV}), and in all cases we analyze $20$ different
342: realizations of the quenched couplings. For all lattice sizes,
343: relevant temperatures and disorder realizations we first thermalize
344: the system. After doing that we record one spin configuration after
345: any new set of $1000$ combined full Monte Carlo sweeps and parallel
346: tempering updates of the system. The large ``computer time''
347: separation among different configurations guarantees a very high level
348: of statistical independence. Residual possible (very small)
349: correlations would not spoil our analysis but would only make it a bit
350: less effective. We have recorded $1024$ such independent spin
351: configuration for each value of the parameters: such configurations
352: are the basic set of objects that we have clustered.
353:
354: Parallel tempering \cite{PT,PTREV} has been crucial in allowing to
355: bring at thermal equilibrium spin configurations at such low
356: temperature values on acceptable lattice volumes. The method is based
357: on simulating in parallel copies of the system at different
358: temperature $T$ values, allowing the different copies to swap $T$
359: among them (with a standard Metropolis weight). This reduces the free
360: energy barriers, always keeping the different copies at Boltzmann
361: equilibrium: tempering can be seen as an annealing where the basic
362: quantity is not energy but free energy.
363:
364: We have used all standard criteria to check that, when using the
365: Parallel Tempering optimized Monte Carlo scheme, we have really
366: reached thermal equilibrium \cite{PTREV}: we have checked that our
367: sample dependent overlap probability distributions $P_J(q)$ are indeed
368: well symmetric under $q\longrightarrow -q$, we have checked that all
369: copies of the system have visited a number of times all available
370: temperature values, we have checked that the acceptance factor of the
371: temperature acceptance swap has been of order $0.5$.
372:
373: In the rest of this note we will work on {\em clustering} these
374: configurations and on using quantitative testing to extract the
375: implications of the hierarchical structure that we obtain.
376:
377: We first introduce a standard graphical way to get a qualitative
378: feeling about the set of data. We consider the proximity matrix $\cal
379: P$, where we have the set of data (in some order to be specified) on
380: the $x$ and on the $y$ axis, and where we plot with darker colors
381: points with higher overlap: the diagonal constitutes by definition the
382: darkest set of the matrix. In figure \ref{F-MATRIX-A} we start by
383: showing, on the left, the matrix $\cal P$ for a given disorder
384: realization at $N=512$ and $T=0.1\ T_c$ (a very low value of $T$, the
385: lowest we have analyzed: here the system is basically in its ground
386: state) where configurations have been ordered at random. A clearly
387: random pattern emerges.
388:
389: \begin{figure}
390: \centerline{\psfig{figure=F/fig1.ps,width=0.7\textwidth,angle=90}}
391: \caption{An example of the clustering procedure as applied to a very
392: low temperature set of configurations. In the left part of the figure
393: we show a proximity matrix $\cal P$ built over $M=512$ configurations
394: of $N=512$ spins at $T=0.1\ T_c$, ordered at random.
395: Darker colors correspond to smaller distances. On the right
396: part of the figure we draw the dendogram that results from our
397: clustering, and the resulting $\cal P$.
398: The distance on the dendogram is proportional to $\tau(\alpha)$
399: defined in equation (\protect\ref{E-TAU}).
400: The method recovers very well
401: the structure of two giant clusters related by the $Z_2$ symmetry.}
402: \label{F-MATRIX-A}
403: \end{figure}
404:
405: We apply the Ward algorithm to these configurations in order to obtain a
406: hierarchical tree (as we have discussed before) \footnote{For
407: clustering we have used the very flexible set of programs developed
408: by P. Kleiweg, available from {\tt http://
409: odur.let.rug.nl/$\tilde{\ }$kleiweg/clustering/clustering.html }}. The
410: hierarchical tree that contains the information about the clustering,
411: the so-called {\em dendogram} \footnote{In a dendogram longer lines are
412: for farer clusters. In most of our drawings, when we are not
413: interested in analyzing this specific information, we use an
414: appropriate power law deformation of the scale to make the graph more
415: readable and telling.} is shown in the upper part of the right side
416: of figure \ref{F-MATRIX-A}. In the lower part of the right hand side
417: of figure \ref{F-MATRIX-A} we show the matrix obtained by ordering the
418: configurations \emph{as from the dendogram} on the $x$ and on the $y$
419: axis. Now the two reflected states appear very clearly (at such a low
420: $T$ value there are basically two $\delta$ functions at values $\pm
421: \overline{q}$, where $\overline{q}$ is close to one). We cannot
422: observe any further structure, since $T$ is too low (the ideal
423: temperature value for observing hints of ultrametric effects will turn
424: out to be, for our lattice sizes, of the order of $0.5\ T_c$). As we
425: increase the temperature we observe that well defined structures
426: emerge (see figure \ref{F-3T}, where we show results for a single
427: sample, with $N=512$, at $T=0.3\ T_c$, $T=0.5\ T_c$ and $T=2.0\ T_c$):
428: when we reach the critical temperature $T_c$ and we go deeper in the
429: warm phase we obtain \emph{a homogeneous matrix}: here spins are
430: equally likely to be up or down, and as a consequence the overlap
431: between two configurations is zero on average.
432:
433: \begin{figure}
434: \centerline{\psfig{figure=F/fig2.ps,width=0.8\textwidth,angle=90}}
435: \caption{The dendogram and the related $\cal P$ matrix obtained
436: from the clustering of $M=256$ configurations at three different
437: temperature values. On the left $T=0.3\ T_c$ (where $T$ is very low
438: and no significant structure but the $Z_2$ degeneracy can be
439: observed), in the center $T=0.5\ T_c$ (that is the best $T$ region for
440: observing the non-trivial state structure), and on the right $T=2.0\
441: T_c$, where there is no structure since we are deep in the high $T$
442: phase.}
443: \label{F-3T}
444: \end{figure}
445:
446: We stress that the information about the $Z_2$ symmetry is a trivial,
447: well known one, that does not give us further insight: still, it is
448: interesting that the clustering algorithm is able to reconstruct it.
449: We will discuss at length the fact that, on the opposite side, the
450: presence of the symmetry is deeply annoying in that it makes more
451: difficult to get quantitative information about the structure in one
452: of the two $Z_2$ sectors, hiding many features of the data, and making
453: interesting predictions impossible.
454:
455: We also use figure \ref{F-3T} to make a further point. The dendograms,
456: that make possible to visualize the hierarchical structure build from
457: the clustering, do not give much unambiguous information about the
458: underlying structure. The picture from $T=0.3\ T_c$ is not so
459: different, but for some power rescaling of the lengths, from the one
460: at very high $T$ ($T=2.0\ T_c$) where we do not expect a non trivial
461: ultrametricity to appear. Clusters at hight $T$ are, indeed, more
462: balanced, but one can only get some qualitative feelings about it.
463:
464: In the following we will work on trying to quantify the qualitative
465: statements about the possible presence of a (maybe hierarchical)
466: definite structure in the low $T$ phase.
467:
468:
469: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
470: \subsection{Quantitative Testing\label{SS-QT}}
471:
472: Before discussing our approach toward a quantitative analysis based
473: on hierarchical clustering techniques and aimed to check whether the
474: spin configurations (our original data set) are really organized
475: according to an ultrametric structure, we analyze the system by
476: applying a more standard statistical mechanical approach. Following
477: \cite{domany,camapa} we analyze the probability distribution of the
478: variable
479: \begin{displaymath}
480: k_{\mu\nu\rho}\equiv \frac{d_{\mu\nu}-d_{\mu\rho}} {d_{\nu\rho}} \ ,
481: \end{displaymath}
482: where we have ordered the three distances to satisfy the condition
483: $C\equiv\{d_{\mu\nu}\geq d_{\mu\rho}\geq d_{\nu\rho}\}$. This implies
484: that $K\in [0,1]$. In an ultrametric space we would get that
485: $P(k=K|C)=\delta(k)$.
486:
487: On our finite $N$ lattices we assume the following dependence of $P$
488: over $K$:
489: \begin{displaymath}
490: P(k=K|C) \sim \exp\left\{-\frac{K^2}{2\sigma^2}\right\}\ \theta(K)\ ,
491: \end{displaymath}
492: where $\theta(\cdot)$ is the step function. We analyze the behavior
493: of the variance $\sigma^2$ with the size $N$. We show our results in
494: figure \ref{F-SIGMA}. In the upper plot we select $T=0.5\ T_c$ and we
495: plot $\sigma$ as a function of $N$. $\sigma$ decreases with $N$, but
496: very slowly (as we expected from the results of \cite{camapa}, where
497: even with a tuned up Monte Carlo procedure one finds that similar
498: analysis are very difficult): it is not even easy to get a reasonable
499: fit to a zero limit of $\sigma$ (but the very large statistical error
500: allows for it). In the inset of the upper part of the plot we show
501: $P(k=K|C)$ for one single sample. In the bottom plot we show how
502: $\sigma$ depends on $T$ on our largest lattice size, $N=512$. Nothing
503: dramatic happens when increasing $T$: again, only some qualitative low
504: key effect is taking place.
505:
506: \begin{figure}
507: \centerline{\psfig{figure=F/fig3.ps,width=0.7\textwidth,angle=270}}
508: \caption{In the upper part of the figure we plot variance of
509: the distribution $P(k=K|C)$ versus the base two logarithm of the
510: number of spins $N$ at fixed temperature $T=0.5\ T_c$. In the inset
511: we plot $P(k=K|C)$ as a function of $K$ for a single sample of the
512: quenched disorder. In the lower part of the figure we plot $\sigma$
513: vs. $\frac{T}{T_c}$ for $N=512$.}
514: \label{F-SIGMA}
515: \end{figure}
516:
517: Now we start with analysis of the results of our cluster
518: reconstruction. We have used our data (spin configurations for a given
519: lattice size and temperature, together with their mutual distances
520: obtained from their mutual overlap) to produce a hierarchical tree,
521: and we want to test if this tree is connected to intrinsic properties
522: of our data (as we have already clarified an ultrametric tree can
523: always be superimposed even to random data). We will adapt standard
524: techniques \cite{jaidub} to judge about the validity of the structure
525: we have found and about the statement that data are organized
526: according to an ultrametric structure.
527:
528: The general procedure testing has a simple structure: given a starting
529: proximity matrix $\cal P$, we end our clustering procedure with a
530: particular ordering of elements of $\cal P$, i.e. with a particular
531: permutation of $|P|$ data. This is what our clustering scheme achieves
532: (transforming the left part of figure \ref{F-MATRIX-A} in the right
533: bottom matrix). Now we have the problem of deciding if what we did was
534: sensible: we can rephrase this question by saying that we have to
535: choose between the {\em randomness hypothesis} ($H_0$: all
536: permutations of labels of $M$ are equally likely) and the {\em
537: alternative hypothesis} ($H_1$: the data have some structure that has
538: been at least partially reconstructed by the clustering). In order to
539: check that we:
540: \begin{enumerate}
541: \item define a variable $T$ that we expect to be
542: ``small'' under the null hypothesis $H_0$;
543: \item assign a {\em confidence level} $\alpha$ for $H_1$ and
544: define a threshold $t_\alpha$ by solving the equation
545: $$
546: P(T\ge t_\alpha | H_0) = 1-\alpha\ ;
547: $$
548: \item measure from the data
549: the value of $T$, that we call $t^*$. If
550: \begin{enumerate}
551: \item $t^* \ge t_\alpha$ $\Rightarrow$ reject $\ H_0$ at level $\alpha$;
552: \item $t^* < t_\alpha$ $\Rightarrow$ accept $H_0$ at level $\alpha$.
553: \end{enumerate}
554: \end{enumerate}
555: $\alpha$ is a confidence level, i.e. it is connected to
556: the probability that by accepting $H_1$ as true we are not
557: making a mistake.
558:
559: The first tool that we introduce is based on \emph{Hubert's $\Gamma$
560: Statistics} \cite{jaidub,hubsch}, and it is useful to validate
561: clustering. This is done by checking the correlation of the data with
562: a structure we define {\em a priori}.
563:
564: We consider our measured distance matrix $d_{\mu\nu}$, and we
565: introduce the matrix $f_{\mu\nu}$ by
566: \begin{equation}
567: f_{\mu,\nu}=
568: \left\{
569: \begin{array}{cl}
570: 0 & \textrm{if $\mu,\nu\,{{\in}}$
571: same cluster}\\
572: 1 & \textrm{if not}
573: \end{array}
574: \right.
575: \label{E-HUBERT}
576: \end{equation}
577: We will study the correlations among $d_{\mu\nu}$ and
578: $f_{\mu\nu}$. Clearly we have also to specify the definition of {\em
579: being in the same cluster}. This introduces a parameter that allow to
580: decide how deeply we want to test the clusterization features of the
581: data. We will introduce a threshold, that defines the refinement level
582: that we want to use to check our description.
583:
584: We then have to define the a priori structure that we will compare to
585: the data. Let us call $d_{\mbox{max}}$ the maximum distance (on the
586: hierarchical tree) among two configurations of our set: we say that
587: {\em two configurations belong to the same cluster if their distance
588: is smaller than a certain fraction of $d_{\mbox{max}}$, say than
589: ${d_{\mbox{max}}}/{z}$}. We show in the right part of figure
590: \ref{F-HUBERT} how the number of clusters $N_c$ depends on $z$. At
591: very low $T$ we find a linear dependence of $N_c$ over $z$, while at
592: values of the order of $\frac12 T_c$ $N_c$ grows faster than linearly.
593: In figure \ref{F-HUBERT} we also show, for one sample of the quenched
594: disorder, the true distance matrix $d_{\mu,\nu}$ and four different
595: matrices $f_{\mu,\nu}$ obtained with an increasing value of $z$ (from
596: the upper left corner going rightward and then to the lower line and
597: rightward again), $z=$ $4$, $8$, $12$ and $16$. The difference among
598: the structures that we are testing in the different cases is obvious.
599: The careful reader will be able to recognize by eye that the three
600: valley structure implied by the threshold level $z=4$ can indeed be
601: found in the raw distance data of the leftmost matrix.
602:
603: \begin{figure}
604: \centerline{
605: \psfig{figure=F/fig4.ps,width=1.0\textwidth,angle=90}
606: }
607: \caption{On the left we plot the true distance matrix for a single
608: disorder sample at $T=0.5\ T_c$, and in the center four matrices
609: $f_{\mu,\nu}$ obtained for four different values of the threshold as
610: defined in equation (\protect\ref{E-HUBERT}). On the right we plot the
611: number of clusters $N_c$ versus $z$. i.e. how the the number of
612: valleys depends upon the value of threshold we fix in order to test
613: the hypothesis. It turns out to be linear for small $T/T_c$,
614: exponential if $T\gtrsim T_c/2$ }
615: \label{F-HUBERT}
616: \end{figure}
617:
618: The main ingredient needed for
619: analyzing the Hubert's
620: $\Gamma$ statistics is the correlation function
621: \begin{equation}
622: \Gamma=
623: \frac 1{M^2}\sum_{\mu=0}^M\sum_{\nu=0}^M
624: \frac{
625: \left(d_{\mu,\nu}-m_{D}\right)\left(f_{\mu,\nu}-m_{F}\right)
626: }{
627: {s_{D}\,s_{F}}
628: }\ ,
629: \label{E-GAMMA}
630: \end{equation}
631: where (for $X=D,F$, $x=d,f$)
632: \begin{displaymath}
633: m_X \equiv \frac 1{M^2}\sum_{\mu=0}^M\sum_{\nu=0}^M
634: x_{\mu,\nu}\quad\quad,\quad\quad
635: s^2_X \equiv \frac 1{M^2}\sum_{\mu=0}^M\sum_{\nu=0}^M
636: x^2_{\mu,\nu}-m_X^2\ .
637: \end{displaymath}
638: Let us say that when looking at the output of the clustering we
639: observe a value of $\Gamma$ equal to $\Gamma^*$. In order to estimate
640: if this value hints for the hierarchical structure being intrinsic to
641: the data we have used a number of tests. The first test amounts to
642: little more than checking if our procedures are correct: we take as
643: $H_0$ the randomness hypothesis, i.e. we compare our ordered distance
644: matrix to a matrix where the configurations are at random. We would
645: find that the configuration is not atypical only if our programming
646: was wrong. We compute an histogram $P(\Gamma|H_0)$, i.e. the
647: distribution of $\Gamma$ under the null hypothesis of randomness, by
648: evaluating
649: \begin{displaymath}
650: \Gamma(\pi)=
651: \frac 1{M^2}\sum_{\mu=0}^M\sum_{\nu=0}^M
652: \frac{
653: \left( d_{\mu,\nu} - m_D \right)
654: \left( f_{\pi(\mu),\pi(\nu)}- m_F \right)
655: }{s_D\,s_F}\ ,
656: \end{displaymath}
657: where the $\pi$ are random permutations of the $M$ configuration. A
658: cluster is not consistent with the hypothesis $H_0$ (in this case the
659: hypothesis that configurations have not been ordered) if it is
660: ``unusual''. In order to quantify this statement, we introduce an
661: indicator $\Delta$ defined as
662: \begin{displaymath}
663: \Delta\equiv
664: \frac{\Gamma^*-\langle \Gamma\rangle}
665: {\sqrt{\langle(\Delta\Gamma)^2}\rangle}
666: \end{displaymath}
667: where the value of $\Gamma$ that we have observed in our sample and
668: where the averages are taken with respect to the conditioned
669: probability distribution $P(\Gamma|H_0)$. As expected we always find
670: a very high value of $\Delta$ for all reasonable values of the
671: threshold $z$ (i.e., say, values of $z$ that produce from two to order
672: hundred valleys): $\Delta$ is of order $10^{2}$ and that it is only
673: weakly dependent on the temperature (even at $T=\infty$ this test
674: tells that, yes, we had ordered the configurations, rejecting in this
675: way $H_0$ in a very clear cut way, since we are dealing with a large
676: matrix). As expected this procedure gives positive results both on the
677: original set of configurations and after applying the reversing
678: procedure described in section \ref{SS-REVERSE}.
679:
680: The rest of the (more crucial) testing of the Hubert's $\Gamma$
681: statistics has been done on the set of reversed configurations, where
682: the $Z_2$ symmetry has been eliminated (see section \ref{SS-REVERSE}).
683: We will discuss it later on, after introducing same other important
684: objects and methods.
685:
686: The second tool we use to establish whether the particular
687: hierarchical structure we find is the correct one is based on the
688: evaluation of the so called \emph{cophenetic correlation coefficient}
689: $\cal K$. It is defined as
690: $$
691: {\cal K}\equiv
692: \langle d\cdot d_C\rangle-\langle d\rangle \langle
693: d_C\rangle\ ,
694: $$
695: where the cophenetic distance $d_C(\mu,\nu)$ is measured on the
696: dendogram (and because of that it is ultrametric by definition). For
697: example, in the case of Ward clustering, it is the quantity
698: defined in (\ref{E-WARD}). A high level of correlation of true
699: distance and cophenetic distance implies that the data have an
700: intrinsic ultrametric organization. On the contrary a low level of
701: correlation suggests that a true ultrametric structure cannot be
702: detected. $\cal K$ is a natural measure of the ultrametricity build in
703: our data set.
704:
705: If we try to analyze our original configuration set without removing
706: the $Z_2$ symmetry (each configuration $\cal C$ has a corresponding
707: configuration $\cal - C'$ which appears with the same probability)
708: we measure a high value of $\cal K$, always higher than $0.97$.
709: Interpreting this result as a confirm of the detection of an
710: ultrametric structure would be wrong: the $Z_2$ implies a very
711: primitive form of hierarchical organization (states are grouped in two
712: well separated sectors of the phase space) and on finite, medium size
713: volumes, this is what we are measuring.
714:
715: \begin{figure}
716: \centerline{\psfig{figure=F/fig5.ps,width=0.8\textwidth,angle=270}}
717: \caption{Plot of the true distance $d(i_0,j)$ (solid lines with wiggles) and
718: of the ultrametric cophenetic distance $d_C(i_0,j)$ (solid straight lines)
719: versus $j$ for different values of $i_0$.}
720: \label{cfr}
721: \end{figure}
722:
723: One way to clarify this issue is to look at figure \ref{cfr}, where we
724: plot, for a given sample of the quenched disorder, at $N=512$ and low
725: temperature $T=0.3 T_c$, both the true distance $d(i_0,j)$ and the
726: cophenetic distance $d_C(i_0,j)$ as a function of $j$ for various
727: values of $i_0$. It is clear that the $Z_2$ symmetry makes the two
728: distances similar in a trivial way, by designing the same step: this
729: is the reason that makes ${\cal K}\lesssim 1$. The real physical
730: differences are in the wavy behavior of the true distance: it is its
731: difference from the constant behavior of the cophenetic distance that
732: has to be analyzed. This is what we will do in the next section.
733:
734: We will now apply a spin reversal procedure that allows us to obtain a
735: set of configurations that have, in the infinite volume limit, a
736: positive definite mutual overlap. This is a very useful procedure
737: \cite{MAMAZU} that makes our set of configurations equivalent to a set
738: of configurations obtained in an infinitesimal magnetic field (without
739: the drawback of having to keep under control the smallness of the
740: field). Only after doing that we will come back to the evaluation of
741: the cophenetic coefficient $\cal K$.
742:
743: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
744: \subsection{The Reversing Procedure and Our Main Results\label{SS-REVERSE}}
745:
746: \begin{figure}
747: \centerline{\psfig{figure=F/fig6.ps,width=0.6\textwidth,angle=270}}
748: \caption{The probability distribution $P_J(q)$
749: for different realizations of the quenched disorder ($T=0.4$),
750: before and after applying the reversing procedure.
751: Here we use $M=512$ configurations of a $N=512$ spin system.}
752: \label{pdq}
753: \end{figure}
754:
755: In the infinite volume limit the question of identifying in our set of
756: configurations two subsets, $|+\rangle$ and $|-\rangle$ is well
757: posed. After doing that we can flip all signs of the configurations in
758: $|-\rangle$, obtaining in this way a set of configurations with a
759: positive definite overlap.
760:
761: We use here the approach introduced in \cite{MAMAZU}. We take one
762: configuration as starting point, $\cal S$. We consider now a new
763: configuration, and if its overlap with $\cal S$ is negative we flip
764: it. For a third configuration we consider the average overlap with the
765: first two, and we flip it if this is negative. We do that for all
766: configurations. This procedure works quite well, and it can be
767: improved in a number of ways (for example we can repeat it by starting
768: from the new set and considering a different reference configuration
769: and a different order).
770:
771: In figure \ref{pdq} we show the $P_J(q)$ for several samples, before
772: and after the reversing procedure. It is clear that the procedure
773: works quite well. The main problems are for samples where different
774: valleys are quite similar (we are on finite lattices and there are
775: intrinsic ambiguities that disappear in the thermodynamic limit). A
776: good example of a troublesome samples is the second sample from the
777: top on the right, where the reconstructed $P_J(q)$ has, even after our
778: reversal procedure, a long tail at negative $q$ values. We have
779: verified (see also \cite{MAMAZU}) that when increasing the volume size
780: these spurious effects become smaller.
781:
782: We have also found that a second effective approach to the separation
783: of the phase space is based on using the same clusterization procedure
784: we will eventually use for analyzing the hierarchical structure. We
785: first use clusterization (based for example on the Ward algorithm) to
786: identify the two $Z_2$ subsets. We then flip all spins of all
787: configurations of one of the two, and repeat the clusterization to
788: find a new (hopefully faithful) hierarchical structure. This second
789: approach gives results that are very similar to the ones of the first
790: approach \cite{MAMAZU} that we have discussed before: for example the
791: resulting $P_J(q)$ are basically indistinguishable.
792:
793: In the following we will use spin configurations {\em ``reversed''}
794: using this technique.
795:
796: \begin{figure}
797: \centerline{
798: \psfig{figure=F/fig7A.ps,width=0.5\textwidth}
799: \psfig{figure=F/fig7B.ps,width=0.5\textwidth}
800: }
801: \caption{Proximity matrix for two $N=512$ samples in the left and
802: right parts of the plot (at $T=0.2\ T_c$ on the left for each of the
803: two samples and at $T=0.6\ T_c$ on the right for each of the two
804: samples) ordered according to the output of the clustering procedure
805: (i.e. as from the dendogram, in the bottom) and the corresponding
806: cophenetic matrix implied by the same dendogram (in the top).}
807: \label{A16}
808: \end{figure}
809:
810: In figure \ref{A16} we show the proximity matrix for two $N=512$
811: samples (at $T=0.2\ T_c$ and at $T=0.6\ T_c$) ordered according to the
812: output of the clustering procedure (i.e. as from the dendogram) and
813: the corresponding cophenetic matrix implied by the same dendogram.
814:
815: \begin{figure}
816: \centerline{\psfig{figure=F/fig8.ps,width=0.8\textwidth,angle=0}}
817: \caption{ In figures \protect\ref{corr}.a, \protect\ref{corr}.b and
818: \protect\ref{corr}.c we plot $\cal K$ as a function of $\frac{T}{T_c}$
819: for $N=128$, $N=256$ and $N=512$. In figure \protect\ref{corr}.d we
820: plot $\langle\Gamma\rangle$ versus the assumed density of valleys,
821: i.e. the number of valleys divided times the number of configurations
822: $M$: a large difference from the high $T$ data implies a plausible
823: hypothesis. In figure \protect\ref{corr}.e we compare single and
824: complete link clustering: see the text for further details.}
825: \label{corr}
826: \end{figure}
827:
828: When the hierarchical, ultrametric structure is intrinsic to the data
829: set the matrices in the bottom line of figure \ref{A16} become equal
830: to the ones in the central line. Now that the accidental $Z_2$
831: symmetry has been removed we are able to look at the real, relevant
832: physical effects. We have investigated the issue in a systematic
833: way. We average over $20$ different quenched realizations of the
834: disorder, and analyze the system for different lattice volumes as a
835: function of the temperature.
836:
837: In figures \ref{corr}.a, \ref{corr}.b and \ref{corr}.c we plot $\cal
838: K$ as a function of $\frac{T}{T_c}$ for $N=128$, $N=256$ and $N=512$.
839: The upper sets of points with smaller errors are from the analysis
840: done {\em before} the spin reversal ($Z$), the lower sets of points
841: with larger error are from the analysis of the spin reversed
842: configurations ($R$). We have already discussed the fake detection of
843: ultrametricity induced by the $Z_2$ symmetry. We discuss now the data
844: obtained after removing the symmetry. In no cases a clear evidence for
845: the existence of a true ultrametric structure emerges. $\cal K$ is
846: always small, and for $T<T_c$ it does not even increase clearly with
847: $N$ (finite size effects are very large and uncontrolled). It is
848: interesting that in the set of $Z$ data the phase transition is
849: detected quite clearly (but, as we have explained, what we observed is
850: no connected to a hierarchical structure, but only to the
851: usual breaking of the $Z_2$ symmetry). At high $T$ values, for $T>T_c$
852: the $Z$ and the $R$ sets of data coincide: here there is one single
853: state.
854:
855: This analysis shows clearly that on medium size lattices it is
856: impossible to detect more than hints toward a hierarchical structure:
857: in our mean field model we know that ultrametricity will eventually
858: emerge, but very large lattices are needed for that.
859:
860: In figure \ref{corr}.d we try a further test to improve the level of
861: our quantitative understanding. We could phrase our goal by saying
862: that we are trying to understand how many valleys we can be sure are
863: present in the phase space (we repeat that since we are studying the
864: mean field Sherrington-Kirkpatrick theory in the Parisi broken phase
865: we know that asymptotically an infinite number of such valleys will
866: emerge). We go back to $\Gamma$ defined in equation \ref{E-GAMMA}. At
867: different $T$ values we change the threshold value $z$ and monitor the
868: number of valleys we are building for a given $z$ value (this depends
869: on $T$: we have discussed this procedure when commenting figure
870: \ref{F-HUBERT}). We measure $\langle\Gamma\rangle$ and we plot it
871: versus the average number of valleys per sample (all data are for
872: reversed configurations, except for one set of non-reversed data at
873: $T=0.5 T_c$ that we plot for sake of comparison). We use the high $T$
874: ($T=1.9 T_c$) curve as a reference curve, and we consider it as the
875: randomness threshold: if at a given temperature $T$ the value of
876: $\langle\Gamma\rangle$ is very different than the high $T$ value we
877: consider that as evidence for existence of this number of valleys.
878:
879: Using the hight $T$ limit as the reference line looks to us as a
880: sensible choice (we have already discussing that using unordered
881: matrix lines is basically just a check of the correctness of our
882: procedure). If, for example, we select a value of the $x$ variable
883: (number of clusters divided by $M$) $x=0.002$, that in the case of
884: $N=512$ assumes the presence of two valleys ({\em after} removal of
885: the $Z_2$ symmetry) we see that at low $T$ the data are quite
886: different from the high $T$ ones, suggesting that we are probably
887: already detecting this (quite low) level of organization. When we try
888: a threshold implying a larger number of valleys (already for example
889: for three of four valleys on our larger lattice, $N=512$) the data are
890: not far from the high $T$ ones, implying a failure in supporting the
891: hypothesis.
892:
893: We will discuss figure \ref{corr}.e in the next section.
894:
895: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
896: \subsection{Other Clustering Algorithms\label{SS-OTHER}}
897:
898: As we have discussed in some detail in section \ref{S-CLUSTER} the
899: cluster reconstruction algorithm is defined by selecting the rule used
900: to join two elements at different levels of the partitioning, an to
901: update the distance matrix after each step of refining the
902: partitioning level.
903:
904: In our analysis we have used the Ward scheme \cite{ward} (that updates
905: the distances as in equation \ref{E-WARD}): this is believed to be an
906: optimal choice when there is no information \emph{a priori} on the
907: data \cite{jaidub}.
908:
909: Basic clustering algorithms are the {\em single link} scheme and the
910: \emph{complete link} one. We will not enter here in many details (see
911: \cite{jaidub} for further information), but let us say that in the
912: single link scheme one just demands a weak connectivity to merge two
913: subsets, and joins them to form a new cluster as early as possible,
914: while in the complete link scheme the opposite happens, and subsets
915: are joined to form a new cluster ``as late as possible''. Both methods
916: have advantages and drawbacks. The crucial observation that we will
917: use now is that when a real hierarchical structure is present all
918: these methods end up to give the same result, and to reconstruct the
919: same classification.
920:
921: In these two algorithms we have that, if as before $\rho$ and $\sigma$
922: merge to form the new cluster $\rho'$ for all other clusters $\tau$:
923: \begin{eqnarray*}
924: d_{\tau,\rho'}=&
925: \min\{d_{\tau,\rho},d_{\tau,\sigma}\}&
926: \;\;\;\;\;\textrm{(single link)}\ ,\\
927: d_{\tau,\rho'}=&
928: \max\{d_{\tau,\rho},d_{\tau,\sigma}\}&
929: \;\;\;\textrm{(complete link)}\ .
930: \end{eqnarray*}
931: The reason for the names is in the graph theory interpretation of the
932: algorithms \cite{jaidub}. As we have already said it is not difficult
933: to show that if the true distance matrix is actually ultrametric the
934: optimal permutation with respect to these two algorithms is be exactly
935: the same.
936:
937: In this framework we have introduced a last test of the structure of
938: our data: we check how different are the output of the two algorithms
939: to try to understand if we can detect further hints for an emerging
940: ultrametric structure. We have analyzed 20 samples at several
941: temperatures values, and we show in figure \ref{corr}.e the average
942: correlation between the two output distance matrices, that is
943: \begin{displaymath}
944: \omega
945: \equiv \overline{\langle d_{SL}\cdot d_{CL}\rangle} \ .
946: \end{displaymath}
947: The correlation is very high at low $T$, and decreases toward the high
948: $T$ value around $T\sim 0.8\ T_c$. Again, on medium large lattice sizes
949: we can detect hints toward an emerging ultrametric structure but we
950: cannot in any way get a clear cut answer.
951:
952: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
953: \section{Clustering the Spins\label{S-SPINS}}
954:
955: \begin{figure}
956: \centerline{\psfig{figure=F/fig9.ps,width=0.8\textwidth}}
957: \caption{Clustering the spins: for a given sample of the quenched
958: disordered couplings we look at the spins of our configurations as a
959: set done of $N=512$ elements (one per lattice site), each element
960: being a $M=512$ dimensional vector configurations (all the values
961: taken by the spin in the given site on our $M$ independent
962: configurations). After clustering these data vectors we plot the
963: distance matrix $d_{ij}$ between spin $i$ and spin $j$ according to
964: the ordering found in the cluster. The plots correspond to $T=0.1\
965: T_c, T=0.2\ T_c, ...,0.9\ T_c$. At very low temperatures a large
966: ($O(N)$) spin domain structure emerges. The structure disappears when
967: increasing the temperature.}
968: \label{spin}
969: \end{figure}
970:
971: An interesting question (discussed in details in \cite{domany})
972: concerns a possible clustering of the {\em spins} of our system.
973: The issue is clearly very relevant in the finite dimensional systems
974: studied in \cite{domany} where spatial structures can be very
975: relevant. Here, in mean field, there is no notion of distance, but
976: still spins can be aggregated in different groups that have different
977: degrees of correlation.
978:
979: We will look for the possible presence of some kind of structure (in
980: this case not hierarchical since there is no reason for this) now in
981: the space of the elementary spins instead than in configuration space.
982: In the analysis of configurations we were considering the $N\times M$
983: data matrix $\{\sigma_i^\mu\}$ as representing $M$ configurations,
984: where each data point was an $N$-dimensional vector. Now we change
985: our point of view; we regard each of the $N$ spins as a data point,
986: that is as a vector in a $M$-dimensional space. Since we expect
987: highly correlated spins to be in the same cluster, following
988: \cite{domany}
989: we define
990: the distance between spin $i$ and spin $j$ as
991: $$
992: d_{ij}=1-c_{ij}^2\ ,
993: $$
994: where
995: $$
996: c_{ij}\equiv\langle \sigma_i\sigma_j\rangle
997: \equiv\frac 1M\sum_{\mu=1}^M \sigma_i^\mu\sigma_j^\mu
998: $$
999: is the spin correlation matrix that we can evaluate using our
1000: spin configurations generated in a Monte Carlo run.
1001:
1002: It is interesting to follow the evolution in temperature
1003: of the ordered spin matrix for a given sample: we show it in figure
1004: \ref{spin}. At intermediate temperature values a large group of spin
1005: is clearly very correlated: here $O(N)$ spins are grouped
1006: together. This structure disappears at high $T$ values. It is
1007: remarkable how this picture is similar to figure 11.d of the second
1008: paper of reference \cite{domany}. This is a severe warning against
1009: misleading interpretations of the data analysis: here we are in mean
1010: field, and there are no spatial local domains.
1011:
1012:
1013: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1014: \section{Conclusions\label{S-CONCLUSIONS}}
1015:
1016: The configuration space of a $N$-spin system is a $2^N$-dimensional
1017: space and it is very difficult to represent it in order to catch the
1018: main physical features~\footnote{Only for limited purposes a principal
1019: component analysis (PCA) can be adapted to help in this task
1020: \protect\cite{domany}.}. We have shown that cluster analysis allows
1021: not only to visualize in a physically meaningful way the structure of
1022: the configuration space, but also allows for quantitative testing of a
1023: priori hypothesis about the structure of the data set.
1024:
1025: We have discussed the role of the $Z_2$ symmetry of the system, and
1026: how its removal is necessary to study the relevant physical
1027: phenomena. Our main issue is that quantitative testing is mandatory to
1028: make of clustering techniques an useful tool. We have introduced some
1029: of these techniques by designing tests such to be useful in our
1030: context of a (disordered) statistical mechanics context.
1031:
1032: As a crucial benchmark we have analyzed the mean field theory in the
1033: low $T$ replica broken phase, where we know that eventually, in the
1034: infinite volume limit, a hierarchical structure of states emerges. We
1035: are able to observe many hints toward the emerging of such structure,
1036: but on the lattice sizes where we are able to work these indications
1037: cannot be considered as unambiguous. Detecting ultrametricity is very
1038: difficult, and demands very large lattice sizes: this turns out to be
1039: true in mean field, and we expect it to be probably true also in
1040: finite dimensional models, where the existence itself of mean field
1041: like states is all to be checked. We believe that the findings and the
1042: techniques that we have reported here will be important to use in the
1043: finite dimensional context. As many other features (we have in mind
1044: for example temperature chaos \cite{CHAOS}, that is very difficult to
1045: detect numerically and emerges only at very high orders in
1046: perturbation theory) ultrametricity emerges, already in mean field,
1047: only on very large lattices.
1048:
1049: We also believe it is important that in this ``quantitative'' approach
1050: to clustering we have been able to introduce a natural way to consider
1051: not only sample dependent but also disorder average quantities.
1052:
1053: A next step is to apply, by continuing the work of \cite{domany},
1054: these techniques to finite dimensional disordered systems (defined on
1055: very large lattices!) on the one side and to glassy systems on the
1056: other side: since here a crucial goal is to try to understand the
1057: details of the spatial, time dependent organization of the system,
1058: techniques like the ones introduced here could turn out to be very
1059: useful.
1060:
1061: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1062: \section*{Acknowledgments\label{S-ACK}}
1063:
1064: We acknowledge the precious contribution of Loredana Correale to a
1065: first phase of this work. We thank Eytan Domany and Peter Young for many
1066: useful conversations that have motivated us toward this problem.
1067:
1068: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1069: \begin{thebibliography}{99}
1070:
1071: \bibitem{domany}
1072: G. Hed, A. K. Hartmann, D. Stauffer and E. Domany,
1073: Phys. Rev. Lett. {\bf 86}, 3148 (2001);
1074: E. Domany, G. Hed, M. Palassini and
1075: A. P. Young, Phys. Rev. B {\bf 64}, 224406 (2001).
1076:
1077: \bibitem{books}
1078: M. M\'ezard, G. Parisi and M. A. Virasoro,
1079: \emph{Spin Glass Theory and Beyond}
1080: (World Scientific, Singapore 1987);
1081: K. Binder and A. P. Young,
1082: Rev. Mod. Phys. {\bf 58}, 801 (1986);
1083: K. H. Fischer and J. A. Hertz,
1084: \emph{Spin Glasses}
1085: (Cambridge University Press, Cambridge, UK 1993);
1086: \emph{Spin Glasses and Random Fields},
1087: edited by A. P. Young
1088: (World Scientific, Singapore 1998).
1089:
1090: \bibitem{ultra}
1091: See for example
1092: R. Rammal, G. Toulouse and M. A. Virasoro,
1093: Rev. Mod. Phys. {\bf 58}, 765 (1986),
1094: and references therein.
1095:
1096: \bibitem{review}
1097: E. Marinari, G. Parisi, F. Ricci-Tersenghi,
1098: J. J. Ruiz-Lorenzo and F. Zuliani,
1099: J. Stat. Phys. {\bf 98}, 973 (2000).
1100:
1101: \bibitem{camapa}
1102: A. Cacciuto, E. Marinari and G. Parisi,
1103: J. Phys. A {\bf 30}, L263 (1997).
1104:
1105: \bibitem{fraric}
1106: S. Franz and F. Ricci-Tersenghi,
1107: Phys. Rev. E {\bf 61}, 1121 (2000).
1108:
1109: \bibitem{jaidub}
1110: A. K. Jain and R. C. Dubes,
1111: {\em Algorithms for Clustering Data}
1112: (Prentice-Hall, Englewood Cliffs, USA 1988).
1113:
1114: \bibitem{rogufo}
1115: K. Rose, E. Gurewitz and G. Fox,
1116: Phys. Rev. Lett. {\bf 65}, 945 (1990).
1117:
1118: \bibitem{blwido}
1119: M. Blatt, S. Wiseman and E. Domany,
1120: Phys. Rev. Lett. {\bf 76}, 3251 (1996);
1121: S. Wiseman, M. Blatt and E. Domany,
1122: Phys. Rev. E {\bf 57}, 3767 (1997).
1123:
1124: \bibitem{stibia}
1125: S. Still and W. Bialek,
1126: preprint physics/0303011 (March 2003).
1127:
1128: \bibitem{dudhar} R. O. Duda and P. E. Hart,
1129: \emph{Pattern Classification and Scene Analysis}
1130: (John Wiley \& Sons, New York 1973).
1131:
1132: \bibitem{giamar}
1133: L. Giada and M. Marsili,
1134: Phys. Rev. E {\bf 63}, 061101 (2001);
1135: Physica A {\bf 315}, 57 (2002).
1136:
1137: \bibitem{ward}
1138: J. H. Ward, Jr.,
1139: Journal of the American Statistical Association {\bf 58}, 236 (1963).
1140:
1141: \bibitem{PT}
1142: M. C. Tesi, E. J. Janse van Rensburg, E. Orlandini and
1143: S. G. Whillington,
1144: J. Stat. Phys. {\bf 82}, 155 (1996);
1145: K. Hukushima and K. Nemoto,
1146: J. Phys. Soc. Japan {\bf 65}, 1604 (1996);
1147:
1148: \bibitem{PTREV}
1149: E. Marinari,
1150: {\em Optimized Monte Carlo Methods,}
1151: in
1152: {\em Advances in Computer Simulations,}
1153: edited by J. Kert\'esz and I. Kondor
1154: (Springer-Verlag, Berlin 1998), p.50.
1155:
1156: \bibitem{hubsch}
1157: L. J. Hubert and J. Schultz,
1158: British Journal of Mathematical and Statistical Psychology
1159: {\bf 29}, 190 (1976).
1160:
1161: \bibitem{MAMAZU}
1162: E. Marinari, O. Martin and F. Zuliani,
1163: Phys. Rev. B {\bf 64}, 184413 (2001).
1164:
1165: \bibitem{CHAOS}
1166: A. Billoire and E. Marinari,
1167: Europhys. Lett. {\bf 60}, 775 (2002);
1168: A. Crisanti and T. Rizzo,
1169: Phys. Rev. Lett. {\bf 90}, 137201 (2003).
1170:
1171: \end{thebibliography}
1172:
1173: \end{document}
1174: