cs0402030/paper.tex
1: \documentclass[runningheads]{llncs}
2: \usepackage{epsfig}
3: \usepackage{subfigure}
4: \usepackage{wrapfig}
5: \usepackage{url}
6: \usepackage{psfrag}
7: 
8: %\newcommand{\mysection}[1]{\vspace*{-0.75ex}\section{#1}\vspace*{-0.85ex}}
9: %\newcommand{\mysubsection}[1]{\vspace*{-0.5ex}\subsection{#1}\vspace{-0.5ex}}
10: 
11: \hyphenation{cross-over}
12: \hyphenation{re-com-bi-na-tion-based}
13: 
14: \begin{document}
15: 
16: \pagestyle{headings}
17: \mainmatter
18: 
19: \title{Computational Complexity and Simulation of Rare Events of Ising Spin Glasses}
20: \titlerunning{Computational Complexity of Spin Glasses}
21: 
22: \author{Martin Pelikan\inst{1}, Jiri Ocenasek\inst{2}, Simon Trebst\inst{2}, Matthias Troyer\inst{2}, \\and Fabien Alet\inst{2}}
23: \authorrunning{Martin Pelikan et al.}
24: 
25: \institute{Dept. of Math. and Computer Science, 320 CCB\\
26: University of Missouri at St. Louis\\
27: 8001 Natural Bridge Rd.,
28: St. Louis, MO 63121\\
29: \email{pelikan@illigal.ge.uiuc.edu}
30: \vspace*{1ex}
31: \and
32: Computational Laboratory (CoLab)\\
33: Swiss Federal Institute of Technology (ETH)\\
34: CH-8092 Z\"{u}rich, Switzerland\\
35: \email{jirio@inf.ethz.ch}\\
36: \email{\{trebst,troyer,falet\}@comp-phys.org}}
37: 
38: \maketitle
39: 
40: %===========================================================================================================
41: 
42: \begin{abstract}
43: \begin{sloppy}
44: We discuss the computational complexity of random 2D Ising spin
45: glasses, which represent an interesting class of constraint
46: satisfaction problems for black box optimization. Two extremal
47: cases are considered: (1) the $\pm J$ spin glass, and (2) the
48: Gaussian spin glass. We also study a smooth transition between
49: these two extremal cases. The computational complexity of all
50: studied spin glass systems is found to be dominated by rare events
51: of extremely hard spin glass samples. We show that complexity of
52: all studied spin glass systems is closely related to Fr\'echet
53: extremal value distribution. In a hybrid algorithm that combines
54: the hierarchical Bayesian optimization algorithm (hBOA) with a
55: deterministic bit-flip hill climber, the number of steps performed
56: by both the global searcher (hBOA) and the local searcher follow
57: Fr\'echet distributions. Nonetheless, unlike in methods
58: based purely on local search, the parameters of these
59: distributions confirm good scalability of hBOA with local search.
60: We further argue that standard performance measures for
61: optimization algorithms---such as the average number of
62: evaluations until convergence---can be misleading. Finally, our
63: results indicate that for highly multimodal constraint
64: satisfaction problems, such as Ising spin glasses,
65: recombination-based search can provide qualitatively better
66: results than mutation-based search.
67: \end{sloppy}
68: \end{abstract}
69: 
70: %===========================================================================================================
71: 
72: \section{Introduction}
73: 
74: The spin glass problem is an old-standing but still intensively
75: studied problem in physics~\cite{Mezard:86}. First, experimental
76: realizations of spin glass systems do exist and their properties,
77: in particular their dynamics, are still not well explained.
78: Second, spin glasses pose a challenging, unsolved problem in theoretical
79: physics since the nature of the spin glass state at low
80: temperatures is not understood. It is widely believed that this is
81: due to the intrinsic complexity of the rough energy landscape of
82: spin glasses.
83: 
84: In statistical physics, one usual goal is to calculate a desired
85: quantity (e.g. magnetization) over a distribution of
86: configurations of a spin glass system for a given temperature. The
87: probability of observing a specific spin configuration, $C$, of
88: the spin glass is governed by the Boltzmann distribution, that is
89: to say it is inversely proportional to the exponential of the
90: ratio of its energy and temperature : $p(C) \sim \exp(-E(C)/T)$.
91: Thus, as temperature decreases, the distribution of possible
92: configurations of the spin glass concentrates near the
93: configurations with minimum energy, which are also called ground
94: states. The ground-state properties capture most of the low
95: temperatures physics, and it is therefore very interesting to find
96: and study them.
97: 
98: From another perspective, spin glasses represent an interesting
99: class of problems for black-box optimization where the task is to
100: find ground states of a given spin glass sample, because the
101: energy landscape in most spin glasses exhibits features that make
102: it a challenging optimization benchmark. One of these features is
103: the large number of local optima, which often grows exponentially
104: with the number of decision variables (spins) in the problem.
105: Because of the large number of local optima, using local search
106: operators, such as mutation, is almost always intractable.
107: 
108: In this paper we present, analyze, and discuss a series of
109: experiments on 2D Ising spin glasses. Random spin glass instances
110: for a fixed lattice geometry (square lattice) are generated by
111: randomly sampling a fixed distribution of coupling constants. We
112: distinguish two basic classes of random 2D Ising spin glass
113: systems: (1) coupling constants are initialized
114: randomly to either $+1$ or $-1$, and (2) coupling constants are generated from a zero-mean Gaussian
115: distribution. A transition between these two cases is also
116: considered. We apply the hierarchical Bayesian optimization
117: algorithm (hBOA) with local search to all considered classes of
118: spin glasses, and provide a thorough statistical analysis of hBOA
119: performance on a large number of problem instances in each class.
120: The results are discussed in the context of state-of-the-art Monte
121: Carlo methods, such as the Wang-Landau algorithm \cite{Wang:01}
122: and the multicanonical method \cite{Berg:92}. Finally, we identify
123: important lessons from this work for genetic and evolutionary
124: computation.
125: 
126: In the following we present a short review of the hierarchical
127: Bayesian optimization algorithm and extremal value distributions
128: used in the statistical analysis. In
129: section~\ref{section-methodology} we define the 2D Ising spin
130: glass systems analyzed in this work, and introduce several
131: classes of random spin glass instances.
132: Section~\ref{section-experiments} presents experimental
133: methodology and results. Finally,
134: Section~\ref{section-conclusions} summarizes and concludes the
135: paper.
136: 
137: %===========================================================================================================
138: 
139: \section{Numerical methods and statistical analysis}
140: \label{section-background}
141: 
142: This section briefly discusses the hierarchical Bayesian
143: optimization algorithm (hBOA)~\cite{Pelikan:01*,Pelikan:03b} and
144: extremal value distributions, which will be used to analyze
145: experimental results.
146: 
147: \subsection{Hierarchical Bayesian optimization algorithm (hBOA)}
148: The hierarchical Bayesian optimization algorithm
149: (hBOA)~\cite{Pelikan:01*,Pelikan:03b} is one of the most advanced
150: genetic and evolutionary algorithms based primarily on selection
151: and recombination. hBOA evolves a population of candidate
152: solutions to a given problem. Using a {\em population} of
153: solutions as opposed to a single solution has several advantages;
154: for example, it enables simultaneous exploration of multiple
155: regions in the search space, it can help to alleviate the effects
156: of noise in evaluation, and it allows the use of statistical and
157: learning techniques to identify regularities in the black-box
158: optimization problem under consideration.
159: 
160: The first population of candidate solutions is usually generated
161: according to uniform distribution over all candidate solutions. The population is updated for a number of iterations
162: using two basic operators: (1) selection, and (2) variation. The
163: selection operator selects better solutions at the expense of the
164: worse ones from the current population, yielding a population of
165: promising candidates. The variation operator starts by learning a
166: probabilistic model of the selected solutions that encodes
167: features of these promising solutions and the inherent
168: regularities. hBOA uses Bayesian networks with local
169: structures~\cite{Chickering:97} to model promising solutions. The
170: variation operator then proceeds by sampling the probabilistic
171: model to generate new solutions. The new solutions are
172: incorporated into the original population using the restricted
173: tournament replacement (RTR)~\cite{Harik:95a}, which ensures that
174: useful diversity in the population is maintained over long periods
175: of time. A more detailed description of hBOA can be found
176: in~\cite{Pelikan:thesis}.
177: 
178: To improve candidate solutions locally, hBOA applies a
179: deterministic bit-flip hill-climber to each newly generated
180: candidate solution that improves the solution by single-bit flips
181: until no further improvement is possible. Flips that produce
182: better solutions are of higher priority. It was previously
183: shown that local search can significantly reduce population sizes
184: for various optimization problems, including the spin glass
185: problem~\cite{Pelikan:03*}.
186: 
187: \subsection{Extremal value distributions}
188: \label{section-evd}
189: 
190: Several quantities related to the computational complexity studied
191: in this work are found to follow extremal value distributions. The
192: central limit theorem for extremal values states that the extremes
193: of large samples are distributed according to one of three
194: extremal value distributions, depending on whether their shapes
195: are fat-tailed (tails decay polynomially), exponential (tails
196: decay exponentially), or thin-tailed (tails decay faster than
197: exponentially)~\cite{Fisher:28}. The integrated probability
198: density function for any of these extremal value distributions can
199: be written as
200: \begin{equation}
201: H_{\xi;\mu;\beta}(x) = \exp\left(-{\left( 1 + \xi \frac{x-\mu}{\beta}\right)}^{\frac{1}{\xi}}\right),
202: \end{equation}
203: where $\mu$ is the location parameter, $\beta$ is the scaling
204: parameter, and $\xi$ is the shape parameter that indicates how
205: fast the tail decays. If $\xi<0$, $H_{\xi;\mu;\beta}(x)$ represents the Fr\'echet
206: distribution (polynomial decay), if $\xi=0$ it represents the
207: Gumbel distribution (exponential decay), and if $\xi>0$ it
208: represents the Weibull distribution (faster than exponential
209: decay). Distributions encountered in this work are Fr\'echet
210: distributions, where the shape parameter $\xi$ determines the
211: power law decay of the fat tails of the distribution
212: 
213: \begin{equation}
214: \frac{dH_{\xi;\mu;\beta}}{dx}
215: \stackrel{x\to\infty}{-\!\!\!\!-\!\!\!\!-\!\!\!\!-\!\!\!\!\longrightarrow}
216: x^{-(1-1/\xi)} \;.
217: \label{eq:Tail}
218: \end{equation}
219: 
220: From this asymptotic behavior one can see that the $m$-th moment of a fat tailed
221: Fr\'echet distribution (with $\xi<0$) is well defined only if $|\xi| < 1/m$.
222: 
223: %===========================================================================================================
224: 
225: \section{The Ising spin glass}
226: \label{section-methodology}
227: 
228: A 2D spin glass system consists of a regular 2D grid containing
229: $N$ nodes which correspond to the spins. The edges in the grid
230: connect nearest neighbors. Additionally, edges between the first
231: and the last element in each dimension are added to introduce
232: periodic boundary conditions. 
233: %See Figure~\ref{figure-spin-glass}
234: for an example 2D spin glass structure consisting of $9$ spins
235: distributed on a $3\times 3$ square lattice.
236: 
237: %\begin{figure}[t]
238: %\begin{center}
239: %\epsfig{file=spin-glass.eps,width=1.4in}
240: %\end{center}
241: %\caption{Topology of a 2D spin glass with $N=9$ spins on a
242: %$3\times 3$ grid (square lattice). Nodes represent spins, whereas
243: %edges connect pairs of spins related by coupling constants.}
244: %\label{figure-spin-glass}
245: %\end{figure}
246: 
247: With each edge there is a real-valued constant associated which
248: gives the strength of spin-spin coupling. For the classical Ising
249: model each spin can be in one of two states: $+1$ or $-1$. Each
250: possible set of values for all spins is called a spin
251: configuration. Given a set of (random) coupling constants,
252: $J_{i,j}$, and a configuration of spins, $C$, the energy can be
253: computed as
254: \begin{equation}
255: E(C) = \sum_{\langle i,j\rangle} s_i J_{i,j} s_j \;,
256: \end{equation}
257: where $i,j \in\{0, 1, \ldots, N-1\}$ denote the spins (nodes) and
258: $\langle i,j\rangle$ nearest neighbors on the underlying grid
259: (allowed edges). The random spin-spin coupling constants $J_{i,j}$ for a
260: particular spin glass instance are given on input.
261: 
262: In statistical physics, the usual task is to integrate a known
263: function over all possible configurations of spins,
264: %$\{\tilde{C}\}$
265:  where the configurations are distributed
266: according to the Boltzmann distribution. Probability of
267: encountering a configuration, $C$ at temperature $T$ is given by
268: \begin{equation}
269: \label{eq-boltzmann-distribution}
270: p(C) = \frac{\exp\left({-E(C)/T}\right)}{\sum_{\tilde{C}} \exp\left({-E(\tilde{C})/T}\right)} \;.
271: \end{equation}
272: 
273: From the physics point of view, it is interesting to know the ground states (configurations
274: associated with the minimum possible energy). Finding extremal energies
275: then corresponds to sampling the Boltzmann distribution with temperature
276: approaching $0$ and thus the problem of finding ground states is simpler {\it
277:   a priori} than integration over a wide range of temperatures. However, most
278: of the conventional methods based on sampling the above Boltzmann distribution \ref{eq-boltzmann-distribution} fail to find the ground states configurations because they get often trapped in a local minimum.
279: 
280: The problem of finding ground states is a typical optimization problem, where the task is to find an optimal
281: configuration of spins that minimizes energy. Although
282: polynomial-time deterministic methods exist for both types of 2D
283: spin glasses~\cite{Galluccio:99,Galluccio:99a}, most algorithms
284: based on local search operators, including a (1+1) evolution
285: strategy, conventional Monte Carlo simulations, and Monte Carlo
286: simulations with Wang-Landau~\cite{Wang:01} or multicanonical
287: sampling~\cite{Berg:92},
288: scale exponentially and are thus impractical for solving this
289: class of problems. The origin for this slowdown is due to the
290: suppressed relaxation times in the Monte Carlo simulations in the
291: vicinity of the extremal energies because of the enormous number
292: of local optima in the energy landscape. Recombination-based
293: genetic algorithms succeed if recombination is performed in a way
294: that interacting spins are located close to each other in the
295: representation; $k$-point crossover with a rather small $k$ can
296: then be used so that the linkage between contiguous blocks of bits
297: is preserved (unlike with uniform crossover, for instance).
298: However, the behavior of such specialized representations and
299: variation operators cannot be generalized to similar slowly
300: equilibrating problems which exhibit different energy landscapes,
301: such as protein folding or polymer dynamics.
302: 
303: In order to obtain a quantitative understanding of the disorder in
304: a spin glass system introduced by the random spin-spin couplings,
305: one generally analyzes a large set of random spin glass instances
306: for a given distribution of the spin-spin couplings. For each spin
307: glass instance the optimization algorithm is applied and results
308: statistically analyzed to obtain a measure of computational
309: complexity. Here we first consider two types of initial spin-spin
310: coupling distributions, the $\pm J$ spin glass and the Gaussian
311: spin glass.
312: 
313: \subsection{The $\pm J$ spin glass}
314: For the $\pm J$ Ising spin glass, each spin-spin coupling constant
315: is set randomly to either $+1$ or $-1$ with equal probability (see
316: lower right panel in Figure \ref{fig-transition-distribution}).
317: Energy minimization in this case can be transformed into a
318: constraint satisfaction problem, where the constraints relate
319: spins connected by a coupling constant. If $J_{i,j}>0$, then the
320: constraint requires spins $i$ and $j$ to be different, whereas if
321: $J_{i,j}<0$, then the constraint requires spins $i$ and $j$ to be
322: the same. Energy is minimized when the number of satisfied
323: constraints is maximized.
324: 
325: \subsection{Gaussian spin glasses}
326: In the Gaussian spin glass, coupling constants are generated
327: according to a zero-mean Gaussian distribution with variance one
328: (see upper left panel in Figure
329: \ref{fig-transition-distribution}). For real-valued couplings,
330: energy minimization can be casted as a constraint satisfaction
331: problem with weighted constraints.
332: 
333: \subsection{Transition between $\pm J$ and Gaussian spin glasses}
334: To describe a smooth transition between the $\pm J$ and the
335: Gaussian spin glass we vary the distribution of spin-spin coupling
336: constants by defining a distribution as the sum of two Gaussian
337: distributions, described by means, $\pm \tilde{\mu}$, and
338: variance, $\tilde \sigma$, in such a way that the overall mean
339: becomes $\mu = 0$ and the  overall variance $\sigma = 1$. The
340: explicit form of the two Gaussians is thus given by
341: $\tilde{\sigma}^2=1-\tilde{\mu}^2$. The $\pm J$ spin glass
342: ($\tilde{\mu} = 1$) and the Gaussian spin glass ($\tilde{\mu} =
343: 0$) then describe the extremal cases of this new family of
344: distributions. The transition between the two extrema is then
345: described by varying $\tilde{\mu}$ between 0 and 1 which is
346: illustrated in Figure~\ref{fig-transition-distribution} for
347: $\tilde{\mu} = 0, 0.60, 0.80, 0.95, 0.99, 1$.
348: 
349: \begin{figure}[t]
350: \begin{center}
351: \epsfig{file=couplings/couplings_mu_0.eps,width=1.1in}~~~
352: \epsfig{file=couplings/couplings_mu_60.eps,width=1.1in}~~~
353: \epsfig{file=couplings/couplings_mu_80.eps,width=1.1in}\\~\\
354: \epsfig{file=couplings/couplings_mu_95.eps,width=1.1in}~~~
355: \epsfig{file=couplings/couplings_mu_99.eps,width=1.1in}~~~
356: \epsfig{file=couplings/couplings_mu_100.eps,width=1.1in}
357: \end{center}
358: \vspace*{-3ex}
359: \caption{Distribution of coupling constants for the transition from the Gaussian (upper left) to the $\pm J$ spin glass (lower right).}
360: \label{fig-transition-distribution}
361: \end{figure}
362: 
363: %==========================================================================
364: 
365: \section{Numerical experiments}
366: \label{section-experiments}
367: 
368: In the following we describe the numerical experiments in more detail and present results for the spin glasses described above.
369: 
370: \subsection{Description of experiments}
371: 
372: For $\pm J$ and Gaussian 2D spin glasses, systems with equal
373: number of spins in each dimension were used of size from
374: $n=8\times 8$ to $n=20\times 20$. For each system size, 1000
375: random samples were generated. hBOA with the deterministic local
376: searcher was then applied to find the ground state for each
377: sample. For the transition from $\pm J$ to Gaussian spin glasses,
378: we focused on a single system size, $n=10\times 10$.
379: 
380: For each spin glass sample, the population size in hBOA is set to
381: the minimum population size required to find the optimum in 10
382: independent runs. The minimum population size is determined using
383: bisection. The width of the final interval in bisection is at most $10\%$ of its
384: higher limit. Binary tournament selection without replacement is
385: used. The windows size in RTR is set to the number of spins of the
386: system under consideration, but it is always at most equal to
387: $5\%$ of the population size. The $5\%$ cap on the window size is
388: important to ensure fast convergence with even small populations.
389: The cap explains the difference between the results presented here
390: and the previous results, because populations are usually very
391: small for hBOA with local search on Ising spin
392: glasses~\cite{Pelikan:thesis}.
393: 
394: Performance of hBOA was measured by (1) $E_G$, the total number of
395: spin glass system configurations examined by hBOA (the number of
396: restarts of the local searcher), and (2) $E_L$, the total number
397: of steps of the local hill climber. Due to the lack of space, we
398: only analyze $E_G$. $E_L$ was greater than $E_G$ by a factor of
399: approximately $O(\sqrt{n})$. Clearly, we can expect that
400: $E_G<E_L$. Nonetheless, it is computationally much less expensive
401: to perform a local step in the hill climber than to evaluate a new
402: spin glass configuration sampled by hBOA.
403: 
404: \subsection{Results for $\pm J$ and Gaussian couplings}
405: 
406: \begin{figure}[t]
407: \begin{center}
408: \epsfig{file=ising_frechet/norm8x8.eps,height=1.10in}~~~
409: \epsfig{file=ising_frechet/norm14x14.eps,height=1.10in}~~~
410: \epsfig{file=ising_frechet/norm20x20.eps,height=1.10in}\\~\\
411: \epsfig{file=gaussian_frechet/norm10x10.eps,height=1.10in}~~~
412: \epsfig{file=gaussian_frechet/norm14x14.eps,height=1.10in}~~~
413: \epsfig{file=gaussian_frechet/norm20x20.eps,height=1.10in}
414: \end{center}
415: \vspace*{-3ex}
416: \caption{Distribution of $E_G$ for $\pm J$ spin glass systems of
417: varying size. $E_G$ and the density function are normalized using
418: $\mu$ and $\beta$.}
419: \label{figure-Frechet}
420: \end{figure}
421: 
422: The first important observation is that the distributions of $E_G$
423: and $E_L$ for all problem sizes and distributions of coupling
424: constants follow Fr\'echet extremal value distributions. Applying
425: a maximum likelihood estimator we can determine the parameters
426: $\mu$, $\beta$, and $\xi$ of these distributions defined in
427: Equation (1). Figure~\ref{figure-Frechet} shows the histograms and
428: the corresponding probability density function for $E_G$ for $\pm
429: J$ spin glasses of various sizes.
430: 
431: The location parameter $\mu$ indicating the most likely value of
432: $E_G$ can be used to determine the scalability of hBOA. In
433: Figures~\ref{fig-location-and-shape}a and
434: \ref{fig-location-and-shape}b the location parameter for both $\pm
435: J$ and Gaussian spin glasses are shown versus the system size.
436: Double logarithmic plots confirm that the location has an upper
437: polynomial bound. For the $\pm J$ spin glass, the order of that
438: polynomial approaches $1.5$ as system size $n$ grows, whereas for
439: Gaussian couplings, the order of the polynomial seems to approach
440: $2.2$.
441: 
442: Figure~\ref{fig-location-and-shape}c and
443: ~\ref{fig-location-and-shape}d show the shape $\xi$ for both $\pm
444: J$ and Gaussian spin glasses with respect to the system size.
445: Since it is always smaller than 1, we conclude that the mean is
446: well-defined for all cases. For the variance (2nd moment) we find
447: the shape parameter to be smaller than 1/2 only for systems larger
448: than $n=10\times10$. Thus, for system smaller than $n=10\times10$
449: the variance is not well-defined and the mean has an infinite
450: error.
451: 
452: \begin{figure}[t]
453: \begin{center}
454: \epsfig{file=ising_frechet/location.eps,width=0.35\textwidth}~~~
455: \epsfig{file=gaussian_frechet/location.eps,width=0.35\textwidth}\\~\\
456: \epsfig{file=ising_frechet/shape.eps,width=0.35\textwidth}~~~
457: \epsfig{file=gaussian_frechet/shape.eps,width=0.35\textwidth}
458: \end{center}
459: \vspace*{-3ex}
460: \caption{Location $\mu$ and shape $\xi$ for $\pm J$ and Gaussian
461: spin glasses using maximum likelihood estimation. Standard error
462: of the estimations displayed with error bars.}
463: \label{fig-location-and-shape}
464: \end{figure}
465: 
466: \subsection{Results for the transition between $\pm J$ and Gaussian couplings}
467: 
468: For the transition between $\pm J$ and Gaussian couplings, $E_G$
469: and $E_L$ also follow Fr\'echet distributions.
470: Figure~\ref{figure-Frechet-transition} shows the distribution of
471: $E_G$ in the transition, including $\pm J$ and Gaussian cases.
472: Figure~\ref{fig-location-and-shape-transition} shows location and
473: shape parameters for the transition.
474: 
475: We can see that both location and shape parameters for the
476: transition between $\pm J$ and Gaussian couplings lie between the
477: corresponding parameters for the two extreme cases. That means
478: that considering the two extreme cases provides insight not only
479: in the cases themselves, but it can be used to guide estimation of
480: parameters for a large class of other distributions of couplings.
481: 
482: \begin{figure}[t]
483: \begin{center}
484: \epsfig{file=transition_frechet/norm0.eps,height = 1.1in}~~~
485: \epsfig{file=transition_frechet/norm60.eps,height = 1.1in}~~~
486: \epsfig{file=transition_frechet/norm80.eps,height = 1.1in}\\~\\
487: \epsfig{file=transition_frechet/norm95.eps,height = 1.1in}~~~
488: \epsfig{file=transition_frechet/norm99.eps,height = 1.1in}~~~
489: \epsfig{file=transition_frechet/norm100.eps,height = 1.1in}
490: \end{center}
491: \vspace*{-3ex}
492: \caption{Distribution of $E_G$ for the transition from $\pm J$ to Gaussian spin glasses for $n=10\times 10$.}
493: \label{figure-Frechet-transition}
494: \end{figure}
495: 
496: \begin{figure}[t]
497: \begin{center}
498: \subfigure[Location]{\epsfig{file=transition_frechet/location.eps,width=0.35\textwidth}}
499: \hspace*{0.5in}
500: \subfigure[Shape]{\epsfig{file=transition_frechet/shape.eps,width=0.35\textwidth}}
501: \end{center}
502: \vspace*{-3ex}
503: \caption{Location $\mu$ and shape $\xi$ for the transition between
504: $\pm J$ and Gaussian spin glasses. Standard error of the estimations displayed with error bars. X-axis denotes the distance $\tilde{\mu}$ of the means used to generate couplings.}
505: \label{fig-location-and-shape-transition}
506: \end{figure}
507: 
508: \section{Discussion}
509: 
510: In the following we discuss the experimental results, first in the
511: context of hBOA scalability theory and then in comparison with
512: flat-histogram Monte Carlo results~\cite{Dayal:04}. We close by
513: presenting some general conclusions for genetic and evolutionary
514: computation.
515: 
516: \subsection{Experimental results and hBOA theory}
517: 
518: An interesting question is whether the results obtained can be
519: explained using hBOA convergence theory designed for a rather
520: idealized situation, where the problem can be decomposed into
521: subproblems of bounded order over multiple levels of difficulty.
522: For random 2D Ising spin glasses, it can be shown that for a
523: complete single-level decomposition it would be necessary to
524: consider subproblems of order proportional to $\sqrt{n}$ as hypothesized by M\"{u}hlenbein ~\cite{Muhlenbein:99a}, which
525: would lead to exponentially sized populations~\cite{Pelikan:02a}.
526: Despite this, the number of function evaluations grows as a
527: low-order polynomial of the number $n$ of spins as predicted by
528: hBOA scalability theory for decomposable problems of bounded
529: difficulty~\cite{Pelikan:02a}. Spin glasses with $\pm J$ couplings
530: correspond to uniform scaling, where the theory predicts
531: $O(n^{1.55})$ evaluations; indeed the location parameter $\mu$
532: indeed seems to approach a polynomial of order approx. $1.5$. Spin
533: glasses with Gaussian couplings exhibit a non-uniform scaling,
534: where exponential scaling can be taken as a bounding case. For
535: exponential scaling, the number of evaluations would be predicted
536: to grow as $O(n^2)$; here the location parameter seems to grow
537: slightly faster with a polynomial of order approx. $2.2$. However,
538: the order of this polynomial decreases with problem size.
539: 
540: \subsection{Comparison to flat-histogram Monte Carlo}
541: 
542: 
543: Monte Carlo (MC) methods are usually used to integrate a function
544: $f(x)$ with some probability density distribution over the input
545: parameter $x$. The common approach is to sample a series of values
546: of $x$ according to the specified probability distribution, and
547: averaging the values of $f(x)$.
548: 
549: While conventional MC has been successfully used in numerous
550: applications, it sometimes produces inferior results for low
551: temperatures because the random walk through the space of all
552: possible configurations (values of $x$) of the system has
553: difficulties in overcoming energy barriers. One of the ways to
554: alleviate this difficulty is to modify the simulated
555: statistical-mechanical ensemble and use Wang-Landau
556: sampling~\cite{Wang:01} to sample each energy level equally
557: likely, thereby producing a flat histogram. The Wang-Landau
558: algorithm thus represents a class of methods also known as
559: flat-histogram MC. This approach not only alleviates the problem
560: of energy barriers, but it also enables computation of the number
561: of configurations at different energy levels, which can in turn be
562: used to quickly compute thermal averages for any given temperature
563: without having to rerun the simulation.
564: 
565: For flat-histogram MC, the distribution of round-trip times in
566: energy measured by the total number of applications of local
567: operators was recently shown to follow Fr\'echet distributions
568: \cite{Dayal:04}. However, the absolute value of the shape
569: parameter for flat-histogram MC was shown to approach $1$. As a
570: result, the mean of this distribution is not defined. Further, the
571: location parameter found for flat-histogram MC grows exponentially~\cite{Dayal:04}, although for this class of spin glasses it is
572: possible to analytically compute the entire energy spectrum in
573: polynomial time, $O(n^{3.5})$~\cite{Galluccio:99}.
574: 
575: \subsection{Important lessons for genetic and evolutionary computation}
576: 
577: The results presented in this paper indicate that it can be
578: misleading to estimate the mean convergence time by an average
579: over several independent samples (runs), because in some cases the
580: mean, variance, and other moments of the respective distribution
581: may become ill-defined. In this work, the location parameter
582: serves as a well-defined quantity to express computational
583: complexity of various optimization and simulation techniques,
584: including hBOA and flat-histogram MC. It can be expected that
585: similar distribution will be observed for other evolutionary
586: algorithms, as they reflect intrinsic properties of the spin glass
587: \cite{Dayal:04}.
588: 
589: 
590: Random 2D Ising spin glasses represent interesting classes of constraint
591: satisfaction problems with a large number of local optima. The results
592: presented in this work indicate that for such classes of problems,
593: recombination-based search can provide optimal solutions in low-order
594: polynomial time, whereas mutation-based methods scale exponentially. However,
595: local search is still beneficial for local improvement of solutions in
596: recombination-based evolutionary algorithms, because incorporating local
597: search decreases population sizing requirements. A similar observation was found
598: for MAXSAT~\cite{Pelikan:03*}.
599: 
600: %===========================================================================================================
601: 
602: \section{Conclusions}
603: \label{section-conclusions}
604: 
605: Random classes of Ising spin glass systems represent an
606: interesting class of constraint satisfaction problems for
607: black-box optimization. Similar to flat-histogram MC,
608: computational complexity of hBOA---expressed in the number of
609: solutions explored by both hBOA and the local hill climber until
610: the optimum---is found to show large sample-to-sample variations.
611: The obtained distribution of optimization steps follow a
612: fat-tailed Fr\'echet extremal value distribution. However, for
613: hBOA the shape parameter defining the decay of the tail is small
614: enough for the first two moments of the observed distributions to
615: exist for all but smallest system sizes. The location parameter as
616: well as the mean of this distribution scale like a polynomial of
617: low order. The experiments show that similar behavior can be
618: observed for $\pm J$ and Gaussian spin glasses, as well as for the
619: transition between these two cases. For $\pm J$ spin glasses,
620: performance of hBOA agrees with scalability theory for hBOA on
621: uniformly scaled problems, whereas for Gaussian spin glasses,
622: performance of hBOA agrees with scalability theory for hBOA on
623: exponentially scaled problems.
624: 
625: There are some general conclusions for genetic and evolutionary
626: computation. First, measuring time complexity by the average
627: number of function evaluations until the optimum is found can
628: sometimes be misleading when rare events dominate the
629: sample-to-sample variations. Second, it was shown for this
630: specific problem that recombination-based search can efficiently
631: deal with exponentially many local optima and still find the
632: global optimum in low-order polynomial time.
633: 
634: \section*{Acknowledgments}
635: \vspace*{-1ex}
636: 
637: Pelikan was supported by the Research Award at the University of
638: Missouri at St. Louis and the Research Board at the University of
639: Missouri. Trebst and Alet acknowledge support from the Swiss National Science
640: Foundation. Most of the calculations were performed on the Asgard cluster at ETH Z\"{u}rich. The hBOA software, used by Pelikan, was developed by Martin Pelikan and David E. Goldberg at the University of Illinois at Urbana-Champaign. 2D spin glass instances with ground states obtained from S. Sabhapandit and S. N. Coppersmith from the University of Wisconsin.
641: 
642: \vspace*{-2ex}
643: 
644: \begin{small}
645: \bibliographystyle{splncs}
646: \bibliography{mybib}
647: \end{small}
648: 
649: \end{document}
650: