1: %BeginFileInfo
2: %%Publisher=ELSEVIER
3: %%Project=NUPHA
4: %%Manuscript=NPA8550
5: %%Stage=308
6: %%TID=elvyraa
7: %%Pages=71
8: %%Format=latex006
9: %%Distribution=live4
10: %%Destination=DVI
11: %%DVI.Maker=vtex_tex_dvi
12: %%History1=Computer: 318AW, User: Ritac, 2003.09.08 11:06
13: %%History2=Computer: 318AW, User: Ritac, 2003.09.10 11:32
14: %%History3=Computer: 318AW, User: Ritac, 2003.09.10 14:18
15: %%History4=Computer: 318AW, User: Ritac, 2003.09.10 15:26
16: %%History5=Computer: 318AW, User: Ritac, 2003.09.22 08:29
17: %%History6=Computer: 318AW, User: Ritac, 2003.09.25 15:29
18: %%History7=Computer: 514BW, User: elvyraa, 2003.09.26 14:35
19: %%History8=Computer: 514BW, User: elvyraa, 2003.09.30 13:44
20: %EndFileInfo
21: %
22: % Journal NPA, Elsevier
23: % Typeset by VTeX Ltd., Vilnius, Lithuania
24: %
25: %Spelling_date
26: % Opcijos: [rotating,secthm,seceqn,secfloat,nameyear,xxtheorem]
27: \documentclass{article}
28: \usepackage{bookstyle,bm,cite}
29: \usepackage{graphicx}
30: %\psdraft
31: %spell_from
32: %
33: % Local def's for bibliography
34: %
35: \let\bauthor\relax
36: \let\fnm\relax\let\snm\relax
37: \let\bseries\ignorespaces
38: \let\btitle\relax
39: \let\bvolumeno\textbf
40: \def\bdate#1{\unskip\ (#1)}
41: \def\bfirstpage#1{\unskip\ #1}
42: \def\bcomment#1{\unskip, #1}
43: %
44: \begin{document}
45: \title{Random Matrices, the Ulam Problem, Directed Polymers \& Growth Models, and Sequence Matching}
46: %\runtitle{}
47: \author{Satya N. Majumdar}
48: \address{
49: Laboratoire de Physique Th\'eorique et
50: Mod\`eles Statistiques (UMR 8626 du CNRS),
51: Universit\'e Paris-Sud, B\^at. 100, 91405 Orsay Cedex, France
52: }
53: \frontmatter
54: \maketitle
55: \mainmatter%
56: %s1 ###
57: \section{Preamble}
58:
59: In these lecture notes I will give a pedagogical introduction to some common aspects of $4$ different
60: problems: (i) random matrices (ii) the longest increasing subsequence problem (also known as
61: the Ulam problem) (iii) directed
62: polymers in random medium
63: and growth models in $(1+1)$ dimensions and (iv) a problem on the alignment of a pair
64: of random sequences.
65: Each of these problems is almost entirely a sub-field by itself and here I will discuss only some specific
66: aspects of each of them. These $4$ problems have been studied almost independently for the
67: past few decades, but only over the last few years a common thread was found to
68: link all of them. In particular all of them share one common limiting probability distribution
69: known as the Tracy-Widom distribution that describes the asymptotic probability distribution
70: of the largest eigenvalue of a random matrix. I will mention here, without mathematical
71: derivation, some of the
72: beautiful results discovered in the past few years. Then, I will consider two specific models
73: (a) a ballistic deposition growth model and (b) a model of sequence alignment known as the
74: Bernoulli matching model and discuss, in some detail, how one derives exactly the
75: Tracy-Widom law in these models. The emphasis of these lectures would be on how to
76: map one model to another. Some open problems will be discussed at the end.
77:
78: \section{Introduction}
79:
80: In these lectures I will discuss $4$ seemingly unrelated problems: (i) random matrices
81: (ii) the longest increasing subsequence (LIS) problem (also known as the Ulam
82: problem after its discoverer) (iii) directed polymers in random environment in $(1+1)$ dimensions
83: and related random growth models and (iv) the longest common subsequence (LCS)
84: problem arising in matching of a pair of random sequences (see Fig. \ref{fields}). These 4 problems
85: have been studied extensively, but almost independently, over the past few
86: decades.
87: For example, random matrices have
88: been extensively studied by
89: nuclear physicists, mathematicians and statisticians. The LIS problem
90: has been studied extensively by probabilists. The models of directed polymers in
91: random medium
92: and the related growth models have been a very popular subject among
93: statistical physicists. Similarly, the LCS problem has been very popular
94: among biologists and computer scientists. Only, in the last $10$ years or so,
95: it became progressively evident that there are profound links between these
96: $4$ problems. All of them share one common probability distribution function
97: which is called the Tracy-Widom distribution.
98: \begin{figure}[t]
99: %\fbox{\vtop to3cm{\vss\hsize=.7\hsize\centerline{fields.eps}\vss}}
100: \includegraphics[width=.7\hsize]{fields.eps}
101: \caption{All $4$ problems share the Tracy-Widom distribution.}
102: \label{fields}
103: \end{figure}
104:
105:
106: This distribution was first discovered in the context of random matrices
107: by Tracy and Widom~\cite{TW1}. They calculated exactly the probability distribution
108: of the {\em typical} fluctuations of the largest eigenvalue of a random matrix around
109: its mean. This distribution, suitably scaled, is known as the Tracy-Widom (TW)
110: distribution (see later for details). Later in 1999, in a landmark paper~\cite{BDJ},
111: Baik, Deift and Johansson (BDJ) showed that the same TW distribution
112: describes the scaled distributions of the length of the longest
113: increasing subsequence in the LIS problem. Immediately after, Johansson~\cite{J1},
114: Baik and Rains~\cite{BR1} showed that the same distribution also appears
115: in a class of directed polymer problems. Around the same time, Pr\"ahofer
116: and Spohn showed~\cite{PS} that the TW distribution also appears in
117: a class of random growth models known as the polynuclear growth (PNG) models.
118: Following this, it was discovered that the TW distribution
119: also occurred
120: in several other growth models, such as the `oriented digital boiling' model~\cite{GTW},
121: a ballistic deposition model~\cite{BD}, in PNG type of growth models
122: with varying initial conditions and in various geometries~\cite{IS,F1} and
123: also in the single-step growth model arising from the totally asymmetric exclusion process~\cite{S1}.
124: Also, a somewhat direct connection between the stochastic growth models
125: and the random matrix models via the so called `determinantal point processes' was found
126: in a series of work by Spohn and collaborators~\cite{Spohn}, which I will not discuss here
127: (see Ref. \cite{Spohn} for a recent review).
128: Finally, the TW
129: distribution was also shown to appear in the LCS problem~\cite{MN}, which is also related
130: to these growth models.
131: Apart from these 4 problems that we will focus here, the TW distribution
132: has also appeared in many other problems, e.g., in the mesoscopic
133: fluctuations of excitation gaps in a dirty metal grain or a semiconductor quantum dot induced
134: by a nearby superconductor~\cite{meso}.
135: The TW distribution also appears in problems related to finance~\cite{BBP}.
136:
137: The appearence of the TW distribution in so many different problems
138: is really interesting, suggesting an underlying universality
139: that links all these different systems. The purpose of my lectures would
140: be to explore and elucidate the links between the 4 problems stated above.
141: The literature on this subject is huge. I will not try to provide
142: any detailed derivation of the mathematical results here. Instead,
143: I will state precisely the known results that we will need to use and put
144: more emphasis on how one maps one problem to the other. In particular,
145: I will discuss two problems in some detail and show how the TW distribution
146: appears in them. These two problems are: (i) a random growth model
147: in $(1+1)$ dimensions that we call the anisotropic ballistic deposition model
148: and (ii) a particular variant of the LCS problem known as the Bernoulli
149: matching (BM) model. In the former case, I will show how to the map
150: the ballistic deposition model to the LIS problem and subsequently use the BDJ
151: results. In the second case, I will show that the BM model can be mapped to
152: a particular directed polymer model that was studied by Johansson.
153: The mappings are often geometric in nature, are nontrivial and serves
154: two purposes: (a) to elucidate how the TW distribution appears in
155: somewhat unrelated problems and (b) to derive exact analytical results in problems such
156: as the sequence matching models, where precise analytical results were
157: missing so far.
158:
159: The lecture notes are organized as follows. In Section 3, I will describe
160: some basic results of the random matrix theory and define the TW distribution
161: precisely. In Section 4, the LIS problem will be described along with
162: the main results of BDJ.
163: Section 5 contains a discussion of the directed polymer problems,
164: and in particular the main results of Johansson will be mentioned. In Section 5.1, I will describe
165: how one maps the anisotropic ballistic deposition model to the LIS problem.
166: Section 6 contains a discussion of the LCS problem. Finally, I will conclude
167: in Section 7 with a discussion and open problems.
168:
169: \section{Random Matrices: the Tracy-Widom distribution for the largest eigenvalue}
170:
171: Studies of the statistics of the eigenvalues of random matrices have a
172: long history going back to the seminal work of Wigner~\cite{Wigner}.
173: Since then, random matrices have found applications in multiple fields
174: including nuclear physics, quantum chaos, disordered systems, string
175: theory and number theory~\cite{Mehta}. Three classes of matrices with
176: Gaussian entries have played important roles~\cite{Mehta}: $(N\times
177: N)$ real symmetric (Gaussian Orthogonal Ensemble (GOE)), $(N\times N)$
178: complex Hermitian (Gaussian Unitary Ensemble (GUE)) and $(2N\times
179: 2N)$ self-dual Hermitian matrices (Gaussian Symplectic Ensemble
180: (GSE)). For example, in GOE, one considers an $(N\times
181: N)$ real symmetric matrix $X$ whose elements $x_{ij}$'s are drawn
182: independently from a Gaussian distribution: $P(x_{ii})= \frac{1}{\sqrt{2\pi}}\,\exp[-x_{ii}^2/2]$
183: and $P(x_{ij}) = \frac{1}{\sqrt{\pi}}\,\exp[-x_{ij}^2]$ for $i<j$. Thus the
184: joint distribution of all the $N(N+1)/2$ independent elements is just the product
185: of the individual distributions and can be writen in a compact form as
186: $P[X]= A_N \exp[-{\rm tr}(X^2)/2]$, where $A_N$ is a normalization constant.
187: One can similarly write down the joint distribution for the other two ensembles~\cite{Mehta}.
188:
189: One of the key results in the random matrix theory is due to Wigner who derived,
190: starting from the joint distribution of the matrix elements $P(X)$, a rather
191: compact expression for the
192: joint probability density function (PDF) of the eigenvalues of a random $(N\times
193: N)$ matrix from all ensembles~\cite{Wigner}
194: \begin{equation}
195: P(\lambda_1, \lambda_2,\dots, \lambda_N) = B_N \exp\left[-\frac{\beta}{2}\left(\sum_{i=1}^N\lambda_i^2
196: -\sum_{i\ne j}\ln(|\lambda_i-\lambda_j|)\right)\right],
197: \label{pdf}
198: \end{equation}
199: where $B_N$ normalizes the pdf and $\beta=1$, $2$ and $4$ correspond
200: respectively to the GOE, GUE and GSE. The joint law allows one to
201: interpret the eigenvalues as the positions of charged particles,
202: repelling each other via a $2$-d Coulomb potential (logarithmic);
203: they are confined on a $1$-d line and each is subject to an external harmonic
204: potential. The parameter $\beta$ that characterizes the type of
205: ensemble can be interpreted as the inverse temperature.
206:
207: Once the joint pdf is known explicitly, other statistical properties of a random matrix
208: can, in principle, be derived from this joint pdf. In practice, however
209: this is often a technically daunting task. For example, suppose we want to
210: compute the average density of states of the eigenvalues defined as
211: $\rho(\lambda,N)= \sum_{i=1}^N\langle
212: \delta(\lambda-\lambda_i)\rangle/N$, which counts the average number of
213: eigenvalues between $\lambda$ and $\lambda + d\lambda$ per unit length.
214: The angled bracket $\langle \rangle$ denotes an average over the joint pdf.
215: It then follows that $\rho(\lambda,N)$ is simply the marginal of the joint pdf,
216: i.e, we fix one of the eigenavlues (say the first one) at $\lambda$ and integrate the joint pdf
217: over the rest of the $(N-1)$ variables.
218: \begin{equation}
219: \rho(\lambda,N)=\frac{1}{N} \sum_{i=1}^N\langle
220: \delta(\lambda-\lambda_i)\rangle =\int_{-\infty}^{\infty}\prod_{i=2}^N d\lambda_i \,
221: P(\lambda,\lambda_2,\dots, \lambda_N).
222: \label{marginal}
223: \end{equation}
224: Wigner was able to compute this marginal and this is one of the central results
225: in the random matrix theory, known as the celebrated Wigner semi-circular law. For large $N$
226: and for any $\beta$,
227: \begin{equation}
228: \rho (\lambda,N) = \sqrt{\frac{2}{N\pi^2}}\,{\left[1 -\frac{\lambda^2}{2N}\right]}^{1/2}.
229: \label{wig1}
230: \end{equation}
231: Thus, on an average, the $N$ eigenvalues lie within a
232: finite interval $\left[-\sqrt{2N}, \sqrt{2N}\right]$, often referred
233: to as the Wigner `sea'. Within this sea, the average density of states
234: has a semi-circular form (see Fig. \ref{figtw}) that vanishes at the
235: two edges $-\sqrt{2N}$ and $\sqrt{2N}$. Note that since there are $N$
236: eigenvalues distributed over the interval $\left[-\sqrt{2N}, \sqrt{2N}\right]$, the
237: average spacing between adjacent eigenvalues scales as $N^{-1/2}$.
238: \begin{figure}
239: \includegraphics[width=.7\hsize]{tw.eps}
240: \caption{The dashed line shows the semi-circular form of the
241: average density of states. The largest eigenvalue is centered around its mean $\sqrt{2N}$
242: and fluctuates over a scale of width $N^{-1/6}$. The probability of fluctuations
243: on this scale is described by the Tracy-Widom distribution (shown schematically).}
244: \label{figtw}
245: \end{figure}
246:
247: From the semi-circular law, it is clear that the average of the maximum (or minimum) eigenvalue
248: is $\sqrt{2N}$ $\left(-\sqrt{2N}\right)$. However, for finite but large $N$, the maximum
249: eigenvalue fluctuates, around its mean $\sqrt{2N}$, from one sample to
250: another. A natural question is: what is the full probability distribution
251: of the largest eigenvalue $\lambda_{\rm max}$? Once again, this distribution
252: can, in principle, be computed from the joint pdf in Eq. (\ref{pdf}). To see
253: this, it is useful to consider the cumulative distribution of $\lambda_{\rm max}$.
254: Clearly, if $\lambda_{\rm max}\le t$, it necessarily means that all the eigenvalues
255: are less than or equal to $t$. Thus,
256: \begin{equation}
257: {\rm Prob}\left[\lambda_{\rm max}\le t, N\right]= \int_{-\infty}^t \prod_{i=1}^N d\lambda_i \,
258: P(\lambda_1,\lambda_2,\dots, \lambda_N),
259: \label{max1}
260: \end{equation}
261: where the joint pdf is given in Eq. (\ref{pdf}).
262: In practice, however, carrying out this multiple integration in closed form is very difficult.
263: Relatively recently, Tracy and Widom~\cite{TW1} were
264: able to find the limiting form of ${\rm Prob}\left[\lambda_{\rm
265: max}\le t,
266: N\right]$ for large $N$. They showed that the fluctuations of $\lambda_{\rm max}$
267: {\em typically} occur over a very narrow scale of
268: width $\sim N^{-1/6}$ around its mean $\sqrt{2N}$ at the upper edge of the Wigner sea.
269: It is useful to note that this scale $\sim N^{-1/6}$ of typical fluctuations
270: of the largest eigenvalue is much bigger than the average spacing $\sim N^{-1/2}$
271: between adjacent eigenvalues in the limit of large $N$.
272:
273: More precisely, Tracy and Widom showed~\cite{TW1} that asymptotically for
274: large $N$, the scaling variable $\xi=\sqrt{2}\,N^{1/6}\, \left[\lambda_{\rm
275: max}-\sqrt{2N}\right]$ has a limiting $N$-independent probability
276: distribution, ${\rm Prob}[\xi\le x]= F_{\beta}(x)$ whose form depends
277: on the value of the parameter $\beta=1$, $2$ and $4$ characterizing
278: respectively the GOE, GUE and GSE. The function $F_{\beta}(x)$ is called
279: the Tracy-Widom (TW) distribution function. The function $F_{\beta}(x)$,
280: computed as a solution of a nonlinear Painleve differential equation~\cite{TW1},
281: approaches to $1$ as $x\to \infty$ and decays rapidly to zero as $x\to
282: -\infty$. For example, for $\beta=2$, $F_2(x)$ has the following
283: tails~\cite{TW1},
284: \begin{eqnarray}
285: F_2(x) &\to & 1- O\left(\exp[-4x^{3/2}/3]\right)\quad\, {\rm as}\,\,\, x\to \infty
286: \nonumber \\
287: &\to & \exp[-|x|^3/12] \quad\, {\rm as}\,\,\, x\to -\infty.
288: \label{asymp1}
289: \end{eqnarray}
290: The probability density function $f_{\beta}(x)=dF_{\beta}/dx$ thus has highly
291: asymmetric tails. A graph of these functions for $\beta=1$, $2$ and $4$
292: is shown in Fig. \ref{fig:tracy}.
293: A convenient way to express these typical fluctuations of $\lambda_{\rm max}$
294: around its mean $\sqrt{2N}$ is to write, for large $N$,
295: \begin{equation}
296: \lambda_{\max} = \sqrt{2N} + \frac{N^{-1/6}}{\sqrt{2}}\, \chi
297: \label{tw2}
298: \end{equation}
299: where the random variable $\chi$ has the limiting $N$-independent distribution,
300: ${\rm Prob}[\chi \le x] = F_{\beta}(x)$.
301: As mentioned in the introduction, amazingly this TW distribution function has since
302: emerged in a growing variety of seemingly unrelated problems, some of which I
303: will discuss in the next sections.
304: \begin{figure}
305: \includegraphics[width=.7\hsize]{tracy.eps}
306: \caption{The probability density function $f_{\beta}(x)$ plotted as a
307: function of $x$ for $\beta=1$, $2$ and $4$ (reproduced from Ref. ~\cite{TW1}).}
308: \label{fig:tracy}
309: \end{figure}
310: \vspace{0.4cm}
311:
312: {\bf {Large Deviations of $\lambda_{\rm max}$:}} Before we end this section and proceed to the other
313: problems, it is worth making
314: the following remark. The Tracy-Widom distribution describes the probability of {\em typical and small}
315: fluctuations of $\lambda_{\rm max}$ over a very narrow region of width
316: $\sim O(N^{-1/6})$ around the mean $\langle \lambda_{\rm max}\rangle
317: \approx \sqrt{2N}$. A natural question is how to describe the
318: probability of {\em atypical and large} fluctuations of $\lambda_{max}$ around its
319: mean, say over a wider region of width $\sim O(N^{1/2})$? For example,
320: what is the probability that all the eigenvalues of a random matrix
321: are negative (or equivalently all are positive)? This is the same as
322: the probability that $\lambda_{\rm max}\le 0$ (or equivalently
323: $\lambda_{\rm min}\ge 0$). Since $\langle \lambda_{\rm max}\rangle
324: \approx \sqrt{2N} $, this requires the computation of the probability
325: of an extremely rare event characterizing a large deviation of $\sim
326: -O(N^{1/2})$ to the left of the mean.
327: This question naturally arises in any physical system where one
328: is interested in the statistics of stationary points of a random landscape.
329: For example, in disordered systems such as spin glasses one is interested in
330: the stationary points (metastable states) of the free energy landscape.
331: On the other hand, in structural glasses or supercooled liquids, one is
332: interested in the stationary points of the potential energy landscape.
333: In order to have a local minimum of the
334: random landscape one needs to ensure that the eigenvalues of the
335: associated Hessian matrix are all positive~\cite{CGG,Fyodorov}.
336: A similar question recently came up
337: in the context of random landscape models of anthropic principle
338: based string theory~\cite{Susskind,AE} as well as in quantum
339: cosmology~\cite{MH}. Here one is interested in the statistical
340: properties of vacua associated with a random multifield potential,
341: e.g., how many minima are there in a random string landscape?
342: These large deviations are also important in characterizing the large sample
343: to sample fluctuations of the excitation gap in quantum dots
344: connected to a superconductor~\cite{meso}.
345:
346: The issue of large deviations of $\lambda_{\rm max}$ was addressed
347: in Ref. \cite{J1} for a special class of matrices drawn
348: from the Laguerre ensemble that corresponds to the eigenvalues of product
349: matrices of the form $W=X^{\dagger}X$ where $X$ itself is a Gaussian
350: matrix (real or complex). Adopting similar methods as in
351: Ref. \cite{J1}
352: one can prove that for Gaussian ensembles,
353: the probability of {\em large} fluctuations to the left of the mean $\sqrt{2N}$
354: behaves for large $N$ as,
355: \begin{equation}
356: {\rm Prob}\left[\lambda_{\rm max}\le t, N\right] \sim \exp\left[-\beta
357: N^2 \Phi_{-}\left( \frac{\sqrt{2N}-t}{\sqrt{N}} \right) \right]
358: \label{ldf1}
359: \end{equation}
360: where $t\sim O(N^{1/2})\le \sqrt{2N}$ is located deep inside the
361: Wigner sea and $\Phi_{-}(y)$ is a certain {\em left} large deviation function.
362: On the other hand, for {\em large} fluctuations to the right of the mean $\sqrt{2N}$,
363: \begin{equation}
364: 1-{\rm Prob}\left[\lambda_{\rm max}\le t, N\right] \sim \exp\left[-\beta
365: N \Phi_{+}\left( \frac{t-\sqrt{2N}}{\sqrt{N}} \right) \right]
366: \label{ldf2}
367: \end{equation}
368: for $t\sim O(N^{1/2})\ge \sqrt{2N}$ located outside the Wigner sea to its right
369: and $\Phi_{+}(y)$ is the {\em right} large deviation function.
370: The problem then is to evaluate explicitly the left and the right large deviation
371: functions $\Phi_{\mp}(y)$ explicitly.
372: While, for the Laguerre ensemble, an explicit
373: expression of $\Phi_{+}(y)$ was obtained in Ref. \cite{J1} and
374: that of $\Phi_(y)$ recently in Ref. \cite{VMB}, similar expressions
375: for the Gaussian ensemble were missing so far.
376:
377: Indeed, to calculate the probability
378: that all eigenvalues are negative (or positive) for Gaussian matrices, we need an explicit expression
379: of $\Phi_{-}(y)$ for the Gaussian ensemble. This is because, the probability that all
380: eigenvalues are negative is precisely the probability that $\lambda_{\rm max}\le 0$,
381: and hence, from Eq. (\ref{ldf1})
382: \begin{equation}
383: {\rm Prob}\left[\lambda_{\rm max}\le 0, N\right]\sim \exp[-\beta N^2 \Phi_{-}(\sqrt{2})].
384: \label{exp1}
385: \end{equation}
386: The coefficient $\theta= \beta \Phi_{-}(\sqrt{2})$ of the $N^2$ term inside
387: the exponential term in Eq. (\ref{exp1}) is of interest in string theory,
388: and in Ref. \cite{AE}, the authors provided an approximate estimate (for $\beta=1$) of
389: $\theta \approx 1/4$, along with numerical simulations.
390: Recently, in collaboration with D.S. Dean,
391: we were able to compute exactly an explicit expression~\cite{DM} for
392: the full {\em left} large deviation function $\Phi_{-}(y)$.
393: I will not provide the derivation here, but the calculation of
394: {\em large} deviations turns out to be somewhat simpler~\cite{DM} than the calculation of the {\em small}
395: deviations `a la TW. One simply has to minimize the effective free energy
396: of a Coulomb gas using the method of steepest descents and then analyze the
397: resulting saddle point equation (which is an integral equation)~\cite{DM}.
398: This technique is quite useful, as it can be applied to other problems
399: as well, such as the calculation of the average number of stationary points
400: for a Gaussian random fields with $N$ components in the large $N$ limit~\cite{BrayDean,FSW}
401: and also the large deviation function associated with the largest eigenvalue
402: of other types of matrices, such as the Wishart matrices~\cite{VMB}.
403: In terms of the variable $z=y-\sqrt{2}$, the {\em left} large deviation
404: function has the following
405: explicit expression~\cite{DM}
406: \begin{eqnarray}
407: \Phi_{-}(y=z+\sqrt{2})& =& -\frac{1}{8}(3+2 \ln 2) + \frac{1}{216}\left[ 72z^2 -2z^4
408: (30z + 2z^3) \sqrt{6+z^2} \right.\nonumber \\
409: &+& \left. 27\left( 3 + \ln(1296) - 4 \ln\left(-z +
410: \sqrt{6 +z^2}\right) \right) \right].
411: \label{ldfl}
412: \end{eqnarray}
413: In particular, the constant $\theta$ is given exactly by
414: \begin{equation}
415: \theta = \beta\, \Phi(\sqrt{2})= \beta\, \frac{\ln 3}{4} = (0.274653\dots )\,\beta.
416: \label{theta}
417: \end{equation}
418:
419: Another interesting point about the left large deviation function $\Phi_{-}(y)$ is the following.
420: It describes the probability of large $\sim O(\sqrt{N})$ fluctuations to the left of the mean, i.e.,
421: when $y=(\sqrt{2N}-\lambda_{\rm max})/\sqrt{N} \sim O(1)$. Now, if we take the $y\to 0$ limit,
422: then $\Phi_{-}(y)$ should describe the {\em small} fluctuations to the left of the mean $\sqrt{2N}$.
423: In other words, we expect to recover the left tail of the TW distribution by taking the $y\to 0$
424: limit in the left large deviation function. Indeed, as $y\to 0$, one finds from Eq. (\ref{ldfl}),
425: that $\Phi_{-}(y) \approx y^3/{6\sqrt{2}}$. Putting this expression back in Eq. (\ref{ldf1})
426: one gets
427: \begin{equation}
428: {\rm Prob}[\lambda_{\rm max}\le t, N]\approx \exp\left[-\frac{\beta}{24}\big|\sqrt{2}\,
429: N^{1/6}\,(t-\sqrt{2N})\big|^3\right]
430: \label{asymp3}
431: \end{equation}
432: Given that $\chi= \sqrt{2}\,
433: N^{1/6}\,\left(t-\sqrt{2N}\right)$ is the Tracy-Widom scaling variable, we find that the result
434: in Eq. (\ref{asymp3}) matches exactly with the left
435: tail of the Tracy-Widom distribution for all $\beta$.
436: For example, for $\beta=2$ one can easily verify this by comparing Eqs. (\ref{asymp3})
437: and (\ref{asymp1}).
438: This approach not only serves as a useful check that one has obtained the correct
439: large deviation function $\Phi_{-}(y)$, but also provides an alternative and simpler way
440: to derive the asymptotics of the left tail of the TW distribution.
441: A similar expression for the right large deviation function $\Phi_+(y)$ for the
442: Gaussian ensemble is still missing and its computation remains an open problem.
443:
444: Although the Tracy-Widom distribution was originally derived as the limiting distribution
445: of the largest eigenvalue of matrices whose elements are drawn from Gaussian distributions,
446: it is now believed that the same limiting distribution also holds for matrices drawn
447: from a larger class of ensembles, e.g., when the entries are independent
448: and identically distributed random variables drawn from an arbitrary distribution
449: with all moments finite~\cite{Sosh,BBP1}.
450: Recently, Biroli, Bouchaud and Potters ~\cite{BBP} extended this result to
451: power-law ensembles, where each entry of a random matrix is drawn independently
452: from a power-law distribution~\cite{CB,Burda}.
453: They showed that
454: as long as the fourth moment of this power-law distribution is finite, the suitably
455: scaled $\lambda_{\rm max}$ is again TW distributed, but when the fourth moment is
456: infinite, $\lambda_{\rm max}$ has Fr\'echet fluctuations~\cite{BBP}. It would be interesting
457: to compute the probability of {\em large} deviations of $\lambda_{\rm max}$
458: for this power-law ensemble, as in the Gaussian case mentioned above. For example,
459: what is the probability that all the eigenavlues of such random matrices (drawn
460: from the power-law ensemble) are negative (or positive), i.e. $\lambda_{\rm max}\le 0$?
461: This is an open question.
462:
463: \section{The Longest Common Subsequence Problem (or the Ulam Problem)}
464:
465: The longest common subsequence (LIS) problem was first stated by Ulam~\cite{Ulam} in 1961, hence
466: it is also called the Ulam's problem. Since then, a lot of research, mostly by
467: probabilists, has been done on this problem (for a brief history of the problem, see the
468: introduction in Ref. \cite{BDJ}). The problem can be stated very simply as follows.
469: Consider a set of $N$ distinct integers $\{1,2,3,\dots, N\}$. Consider all
470: $N!$ possible permutations of
471: this sequence. For any given
472: permutation, let us find all possible increasing subsequences (terms of a
473: subsequence need not necessarily be consecutive elements) and from them find
474: out the longest one. For example, take $N=10$ and consider a particular
475: permutation $\{8, 2, 7, \underbar 1, \underbar 3, \underbar 4, 10, \underbar 6,
476: \underbar 9, 5\}$. From this sequence, one can form several increasing
477: subsequences such as $\{8,10\}$, $\{2,3,4,10\}$, $\{1,3,4,10\}$ etc. The
478: longest one of all such subsequences is either $\{1,3,4,6,9\}$ as shown by the
479: underscores or $\{2,3,4,6,9\}$. The length $l_N$ of the LIS
480: (in our example $l_N=5$) is a random
481: variable as it varies from one permutation to another. In the Ulam problem one
482: considers all the $N!$ permutations to be equally likely. Given this uniform
483: measure over the space of permutations, what is the statistics of the random
484: variable $l_N$?
485:
486: Ulam found numerically that the average length $\langle
487: l_N\rangle$ behaves asymptotically $\langle l_N\rangle\sim c \sqrt{N}$ for
488: large $N$. Later this result was established rigorously by Hammersley
489: \cite{Hammersley} and the constant $c=2$ was found by Vershik and Kerov
490: \cite{VK}. Recently, in a seminal paper, Baik, Deift and Johansson (BDJ)
491: \cite{BDJ} derived the full distribution of $l_N$ for large $N$. In particular,
492: they showed that asymptotically for large $N$
493: \begin{equation}
494: l_N \to 2\sqrt {N} + N^{1/6} \chi
495: \label{lis1}
496: \end{equation}
497: where the
498: random variable $\chi$ has a limiting $N$-independent distribution,
499: \begin{equation}
500: {\rm Prob}(\chi\leq x) = F_2(x)
501: \label{gue}
502: \end{equation}
503: where $F_2(x)$ is precisely
504: the TW distribution for the largest eigenvalue of a random matrix
505: drawn from the GUE ($\beta=2$), as defined in Section 3.
506: Note that the power of $N$ in the correction term in Eq. (\ref{lis1}) is ${+1/6}$
507: as opposed to the asymptotic law in Eq. (\ref{tw2}) where the power of $N$ in the correction term
508: is $-1/6$. This means that while for random matrices of size $(N\times N)$, the typical
509: fluctuation of $\lambda_{\rm max}$ around its mean value $\sqrt{2N}$ {\em decreases} with
510: $N$ as $N^{-1/6}$ as $N\to \infty$ (i.e., the distribution gets narrower ans narrower
511: around the mean as $N$ increases), the opposite happens in the Ulam problem: the
512: typical fluctuation in $l_N$ around its mean $2\sqrt{N}$ {\em increases} as $N^{1/6}$
513: with increasing $N$, i.e., the distribution around the mean gets broader and broader
514: with increasing $N$.
515:
516: BDJ also showed that
517: when the sequence length $N$ itself is a random variable drawn from a
518: Poisson distribution with mean $\langle N\rangle =\lambda$, the length of the LIS converges for
519: large $\lambda$ to
520: \begin{equation}
521: l_{\lambda}\to 2\sqrt{\lambda} + {\lambda}^{1/6} \chi,
522: \label{bdj1}
523: \end{equation}
524: where $\chi$ has the Tracy-Widom distribution $F_2(x)$. The fixed $N$ and the fixed
525: $\lambda$ ensembles are like the canonical and the grand canonical ensembles in
526: statistical mechanics. The
527: BDJ results led to an avalanche of subsequent mathematical works \cite{AD}.
528: \begin{figure}[t]
529: \includegraphics[width=.7\hsize]{psheap.eps}
530: \caption{The construction of piles according to the patience sorting game. The number
531: of piles corresponding to the sequence
532: $\{8,3,5,1,2,6,4,7\}$ is $4$, which is also the length of the LIS of this sequence.}
533: \label{psheap}
534: \end{figure}
535:
536: I will not provide here the derivation of the BDJ results, but I will assume this
537: result to be known and use it later for other problems. As we will see later, in
538: many problems such as in several growth models, the stratgey is to map those models
539: into the LIS problem and subsequently use the BDJ results. In these mappings, typically
540: the height of a growing surface in the $(1+1)$ dimensional growth models gets mapped to
541: the length of the LIS, i.e., schematically, $H \to l_N$. Subsequently, using the BDJ
542: results for the distribution of $l_N$, one shows that the height in growth models
543: is distributed accoriding to the Tracy-Widom law. I will show explicitly how this
544: strategy works for one specific ballistic deposition model in Section 5.1.
545: But to understand the mapping, we need to know one additional fact about the LIS, which I
546: discuss below.
547:
548: Suppose we are given a specific permutation of $N$ integers.
549: What is a simple algorithm to find the length of the LIS of this permuation?
550: The most famous algorithm goes by the name of Robinson-Schensted-Knuth (RSK)
551: algorithm~\cite{RSK}, which makes a correspondence between the permutation
552: and a Young tableaux, and has played a very important role in the development
553: of the LIS problem. But let me not discuss this
554: here, the reader can find a nice readable account in Ref. \cite{AD}. Instead, I will
555: discuss another related algorithm known as the `patience-sorting' algorithm which
556: will be more useful for our purposes. This algorithm was developed first by Mallows~\cite{Mallows}
557: who showed its connection to the Young tableaux. I will discuss here the version that was
558: discussed recently by Aldous and Diaconis~\cite{AD}. This algorithm is best explained
559: in terms of an example. Let us take $N=8$ and consider a specific permuation,
560: say $\{8,3,5,1,2,6,4,7\}$. The `patience sorting' is a greedy algorithm
561: that will easily find the length of the LIS of this sequence. It is like
562: a simple card game of `patience'. This game
563: goes as follows: start forming piles with the numbers in the permuted sequence
564: starting with the first element which is $8$ in our example. So, the number 8
565: forms the base of the first pile (see Fig. \ref{psheap}). The next element, if less than 8, goes on
566: top of 8. If not, it forms the base of a new pile. One follows a greedy
567: algorithm: for any new element of the sequence, check all the top numbers on
568: the existing piles starting from the first pile and if the new number is less
569: than the top number of an already existing pile, it goes on top of that pile.
570: If the new number is larger than all the top numbers of the existing piles,
571: this new number forms the base of a new pile. Thus in our example, we form $4$
572: distinct piles: $[\{8,3,1\}, \{5,2\}, \{6,4\}, \{7\}]$. Thus the number of piles
573: is $4$. On the other hand, for this particular example, it is easy to check
574: that there are $3$ LIS's namely, $\{3,5,6,7\}$, $\{1,2,6,7\}$ and $\{1,2,4,7\}$, all of the same
575: length $l=4$. So, we see that the length of the LIS is $4$, same as the number of
576: piles in the patience sorting game. But this is not an accident. One can
577: easily prove~\cite{AD} that for any given permutation of $N$ integers, the length of the
578: LIS $l_N$ is exactly the same as the number of piles in the corresponding `patience sorting'
579: algorithm. We will see later that this fact does indeed play a crucial role in our mapping
580: of growth models to the LIS problem.
581:
582: \section{Directed Polymers and Growth Models}
583:
584: The problem of directed polymers in random medium has been an active area
585: of research in statistical physics for the past three decades.
586: Apart from the fact that it is a simple `toy' model of disordered systems,
587: the directed polymer problem has important links to a wide variety
588: of other problems in physics, such as interface fluctuations and pinning~\cite{HH},
589: growing interface models of the Kardar-Parisi-Zhang (KPZ)
590: variety~\cite{KPZ}, randomly forced Burger's equation in fluid dynamics~\cite{FNS},
591: spin glasses~\cite{DS1,Mezard,FH},
592: and also to a single-particle quantum mechanics problem in a time-dependent random
593: potential~\cite{Kardar}. There are many interesting issues associated
594: with the directed polymer problem, such as the phase-transition at a finite
595: temperature in $(d+1)$-dimensional directer polymer when $d>2$~\cite{IS1}, the nature
596: of the low temperature phase~\cite{Mezard,FH}, the nature of the tranverse fluctuations~\cite{KZ,HH}
597: etc. The literature on the subject is huge (for a review see Ref. \cite{HZ}).
598:
599:
600: Here we will focus simply at zero-temperature and a lattice version of the directed polymer
601: problem. This version can be stated as in Fig. \ref{dp}.
602: Consider a square lattice with $O$ denoting the origin.
603: On each site with coordinates $(i,j)$ of this lattice, there is a random energy
604: $\epsilon_{i,j}$, drawn
605: independently
606: from site to site, but from the identical distribution $\rho(\epsilon)$. For simplicity, we
607: will consider that $\epsilon_{i,j}$'s are all negative, i.e., $\rho(\epsilon)$ has support
608: only over $\epsilon\in [0,-\infty]$. The energy variables $\epsilon_{i,j}$'s are quenched
609: random variables.
610: \begin{figure}[t]
611: \includegraphics[width=.7\hsize]{dp.eps}
612: \caption{Directed polymer in $(1+1)$ dimensions with random site energies.}
613: \label{dp}
614: \end{figure}
615:
616: We are interested here only in directed walks for simplicity.
617: Consider all possible directed walk configurations (a walk that can move only
618: north or eastward as shown in Fig. \ref{dp}) that start from the origin $O$ and end up
619: at a fixed point, say $P$ with co-ordinates $(x,y)$. An example of such a walk
620: is shown in Fig. \ref{dp}.
621: The total energy $E(W)$ of any given walk $W$ from $O$ to $P$ is just the sum of site energies along the path
622: $W$,
623: $E(W)= \sum_{i\in W} \epsilon_i$. Thus, for fixed $O$ and $P$ (the endpoints), the energy of a
624: path varies from one path to another (all having
625: the same endpoints $O$ and $P$). The path having the minimum energy (optimal path) among these will
626: correspond to the ground state configuration, i.e., the polymer will prefer to choose
627: this optimal path at zero temperature. Let $E_0(x,y)$ denote this minimum energy amongst
628: all directed paths that start at $O$ and finish at $P:(x,y)$. Now, this minimum energy
629: $E_0(x,y)$ is, of course, a random variable since it fluctuates from one configuration
630: of quenched disorder to another. One is interested in the statistics of $E_0(x,y)$ for
631: a given fixed $(x,y)$. For example, what is the probability distribution of $E_0(x,y)$
632: given that $\epsilon_{x,y}$'s are independent and identically distributed random variables each
633: drawn from $\rho(\epsilon)$?
634:
635: Mathematically, one can write an `evolution' equation or recursion relation for the variable $E_0(x,y)$.
636: Indeed, the path that ends up at say $(x,y)$, must have visited either the site $(x-1,y)$
637: or the site $(x,y-1)$ at the previous step. Then clearly,
638: \begin{equation}
639: E_0(x,y) = {\rm min}\left[E_0(x-1,y), E_0(x,y-1)\right] + \epsilon_{x,y}
640: \label{dpr1}
641: \end{equation}
642: where $\epsilon_{x,y}$ denotes the random energy associated with the site $(x,y)$.
643: Alternately, we can define $H(x,y)=-E_0(x,y)$ which are all positive variables that
644: satisfy the recursion relation
645: \begin{equation}
646: H(x,y) = {\rm max}\left[H(x-1,y), H(x,y-1)\right] + \xi_{x,y}
647: \label{dpr2}
648: \end{equation}
649: where $\xi_{x,y}=-\epsilon_{x,y}$ are positive random variables. The recursion
650: relation in Eq. (\ref{dpr2}) is non-linear and hence is difficult to find the
651: distribution of $H(x,y)$, knowing the distribution of the $\xi_{x,y}$'s.
652: Note that, by interpreting $t=x+y$ as a time-like variable, and denoting
653: by $i$ the transverse coordinate at a fixed $t$, this recursion
654: relation can also be interpreted as a stochastic evolution equation,
655: \begin{equation}
656: H(i,t) = {\rm max}\left[H(i+1,t-1), H(i-1,t-1)\right] + \xi_{i,t}
657: \label{dpr3}
658: \end{equation}
659: where the site energy $\xi_{i,t}$ can now be interpreted as a stochastic noise.
660: In this interpretation, one can think of the directed polymer as a growing
661: model of $(1+1)$ dimensional interface where $H(i,t)$ denotes the height of the interface
662: at the site $i$ of a one dimensional lattice at time $t$. Only, in this version, the
663: length of one dimensional lattice or the substrate keeps increasing linearly with
664: time $t$. In this respect, it corresponds to a special version of a polynuclear
665: growth model where growth occurs on top of a single droplet whose linear size
666: keeps increasing uniformly with time.
667: There are, of course, several other variations of this simple directed
668: polymer model~\cite{HZ}. For example, one can consider a version
669: where the random energies are associated with bonds, rather than the sites.
670: Similarly, one can consider a finite temperature version of the model.
671: In the corresponding analogy to the interface model, at finite temperature, the free energy
672: (as opposed to the ground state energy) of the polymer corresponds to the
673: height variable of the interface. This is most easily seen in the continuum formulation
674: of the model by writing down the partition function as a path integral
675: and then showing directly that $H=\ln Z$ satisfies the KPZ equation~\cite{HHF}.
676:
677: A lot is known about the first and the second moment of $H(x,y)$ (or alternatively
678: for $H(i,t)$ in the height language)
679: and the associated universality properties~\cite{Mezard,FH,KMH}. For example, from simple
680: extensivity properties, one would expect that average ground state energy
681: of the path will increase linearly with the size (number of steps $t$) of the path.
682: In terms of height, this means $\langle H(i,t)\rangle \to v(i) t$ for large $t$
683: where $v(i)$ is velocity of the interface at site $i$ of the one dimensional
684: lattice~\cite{KH}. Also, the standard deviation of height,
685: say of $H(x,x)$ (along the diagonal),
686: is known to grow universally, for large $x$ as $x^{1/3}$~\cite{HZ}. For the interface, this means
687: that the typical height fluctuation grows as $t^{1/3}$ for large $t$, a result
688: that is known from the KPZ problem in $1$-dimension (via a mapping to the noisy
689: Burgers equation).
690: However, much less was known about the full distribution
691: of $H(x,y)$, till only recently.
692:
693: Johansson~\cite{J1} was able to derive the full asymptotic distribution of $H(x,y)$
694: evolving via Eq. (\ref{dpr2}) for a specific disorder distribution, where the noise
695: $\xi_{x,y}$'s in Eq. (\ref{dpr2}) are i.i.d variables taking nonnegative integer
696: values according to the distribution: ${\rm Prob}(\xi_{x,y}=k)= (1-p)\, p^k$ for $k=0,1,2,\dots$,
697: where $0\le p\le 1$ is a fraction.
698: Interestingly, exactly the same recursion relation as in Eq. (\ref{dpr2}) and also
699: with the same disorder distribution as in Johansson's model
700: also appeared independently around the same time in an anisotropic directed percolation
701: problem studied by Rajesh and Dhar~\cite{RD}, a problem to which we will come back
702: later when we discuss the sequence matching problem. The authors in Ref.~\cite{RD} were able
703: to compute exactly the first moment, but Johansson computed the full asymptotic
704: distribution. He showed that for large $x$ and $y$~\cite{J1}
705: \begin{eqnarray}
706: H(x,y) &\to& \frac{2\sqrt{pxy}+p(x+y)}{q}+ \nonumber \\
707: &+& \frac{(pxy)^{1/6}}{q}\,\left[(1+p)+\sqrt{\frac{p}{xy}}\,(x+y)\right]^{2/3}
708: \, \chi
709: \label{j1}
710: \end{eqnarray}
711: where $q=1-p$, $\chi$ is a random variable with the Tracy-Widom distribution, ${\rm Prob}(\chi\le x)=F_2(x)$
712: as in Eq. (\ref{gue}). If one sets $x=y=t/2$, then for the growing droplet interpretation, it would
713: mean that the height $H(i=0,t)$ has a mean that grows linearly with $t$ and a standard deviation
714: that grows as $t^{1/3}$ and when properly centered and scaled, the distribution of $H(0,t)$
715: tends to the GUE Tracy-Widom distribution. Around the same time, Pr\"ahofer and Spohn derived
716: a similar result for a class of PNG models~\cite{PS}. Moreover, they were able to show that not just the
717: $F_2(x)$,
718: but other Tracy-Widom distributions such as the $F_1(x)$ (corresponding to the GOE ensemble)
719: also arises in the PNG model when one starts from different initial conditions~\cite{PS}.
720:
721: \subsection{Exact Height Distribution in A Ballistic Deposition Model}
722:
723: In this subsection, we will show explicitly how one can derive the exact height distribution
724: in a specific $(1+1)$ dimensional growth model and show that it has a limiting Tracy-Widom
725: distribution. This example will illustrate explicitly how one maps a growth model
726: to the LIS problem~\cite{BD}. A similar mapping was used by Pr\"ahofer and Spohn
727: for the PNG model~\cite{PS}. But before we illustrate the mapping, it is useful
728: to remark (i) why one studies such growth models and (ii) what does this mapping
729: and subsequent calculation of the height distribution achieve?
730:
731: The answer to these two questions are as follows. We know that growth processes are
732: ubiquitous in nature. The past few decades have seen
733: extensive research on a wide variety of both discrete and contiuous growth models
734: \cite{Meakin,KS,HZ}. A large class of these growth models in $(1+1)$ dimensions
735: such as the Eden model
736: \cite{Eden}, restricted solid on solid (RSOS) models \cite{RSOS}, directed
737: polymers as mentioned before~\cite{HZ}, polynuclear growth models (PNG) \cite{PNG} and ballistic
738: deposition models (BD)~\cite{BaD} are believed to belong to the same
739: universality class as that of the Kardar-Parisi-Zhang (KPZ) equation describing the
740: growth of interface fluctuations \cite{KPZ}. This universality is, however,
741: somewhat restricted in the sense that it refers only to the width or the second
742: moment of the height fluctuations characterized by two independent exponents
743: (the growth exponent $\beta$ and the dynamical exponent $z$) and the associated
744: scaling function. Moreover, even this restricted universality is established
745: mostly numerically. Only in very few special discrete models in $(1+1)$ dimensions, the
746: exponents $\beta=1/3$ and $z=3/2$ can be computed exactly via the Bethe ansatz
747: technique \cite{Bethe}. A natural and important question is whether this
748: universality can be extended beyond the second moment of height fluctuations.
749: For example, is the full distribution of the height fluctuations (suitably
750: scaled) universal, i.e. is the same for different growth models belonging to
751: the KPZ class? Moreover, the KPZ-type equations are usually attributed to
752: models with small gradients in the height profile and the question whether the
753: models with large gradients (such as the BD models) belong to the KPZ universality class is still
754: open. The connection between the discrete BD models and the continuum KPZ equation
755: has recently been elucidated upon \cite{KS1}.
756:
757: To test whether this more stringent test of universality (going beyond the second moment) of the full
758: distribution is true or not,
759: one needs to calculate the full height distribution in different models which are known
760: to belong to the KPZ universality class as far as only the second moment is concerned.
761: In fact, as mentioned earlier, Pr\"ahofer and Spohn were able to calculate the asymptotic height
762: distribution in a class of PNG models and showed that it has the Tracy-Widom distribution~\cite{PS}.
763: Similarly, we mentioneed earlier that Johansson~\cite{J1} established rigorously that
764: the height distribution,
765: in a specific version of the directed polymer model, is of the Tracy-Widom form.
766: Subsequently, there have been several other works~\cite{GTW} recently, including the ballistic deposition
767: model~\cite{BD} that we will discuss below, that showed that indeed
768: all these $(1+1)$ dimensional growth models share the same common scaled height distribution
769: (Tracy-Widom), thus putting the universality on a much stronger footing going beyond just the
770: second moment.
771:
772: We now focus on a specific ballistic deposition model. Ballistic deposition models typically
773: try to mimic columnar growth that occur in many natural systems and have been studied
774: extensively in the past with a variety of microscopic rules~\cite{Krug2,BaD}, though an exact calculation
775: of the height distribution remained elusive in any of these microscopic models. In collaboration
776: with S. Nechaev, we found a particular ballistic deposition model which can be explicitly mapped
777: to the LIS problem and hence the full asymptotic height distribution can be computed
778: exactly~\cite{BD}.
779: In our $(1+1)$-D (here $D$ stands for `dimensional') BD model columnar growth occurs sequentially on a linear
780: substrate
781: consisting of $L$ columns with free boundary conditions. The time $t$ is
782: discrete and is increased by $1$ with every deposition event. We first consider
783: the flat initial condition, i.e., an empty substrate at $t=0$. Other initial
784: conditions will be treated later. At any stage of the growth, a column (say the
785: $k$-th column) is chosen at random with probability $p=\frac{1}{L}$ and a
786: "brick" is deposited there which increases the height of this column by one
787: unit, $H_k\to H_k+1$. Once this "brick" is deposited, it screens all the sites
788: at the same level in all the columns to its right from future deposition, i.e.
789: the heights at all the columns to the right of the $k$-th column must be
790: strictly greater than or equal to $H_k+1$ at all subsequent times. For example,
791: in Fig. \ref{fig:1}, the first brick (denoted by 1) gets deposited at $t=1$ in
792: the 4-th column and it immediately screens all the sites to its right. Then the
793: second brick (denoted by 2) gets deposited at $t=2$ again in the same 4-th
794: column whose height now becomes 2 and thus the heights of all the columns to
795: the right of the 4-th column must be $\ge 2$ at all subsequent times and so on.
796: Formally such growth is implemented by the following update rule. If the $k$-th site
797: is chosen at time $t$ for deposition, then
798: \begin{equation}
799: H_k(t+1)={\rm max}\{H_k(t), H_{k-1}(t), \dots, H_1(t)\}+1.
800: \label{update1}
801: \end{equation}
802: The model is anisotropic and evidently even the average height profile $\langle
803: H_k(t) \rangle$ depends nontrivially on both the column number $k$ and time
804: $t$. Our goal is to compute the asymptotic height distribution $P_k(H,t)$ for
805: large $t$.
806: \begin{figure}
807: %\centerline{\epsfig{file=bdm.eps,width=5cm}}
808: \includegraphics[width=.7\hsize]{bdm.eps}
809: \caption{Growth of a heap with asymmetric long-range interaction. The numbers
810: inside cells show the times at which the blocks are added to the heap.}
811: \label{fig:1}
812: \end{figure}
813:
814: It is easy to find the height distribution $P_1(H, t)$ of the first column,
815: since the height there does not depend on any other column. At any stage, the
816: height in the first column either increases by one unit with probability
817: $p=\frac{1}{L}$ (if this column is selected for deposit) or stays the same with
818: probability $1-p$. Thus $P_1(H,t)$ is simply the binomial distribution,
819: $P_1(H,t)={t\choose H}p^h(1-p)^{t-H}$ with $H\leq t$. The average height of the
820: first column thus increases as $\langle H_1(t)\rangle=pt$ for all $t$ and its
821: variance is given by $\sigma_1^2(t)= tp(1-p)$. While the first column is thus
822: trivial, the dynamics of heights in other columns is nontrivial due to the
823: right-handed infinite range interactions between the columns. For
824: convenience, we subsequently measure the height of any other column with respect to the
825: first one. Namely, by height $h_k(t)$ we mean the height difference between the
826: $(k+1)$-th column and the first one, $h_k(t)=H_{k+1}(t)-H_1(t)$, so that
827: $h_0(t)=0$ for all $t$.
828:
829: To make progress for columns $k>0$, we first consider a
830: (2+1)-D construction of the heap as shown in Fig. \ref{fig:2}, by adding an extra
831: dimension indicating the time $t$. In Fig. \ref{fig:2}, the $x$ axis denotes the
832: column number, the $y$ axis stands for the time $t$ and the $z$ axis is the
833: height $h$. In this figure, every time a new block is added, it "wets" all the
834: sites at the same level to its "east" (along the $x$ axis) and to its "north"
835: (along the time axis). Here "wetting" means "screening" from
836: further deposition at those sites at the same level. This $(2+1)$-D system of
837: "terraces" is in one-to-one correspondence with the $(1+1)$-D heap in
838: Fig. \ref{fig:1}. This construction is reminiscent of the 3D anisotropic
839: directed percolation (ADP) problem studied by Rajesh and Dhar \cite{RD}. Note however,
840: that unlike the ADP problem, in our case each row labelled by $t$ can contain
841: only one deposition event.
842: \begin{figure}
843: %\centerline{\epsfig{file=d3.eps,width=8cm}}
844: \includegraphics[width=.7\hsize]{d3.eps}
845: \caption{$(2+1)$ dimensional "terraces" corresponding to the growth of a heap
846: in Fig. \ref{fig:1}}
847: \label{fig:2}
848: \end{figure}
849:
850: The next step is to consider the projection onto the 2D $(x,y)$-plane of the
851: level lines separating the adjacent terraces whose heights differ by $1$. In
852: this projection, some of the level lines may overlap partially on the plane.
853: To avoid the overlap for better visual purposes, we make a shift
854: $(x,y)\to (x+h(x,y),y)$ and represent these shifted directed lines on the 2D
855: plane in Fig. \ref{fig:3}.
856: The black dots in Fig. \ref{fig:3} denote the points
857: where the deposition events took place and the integer next to a dot denotes
858: the time of this event. Note that each row in Fig. \ref{fig:3} contains a single
859: black dot, i.e., only one deposition per unit of time can occur. In
860: Fig. \ref{fig:3}, there are 8 such events whose deposition times form the
861: sequence $\{1,2,3,4,5,6,7,8\}$ of length $N=8$. Now let us read the deposition times of the
862: dots sequentially, but now column by column and vertically from top to bottom
863: in each column, starting from the leftmost one. Then this sequence reads
864: $\{8,3,5,1,2,6,4,7\}$ which is just a permutation of the original sequence
865: $\{1,2,3,4,5,6,7,8\}$. In the permuted sequence $\{8,3,5,1,2,6,4,7\}$ there are
866: $3$ LIS's: $\{3,5,6,7\}$, $\{1,2,6,7\}$ and $\{1,2,4,7\}$, all of the same
867: length $l_N=4$. As mentioned before (see Fig. \ref{psheap}), this is precisely
868: the number of piles in the patience sorting of the permutation
869: $\{8,3,5,1,2,6,4,7\}$.
870:
871: \begin{figure}
872: %\centerline{\epsfig{file=permu.eps,width=5cm}}
873: \includegraphics[width=.7\hsize]{permu.eps}
874: \caption{The directed lines are the level lines separating adjacent terraces
875: with height diffrence $1$ in Fig. 2, projected onto the $(x,y)$ plane and
876: shifted by $(x,y)\to (x+h(x,y),y)$ to avoid partial overlap. The black dots
877: denote the deposition events. The numbers next to the dots denote the times of
878: those deposition events.}
879: \label{fig:3}
880: \end{figure}
881:
882: Let us note one immediate fact from Fig. \ref{fig:3}. The numbers
883: belonging to the different level lines in Fig. \ref{fig:3} are in one-to-one
884: correspondence with the piles $[\{8,3,1\}, \{5,2\}, \{6,4\},\{7\}]$ in
885: Aldous--Diaconis patience sorting game. Hence, each pile can be identified with
886: an unique level line. Now, the height $h(x,t)$ at any given point $(x,t)$ in
887: Fig. \ref{fig:3} is equal to the number of level lines inside the rectangle
888: bounded by the corners: $[0,0], [x,0], [0,t], [x,t]$. Thus, we have
889: the correspondonce: height $\equiv$ number of level lines $\equiv$ number of piles $\equiv$
890: length $l_n$ of the LIS. However, to compute $l_n$, we need to know the value of $n$ which
891: is precisely the number of black dots inside this rectangle.
892:
893: Once the problem is reduced to finding the number of black dots or deposition events, we
894: no longer need the Fig. \ref{fig:3} (as it may confuse due to the visual shift
895: $(x,y)\to (x+h(x,y),y)$) and can go back to Fig. \ref{fig:2}, where the
896: north-to-east corners play the same role as the black dots in Fig. \ref{fig:2}.
897: In Fig. \ref{fig:2}, to determine the height $h_k(t)$ of the $k$-th column at
898: time $t$, we need to know the number of deposition events inside the $2$D plane
899: rectangle $R_{k,t}$ bounded by the four corners $[0,0], [k,0], [0,t], [k,t]$.
900: Let us begin with the last column $k=L$. For $k=L$ the number of deposition
901: events $N$ in the rectangle $R_{L,t}$ is equal to the time $t$ because there is
902: only one deposition event per time. In our example $N=t=8$. For a general $k<L$
903: the number of deposition events $N$ inside the rectangle $R_{k,t}$ is a random
904: variable, since some of the rows inside the rectangle may not contain a
905: north-to-east corner or a deposition event. The probability distribution
906: $P_{k,t}(N)$ (for a given $[k,t]$) of this random variable can, however, be
907: easily found as follows. At each step of deposition, a column is chosen at
908: random from any of the $L$ columns. Thus, the probability that a north-to-east
909: corner will fall on the segment of line $[0,k]$ (where $k\leq L$) is equal to
910: $k/L$. The deposition events are completely independent of each other,
911: indicating the absence of correlations between different rows labelled by $t$ in
912: Fig. \ref{fig:2}. So, we are asking the question: given $t$ rows, what is the
913: probability that $N$ of them will contain a north-to-east corner? This is
914: simply given by the binomial distribution
915: \begin{equation}
916: P_{k,t}(N) = {t\choose N } \left({\frac {k}{L}} \right)^N
917: \left(1-{\frac {k}{L}}\right)^{t-N},
918: \label{binom1}
919: \end{equation}
920: where $N\leq t$. Now we are reduced to the following problem: given a sequence
921: of integers of length $N$ (where $N$ itself is random and is taken from the
922: distribution in Eq.(\ref{binom1})), what is the length of the LIS? Recall that
923: this length is precisely the height $h_k(t)$ of the $k$-th column at time $t$
924: in our model. In the thermodynamic limit $L\to \infty$ for $t\gg 1$ and any
925: fixed $k$ such that the quotient $\lambda=\frac{tk}{L}$ remains fixed but is
926: arbitrary, the distribution in Eq.(\ref{binom1}) becomes a Poisson distribution
927: $P(N)\to e^{-\lambda} \frac {\lambda^N}{N!}$, with the mean
928: $\lambda=\frac{tk}{L}$. We can then directly use the BDJ result in
929: Eq.(\ref{bdj1}) to predict our main result for the height in the BD model,
930: \begin{equation}
931: h_k(t) \to 2\sqrt{\frac{tk}{L}} + \left(\frac{tk}{L}\right)^{1/6} \chi,
932: \label{result1}
933: \end{equation}
934: for large $\lambda=tk/L$, where the random variable $\chi$ has the
935: Tracy-Widom distribution $F_2(\chi)$ as in Eq. (\ref{gue}).
936: Using the known exact value $\langle \chi\rangle
937: =-1.7711...$ from the Tracy-Widom distribution \cite{TW1}, we find exactly the
938: asymptotic average height profile in the BD model,
939: \begin{equation}
940: \langle h_k(t)\rangle \to 2\sqrt{\frac{tk}{L}}-
941: 1.7711...\left(\frac{tk}{L}\right)^{1/6}.
942: \label{avgh}
943: \end{equation}
944: The leading square root dependence of the profile on the column number $k$ has
945: been seen numerically. Eq. (\ref{avgh}) also predicts an
946: exact sub-leading term with $k^{1/6}$ dependence. Similarly, for the variance,
947: $\sigma_k^2(t)=\langle [h_k(t)-\langle h_k(t)\rangle]^2 \rangle$, we find
948: asymptotically: $\sigma_k^2(t)\to c_0\left(\frac{tk}{L}\right)^{1/3}$, where
949: $c_0=\langle [\chi-\langle \chi \rangle]^2\rangle=0.8132...$ \cite{TW1}.
950: Eliminating the $t$ dependence for large $t$ between the average and the
951: variance, we get, $\sigma_k^2(t)\approx a {\langle h_k(t)\rangle}^{2\beta}$
952: where the constant $a=c_0/2^{2/3}=0.51228\dots$ and $\beta=1/3$, thus
953: recovering the KPZ scaling exponent.
954: In addition to the BD model with infinite range right-handed
955: interaction reported here,
956: we have also analyzed the model (analytically within a mean field theory and numerically)
957: when the right-handed interaction is short ranged.
958: Somewhat suurprisingly and pleasantly, we found that
959: the asymptotic average height profile is independent of the range of interaction.
960: A recent analysis of the short range BD model sheds light on this fact~\cite{KNV}.
961:
962: So far, we have demonstrated that for a flat initial condition, the height fluctuations in the
963: BD model follow the Tracy-Widom distribution $F_{\rm GUE}(x)$ which corresponds to
964: the distribution of the largest eigenvalue of a random matrix drawn from a Gaussian unitary ensemble.
965: In the context of the PNG model, Pr\"ahofer and Spohn \cite{PS} have shown that while the height
966: fluctuations of a single PNG droplet follow the distribution $F_{\rm GUE}(x)$, it is possible to
967: obtain other types of universal distributions as well. For example, the height fluctuations
968: in the PNG model growing over a flat substrate follow the
969: distribution $F_{\rm GOE}(x)$ where $F_{\rm GOE}(x)$ is the distribution of the largest
970: eigenvalue of a random matrix drawn from the Gaussian orthogonal ensemble. Besides,
971: in a PNG droplet with two external sources at its edges which nucleate with rates
972: $\rho_{+}$ and $\rho_{-}$, the height fluctuations have different distributions depending
973: on the values of $\rho_{+}$ and $\rho_{-}$. For $\rho_{+}<1$ and $\rho_{-}<1$, one gets back
974: the distribution $F_{\rm GUE}(x)$. If however $\rho_{+}=1$ and $\rho_{-}<1$ (or alternatively
975: $\rho_{-}=1$ and $\rho_{+}<1$), one gets the distribution $F_{\rm GOE}^2(x)$ which corresponds to
976: the distribution of the largest of the superimposed eigenvalues of two independent
977: GOE matrices. In the critical case $\rho_{+}=1$ and $\rho_{-}=1$, one gets a new
978: distribution $F_0(x)$ which does not have any random matrix analogy. For $\rho_{+}>1$
979: and $\rho_{-}>1$, one gets Gaussian distribution. These results for the PNG model were obtained in
980: Ref. \cite{PS} using a powerful theorem of Baik and Rains \cite{BR1}.
981:
982: The question naturally arises as to whether these other distributions, apart from the $F_{\rm GUE}(x)$,
983: can also appear in the BD model considered in this paper. Indeed, they do. For example, if
984: one starts with a staircase initial condition $h_k(0)=k$ for the heights in the BD model,
985: one gets the distribution $F_{\rm GOE}^2(x)$ for the scaled variable $\chi$. This follows from the
986: fact that for the staircase initial condition, in Fig. 2 there will be a black dot (or a north-to-east
987: corner) at every value of $k$ on the $k$ axis at $t=0$. Thus the black dots appear on the $k$ axis
988: with unit density. This
989: corresponds to the case $\rho_{+}=1$
990: and $\rho_{-}=0$ of the general results of Baik and Rains which leads to a $F_{\rm GOE}^2(x)$
991: distribution. Of course, the density $\rho_{+}$ can be tuned between $0$ and $1$, by tuning
992: the average slope of the staircase. For a generic $0<\rho_{+}\leq 1$, one can also
993: vary $\rho_{-}$ by putting an external source at the first column.
994: Thus one can obtain, in principle, most of the distributions discussed in Ref. \cite{BR1} by varying
995: $\rho_{+}$ and $\rho_{-}$.
996: Note that the
997: case $\rho_{-}=1$ (external source which drops one particle at the first column at every time step) and
998: $\rho_{+}=0$ (flat substrate) is, however, trivial since the surface then remains flat
999: at all times and the height just increases by one unit at every time step. The distribution
1000: $F_{\rm GOE}(x)$ is, however, not naturally accessible within the rules of our model.
1001:
1002: \section{Sequence Matching Problem}
1003:
1004: In this section, I will discuss a different problem namely that of the alignment of two
1005: random sequences and will illustrate how the Tracy-Widom distribution appears in this
1006: problem. This is based on a joint wotk with S. Nechaev~\cite{MN}.
1007:
1008: Sequence alignment is one of the most useful quantitative methods used in
1009: evolutionary molecular biology\cite{W1,Gusfield,DEKM}. The goal of an alignment
1010: algorithm is to search for similarities in patterns in different sequences. A
1011: classic and much studied alignment problem is the so called `longest common
1012: subsequence' (LCS) problem. The input to this problem is a pair of sequences
1013: $\alpha=\{\alpha_1, \alpha_2,\dots, \alpha_i\}$ (of length $i$) and
1014: $\beta=\{\beta_1, \beta_2,\dots, \beta_j\}$ (of length $j$). For example, $\alpha$
1015: and $\beta$ can be two random sequences of the $4$ base pairs $A$, $C$, $G$, $T$ of
1016: a DNA molecule, e.g., $\alpha=\{A, C, G, C, T, A, C\}$ and $\beta=\{C, T, G, A,
1017: C\}$. A subsequence of $\alpha$ is an ordered sublist of $\alpha$ (entries of which
1018: need not be consecutive in $\alpha$), e.g, $\{C, G, T, C\}$, but not $\{T, G, C\}$.
1019: A common subsequence of two sequences $\alpha$ and $\beta$ is a subsequence of both
1020: of them. For example, the subsequence $\{C, G, A, C\}$ is a common subsequence of
1021: both $\alpha$ and $\beta$. There can be many possible common subsequences of a pair
1022: of sequences. For example, another common subsequence of $\alpha$ and $\beta$ is
1023: $\{A, C\}$. One simple way to construct different common subsequences (for two
1024: fixed sequences $\alpha$ and $\beta$) is by drawing lines from one member
1025: of the set $\alpha$ to another member of the set $\beta$ such that the lines
1026: can not cross. For example, the common subsequence $\{C, G, A, C\}$ is shown
1027: by solid lines in Fig. \ref{matching}. On the other hand the common subsequence
1028: $\{A,C\}$ is shown by the dashed lines in Fig. \ref{matching}.
1029: \begin{figure}
1030: \includegraphics[width=.7\hsize]{matching.eps}
1031: \caption{ Two fixed sequences $\alpha: \{A, C, G, C, T, A, C\}$
1032: and $\beta: \{C, T, G, A, C\}$. The solid lines show the common
1033: subsequence $\{C, G, A, C\}$ and the dashed lines denote another
1034: common subsequence $\{A,C\}$.}
1035: \label{matching}
1036: \end{figure}
1037: The aim of the LCS problem is to find the longest of such common
1038: subsequences between two fixed sequences $\alpha$ and $\beta$.
1039:
1040: This problem and its variants have been widely studied in
1041: biology\cite{NW,SW,WGA,AGMML}, computer science\cite{SK,AG,WF,Gusfield}, probability
1042: theory\cite{CS,Deken,Steele,DP,Alex,KLM} and more recently in statistical
1043: physics\cite{ZM,Hwa,Monvel}. A particularly important application of the LCS problem
1044: is to quantify the closeness between two DNA sequences. In evolutionary biology, the
1045: genes responsible for building specific proteins evolve with time and by finding the
1046: LCS of the same gene in different species, one can learn what has been conserved in
1047: time. Also, when a new DNA molecule is sequenced {\it in vitro}, it is important to
1048: know whether it is really new or it already exists. This is achieved quantitatively
1049: by measuring the LCS of the new molecule with another existing already in the
1050: database.
1051:
1052: For a pair of fixed sequences of length $i$ and $j$ respectively, the length
1053: $L_{i,j}$ of their LCS is just a number. However, in the stochastic version of the
1054: LCS problem one compares two random sequences drawn from $c$ alphabets and hence the
1055: length $L_{i,j}$ is a random variable. A major challenge over the last three decades
1056: has been to determine the statistics of $L_{i,j}$\cite{CS,Deken,Steele,DP,Alex}. For
1057: equally long sequences ($i=j=n$), it has been proved that $\langle L_{n,n}\rangle
1058: \approx \gamma_c n$ for $n\gg 1$, where the averaging is performed over all
1059: realizations of the random sequences. The constant $\gamma_c$ is known as the
1060: Chv\'atal-Sankoff constant which, to date, remains undetermined though there exists
1061: several bounds\cite{Deken,DP,Alex}, a conjecture due to Steele\cite{Steele} that
1062: $\gamma_c=2/(1+\sqrt{c})$ and a recent proof\cite{KLM} that $\gamma_c\to 2/\sqrt{c}$
1063: as $c\to \infty$. Unfortunately, no exact results are available for the finite size
1064: corrections to the leading behavior of the average $\langle L_{n,n}\rangle$, for the
1065: variance, and also for the full probability distribution of $L_{n,n}$. Thus, despite
1066: tremendous analytical and numerical efforts, exact solution of the random LCS
1067: problem has, so far, remained elusive. Therefore it is important to find other
1068: variants of this LCS problem that may be analytically tractable.
1069:
1070: Computationally, the easiest way to determine the length $L_{i,j}$ of the LCS of two
1071: arbitrary sequences of lengths $i$ and $j$ (in polynomial time $\sim O(ij)$) is via
1072: using the recursive algorithm\cite{Gusfield,Monvel}
1073: \begin{equation}
1074: L_{ij} = \max\left[L_{i-1,j}, L_{i,j-1}, L_{i-1,j-1} + \eta_{i,j}\right],
1075: \label{recur1}
1076: \end{equation}
1077: subject to the initial conditions $L_{i,0}=L_{0,j}=L_{0,0}=0$. The variable
1078: $\eta_{i,j}$ is either 1 when the characters at the positions $i$ (in the sequence
1079: $\alpha$) and $j$ (in the sequence $\beta$) match each other, or 0 if they do not.
1080: Note that the variables $\eta_{i,j}$'s are not independent of each other. To see
1081: this consider the simple example -- matching of two strings $\alpha={\rm AB}$ and
1082: $\beta={\rm AA}$. One has by definition: $\eta_{1,1}=\eta_{1,2}=1$ and
1083: $\eta_{2,1}=0$. The knowledge of these three variables is sufficient to predict that
1084: the last two letters will not match, i.e., $\eta_{2,2}=0$. Thus, $\eta_{2,2}$ can
1085: not take its value independently of $\eta_{1,1},\,\eta_{1,2},\,\eta_{2,1}$. These
1086: residual correlations between the $\eta_{i,j}$ variables make the LCS problem rather
1087: complicated. Note however that for two random sequences drawn from $c$ alphabets,
1088: these correlations between the $\eta_{i,j}$ variables vanish in the $c\to \infty$
1089: limit.
1090:
1091: A natural question is how important are these correlations between the $\eta_{i,j}$ variables, e.g.,
1092: do they affect the asymptotic statistics of $L_{i,j}$'s for large $i$ and $j$?
1093: Is the problem solvable if one ignores these correlations?
1094: These questions naturally lead to the Bernoulli matching (BM) model which is a simpler variant of
1095: the original LCS problem where one ignores the correlations between $\eta_{i,j}$'s for all
1096: $c$\cite{Monvel}.
1097: The length $L_{i,j}^{BM}$ of the BM model satisfies the same
1098: recursion relation in Eq. (\ref{recur1}) except that $\eta_{i,j}$'s are now
1099: independent and each drawn from the bimodal distribution: $p(\eta)=
1100: (1/c)\delta_{\eta,1}+ (1-1/c)\delta_{\eta,0}$.
1101: This approximation is expected to be exact only in the appropriately taken
1102: $c\to \infty$ limit. Nevertheless, for finite $c$, the results on the BM model can serve
1103: as a useful benchmark for the original LCS model to decide if indeed the correlations
1104: between $\eta_{i,j}$'s are important or not. Unfortunately, even in the absence of
1105: correlations, the exact aymptotic distribution of $L_{i,j}^{BM}$ in the BM model has so far
1106: remained elusive, mainly due to the nonlinear nature of the recursion relation
1107: in Eq. (\ref{recur1}).
1108: The purpose of this Rapid Communication is to present an exact asymptotic formula for the
1109: distribution of the length $L_{n,n}^{BM}$ in the BM model for all $c$.
1110: So far, only the leading asymptotic behavior of the
1111: average length in the BM model is known\cite{Monvel} using the `cavity'
1112: method of spin glass physics\cite{MPV},
1113: \begin{equation}
1114: \langle L_{n,n}^{BM}\rangle \approx \gamma_c^{BM} n
1115: \label{bm1}
1116: \end{equation}
1117: where $\gamma_c^{BM}= 2/(1+\sqrt{c})$, same as the conjectured value of the
1118: Chv\'atal-Sankoff constant $\gamma_c$ for the original LCS model. However, other
1119: properties such as the variance or the distribution of $L_{n,n}^{BM}$ remained
1120: untractable even in the BM model.
1121: We have shown~\cite{MN}, as illustrated below, that for large $n$,
1122: \begin{equation}
1123: L_{n,n}^{BM}\to \gamma_c^{BM} n + f(c)\, n^{1/3}\, \chi
1124: \label{asymp11}
1125: \end{equation}
1126: where $\chi$ is a random variable with a $n$-independent distribution, ${\rm Prob}
1127: (\chi\le x)= F_{ 2}(x)$ which is precisely the Tracy-Widom distribution
1128: in Eq. (\ref{gue}).
1129: Indeed, we were also able to compute the functional form of the scale factor $f(c)$ exactly for all
1130: $c$~\cite{MN},
1131: \begin{equation}
1132: f(c)=\frac{c^{1/6}(\sqrt{c}-1)^{1/3}}{\sqrt{c}+1}.
1133: \label{fc1}
1134: \end{equation}
1135: This allows us to calculate the average including the subleading finite size
1136: correction term and the variance of $L_{n,n}^{BM}$ for large $n$,
1137: \begin{eqnarray}
1138: \langle L_{n,n}^{BM}\rangle &\approx & \gamma_c^{BM} n + \left<\chi\right> f(c)
1139: n^{1/3} \nonumber \\
1140: {\rm Var}\, L_{n,n}^{BM} &\approx &
1141: \left(\langle\chi^2\rangle-{\langle\chi\rangle}^2\right)\, f^2(c)\, n^{2/3},
1142: \label{eq:expvar}
1143: \end{eqnarray}
1144: where one can use the known exact values\cite{TW1}, $\langle \chi\rangle=
1145: -1.7711\dots$ and $\langle \chi^2\rangle- {\langle \chi\rangle}^2= 0.8132\dots$.
1146: These exact results thus invalidate the previous attempt\cite{Monvel} to
1147: fit the subleading correction to the mean in the BM model with a
1148: $n^{1/2}/{\ln (n)}$ behavior and also to fit the scaled distribution
1149: with a Gaussian form.
1150: Note that the recursion relation in Eq.
1151: (\ref{recur1}) can also be viewed as a $(1+1)$ dimensional directed polymer
1152: problem\cite{Hwa,Monvel} and some asymptotic results (such as the $O(n^{2/3})$
1153: behavior of the variance of $L_{n,n}$ for large $n$) can be obtained using the
1154: arguments of universality\cite{Hwa}. However, this does not provide precise results
1155: for the full distribution along with the correct scale factors that are obtained here.
1156:
1157: It is useful to provide a synopsis of our method in deriving these results. First,
1158: we prove the results in the $c\to \infty$ limit, by using mappings to other models.
1159: To make progress for finite $c$, we first map the BM model exactly to a $3$-d
1160: anisotropic directed percolation (ADP) model first studied by Rajesh and
1161: Dhar\cite{RD}. This ADP model is also precisely the same as the directed
1162: polymer model studied by Johansson~\cite{J1}, as discussed in the previous section
1163: and for which the exact results are known as in Eq. (\ref{j1}).
1164: To extract the results for the BM model from those of Johansson's
1165: model, we use a simple symmetry argument which then allows us to derive our main
1166: results in Eqs. (\ref{asymp11})-(\ref{eq:expvar}) for all $c$. As a check, we recover
1167: the $c\to \infty$ limit result obtained independently by the first method.
1168:
1169: In the BM model, the length $L_{i,j}^{BM}$ can be interpreted as the height of a
1170: surface over the $2$ dimensional $(i,j)$ plane constructed via the recursion relation in Eq.
1171: (\ref{recur1}). A typical surface, shown in Fig. \ref{fig:bms1}\,(a), has terrace-like structures.
1172: \begin{figure}
1173: \includegraphics[width=.7\hsize]{bm_f1.eps}
1174: \caption{Examples of (a) BM surface
1175: $L_{i,j}^{BM}\equiv {\tilde h}(x,y)$ and (b) ADP surface $L_{i,j}^{ADP}\equiv
1176: h(x,y)$.}
1177: \label{fig:bms1}
1178: \end{figure}
1179:
1180: It is useful to consider the projection of the level lines separating the adjacent
1181: terraces whose heights differ by $1$ (see Fig.\ref{fig:bms2}) onto the $2$-D $(i,j)$ plane. Note
1182: that, by the rule Eq. (\ref{recur1}), these level lines never overlap each other,
1183: i.e., no two paths have any common edge. The statistical weight of such a projected
1184: $2$-D configuration is the product of weights associated with the vertices of the
1185: $2$-D plane. There are five types of possible vertices with nonzero weights as shown
1186: in Fig. \ref{fig:bms2}, where $p=1/c$ and $q=1-p$. Since the level lines never cross each other,
1187: the weight of the first vertex in Fig. \ref{fig:bms2} is $0$.
1188: %The height $L_{i,j}^{BM}$ at any point $(i,j)$ on this $2$-d plane is just the
1189: %number of level lines that one crosses in going from the origin to $(i,j)$.
1190: \begin{figure}
1191: \includegraphics[width=.7\hsize]{bm_f2.eps}
1192: \caption{Projected $2$-d level lines separating adjacent terraces of unit height
1193: difference in the BM surface in Fig.\ref{fig:bms1} (a). The adjacent table shows the weights of
1194: all vertices on the $2$-d plane.}
1195: \label{fig:bms2}
1196: \end{figure}
1197:
1198: Consider first the limit $c\to \infty$ (i.e., $p\to 0$). The weights of all allowed
1199: vertices are $1$, except the ones shown by black dots in Fig. \ref{fig:bms2}, whose associated
1200: weights are $p\to 0$. The number $N$ of these black dots inside a rectangle of area
1201: $A=ij$ can be easily estimated.
1202: For large $A$ and $p\to 0$, this number
1203: is clearly
1204: Poisson
1205: distributed with the mean ${\overline N}= pA$.
1206: The height $L_{i,j}^{BM}$ is just the number of level lines $\cal N$ inside this
1207: rectangle of area $A=ij$. One can easily estimate $\cal N$ by following
1208: precisely the method outlined in the previous subsection in the context of the ballistic deposition
1209: model. Following the same analysis as in the ballistic deposition model,
1210: it is easy to see that
1211: the number of level lines ${\cal N}$ inside the rectangle
1212: (for large $A$), appropriately scaled, has a limiting behavior, ${\cal N}\to
1213: 2\sqrt{\overline N} + {\overline N}^{1/6}\, \chi$, where $\chi$ is a random variable
1214: with the Tracy-Widom distribution. Using ${\overline N}=pA=ij/c$, one then obtains in
1215: the limit $p\to 0$,
1216: \begin{equation}
1217: L_{i,j}^{BM}= {\cal N} \to \frac{2}{\sqrt c}\sqrt{ij} +
1218: {\left( \frac{ij}{c}\right)}^{1/6}\, \chi.
1219: \label{p01}
1220: \end{equation}
1221: In particular, for large equal length sequences $i=j=n$, we get for $c\to \infty$
1222: \begin{equation}
1223: L_{n,n}^{BM}\to \frac{2}{\sqrt{c}}\, n + c^{-1/6} \, n^{1/3}\, \chi .
1224: \label{p02}
1225: \end{equation}
1226: For finite $c$, while the above mapping to the LIS problem still works, the
1227: corresponding permutations of the LIS problem are not generated with equal
1228: probability and hence one can no longer use the BDJ results.
1229:
1230: For any finite $c$, we can however map the BM model to the ADP model studied by Rajesh and Dhar~\cite{RD}.
1231: In the ADP model on
1232: a simple cubic lattice the bonds are occupied with probabilities $p_x$, $p_y$, and
1233: $p_z$ along the $x$, $y$ and $z$ axes and are all directed towards increasing
1234: coordinates. Imagine a source of fluid at the origin which spreads along the
1235: occupied directed bonds. The sites that get wet by the fluid form a $3$-d cluster.
1236: In the ADP problem, the bond occupation probabilities are anisotropic, $p_x=p_y=1$
1237: (all bonds aligned along the $x$ and $y$ axes are occupied) and $p_z=p$. Hence, if
1238: the point $(x,y,z)$ gets wet by the fluid then all the points $(x',y', z)$ on the
1239: same plane with $x'\ge x$ and $y'\ge y$ also get wet. Such a wet cluster is compact
1240: and can be characterized by its bounding surface height $H(x,y)$ as shown in
1241: Fig.(1b). It is not difficult to see~\cite{RD} that the height $H(x,y)$ satisfies exactly
1242: the same recursion relation of the directed polymer as in Eq. (\ref{dpr2})
1243: where $\xi_{x,y}$'s are i.i.d. random variables taking nonnegative integer values
1244: with ${\rm Prob}(\xi_{x,y}=k)= (1-p)\, p^k$ for $k=0,1,2,\dots$. Thus the ADP
1245: model of Rajesh and Dhar is precisely identical to the directed polymer model
1246: studied by Johansson with exactly the same distribution of the noise $\xi(x,y)$.
1247:
1248: While the terrace-like structures of the ADP surface look similar to the BM surfaces
1249: (compare Figs. (\ref{fig:bms1}\,a) and (\ref{fig:bms1}\,b), there is an important difference between the
1250: two. In
1251: the ADP model, the level lines separating two adjacent terraces can overlap with
1252: each other\cite{RD}, which does not happen in the BM model. However, by making the
1253: following change of coordinates in the ADP model\cite{RD}
1254: \begin{equation}
1255: \zeta= x+ h(x,y); \,\,\, \eta=y+ h(x,y)
1256: \label{ct1}
1257: \end{equation}
1258: one gets a configuration of the surface where the level lines no longer overlap.
1259: Moreover, it is not difficult to show that the projected $2$-D configuration of
1260: level lines of this shifted ADP surface has exactly the same statistical weight as
1261: the projected $2$-D configuration of the BM surface. Denoting the BM height by
1262: ${\tilde h}(x,y)= L_{x,y}^{BM}$, one then has the identity, ${\tilde h}(\zeta,
1263: \eta)= h(x,y)$, which holds for each configuration. Using Eq. (\ref{ct1}), one can
1264: rewrite this identity as
1265: \begin{equation}
1266: {\tilde h}(\zeta, \eta)= h\left( \zeta- {\tilde h}(\zeta, \eta),
1267: \eta- {\tilde h}(\zeta, \eta)\right).
1268: \label{conv1}
1269: \end{equation}
1270:
1271: Thus, for any given height function $h(x,y)$ of the ADP model, one can, in
1272: principle, obtain the corresponding height function ${\tilde h}(x,y)$ for all
1273: $(x,y)$ of the BM model by solving the nonlinear equation (\ref{conv1}). This is
1274: however very difficult in practice. Fortunately, one can make progress for large
1275: $(x,y)$ where one can replace the integer valued discrete heights by continuous
1276: functions $h(x,y)$ and ${\tilde h}(x,y)$. Using the notation $\partial_x\equiv
1277: \partial/{\partial x}$ it is easy to derive from Eq. (\ref{ct1}) the following pair
1278: of identities,
1279: \begin{equation}
1280: \partial_x h = \frac{\partial_{\zeta} {\tilde h}}{1-\partial_{\zeta}
1281: {\tilde h}-\partial_{\eta} {\tilde h}};
1282: \,\,\,
1283: \partial_y h = \frac{\partial_{\eta} {\tilde h}}{1-\partial_{\zeta}
1284: {\tilde h}-\partial_{\eta} {\tilde h}}.
1285: \label{der1}
1286: \end{equation}
1287: In a similar way, one can show that
1288: \begin{equation}
1289: \partial_{\zeta} {\tilde h} = \frac{\partial_x h}{1+\partial_x h+\partial_y h};\,\,\,
1290: \partial_{\eta} {\tilde h} = \frac{\partial_y h}{1+\partial_x h+\partial_y h}.
1291: \label{der2}
1292: \end{equation}
1293: We then observe that Eqs. (\ref{der1}) and (\ref{der2}) are invariant under the
1294: simultaneous transformations
1295: \begin{equation}
1296: \zeta\to -x ; \,\, \eta\to -y; \,\, \tilde h \to h \, .
1297: \label{invar1}
1298: \end{equation}
1299: Since the height is built up by integrating the derivatives, this leads to a simple
1300: result for large $\zeta$ and $\eta$,
1301: \begin{equation}
1302: {\tilde h}(\zeta, \eta) = h(-\zeta, -\eta).
1303: \label{res1}
1304: \end{equation}
1305:
1306: Thus, if we know exactly the functional form of the ADP surface $h(x,y)$, then the
1307: functional form of the BM surface ${\tilde h}(x,y)$ for large $x$ and $y$ is simply
1308: obtained by ${\tilde h}(x,y)=h(-x,-y)$. Changing $x\to -x$ and $y\to -y$ in
1309: Johansson's expression for the ADP surface in Eq. (\ref{j1}) we thus arrive at our
1310: main asymptotic result for the BM model
1311: \begin{eqnarray}
1312: L_{x,y}^{BM}&=& {\tilde h}(x,y) \to \frac{2\sqrt{pxy}-p(x+y)}{q}+ \nonumber \\
1313: &+&\frac{(pxy)^{1/6}}{q}\,\left[(1+p)-\sqrt{\frac{p}{xy}}\,(x+y)\right]^{2/3} \,
1314: \chi, \label{res2}
1315: \end{eqnarray}
1316: where $p=1/c$ and $q=1-1/c$. For equal length sequences $x=y=n$, Eq. (\ref{res2})
1317: then reduces to Eq. (\ref{asymp11}).
1318:
1319: To check the consistency of our asymptotic results, we further computed the
1320: difference between the left- and the right-hand sides of Eq. (\ref{conv1}),
1321: \begin{equation}
1322: \Delta h (\zeta, \eta)= {\tilde h}(\zeta, \eta)- h\left( \zeta- {\tilde h}(\zeta,
1323: \eta), \eta- {\tilde h}(\zeta, \eta)\right), \label{conv2}
1324: \end{equation}
1325: with the functions $h(x,y)$ and ${\tilde h}(x,y)$ given respectively by Eqs.
1326: (\ref{j1}) and (\ref{res2}). For large $\zeta=\eta$ one gets
1327: \begin{equation}
1328: \Delta h(\zeta,\zeta) \to \left[{p^{1/3}\chi^2}/{3 (1-\sqrt{p})^{4/3}}\right]\,
1329: {\zeta}^{-1/3} . \label{cons1}
1330: \end{equation}
1331: Thus the discrepancy falls off as a power law for large $\zeta$, indicating that
1332: indeed our solution is asymptotically exact. We have also performed numerical
1333: simulations of the BM model using the recursion relation in Eq. (\ref{recur1}) for
1334: $c=2,\,4,\,9,\,16,\,100$. Our preliminary results\cite{MN} for relatively small
1335: system sizes (up to $n=5000$) are consistent with our exact results in Eqs.
1336: (\ref{asymp11})-(\ref{eq:expvar}).
1337:
1338: Thus, the Tracy-Widom distribution also describes the asymptotic distribution of
1339: the optimal matching length in the BM model, for all $c$. Given that the correlations in the original LCS
1340: model
1341: become negligible in the $c\to \infty$ limit, it is likely that the
1342: BM asymptotics in Eq. (\ref{p02}) would also hold for the original LCS model
1343: in the $c\to \infty$ limit.
1344: An important open problem
1345: is to determine whether the Tracy-Widom distribution also appears in the
1346: LCS problem for finite $c$. The precise distribution obtained
1347: here (including exact prefactors) for all $c$ in the BM model will serve
1348: as a useful benchmark to which future simulations of the LCS problem can
1349: be compared.
1350:
1351: \section{Conclusion}
1352:
1353: In these lectures I have discussed $4$ a priori unrelated problems and tried to give a flavour
1354: of the recent developments that have found a deep connection between these problems.
1355: These connections have now established the fact that they all share one common limiting distribution,
1356: namely the Tracy-Widom distribution that describes the asymptotic distribution law of
1357: the largest eigenvalue of a random matrix. I have also discussed the probabilities of
1358: large deviations of the largest eigenvalue, in the range outside the validity of the
1359: Tracy-Widom law. As examples, I have demonstrated in detail, in two specfic models
1360: a ballistic
1361: deposition model and a sequence alignment problem,
1362: how they can be mapped on to the longest increasing subsequence problem
1363: and consequently proving the existence of the Tracy-Widom distribution in these
1364: models.
1365:
1366: There have been many other interesting recent developments in this rather broad area encompassing
1367: different fields that I did not have the scope to discuss in these lectures.
1368: There are, of course, plenty of open questions that
1369: need to be addressed, some of which I mention below.
1370:
1371: {\em Finite size effects in growth models:} We have discussed how the Tracy-Widom distribution appears
1372: as the limiting scaled height distribution in several $(1+1)$ dimensional growth
1373: models that belong to the KPZ universality class of fluctuating interfaces. Indeed,
1374: for a fluctuating surface with height $H(x,t)$ growing over a substrate of infinite size
1375: one now believes that at long times $t>>1$
1376: \begin{equation}
1377: H(x,t) = v t + b t^{1/3} \chi
1378: \label{con1}
1379: \end{equation}
1380: where $\chi$ is a time-independent random variable with the Tracy-Widom distribution.
1381: The prefactors $v$ (the velocity of the interface) and $b$ are model dependent,
1382: but the distribution of the scaled variable $\chi=(H-vt)/{bt^{1/3}}$ is universal
1383: for large $t$. The nonuniversal prefactors are often very hard to compute. We have
1384: shown two examples in these lectures where these prefactors can be computed exactly.
1385: Note, however, that the result in Eq. (\ref{con1}) holds only in an infinite system.
1386: In any real system with a finite
1387: substrate size $L$, the result in Eq. (\ref{con1}) will hold only in the growing
1388: regime of the surface, i.e., when $1<< t << L^z$, where $z$ is the dynamical
1389: exponent characterizing the surface evolution. For example, for the KPZ
1390: type of interfaces in $(1+1)$ dimensions, $z=3/2$. However, when $t>> L^z$, the probability distribution
1391: of the height fluctuation
1392: $H-\langle H\rangle$ will become time-independent. For example, for $(1+1)$ dimensional KPZ surfaces
1393: with periodic boundary conditions, it is well known~\cite{HZ} that the stationary distribution of
1394: the height fluctuation is a simple Gaussian, ${\rm Prob}[H-\langle H\rangle=x]\propto \exp[-x^2/{a_0 L}]$
1395: where $a_0$ is a nonuniversal constant and the typical fluctuation scales with the system size as $L^{1/2}$.
1396: An important open question is how does the distribution of the height fluctuation crosses over
1397: from the Tracy-Widom form to a simple Gaussian form as $t$ becomes bigger than the crossover time $L^z$.
1398: It would be nice to show this explicitly in any of the simple models discussed above.
1399:
1400:
1401: {\em A direct connection between the growth models and random matrices:} The existence of the Tracy-Widom
1402: distribution in many of the growth models discussed here, such as the polynuclear growth model
1403: or the ballistic deposition model, rely on the mapping to the LIS problem
1404: and subsequently using the BDJ results that connect the LIS problem to random matrices.
1405: It is certainly desirable to find to a direct mapping between the growth models and the
1406: largest eigenvalue of a random matrix. Recent work by Spohn and collaborators~\cite{Spohn}
1407: linking the top edge of a PNG growth model to Dyson's brownian motion of the eigenvalues
1408: of a random matrix perhaps provides a clue to this missing link.
1409:
1410:
1411: {\em Largest Lyapunov exponent in population dynamics:} The Tracy-Widom distribution
1412: and the associated large-deviation function discussed in Section 3
1413: conceivably have important applications in several systems
1414: where the largest eigenvalue controls the spectral properties of the system. Some
1415: examples were discussed in Section 3. Recently, it has been shown that the statistics
1416: of largest eigenvalue (the largest Lyapunov exponent) is also of importance
1417: in population growth of organisms in fluctuating environments~\cite{KL1}.
1418: It would be interesting to see if Tracy-Widom type distribution functions also
1419: appear in these biological problems.
1420:
1421:
1422: {\em Sequence matching, directed polymer and vertex models:} In the context of the sequence matching problem
1423: discussed in Section 6, we have demonstrated how the statistical weights of the surface generated
1424: in the Bernoulli matching
1425: model of the sequence alignment are exactly identical to that of
1426: a $5$-vertex model on a square lattice (see Fig. \ref{fig:bms2}). This is a useful connection
1427: because there are many quantities in the $5$-vertex models that can be computed exactly by employing
1428: the Bethe ansatz techniques and subsequently one can use those results for the sequence
1429: alignment or equivalently for the directed polymer model. Recently, in collaboration
1430: with K. Mallick and S. Nechaev, we have made some progress in these directions~\cite{MMN}.
1431: A very interesting open issue is if one can derive the Tracy-Widom distribution
1432: by using the Bethe ansatz techniques.
1433:
1434: {\em Other issues related to the sequence matching problem:} There are also many other
1435: interesting open questions associated with
1436: the sequence matching problem.
1437: We have shown that the length of the longest matching is Tracy-Widom distributed
1438: only in the Bernoulli matching model which is a simpler version of the original LCS problem.
1439: In the BM model one has ignored certain correlations, as we discussed in detail. This approximation is
1440: exact in the $c\to \infty$ limit, where $c$ is the number of different types of alphabets, e.g.
1441: for DNA, $c=4$. Is this approximation good even for finite $c$? In
1442: other words,
1443: is the optimal matching length in the original LCS problem also Tracy-Widom distributed?
1444: It would also be
1445: interesting if one can make a systematic $1/c$ expansion of the LCS model, i.e., keeping
1446: the correlations up to $O(1/c)$. Numerical simulations the LCS problem~\cite{BMat} for binary sequence $c=2$
1447: indeed indicates that the standard deiviation of the optimal matching length scales as $n^{1/3}$ where
1448: $n$ is the sequence size, as in the
1449: BM model, the question is if the scaled distribution is also Tracy-Widom or not.
1450: For the original LCS problem, there is also a curious result due to Bonetto
1451: and Matzinger~\cite{BMat} that claims that if the value of $c$ for the two sequences are not the same (for example,
1452: the first sequence may be drawn randomly from $3$ alphabets and the second may be a binary sequence),
1453: then the standard deviation of the optimal matching length scales as $n^{1/2}$ for large $n$, which
1454: is rather surprising!
1455: It would be interesting to study the statistics of optimal matches between more than two sequences.
1456: Finally, here we have just mentioned the matching of random sequences. It would be interesting
1457: and important
1458: to study the statistics of optimal matching lengths between non-random sequences, e.g.,
1459: when there are some correlations between the members of any given sequence.
1460:
1461:
1462: \vspace{0.2cm}
1463:
1464: {\bf Acknowledgements:} My own contribution to this field that is presented here was
1465: developed partly in collaboration
1466: with D.S. Dean and partly with S. Nechaev. It is a pleasure to thank them.
1467: I also thank O. Bohigas, K. Mallick and P. Vivo for collaborations on related topics.
1468: Besides, I acknowledge useful discussions with G. Biroli, J.-P. Bouchaud, A.J. Bray,
1469: A. Comtet, D. Dhar, S. Leibler, O.C. Martin, M. M\'ezard, R. Rajesh and C. Tracy. I also thank the
1470: organizers
1471: J.-P. Bouchaud and M. M\'ezard and all other participants of this summer school for physics, for fun,
1472: and for making the school a memorable one.
1473:
1474: %
1475: % ********** End of text entry *************
1476: %
1477: \begin{thebibliography}{99}
1478:
1479: \bibitem{TW1} C. Tracy and H. Widom, Comm. Math. Phys. {\bf 159}, 151 (1994);
1480: {\bf 177}, 727 (1996); For a review see {\em Proceedings of the International Congress of
1481: Mathematicians}, Beijing 2002, Vol. I, ed. LI Tatsien, Higher Education
1482: Press, Beijing 2002, pgs. 587-596.
1483:
1484: \bibitem{BDJ} J. Baik, P. Deift, and K. Johansson, J. Amer. Math. Soc. {\bf
1485: 12}, 1119 (1999).
1486:
1487: \bibitem{J1} K. Johansson, Comm. Math. Phys. {\bf 209}, 437 (2000).
1488:
1489: \bibitem{BR1} J. Baik and E.M. Rains, J. Stat. Phys. {\bf 100}, 523 (2000).
1490:
1491: \bibitem{PS} M. Pr\"ahofer and H. Spohn, Phys. Rev. Lett. {\bf 84}, 4882
1492: (2000); Physica A, {\bf 279}, 342 (2000).
1493:
1494: \bibitem{GTW} J. Gravner, C.A. Tracy, and H. Widom, J. Stat. Phys. {\bf 102}, 1085 (2001).
1495:
1496: \bibitem{BD} S.N. Majumdar and S. Nechaev, Phys. Rev. E {\bf 69}, 011103 (2004).
1497:
1498: \bibitem{IS} T. Imamura and T. Sasamoto,
1499: Nucl. Phys. {\bf B699}, 503 (2004); J. Stat. Phys. {\bf 115}, 749 (2004).
1500:
1501: \bibitem{F1} P.L. Ferrari, Commun. Math. Phys. {\bf 252}, 77 (2004).
1502:
1503: \bibitem{S1} T. Sasamoto, J. Phys. A.: Math. Gen. {\bf 38}, L549 (2005).
1504:
1505: \bibitem{Spohn} H. Spohn, Physica A, {\bf 369}, 71 (2006) and references therein.
1506:
1507: \bibitem{MN} S.N. Majumdar and S. Nechaev, Phys. Rev. E {\bf 72}, 020901(R) (2005).
1508:
1509: \bibitem{meso}
1510: M.G. Vavilov, P.W. Brouwer, V. Ambegaokar, and C.W.J. Beenaker,
1511: Phys. Rev. Lett. {\bf 86}, 874 (2001); A. Lamacraft and B.D. Simons, Phys. Rev. B {\bf 64} 014514 (2001);
1512: P.M. Ostrovsky, M.A. Skvortsov, and M.V. Feigel'man, Phys. Rev. Lett. {\bf 87}, 027002 (2001);
1513: J.S. Meyer, and B.D. Simons, Phys. Rev. B {\bf 64}, 134516 (2001);
1514: A. Silva and L.B. Ioffe, Phys. Rev. B {\bf 71}, 104502 (2005);
1515: A. Silva, Phys. Rev. B {\bf 72}, 224505 (2005).
1516:
1517: \bibitem{BBP}
1518: G. Biroli, J-P. Bouchaud, and M. Potters, cond-mat/0609070 and references therein.
1519:
1520:
1521: \bibitem{Wigner}{E.P. Wigner, Proc. Cambridge Philos. Soc. {\bf 47},
1522: 790 (1951).}
1523:
1524: \bibitem{Mehta} M.L. Mehta, Random Matrices, 2nd Edition, (Academic Press)
1525: (1991).
1526:
1527: \bibitem{CGG} A. Cavagna, J.P. Garrahan, and I. Giardina, Phys. Rev. B. {\bf 61}, 3960 (2000).
1528:
1529: \bibitem{Fyodorov} Y.V. Fyodorov Phys. Rev. Lett. {\bf 92}, 240601 (2004) ;
1530: {\em ibid} Acta Physica Polonica B, {\bf 36}, 2699 (2005).
1531:
1532:
1533: \bibitem{Susskind} L. Susskind, arXiv:hep-th/0302219; M.R. Douglas,
1534: B. Shiffman, and S. Zelditch, Commu. Math. Phys. {\bf 252}, 325
1535: (2004).
1536:
1537: \bibitem{AE} A. Aazami and R. Easther, J. Cosmol. Astropart. Phys.
1538: JCAP03 013 (2006).
1539:
1540: \bibitem{MH} L. Mersini-Houghton, Class. Quant. Grav. {\bf 22}, 3481 (2005).
1541:
1542: \bibitem{VMB} P. Vivo, S.N. Majumdar, and O. Bohigas, in preparation.
1543:
1544: \bibitem{DM} D.S. Dean and S.N. Majumdar, Phys. Rev. Lett. {\bf 97}, 160201 (2006).
1545:
1546: \bibitem{BrayDean} A.J. Bray and D.S. Dean, cond-mat/0611023.
1547:
1548: \bibitem{FSW} Y.V. Fyodorov, H-J. Sommers, and I. Williams, cond-mat/0611585.
1549:
1550: \bibitem{Sosh} A. Soshnikov, Commu. Math. Phys. {\bf 207}, 697 (1999).
1551:
1552: \bibitem{BBP1} J. Baik, G. Ben Arous, and S. P\'ech\'e, Ann. Proab. {\bf 33}, 1643 (2005).
1553:
1554: \bibitem{CB} P. Cizeau and J.-P. Bouchaud, Phys. Rev. E {\bf 50}, 1810 (1994).
1555:
1556: \bibitem{Burda} Z. Burda et. al., cond-mat/0602087.
1557:
1558: \bibitem{Ulam} S.M. Ulam, {\em Modern Mathematics for the Engineers}, ed. by
1559: E.F. Beckenbach (McGraw-Hill, New York, 1961), p. 261.
1560:
1561: \bibitem{Hammersley} J.M. Hammersley, {\em Proc. VI-th Berkeley Symp. on Math.
1562: Stat. and Probability}, (University of California, Berkeley, 1972), Vol. 1, p.
1563: 345.
1564:
1565: \bibitem{VK} A.M. Vershik and S.V. Kerov, Sov. Math. Dokl. {\bf 18}, 527
1566: (1977).
1567:
1568: \bibitem{AD} For a review, see D. Aldous and P. Diaconis, Bull. Amer. Math.
1569: Soc. {\bf 36}, 413 (1999).
1570:
1571: \bibitem{RSK} C. Schensted, Canad. J. Math. {\bf 13}, 179 (1961).
1572:
1573: \bibitem{Mallows} C.M. Mallows, Bull. Inst. Math. Appl., {\bf 9}, 216 (1973).
1574:
1575: \bibitem{HH} D.A. Huse and C.L. Henley, Phys. Rev. Lett. {\bf 54}, 2708 (1985).
1576:
1577: \bibitem{KPZ} M. Kardar, G. Parisi, and Y.C. Zhang, Phys. Rev. Lett. {\bf 56}, 889 (1986).
1578:
1579: \bibitem{FNS} D. Forster, D.R. Nelson, and M.J. Stephen, Phys. Rev. A {\bf 16}, 732 (1977).
1580:
1581: \bibitem{DS1} B. Derrida and H. Spohn, J. Stat. Phys. {\bf 51}, 817 (1988).
1582:
1583: \bibitem{Mezard} M. Mezard, J. Phys. Fr. {\bf 51}, 1831 (1990).
1584:
1585: \bibitem{FH} D.S. Fisher and D.A. Huse, Phys. Rev. B {\bf 43}, 10728 (1991).
1586:
1587: \bibitem{Kardar} Nucl. Phys. {\bf B290}, 582 (1987).
1588:
1589: \bibitem{IS1} J.Z. Imbrie and T. Spencer, J. Stat. Phys. {\bf 52}, 609 (1988); J. Cook
1590: and B. derrida, J. stat. Phys. {\bf 57}, 89 (1989).
1591:
1592: \bibitem{KZ} M. Kardar and Y.C. Zhang, Phys. Rev. Lett. {\bf 58}, 2087 (1987); M. Kardar, Phys. Rev.
1593: Lett. {\bf 55}, 2923 (1989).
1594:
1595: \bibitem{HZ} T. Halpin-Healy and Y.C. Zhang, Phys. Rep. {\bf 254}, 215 (1995).
1596:
1597: \bibitem{HHF} D.A. Huse, C.L. Henley, and D.S. Fisher, Phys. Rev. Lett. {\bf 55}, 2924 (1985).
1598:
1599: \bibitem{KMH} J. Krug, P. Meakin, and T. Halpin-Healy, Phys. Rev. A {\bf 45}, 638 (1992).
1600:
1601: \bibitem{KH} J. Krug and T. Halpin-Healy, J. Phys. A {\bf 31}, 5939 (1998).
1602:
1603: \bibitem{RD} R. Rajesh and D. Dhar, Phys. Rev. Lett. {\bf 81}, 1646 (1998).
1604:
1605: \bibitem{Meakin} P. Meakin, {\em Fractals, Scaling, and Growth Far From
1606: Equilibrium} (Cambridge University Press, Cambridge, 1998).
1607:
1608: \bibitem{KS} J. Krug and H. Spohn, in {\em Solids Far From Equilibrium} (ed. by
1609: C. Godr\`eche) (Cambridge University Press, New York, 1991).
1610:
1611: \bibitem{Eden} M. Eden, in {\em Proc. IV-th Berkeley Symp. on Math. Sciences
1612: and Probability}, ed. by F. Neyman (University of California, Berkeley, 1961),
1613: Vol. 4, p. 223.
1614:
1615: \bibitem{RSOS} J.M. Kim and J.M. Kosterlitz, Phys. Rev. Lett. {\bf 62}, 2289
1616: (1989).
1617:
1618: \bibitem{PNG} F.C. Frank, J. Cryst. Growth {\bf 22}, 233 (1974); J. Krug and H.
1619: Spohn, Europhys. Lett. {\bf 8}, 219 (1989). J. Kert\'esz and D.E. Wolf, Phys.
1620: Rev. Lett. {\bf 62}, 2571 (1989).
1621:
1622: \bibitem{BaD} M.J. Vold, J. Colloid Sci. {\bf 14}, 168 (1959); P. Meakin, P.
1623: Ramanlal, L.M. Sander, and R.C. Ball, Phys. Rev. A {\bf 34}, 5091 (1986); J.
1624: Krug and H. Spohn, Phys. Rev. A {\bf 38}, 4271 (1988).
1625:
1626: \bibitem{Krug2} J. Krug and P. Meakin, Phys. Rev. A {\bf 40}, 2064 (1989); {\em ibid}, {\bf 43},
1627: 900 (1991).
1628:
1629: \bibitem{Bethe} D. Dhar, Phase Transitions, {\bf 9}, 51 (1987); L.-H. Gwa and
1630: H. Spohn, Phys. Rev. Lett. {\bf 68}, 725 (1992); D. Kim, Phys. Rev. E {\bf 52},
1631: 3512 (1995).
1632:
1633: \bibitem{KS1} E. Katzav and M. Schwartz, Phys. Rev. E {\bf 70}, 061608 (2004).
1634:
1635: \bibitem{KNV} E. Katzav, S. Nechaev, and O. Vasilyev, cond-mat/0611537.
1636:
1637: \bibitem{W1} M.S. Waterman, {\em Introduction to Computational Biology} (Chapman \& Hall,
1638: London, 1994).
1639:
1640: \bibitem{Gusfield} D. Gusfield, {\em Algorithms on Strings, Trees, and Sequences} (Cambridge
1641: University Press, Cambridge, 1997).
1642:
1643: \bibitem{DEKM} R. Dubrin, S. Eddy, A. Krogh, and G. Mitchison, {\em Biological Sequence
1644: Analysis} (Cambridge University Press, Cambridge, 1998).
1645:
1646: \bibitem{NW} S.B. Needleman and C.D. Wunsch, J. Mol. Biol. {\bf 48}, 443 (1970).
1647:
1648: \bibitem{SW} T.F. Smith and M.S. Waterman, J. Mol. Biol. {\bf 147}, 195 (1981); Adv. Appl.
1649: math. {\bf 2}, 482 (1981).
1650:
1651: \bibitem{WGA} M.S. Waterman, L. Gordon, and R. Arratia, Proc. Natl. Acad. Sci. USA,
1652: {\bf 84}, 1239 (1987).
1653:
1654: \bibitem{AGMML} S.F. Altschul et. al., J. Mol. Biol. {\bf 215}, 403 (1990).
1655:
1656: \bibitem{SK} D. Sankoff and J. Kruskal, {\em Time Warps, String Edits, and Macromolecules:
1657: The theory and practice of sequence comparison} (Addison Wesley, Reading, Massachussets,
1658: 1983).
1659:
1660: \bibitem{AG} A. Apostolico and C. Guerra, Alogorithmica, {\bf 2}, 315 (1987).
1661:
1662: \bibitem{WF} R. Wagner and M. Fisher, J. Assoc. Comput. Mach. {\bf 21}, 168 (1974);
1663:
1664: \bibitem{CS} V. Chv\'atal and D. Sankoff, J. Appl. Probab. {\bf 12}, 306 (1975).
1665:
1666: \bibitem{Deken} J. Deken, Discrete Math. {\bf 26}, 17 (1979).
1667:
1668: \bibitem{Steele} J.M. Steele, SIAM J. Appl. Math. {\bf 42}, 731 (1982).
1669:
1670: \bibitem{DP} V. Dancik and M. Paterson, in STACS94, Lecture Notes in Computer Science, {\bf
1671: 775}, 306 (Springer, New York, 1994).
1672:
1673: \bibitem{Alex} K.S. Alexander, Ann. Appl. Probab. {\bf 4}, 1074 (1994).
1674:
1675: \bibitem{KLM} M. Kiwi, M. Loebl, and J. Matousek, math.CO/0308234.
1676:
1677: \bibitem{ZM} M. Zhang and T. Marr, J. Theor. Biol. {\bf 174}, 119 (1995).
1678:
1679: \bibitem{Hwa} T. Hwa and M. Lassig, Phys. Rev. Lett. {\bf 76}, 2591 (1996); R. Bundschuh
1680: and T. Hwa, Discrete Appl. Math. {\bf 104}, 113 (2000).
1681:
1682: \bibitem{Monvel} J. Boutet de Monvel, European Phys. J. B {\bf 7}, 293 (1999); Phys. Rev. E
1683: {\bf 62}, 204 (2000).
1684:
1685: \bibitem{MPV} M. M\'ezard, G. Parisi, and M.A. Virasoro, eds., {\em Spin Glass Theory
1686: and Beyond} (World Scientific, Singapore, 1987).
1687:
1688: \bibitem{KL1} E. Kussell and S. Leibler, Science, {\bf 309}, 2075 (2005).
1689:
1690: \bibitem{MMN} S.N. Majumdar, K. Mallick, and S. Nechaev, in preparation.
1691:
1692: \bibitem{BMat} F. Bonetto and H. Matzinger, arXiv:math.CO/0410404.
1693:
1694:
1695:
1696:
1697:
1698:
1699: \end{thebibliography}
1700: %
1701: %spell_to
1702: \end{document}
1703: %
1704: