0701:cond-mat0701193/lh3.tex

1: %BeginFileInfo

2: %%Publisher=ELSEVIER

3: %%Project=NUPHA

4: %%Manuscript=NPA8550

5: %%Stage=308

6: %%TID=elvyraa

7: %%Pages=71

8: %%Format=latex006

9: %%Distribution=live4

10: %%Destination=DVI

11: %%DVI.Maker=vtex_tex_dvi

12: %%History1=Computer: 318AW, User: Ritac, 2003.09.08 11:06

13: %%History2=Computer: 318AW, User: Ritac, 2003.09.10 11:32

14: %%History3=Computer: 318AW, User: Ritac, 2003.09.10 14:18

15: %%History4=Computer: 318AW, User: Ritac, 2003.09.10 15:26

16: %%History5=Computer: 318AW, User: Ritac, 2003.09.22 08:29

17: %%History6=Computer: 318AW, User: Ritac, 2003.09.25 15:29

18: %%History7=Computer: 514BW, User: elvyraa, 2003.09.26 14:35

19: %%History8=Computer: 514BW, User: elvyraa, 2003.09.30 13:44

20: %EndFileInfo

21: %

22: % Journal NPA, Elsevier

23: % Typeset by VTeX Ltd., Vilnius, Lithuania

24: %

25: %Spelling_date

26: % Opcijos: [rotating,secthm,seceqn,secfloat,nameyear,xxtheorem]

27: \documentclass{article}

28: \usepackage{bookstyle,bm,cite}

29: \usepackage{graphicx}

30: %\psdraft

31: %spell_from

32: %

33: % Local def's for bibliography

34: %

35: \let\bauthor\relax

36: \let\fnm\relax\let\snm\relax

37: \let\bseries\ignorespaces

38: \let\btitle\relax

39: \let\bvolumeno\textbf

40: \def\bdate#1{\unskip\ (#1)}

41: \def\bfirstpage#1{\unskip\ #1}

42: \def\bcomment#1{\unskip, #1}

43: %

44: \begin{document}

45: \title{Random Matrices, the Ulam Problem, Directed Polymers \& Growth Models, and Sequence Matching}

46: %\runtitle{}

47: \author{Satya N. Majumdar}

48: \address{

49: Laboratoire de Physique Th\'eorique et

50: Mod\`eles Statistiques (UMR 8626 du CNRS),

51: Universit\'e Paris-Sud, B\^at. 100, 91405 Orsay Cedex, France

52: }

53: \frontmatter

54: \maketitle

55: \mainmatter%

56: %s1 ###

57: \section{Preamble}

58:

59: In these lecture notes I will give a pedagogical introduction to some common aspects of $4$ different

60: problems: (i) random matrices (ii) the longest increasing subsequence problem (also known as

61: the Ulam problem) (iii) directed

62: polymers in random medium

63: and growth models in $(1+1)$ dimensions and (iv) a problem on the alignment of a pair

64: of random sequences.

65: Each of these problems is almost entirely a sub-field by itself and here I will discuss only some specific

66: aspects of each of them. These $4$ problems have been studied almost independently for the

67: past few decades, but only over the last few years a common thread was found to

68: link all of them. In particular all of them share one common limiting probability distribution

69: known as the Tracy-Widom distribution that describes the asymptotic probability distribution

70: of the largest eigenvalue of a random matrix. I will mention here, without mathematical

71: derivation, some of the

72: beautiful results discovered in the past few years. Then, I will consider two specific models

73: (a) a ballistic deposition growth model and (b) a model of sequence alignment known as the

74: Bernoulli matching model and discuss, in some detail, how one derives exactly the

75: Tracy-Widom law in these models. The emphasis of these lectures would be on how to

76: map one model to another. Some open problems will be discussed at the end.

77:

78: \section{Introduction}

79:

80: In these lectures I will discuss $4$ seemingly unrelated problems: (i) random matrices

81: (ii) the longest increasing subsequence (LIS) problem (also known as the Ulam

82: problem after its discoverer) (iii) directed polymers in random environment in $(1+1)$ dimensions

83: and related random growth models and (iv) the longest common subsequence (LCS)

84: problem arising in matching of a pair of random sequences (see Fig. \ref{fields}). These 4 problems

85: have been studied extensively, but almost independently, over the past few

86: decades.

87: For example, random matrices have

88: been extensively studied by

89: nuclear physicists, mathematicians and statisticians. The LIS problem

90: has been studied extensively by probabilists. The models of directed polymers in

91: random medium

92: and the related growth models have been a very popular subject among

93: statistical physicists. Similarly, the LCS problem has been very popular

94: among biologists and computer scientists. Only, in the last $10$ years or so,

95: it became progressively evident that there are profound links between these

96: $4$ problems. All of them share one common probability distribution function

97: which is called the Tracy-Widom distribution.

98: \begin{figure}[t]

99: %\fbox{\vtop to3cm{\vss\hsize=.7\hsize\centerline{fields.eps}\vss}}

100: \includegraphics[width=.7\hsize]{fields.eps}

101: \caption{All $4$ problems share the Tracy-Widom distribution.}

102: \label{fields}

103: \end{figure}

104:

105:

106: This distribution was first discovered in the context of random matrices

107: by Tracy and Widom~\cite{TW1}. They calculated exactly the probability distribution

108: of the {\em typical} fluctuations of the largest eigenvalue of a random matrix around

109: its mean. This distribution, suitably scaled, is known as the Tracy-Widom (TW)

110: distribution (see later for details). Later in 1999, in a landmark paper~\cite{BDJ},

111: Baik, Deift and Johansson (BDJ) showed that the same TW distribution

112: describes the scaled distributions of the length of the longest

113: increasing subsequence in the LIS problem. Immediately after, Johansson~\cite{J1},

114: Baik and Rains~\cite{BR1} showed that the same distribution also appears

115: in a class of directed polymer problems. Around the same time, Pr\"ahofer

116: and Spohn showed~\cite{PS} that the TW distribution also appears in

117: a class of random growth models known as the polynuclear growth (PNG) models.

118: Following this, it was discovered that the TW distribution

119: also occurred

120: in several other growth models, such as the `oriented digital boiling' model~\cite{GTW},

121: a ballistic deposition model~\cite{BD}, in PNG type of growth models

122: with varying initial conditions and in various geometries~\cite{IS,F1} and

123: also in the single-step growth model arising from the totally asymmetric exclusion process~\cite{S1}.

124: Also, a somewhat direct connection between the stochastic growth models

125: and the random matrix models via the so called `determinantal point processes' was found

126: in a series of work by Spohn and collaborators~\cite{Spohn}, which I will not discuss here

127: (see Ref. \cite{Spohn} for a recent review).

128: Finally, the TW

129: distribution was also shown to appear in the LCS problem~\cite{MN}, which is also related

130: to these growth models.

131: Apart from these 4 problems that we will focus here, the TW distribution

132: has also appeared in many other problems, e.g., in the mesoscopic

133: fluctuations of excitation gaps in a dirty metal grain or a semiconductor quantum dot induced

134: by a nearby superconductor~\cite{meso}.

135: The TW distribution also appears in problems related to finance~\cite{BBP}.

136:

137: The appearence of the TW distribution in so many different problems

138: is really interesting, suggesting an underlying universality

139: that links all these different systems. The purpose of my lectures would

140: be to explore and elucidate the links between the 4 problems stated above.

141: The literature on this subject is huge. I will not try to provide

142: any detailed derivation of the mathematical results here. Instead,

143: I will state precisely the known results that we will need to use and put

144: more emphasis on how one maps one problem to the other. In particular,

145: I will discuss two problems in some detail and show how the TW distribution

146: appears in them. These two problems are: (i) a random growth model

147: in $(1+1)$ dimensions that we call the anisotropic ballistic deposition model

148: and (ii) a particular variant of the LCS problem known as the Bernoulli

149: matching (BM) model. In the former case, I will show how to the map

150: the ballistic deposition model to the LIS problem and subsequently use the BDJ

151: results. In the second case, I will show that the BM model can be mapped to

152: a particular directed polymer model that was studied by Johansson.

153: The mappings are often geometric in nature, are nontrivial and serves

154: two purposes: (a) to elucidate how the TW distribution appears in

155: somewhat unrelated problems and (b) to derive exact analytical results in problems such

156: as the sequence matching models, where precise analytical results were

157: missing so far.

158:

159: The lecture notes are organized as follows. In Section 3, I will describe

160: some basic results of the random matrix theory and define the TW distribution

161: precisely. In Section 4, the LIS problem will be described along with

162: the main results of BDJ.

163: Section 5 contains a discussion of the directed polymer problems,

164: and in particular the main results of Johansson will be mentioned. In Section 5.1, I will describe

165: how one maps the anisotropic ballistic deposition model to the LIS problem.

166: Section 6 contains a discussion of the LCS problem. Finally, I will conclude

167: in Section 7 with a discussion and open problems.

168:

169: \section{Random Matrices: the Tracy-Widom distribution for the largest eigenvalue}

170:

171: Studies of the statistics of the eigenvalues of random matrices have a

172: long history going back to the seminal work of Wigner~\cite{Wigner}.

173: Since then, random matrices have found applications in multiple fields

174: including nuclear physics, quantum chaos, disordered systems, string

175: theory and number theory~\cite{Mehta}. Three classes of matrices with

176: Gaussian entries have played important roles~\cite{Mehta}: $(N\times

177: N)$ real symmetric (Gaussian Orthogonal Ensemble (GOE)), $(N\times N)$

178: complex Hermitian (Gaussian Unitary Ensemble (GUE)) and $(2N\times

179: 2N)$ self-dual Hermitian matrices (Gaussian Symplectic Ensemble

180: (GSE)). For example, in GOE, one considers an $(N\times

181: N)$ real symmetric matrix $X$ whose elements $x_{ij}$'s are drawn

182: independently from a Gaussian distribution: $P(x_{ii})= \frac{1}{\sqrt{2\pi}}\,\exp[-x_{ii}^2/2]$

183: and $P(x_{ij}) = \frac{1}{\sqrt{\pi}}\,\exp[-x_{ij}^2]$ for $i<j$. Thus the

184: joint distribution of all the $N(N+1)/2$ independent elements is just the product

185: of the individual distributions and can be writen in a compact form as

186: $P[X]= A_N \exp[-{\rm tr}(X^2)/2]$, where $A_N$ is a normalization constant.

187: One can similarly write down the joint distribution for the other two ensembles~\cite{Mehta}.

188:

189: One of the key results in the random matrix theory is due to Wigner who derived,

190: starting from the joint distribution of the matrix elements $P(X)$, a rather

191: compact expression for the

192: joint probability density function (PDF) of the eigenvalues of a random $(N\times

193: N)$ matrix from all ensembles~\cite{Wigner}

194: \begin{equation}

195: P(\lambda_1, \lambda_2,\dots, \lambda_N) = B_N \exp\left[-\frac{\beta}{2}\left(\sum_{i=1}^N\lambda_i^2

196: -\sum_{i\ne j}\ln(|\lambda_i-\lambda_j|)\right)\right],

197: \label{pdf}

198: \end{equation}

199: where $B_N$ normalizes the pdf and $\beta=1$, $2$ and $4$ correspond

200: respectively to the GOE, GUE and GSE. The joint law allows one to

201: interpret the eigenvalues as the positions of charged particles,

202: repelling each other via a $2$-d Coulomb potential (logarithmic);

203: they are confined on a $1$-d line and each is subject to an external harmonic

204: potential. The parameter $\beta$ that characterizes the type of

205: ensemble can be interpreted as the inverse temperature.

206:

207: Once the joint pdf is known explicitly, other statistical properties of a random matrix

208: can, in principle, be derived from this joint pdf. In practice, however

209: this is often a technically daunting task. For example, suppose we want to

210: compute the average density of states of the eigenvalues defined as

211: $\rho(\lambda,N)= \sum_{i=1}^N\langle

212: \delta(\lambda-\lambda_i)\rangle/N$, which counts the average number of

213: eigenvalues between $\lambda$ and $\lambda + d\lambda$ per unit length.

214: The angled bracket $\langle \rangle$ denotes an average over the joint pdf.

215: It then follows that $\rho(\lambda,N)$ is simply the marginal of the joint pdf,

216: i.e, we fix one of the eigenavlues (say the first one) at $\lambda$ and integrate the joint pdf

217: over the rest of the $(N-1)$ variables.

218: \begin{equation}

219: \rho(\lambda,N)=\frac{1}{N} \sum_{i=1}^N\langle

220: \delta(\lambda-\lambda_i)\rangle =\int_{-\infty}^{\infty}\prod_{i=2}^N d\lambda_i \,

221: P(\lambda,\lambda_2,\dots, \lambda_N).

222: \label{marginal}

223: \end{equation}

224: Wigner was able to compute this marginal and this is one of the central results

225: in the random matrix theory, known as the celebrated Wigner semi-circular law. For large $N$

226: and for any $\beta$,

227: \begin{equation}

228: \rho (\lambda,N) = \sqrt{\frac{2}{N\pi^2}}\,{\left[1 -\frac{\lambda^2}{2N}\right]}^{1/2}.

229: \label{wig1}

230: \end{equation}

231: Thus, on an average, the $N$ eigenvalues lie within a

232: finite interval $\left[-\sqrt{2N}, \sqrt{2N}\right]$, often referred

233: to as the Wigner `sea'. Within this sea, the average density of states

234: has a semi-circular form (see Fig. \ref{figtw}) that vanishes at the

235: two edges $-\sqrt{2N}$ and $\sqrt{2N}$. Note that since there are $N$

236: eigenvalues distributed over the interval $\left[-\sqrt{2N}, \sqrt{2N}\right]$, the

237: average spacing between adjacent eigenvalues scales as $N^{-1/2}$.

238: \begin{figure}

239: \includegraphics[width=.7\hsize]{tw.eps}

240: \caption{The dashed line shows the semi-circular form of the

241: average density of states. The largest eigenvalue is centered around its mean $\sqrt{2N}$

242: and fluctuates over a scale of width $N^{-1/6}$. The probability of fluctuations

243: on this scale is described by the Tracy-Widom distribution (shown schematically).}

244: \label{figtw}

245: \end{figure}

246:

247: From the semi-circular law, it is clear that the average of the maximum (or minimum) eigenvalue

248: is $\sqrt{2N}$ $\left(-\sqrt{2N}\right)$. However, for finite but large $N$, the maximum

249: eigenvalue fluctuates, around its mean $\sqrt{2N}$, from one sample to

250: another. A natural question is: what is the full probability distribution

251: of the largest eigenvalue $\lambda_{\rm max}$? Once again, this distribution

252: can, in principle, be computed from the joint pdf in Eq. (\ref{pdf}). To see

253: this, it is useful to consider the cumulative distribution of $\lambda_{\rm max}$.

254: Clearly, if $\lambda_{\rm max}\le t$, it necessarily means that all the eigenvalues

255: are less than or equal to $t$. Thus,

256: \begin{equation}

257: {\rm Prob}\left[\lambda_{\rm max}\le t, N\right]= \int_{-\infty}^t \prod_{i=1}^N d\lambda_i \,

258: P(\lambda_1,\lambda_2,\dots, \lambda_N),

259: \label{max1}

260: \end{equation}

261: where the joint pdf is given in Eq. (\ref{pdf}).

262: In practice, however, carrying out this multiple integration in closed form is very difficult.

263: Relatively recently, Tracy and Widom~\cite{TW1} were

264: able to find the limiting form of ${\rm Prob}\left[\lambda_{\rm

265: max}\le t,

266: N\right]$ for large $N$. They showed that the fluctuations of $\lambda_{\rm max}$

267: {\em typically} occur over a very narrow scale of

268: width $\sim N^{-1/6}$ around its mean $\sqrt{2N}$ at the upper edge of the Wigner sea.

269: It is useful to note that this scale $\sim N^{-1/6}$ of typical fluctuations

270: of the largest eigenvalue is much bigger than the average spacing $\sim N^{-1/2}$

271: between adjacent eigenvalues in the limit of large $N$.

272:

273: More precisely, Tracy and Widom showed~\cite{TW1} that asymptotically for

274: large $N$, the scaling variable $\xi=\sqrt{2}\,N^{1/6}\, \left[\lambda_{\rm

275: max}-\sqrt{2N}\right]$ has a limiting $N$-independent probability

276: distribution, ${\rm Prob}[\xi\le x]= F_{\beta}(x)$ whose form depends

277: on the value of the parameter $\beta=1$, $2$ and $4$ characterizing

278: respectively the GOE, GUE and GSE. The function $F_{\beta}(x)$ is called

279: the Tracy-Widom (TW) distribution function. The function $F_{\beta}(x)$,

280: computed as a solution of a nonlinear Painleve differential equation~\cite{TW1},

281: approaches to $1$ as $x\to \infty$ and decays rapidly to zero as $x\to

282: -\infty$. For example, for $\beta=2$, $F_2(x)$ has the following

283: tails~\cite{TW1},

284: \begin{eqnarray}

285: F_2(x) &\to & 1- O\left(\exp[-4x^{3/2}/3]\right)\quad\, {\rm as}\,\,\, x\to \infty

286: \nonumber \\

287: &\to & \exp[-|x|^3/12] \quad\, {\rm as}\,\,\, x\to -\infty.

288: \label{asymp1}

289: \end{eqnarray}

290: The probability density function $f_{\beta}(x)=dF_{\beta}/dx$ thus has highly

291: asymmetric tails. A graph of these functions for $\beta=1$, $2$ and $4$

292: is shown in Fig. \ref{fig:tracy}.

293: A convenient way to express these typical fluctuations of $\lambda_{\rm max}$

294: around its mean $\sqrt{2N}$ is to write, for large $N$,

295: \begin{equation}

296: \lambda_{\max} = \sqrt{2N} + \frac{N^{-1/6}}{\sqrt{2}}\, \chi

297: \label{tw2}

298: \end{equation}

299: where the random variable $\chi$ has the limiting $N$-independent distribution,

300: ${\rm Prob}[\chi \le x] = F_{\beta}(x)$.

301: As mentioned in the introduction, amazingly this TW distribution function has since

302: emerged in a growing variety of seemingly unrelated problems, some of which I

303: will discuss in the next sections.

304: \begin{figure}

305: \includegraphics[width=.7\hsize]{tracy.eps}

306: \caption{The probability density function $f_{\beta}(x)$ plotted as a

307: function of $x$ for $\beta=1$, $2$ and $4$ (reproduced from Ref. ~\cite{TW1}).}

308: \label{fig:tracy}

309: \end{figure}

310: \vspace{0.4cm}

311:

312: {\bf {Large Deviations of $\lambda_{\rm max}$:}} Before we end this section and proceed to the other

313: problems, it is worth making

314: the following remark. The Tracy-Widom distribution describes the probability of {\em typical and small}

315: fluctuations of $\lambda_{\rm max}$ over a very narrow region of width

316: $\sim O(N^{-1/6})$ around the mean $\langle \lambda_{\rm max}\rangle

317: \approx \sqrt{2N}$. A natural question is how to describe the

318: probability of {\em atypical and large} fluctuations of $\lambda_{max}$ around its

319: mean, say over a wider region of width $\sim O(N^{1/2})$? For example,

320: what is the probability that all the eigenvalues of a random matrix

321: are negative (or equivalently all are positive)?  This is the same as

322: the probability that $\lambda_{\rm max}\le 0$ (or equivalently

323: $\lambda_{\rm min}\ge 0$). Since $\langle \lambda_{\rm max}\rangle

324: \approx \sqrt{2N} $, this requires the computation of the probability

325: of an extremely rare event characterizing a large deviation of $\sim

326: -O(N^{1/2})$ to the left of the mean.

327: This question naturally arises in any physical system where one

328: is interested in the statistics of stationary points of a random landscape.

329: For example, in disordered systems such as spin glasses one is interested in

330: the stationary points (metastable states) of the free energy landscape.

331: On the other hand, in structural glasses or supercooled liquids, one is

332: interested in the stationary points of the potential energy landscape.

333: In order to have a local minimum of the

334: random landscape one needs to ensure that the eigenvalues of the

335: associated Hessian matrix are all positive~\cite{CGG,Fyodorov}.

336: A similar question recently came up

337: in the context of random landscape models of anthropic principle

338: based string theory~\cite{Susskind,AE} as well as in quantum

339: cosmology~\cite{MH}.  Here one is interested in the statistical

340: properties of vacua associated with a random multifield potential,

341: e.g., how many minima are there in a random string landscape?

342: These large deviations are also important in characterizing the large sample

343: to sample fluctuations of the excitation gap in quantum dots

344: connected to a superconductor~\cite{meso}.

345:

346: The issue of large deviations of $\lambda_{\rm max}$ was addressed

347: in Ref. \cite{J1} for a special class of matrices drawn

348: from the Laguerre ensemble that corresponds to the eigenvalues of product

349: matrices of the form $W=X^{\dagger}X$ where $X$ itself is a Gaussian

350: matrix (real or complex). Adopting similar methods as in

351: Ref. \cite{J1}

352: one can prove that for Gaussian ensembles,

353: the probability of {\em large} fluctuations to the left of the mean $\sqrt{2N}$

354: behaves for large $N$ as,

355: \begin{equation}

356: {\rm Prob}\left[\lambda_{\rm max}\le t, N\right] \sim \exp\left[-\beta

357: N^2 \Phi_{-}\left( \frac{\sqrt{2N}-t}{\sqrt{N}} \right) \right]

358: \label{ldf1}

359: \end{equation}

360: where $t\sim O(N^{1/2})\le \sqrt{2N}$ is located deep inside the

361: Wigner sea and $\Phi_{-}(y)$ is a certain {\em left} large deviation function.

362: On the other hand, for {\em large} fluctuations to the right of the mean $\sqrt{2N}$,

363: \begin{equation}

364: 1-{\rm Prob}\left[\lambda_{\rm max}\le t, N\right] \sim \exp\left[-\beta

365: N \Phi_{+}\left( \frac{t-\sqrt{2N}}{\sqrt{N}} \right) \right]

366: \label{ldf2}

367: \end{equation}

368: for $t\sim O(N^{1/2})\ge \sqrt{2N}$ located outside the Wigner sea to its right

369: and $\Phi_{+}(y)$ is the {\em right} large deviation function.

370: The problem then is to evaluate explicitly the left and the right large deviation

371: functions $\Phi_{\mp}(y)$ explicitly.

372: While, for the Laguerre ensemble, an explicit

373: expression of $\Phi_{+}(y)$ was obtained in Ref. \cite{J1} and

374: that of $\Phi_(y)$ recently in Ref. \cite{VMB}, similar expressions

375: for the Gaussian ensemble were missing so far.

376:

377: Indeed, to calculate the probability

378: that all eigenvalues are negative (or positive) for Gaussian matrices, we need an explicit expression

379: of $\Phi_{-}(y)$ for the Gaussian ensemble. This is because, the probability that all

380: eigenvalues are negative is precisely the probability that $\lambda_{\rm max}\le 0$,

381: and hence, from Eq. (\ref{ldf1})

382: \begin{equation}

383: {\rm Prob}\left[\lambda_{\rm max}\le 0, N\right]\sim \exp[-\beta N^2 \Phi_{-}(\sqrt{2})].

384: \label{exp1}

385: \end{equation}

386: The coefficient $\theta= \beta \Phi_{-}(\sqrt{2})$ of the $N^2$ term inside

387: the exponential term in Eq. (\ref{exp1}) is of interest in string theory,

388: and in Ref. \cite{AE}, the authors provided an approximate estimate (for $\beta=1$) of

389: $\theta \approx 1/4$, along with numerical simulations.

390: Recently, in collaboration with D.S. Dean,

391: we were able to compute exactly an explicit expression~\cite{DM} for

392: the full {\em left} large deviation function $\Phi_{-}(y)$.

393: I will not provide the derivation here, but the calculation of

394: {\em large} deviations turns out to be somewhat simpler~\cite{DM} than the calculation of the {\em small}

395: deviations `a la TW. One simply has to minimize the effective free energy

396: of a Coulomb gas using the method of steepest descents and then analyze the

397: resulting saddle point equation (which is an integral equation)~\cite{DM}.

398: This technique is quite useful, as it can be applied to other problems

399: as well, such as the calculation of the average number of stationary points

400: for a Gaussian random fields with $N$ components in the large $N$ limit~\cite{BrayDean,FSW}

401: and also the large deviation function associated with the largest eigenvalue

402: of other types of matrices, such as the Wishart matrices~\cite{VMB}.

403: In terms of the variable $z=y-\sqrt{2}$, the {\em left} large deviation

404: function has the following

405: explicit expression~\cite{DM}

406: \begin{eqnarray}

407: \Phi_{-}(y=z+\sqrt{2})& =& -\frac{1}{8}(3+2 \ln 2) + \frac{1}{216}\left[ 72z^2 -2z^4

408: (30z + 2z^3) \sqrt{6+z^2} \right.\nonumber \\

409: &+& \left. 27\left( 3 + \ln(1296) - 4 \ln\left(-z +

410: \sqrt{6 +z^2}\right) \right) \right].

411: \label{ldfl}

412: \end{eqnarray}

413: In particular, the constant $\theta$ is given exactly by

414: \begin{equation}

415: \theta = \beta\, \Phi(\sqrt{2})= \beta\, \frac{\ln 3}{4} = (0.274653\dots )\,\beta.

416: \label{theta}

417: \end{equation}

418:

419: Another interesting point about the left large deviation function $\Phi_{-}(y)$ is the following.

420: It describes the probability of large $\sim O(\sqrt{N})$ fluctuations to the left of the mean, i.e.,

421: when $y=(\sqrt{2N}-\lambda_{\rm max})/\sqrt{N} \sim O(1)$. Now, if we take the $y\to 0$ limit,

422: then $\Phi_{-}(y)$ should describe the {\em small} fluctuations to the left of the mean $\sqrt{2N}$.

423: In other words, we expect to recover the left tail of the TW distribution by taking the $y\to 0$

424: limit in the left large deviation function. Indeed, as $y\to 0$, one finds from Eq. (\ref{ldfl}),

425: that $\Phi_{-}(y) \approx y^3/{6\sqrt{2}}$. Putting this expression back in Eq. (\ref{ldf1})

426: one gets

427: \begin{equation}

428: {\rm Prob}[\lambda_{\rm max}\le t, N]\approx \exp\left[-\frac{\beta}{24}\big|\sqrt{2}\,

429: N^{1/6}\,(t-\sqrt{2N})\big|^3\right]

430: \label{asymp3}

431: \end{equation}

432: Given that $\chi= \sqrt{2}\,

433: N^{1/6}\,\left(t-\sqrt{2N}\right)$ is the Tracy-Widom scaling variable, we find that the result

434: in Eq. (\ref{asymp3}) matches exactly with the left

435: tail of the Tracy-Widom distribution for all $\beta$.

436: For example, for $\beta=2$ one can easily verify this by comparing Eqs. (\ref{asymp3})

437: and (\ref{asymp1}).

438: This approach not only serves as a useful check that one has obtained the correct

439: large deviation function $\Phi_{-}(y)$, but also provides an alternative and simpler way

440: to derive the asymptotics of the left tail of the TW distribution.

441: A similar expression for the right large deviation function $\Phi_+(y)$ for the

442: Gaussian ensemble is still missing and its computation remains an open problem.

443:

444: Although the Tracy-Widom distribution was originally derived as the limiting distribution

445: of the largest eigenvalue of matrices whose elements are drawn from Gaussian distributions,

446: it is now believed that the same limiting distribution also holds for matrices drawn

447: from a larger class of ensembles, e.g., when the entries are independent

448: and identically distributed random variables drawn from an arbitrary distribution

449: with all moments finite~\cite{Sosh,BBP1}.

450: Recently, Biroli, Bouchaud and Potters ~\cite{BBP} extended this result to

451: power-law ensembles, where each entry of a random matrix is drawn independently

452: from a power-law distribution~\cite{CB,Burda}.

453: They showed that

454: as long as the fourth moment of this power-law distribution is finite, the suitably

455: scaled $\lambda_{\rm max}$ is again TW distributed, but when the fourth moment is

456: infinite, $\lambda_{\rm max}$ has Fr\'echet fluctuations~\cite{BBP}. It would be interesting

457: to compute the probability of {\em large} deviations of $\lambda_{\rm max}$

458: for this power-law ensemble, as in the Gaussian case mentioned above. For example,

459: what is the probability that all the eigenavlues of such random matrices (drawn

460: from the power-law ensemble) are negative (or positive), i.e. $\lambda_{\rm max}\le 0$?

461: This is an open question.

462:

463: \section{The Longest Common Subsequence Problem (or the Ulam Problem)}

464:

465: The longest common subsequence (LIS) problem was first stated by Ulam~\cite{Ulam} in 1961, hence

466: it is also called the Ulam's problem. Since then, a lot of research, mostly by

467: probabilists, has been done on this problem (for a brief history of the problem, see the

468: introduction in Ref. \cite{BDJ}). The problem can be stated very simply as follows.

469: Consider a set of $N$ distinct integers $\{1,2,3,\dots, N\}$. Consider all

470: $N!$ possible permutations of

471: this sequence. For any given

472: permutation, let us find all possible increasing subsequences (terms of a

473: subsequence need not necessarily be consecutive elements) and from them find

474: out the longest one. For example, take $N=10$ and consider a particular

475: permutation $\{8, 2, 7, \underbar 1, \underbar 3, \underbar 4, 10, \underbar 6,

476: \underbar 9, 5\}$. From this sequence, one can form several increasing

477: subsequences such as $\{8,10\}$, $\{2,3,4,10\}$, $\{1,3,4,10\}$ etc. The

478: longest one of all such subsequences is either $\{1,3,4,6,9\}$ as shown by the

479: underscores or $\{2,3,4,6,9\}$. The length $l_N$ of the LIS

480: (in our example $l_N=5$) is a random

481: variable as it varies from one permutation to another. In the Ulam problem one

482: considers all the $N!$ permutations to be equally likely. Given this uniform

483: measure over the space of permutations, what is the statistics of the random

484: variable $l_N$?

485:

486: Ulam found numerically that the average length $\langle

487: l_N\rangle$ behaves asymptotically $\langle l_N\rangle\sim c \sqrt{N}$ for

488: large $N$. Later this result was established rigorously by Hammersley

489: \cite{Hammersley} and the constant $c=2$ was found by Vershik and Kerov

490: \cite{VK}. Recently, in a seminal paper, Baik, Deift and Johansson (BDJ)

491: \cite{BDJ} derived the full distribution of $l_N$ for large $N$. In particular,

492: they showed that asymptotically for large $N$

493: \begin{equation}

494: l_N \to 2\sqrt {N} + N^{1/6} \chi

495: \label{lis1}

496: \end{equation}

497: where the

498: random variable $\chi$ has a limiting $N$-independent distribution,

499: \begin{equation}

500: {\rm Prob}(\chi\leq x) = F_2(x)

501: \label{gue}

502: \end{equation}

503: where $F_2(x)$ is precisely

504: the TW distribution for the largest eigenvalue of a random matrix

505: drawn from the GUE ($\beta=2$), as defined in Section 3.

506: Note that the power of $N$ in the correction term in Eq. (\ref{lis1}) is ${+1/6}$

507: as opposed to the asymptotic law in Eq. (\ref{tw2}) where the power of $N$ in the correction term

508: is $-1/6$. This means that while for random matrices of size $(N\times N)$, the typical

509: fluctuation of $\lambda_{\rm max}$ around its mean value $\sqrt{2N}$ {\em decreases} with

510: $N$ as $N^{-1/6}$ as $N\to \infty$ (i.e., the distribution gets narrower ans narrower

511: around the mean as $N$ increases), the opposite happens in the Ulam problem: the

512: typical fluctuation in $l_N$ around its mean $2\sqrt{N}$ {\em increases} as $N^{1/6}$

513: with increasing $N$, i.e., the distribution around the mean gets broader and broader

514: with increasing $N$.

515:

516: BDJ also showed that

517: when the sequence length $N$ itself is a random variable drawn from a

518: Poisson distribution with mean $\langle N\rangle =\lambda$, the length of the LIS converges for

519: large $\lambda$ to

520: \begin{equation}

521: l_{\lambda}\to 2\sqrt{\lambda} + {\lambda}^{1/6} \chi,

522: \label{bdj1}

523: \end{equation}

524: where $\chi$ has the Tracy-Widom distribution $F_2(x)$. The fixed $N$ and the fixed

525: $\lambda$ ensembles are like the canonical and the grand canonical ensembles in

526: statistical mechanics. The

527: BDJ results led to an avalanche of subsequent mathematical works \cite{AD}.

528: \begin{figure}[t]

529: \includegraphics[width=.7\hsize]{psheap.eps}

530: \caption{The construction of piles according to the patience sorting game. The number

531: of piles corresponding to the sequence

532: $\{8,3,5,1,2,6,4,7\}$ is $4$, which is also the length of the LIS of this sequence.}

533: \label{psheap}

534: \end{figure}

535:

536: I will not provide here the derivation of the BDJ results, but I will assume this

537: result to be known and use it later for other problems. As we will see later, in

538: many problems such as in several growth models, the stratgey is to map those models

539: into the LIS problem and subsequently use the BDJ results. In these mappings, typically

540: the height of a growing surface in the $(1+1)$ dimensional growth models gets mapped to

541: the length of the LIS, i.e., schematically, $H \to l_N$. Subsequently, using the BDJ

542: results for the distribution of $l_N$, one shows that the height in growth models

543: is distributed accoriding to the Tracy-Widom law. I will show explicitly how this

544: strategy works for one specific ballistic deposition model in Section 5.1.

545: But to understand the mapping, we need to know one additional fact about the LIS, which I

546: discuss below.

547:

548: Suppose we are given a specific permutation of $N$ integers.

549: What is a simple algorithm to find the length of the LIS of this permuation?

550: The most famous algorithm goes by the name of Robinson-Schensted-Knuth (RSK)

551: algorithm~\cite{RSK}, which makes a correspondence between the permutation

552: and a Young tableaux, and has played a very important role in the development

553: of the LIS problem. But let me not discuss this

554: here, the reader can find a nice readable account in Ref. \cite{AD}. Instead, I will

555: discuss another related algorithm known as the `patience-sorting' algorithm which

556: will be more useful for our purposes. This algorithm was developed first by Mallows~\cite{Mallows}

557: who showed its connection to the Young tableaux. I will discuss here the version that was

558: discussed recently by Aldous and Diaconis~\cite{AD}. This algorithm is best explained

559: in terms of an example. Let us take $N=8$ and consider a specific permuation,

560: say $\{8,3,5,1,2,6,4,7\}$. The `patience sorting' is a greedy algorithm

561: that will easily find the length of the LIS of this sequence. It is like

562: a simple card game of `patience'. This game

563: goes as follows: start forming piles with the numbers in the permuted sequence

564: starting with the first element which is $8$ in our example. So, the number 8

565: forms the base of the first pile (see Fig. \ref{psheap}). The next element, if less than 8, goes on

566: top of 8. If not, it forms the base of a new pile. One follows a greedy

567: algorithm: for any new element of the sequence, check all the top numbers on

568: the existing piles starting from the first pile and if the new number is less

569: than the top number of an already existing pile, it goes on top of that pile.

570: If the new number is larger than all the top numbers of the existing piles,

571: this new number forms the base of a new pile. Thus in our example, we form $4$

572: distinct piles: $[\{8,3,1\}, \{5,2\}, \{6,4\}, \{7\}]$. Thus the number of piles

573: is $4$. On the other hand, for this particular example, it is easy to check

574: that there are $3$ LIS's namely, $\{3,5,6,7\}$, $\{1,2,6,7\}$ and $\{1,2,4,7\}$, all of the same

575: length $l=4$. So, we see that the length of the LIS is $4$, same as the number of

576: piles in the patience sorting game. But this is not an accident. One can

577: easily prove~\cite{AD} that for any given permutation of $N$ integers, the length of the

578: LIS $l_N$ is exactly the same as the number of piles in the corresponding `patience sorting'

579: algorithm. We will see later that this fact does indeed play a crucial role in our mapping

580: of growth models to the LIS problem.

581:

582: \section{Directed Polymers and Growth Models}

583:

584: The problem of directed polymers in random medium has been an active area

585: of research in statistical physics for the past three decades.

586: Apart from the fact that it is a simple `toy' model of disordered systems,

587: the directed polymer problem has important links to a wide variety

588: of other problems in physics, such as interface fluctuations and pinning~\cite{HH},

589: growing interface models of the Kardar-Parisi-Zhang (KPZ)

590: variety~\cite{KPZ}, randomly forced Burger's equation in fluid dynamics~\cite{FNS},

591: spin glasses~\cite{DS1,Mezard,FH},

592: and also to a single-particle quantum mechanics problem in a time-dependent random

593: potential~\cite{Kardar}. There are many interesting issues associated

594: with the directed polymer problem, such as the phase-transition at a finite

595: temperature in $(d+1)$-dimensional directer polymer when $d>2$~\cite{IS1}, the nature

596: of the low temperature phase~\cite{Mezard,FH}, the nature of the tranverse fluctuations~\cite{KZ,HH}

597: etc. The literature on the subject is huge (for a review see Ref. \cite{HZ}).

598:

599:

600: Here we will focus simply at zero-temperature and a lattice version of the directed polymer

601: problem. This version can be stated as in Fig. \ref{dp}.

602: Consider a square lattice with $O$ denoting the origin.

603: On each site with coordinates $(i,j)$ of this lattice, there is a random energy

604: $\epsilon_{i,j}$, drawn

605: independently

606: from site to site, but from the identical distribution $\rho(\epsilon)$. For simplicity, we

607: will consider that $\epsilon_{i,j}$'s are all negative, i.e., $\rho(\epsilon)$ has support

608: only over $\epsilon\in [0,-\infty]$. The energy variables $\epsilon_{i,j}$'s are quenched

609: random variables.

610: \begin{figure}[t]

611: \includegraphics[width=.7\hsize]{dp.eps}

612: \caption{Directed polymer in $(1+1)$ dimensions with random site energies.}

613: \label{dp}

614: \end{figure}

615:

616: We are interested here only in directed walks for simplicity.

617: Consider all possible directed walk configurations (a walk that can move only

618: north or eastward as shown in Fig. \ref{dp}) that start from the origin $O$ and end up

619: at a fixed point, say $P$ with co-ordinates $(x,y)$. An example of such a walk

620: is shown in Fig. \ref{dp}.

621: The total energy $E(W)$ of any given walk $W$ from $O$ to $P$ is just the sum of site energies along the path

622: $W$,

623: $E(W)= \sum_{i\in W} \epsilon_i$. Thus, for fixed $O$ and $P$ (the endpoints), the energy of a

624: path varies from one path to another (all having

625: the same endpoints $O$ and $P$). The path having the minimum energy (optimal path) among these will

626: correspond to the ground state configuration, i.e., the polymer will prefer to choose

627: this optimal path at zero temperature. Let $E_0(x,y)$ denote this minimum energy amongst

628: all directed paths that start at $O$ and finish at $P:(x,y)$. Now, this minimum energy

629: $E_0(x,y)$ is, of course, a random variable since it fluctuates from one configuration

630: of quenched disorder to another. One is interested in the statistics of $E_0(x,y)$ for

631: a given fixed $(x,y)$. For example, what is the probability distribution of $E_0(x,y)$

632: given that $\epsilon_{x,y}$'s are independent and identically distributed random variables each

633: drawn from $\rho(\epsilon)$?

634:

635: Mathematically, one can write an `evolution' equation or recursion relation for the variable $E_0(x,y)$.

636: Indeed, the path that ends up at say $(x,y)$, must have visited either the site $(x-1,y)$

637: or the site $(x,y-1)$ at the previous step. Then clearly,

638: \begin{equation}

639: E_0(x,y) = {\rm min}\left[E_0(x-1,y), E_0(x,y-1)\right] + \epsilon_{x,y}

640: \label{dpr1}

641: \end{equation}

642: where $\epsilon_{x,y}$ denotes the random energy associated with the site $(x,y)$.

643: Alternately, we can define $H(x,y)=-E_0(x,y)$ which are all positive variables that

644: satisfy the recursion relation

645: \begin{equation}

646: H(x,y) = {\rm max}\left[H(x-1,y), H(x,y-1)\right] + \xi_{x,y}

647: \label{dpr2}

648: \end{equation}

649: where $\xi_{x,y}=-\epsilon_{x,y}$ are positive random variables. The recursion

650: relation in Eq. (\ref{dpr2}) is non-linear and hence is difficult to find the

651: distribution of $H(x,y)$, knowing the distribution of the $\xi_{x,y}$'s.

652: Note that, by interpreting $t=x+y$ as a time-like variable, and denoting

653: by $i$ the transverse coordinate at a fixed $t$, this recursion

654: relation can also be interpreted as a stochastic evolution equation,

655: \begin{equation}

656: H(i,t) = {\rm max}\left[H(i+1,t-1), H(i-1,t-1)\right] + \xi_{i,t}

657: \label{dpr3}

658: \end{equation}

659: where the site energy $\xi_{i,t}$ can now be interpreted as a stochastic noise.

660: In this interpretation, one can think of the directed polymer as a growing

661: model of $(1+1)$ dimensional interface where $H(i,t)$ denotes the height of the interface

662: at the site $i$ of a one dimensional lattice at time $t$. Only, in this version, the

663: length of one dimensional lattice or the substrate keeps increasing linearly with

664: time $t$. In this respect, it corresponds to a special version of a polynuclear

665: growth model where growth occurs on top of a single droplet whose linear size

666: keeps increasing uniformly with time.

667: There are, of course, several other variations of this simple directed

668: polymer model~\cite{HZ}. For example, one can consider a version

669: where the random energies are associated with bonds, rather than the sites.

670: Similarly, one can consider a finite temperature version of the model.

671: In the corresponding analogy to the interface model, at finite temperature, the free energy

672: (as opposed to the ground state energy) of the polymer corresponds to the

673: height variable of the interface. This is most easily seen in the continuum formulation

674: of the model by writing down the partition function as a path integral

675: and then showing directly that $H=\ln Z$ satisfies the KPZ equation~\cite{HHF}.

676:

677: A lot is known about the first and the second moment of $H(x,y)$ (or alternatively

678: for $H(i,t)$ in the height language)

679: and the associated universality properties~\cite{Mezard,FH,KMH}. For example, from simple

680: extensivity properties, one would expect that average ground state energy

681: of the path will increase linearly with the size (number of steps $t$) of the path.

682: In terms of height, this means $\langle H(i,t)\rangle \to v(i) t$ for large $t$

683: where $v(i)$ is velocity of the interface at site $i$ of the one dimensional

684: lattice~\cite{KH}. Also, the standard deviation of height,

685: say of $H(x,x)$ (along the diagonal),

686: is known to grow universally, for large $x$ as $x^{1/3}$~\cite{HZ}. For the interface, this means

687: that the typical height fluctuation grows as $t^{1/3}$ for large $t$, a result

688: that is known from the KPZ problem in $1$-dimension (via a mapping to the noisy

689: Burgers equation).

690: However, much less was known about the full distribution

691: of $H(x,y)$, till only recently.

692:

693: Johansson~\cite{J1} was able to derive the full asymptotic distribution of $H(x,y)$

694: evolving via Eq. (\ref{dpr2}) for a specific disorder distribution, where the noise

695: $\xi_{x,y}$'s in Eq. (\ref{dpr2}) are i.i.d variables taking nonnegative integer

696: values according to the distribution: ${\rm Prob}(\xi_{x,y}=k)= (1-p)\, p^k$ for $k=0,1,2,\dots$,

697: where $0\le p\le 1$ is a fraction.

698: Interestingly, exactly the same recursion relation as in Eq. (\ref{dpr2}) and also

699: with the same disorder distribution as in Johansson's model

700: also appeared independently around the same time in an anisotropic directed percolation

701: problem studied by Rajesh and Dhar~\cite{RD}, a problem to which we will come back

702: later when we discuss the sequence matching problem. The authors in Ref.~\cite{RD} were able

703: to compute exactly the first moment, but Johansson computed the full asymptotic

704: distribution. He showed that for large $x$ and $y$~\cite{J1}

705: \begin{eqnarray}

706: H(x,y) &\to& \frac{2\sqrt{pxy}+p(x+y)}{q}+ \nonumber \\

707:        &+&   \frac{(pxy)^{1/6}}{q}\,\left[(1+p)+\sqrt{\frac{p}{xy}}\,(x+y)\right]^{2/3}

708:        \, \chi

709: \label{j1}

710: \end{eqnarray}

711: where $q=1-p$, $\chi$ is a random variable with the Tracy-Widom distribution, ${\rm Prob}(\chi\le x)=F_2(x)$

712: as in Eq. (\ref{gue}). If one sets $x=y=t/2$, then for the growing droplet interpretation, it would

713: mean that the height $H(i=0,t)$ has a mean that grows linearly with $t$ and a standard deviation

714: that grows as $t^{1/3}$ and when properly centered and scaled, the distribution of $H(0,t)$

715: tends to the GUE Tracy-Widom distribution. Around the same time, Pr\"ahofer and Spohn derived

716: a similar result for a class of PNG models~\cite{PS}. Moreover, they were able to show that not just the

717: $F_2(x)$,

718: but other Tracy-Widom distributions such as the $F_1(x)$ (corresponding to the GOE ensemble)

719: also arises in the PNG model when one starts from different initial conditions~\cite{PS}.

720:

721: \subsection{Exact Height Distribution in A Ballistic Deposition Model}

722:

723: In this subsection, we will show explicitly how one can derive the exact height distribution

724: in a specific $(1+1)$ dimensional growth model and show that it has a limiting Tracy-Widom

725: distribution. This example will illustrate explicitly how one maps a growth model

726: to the LIS problem~\cite{BD}. A similar mapping was used by Pr\"ahofer and Spohn

727: for the PNG model~\cite{PS}. But before we illustrate the mapping, it is useful

728: to remark (i) why one studies such growth models and (ii) what does this mapping

729: and subsequent calculation of the height distribution achieve?

730:

731: The answer to these two questions are as follows. We know that growth processes are

732: ubiquitous in nature. The past few decades have seen

733: extensive research on a wide variety of both discrete and contiuous growth models

734: \cite{Meakin,KS,HZ}. A large class of these growth models in $(1+1)$ dimensions

735: such as the Eden model

736: \cite{Eden}, restricted solid on solid (RSOS) models \cite{RSOS}, directed

737: polymers as mentioned before~\cite{HZ}, polynuclear growth models (PNG) \cite{PNG} and ballistic

738: deposition models (BD)~\cite{BaD} are believed to belong to the same

739: universality class as that of the Kardar-Parisi-Zhang (KPZ) equation describing the

740: growth of interface fluctuations \cite{KPZ}. This universality is, however,

741: somewhat restricted in the sense that it refers only to the width or the second

742: moment of the height fluctuations characterized by two independent exponents

743: (the growth exponent $\beta$ and the dynamical exponent $z$) and the associated

744: scaling function. Moreover, even this restricted universality is established

745: mostly numerically. Only in very few special discrete models in $(1+1)$ dimensions, the

746: exponents $\beta=1/3$ and $z=3/2$ can be computed exactly via the Bethe ansatz

747: technique \cite{Bethe}. A natural and important question is whether this

748: universality can be extended beyond the second moment of height fluctuations.

749: For example, is the full distribution of the height fluctuations (suitably

750: scaled) universal, i.e. is the same for different growth models belonging to

751: the KPZ class? Moreover, the KPZ-type equations are usually attributed to

752: models with small gradients in the height profile and the question whether the

753: models with large gradients (such as the BD models) belong to the KPZ universality class is still

754: open. The connection between the discrete BD models and the continuum KPZ equation

755: has recently been elucidated upon \cite{KS1}.

756:

757: To test whether this more stringent test of universality (going beyond the second moment) of the full

758: distribution is true or not,

759: one needs to calculate the full height distribution in different models which are known

760: to belong to the KPZ universality class as far as only the second moment is concerned.

761: In fact, as mentioned earlier, Pr\"ahofer and Spohn were able to calculate the asymptotic height

762: distribution in a class of PNG models and showed that it has the Tracy-Widom distribution~\cite{PS}.

763: Similarly, we mentioneed earlier that Johansson~\cite{J1} established rigorously that

764: the height distribution,

765: in a specific version of the directed polymer model, is of the Tracy-Widom form.

766: Subsequently, there have been several other works~\cite{GTW} recently, including the ballistic deposition

767: model~\cite{BD} that we will discuss below, that showed that indeed

768: all these $(1+1)$ dimensional growth models share the same common scaled height distribution

769: (Tracy-Widom), thus putting the universality on a much stronger footing going beyond just the

770: second moment.

771:

772: We now focus on a specific ballistic deposition model. Ballistic deposition models typically

773: try to mimic columnar growth that occur in many natural systems and have been studied

774: extensively in the past with a variety of microscopic rules~\cite{Krug2,BaD}, though an exact calculation

775: of the height distribution remained elusive in any of these microscopic models. In collaboration

776: with S. Nechaev, we found a particular ballistic deposition model which can be explicitly mapped

777: to the LIS problem and hence the full asymptotic height distribution can be computed

778: exactly~\cite{BD}.

779: In our $(1+1)$-D (here $D$ stands for `dimensional') BD model columnar growth occurs sequentially on a linear

780: substrate

781: consisting of $L$ columns with free boundary conditions. The time $t$ is

782: discrete and is increased by $1$ with every deposition event. We first consider

783: the flat initial condition, i.e., an empty substrate at $t=0$. Other initial

784: conditions will be treated later. At any stage of the growth, a column (say the

785: $k$-th column) is chosen at random with probability $p=\frac{1}{L}$ and a

786: "brick" is deposited there which increases  the height of this column by one

787: unit, $H_k\to H_k+1$. Once this "brick" is deposited, it screens all the sites

788: at the same level in all the columns to its right from future deposition, i.e.

789: the heights at all the columns to the right of the $k$-th column must be

790: strictly greater than or equal to $H_k+1$ at all subsequent times. For example,

791: in Fig. \ref{fig:1}, the first brick (denoted by 1) gets deposited at $t=1$ in

792: the 4-th column and it immediately screens all the sites to its right. Then the

793: second brick (denoted by 2) gets deposited at $t=2$ again in the same 4-th

794: column whose height now becomes 2 and thus the heights of all the columns to

795: the right of the 4-th column must be $\ge 2$ at all subsequent times and so on.

796: Formally such growth is implemented by the following update rule. If the $k$-th site

797: is chosen at time $t$ for deposition, then

798: \begin{equation}

799: H_k(t+1)={\rm max}\{H_k(t), H_{k-1}(t), \dots, H_1(t)\}+1.

800: \label{update1}

801: \end{equation}

802: The model is anisotropic and evidently even the average height profile $\langle

803: H_k(t) \rangle$ depends nontrivially on both the column number $k$ and time

804: $t$. Our goal is to compute the asymptotic height distribution $P_k(H,t)$ for

805: large $t$.

806: \begin{figure}

807: %\centerline{\epsfig{file=bdm.eps,width=5cm}}

808: \includegraphics[width=.7\hsize]{bdm.eps}

809: \caption{Growth of a heap with asymmetric long-range interaction. The numbers

810: inside cells show the times at which the blocks are added to the heap.}

811: \label{fig:1}

812: \end{figure}

813:

814: It is easy to find the height distribution $P_1(H, t)$ of the first column,

815: since the height there does not depend on any other column. At any stage, the

816: height in the first column either increases by one unit with probability

817: $p=\frac{1}{L}$ (if this column is selected for deposit) or stays the same with

818: probability $1-p$. Thus $P_1(H,t)$ is simply the binomial distribution,

819: $P_1(H,t)={t\choose H}p^h(1-p)^{t-H}$ with $H\leq t$. The average height of the

820: first column thus increases as $\langle H_1(t)\rangle=pt$ for all $t$ and its

821: variance is given by $\sigma_1^2(t)= tp(1-p)$. While the first column is thus

822: trivial, the dynamics of heights in other columns is nontrivial due to the

823: right-handed infinite range interactions between the columns. For

824: convenience, we subsequently measure the height of any other column with respect to the

825: first one. Namely, by height $h_k(t)$ we mean the height difference between the

826: $(k+1)$-th column and the first one, $h_k(t)=H_{k+1}(t)-H_1(t)$, so that

827: $h_0(t)=0$ for all $t$.

828:

829: To make progress for columns $k>0$, we first consider a

830: (2+1)-D construction of the heap as shown in Fig. \ref{fig:2}, by adding an extra

831: dimension indicating the time $t$. In Fig. \ref{fig:2}, the $x$ axis denotes the

832: column number, the $y$ axis stands for the time $t$ and the $z$ axis is the

833: height $h$. In this figure, every time a new block is added, it "wets" all the

834: sites at the same level to its "east" (along the $x$ axis) and to its "north"

835: (along the time axis). Here "wetting" means "screening" from

836: further deposition at those sites at the same level. This $(2+1)$-D system of

837: "terraces" is in one-to-one correspondence with the $(1+1)$-D heap in

838: Fig. \ref{fig:1}. This construction is reminiscent of the 3D anisotropic

839: directed percolation (ADP) problem studied by Rajesh and Dhar \cite{RD}. Note however,

840: that unlike the ADP problem, in our case each row labelled by $t$ can contain

841: only one deposition event.

842: \begin{figure}

843: %\centerline{\epsfig{file=d3.eps,width=8cm}}

844: \includegraphics[width=.7\hsize]{d3.eps}

845: \caption{$(2+1)$ dimensional "terraces" corresponding to the growth of a heap

846: in Fig. \ref{fig:1}}

847: \label{fig:2}

848: \end{figure}

849:

850: The next step is to consider the projection onto the 2D $(x,y)$-plane of the

851: level lines separating  the adjacent terraces whose heights differ by $1$. In

852: this projection, some of the level lines may overlap partially on the plane.

853: To avoid the overlap for better visual purposes, we make a shift

854: $(x,y)\to (x+h(x,y),y)$ and represent these shifted directed lines on the 2D

855: plane in Fig. \ref{fig:3}.

856: The black dots in Fig. \ref{fig:3} denote the points

857: where the deposition events took place and the integer next to a dot denotes

858: the time of this event. Note that each row in Fig. \ref{fig:3} contains a single

859: black dot, i.e., only one deposition per unit of time can occur. In

860: Fig. \ref{fig:3}, there are 8 such events whose deposition times form the

861: sequence $\{1,2,3,4,5,6,7,8\}$ of length $N=8$. Now let us read the deposition times of the

862: dots sequentially, but now column by column and vertically from top to bottom

863: in each column, starting from the leftmost one. Then this sequence reads

864: $\{8,3,5,1,2,6,4,7\}$ which is just a permutation of the original sequence

865: $\{1,2,3,4,5,6,7,8\}$. In the permuted sequence $\{8,3,5,1,2,6,4,7\}$ there are

866: $3$ LIS's: $\{3,5,6,7\}$, $\{1,2,6,7\}$ and $\{1,2,4,7\}$, all of the same

867: length $l_N=4$. As mentioned before (see Fig. \ref{psheap}), this is precisely

868: the number of piles in the patience sorting of the permutation

869: $\{8,3,5,1,2,6,4,7\}$.

870:

871: \begin{figure}

872: %\centerline{\epsfig{file=permu.eps,width=5cm}}

873: \includegraphics[width=.7\hsize]{permu.eps}

874: \caption{The directed lines are the level lines separating adjacent terraces

875: with height diffrence $1$ in Fig. 2, projected onto the $(x,y)$ plane and

876: shifted by $(x,y)\to (x+h(x,y),y)$ to avoid partial overlap. The black dots

877: denote the deposition events. The numbers next to the dots denote the times of

878: those deposition events.}

879: \label{fig:3}

880: \end{figure}

881:

882: Let us note one immediate fact from Fig. \ref{fig:3}. The numbers

883: belonging to the different level lines in Fig. \ref{fig:3} are in one-to-one

884: correspondence with the piles $[\{8,3,1\}, \{5,2\}, \{6,4\},\{7\}]$ in

885: Aldous--Diaconis patience sorting game. Hence, each pile can be identified with

886: an unique level line. Now, the height $h(x,t)$ at any given point $(x,t)$ in

887: Fig. \ref{fig:3} is equal to the number of level lines inside the rectangle

888: bounded by the corners: $[0,0], [x,0], [0,t], [x,t]$. Thus, we have

889: the correspondonce: height $\equiv$ number of level lines $\equiv$ number of piles $\equiv$

890: length $l_n$ of the LIS. However, to compute $l_n$, we need to know the value of $n$ which

891: is precisely the number of black dots inside this rectangle.

892:

893: Once the problem is reduced to finding the number of black dots or deposition events, we

894: no longer need the Fig. \ref{fig:3} (as it may confuse due to the visual shift

895: $(x,y)\to (x+h(x,y),y)$) and can go back to Fig. \ref{fig:2}, where the

896: north-to-east corners play the same role as the black dots in Fig. \ref{fig:2}.

897: In Fig. \ref{fig:2}, to determine the height $h_k(t)$ of the $k$-th column at

898: time $t$, we need to know the number of deposition events inside the $2$D plane

899: rectangle $R_{k,t}$ bounded by the four corners $[0,0], [k,0], [0,t], [k,t]$.

900: Let us begin with the last column $k=L$. For $k=L$ the number of deposition

901: events $N$ in the rectangle $R_{L,t}$ is equal to the time $t$ because there is

902: only one deposition event per time. In our example $N=t=8$. For a general $k<L$

903: the number of deposition events $N$ inside the rectangle $R_{k,t}$ is a random

904: variable, since some of the rows inside the rectangle may not contain a

905: north-to-east corner or a deposition event. The probability distribution

906: $P_{k,t}(N)$ (for a given $[k,t]$) of this random variable can, however, be

907: easily found as follows. At each step of deposition, a column is chosen at

908: random from any of the $L$ columns. Thus, the probability that a north-to-east

909: corner will fall on the segment of line $[0,k]$ (where $k\leq L$) is equal to

910: $k/L$. The deposition events are completely independent of each other,

911: indicating the absence of correlations between different rows labelled by $t$ in

912: Fig. \ref{fig:2}. So, we are asking the question: given $t$ rows, what is the

913: probability that $N$ of them will contain a north-to-east corner? This is

914: simply given by the binomial distribution

915: \begin{equation}

916: P_{k,t}(N) = {t\choose N } \left({\frac {k}{L}} \right)^N

917: \left(1-{\frac {k}{L}}\right)^{t-N},

918: \label{binom1}

919: \end{equation}

920: where $N\leq t$. Now we are reduced to the following problem: given a sequence

921: of integers of length $N$ (where $N$ itself is random and is taken from the

922: distribution in Eq.(\ref{binom1})), what is the length of the LIS? Recall that

923: this length is precisely the height $h_k(t)$ of the $k$-th column at time $t$

924: in our model. In the thermodynamic limit $L\to \infty$ for $t\gg 1$ and any

925: fixed $k$ such that the quotient $\lambda=\frac{tk}{L}$ remains fixed but is

926: arbitrary, the distribution in Eq.(\ref{binom1}) becomes a Poisson distribution

927: $P(N)\to e^{-\lambda} \frac {\lambda^N}{N!}$, with the mean

928: $\lambda=\frac{tk}{L}$. We can then directly use the BDJ result in

929: Eq.(\ref{bdj1}) to predict our main result for the height in the BD model,

930: \begin{equation}

931: h_k(t) \to 2\sqrt{\frac{tk}{L}} + \left(\frac{tk}{L}\right)^{1/6} \chi,

932: \label{result1}

933: \end{equation}

934: for large $\lambda=tk/L$, where the random variable $\chi$ has the

935: Tracy-Widom distribution $F_2(\chi)$ as in Eq. (\ref{gue}).

936: Using the known exact value $\langle \chi\rangle

937: =-1.7711...$ from the Tracy-Widom distribution \cite{TW1}, we find exactly the

938: asymptotic average height profile in the BD model,

939: \begin{equation}

940: \langle h_k(t)\rangle \to 2\sqrt{\frac{tk}{L}}-

941: 1.7711...\left(\frac{tk}{L}\right)^{1/6}.

942: \label{avgh}

943: \end{equation}

944: The leading square root dependence of the profile on the column number $k$ has

945: been seen numerically. Eq. (\ref{avgh}) also predicts an

946: exact sub-leading term with $k^{1/6}$ dependence. Similarly, for the variance,

947: $\sigma_k^2(t)=\langle [h_k(t)-\langle h_k(t)\rangle]^2 \rangle$, we find

948: asymptotically: $\sigma_k^2(t)\to c_0\left(\frac{tk}{L}\right)^{1/3}$, where

949: $c_0=\langle [\chi-\langle \chi \rangle]^2\rangle=0.8132...$ \cite{TW1}.

950: Eliminating the $t$ dependence for large $t$ between the average and the

951: variance, we get, $\sigma_k^2(t)\approx a {\langle h_k(t)\rangle}^{2\beta}$

952: where the constant $a=c_0/2^{2/3}=0.51228\dots$ and $\beta=1/3$, thus

953: recovering the KPZ scaling exponent.

954: In addition to the BD model with infinite range right-handed

955: interaction reported here,

956: we have also analyzed the model (analytically within a mean field theory and numerically)

957: when the right-handed interaction is short ranged.

958: Somewhat suurprisingly and pleasantly, we found that

959: the asymptotic average height profile is independent of the range of interaction.

960: A recent analysis of the short range BD model sheds light on this fact~\cite{KNV}.

961:

962: So far, we have demonstrated that for a flat initial condition, the height fluctuations in the

963: BD model follow the Tracy-Widom distribution $F_{\rm GUE}(x)$ which corresponds to

964: the distribution of the largest eigenvalue of a random matrix drawn from a Gaussian unitary ensemble.

965: In the context of the PNG model, Pr\"ahofer and Spohn \cite{PS} have shown that while the height

966: fluctuations of a single PNG droplet follow the distribution $F_{\rm GUE}(x)$, it is possible to

967: obtain other types of universal distributions as well. For example, the height fluctuations

968: in the PNG model growing over a flat substrate follow the

969: distribution $F_{\rm GOE}(x)$ where $F_{\rm GOE}(x)$ is the distribution of the largest

970: eigenvalue of a random matrix drawn from the Gaussian orthogonal ensemble. Besides,

971: in a PNG droplet with two external sources at its edges which nucleate with rates

972: $\rho_{+}$ and $\rho_{-}$, the height fluctuations have different distributions depending

973: on the values of $\rho_{+}$ and $\rho_{-}$. For $\rho_{+}<1$ and $\rho_{-}<1$, one gets back

974: the distribution $F_{\rm GUE}(x)$. If however $\rho_{+}=1$ and $\rho_{-}<1$ (or alternatively

975: $\rho_{-}=1$ and $\rho_{+}<1$), one gets the distribution $F_{\rm GOE}^2(x)$ which corresponds to

976: the distribution of the largest of the superimposed eigenvalues of two independent

977: GOE matrices. In the critical case $\rho_{+}=1$ and $\rho_{-}=1$, one gets a new

978: distribution $F_0(x)$ which does not have any random matrix analogy. For $\rho_{+}>1$

979: and $\rho_{-}>1$, one gets Gaussian distribution. These results for the PNG model were obtained in

980: Ref. \cite{PS} using a powerful theorem of Baik and Rains \cite{BR1}.

981:

982: The question naturally arises as to whether these other distributions, apart from the $F_{\rm GUE}(x)$,

983: can also appear in the BD model considered in this paper. Indeed, they do. For example, if

984: one starts with a staircase initial condition $h_k(0)=k$ for the heights in the BD model,

985: one gets the distribution $F_{\rm GOE}^2(x)$ for the scaled variable $\chi$. This follows from the

986: fact that for the staircase initial condition, in Fig. 2 there will be a black dot (or a north-to-east

987: corner) at every value of $k$ on the $k$ axis at $t=0$. Thus the black dots appear on the $k$ axis

988: with unit density. This

989: corresponds to the case $\rho_{+}=1$

990: and $\rho_{-}=0$ of the general results of Baik and Rains which leads to a $F_{\rm GOE}^2(x)$

991: distribution. Of course, the density $\rho_{+}$ can be tuned between $0$ and $1$, by tuning

992: the average slope of the staircase. For a generic $0<\rho_{+}\leq 1$, one can also

993: vary $\rho_{-}$ by putting an external source at the first column.

994: Thus one can obtain, in principle, most of the distributions discussed in Ref. \cite{BR1} by varying

995: $\rho_{+}$ and $\rho_{-}$.

996: Note that the

997: case $\rho_{-}=1$ (external source which drops one particle at the first column at every time step) and

998: $\rho_{+}=0$ (flat substrate) is, however, trivial since the surface then remains flat

999: at all times and the height just increases by one unit at every time step. The distribution

1000: $F_{\rm GOE}(x)$ is, however, not naturally accessible within the rules of our model.

1001:

1002: \section{Sequence Matching Problem}

1003:

1004: In this section, I will discuss a different problem namely that of the alignment of two

1005: random sequences and will illustrate how the Tracy-Widom distribution appears in this

1006: problem. This is based on a joint wotk with S. Nechaev~\cite{MN}.

1007:

1008: Sequence alignment is one of the most useful quantitative methods used in

1009: evolutionary molecular biology\cite{W1,Gusfield,DEKM}. The goal of an alignment

1010: algorithm is to search for similarities in patterns in different sequences. A

1011: classic and much studied alignment problem is the so called `longest common

1012: subsequence' (LCS) problem. The input to this problem is a pair of sequences

1013: $\alpha=\{\alpha_1, \alpha_2,\dots, \alpha_i\}$ (of length $i$) and

1014: $\beta=\{\beta_1, \beta_2,\dots, \beta_j\}$ (of length $j$). For example, $\alpha$

1015: and $\beta$ can be two random sequences of the $4$ base pairs $A$, $C$, $G$, $T$ of

1016: a DNA molecule, e.g., $\alpha=\{A, C, G, C, T, A, C\}$ and $\beta=\{C, T, G, A,

1017: C\}$. A subsequence of $\alpha$ is an ordered sublist of $\alpha$ (entries of which

1018: need not be consecutive in $\alpha$), e.g, $\{C, G, T, C\}$, but not $\{T, G, C\}$.

1019: A common subsequence of two sequences $\alpha$ and $\beta$ is a subsequence of both

1020: of them. For example, the subsequence $\{C, G, A, C\}$ is a common subsequence of

1021: both $\alpha$ and $\beta$. There can be many possible common subsequences of a pair

1022: of sequences. For example, another common subsequence of $\alpha$ and $\beta$ is

1023: $\{A, C\}$. One simple way to construct different common subsequences (for two

1024: fixed sequences $\alpha$ and $\beta$) is by drawing lines from one member

1025: of the set $\alpha$ to another member of the set $\beta$ such that the lines

1026: can not cross. For example, the common subsequence $\{C, G, A, C\}$ is shown

1027: by solid lines in Fig. \ref{matching}. On the other hand the common subsequence

1028: $\{A,C\}$ is shown by the dashed lines in Fig. \ref{matching}.

1029: \begin{figure}

1030: \includegraphics[width=.7\hsize]{matching.eps}

1031: \caption{ Two fixed sequences $\alpha: \{A, C, G, C, T, A, C\}$

1032: and $\beta: \{C, T, G, A, C\}$. The solid lines show the common

1033: subsequence $\{C, G, A, C\}$ and the dashed lines denote another

1034: common subsequence $\{A,C\}$.}

1035: \label{matching}

1036: \end{figure}

1037: The aim of the LCS problem is to find the longest of such common

1038: subsequences between two fixed sequences $\alpha$ and $\beta$.

1039:

1040: This problem and its variants have been widely studied in

1041: biology\cite{NW,SW,WGA,AGMML}, computer science\cite{SK,AG,WF,Gusfield}, probability

1042: theory\cite{CS,Deken,Steele,DP,Alex,KLM} and more recently in statistical

1043: physics\cite{ZM,Hwa,Monvel}. A particularly important application of the LCS problem

1044: is to quantify the closeness between two DNA sequences. In evolutionary biology, the

1045: genes responsible for building specific proteins evolve with time and by finding the

1046: LCS of the same gene in different species, one can learn what has been conserved in

1047: time. Also, when a new DNA molecule is sequenced {\it in vitro}, it is important to

1048: know whether it is really new or it already exists. This is achieved quantitatively

1049: by measuring the LCS of the new molecule with another existing already in the

1050: database.

1051:

1052: For a pair of fixed sequences of length $i$ and $j$ respectively, the length

1053: $L_{i,j}$ of their LCS is just a number. However, in the stochastic version of the

1054: LCS problem one compares two random sequences drawn from $c$ alphabets and hence the

1055: length $L_{i,j}$ is a random variable. A major challenge over the last three decades

1056: has been to determine the statistics of $L_{i,j}$\cite{CS,Deken,Steele,DP,Alex}. For

1057: equally long sequences ($i=j=n$), it has been proved that $\langle L_{n,n}\rangle

1058: \approx \gamma_c n$ for $n\gg 1$, where the averaging is performed over all

1059: realizations of the random sequences. The constant $\gamma_c$ is known as the

1060: Chv\'atal-Sankoff constant which, to date, remains undetermined though there exists

1061: several bounds\cite{Deken,DP,Alex}, a conjecture due to Steele\cite{Steele} that

1062: $\gamma_c=2/(1+\sqrt{c})$ and a recent proof\cite{KLM} that $\gamma_c\to 2/\sqrt{c}$

1063: as $c\to \infty$. Unfortunately, no exact results are available for the finite size

1064: corrections to the leading behavior of the average $\langle L_{n,n}\rangle$, for the

1065: variance, and also for the full probability distribution of $L_{n,n}$. Thus, despite

1066: tremendous analytical and numerical efforts, exact solution of the random LCS

1067: problem has, so far, remained elusive. Therefore it is important to find other

1068: variants of this LCS problem that may be analytically tractable.

1069:

1070: Computationally, the easiest way to determine the length $L_{i,j}$ of the LCS of two

1071: arbitrary sequences of lengths $i$ and $j$ (in polynomial time $\sim O(ij)$) is via

1072: using the recursive algorithm\cite{Gusfield,Monvel}

1073: \begin{equation}

1074: L_{ij} = \max\left[L_{i-1,j}, L_{i,j-1}, L_{i-1,j-1} + \eta_{i,j}\right],

1075: \label{recur1}

1076: \end{equation}

1077: subject to the initial conditions $L_{i,0}=L_{0,j}=L_{0,0}=0$. The variable

1078: $\eta_{i,j}$ is either 1 when the characters at the positions $i$ (in the sequence

1079: $\alpha$) and $j$ (in the sequence $\beta$) match each other, or 0 if they do not.

1080: Note that the variables $\eta_{i,j}$'s are not independent of each other. To see

1081: this consider the simple example -- matching of two strings $\alpha={\rm AB}$ and

1082: $\beta={\rm AA}$. One has by definition: $\eta_{1,1}=\eta_{1,2}=1$ and

1083: $\eta_{2,1}=0$. The knowledge of these three variables is sufficient to predict that

1084: the last two letters will not match, i.e., $\eta_{2,2}=0$. Thus, $\eta_{2,2}$ can

1085: not take its value independently of $\eta_{1,1},\,\eta_{1,2},\,\eta_{2,1}$. These

1086: residual correlations between the $\eta_{i,j}$ variables make the LCS problem rather

1087: complicated. Note however that for two random sequences drawn from $c$ alphabets,

1088: these correlations between the $\eta_{i,j}$ variables vanish in the $c\to \infty$

1089: limit.

1090:

1091: A natural question is how important are these correlations between the $\eta_{i,j}$ variables, e.g.,

1092: do they affect the asymptotic statistics of $L_{i,j}$'s for large $i$ and $j$?

1093: Is the problem solvable if one ignores these correlations?

1094: These questions naturally lead to the Bernoulli matching (BM) model which is a simpler variant of

1095: the original LCS problem where one ignores the correlations between $\eta_{i,j}$'s for all

1096: $c$\cite{Monvel}.

1097: The length $L_{i,j}^{BM}$ of the BM model satisfies the same

1098: recursion relation in Eq. (\ref{recur1}) except that $\eta_{i,j}$'s are now

1099: independent and each drawn from the bimodal distribution: $p(\eta)=

1100: (1/c)\delta_{\eta,1}+ (1-1/c)\delta_{\eta,0}$.

1101: This approximation is expected to be exact only in the appropriately taken

1102: $c\to \infty$ limit. Nevertheless, for finite $c$, the results on the BM model can serve

1103: as a useful benchmark for the original LCS model to decide if indeed the correlations

1104: between $\eta_{i,j}$'s are important or not. Unfortunately, even in the absence of

1105: correlations, the exact aymptotic distribution of $L_{i,j}^{BM}$ in the BM model has so far

1106: remained elusive, mainly due to the nonlinear nature of the recursion relation

1107: in Eq. (\ref{recur1}).

1108: The purpose of this Rapid Communication is to present an exact asymptotic formula for the

1109: distribution of the length $L_{n,n}^{BM}$ in the BM model for all $c$.

1110: So far, only the leading asymptotic behavior of the

1111: average length in the BM model is known\cite{Monvel} using the `cavity'

1112: method of spin glass physics\cite{MPV},

1113: \begin{equation}

1114: \langle L_{n,n}^{BM}\rangle  \approx \gamma_c^{BM} n

1115: \label{bm1}

1116: \end{equation}

1117: where $\gamma_c^{BM}= 2/(1+\sqrt{c})$, same as the conjectured value of the

1118: Chv\'atal-Sankoff constant $\gamma_c$ for the original LCS model. However, other

1119: properties such as the variance or the distribution of $L_{n,n}^{BM}$ remained

1120: untractable even in the BM model.

1121: We have shown~\cite{MN}, as illustrated below, that for large $n$,

1122: \begin{equation}

1123: L_{n,n}^{BM}\to \gamma_c^{BM} n + f(c)\, n^{1/3}\, \chi

1124: \label{asymp11}

1125: \end{equation}

1126: where $\chi$ is a random variable with a $n$-independent distribution, ${\rm Prob}

1127: (\chi\le x)= F_{ 2}(x)$ which is precisely the Tracy-Widom distribution

1128: in Eq. (\ref{gue}).

1129: Indeed, we were also able to compute the functional form of the scale factor $f(c)$ exactly for all

1130: $c$~\cite{MN},

1131: \begin{equation}

1132: f(c)=\frac{c^{1/6}(\sqrt{c}-1)^{1/3}}{\sqrt{c}+1}.

1133: \label{fc1}

1134: \end{equation}

1135: This allows us to calculate the average including the subleading finite size

1136: correction term and the variance of $L_{n,n}^{BM}$ for large $n$,

1137: \begin{eqnarray}

1138: \langle L_{n,n}^{BM}\rangle &\approx & \gamma_c^{BM} n + \left<\chi\right> f(c)

1139: n^{1/3} \nonumber \\

1140: {\rm Var}\, L_{n,n}^{BM} &\approx &

1141: \left(\langle\chi^2\rangle-{\langle\chi\rangle}^2\right)\, f^2(c)\, n^{2/3},

1142: \label{eq:expvar}

1143: \end{eqnarray}

1144: where one can use the known exact values\cite{TW1}, $\langle \chi\rangle=

1145: -1.7711\dots$ and $\langle \chi^2\rangle- {\langle \chi\rangle}^2= 0.8132\dots$.

1146: These exact results thus invalidate the previous attempt\cite{Monvel} to

1147: fit the subleading correction to the mean in the BM model with a

1148: $n^{1/2}/{\ln (n)}$ behavior and also to fit the scaled distribution

1149: with a Gaussian form.

1150: Note that the recursion relation in Eq.

1151: (\ref{recur1}) can also be viewed as a $(1+1)$ dimensional directed polymer

1152: problem\cite{Hwa,Monvel} and some asymptotic results (such as the $O(n^{2/3})$

1153: behavior of the variance of $L_{n,n}$ for large $n$) can be obtained using the

1154: arguments of universality\cite{Hwa}. However, this does not provide precise results

1155: for the full distribution along with the correct scale factors that are obtained here.

1156:

1157: It is useful to provide a synopsis of our method in deriving these results. First,

1158: we prove the results in the $c\to \infty$ limit, by using mappings to other models.

1159: To make progress for finite $c$, we first map the BM model exactly to a $3$-d

1160: anisotropic directed percolation (ADP) model first studied by Rajesh and

1161: Dhar\cite{RD}. This ADP model is also precisely the same as the directed

1162: polymer model studied by Johansson~\cite{J1}, as discussed in the previous section

1163: and for which the exact results are known as in Eq. (\ref{j1}).

1164: To extract the results for the BM model from those of Johansson's

1165: model, we use a simple symmetry argument which then allows us to derive our main

1166: results in Eqs. (\ref{asymp11})-(\ref{eq:expvar}) for all $c$. As a check, we recover

1167: the $c\to \infty$ limit result obtained independently by the first method.

1168:

1169: In the BM model, the length $L_{i,j}^{BM}$ can be interpreted as the height of a

1170: surface over the $2$ dimensional $(i,j)$ plane constructed via the recursion relation in Eq.

1171: (\ref{recur1}). A typical surface, shown in Fig. \ref{fig:bms1}\,(a), has terrace-like structures.

1172: \begin{figure}

1173: \includegraphics[width=.7\hsize]{bm_f1.eps}

1174: \caption{Examples of (a) BM surface

1175: $L_{i,j}^{BM}\equiv {\tilde h}(x,y)$ and (b) ADP surface $L_{i,j}^{ADP}\equiv

1176: h(x,y)$.}

1177: \label{fig:bms1}

1178: \end{figure}

1179:

1180: It is useful to consider the projection of the level lines separating the adjacent

1181: terraces whose heights differ by $1$ (see Fig.\ref{fig:bms2}) onto the $2$-D $(i,j)$ plane. Note

1182: that, by the rule Eq. (\ref{recur1}), these level lines never overlap each other,

1183: i.e., no two paths have any common edge. The statistical weight of such a projected

1184: $2$-D configuration is the product of weights associated with the vertices of the

1185: $2$-D plane. There are five types of possible vertices with nonzero weights as shown

1186: in Fig. \ref{fig:bms2}, where $p=1/c$ and $q=1-p$. Since the level lines never cross each other,

1187: the weight of the first vertex in Fig. \ref{fig:bms2} is $0$.

1188: %The height $L_{i,j}^{BM}$ at any point $(i,j)$ on this $2$-d plane is just the

1189: %number of level lines that one crosses in going from the origin to $(i,j)$.

1190: \begin{figure}

1191: \includegraphics[width=.7\hsize]{bm_f2.eps}

1192: \caption{Projected $2$-d level lines separating adjacent terraces of unit height

1193: difference in the BM surface in Fig.\ref{fig:bms1} (a). The adjacent table shows the weights of

1194: all vertices on the $2$-d plane.}

1195: \label{fig:bms2}

1196: \end{figure}

1197:

1198: Consider first the limit $c\to \infty$ (i.e., $p\to 0$). The weights of all allowed

1199: vertices are $1$, except the ones shown by black dots in Fig. \ref{fig:bms2}, whose associated

1200: weights are $p\to 0$. The number $N$ of these black dots inside a rectangle of area

1201: $A=ij$ can be easily estimated.

1202: For large $A$ and $p\to 0$, this number

1203: is clearly

1204: Poisson

1205: distributed with the mean ${\overline N}= pA$.

1206: The height $L_{i,j}^{BM}$ is just the number of level lines $\cal N$ inside this

1207: rectangle of area $A=ij$. One can easily estimate $\cal N$ by following

1208: precisely the method outlined in the previous subsection in the context of the ballistic deposition

1209: model. Following the same analysis as in the ballistic deposition model,

1210: it is easy to see that

1211: the number of level lines ${\cal N}$ inside the rectangle

1212: (for large $A$), appropriately scaled, has a limiting behavior, ${\cal N}\to

1213: 2\sqrt{\overline N} + {\overline N}^{1/6}\, \chi$, where $\chi$ is a random variable

1214: with the Tracy-Widom distribution. Using ${\overline N}=pA=ij/c$, one then obtains in

1215: the limit $p\to 0$,

1216: \begin{equation}

1217: L_{i,j}^{BM}= {\cal N} \to \frac{2}{\sqrt c}\sqrt{ij} +

1218: {\left( \frac{ij}{c}\right)}^{1/6}\, \chi.

1219: \label{p01}

1220: \end{equation}

1221: In particular, for large equal length sequences $i=j=n$, we get for $c\to \infty$

1222: \begin{equation}

1223: L_{n,n}^{BM}\to \frac{2}{\sqrt{c}}\, n + c^{-1/6} \, n^{1/3}\, \chi .

1224: \label{p02}

1225: \end{equation}

1226: For finite $c$, while the above mapping to the LIS problem still works, the

1227: corresponding permutations of the LIS problem are not generated with equal

1228: probability and hence one can no longer use the BDJ results.

1229:

1230: For any finite $c$, we can however map the BM model to the ADP model studied by Rajesh and Dhar~\cite{RD}.

1231: In the ADP model on

1232: a simple cubic lattice the bonds are occupied with probabilities $p_x$, $p_y$, and

1233: $p_z$ along the $x$, $y$ and $z$ axes and are all directed towards increasing

1234: coordinates. Imagine a source of fluid at the origin which spreads along the

1235: occupied directed bonds. The sites that get wet by the fluid form a $3$-d cluster.

1236: In the ADP problem, the bond occupation probabilities are anisotropic, $p_x=p_y=1$

1237: (all bonds aligned along the $x$ and $y$ axes are occupied) and $p_z=p$. Hence, if

1238: the point $(x,y,z)$ gets wet by the fluid then all the points $(x',y', z)$ on the

1239: same plane with $x'\ge x$ and $y'\ge y$ also get wet. Such a wet cluster is compact

1240: and can be characterized by its bounding surface height $H(x,y)$ as shown in

1241: Fig.(1b). It is not difficult to see~\cite{RD} that the height $H(x,y)$ satisfies exactly

1242: the same recursion relation of the directed polymer as in Eq. (\ref{dpr2})

1243: where $\xi_{x,y}$'s are i.i.d. random variables taking nonnegative integer values

1244: with ${\rm Prob}(\xi_{x,y}=k)= (1-p)\, p^k$ for $k=0,1,2,\dots$. Thus the ADP

1245: model of Rajesh and Dhar is precisely identical to the directed polymer model

1246: studied by Johansson with exactly the same distribution of the noise $\xi(x,y)$.

1247:

1248: While the terrace-like structures of the ADP surface look similar to the BM surfaces

1249: (compare Figs. (\ref{fig:bms1}\,a) and (\ref{fig:bms1}\,b), there is an important difference between the

1250: two. In

1251: the ADP model, the level lines separating two adjacent terraces can overlap with

1252: each other\cite{RD}, which does not happen in the BM model. However, by making the

1253: following change of coordinates in the ADP model\cite{RD}

1254: \begin{equation}

1255: \zeta= x+ h(x,y); \,\,\, \eta=y+ h(x,y)

1256: \label{ct1}

1257: \end{equation}

1258: one gets a configuration of the surface where the level lines no longer overlap.

1259: Moreover, it is not difficult to show that the projected $2$-D configuration of

1260: level lines of this shifted ADP surface has exactly the same statistical weight as

1261: the projected $2$-D configuration of the BM surface. Denoting the BM height by

1262: ${\tilde h}(x,y)= L_{x,y}^{BM}$, one then has the identity, ${\tilde h}(\zeta,

1263: \eta)= h(x,y)$, which holds for each configuration. Using Eq. (\ref{ct1}), one can

1264: rewrite this identity as

1265: \begin{equation}

1266: {\tilde h}(\zeta, \eta)= h\left( \zeta- {\tilde h}(\zeta, \eta),

1267: \eta- {\tilde h}(\zeta, \eta)\right).

1268: \label{conv1}

1269: \end{equation}

1270:

1271: Thus, for any given height function $h(x,y)$ of the ADP model, one can, in

1272: principle, obtain the corresponding height function ${\tilde h}(x,y)$ for all

1273: $(x,y)$ of the BM model by solving the nonlinear equation (\ref{conv1}). This is

1274: however very difficult in practice. Fortunately, one can make progress for large

1275: $(x,y)$ where one can replace the integer valued discrete heights by continuous

1276: functions $h(x,y)$ and ${\tilde h}(x,y)$. Using the notation $\partial_x\equiv

1277: \partial/{\partial x}$ it is easy to derive from Eq. (\ref{ct1}) the following pair

1278: of identities,

1279: \begin{equation}

1280: \partial_x h = \frac{\partial_{\zeta} {\tilde h}}{1-\partial_{\zeta}

1281: {\tilde h}-\partial_{\eta} {\tilde h}};

1282: \,\,\,

1283: \partial_y h = \frac{\partial_{\eta} {\tilde h}}{1-\partial_{\zeta}

1284: {\tilde h}-\partial_{\eta} {\tilde h}}.

1285: \label{der1}

1286: \end{equation}

1287: In a similar way, one can show that

1288: \begin{equation}

1289: \partial_{\zeta} {\tilde h} = \frac{\partial_x h}{1+\partial_x h+\partial_y h};\,\,\,

1290: \partial_{\eta} {\tilde h} = \frac{\partial_y h}{1+\partial_x h+\partial_y h}.

1291: \label{der2}

1292: \end{equation}

1293: We then observe that Eqs. (\ref{der1}) and (\ref{der2}) are invariant under the

1294: simultaneous transformations

1295: \begin{equation}

1296: \zeta\to -x ; \,\, \eta\to -y; \,\, \tilde h \to h \, .

1297: \label{invar1}

1298: \end{equation}

1299: Since the height is built up by integrating the derivatives, this leads to a simple

1300: result for large $\zeta$ and $\eta$,

1301: \begin{equation}

1302: {\tilde h}(\zeta, \eta) = h(-\zeta, -\eta).

1303: \label{res1}

1304: \end{equation}

1305:

1306: Thus, if we know exactly the functional form of the ADP surface $h(x,y)$, then the

1307: functional form of the BM surface ${\tilde h}(x,y)$ for large $x$ and $y$ is simply

1308: obtained by ${\tilde h}(x,y)=h(-x,-y)$. Changing $x\to -x$ and $y\to -y$ in

1309: Johansson's expression for the ADP surface in Eq. (\ref{j1}) we thus arrive at our

1310: main asymptotic result for the BM model

1311: \begin{eqnarray}

1312: L_{x,y}^{BM}&=& {\tilde h}(x,y) \to \frac{2\sqrt{pxy}-p(x+y)}{q}+ \nonumber \\

1313: &+&\frac{(pxy)^{1/6}}{q}\,\left[(1+p)-\sqrt{\frac{p}{xy}}\,(x+y)\right]^{2/3} \,

1314: \chi, \label{res2}

1315: \end{eqnarray}

1316: where $p=1/c$ and $q=1-1/c$. For equal length sequences $x=y=n$, Eq. (\ref{res2})

1317: then reduces to Eq. (\ref{asymp11}).

1318:

1319: To check the consistency of our asymptotic results, we further computed the

1320: difference between the left- and the right-hand sides of Eq. (\ref{conv1}),

1321: \begin{equation}

1322: \Delta h (\zeta, \eta)= {\tilde h}(\zeta, \eta)- h\left( \zeta- {\tilde h}(\zeta,

1323: \eta), \eta- {\tilde h}(\zeta, \eta)\right), \label{conv2}

1324: \end{equation}

1325: with the functions $h(x,y)$ and ${\tilde h}(x,y)$ given respectively by Eqs.

1326: (\ref{j1}) and (\ref{res2}). For large $\zeta=\eta$ one gets

1327: \begin{equation}

1328: \Delta h(\zeta,\zeta) \to \left[{p^{1/3}\chi^2}/{3 (1-\sqrt{p})^{4/3}}\right]\,

1329: {\zeta}^{-1/3} . \label{cons1}

1330: \end{equation}

1331: Thus the discrepancy falls off as a power law for large $\zeta$, indicating that

1332: indeed our solution is asymptotically exact. We have also performed numerical

1333: simulations of the BM model using the recursion relation in Eq. (\ref{recur1}) for

1334: $c=2,\,4,\,9,\,16,\,100$. Our preliminary results\cite{MN} for relatively small

1335: system sizes (up to $n=5000$) are consistent with our exact results in Eqs.

1336: (\ref{asymp11})-(\ref{eq:expvar}).

1337:

1338: Thus, the Tracy-Widom distribution also describes the asymptotic distribution of

1339: the optimal matching length in the BM model, for all $c$. Given that the correlations in the original LCS

1340: model

1341: become negligible in the $c\to \infty$ limit, it is likely that the

1342: BM asymptotics in Eq. (\ref{p02}) would also hold for the original LCS model

1343: in the $c\to \infty$ limit.

1344: An important open problem

1345: is to determine whether the Tracy-Widom distribution also appears in the

1346: LCS problem for finite $c$. The precise distribution obtained

1347: here (including exact prefactors) for all $c$ in the BM model will serve

1348: as a useful benchmark to which future simulations of the LCS problem can

1349: be compared.

1350:

1351: \section{Conclusion}

1352:

1353: In these lectures I have discussed $4$ a priori unrelated problems and tried to give a flavour

1354: of the recent developments that have found a deep connection between these problems.

1355: These connections have now established the fact that they all share one common limiting distribution,

1356: namely the Tracy-Widom distribution that describes the asymptotic distribution law of

1357: the largest eigenvalue of a random matrix. I have also discussed the probabilities of

1358: large deviations of the largest eigenvalue, in the range outside the validity of the

1359: Tracy-Widom law. As examples, I have demonstrated in detail, in two specfic models

1360: a ballistic

1361: deposition model and a sequence alignment problem,

1362: how they can be mapped on to the longest increasing subsequence problem

1363: and consequently proving the existence of the Tracy-Widom distribution in these

1364: models.

1365:

1366: There have been many other interesting recent developments in this rather broad area encompassing

1367: different fields that I did not have the scope to discuss in these lectures.

1368: There are, of course, plenty of open questions that

1369: need to be addressed, some of which I mention below.

1370:

1371: {\em Finite size effects in growth models:} We have discussed how the Tracy-Widom distribution appears

1372: as the limiting scaled height distribution in several $(1+1)$ dimensional growth

1373: models that belong to the KPZ universality class of fluctuating interfaces. Indeed,

1374: for a fluctuating surface with height $H(x,t)$ growing over a substrate of infinite size

1375: one now believes that at long times $t>>1$

1376: \begin{equation}

1377: H(x,t) = v t + b t^{1/3} \chi

1378: \label{con1}

1379: \end{equation}

1380: where $\chi$ is a time-independent random variable with the Tracy-Widom distribution.

1381: The prefactors $v$ (the velocity of the interface) and $b$ are model dependent,

1382: but the distribution of the scaled variable $\chi=(H-vt)/{bt^{1/3}}$ is universal

1383: for large $t$. The nonuniversal prefactors are often very hard to compute. We have

1384: shown two examples in these lectures where these prefactors can be computed exactly.

1385: Note, however, that the result in Eq. (\ref{con1}) holds only in an infinite system.

1386: In any real system with a finite

1387: substrate size $L$, the result in Eq. (\ref{con1}) will hold only in the growing

1388: regime of the surface, i.e., when $1<< t << L^z$, where $z$ is the dynamical

1389: exponent characterizing the surface evolution. For example, for the KPZ

1390: type of interfaces in $(1+1)$ dimensions, $z=3/2$. However, when $t>> L^z$, the probability distribution

1391: of the height fluctuation

1392: $H-\langle H\rangle$ will become time-independent. For example, for $(1+1)$ dimensional KPZ surfaces

1393: with periodic boundary conditions, it is well known~\cite{HZ} that the stationary distribution of

1394: the height fluctuation is a simple Gaussian, ${\rm Prob}[H-\langle H\rangle=x]\propto \exp[-x^2/{a_0 L}]$

1395: where $a_0$ is a nonuniversal constant and the typical fluctuation scales with the system size as $L^{1/2}$.

1396: An important open question is how does the distribution of the height fluctuation crosses over

1397: from the Tracy-Widom form to a simple Gaussian form as $t$ becomes bigger than the crossover time $L^z$.

1398: It would be nice to show this explicitly in any of the simple models discussed above.

1399:

1400:

1401: {\em A direct connection between the growth models and random matrices:} The existence of the Tracy-Widom

1402: distribution in many of the growth models discussed here, such as the polynuclear growth model

1403: or the ballistic deposition model, rely on the mapping to the LIS problem

1404: and subsequently using the BDJ results that connect the LIS problem to random matrices.

1405: It is certainly desirable to find to a direct mapping between the growth models and the

1406: largest eigenvalue of a random matrix. Recent work by Spohn and collaborators~\cite{Spohn}

1407: linking the top edge of a PNG growth model to Dyson's brownian motion of the eigenvalues

1408: of a random matrix perhaps provides a clue to this missing link.

1409:

1410:

1411: {\em Largest Lyapunov exponent in population dynamics:} The Tracy-Widom distribution

1412: and the associated large-deviation function discussed in Section 3

1413: conceivably have important applications in several systems

1414: where the largest eigenvalue controls the spectral properties of the system. Some

1415: examples were discussed in Section 3. Recently, it has been shown that the statistics

1416: of largest eigenvalue (the largest Lyapunov exponent) is also of importance

1417: in population growth of organisms in fluctuating environments~\cite{KL1}.

1418: It would be interesting to see if Tracy-Widom type distribution functions also

1419: appear in these biological problems.

1420:

1421:

1422: {\em Sequence matching, directed polymer and vertex models:} In the context of the sequence matching problem

1423: discussed in Section 6, we have demonstrated how the statistical weights of the surface generated

1424: in the Bernoulli matching

1425: model of the sequence alignment are exactly identical to that of

1426: a $5$-vertex model on a square lattice (see Fig. \ref{fig:bms2}). This is a useful connection

1427: because there are many quantities in the $5$-vertex models that can be computed exactly by employing

1428: the Bethe ansatz techniques and subsequently one can use those results for the sequence

1429: alignment or equivalently for the directed polymer model. Recently, in collaboration

1430: with K. Mallick and S. Nechaev, we have made some progress in these directions~\cite{MMN}.

1431: A very interesting open issue is if one can derive the Tracy-Widom distribution

1432: by using the Bethe ansatz techniques.

1433:

1434: {\em Other issues related to the sequence matching problem:} There are also many other

1435: interesting open questions associated with

1436: the sequence matching problem.

1437: We have shown that the length of the longest matching is Tracy-Widom distributed

1438: only in the Bernoulli matching model which is a simpler version of the original LCS problem.

1439: In the BM model one has ignored certain correlations, as we discussed in detail. This approximation is

1440: exact in the $c\to \infty$ limit, where $c$ is the number of different types of alphabets, e.g.

1441: for DNA, $c=4$. Is this approximation good even for finite $c$? In

1442: other words,

1443: is the optimal matching length in the original LCS problem also Tracy-Widom distributed?

1444: It would also be

1445: interesting if one can make a systematic $1/c$ expansion of the LCS model, i.e., keeping

1446: the correlations up to $O(1/c)$. Numerical simulations the LCS problem~\cite{BMat} for binary sequence $c=2$

1447: indeed indicates that the standard deiviation of the optimal matching length scales as $n^{1/3}$ where

1448: $n$ is the sequence size, as in the

1449: BM model, the question is if the scaled distribution is also Tracy-Widom or not.

1450: For the original LCS problem, there is also a curious result due to Bonetto

1451: and Matzinger~\cite{BMat} that claims that if the value of $c$ for the two sequences are not the same (for example,

1452: the first sequence may be drawn randomly from $3$ alphabets and the second may be a binary sequence),

1453: then the standard deviation of the optimal matching length scales as $n^{1/2}$ for large $n$, which

1454: is rather surprising!

1455: It would be interesting to study the statistics of optimal matches between more than two sequences.

1456: Finally, here we have just mentioned the matching of random sequences. It would be interesting

1457: and important

1458: to study the statistics of optimal matching lengths between non-random sequences, e.g.,

1459: when there are some correlations between the members of any given sequence.

1460:

1461:

1462: \vspace{0.2cm}

1463:

1464: {\bf Acknowledgements:} My own contribution to this field that is presented here was

1465: developed partly in collaboration

1466: with D.S. Dean and partly with S. Nechaev. It is a pleasure to thank them.

1467: I also thank O. Bohigas, K. Mallick and P. Vivo for collaborations on related topics.

1468: Besides, I acknowledge useful discussions with G. Biroli, J.-P. Bouchaud, A.J. Bray,

1469: A. Comtet, D. Dhar, S. Leibler, O.C. Martin, M. M\'ezard, R. Rajesh and C. Tracy. I also thank the

1470: organizers

1471: J.-P. Bouchaud and M. M\'ezard and all other participants of this summer school for physics, for fun,

1472: and for making the school a memorable one.

1473:

1474: %

1475: % ********** End of text entry *************

1476: %

1477: \begin{thebibliography}{99}

1478:

1479: \bibitem{TW1} C. Tracy and H. Widom, Comm. Math. Phys. {\bf 159}, 151 (1994);

1480: {\bf 177}, 727 (1996); For a review see {\em Proceedings of the International Congress of

1481: Mathematicians}, Beijing 2002, Vol. I, ed. LI Tatsien, Higher Education

1482: Press, Beijing 2002, pgs. 587-596.

1483:

1484: \bibitem{BDJ} J. Baik, P. Deift, and K. Johansson, J. Amer. Math. Soc. {\bf

1485: 12}, 1119 (1999).

1486:

1487: \bibitem{J1} K. Johansson, Comm. Math. Phys. {\bf 209}, 437 (2000).

1488:

1489: \bibitem{BR1} J. Baik and E.M. Rains, J. Stat. Phys. {\bf 100}, 523 (2000).

1490:

1491: \bibitem{PS} M. Pr\"ahofer and H. Spohn, Phys. Rev. Lett. {\bf 84}, 4882

1492: (2000); Physica A, {\bf 279}, 342 (2000).

1493:

1494: \bibitem{GTW} J. Gravner, C.A. Tracy, and H. Widom, J. Stat. Phys. {\bf 102}, 1085 (2001).

1495:

1496: \bibitem{BD} S.N. Majumdar and S. Nechaev, Phys. Rev. E {\bf 69}, 011103 (2004).

1497:

1498: \bibitem{IS} T. Imamura and T. Sasamoto,

1499: Nucl. Phys. {\bf B699}, 503 (2004); J. Stat. Phys. {\bf 115}, 749 (2004).

1500:

1501: \bibitem{F1} P.L. Ferrari, Commun. Math. Phys. {\bf 252}, 77 (2004).

1502:

1503: \bibitem{S1} T. Sasamoto, J. Phys. A.: Math. Gen. {\bf 38}, L549 (2005).

1504:

1505: \bibitem{Spohn} H. Spohn, Physica A, {\bf 369}, 71 (2006) and references therein.

1506:

1507: \bibitem{MN} S.N. Majumdar and S. Nechaev, Phys. Rev. E {\bf 72}, 020901(R) (2005).

1508:

1509: \bibitem{meso}

1510: M.G. Vavilov, P.W. Brouwer, V. Ambegaokar, and C.W.J. Beenaker,

1511: Phys. Rev. Lett. {\bf 86}, 874 (2001); A. Lamacraft and B.D. Simons, Phys. Rev. B {\bf 64} 014514 (2001);

1512: P.M. Ostrovsky, M.A. Skvortsov, and M.V. Feigel'man, Phys. Rev. Lett. {\bf 87}, 027002 (2001);

1513: J.S. Meyer, and B.D. Simons, Phys. Rev. B {\bf 64}, 134516 (2001);

1514: A. Silva and L.B. Ioffe, Phys. Rev. B {\bf 71}, 104502 (2005);

1515: A. Silva, Phys. Rev. B {\bf 72}, 224505 (2005).

1516:

1517: \bibitem{BBP}

1518: G. Biroli, J-P. Bouchaud, and M. Potters, cond-mat/0609070 and references therein.

1519:

1520:

1521: \bibitem{Wigner}{E.P. Wigner, Proc. Cambridge Philos. Soc. {\bf 47},

1522: 790 (1951).}

1523:

1524: \bibitem{Mehta} M.L. Mehta, Random Matrices, 2nd Edition, (Academic Press)

1525: (1991).

1526:

1527: \bibitem{CGG} A. Cavagna, J.P. Garrahan, and I. Giardina, Phys. Rev. B. {\bf 61}, 3960 (2000).

1528:

1529: \bibitem{Fyodorov} Y.V. Fyodorov Phys. Rev. Lett. {\bf 92}, 240601 (2004) ;

1530: {\em ibid} Acta Physica Polonica B, {\bf 36}, 2699 (2005).

1531:

1532:

1533: \bibitem{Susskind} L. Susskind, arXiv:hep-th/0302219; M.R. Douglas,

1534: B. Shiffman, and S. Zelditch, Commu. Math. Phys. {\bf 252}, 325

1535: (2004).

1536:

1537: \bibitem{AE} A. Aazami and R. Easther, J. Cosmol. Astropart. Phys.

1538: JCAP03 013 (2006).

1539:

1540: \bibitem{MH} L. Mersini-Houghton, Class. Quant. Grav. {\bf 22}, 3481 (2005).

1541:

1542: \bibitem{VMB} P. Vivo, S.N. Majumdar, and O. Bohigas, in preparation.

1543:

1544: \bibitem{DM} D.S. Dean and S.N. Majumdar, Phys. Rev. Lett. {\bf 97}, 160201 (2006).

1545:

1546: \bibitem{BrayDean} A.J. Bray and D.S. Dean, cond-mat/0611023.

1547:

1548: \bibitem{FSW} Y.V. Fyodorov, H-J. Sommers, and I. Williams, cond-mat/0611585.

1549:

1550: \bibitem{Sosh} A. Soshnikov, Commu. Math. Phys. {\bf 207}, 697 (1999).

1551:

1552: \bibitem{BBP1} J. Baik, G. Ben Arous, and S. P\'ech\'e, Ann. Proab. {\bf 33}, 1643 (2005).

1553:

1554: \bibitem{CB} P. Cizeau and J.-P. Bouchaud, Phys. Rev. E {\bf 50}, 1810 (1994).

1555:

1556: \bibitem{Burda} Z. Burda et. al., cond-mat/0602087.

1557:

1558: \bibitem{Ulam} S.M. Ulam, {\em Modern Mathematics for the Engineers}, ed. by

1559: E.F. Beckenbach (McGraw-Hill, New York, 1961), p. 261.

1560:

1561: \bibitem{Hammersley} J.M. Hammersley, {\em Proc. VI-th Berkeley Symp. on Math.

1562: Stat. and Probability}, (University of California, Berkeley, 1972), Vol. 1, p.

1563: 345.

1564:

1565: \bibitem{VK} A.M. Vershik and S.V. Kerov, Sov. Math. Dokl. {\bf 18}, 527

1566: (1977).

1567:

1568: \bibitem{AD} For a review, see D. Aldous and P. Diaconis, Bull. Amer. Math.

1569: Soc. {\bf 36}, 413 (1999).

1570:

1571: \bibitem{RSK} C. Schensted, Canad. J. Math. {\bf 13}, 179 (1961).

1572:

1573: \bibitem{Mallows} C.M. Mallows, Bull. Inst. Math. Appl., {\bf 9}, 216 (1973).

1574:

1575: \bibitem{HH} D.A. Huse and C.L. Henley, Phys. Rev. Lett. {\bf 54}, 2708 (1985).

1576:

1577: \bibitem{KPZ} M. Kardar, G. Parisi, and Y.C. Zhang, Phys. Rev. Lett. {\bf 56}, 889 (1986).

1578:

1579: \bibitem{FNS} D. Forster, D.R. Nelson, and M.J. Stephen, Phys. Rev. A {\bf 16}, 732 (1977).

1580:

1581: \bibitem{DS1} B. Derrida and H. Spohn, J. Stat. Phys. {\bf 51}, 817 (1988).

1582:

1583: \bibitem{Mezard} M. Mezard, J. Phys. Fr. {\bf 51}, 1831 (1990).

1584:

1585: \bibitem{FH} D.S. Fisher and D.A. Huse, Phys. Rev. B {\bf 43}, 10728 (1991).

1586:

1587: \bibitem{Kardar} Nucl. Phys. {\bf B290}, 582 (1987).

1588:

1589: \bibitem{IS1} J.Z. Imbrie and T. Spencer, J. Stat. Phys. {\bf 52}, 609 (1988); J. Cook

1590: and B. derrida, J. stat. Phys. {\bf 57}, 89 (1989).

1591:

1592: \bibitem{KZ} M. Kardar and Y.C. Zhang, Phys. Rev. Lett. {\bf 58}, 2087 (1987); M. Kardar, Phys. Rev.

1593: Lett. {\bf 55}, 2923 (1989).

1594:

1595: \bibitem{HZ} T. Halpin-Healy and Y.C. Zhang, Phys. Rep. {\bf 254}, 215 (1995).

1596:

1597: \bibitem{HHF} D.A. Huse, C.L. Henley, and D.S. Fisher, Phys. Rev. Lett. {\bf 55}, 2924 (1985).

1598:

1599: \bibitem{KMH} J. Krug, P. Meakin, and T. Halpin-Healy, Phys. Rev. A {\bf 45}, 638 (1992).

1600:

1601: \bibitem{KH} J. Krug and T. Halpin-Healy, J. Phys. A {\bf 31}, 5939 (1998).

1602:

1603: \bibitem{RD} R. Rajesh and D. Dhar, Phys. Rev. Lett. {\bf 81}, 1646 (1998).

1604:

1605: \bibitem{Meakin} P. Meakin, {\em Fractals, Scaling, and Growth Far From

1606: Equilibrium} (Cambridge University Press, Cambridge, 1998).

1607:

1608: \bibitem{KS} J. Krug and H. Spohn, in {\em Solids Far From Equilibrium} (ed. by

1609: C. Godr\`eche) (Cambridge University Press, New York, 1991).

1610:

1611: \bibitem{Eden} M. Eden, in {\em Proc. IV-th Berkeley Symp. on Math. Sciences

1612: and Probability}, ed. by F. Neyman (University of California, Berkeley, 1961),

1613: Vol. 4, p. 223.

1614:

1615: \bibitem{RSOS} J.M. Kim and J.M. Kosterlitz, Phys. Rev. Lett. {\bf 62}, 2289

1616: (1989).

1617:

1618: \bibitem{PNG} F.C. Frank, J. Cryst. Growth {\bf 22}, 233 (1974); J. Krug and H.

1619: Spohn, Europhys. Lett. {\bf 8}, 219 (1989). J. Kert\'esz and D.E. Wolf, Phys.

1620: Rev. Lett. {\bf 62}, 2571 (1989).

1621:

1622: \bibitem{BaD} M.J. Vold, J. Colloid Sci. {\bf 14}, 168 (1959); P. Meakin, P.

1623: Ramanlal, L.M. Sander, and R.C. Ball, Phys. Rev. A {\bf 34}, 5091 (1986); J.

1624: Krug and H. Spohn, Phys. Rev. A {\bf 38}, 4271 (1988).

1625:

1626: \bibitem{Krug2} J. Krug and P. Meakin, Phys. Rev. A {\bf 40}, 2064 (1989); {\em ibid}, {\bf 43},

1627: 900 (1991).

1628:

1629: \bibitem{Bethe} D. Dhar, Phase Transitions, {\bf 9}, 51 (1987); L.-H. Gwa and

1630: H. Spohn, Phys. Rev. Lett. {\bf 68}, 725 (1992); D. Kim, Phys. Rev. E {\bf 52},

1631: 3512 (1995).

1632:

1633: \bibitem{KS1} E. Katzav and M. Schwartz, Phys. Rev. E {\bf 70}, 061608 (2004).

1634:

1635: \bibitem{KNV} E. Katzav, S. Nechaev, and O. Vasilyev, cond-mat/0611537.

1636:

1637: \bibitem{W1} M.S. Waterman, {\em Introduction to Computational Biology} (Chapman \& Hall,

1638: London, 1994).

1639:

1640: \bibitem{Gusfield} D. Gusfield, {\em Algorithms on Strings, Trees, and Sequences} (Cambridge

1641: University Press, Cambridge, 1997).

1642:

1643: \bibitem{DEKM} R. Dubrin, S. Eddy, A. Krogh, and G. Mitchison, {\em Biological Sequence

1644: Analysis} (Cambridge University Press, Cambridge, 1998).

1645:

1646: \bibitem{NW} S.B. Needleman and C.D. Wunsch, J. Mol. Biol. {\bf 48}, 443 (1970).

1647:

1648: \bibitem{SW} T.F. Smith and M.S. Waterman, J. Mol. Biol. {\bf 147}, 195 (1981); Adv. Appl.

1649: math. {\bf 2}, 482 (1981).

1650:

1651: \bibitem{WGA} M.S. Waterman, L. Gordon, and R. Arratia, Proc. Natl. Acad. Sci. USA,

1652: {\bf 84}, 1239 (1987).

1653:

1654: \bibitem{AGMML} S.F. Altschul et. al., J. Mol. Biol. {\bf 215}, 403 (1990).

1655:

1656: \bibitem{SK} D. Sankoff and J. Kruskal, {\em Time Warps, String Edits, and Macromolecules:

1657: The theory and practice of sequence comparison} (Addison Wesley, Reading, Massachussets,

1658: 1983).

1659:

1660: \bibitem{AG} A. Apostolico and C. Guerra, Alogorithmica, {\bf 2}, 315 (1987).

1661:

1662: \bibitem{WF} R. Wagner and M. Fisher, J. Assoc. Comput. Mach. {\bf 21}, 168 (1974);

1663:

1664: \bibitem{CS} V. Chv\'atal and D. Sankoff, J. Appl. Probab. {\bf 12}, 306 (1975).

1665:

1666: \bibitem{Deken} J. Deken, Discrete Math. {\bf 26}, 17 (1979).

1667:

1668: \bibitem{Steele} J.M. Steele, SIAM J. Appl. Math. {\bf 42}, 731 (1982).

1669:

1670: \bibitem{DP} V. Dancik and M. Paterson, in STACS94, Lecture Notes in Computer Science, {\bf

1671: 775}, 306 (Springer, New York, 1994).

1672:

1673: \bibitem{Alex} K.S. Alexander, Ann. Appl. Probab. {\bf 4}, 1074 (1994).

1674:

1675: \bibitem{KLM} M. Kiwi, M. Loebl, and J. Matousek, math.CO/0308234.

1676:

1677: \bibitem{ZM} M. Zhang and T. Marr, J. Theor. Biol. {\bf 174}, 119 (1995).

1678:

1679: \bibitem{Hwa} T. Hwa and M. Lassig, Phys. Rev. Lett. {\bf 76}, 2591 (1996); R. Bundschuh

1680: and T. Hwa, Discrete Appl. Math. {\bf 104}, 113 (2000).

1681:

1682: \bibitem{Monvel} J. Boutet de Monvel, European Phys. J. B {\bf 7}, 293 (1999); Phys. Rev. E

1683: {\bf 62}, 204 (2000).

1684:

1685: \bibitem{MPV} M. M\'ezard, G. Parisi, and M.A. Virasoro, eds., {\em Spin Glass Theory

1686: and Beyond} (World Scientific, Singapore, 1987).

1687:

1688: \bibitem{KL1} E. Kussell and S. Leibler, Science, {\bf 309}, 2075 (2005).

1689:

1690: \bibitem{MMN} S.N. Majumdar, K. Mallick, and S. Nechaev, in preparation.

1691:

1692: \bibitem{BMat} F. Bonetto and H. Matzinger, arXiv:math.CO/0410404.

1693:

1694:

1695:

1696:

1697:

1698:

1699: \end{thebibliography}

1700: %

1701: %spell_to

1702: \end{document}

1703: %

1704: