1: \documentclass[aps,prl,twocolumn,showpacs,byrevtex]{revtex4}
2: %\documentclass[aps,prl,preprint,showpacs,byrevtex]{revtex4}
3: \usepackage{graphicx}
4: \usepackage{times}
5: \usepackage{amsmath}
6:
7:
8: \newcommand\ie{i.e.,~}
9: \newcommand\eg{e.g.~}
10: \newcommand\erfc{\mbox{erfc}}
11: \newcommand\etal{\emph{et al.}}
12:
13: \begin{document}
14:
15:
16: \title{Density of near-extreme events}
17:
18: \author{Sanjib Sabhapandit}
19:
20: \author{Satya N. Majumdar}
21:
22: \affiliation{Laboratoire de Physique Th\'eorique et Mod\`eles Statistiques
23: (UMR 8626 du CNRS), Universit\'e Paris-Sud, B\^atiment 100, 91405 Orsay
24: Cedex, France}
25:
26: \date{\today}
27:
28: \begin{abstract}
29: We provide a quantitative analysis of the phenomenon of crowding of
30: near-extreme events by computing exactly the density of states (DOS) near
31: the maximum of a set of independent and identically distributed random
32: variables. We show that the mean DOS converges to three different
33: limiting forms depending on whether the tail of the distribution of the
34: random variables decays slower than, faster than, or as a pure
35: exponential function. We argue that some of these results would remain
36: valid even for certain {\em correlated} cases and verify it for power-law
37: correlated stationary Gaussian sequences. Satisfactory agreement is
38: found between the near-maximum crowding in the summer temperature
39: reconstruction data of western Siberia and the theoretical prediction.
40: \end{abstract}
41:
42:
43:
44: \pacs{02.50.-r, 05.40.-a, 05.45.Tp}
45:
46: \maketitle
47:
48: Extreme value statistics (EVS)~\cite{EVT}, ---the statistics of the maximum
49: or the minimum value of a set of random observations,--- has seen a recent
50: resurgence of interests due to its applications found in diverse fields
51: such as physics~\cite{physics}, engineering~\cite{engineering}, computer
52: science~\cite{KM}, finance~\cite{finance}, hydrology~\cite{hydrology}, and
53: atmospheric sciences~\cite{climate}. In particular, for independent and
54: identically distributed (i.i.d.) observations from a common probability
55: density function (PDF) $p(X)$, the EVS is governed by one of the three well
56: known limit laws~\cite{EVT}, namely, (a)~Fr\'echet, (b)~Gumbel, or
57: (c)~Weibull, depending on whether the tail of $p(X)$ is, (a)~power-law,
58: (b)~faster than any power-law but unbounded, or (c)~bounded, respectively.
59: Recently, these same limiting laws have also been observed in a seemingly
60: different problem concerning the level density of a Bose gas and integer
61: partition problem~\cite{comtet}.
62:
63:
64:
65:
66: While EVS is very important, an equally important issue concerns the
67: near-extreme events~\cite{near-maxima}, ---\ie \emph{how many events occur
68: with their values near the extreme}? In other words, whether the global
69: maximum (or minimum) value is very far from others (\emph{is it lonely at
70: the top?}), or there are many other events whose values are close to the
71: maximum value. This issue of the crowding of near-extreme events arises in
72: many problems. For instance, in disordered systems, the low temperature
73: properties are governed by the spectral density function of the excited
74: states near the ground state. In the study of weather and climate
75: extremes, an important question is: \emph{how often do extreme temperature
76: events such as heat waves and cold waves occur?} While for an insurance
77: company, it is very important to safeguard itself against excessively large
78: claims, it is equally or may be more important to guard itself from
79: unexpectedly high number of them. In many of the optimization problems
80: finding the exact optimal solution is extremely hard and only practical
81: solutions available are the near-optimal ones~\cite{optimization}. In
82: these situations, the prior knowledge about the crowding of the solutions
83: near the optimal one is very much desirable.
84:
85: In this Letter, we study quantitatively the phenomenon of the crowding of
86: events near the extreme value for i.i.d. random variables, and find rather
87: rich and often universal behavior. In general, the events that occur in
88: nature are correlated. However, when the correlations among them are not
89: very strong, then their EVS converges to that of the i.i.d. random
90: variables~\cite{berman}. This is why the limiting laws of EVS of the i.i.d.
91: random variables are very useful. Here we consider i.i.d. random variables
92: in the similar spirit of the random-energy model~\cite{derrida} for
93: disordered systems, ---which despite its simplicity that the energy levels
94: are i.i.d. random variables, has been successful in capturing many
95: qualitative features of complex spin-glass systems. Moreover, we provide
96: an example of a power-law correlated case, where the behavior of
97: near-extreme events converges to that of the i.i.d. random variables. In
98: addition, by comparing the near-maximum crowding in the reconstructed
99: summer temperature data of western Siberia against the prediction from the
100: i.i.d. random variables, we find satisfactory agreement.
101:
102:
103: We start with a sequence of $N$ i.i.d. random observations $\{X_1, X_2,
104: \ldots X_N\}$, drawn from a common PDF $p(X)$. Let $X_{\max}$ be the maximum
105: of the sequence, ---\ie $X_{\max}=\max(X_1, X_2, \ldots X_N)$. A natural
106: measure of the crowding of events near $X_{\max}$, is the density of states
107: (DOS) with respect to the maximum
108: \begin{equation}
109: \rho(r,N) =\frac{1}{N} \sum_{\{X_i\neq X_{\max}\}}^{N-1}
110: \delta\left[r-(X_{\max}-X_i) \right],
111: \label{DOS}
112: \end{equation}
113: where $r$ is measured from the maximum value, and we do not count
114: $X_{\max}$ itself, ---\ie $\int_0^\infty \rho(r,N)\, dr =1 -1/N$. Clearly,
115: $\rho(r,N)$ fluctuates from one realization of the random sequence to
116: another, and one is interested in knowing whether its statistical
117: properties show any general limiting behavior, in the same sense, as one
118: finds for the EVS. Note that, even though the random variables are
119: independent, the different terms in Eq.~(\ref{DOS}) become correlated
120: through their common maximum $X_{\max}$.
121:
122:
123: We find that the mean DOS $\overline{\rho(r,N)}$ displays rather rich
124: limiting behavior, as $N\rightarrow \infty$. If the tail of the parent
125: distribution $p(X)$ of the random variables decays slower than a pure
126: exponential function, the behavior of $\overline{\rho(r,N)}$ is governed by
127: the corresponding extreme value distribution. On the other hand, when the
128: tail of $p(X)$ is faster than a pure exponential, it is related to the
129: parent distribution itself. In the borderline case when $p(X)$ has a pure
130: exponential tail, $\overline{\rho(r,N)}$ is entirely different.
131:
132:
133:
134: To find $\overline{\rho(r,N)}$, first consider Eq.~(\ref{DOS}) for a given
135: value of the maximum at $X_{\max}=x$. Then the rest of the $(N-1)$
136: variables are distributed independently according to the common conditional
137: PDF $p_{\text{cond}} (X,x)=p(X)/\int_{-\infty}^x p(y)\,dy$. Hence the
138: conditional mean DOS, from Eq.~(\ref{DOS}), is
139: $\overline{\rho_{\text{cond}}(r,N,x)} =[(N-1)/N]\,p_{\text{cond}}(x-r,x)$.
140: For a set of $N$ i.i.d. random variables, the PDF of their maximum value
141: $X_{\max}=x$ is
142: \begin{equation}
143: p_{\max}(x,N)= N p(x) \left[\int_{-\infty}^x p(y)\,dy\right]^{N-1}.
144: \label{p_max}
145: \end{equation}
146: Thus, $\overline{\rho(r,N)}=\int_{-\infty}^\infty
147: \overline{\rho_{\text{cond}}(r,N,x)}\, p_{\max}(x,N)\,dx$. Upon substituting
148: the expressions for $\overline{\rho_{\text{cond}}(r,N,x)}$ and
149: $p_{\max}(x,N)$, a little algebra shows that
150: \begin{equation}
151: \overline{\rho(r,N)}=\int_{-\infty}^\infty p(x-r)\, p_{\max}(x,N-1)\, dx.
152: \label{g.1}
153: \end{equation}
154: This is the key result, which is valid for all $N$. We next analyze its
155: limiting behavior for large $N$.
156:
157:
158: For i.i.d. random variables, it is known that $p_{\max}(x)$ has a limiting
159: distribution~\cite{EVT}:
160: \begin{equation}
161: b_N\, p_{\max}(x=a_N + b_N z,N)\xrightarrow{N\rightarrow \infty} f(z).
162: \label{limiting maximum distribution}
163: \end{equation}
164: The non-universal scale factors $a_N$ and $b_N$ depend explicitly on the
165: parent distribution $p(X)$ and $N$. However, the scaling function $f(z)$
166: is universal and belongs to (a)~Fr\'echet, (b)~Gumbel, or (c)~Weibull,
167: depending only on the tail of $p(X)$. For example, if $p(X) \sim
168: \exp(-X^\delta)$ for large $X$, then $a_N\sim (\ln N)^{1/\delta}$ and
169: $b_N\sim \delta^{-1} (\ln N)^{1/\delta-1}$ for large $N$, and the scaling
170: function is the universal Gumbel PDF $f(z)=\exp\left[-z-\exp(-z)\right]$.
171: Note that, as $N\rightarrow\infty$, for $\delta<1$, $b_N\rightarrow\infty$,
172: whereas $b_N\rightarrow 0$ for $\delta>1$. In fact, this large $N$
173: behavior of $b_N$ is not restricted to only this specific tail of $p(X)$,
174: but is more generic: for any slower than $\exp(-X)$ tail of $p(X)$, as $N$
175: increases $b_N$ also increases, whereas for any faster than $\exp(-X)$
176: tail, $b_N$ decreases as $N$ increases. This is indeed responsible for the
177: generic limiting behavior of $\overline{\rho(r,N)}$.
178:
179: When $p(X)$ has a \emph{slower than exponential} tail, so that $b_N
180: \rightarrow \infty$ as $N\rightarrow\infty$, it is useful to make a change
181: of variable $x=a_N+b_N z$ in Eq.~(\ref{g.1}). Then one immediately realizes
182: that $p(b_N z +a_N-r)$ is highly localized, in the limit $N\rightarrow
183: \infty$, compared to $f(z)$, ---\ie $b_N p(b_N z +a_N-r) \rightarrow
184: \delta(z-[r-a_N]/b_N)$. Therefore, in the scaling region of order $b_N$,
185: around $r=a_N$
186: \begin{equation}
187: \overline{\rho(r,N)}\xrightarrow{N\rightarrow\infty}
188: \frac{1}{b_N} f\left(\frac{r-a_N}{b_N} \right).
189: \label{dos slower than exp}
190: \end{equation}
191:
192:
193: On the other hand, if the tail of $p(X)$ is \emph{faster than exponential},
194: so that $b_N \rightarrow 0$ as $N\rightarrow\infty$, the PDF of the maximum
195: becomes highly localized near $x=a_N$, ---\ie $p_{\max}(x,N)\rightarrow
196: \delta(x-a_N)$. Therefore, Eq.~(\ref{g.1}) yields
197: \begin{equation}
198: \overline{\rho(r,N)}\xrightarrow{N\rightarrow\infty} p(a_N-r).
199: \label{dos faster than exp}
200: \end{equation}
201:
202:
203:
204:
205: In EVS, the convergence towards the limiting distribution is usually very
206: slow~\cite{slow-convergence}. Therefore, it is instructive to check how
207: $\overline{\rho(r,N)}$ approaches the limiting form for large $N$. For this
208: purpose, now we consider explicit forms of $p(X)$, such that
209: $\overline{\rho(r,N)}$ can be computed to high accuracy for any given $N$
210: by numerically integrating Eq.~(\ref{g.1}), and also the explicit forms for
211: $a_N$ and $b_N$ as a function of $N$ can be obtained. The mean number of
212: events close to the maximum, for a finite but large sample of size $N$, is
213: proportional to $\overline{\rho(0,N)}$. In certain cases, $r=0$ is part of
214: the scaling function and $\overline{\rho(0,N)}$ can be obtained from the
215: scaling form of $\overline{\rho(r,N)}$ by putting $r=0$. However, sometimes
216: $r=0$ is not part of the scaling regime and $\overline{\rho(0,N)}$ has to
217: be computed separately from Eq.~(\ref{g.1}). For simplicity, we consider
218: only positive random variables.
219:
220:
221:
222: %%%%%%%%%%%%%%
223: \begin{figure*}
224: \includegraphics[width=7in]{dosfigs.ps}
225: \caption{\label{dosfigs} (Color online). {\bf A}: $\overline{\rho(r,N)}$
226: for $N=10^2$ (blue), $10^3$ (red) and $10^4$ (green), for the power-low
227: distribution $p(X)=\alpha \exp(-X^{-\alpha}) X^{-(1+\alpha)}$, with
228: $\alpha=2$. The dashed (black) line plots the Fr\'echet distribution
229: $f_1(r/b_N)$. {\bf B}: $\overline{\rho(r,N)}$ for exponential decay
230: $p(X)=\delta X^{\delta-1} \exp(-X^\delta)$. (a) For $\delta=1/2$, with
231: $N=10^3$ (blue), $10^5$ (red) and $10^7$ (green). The dashed (black) line
232: plots the Gumbel distribution $f_2([r-a_N]/b_N)$. (b) For $\delta=2$,
233: with $N=10^3$ (blue), $10^6$ (red) and $10^9$ (green). The dashed
234: (black) line plots $p(a_N-r)$. {\bf C}: $\overline{\rho(r,N)}$ for
235: bounded distribution, $p(X)= \beta a^{-\beta} (a-X)^{\beta-1}$ for $X<a$
236: and $p(X)=0$ for $X\ge a$, where $a=10$. (a) For $\beta=3/2$, with
237: $N=10^2$ (blue), $10^3$ (red) and $10^4$ (green). (b) For $\beta=1/2$,
238: with $N=10$ (blue), $10^2$ (red) and $10^3$ (green). The dashed (black)
239: lines plot $p(a-r)$.}
240: \end{figure*}
241: %%%%%%%%%%%%%%
242:
243:
244: {\bf A}.~{\slshape\bfseries Power-law tail.}--- Consider $p(X)=\frac{\alpha
245: \exp(-X^{-\alpha})}{ X^{1+\alpha}}$,
246: %$p(X)=\alpha \exp(-X^{-\alpha}) X^{-(1+\alpha)}$,
247: where $\alpha>0$. In this case, $a_N=0$ and $b_N=N^{1/\alpha}$. Therefore,
248: limiting $\overline{\rho(r,N)}$ is given by Eq.~(\ref{dos slower than
249: exp}), with $f(z)$ belonging to the Fr\'echet class:
250: \begin{equation}
251: f(z)\equiv f_1(z) =
252: \frac{\alpha\exp\left[-z^{-\alpha}\right]}{z^{1+\alpha}}, \quad z\ge 0.
253: \label{frechet}
254: \end{equation}
255: Figure.~\ref{dosfigs}A compares this limiting form with the results obtained
256: from Eq.~(\ref{g.1}) by evaluating the integration numerically. Here,
257: $r=0$ is away from the scaling regime. Thus, $\overline{\rho(0,N)}$ is
258: obtained directly from Eq.~(\ref{g.1}),
259: \begin{equation}
260: \overline{\rho(0,N)}\xrightarrow{N\rightarrow\infty} \frac{\alpha
261: \Gamma(2+1/\alpha)}{N^{1+1/\alpha}}.
262: \label{0-power}
263: \end{equation}
264:
265:
266: {\bf B}.~{\slshape\bfseries Faster than power-law, but unbounded tail.}---
267: Consider $p(X)=\delta X^{\delta-1} \exp(-X^\delta)$, where $\delta>0$. In
268: this case $a_N=(\ln N)^{1/\delta}$ and $b_N= \delta^{-1}(\ln N)^{1/\delta
269: -1}$. For very large and very small $r$, the large $N$ forms of the mean
270: DOS have same forms for all $\delta$, ---\ie $\overline{\rho(r,N)}\sim N
271: p(r)$ for $r\gg a_N$, and $\overline{\rho(r,N)}\approx p(a_N-r)$ for $r\ll
272: a_N$. Thus, at $r=0$
273: \begin{equation}
274: \overline{\rho(0,N)}\xrightarrow{N\rightarrow\infty} p(a_N)=
275: \frac{\delta}{N} (\ln N)^{1-1/\delta},
276: \label{0-exponential}
277: \end{equation}
278: for all $\delta$. However, the scaling behaviors of $\overline{\rho(r,N)}$
279: are very different for the three cases: $\delta<1$, $\delta=1$, and
280: $\delta>1$.
281:
282: {\bfseries Case I: $\delta<1$}. As $N\rightarrow\infty$,
283: $b_N\rightarrow\infty$. Therefore, in the scaling regime around $r=a_N$,
284: ---which, however, becomes larger as $N$ increases, as $b_N$ becomes
285: larger--- the limiting $\overline{\rho(r,N)}$ is again given by
286: Eq.~(\ref{dos slower than exp}), but now $f(z)$ belongs to the Gumbel
287: class:
288: \begin{equation}
289: f(z)\equiv f_2(z)=\exp\left[-z-\exp(-z)\right].
290: \label{gumbel}
291: \end{equation}
292: Figure.~\ref{dosfigs}B~(a) compares the limiting form with the results
293: obtained from Eq.~(\ref{g.1}) by numerical integration.
294:
295: {\bfseries Case II: $\delta=1$}. In this case $b_N=1$. In this borderline
296: case neither of the limiting forms, ---\ie Eq.~(\ref{dos slower than
297: exp})~or~(\ref{dos faster than exp}), are reached in the large $N$ limit.
298: Instead, we find a completely different behavior:
299: $\overline{\rho(r,N)}=g(r-a_N)$, where the scaling function
300: \begin{equation}
301: g(z)=e^z \left[1-
302: \left(1+e^{-z}\right) e^{-e^{-z}} \right].
303: % \left(1+e^{-z}\right) \exp\left(-e^{-z}\right) \right].
304: \label{dos exp}
305: \end{equation}
306:
307:
308: {\bfseries Case III: $\delta >1$}. As $N\rightarrow\infty$, $b_N\rightarrow
309: 0$. Thus, $\overline{\rho(r,N)}$ now converges to the other form given by
310: Eq.~(\ref{dos faster than exp}), which is compared in
311: Fig.~\ref{dosfigs}B~(b), with the results obtained from
312: Eq.~(\ref{g.1}) by evaluating the integration numerically.
313:
314:
315:
316:
317:
318: {\bf C}.~{\slshape\bfseries Bounded tail.}--- Consider $p(X)= \beta
319: a^{-\beta} (a-X)^{\beta-1}$ for $0<X<a$, where $\beta >0$, and $p(X)=0$
320: otherwise. In this case, $a_N=a$ and $b_N=a N^{-1/\beta}$. Therefore, again
321: $\overline{\rho(r,N)}$ now converges to the other form given by
322: Eq.~(\ref{dos faster than exp}). The comparison with Eq.~(\ref{g.1}) is
323: illustrated in Fig.~\ref{dosfigs}C. Again, $N$ dependence of
324: $\overline{\rho(0,N)}$ for large $N$, does not follow from the limiting
325: $\overline{\rho(r,N)}$. This is obtained directly from Eq.~(\ref{g.1}),
326: \begin{equation}
327: \overline{\rho(0,N)}\xrightarrow{N\rightarrow\infty} \frac{(\beta/a)
328: \Gamma(2-1/\beta)}{N^{1-1/\beta}}, ~\text{for} ~\beta>1/2.
329: \label{0-bounded}
330: \end{equation}
331:
332:
333:
334: To summarize the explicit results: When the tail of $p(X)$ is either
335: power-law or bounded, the convergence of $\overline{\rho(r,N)}$ to the
336: respective limits given by Eqs.~(\ref{dos slower than exp}) and (\ref{dos
337: faster than exp}) are fast, as can be seen from Figs.~\ref{dosfigs}A and
338: \ref{dosfigs}C respectively. However, in the intermediate situation ---\ie
339: when $p(X)$ decays faster than power-law but not bounded, --- the
340: convergence is slow, as can be seen from Figs.~\ref{dosfigs}B~(a) and
341: \ref{dosfigs}B~(b). In other words, the more $p(X)$ deviates from
342: $\exp(-X)$ in either direction (slower and faster), $\overline{\rho(r,N)}$
343: converges more quickly (with increasing $N$) to its limiting form. As $N$
344: increases, the mean number of events close to the maximum, which is
345: proportional to $\overline{\rho(0,N)}$, decreases faster for $p(X)$ with a
346: broader tail [cf. Eqs. (\ref{0-power}), (\ref{0-exponential}) and
347: (\ref{0-bounded})]. This is also evident from the small $r$ behavior of
348: $\overline{\rho(r,N)}$ in the scaling regime, ---\ie from the peak to the
349: left in Figs. \ref{dosfigs}A and \ref{dosfigs}B~(a): For $p(X)$ with a
350: power-law tail, $\overline{\rho(r,N)}$ has an essential singular behavior
351: $\exp(-N/r^\alpha)$ for small $r$ [cf. Eq.~(\ref{frechet})], and for a
352: stretched-exponential tail ({\bf B} with $\delta<1$), as $r$ decreases from
353: $a_N$ in the scaling regime $\overline{\rho(r,N)}$ decreases
354: super-exponentially $\exp(-\exp([a_N-r]/b_N))$ [cf. Eq.~(\ref{gumbel})]. On
355: the contrary, for $p(X)$ having faster than $\exp(-X)$ tail, there is
356: crowding near the maximum value ($r=0$) [Figs. \ref{dosfigs}B~(b) and
357: \ref{dosfigs}C].
358:
359:
360:
361: Another measure of the loneliness of the maximum is the gap between the
362: maximum and the next highest value. Let $Q(\epsilon|N)$ be the PDF of the
363: gap being $\epsilon$. Clearly
364: \begin{equation}
365: Q(\epsilon|N)= N \int_{-\infty}^\infty p(z+\epsilon)\, p_{\max}(z,N-1)\, dz.
366: \end{equation}
367: In particular, when $p(X)\sim \exp(-X^\delta)$ for large $X$, we find the
368: limiting form
369: \begin{equation}
370: Q(\epsilon|N) \xrightarrow{N\rightarrow\infty}
371: \frac{1}{b_N} \exp(-\epsilon/b_N).
372: \end{equation}
373: Thus, the typical gap is of the order $b_N$, which increases (decreases) as
374: $N$ increases for $\delta<1$ ($\delta>1$), ---consistent with the results
375: obtained form the study of mean DOS.
376:
377:
378:
379:
380:
381:
382: So far, we have considered the case of i.i.d. random variables. \emph{What
383: would happen if the random variables are correlated?} For short-ranged
384: correlation, one expects the results from i.i.d. random variables to hold.
385: However, for a stationary Gaussian sequence (SGS), this holds even for
386: long-range (\eg power-law) correlation. More precisely, for SGS a rigorous
387: theorem~\cite{berman} states: if the correlator $C(n)\equiv\overline{X_i
388: X_{i+n}}$ satisfies either $\lim_{n\rightarrow\infty} C(n) \ln n =0$ or
389: $\sum_{n=1}^\infty C^2(n) <\infty$, then the limiting distribution of the
390: maximum [cf. Eq.~(\ref{limiting maximum distribution})] is Gumbel [cf.
391: Eq.~(\ref{gumbel})], and $a_N$ and $b_N$ are same as those in the case of
392: independent Gaussian random variables. Based on this theorem, one
393: therefore predicts that $\overline{\rho(r,N)}$ for large $N$, should be
394: independent of the correlation function $C(n)$ and hence would be the same
395: as that of Gaussian i.i.d. random variables. We have indeed verified this
396: prediction for SGS's with a power-law correlation
397: $C(n)=(1+n^2)^{-\gamma/2}$, which are generated using numerical simulation.
398: We compute $\overline{\rho(r,N)}$ from these sequences for three different
399: values of $N$ and for each $N$ two different values of $\gamma$, and
400: compare with the one obtained by numerically integrating Eq.~(\ref{g.1})
401: for same $N$ and using $p(X)=\exp(-X^2/2)/\sqrt{2\pi}$, ---this is shown in
402: Fig.~\ref{gsp}. While for smaller $N$ [cf. Fig.~\ref{gsp}~(a)] they
403: differ, for larger $N$ [cf. Fig.~\ref{gsp}~(c)] the difference becomes
404: unnoticeable.
405:
406: %%%%%%%%%%%%%%%
407: \begin{figure}
408: \includegraphics[width=3.375in]{gsp.ps}
409: \caption{\label{gsp} (Color online). $\overline{\rho(r,N)}$ for stationary
410: Gaussian random sequence with a correlator $C(n)=(1+n^2)^{-\gamma/2}$,
411: where $\gamma=0.5$ (blue) and $\gamma=1$ (red) obtained from numerical
412: simulation, and for Gaussian i.i.d. random variables (black dashed)
413: obtained by numerical integration of Eq.~(\ref{g.1}). The three sets of
414: curves (a), (b) and (c) correspond three different values of $N$.}
415: \end{figure}
416: %%%%%%%%%%%%%%
417:
418: %%%%%%%%%%%%%%
419: \begin{figure}[b]
420: \includegraphics[width=3.375in]{yamal.ps}
421: \caption{\label{yamal} (a) Yamal peninsula June-July mean temperature
422: anomaly ($\Delta T$) reconstruction series~\cite{yamal_data}. (b) The
423: histogram plots the distribution $\Delta T$ of the data shown in (a). The
424: solid line represents $p(\Delta T)=\exp(-\Delta T^2/2)/\sqrt{2\pi}$. In
425: (c) and (d), the histograms plot the mean DOS relative to the maximum
426: (excluding the maximum), computed by dividing the data into blocks, with
427: each block consists of $N$ years. Solid lines are calculated using the
428: exact numerical integration in Eq.~(\ref{g.1}). The dashed lines represent
429: $p(a_N-r)$, where $a_N=(2\ln N)^{1/2} - (2\ln N)^{-1/2} (\ln\ln N +\ln
430: 4\pi)/2$.}
431: \end{figure}
432: %%%%%%%%%%%%%%
433:
434: \emph{How well do the mathematical results describe real data?} That is
435: what we check last in this Letter, by comparing against the reconstructed
436: Yamal multimillennial summer temperature data by Hantemirov and
437: Shiyatov~\cite{yamal_data}. The reconstructed data-set consists of yearly
438: mean summer temperature anomalies ($\Delta T)$, of Yamal Peninsula of
439: western Siberia, relative to the mean of the full reconstructed series for
440: 4000 years (2000 BC to AD 1996), which is shown in Fig.~\ref{yamal}~(a).
441: We divide the full time series into blocks of $N$ years, and for each
442: block: (I) find the maximum value of $\Delta T$, and then (II) with respect
443: to this maximum, compute $\rho(r,N)$ using Eq.~(\ref{DOS}). Finally, we
444: find $\overline{\rho(r,N)}$, by taking average over all the blocks. The
445: histograms in Fig.~\ref{yamal}~(c) and (d) illustrate
446: $\overline{\rho(r,N)}$, computed by dividing the full series into
447: $40$-blocks with $100$ years of data in each block, and $4$-blocks with
448: $1000$ years of data in each block respectively. Now to compare with our
449: results, we first compute the distribution of $\Delta T$ from the full time
450: series, which is illustrated in Fig.~\ref{yamal}~(b) by histogram, along
451: with the solid line given by the Gaussian distribution. In
452: Fig.~\ref{yamal} (c) and (d), the solid lines are computed using the
453: Gaussian distribution from Eq.~(\ref{g.1}), by performing exact numerical
454: integration, with $N=100$ and $N=1000$ respectively. The dashed lines
455: correspond to the limiting form $p(a_N-r)$, obtained in Eq.~(\ref{dos
456: faster than exp}) for large $N$. The agreements between them (dashed and
457: solid lines) are satisfactory.
458:
459:
460:
461:
462:
463: % In summary, we have considered the DOS relative to the maximum of a set of
464: % $N$ i.i.d. random variables. We have shown that, the mean DOS converges
465: % different limits as $N\rightarrow\infty$, depending on whether the tail of
466: % the parent distribution of the random variables decay slower or faster than
467: % a pure exponential function. We have compared our results against Yamal
468: % summer temperature data, and found satisfactory agreement.
469:
470:
471:
472:
473:
474: We acknowledge the support of the Indo-French Centre for the Promotion of
475: Advanced Research under Project 3404-2.
476:
477:
478:
479: \begin{thebibliography}{10}
480:
481: \bibitem{EVT} R.A.~Fisher and L.H.C.~Tippet, Proc. Cambridge Philos. Soc.
482: {\bf 24}, 180 (1928); %
483: E.J.~Gumbel, \emph{Statistics of Extremes} (Columbia University Press, NY,
484: 1958); %
485: J.~Galambos, \emph{The Asymptotic Theory of Extreme Order Statistics}
486: (John Wiley \& Sons, NY, 1978).
487:
488: \bibitem{physics} J.-P.~Bouchaud and M.~M\'ezard, J. Phys A {\bf 30}, 7997
489: (1997); %
490: D.S.~Dean and S.N.~Majumdar, Phys. Rev. E {\bf 64}, 046121 (2001); %
491: G.~Gy\"orgyi, P.C.W.~Holdsworth, B.~Portelli, and Z.~R\'acz, Phys. Rev. E
492: {\bf 68}, 056116 (2003); %
493: S.N.~Majumdar and P.L.~Krapivsky, %Phys. Rev. E {\bf 62}, 7735 (2000); %
494: Physica A {\bf 318}, 161 (2003); %
495: J.F.~Eichner, J.W.~Kantelhardt, A.~Bunde, and S.~Havlin, Phys. Rev. E {\bf
496: 73}, 016130 (2006); %
497: E.~Bertin and M.~Clusel, J. Phys. A {\bf 39}, 7607 (2006); %
498:
499:
500:
501: \bibitem{engineering} A.N.~Norris, J. Mech. Materials Struct. {\bf 1}, 793
502: (2006); %
503: A.~Cazzani and M.~Rovati, Int. J. Solids Struct. {\bf 42}, 5057
504: (2005); %
505: M.~Hayes and A.~Shuvalov, J. appl. mech. {\bf 65}, 786 (1998).
506:
507:
508:
509: \bibitem{KM} S.N.~Majumdar and P.L.~Krapivsky, Phys. Rev. E {\bf 65},
510: 036127 (2002).
511:
512:
513: \bibitem{finance} P.~Embrechts, C.~Kl\"uppelberg, and T.~Mikosch,
514: \emph{Modelling Extremal Events for Insurance and Finance} (Springer,
515: Berlin, 1997).
516:
517: \bibitem{hydrology} R.W.~Katz, M.B.~Parlange, and P.~Naveau, Advances in
518: Water Resources {\bf 25}, 1287 (2002).
519:
520: \bibitem{climate} D.R.~Easterling \etal, Science {\bf 289}, 2068 (2000); %
521: S.~Redner and M.R.~Petersen, Phys. Rev. E {\bf 74}, 061114 (2006).
522:
523:
524: \bibitem{comtet} A.~Comtet, P.~Leboeuf, and S.N.~Majumdar, Phys. Rev. Lett.
525: {\bf 98}, 070404 (2007).
526:
527: \bibitem{near-maxima} A.G.~Pakes and F.W.~Steutel, Aust. J. Stat. {\bf 39},
528: 179 (1997); %
529: A.G.~Pakes and Y.~Li, Stat. Probab. Lett. {\bf 40}, 395 (1998).
530:
531:
532: \bibitem{optimization} D.S.~Dean, D.~Lancaster, and S.N.~Majumdar, Phys.
533: Rev. E {\bf 72}, 026125 (2005).
534:
535: \bibitem{berman} S.M.~Berman, Ann. Math. Stat. {\bf 35}, 502 (1964); %
536: J.~Pickands, Trans. Am. Math. Soc. {\bf 145}, 75 (1969).
537:
538:
539: \bibitem{derrida} B.~Derrida, Phys. Rev. Lett. {\bf 45}, 79 (1980).
540:
541: \bibitem{slow-convergence} G.~Gy\"orgyi, N.R.~Moloney, K.~Ozog\'any, and
542: Z.~Ra\'cz, Phys. Rev. E {\bf 75}, 021123 (2007).
543:
544:
545:
546: \bibitem{yamal_data} R.M.~Hantemirov and S.G.~Shiyatov, Holocene, {\bf 12},
547: 717 (2002). Data obtained from IGBP PAGES/WDC for Paleoclimatology,
548: http://www.ncdc.noaa.gov/paleo/pubs/hantemirov2002/
549:
550:
551: \end{thebibliography}
552:
553: \end{document}
554: