0704:0704.2539/f2.tex

1: \documentclass[pre,epsfig,aps,twocolumn]{revtex4}

2:

3: \usepackage{epsfig}

4: \usepackage{fancyheadings}

5: \usepackage{amstext,amsmath,amssymb}

6:

7: \begin{document}

8:

9: \sloppy

10: \pagestyle{empty}

11:

12: \title{Reconstructing a Random Potential from its Random Walks}

13: \author{S. Cocco $^1$, R. Monasson $^2$}

14: \affiliation{$^1$

15: CNRS-Laboratoire de Physique Statistique de l'ENS, 24 rue Lhomond,

16: 75005 Paris, France\\

17: $^2$ CNRS-Laboratoire de Physique Th\'eorique de l'ENS, 24 rue Lhomond,

18: 75005 Paris, France}

19:

20: \begin{abstract}

21: The problem of how many trajectories of a random walker in a potential

22: are needed to reconstruct the values of this potential is

23: studied. We show that this problem can be solved by calculating

24: the probability of survival of an abstract random walker in a

25: partially absorbing potential. The approach is illustrated on the

26: discrete Sinai (random force) model with a drift.

27: We determine the parameter (temperature, duration of

28: each trajectory, ...) values making reconstruction as fast as

29: possible.

30: \end{abstract}

31:

32: \maketitle

33:

34: {\em Introduction.}

35: Random walks (RW) in random media have been intensively studied in the

36: past decades as a paradigm for out-of-equilibrium dynamics, and have

37: led to the discovery and understanding of

38: important dynamical effects as anomalous diffusion, ageing

39: ...\cite{discreteRF,revue}. Briefly speaking the issue is to determine the

40: statistical properties of the walker from the ones of the energy

41: potential. Much less attention has been devoted to the inverse

42: problem: given one (or more) observed RW(s) can we guess the potential

43: values?

44: This question naturally arises in biophysics where the use of

45:  AFM, optical and magnetic tweezers make  possible the

46: mechanical separation of single protein-protein complexes \cite{evans},

47: or the unfolding and refolding of single biomolecules\cite{fer04,Ess97,woo06}.

48: The observed dynamics the rupture of chemical bonds,

49: of folding/unfolding of nucleic acids,

50: or proteins   can be modeled as a RW motion affected by thermal

51: noise, moving in a quenched potential determined by

52: the composition of the chemical bonds, or the sequence of

53: amino-- or nucleic--acids. Reconstructing the free

54: energy landscape of those processes is the object of current and

55: intense efforts  \cite{hye03,evans,mb,ritort,woo06}.

56:

57: In this letter we show how the inverse RW problem

58: can be practically solved within the Bayesian inference framework and

59: address the crucial question of the accuracy of reconstruction.

60: In practice information can be accumulated either by increasing the

61: duration of one RW, or observing more than one RW, or combining the two.

62: We discuss the optimal procedure minimizing the total number of

63: data to be acquired, and show how this minimal amount of data can be

64: calculated from

65: the probability of survival of an abstract  walker in a partially

66: absorbing potential. The approach is illustrated in detail on the celebrated

67: discrete random force (RF) model (Sinai model with non zero drift)

68: \cite{revue,discreteRF}.

69:

70: %{\em Related works.}

71: Inference is a key issue in information theory and

72: statistics \cite{bayes}, with applications in biology \cite{domany},

73: social science \cite{And57}, finance,  ... A central question is the

74: so-called hypothesis testing problem: which one of two candidate

75: distributions is

76: likely to have generated a set of measured data?  This question was

77: solved in the case of independent variables by Chernoff \cite{Che52},

78: and is the core issue of the asymptotic theory of inference

79: \cite{bayes}. Chernoff showed that the probability of guessing

80: the wrong distribution decreases exponentially with the size of the data

81: set \cite{Che52}. Large deviations techniques can be used to treat

82: the case of variables extracted from one recurrent

83: realization of a finite Markov chain \cite{Boz71,Dem98}; the present

84: work can be seen as an extension to many transient realizations of

85: an `infinite' chain.

86:

87:

88: {\em Random Force model.} For an illustration of the problem

89: consider the discrete, one dimensional RF model defined on the set

90: of sites $x=0,1,2,\ldots ,N$ \cite{discreteRF}. We start by choosing

91: randomly a set of dimensionless forces $f_{x}=\pm 1$ on each link

92: ($x,x+1$) with {\em a priori}

93: probability $P_0 =\prod_x \frac{1+ b\,f_x}2$ where $-1<b<1$ is

94: called tilt. This defines the values of the potential ${\bf V}$ on each site,

95: $V_x = - \sum_{y<x} f_y$ (by definition $V_0=0$).

96: An example of potential for $b=0.4$ is shown on Fig.~\ref{fig-pot}.

97:

98: After the quenched potential has been drawn

99: a random walker starts in $x=0$ at time $t=0$. The walker then

100: jumps from one site $x$ to one of its neighbors $x'=x\pm 1$ with

101: rate (probability per unit of time) $r_{\bf V}(x \to

102: x')=r_0\times e^{(V_x-V_{x'})/(2T)}$ to satisfy detailed balance

103: at temperature $T$; the attempt rate $r_0$ will be set to unity in the

104: following. Reflecting boundary conditions are imposed by

105: setting $V_{N+1}=V_{-1}=+\infty$. We register the sequence of

106: of positions up to some time $t_f$: ${\bf X}=\{ x(t), 0\le

107: t\le t_f\}$. Figure~\ref{fig-pot} shows five RWs ${\bf X}_\rho$,

108: $\rho=1,\ldots,5$ , each starting in the origin $x(0)=0$ and of equal

109: duration $t_f$ for a temperature $T=1$. The value of the temperature

110: strongly affects the dynamics \cite{revue}, and its relevance for

111: the inverse problem will be discussed later.

112:

113: Our objective is to reconstruct the potential over a region of the

114: lattice e.g. the value of the forces on some specific links

115: from the observation of RWs.

116: Within Bayes inference framework this can be done by maximizing

117:  the joint probability of the potential ${\bf V}$

118: and of the observed RWs ${\bf X}_1,\ldots, {\bf X}_R$

119:  over ${\bf V}$ \cite{bayes}.

120: $P$ is the product of the {\em a priori} probability of the potential,

121: $P_0$, times the likelihood of

122: the RWs given the potential, $L$. Since the RW is Markovian $L$ depends

123: only on the sets of total times $t_x$ spent on every site $x$,

124: and of the numbers of jumps $u (x\to x')$ from

125: $x$ to $x'$ over the set of RWs:

126: \begin{equation} \label{like}

127: L =  \prod_{x,x'} e^{-t_x \;r_{\bf V}(x\to x')}

128: \;r_{\bf V}(x\to x') ^{u(x\to x')}

129: \end{equation}

130: where the product runs over all sites $x$ and their neighbors $x'=x\pm 1$.

131: Expressing the rates in terms of the forces and maximizing the joint

132: probability $P$ we obtain  the most likely values for the forces:

133: $f_x=\mbox{sign}(h_x + \alpha)$ where $\alpha\equiv T\, \ln[

134:   (1+b)/(1-b)]$ is a global `field'

135: coming from the {\em a priori} distribution $P_0$ and $h_x$ a local

136: contribution due to the likelihood $L$,

137: \begin{equation} \label{field}

138: h_x = 2T\sinh \big(\frac 1{2T}\big) \; ( t_{x+1} -t_x)  + u (x\to x+1)

139:  - u (x+1\to x) \ .

140: \end{equation}

141: Figure~\ref{fig-pot} (left, bottom) shows predictions made from

142: $R=1$ to $R=5$  RWs for the first 200 sites.

143: The duration $t_f$ of the RW is chosen

144: to be much larger than the mean first passage time

145: in $x=200$, and much smaller than the equilibration time

146: $t_{eq}\sim e^{bN/T}$. In this range the quality of prediction is essentially

147: independent of $t_f$ as will be discussed in detail below.

148: As expected the number of erroneous forces

149: decreases with increasing $R$ though atypical events may

150: produce flaws in the prediction. The analysis of these atypical RWs,

151: and how they lead to errors is the keystone of what follows.

152:

153:

154: \begin{figure}

155: \begin{center}

156: \vskip .7cm

157: \psfig{figure=./fig1.eps,height=4.7cm,angle=0}

158: \vskip .7cm

159: \caption{Left, top:  Example of  potential ${\bf V}$ obtained in the RF model

160: with tilt $b=0.4$ (size $N=1000$,

161: sites $x>200$ not shown here).  Right: examples of RWs, numbered

162: from 1 to 5, in this potential at temperature $T=1$; plateaus are in

163: correspondence with the local minima of $V$. Here $\alpha \simeq 0.85$

164: (creep phase). Left, bottom: Predictions from the first $R$ RWs in the

165: right panel and (\ref{field});

166: impulses locate incorrectly predicted forces $f_x$ for $x\le

167: 200$. The number of erroneous forces decreases from 26 (for $R=1$) to

168: 0 ($R=5)$. Note the errors on sites $x_0\simeq

169: 100$ appearing when the fourth RW is taken into account; indeed this

170: atypical RW marks no pause in the local minimum in $x_0$. }

171: \label{fig-pot}

172: \end{center}

173: \end{figure}

174:

175: {\em Number of RWs necessary for a good reconstruction.}

176: Expression (\ref{like}) for the likelihood of the RWs

177: is true for any potential

178: ${\bf V}$ and can be geometrically interpreted as follows.

179: Given a set of RWs we extract a signal vector ${\bf S}$

180: whose components are:  the times $t_x$ spent on

181: site $x$, the numbers $u(x\to x')$ of transitions

182: from site $x$ to site $x'$. When $R$ is large we expect ${\bf

183: S}$ to be extensive with $R$ and define the intensive signal

184: ${\bf s}={\bf S}/R$.

185: Similarly, to each potential ${\bf V}$ we associate a vector

186: ${\bf v}$ with components: minus the outgoing rate {\em i.e.} $ -\sum

187: _{x' (\ne x)} r_{\bf V} ( x\to x')$ for each site $x$, the logarithm of

188: the rate $r_{\bf V}(x\to x')$ for each pair of neighbors.

189: Then $L=\exp(R \;{\bf s}\cdot {\bf v})$ from (\ref{like})

190: where $\cdot$ denotes the scalar product. Maximizing the joint

191: probability $P=P_0\times L$ over the potential becomes equivalent,

192: in the large $R$ limit, to finding ${\bf v}$ with the largest

193: scalar product with the signal ${\bf s}$

194: \footnote{The irrelevance of the {\em a priori} distribution

195: in the asymptotic case of large data set is well-known \cite{bayes}

196: and can be checked for the RF model: the local field (\ref{field})

197: is extensive in $R$, while the global field $\alpha$ remains

198: finite.}.

199: It is natural to partition the space of signals into `Voronoi cells':

200: $C_{\bf v}$ is the set of ${\bf s}$ having a larger scalar product with

201: ${\bf v}$ than with any other potential ${\bf v}'$.

202: Bayes rule tells us that the most likely potential given an observed

203: signal ${\bf s}$ is the one attached to the cell in which ${\bf s}$ lies.

204:

205:

206: Consider now RWs taking place in a given potential ${\bf V}$.

207: From the law of large number the signal ${\bf s}$ is equal, in

208: the infinite $R$ limit, to

209: ${\bf s}^*_{\bf v}=\{t_x^*,u^*(x\to x')=t^*_x \, r_{\bf V} (x\to x')\}$ where

210: $t^*_x$ is the average sojourn time on site $x$

211: over RWs of duration $t_f$. As ${\bf s}^*_{\bf v} \in C_{\bf

212:   v}$ \footnote{Let ${\bf v}'\ne {\bf v}$;

213: ${\bf s}^*_{\bf v}\cdot ({\bf v}-{\bf v}') = \sum _{x\ne x'}

214: u^* (x\to x') G( r_{{\bf V}'}(x\to x')/ r_{{\bf V}}(x\to x') )$ where

215: $G(z)=z-\ln z -1 >0$ for $z\ne 1$.

216: } reconstruction becomes flawless in the limit

217: of an infinite number of data  as expected.

218: For large albeit finite $R$, ${\bf s}$ typically deviates

219: from ${\bf  s}^*_{\bf v}$ by $O(R^{-\frac 12})$; finite deviations have

220: exponentially small--in--$R$ probabilities, $e^{-R \, \omega _{\bf V}

221:   ({\bf s})}$, controlled by a

222: rate function $\omega _{\bf V} ({\bf s})$ \cite{Dem98}.

223: The probability to predict an erroneous potential is

224: the probability that the stochastic signal ${\bf s}$ does not

225: belongs to cell $C_{{\bf v}}$. This probability of error thus decays

226: exponentially with $R$ over a typical number of RWs

227: \begin{equation} \label{min}

228: R_c({\bf V}) = \big[\ \displaystyle{\min _{{\bf s} \notin C_{\bf v}}}

229: \ \omega _{\bf  V} ({\bf s})\ \big]^{-1}  \ ,

230: \end{equation}

231: where the minimum is taken over signals outside the `true'

232: cell. It depends on the temperature, the duration of the RW, ...

233:

234: As the RWs are independently drawn $\omega _{\bf   V}$ is a convex

235: function of ${\bf s}$ \cite{Dem98}. The minimum in (\ref{min}) is thus

236: reached on the boundary between the true cell and

237: another, bad cell, say, $C_{\bar {\bf v}}$. The attached potential,

238: $\bar {\bf V}$, is the most `dangerous' one from the inference point of

239: view. RWs generated from ${\bf V}$ and $\bar {\bf V}$ are hardly told

240: from each other unless more than $R_c({\bf V})$ of them are observed.

241:

242: Assume $\bar {\bf V}$ is known. Then the boundary between $C_{\bf v}$

243: and $C_{\bar {\bf v}}$ is the set of signals ${\bf s}\perp {\bf v}

244: -\bar {\bf v}$. We deduce

245: \begin{equation}

246: \label{bound}

247: R_c({\bf V} ) = \big[\ \max _{\mu} \; \min _{{\bf s}}

248: \big( \omega _{\bf V} ({\bf s}) + \mu\, {\bf s}\cdot(

249:  {\bf \bar v}-{\bf  v}) \big)\ \big] ^{-1}\

250: \end{equation}

251: where the Lagrange multiplier $\mu\in [0;1]$ ensures that ${\bf s}$ is confined

252: to the boundary.

253: The Legendre transform of $\omega _{\bf V}$ appearing in (\ref{bound})

254: is intimately related to the evolution operator of an abstract

255: random  walk process, denoted by RW$(\mu)$

256: to distinguish from the original RW \cite{noi}. This RW$(\mu)$-er

257: moves with the rates

258: $r_{(1-\mu) {\bf V} + \mu \bar {\bf  V}} (x \to x')$ and may die

259: on every site $x$ with positive rate

260: \begin{eqnarray} \label{death}

261: &&d_{{\bf V},\bar {\bf V},\mu} (x)

262: =\sum _{x' (\ne x)}  \big[ (1-\mu)\, r _{\bf V} (x\to x') +

263: \mu \,  r_{\bf \bar V}(x\to x') \nonumber \\

264:  &&\hskip 3cm  -\ r_{(1-\mu) {\bf V} + \mu {\bf \bar V}}(x\to x')\big] \ .

265: \end{eqnarray}

266: Consider now the probability $\pi(\mu)$

267: that RW$(\mu)$-er, initially at the origin,

268: has survived up to time $t_f$ (the duration of the original RW).

269: Then

270: $R_c({\bf V}) =\displaystyle{\min _{\mu \in[0;1]} 1/| \ln \pi(\mu)|}$.

271:

272: {\em Optimal Working Point for the RF model.}

273: We apply the above theory to the discrete RF model, and want to predict

274: the value of the force $f_y$ on the link $(y,y+1)$ for some specific

275: $y$. The dangerous potential is ${\bf \bar V}$

276: obtained from ${\bf V}$ upon reversal of the force $f_y\to -f_y$.

277: We aim at calculating the probability $\pi(\mu)$ of survival of

278: RW$(\mu)$-er moving with rate $r(x\to x')=r_{\bf V} (x\to x')$

279: and dying on site $x$

280: with rate $d(x)=0$ except:  $r(y\to y+1)

281: =1/r(y+1\to y)=e^{(1-2\mu)f_y/(2T)}$,

282: $d(y)= D(f_y),  d(y+1)=D(-f_y)$ where

283: $D(f)\equiv (1-\mu) e^{f/(2T)}+\mu e^{-f/(2T)} - e^{

284: (1-2\mu) f/(2T)}$ from (\ref{death}). From the

285: previous section the number of RWs required for a reliable prediction

286: of $f_y$ is $R_c(y;{\bf V}) = \min _{\mu} 1/|\ln \pi(\mu)|$.

287:

288: Let $\pi _x(\mu,t)$ be the

289: probability that RW$(\mu)$, initially on site $x$,  is still

290: alive at time $t$. The time-evolution of $\pi_x$ is described by

291: \begin{equation} \label{diffeq}

292: \frac{\partial\pi _x }{\partial t} = \sum

293: _{x' (\ne x)}  r(x\to x') \big( \pi _{x'}- \pi_{x} \big)

294: -d (x)\, \pi_x \ ,

295: \end{equation}

296: with initial  condition $\pi_y (\mu,0)=1$ (by convention

297: $\pi_{-1}=\pi_{N+1}=0$). After Laplace transform over time, eqns

298: (\ref{diffeq}) are turned into recurrence equations for the ratios

299: $\pi_x/\pi_{x+1}$ and solved with great numerical accuracy. We obtain this way

300: the probability of survival, $\pi(\mu)=\pi_0(\mu,t_f)$, and optimize

301: over $\mu$.

302: Though $R_c$ depends on the potential ${\bf V}$ its general behavior

303: for tilt $b>0$ as a function of the duration $t_f$ is sketched in

304: Fig.~\ref{fig-rc}. Three regimes are observed:

305:

306: $\bullet$ for $t_f\ll \tau_y$ (mean first passage time in $y$)

307: RW$(\mu)$ has a low probability to visit $y$ and is almost surely alive,

308: hence $R_c$ is very large;

309:

310: $\bullet$ for $\tau _y\ll t_f\ll t_{eq}$ RW$(\mu)$ has visited

311: the region

312: surrounding $y$ and escaped from this region (transient regime), hence

313: its probability of survival remains constant, and so does

314: $R_c$;

315:

316: $\bullet$ for $t_f\gg t_{eq}$ RW$(\mu)$ visits again

317: and again the region surrounding $y$, hence

318: the probability of survival decreases exponentially

319: with the duration: $R_c \propto 1/t_f$.

320:

321: The total time $R_c\times t_f$ for a good reconstruction

322: is minimal when we choose $t_f \gtrsim \tau_y$. This

323: marginally transient regime corresponds to the plateau of

324: Fig.~\ref{fig-rc}: RWs are long enough to

325: visit site $y$ but short enough not to wander much away from $y$.

326: To calculate the corresponding value of $R_c$ we take

327: the limits, in order, $N\to\infty$, $t_f\to \infty$, and look for the

328: stationary solution of (\ref{diffeq}) with boundary condition

329: $\pi_{x\to\infty}=1$. The result for

330: the probability of survival is

331: \begin{equation} \label{pstarsinai}

332: \pi (\mu) = \frac{e^{-\frac{\mu}T}}{1-\mu+ \mu

333: \, e^{-\frac 1T}+\mu(1-\mu)\, t^* _{y+1}\,(e^{\frac{1}{4T}}-

334: e^{-\frac{3}{4T}})^2 } \ ,

335: \end{equation}

336: where the mean sojourn time on site $y+1$ in ${\bf V}$ is \cite{revue}

337: \begin{equation} \label{tx}

338: t ^*_{y+1}= \sum _{z\ge0} \exp \left[\frac 1{T} \left( \frac {

339: V_{y+z+2}+ V_{y+z+1}}2 - V_{y+1} \right)\right] \ .

340: \end{equation}

341:

342:

343: \begin{figure}

344: \begin{center}

345: \vskip .7cm

346: \psfig{figure=./fig2.eps,height=3cm,angle=0}

347: \caption{Sketch of the number $R_c(y;{\bf V})$ of RWs

348: necessary for a good inference of the force $f_y$

349: as a function of the RW duration $t_f$. $\tau_y$ is the typical

350: first-passage time in $y$ from the origin, $t_{eq}$ the equilibration

351: time  (comparable to the first-passage time from the extremity

352: $N$ when $y\ll N$). Inset: rate of reconstruction (\ref{vel}) as

353: a function of temperature at fixed tilt.}

354: \label{fig-rc}

355: \end{center}

356: \end{figure}

357:

358:

359: {\em Distribution of $R_c$ over potentials.}

360: The number $R_c(y;{\bf V})$ of RWs necessary to predict the value of

361: $f_y$ depends on the potential ${\bf V}$ through the sojourn time

362: $t^*_{y+1}$ (\ref{tx}). By randomly drawing potentials (or varying site $y$)

363: we obtain the distribution of $R_c$ shown in Fig.~\ref{fig-histo}.

364: Main features are:

365:

366: $\bullet$ Small $R_c$ correspond to sites where the RW spends long

367: time $t^*$ (traps)\footnote{RW$(\mu)$, due to conditioning to survival,

368: is likely to stay for $\sim 1/d(y) \ll t^*$ in the trap only.}:

369: $R_c \sim \frac 1{|\ln \pi|} \sim \frac 1{\ln t^*}$ from

370: (\ref{pstarsinai}). The power law tail of the distribution of

371: sojourn times, $P(t^* )\sim (t^*) ^{-(\alpha+1)}$ \cite{revue},

372: gives rise to an essential singularity at the origin

373: in the cumulative distribution, ${\cal Q} (R_c) \sim

374: e^{-\alpha/R_c}$. The potential is easy to predict over trapping

375: regions since RWer spends a long time there, and accumulates

376: information about the energy landscape.

377:

378: $\bullet$ Conversely the largest value of $R_c$, denoted by $R_c ^H$,

379: correspond to the homogeneous potential $V_x^H=-x$ in which the

380: walker is never  trapped and is quickly driven to $+\infty$.

381: $R_c^H$ can be calculated from (\ref{pstarsinai}) by setting

382: $f_x=+1$ for all sites in (\ref{tx}). The singularity in

383: ${\cal Q}$ when $R_c\to R_c^H$ corresponds to quasi-homogeneous

384: potentials, where one force, say, on site $\ell$, is $-1$.

385: Such potentials have exponential-in-$\ell$ small probabilities, but

386: give values of $R_c$ on site $y=0$ exponentially close to $R_c^H$.

387: On the overall we

388: find $1-{\cal Q} (R_c ^H - \epsilon) \sim \epsilon ^\beta$ where

389: the exponent is $\beta=T \ln \frac {1+b}2$.

390:

391: $\bullet$ In between ${\cal Q}$ shows marked steps at well defined and

392: $b$-independent values of $R_c$, which correspond to specific

393: local force patterns beyond site $y$.

394: A $\ell$-pattern is defined as a sequence of

395: forces on sites $y+1$ to $y+\ell+1$, followed by all $+$

396: forces; the corresponding  $R_c$ can be

397: exactly calculated from (\ref{pstarsinai},\ref{tx}), and is

398: shown  for 7 among the

399: 16 $\ell=4$-patterns in Fig.~\ref{fig-histo}.

400: The histogram of $R_c$ can be accurately approximated for any tilt $b>0$ based

401: on the above local pattern description. Given a length $\ell$ we enumerate all

402: the $2^\ell$ patterns, calculate the corresponding $R_c$, and weight them

403: with probability $(\frac {1+b}2) ^{\# f_x =+}\times (\frac {1-b}2)

404: ^{\# f_x =-}$. In practice we choose $\ell\sim 10/\ln [2/(1-b)]$,

405: to ensure that patterns with more than $\ell$ negative

406: forces have negligible

407: weights ($< e^{-10}$). The resulting histograms are in

408: excellent agreement with ${\cal Q}$ for intermediate

409: values of $R_c$ (dashed lines in  Fig.~\ref{fig-histo}).

410:

411: \begin{figure}

412: \begin{center}

413: \vskip .7cm

414: \psfig{figure=./fig3.eps,height=5cm,angle=0}

415: \caption{Cumulative probability distribution ${\cal Q}$ of

416: $R_c(y;{\bf V})$ at

417: temperature $T=1$ and for three tilt values $b$. Full lines are

418: numerical results from $10^6$ samples, and dashed lines are the

419: outcomes from the $\ell$-pattern approximation. Inset: $R_c$ vs.

420: $T$ for the 3-patterns  $+++$,

421: $-++$, $---$ (from top to down).}

422: \label{fig-histo}

423: \end{center}

424: \end{figure}

425:

426: {\em Tuning temperature for fast reconstruction.}

427: The dependence of $R_c$ upon temperature is shown for three patterns

428: in the Inset of Fig.~\ref{fig-histo}. We have $R_c \sim 4T$

429: as $T\to\infty$ independently of the pattern, and $R_c\sim 2T/(h+3)$

430: when $T\to 0$ where $h$ is the highest barrier to the right of $y$

431: in the potential defined by the pattern (Fig.~\ref{fig-histo}).

432: When the temperature exceeds the temperature $T_b$ such that

433: $\alpha=1$ the velocity

434: of the RWer is finite $\frac y{\tau _y} \sim v(T) >0$

435: \cite{revue}. The reconstruction rate (number of

436: correctly predicted forces per unit of time) is equal to the velocity

437: $v(T)$ divided by $R_c$,

438: \begin{equation} \label{vel}

439: \nu (T)= \frac{1-\cosh\frac 1T + b \sinh

440: \frac 1T}{\cosh \frac 1{2T} - b \sinh \frac 1{2T}} \times \int _0 ^{R_c^H}

441: dR_c \frac {{\cal Q}  '(R_c)}{R_c}

442: \end{equation}

443: after averaging over the quenched potential.

444: The dependence of $\nu$

445: upon temperature is sketched in the Inset of Fig.~\ref{fig-rc}; it is

446: maximal and equal to $\nu ^M$ for some temperature $T^M$ realizing a

447: trade-off between fast motion (large velocity) and accurate

448: reading-out (small $R_c$). Even in the small

449: $b$ limit the optimal reconstruction rate is finite, $\nu^M \sim b^2$,

450: by working at high temperature $T^M \sim \frac  1b$, while in the

451: absence of optimization procedure the number of predicted forces

452: scales only as the squared logarithm of the time \cite{math2}.

453:

454: {\em Conclusion.} We have shown how the number of RWs

455: required for a good reconstruction of the potential

456: can be deduced from the probability of

457: survival of an absorbing RW process. This result is of practical

458: interest since the survival probability can be estimated through

459: numerical simulations e.g. in dimension $\ge 2$. Furthermore we have

460: determined, for the special case of the RF model, the optimal

461: `experimental' protocol for reconstruction (number of RWs,  duration,

462: temperature).

463:

464: Our formalism applies to continuously parametrized potentials

465: e.g. RF model with forces taking continuous

466: instead of binary values. The aim is now to predict the true potential

467: values up to some accuracy on each site; this is turn

468: determines an acceptable neighborhood around ${\bf s}^*_{\bf v}$ in the space

469: of signals. The rate function $\omega _{\bf v}$ is generically

470: parabolic around ${\bf s}^*_{\bf v}$, with a curvature matrix called Fisher

471: information matrix \cite{bayes}. Finding $R_c$ amounts to minimize this

472: (positive) quadratic form on the boundary of the neighborhood, a task

473: which can be carried out efficiently \cite{garey}.

474: Our approach can be easily  extended to the

475: case of a finite delay between two measures of the positions, and

476: Chernoff's result is recovered in the finite $N$, infinite delay

477: limits \cite{Che52,mb}.

478:

479: {\em Acknowledgments.} We are grateful to D. Thirumalai for his

480: suggestion of illustrating our formalism on the RF model. This work

481: was partially funded by ANR under  contract 06-JCJC-051.

482:

483:

484:

485: \begin{thebibliography}{999999}

486:

487: \bibitem{discreteRF}

488: B. D. Hughes, {\em Random walks and random environments},

489: Oxford University Press (1996).

490:

491: \bibitem{revue}

492: J-P. Bouchaud, A. Georges, {\em Physics Reports} {\bf 195}, 127 (1990).

493:

494:

495: \bibitem{evans}

496: R. Merkel {\em et al.}

497: {\em Nature}   {\bf 397},50-53 (1999).

498:

499:

500: \bibitem{fer04}

501: J.M. Fernandez, H. Li {\em Science}   {\bf 303}, 1674 (2004).

502:

503: \bibitem{Ess97}

504: B. Essevaz-Roulet, U. Bockelmann,  F. Heslot,

505: {\em Proc. Natl. Acad. Sci. (USA) } {\bf 94}, 11935 (1997).

506:

507: \bibitem{woo06}

508: M.T. Woodside {\em et al.} {\em Science} {\bf 314}, 1001 (2006).

509:

510:

511: \bibitem{hye03}

512: C. Hyeon, D. Thirumalai

513: {\em Proc Natl Acad Sci USA}   {\bf 100},10249-53 (2003).

514:

515: \bibitem{mb}

516: V. Baldazzi {\em et al.} {\em Phys. Rev. Lett.} {\bf 96} 128102

517: (2006); {\em Phys. Rev. E} {\bf 75}, 011904 (2007)

518:

519: \bibitem{ritort}

520: M. Manosas, D. Collin, F. Ritort {\em Phys. Rev. Lett.} {\bf 96},

521: 218301 (2006).

522:

523: \bibitem{bayes}

524: T.M. Cover, J.A. Thomas, {\em Elements of Information Theory}, Wiley (1991)

525:

526:

527: \bibitem{domany}

528: L. Ein-Dor, O. Zuk, E. Domany, {\em Proc. Nat. Acad. Sci. (USA)} {\bf

529:  103}, 5923-5928 (2006).

530:

531: \bibitem{And57}

532: T. W. Anderson, L. Goodman

533: {\em The Annals of Mathematical Statistics} {\bf 28}, 89 (1957)

534:

535: \bibitem{Che52}

536: H. Chernoff, {\em Ann. Math. Statis.} {\bf 23}, 493 (1952)

537:

538: \bibitem{Dem98}

539: A. Dembo, O. Zeitouni, {\em Large deviations Techniques and Applications}

540: Springer-Verlag (1998)

541:

542: \bibitem{Boz71}

543: L.B. Boza, {\em Ann. Math. Statis.} {\bf 42}, 1992 (1971)

544:

545: \bibitem{noi}

546: S. Cocco, R. Monasson, {\em in preparation}

547:

548: \bibitem{garey}

549: M.R. Garey, D.S. Johnson, {\em Computers and Intractability:

550: A Guide to the Theory of NP-Completeness},

551: W.H. Freeman (1979).

552:

553: \bibitem{math1}

554: O. Adelman, N. Enriquez, {\em Israel J. Math.} {\bf 142}, 205-220

555: (2004).

556:

557: \bibitem{math2}

558: P. Andreoletti, {\em preprint arxiv:math.PR/0612208} (2006).

559:

560: \end{thebibliography}

561:

562:

563: \end{document}

564: