physics0502152/neun.tex
1: \documentclass[twocolumn,showpacs,preprintnumbers,amsmath,amssymb]{revtex4}
2: 
3: \bibliographystyle{/mnt/p0friedr/tdfrank/styles/apsrev}
4: \usepackage{epsfig}
5: 
6: \begin{document}
7: \title{An Iterative Procedure for the Estimation of Drift and
8:   Diffusion Coefficients of Langevin Processes} 
9: \author{D.~Kleinhans, R.~Friedrich}
10: \affiliation{Institute for Theoretical Physics, University of
11:   M\"unster, D-48149 M\"unster, Germany}
12: 
13: \author{A.~Nawroth, J.~Peinke}
14: \affiliation{ Institute for Physics, Carl-von-Ossietzky University Oldenburg, D-26111 Oldenburg, Germany} 
15: \date{\today}
16: 
17: 
18: \begin{abstract}
19: A general method is proposed which allows one 
20: to estimate drift and diffusion coefficients of a stochastic process
21: governed by a Langevin equation. It extends a previously devised
22: approach [R. Friedrich et al., 
23: Physics Letters {\bf A 271}, 217 (2000)], which 
24: requires sufficiently high sampling rates.
25: The analysis is based on an iterative procedure minimizing 
26: the Kullback-Leibler distance between measured and estimated 
27: two time joint probability distributions of the process.
28: 
29: \end{abstract}
30: 
31: \pacs{87.23.Cc,02.50.Ey,05.40.Jc}
32: \maketitle
33: 
34: 
35: \section{Introduction}
36: Complex behavior in systems far from equilibrium can quite often 
37: be traced back to rather simple laws due to the existence
38: of processes of selforganization \cite{Haken1}. 
39: Since complex systems are composed
40: of a huge number of subsystems, however, fluctuations stemming from
41: the microscopic degrees of freedom play an important role 
42: introducing a temporal variation on a fast time scale which quite
43: often can be considered as fluctuations. 
44: The consequence is the existence of evolution equations of a set of 
45: macroscopic order parameters ${\bf q}(t)$ which are governed by nonlinear
46: Langevin equations \cite{Risken}, \cite{Gardiner}:
47: \begin{equation}\label{Lange}
48: \frac{d}{dt}q_{i} = D_i^1({\bf q})  + \sum_l g_{il}({\bf q}) F_l(t)\quad ,
49: \end{equation}
50: where ${\bf q}(t)$ denotes the n-dimensional state vector, ${\bf
51: D}^1({\bf q})$ is the drift vector and the matrix $g({\bf q})$
52: is related to the diffusion matrix according to 
53:  $\left(D^2({\bf q})\right)_{ij}
54: =\sum_k g_{ik}({\bf q}) g_{jk}({\bf q})$. ${\bf F}(t)$ are fluctuating forces
55: with Gaussian statistics  
56: delta-correlated in time: $<F_l(t)>=0$, 
57: $<F_l(t) F_k(t')>=2\delta_{lk}\delta(t-t')$.
58: Here and in
59: the following we adopt It\^o's interpretation of stochastic 
60: integrals \cite{Risken}, \cite{Gardiner}. 
61: 
62: Analyzing complex systems, which can be described by
63: stochastic equations of the form (\ref{Lange}), therefore, amounts to
64: assess the underlying Langevin equations or the corresponding
65: Fokker-Planck equations from an inspection of experimentally
66: determined time series \cite{Haken2}. 
67: Recently, an operational method \cite{Siegert1}, \cite{Siegert2}
68: has been devised, which allows one to
69: estimate drift and diffusion coefficients of the 
70: stochastic processes from experimental data.
71: This method has been successfully applied to various problems in the
72: field of complex systems like the analysis of noisy electrical circuits
73: \cite{Siegert2}, stochastic dynamics of metal cutting 
74: \cite{Grad1}, systems with feedback delay \cite{Frank1},
75: meteorological processes like wind-driven Southern Ocean variability
76: \cite{Sura1}, traffic flow data \cite{Kriso} and physiological time series \cite{Kuusela04}. 
77: Furthermore it has been applied
78: to problems like turbulent flows \cite{PRL}, \cite{JFM}, 
79: passive scalar advection \cite{Tutku},
80: financial time series \cite{PRLfinanz}, analysis of rough surfaces
81: \cite{Jafari}, \cite{Waechter}, which can be characterized as a
82: stochastic process with respect to a scale variable exhibiting
83: markovian properties in scale.
84: 
85: The method is based on the evaluation of the time limits 
86: the first and second conditional moments, 
87: \begin{subequations}
88: \label{est}
89: \begin{eqnarray}
90: {\bf D}^1({\bf q}) &=& \lim_{\tau \rightarrow 0} \frac{1}{\tau}
91: < {\bf q}(t+\tau)-{\bf q}(t)|{\bf q}(t)={\bf q}> \\ 
92: { D}^2_{ij}({\bf q}) &=& \lim_{\tau \rightarrow 0} \frac{1}{2\tau}
93:  < [{\bf q}(t+\tau)-{\bf q}(t)]_{i}\nonumber\\
94: &&[{\bf q}(t+\tau)-{\bf q}(t)]_{j}|{\bf q}(t)={\bf q}>\quad .
95: \end{eqnarray}
96: \end{subequations}
97: From these expressions it becomes evident that the sampling rate in the
98: experiments has to be sufficiently high in order to allow for a
99: reliable evaluation of the limit $\tau \rightarrow 0$.
100: Therefore, in all
101: applications mentioned above 
102: the results have been checked in a selfconsistent manner by a
103: recalculation of conditional pdf's from the estimated Fokker-Planck
104: equation. Possible problems in estimating drift and diffusion coefficients
105: related with low sampling frequencies have been adressed by Sura
106: \cite{Sura}, Ragwitz and Kantz \cite{Ragw}, \cite{Kantzcom} and 
107: Friedrich et al. \cite{Kantzrepl}.  
108: 
109: The aim of the present letter is to devise an extension of the above
110: method in order to overcome problems related with the time limit $\tau
111: \rightarrow 0$. These problems immediately show up for low
112: sampling rates.
113:  We also want to point out that for the case of stochastic forces
114:  ${\bf F}(t)$ with small but finite temporal correlations the process is not markovian in the
115:  limit $\tau \to 0$. In this case, however, one should use the Stratonovich
116:  interpretation of stochastic processes \cite{Risken}.
117: 
118: % We also want to point out that for the case of stochastic forces
119: % $\bf{F}(\bf{q},t)$ with small but finite temporal correlations the
120: % limit $\tau \to 0$ cannot be used for approximating the process by a
121: % markovian one. In this case, however, one should use the Stratonovich
122: % interpretation of stochastic processes \cite{Risken}.
123: 
124: %   or for the case of stochastic forces ${\bf F}({\bf q},t)$
125: % with small but finite temporal correlations.
126: % Furthermore, uncorrelated noise
127: % sources, so-called measurement noise \cite{Siefert} additionally may limit the
128: % accuracy of the estimates (\ref{est}).  
129: 
130: \section{Description of the Method}
131: The starting point is a first estimate of drift and
132: diffusion coefficients by the expressions (\ref{est}) evaluated for
133: the smallest reliably possible values of $\tau$. The second step
134: is an embedding of drift and diffusion coefficients into a family of
135: functions ${\bf D}^1({\bf q},\sigma)$, ${\bf D}^2({\bf q},\sigma)$
136: parameterized by a set of free parameters $\sigma$. The expressions
137: obtained in the first step 
138: already yield a crude estimate of the parameters $\sigma$. 
139: The third step consists in optimizing the free parameters
140: ${\sigma}$. 
141: 
142: Optimization of the free parameters can be performed 
143: in the following way. One determines the 
144: conditional probability distribution
145: \begin{equation}
146: p({\bf q},t|{\bf q}_0,t_0;{\bf\sigma})
147: \end{equation}
148: for the parameter set ${\sigma}$ either by a
149: simulation of the Langevin equations or by a numerical
150: solution of the corresponding Fokker-Planck equation. In each case,
151: one can determine the two point pdf $f({\bf q},t;{\bf
152: q}_0,t_0;{\sigma})=p({\bf q},t|{\bf q}_0,t_0;{\sigma})f({\bf
153: q}_0,t_0)$ . 
154: The reader should note that this 
155: may be done for various finite values of $t-t_0$. The obtained two time
156: pdf can now be compared with the experimental one. A suitable measure
157: for the distance is the Kullback-Leibler information \cite{Haken2} 
158:  defined according to
159: \begin{eqnarray}
160: \label{kullb_information}
161: K({\sigma},t,t_0) &=&\int d{\bf q} \int d{\bf q}_0
162: f_{exp}({\bf q},t;{\bf q}_0,t_0)
163: \nonumber \\
164: &\times &
165: \ln \frac{f_{exp}({\bf q},t;{\bf q}_0,t_0)}{f({\bf q},t;{\bf q}_0,t_0,{\sigma)}}\qquad .
166: \end{eqnarray}
167: 
168: The minimum of the Kullback-Leibler information with respect to the parameters 
169: ${\sigma}$ yields estimates of drift and diffusion of 
170: a stochastic process. This process is the best approximation
171: with respect to this measure in
172: the class of stochastic processes characterized by the parameters
173: ${\sigma}$. The problem of identifying a stochastic process is then
174: equivalent to determining a minimum of the Kullback information. In practice
175: the minimum can be determined by gradient or genetic 
176: algorithms and solved by standard methods \cite{weinstein90}.
177: In the following we shall consider cases, where it
178: is possible to obtain a parametrization of the stochastic processes by 
179: only few parameters $\sigma$ such that the Kullback-Leibler measure
180: can be investigated by graphical means.
181: 
182: \section{Examples}
183: For certain classes of stochastic processes
184: the above procedure can be reduced considerably by the fact that
185: only few free parameters for the parametrization
186: of drift and diffusion terms have to be introduced. As a consequence 
187: the minimization procedure of the Kullback-Leibler information
188: is greatly facilitated.  
189: 
190: \subsection{One dimensional systems}
191: 
192: \begin{figure}
193: \begin{center}
194: \includegraphics[width=8.6cm]{mult01u2.dat.bw.eps}
195: \end{center}
196: \caption{Segment of the one-dimensional synthetic time series I. }
197: \label{mult01u2.dat}
198: \end{figure}
199: 
200: The case of one-dimensional systems allows for the following
201: treatment due to the fact that the 
202: stationary pdf, which is assumed to exist, can be determined
203: analytically:
204: \begin{equation}
205: f(q)=\frac{N}{D^{2}(q)} e^{\ \int\limits^q dq' \frac{D^{1}(q')}{D^{2}(q')}} \qquad .
206: \end{equation}
207: As a consequence, we have the relationship
208: \begin{equation}
209: \label{multnoise}
210: D^{1}(q)=D^{2}(q)\frac{d}{dq} \ln f(q)+\frac{d}{dq}D^{2}(q) \qquad . 
211: \end{equation}
212: 
213: Since $f(q)$ can be determined from the time series 
214: an estimate in terms of a parameterized ansatz 
215: for the diffusion term suffices. In fact, one may use the ansatz 
216: $D^2(q)=Q+ aq^2 +b q^4+\ldots$ , which 
217: helps in lowering the number of parameters $\sigma$ to be estimated by
218: the above procedure of minimization the Kullback-Leibler information.  
219: The drift then follows from (\ref{multnoise}). 
220: 
221: \begin{figure}
222: \begin{center}
223: \includegraphics[width=8.6cm]{mult02.out.map.bw.eps}
224: \end{center}
225: \caption{Kullback distance $K(Q,a)$ as function of the parameters $Q$ 
226: and $a$ for time series I. The lines are equidistant 
227: contour lines starting from $2.6\cdot 10^{-4}$ in the center. 
228: The distance between contour lines is $5\cdot 10^{-5}$. 
229: A clear minimum is located at $(Q,a)=(1,1)$.}
230: \label{mult02.out}
231: \end{figure}
232: 
233: Let us consider system I with drift and diffusion functions
234: \begin{eqnarray}
235: D^1(q)=q-q^3\quad\mbox{and}\quad D^2(q)=1+q^2
236: \end{eqnarray}
237: driven by a multiplicative noise term.
238: We use synthetic data obtained by numerical integration of the 
239: corresponding Langevin equation \cite{Risken},
240: \begin{equation}
241: q(t+\tilde{\tau})=q(t)+\tilde{\tau}D^1\left[q(t)\right]+\sqrt{\tilde{\tau}}D^2\left[q(t)\right]\Gamma(t)\quad.
242: \end{equation}
243: A time series containing $10^6$ points with time 
244: increment $10^{-2}$ was generated. The intrinsic increment $\tilde{\tau}$ used for numerical integration 
245: of the corresponding Langevin equation was $10^{-5}$. 
246: A time segment of the data is presented in fig.~\ref{mult01u2.dat}. 
247: Since the stochastic process is stationary and ergodic  
248: all statistical quantities can be retrieved from this data.
249: 
250: For the estimation of the pdf's from data state space has to be 
251: divided into bins. We used $100$ equidistant 
252: bins for the stationary pdf. A very accurate way to calculate the 
253: integral yielding the Kullback-Leibler distance
254: without running out of memory even for higher dimensional 
255: data is to use an adequate local grid for the first argument (the
256: destination) of the conditional pdf's. The conditional pdf then 
257: locally can be retrieved from the data for any $({\bf q},{\bf q_{0}})$ 
258: with high accuracy. The local grid used in this  example covered $20$
259: equidistant bins.
260: 
261: % ...select a different 
262: % amount of bins for the conditional pdf's. The conditional pdf then 
263: % locally can be retrieved from the data for any ${\bf q}$ 
264: % with high accuracy. 
265: 
266: During the iteration
267: procedure the two point pdf's have to be calculated. 
268: We again use the numerical simulation of  Langevin processes 
269: as a very  efficient way to generate these pdf's.
270: 
271: Starting from the estimates (\ref{est}) the ansatz $D^2(Q,a,q)=Q+aq^2$ 
272: is reasonable. The drift immediately follows from (\ref{multnoise}) 
273: and, for each parameter set $(Q,a)$,
274: one obtains a stationary distribution that equals the experimental one. 
275: Due to this fact the evaluation of the conditional 
276: pdf $p(q,t+\tau|q_{0},t;Q,a)$ suffices to calculate the Kullback-Leibler
277: distance. A clear minimum of the distance is found at $(Q,a)=(1,1)$ 
278: corresponding to the original set of parameters. 
279: The Kullback distance close to this minimum 
280: in the two-dimensional parameter space is 
281: exhibited in fig.~\ref{mult02.out}.
282: 
283: \subsection{Application to potential systems}
284: 
285: \begin{figure}
286: \begin{center}
287: \includegraphics[width=8.6cm]{feb002.dat.1d.eps}
288: \end{center}
289: \caption{Segment of the two-dimensional synthetic time series II.}
290: \label{feb002.dat.1d}
291: \end{figure}
292: 
293: \begin{figure}
294: \begin{center}
295: \includegraphics[width=8.6cm]{feb002.out.eps}
296: \end{center}
297: \caption{The Kullback distance $K(Q)$ as a function of the 
298: noise strength $Q$ (time series II). A minimum is clearly visible at the value
299: $Q=0.05$.}
300: \label{feb002.out}
301: \end{figure}
302: 
303: The procedure for one-dimensional systems can be immediately 
304: extended to higher dimensions if one restricts the analysis
305: to the so-called class 
306: of potential systems for which the drift vector 
307: ${\bf D}^1({\bf q})$ is obtained from a potential 
308: $V({\bf q})$ and $g_{ik}=\sqrt{Q}\delta_{ik}$.
309: The central point of our analysis is the following exact expression for
310: the stationary pdf 
311: 
312: \begin{equation}
313: f({\bf q})=N  e^{-V({\bf q}) /Q} 
314: \qquad .
315: \end{equation}
316: 
317: Since the stationary pdf can be estimated from experimental data 
318: one may parameterize the class of stochastic
319: processes by the single variable $Q$. Thus the drift function can be taken
320: to be fixed except for the value $Q$:
321: 
322: \begin{equation}
323: {\bf D}^{1}({\bf q})= Q {\bf \nabla} \ln f({\bf q}) \qquad . \label{add_final}
324: \end{equation}
325:  
326: As an example we consider the two-dimensional system
327: \begin{equation}
328: {\bf D^{1}}({\bf q})=
329: \left(\begin{array}{c}\epsilon q_{1}-q_{1}\left[q_{1}^2+Bq_{2}^2\right]\\
330: \epsilon q_{2}-q_{2}\left[Bq_{1}^2+q_{2}^2\right]\end{array}\right)
331: \qquad .
332: \end{equation}
333: This dynamical system arises as order parameter equations for instabilities
334: in nonequilibrium systems and has applications
335: from the fields of pattern formation in nonequilibrium systems to pattern
336: recognition \cite{Haken1}. It exhibits the features of 
337: multistability and selection. We considered the case
338: $\epsilon=0.25$ and $B=2$ (time series II). 
339: These parameters yield four stable fixpoints of the dynamics 
340: on the axes at $|{\bf q}|=1/2$ and unstable fixpoints at the 
341: origin and on the bisectional lines  at $|{\bf q}|=\sqrt{6}/6$. 
342: 
343: Data with time increments $10^{-1}$ 
344: for the datapoints 
345: has been generated with a time step 
346: $10^{-5}$ for the integration of the Langevin equations. 
347: The simulated time series II with $Q=.05$ consists 
348: of $5\cdot 10^{6}$ data points. 
349: Figure~\ref{feb002.dat.1d} exhibits a segment of the generated data.
350: 
351: We analyzed the time series as outlined above. 
352: State space in this case is divided in $100\times 100$ equidistant bins. 
353: Since the drift ${\bf D}^{1}({\bf q})$ can be evaluated 
354: from (\ref{add_final}) all parameters are fixed except for the 
355: noise strength $Q$.
356: 
357: After evaluating the Kullback measure for various values of 
358: $Q$ this value has to be optimized. The optimal value is
359: determined by the minimum of the 
360: Kullback distance. For the present case the minimum can easily be 
361: determined by graphical means.
362: 
363: \begin{figure}
364: \begin{center}
365: \includegraphics[width=8.6cm]{feb002.drift.eps}
366: \end{center}
367: \caption{Time series II: Drift vector field extracted from data
368: using the optimal value of $Q$. Unstable fixpoints in the center and 
369: on the bisectional line as well as 
370: the attractive fixpoints are clearly visible.}
371: \label{feb002.drift}
372: \end{figure}
373: 
374: Fig.~\ref{feb002.out} shows the Kullback distance
375: $K(Q)$ as a function of the noise strength Q for the time series II.
376: The minimum is clearly visible at $Q=0.05$ and agrees with the one 
377: used for simulation. With this parameter the drift 
378: vector field can be recalculated 
379: from the stationary distribution 
380: based on relation 
381:  (\ref{add_final}). 
382: The resulting drift vector field of dataset II is exhibited 
383: in fig.~\ref{feb002.drift}.
384: 
385: \section{Conclusion}
386: 
387: Summarizing, we have outlined an operational method for the estimation
388: of drift and diffusion terms from experimental time series of
389: stochastic Langevin processes. In contrast to previous approaches the 
390: present algorithm does not rely on estimating conditional moments in 
391: the small time increment limit. Although this limit 
392: yields a first
393: approximation an iterative refinement of the estimated stochastic
394: process is performed by minimization of the Kullback-Leibler distance between
395: estimated and measured two time probability distributions. 
396: The proposed procedure solves the problem of estimating drift and
397: diffusion terms of Langevin processes from time series. 
398: It involves the numerical solution of Langevin equations with
399: parameter dependent drift and diffusion terms, an evaluation of 
400: the Kullback-Leibler integral (which may be
401: determined by means of a Monte-Carlo method) and an 
402: optimization procedure, for which standard approaches
403: can be used. All involved steps are based on routine calculations.
404: Furthermore, restriction to certain classes of
405: stochastic processes like potential systems can drastically lower
406: the numerical efforts of the procedure. Therefore, the proposed
407: algorithm can be applied also to systems with higher dimensional
408: state spaces. 
409: 
410: \begin{thebibliography}{}
411: \bibitem{Haken1} H. Haken, {\em Synergetics: Introduction and Advanced
412: 	Topics}, Springer Verlag Berlin Heidelberg New York (2004)
413: \bibitem{Risken} H. Risken, {\em The Fokker-Planck equation},
414: 	Springer-Verlag Berlin Heidelberg New-York Tokyo (1983)
415: \bibitem{Gardiner} C. W. Gardiner, {\em Handbook of Stochastic
416: 	Methods}, Springer-Verlag Berlin Heidelberg New-York Tokyo (1983)
417: \bibitem{Haken2} H. Haken, {\em Information and Self-Organization-
418: 	A macroscopic approach to complex systems},
419: 	Springer Verlag Berlin Heidelberg New York (2004)
420: \bibitem{Siegert1} S. Siegert, R. Friedrich, J. Peinke, 
421:   Phys. Lett. {\bf A 234},
422:       275-280 (1998)
423: \bibitem{Siegert2} R. Friedrich, S. Siegert, J. Peinke, St. L\"uck, 
424: 	M. Siefert, M. Lindemann, J. Raethjen, G. Deuschl, G. Pfister, 
425: 	Phys. Lett.
426: 	{\bf A 271}, 217 (2000)
427: \bibitem{Grad1} J. Gradisek, I. Grabec, S. Siegert, R. Friedrich,
428:   Mechanical Systems and Signal Processing {\bf 16}
429:         (5), 831 (2002)
430: \bibitem{Frank1} T. D. Frank, P. J. Beek, R. Friedrich, Phys. Lett. {\bf A
431: 	328}, 219 (2004), T. D. Frank, R. Friedrich, P.J. Beek,  Stochastics and
432: 	Dynamics {\bf 9}, 44 (2004)
433: \bibitem{Sura1} P. Sura, S.T. Gille,  Journal of
434: 	Marine Research {\bf 61}, 313 (2003)
435: \bibitem{Sura2} P. Sura,  Journal of the Atmospheric Sciences
436: 	{\bf 60}, 654 (2003)
437: \bibitem{Kriso} S. Kriso, R. Friedrich, J. Peinke, P. Wagner, 
438:          Phys. Lett. {\bf A
439: 	299}, 287 (2002)
440: \bibitem{Kuusela04} T. Kuusela, Phys. Rev. {\bf E 69}, 031916 (2004)
441: \bibitem{PRL} R. Friedrich, J. Peinke,
442:         Phys. Rev. Lett. {\bf 78}, 863 (1997)
443: \bibitem{JFM} Ch. Renner, J. Peinke, R. Friedrich, 
444:          J. Fluid Mech. {\bf 433}, 383 (2001)
445: \bibitem{Tutku} M. Tutkun, L. Mydlarski,  New Journal
446: 	of Physics {\bf 6}, Art. No. 49 (2004)
447: \bibitem{PRLfinanz} R. Friedrich, J. Peinke, Ch. Renner, 
448:           Phys. Rev. Lett. {\bf 84}, 5224 (2000)
449: \bibitem{Jafari} G. R. Jafari, S. M. Fazeli, F. Ghasemi,
450: 	S. M. V. Allaei,
451: 	M. R. R. Tabar, A. I. Zad, G. Kavei, Phys. Rev. Lett. {\bf 91},
452: 	226101 (2003)
453: \bibitem{Waechter} M. W\"achter, F. Riess, H. Kantz, J. Peinke, 
454:         Europhys. Lett. {\bf 64}, 579 (2003)
455: \bibitem{Sura} P. Sura, j. Barsugli,  Phys. Lett. {\bf A 305},
456: 	304 (2002) 
457: \bibitem{Ragw} M. Ragwitz, H. Kantz,  Phys. Rev. Lett.
458: 	{\bf 87}, 254501 (2001) 
459: \bibitem{Kantzcom} R. Friedrich, Ch. Renner, M. Siefert, J. Peinke,
460: 	Phys. Rev. Lett. 89, 149401 (2002)
461: \bibitem{Kantzrepl} M. Ragwitz, H. Kantz,
462: 	Phys. Rev. Lett. 89, 149402 (2002)
463: \bibitem{Siefert} M. Siefert, A. Kittel, R. Friedrich, J. Peinke, 
464: 	 Europhys. Lett. {\bf 61} (4), 466 (2003)
465: \bibitem{weinstein90}
466: E. Weinstein, M. Feder, and A.~V. Oppenheim, IEEE Transactions on Acoustics,
467:   Speech and Signal Processing {\bf 38},  1652  (1990)
468: \end{thebibliography}
469: 
470: \end{document}
471: