math0603668/PS.tex
1: \documentclass[10pt]{article}
2: \usepackage[dvips]{graphicx}
3: \usepackage{amsmath,amsfonts,amssymb,latexsym,epsfig}
4: %\usepackage[notref,notcite]{showkeys}
5: \include{references}
6: \usepackage{mathrsfs}
7: \usepackage{verbatim}
8: \usepackage{latexsym}
9: \usepackage{amsthm}
10: \usepackage{amssymb}
11: \usepackage{graphics}
12: \usepackage{amsbsy}
13: %\usepackage{theorem}
14: %\usepackage{showlabels}
15: \usepackage{enumerate}
16: \usepackage{times}
17: %\newskip\structskipamount \structskipamount=1.5ex
18: %\newcommand{\structskip}{\par\ifdim\lastskip<\structskipamount
19: %  \removelastskip\penalty-100\vskip\structskipamount\fi}
20: 
21: \newtheorem{theorem}{Theorem}[section]
22: \newtheorem{lemma}[theorem]{Lemma}
23: \newtheorem{remark}[theorem]{Remark}
24: \newtheorem{corollary}[theorem]{Corollary}
25: \newtheorem{prop}[theorem]{Proposition}
26: \newtheorem{assumptions}[theorem]{Assumptions}
27: 
28: %\def\qed{\unskip\nobreak\hfil\penalty50\hskip2em\hbox{}\nobreak
29: %   \hfil\vrule width0.5em height 1.5ex depth0pt\kern2pt%
30: %   \parfillskip=0pt\finalhyphendemerits=0 \par}
31: %\newenvironment{proof}%
32: %  {\structskip\noindent\textbf{Proof.} \ignorespaces}%
33: %  {\qed\structskip}
34: 
35: \numberwithin{equation}{section}
36: %
37: \graphicspath{{figures/}}
38: %
39: \newcommand{\E}{{\mathbb E}}
40: \newcommand{\bbE}{{\mathbb E}}
41: \newcommand{\Ee}{{\mathbb E}^{\mu^\eps}}
42: \newcommand{\LL}{{\mathcal L}}
43: \newcommand{\KK}{{\mathcal K}}
44: \newcommand{\HH}{{\mathcal H}}
45: \newcommand{\T}{{\mathbb T}}
46: \newcommand{\R}{{\mathbb R  }}
47: \newcommand{\D}{{\mathcal D  }}
48: \newcommand{\RR}{{\mathcal R  }}
49: \newcommand{\pd}[2]{\frac{\partial #1}{\partial #2}}
50: \newcommand{\pdt}[1]{\frac{\partial #1}{\partial t}}
51: \newcommand{\pdtau}[1]{\frac{\partial #1}{\partial \tau}}
52: \newcommand{\pdd}[2]{\frac{\partial^2 #1}{\partial {#2}^2}}
53: \newcommand{\pddd}[3]{\frac{\partial^2 #1}{\partial {#2} \partial{#3}}}
54: \newcommand{\pdddd}[4]{\frac{\partial^2 #1}{\partial {#2} \partial{#3}
55: \partial{#4}}}
56: \newcommand{\brk}[1]{\left( #1 \right)}
57: \newcommand{\Brk}[1]{\left[ #1 \right]}
58: \newcommand{\px}{\partial_x}
59: \newcommand{\py}{\partial_y}
60: \newcommand{\bbT}{\mathbb{T}}
61: \newcommand{\bbR}{\mathbb{R}}
62: \newcommand{\cA}{\mathcal A}
63: \newcommand{\cL}{\mathcal L}
64: \newcommand{\cLo}{\cL^{OU}}
65: \newcommand{\rou}{\rho^{OU}}
66: \newcommand{\pit}{\hat{\pi}}
67: \newcommand{\piz}{\pi_0}
68: \newcommand{\eps}{\epsilon}
69: \newcommand{\xeps}{x^{\epsilon}}
70: \newcommand{\xepss}{x_s^{\epsilon}}
71: \newcommand{\yepss}{y_s^{\epsilon}}
72: \newcommand{\goup}{ e^{\frac{y^2}{2 D}}}
73: \newcommand{\goum}{ e^{-\frac{y^2}{2 D}}}
74: \newcommand{\la}{\langle}
75: \newcommand{\ra}{\rangle}
76: 
77: 
78: %
79: %   MAIN DOCUMENT
80: %
81: %
82: \begin{document}
83: %
84: %
85: %
86: \setlength{\baselineskip}{10pt}
87: \title{PARAMETER ESTIMATION FOR MULTISCALE DIFFUSIONS}
88: \author{G.A. Pavliotis\footnote{Corresponding author. 
89: E-mail address: g.paviotis@maths.warwick.ac.uk.} \\
90:         Department of Mathematics\\
91:     Imperial College London \\
92:         London SW7 2AZ, UK \\
93:         and \\
94:         A.M. Stuart\footnote{E-mail address: stuart@maths.warrwick.ac.uk.} \\
95:         Mathematics Institute \\
96:         Warwick University \\
97:         Coventry CV4 7AL, UK
98:                     }
99: \maketitle
100: 
101: \begin{abstract}
102: We study the problem of parameter estimation for time-series possessing two, widely
103: separated, characteristic time scales. The aim is to understand situations where it is
104: desirable to fit a homogenized singlescale model to such multiscale data. We
105: demonstrate, numerically and analytically, that if the data is sampled too finely then
106: the parameter fit will fail, in that the correct parameters in the homogenized model are
107: not identified. We also show, numerically and analytically, that if the data is
108: subsampled at an appropriate rate then it is possible to estimate the coefficients of the
109: homogenized model correctly.
110: \end{abstract}
111: 
112: \noindent {\bf Keywords:} Parameter estimation, multiscale diffusions, stochastic 
113: differential equations, homogenization, maximum likelihood, subsampling.
114: 
115: %
116: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
117: %
118: %                                      INTRODUCTION
119: %
120: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
121: %
122: \section{Introduction}
123: %
124: \label{sec:intro}
125: 
126: 
127: 
128: Parameter estimation for continuous time stochastic models is an increasingly important
129: part of the overall modelling strategy in a wide variety of applications. It is quite
130: often the case that the data to be fitted to a diffusion process has a multiscale
131: character. One example is the field of  molecular dynamics, where it is desirable to find
132: effective models for low dimensional phenomena (such as conformational dynamics, vacancy
133: diffusion and so forth) which are embedded within higher dimensional time-series. Another
134: example is the ocean--atmosphere sciences where it is desirable to find effective models
135: for large--scale structures, whilst representing the small--scales stochastically.  The
136: multiscale structure of the data in these problems renders the problem of parameter
137: estimation very subtle, and great care has to be taken in order to estimate the
138: coefficients correctly. The aim of the paper is to shed light on this estimation problem
139: through the study of a simple class of model problems, typical of those arising in
140: molecular dynamics.
141: 
142: In econometrics and finance, the problem of estimating parameters for continuous time
143: diffusion processes in the presence of small scale fluctuations (market microstructure
144: noise) has been considered by A\"{i}t--Sahalia and collaborators
145: \cite{AitMykZha05b,AitMykZha05a} and more recently in \cite{BaNiHaLuSh06}. In that work
146: the microscale is input as an independent white observational noise that is superimposed
147: on--top of a singlescale diffusion process. We have a somewhat different framework: we
148: work in the context of coupled systems of diffusions exhibiting multiple scales. Our aim
149: is to fit a singlescale homogenized diffusion to data. Models similar to the ones
150: considered in this paper have been studied extensively in finance, see \cite{fouque00}
151: and the reference therein. In that book there is discussion of parameter estimation for
152: multiscale diffusions, with emphasis on the estimation of the rate of mean reversion of
153: volatility from historical asset price data; see \cite[Ch. 4]{fouque00}.
154: 
155: Various numerical algorithms for
156: diffusions with multiple scales have been developed \cite{Vand03} and analyzed
157: \cite{ELV05}. Those papers are finely honed to optimize the fitting of the homogenized
158: diffusion in situations where the multiscale model is known explicitly. In contrast, in
159: this paper we introduce multiscale diffusions primarily as a device to generate
160: multiscale data; we do not assume that the multiscale model is available to us when doing
161: parameter estimation. This enables us to gain understanding of parameter estimation in
162: situations where the multiscale data is given to us from experiments, or comes from a
163: model where the scale--separation is not explicit. Two recent papers contain numerical
164: experiments relating to the extraction of averaged or homogenized diffusions from data
165: generated by a multiscale diffusion; see \cite{Cald06,CromVanEij06b}.
166: 
167: Despite differences from the framework used in
168: \cite{AitMykZha05b,AitMykZha05a,BaNiHaLuSh06} to study problems arising
169: in econometrics and finance, similarities with our work remain:
170: trying to fit the
171: models on the basis of data sampled at too high a frequency leads to incorrect parameter
172: inference; furthermore, there is an optimal subsampling rate for the data to obtain
173: correct inference.
174: 
175: There are two forms of multiscale diffusions which are of particular
176: interest in the context of parameter estimation. The first gives rise
177: to {\bf averaging} for SDEs, and the second to {\bf homogenization}
178: for SDEs. For averaging one has, for $\eps \ll 1$,
179: \begin{subequations}
180: \begin{eqnarray}
181:  d x^\eps(t) &=& f(x^\eps(t),y^\eps(t)) \, dt+\alpha(x^\eps(t),y^\eps(t)) \,  dU(t),  \\
182:  d y^\eps(t) &=& \frac{1}{\eps} g(x^\eps(t),y^\eps(t)) \, dt+\frac{1}{\sqrt\eps}
183:  \beta(x^\eps(t),y^\eps(t)) \, dV(t),
184: \end{eqnarray}
185: \label{e:averg}
186: \end{subequations}
187: with $U,V$ standard Brownian motions. Averaging $f$ and $\alpha \alpha^T$
188: over the invariant measure of the $y^\eps$ equation, with $x^\eps$ viewed as fixed,
189: gives an averaged SDE for $x$. The fast process $y$, with timescale $\eps$,
190: is eliminated. For homogenization one has
191: \begin{subequations}
192: \begin{eqnarray}
193: d x^\eps(t) &=&  \left( \frac{1}{\eps} f_0(x^\eps(t),y^\eps(t)) +
194: f_1(x^\eps(t),y^\eps(t)) \right) dt \nonumber \\
195: &+& \alpha(x^\eps(t),y^\eps(t)) \, dU( t),\\
196: d y^\eps(t) &=& \frac{1}{\eps^2} g(x^\eps(t),y^\eps(t)) \, dt +
197: \frac{1}{\eps} \beta(x^\eps(t),y^\eps(t)) \, dV(t),
198: \end{eqnarray}
199: \label{e:homog}
200: \end{subequations}
201: where it is assumed that $f_0$ averages to zero against the invariant measure
202: of the fast process $y^\eps$ with $x^\eps$ fixed.
203: Now $y^\eps$ has time-scale $\eps^2$ and is eliminated.
204: The fluctuations in $f_0$, suitably amplified by $\eps^{-1}$, induce ${\cal O}(1)$ effects
205: in the homogenized equation for $x^\eps$. In both cases \eqref{e:averg} and \eqref{e:homog} it is
206: possible to show \cite{lions} that the process $x^\eps(t)$ converges in law, as $\eps
207: \rightarrow 0$, to the solution of an effective SDE of the form
208: \begin{equation}\label{e:effect}
209: d x(t) = F(x(t)) dt + A(x(t)) d U(t).
210: \end{equation}
211: Explicit formulae can be derived for the effective coefficients $F(x)$ and $A(x)$ in the
212: above equation \cite{lions, PavlSt06b}. A natural question that arises then is how to fit
213: an SDE of the form \eqref{e:effect} to data generated by a multiscale stochastic equation
214: of the form \eqref{e:averg} or \eqref{e:homog}, under the assumption of scale separation,
215: i.e. when $\eps \ll 1$. This paper is a first attempt towards the study of this
216: interesting problem, for a specific class of SDEs of the form \eqref{e:homog}.
217: 
218: Our basic model will be the first order Langevin equation
219: %
220: \begin{equation}
221: d x^\eps(t) = - \nabla V \left(x^\eps(t), \frac{x^\eps(t)}{\eps}; \alpha \right)  dt +
222: \sqrt{2 \sigma } d \beta(t),
223: %
224: \label{e:main}
225: %
226: \end{equation}
227: %
228: where $\beta(t)$ denotes standard Brownian motion on $\R^d$ and $\sigma$ is a positive
229: constant. The two--scale potential $ V^\eps \left(x, y; \alpha \right)$ is assumed to
230: consist of a large--scale and a fluctuating part
231: %
232: \begin{equation}
233: %
234: V ( x, y ; \alpha) = \alpha V(x) + p(y).
235: %
236: \label{e:potential}
237: %
238: \end{equation}
239: %
240: As we show explicitly in \eqref{e:eqns_motion} this set-up puts us in
241: the framework of homogenization for SDEs.
242: 
243: Under \eqref{e:potential}, the SDE \eqref{e:main} becomes
244: \begin{equation}
245: d x^\eps(t) = - \alpha \nabla V(x^\eps(t)) \, dt - \frac{1}{\eps}\nabla p \left(
246: \frac{x^\eps(t)} {\eps} \right) \, dt + \sqrt{2 \sigma} \, d \beta (t).
247: %
248: \label{e:xeps_V}
249: %
250: \end{equation}
251: If $p$ is periodic on $\bbT^d$ and sufficiently smooth, then it is well
252: known (see \cite{lions, pardoux} for example) that, as $\eps \rightarrow 0$,
253: the solution $x^\eps(t)$ of $\eqref{e:main}$ converges in law to the
254: solution of the SDE
255: %
256: \begin{equation}
257:  d x(t) = -\alpha K \nabla V(x(t)) dt + \sqrt{2 \sigma K} d \beta (t),
258: \label{e:lim_sde}
259: \end{equation}
260: with
261: \begin{equation}
262: K = \int_{\T^d} \left( I + \nabla_y \phi(y) \right)  \left( I + \nabla_y \phi(y)
263: \right)^T \, \mu(dy) \label{e:coeffs}
264: \end{equation}
265: and
266: \begin{equation}
267: \mu(dy) = \rho(y) dy = \frac{1}{Z} e^{-p(y)/\sigma} \, dy, \quad Z = \int_{\T^d}
268: e^{-p(y)/\sigma} \, dy. \label{e:gibbs_torus}
269: \end{equation}
270: The field $\phi(y)$ is the solution of the Poisson equation
271: \begin{equation}
272: - \LL_0 \phi(y) = -\nabla_y p(y), \quad \LL_0 := - \nabla_y p(y) \cdot \nabla_y + \sigma
273: \Delta_y, \label{e:cell}
274: \end{equation}
275: with periodic boundary conditions. The function $\rho(y)$ spans the null-space of ${\cal
276: L}_0^*$, the $L^2$--adjoint of $\LL_0$. The effective diffusion tensor is positive
277: definite and the diffusivity is always depleted \cite{Oll94}. Physically this occurs
278: because the homogenized process must represent the cost of traversing the many small
279: energy barriers present in the original multiscale problem but which are not explicitly
280: captured in the homogenized potential.  In Figure \ref{fig:potential} we
281: plot the potential $V^\eps(x,x/\eps)$, as well as the average potential $V(x)$,
282: illustrating this phenomenon.  In fact, the effective diffusivity $\Sigma = \sigma K$
283: decays exponentially fast in $\sigma$ as $\sigma \rightarrow 0$.
284: See \cite{CampPiatn2002} and the references therein. Thus the original
285: and homogenized diffusivities are exponentially different at small
286: temperatures.
287: 
288: To illustrate these facts explicitly, consider the problem in one dimension, $d = 1$. In
289: this case the limiting equation takes the form
290: \begin{equation}
291:  d x(t) = - A  V'(x(t)) dt + \sqrt{2 \Sigma } d \beta (t).
292: \label{e:lim_sde_1d}
293: \end{equation}
294: The effective coefficients are
295: \begin{equation}
296: A = \frac{\alpha L^2}{Z \widehat{Z}} \quad \mbox{and} \quad \Sigma = \frac{\sigma L^2}{Z
297: \widehat{Z}}, \label{e:coeffs_1d}
298: \end{equation}
299: where
300: \begin{equation}
301: \widehat{Z} = \int_{0}^L e^{p(y)/\sigma} \, dy, \quad Z = \int_{0}^L e^{-p(y)/\sigma} \,
302: dy. \label{e:z_1d}
303: \end{equation}
304: \begin{figure}
305: \begin{center}
306: \includegraphics[width=3.0in, height = 3.0in]{potential3.eps}
307: \caption{$V^\eps(x, x/\eps) = \frac{1}{2} x^2 + \sin \left( \frac{x}{\eps} \right)$
308: with $ \eps = 0.1$ and averaged potential $V(x) = \frac{1}{2} x^2$.} \label{fig:potential}
309: \end{center}
310: \end{figure}
311: Notice that $L^2 \leq Z \widehat{Z}$ by the Cauchy--Schwarz inequality. This explicitly shows
312: that the homogenized equation in one dimension comprises motion in the average potential
313: $V(x)$, at a new slower time--scale contracted by $A/\alpha.$
314: 
315: 
316: 
317: The main results of the paper can be summarized as follows.
318: Assume that we are given a path $\{x^\eps(t)\}_{t \in [0,T]}$
319: of equation \eqref{e:xeps_V} and that we want to fit an SDE of the form
320: \eqref{e:lim_sde_1d} to the given data, estimating the parameters $A,\Sigma$
321: as $\widehat{A}, \widehat{\Sigma}$. Then the following is a loose
322: statement of our main results; these will be formulated precisely, and
323: proved, below.
324: 
325: %
326: \begin{theorem}
327: If we do not subsample, then the estimators $\widehat{A}$ and $\widehat{\Sigma}$ are
328: asymptotically biased -- they converge to $\alpha, \, \sigma$.
329: \end{theorem}
330: %
331: \begin{theorem}
332: If the sampling rate is between the two characteristic time scales of the SDE \eqref{e:main}
333: then the estimators $\widehat{A}$ and $\widehat{\Sigma}$ are
334: asymptotically unbiased -- they converge to $A, \, \Sigma$.
335: \end{theorem}
336: %
337: The rest of the paper is organized as follows. In section \ref{sec:estim} we present the
338: estimators that we will use. In section \ref{sec:numerics} we present various numerical
339: experiments illustrating the behaviour of these estimators.
340: In section \ref{sec:results} we state the main results of this paper,
341: explaining the numerical experiments from the previous section.
342: Section 5 contains some preliminary results that will be useful in the sequel.
343: Section 6 contains proof of two central propositions concerning the behaviour
344: of the multiscale diffusion
345: when observed on time--scales long compared with the fast time--scales of process, but
346: small compared with the slow time--scales of the process.  Section 7 is devoted to
347: the proofs of our theorems. Finally, section \ref{sec:conc} is devoted to some concluding
348: remarks.
349: 
350: In the sequel we use $\langle \cdot, \cdot \rangle$ to denote the standard inner--product
351: on $\bbR^d$ and $|\cdot|$ the induced Euclidean norm. Throughout the paper we make the
352: following standing assumptions on the drift vector fields:
353: %
354: \begin{assumptions}
355: \label{a:1}
356: The potentials $p$ and $V$ satisfy:
357: %
358: \begin{itemize}
359: %
360: \item $p(y) \in C^{\infty}_{per}(\bbT^d,\bbR^d)$;
361: %
362: \item $V(x) \in C^{\infty}(\R^d,\R);$
363: %
364: \item $|\nabla V(x_1)-\nabla V(x_2)| \le L |x_1-x_2| \quad \forall x_1,x_2 \in \R^d;$
365: %
366: \item $\exists a,b>0: \la -\nabla V(x),x \ra \le a-b|x|^2 \quad \forall x \in \R^d;$
367: %
368: \item $e^{-\frac{\alpha}{\sigma} V(x)} \in L^1(\R^d,\R^+)$.
369: %
370: \end{itemize}
371: %
372: \end{assumptions}
373: %
374: The third assumption will be used primarily to deduce that, by choice of
375: origin for $V$,
376: \begin{equation}
377: \label{e:linbnd}
378: |\nabla V(x)| \le L|x|.
379: \end{equation}
380: This assumption could be relaxed and replaced by a polynomial growth bound;
381: however this complicates the analysis without adding new insight.
382: Similarly it is not necessary, of course, that $V$ and $p$ are $C^{\infty}$.
383: The fourth condition, however, is essential:
384: it drives the ergodicity of the process which we use in a fundamental way in the analysis
385: of the drift parameter estimators; it would not, however, be fundamental for estimation
386: of diffusion coefficients alone. The fourth condition implies the fifth, which is simply
387: the requirement that the invariant measure is indeed a probability measure; we state the
388: two conditions separately for clarity of exposition.
389: %
390: %
391: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
392: %
393: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
394: %
395: \section{The Estimators}\label{sec:estim}
396: %
397: In this section we describe various estimators for the parameters
398: arising in equation \eqref{e:lim_sde}. We assume that we are given
399: a path $x=\{x(t)\}_{t \in [0,T]}$, or samples from such a path,
400: $x=\{x_n\}_{n=0}^N$, with $x_n=x(n\delta).$ For simplicity
401: we aim to fit the equation in the form
402: \begin{equation}
403:  d x (t) = -A \nabla V(x (t)) dt + \sqrt{2 \Sigma } d \beta (t),
404: \label{e:lim_sde_sim}
405: \end{equation}
406: where $A$ and $\Sigma$ are scalars. In one dimension this reduces to the form
407: \eqref{e:lim_sde_1d}. Note that in general this is only the correct form for the
408: homogenized equation in one dimension since, typically, the average potential has a
409: matrix as a pre--factor, as in \eqref{e:lim_sde}. However it suffices to exemplify the
410: main ideas in this work, and simplifies the presentation.
411: 
412: The standard way to estimate the diffusion coefficient
413: is via the quadratic variation of the path:
414: \begin{equation}
415: \widehat{\Sigma}_{N,\delta}(x) = \frac{1}{2 N \delta d} \sum_{n = 0}^{N-1}
416: |x_{n+1} -  x_n|^2.
417: \label{e:sigma_estim_1d}
418: \end{equation}
419: A key issue in this paper is to understand how to choose $\delta$
420: as a function of $\epsilon$ to ensure that data generated by \eqref{e:main}
421: can be effectively fit to obtain the correct homogenized diffusivity
422: in equations such as  \eqref{e:lim_sde_sim}.
423: 
424: The standard way to estimate drift coefficients is via the path-space likelihood of
425: \eqref{e:lim_sde_sim} with respect to a pure diffusion with no drift,
426: namely (see, for example, \cite{BasRao80,LipShir01a})
427: %
428: $$L(x) \propto  \exp\{-I(x)/2\Sigma\}$$
429: %
430: where
431: $$I(x)=\int_0^T\left\{ |A \nabla V(x (t))|^2dt+2A \la \nabla V(x (t)), d x (t) \ra
432: \right\}.$$
433: Maximizing the log-likelihood then gives
434: the estimate $\widehat{A}$ of $A$ given by
435: \begin{equation}
436: \widehat{A}(x) = - \frac{\int_0^T \la \nabla V(x(t)), d x(t)  \ra} {\int_0^T \big| \nabla
437: V(x(t)) \big|^2\, dt}. \label{e:a_est}
438: \end{equation}
439: If the data is given in discrete but finely spaced increments, as often happens
440: in practice, then this estimator can be approximated to yield
441: %
442: \begin{equation}
443: \widehat{A}_{N,\delta}(x) = - \frac{\sum_{n = 0}^{N-1} \la \nabla V(x_n),
444: \left(x_{n+1} - x_n \right)\ra}{\sum_{n=0}^{N-1} \left|\nabla V(x_n) \right|^2
445: \delta}.
446: %
447: \label{e:alpha_estim_1d}
448: %
449: \end{equation}
450: A key issue in this paper is to understand how to chose $\delta$ as a function of
451: $\epsilon$ to ensure that data generated by \eqref{e:main} can be effectively fit to
452: obtain the correct homogenized drift coefficients in equations such as
453: \eqref{e:lim_sde_sim}, via the estimator \eqref{e:alpha_estim_1d}.
454: 
455: The gradient structure of the SDE \eqref{e:lim_sde_sim} can be used to obtain a second
456: estimator for the drift coefficients. This second estimator, which we now derive, is of
457: interest for two different reasons: firstly it may be useful in practice as it may lead
458: to smaller variance in estimators; secondly it highlights the fact that working out how
459: to sample the data to obtain the correct estimation of the diffusion coefficient alone
460: will lead to correct estimation of the drift parameters, at least for the
461: class of gradient--structure SDEs that we consider in this paper.
462: The second estimator requires
463: the input of an estimator $\widehat\Sigma$ for the diffusion coefficient and is
464: \begin{equation}
465: \tilde{A}(x)=\widehat{\Sigma}\frac{\frac{1}{T}\int_0^T \Delta V(x(t)) \, dt  }
466: {\frac{1}{T} \int_0^T |\nabla V(x(t))|^2 \, dt}. \label{eq:alpha2}
467: \end{equation}
468: Approximating to allow for the input of discrete--time data gives
469: %
470: \begin{equation}
471: \tilde{A}_{N,\delta}(x) =  \widehat{\Sigma}\frac{\sum_{n = 0}^{N-1} \Delta V(x_n) \delta}
472: {\sum_{n=0}^{N-1} \left|\nabla V(x_n) \right|^2 \delta}.
473: %
474: \label{e:alpha_estim_1d2}
475: \end{equation}
476: %
477: The following result shows that $\tilde{A}(x)$ is a natural approximation to $\widehat
478: A(x).$
479: %
480: \begin{prop}
481: Let $x=\{x(t)\}_{t \in [0,T]}$ satisfy \eqref{e:lim_sde_sim}. If $\widehat{\Sigma}=\Sigma$
482: then the estimator $\tilde A(x)$ is asymptotically equivalent to the maximum likelihood
483: estimator $\widehat{A}$:
484: $$
485: \lim_{T \rightarrow \infty} \tilde{A}(x) = \widehat{A}(x),\, a.s.
486: $$
487: \end{prop}
488: %
489: \proof We apply the It\^{o} formula to $V(x(t))$ for $x(t)$ solving \eqref{e:lim_sde_sim}
490: and use formula \eqref{e:a_est} to obtain
491: \begin{eqnarray*}
492: \widehat{A}(x) & = &
493: \frac{V(x(0)) - V(x(T)) + \Sigma \int_0^T \Delta V(x(t))
494: \, dt  }{\int_0^T |\nabla V(x)|^2 \, dt}
495: \\ & = & \frac{(V(x(0)) - V(x(T)))}{\int_0^T |\nabla V(x)|^2 \,
496: dt} + \frac{\frac{1}{T}\Sigma\int_0^T \Delta V(x(t)) \, dt  } {\frac{1}{T}
497: \int_0^T |\nabla V(x)|^2 \, dt}
498: \\ & = & \frac{\frac{1}{T} (V(x(0)) - V(x(T)))}{\frac{1}{T}\int_0^T |\nabla V(x)|^2 \,
499: dt} + \tilde{A}(x).
500: \end{eqnarray*}
501: Under the Assumptions \ref{a:1} it follows from \cite{Mao97} that
502: $$
503: \lim_{T \rightarrow 0} \frac{\frac{1}{T} (V(x(0)) - V(x(T))}{ \int_0^T
504: |\nabla V(x(t))|^2 \, dt} = 0,\, a.s.
505: $$
506: The result follows.
507: \qed
508: %
509: %
510: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
511: %
512: %                          NUMERICAL RESULTS
513: %
514: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
515: %
516: \section{Numerical Results}
517: \label{sec:numerics} In all cases we solve the multiscale SDE \eqref{e:main} using the
518: Euler--Marayama scheme \cite{KlPl92} for a single realization of the noise, with a
519: time--step $\Delta t$ sufficiently small so that the error due to the discretization is
520: negligible; this requires that the time--step is small compared with $\eps^2,$ the
521: fastest scale in the problem. We also employ a sufficiently long time interval so that
522: the invariant measure is well sampled by the single path. Since the convergence to the invariant measure
523: is uniform in $\eps \to 0$, this is not prohibitive. We then use the data generated from
524: the multiscale process as input to the estimators for the homogenized diffusion
525: \eqref{e:lim_sde}. We present numerical results for three model problems: a one
526: dimensional monomial potential of even degree, a one dimensional bistable potential and a
527: two dimensional quadratic potential. In all three cases we perturb the large--scale part
528: of the potential $V$ by small--scale fast oscillations, usually in the form of a cosine
529: potential $p$.
530: 
531: We present two types of numerical results. Note that $\delta$, the time interval between
532: two consecutive observations, is the inverse sampling rate. In the first we use
533: $\delta=\Delta t$ as the time interval between two consecutive observations in the
534: estimators. In the second we subsample the data, using $\delta > \Delta t$ and study how
535: the estimated coefficients behave as a function of the subsampling. We use the data
536: generated from our simulation in the estimators \eqref{e:alpha_estim_1d} and
537: \eqref{e:alpha_estim_1d2} to estimate the drift coefficient and in \eqref{e:sigma_estim_1d}
538: to estimate the diffusion coefficient of \eqref{e:lim_sde_1d}. For the most part we work
539: in one dimension and fit a single drift and diffusion parameter so that \eqref{e:lim_sde}
540: becomes \eqref{e:lim_sde_1d}. When we work in more than one dimension, or
541: estimate more than just a single drift or diffusion parameter,
542: we use natural generalizations of the estimators defined in the previous
543: section.
544: 
545: Let us summarize the main conclusions that can be drawn from the numerical experiments;
546: recall that $\Delta t \ll \eps^2.$ First, if we choose $\delta = \Delta t$, that is, if
547: we don't subsample, then the resulting estimators do not generate the correct estimates
548: of the homogenized coefficients. If, on the other hand, we subsample with $\eps^2 \ll
549: \delta \ll \mathcal{O} (1),$ then  the estimators generate the values of the parameters
550: of the homogenized equation. Furthermore, there is an optimal sampling rate: there exists
551: a $\delta^*$ which minimizes the distance between the homogenized value of the parameter
552: and the value generated by the estimator. The optimal sampling rate depends sensitively
553: on $\sigma$. It is also of interest that, in higher dimensions, the optimal sampling rate
554: can be different for different parameters.
555: 
556: The above observations appear to hold independently of the detailed
557: form of the large--scale part of the potential $V$ (provided, of course,
558: that it satisfies appropriate convexity conditions).
559: In addition, the performance of the estimators seems to be the same
560: irrespective of the dimension of the problem.
561: 
562: Another interesting observation is that the second estimator for the drift coefficient
563: \eqref{e:alpha_estim_1d2} performs at least as well as the maximum
564: likelihood estimator \eqref{e:alpha_estim_1d}, and in some instances
565: outperformas it.
566: %
567: %
568: \subsection{Failure Without Subsampling}
569: \begin{figure}
570: \centerline{
571: \begin{tabular}{c@{\hspace{2pc}}c}
572: \includegraphics[width=2.7in, height = 2.7in]{a_estim_ou_vs_eps_a1_eps004_02_dt510_4T10_4.eps} &
573: \includegraphics[width=2.7in, height = 2.7in]{sigma_estim_ou_vs_eps_a1_eps004_02_dt510_4T10_4.eps} \\
574:  a.~~  $\widehat{A}$  & b.~~ $\widehat{\Sigma}$
575: \end{tabular}}
576: \begin{center}
577: \caption{ Estimation of the drift and diffusion coefficients vs $\eps$ for the potential \eqref{e:ou}.
578: Solid line: estimated coefficient. Dashed line: homogenized coefficient. Dotted line:
579: unhomogenized coefficient.}
580: %
581: \label{fig:vs_eps_no_subsam}
582: %
583: \end{center}
584: \end{figure}
585: %
586: \begin{figure}
587: \centerline{
588: \begin{tabular}{c@{\hspace{2pc}}c}
589: \includegraphics[width=2.7in, height = 2.7in]{a_estim_ou_vs_sig_a1_eps01_dt510_4T10_4.eps}
590:  & \includegraphics[width=2.7in, height = 2.7in]
591:  {sigma_estim_ou_vs_sig_a1_eps01_dt510_4T10_4.eps} \\
592:  a.~~  $\widehat{A}$  & b.~~ $\widehat{\Sigma}$
593: \end{tabular}}
594: \begin{center}
595: \caption{Estimation of the drift and diffusion coefficients vs $\sigma$ for the potential
596: \eqref{e:ou} with $\eps = 0.1$. Solid line: estimated coefficient. Dashed line:
597: homogenized coefficient. Dotted line: unhomogenized coefficient.}
598: \label{fig:vs_sigma_no_subsam}
599: \end{center}
600: \end{figure}
601: In this section we study the estimators
602: $\widehat{A}$ and $\widehat{\Sigma}$
603: when the data is given from the solution of equation \eqref{e:xeps_V}
604: with $\eps \ll 1$ and $\Delta t=\delta$ -- no subsampling is used.
605: We use the potential
606: \begin{equation}\label{e:ou}
607: V(x) = \frac{1}{2}\alpha x^2
608: \end{equation}
609: The small--scale part of the potential is
610: \begin{equation}\label{e:cos}
611: p(y) =  \cos ( y ).
612: \end{equation}
613: In Figure \ref{fig:vs_eps_no_subsam} we plot the estimators $\widehat{A}$ and
614: $\widehat{\Sigma}$ for various values of $\eps$. For comparison we also plot the
615: homogenized coefficients $A$ and $\Sigma$ and the unhomogenized coefficients $\alpha$ and
616: $\sigma$. We observe that the estimators always give us the coefficients $\alpha$ and
617: $\sigma$ of the original SDE \eqref{e:xeps_V}. In particular, the performance of the
618: estimators does not improve as $\eps \rightarrow 0$. In Figure
619: \ref{fig:vs_sigma_no_subsam} we plot the estimators for various values of the diffusion
620: coefficient $\sigma$. We notice that the estimators give the values of the coefficients
621: $\alpha$ and $\sigma$, for all values of $\sigma$. Since the homogenized coefficients
622: decay to $0$ exponentially fast in $\sigma$, the results of Figure
623: \ref{fig:vs_sigma_no_subsam} indicate that the estimators give exponentially wrong
624: results when $\sigma \ll 1$.
625: 
626: These results indicate the need to subsample -- i.e. to choose $\delta$
627: appropriately as a function of $\epsilon$.
628: %
629: %
630: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
631: \subsection{Success With Subsampling}
632: Now, rather than using all the data that were generated from the solution of equation
633: \eqref{e:main} we use only a fraction of them. We choose $\delta$ in the estimators
634: \eqref{e:sigma_estim_1d}, \eqref{e:alpha_estim_1d} and \eqref{e:alpha_estim_1d2} as
635: follows:
636: $$
637: \Delta t_{sam}=\delta = 2^k \Delta t, \quad k=0, \, 1, \, 2, \dots,
638: $$
639: and we study the performance of the estimators as a function of the sampling rate. We
640: investigate this issue for three different model problems.
641: \subsubsection{OU Processes in 1D}
642: \begin{figure}
643: \centerline{
644: \begin{tabular}{c@{\hspace{2pc}}c}
645: \includegraphics[width=2.7in, height = 2.7in]{alpha_estim_ou_sig05_a1_ep01_dt001T2.10+4.eps}
646: & \includegraphics[width=2.7in, height = 2.7in]
647: {sigma_estim_ou_sig05_a1_ep01_dt001T2.10+4.eps} \\
648: a.~~  $\widehat{A}$  & b.~~ $ \widehat{\Sigma}$
649: \end{tabular}}
650: \begin{center}
651: \caption{Estimation of the drift and diffusion coefficients vs $\Delta t_{sam}$
652: for the potential \eqref{e:ou} with $\eps = 0.1$.  Solid line: estimated coefficient.
653: Dashed line: homogenized coefficient. Dotted line: unhomogenized coefficient.}
654: \label{fig:ou_sig05}
655: \end{center}
656: \end{figure}
657: %
658: \begin{figure}[t]
659: \centerline{
660: \begin{tabular}{c@{\hspace{2pc}}c}
661: \includegraphics[width=2.7in, height = 2.7in]{sigma_estim_ou_sig07_a1_eps01_dt001T10+4.eps}
662: & \includegraphics[width=2.7in, height = 2.7in]{sigma_estim_ou_sig1_a1_eps01_dt001T10+4.eps}
663: \\  a.~~  $\sigma=0.7$  & b.~~ $\sigma = 1.0$
664: \end{tabular}}
665: \begin{center}
666: \caption{Estimation of the diffusion coefficient vs $\Delta t_{sam}$ for the potential
667: \eqref{e:ou} with $\eps = 0.1$, for two different values of $\sigma$.  Solid line: estimated
668: coefficient. Dashed line: homogenized coefficient. Dotted line: unhomogenized coefficient.}
669: \label{fig:ou_sig_07_1}
670: \end{center}
671: \end{figure}
672: %
673: \begin{figure}[t]
674: \centerline{
675: \begin{tabular}{c@{\hspace{2pc}}c}
676: \includegraphics[width=2.7in, height = 2.7in]
677: {a_estim_ou_vs_sigma_sampl_eps01_a1_ep004_02_dt0005T210+4.eps} &
678: \includegraphics[width=2.7in, height = 2.7in]
679: {sigma_estim_ou_vs_sigma_sampl_eps01_a1_ep004_02_dt0005T210+4.eps} \\
680:  a.~~  $\widehat{A}$  & b.~~ $\widehat{\Sigma}$
681: \end{tabular}}
682: \begin{center}
683: \caption{Estimation of the drift and diffusion coefficient vs $\sigma$ for the potential
684: \eqref{e:ou} with $\eps = 0.1, \, \alpha = 1.0$, for three different sampling rates.
685: Solid line: $\Delta t_{sam} = 0.128$. Dash--dotted line: $\Delta t_{sam} = 0.256$. Dotted
686: line: $\Delta t_{sam} = 0.512$. Dashed line: homogenized coefficient. }
687: \label{fig:ou_vs_sig_sam}
688: \end{center}
689: \end{figure}
690: We study the problem in one dimension with the large--scale part of the potential given
691: by \eqref{e:ou} and with the fluctuating part being the cosine potential \eqref{e:cos}.
692: The two estimators $\widehat{A}$ and $\tilde{A}$ for the drift coefficient produce almost
693: identical results and we only present results for the maximum likelihood estimator
694: $\widehat{A}$. In Figure \ref{fig:ou_sig05} we present the estimated values of the drift
695: and diffusion coefficients as a function of the inverse sampling rate $\delta = \Delta
696: t_{sam}$ when $\eps = 0.1, \, \alpha = 1.0, \, \sigma = 0.5$. We observe that, provided
697: that we subsample at an appropriate rate, we are able to estimate the parameters of the
698: homogenized equation correctly. Notice also that the estimators for the drift and the
699: diffusion coefficient show very similar dependence on the sampling rate. This is in
700: accordance with our theoretical results; see Theorem \ref{prop:drift_estim_2}.
701: 
702: In Figure \ref{fig:ou_sig_07_1} we plot $\widehat{\Sigma}$ as a function of the sampling
703: rate for two different values of $\sigma$. We observe that the estimator of the diffusion
704: coefficient is a decreasing function of the sampling rate, as expected. In addition to
705: this, there is a well defined optimal sampling rate, which depends sensitively on
706: $\sigma$. In particular the optimal $\delta$ is a decreasing function of $\sigma$. This
707: is to be expected, since when $\sigma \gg 1$ the process $x^\eps(t)$ loses its multiscale
708: character and becomes effectively a standard Brownian motion. Consequently, when $\sigma$
709: is sufficiently large, the optimal $\delta$ becomes $\Delta t$, the integration time
710: step. Notice furthermore that the slope of the $\widehat{\Sigma}-\delta$
711: curve depends on $\sigma$.
712: 
713: In Figure \ref{fig:ou_vs_sig_sam} we plot the estimators of the drift and diffusion
714: coefficients versus $\sigma$, for three different sampling rates. For comparison we also
715: plot the homogenized coefficients. We observe that all three sampling rates lead to
716: reasonably accurate estimates for $A$ and $\Sigma$, when $\sigma$ is not too small. On
717: the other hand, the estimators become less accurate as $\sigma \rightarrow 0$. This is
718: also to be expected: when $\sigma \ll 1$, the accurate simulation of \eqref{e:main}
719: requires a very small time step; moreover, the equation has to be solved over a very long
720: time interval in order for the invariant measure of the process to be well represented.
721: Hence, our hypothesis that the errors due to discretization and finite time of
722: integration are small, is not valid. In addition, as $\sigma$ tends to $0$, the optimal sampling
723: rate increases, and becomes much larger than the coarser sampling rate that we use in the
724: simulations.
725: 
726: In Figure \ref{fig:ou_vs_eps_sam} we plot the estimators versus $\eps$, for three
727: different values of the sampling rate. As expected, the deviation of the estimated values
728: of the drift and diffusion coefficients from the homogenized values is an increasing
729: function of $\epsilon$. On the other hand, the optimal sampling rate does not appear to
730: depend sensitively on $\eps$: it is always the same sampling rate that minimizes the
731: distance between the estimated coefficient and the homogenized one, for all values of
732: $\eps$.
733: %
734: \begin{figure}[t]
735: \centerline{
736: \begin{tabular}{c@{\hspace{2pc}}c}
737: \includegraphics[width=2.7in, height = 2.7in]
738: {a_estim_ou_vs_eps_sampl_sig05_a1_ep004_02_dt0005T210+4.eps} &
739: \includegraphics[width=2.7in, height = 2.7in]
740: {sigma_estim_ou_vs_eps_sampl_sig05_a1_ep004_02_dt0005T210+4.eps} \\
741:  a.~~  $\widehat{A}$  & b.~~ $\widehat{\Sigma}$
742: \end{tabular}}
743: \begin{center}
744: \caption{Estimation of the drift and diffusion coefficient vs $\eps$ for the potential
745: \eqref{e:ou} with $\alpha = 1.0, \, \sigma = 0.5$, for three different sampling rates.
746: Solid line: $\Delta t_{sam} = 0.128$. Dash--dotted line: $\Delta t_{sam} = 0.256$. Dotted
747: line: $\Delta t_{sam} = 0.512$. Dashed line: homogenized coefficient.}
748: \label{fig:ou_vs_eps_sam}
749: \end{center}
750: \end{figure}
751: %
752: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
753: %
754: \subsubsection{A Bistable Potential}
755: We consider equation \eqref{e:main} in one dimension with a mean potential of
756: the bistable form
757: \begin{equation}\label{e:pot_bistable}
758: V(x; \alpha, \beta) = - \frac{1}{2} \alpha x^2 + \frac{1}{4} \beta x^4.
759: \end{equation}
760: The fluctuating part of the potential is given by  \eqref{e:cos}. The homogenized
761: equation is
762: %
763: \begin{equation}\label{e:hom_bistable}
764: d X(t) = ( A X(t) - B X(t)^3 ) dt + \sqrt{2 \Sigma} d \beta(t),
765: \end{equation}
766: where the homogenized coefficients are given by
767: $$
768: A = \alpha K, \quad B = \beta K, \quad \Sigma = \sigma K,
769: \quad K = \frac{4 \pi^2}{Z \widehat{Z}},
770: $$
771: where $Z$ and $\widehat{Z}$ are given by \eqref{e:z_1d} with $L = 2 \pi$ and $p(y) = \cos(y)$.
772: We will estimate the diffusion coefficient using formula \eqref{e:sigma_estim_1d} with $d
773: = 1$. For the two parameters of the drift we use generalizations of the maximum
774: likelihood estimator $\widehat{A}$.
775: 
776: In Figures \ref{fig:bistable:A_B_05} and \ref{fig:bistable:A_B_07} we present the
777: estimators for the two drift coefficients versus the sampling rate, for two different
778: values of $\sigma$. We observe that the performance of the estimators is qualitatively
779: similar to the OU case. Notice also that the optimal sampling rate is
780: approximately the same for both coefficients.
781: 
782: In Figure \ref{fig:bistable:sigma} we plot the estimator for the diffusion coefficient
783: versus the sampling rate, for two different values of $\sigma$. The conclusions reached
784: from the numerical study of $\widehat{\Sigma}$ for the one dimensional OU process carry
785: almost verbatim to this case.
786: 
787: \begin{figure}[t]
788: \centerline{
789: \begin{tabular}{c@{\hspace{2pc}}c}
790: \includegraphics[width=2.7in, height = 2.7in]{a_estim_bistable_sig05_a1_b2_ep01_dt001.eps} &
791: \includegraphics[width=2.7in, height = 2.7in]{beta_estim_bistable_sig05_a1_b2_ep01_dt001.eps} \\
792: a.~~  $\widehat{A}$ vs $\Delta t_{sam}$  & b.~~ $ \widehat{B}$ vs $\Delta t_{sam}$
793: \end{tabular}}
794: \begin{center}
795: \caption{ Estimation of the parameters of the bistable potential \eqref{e:pot_bistable}
796: as a function of the sampling rate for $\sigma = 0.5, \,\eps = 0.1$. Solid line:
797: estimated coefficient. Dashed line: homogenized coefficient. Dotted line: unhomogenized
798: coefficient.} \label{fig:bistable:A_B_05}
799: \end{center}
800: \end{figure}
801: %
802: \begin{figure}[t]
803: \centerline{
804: \begin{tabular}{c@{\hspace{2pc}}c}
805: \includegraphics[width=2.7in, height = 2.7in]{a_estim_bistable_sig07_a1_b2_ep01_dt001.eps} &
806: \includegraphics[width=2.7in, height = 2.7in]{beta_estim_bistable_sig07_a1_b2_ep01_dt001.eps} \\
807: a.~~  $\widehat{A}$ vs $\Delta t_{sam}$  & b.~~ $ \widehat{B}$ vs $\Delta t_{sam}$
808: \end{tabular}}
809: \begin{center}
810: \caption{Estimation of the parameters of the bistable potential \eqref{e:pot_bistable} as a
811: function of the sampling rate for $\sigma = 0.7, \,\eps = 0.1$. Solid line: estimated coefficient.
812: Dashed line: homogenized coefficient. Dotted line: unhomogenized coefficient.}
813: %
814: \label{fig:bistable:A_B_07}
815: %
816: \end{center}
817: \end{figure}
818: %
819: \begin{figure}[t]
820: \centerline{
821: \begin{tabular}{c@{\hspace{2pc}}c}
822: \includegraphics[width=2.7in, height = 2.7in]
823: {sigma_estim_bistable_sig05_a1_b2_ep01_dt001.eps} &
824: \includegraphics[width=2.7in, height = 2.7in]
825: {sigma_estim_bistable_sig07_a1_b2_ep01_dt001.eps} \\
826: a.~~  $\sigma = 0.5$  & b.~~ $ \sigma =0.7$
827: \end{tabular}}
828: \begin{center}
829: \caption{Estimation of the diffusion coefficient for the bistable potential
830: \eqref{e:pot_bistable} as a function of the sampling rate for $\alpha = 1.0,
831: \, \beta = 2.0, \,\eps = 0.1$. Solid line: estimated coefficient. Dashed line:
832: homogenized coefficient. Dotted line: unhomogenized coefficient.} \label{fig:bistable:sigma}
833: \end{center}
834: \end{figure}
835: %
836: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
837: %
838: %
839: \subsubsection{A Quadratic Potential in 2D}
840: We Consider now \eqref{e:main} in two dimensions with a separable fast potential $p(y)$:
841: %
842: \begin{equation}\label{e:2dim}
843: d x^\eps(t) = - \nabla V(x^\eps(t), B) \, dt - \frac{1}{\epsilon}\nabla p_1 \left(
844: \frac{x^\eps_1(t)}{\eps} \right)  -\frac{1}{\epsilon}\nabla p_2 \left(
845: \frac{x^\eps_2(t)}{\eps} \right)  \, dt + \sqrt{2 \sigma } \, d \beta (t),
846: \end{equation}
847: %
848: where $B$ is the set of the drift parameters that we wish to estimate. The homogenized
849: equation reads
850: \begin{equation}\label{e:homog_2d}
851: d X (t) = - K \nabla V(X (t), B) dt + \sqrt{2 \sigma K} \, d \beta (t),
852: \end{equation}
853: where
854: \begin{equation}\label{e:tensor_2d}
855: K = \left( \begin{array}{cc}
856: \frac{L^2}{Z_1 \widehat{Z}_1} & 0  \\
857: 0 & \frac{L^2}{Z_2 \widehat{Z}_2}
858: \end{array} \right)
859: \end{equation}
860: and
861: \begin{eqnarray*}
862: Z_i = \int_0^L e^{- \frac{p_i(y_i)}{\sigma}} \, dy_i, \quad
863:  \widehat{Z}_i = \int_0^L e^{\frac{p_i(y_i)}{\sigma}} \, dy_i, \; \; i=1,2.
864: \end{eqnarray*}
865: %
866: In the above $L$ denotes the period of $p(y)$.
867: 
868: We will consider the case of a general quadratic potential in two dimensions:
869: \begin{equation}\label{e:pot_2d}
870: V(x, B) = \frac{1}{2} x^T B x,
871: \end{equation}
872: with $B$ symmetric positive-definite. For the fluctuations we will use a simple
873: two--dimensional extension of the cosine potential
874: \eqref{e:cos}:
875: $$
876: p_1(y_1) = \cos(y_1), \; p_2(y_2) =  \frac{1}{2}\cos(y_2).
877: $$
878: Our goal is to estimate the diffusion tensor and the drift coefficients.
879: We will estimate the diffusion tensor through the quadratic variation:
880: \begin{equation}
881: \widehat{\Sigma}_{N,\delta}(x(t)) = \frac{1}{2 N \delta } \sum_{n = 0}^{N-1}
882: (x_{n+1} -  x_n ) \otimes (x_{n+1} -  x_n ),
883: \label{e:sigma_estim_dd}
884: \end{equation}
885: where $\otimes$ stands for the tensor product.
886: For simplicity we will assume that the
887: diffusion tensor in our model is diagonal. This is consistent with the
888: homogenized diffusion tensor, see eq.  \eqref{e:tensor_2d}.
889: We will use generalizations of the maximum likelihood estimator
890: $\widehat{A}$ in order to estimate the parameters of the quadratic potential.
891: 
892: \begin{figure}[t]
893: \centerline{
894: \begin{tabular}{c@{\hspace{2pc}}c}
895: \includegraphics[width=2.7in, height = 2.7in]{sigma11_estim_2d_sig05_a2_b2_c3_ep01_dt001.eps} &
896: \includegraphics[width=2.7in, height = 2.7in]{sigma22_estim_2d_sig05_a2_b2_c3_ep01_dt001.eps} \\
897: a.~~  $\widehat{\Sigma}_{11}$  & b.~~ $ \widehat{\Sigma}_{22} $
898: \end{tabular}}
899: \begin{center}
900: \caption{Estimation of the non--zero elements of the diffusion tensor for the 2d quadratic potential
901: \eqref{e:pot_2d} as a function of the sampling rate for $B_{11}= B_{12} = B_{21} = 2, \, B_{22} = 3,
902: \, \sigma = 0.5, \, \eps = 0.1$. Solid line: estimated coefficient.
903: Dashed line: homogenized coefficient. Dotted line: unhomogenized coefficient.}
904: \label{fig:2d_sigma}
905: \end{center}
906: \end{figure}
907: \begin{figure}
908: \centerline{
909: \begin{tabular}{c@{\hspace{2pc}}cc}
910: \includegraphics[width=2.7in, height = 2.7in]{alpha11_estim_2d_sig05_a2_b2_c3_ep01_dt001.eps} &
911: \includegraphics[width=2.7in, height = 2.7in]{alpha12_estim_2d_sig05_a2_b2_c3_ep01_dt001.eps} \\
912: a.~~  $\widehat{B}_{11}$  & b.~~ $ \widehat{B}_{12} $ \\
913: \includegraphics[width=2.7in, height = 2.7in]{alpha21_estim_2d_sig05_a2_b2_c3_ep01_dt001.eps} &
914: \includegraphics[width=2.7in, height = 2.7in]{alpha22_estim_2d_sig05_a2_b2_c3_ep01_dt001.eps} \\
915: a.~~  $\widehat{B}_{21}$  & b.~~ $ \widehat{B}_{22} $
916: \end{tabular}}
917: \begin{center}
918: \caption{Estimation of the parameters of the 2d quadratic potential
919: \eqref{e:pot_2d} as a function of the sampling rate for $\sigma = 0.5, \, \eps = 0.1$.
920: Solid line: estimated coefficient. Dashed line: homogenized coefficient.
921: Dotted line: unhomogenized coefficient. }
922: \label{fig:2d_alpha}
923: \end{center}
924: \end{figure}
925: In Figure \ref{fig:2d_sigma} we present the estimated values of the two non--zero
926: components of the diffusion tensor versus the sampling rate\footnote{The estimated value of
927: the off--diagonal elements is almost
928: $0$ for all values of the sampling rate, in accordance with the theoretical result
929: \eqref{e:tensor_2d}.}. The performance of the estimator for the diffusion tensor is,
930: qualitatively at least, similar to its performance in the one dimensional problems
931: considered in the previous two subsections. Notice, however, that the optimal sampling
932: rate is quite different for the two non--zero components of the diffusion tensor.
933: 
934: In Figure \ref{fig:2d_alpha} we present the estimated values of the four drift
935: coefficients. The results are in accordance with the one dimensional theory developed in
936: this paper, as well as with the numerical experiments shown in one dimension. We remark
937: that the estimators capture successfully the fact that the homogenized matrix $B$ is not
938: symmetric. Notice furthermore that, as for the diffusion matrix, the optimal sampling
939: rate is different for different components of the matrix $B$.
940: 
941: Thus, in this simple two dimensional multiscale model, the optimal sampling
942: rate is different in different directions. This suggests
943: that extreme care has to be taken when estimating parameters for multidimensional,
944: multiscale stochastic processes.
945: %
946: %
947: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
948: %
949: %
950: \subsection{The Second Estimator for the Drift Coefficient}
951: In this section we compare between the performances of the two estimators
952: for the drift coefficient, namely $\widehat{A}$ and $\tilde{A}$ given by
953: equations \eqref{e:alpha_estim_1d} and \eqref{e:alpha_estim_1d2} respectively. We estimate
954: the drift parameter of
955: \eqref{e:main} in one dimension for a quartic and a sixth--degree large--scale potential
956: $V(x)$:
957: \begin{equation}\label{e:pot_quartic}
958: V(x) = \frac{1}{4} \alpha x^4
959: \end{equation}
960: and
961: \begin{equation}\label{e:pot_six}
962: V(x) = \frac{1}{6} \alpha x^6.
963: \end{equation}
964: In both cases the small scale fluctuations are represented by the cosine potential
965: \eqref{e:cos} In Figure \ref{fig:alpha2_four} we present the estimated values of the
966: drift coefficient as a function of the sampling rate for two different $\sigma$ for the
967: quartic potential \eqref{e:pot_quartic}. We also plot the effective and the unhomogenized
968: values of the drift coefficient. Similar results for the sixth--degree potential
969: \eqref{e:pot_six} are presented in Figure \ref{fig:alpha2_six}. In both cases we observe
970: that the alternative estimator $\tilde{A}$ performs better than $\widehat{A}$ in this
971: situation where the data is subsampled.
972: \begin{figure}[t]
973: \centerline{
974: \begin{tabular}{c@{\hspace{2pc}}c}
975: \includegraphics[width=2.7in, height = 2.7in]{alpha_estim_quartic_sig05_a1_eps01_dt001.eps} &
976: \includegraphics[width=2.7in, height = 2.7in]{alpha_estim_quartic_sig07_a1_eps01_dt001.eps} \\
977: a.~~  $\sigma = 0.5$   & b.~~ $ \sigma = 0.7$
978: \end{tabular}}
979: \begin{center}
980: \caption{ Estimation of the drift coefficients for the quartic potential
981: \eqref{e:pot_quartic} as a function of the sampling rate for $ \eps = 0.1$. Solid line:
982: $\widehat{A}$. Dash-dot line: $\tilde{A}$. Dashed line: homogenized coefficient. Dotted
983: line: unhomogenized coefficient .}
984: %
985: \label{fig:alpha2_four}
986: %
987: \end{center}
988: \end{figure}
989: %
990: \begin{figure}[t]
991: \centerline{
992: \begin{tabular}{c@{\hspace{2pc}}c}
993: \includegraphics[width=2.7in, height = 2.7in]{alpha_estim_sixth_sig05_a1_eps01_dt001.eps} &
994: \includegraphics[width=2.7in, height = 2.7in]{alpha_estim_sixth_sig07_a1_eps01_dt001.eps} \\
995: a.~~  $\sigma = 0.5$   & b.~~ $ \sigma = 0.7$
996: \end{tabular}}
997: \begin{center}
998: \caption{ Estimation of the drift coefficients for the sixth--degree potential
999: \eqref{e:pot_six} as a function of the sampling rate for $\eps = 0.1$. Solid line:
1000: $\widehat{A}$. Dash-dotted line: $\tilde{A}$. Dashed line: homogenized coefficient.
1001: Dotted line: unhomogenized coefficient .}
1002: %
1003: \label{fig:alpha2_six}
1004: %
1005: \end{center}
1006: \end{figure}
1007: %
1008: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1009: %
1010: %                          STATEMENT OF RESULTS
1011: %
1012: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1013: %
1014: \section{Statement of Main Results}
1015: \label{sec:results}
1016: 
1017: In this section we pesent theorems which substantiate the numerical
1018: observations in the preceeding section.
1019: The first result shows that, without subsampling, the parameter estimators for the
1020: homogenized model will be asymptotically biased: they recover the parameters from the
1021: unhomogenized equations.
1022: \begin{theorem}
1023: \label{thm:est_ddim} Let $x^\eps(t)$ be the solution of \eqref{e:xeps_V} with $x^\eps
1024: (0)$ distributed according to the invariant measure of the process. Then the estimator
1025: \eqref{e:a_est} satisfies
1026: \begin{equation}\label{e:a_est_lim}
1027: \lim_{\eps \rightarrow 0}\lim_{T \rightarrow \infty} \widehat{A}(x^{\eps}) = \alpha
1028: \quad \mbox{a.s.}
1029: \end{equation}
1030: Fix $T = N \delta$ in \eqref{e:sigma_estim_1d}. Then for every $\eps > 0$ we have
1031: \begin{equation}\label{e:sigma_est_lim}
1032: \lim_{N \rightarrow \infty} \widehat{\Sigma}_{N, \delta}(x^{\eps}) = \sigma \quad
1033: \mbox{a.s.}
1034: \end{equation}
1035: \end{theorem}
1036: %
1037: Now consider the one dimensional problem
1038: %
1039: \begin{equation}
1040: d x^\eps (t) = - \alpha V'(x^\eps(t)) dt - \frac{1}{\eps} p' \left(
1041: \frac{x^\eps(t)}{\eps} \right) dt + \sqrt{2 \sigma} d \beta (t). \label{e:xeps_est}
1042: \end{equation}
1043: %
1044: 
1045: The next two results show that, with appropriate subsampling, the estimators
1046: recover the correct drift and diffusion coefficients for the homogenized
1047: model \eqref{e:lim_sde_1d} when taking data from the unhomogenized
1048: equation \eqref{e:xeps_est}.
1049: %
1050: \begin{theorem}\label{thm:par_est_alpha}
1051: Let $x^\eps(t)$ be the solution of \eqref{e:xeps_est} with $x^\eps (0)$ distributed
1052: according to the invariant measure of the process. Further, let
1053: $\delta = \eps^\alpha, \, \alpha \in (0 , 1 )$ and $N = \left[ \eps^{-\gamma} \right], \,
1054: \gamma > \alpha,$ where $[\cdot]$ denotes the integer part of a number. Then
1055: \begin{equation}
1056: \lim_{\eps \rightarrow 0} \widehat{A}_{N, \delta} (x^\eps) = A \quad \mbox{in law,}
1057: \label{e:alpha_lim}
1058: \end{equation}
1059: where $A$ is given by \eqref{e:coeffs_1d}.
1060: \end{theorem}
1061: %
1062: \begin{theorem}
1063: \label{thm:par_est_sigma} Let $x^\eps(t)$ be the solution of \eqref{e:xeps_est} with
1064: $x^\eps (0)$ distributed according to the invariant measure of the process. Fix $ T = N
1065: \delta$ with $\delta = \eps^\alpha$ and $\alpha \in (0 , 1)$. Then
1066: %
1067: \begin{equation}
1068: \lim_{\eps \rightarrow 0} \widehat{\Sigma}_{N, \delta} (x^\eps) = \Sigma \quad
1069: \mbox{in law,}
1070: %
1071: \label{e:sigma_lim}
1072: %
1073: \end{equation}
1074: where $\Sigma$ is given by \eqref{e:coeffs_1d}.
1075: \end{theorem}
1076: %
1077: \begin{remark}
1078: The two previous results require $\epsilon/\delta \to 0$ as $\epsilon \to 0.$ In view of
1079: the fact that the fast time--scale is ${\cal O}(\epsilon^2)$ (see equation
1080: \eqref{e:yeps_eqn}) we might expect that this could relaxed to
1081: $\epsilon^2/\delta \to 0$
1082: as $\epsilon \to 0.$ However we have not been able to prove this.
1083: See Remark \ref{r:label} for further discussion of this point.
1084: \end{remark}
1085: The final result concerns the second drift estimator and again concerns
1086: input of data from the unhomogenized equation \eqref{e:xeps_est} into the
1087: paramter estimator for the homogenized equation \eqref{e:lim_sde_1d}.
1088: It requires an estimate of the
1089: diffusion coefficient, $\widehat{\Sigma}.$ If $\widehat{\Sigma} = \sigma$, then we
1090: estimate the drift coefficient incorrectly with $\tilde A(x^{\eps})$; on the other
1091: hand, if $\widehat{\Sigma} = \Sigma$, then the estimator $\tilde A(x^{\eps})$ gives
1092: the drift of the homogenized equation. (To see the last result recall that
1093: $A/\Sigma=\alpha/\sigma$, see \eqref{e:coeffs_1d}). Consequently, for
1094: multiscale gradient systems, it is sufficient only to subsample in a
1095: fashion which leads to the
1096: correct diffusion coefficient. This offers a clear computational advantage.
1097: %
1098: \begin{theorem}\label{prop:drift_estim_2}
1099: Let $x^\eps(t)$ be the solution of \eqref{e:xeps_est} with $x^\eps (0)$ distributed
1100: according to the invariant measure of the process. Assume that the diffusion coefficient
1101: has been estimated to be $\widehat{\Sigma}$. Then
1102: $$\lim_{\eps \to 0}\lim_{T \to \infty}\tilde A(x^\eps)
1103: =\frac{\widehat\Sigma}{\sigma}\alpha \quad \mbox{in law.}$$
1104: \end{theorem}
1105: %
1106: %
1107: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1108: %
1109: %                                 PRELIMINARY RESULTS
1110: %
1111: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1112: %
1113: \section{Preliminary Results}
1114: \label{sec:prelim}
1115: In this section we collect various results that will be used in the proof of our main
1116: theorems.
1117: We start by investigating some of the properties of the invariant measures of the
1118: unhomogenized and of the homogenized equation.
1119: We then introduce some tools useful in the study of homogenization for
1120: SDEs.
1121: %
1122: \begin{prop}
1123: %
1124: \label{prop:gibbs}
1125: The invariant measure of the homogenized equation \eqref{e:lim_sde}
1126: is the Gibbs measure
1127: \begin{equation}
1128: \mu(dx) = \rho(x) dx = \frac{1}{Z} e^{-\alpha V(x)/\sigma} \, dx,
1129: \quad Z = \int_{\R^d} e^{-\alpha V(x)/\sigma} \, dx.
1130: \label{e:gibbs}
1131: \end{equation}
1132: The Markov process $x(t)$ given by \eqref{e:lim_sde} is geometrically ergodic: there are
1133: $C,\, \lambda>0$ such that, for every measurable $f(x)$ satisfying
1134: $$
1135: |f(x)| \leq 1 + |x|^p,
1136: $$
1137: for some integer $p > 0$, we have, for $\mu-$ a.e. $X(0)$,
1138: $$
1139: \left| \E f(x(t)) - \int_{\R^d} f(x) \rho(x) \, dx  \right| \leq
1140: C\bigl(1+|x(0)^p| \bigr)e^{- \lambda t},
1141: $$
1142: where $\E$ denotes expectation with respect to Wiener measure.
1143: \end{prop}
1144: 
1145: \proof Assumptions \ref{a:1}, together with the formulae for the effective drift and the
1146: effective diffusion coefficient, equation \eqref{e:coeffs}, imply
1147: that the solution $x(t)$ of the homogenized equation \eqref{e:lim_sde} has a unique
1148: invariant measure with smooth density.  The Gibbs measure \eqref{e:gibbs} satisfies
1149: $$\alpha \nabla V \rho+\sigma \nabla \rho=0$$
1150: and hence
1151: $$K\Bigl(\alpha \nabla V \rho+\sigma \nabla \rho\Bigr)=0.$$
1152: Because $K$ is constant we deduce that
1153: $$\alpha K\nabla V \rho+\nabla \cdot \bigl(\sigma K \rho\bigr)=0.$$
1154: Thus
1155: $$\nabla \cdot \Bigl(\alpha K\nabla V \rho+\nabla \cdot \bigl(\sigma K \rho\bigr)\Bigr)=0.$$
1156: This is the stationary Fokker-Planck equation for \eqref{e:lim_sde}
1157: showing that the Gibbs measure $\rho$ is indeed an invariant measure.
1158: For the geometric ergodicity we use \cite[Thm 5.3]{MattStuHigh02}.
1159: \qed
1160: %
1161: \begin{prop}\label{lem:xeps_meas_ddim}
1162: The invariant measure of the unhomogenized equation \eqref{e:xeps_V} is the Gibbs measure
1163: %
1164: \begin{equation}
1165: \mu^\eps(dx) = \rho^\eps(x) \, dx = \frac{1}{Z^\eps} e^{-\frac{\alpha}{ \sigma}V(x) -
1166: \frac{1}{\sigma} p \left(\frac{x}{\eps} \right)}, \quad Z^\eps := \int_{\R^d}
1167: e^{-\frac{\alpha}{ \sigma} V(x) - \frac{1}{\sigma} p \left(\frac{x}{\eps} \right)} \, dx.
1168: \label{e:xeps_inv_meas_ddim}
1169: \end{equation}
1170: %
1171: For every $\eps > 0$ the Markov process \eqref{e:xeps_V} is geometrically
1172: ergodic: there are $C,\lambda>0$ such that, for every measurable $f(x)$
1173: satisfying
1174: $$
1175: |f(x)| \leq 1 + |x|^p,
1176: $$
1177: for some integer $p>0$ we have, for $\mu^{\eps}-$a.e. $x^{\eps}(0)$,
1178: $$
1179: \left| \E f(x^\eps(t)) - \int_{\R} f(x) \rho^\eps(x) \, dx  \right|
1180:  \leq C\bigl(1+|x^{\eps}(0)|^p\bigr)e^{- \lambda t},$$
1181: where $\E$ denotes expectation with respect to Wiener measure.
1182: 
1183: Furthermore, the measure $\mu^\eps$ converges weakly to the invariant measure of the
1184: homogenized dynamics $\mu$ given by \eqref{e:gibbs}.
1185: \end{prop}
1186: %
1187: \proof Assumptions \ref{a:1} imply that $x^\eps(t)$ is an ergodic Markov process. Direct
1188: calculation with the Fokker--Planck equation shows that the unique invariant measure of
1189: the process is the Gibbs measure
1190: %
1191: \begin{eqnarray*}
1192: \rho^\eps(x) \, dx & = & \frac{1}{Z^\eps} e^{- \frac{1}{\sigma}
1193:  V \left( x, \frac{x}{\eps}, \alpha \right)} \, dx
1194: \\ & = & \frac{1}{Z^\eps} e^{- \frac{\alpha}{\sigma}
1195:  V(x)- \frac{1}{\sigma} p\left(  \frac{x}{\eps} \right)} \, dx,
1196: \end{eqnarray*}
1197: with $Z^\eps$ given by \eqref{e:xeps_inv_meas_ddim}.
1198: For the geometric ergodicity we use \cite[Thm 5.3]{MattStuHigh02}.
1199: 
1200: Now let
1201: %
1202: $$
1203: u(x,y):= e^{- \frac{\alpha}{ \sigma} V(x) - \frac{1}{\sigma}p(y)}.
1204: %
1205: $$
1206: %
1207: Since $u(x,y) \in L^1(\R^d ; C_{per}(\T^d))$, by \cite[Lem. 9.1]{cioran} we have that
1208: %
1209: $$
1210: u \left(\cdot, \frac{\cdot}{\eps}  \right) \rightharpoonup \int_{\T^d}
1211: u(\cdot, y) \, dy, \quad \mbox{weakly in } L^1(\R^d).
1212: $$
1213: %
1214: In particular, since $1 \in L^{\infty}(\R^d)$,
1215: %
1216: $$
1217: \lim_{\eps \rightarrow 0} Z^\eps =  \int_{\R^d} \int_{\T^d} e^{-
1218: \frac{\alpha}{\sigma}V(x) - \frac{1}{\sigma} p(y)} \, dy.
1219: $$
1220: We combine the above two results to conclude that
1221: $$
1222: \rho^\eps(x) \rightharpoonup \frac{1}{Z} e^{-\frac{\alpha}{ \sigma} V(x)}, \quad
1223: \mbox{weakly in } L^1(\R^d),
1224: $$
1225: %
1226: where $Z$ is given by \eqref{e:gibbs}. The weak convergence of the densities in
1227: $L^1(\R^d)$ implies the weak convergence of the corresponding probability measures. \qed
1228: 
1229: \begin{remark}
1230: The assumption of stationarity of the process $x^\eps(t)$ is not necessary for the proof of
1231: the above theorems and is only made for simplicity. Indeed, in the next section we prove that
1232: $x^\eps (t)$ is geometrically ergodic and consequently it converges to its invariant
1233: distribution exponentially fast for arbitrary initial conditions.
1234: Furthermore, the fact that the invariant measure of the process
1235: $x^\eps(t)$ converges weakly, as $\eps \rightarrow 0$,
1236: to the invariant measure of the homogenized process is important for
1237: us as many of our results will be deduced by taking expectations with respect to the
1238: invariant measure $\mu^{\eps}(dx)$ of the multiscale dynamics \eqref{e:xeps_V}. The weak
1239: convergence alluded to demonstrates that the measure $\mu^{\eps}$ behaves uniformly in
1240: $\eps \to 0.$
1241: \end{remark}
1242: 
1243: An immediate corollary of the above proposition is that $x^\eps(t)$ has bounded moments
1244: of all orders. We will use the notation $\bbE^{\mu^{\epsilon}}$ to denote expectation
1245: with respect to the stationary measure of \eqref{a:1} on path space, when initial data is
1246: distributed according to the Gibbs measure \eqref{e:xeps_inv_meas_ddim}.
1247: %
1248: \begin{corollary}\label{cor:moments}
1249: Let $x^\eps(t)$ be the solution of \eqref{e:main} with the potential given by
1250: \eqref{e:potential} and assume that conditions \eqref{a:1} are satisfied. Assume
1251: furthermore that $x^\eps(0)$ is distributed according to $\mu^\eps$. Then, for all $p \ge
1252: 1,$ there is a constant $C=C(P,T)$ uniform in $\epsilon \to 0$, such that
1253: %
1254: $$\bbE^{\mu^{\epsilon}}|x^\eps(t)|^p \le C \quad \forall \, t \in [0,T].$$
1255: %
1256: \end{corollary}
1257: 
1258: It is convenient for the subsequent analysis to introduce the auxiliary
1259: variable
1260: $$
1261: y^\eps (t) = \frac{x^\eps(t)}{\eps}.
1262: $$
1263: We can then write equation \eqref{e:xeps_V} in the form
1264: \begin{subequations}
1265: \begin{equation}
1266: d x^\eps(t) = - \alpha \nabla V(x^\eps(t)) \, dt -  \frac{1}{\eps} \nabla p \left( y^\eps
1267: (t) \right) \, dt + \sqrt{2 \sigma} \, d \beta (t),
1268: %
1269: \label{e:xeps_eqn}
1270: %
1271: \end{equation}
1272: \begin{equation}
1273: d y^{\eps}(t) = - \frac{1}{\eps} \alpha \nabla V(x^\eps(t)) \, dt -  \frac{1}{\eps^2}
1274: \nabla p \left( y^\eps (t)  \right) \, dt + \sqrt{\frac{2 \sigma}{\eps^2}} \, d \beta
1275: (t).
1276: %
1277: \label{e:yeps_eqn}
1278: %
1279: \end{equation}
1280: \label{e:eqns_motion}
1281: \end{subequations}
1282: %
1283: Notice that both processes $x^\eps (t)$ and $y^\eps (t)$are driven by the same Brownian
1284: motion. Written in this fashion it is clear that we are in a situation
1285: where homogenization applies. The homogenized equation is found by
1286: eliminating $y^{\eps}(t)$ from the scale separated system
1287: for $\left\{ x^{\eps}(t), y^{\eps}(t) \right\}$. Note that
1288: ${\cal L}_0$ defined in \eqref{e:cell} is the generator of the process
1289: \begin{equation*}
1290: d y (t) = - \nabla p \left( y (t)  \right) \, dt + \sqrt{2 \sigma} \, d \beta (t),
1291: \end{equation*}
1292: on the unit torus, which governs the dynamics of $y_t^{\eps}$ to leading order in
1293: $\epsilon$. The generator of the joint process $\{x^\eps(t), \, y^\eps_t \}$ reads
1294: $$\LL^\eps=\frac{1}{\eps^2} \LL_0 + \frac{1}{\eps} \LL_1 + \LL_2,$$
1295: where
1296: \begin{align*}
1297: \LL_0&= - \nabla_y p(y)\cdot \nabla_y + \sigma \Delta_y,\\
1298: \LL_1&=  - \nabla_y p(y) \cdot \nabla_x - \alpha \nabla_x V(x)
1299: \cdot \nabla_y + 2 \sigma \nabla_x \cdot \nabla_y,\\
1300: \LL_2&=  - \alpha \nabla_x V(x) \cdot \nabla_x + \sigma \Delta_x.
1301: \end{align*}
1302: 
1303: The following result can be found in, e.g. \cite[Ch. 3]{lions}.
1304: \begin{lemma}
1305: \label{l:Poisson} Assume that $p(y) \in C^{\infty}_{per}(\T^d,\R)$ and that $H(y) \in
1306: C^{\infty}_{per}(\T^d,\R^d).$ Let $\mu(dy)$ be the Gibbs measure \eqref{e:gibbs_torus}
1307: and assume that $H(y)$ is centered with respect to $\mu(dy)$:
1308: \begin{equation}\label{e:centering}
1309: \int_{\T^d} H(y) \, \mu(dy) = 0.
1310: \end{equation}
1311:  Then the Poisson equation
1312: \begin{equation}\label{e:poisson}
1313: - \LL_0 \chi = H(y),
1314: \end{equation}
1315: has a unique mean-zero solution in $L^2_{per}(\T^d, \mu(dy) ; \R^d)$.
1316: This solution, together with all its derivatives, is bounded.
1317: \end{lemma}
1318: 
1319: We will need an estimate on integrals whose integrand is centered with respect to the
1320: invariant measure  $\mu(dy)$.
1321: %
1322: \begin{lemma}\label{lem:ito}
1323: Let $H(y) \in C^\infty_{per} \left(\T^d ; \R^d \right)$ satisfy condition
1324: \eqref{e:centering}.  Assume that  $x^\eps (0)$ is distributed according
1325: to \eqref{e:xeps_inv_meas_ddim}. Then the
1326: following estimate holds for any $p>1$ and $T >0$:
1327: \begin{equation*}
1328:  \E^{\mu^\eps} \left| \int_{0}^{T} H(y^\eps(s)) \, ds \right|^p \leq
1329: C \left(\eps^{2p} + \eps^pT^p+\eps^p T^{\frac{p}{2}}  \right).
1330: \end{equation*}
1331: \end{lemma}
1332: %
1333: \proof Consider the Poisson equation \eqref{e:poisson} with periodic boundary conditions.
1334: Since $H(y)$ satisfies \eqref{e:centering}, Lemma \ref{l:Poisson} applies and we have
1335: that $\chi(y)$ is smooth and bounded, together with all its derivatives. We now apply the
1336: It\^{o} formula to $\chi(y^\eps (t))$, where $y^\eps (t)$ is the solution of
1337: \eqref{e:yeps_eqn}, and use \eqref{e:poisson} to obtain
1338: %
1339: \begin{align*}
1340: \int_{0}^{T} H(y^\eps(s)) \, ds =& - \eps^2 \left( \chi(y^\eps(T))  -
1341: \chi(y^\eps(0))   \right)\\
1342: & + \eps \sqrt{2 \sigma} \int_{0}^{T} \langle \nabla_y \chi(y^\eps(s)), \, d \beta(s)
1343: \rangle -\alpha \eps \int_0^T \langle \nabla V(x^\eps(s)),
1344: \nabla \chi(y^\eps(s)) \rangle ds.
1345: \end{align*}
1346: %
1347: Now, using the boundedness of $\chi$, we have, for
1348: $$I(T) :=\E^{\mu^\eps} \left| \int_{0}^{T} H(y^\eps(s)) \, ds, \right|^p,$$
1349: \begin{eqnarray*}
1350: I(T) & \leq & C \left( \eps^{2 p} + \eps^{p} \E^{\mu^\eps}\left|\int_0^T |\nabla
1351: V(x^{\eps}(s))| ds\right|^p+ \eps^{p} \E^{\mu^\eps} \left| \int_{0}^{T} \langle \nabla_y
1352: \chi(y^\eps(s)) , \, d \beta (s) \rangle \right|^p \right)
1353: %
1354: \\ & \leq &
1355: %
1356:  C \left( \eps^{2 p} + \eps^{p} T^{p-1} \int_0^T |x^{\eps}(s)|^p ds+
1357: \eps^{p} T^{\frac{p}{2} -1} \int_{0}^{T} \E^{\mu^\eps} \left| \nabla_y \chi(y^\eps(s))
1358: \right|^p \, ds \right)
1359: %
1360: \\ & \leq &
1361: %
1362: C \left( \eps^{2 p} + \eps^p T^p+\eps^{p} T^{\frac{p}{2}} \right),
1363: \end{eqnarray*}
1364: %
1365: from which the desired estimate follows. In deriving the above we used
1366: the estimate \cite[Eqn. 3.25, p. 163]{KSh91} on moments of
1367: stochastic integrals. \qed
1368: 
1369: %The above result will be of particular use to us in the case $T=\delta \ll 1.$
1370: %The Markov property then implies that
1371: %\begin{equation}\label{e:ito_est}
1372: %\left( \E^{\mu^\eps} \left| \int_{n\delta}^{(n+1)\delta} H(y^\eps(s)) \, ds
1373: %\right|^p \right)^{1/p} \leq   C \left(\eps^{2p} + \eps^p\delta^{\frac{p}{2}}
1374: %\right)^{1/p}. %\end{equation}
1375: %
1376: For the rest of this section we will restrict ourselves to the one dimensional case.
1377:  If we apply It\^{o} formula to $\phi(y^\eps(s))$, the solution of the Poisson equation
1378: \eqref{e:cell}, then we obtain
1379: %
1380: \begin{eqnarray}
1381: x^\eps_{n+1} - x_n^\eps & = & - \alpha \int_{n \delta}^{(n+1)\delta} V'(x^\eps(s)) (1 +
1382: \partial_y \phi(y^\eps(s))) \, ds
1383: \\ &&+ \sqrt{2 \sigma} \int_{n \delta}^{(n+1)\delta} (1 +
1384: \partial_y \phi(y^\eps(s))) \, d \beta (s)
1385: \nonumber\\ && - \eps \left( \phi(y^\eps((n+1)\delta)) - \phi(y^\eps (n\delta)) \right).
1386: \label{e:integr_parts}
1387: \end{eqnarray}
1388: The proof Theorems \ref{thm:par_est_alpha} and \ref{thm:par_est_sigma} is based on
1389: careful asymptotic analysis of the behavior of $x^\eps_{n+1} - x^\eps_n$ given by this
1390: formula when both $\eps$ and $\delta$ are small. Specifically we will use the following
1391: two propositions. They show how the effective homogenized behaviour is manifest in
1392: the time--$\delta$ Markov chain induced by sampling the path $x^{\eps}(t)$ from
1393: \eqref{e:xeps_V}.
1394: %
1395: \begin{prop}\label{prop:xndelta1}
1396: For $\eps, \, \delta >0 $ sufficiently small and $n \in \mathbb{N}$ there exists an i.i.d.
1397: sequence of random variables $\xi_n \in \mathcal{N}(0,1)$ such that
1398: \begin{equation}
1399: \sqrt{2 \sigma} \int_{n \delta}^{(n+1) \delta}(1+\partial_y \phi(y^\eps(s))) \, d \beta(s)
1400: =\sqrt{2 \Sigma \, \delta} \, \xi_n+ R_1(\delta, \eps)
1401: \label{e:xn_loc1}
1402: \end{equation}
1403: in law. The remainder $R_1(\delta, \eps)$ satisfies, for every $\beta \in
1404: (0,\frac12)$ and $p>0$, the estimate
1405: \begin{equation}
1406: \left( \E^{\mu^\eps} \big| R_1(\eps,\delta) \big|^p \right)^{1/p} \leq  C \, \left(
1407: \eps^{2 \beta} + \eps^{\beta} \right),
1408: \label{e:R_est}
1409: \end{equation}
1410: where $C$ is independent of $\eps$ and $\delta$.
1411: \end{prop}
1412: \begin{remark}
1413: \label{r:label}
1414: Estimate \eqref{e:R_est} is almost certainly not optimal. Indeed, informal
1415: calculations lead us to expect the estimate
1416: $$
1417: \left( \E^{\mu^\eps} \big| R_1(\eps,\delta) \big|^p \right)^{1/p} \leq  C \, \left(
1418: \eps^{2 \beta} + \eps^{\beta} \delta^{\beta} + \eps^{\beta} \delta^{\frac{\beta}{2}} \right).
1419: $$
1420: However, we have not been able to prove this.
1421: \end{remark}
1422: \begin{prop}\label{prop:xndelta2}
1423: For $\eps, \, \delta >0 $ sufficiently small and $n \in \mathbb{N}$ we have that
1424: \begin{equation}
1425: \label{e:xn_loc2}
1426: \alpha \int_{n \delta}^{(n+1) \delta} V'(x^\eps (s)) (1 +
1427: \partial_y \phi(y^\eps (s))) \, ds = A \delta V'(x^\eps_n) +R_2(\eps, \delta)
1428: \end{equation}
1429: in law. The remainder $R_2(\delta, \eps)$ satisfies, for every $p>0$,
1430: the estimate
1431: %
1432: \begin{equation}
1433: \left( \E^{\mu^\eps} \big| R_2(\eps,\delta) \big|^p \right)^{1/p} \leq  C \left(
1434: \eps^2 +\delta^{ \frac12}\eps+ \delta^{3/2} \right), \label{e:R_est2}
1435: \end{equation}
1436: where $C$ independent of $\eps$ and $\delta.$
1437: \end{prop}
1438: 
1439: 
1440: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1441: %
1442: %                   PROOF OF PROPOSITION 1.1
1443: %
1444: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1445: %
1446: \section{Proof of Propositions \ref{prop:xndelta1} and \ref{prop:xndelta2}}
1447: \label{sec:xndelta_proof}
1448: In this section we prove the two propositions \ref{prop:xndelta1} and
1449: \ref{prop:xndelta2}. These are central to
1450: the proof of the two theorems concerning the behaviour of the estimators with
1451: subsampled data. We start with a rough estimate on $x^\eps_{n+1} - x_n^\eps$ that we will
1452: need for the proofs of the propositions.
1453: 
1454: \subsection{A Rough Estimate}
1455: %
1456: \begin{lemma}\label{lem:rough}
1457: Let Assumptions \ref{a:1} hold and assume that $x^\eps (t)$, the solution of
1458: \eqref{e:xeps_est}, is stationary. Then there exists a constant $C$, independent of
1459: $\delta$ and $\epsilon$, such that
1460: \begin{equation}\label{e:est_rough}
1461: \E^{\mu^\eps} \left| x^\eps(s) - x^\eps_{n \delta} \right|^p \leq C \left( \delta^p +
1462: \delta^{\frac{p}{2}} + \eps^p \right),
1463: \end{equation}
1464: for every $s \in (n \delta, (n +1 ) \delta]$ and every $p \geq 1$.
1465: \end{lemma}
1466: %
1467: \proof
1468: Using the same derivation that leads to \eqref{e:integr_parts},
1469: but with $(n+1) \delta$ replaced by $s$, we have:
1470: \begin{eqnarray}
1471: x^\eps(s) - x_n^\eps & = & - \alpha \int_{n \delta}^{s} V'(x^\eps(s)) (1 +
1472: \partial_y \phi(y^\eps(s))) \, ds + \sqrt{2 \sigma} \int_{n \delta}^{s} (1 +
1473: \partial_y \phi(y^\eps(s))) \, d \beta(s)
1474: \nonumber \\ &&- \eps \left( \phi(y^\eps (s)) - \phi(y^\eps(n\delta)) \right)
1475: \nonumber \\  & =:&
1476: I_{n,\delta}^1 + I_{n,\delta}^2 + I_{n,\delta}^3.
1477: \label{eq:itophi}
1478: \end{eqnarray}
1479: We need to estimate the terms in \eqref{eq:itophi}. We start with $I^3_{n, \delta}$. By
1480: Lemma \ref{l:Poisson} we have
1481: $$\|\phi(y)\|_{L^\infty} \leq C. $$
1482: Consequently
1483: $$
1484: \E^{\mu^\eps} |I_{n,\delta}^3|^p  \leq C \eps^p.
1485: $$
1486: To estimate $I_{n,\delta}^1$ we use again Lemma \ref{l:Poisson} to conclude that
1487: \begin{equation}\label{e:phi_est}
1488: \|1 + \partial_y \phi(y)\|_{L^\infty} \leq C.
1489: \end{equation}
1490: The above estimate, together with Assumptions \ref{a:1}, Corollary \ref{cor:moments} and
1491: the stationarity of the process $x^\eps (t),$ give
1492: \begin{eqnarray*}
1493: \E^{\mu^\eps} |I_{n,\delta}^1|^p & \leq & C \delta^{p-1} \int_{n \delta}^{(n+1) \delta}
1494: \E^{\mu^\eps} |V'(x^\eps(s))|^p \, ds
1495:  \\ & \leq & C \delta^{p-1} \int_{n \delta}^{(n+1)\delta} \E^{\mu^\eps}
1496: |x^\eps (s) |^{p} \, ds
1497: \\ & \leq & C \delta^{p}.
1498: \end{eqnarray*}
1499: Estimate \cite[Eqn. 3.25, p. 163]{KSh91} on moments of stochastic integrals,
1500: together with equation \eqref{e:phi_est}, enable us to conclude that
1501: %
1502: \begin{eqnarray*}
1503: \E^{\mu^\eps} |I_{n,\delta}^2|^p & \leq & C \delta^{\frac{p}{2}-1} \int_{n
1504: \delta}^{(n+1)\delta} \E^{\mu^\eps} |1 + \partial_y \phi(y^\eps (s))|^p \, ds
1505: %
1506: \\ & \leq & C \delta^{\frac{p}{2}}.
1507: %
1508: \end{eqnarray*}
1509: %
1510: We combine the above estimates to obtain \eqref{e:est_rough}. \qed
1511: %
1512: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1513: %
1514: \subsection{Proof of Proposition \ref{prop:xndelta1}}
1515: 
1516: From Theorem \cite[Sec. 1.3]{Freid85}, \cite[Thm. 3.4.6]{KSh91} we know that the
1517: martingale
1518: $$M(t):=\sqrt{2\sigma}\int_0^t  \left( 1 + \partial_y \phi(y^\eps_{s}) \right)ds$$
1519: is equal in law to a time--changed Brownian motion,
1520: $$M(t)=\widehat{\beta}  \left(2 \sigma \int_0^t \left( 1 + \partial_y
1521:  \phi(y^\eps(s)) \right)^2 \, d s \right).
1522: $$
1523: Also the quadratic variation satisfies
1524: %
1525: $$\langle M \rangle_t =2 \sigma \int_0^t \left( 1 + \partial_y \phi(y^\eps(s))
1526: \right)^2 \, d s \approx 2\Sigma t.$$
1527: %
1528: Indeed
1529: \begin{eqnarray*}
1530: \E^{\mu^{\eps}} \langle M \rangle_t &=& 2 \sigma \bbE^{\mu^{\eps}}
1531: \int_0^t \left( 1 + \partial_y \phi(y^\eps(s))
1532: \right)^2 \, d s
1533: \\ & = & 2\Sigma t,
1534: \end{eqnarray*}
1535: where the last equality follows from equation \eqref{e:coeffs} for $d = 1$. Using these
1536: observations we write
1537: \begin{eqnarray*}
1538: J_n  & := & \sqrt{2 \sigma} \int_{n \delta}^{(n+1) \delta} \left( 1 + \partial_y
1539: \phi(y^\eps(s)) \right) \, d \beta(s) \\
1540: &=&\sqrt 2\sigma \int_0^{(n+1)\delta}\left( 1 + \partial_y \phi(y^\eps(s)) \right) \, d
1541: \beta(s) -\sqrt 2\sigma \int_0^{n\delta}\left( 1 + \partial_y
1542: \phi(y^\eps(s)) \right) \, d \beta(s) \\
1543: &=& \widehat{\beta}(2\Sigma(n+1)\delta)-\widehat{\beta}(2\Sigma n\delta)
1544: +r_{n+1}-r_n\\
1545: &=&\sqrt {2\Sigma \delta} \xi_n+r_{n+1}-r_n,
1546: \end{eqnarray*}
1547: where the $\xi_n$ are i.i.d unit Gaussian random variables and
1548: $$r_n=\widehat{\beta}(\langle M\rangle_{n\delta})-\widehat{\beta}(2\Sigma n\delta).$$
1549: 
1550: To estimate this difference we follow the proof of \cite[Thm. 2.1]{HairPavl04}. We start
1551: by employing the H\"{o}lder continuity of Brownian motion, together with H\"{o}lder
1552: inequality, to estimate:
1553: \begin{eqnarray*}
1554: \E^{\mu^\eps} \left|  \widehat{\beta} (\langle M \rangle_{n\delta}) - \widehat{\beta} (
1555: \E^{\mu^\eps} \langle M \rangle_{n\delta}) \right|^p   & \leq & \E^{\mu^\eps} \left|
1556: \mbox{H\"{o}l}_{\beta}(\widehat{\beta})  \left(  \langle M \rangle_{n\delta} -
1557: \E^{\mu^\eps} \langle M \rangle_{n\delta} \right)^{\beta} \right|^p
1558: %
1559: \\ & \leq &
1560: %
1561: \E^{\mu^\eps} \left| \mbox{H\"{o}l}_{\beta}(\widehat{\beta})  \right|^{p}
1562: \left( \E^{\mu^\eps} \left|\langle M \rangle_{n\delta} - \E^{\mu^\eps}
1563: \langle M \rangle_{n\delta}  \right|^{\beta  q} \right)^{\frac{p}{q}}
1564: %
1565: \\ & \leq &
1566: %
1567: C \left( \E^{\mu^\eps} \left| \int_0^{ n\delta} H(y^\eps (z)) \, dz \right|^{\beta  q}
1568: \right)^{\frac{p}{q}},
1569: %
1570: \end{eqnarray*}
1571: %
1572: with $\beta \in \left(0, \frac{1}{2} \right)$. We have used the notation
1573: %
1574: $$
1575: H(y) := 2 \sigma  \left( 1 + \partial_y \phi(y) \right)^2  - 2 \Sigma.
1576: $$
1577: %
1578: We have also used the fact that, for every $\beta \in \left(0, \frac{1}{2} \right)$ and
1579: every bounded time interval, the $\beta$--H\"{o}lder exponent of Brownian motion is
1580: uniformly bounded with probability one. We have that
1581: $$
1582: \int_{\T} H(y) \, \mu(dy) = 0,
1583: $$
1584: %
1585: where $\mu(dy)$ is defined in \eqref{e:gibbs_torus}. Since $n\delta \le T$,
1586: Lemma \ref{lem:ito} applies
1587: and we have that,  for $q$ sufficiently large and for $\eps$ sufficiently small,
1588: %
1589: \begin{eqnarray*}
1590: \E^{\mu^\eps} \left| J_n - \sqrt{2 \Sigma \delta}\xi_n \right|^p   & \leq & C \left( \eps^{2 q
1591: \beta} +   \eps^{q \beta }  \right)^{\frac{p}{q}} \\ & \leq & C
1592: \left( \eps^{2 p \beta} + \eps^{p \beta }  \right) .
1593: \end{eqnarray*}
1594: This completes the proof of the proposition.
1595: \begin{comment}
1596: \begin{eqnarray*}
1597: \sqrt{2 \sigma} \int_{n \delta}^{(n+1)\delta} (1 + \partial_y \phi(y^\eps(s))) \, d
1598: \beta(s) & = & \sqrt{2 \sigma} \int_0^{(n+1) \delta} (1 + \partial_y \phi(y^\eps(s))) \, d
1599: \beta(s) -   \sqrt{2 \sigma} \int_0^{n \delta} (1 + \partial_y \phi(y^\eps(s))) \, d
1600: \beta(s) \\ & = & \widehat{\beta}  \left(2 \sigma \int_0^{(n+1)\delta} \left( 1 +
1601: \partial_y \phi(y^\eps(s)) \right)^2 \, d s \right) - \widehat{\beta}
1602: \left(2 \sigma \int_0^{n \delta} \left( 1 + \partial_y
1603:  \phi(y^\eps(s)) \right)^2 \, d s \right) \\ & = & \widehat{\beta}(2 \Sigma (n+1) \delta) -
1604:  \widehat{\beta}(2 \Sigma n \delta) \\ &&+  \left[ \left(\widehat{\beta}  \left(2 \sigma
1605:  \int_0^{(n+1)\delta} \left( 1 + \partial_y \phi(y^\eps(s)) \right)^2 \, d s \right)
1606:  - \widehat{\beta}(2 \Sigma (n+1) \delta) \right)  \right. \\ && \left.+
1607:  \left(\widehat{\beta}  \left(2 \sigma
1608:  \int_0^{n \delta} \left( 1 + \partial_y \phi(y^\eps(s)) \right)^2 \, d s \right)
1609:  - \widehat{\beta}(2 \Sigma n \delta) \right)  \right] \\ &=&
1610:  \widehat{\beta}(2 \Sigma (n+1) \delta) - \widehat{\beta}(2 \Sigma n \delta) + r^\eps_{(n +1)
1611:  \delta} + r^\eps_{n \delta}.
1612: \end{eqnarray*}
1613: Notice that, in law,
1614: \begin{eqnarray*}
1615: \widehat{\beta} \left(2 \Sigma (n+1) \delta \right) -
1616: \widehat{\beta} \left(2 \Sigma n \delta \right)
1617:  & = & \sqrt{2 \Sigma \delta} (\widehat{\beta}(n+1) - \widehat{\beta}(n) ).
1618: \\ & = & \sqrt{2 \Sigma \delta} \, \xi_n.
1619: \end{eqnarray*}
1620: Now we need to estimate the remainder. The proof of the estimate follows the proof of
1621: \cite[Thm. 2.1]{HairPavl04}. We start by employing the H\"{o}lder continuity of Brownian
1622: motion, together with H\"{o}lder inequality to estimate:
1623: \begin{eqnarray*}
1624: \E \left| r^\eps_{(n+1) \delta} \right|^p  & = & \E \left| \widehat{\beta} \left(2 \sigma
1625: \int_0^{(n+1) \delta} \left( 1 + \partial_y \phi(y^\eps(s)) \right)^2 \,  d s \right) -
1626: \widehat{\beta} \left(2 \Sigma \delta  \right) \right|^p
1627: \\ & \leq &
1628: \E \left| \mbox{H\"{o}l}_{\beta}(\widehat{\beta})  \left(\int_0^{(n+1) \delta} \left( 2
1629: \sigma  \left( 1 + \partial_y \phi(y^\eps(s)) \right)^2  - 2 \Sigma    \right) \, ds
1630: \right)^{\beta} \right|^p
1631: %
1632: \\ & \leq &
1633: %
1634: \left( \E \left| \mbox{H\"{o}l}_{\beta}(\widehat{\beta})  \right|^{m }
1635: \right)^{\frac{p}{m}} \left( \E \left| \int_0^{(n+1) \delta} \left( 2 \sigma \left( 1 +
1636: \partial_y \phi(y^\eps(s)) \right)^2  - 2 \Sigma   \right) \, ds \right|^{\beta  q}
1637: \right)^{\frac{p}{q}}
1638: %
1639: \\ & \leq &
1640: %
1641: C \left( \E \left| \int_0^{(n+1) \delta} H(y^\eps(s)) \, ds, \right|^{\beta  q}
1642: \right)^{\frac{p}{q}},
1643: %
1644: \end{eqnarray*}
1645: %
1646: with $\beta \in \left(0, \frac{1}{2} \right)$. We have used the notation
1647: %
1648: $$
1649: H(y) = 2 \sigma  \left( 1 + \partial_y \phi(y^\eps(s)) \right)^2  - 2 \Sigma.
1650: $$
1651: %
1652: We have also used the fact that, for every $\beta \in \left(0, \frac{1}{2} \right)$ and
1653: every bounded time interval, the $\beta$--H\"{o}lder exponent of Brownian motion is
1654: uniformly bounded with probability one.
1655: 
1656: Notice now that
1657: %
1658: $$
1659: \int_{\T^d} H(y) \, \mu(dy) = 0,
1660: $$
1661: %
1662: where $\mu(dy)$ is defined in \eqref{e:gibbs_torus}. Hence,
1663: by Lemma \ref{lem:ito}, we have that,  for $q$ sufficiently large and for
1664: $\eps$ sufficiently small,
1665: %
1666: \begin{eqnarray*}
1667: \E \left| r^\eps_{(n+1) \delta} \right|^p   & \leq & C \left( \eps^{2 \beta} +   \eps^{
1668: \beta } ((n+1) \delta)^{\frac{\beta q}{2}} \right)^p \\ & \leq & C \eps^{p \beta}.
1669: \end{eqnarray*}
1670: %
1671: Similarly,
1672: %
1673: \begin{eqnarray*}
1674: \E \left| r^\eps_{n \delta} \right|^p  \leq C \eps^{p \beta}.
1675: \end{eqnarray*}
1676: %
1677: This completes the proof of the proposition.
1678: \end{comment}
1679:  \qed
1680: %
1681: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1682: %
1683: \subsection{Proof of Proposition \ref{prop:xndelta2}}
1684: %
1685: %
1686: We have
1687: \begin{eqnarray*}
1688: \E^{\mu^\eps} |R_2(\eps,\delta)|^p & = & \E^{\mu^\eps} \left| \int_{n \delta}^{(n +1)
1689: \delta} \alpha V'(x^\eps(s)) \left( 1 +
1690: \partial_y \phi(y^\eps(s)) \right) \, ds  -
1691:  \delta A V'(x^\eps_{n \delta}) \right|^p  \\ &=&
1692: \E^{\mu^\eps} \left| \int_{n \delta}^{(n +1) \delta} \alpha V'(x^\eps_{n\delta})\left( 1 +
1693: \partial_y \phi(y^\eps(s)) \right) \, ds - A \int_{n\delta}^{(n+1) \delta} V'(x^\eps_{n\delta}) \,
1694: ds \right.
1695: \\ &&
1696: +  \left. \alpha \int_{n\delta}^{(n+1) \delta} \Bigl(V'(x^\eps(s)) - V'(x^\eps_{n
1697: \delta})\Bigr)\Bigl(1+
1698: \partial_y \phi(y^\eps(s))\Bigr) \,ds \right|^p
1699: \\ & \leq & C \E^{\mu^\eps} \left| V'(x^\eps_{n\delta})
1700: \int_{n \delta}^{(n +1) \delta} \left( \alpha \left( 1 + \partial_y \phi(y^\eps(s))
1701: \right) - A \right) \, ds \right|^p
1702: \\ &&
1703: + \alpha ^p C \E^{\mu^\eps} \left| \int_{n\delta}^{(n+1) \delta} \Bigl( V'(x^\eps(s)) -
1704: V'(x^\eps_{n \delta}) \Bigr) \Bigl(1+\partial_y \phi(y^\eps(s))\Bigr) \,ds \right|^p \\ &
1705: =: & I^1_{\eps, \delta} + I^2_{\eps, \delta},
1706: \end{eqnarray*}
1707: %
1708: where the constant $C$ depends only on $p$. We use the H\"{o}lder inequality, Assumptions
1709: \ref{a:1}, Lemma \ref{lem:rough} and the uniform bound on $\partial_y \phi(y)$ to obtain,
1710: for $\eps, \, \delta$ sufficiently small,
1711: \begin{eqnarray*}
1712: I_{\epsilon,\delta}^2  & \leq & C \delta^{p - 1} \int_{n\delta}^{(n+1) \delta}
1713: \E^{\mu^\eps} \left| x^\eps(s) - x^\eps_{n \delta} \right|^{p} \, ds
1714: \\ & \leq & C \delta^{p-1} \int_{n\delta}^{(n+1) \delta}(\delta^{\frac{p}{2}} +
1715: \eps^{p}  ) \, ds \\ & \leq & C \left(\delta^{\frac{3p}{2}} + \delta^p \eps^{p}
1716:    \right).
1717: \end{eqnarray*}
1718: %
1719: Consequently
1720: \begin{equation}\label{e:est_xn1}
1721:  \left(\E^{\mu^\eps} |I^2_{\eps, \delta}| \right)^{1/p}  \leq   C(\delta^{3/2} +\delta
1722: \epsilon).
1723: \end{equation}
1724: Consider now the function
1725: %
1726: $$
1727: H(y):= \alpha \left( 1 + \partial_y \phi(y) \right) - A,
1728: $$
1729: From the definition of $A$ we get that
1730: %
1731: $$
1732: \int_{\bbT} \Bigl( \alpha \left( 1 + \partial_y \phi(y) \right) - A \Bigr)\, \mu(dy) = 0.
1733: $$
1734: %
1735: Hence, Lemma \ref{lem:ito} applies and we get
1736: %
1737: \begin{eqnarray*}
1738: \E^{\mu^\eps} \left| \int_{n \delta}^{(n +1) \delta} \left( \alpha \left( 1 + \partial_y
1739: \phi(y^\eps(s)) \right) - A \right) \, ds \right|^p
1740:  & \leq & C \left(\eps^{2 p} + \eps^p \delta^p + \eps^p \delta^{p/2} \right).
1741: \end{eqnarray*}
1742: %
1743: We combine the above estimate with \eqref{e:linbnd} and Corollary \ref{cor:moments} to obtain,
1744: \begin{equation}\label{e:est_xn2}
1745:  \left(\E^{\mu^\eps} |I^1_{\eps, \delta}|^p \right)^{1/p}  \leq   C \left(\eps^2 +  \eps
1746: \delta^{1/2 } \right),
1747: \end{equation}
1748: for $\eps, \, \delta$ sufficiently small. The proof of the proposition follows from
1749: estimates \eqref{e:est_xn1} and \eqref{e:est_xn2}. \qed
1750: %
1751: %
1752: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1753: %
1754: %                        PROOF OF THM 1.2
1755: %
1756: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1757: \section{Proof of Main Theorems}
1758: 
1759: Here we combine the results from the preceding two sections to complete the proofs of the
1760: main theorems.
1761: 
1762: \subsection{Proof of Theorem \ref{thm:est_ddim}}
1763: \label{sec:ddim}
1764: We combine equations \eqref{e:a_est} and \eqref{e:xeps_V} to calculate
1765: %
1766: \begin{eqnarray*}
1767: \widehat{A}(x^{\eps}) & = &  \frac{\int_0^T - \langle \nabla V(x^\eps(t)) , d x^\eps (t)
1768: \rangle}{\int_0^T | \nabla V(x^\eps(t))|^2 \, dt}
1769: %
1770: \\& = &
1771: %
1772: \frac{\int_0^T  \left\langle - \nabla V( x^\eps(t)),  - \alpha \nabla V(x^{\eps}(t)) \,
1773: dt -  \frac{1}{\epsilon}\nabla p \left( \frac{x^\eps(t)}{\eps} \right) \, dt + \sqrt{ 2
1774: \sigma} \, d\beta(t) \right\rangle }{\int_0^T |\nabla V(x^\eps(t))|^2 \, dt}
1775: %
1776: \\& = &
1777: %
1778: \alpha +  \frac{\frac{1}{\epsilon}\int_0^T  \left\langle \nabla V(x^\eps(t)), \nabla
1779: p(\frac{x^\eps(t)} {\eps} ) \right\rangle \, dt}{\int_0^T |\nabla V(x^\eps(t))|^2 \, dt}
1780: - \sqrt{2 \sigma} \frac{\int_0^T  \left\langle \nabla V(x^{\eps}(t)) , d \beta(t)
1781: \right\rangle}  {\int_0^T | \nabla V(x^\eps(t))|^2 \, dt}
1782: \\
1783: & = :& \alpha + I_1(T, \eps) - I_2(T, \eps).
1784: \end{eqnarray*}
1785: %
1786: We will treat the terms $I_1(T, \eps)$ and $I_2(T,\eps)$ separately. We start with
1787: $I_2(t, \eps)$. Since the stochastic integral
1788: $$
1789: M_T :=\int_0^T  \left\langle \nabla V(x^{\eps}(t)) , d \beta(t) \right\rangle
1790: $$
1791: is a continuous martingale which is null at $0$, the strong law of large numbers for
1792: martingales \cite[p. 187]{yor} applies and we have that
1793: $$
1794: \lim_{T \rightarrow + \infty} \frac{M_T}{\langle M \rangle_T} = 0 \quad \mbox{a.s.}
1795: $$
1796: Consequently
1797: \begin{equation}
1798: \lim_{T \rightarrow + \infty}I_2(T,\eps) =  0 \quad \mbox{a.s.}
1799: \label{e:i2_lim_d}
1800: \end{equation}
1801: 
1802: Let us consider now the term $I_1(T, \eps)$. We use the ergodic theorem to deduce that
1803: %
1804: \begin{eqnarray*}
1805: \lim_{T \rightarrow \infty} I_1(T, \eps) & = & \lim_{T \rightarrow \infty} \frac{
1806: \frac{1}{\epsilon T} \int_0^T  \left\langle \nabla V(x^\eps(t)), \nabla p \left(
1807: \frac{x^\eps(t)}{\eps} \right) \right\rangle \, ds}{\frac{1}{T}\int_0^T |\nabla
1808: V(x^\eps(t))|^2 \, dt}
1809: \\ & = &
1810: \frac{\E^{\mu^\eps} \left(\left\langle \nabla V(x),  \frac{1}{\eps} \nabla p \left(
1811: \frac{x}{\eps} \right) \right\rangle \right) } { \E^{\mu^\eps} | \nabla V(x)|^2} \quad
1812: \mbox{a.s.}
1813: \end{eqnarray*}
1814: Now we use Proposition \ref{lem:xeps_meas_ddim} to compute
1815: \begin{eqnarray*}
1816:  \frac{\E^{\mu^\eps} \left(\left\langle \nabla V(x),  \frac{1}{\eps}
1817:  \nabla p \left( \frac{x}{\eps}
1818: \right) \right\rangle \right) } { \E^{\mu^\eps} | \nabla V(x)|^2} & = & \frac{\int_{\R^d}
1819: \left\langle \nabla V(x),  \frac{1}{\eps} \nabla p \left( \frac{x}{\eps} \right)
1820: \right\rangle \rho^\eps(x) \, dx }{  \E^{\mu^\eps} | \nabla V(x)|^2 }
1821: %
1822: \\ & = &
1823: %
1824: \frac{-\sigma \frac{1}{Z^\eps} \int_{\R^d} \left\langle \nabla V(x) e^{-\frac
1825: {\alpha}{\sigma} V(x)}, \nabla \left( e^{-\frac{1}{\sigma} p(x/ \eps)} \right)
1826: \right\rangle \, dx }{ \E^{\mu^\eps} | \nabla V(x)|^2 }
1827: \\ & = &
1828: \sigma \frac{ \E^{\mu^\eps} ( \Delta V(x) )}{ \E^{\mu^\eps} | \nabla
1829:  V(x)|^2} - \alpha.
1830: \end{eqnarray*}
1831: In deriving the penultimate line we used an integration by parts. The weak convergence of
1832: $\mu^\eps$ to $\mu$ (second part of Proposition \ref{lem:xeps_meas_ddim}), formula
1833: \eqref{e:gibbs}, together with another integration by parts give
1834: \begin{eqnarray*}
1835: \lim_{\eps \rightarrow 0} \frac{ \E^{\mu^\eps} (\Delta V(x))}{\E^{\mu^\eps} (|\nabla
1836: V(x)|^2)} & = & \frac{\E^{\mu} (\Delta V(x))}{\E^{\mu} (|\nabla V(x)|^2)}
1837: \\ & = & \frac{\E^{\mu} (\Delta V(x))}{ -\frac{\sigma}{\alpha} \frac{1}{Z}
1838: \int_{\R^d} \langle \nabla V(x) , \nabla (e^{-\frac{\alpha}{\sigma} V(x))} \rangle dx }
1839: \\ & = & \frac{\alpha}{\sigma}.
1840: \end{eqnarray*}
1841: We combine the above calculations to conclude that
1842: \begin{equation}
1843: \lim_{\eps \rightarrow 0} \lim_{T \rightarrow \infty} I_1(T, \eps) = 0 \quad \mbox{a.s.}
1844: \label{e:i1_lim_d}
1845: \end{equation}
1846: The proof of the convergence of the maximum likelihood estimator, eqn.
1847: \eqref{e:a_est_lim} now follows from equations \eqref{e:i1_lim_d} and \eqref{e:i2_lim_d}.
1848: 
1849: The proof of the convergence of the estimator for the diffusion coefficient, eqn.
1850: \eqref{e:sigma_est_lim}, follows from the definition of the quadratic variation, see e.g.
1851: \cite{BasRao80}. \qed
1852: \begin{remark}
1853: An immediate corollary of the proof of the above theorem is that
1854: $$
1855: \lim_{T \rightarrow \infty} \widehat{A}(x^{\epsilon}) = \sigma \frac{ \E^{\mu^\eps} (\Delta
1856: V(x))}{\E^{\mu^\eps} |\nabla V(x)|^2} \quad \mbox{a.s.}
1857: $$
1858: \end{remark}
1859: %
1860: %
1861: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1862: %
1863: %                   PROOF OF PROPOSITION 1.1
1864: %
1865: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1866: %
1867: \subsection{Proof of Theorem \ref{thm:par_est_alpha}}\label{sec:thm_alpha}
1868: We combine Proposition \ref{prop:xndelta2} and \eqref{e:integr_parts} to conclude that
1869: $$
1870: x^\eps_{n+1} - x^\eps_n = J_n - A\delta V'(x^\eps_n) + R(\eps, \delta),
1871: $$
1872: where $J_n$ is as defined in the proof of Proposition \ref{prop:xndelta1}
1873: and, for $\eps, \, \delta$ sufficiently small and $\alpha \in (0,1)$,
1874: \begin{equation}\label{e:R_est_1}
1875: \left( \E^{\mu^\eps} |R(\eps, \delta)|^p \right)^{1/p} \leq
1876: C \bigl(\delta^{3/2}+\epsilon \bigr).
1877: \end{equation}
1878: Notice that
1879: $$\bbE^{\mu^{\eps}} |J_n|^2={\cal O}(\delta).$$
1880: We combine this with formula \eqref{e:alpha_estim_1d} to obtain
1881: \begin{eqnarray}
1882: \widehat{A}_{N, \delta}(x^\eps) &=& A -  \frac{\sum_{n=0}^{N-1} V'(x^\eps_n)
1883: J_n}{\sum_{n=1}^{N-1} |V'(x^\eps_n)|^2 \delta} - \frac{\sum_{n=0}^{N-1} V'(x^\eps_n)
1884: R(\eps, \delta)}{\sum_{n=0}^{N-1} |V'(x^\eps_n)|^2 \delta} \nonumber \\ & :=& A - I_1 -
1885: I_2 \label{e:a_j1_j2},
1886: \end{eqnarray}
1887: %
1888: We need to control the terms $I_1$ and $I_2$. We start with $I_1$, which we rewrite in
1889: the form
1890: \begin{eqnarray*}
1891: I_1 & = &  \eps^{\frac{\gamma
1892: -\alpha}{2}}\frac{\frac{1}{\sqrt{(N\delta)}}\sum_{n=0}^{N-1} V'(x^\eps_n)
1893: J_n}{\frac{1}{N}\sum_{n=0}^{N-1} |V'(x^\eps_n)|^2}.
1894: \end{eqnarray*}
1895: The central limit theorem for (discrete) martingales implies that
1896: \begin{eqnarray*}
1897: \lim_{N \rightarrow + \infty} \frac{1}{\sqrt{(N\delta)}} \sum_{n=0}^{N-1} V'(x^\eps_n)
1898: J_n  & = & \frac{1}{\sqrt \delta}\mathcal{N} \left(0, \E^{\mu^{\eps}} \left(
1899: |V'(x^{\eps}(0))|^2|J_0|^2 \right) \right)
1900: \\ & = &
1901: \frac{1}{\sqrt \delta}\mathcal{N} \left(0, c \, \delta \right) = c \, \mathcal{N}(0,1)
1902: \quad \mbox{in law},
1903: \end{eqnarray*}
1904: for some $c$ uniform in $\epsilon \to 0$. In the above we have used the fact that
1905: $\E^{\mu^\eps} |J_0|^2 = 2 \Sigma \delta$.
1906: 
1907: On the other hand, the ergodic theorem implies that
1908: \begin{equation}\label{e:denom}
1909: \lim_{N \rightarrow + \infty}\frac{1}{N}\sum_{n=0}^{N-1} |V'(x^\eps_n)|^2 = \E^{\mu^\eps}
1910: |V(x)|^2, \quad \mbox{a.s.}
1911: \end{equation}
1912: Hence, by Slutsky's theorem, and remembering that $N = [\eps^{-\gamma}]$, we have that
1913: \begin{equation}\label{e:j1_lim}
1914: \lim_{\eps \rightarrow 0 } I_1 = 0 \quad \mbox{in law}.
1915: \end{equation}
1916: Consider now the term $I_2$. It can be written as
1917: $$
1918: I_2 = \frac{\eps^{\gamma - \alpha}\sum_{n=0}^{N-1}V'(x^\eps_n) R(\eps, \delta)
1919: }{\frac{1}{N}\sum_{n=0}^{N-1} |V'(x^\eps_n)|^2}.
1920: $$
1921: The ergodic theorem implies that the denominator in the above expression converges a.s.
1922: to a finite value. To study the numerator of the above expression we use estimate
1923: \eqref{e:R_est_1}, together with H\"{o}lder inequality to estimate
1924: \begin{eqnarray*}
1925: \E^{\mu^\eps} \left| \eps^{\gamma - \alpha} \sum_{n = 0}^{N-1} V'(x^\eps_n) R(\eps,
1926: \delta)\right| &\leq & \eps^{\gamma - \alpha} \sum_{n=0}^{N - 1} \left(
1927: \E^{\mu^\eps}|V'(x^\eps_n)|^q  \right)^{1/q} \left( \E^{\mu^\eps} |R(\eps, \delta)|^p
1928: \right)^{1/p}
1929: \\ & \leq &
1930: C \eps^{\gamma - \alpha} \sum_{n = 0}^{N - 1}
1931: \Bigl(\E^{\mu^\eps}  |R(\eps, \delta)|^p \Bigr)^{1/p}
1932: \\ & \leq &
1933: C\bigl(\epsilon^{\alpha/2}+\epsilon^{1-\alpha}\bigr).
1934: \end{eqnarray*}
1935: %
1936: In the above we have used Corollary \ref{cor:moments}, together with Assumptions
1937: \ref{a:1}. The above calculation shows that numerator of $I_2$ converges to $0$ in $L^1$,
1938: and hence in law. This, together with the a.s. convergence of the denominator
1939: and Slutsky's theorem gives
1940: \begin{equation}\label{e:j2_lim}
1941: \lim_{\eps \rightarrow 0 } I_2 = 0 \quad \mbox{in law}.
1942: \end{equation}
1943: Combining \eqref{e:a_j1_j2}, \eqref{e:j1_lim} and \eqref{e:j2_lim} completes the
1944: proof of the theorem.  \qed
1945: %
1946: %
1947: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1948: %
1949: %                        PROOF OF THM 1.1
1950: %
1951: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
1952: %
1953: \subsection{Proof of Theorem \ref{thm:par_est_sigma}}\label{sec:thm_sigma}
1954: %
1955: We combine Proposition \ref{prop:xndelta1} with \eqref{e:integr_parts} to write
1956: the difference $x_{n+1}^\eps - x_n^\eps$  in the form
1957: \begin{equation}
1958: x_{n+1}^\eps - x_n^\eps = \sqrt{2 \Sigma \, \delta} \, \xi_n + \widehat{R}(\delta, \eps)
1959: \label{e:xn_loc_1}
1960: \end{equation}
1961: in law, where, for $\eps, \, \delta$ sufficiently small,
1962: %
1963: \begin{equation}\label{e:R_est_3}
1964: \left( \E^{\mu^\eps} |\widehat{R}(\eps, \delta)|^p \right)^{1/p}
1965: \leq C \left(\delta + \eps^{\beta}\right).
1966: \end{equation}
1967: We substitute \eqref{e:xn_loc_1} into the formula for the estimator
1968: \eqref{e:sigma_estim_1d} with $d = 1$ to obtain
1969: \begin{eqnarray*}
1970: \widehat{\Sigma}_{N, \delta}(x^{\epsilon}) &=& \Sigma \frac{1}{N} \sum_{n=0}^{N-1}
1971: \xi_n^2 + \frac{1}{2 N \delta} \sum_{n=0}^{N-1} \left( \widehat{R}(\delta, \eps)
1972: \right)^2 + \frac{1}{N \delta} \sum_{n=0}^{N-1} \sqrt{2\Sigma \delta}\xi_n
1973: \widehat{R}(\delta, \eps)
1974: \\ & =:&
1975: \Sigma \frac{1}{N}\sum_{n=0}^{N-1} \xi_n^2 + I_1 + I_2.
1976: \end{eqnarray*}
1977: By the law of large numbers the first term tends almost surely to $\Sigma$ as $\epsilon
1978: \to 0$ (which implies $N \to \infty.$) Thus it suffices to show that the remaining terms
1979: tend to zero in law. We do this by showing that they tend to zero in $L^1.$
1980: 
1981: Note that
1982: \begin{align*}
1983: \bbE^{\mu^{\eps}}|I_1| & \le C \sum_{n=0}^{N-1} \bbE^{\mu^\eps}(\widehat{R}(\delta,\eps))^2\\
1984: &=CN(\delta+\eps^{\beta})^2\\
1985: &\le C(\delta+\epsilon^{2\beta}\delta^{-1})\\
1986: &=C(\epsilon^{\alpha}+\epsilon^{2\beta-\alpha})\\
1987: &=o(1),
1988: \end{align*}
1989: for $\alpha \in (0,1)$, since $\beta$ can be chosen arbitrarily close to $\frac12.$
1990: 
1991: Similarly
1992: \begin{align*}
1993: \bbE^{\mu^{\eps}}|I_2| & \le C \sum_{n=0}^{N-1} \delta^{\frac12}(\delta+\eps^{\beta})\\
1994: &\le C(\delta^{\frac12}+\epsilon^{\beta}\delta^{-\frac12})\\
1995: &=C(\epsilon^{\frac{\alpha}{2}}+\epsilon^{\beta-\frac{\alpha}{2}})\\
1996: &=o(1),
1997: \end{align*}
1998: for $\alpha \in (0,1)$, since $\beta$ can be chosen arbitrarily close to $\frac12.$
1999: This completes the proof.
2000: \qed
2001: %
2002: %
2003: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2004: %
2005: %
2006: \subsection{Proof of Theorem \ref{prop:drift_estim_2}}
2007: %
2008: Taking the limit $T \to \infty$ in \eqref{eq:alpha2} gives
2009: %
2010: $$
2011: \lim_{T \to \infty}\tilde{A}(x^{\eps})=\widehat{\Sigma} \frac{\bbE^{\mu^{\eps}} (\Delta
2012: V(x))}{\bbE^{\mu^{\eps}} |\nabla V(x)|^2}.
2013: $$
2014: %
2015: Proposition \ref{lem:xeps_meas_ddim} implies that
2016: %
2017: $$\lim_{\eps \to 0}\widehat{\Sigma} \frac{\bbE^{\mu^{\eps}} (\Delta V(x))}
2018: {\bbE^{\mu^{\eps}} |\nabla V(x)|^2}= \widehat{\Sigma} \frac{\bbE^{\mu} ( \Delta V(x)
2019: )}{\bbE^{\mu} |\nabla V(x)|^2},$$
2020: %
2021: where $\E^\mu$ denotes expectation with respect to the invariant distribution $\rho(x)$
2022: of the homogenized process, given by formula \eqref{e:gibbs}. An
2023: integration by parts now gives that
2024: %
2025: $$
2026: \E^{\mu} |\nabla V(x)|^2 = \frac{\sigma}{\alpha} \E^{\mu} ( \Delta V(x) ).
2027: $$
2028: %
2029: Thus, the final result of our considerations is that
2030: %
2031: $$
2032: \lim_{\eps \rightarrow 0} \lim_{T \rightarrow \infty} \tilde{A}(x^{\eps}) =
2033: \frac{\widehat{\Sigma}}{\sigma} \alpha.
2034: $$
2035: \qed
2036: %
2037: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2038: %
2039: %                          CONCLUSIONS
2040: %
2041: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2042: %
2043: \section{Conclusions and Future Work}
2044: \label{sec:conc}
2045: The problem of parameter estimation for continuous time multiscale
2046: diffusion processes is studied in this paper. Our goal is to accurately fit a
2047: homogenized equation from data which has a multiscale character.
2048: Our main conclusions are as follows:
2049: 
2050: \begin{itemize}
2051: 
2052: \item In order to estimate the drift and diffusion
2053: coefficients accurately it is necessary to subsample.
2054: 
2055: \item There is an optimal subsampling rate, between the two
2056: charateristic time-scales of the multiscale data.
2057: 
2058: \item The optimal subsampling rate may differ for different
2059: parameters.
2060: 
2061: \item For gradient multiscale systems it is only necessary to estimate
2062: the diffusion coefficient correctly, if one uses
2063: the second estimator for the drift -- $\tilde{A}$, defined in equations \eqref{eq:alpha2} and
2064: \eqref{e:alpha_estim_1d2}.
2065: 
2066: 
2067: \end{itemize}
2068: 
2069: Both analysis and numerics are given to substantiate these claims.
2070: Many open questions remain; we list those which seem important to us.
2071: 
2072: \begin{itemize}
2073: 
2074: \item Rough heuristics indicate that any subsampling
2075: rate which is between the two characteristic time scales of the processes, namely
2076: $\mathcal{O}(\eps^2)$ and $\mathcal{O}(1)$, should enable accurate  estimation
2077: of the drift and diffusion coefficients. However our analysis works only
2078: in the case where the subsampling is between
2079: $\mathcal{O}(\eps)$ and $\mathcal{O}(1)$. Closing the gap between intuition
2080: and what can be proved would be valuable.
2081: 
2082: \item Analyze other parameter estimation problems for multiscale
2083: diffusions, not necessarily of gradient form. In particular study both
2084: averaging and homogenization set-ups, as outlined in the introductory
2085: section.
2086: 
2087: 
2088: \item In this paper we have generated simulated multiscale data
2089: by using a multiscale diffusion process. However this was done to
2090: provide a convenient analytical framework. In applications it
2091: is of interest to develop tools for characterizing the multiscale
2092: structure of a given path  -- to estimate characteristic time--scales.
2093: Related work has been done in \cite{FPSS03}. Further study would be
2094: of interest.
2095: 
2096: 
2097: \item Determine precisely the range of subsamplings which will give
2098: accurate parameter estimates and optimize the subsampling rate for
2099: accuracy.
2100: 
2101: \item Optimize the algorithm by combining estimates based on shifts of
2102: the subsampled data -- so that information is not thrown away; this is
2103: done in the context of econometrics and finance in
2104: \cite{AitMykZha05b, AitMykZha05a}.
2105: 
2106: 
2107: \item Analyze questions analogous to those raised here for
2108: multidimensional multiscale processes.
2109: %
2110: \item Analyze questions analogous to those raised here
2111: for hypoelliptic multiscale diffusions; in particular the case where the homogenized
2112: equation is a fully elliptic first order Langevin equation which is derived from an
2113: overdamped second-order Langevin equation.
2114: %
2115: \item Study whether there is any advantage in using random subsampling rates.
2116: %
2117: \item Study drift that depends non--linearly on the parameters to be
2118: estimated:
2119: %
2120: $$
2121: d x^\eps (t) = - \nabla V(x^\eps (t), \eps ; \alpha) dt + \sqrt{2 \sigma} d \beta (t).
2122: $$
2123: %
2124: \item Parameter estimation for deterministic multiscale problems where the fast
2125: process is a strongly mixing chaotic deterministic process.
2126: %
2127: %
2128: \end{itemize}
2129: 
2130: {\it Acknowledgements} The authors are grateful to Ch. Sch\"{u}tte
2131: for useful discussions concerning molecular dynamics, leading us to
2132: formulate this problem. They also thank S. Olhede for useful discussions
2133: and comments.
2134: %
2135: \bibliography{../bibtex_files/mybib}
2136: \bibliographystyle{plain}
2137: \end{document}
2138: