0603:math0603668/PS.tex

1: \documentclass[10pt]{article}

2: \usepackage[dvips]{graphicx}

3: \usepackage{amsmath,amsfonts,amssymb,latexsym,epsfig}

4: %\usepackage[notref,notcite]{showkeys}

5: \include{references}

6: \usepackage{mathrsfs}

7: \usepackage{verbatim}

8: \usepackage{latexsym}

9: \usepackage{amsthm}

10: \usepackage{amssymb}

11: \usepackage{graphics}

12: \usepackage{amsbsy}

13: %\usepackage{theorem}

14: %\usepackage{showlabels}

15: \usepackage{enumerate}

16: \usepackage{times}

17: %\newskip\structskipamount \structskipamount=1.5ex

18: %\newcommand{\structskip}{\par\ifdim\lastskip<\structskipamount

19: %  \removelastskip\penalty-100\vskip\structskipamount\fi}

20:

21: \newtheorem{theorem}{Theorem}[section]

22: \newtheorem{lemma}[theorem]{Lemma}

23: \newtheorem{remark}[theorem]{Remark}

24: \newtheorem{corollary}[theorem]{Corollary}

25: \newtheorem{prop}[theorem]{Proposition}

26: \newtheorem{assumptions}[theorem]{Assumptions}

27:

28: %\def\qed{\unskip\nobreak\hfil\penalty50\hskip2em\hbox{}\nobreak

29: %   \hfil\vrule width0.5em height 1.5ex depth0pt\kern2pt%

30: %   \parfillskip=0pt\finalhyphendemerits=0 \par}

31: %\newenvironment{proof}%

32: %  {\structskip\noindent\textbf{Proof.} \ignorespaces}%

33: %  {\qed\structskip}

34:

35: \numberwithin{equation}{section}

36: %

37: \graphicspath{{figures/}}

38: %

39: \newcommand{\E}{{\mathbb E}}

40: \newcommand{\bbE}{{\mathbb E}}

41: \newcommand{\Ee}{{\mathbb E}^{\mu^\eps}}

42: \newcommand{\LL}{{\mathcal L}}

43: \newcommand{\KK}{{\mathcal K}}

44: \newcommand{\HH}{{\mathcal H}}

45: \newcommand{\T}{{\mathbb T}}

46: \newcommand{\R}{{\mathbb R  }}

47: \newcommand{\D}{{\mathcal D  }}

48: \newcommand{\RR}{{\mathcal R  }}

49: \newcommand{\pd}[2]{\frac{\partial #1}{\partial #2}}

50: \newcommand{\pdt}[1]{\frac{\partial #1}{\partial t}}

51: \newcommand{\pdtau}[1]{\frac{\partial #1}{\partial \tau}}

52: \newcommand{\pdd}[2]{\frac{\partial^2 #1}{\partial {#2}^2}}

53: \newcommand{\pddd}[3]{\frac{\partial^2 #1}{\partial {#2} \partial{#3}}}

54: \newcommand{\pdddd}[4]{\frac{\partial^2 #1}{\partial {#2} \partial{#3}

55: \partial{#4}}}

56: \newcommand{\brk}[1]{\left( #1 \right)}

57: \newcommand{\Brk}[1]{\left[ #1 \right]}

58: \newcommand{\px}{\partial_x}

59: \newcommand{\py}{\partial_y}

60: \newcommand{\bbT}{\mathbb{T}}

61: \newcommand{\bbR}{\mathbb{R}}

62: \newcommand{\cA}{\mathcal A}

63: \newcommand{\cL}{\mathcal L}

64: \newcommand{\cLo}{\cL^{OU}}

65: \newcommand{\rou}{\rho^{OU}}

66: \newcommand{\pit}{\hat{\pi}}

67: \newcommand{\piz}{\pi_0}

68: \newcommand{\eps}{\epsilon}

69: \newcommand{\xeps}{x^{\epsilon}}

70: \newcommand{\xepss}{x_s^{\epsilon}}

71: \newcommand{\yepss}{y_s^{\epsilon}}

72: \newcommand{\goup}{ e^{\frac{y^2}{2 D}}}

73: \newcommand{\goum}{ e^{-\frac{y^2}{2 D}}}

74: \newcommand{\la}{\langle}

75: \newcommand{\ra}{\rangle}

76:

77:

78: %

79: %   MAIN DOCUMENT

80: %

81: %

82: \begin{document}

83: %

84: %

85: %

86: \setlength{\baselineskip}{10pt}

87: \title{PARAMETER ESTIMATION FOR MULTISCALE DIFFUSIONS}

88: \author{G.A. Pavliotis\footnote{Corresponding author.

89: E-mail address: g.paviotis@maths.warwick.ac.uk.} \\

90:         Department of Mathematics\\

91:     Imperial College London \\

92:         London SW7 2AZ, UK \\

93:         and \\

94:         A.M. Stuart\footnote{E-mail address: stuart@maths.warrwick.ac.uk.} \\

95:         Mathematics Institute \\

96:         Warwick University \\

97:         Coventry CV4 7AL, UK

98:                     }

99: \maketitle

100:

101: \begin{abstract}

102: We study the problem of parameter estimation for time-series possessing two, widely

103: separated, characteristic time scales. The aim is to understand situations where it is

104: desirable to fit a homogenized singlescale model to such multiscale data. We

105: demonstrate, numerically and analytically, that if the data is sampled too finely then

106: the parameter fit will fail, in that the correct parameters in the homogenized model are

107: not identified. We also show, numerically and analytically, that if the data is

108: subsampled at an appropriate rate then it is possible to estimate the coefficients of the

109: homogenized model correctly.

110: \end{abstract}

111:

112: \noindent {\bf Keywords:} Parameter estimation, multiscale diffusions, stochastic

113: differential equations, homogenization, maximum likelihood, subsampling.

114:

115: %

116: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

117: %

118: %                                      INTRODUCTION

119: %

120: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

121: %

122: \section{Introduction}

123: %

124: \label{sec:intro}

125:

126:

127:

128: Parameter estimation for continuous time stochastic models is an increasingly important

129: part of the overall modelling strategy in a wide variety of applications. It is quite

130: often the case that the data to be fitted to a diffusion process has a multiscale

131: character. One example is the field of  molecular dynamics, where it is desirable to find

132: effective models for low dimensional phenomena (such as conformational dynamics, vacancy

133: diffusion and so forth) which are embedded within higher dimensional time-series. Another

134: example is the ocean--atmosphere sciences where it is desirable to find effective models

135: for large--scale structures, whilst representing the small--scales stochastically.  The

136: multiscale structure of the data in these problems renders the problem of parameter

137: estimation very subtle, and great care has to be taken in order to estimate the

138: coefficients correctly. The aim of the paper is to shed light on this estimation problem

139: through the study of a simple class of model problems, typical of those arising in

140: molecular dynamics.

141:

142: In econometrics and finance, the problem of estimating parameters for continuous time

143: diffusion processes in the presence of small scale fluctuations (market microstructure

144: noise) has been considered by A\"{i}t--Sahalia and collaborators

145: \cite{AitMykZha05b,AitMykZha05a} and more recently in \cite{BaNiHaLuSh06}. In that work

146: the microscale is input as an independent white observational noise that is superimposed

147: on--top of a singlescale diffusion process. We have a somewhat different framework: we

148: work in the context of coupled systems of diffusions exhibiting multiple scales. Our aim

149: is to fit a singlescale homogenized diffusion to data. Models similar to the ones

150: considered in this paper have been studied extensively in finance, see \cite{fouque00}

151: and the reference therein. In that book there is discussion of parameter estimation for

152: multiscale diffusions, with emphasis on the estimation of the rate of mean reversion of

153: volatility from historical asset price data; see \cite[Ch. 4]{fouque00}.

154:

155: Various numerical algorithms for

156: diffusions with multiple scales have been developed \cite{Vand03} and analyzed

157: \cite{ELV05}. Those papers are finely honed to optimize the fitting of the homogenized

158: diffusion in situations where the multiscale model is known explicitly. In contrast, in

159: this paper we introduce multiscale diffusions primarily as a device to generate

160: multiscale data; we do not assume that the multiscale model is available to us when doing

161: parameter estimation. This enables us to gain understanding of parameter estimation in

162: situations where the multiscale data is given to us from experiments, or comes from a

163: model where the scale--separation is not explicit. Two recent papers contain numerical

164: experiments relating to the extraction of averaged or homogenized diffusions from data

165: generated by a multiscale diffusion; see \cite{Cald06,CromVanEij06b}.

166:

167: Despite differences from the framework used in

168: \cite{AitMykZha05b,AitMykZha05a,BaNiHaLuSh06} to study problems arising

169: in econometrics and finance, similarities with our work remain:

170: trying to fit the

171: models on the basis of data sampled at too high a frequency leads to incorrect parameter

172: inference; furthermore, there is an optimal subsampling rate for the data to obtain

173: correct inference.

174:

175: There are two forms of multiscale diffusions which are of particular

176: interest in the context of parameter estimation. The first gives rise

177: to {\bf averaging} for SDEs, and the second to {\bf homogenization}

178: for SDEs. For averaging one has, for $\eps \ll 1$,

179: \begin{subequations}

180: \begin{eqnarray}

181:  d x^\eps(t) &=& f(x^\eps(t),y^\eps(t)) \, dt+\alpha(x^\eps(t),y^\eps(t)) \,  dU(t),  \\

182:  d y^\eps(t) &=& \frac{1}{\eps} g(x^\eps(t),y^\eps(t)) \, dt+\frac{1}{\sqrt\eps}

183:  \beta(x^\eps(t),y^\eps(t)) \, dV(t),

184: \end{eqnarray}

185: \label{e:averg}

186: \end{subequations}

187: with $U,V$ standard Brownian motions. Averaging $f$ and $\alpha \alpha^T$

188: over the invariant measure of the $y^\eps$ equation, with $x^\eps$ viewed as fixed,

189: gives an averaged SDE for $x$. The fast process $y$, with timescale $\eps$,

190: is eliminated. For homogenization one has

191: \begin{subequations}

192: \begin{eqnarray}

193: d x^\eps(t) &=&  \left( \frac{1}{\eps} f_0(x^\eps(t),y^\eps(t)) +

194: f_1(x^\eps(t),y^\eps(t)) \right) dt \nonumber \\

195: &+& \alpha(x^\eps(t),y^\eps(t)) \, dU( t),\\

196: d y^\eps(t) &=& \frac{1}{\eps^2} g(x^\eps(t),y^\eps(t)) \, dt +

197: \frac{1}{\eps} \beta(x^\eps(t),y^\eps(t)) \, dV(t),

198: \end{eqnarray}

199: \label{e:homog}

200: \end{subequations}

201: where it is assumed that $f_0$ averages to zero against the invariant measure

202: of the fast process $y^\eps$ with $x^\eps$ fixed.

203: Now $y^\eps$ has time-scale $\eps^2$ and is eliminated.

204: The fluctuations in $f_0$, suitably amplified by $\eps^{-1}$, induce ${\cal O}(1)$ effects

205: in the homogenized equation for $x^\eps$. In both cases \eqref{e:averg} and \eqref{e:homog} it is

206: possible to show \cite{lions} that the process $x^\eps(t)$ converges in law, as $\eps

207: \rightarrow 0$, to the solution of an effective SDE of the form

208: \begin{equation}\label{e:effect}

209: d x(t) = F(x(t)) dt + A(x(t)) d U(t).

210: \end{equation}

211: Explicit formulae can be derived for the effective coefficients $F(x)$ and $A(x)$ in the

212: above equation \cite{lions, PavlSt06b}. A natural question that arises then is how to fit

213: an SDE of the form \eqref{e:effect} to data generated by a multiscale stochastic equation

214: of the form \eqref{e:averg} or \eqref{e:homog}, under the assumption of scale separation,

215: i.e. when $\eps \ll 1$. This paper is a first attempt towards the study of this

216: interesting problem, for a specific class of SDEs of the form \eqref{e:homog}.

217:

218: Our basic model will be the first order Langevin equation

219: %

220: \begin{equation}

221: d x^\eps(t) = - \nabla V \left(x^\eps(t), \frac{x^\eps(t)}{\eps}; \alpha \right)  dt +

222: \sqrt{2 \sigma } d \beta(t),

223: %

224: \label{e:main}

225: %

226: \end{equation}

227: %

228: where $\beta(t)$ denotes standard Brownian motion on $\R^d$ and $\sigma$ is a positive

229: constant. The two--scale potential $ V^\eps \left(x, y; \alpha \right)$ is assumed to

230: consist of a large--scale and a fluctuating part

231: %

232: \begin{equation}

233: %

234: V ( x, y ; \alpha) = \alpha V(x) + p(y).

235: %

236: \label{e:potential}

237: %

238: \end{equation}

239: %

240: As we show explicitly in \eqref{e:eqns_motion} this set-up puts us in

241: the framework of homogenization for SDEs.

242:

243: Under \eqref{e:potential}, the SDE \eqref{e:main} becomes

244: \begin{equation}

245: d x^\eps(t) = - \alpha \nabla V(x^\eps(t)) \, dt - \frac{1}{\eps}\nabla p \left(

246: \frac{x^\eps(t)} {\eps} \right) \, dt + \sqrt{2 \sigma} \, d \beta (t).

247: %

248: \label{e:xeps_V}

249: %

250: \end{equation}

251: If $p$ is periodic on $\bbT^d$ and sufficiently smooth, then it is well

252: known (see \cite{lions, pardoux} for example) that, as $\eps \rightarrow 0$,

253: the solution $x^\eps(t)$ of $\eqref{e:main}$ converges in law to the

254: solution of the SDE

255: %

256: \begin{equation}

257:  d x(t) = -\alpha K \nabla V(x(t)) dt + \sqrt{2 \sigma K} d \beta (t),

258: \label{e:lim_sde}

259: \end{equation}

260: with

261: \begin{equation}

262: K = \int_{\T^d} \left( I + \nabla_y \phi(y) \right)  \left( I + \nabla_y \phi(y)

263: \right)^T \, \mu(dy) \label{e:coeffs}

264: \end{equation}

265: and

266: \begin{equation}

267: \mu(dy) = \rho(y) dy = \frac{1}{Z} e^{-p(y)/\sigma} \, dy, \quad Z = \int_{\T^d}

268: e^{-p(y)/\sigma} \, dy. \label{e:gibbs_torus}

269: \end{equation}

270: The field $\phi(y)$ is the solution of the Poisson equation

271: \begin{equation}

272: - \LL_0 \phi(y) = -\nabla_y p(y), \quad \LL_0 := - \nabla_y p(y) \cdot \nabla_y + \sigma

273: \Delta_y, \label{e:cell}

274: \end{equation}

275: with periodic boundary conditions. The function $\rho(y)$ spans the null-space of ${\cal

276: L}_0^*$, the $L^2$--adjoint of $\LL_0$. The effective diffusion tensor is positive

277: definite and the diffusivity is always depleted \cite{Oll94}. Physically this occurs

278: because the homogenized process must represent the cost of traversing the many small

279: energy barriers present in the original multiscale problem but which are not explicitly

280: captured in the homogenized potential.  In Figure \ref{fig:potential} we

281: plot the potential $V^\eps(x,x/\eps)$, as well as the average potential $V(x)$,

282: illustrating this phenomenon.  In fact, the effective diffusivity $\Sigma = \sigma K$

283: decays exponentially fast in $\sigma$ as $\sigma \rightarrow 0$.

284: See \cite{CampPiatn2002} and the references therein. Thus the original

285: and homogenized diffusivities are exponentially different at small

286: temperatures.

287:

288: To illustrate these facts explicitly, consider the problem in one dimension, $d = 1$. In

289: this case the limiting equation takes the form

290: \begin{equation}

291:  d x(t) = - A  V'(x(t)) dt + \sqrt{2 \Sigma } d \beta (t).

292: \label{e:lim_sde_1d}

293: \end{equation}

294: The effective coefficients are

295: \begin{equation}

296: A = \frac{\alpha L^2}{Z \widehat{Z}} \quad \mbox{and} \quad \Sigma = \frac{\sigma L^2}{Z

297: \widehat{Z}}, \label{e:coeffs_1d}

298: \end{equation}

299: where

300: \begin{equation}

301: \widehat{Z} = \int_{0}^L e^{p(y)/\sigma} \, dy, \quad Z = \int_{0}^L e^{-p(y)/\sigma} \,

302: dy. \label{e:z_1d}

303: \end{equation}

304: \begin{figure}

305: \begin{center}

306: \includegraphics[width=3.0in, height = 3.0in]{potential3.eps}

307: \caption{$V^\eps(x, x/\eps) = \frac{1}{2} x^2 + \sin \left( \frac{x}{\eps} \right)$

308: with $ \eps = 0.1$ and averaged potential $V(x) = \frac{1}{2} x^2$.} \label{fig:potential}

309: \end{center}

310: \end{figure}

311: Notice that $L^2 \leq Z \widehat{Z}$ by the Cauchy--Schwarz inequality. This explicitly shows

312: that the homogenized equation in one dimension comprises motion in the average potential

313: $V(x)$, at a new slower time--scale contracted by $A/\alpha.$

314:

315:

316:

317: The main results of the paper can be summarized as follows.

318: Assume that we are given a path $\{x^\eps(t)\}_{t \in [0,T]}$

319: of equation \eqref{e:xeps_V} and that we want to fit an SDE of the form

320: \eqref{e:lim_sde_1d} to the given data, estimating the parameters $A,\Sigma$

321: as $\widehat{A}, \widehat{\Sigma}$. Then the following is a loose

322: statement of our main results; these will be formulated precisely, and

323: proved, below.

324:

325: %

326: \begin{theorem}

327: If we do not subsample, then the estimators $\widehat{A}$ and $\widehat{\Sigma}$ are

328: asymptotically biased -- they converge to $\alpha, \, \sigma$.

329: \end{theorem}

330: %

331: \begin{theorem}

332: If the sampling rate is between the two characteristic time scales of the SDE \eqref{e:main}

333: then the estimators $\widehat{A}$ and $\widehat{\Sigma}$ are

334: asymptotically unbiased -- they converge to $A, \, \Sigma$.

335: \end{theorem}

336: %

337: The rest of the paper is organized as follows. In section \ref{sec:estim} we present the

338: estimators that we will use. In section \ref{sec:numerics} we present various numerical

339: experiments illustrating the behaviour of these estimators.

340: In section \ref{sec:results} we state the main results of this paper,

341: explaining the numerical experiments from the previous section.

342: Section 5 contains some preliminary results that will be useful in the sequel.

343: Section 6 contains proof of two central propositions concerning the behaviour

344: of the multiscale diffusion

345: when observed on time--scales long compared with the fast time--scales of process, but

346: small compared with the slow time--scales of the process.  Section 7 is devoted to

347: the proofs of our theorems. Finally, section \ref{sec:conc} is devoted to some concluding

348: remarks.

349:

350: In the sequel we use $\langle \cdot, \cdot \rangle$ to denote the standard inner--product

351: on $\bbR^d$ and $|\cdot|$ the induced Euclidean norm. Throughout the paper we make the

352: following standing assumptions on the drift vector fields:

353: %

354: \begin{assumptions}

355: \label{a:1}

356: The potentials $p$ and $V$ satisfy:

357: %

358: \begin{itemize}

359: %

360: \item $p(y) \in C^{\infty}_{per}(\bbT^d,\bbR^d)$;

361: %

362: \item $V(x) \in C^{\infty}(\R^d,\R);$

363: %

364: \item $|\nabla V(x_1)-\nabla V(x_2)| \le L |x_1-x_2| \quad \forall x_1,x_2 \in \R^d;$

365: %

366: \item $\exists a,b>0: \la -\nabla V(x),x \ra \le a-b|x|^2 \quad \forall x \in \R^d;$

367: %

368: \item $e^{-\frac{\alpha}{\sigma} V(x)} \in L^1(\R^d,\R^+)$.

369: %

370: \end{itemize}

371: %

372: \end{assumptions}

373: %

374: The third assumption will be used primarily to deduce that, by choice of

375: origin for $V$,

376: \begin{equation}

377: \label{e:linbnd}

378: |\nabla V(x)| \le L|x|.

379: \end{equation}

380: This assumption could be relaxed and replaced by a polynomial growth bound;

381: however this complicates the analysis without adding new insight.

382: Similarly it is not necessary, of course, that $V$ and $p$ are $C^{\infty}$.

383: The fourth condition, however, is essential:

384: it drives the ergodicity of the process which we use in a fundamental way in the analysis

385: of the drift parameter estimators; it would not, however, be fundamental for estimation

386: of diffusion coefficients alone. The fourth condition implies the fifth, which is simply

387: the requirement that the invariant measure is indeed a probability measure; we state the

388: two conditions separately for clarity of exposition.

389: %

390: %

391: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

392: %

393: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

394: %

395: \section{The Estimators}\label{sec:estim}

396: %

397: In this section we describe various estimators for the parameters

398: arising in equation \eqref{e:lim_sde}. We assume that we are given

399: a path $x=\{x(t)\}_{t \in [0,T]}$, or samples from such a path,

400: $x=\{x_n\}_{n=0}^N$, with $x_n=x(n\delta).$ For simplicity

401: we aim to fit the equation in the form

402: \begin{equation}

403:  d x (t) = -A \nabla V(x (t)) dt + \sqrt{2 \Sigma } d \beta (t),

404: \label{e:lim_sde_sim}

405: \end{equation}

406: where $A$ and $\Sigma$ are scalars. In one dimension this reduces to the form

407: \eqref{e:lim_sde_1d}. Note that in general this is only the correct form for the

408: homogenized equation in one dimension since, typically, the average potential has a

409: matrix as a pre--factor, as in \eqref{e:lim_sde}. However it suffices to exemplify the

410: main ideas in this work, and simplifies the presentation.

411:

412: The standard way to estimate the diffusion coefficient

413: is via the quadratic variation of the path:

414: \begin{equation}

415: \widehat{\Sigma}_{N,\delta}(x) = \frac{1}{2 N \delta d} \sum_{n = 0}^{N-1}

416: |x_{n+1} -  x_n|^2.

417: \label{e:sigma_estim_1d}

418: \end{equation}

419: A key issue in this paper is to understand how to choose $\delta$

420: as a function of $\epsilon$ to ensure that data generated by \eqref{e:main}

421: can be effectively fit to obtain the correct homogenized diffusivity

422: in equations such as  \eqref{e:lim_sde_sim}.

423:

424: The standard way to estimate drift coefficients is via the path-space likelihood of

425: \eqref{e:lim_sde_sim} with respect to a pure diffusion with no drift,

426: namely (see, for example, \cite{BasRao80,LipShir01a})

427: %

428: $$L(x) \propto  \exp\{-I(x)/2\Sigma\}$$

429: %

430: where

431: $$I(x)=\int_0^T\left\{ |A \nabla V(x (t))|^2dt+2A \la \nabla V(x (t)), d x (t) \ra

432: \right\}.$$

433: Maximizing the log-likelihood then gives

434: the estimate $\widehat{A}$ of $A$ given by

435: \begin{equation}

436: \widehat{A}(x) = - \frac{\int_0^T \la \nabla V(x(t)), d x(t)  \ra} {\int_0^T \big| \nabla

437: V(x(t)) \big|^2\, dt}. \label{e:a_est}

438: \end{equation}

439: If the data is given in discrete but finely spaced increments, as often happens

440: in practice, then this estimator can be approximated to yield

441: %

442: \begin{equation}

443: \widehat{A}_{N,\delta}(x) = - \frac{\sum_{n = 0}^{N-1} \la \nabla V(x_n),

444: \left(x_{n+1} - x_n \right)\ra}{\sum_{n=0}^{N-1} \left|\nabla V(x_n) \right|^2

445: \delta}.

446: %

447: \label{e:alpha_estim_1d}

448: %

449: \end{equation}

450: A key issue in this paper is to understand how to chose $\delta$ as a function of

451: $\epsilon$ to ensure that data generated by \eqref{e:main} can be effectively fit to

452: obtain the correct homogenized drift coefficients in equations such as

453: \eqref{e:lim_sde_sim}, via the estimator \eqref{e:alpha_estim_1d}.

454:

455: The gradient structure of the SDE \eqref{e:lim_sde_sim} can be used to obtain a second

456: estimator for the drift coefficients. This second estimator, which we now derive, is of

457: interest for two different reasons: firstly it may be useful in practice as it may lead

458: to smaller variance in estimators; secondly it highlights the fact that working out how

459: to sample the data to obtain the correct estimation of the diffusion coefficient alone

460: will lead to correct estimation of the drift parameters, at least for the

461: class of gradient--structure SDEs that we consider in this paper.

462: The second estimator requires

463: the input of an estimator $\widehat\Sigma$ for the diffusion coefficient and is

464: \begin{equation}

465: \tilde{A}(x)=\widehat{\Sigma}\frac{\frac{1}{T}\int_0^T \Delta V(x(t)) \, dt  }

466: {\frac{1}{T} \int_0^T |\nabla V(x(t))|^2 \, dt}. \label{eq:alpha2}

467: \end{equation}

468: Approximating to allow for the input of discrete--time data gives

469: %

470: \begin{equation}

471: \tilde{A}_{N,\delta}(x) =  \widehat{\Sigma}\frac{\sum_{n = 0}^{N-1} \Delta V(x_n) \delta}

472: {\sum_{n=0}^{N-1} \left|\nabla V(x_n) \right|^2 \delta}.

473: %

474: \label{e:alpha_estim_1d2}

475: \end{equation}

476: %

477: The following result shows that $\tilde{A}(x)$ is a natural approximation to $\widehat

478: A(x).$

479: %

480: \begin{prop}

481: Let $x=\{x(t)\}_{t \in [0,T]}$ satisfy \eqref{e:lim_sde_sim}. If $\widehat{\Sigma}=\Sigma$

482: then the estimator $\tilde A(x)$ is asymptotically equivalent to the maximum likelihood

483: estimator $\widehat{A}$:

484: $$

485: \lim_{T \rightarrow \infty} \tilde{A}(x) = \widehat{A}(x),\, a.s.

486: $$

487: \end{prop}

488: %

489: \proof We apply the It\^{o} formula to $V(x(t))$ for $x(t)$ solving \eqref{e:lim_sde_sim}

490: and use formula \eqref{e:a_est} to obtain

491: \begin{eqnarray*}

492: \widehat{A}(x) & = &

493: \frac{V(x(0)) - V(x(T)) + \Sigma \int_0^T \Delta V(x(t))

494: \, dt  }{\int_0^T |\nabla V(x)|^2 \, dt}

495: \\ & = & \frac{(V(x(0)) - V(x(T)))}{\int_0^T |\nabla V(x)|^2 \,

496: dt} + \frac{\frac{1}{T}\Sigma\int_0^T \Delta V(x(t)) \, dt  } {\frac{1}{T}

497: \int_0^T |\nabla V(x)|^2 \, dt}

498: \\ & = & \frac{\frac{1}{T} (V(x(0)) - V(x(T)))}{\frac{1}{T}\int_0^T |\nabla V(x)|^2 \,

499: dt} + \tilde{A}(x).

500: \end{eqnarray*}

501: Under the Assumptions \ref{a:1} it follows from \cite{Mao97} that

502: $$

503: \lim_{T \rightarrow 0} \frac{\frac{1}{T} (V(x(0)) - V(x(T))}{ \int_0^T

504: |\nabla V(x(t))|^2 \, dt} = 0,\, a.s.

505: $$

506: The result follows.

507: \qed

508: %

509: %

510: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

511: %

512: %                          NUMERICAL RESULTS

513: %

514: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

515: %

516: \section{Numerical Results}

517: \label{sec:numerics} In all cases we solve the multiscale SDE \eqref{e:main} using the

518: Euler--Marayama scheme \cite{KlPl92} for a single realization of the noise, with a

519: time--step $\Delta t$ sufficiently small so that the error due to the discretization is

520: negligible; this requires that the time--step is small compared with $\eps^2,$ the

521: fastest scale in the problem. We also employ a sufficiently long time interval so that

522: the invariant measure is well sampled by the single path. Since the convergence to the invariant measure

523: is uniform in $\eps \to 0$, this is not prohibitive. We then use the data generated from

524: the multiscale process as input to the estimators for the homogenized diffusion

525: \eqref{e:lim_sde}. We present numerical results for three model problems: a one

526: dimensional monomial potential of even degree, a one dimensional bistable potential and a

527: two dimensional quadratic potential. In all three cases we perturb the large--scale part

528: of the potential $V$ by small--scale fast oscillations, usually in the form of a cosine

529: potential $p$.

530:

531: We present two types of numerical results. Note that $\delta$, the time interval between

532: two consecutive observations, is the inverse sampling rate. In the first we use

533: $\delta=\Delta t$ as the time interval between two consecutive observations in the

534: estimators. In the second we subsample the data, using $\delta > \Delta t$ and study how

535: the estimated coefficients behave as a function of the subsampling. We use the data

536: generated from our simulation in the estimators \eqref{e:alpha_estim_1d} and

537: \eqref{e:alpha_estim_1d2} to estimate the drift coefficient and in \eqref{e:sigma_estim_1d}

538: to estimate the diffusion coefficient of \eqref{e:lim_sde_1d}. For the most part we work

539: in one dimension and fit a single drift and diffusion parameter so that \eqref{e:lim_sde}

540: becomes \eqref{e:lim_sde_1d}. When we work in more than one dimension, or

541: estimate more than just a single drift or diffusion parameter,

542: we use natural generalizations of the estimators defined in the previous

543: section.

544:

545: Let us summarize the main conclusions that can be drawn from the numerical experiments;

546: recall that $\Delta t \ll \eps^2.$ First, if we choose $\delta = \Delta t$, that is, if

547: we don't subsample, then the resulting estimators do not generate the correct estimates

548: of the homogenized coefficients. If, on the other hand, we subsample with $\eps^2 \ll

549: \delta \ll \mathcal{O} (1),$ then  the estimators generate the values of the parameters

550: of the homogenized equation. Furthermore, there is an optimal sampling rate: there exists

551: a $\delta^*$ which minimizes the distance between the homogenized value of the parameter

552: and the value generated by the estimator. The optimal sampling rate depends sensitively

553: on $\sigma$. It is also of interest that, in higher dimensions, the optimal sampling rate

554: can be different for different parameters.

555:

556: The above observations appear to hold independently of the detailed

557: form of the large--scale part of the potential $V$ (provided, of course,

558: that it satisfies appropriate convexity conditions).

559: In addition, the performance of the estimators seems to be the same

560: irrespective of the dimension of the problem.

561:

562: Another interesting observation is that the second estimator for the drift coefficient

563: \eqref{e:alpha_estim_1d2} performs at least as well as the maximum

564: likelihood estimator \eqref{e:alpha_estim_1d}, and in some instances

565: outperformas it.

566: %

567: %

568: \subsection{Failure Without Subsampling}

569: \begin{figure}

570: \centerline{

571: \begin{tabular}{c@{\hspace{2pc}}c}

572: \includegraphics[width=2.7in, height = 2.7in]{a_estim_ou_vs_eps_a1_eps004_02_dt510_4T10_4.eps} &

573: \includegraphics[width=2.7in, height = 2.7in]{sigma_estim_ou_vs_eps_a1_eps004_02_dt510_4T10_4.eps} \\

574:  a.~~  $\widehat{A}$  & b.~~ $\widehat{\Sigma}$

575: \end{tabular}}

576: \begin{center}

577: \caption{ Estimation of the drift and diffusion coefficients vs $\eps$ for the potential \eqref{e:ou}.

578: Solid line: estimated coefficient. Dashed line: homogenized coefficient. Dotted line:

579: unhomogenized coefficient.}

580: %

581: \label{fig:vs_eps_no_subsam}

582: %

583: \end{center}

584: \end{figure}

585: %

586: \begin{figure}

587: \centerline{

588: \begin{tabular}{c@{\hspace{2pc}}c}

589: \includegraphics[width=2.7in, height = 2.7in]{a_estim_ou_vs_sig_a1_eps01_dt510_4T10_4.eps}

590:  & \includegraphics[width=2.7in, height = 2.7in]

591:  {sigma_estim_ou_vs_sig_a1_eps01_dt510_4T10_4.eps} \\

592:  a.~~  $\widehat{A}$  & b.~~ $\widehat{\Sigma}$

593: \end{tabular}}

594: \begin{center}

595: \caption{Estimation of the drift and diffusion coefficients vs $\sigma$ for the potential

596: \eqref{e:ou} with $\eps = 0.1$. Solid line: estimated coefficient. Dashed line:

597: homogenized coefficient. Dotted line: unhomogenized coefficient.}

598: \label{fig:vs_sigma_no_subsam}

599: \end{center}

600: \end{figure}

601: In this section we study the estimators

602: $\widehat{A}$ and $\widehat{\Sigma}$

603: when the data is given from the solution of equation \eqref{e:xeps_V}

604: with $\eps \ll 1$ and $\Delta t=\delta$ -- no subsampling is used.

605: We use the potential

606: \begin{equation}\label{e:ou}

607: V(x) = \frac{1}{2}\alpha x^2

608: \end{equation}

609: The small--scale part of the potential is

610: \begin{equation}\label{e:cos}

611: p(y) =  \cos ( y ).

612: \end{equation}

613: In Figure \ref{fig:vs_eps_no_subsam} we plot the estimators $\widehat{A}$ and

614: $\widehat{\Sigma}$ for various values of $\eps$. For comparison we also plot the

615: homogenized coefficients $A$ and $\Sigma$ and the unhomogenized coefficients $\alpha$ and

616: $\sigma$. We observe that the estimators always give us the coefficients $\alpha$ and

617: $\sigma$ of the original SDE \eqref{e:xeps_V}. In particular, the performance of the

618: estimators does not improve as $\eps \rightarrow 0$. In Figure

619: \ref{fig:vs_sigma_no_subsam} we plot the estimators for various values of the diffusion

620: coefficient $\sigma$. We notice that the estimators give the values of the coefficients

621: $\alpha$ and $\sigma$, for all values of $\sigma$. Since the homogenized coefficients

622: decay to $0$ exponentially fast in $\sigma$, the results of Figure

623: \ref{fig:vs_sigma_no_subsam} indicate that the estimators give exponentially wrong

624: results when $\sigma \ll 1$.

625:

626: These results indicate the need to subsample -- i.e. to choose $\delta$

627: appropriately as a function of $\epsilon$.

628: %

629: %

630: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

631: \subsection{Success With Subsampling}

632: Now, rather than using all the data that were generated from the solution of equation

633: \eqref{e:main} we use only a fraction of them. We choose $\delta$ in the estimators

634: \eqref{e:sigma_estim_1d}, \eqref{e:alpha_estim_1d} and \eqref{e:alpha_estim_1d2} as

635: follows:

636: $$

637: \Delta t_{sam}=\delta = 2^k \Delta t, \quad k=0, \, 1, \, 2, \dots,

638: $$

639: and we study the performance of the estimators as a function of the sampling rate. We

640: investigate this issue for three different model problems.

641: \subsubsection{OU Processes in 1D}

642: \begin{figure}

643: \centerline{

644: \begin{tabular}{c@{\hspace{2pc}}c}

645: \includegraphics[width=2.7in, height = 2.7in]{alpha_estim_ou_sig05_a1_ep01_dt001T2.10+4.eps}

646: & \includegraphics[width=2.7in, height = 2.7in]

647: {sigma_estim_ou_sig05_a1_ep01_dt001T2.10+4.eps} \\

648: a.~~  $\widehat{A}$  & b.~~ $ \widehat{\Sigma}$

649: \end{tabular}}

650: \begin{center}

651: \caption{Estimation of the drift and diffusion coefficients vs $\Delta t_{sam}$

652: for the potential \eqref{e:ou} with $\eps = 0.1$.  Solid line: estimated coefficient.

653: Dashed line: homogenized coefficient. Dotted line: unhomogenized coefficient.}

654: \label{fig:ou_sig05}

655: \end{center}

656: \end{figure}

657: %

658: \begin{figure}[t]

659: \centerline{

660: \begin{tabular}{c@{\hspace{2pc}}c}

661: \includegraphics[width=2.7in, height = 2.7in]{sigma_estim_ou_sig07_a1_eps01_dt001T10+4.eps}

662: & \includegraphics[width=2.7in, height = 2.7in]{sigma_estim_ou_sig1_a1_eps01_dt001T10+4.eps}

663: \\  a.~~  $\sigma=0.7$  & b.~~ $\sigma = 1.0$

664: \end{tabular}}

665: \begin{center}

666: \caption{Estimation of the diffusion coefficient vs $\Delta t_{sam}$ for the potential

667: \eqref{e:ou} with $\eps = 0.1$, for two different values of $\sigma$.  Solid line: estimated

668: coefficient. Dashed line: homogenized coefficient. Dotted line: unhomogenized coefficient.}

669: \label{fig:ou_sig_07_1}

670: \end{center}

671: \end{figure}

672: %

673: \begin{figure}[t]

674: \centerline{

675: \begin{tabular}{c@{\hspace{2pc}}c}

676: \includegraphics[width=2.7in, height = 2.7in]

677: {a_estim_ou_vs_sigma_sampl_eps01_a1_ep004_02_dt0005T210+4.eps} &

678: \includegraphics[width=2.7in, height = 2.7in]

679: {sigma_estim_ou_vs_sigma_sampl_eps01_a1_ep004_02_dt0005T210+4.eps} \\

680:  a.~~  $\widehat{A}$  & b.~~ $\widehat{\Sigma}$

681: \end{tabular}}

682: \begin{center}

683: \caption{Estimation of the drift and diffusion coefficient vs $\sigma$ for the potential

684: \eqref{e:ou} with $\eps = 0.1, \, \alpha = 1.0$, for three different sampling rates.

685: Solid line: $\Delta t_{sam} = 0.128$. Dash--dotted line: $\Delta t_{sam} = 0.256$. Dotted

686: line: $\Delta t_{sam} = 0.512$. Dashed line: homogenized coefficient. }

687: \label{fig:ou_vs_sig_sam}

688: \end{center}

689: \end{figure}

690: We study the problem in one dimension with the large--scale part of the potential given

691: by \eqref{e:ou} and with the fluctuating part being the cosine potential \eqref{e:cos}.

692: The two estimators $\widehat{A}$ and $\tilde{A}$ for the drift coefficient produce almost

693: identical results and we only present results for the maximum likelihood estimator

694: $\widehat{A}$. In Figure \ref{fig:ou_sig05} we present the estimated values of the drift

695: and diffusion coefficients as a function of the inverse sampling rate $\delta = \Delta

696: t_{sam}$ when $\eps = 0.1, \, \alpha = 1.0, \, \sigma = 0.5$. We observe that, provided

697: that we subsample at an appropriate rate, we are able to estimate the parameters of the

698: homogenized equation correctly. Notice also that the estimators for the drift and the

699: diffusion coefficient show very similar dependence on the sampling rate. This is in

700: accordance with our theoretical results; see Theorem \ref{prop:drift_estim_2}.

701:

702: In Figure \ref{fig:ou_sig_07_1} we plot $\widehat{\Sigma}$ as a function of the sampling

703: rate for two different values of $\sigma$. We observe that the estimator of the diffusion

704: coefficient is a decreasing function of the sampling rate, as expected. In addition to

705: this, there is a well defined optimal sampling rate, which depends sensitively on

706: $\sigma$. In particular the optimal $\delta$ is a decreasing function of $\sigma$. This

707: is to be expected, since when $\sigma \gg 1$ the process $x^\eps(t)$ loses its multiscale

708: character and becomes effectively a standard Brownian motion. Consequently, when $\sigma$

709: is sufficiently large, the optimal $\delta$ becomes $\Delta t$, the integration time

710: step. Notice furthermore that the slope of the $\widehat{\Sigma}-\delta$

711: curve depends on $\sigma$.

712:

713: In Figure \ref{fig:ou_vs_sig_sam} we plot the estimators of the drift and diffusion

714: coefficients versus $\sigma$, for three different sampling rates. For comparison we also

715: plot the homogenized coefficients. We observe that all three sampling rates lead to

716: reasonably accurate estimates for $A$ and $\Sigma$, when $\sigma$ is not too small. On

717: the other hand, the estimators become less accurate as $\sigma \rightarrow 0$. This is

718: also to be expected: when $\sigma \ll 1$, the accurate simulation of \eqref{e:main}

719: requires a very small time step; moreover, the equation has to be solved over a very long

720: time interval in order for the invariant measure of the process to be well represented.

721: Hence, our hypothesis that the errors due to discretization and finite time of

722: integration are small, is not valid. In addition, as $\sigma$ tends to $0$, the optimal sampling

723: rate increases, and becomes much larger than the coarser sampling rate that we use in the

724: simulations.

725:

726: In Figure \ref{fig:ou_vs_eps_sam} we plot the estimators versus $\eps$, for three

727: different values of the sampling rate. As expected, the deviation of the estimated values

728: of the drift and diffusion coefficients from the homogenized values is an increasing

729: function of $\epsilon$. On the other hand, the optimal sampling rate does not appear to

730: depend sensitively on $\eps$: it is always the same sampling rate that minimizes the

731: distance between the estimated coefficient and the homogenized one, for all values of

732: $\eps$.

733: %

734: \begin{figure}[t]

735: \centerline{

736: \begin{tabular}{c@{\hspace{2pc}}c}

737: \includegraphics[width=2.7in, height = 2.7in]

738: {a_estim_ou_vs_eps_sampl_sig05_a1_ep004_02_dt0005T210+4.eps} &

739: \includegraphics[width=2.7in, height = 2.7in]

740: {sigma_estim_ou_vs_eps_sampl_sig05_a1_ep004_02_dt0005T210+4.eps} \\

741:  a.~~  $\widehat{A}$  & b.~~ $\widehat{\Sigma}$

742: \end{tabular}}

743: \begin{center}

744: \caption{Estimation of the drift and diffusion coefficient vs $\eps$ for the potential

745: \eqref{e:ou} with $\alpha = 1.0, \, \sigma = 0.5$, for three different sampling rates.

746: Solid line: $\Delta t_{sam} = 0.128$. Dash--dotted line: $\Delta t_{sam} = 0.256$. Dotted

747: line: $\Delta t_{sam} = 0.512$. Dashed line: homogenized coefficient.}

748: \label{fig:ou_vs_eps_sam}

749: \end{center}

750: \end{figure}

751: %

752: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

753: %

754: \subsubsection{A Bistable Potential}

755: We consider equation \eqref{e:main} in one dimension with a mean potential of

756: the bistable form

757: \begin{equation}\label{e:pot_bistable}

758: V(x; \alpha, \beta) = - \frac{1}{2} \alpha x^2 + \frac{1}{4} \beta x^4.

759: \end{equation}

760: The fluctuating part of the potential is given by  \eqref{e:cos}. The homogenized

761: equation is

762: %

763: \begin{equation}\label{e:hom_bistable}

764: d X(t) = ( A X(t) - B X(t)^3 ) dt + \sqrt{2 \Sigma} d \beta(t),

765: \end{equation}

766: where the homogenized coefficients are given by

767: $$

768: A = \alpha K, \quad B = \beta K, \quad \Sigma = \sigma K,

769: \quad K = \frac{4 \pi^2}{Z \widehat{Z}},

770: $$

771: where $Z$ and $\widehat{Z}$ are given by \eqref{e:z_1d} with $L = 2 \pi$ and $p(y) = \cos(y)$.

772: We will estimate the diffusion coefficient using formula \eqref{e:sigma_estim_1d} with $d

773: = 1$. For the two parameters of the drift we use generalizations of the maximum

774: likelihood estimator $\widehat{A}$.

775:

776: In Figures \ref{fig:bistable:A_B_05} and \ref{fig:bistable:A_B_07} we present the

777: estimators for the two drift coefficients versus the sampling rate, for two different

778: values of $\sigma$. We observe that the performance of the estimators is qualitatively

779: similar to the OU case. Notice also that the optimal sampling rate is

780: approximately the same for both coefficients.

781:

782: In Figure \ref{fig:bistable:sigma} we plot the estimator for the diffusion coefficient

783: versus the sampling rate, for two different values of $\sigma$. The conclusions reached

784: from the numerical study of $\widehat{\Sigma}$ for the one dimensional OU process carry

785: almost verbatim to this case.

786:

787: \begin{figure}[t]

788: \centerline{

789: \begin{tabular}{c@{\hspace{2pc}}c}

790: \includegraphics[width=2.7in, height = 2.7in]{a_estim_bistable_sig05_a1_b2_ep01_dt001.eps} &

791: \includegraphics[width=2.7in, height = 2.7in]{beta_estim_bistable_sig05_a1_b2_ep01_dt001.eps} \\

792: a.~~  $\widehat{A}$ vs $\Delta t_{sam}$  & b.~~ $ \widehat{B}$ vs $\Delta t_{sam}$

793: \end{tabular}}

794: \begin{center}

795: \caption{ Estimation of the parameters of the bistable potential \eqref{e:pot_bistable}

796: as a function of the sampling rate for $\sigma = 0.5, \,\eps = 0.1$. Solid line:

797: estimated coefficient. Dashed line: homogenized coefficient. Dotted line: unhomogenized

798: coefficient.} \label{fig:bistable:A_B_05}

799: \end{center}

800: \end{figure}

801: %

802: \begin{figure}[t]

803: \centerline{

804: \begin{tabular}{c@{\hspace{2pc}}c}

805: \includegraphics[width=2.7in, height = 2.7in]{a_estim_bistable_sig07_a1_b2_ep01_dt001.eps} &

806: \includegraphics[width=2.7in, height = 2.7in]{beta_estim_bistable_sig07_a1_b2_ep01_dt001.eps} \\

807: a.~~  $\widehat{A}$ vs $\Delta t_{sam}$  & b.~~ $ \widehat{B}$ vs $\Delta t_{sam}$

808: \end{tabular}}

809: \begin{center}

810: \caption{Estimation of the parameters of the bistable potential \eqref{e:pot_bistable} as a

811: function of the sampling rate for $\sigma = 0.7, \,\eps = 0.1$. Solid line: estimated coefficient.

812: Dashed line: homogenized coefficient. Dotted line: unhomogenized coefficient.}

813: %

814: \label{fig:bistable:A_B_07}

815: %

816: \end{center}

817: \end{figure}

818: %

819: \begin{figure}[t]

820: \centerline{

821: \begin{tabular}{c@{\hspace{2pc}}c}

822: \includegraphics[width=2.7in, height = 2.7in]

823: {sigma_estim_bistable_sig05_a1_b2_ep01_dt001.eps} &

824: \includegraphics[width=2.7in, height = 2.7in]

825: {sigma_estim_bistable_sig07_a1_b2_ep01_dt001.eps} \\

826: a.~~  $\sigma = 0.5$  & b.~~ $ \sigma =0.7$

827: \end{tabular}}

828: \begin{center}

829: \caption{Estimation of the diffusion coefficient for the bistable potential

830: \eqref{e:pot_bistable} as a function of the sampling rate for $\alpha = 1.0,

831: \, \beta = 2.0, \,\eps = 0.1$. Solid line: estimated coefficient. Dashed line:

832: homogenized coefficient. Dotted line: unhomogenized coefficient.} \label{fig:bistable:sigma}

833: \end{center}

834: \end{figure}

835: %

836: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

837: %

838: %

839: \subsubsection{A Quadratic Potential in 2D}

840: We Consider now \eqref{e:main} in two dimensions with a separable fast potential $p(y)$:

841: %

842: \begin{equation}\label{e:2dim}

843: d x^\eps(t) = - \nabla V(x^\eps(t), B) \, dt - \frac{1}{\epsilon}\nabla p_1 \left(

844: \frac{x^\eps_1(t)}{\eps} \right)  -\frac{1}{\epsilon}\nabla p_2 \left(

845: \frac{x^\eps_2(t)}{\eps} \right)  \, dt + \sqrt{2 \sigma } \, d \beta (t),

846: \end{equation}

847: %

848: where $B$ is the set of the drift parameters that we wish to estimate. The homogenized

849: equation reads

850: \begin{equation}\label{e:homog_2d}

851: d X (t) = - K \nabla V(X (t), B) dt + \sqrt{2 \sigma K} \, d \beta (t),

852: \end{equation}

853: where

854: \begin{equation}\label{e:tensor_2d}

855: K = \left( \begin{array}{cc}

856: \frac{L^2}{Z_1 \widehat{Z}_1} & 0  \\

857: 0 & \frac{L^2}{Z_2 \widehat{Z}_2}

858: \end{array} \right)

859: \end{equation}

860: and

861: \begin{eqnarray*}

862: Z_i = \int_0^L e^{- \frac{p_i(y_i)}{\sigma}} \, dy_i, \quad

863:  \widehat{Z}_i = \int_0^L e^{\frac{p_i(y_i)}{\sigma}} \, dy_i, \; \; i=1,2.

864: \end{eqnarray*}

865: %

866: In the above $L$ denotes the period of $p(y)$.

867:

868: We will consider the case of a general quadratic potential in two dimensions:

869: \begin{equation}\label{e:pot_2d}

870: V(x, B) = \frac{1}{2} x^T B x,

871: \end{equation}

872: with $B$ symmetric positive-definite. For the fluctuations we will use a simple

873: two--dimensional extension of the cosine potential

874: \eqref{e:cos}:

875: $$

876: p_1(y_1) = \cos(y_1), \; p_2(y_2) =  \frac{1}{2}\cos(y_2).

877: $$

878: Our goal is to estimate the diffusion tensor and the drift coefficients.

879: We will estimate the diffusion tensor through the quadratic variation:

880: \begin{equation}

881: \widehat{\Sigma}_{N,\delta}(x(t)) = \frac{1}{2 N \delta } \sum_{n = 0}^{N-1}

882: (x_{n+1} -  x_n ) \otimes (x_{n+1} -  x_n ),

883: \label{e:sigma_estim_dd}

884: \end{equation}

885: where $\otimes$ stands for the tensor product.

886: For simplicity we will assume that the

887: diffusion tensor in our model is diagonal. This is consistent with the

888: homogenized diffusion tensor, see eq.  \eqref{e:tensor_2d}.

889: We will use generalizations of the maximum likelihood estimator

890: $\widehat{A}$ in order to estimate the parameters of the quadratic potential.

891:

892: \begin{figure}[t]

893: \centerline{

894: \begin{tabular}{c@{\hspace{2pc}}c}

895: \includegraphics[width=2.7in, height = 2.7in]{sigma11_estim_2d_sig05_a2_b2_c3_ep01_dt001.eps} &

896: \includegraphics[width=2.7in, height = 2.7in]{sigma22_estim_2d_sig05_a2_b2_c3_ep01_dt001.eps} \\

897: a.~~  $\widehat{\Sigma}_{11}$  & b.~~ $ \widehat{\Sigma}_{22} $

898: \end{tabular}}

899: \begin{center}

900: \caption{Estimation of the non--zero elements of the diffusion tensor for the 2d quadratic potential

901: \eqref{e:pot_2d} as a function of the sampling rate for $B_{11}= B_{12} = B_{21} = 2, \, B_{22} = 3,

902: \, \sigma = 0.5, \, \eps = 0.1$. Solid line: estimated coefficient.

903: Dashed line: homogenized coefficient. Dotted line: unhomogenized coefficient.}

904: \label{fig:2d_sigma}

905: \end{center}

906: \end{figure}

907: \begin{figure}

908: \centerline{

909: \begin{tabular}{c@{\hspace{2pc}}cc}

910: \includegraphics[width=2.7in, height = 2.7in]{alpha11_estim_2d_sig05_a2_b2_c3_ep01_dt001.eps} &

911: \includegraphics[width=2.7in, height = 2.7in]{alpha12_estim_2d_sig05_a2_b2_c3_ep01_dt001.eps} \\

912: a.~~  $\widehat{B}_{11}$  & b.~~ $ \widehat{B}_{12} $ \\

913: \includegraphics[width=2.7in, height = 2.7in]{alpha21_estim_2d_sig05_a2_b2_c3_ep01_dt001.eps} &

914: \includegraphics[width=2.7in, height = 2.7in]{alpha22_estim_2d_sig05_a2_b2_c3_ep01_dt001.eps} \\

915: a.~~  $\widehat{B}_{21}$  & b.~~ $ \widehat{B}_{22} $

916: \end{tabular}}

917: \begin{center}

918: \caption{Estimation of the parameters of the 2d quadratic potential

919: \eqref{e:pot_2d} as a function of the sampling rate for $\sigma = 0.5, \, \eps = 0.1$.

920: Solid line: estimated coefficient. Dashed line: homogenized coefficient.

921: Dotted line: unhomogenized coefficient. }

922: \label{fig:2d_alpha}

923: \end{center}

924: \end{figure}

925: In Figure \ref{fig:2d_sigma} we present the estimated values of the two non--zero

926: components of the diffusion tensor versus the sampling rate\footnote{The estimated value of

927: the off--diagonal elements is almost

928: $0$ for all values of the sampling rate, in accordance with the theoretical result

929: \eqref{e:tensor_2d}.}. The performance of the estimator for the diffusion tensor is,

930: qualitatively at least, similar to its performance in the one dimensional problems

931: considered in the previous two subsections. Notice, however, that the optimal sampling

932: rate is quite different for the two non--zero components of the diffusion tensor.

933:

934: In Figure \ref{fig:2d_alpha} we present the estimated values of the four drift

935: coefficients. The results are in accordance with the one dimensional theory developed in

936: this paper, as well as with the numerical experiments shown in one dimension. We remark

937: that the estimators capture successfully the fact that the homogenized matrix $B$ is not

938: symmetric. Notice furthermore that, as for the diffusion matrix, the optimal sampling

939: rate is different for different components of the matrix $B$.

940:

941: Thus, in this simple two dimensional multiscale model, the optimal sampling

942: rate is different in different directions. This suggests

943: that extreme care has to be taken when estimating parameters for multidimensional,

944: multiscale stochastic processes.

945: %

946: %

947: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

948: %

949: %

950: \subsection{The Second Estimator for the Drift Coefficient}

951: In this section we compare between the performances of the two estimators

952: for the drift coefficient, namely $\widehat{A}$ and $\tilde{A}$ given by

953: equations \eqref{e:alpha_estim_1d} and \eqref{e:alpha_estim_1d2} respectively. We estimate

954: the drift parameter of

955: \eqref{e:main} in one dimension for a quartic and a sixth--degree large--scale potential

956: $V(x)$:

957: \begin{equation}\label{e:pot_quartic}

958: V(x) = \frac{1}{4} \alpha x^4

959: \end{equation}

960: and

961: \begin{equation}\label{e:pot_six}

962: V(x) = \frac{1}{6} \alpha x^6.

963: \end{equation}

964: In both cases the small scale fluctuations are represented by the cosine potential

965: \eqref{e:cos} In Figure \ref{fig:alpha2_four} we present the estimated values of the

966: drift coefficient as a function of the sampling rate for two different $\sigma$ for the

967: quartic potential \eqref{e:pot_quartic}. We also plot the effective and the unhomogenized

968: values of the drift coefficient. Similar results for the sixth--degree potential

969: \eqref{e:pot_six} are presented in Figure \ref{fig:alpha2_six}. In both cases we observe

970: that the alternative estimator $\tilde{A}$ performs better than $\widehat{A}$ in this

971: situation where the data is subsampled.

972: \begin{figure}[t]

973: \centerline{

974: \begin{tabular}{c@{\hspace{2pc}}c}

975: \includegraphics[width=2.7in, height = 2.7in]{alpha_estim_quartic_sig05_a1_eps01_dt001.eps} &

976: \includegraphics[width=2.7in, height = 2.7in]{alpha_estim_quartic_sig07_a1_eps01_dt001.eps} \\

977: a.~~  $\sigma = 0.5$   & b.~~ $ \sigma = 0.7$

978: \end{tabular}}

979: \begin{center}

980: \caption{ Estimation of the drift coefficients for the quartic potential

981: \eqref{e:pot_quartic} as a function of the sampling rate for $ \eps = 0.1$. Solid line:

982: $\widehat{A}$. Dash-dot line: $\tilde{A}$. Dashed line: homogenized coefficient. Dotted

983: line: unhomogenized coefficient .}

984: %

985: \label{fig:alpha2_four}

986: %

987: \end{center}

988: \end{figure}

989: %

990: \begin{figure}[t]

991: \centerline{

992: \begin{tabular}{c@{\hspace{2pc}}c}

993: \includegraphics[width=2.7in, height = 2.7in]{alpha_estim_sixth_sig05_a1_eps01_dt001.eps} &

994: \includegraphics[width=2.7in, height = 2.7in]{alpha_estim_sixth_sig07_a1_eps01_dt001.eps} \\

995: a.~~  $\sigma = 0.5$   & b.~~ $ \sigma = 0.7$

996: \end{tabular}}

997: \begin{center}

998: \caption{ Estimation of the drift coefficients for the sixth--degree potential

999: \eqref{e:pot_six} as a function of the sampling rate for $\eps = 0.1$. Solid line:

1000: $\widehat{A}$. Dash-dotted line: $\tilde{A}$. Dashed line: homogenized coefficient.

1001: Dotted line: unhomogenized coefficient .}

1002: %

1003: \label{fig:alpha2_six}

1004: %

1005: \end{center}

1006: \end{figure}

1007: %

1008: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

1009: %

1010: %                          STATEMENT OF RESULTS

1011: %

1012: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

1013: %

1014: \section{Statement of Main Results}

1015: \label{sec:results}

1016:

1017: In this section we pesent theorems which substantiate the numerical

1018: observations in the preceeding section.

1019: The first result shows that, without subsampling, the parameter estimators for the

1020: homogenized model will be asymptotically biased: they recover the parameters from the

1021: unhomogenized equations.

1022: \begin{theorem}

1023: \label{thm:est_ddim} Let $x^\eps(t)$ be the solution of \eqref{e:xeps_V} with $x^\eps

1024: (0)$ distributed according to the invariant measure of the process. Then the estimator

1025: \eqref{e:a_est} satisfies

1026: \begin{equation}\label{e:a_est_lim}

1027: \lim_{\eps \rightarrow 0}\lim_{T \rightarrow \infty} \widehat{A}(x^{\eps}) = \alpha

1028: \quad \mbox{a.s.}

1029: \end{equation}

1030: Fix $T = N \delta$ in \eqref{e:sigma_estim_1d}. Then for every $\eps > 0$ we have

1031: \begin{equation}\label{e:sigma_est_lim}

1032: \lim_{N \rightarrow \infty} \widehat{\Sigma}_{N, \delta}(x^{\eps}) = \sigma \quad

1033: \mbox{a.s.}

1034: \end{equation}

1035: \end{theorem}

1036: %

1037: Now consider the one dimensional problem

1038: %

1039: \begin{equation}

1040: d x^\eps (t) = - \alpha V'(x^\eps(t)) dt - \frac{1}{\eps} p' \left(

1041: \frac{x^\eps(t)}{\eps} \right) dt + \sqrt{2 \sigma} d \beta (t). \label{e:xeps_est}

1042: \end{equation}

1043: %

1044:

1045: The next two results show that, with appropriate subsampling, the estimators

1046: recover the correct drift and diffusion coefficients for the homogenized

1047: model \eqref{e:lim_sde_1d} when taking data from the unhomogenized

1048: equation \eqref{e:xeps_est}.

1049: %

1050: \begin{theorem}\label{thm:par_est_alpha}

1051: Let $x^\eps(t)$ be the solution of \eqref{e:xeps_est} with $x^\eps (0)$ distributed

1052: according to the invariant measure of the process. Further, let

1053: $\delta = \eps^\alpha, \, \alpha \in (0 , 1 )$ and $N = \left[ \eps^{-\gamma} \right], \,

1054: \gamma > \alpha,$ where $[\cdot]$ denotes the integer part of a number. Then

1055: \begin{equation}

1056: \lim_{\eps \rightarrow 0} \widehat{A}_{N, \delta} (x^\eps) = A \quad \mbox{in law,}

1057: \label{e:alpha_lim}

1058: \end{equation}

1059: where $A$ is given by \eqref{e:coeffs_1d}.

1060: \end{theorem}

1061: %

1062: \begin{theorem}

1063: \label{thm:par_est_sigma} Let $x^\eps(t)$ be the solution of \eqref{e:xeps_est} with

1064: $x^\eps (0)$ distributed according to the invariant measure of the process. Fix $ T = N

1065: \delta$ with $\delta = \eps^\alpha$ and $\alpha \in (0 , 1)$. Then

1066: %

1067: \begin{equation}

1068: \lim_{\eps \rightarrow 0} \widehat{\Sigma}_{N, \delta} (x^\eps) = \Sigma \quad

1069: \mbox{in law,}

1070: %

1071: \label{e:sigma_lim}

1072: %

1073: \end{equation}

1074: where $\Sigma$ is given by \eqref{e:coeffs_1d}.

1075: \end{theorem}

1076: %

1077: \begin{remark}

1078: The two previous results require $\epsilon/\delta \to 0$ as $\epsilon \to 0.$ In view of

1079: the fact that the fast time--scale is ${\cal O}(\epsilon^2)$ (see equation

1080: \eqref{e:yeps_eqn}) we might expect that this could relaxed to

1081: $\epsilon^2/\delta \to 0$

1082: as $\epsilon \to 0.$ However we have not been able to prove this.

1083: See Remark \ref{r:label} for further discussion of this point.

1084: \end{remark}

1085: The final result concerns the second drift estimator and again concerns

1086: input of data from the unhomogenized equation \eqref{e:xeps_est} into the

1087: paramter estimator for the homogenized equation \eqref{e:lim_sde_1d}.

1088: It requires an estimate of the

1089: diffusion coefficient, $\widehat{\Sigma}.$ If $\widehat{\Sigma} = \sigma$, then we

1090: estimate the drift coefficient incorrectly with $\tilde A(x^{\eps})$; on the other

1091: hand, if $\widehat{\Sigma} = \Sigma$, then the estimator $\tilde A(x^{\eps})$ gives

1092: the drift of the homogenized equation. (To see the last result recall that

1093: $A/\Sigma=\alpha/\sigma$, see \eqref{e:coeffs_1d}). Consequently, for

1094: multiscale gradient systems, it is sufficient only to subsample in a

1095: fashion which leads to the

1096: correct diffusion coefficient. This offers a clear computational advantage.

1097: %

1098: \begin{theorem}\label{prop:drift_estim_2}

1099: Let $x^\eps(t)$ be the solution of \eqref{e:xeps_est} with $x^\eps (0)$ distributed

1100: according to the invariant measure of the process. Assume that the diffusion coefficient

1101: has been estimated to be $\widehat{\Sigma}$. Then

1102: $$\lim_{\eps \to 0}\lim_{T \to \infty}\tilde A(x^\eps)

1103: =\frac{\widehat\Sigma}{\sigma}\alpha \quad \mbox{in law.}$$

1104: \end{theorem}

1105: %

1106: %

1107: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

1108: %

1109: %                                 PRELIMINARY RESULTS

1110: %

1111: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

1112: %

1113: \section{Preliminary Results}

1114: \label{sec:prelim}

1115: In this section we collect various results that will be used in the proof of our main

1116: theorems.

1117: We start by investigating some of the properties of the invariant measures of the

1118: unhomogenized and of the homogenized equation.

1119: We then introduce some tools useful in the study of homogenization for

1120: SDEs.

1121: %

1122: \begin{prop}

1123: %

1124: \label{prop:gibbs}

1125: The invariant measure of the homogenized equation \eqref{e:lim_sde}

1126: is the Gibbs measure

1127: \begin{equation}

1128: \mu(dx) = \rho(x) dx = \frac{1}{Z} e^{-\alpha V(x)/\sigma} \, dx,

1129: \quad Z = \int_{\R^d} e^{-\alpha V(x)/\sigma} \, dx.

1130: \label{e:gibbs}

1131: \end{equation}

1132: The Markov process $x(t)$ given by \eqref{e:lim_sde} is geometrically ergodic: there are

1133: $C,\, \lambda>0$ such that, for every measurable $f(x)$ satisfying

1134: $$

1135: |f(x)| \leq 1 + |x|^p,

1136: $$

1137: for some integer $p > 0$, we have, for $\mu-$ a.e. $X(0)$,

1138: $$

1139: \left| \E f(x(t)) - \int_{\R^d} f(x) \rho(x) \, dx  \right| \leq

1140: C\bigl(1+|x(0)^p| \bigr)e^{- \lambda t},

1141: $$

1142: where $\E$ denotes expectation with respect to Wiener measure.

1143: \end{prop}

1144:

1145: \proof Assumptions \ref{a:1}, together with the formulae for the effective drift and the

1146: effective diffusion coefficient, equation \eqref{e:coeffs}, imply

1147: that the solution $x(t)$ of the homogenized equation \eqref{e:lim_sde} has a unique

1148: invariant measure with smooth density.  The Gibbs measure \eqref{e:gibbs} satisfies

1149: $$\alpha \nabla V \rho+\sigma \nabla \rho=0$$

1150: and hence

1151: $$K\Bigl(\alpha \nabla V \rho+\sigma \nabla \rho\Bigr)=0.$$

1152: Because $K$ is constant we deduce that

1153: $$\alpha K\nabla V \rho+\nabla \cdot \bigl(\sigma K \rho\bigr)=0.$$

1154: Thus

1155: $$\nabla \cdot \Bigl(\alpha K\nabla V \rho+\nabla \cdot \bigl(\sigma K \rho\bigr)\Bigr)=0.$$

1156: This is the stationary Fokker-Planck equation for \eqref{e:lim_sde}

1157: showing that the Gibbs measure $\rho$ is indeed an invariant measure.

1158: For the geometric ergodicity we use \cite[Thm 5.3]{MattStuHigh02}.

1159: \qed

1160: %

1161: \begin{prop}\label{lem:xeps_meas_ddim}

1162: The invariant measure of the unhomogenized equation \eqref{e:xeps_V} is the Gibbs measure

1163: %

1164: \begin{equation}

1165: \mu^\eps(dx) = \rho^\eps(x) \, dx = \frac{1}{Z^\eps} e^{-\frac{\alpha}{ \sigma}V(x) -

1166: \frac{1}{\sigma} p \left(\frac{x}{\eps} \right)}, \quad Z^\eps := \int_{\R^d}

1167: e^{-\frac{\alpha}{ \sigma} V(x) - \frac{1}{\sigma} p \left(\frac{x}{\eps} \right)} \, dx.

1168: \label{e:xeps_inv_meas_ddim}

1169: \end{equation}

1170: %

1171: For every $\eps > 0$ the Markov process \eqref{e:xeps_V} is geometrically

1172: ergodic: there are $C,\lambda>0$ such that, for every measurable $f(x)$

1173: satisfying

1174: $$

1175: |f(x)| \leq 1 + |x|^p,

1176: $$

1177: for some integer $p>0$ we have, for $\mu^{\eps}-$a.e. $x^{\eps}(0)$,

1178: $$

1179: \left| \E f(x^\eps(t)) - \int_{\R} f(x) \rho^\eps(x) \, dx  \right|

1180:  \leq C\bigl(1+|x^{\eps}(0)|^p\bigr)e^{- \lambda t},$$

1181: where $\E$ denotes expectation with respect to Wiener measure.

1182:

1183: Furthermore, the measure $\mu^\eps$ converges weakly to the invariant measure of the

1184: homogenized dynamics $\mu$ given by \eqref{e:gibbs}.

1185: \end{prop}

1186: %

1187: \proof Assumptions \ref{a:1} imply that $x^\eps(t)$ is an ergodic Markov process. Direct

1188: calculation with the Fokker--Planck equation shows that the unique invariant measure of

1189: the process is the Gibbs measure

1190: %

1191: \begin{eqnarray*}

1192: \rho^\eps(x) \, dx & = & \frac{1}{Z^\eps} e^{- \frac{1}{\sigma}

1193:  V \left( x, \frac{x}{\eps}, \alpha \right)} \, dx

1194: \\ & = & \frac{1}{Z^\eps} e^{- \frac{\alpha}{\sigma}

1195:  V(x)- \frac{1}{\sigma} p\left(  \frac{x}{\eps} \right)} \, dx,

1196: \end{eqnarray*}

1197: with $Z^\eps$ given by \eqref{e:xeps_inv_meas_ddim}.

1198: For the geometric ergodicity we use \cite[Thm 5.3]{MattStuHigh02}.

1199:

1200: Now let

1201: %

1202: $$

1203: u(x,y):= e^{- \frac{\alpha}{ \sigma} V(x) - \frac{1}{\sigma}p(y)}.

1204: %

1205: $$

1206: %

1207: Since $u(x,y) \in L^1(\R^d ; C_{per}(\T^d))$, by \cite[Lem. 9.1]{cioran} we have that

1208: %

1209: $$

1210: u \left(\cdot, \frac{\cdot}{\eps}  \right) \rightharpoonup \int_{\T^d}

1211: u(\cdot, y) \, dy, \quad \mbox{weakly in } L^1(\R^d).

1212: $$

1213: %

1214: In particular, since $1 \in L^{\infty}(\R^d)$,

1215: %

1216: $$

1217: \lim_{\eps \rightarrow 0} Z^\eps =  \int_{\R^d} \int_{\T^d} e^{-

1218: \frac{\alpha}{\sigma}V(x) - \frac{1}{\sigma} p(y)} \, dy.

1219: $$

1220: We combine the above two results to conclude that

1221: $$

1222: \rho^\eps(x) \rightharpoonup \frac{1}{Z} e^{-\frac{\alpha}{ \sigma} V(x)}, \quad

1223: \mbox{weakly in } L^1(\R^d),

1224: $$

1225: %

1226: where $Z$ is given by \eqref{e:gibbs}. The weak convergence of the densities in

1227: $L^1(\R^d)$ implies the weak convergence of the corresponding probability measures. \qed

1228:

1229: \begin{remark}

1230: The assumption of stationarity of the process $x^\eps(t)$ is not necessary for the proof of

1231: the above theorems and is only made for simplicity. Indeed, in the next section we prove that

1232: $x^\eps (t)$ is geometrically ergodic and consequently it converges to its invariant

1233: distribution exponentially fast for arbitrary initial conditions.

1234: Furthermore, the fact that the invariant measure of the process

1235: $x^\eps(t)$ converges weakly, as $\eps \rightarrow 0$,

1236: to the invariant measure of the homogenized process is important for

1237: us as many of our results will be deduced by taking expectations with respect to the

1238: invariant measure $\mu^{\eps}(dx)$ of the multiscale dynamics \eqref{e:xeps_V}. The weak

1239: convergence alluded to demonstrates that the measure $\mu^{\eps}$ behaves uniformly in

1240: $\eps \to 0.$

1241: \end{remark}

1242:

1243: An immediate corollary of the above proposition is that $x^\eps(t)$ has bounded moments

1244: of all orders. We will use the notation $\bbE^{\mu^{\epsilon}}$ to denote expectation

1245: with respect to the stationary measure of \eqref{a:1} on path space, when initial data is

1246: distributed according to the Gibbs measure \eqref{e:xeps_inv_meas_ddim}.

1247: %

1248: \begin{corollary}\label{cor:moments}

1249: Let $x^\eps(t)$ be the solution of \eqref{e:main} with the potential given by

1250: \eqref{e:potential} and assume that conditions \eqref{a:1} are satisfied. Assume

1251: furthermore that $x^\eps(0)$ is distributed according to $\mu^\eps$. Then, for all $p \ge

1252: 1,$ there is a constant $C=C(P,T)$ uniform in $\epsilon \to 0$, such that

1253: %

1254: $$\bbE^{\mu^{\epsilon}}|x^\eps(t)|^p \le C \quad \forall \, t \in [0,T].$$

1255: %

1256: \end{corollary}

1257:

1258: It is convenient for the subsequent analysis to introduce the auxiliary

1259: variable

1260: $$

1261: y^\eps (t) = \frac{x^\eps(t)}{\eps}.

1262: $$

1263: We can then write equation \eqref{e:xeps_V} in the form

1264: \begin{subequations}

1265: \begin{equation}

1266: d x^\eps(t) = - \alpha \nabla V(x^\eps(t)) \, dt -  \frac{1}{\eps} \nabla p \left( y^\eps

1267: (t) \right) \, dt + \sqrt{2 \sigma} \, d \beta (t),

1268: %

1269: \label{e:xeps_eqn}

1270: %

1271: \end{equation}

1272: \begin{equation}

1273: d y^{\eps}(t) = - \frac{1}{\eps} \alpha \nabla V(x^\eps(t)) \, dt -  \frac{1}{\eps^2}

1274: \nabla p \left( y^\eps (t)  \right) \, dt + \sqrt{\frac{2 \sigma}{\eps^2}} \, d \beta

1275: (t).

1276: %

1277: \label{e:yeps_eqn}

1278: %

1279: \end{equation}

1280: \label{e:eqns_motion}

1281: \end{subequations}

1282: %

1283: Notice that both processes $x^\eps (t)$ and $y^\eps (t)$are driven by the same Brownian

1284: motion. Written in this fashion it is clear that we are in a situation

1285: where homogenization applies. The homogenized equation is found by

1286: eliminating $y^{\eps}(t)$ from the scale separated system

1287: for $\left\{ x^{\eps}(t), y^{\eps}(t) \right\}$. Note that

1288: ${\cal L}_0$ defined in \eqref{e:cell} is the generator of the process

1289: \begin{equation*}

1290: d y (t) = - \nabla p \left( y (t)  \right) \, dt + \sqrt{2 \sigma} \, d \beta (t),

1291: \end{equation*}

1292: on the unit torus, which governs the dynamics of $y_t^{\eps}$ to leading order in

1293: $\epsilon$. The generator of the joint process $\{x^\eps(t), \, y^\eps_t \}$ reads

1294: $$\LL^\eps=\frac{1}{\eps^2} \LL_0 + \frac{1}{\eps} \LL_1 + \LL_2,$$

1295: where

1296: \begin{align*}

1297: \LL_0&= - \nabla_y p(y)\cdot \nabla_y + \sigma \Delta_y,\\

1298: \LL_1&=  - \nabla_y p(y) \cdot \nabla_x - \alpha \nabla_x V(x)

1299: \cdot \nabla_y + 2 \sigma \nabla_x \cdot \nabla_y,\\

1300: \LL_2&=  - \alpha \nabla_x V(x) \cdot \nabla_x + \sigma \Delta_x.

1301: \end{align*}

1302:

1303: The following result can be found in, e.g. \cite[Ch. 3]{lions}.

1304: \begin{lemma}

1305: \label{l:Poisson} Assume that $p(y) \in C^{\infty}_{per}(\T^d,\R)$ and that $H(y) \in

1306: C^{\infty}_{per}(\T^d,\R^d).$ Let $\mu(dy)$ be the Gibbs measure \eqref{e:gibbs_torus}

1307: and assume that $H(y)$ is centered with respect to $\mu(dy)$:

1308: \begin{equation}\label{e:centering}

1309: \int_{\T^d} H(y) \, \mu(dy) = 0.

1310: \end{equation}

1311:  Then the Poisson equation

1312: \begin{equation}\label{e:poisson}

1313: - \LL_0 \chi = H(y),

1314: \end{equation}

1315: has a unique mean-zero solution in $L^2_{per}(\T^d, \mu(dy) ; \R^d)$.

1316: This solution, together with all its derivatives, is bounded.

1317: \end{lemma}

1318:

1319: We will need an estimate on integrals whose integrand is centered with respect to the

1320: invariant measure  $\mu(dy)$.

1321: %

1322: \begin{lemma}\label{lem:ito}

1323: Let $H(y) \in C^\infty_{per} \left(\T^d ; \R^d \right)$ satisfy condition

1324: \eqref{e:centering}.  Assume that  $x^\eps (0)$ is distributed according

1325: to \eqref{e:xeps_inv_meas_ddim}. Then the

1326: following estimate holds for any $p>1$ and $T >0$:

1327: \begin{equation*}

1328:  \E^{\mu^\eps} \left| \int_{0}^{T} H(y^\eps(s)) \, ds \right|^p \leq

1329: C \left(\eps^{2p} + \eps^pT^p+\eps^p T^{\frac{p}{2}}  \right).

1330: \end{equation*}

1331: \end{lemma}

1332: %

1333: \proof Consider the Poisson equation \eqref{e:poisson} with periodic boundary conditions.

1334: Since $H(y)$ satisfies \eqref{e:centering}, Lemma \ref{l:Poisson} applies and we have

1335: that $\chi(y)$ is smooth and bounded, together with all its derivatives. We now apply the

1336: It\^{o} formula to $\chi(y^\eps (t))$, where $y^\eps (t)$ is the solution of

1337: \eqref{e:yeps_eqn}, and use \eqref{e:poisson} to obtain

1338: %

1339: \begin{align*}

1340: \int_{0}^{T} H(y^\eps(s)) \, ds =& - \eps^2 \left( \chi(y^\eps(T))  -

1341: \chi(y^\eps(0))   \right)\\

1342: & + \eps \sqrt{2 \sigma} \int_{0}^{T} \langle \nabla_y \chi(y^\eps(s)), \, d \beta(s)

1343: \rangle -\alpha \eps \int_0^T \langle \nabla V(x^\eps(s)),

1344: \nabla \chi(y^\eps(s)) \rangle ds.

1345: \end{align*}

1346: %

1347: Now, using the boundedness of $\chi$, we have, for

1348: $$I(T) :=\E^{\mu^\eps} \left| \int_{0}^{T} H(y^\eps(s)) \, ds, \right|^p,$$

1349: \begin{eqnarray*}

1350: I(T) & \leq & C \left( \eps^{2 p} + \eps^{p} \E^{\mu^\eps}\left|\int_0^T |\nabla

1351: V(x^{\eps}(s))| ds\right|^p+ \eps^{p} \E^{\mu^\eps} \left| \int_{0}^{T} \langle \nabla_y

1352: \chi(y^\eps(s)) , \, d \beta (s) \rangle \right|^p \right)

1353: %

1354: \\ & \leq &

1355: %

1356:  C \left( \eps^{2 p} + \eps^{p} T^{p-1} \int_0^T |x^{\eps}(s)|^p ds+

1357: \eps^{p} T^{\frac{p}{2} -1} \int_{0}^{T} \E^{\mu^\eps} \left| \nabla_y \chi(y^\eps(s))

1358: \right|^p \, ds \right)

1359: %

1360: \\ & \leq &

1361: %

1362: C \left( \eps^{2 p} + \eps^p T^p+\eps^{p} T^{\frac{p}{2}} \right),

1363: \end{eqnarray*}

1364: %

1365: from which the desired estimate follows. In deriving the above we used

1366: the estimate \cite[Eqn. 3.25, p. 163]{KSh91} on moments of

1367: stochastic integrals. \qed

1368:

1369: %The above result will be of particular use to us in the case $T=\delta \ll 1.$

1370: %The Markov property then implies that

1371: %\begin{equation}\label{e:ito_est}

1372: %\left( \E^{\mu^\eps} \left| \int_{n\delta}^{(n+1)\delta} H(y^\eps(s)) \, ds

1373: %\right|^p \right)^{1/p} \leq   C \left(\eps^{2p} + \eps^p\delta^{\frac{p}{2}}

1374: %\right)^{1/p}. %\end{equation}

1375: %

1376: For the rest of this section we will restrict ourselves to the one dimensional case.

1377:  If we apply It\^{o} formula to $\phi(y^\eps(s))$, the solution of the Poisson equation

1378: \eqref{e:cell}, then we obtain

1379: %

1380: \begin{eqnarray}

1381: x^\eps_{n+1} - x_n^\eps & = & - \alpha \int_{n \delta}^{(n+1)\delta} V'(x^\eps(s)) (1 +

1382: \partial_y \phi(y^\eps(s))) \, ds

1383: \\ &&+ \sqrt{2 \sigma} \int_{n \delta}^{(n+1)\delta} (1 +

1384: \partial_y \phi(y^\eps(s))) \, d \beta (s)

1385: \nonumber\\ && - \eps \left( \phi(y^\eps((n+1)\delta)) - \phi(y^\eps (n\delta)) \right).

1386: \label{e:integr_parts}

1387: \end{eqnarray}

1388: The proof Theorems \ref{thm:par_est_alpha} and \ref{thm:par_est_sigma} is based on

1389: careful asymptotic analysis of the behavior of $x^\eps_{n+1} - x^\eps_n$ given by this

1390: formula when both $\eps$ and $\delta$ are small. Specifically we will use the following

1391: two propositions. They show how the effective homogenized behaviour is manifest in

1392: the time--$\delta$ Markov chain induced by sampling the path $x^{\eps}(t)$ from

1393: \eqref{e:xeps_V}.

1394: %

1395: \begin{prop}\label{prop:xndelta1}

1396: For $\eps, \, \delta >0 $ sufficiently small and $n \in \mathbb{N}$ there exists an i.i.d.

1397: sequence of random variables $\xi_n \in \mathcal{N}(0,1)$ such that

1398: \begin{equation}

1399: \sqrt{2 \sigma} \int_{n \delta}^{(n+1) \delta}(1+\partial_y \phi(y^\eps(s))) \, d \beta(s)

1400: =\sqrt{2 \Sigma \, \delta} \, \xi_n+ R_1(\delta, \eps)

1401: \label{e:xn_loc1}

1402: \end{equation}

1403: in law. The remainder $R_1(\delta, \eps)$ satisfies, for every $\beta \in

1404: (0,\frac12)$ and $p>0$, the estimate

1405: \begin{equation}

1406: \left( \E^{\mu^\eps} \big| R_1(\eps,\delta) \big|^p \right)^{1/p} \leq  C \, \left(

1407: \eps^{2 \beta} + \eps^{\beta} \right),

1408: \label{e:R_est}

1409: \end{equation}

1410: where $C$ is independent of $\eps$ and $\delta$.

1411: \end{prop}

1412: \begin{remark}

1413: \label{r:label}

1414: Estimate \eqref{e:R_est} is almost certainly not optimal. Indeed, informal

1415: calculations lead us to expect the estimate

1416: $$

1417: \left( \E^{\mu^\eps} \big| R_1(\eps,\delta) \big|^p \right)^{1/p} \leq  C \, \left(

1418: \eps^{2 \beta} + \eps^{\beta} \delta^{\beta} + \eps^{\beta} \delta^{\frac{\beta}{2}} \right).

1419: $$

1420: However, we have not been able to prove this.

1421: \end{remark}

1422: \begin{prop}\label{prop:xndelta2}

1423: For $\eps, \, \delta >0 $ sufficiently small and $n \in \mathbb{N}$ we have that

1424: \begin{equation}

1425: \label{e:xn_loc2}

1426: \alpha \int_{n \delta}^{(n+1) \delta} V'(x^\eps (s)) (1 +

1427: \partial_y \phi(y^\eps (s))) \, ds = A \delta V'(x^\eps_n) +R_2(\eps, \delta)

1428: \end{equation}

1429: in law. The remainder $R_2(\delta, \eps)$ satisfies, for every $p>0$,

1430: the estimate

1431: %

1432: \begin{equation}

1433: \left( \E^{\mu^\eps} \big| R_2(\eps,\delta) \big|^p \right)^{1/p} \leq  C \left(

1434: \eps^2 +\delta^{ \frac12}\eps+ \delta^{3/2} \right), \label{e:R_est2}

1435: \end{equation}

1436: where $C$ independent of $\eps$ and $\delta.$

1437: \end{prop}

1438:

1439:

1440: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

1441: %

1442: %                   PROOF OF PROPOSITION 1.1

1443: %

1444: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

1445: %

1446: \section{Proof of Propositions \ref{prop:xndelta1} and \ref{prop:xndelta2}}

1447: \label{sec:xndelta_proof}

1448: In this section we prove the two propositions \ref{prop:xndelta1} and

1449: \ref{prop:xndelta2}. These are central to

1450: the proof of the two theorems concerning the behaviour of the estimators with

1451: subsampled data. We start with a rough estimate on $x^\eps_{n+1} - x_n^\eps$ that we will

1452: need for the proofs of the propositions.

1453:

1454: \subsection{A Rough Estimate}

1455: %

1456: \begin{lemma}\label{lem:rough}

1457: Let Assumptions \ref{a:1} hold and assume that $x^\eps (t)$, the solution of

1458: \eqref{e:xeps_est}, is stationary. Then there exists a constant $C$, independent of

1459: $\delta$ and $\epsilon$, such that

1460: \begin{equation}\label{e:est_rough}

1461: \E^{\mu^\eps} \left| x^\eps(s) - x^\eps_{n \delta} \right|^p \leq C \left( \delta^p +

1462: \delta^{\frac{p}{2}} + \eps^p \right),

1463: \end{equation}

1464: for every $s \in (n \delta, (n +1 ) \delta]$ and every $p \geq 1$.

1465: \end{lemma}

1466: %

1467: \proof

1468: Using the same derivation that leads to \eqref{e:integr_parts},

1469: but with $(n+1) \delta$ replaced by $s$, we have:

1470: \begin{eqnarray}

1471: x^\eps(s) - x_n^\eps & = & - \alpha \int_{n \delta}^{s} V'(x^\eps(s)) (1 +

1472: \partial_y \phi(y^\eps(s))) \, ds + \sqrt{2 \sigma} \int_{n \delta}^{s} (1 +

1473: \partial_y \phi(y^\eps(s))) \, d \beta(s)

1474: \nonumber \\ &&- \eps \left( \phi(y^\eps (s)) - \phi(y^\eps(n\delta)) \right)

1475: \nonumber \\  & =:&

1476: I_{n,\delta}^1 + I_{n,\delta}^2 + I_{n,\delta}^3.

1477: \label{eq:itophi}

1478: \end{eqnarray}

1479: We need to estimate the terms in \eqref{eq:itophi}. We start with $I^3_{n, \delta}$. By

1480: Lemma \ref{l:Poisson} we have

1481: $$\|\phi(y)\|_{L^\infty} \leq C. $$

1482: Consequently

1483: $$

1484: \E^{\mu^\eps} |I_{n,\delta}^3|^p  \leq C \eps^p.

1485: $$

1486: To estimate $I_{n,\delta}^1$ we use again Lemma \ref{l:Poisson} to conclude that

1487: \begin{equation}\label{e:phi_est}

1488: \|1 + \partial_y \phi(y)\|_{L^\infty} \leq C.

1489: \end{equation}

1490: The above estimate, together with Assumptions \ref{a:1}, Corollary \ref{cor:moments} and

1491: the stationarity of the process $x^\eps (t),$ give

1492: \begin{eqnarray*}

1493: \E^{\mu^\eps} |I_{n,\delta}^1|^p & \leq & C \delta^{p-1} \int_{n \delta}^{(n+1) \delta}

1494: \E^{\mu^\eps} |V'(x^\eps(s))|^p \, ds

1495:  \\ & \leq & C \delta^{p-1} \int_{n \delta}^{(n+1)\delta} \E^{\mu^\eps}

1496: |x^\eps (s) |^{p} \, ds

1497: \\ & \leq & C \delta^{p}.

1498: \end{eqnarray*}

1499: Estimate \cite[Eqn. 3.25, p. 163]{KSh91} on moments of stochastic integrals,

1500: together with equation \eqref{e:phi_est}, enable us to conclude that

1501: %

1502: \begin{eqnarray*}

1503: \E^{\mu^\eps} |I_{n,\delta}^2|^p & \leq & C \delta^{\frac{p}{2}-1} \int_{n

1504: \delta}^{(n+1)\delta} \E^{\mu^\eps} |1 + \partial_y \phi(y^\eps (s))|^p \, ds

1505: %

1506: \\ & \leq & C \delta^{\frac{p}{2}}.

1507: %

1508: \end{eqnarray*}

1509: %

1510: We combine the above estimates to obtain \eqref{e:est_rough}. \qed

1511: %

1512: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

1513: %

1514: \subsection{Proof of Proposition \ref{prop:xndelta1}}

1515:

1516: From Theorem \cite[Sec. 1.3]{Freid85}, \cite[Thm. 3.4.6]{KSh91} we know that the

1517: martingale

1518: $$M(t):=\sqrt{2\sigma}\int_0^t  \left( 1 + \partial_y \phi(y^\eps_{s}) \right)ds$$

1519: is equal in law to a time--changed Brownian motion,

1520: $$M(t)=\widehat{\beta}  \left(2 \sigma \int_0^t \left( 1 + \partial_y

1521:  \phi(y^\eps(s)) \right)^2 \, d s \right).

1522: $$

1523: Also the quadratic variation satisfies

1524: %

1525: $$\langle M \rangle_t =2 \sigma \int_0^t \left( 1 + \partial_y \phi(y^\eps(s))

1526: \right)^2 \, d s \approx 2\Sigma t.$$

1527: %

1528: Indeed

1529: \begin{eqnarray*}

1530: \E^{\mu^{\eps}} \langle M \rangle_t &=& 2 \sigma \bbE^{\mu^{\eps}}

1531: \int_0^t \left( 1 + \partial_y \phi(y^\eps(s))

1532: \right)^2 \, d s

1533: \\ & = & 2\Sigma t,

1534: \end{eqnarray*}

1535: where the last equality follows from equation \eqref{e:coeffs} for $d = 1$. Using these

1536: observations we write

1537: \begin{eqnarray*}

1538: J_n  & := & \sqrt{2 \sigma} \int_{n \delta}^{(n+1) \delta} \left( 1 + \partial_y

1539: \phi(y^\eps(s)) \right) \, d \beta(s) \\

1540: &=&\sqrt 2\sigma \int_0^{(n+1)\delta}\left( 1 + \partial_y \phi(y^\eps(s)) \right) \, d

1541: \beta(s) -\sqrt 2\sigma \int_0^{n\delta}\left( 1 + \partial_y

1542: \phi(y^\eps(s)) \right) \, d \beta(s) \\

1543: &=& \widehat{\beta}(2\Sigma(n+1)\delta)-\widehat{\beta}(2\Sigma n\delta)

1544: +r_{n+1}-r_n\\

1545: &=&\sqrt {2\Sigma \delta} \xi_n+r_{n+1}-r_n,

1546: \end{eqnarray*}

1547: where the $\xi_n$ are i.i.d unit Gaussian random variables and

1548: $$r_n=\widehat{\beta}(\langle M\rangle_{n\delta})-\widehat{\beta}(2\Sigma n\delta).$$

1549:

1550: To estimate this difference we follow the proof of \cite[Thm. 2.1]{HairPavl04}. We start

1551: by employing the H\"{o}lder continuity of Brownian motion, together with H\"{o}lder

1552: inequality, to estimate:

1553: \begin{eqnarray*}

1554: \E^{\mu^\eps} \left|  \widehat{\beta} (\langle M \rangle_{n\delta}) - \widehat{\beta} (

1555: \E^{\mu^\eps} \langle M \rangle_{n\delta}) \right|^p   & \leq & \E^{\mu^\eps} \left|

1556: \mbox{H\"{o}l}_{\beta}(\widehat{\beta})  \left(  \langle M \rangle_{n\delta} -

1557: \E^{\mu^\eps} \langle M \rangle_{n\delta} \right)^{\beta} \right|^p

1558: %

1559: \\ & \leq &

1560: %

1561: \E^{\mu^\eps} \left| \mbox{H\"{o}l}_{\beta}(\widehat{\beta})  \right|^{p}

1562: \left( \E^{\mu^\eps} \left|\langle M \rangle_{n\delta} - \E^{\mu^\eps}

1563: \langle M \rangle_{n\delta}  \right|^{\beta  q} \right)^{\frac{p}{q}}

1564: %

1565: \\ & \leq &

1566: %

1567: C \left( \E^{\mu^\eps} \left| \int_0^{ n\delta} H(y^\eps (z)) \, dz \right|^{\beta  q}

1568: \right)^{\frac{p}{q}},

1569: %

1570: \end{eqnarray*}

1571: %

1572: with $\beta \in \left(0, \frac{1}{2} \right)$. We have used the notation

1573: %

1574: $$

1575: H(y) := 2 \sigma  \left( 1 + \partial_y \phi(y) \right)^2  - 2 \Sigma.

1576: $$

1577: %

1578: We have also used the fact that, for every $\beta \in \left(0, \frac{1}{2} \right)$ and

1579: every bounded time interval, the $\beta$--H\"{o}lder exponent of Brownian motion is

1580: uniformly bounded with probability one. We have that

1581: $$

1582: \int_{\T} H(y) \, \mu(dy) = 0,

1583: $$

1584: %

1585: where $\mu(dy)$ is defined in \eqref{e:gibbs_torus}. Since $n\delta \le T$,

1586: Lemma \ref{lem:ito} applies

1587: and we have that,  for $q$ sufficiently large and for $\eps$ sufficiently small,

1588: %

1589: \begin{eqnarray*}

1590: \E^{\mu^\eps} \left| J_n - \sqrt{2 \Sigma \delta}\xi_n \right|^p   & \leq & C \left( \eps^{2 q

1591: \beta} +   \eps^{q \beta }  \right)^{\frac{p}{q}} \\ & \leq & C

1592: \left( \eps^{2 p \beta} + \eps^{p \beta }  \right) .

1593: \end{eqnarray*}

1594: This completes the proof of the proposition.

1595: \begin{comment}

1596: \begin{eqnarray*}

1597: \sqrt{2 \sigma} \int_{n \delta}^{(n+1)\delta} (1 + \partial_y \phi(y^\eps(s))) \, d

1598: \beta(s) & = & \sqrt{2 \sigma} \int_0^{(n+1) \delta} (1 + \partial_y \phi(y^\eps(s))) \, d

1599: \beta(s) -   \sqrt{2 \sigma} \int_0^{n \delta} (1 + \partial_y \phi(y^\eps(s))) \, d

1600: \beta(s) \\ & = & \widehat{\beta}  \left(2 \sigma \int_0^{(n+1)\delta} \left( 1 +

1601: \partial_y \phi(y^\eps(s)) \right)^2 \, d s \right) - \widehat{\beta}

1602: \left(2 \sigma \int_0^{n \delta} \left( 1 + \partial_y

1603:  \phi(y^\eps(s)) \right)^2 \, d s \right) \\ & = & \widehat{\beta}(2 \Sigma (n+1) \delta) -

1604:  \widehat{\beta}(2 \Sigma n \delta) \\ &&+  \left[ \left(\widehat{\beta}  \left(2 \sigma

1605:  \int_0^{(n+1)\delta} \left( 1 + \partial_y \phi(y^\eps(s)) \right)^2 \, d s \right)

1606:  - \widehat{\beta}(2 \Sigma (n+1) \delta) \right)  \right. \\ && \left.+

1607:  \left(\widehat{\beta}  \left(2 \sigma

1608:  \int_0^{n \delta} \left( 1 + \partial_y \phi(y^\eps(s)) \right)^2 \, d s \right)

1609:  - \widehat{\beta}(2 \Sigma n \delta) \right)  \right] \\ &=&

1610:  \widehat{\beta}(2 \Sigma (n+1) \delta) - \widehat{\beta}(2 \Sigma n \delta) + r^\eps_{(n +1)

1611:  \delta} + r^\eps_{n \delta}.

1612: \end{eqnarray*}

1613: Notice that, in law,

1614: \begin{eqnarray*}

1615: \widehat{\beta} \left(2 \Sigma (n+1) \delta \right) -

1616: \widehat{\beta} \left(2 \Sigma n \delta \right)

1617:  & = & \sqrt{2 \Sigma \delta} (\widehat{\beta}(n+1) - \widehat{\beta}(n) ).

1618: \\ & = & \sqrt{2 \Sigma \delta} \, \xi_n.

1619: \end{eqnarray*}

1620: Now we need to estimate the remainder. The proof of the estimate follows the proof of

1621: \cite[Thm. 2.1]{HairPavl04}. We start by employing the H\"{o}lder continuity of Brownian

1622: motion, together with H\"{o}lder inequality to estimate:

1623: \begin{eqnarray*}

1624: \E \left| r^\eps_{(n+1) \delta} \right|^p  & = & \E \left| \widehat{\beta} \left(2 \sigma

1625: \int_0^{(n+1) \delta} \left( 1 + \partial_y \phi(y^\eps(s)) \right)^2 \,  d s \right) -

1626: \widehat{\beta} \left(2 \Sigma \delta  \right) \right|^p

1627: \\ & \leq &

1628: \E \left| \mbox{H\"{o}l}_{\beta}(\widehat{\beta})  \left(\int_0^{(n+1) \delta} \left( 2

1629: \sigma  \left( 1 + \partial_y \phi(y^\eps(s)) \right)^2  - 2 \Sigma    \right) \, ds

1630: \right)^{\beta} \right|^p

1631: %

1632: \\ & \leq &

1633: %

1634: \left( \E \left| \mbox{H\"{o}l}_{\beta}(\widehat{\beta})  \right|^{m }

1635: \right)^{\frac{p}{m}} \left( \E \left| \int_0^{(n+1) \delta} \left( 2 \sigma \left( 1 +

1636: \partial_y \phi(y^\eps(s)) \right)^2  - 2 \Sigma   \right) \, ds \right|^{\beta  q}

1637: \right)^{\frac{p}{q}}

1638: %

1639: \\ & \leq &

1640: %

1641: C \left( \E \left| \int_0^{(n+1) \delta} H(y^\eps(s)) \, ds, \right|^{\beta  q}

1642: \right)^{\frac{p}{q}},

1643: %

1644: \end{eqnarray*}

1645: %

1646: with $\beta \in \left(0, \frac{1}{2} \right)$. We have used the notation

1647: %

1648: $$

1649: H(y) = 2 \sigma  \left( 1 + \partial_y \phi(y^\eps(s)) \right)^2  - 2 \Sigma.

1650: $$

1651: %

1652: We have also used the fact that, for every $\beta \in \left(0, \frac{1}{2} \right)$ and

1653: every bounded time interval, the $\beta$--H\"{o}lder exponent of Brownian motion is

1654: uniformly bounded with probability one.

1655:

1656: Notice now that

1657: %

1658: $$

1659: \int_{\T^d} H(y) \, \mu(dy) = 0,

1660: $$

1661: %

1662: where $\mu(dy)$ is defined in \eqref{e:gibbs_torus}. Hence,

1663: by Lemma \ref{lem:ito}, we have that,  for $q$ sufficiently large and for

1664: $\eps$ sufficiently small,

1665: %

1666: \begin{eqnarray*}

1667: \E \left| r^\eps_{(n+1) \delta} \right|^p   & \leq & C \left( \eps^{2 \beta} +   \eps^{

1668: \beta } ((n+1) \delta)^{\frac{\beta q}{2}} \right)^p \\ & \leq & C \eps^{p \beta}.

1669: \end{eqnarray*}

1670: %

1671: Similarly,

1672: %

1673: \begin{eqnarray*}

1674: \E \left| r^\eps_{n \delta} \right|^p  \leq C \eps^{p \beta}.

1675: \end{eqnarray*}

1676: %

1677: This completes the proof of the proposition.

1678: \end{comment}

1679:  \qed

1680: %

1681: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

1682: %

1683: \subsection{Proof of Proposition \ref{prop:xndelta2}}

1684: %

1685: %

1686: We have

1687: \begin{eqnarray*}

1688: \E^{\mu^\eps} |R_2(\eps,\delta)|^p & = & \E^{\mu^\eps} \left| \int_{n \delta}^{(n +1)

1689: \delta} \alpha V'(x^\eps(s)) \left( 1 +

1690: \partial_y \phi(y^\eps(s)) \right) \, ds  -

1691:  \delta A V'(x^\eps_{n \delta}) \right|^p  \\ &=&

1692: \E^{\mu^\eps} \left| \int_{n \delta}^{(n +1) \delta} \alpha V'(x^\eps_{n\delta})\left( 1 +

1693: \partial_y \phi(y^\eps(s)) \right) \, ds - A \int_{n\delta}^{(n+1) \delta} V'(x^\eps_{n\delta}) \,

1694: ds \right.

1695: \\ &&

1696: +  \left. \alpha \int_{n\delta}^{(n+1) \delta} \Bigl(V'(x^\eps(s)) - V'(x^\eps_{n

1697: \delta})\Bigr)\Bigl(1+

1698: \partial_y \phi(y^\eps(s))\Bigr) \,ds \right|^p

1699: \\ & \leq & C \E^{\mu^\eps} \left| V'(x^\eps_{n\delta})

1700: \int_{n \delta}^{(n +1) \delta} \left( \alpha \left( 1 + \partial_y \phi(y^\eps(s))

1701: \right) - A \right) \, ds \right|^p

1702: \\ &&

1703: + \alpha ^p C \E^{\mu^\eps} \left| \int_{n\delta}^{(n+1) \delta} \Bigl( V'(x^\eps(s)) -

1704: V'(x^\eps_{n \delta}) \Bigr) \Bigl(1+\partial_y \phi(y^\eps(s))\Bigr) \,ds \right|^p \\ &

1705: =: & I^1_{\eps, \delta} + I^2_{\eps, \delta},

1706: \end{eqnarray*}

1707: %

1708: where the constant $C$ depends only on $p$. We use the H\"{o}lder inequality, Assumptions

1709: \ref{a:1}, Lemma \ref{lem:rough} and the uniform bound on $\partial_y \phi(y)$ to obtain,

1710: for $\eps, \, \delta$ sufficiently small,

1711: \begin{eqnarray*}

1712: I_{\epsilon,\delta}^2  & \leq & C \delta^{p - 1} \int_{n\delta}^{(n+1) \delta}

1713: \E^{\mu^\eps} \left| x^\eps(s) - x^\eps_{n \delta} \right|^{p} \, ds

1714: \\ & \leq & C \delta^{p-1} \int_{n\delta}^{(n+1) \delta}(\delta^{\frac{p}{2}} +

1715: \eps^{p}  ) \, ds \\ & \leq & C \left(\delta^{\frac{3p}{2}} + \delta^p \eps^{p}

1716:    \right).

1717: \end{eqnarray*}

1718: %

1719: Consequently

1720: \begin{equation}\label{e:est_xn1}

1721:  \left(\E^{\mu^\eps} |I^2_{\eps, \delta}| \right)^{1/p}  \leq   C(\delta^{3/2} +\delta

1722: \epsilon).

1723: \end{equation}

1724: Consider now the function

1725: %

1726: $$

1727: H(y):= \alpha \left( 1 + \partial_y \phi(y) \right) - A,

1728: $$

1729: From the definition of $A$ we get that

1730: %

1731: $$

1732: \int_{\bbT} \Bigl( \alpha \left( 1 + \partial_y \phi(y) \right) - A \Bigr)\, \mu(dy) = 0.

1733: $$

1734: %

1735: Hence, Lemma \ref{lem:ito} applies and we get

1736: %

1737: \begin{eqnarray*}

1738: \E^{\mu^\eps} \left| \int_{n \delta}^{(n +1) \delta} \left( \alpha \left( 1 + \partial_y

1739: \phi(y^\eps(s)) \right) - A \right) \, ds \right|^p

1740:  & \leq & C \left(\eps^{2 p} + \eps^p \delta^p + \eps^p \delta^{p/2} \right).

1741: \end{eqnarray*}

1742: %

1743: We combine the above estimate with \eqref{e:linbnd} and Corollary \ref{cor:moments} to obtain,

1744: \begin{equation}\label{e:est_xn2}

1745:  \left(\E^{\mu^\eps} |I^1_{\eps, \delta}|^p \right)^{1/p}  \leq   C \left(\eps^2 +  \eps

1746: \delta^{1/2 } \right),

1747: \end{equation}

1748: for $\eps, \, \delta$ sufficiently small. The proof of the proposition follows from

1749: estimates \eqref{e:est_xn1} and \eqref{e:est_xn2}. \qed

1750: %

1751: %

1752: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

1753: %

1754: %                        PROOF OF THM 1.2

1755: %

1756: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

1757: \section{Proof of Main Theorems}

1758:

1759: Here we combine the results from the preceding two sections to complete the proofs of the

1760: main theorems.

1761:

1762: \subsection{Proof of Theorem \ref{thm:est_ddim}}

1763: \label{sec:ddim}

1764: We combine equations \eqref{e:a_est} and \eqref{e:xeps_V} to calculate

1765: %

1766: \begin{eqnarray*}

1767: \widehat{A}(x^{\eps}) & = &  \frac{\int_0^T - \langle \nabla V(x^\eps(t)) , d x^\eps (t)

1768: \rangle}{\int_0^T | \nabla V(x^\eps(t))|^2 \, dt}

1769: %

1770: \\& = &

1771: %

1772: \frac{\int_0^T  \left\langle - \nabla V( x^\eps(t)),  - \alpha \nabla V(x^{\eps}(t)) \,

1773: dt -  \frac{1}{\epsilon}\nabla p \left( \frac{x^\eps(t)}{\eps} \right) \, dt + \sqrt{ 2

1774: \sigma} \, d\beta(t) \right\rangle }{\int_0^T |\nabla V(x^\eps(t))|^2 \, dt}

1775: %

1776: \\& = &

1777: %

1778: \alpha +  \frac{\frac{1}{\epsilon}\int_0^T  \left\langle \nabla V(x^\eps(t)), \nabla

1779: p(\frac{x^\eps(t)} {\eps} ) \right\rangle \, dt}{\int_0^T |\nabla V(x^\eps(t))|^2 \, dt}

1780: - \sqrt{2 \sigma} \frac{\int_0^T  \left\langle \nabla V(x^{\eps}(t)) , d \beta(t)

1781: \right\rangle}  {\int_0^T | \nabla V(x^\eps(t))|^2 \, dt}

1782: \\

1783: & = :& \alpha + I_1(T, \eps) - I_2(T, \eps).

1784: \end{eqnarray*}

1785: %

1786: We will treat the terms $I_1(T, \eps)$ and $I_2(T,\eps)$ separately. We start with

1787: $I_2(t, \eps)$. Since the stochastic integral

1788: $$

1789: M_T :=\int_0^T  \left\langle \nabla V(x^{\eps}(t)) , d \beta(t) \right\rangle

1790: $$

1791: is a continuous martingale which is null at $0$, the strong law of large numbers for

1792: martingales \cite[p. 187]{yor} applies and we have that

1793: $$

1794: \lim_{T \rightarrow + \infty} \frac{M_T}{\langle M \rangle_T} = 0 \quad \mbox{a.s.}

1795: $$

1796: Consequently

1797: \begin{equation}

1798: \lim_{T \rightarrow + \infty}I_2(T,\eps) =  0 \quad \mbox{a.s.}

1799: \label{e:i2_lim_d}

1800: \end{equation}

1801:

1802: Let us consider now the term $I_1(T, \eps)$. We use the ergodic theorem to deduce that

1803: %

1804: \begin{eqnarray*}

1805: \lim_{T \rightarrow \infty} I_1(T, \eps) & = & \lim_{T \rightarrow \infty} \frac{

1806: \frac{1}{\epsilon T} \int_0^T  \left\langle \nabla V(x^\eps(t)), \nabla p \left(

1807: \frac{x^\eps(t)}{\eps} \right) \right\rangle \, ds}{\frac{1}{T}\int_0^T |\nabla

1808: V(x^\eps(t))|^2 \, dt}

1809: \\ & = &

1810: \frac{\E^{\mu^\eps} \left(\left\langle \nabla V(x),  \frac{1}{\eps} \nabla p \left(

1811: \frac{x}{\eps} \right) \right\rangle \right) } { \E^{\mu^\eps} | \nabla V(x)|^2} \quad

1812: \mbox{a.s.}

1813: \end{eqnarray*}

1814: Now we use Proposition \ref{lem:xeps_meas_ddim} to compute

1815: \begin{eqnarray*}

1816:  \frac{\E^{\mu^\eps} \left(\left\langle \nabla V(x),  \frac{1}{\eps}

1817:  \nabla p \left( \frac{x}{\eps}

1818: \right) \right\rangle \right) } { \E^{\mu^\eps} | \nabla V(x)|^2} & = & \frac{\int_{\R^d}

1819: \left\langle \nabla V(x),  \frac{1}{\eps} \nabla p \left( \frac{x}{\eps} \right)

1820: \right\rangle \rho^\eps(x) \, dx }{  \E^{\mu^\eps} | \nabla V(x)|^2 }

1821: %

1822: \\ & = &

1823: %

1824: \frac{-\sigma \frac{1}{Z^\eps} \int_{\R^d} \left\langle \nabla V(x) e^{-\frac

1825: {\alpha}{\sigma} V(x)}, \nabla \left( e^{-\frac{1}{\sigma} p(x/ \eps)} \right)

1826: \right\rangle \, dx }{ \E^{\mu^\eps} | \nabla V(x)|^2 }

1827: \\ & = &

1828: \sigma \frac{ \E^{\mu^\eps} ( \Delta V(x) )}{ \E^{\mu^\eps} | \nabla

1829:  V(x)|^2} - \alpha.

1830: \end{eqnarray*}

1831: In deriving the penultimate line we used an integration by parts. The weak convergence of

1832: $\mu^\eps$ to $\mu$ (second part of Proposition \ref{lem:xeps_meas_ddim}), formula

1833: \eqref{e:gibbs}, together with another integration by parts give

1834: \begin{eqnarray*}

1835: \lim_{\eps \rightarrow 0} \frac{ \E^{\mu^\eps} (\Delta V(x))}{\E^{\mu^\eps} (|\nabla

1836: V(x)|^2)} & = & \frac{\E^{\mu} (\Delta V(x))}{\E^{\mu} (|\nabla V(x)|^2)}

1837: \\ & = & \frac{\E^{\mu} (\Delta V(x))}{ -\frac{\sigma}{\alpha} \frac{1}{Z}

1838: \int_{\R^d} \langle \nabla V(x) , \nabla (e^{-\frac{\alpha}{\sigma} V(x))} \rangle dx }

1839: \\ & = & \frac{\alpha}{\sigma}.

1840: \end{eqnarray*}

1841: We combine the above calculations to conclude that

1842: \begin{equation}

1843: \lim_{\eps \rightarrow 0} \lim_{T \rightarrow \infty} I_1(T, \eps) = 0 \quad \mbox{a.s.}

1844: \label{e:i1_lim_d}

1845: \end{equation}

1846: The proof of the convergence of the maximum likelihood estimator, eqn.

1847: \eqref{e:a_est_lim} now follows from equations \eqref{e:i1_lim_d} and \eqref{e:i2_lim_d}.

1848:

1849: The proof of the convergence of the estimator for the diffusion coefficient, eqn.

1850: \eqref{e:sigma_est_lim}, follows from the definition of the quadratic variation, see e.g.

1851: \cite{BasRao80}. \qed

1852: \begin{remark}

1853: An immediate corollary of the proof of the above theorem is that

1854: $$

1855: \lim_{T \rightarrow \infty} \widehat{A}(x^{\epsilon}) = \sigma \frac{ \E^{\mu^\eps} (\Delta

1856: V(x))}{\E^{\mu^\eps} |\nabla V(x)|^2} \quad \mbox{a.s.}

1857: $$

1858: \end{remark}

1859: %

1860: %

1861: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

1862: %

1863: %                   PROOF OF PROPOSITION 1.1

1864: %

1865: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

1866: %

1867: \subsection{Proof of Theorem \ref{thm:par_est_alpha}}\label{sec:thm_alpha}

1868: We combine Proposition \ref{prop:xndelta2} and \eqref{e:integr_parts} to conclude that

1869: $$

1870: x^\eps_{n+1} - x^\eps_n = J_n - A\delta V'(x^\eps_n) + R(\eps, \delta),

1871: $$

1872: where $J_n$ is as defined in the proof of Proposition \ref{prop:xndelta1}

1873: and, for $\eps, \, \delta$ sufficiently small and $\alpha \in (0,1)$,

1874: \begin{equation}\label{e:R_est_1}

1875: \left( \E^{\mu^\eps} |R(\eps, \delta)|^p \right)^{1/p} \leq

1876: C \bigl(\delta^{3/2}+\epsilon \bigr).

1877: \end{equation}

1878: Notice that

1879: $$\bbE^{\mu^{\eps}} |J_n|^2={\cal O}(\delta).$$

1880: We combine this with formula \eqref{e:alpha_estim_1d} to obtain

1881: \begin{eqnarray}

1882: \widehat{A}_{N, \delta}(x^\eps) &=& A -  \frac{\sum_{n=0}^{N-1} V'(x^\eps_n)

1883: J_n}{\sum_{n=1}^{N-1} |V'(x^\eps_n)|^2 \delta} - \frac{\sum_{n=0}^{N-1} V'(x^\eps_n)

1884: R(\eps, \delta)}{\sum_{n=0}^{N-1} |V'(x^\eps_n)|^2 \delta} \nonumber \\ & :=& A - I_1 -

1885: I_2 \label{e:a_j1_j2},

1886: \end{eqnarray}

1887: %

1888: We need to control the terms $I_1$ and $I_2$. We start with $I_1$, which we rewrite in

1889: the form

1890: \begin{eqnarray*}

1891: I_1 & = &  \eps^{\frac{\gamma

1892: -\alpha}{2}}\frac{\frac{1}{\sqrt{(N\delta)}}\sum_{n=0}^{N-1} V'(x^\eps_n)

1893: J_n}{\frac{1}{N}\sum_{n=0}^{N-1} |V'(x^\eps_n)|^2}.

1894: \end{eqnarray*}

1895: The central limit theorem for (discrete) martingales implies that

1896: \begin{eqnarray*}

1897: \lim_{N \rightarrow + \infty} \frac{1}{\sqrt{(N\delta)}} \sum_{n=0}^{N-1} V'(x^\eps_n)

1898: J_n  & = & \frac{1}{\sqrt \delta}\mathcal{N} \left(0, \E^{\mu^{\eps}} \left(

1899: |V'(x^{\eps}(0))|^2|J_0|^2 \right) \right)

1900: \\ & = &

1901: \frac{1}{\sqrt \delta}\mathcal{N} \left(0, c \, \delta \right) = c \, \mathcal{N}(0,1)

1902: \quad \mbox{in law},

1903: \end{eqnarray*}

1904: for some $c$ uniform in $\epsilon \to 0$. In the above we have used the fact that

1905: $\E^{\mu^\eps} |J_0|^2 = 2 \Sigma \delta$.

1906:

1907: On the other hand, the ergodic theorem implies that

1908: \begin{equation}\label{e:denom}

1909: \lim_{N \rightarrow + \infty}\frac{1}{N}\sum_{n=0}^{N-1} |V'(x^\eps_n)|^2 = \E^{\mu^\eps}

1910: |V(x)|^2, \quad \mbox{a.s.}

1911: \end{equation}

1912: Hence, by Slutsky's theorem, and remembering that $N = [\eps^{-\gamma}]$, we have that

1913: \begin{equation}\label{e:j1_lim}

1914: \lim_{\eps \rightarrow 0 } I_1 = 0 \quad \mbox{in law}.

1915: \end{equation}

1916: Consider now the term $I_2$. It can be written as

1917: $$

1918: I_2 = \frac{\eps^{\gamma - \alpha}\sum_{n=0}^{N-1}V'(x^\eps_n) R(\eps, \delta)

1919: }{\frac{1}{N}\sum_{n=0}^{N-1} |V'(x^\eps_n)|^2}.

1920: $$

1921: The ergodic theorem implies that the denominator in the above expression converges a.s.

1922: to a finite value. To study the numerator of the above expression we use estimate

1923: \eqref{e:R_est_1}, together with H\"{o}lder inequality to estimate

1924: \begin{eqnarray*}

1925: \E^{\mu^\eps} \left| \eps^{\gamma - \alpha} \sum_{n = 0}^{N-1} V'(x^\eps_n) R(\eps,

1926: \delta)\right| &\leq & \eps^{\gamma - \alpha} \sum_{n=0}^{N - 1} \left(

1927: \E^{\mu^\eps}|V'(x^\eps_n)|^q  \right)^{1/q} \left( \E^{\mu^\eps} |R(\eps, \delta)|^p

1928: \right)^{1/p}

1929: \\ & \leq &

1930: C \eps^{\gamma - \alpha} \sum_{n = 0}^{N - 1}

1931: \Bigl(\E^{\mu^\eps}  |R(\eps, \delta)|^p \Bigr)^{1/p}

1932: \\ & \leq &

1933: C\bigl(\epsilon^{\alpha/2}+\epsilon^{1-\alpha}\bigr).

1934: \end{eqnarray*}

1935: %

1936: In the above we have used Corollary \ref{cor:moments}, together with Assumptions

1937: \ref{a:1}. The above calculation shows that numerator of $I_2$ converges to $0$ in $L^1$,

1938: and hence in law. This, together with the a.s. convergence of the denominator

1939: and Slutsky's theorem gives

1940: \begin{equation}\label{e:j2_lim}

1941: \lim_{\eps \rightarrow 0 } I_2 = 0 \quad \mbox{in law}.

1942: \end{equation}

1943: Combining \eqref{e:a_j1_j2}, \eqref{e:j1_lim} and \eqref{e:j2_lim} completes the

1944: proof of the theorem.  \qed

1945: %

1946: %

1947: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

1948: %

1949: %                        PROOF OF THM 1.1

1950: %

1951: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

1952: %

1953: \subsection{Proof of Theorem \ref{thm:par_est_sigma}}\label{sec:thm_sigma}

1954: %

1955: We combine Proposition \ref{prop:xndelta1} with \eqref{e:integr_parts} to write

1956: the difference $x_{n+1}^\eps - x_n^\eps$  in the form

1957: \begin{equation}

1958: x_{n+1}^\eps - x_n^\eps = \sqrt{2 \Sigma \, \delta} \, \xi_n + \widehat{R}(\delta, \eps)

1959: \label{e:xn_loc_1}

1960: \end{equation}

1961: in law, where, for $\eps, \, \delta$ sufficiently small,

1962: %

1963: \begin{equation}\label{e:R_est_3}

1964: \left( \E^{\mu^\eps} |\widehat{R}(\eps, \delta)|^p \right)^{1/p}

1965: \leq C \left(\delta + \eps^{\beta}\right).

1966: \end{equation}

1967: We substitute \eqref{e:xn_loc_1} into the formula for the estimator

1968: \eqref{e:sigma_estim_1d} with $d = 1$ to obtain

1969: \begin{eqnarray*}

1970: \widehat{\Sigma}_{N, \delta}(x^{\epsilon}) &=& \Sigma \frac{1}{N} \sum_{n=0}^{N-1}

1971: \xi_n^2 + \frac{1}{2 N \delta} \sum_{n=0}^{N-1} \left( \widehat{R}(\delta, \eps)

1972: \right)^2 + \frac{1}{N \delta} \sum_{n=0}^{N-1} \sqrt{2\Sigma \delta}\xi_n

1973: \widehat{R}(\delta, \eps)

1974: \\ & =:&

1975: \Sigma \frac{1}{N}\sum_{n=0}^{N-1} \xi_n^2 + I_1 + I_2.

1976: \end{eqnarray*}

1977: By the law of large numbers the first term tends almost surely to $\Sigma$ as $\epsilon

1978: \to 0$ (which implies $N \to \infty.$) Thus it suffices to show that the remaining terms

1979: tend to zero in law. We do this by showing that they tend to zero in $L^1.$

1980:

1981: Note that

1982: \begin{align*}

1983: \bbE^{\mu^{\eps}}|I_1| & \le C \sum_{n=0}^{N-1} \bbE^{\mu^\eps}(\widehat{R}(\delta,\eps))^2\\

1984: &=CN(\delta+\eps^{\beta})^2\\

1985: &\le C(\delta+\epsilon^{2\beta}\delta^{-1})\\

1986: &=C(\epsilon^{\alpha}+\epsilon^{2\beta-\alpha})\\

1987: &=o(1),

1988: \end{align*}

1989: for $\alpha \in (0,1)$, since $\beta$ can be chosen arbitrarily close to $\frac12.$

1990:

1991: Similarly

1992: \begin{align*}

1993: \bbE^{\mu^{\eps}}|I_2| & \le C \sum_{n=0}^{N-1} \delta^{\frac12}(\delta+\eps^{\beta})\\

1994: &\le C(\delta^{\frac12}+\epsilon^{\beta}\delta^{-\frac12})\\

1995: &=C(\epsilon^{\frac{\alpha}{2}}+\epsilon^{\beta-\frac{\alpha}{2}})\\

1996: &=o(1),

1997: \end{align*}

1998: for $\alpha \in (0,1)$, since $\beta$ can be chosen arbitrarily close to $\frac12.$

1999: This completes the proof.

2000: \qed

2001: %

2002: %

2003: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

2004: %

2005: %

2006: \subsection{Proof of Theorem \ref{prop:drift_estim_2}}

2007: %

2008: Taking the limit $T \to \infty$ in \eqref{eq:alpha2} gives

2009: %

2010: $$

2011: \lim_{T \to \infty}\tilde{A}(x^{\eps})=\widehat{\Sigma} \frac{\bbE^{\mu^{\eps}} (\Delta

2012: V(x))}{\bbE^{\mu^{\eps}} |\nabla V(x)|^2}.

2013: $$

2014: %

2015: Proposition \ref{lem:xeps_meas_ddim} implies that

2016: %

2017: $$\lim_{\eps \to 0}\widehat{\Sigma} \frac{\bbE^{\mu^{\eps}} (\Delta V(x))}

2018: {\bbE^{\mu^{\eps}} |\nabla V(x)|^2}= \widehat{\Sigma} \frac{\bbE^{\mu} ( \Delta V(x)

2019: )}{\bbE^{\mu} |\nabla V(x)|^2},$$

2020: %

2021: where $\E^\mu$ denotes expectation with respect to the invariant distribution $\rho(x)$

2022: of the homogenized process, given by formula \eqref{e:gibbs}. An

2023: integration by parts now gives that

2024: %

2025: $$

2026: \E^{\mu} |\nabla V(x)|^2 = \frac{\sigma}{\alpha} \E^{\mu} ( \Delta V(x) ).

2027: $$

2028: %

2029: Thus, the final result of our considerations is that

2030: %

2031: $$

2032: \lim_{\eps \rightarrow 0} \lim_{T \rightarrow \infty} \tilde{A}(x^{\eps}) =

2033: \frac{\widehat{\Sigma}}{\sigma} \alpha.

2034: $$

2035: \qed

2036: %

2037: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

2038: %

2039: %                          CONCLUSIONS

2040: %

2041: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

2042: %

2043: \section{Conclusions and Future Work}

2044: \label{sec:conc}

2045: The problem of parameter estimation for continuous time multiscale

2046: diffusion processes is studied in this paper. Our goal is to accurately fit a

2047: homogenized equation from data which has a multiscale character.

2048: Our main conclusions are as follows:

2049:

2050: \begin{itemize}

2051:

2052: \item In order to estimate the drift and diffusion

2053: coefficients accurately it is necessary to subsample.

2054:

2055: \item There is an optimal subsampling rate, between the two

2056: charateristic time-scales of the multiscale data.

2057:

2058: \item The optimal subsampling rate may differ for different

2059: parameters.

2060:

2061: \item For gradient multiscale systems it is only necessary to estimate

2062: the diffusion coefficient correctly, if one uses

2063: the second estimator for the drift -- $\tilde{A}$, defined in equations \eqref{eq:alpha2} and

2064: \eqref{e:alpha_estim_1d2}.

2065:

2066:

2067: \end{itemize}

2068:

2069: Both analysis and numerics are given to substantiate these claims.

2070: Many open questions remain; we list those which seem important to us.

2071:

2072: \begin{itemize}

2073:

2074: \item Rough heuristics indicate that any subsampling

2075: rate which is between the two characteristic time scales of the processes, namely

2076: $\mathcal{O}(\eps^2)$ and $\mathcal{O}(1)$, should enable accurate  estimation

2077: of the drift and diffusion coefficients. However our analysis works only

2078: in the case where the subsampling is between

2079: $\mathcal{O}(\eps)$ and $\mathcal{O}(1)$. Closing the gap between intuition

2080: and what can be proved would be valuable.

2081:

2082: \item Analyze other parameter estimation problems for multiscale

2083: diffusions, not necessarily of gradient form. In particular study both

2084: averaging and homogenization set-ups, as outlined in the introductory

2085: section.

2086:

2087:

2088: \item In this paper we have generated simulated multiscale data

2089: by using a multiscale diffusion process. However this was done to

2090: provide a convenient analytical framework. In applications it

2091: is of interest to develop tools for characterizing the multiscale

2092: structure of a given path  -- to estimate characteristic time--scales.

2093: Related work has been done in \cite{FPSS03}. Further study would be

2094: of interest.

2095:

2096:

2097: \item Determine precisely the range of subsamplings which will give

2098: accurate parameter estimates and optimize the subsampling rate for

2099: accuracy.

2100:

2101: \item Optimize the algorithm by combining estimates based on shifts of

2102: the subsampled data -- so that information is not thrown away; this is

2103: done in the context of econometrics and finance in

2104: \cite{AitMykZha05b, AitMykZha05a}.

2105:

2106:

2107: \item Analyze questions analogous to those raised here for

2108: multidimensional multiscale processes.

2109: %

2110: \item Analyze questions analogous to those raised here

2111: for hypoelliptic multiscale diffusions; in particular the case where the homogenized

2112: equation is a fully elliptic first order Langevin equation which is derived from an

2113: overdamped second-order Langevin equation.

2114: %

2115: \item Study whether there is any advantage in using random subsampling rates.

2116: %

2117: \item Study drift that depends non--linearly on the parameters to be

2118: estimated:

2119: %

2120: $$

2121: d x^\eps (t) = - \nabla V(x^\eps (t), \eps ; \alpha) dt + \sqrt{2 \sigma} d \beta (t).

2122: $$

2123: %

2124: \item Parameter estimation for deterministic multiscale problems where the fast

2125: process is a strongly mixing chaotic deterministic process.

2126: %

2127: %

2128: \end{itemize}

2129:

2130: {\it Acknowledgements} The authors are grateful to Ch. Sch\"{u}tte

2131: for useful discussions concerning molecular dynamics, leading us to

2132: formulate this problem. They also thank S. Olhede for useful discussions

2133: and comments.

2134: %

2135: \bibliography{../bibtex_files/mybib}

2136: \bibliographystyle{plain}

2137: \end{document}

2138: