math0610124/abel.tex
1: \documentclass[11pt]{article}
2: \usepackage{amsmath,amssymb,epsfig}
3: 
4: \newtheorem{conj}{Conjecture}
5: 
6: \begin{document}
7: 
8: \title{A Conjecture about Molecular Dynamics}
9: \author{P. F. Tupper}
10: \date{\today}
11: \maketitle
12: 
13: \begin{abstract}
14: An open problem in numerical analysis is to explain why molecular dynamics 
15: works.  The difficulty is that numerical trajectories are only 
16: accurate for very short times, whereas the simulations are performed over 
17: long time intervals.  It is believed that statistical information from 
18: these simulations is accurate, but no one has offered a rigourous proof of 
19: this.  In order to give mathematicians a clear goal in understanding this 
20: problem, we state a precise mathematical conjecture about molecular 
21: dynamics simulation of a particular system.  We believe that if the 
22: conjecture is proved, we will then understand why molecular 
23: dynamics works.
24: \end{abstract}
25: 
26: 
27: 
28: 
29: \section{Introduction}
30: 
31: Molecular dynamics is the computer simulation of a material at the
32: atomic level.   In principle the only inputs to a simulation are the
33: characteristics of a set of particles and a description of the forces
34: between them.  An initial condition is chosen and from these first
35: principles the evolution of the  system  in time is simulated using
36: Newton's laws and a simple numerical integrator \cite{frenkel,allen}.  
37: 
38: Molecular dynamics is a very prevalent computational practice, as a
39: glance at an issue of the Journal of Chemical Physics will show.   It
40: does have its limitations: the motion of only a relatively small
41: number of particles can be simulated over a short time interval. 
42: However, most of the mesoscopic models that have been suggested to
43: overcome these difficulties still rely on molecular dynamics as a form
44: of calibration.  It is likely that molecular dynamics will continue to
45: be important in the future. 
46: 
47: Given its scientific importance there is very little rigourous
48: justification of molecular dynamics simulation.  From the viewpoint of
49: numerical analysis it is surprising that it works at all.  The problem
50: is that individual trajectories computed by molecular dynamics
51: simulations are accurate for only small time intervals.  As we will
52: see in Section~\ref{sec:problem}, numerical trajectories diverge
53: rapidly from true trajectories given the step-lengths used in
54: practice.  No one disputes this fact, and no one is particularly
55: concerned with it either.  The reason is that practitioners are never
56: interested in particular trajectories to begin with.  They are
57: interested in ensembles of trajectories.  As long as the numerical
58: trajectories are representative of a particular ensemble of true
59: trajectories, researchers are content.  However, that this statistical
60: information is computed accurately has yet to be rigourously
61: demonstrated in representative cases. 
62: 
63: The goal of this article is to present a concise mathematical
64: conjecture that encapsulates this fundamental difficulty.  We present
65: a model system that is representative of systems  
66: commonly simulated in molecular dynamics.  We present the results of
67: numerical simulations of this system using the St\"ormer-Verlet
68: method, the work-horse of molecular dynamics.  In each simulation a
69: random initial condition is generated, an approximate trajectory for
70: the system is computed and the net displacement of one particle over
71: the duration of the simulation is recorded.  We show that even for
72: step-sizes that are far too large to accurately compute the position
73: of the particle, the distribution of the particle's displacement over
74: the many initial conditions appears to be accurate. 	
75: From the numerical data we conjecture a rate of convergence for this
76: particular statistical property.  We believe that if this conjectured
77: rate of convergence (or one like it) can be rigourously established,
78: even for this single system, then we will understand significantly
79: better why molecular dynamics works. 
80: 
81: 
82: The problem of explaining the accuracy of molecular dynamics
83: simulation is well-known both in the physical sciences (for example
84: \cite[p. 81]{frenkel}) and in the mathematics community
85: \cite{SigStu}. 
86: This latter reference is a survey of the relation between computation
87: and statistics for initial value problems in general.   
88: % In the whole
89: %area of numerical analysis and molecular dynamics there has been
90: %plenty of 
91: There has been plenty of 
92: excellent mathematical work that has done much to explain
93: various features of this type of simulation, but has not resolved the
94: issue we consider here.  See \cite{SkeTup,LeiRei,LeBris} for surveys.   
95: 
96: One body of work that has addressed the statistical accuracy of
97: under-resolved trajectories in a special case is by A.~Stuart and
98: co-workers.  In \cite{Stu1,Stu2} they 
99: have explored some linear test systems with provable statistical
100: properties in the limit of large numbers of particles.  They are able
101: to show that if the systems are simulated with appropriate methods the
102: statistical features of numerical trajectories are accurate in the
103: same limit even when the step-lengths are too large to resolve
104: trajectories. 
105: Though these results are interesting since they are the only ones of
106: their kind now known, for the highly nonlinear  problems of practical
107: molecular dynamics very different arguments will be required. 
108: 
109: One subproblem that has been attacked more successfully is that of the
110: computation of ergodic averages. These are averages of functions along
111: very long trajectories.  All that numerical trajectories have to do to
112: get these correct is sample the entire phase space evenly.  This is a
113: much weaker property than getting all statistical features correct.
114: The most striking work on this question is by S.~Reich \cite{Reich}
115: which establishes rapid convergence of ergodic averages for
116: Hamiltonian systems which are uniformly hyperbolic on sets of constant
117: energy.  Unfortunately, this property has never been established for
118: realistic systems, and is unlikely to hold for them
119: \cite{Mackay,Liverani}.  The work \cite{Tupper} established similar
120: results for systems with much weaker properties but requires radically
121: small time steps for convergence to occur. 
122: 
123: 
124: 
125: The contribution of this work is to precisely specify a simple problem
126:  which encapsulates all the essential difficulties of the more general
127:  problem.   In Section~\ref{sec:system} we present the system we will
128:  study.  Section~\ref{sec:problem} shows the results of some
129:  numerical experiments on this system.  There we state our conjecture
130:  based on the results.  In Section~\ref{sec:approaches} we will
131:  discuss two possible approaches to proving the conjecture. 
132:  Finally, in Section~\ref{sec:discussion} we will discuss prospects
133:  for the eventual resolution of the conjecture. 
134: 
135: \section{The System} \label{sec:system}
136: 
137: The system consists of $n=100$ point particles interacting on an 11.5
138: by 11.5 square periodic domain.  We let $q \in \mathbb{T}^{2n}$ and $p
139: \in \mathbb{R}^{2n}$ denote the positions and velocities of the
140: particles, with $q_i \in \mathbb{T}^2, p_i \in \mathbb{R}^2$ denoting
141: the position and velocity of particle $i$.   The motion of the system
142: is described by a system of Hamiltonian differential equations: 
143: \[
144: \frac{dq}{dt} = \frac{\partial H}{\partial p}, \ \ \ 
145: \frac{dp}{dt} = -\frac{\partial H}{\partial q},
146: \]
147: with Hamiltonian 
148: \[
149: H(q,p) = \frac{1}{2} \|p\|_2^2 + \sum_{i<j} V_{LJ} ( \| q^i-q^j \| ).
150: \]
151: Here $V_{LJ}$ denotes the famous Lennard-Jones potential.  In our
152: simulations we use a truncated version: 
153: \[
154: V_{LJ}(r) = \left\{  
155: \begin{array} {ll}
156: 4 \left( \frac{1}{r^{12}} - \frac{1}{r^6} \right) , & \mbox{if } r
157: \leq r_{\mbox{\tiny{cutoff}}}, 
158: \\
159: 0,& \mbox{otherwise.}
160: \end{array} \right.
161: \]
162: Figure~\ref{fig:movie} shows the positions of the particles on the
163: periodic domain for one state of the system.  Though the particles are
164: only points, in the figure each is represented by a circle of radius
165: 1/2. 
166: 
167: \begin{figure}
168: \epsfig{file=onestill.eps,width=4in}
169: \caption{\label{fig:movie} The positions of the particles for a
170:   representative state of the system.}   
171: \end{figure}
172: 
173:  We take our initial conditions $q^0,p^0$ to be randomly distributed
174:  according to the probability density function 
175: \begin{equation} \label{eq:distrib} 
176:  Z^{-1} e^{-H(q,p)/k\mathcal{T}},
177: \end{equation}
178: where $Z$ is chosen so that the function integrates to one.
179: This is known as the canonical distribution (or ensemble) for the
180: system at temperature $\mathcal{T}$.   
181: There is a simple physical interpretation of this distribution:  if
182: the system is  weakly connected to another very large system at
183: temperature $\mathcal{T}$, this is the distribution we will find the
184: original system in after a long period of time.   
185: In our units  $k = 1$, 
186: and we choose $\mathcal{T}=1$.
187: 
188: There are many ways of sampling from the canonical distribution at a
189: given temperature.  For our experiments we generated initial
190: conditions using Langevin dynamics.  See  \cite{cances} for an
191: explanation of this technique and a comparison with other methods.  If
192: done correctly, the precise method of sampling from the canonical
193: distribution will have no bearing on the results of the experiments we
194: will present subsequently. 
195: 
196: The numerical method we use for integrating our system is the
197: St\"ormer-Verlet scheme.   
198: Given an initial $q_0, p_0$ and a $\Delta t>0$ it generates a sequence
199: of states $q_n, p_n, n\geq0$ such that $(q_n,p_n) \approx (q(n\Delta
200: t),p(n\Delta t))$.  The version of the algorithm we use is  
201: \begin{eqnarray*}
202: q_{n+1/2} & = & q_n + p_n \Delta t/2, \\
203: p_{n+1} & = & p_n - \Delta t \nabla V(q_{n+1/2}), \\
204: q_{n+1} & = & q_n + p_{n+1} \Delta t/2.
205: \end{eqnarray*}
206: This is a second-order explicit method.  It is symplectic, and as a
207: consequence conserves phase space volume \cite{hairer}.   
208: 
209: Finally we have to decide upon our step-length $\Delta t$.  If $\Delta
210: t$ is too large the energy of the computed solution will increase
211: rapidly and explode.  In practice, it is observed that for small
212: enough step lengths energy remains within a narrow band of the true energy
213: for very long time intervals. (There is extensive theoretical
214: justification for this phenomenon, see Section~\ref{subsec:bea}).
215: Practitioners tend to pick a $\Delta t$ as large as possible while
216: still maintaining this long-term stability on their time interval of
217: interest. 
218: For the system and initial conditions we describe here $\Delta t=
219: 0.01$ yields good approximate energy conservation  on the time
220: interval $[0,100]$.  For our numerical experiments  we will let
221: $\Delta t$ take this value and smaller.  (The recommended value in
222: \cite{frenkel}, a standard reference, for this type of system is
223: $\Delta t = 0.005$.) 
224: 
225: 
226: \section{The Problem} \label{sec:problem}
227: 
228: We will first examine how well trajectories are computed with $\Delta
229: t = 0.01$. 
230: Figure \ref{fig:convergtraj} shows  the computed $x$-position of one
231: particle versus time for the same initial conditions and for a range
232: of step-lengths.  If the trajectory computed by St\"ormer-Verlet is
233: accurate over the time interval $[0,5]$, we expect that reducing the
234: time step by a factor of a thousand would not yield a significantly
235: different curve.   However, we see that the two curves for $\Delta
236: t=0.01$ and $\Delta t=0.00001$ very quickly diverge.  They are
237: distinguishable to the eye almost immediately and completely diverge
238: around $1.2$ time units. 
239: 
240: Reducing the step length to $\Delta t=0.001$ gives a curve that agrees
241: with the $\Delta t=0.00001$ line longer, but still diverges around
242: $2.5$ time units.  Similarly, even with $\Delta t=0.0001$ trajectory
243: is not accurate over the whole interval depicted.  
244: 
245: \begin{figure} 
246: \epsfig{file=convergtraj.eps,width=4in}
247: \caption{\label{fig:convergtraj} Computed $x$-position of one particle
248:   versus time for fixed initial conditions for a  range of $\Delta
249:   t$.}   
250: \end{figure}
251: 
252: From these numerical results, we might conjecture that reducing the
253: step-length by a constant factor 
254: only extends the duration for which the simulation is accurate by a
255: constant amount of time. 
256: This is consistent with theoretical results about the convergence of
257: numerical methods for ordinary differential equations. 
258: What is surprising in this case is that the
259: time-scale on which the trajectories are valid appears to be miniscule
260: compared to the time-scale on which computation are actually
261: performed. 
262: It seems that the trajectories we compute here with stepsize even as
263: small as $\Delta t=0.00001$ are not accuarate over the whole interval
264: $[0,5]$ let alone over considerably longer intervals.
265: %This is what we expect to happen from the typical analysis of
266: %convergence for ODEs.  What is surprising in this case is that the
267: %time-scale on which the trajectories are valid appears to be miniscule
268: %compared to the time-scale on which computation are actually
269: %performed. 
270: 
271: 
272: Fortunately we almost never care about what one particular trajectory
273: is doing in molecular dynamics. 
274: We only care about statistical features of the trajectories when
275: initial conditions are selected according to some probability
276: distribution.  Here we will consider the example of self-diffusion.
277: Self-diffusion is the diffusion of one particular particle through a
278: bath of identical particles.  We can imagine somehow marking one
279: particle at time zero and watching its motion through the system.
280: This single-particle trajectory will depend on the positions and
281: velocities of all the particles (including itself) at time zero.
282: Since these are random, the trajectory of the single particle is
283: random. 
284: 
285: One way to measure self-diffusion is to look at the 
286:  distribution of the $x$-coordinate  of the tracer particle relative
287:  to its initial condition.  To estimate this, we generate many random
288:  initial conditions, perform the simulation using the St\"ormer-Verlet method,
289:  and record the net displacement of the particle in the given
290:  direction. 
291: Figures~\ref{fig:hists} show the histograms of these displacements at
292:  time $T=10$ for three different step-lengths. 
293: 
294: \begin{figure}
295: \epsfig{file=hist1.eps,width=1.6in}
296: \epsfig{file=hist2.eps,width=1.6in}
297: \epsfig{file=hist3.eps,width=1.6in}
298: \caption{ \label{fig:hists} Displacement in $x$ direction of 1
299:   particle at $T=10$ for three different step-lengths.} 
300: \end{figure}
301: 
302: In contrast to the case where we examined single trajectories, here
303: the histograms are virtually identical for the different step-lengths.
304: This suggests that any information we glean from the first histogram
305: will be accurate. 
306: 
307: To check this more carefully, we compute the variance of the total
308: displacement at various times $T$ for varying step-lengths.  Let $R(T)
309:  = \| q_1(T) -  q_1(0) \|$ denote the total displacement of the
310: particle after time $T$.  This is a random quantity through its
311: dependence on the state of the system at $t=0$.  Let $R_{\Delta t}(T)$
312: denote this same displacement as simulated with the St\"ormer-Verlet
313: method.  This also is a random quantity.  Now define $\langle
314: R^2_{\Delta t} (T)\rangle$ to be the expected value of $R^2_{\Delta
315:   t}(T)$ when the initial conditions are chosen according to the
316: canonical distribution.  Let us see how this last quantity depends on
317: $\Delta t$.  We do this by generating many initial conditions from the
318: canonical ensemble and then simulating the system for 100 time units,
319: keeping track of the total displacement of the tracer particle. 
320: 
321: \begin{figure}
322: \epsfig{file=bwdiffs.eps,width=4in}
323: \caption{ \label{fig:selfdiffuse}  Expected squared total displacement in the $x$ direction   of a single particle as a function of time for three different step-lengths. }
324: \end{figure}
325: 
326: Figure~\ref{fig:selfdiffuse} shows $\langle R^2_{\Delta t}(T) \rangle$ versus $T$ for three choices of step length.  The inset shows a subset of the data with error bars.  Up to the sampling error there is no difference between the curves.  As far as we can tell from this plot, the answers for $\Delta t=0.01$ are accurate.  The time-scale is much larger than the short interval we found the trajectory to be accurate over.
327: Lest we give the impression that $\langle R^2_{\Delta t}(T) \rangle$ depends linearly on $T$, Figure~\ref{fig:zoom} shows the same results for a smaller time interval.
328: 
329: \begin{figure} 
330: \epsfig{file=zoom.eps,width=4in}
331: \caption{ \label{fig:zoom} Same as Figure~\ref{fig:selfdiffuse} but on
332:   a smaller time interval.}
333: \end{figure}
334: 
335: We conjecture that the reason $\langle R^2_{\Delta t}(T) \rangle$ does not appear to depend on $\Delta t$ is that even for these large values of $\Delta t$ it closely matches $\langle R^2(T) \rangle$.
336: It is not clear at all what the rate of convergence of $R_{\Delta t}(T)$ to $R(T)$ is and how it depends on $T$.  However we make the following conjecture:
337: 
338: \begin{conj} \label{conj}
339: For the system described in Section~\ref{sec:system} with the initial
340: distribution given by (\ref{eq:distrib}) and the St\"ormer-Verlet
341: integrator with time step $\Delta t$ 
342: \[
343: \left| \langle R^2_{\Delta t}(T) \rangle - \langle R^2(T) \rangle
344: \right| \leq C \Delta t^2,
345: \]
346: for all $T \in [0,A e^{B/\Delta t}]$, for some constants $A, B, C$.
347: \end{conj}
348: We will explain the reasons for hypothesizing this particular dependence in the next section.  Here we will briefly note what dependence the classical theory of convergence for numerical ODEs gives:
349: \[
350: \left| \langle R^2_{\Delta t}(T) \rangle - \langle R^2(T) \rangle \right| \leq C e^{LT} \Delta t^2
351: \]
352: for $T \in [0,E \log(F/\Delta t)]$ for sufficiently small $\Delta t$ for some $C,L,E,F >0$.  (See \cite[p. 239]{stuhum}, for example.)
353: So we need to explain why the error remains so small even for long
354: simulations.
355: %$T$ and also why it
356: %stays small for such long simulations. 
357: 
358: \section{Two Approaches} \label{sec:approaches}
359: 
360: We will
361: discuss two possible approaches to proving Conjecture~\ref{conj}:
362: backward error analysis and shadowing.
363: 
364: 
365: 
366: \subsection{Backward Error Analysis} \label{subsec:bea}
367: 
368: Typically a $p$th order numerical method applied to a system of ODEs
369: computes a trajectory that is $\mathcal{O}(\Delta t^p)$ close to the
370: exact trajectory on a finite interval. 
371: Backward error analysis is a way of showing that the numerical
372: trajectory is an $\mathcal{O}(\exp(-1/\Delta t))$ approximation to the
373: exact trajectory of a perturbed system.  This result can be used in
374: turn to prove results about the stability of the numerical trajectory.
375: See \cite{bennetin} for an early reference and \cite[Ch. IX.]{hairer}
376: for a recent comprehensive treatment of the subject. 
377: 
378: If we apply a symplectic integrator to a Hamiltonian system it turns
379: out that the modified system is also Hamiltonian.  The Hamiltonian
380: function $\widetilde{H}$ for the new system can be written as
381: $\widetilde{H} = H + \mathcal{O}(\Delta t^2)$.  
382: There are two consequences for us.  Firstly, the numerical method
383: agrees very closely with the exact solutions of the modified
384: Hamiltonian on short time intervals.  If we denote the solution to the
385: modified system with the same initial conditions by
386: $(\tilde{q},\tilde{p})$ then 
387: \begin{equation} \label{eqn:smaller}
388: | \tilde{q}(n \Delta t) - q^n | \leq C e^{-D/\Delta t}
389: \end{equation}
390: for $T \in [0,B/\Delta t]$,
391: for some appropriate constants \cite{skeel}.  (This alone is not
392: useful for analysing molecular dynamics since $T$ and $\Delta t$ are
393: both large.) 
394: Secondly, the modified Hamiltonian $\widetilde{H}$ is conserved
395: extremely well by the numerical method for long time intervals: 
396: \[
397: \left| \widetilde{H}( q^0,p^0 ) - \widetilde{H}(  q^n,p^n  )  \right|
398: \leq C e^{-D/\Delta t},
399: \]
400: for $n \Delta t \in [0,A e^{B/\Delta t}]$.  Putting this together with
401: $\widetilde{H} = H + \mathcal{O}(\Delta t^2)$ gives 
402: \[
403: \left| H( q^0,p^0 ) - H(  q^n,p^n  )  \right|
404: \leq E \Delta t^2,
405: \]
406: for $n \Delta t \in [0,A e^{B/\Delta t}]$.
407: We chose the bound in Conjecture 1 in analogy with this last result.
408: 
409: Suppose we wanted to bound the error between $\langle R^2_{\Delta
410:   t}(t) \rangle$ and $\langle R^2(t) \rangle$ using these estimates.    
411: The fact that the initial conditions are random adds an extra level of
412:   complication to the problem.  We have been using $\langle \cdot
413:   \rangle$ to denote the average with respect to the canonical
414:   distribution for the Hamiltonian $H$.  The perturbed Hamiltonian
415:   $\widetilde{H}$ has a different canonical distribution.  We denote
416:   averages with respect to it by $\langle \cdot \rangle'$.  We let
417:   $\widetilde{R}$ denote the net displacement of the tracer particle
418:   under the new flow given by $\widetilde{H}$. 
419: 
420:  We might try bounding the error in the following way:
421: \begin{eqnarray*}
422: | \langle R^2_{\Delta t}(T) \rangle - \langle R^2(T) \rangle| & \leq &
423: | \langle R^2_{\Delta t}(T) \rangle - \langle \widetilde{R}^2(T) \rangle | \\
424: & & +|\langle \widetilde{R}^2(T) \rangle -  \langle \widetilde{R}^2(T)
425:  \rangle' | \\ 
426: & & +|\langle \widetilde{R}^2(T) \rangle' - \langle R^2(T) \rangle|
427: \end{eqnarray*}
428: We discuss each of the three terms in turn.
429: 
430: The first term is due to the numerical trajectory not agreeing with
431: the exact trajectory of the modified system with Hamiltonian
432: $\widetilde{H}$.  According to (\ref{eqn:smaller}) we can bound this
433: term by $C \exp(-D/\Delta t)$ for a duration of $B/\Delta t$.  The
434: studies in \cite{skeel} suggest that this is a tight estimate for
435: typical molecular dynamics simulations. 
436: 
437: The second term is the difference in the expectation of
438: $\widetilde{R}^2(t)$ due to a perturbation in the measure.  Since the
439: two measures are proportional to $\exp(-H/k \mathcal{T})$ and $\exp(
440: -\widetilde{H}/k \mathcal{T})$ respectively, and $H - \widetilde{H} =
441: \mathcal{O}(\Delta t^2)$, we expect this term to be on the order of
442: $\mathcal{O}(\Delta t^2)$ for all $T$.  This probably can be
443: rigourously controlled without much difficulty.
444: 
445: The third term is just the difference in $\langle R^2(t) \rangle$
446: between the original system and the perturbed system.  This is likely
447: to be extremely difficult to bound.  However, showing that it is small
448: is not a question about computation but about statistical physics.
449: For now let us assume that 
450: %We will leave
451: %the estimation of this term to mathematical physicists, and assume
452: %that 
453: it is $\mathcal{O}(\Delta t^2)$ for all $T$ for now.
454: 
455: Already we can see that this approach will not get us the result that we
456: want, even assuming we can bound the third term.
457: The best estimate we have so far is that
458:   the error is bounded by $\mathcal{O}(\Delta t^2)$ for $T \in
459:   [0,B/\Delta t]$.  The bound would hold on an interval much shorter
460:   than what is needed.
461:   It appears that backward error analysis alone cannot explain the
462:   observed convergence. 
463: 
464: 
465: \subsection{Shadowing} \label{subsec:shadow}
466: 
467: The idea of shadowing is complementary to that of backward error
468: analysis.  Whereas backward error analysis shows that the numerical
469: trajectory is close to the exact trajectory of a different Hamiltonian
470: system with the same initial condition, shadowing attempts to show
471: that the numerical trajectory is close to an exact trajectory  of the
472: same Hamiltonian system with a different initial condition.  See
473: \cite{Hayes} for a nice review of shadowing for Hamiltonian systems. 
474:   
475: In our situation, if shadowing were possible, something like the
476: following would hold. 
477: Suppose we compute a numerical trajectory starting from $(q^0,p^0)$
478: with time step $\Delta t$, which we denote by $(q^n,p^n), n\geq 0$.  
479:  If shadowing is possible then there is an exact trajectory
480:  $(\tilde{q}(t),\tilde{p}(t))$ of the same Hamiltonian system starting
481:  at some other initial condition $(\tilde{q}(0),\tilde{p}(0))$ such
482:  that 
483: \[
484:  (q^n,p^n) \approx (\tilde{q}(n \Delta t),\tilde{p}(n \Delta t))
485: \]
486: for $n \Delta t$ in some large range of times.
487: Assuming that it is possible to shadow every numerical trajectory in
488: this way, let us denote the map on the phase space that takes the
489: numerical initial condition to the initial condition of the shadow
490: trajectory by  
491: \[
492: S_{\Delta t} (q^0,p^0) = (\tilde{q}(0),\tilde{p}(0)).
493: \]
494: 
495: 
496: The idea of shadowing is used very effectively by S.~Reich in
497: \cite{Reich}.  For a Hamiltonian system for which shadowing holds he
498: demonstrates that long-time averages will be computed accurately by
499: almost all numerical trajectories.  That is,
500: \begin{equation}  \label{eqn:ergod}
501: \lim_{T \rightarrow \infty} \frac{1}{T} \int_0^T g(q(t),p(t)) dt
502: \approx \lim_{N \rightarrow \infty} \frac{1}{N} \sum_{n=0}^N
503: g(q^n,p^n),
504: \end{equation}
505: for almost all initial conditions $(q^0,p^0)=(q(0),p(0))$, for
506: reasonable functions $g$.  Since the
507: quantity on the left does not depend on $(q(0),p(0))$ in the systems
508: considered in \cite{Reich} (except for sets
509: of measure zero), it is sufficient that
510: such a map $S_{\Delta t}$ exists to get the result.
511: 
512: 
513: %This is easier than our case since
514:  
515: 
516: In our case we are interested in more general statistical features of
517: trajectories than long-time averages.
518:  For example, the variance of
519: the displacement of a single particle in a finite time interval cannot
520: be put into the form of a long-time average such as in
521: $\ref{eqn:ergod}$.  This puts more stringent
522: requirements on $S_{\Delta t}$. 
523: To show that statistics are captured correctly 
524: we cannot consider just single trajectories; we have make sure
525: the entire ensemble's statistics are reproduced correctly.
526: If the shadowing map $S_{\Delta t}$ systematically picked initial
527: conditions for which the tracer particle tended to move to the left,
528: for example, then the computed statistics could be quite inaccurate.
529: See \cite{Hayes} for a discussion of this issue in the context of
530: astrophysics. 
531: What is necessary for this shadowing to work is for $S_{\Delta t}$ to
532: leave the canonical ensemble invariant: 
533: \begin{equation} \label{eqn:preserve}
534: \langle G(q,p) \rangle = \langle G(S_{\Delta t}(q,p)) \rangle
535: \end{equation}
536: for some suitably broad class of functions $G$ on phase space.
537: This is an even  more stringent requirement than just that shadowing
538: is possible at all, and it may be quite unlikely to hold for our
539: system. 
540: 
541: Fortunately we can weaken some other requirements demanded of shadowing 
542: considerably for our problem.  We do not need the trajectory of the
543: whole system to be close; we only need the trajectory of a single
544: particle to be close.  Suppose that our tracer particle's numerical
545: trajectory is denoted by $(q^n_1,p^n_1)$ for $n\geq 0$.  We say that
546: \emph{weak shadowing} holds if we can select $\tilde{q}(0)$,
547: $\tilde{p}(0)$ such that  
548: \[
549:  (q^n_1,p^n_1) \approx (\tilde{q}_1(n \Delta t)),\tilde{p}_1(n\Delta t))) 
550: \]
551: for $n \Delta t$ in some long range of times.
552: 
553: 
554: To see how this fits in with the conjecture suppose that we have both
555: (\ref{eqn:preserve}) and  
556: \begin{equation}  \label{eqn:hardthing} 
557: \| (q^n_1,p^n_1) - (\tilde{q}_1(T)),\tilde{p}_1(T)) \| \leq C \Delta t^2.
558: \end{equation}
559: for $T= n \Delta t \in [0, A e^{B/ \Delta t}]$.
560: This means that  (assuming we can obtain reasonable bounds on
561: $R_{\Delta t}^2(T)$ and $R^2(T)$) that  
562: \begin{eqnarray*}
563: | \langle R^2_{\Delta t} (T) \rangle - \langle R^2(T) \rangle |  
564: & \leq  & 
565: K | \langle \| q_1^n \| \rangle - \langle \| q_1(t) \| \rangle |  \\
566: & \leq &  K | \langle \| q_1^n \| \rangle - \langle \| \tilde{q}_1(T) \| \rangle | +
567:   K  |  \langle \| \tilde{q}_1(T) \| \rangle      - \langle \| q_1(T) \| \rangle |  \\
568: & \leq &  K \langle \| (q^n_1,p^n_1) - (\tilde{q}_1(T)),\tilde{p}_1(T)) \| \rangle \\
569: & & +   K | \langle G(S_{\Delta t}(q^0,p^0)) \rangle - \langle G(q^0,p^0) \rangle |,
570: \end{eqnarray*}
571: for $T \in [0,A e^{B/\Delta t}]$.  Here we have let $G$ be the composition of the time $T$ flow map of the Hamiltonian system with the 2-norm.  Now the first term above is bounded by  $C T e^{-D/\Delta t}$ by (\ref{eqn:hardthing}) and the second term is $0$ by (\ref{eqn:preserve}),
572: thus establishing the conjecture.
573: Simultaneously proving (\ref{eqn:preserve}) and (\ref{eqn:hardthing}) for some shadowing map $S_{\Delta t}$ may not be easy, but it may be much easier than proving the usual stronger shadowing result.
574: 
575: 
576: \section{Discussion} \label{sec:discussion}
577: 
578: Despite the ideas presented in the previous section, the conjecture we
579:   have presented is probably not open to attack by existing
580:   techniques.  The problem is that there is no rigourous mathematical
581:   theory of how statistical regularities emerge from the dynamics of
582:   generic high-dimensional Hamiltonian systems.  Consequently, there
583:   is no theory of how perturbations in the Hamiltonian dynamics leads
584:   to perturbation in the statistics. 
585:   A numerical analyst has three choices when faced with this situation:
586: \begin{enumerate}
587: \item
588: {\bf Take Up Mathematical Physics.}  
589: If we are to make progress on the conjecture these entirely
590: non-numerical problems need to be tackled first.  Mathematical
591: physicists are interested in proving things like ergodicity and decay
592: of correlations for Hamiltonian systems such as presented here, and it
593: is conceivable that eventually there will a robust body of theory that
594: we can apply to our problem.  So one possibility is to work
595: on developing such a theory.   This likely will not have much to do
596: with computation.
597: 
598: 
599: 
600: \item
601: {\bf Relax Standards of Rigour.}
602: Theoretical physicists, as opposed to mathematical physicists, have accepted that much reliable information can be obtained through calculations that cannot be rigourously justified.  Typically theoretical physicists study systems about which nothing interesting can be proved; to do otherwise would be far too restrictive.  There is no reason why this informal yet highly fruitful style of reasoning should be restricted to systems themselves and not numerical discretizations of systems.  A combination of non-rigourous arguments and careful numerical experiments could do a lot to clarify how the St\"ormer-Verlet method is able to compute statistics so accurately for our system.
603: 
604: \item
605: {\bf Abandon the Whole Pursuit.}  For many, the purpose of numerical analysis is to provide reliable, efficient algorithms.  If one is pursuing a theoretical question, it is hoped that it will lead  to better algorithms eventually.  Sadly, even a complete resolution of the conjecture we have presented in unlikely to have much effect on computational practice.   Many people have tried for years to devise an integrator that is more efficient than the St\"ormer-Verlet method for computing statistically accurate trajectories in molecular dynamics.  They have only been successful for Hamiltonian systems with special structure.  (The prime example of this is the multiple time stepping methods, see \cite[Ch. VIII.4]{hairer}.)
606: In fact, we state another conjecture which is not formulated rigourously.
607: \begin{conj}  \label{conj:solver}
608: No integration scheme can improve the efficiency by more than a factor of two with which St\"ormer-Verlet computes statistically accurate trajectories for systems like that  in Section~\ref{sec:system}.
609: \end{conj}
610: Here even a clear mathematical formulation would be a challenge.
611: Obviously if we already know a lot about a system we can contrive an
612: algorithm which will give correct statistics for a tracer particle,
613: but this does not count.  The conjecture is intended to capture the
614: idea that St\"ormer-Verlet is a very general purpose method; we do not
615: need to know anything about a system to apply it.\end{enumerate} 
616: 
617: At the Abel Symposium participants seemed to prefer the first of the
618: three options: try to prove what one can about the system and its
619: discretization. 
620: 
621: 
622: {\bf Acknowledgements.} The author was supported by an NSERC Discovery
623: Grant.  He would like to thank Nilima Nigam, Bob Skeel, and Wayne
624: Hayes for helpful comments.
625: 
626: \begin{thebibliography}{00}
627: 
628: \bibitem{allen} M. P. Allen and D. J. Tildesley,  {\it Computer
629: Simulation of Liquids},  Oxford University Press, Oxford, 1989.
630: 
631: \bibitem{bennetin}
632: G. Bennetin and A. Giorgilli, \emph{On the Hamiltonian interpolation of near to the identity symplectic mappings with application to symplectic integration algorithms}, J.\ Stat.\ Phys.\ {\bf 74}, (1994) 1117--1143.
633: 
634: \bibitem{Stu1}
635: B. Cano, A. M. Stuart, E. S\"uli, J. O. Warren, \emph{Stiff oscillatory systems, delta jumps and white noise}, Found. Comput. Math. 1 (2001), no. 1, 69--99.
636: 
637: \bibitem{cances}
638: E. Canc\`es, F. Legoll, and G. Stoltz. \emph{Theoretical and numerical
639:   comparison of some sampling methods for molecular dynamics.}  To
640: appear, Math. Mod. Num. Anal.
641: 
642: \bibitem{skeel}
643: R. D. Engle, R. D. Skeel, M. Drees, \emph{Monitoring energy drift with shadow Hamiltonians.} J. Comput. Phys. 206 (2005), no. 2, 432--452. 
644: 
645: \bibitem{frenkel}
646: D. Frenkel and B. Smit. \emph{Understanding Molecular Simulation: From Algorithms to Applications, 2nd edition}. Academic Press, London, 2002.  
647: 
648: \bibitem{hairer}
649: E. Hairer, C. Lubich, and G. Wanner.  \emph{Geometric Numerical Integration: Structure-Preserving Algorithms for Ordinary Differential Equations}.  Springer Series in Computational Mathematics, Berlin, 2002.
650: 
651: \bibitem{Hayes} W. Hayes and K. Jackson. \emph{A Survey of Shadowing Methods for Numerical Solutions of Ordinary Differential Equations}.  Applied Numerical Mathematics 53:1-2, pp. 299-321 (2005).
652: 
653: \bibitem{Mackay}
654: T. J. Hunt and R. S. MacKay, \emph{Anosov parameter values for the triple linkage and a physical system with a uniformly chaotic attractor.}  Nonlinearity 16 (2003), no. 4, 1499--1510. 
655: 
656: 
657: \bibitem{Liverani} C.~Liverani, Interacting Particles,   Hard Ball Systems and the Lorentz Gas, in: D. Sz\'asz(Ed.), {\it Hard Ball Systems and the Lorentz Gas}, Springer, Berlin,  2000.
658: 
659: 
660: \bibitem{Reich}
661: S. Reich. \emph{Backward error analysis for numerical integrators.} SIAM J. Numer. Anal. 36 (1999), no. 5, 1549--1570
662: 
663: \bibitem{SigStu}
664: H. Sigurgeirsson and A. M. Stuart, \emph{Statistics from computations.} Foundations of computational mathematics (Oxford, 1999), 323--344, London Math. Soc. Lecture Note Ser., 284, Cambridge Univ. Press, Cambridge, 2001. 
665: 
666: \bibitem{SkeTup}  R. D. Skeel and P. F. Tupper, editors. \emph{Mathematical Issues in Molecular Dynamics.}  Banff International Research Station Reports.  2005.
667: 
668: \bibitem{LeiRei} B. Leimkuhler and S. Reich, \emph{Simulating Hamiltonian dynamics.}  Cambridge Monographs on Applied and Computational Mathematics, 14. Cambridge University Press, Cambridge, 2004.
669: 
670: \bibitem{LeBris}
671: C. LeBris, \emph{Computational chemistry from the perspective of numerical analysis.} Acta Numer. 14 (2005), 363--444.
672: 
673: \bibitem{stuhum} A.~M.~Stuart and A.~R.~Humphries, {\it Dynamical Systems and Numerical Analysis}, Cambridge University Press, Cambridge, 1996.
674: 
675: 
676: \bibitem{Stu2} A. M. Stuart and J. O. Warren, \emph{Analysis and Experiments for a Computational Model of a Heat Bath}, J. Stat. Phys. {\bf 97} (1999), 687--723.
677: 
678: \bibitem{Tupper}P. F. Tupper, \emph{Ergodicity and the numerical simulation of Hamiltonian systems.} SIAM J. Appl. Dyn. Syst. 4 (2005), no. 3, 563--587
679: 
680: 
681: \end{thebibliography}
682: 
683: \end{document}
684: