1: %\documentclass[12pt]{iopart}
2: \documentclass{modifiedstl}
3:
4: \usepackage{graphicx}
5: \usepackage{modifiedams}
6: \usepackage{cite}
7:
8: \begin{document}
9:
10: \title[Quantum projection filter]{Quantum projection filter for a highly
11: nonlinear model in cavity QED}
12:
13: \author{Ramon van Handel\footnote[3]{E-mail
14: address: ramon@its.caltech.edu} and Hideo Mabuchi\footnote[2]{E-mail
15: address: hmabuchi@its.caltech.edu}}
16:
17: \address{Physical Measurement and Control 266-33, California Institute of
18: Technology, Pasadena, CA 91125, USA}
19:
20: \begin{abstract}
21: Both in classical and quantum stochastic control theory a major role is
22: played by the filtering equation, which recursively updates the
23: information state of the system under observation. Unfortunately, the
24: theory is plagued by infinite-dimensionality of the information state
25: which severely limits its practical applicability, except in a few select
26: cases (e.g.\ the linear Gaussian case.) One solution proposed in
27: classical filtering theory is that of the projection filter. In this
28: scheme, the filter is constrained to evolve in a finite-dimensional family
29: of densities through orthogonal projection on the tangent space with
30: respect to the Fisher metric. Here we apply this approach to the simple
31: but highly nonlinear quantum model of optical phase bistability of a
32: stongly coupled two-level atom in an optical cavity. We observe
33: near-optimal performance of the quantum projection filter, demonstrating
34: the utility of such an approach.
35: \end{abstract}
36:
37: %Uncomment for PACS numbers title message
38: \pacs{00.00, 20.00, 42.10}
39:
40: % Uncomment for Submitted to journal title message
41: %\submitto{\JPA}
42:
43: % Comment out if separate title page not required
44: %\maketitle
45:
46: \section{Introduction}
47:
48: Over the past decade it has become increasingly clear that feedback
49: control of quantum systems is essentially a problem of stochastic control
50: theory with partial observations
51: \cite{s:belavkin2,s:doherty2,s:VanHandel2005}. In this context, the system
52: and observations are generally modeled as a pair of It\^o (quantum)
53: stochastic differential equations. It is then the goal of the control
54: engineer to find a feedback control policy, depending on the system state
55: only through the past history of the observations, that achieves a
56: particular control objective.
57:
58: In the case of linear system dynamics and observations and Gaussian
59: initial conditions, the so-called optimal control problem can be solved
60: exactly both classically \cite{s:bensoussan} and quantum-mechanically
61: \cite{s:belavkin2,s:doherty1} provided that a quadratic performance
62: criterion is chosen. This means that the control objective is specified
63: as an optimization problem, where a certain cost function (the performance
64: criterion) of the system evolution and the control signal is to be
65: minimized. The resulting Linear-Quadratic-Gaussian (LQG) control is
66: widely used in many technological applications. An important feature of
67: LQG theory is its {\it separation structure:} the optimal controller
68: splits up into a part that updates the optimal estimate of the system
69: state given the observations (the Kalman filter), and an optimal feedback
70: law which is only a function of the state estimate.
71:
72: It was originally suggested by Mortensen \cite{s:mortensen} that the
73: separation structure of LQG control carries over even to the nonlinear
74: case. The problem now separates into the nonlinear filtering problem of
75: finding the optimal estimate of the system statistics given the
76: observations and the optimal control problem of finding a feedback law,
77: based only on the filtered estimate, that minimizes some performance
78: criterion. The estimate propagated by the filter is often referred to as
79: the {\it information state} \cite{s:james2} as it contains all information
80: of the system possessed by the observer. Unfortunately, nonlinear
81: stochastic control is plagued by two rather severe problems. First, the
82: information state is generally infinite-dimensional even for very simple
83: nonlinear systems \cite{s:hazewinkelmarcus}. Second, even in the
84: finite-dimensional case the nonlinear optimal control problem is generally
85: intractable. The latter can sometimes be alleviated by posing a less
86: stringent control objective \cite{s:VanHandel2005}. Nonetheless nonlinear
87: stochastic control remains an extremely challenging topic, both in the
88: classical and quantum mechanical case.
89:
90: This paper is concerned with the first problem, that of
91: infinite-dimensionality of the nonlinear information state. There is no
92: universal solution to this problem. The most common (though rather {\it
93: ad hoc}) approach used by engineers is known as the extended Kalman filter
94: \cite{s:jacobs}. In this scheme, the system dynamics is linearized around
95: the current expected system state, and a Kalman filter based on the linear
96: approximation is used to propagate the estimate. However, aside from the
97: fact that the method only performs well for nearly linear systems, it is
98: not clear how it can be applied to quantum models\footnote{
99: If the system dynamics can be meaningfully expressed in terms
100: of conjugate pairs of observables, one could imagine locally
101: linearizing the system Langevin equations to obtain a quantum
102: extended Kalman filter. To our knowledge this has not yet been
103: attempted. However, it is not clear how to do this in e.g.\
104: atomic systems, where the internal degrees of freedom do not
105: obey CCR.
106: }.
107:
108: A much more flexible approximation for nonlinear filtering equations was
109: proposed by Brigo, Hanzon and LeGland \cite{s:brigo1,s:brigo2,s:brigo3},
110: based on the differential geometric methods of information geometry
111: \cite{s:amari}. In this scheme we fix a finite-dimensional family of
112: densities that are assumed to be good approximations to the information
113: state. Using geometric methods the filter is then constrained to evolve
114: in this family. The finite-dimensional approximate filter obtained in
115: this way is known as a projection filter, and often performs extremely
116: well when the approximating family is chosen wisely. Moreover, as this
117: approximate filter is based on the optimal nonlinear filter, instead of on
118: the trajectories of the system state in phase space, it is readily
119: extended to the quantum case. Though by no means a universal solution to
120: the filtering problem, we believe that the flexibility and performance of
121: this method likely make it widely applicable in the realistic (real-time)
122: implementation of quantum filtering theory.
123:
124: In this paper we apply the projection filtering method to a simple, but
125: highly nonlinear quantum system: a stongly driven, strongly coupled
126: two-level atom in a resonant single-mode optical cavity
127: \cite{s:alsing,s:mabuchi}. The output field of such an experiment exhibits
128: a randomly switching phase, caused by the atomic spontaneous emission. The
129: formalism developed by Brigo {\it et al.}\ can be applied directly to this
130: system if the information state (the conditional density of the atom and
131: cavity mode) is represented as a $Q$-function \cite{s:mandel}.
132: Remarkably, our projection filter shows strong connections to the
133: classical problem of filtering a random jump process in additive white
134: noise \cite{s:wonham,s:vellekoop}.
135:
136: Rather than using a quasiprobability representation, a fully quantum
137: theory of projection filtering is expressed in terms of finite-dimensional
138: families of density operators and quantum information geometry
139: \cite{s:amari}. We will present the general theory in a future
140: publication. Nonetheless there is no theoretical objection to the
141: approach taken in this paper. In fact, we observe numerically that the
142: projection filter for our model has near-optimal performance,
143: demonstrating the utility of this approach.
144:
145: This paper is organized as follows. In section \ref{sec:proj} we
146: introduce the projection filter and the neccessary elements of information
147: geometry. Next, in section \ref{sec:phys}, we introduce the physical
148: model that we will be using as an example and obtain the associated
149: filtering equation. In section \ref{sec:proj2} we obtain the projection
150: filter for our model. Finally, in section \ref{sec:num}, we present and
151: discuss the results of numerical simulations.
152:
153:
154: \section{Information geometry and the projection filter}
155: \label{sec:proj}
156:
157: \subsection{The basic principle of the projection filter}
158:
159: \begin{figure}
160: \includegraphics[width=\textwidth]{PFilter.eps}
161: \caption{
162: \label{f:pfilter} Cartoon drawing of the projection filter. (a) The
163: infinite-dimensional space of all densities is represented by $M$,
164: while $S$ is a finite-dimensional submanifold. The filter, defined
165: by a (stochastic) differential equation in $M$, flows along a (random)
166: vector field on $M$. The flow is such that the density remains close to
167: $S$, but $S$ is not invariant. (b) To each point $\theta\in S$ the flow
168: associates a (random) tangent vector $X[\theta]$ which has components in
169: both $T_\theta S$ and its complement. The projection filter is generated
170: by the vector field in $TS$ that, for each $\theta$, is the orthogonal
171: projection of $X[\theta]$ onto $T_\theta S$.}
172: \end{figure}
173:
174: The basic idea behind the projection filter is illustrated in Figure
175: \ref{f:pfilter}. First, we assume that the information state can be
176: represented as a probability density, i.e.\ an integrable nonnegative
177: function on some underlying phase space. Though this is not always the
178: case even in classical probability, this is generally a good assumption
179: for any ``reasonable'' model. In this paper we will use a well-known
180: quantum quasiprobability distribution, the $Q$-function \cite{s:mandel},
181: for this purpose. The set of all possible densities forms an
182: infinite-dimensional function space which we will denote by $M$.
183:
184: We also suppose that the information state is well approximated by
185: densities in some finite-dimensional subspace $S$ of $M$. We will assume
186: that $S$ can be given the structure of a differential manifold, but we do
187: not require it to be a linear space. As such we must be careful in what
188: follows in distinguishing between points in $S$, points in the tangent
189: bundle $TS$, etc., as in any differential geometric situation.
190:
191: In general, $S$ will not be an invariant set of the filter; if we start
192: with a density in $S$, the filter will cause the density to evolve into a
193: neighborhood of $S$ in $M$. The idea behind the projection filter is
194: simply to constrain the optimal filter to remain in $S$. As $S$ is
195: finite-dimensional, we can then express the projection filter as a
196: differential equation in a finite set of local coordinates on $S$.
197:
198: The optimal filter that propagates the information state of the system is
199: given by a stochastic differential equation (SDE) in $M$, and we are
200: seeking to express the projection filter as an SDE in $S$. The precise
201: meaning of this statement is a somewhat important point which we will
202: return to at the end of this section; for now, we can imagine the filter
203: to be an ordinary differential equation that is driven by the
204: observations, as follows:
205: \begin{equation}
206: \label{eq:vectorf}
207: \frac{dp_t}{dt}=X[p_t;Y_t]
208: \end{equation}
209: Here $p_t\in M$ is the information state and $Y_t$ is the observation made
210: at time $t$. $X$, then, is an observation-dependent vector field on $M$.
211:
212: To constrain the filter to evolve in $S$ we must only retain the dynamics
213: of (\ref{eq:vectorf}) that is parallel to $S$; dynamics perpendicular to
214: $S$ will move the density into an undesired region. Mathematically, this
215: idea is simply implemented if we realize that at each point $\theta\in S$,
216: $X[\theta;Y]$ will have components both in the tangent space $T_\theta S$
217: and in its complement $T_\theta S^\perp$. We can now constrain the vector
218: field by orthogonally projecting $X[\theta;Y]$ onto $T_\theta S$ for every
219: $\theta\in S$. The resulting approximate filter, in which only the
220: dynamics that leaves $S$ invariant is retained, is the projection filter
221: \cite{s:brigo1,s:brigo2,s:brigo3}.
222:
223: Before we can flesh out the details of this scheme we must deal with the
224: fact that the filter is not given by a differential equation as in
225: (\ref{eq:vectorf}), but by an SDE of the form
226: \begin{equation}
227: \label{eq:vectors}
228: dp_t=A[p_t]\,dt+B[p_t]\,dY_t
229: \end{equation}
230: We would like to think of $A+B\dot Y_t$ as a ``stochastic vector field''
231: so that we can directly apply the scheme discussed above. The theory of
232: stochastic differential equations on manifolds \cite{s:bismut,s:rogersw}
233: tells us that we can in fact do this, as long as we interpret
234: (\ref{eq:vectors}) as a {\it Stratonovich} SDE
235: \begin{equation}
236: \label{eq:stratflt}
237: dp_t=A[p_t]\,dt+B[p_t]\circ dY_t
238: \end{equation}
239: This is not surprising as, for example, It\^o's rule is incompatible with
240: the requirement that the Lie derivative along a vector field is a
241: derivation \cite{s:marsden} (in other words, a differential geometric
242: transformation rule can only contain first derivatives, and the only
243: stochastic integral with this property is the Stratonovich integral.)
244: Note that usually filtering equations are given in the It\^o form; hence
245: we must transform to the Stratonovich form before we can derive the
246: projection filter.
247:
248: \subsection{Information geometry}
249:
250: In order to perform the key step in the above procedure, the orthogonal
251: projection, we need an inner product in the tangent space $T_\theta S$.
252: A differential manifold is not naturally endowed with an inner product
253: structure, however, and hence the projection filter is not yet well
254: defined. We need to add to the manifold a Riemannian structure
255: \cite{s:lafontaine}. In statistics there is a natural way to do this, and
256: the resulting theory is known as information geometry \cite{s:amari}.
257:
258: There are different ways of introducing this structure, but perhaps the
259: easiest treatment is obtained by considering instead of the densities $M$
260: the space of {\it square roots} of densities $M^{1/2}$. The fact that any
261: density is integrable guarantees that the square root of any density is
262: square integrable; hence $M^{1/2}$ is a subspace of $L^2$, the space of
263: square integrable functions, and any vector field on $M^{1/2}$ takes
264: values in $L^2$.
265:
266: Similarly, we consider the manifold $S^{1/2}$, which we will explicitly
267: parametrize as
268: \begin{equation}
269: S^{1/2}=\{\sqrt{p(\cdot,\theta)},~~\theta\in\Theta\subset\mathbb{R}^m\}
270: \end{equation}
271: That is, $S^{1/2}$ is a finite-dimensional manifold of square roots of
272: densities, parametrized by the local coordinates\footnote{
273: By writing this, we are assuming that the entire manifold can be
274: covered by a single coordinate chart. Without this assumption
275: the description would be more complicated, as then we couldn't
276: describe the projection filter using a simple ``extrinsic'' SDE
277: in $\mathbb{R}^m$.
278: Often we can make our manifold obey this property simply by
279: removing a few points; we will see an example of this later.
280: } $\theta\in\Theta$. As $S^{1/2}\subset M^{1/2}$, for any
281: $\theta\in\Theta$ the tangent space $T_\theta S^{1/2}$ is the linear
282: subspace of $L^2$ given by
283: \begin{equation}
284: \label{eq:tthetas}
285: T_\theta S^{1/2}=\mbox{Span}\left[
286: \frac{\partial\sqrt{p(\cdot,\theta)}}{\partial\theta^1},
287: \cdots,
288: \frac{\partial\sqrt{p(\cdot,\theta)}}{\partial\theta^m}
289: \right]\subset L^2
290: \end{equation}
291: The reason for working with square roots of densities is that this gives
292: a natural inner product in the tangent space, which is simply the standard
293: $L^2$-inner product. In particular, we can calculate the associated
294: metric tensor in the basis of (\ref{eq:tthetas}):
295: \begin{equation}
296: \label{eq:fisher}
297: \left\langle
298: \frac{\partial\sqrt{p(\cdot,\theta)}}{\partial\theta^i},
299: \frac{\partial\sqrt{p(\cdot,\theta)}}{\partial\theta^j}
300: \right\rangle=
301: \int
302: \frac{\partial\sqrt{p(x,\theta)}}{\partial\theta^i}
303: \frac{\partial\sqrt{p(x,\theta)}}{\partial\theta^j}dx=
304: \frac{1}{4}g_{ij}(\theta)
305: \end{equation}
306: Up to a factor of $1/4$, this is the well-known Fisher information matrix
307: $g_{ij}(\theta)$.
308:
309: We are now in the position to define what we mean by orthogonal projection
310: of a vector field on $M$ onto $TS$. At each $\theta$, the orthogonal
311: projection is
312: \begin{equation}
313: \label{eq:orthog}
314: \Pi_\theta X[\theta]=\sum_{i=1}^m\sum_{j=1}^m
315: 4g^{ij}(\theta)
316: \left\langle X[\theta],
317: \frac{\partial\sqrt{p(\cdot,\theta)}}{\partial\theta^j}
318: \right\rangle
319: \frac{\partial\sqrt{p(\cdot,\theta)}}{\partial\theta^i},
320: \end{equation}
321: where we have used the inverse Fisher information matrix $g^{ij}(\theta)$
322: to account for the fact that the basis of (\ref{eq:tthetas}) is not
323: orthogonal. This is the main result that is needed to obtain projection
324: filters.
325:
326: \subsection{Orthogonal projection of a Stratonovich filter}
327:
328: Let us now discuss how to perform orthogonal projection onto a
329: finite-dimensional manifold $S$ for the very general form
330: (\ref{eq:stratflt}) of a filtering equation. We begin by converting the
331: equation to the square root form; this gives
332: \begin{equation}
333: d\sqrt{p_t}=\frac{1}{2\sqrt{p_t}}A[p_t]\,dt
334: +\frac{1}{2\sqrt{p_t}}B[p_t]\circ dY_t
335: \end{equation}
336: We now constrain the filter to evolve on $S^{1/2}$ through orthogonal
337: projection:
338: \begin{equation}
339: \label{eq:pfiltfll}
340: d\sqrt{p(\cdot,\theta_t)}=
341: \Pi_{\theta_t}
342: \frac{
343: A[p(\cdot,\theta_t)]
344: }{2\sqrt{p(\cdot,\theta_t)}}\,dt
345: +\Pi_{\theta_t}
346: \frac{
347: B[p(\cdot,\theta_t)]
348: }{2\sqrt{p(\cdot,\theta_t)}}\circ dY_t
349: \end{equation}
350: This is just a finite-dimensional SDE for the parameters $\theta_t$. To
351: convert the expression explicitly into this form, note that by the
352: Stratonovich transformation rule
353: \begin{equation}
354: d\sqrt{p(\cdot,\theta_t)}=
355: \sum_i\frac{\partial\sqrt{p(\cdot,\theta_t)}}{\partial\theta_t^i}
356: \circ d\theta_t^i
357: \end{equation}
358: Comparing with (\ref{eq:orthog}) and (\ref{eq:pfiltfll}), we find that
359: \begin{equation}
360: \label{eq:projf}
361: d\theta_t^i=
362: \left\langle
363: \frac{
364: A[p(\cdot,\theta_t)]
365: }{p(\cdot,\theta_t)},\Lambda_t^i(\cdot,\theta_t)
366: \right\rangle dt + \left\langle
367: \frac{
368: B[p(\cdot,\theta_t)]
369: }{p(\cdot,\theta_t)},\Lambda_t^i(\cdot,\theta_t)
370: \right\rangle
371: \circ dY_t
372: \end{equation}
373: where
374: \begin{equation}
375: \label{eq:projf2}
376: \Lambda_t^i(\cdot,\theta_t)=
377: \sum_{j=1}^m
378: g^{ij}(\theta_t)\frac{\partial p(\cdot,\theta_t)}{\partial\theta^j_t}
379: \end{equation}
380: Equations (\ref{eq:projf}), (\ref{eq:projf2}) and (\ref{eq:fisher}) can be
381: used to directly calculate the projection filter for a wide range of
382: models.
383:
384: \section{The physical model and the quantum filter}
385: \label{sec:phys}
386:
387: \subsection{The Jaynes-Cummings model in the strong driving limit}
388: \label{sec:jc1}
389:
390: \begin{figure}
391: \includegraphics[width=\textwidth]{Jaynes.eps}
392: \caption{
393: \label{f:jaynes} Schematic of the experimental setup that corresponds to
394: our model. A strongly coupled two-level atom in a resonant, single mode
395: cavity is strongly driven by a resonant driving laser, and spontaneously
396: emits in all directions. One of the cavity mirrors is leaky, and the
397: atomic dynamics is observed through homodyne detection of the
398: electromagnetic field in this forward mode.}
399: \end{figure}
400:
401: We consider the following physical system, shown in Figure \ref{f:jaynes}.
402: A two-level atom is strongly coupled to the mode of a single-mode cavity.
403: The cavity mode and the atomic frequency are resonant. The atom is
404: strongly driven on resonance by a laser, and spontaneously emits in all
405: directions. A forward mode of the electromagnetic field outside the
406: cavity, initially in the vacuum state, scatters off one of the cavity
407: mirrors. By making this mirror slightly leaky, we extract information
408: from the system into the external field. Homodyne detection of the
409: forward mode then yields information about the atom and cavity.
410:
411: The goal of this section is to model this physical system as a pair of
412: It\^o quantum stochastic differential equations, one of which describes
413: the atom-cavity evolution and one describing the homodyne observations.
414: To this end, we begin by writing down the full Hamiltonian for the system:
415: \begin{equation}
416: H=H_0+H_d+H_{\rm JC}+H_f+H_s
417: \end{equation}
418: Here $H_0$ is the free Hamiltonian
419: \begin{equation}
420: H_0=
421: \hbar\omega_0a^\dag a+
422: \frac{\hbar\omega_0}{2}\sigma_z
423: +\int_0^\infty d\omega\,\hbar\omega (b_f^\dag(\omega)b_f(\omega)
424: +b_s^\dag(\omega)b_s(\omega)),
425: \end{equation}
426: $H_d$ is the drive Hamiltonian
427: \begin{equation}
428: H_d=i\hbar(\mathcal{E}/2)
429: (e^{i\omega_0t}\sigma-e^{-i\omega_0t}\sigma^\dag),
430: \end{equation}
431: $H_{\rm JC}$ is the well-known Jaynes-Cummings Hamiltonian
432: \begin{equation}
433: H_{\rm JC}=i\hbar g(a^\dag\sigma-a\sigma^\dag),
434: \end{equation}
435: and $H_f$ and $H_s$ are the dipole couplings to the forward and
436: spontaneous emission field modes outside the cavity (see e.g.\
437: \cite{s:gardinercollett})
438: \begin{eqnarray}
439: H_f=\hbar\int_0^\infty d\omega\,
440: \kappa_f(\omega)[ab_f^\dag(\omega)+a^\dag b_f(\omega)] \\
441: H_s=\hbar\int_0^\infty d\omega\,
442: \kappa_s(\omega)[\sigma b_s^\dag(\omega)+\sigma^\dag b_s(\omega)]
443: \end{eqnarray}
444: Here $\sigma=|g\rangle\langle e|$ is the atomic lowering
445: operator, $\sigma_z=[\sigma^\dag,\sigma]$, $a$ is the cavity mode lowering
446: operator (we will also use $x=a^\dag+a$ and $y=i(a^\dag-a)$), and
447: $b_f(\omega)$ and $b_s(\omega)$ are the annihilators of the forward and
448: spontaneous emission modes, respectively. The resonant frequency of the
449: atom, drive and cavity mode is denoted by $\omega_0$, $\mathcal{E}$ is the
450: drive strength, $g$ is the atom-cavity coupling strength, and
451: $\kappa_f(\omega)$ and $\kappa_s(\omega)$ determine the
452: frequency-dependent coupling to the external field modes.
453:
454: We will assume that $\omega_0\gg\mathcal{E}\gg g>\kappa_f,\kappa_s$.
455: Following \cite{s:mabuchi}, let us switch to the interaction picture with
456: respect to $H_0+H_d$. We obtain the interaction Hamiltonian
457: \begin{eqnarray}
458: \frac{H_{\rm I}}{\hbar}=
459: ig[a^\dag\sigma(t)-a\sigma^\dag(t)]
460: +\int_0^\infty d\omega\,
461: \kappa_s(\omega)[\sigma(t)b_s^\dag(\omega)
462: e^{i(\omega-\omega_0)t}+\mbox{h.c.}]
463: \nonumber\\
464: \label{eq:interaction}
465: \phantom{\frac{H_{\rm I}}{\hbar}=ig[a^\dag\sigma(t)-a\sigma^\dag(t)]}
466: +\int_0^\infty d\omega\,
467: \kappa_f(\omega)[ab_f^\dag(\omega)e^{i(\omega-\omega_0)t}
468: +\mbox{h.c.}]
469: \end{eqnarray}
470: where we have defined
471: \begin{eqnarray}
472: |\pm\rangle = 2^{-1/2}(|g\rangle\mp i|e\rangle), ~~~~
473: \mu=|-\rangle\langle+|, ~~~~ \mu_z=[\mu^\dag,\mu], \\
474: \sigma(t)=(-i/2)(\mu e^{-i\mathcal{E}t}+\mu_z-\mu^\dag e^{i\mathcal{E}t})
475: \end{eqnarray}
476: There are two time scales in the Hamiltonian (\ref{eq:interaction}), which
477: we will consider separately. The first term evolves on the slow time
478: scale of the atomic evolution. As $\mathcal{E}$ is very large compared to
479: the atomic time scale, we make the rotating wave approximation by dropping
480: the rapidly oscillating terms.
481:
482: The remaining terms in (\ref{eq:interaction}) correspond to the fast time
483: scale of interaction with the external electromagnetic field. We cannot
484: use the rotating wave approximation for these terms, as the external
485: fields are broadband and thus have modes that respond on the fast time
486: scale. Instead, we make the weak coupling (Markov) approximation for
487: these terms; this results in the following white noise Hamiltonian:
488: \begin{eqnarray}
489: \label{eq:whiten}
490: \tilde H_{\rm I}=
491: \hbar(g/2)\mu_zx
492: +\hbar\sqrt{2\kappa}\,[a\dot B_f^\dag(t)+\mbox{h.c.}]
493: +i\hbar[
494: \sqrt{\gamma_{+}/2}\,\mu\dot B_{s,+}^\dag(t)
495: \nonumber\\
496: \phantom{\tilde H_{\rm I}=\hbar(g/2)\mu_zx+}
497: +\sqrt{\gamma_{z}/2}\,\mu_z\dot B_{s,z}^\dag(t)+
498: \sqrt{\gamma_{-}/2}\,\mu^\dag\dot B_{s,-}^\dag(t)-\mbox{h.c.}]
499: \end{eqnarray}
500: Here $B_f^\dag$, $B_{s,+}^\dag$, $\dot B_{s,z}^\dag$ and
501: $\dot B_{s,-}^\dag$ are independent quantum white noises corresponding,
502: respectively, to the forward channel and the three spontaneous emission
503: channels at $\omega=\omega_0+\mathcal{E}$, $\omega_0$, and
504: $\omega_0-\mathcal{E}$ (the upper, middle and lower peaks of the Mollow
505: triplet.)
506:
507: We refer to \cite{s:gardinercollett} for a discussion of the white noise
508: approximation. Care must be taken to assign an independent white noise to
509: each frequency component of $\sigma(t)$ (e.g.\ section III.E of
510: \cite{s:gardinercollett}); each frequency probes a different subset of the
511: modes $b(\omega)$ and hence ``sees'' a different noise. In the weak
512: coupling limit these noises are in fact white and independent. For a more
513: rigorous approach to the white noise limit see \cite{s:accardi,s:gough}.
514: Using the latter approach we can explicitly calculate
515: $\kappa=\pi\kappa_f(\omega_0)^2$, $\gamma_z=\pi\kappa_s(\omega_0)^2$, and
516: $\gamma_\pm=\pi\kappa_s(\omega_0\pm\mathcal{E})^2$. For simplicity, we
517: will assume that approximately $\gamma_{z,+,-}=\gamma$.
518:
519: The white noise Hamiltonian (\ref{eq:whiten}) by itself is not well
520: defined. However, we can give rigorous meaning to the equation
521: \begin{equation}
522: \frac{dU_t}{dt}=-\frac{i}{\hbar}\tilde H_{\rm I}U_t
523: \end{equation}
524: if we interpret it as a {\it Stratonovich} quantum stochastic differential
525: equation \cite{s:gardinercollett,s:gough}. After conversion to the It\^o
526: form, this equation reads
527: \begin{eqnarray}
528: dU_t=[
529: \sqrt{\gamma/2}\,(\mu\,dB_{s,+}^\dag(t)
530: +\mu_z\,dB_{s,z}^\dag(t)
531: +\mu^\dag\,dB_{s,-}^\dag(t)-\mbox{h.c.})
532: \nonumber\\
533: \phantom{dU_t=[\sqrt{\gamma/2}}
534: -i\sqrt{2\kappa}\,(a\,dB_f^\dag(t)+\mbox{h.c.})
535: \nonumber\\
536: \label{eq:sys}
537: \phantom{dU_t=[\sqrt{\gamma/2}}
538: -\kappa a^\dag a\,dt
539: -(\gamma/2)\,dt
540: -i(g/2)\mu_z x\,dt
541: ]\,U_t
542: \end{eqnarray}
543:
544: Let us now turn to the homodyne observation of the field. The homodyne
545: detector measures a quadrature of the forward channel after it has
546: scattered off the cavity. We will choose the quadrature
547: $B_f(t)+{B_f}^\dag(t)$; the observation process is then
548: $Y(t)=U_t^\dag(B_f(t)+{B_f}^\dag(t))U_t$ (i.e., the photocurrent is
549: $I(t)=dY(t)/dt$.) Using the quantum It\^o rules
550: \cite{s:gardinercollett,s:hudpar}, we easily find the differential form of
551: this expression:
552: \begin{equation}
553: \label{eq:rawobs}
554: dY(t) = \sqrt{2\kappa}\,U_t^\dag yU_t\,dt+
555: dB_f(t)+dB_f^\dag(t)
556: \end{equation}
557: We can slightly extend our observation model to account for technical
558: noise, detector inefficiency, etc. To model such effects, we add to
559: (\ref{eq:rawobs}) an independent corrupting noise $\propto dC(t)+dC^\dag(t)=
560: dV(t)$. It is customary in the quantum optics literature to normalize
561: $Y(t)$ so that $dY(t)^2=dt$. In terms of the detection efficiency
562: $\eta\in (0,1]$
563: \begin{equation}
564: \label{eq:obs}
565: dY(t) = \sqrt{2\kappa\eta}\,U_t^\dag yU_t\,dt
566: +\sqrt{\eta}\,[dB_f(t)+dB_f^\dag(t)]
567: +\sqrt{1-\eta}\,dV(t)
568: \end{equation}
569: We will take the It\^o equations (\ref{eq:sys}), (\ref{eq:obs}) as
570: our model for the system-observation pair.
571:
572: \subsection{The quantum filter}
573: \label{sec:jc2}
574:
575: Now that we have a model for the system and the observation process, we
576: can calculate the optimal filter. The derivation of the filtering
577: equation is beyond the scope of this paper; for various approaches, see
578: \cite{s:VanHandel2005,s:belavkin,s:belavkz1,s:belavkz2,s:bouten}. We will
579: attempt, however, through a simple finite-dimensional analogy, to explain
580: our interpretation of the filtering equation, as it is not entirely the
581: same as the interpretation that is often found in the physics literature
582: (e.g.\ \cite{s:wisemanf}).
583:
584: The optimal filter propagates the information state, which determines our
585: best estimate of every system observable given the observations we have
586: made. In our model, every system observable can be represented as a
587: self-adjoint operator $X$ that lives on the atom-cavity Hilbert space; as
588: we are working in the Heisenberg picture, this observable at time $t$ is
589: given by $j_t(X)=U_t^\dag XU_t$. We must now define what we mean by an
590: estimate of an observable.
591:
592: The idea behind the concept of estimation is that we have made some
593: observation, and given the outcome of this observation we wish to make a
594: guess as to the outcome of a different observable that we haven't
595: measured. That is, the estimate of an observable $X$ given an observation
596: of $Y$ is some function $f(Y)$ whose outcome represents our best guess of
597: $X$. To find the {\it best} estimate we must specify some cost function
598: $\mathcal{C}[f]$ to optimize; the function that minimizes $\mathcal{C}$ is
599: then by definition the optimal estimate.
600:
601: The most commonly used estimator is one that minimizes the mean-square
602: error
603: \begin{equation}
604: \label{eq:meansq}
605: \mathcal{C}[f]=\langle(X-f(Y))^2\rangle
606: \end{equation}
607: The observable $f(Y)$ that minimizes this cost is called the {\it
608: conditional expectation} $\mathcal{E}(X|Y)$ of $X$ given $Y$.
609: We will use the conditional expectation as our information state
610: throughout this paper. However, note that if we had chosen a different
611: cost we would obtain a different information state and filter. There is
612: nothing inherently superior about the choice (\ref{eq:meansq}); in
613: fact, it is sometimes advantageous to choose a different estimator with
614: e.g.\ improved robustness properties \cite{s:james2,s:james3}.
615:
616: To understand how the conditional expectation relates to familiar notions
617: from quantum theory, we will demonstrate the procedure using a pair of
618: finite-dimensional observables \cite{s:maassen}. Let $X$ and $Y$ be two
619: $n$-dimensional observables, $n<\infty$, and let $Y$ have $m$ distinct
620: eigenvalues $y_i$. Then $Y$ can be decomposed as
621: \begin{equation}
622: Y=\sum_{i=1}^m y_iP_i
623: \end{equation}
624: where $P_i$ is the projection operator onto the eigenspace corresponding
625: to $y_i$. Clearly any function of $Y$ is a linear combination of $P_i$,
626: and vice versa. Hence we identify the set of all observables that are
627: functions of $Y$ with the span of $\{P_i\}$. The conditional expectation
628: $\mathcal{E}(X|Y)$ is then the element of this set that minimizes the cost
629: (\ref{eq:meansq}).
630:
631: To find this element, we use the following trick. The expression $\langle
632: X^\dag Y\rangle$ defines an inner product on the set of $n\times n$
633: complex matrices\footnote{
634: We assume for simplicity that the expectation map
635: $\langle\cdot\rangle$ is faithful \cite{s:maassen}. If this is
636: not the case, then the conditional expectation is not unique.
637: However, all versions of $\mathcal{E}(\cdot|\cdot)$ are
638: equivalent in the sense that the difference between two versions
639: takes nonzero values with zero probability.
640: }. Using this inner product, we orthogonally project $X$ onto the linear
641: space spanned by $\{P_i\}$. This gives
642: \begin{equation}
643: \label{eq:qcondex}
644: P_YX=\sum_{i=1}^m\frac{\langle P_iX\rangle}{\langle P_i\rangle}P_i
645: \end{equation}
646: It is a well known fact that the orthogonal projection of some vector
647: $v$ onto a linear subspace $W$ with respect to any inner product $(a,b)$
648: gives the element $w\in W$ that minimizes the quantity $((v-w),(v-w))$
649: \cite{s:naylor}. In our case, this means that $P_YX$ minimizes $\langle
650: (X-f^*(Y))(X-f(Y))\rangle$. $f(Y)$ is only an observable, however, if $f$
651: is real, in which case we see that $P_YX$ is precisely the conditional
652: expectation $\mathcal{E}(X|Y)$. Note that the orthogonal projection
653: $P_YX$ will always be self-adjoint if $X$ and $Y$ commute.
654:
655: Remarkably, when $X$ and $Y$ commute, the expression (\ref{eq:qcondex}) is
656: equivalent to the traditional projection postulate. To see this, note
657: that if we observe $Y=y_i$ then $P_YX$ takes the value
658: $\langle P_iX\rangle/\langle P_i\rangle=
659: \langle P_iXP_i\rangle/\langle P_i\rangle$, which is exactly the
660: expectation of $X$ with respect to the initial state projected onto the
661: eigenspace of $y_i$. The situation is somewhat ambiguous for noncommuting
662: $X$ and $Y$, and we will simply refrain from defining the conditional
663: expectation $\mathcal{E}(X|Y)$ when $[X,Y]\ne 0$.
664:
665: The quantum filter determines the best estimate of every system observable
666: given the observations; i.e., it propagates
667: $\pi_t(X)=\mathcal{E}(j_t(X)|Y(s\le t))$, where here
668: $\mathcal{E}(\cdot|\cdot)$ is a proper infinite-dimensional generalization
669: of (\ref{eq:qcondex}). A crucial point is that $j_t(X)$ and $Y(s)$ can in
670: fact be shown to commute for all $s\le t$; this is called the {\it
671: nondemolition property} by Belavkin \cite{s:belavkin}. Thus we see that,
672: even though we can evidently interpret the quantum filter in terms of the
673: projection postulate, we do not need to postulate anything beyond the
674: standard formalism of observables and expectations in quantum mechanics.
675: Instead, we see that the filter follows naturally from a statistical
676: inference procedure wherein we find the least-squares estimate for every
677: system observable given the observations. This point of view is
678: very natural in a control-theoretic context.
679:
680: We now give the quantum filter for our model (\ref{eq:sys}), (\ref{eq:obs});
681: we refer to \cite{s:VanHandel2005,s:belavkin,s:belavkz1,s:belavkz2,s:bouten}
682: for various approaches to deriving this equation. The result is
683: \begin{eqnarray}
684: d\pi_t(X)=
685: \pi_t(
686: (\gamma/2)\{
687: \overline{\mathcal D}[\mu]+
688: \overline{\mathcal D}[\mu_z]+
689: \overline{\mathcal D}[\mu^\dag]
690: \}X
691: )\,dt
692: \nonumber\\
693: \phantom{ d\pi_t(X)=\pi}
694: +\pi_t(
695: 2\kappa\,\overline{\mathcal D}[a]X
696: )\,dt
697: +\pi_t(
698: i(g/2)[\mu_zx,X]
699: )\,dt
700: \nonumber\\
701: \phantom{ d\pi_t(X)=\pi}
702: + \sqrt{2\kappa\eta}\,[
703: i\pi_t(a^\dag X-Xa)
704: -\pi_t(y)\pi_t(X)]\times
705: \nonumber\\
706: \label{eq:qfilt}
707: \phantom{ d\pi_t(X)=\pi + \sqrt{2\kappa\eta}\,[i\pi_t(a^\dag X-X}
708: (dY(t)-\sqrt{2\kappa\eta}\,\pi_t(y)\,dt)
709: \end{eqnarray}
710: where $\overline{\mathcal D}[c]X=c^\dag Xc-(c^\dag cX+Xc^\dag c)/2$.
711: The process $dW(t)=dY(t)-\sqrt{2\kappa\eta}\,\pi_t(y)\,dt$ is known as the
712: {\it innovations process}; it describes how ``surprised'' we are by the
713: measurement, as it is the difference between the observation $dY(t)$ and
714: our best estimate of what we should observe. It can be shown that, as
715: long as the observation process $Y(t)$ has the statistics determined by
716: (\ref{eq:sys}) and (\ref{eq:obs}), the innovations process $dW_t$ is a
717: Wiener process. In some sense this reflects the optimality of the filter,
718: as it means that the innovation is unbiased.
719:
720: Usually the quantum filter (\ref{eq:qfilt}) is written in its density
721: form. To do this, we define a random density operator $\rho_t$ such
722: that\footnote{
723: We mean this in the sense of random variables; that is,
724: ${\rm Tr}[X\rho_t]$ is a classical random variable with the same
725: statistics as the observable $\pi_t(X)$. We have already implied
726: such a correspondence by interpreting $Y(t)$ as a classical
727: stochastic process. In general, we can always express a set of
728: observables as classical random variables as long as they commute
729: \cite{s:maassen}.
730: } $\pi_t(X)={\rm Tr}[X\rho_t]$. We then find
731: \begin{eqnarray}
732: d\rho_t=-i(g/2)[\mu_zx,\rho_t]\,dt
733: +2\kappa\mathcal{D}[a]\rho_t\,dt
734: \nonumber\\
735: \phantom{d\rho_t=-i}
736: +(\gamma/2)\{\mathcal{D}[\mu]
737: +\mathcal{D}[\mu_z]+\mathcal{D}[\mu^\dag]\}\rho_t\,dt
738: \label{eq:dfilt}\\
739: \nonumber
740: \phantom{d\rho_t=-i}
741: +\sqrt{2\kappa\eta}\,[
742: i\rho_t a^\dag-ia\rho_t-{\rm Tr}[\rho_t y]\rho_t
743: ](dY(t)-\sqrt{2\kappa\eta}\,{\rm Tr}[\rho_t y]\,dt)
744: \end{eqnarray}
745: where $\mathcal{D}[c]\rho=c\rho c^\dag-(c^\dag c\rho+\rho c^\dag c)/2$.
746: This description is more economical than the raw filter (\ref{eq:qfilt}),
747: and appears frequently in the physics literature. We have to be careful,
748: however, to interpret $\rho_t$ as the information state of an observer
749: with access to $Y(t)$, and {\it not} as the physical state of the system.
750: This point will be important for the interpretation of our results.
751:
752: We conclude this section with one more filter, the so-called unnormalized
753: filter, which is given by the expression
754: \begin{eqnarray}
755: d\tilde\rho_t=-i(g/2)[\mu_zx,\tilde\rho_t]\,dt
756: +2\kappa\mathcal{D}[a]\tilde\rho_t\,dt
757: \nonumber\\
758: \phantom{d\rho_t=-i}
759: +(\gamma/2)\{\mathcal{D}[\mu]
760: +\mathcal{D}[\mu_z]+\mathcal{D}[\mu^\dag]\}
761: \tilde\rho_t\,dt
762: \label{eq:ufilt}\\
763: \nonumber
764: \phantom{d\rho_t=-i}
765: +i\sqrt{2\kappa\eta}\,[\tilde\rho_t a^\dag-a\tilde\rho_t]\,dY(t)
766: \end{eqnarray}
767: The information state $\tilde\rho_t$ propagated by this filter is not
768: normalized, ${\rm Tr}[\tilde\rho_t]\ne 1$. However, it is simply related
769: to the normalized information state by $\rho_t=\tilde\rho_t/{\rm
770: Tr}[\tilde\rho_t]$. The chief advantage of (\ref{eq:ufilt}) is that it is
771: a linear equation, whereas (\ref{eq:dfilt}) is nonlinear in $\rho_t$.
772: This makes (\ref{eq:ufilt}) somewhat easier to manipulate.
773:
774: \subsection{The $Q$-filter}
775:
776: In \cite{s:mabuchi}, it was noticed that density operators of the form
777: \begin{equation}
778: \label{eq:qansatz}
779: \rho=\sum_{a=\pm}|a\rangle\langle a|\otimes
780: \int dy\, P^a(y)|iy/2\rangle\langle iy/2|
781: \end{equation}
782: [$|iy/2\rangle$ are coherent states of the cavity mode] form an invariant
783: set of the filtering equation (\ref{eq:dfilt}). Thus, as long as the
784: initial density is within this set, we can represent the filtering
785: equations in terms of the pair of real (Glauber-Sudarshan) functions
786: $P^\pm(y)$ on a line. Substituting (\ref{eq:qansatz}) into
787: (\ref{eq:ufilt}) yields the unnormalized $P$-filter
788: \begin{eqnarray}
789: dP^\pm_t(y)=
790: \frac{\partial}{\partial y}[(\pm g+\kappa y)P^\pm_t(y)]\,dt
791: \nonumber\\
792: \label{eq:pfiltt}
793: \phantom{dP^\pm_t(y)=(\pm g+}
794: +\frac{\gamma}{2}[P^\mp_t(y)-P^\pm_t(y)]\,dt
795: +\sqrt{2\kappa\eta}\,yP^\pm_t(y)\,dY(t)
796: \end{eqnarray}
797: For our purposes, it is more convenient to work with unnormalized
798: $Q$-functions
799: \begin{equation}
800: \label{eq:qfunction}
801: Q^\pm(y)=\langle\pm,iy/2|\rho|\pm,iy/2\rangle=
802: \int dy'\, P^\pm(y')\, e^{-(y-y')^2/4}
803: \end{equation}
804: as these are always guaranteed to be well-behaved densities \cite{s:mandel}.
805: We obtain
806: \begin{eqnarray}
807: dQ^\pm_t(y)=
808: \frac{\gamma}{2}[Q^\mp_t(y)-Q^\pm_t(y)]\,dt
809: +\frac{\partial}{\partial y}[(\pm g+\kappa y)Q^\pm_t(y)]\,dt
810: \nonumber\\
811: \label{eq:qfiltt}
812: \phantom{dQ^\pm_t(y)=Q}
813: +2\kappa\frac{\partial^2}{\partial y^2}Q^\pm_t(y)\,dt
814: +\sqrt{2\kappa\eta}\,\left[
815: y+2\frac{\partial}{\partial y}
816: \right]Q^\pm_t(y)\,dY(t)
817: \end{eqnarray}
818: The simplicity of this expression motivates our choice of this system for
819: demonstrating the quantum projection filter.
820:
821: Rather than using the $Q$-function for the projection filter, we could
822: work directly with the filter (\ref{eq:dfilt}) in density form and apply
823: methods of quantum information geometry \cite{s:amari}. However, note
824: that any metric on a manifold of densities induces a metric on the
825: corresponding manifold of density operators (e.g.\ \cite{s:topology}).
826: Thus even the $Q$-function projection filter is a true quantum projection
827: filter, as long as we project onto a family of $Q$-functions that
828: correspond to valid quantum states.
829:
830: \subsection{Observing the spontaneous emission}
831:
832: Until now we have only observed the forward channel; however, at least in
833: principle, we could also observe independently the three spontaneous
834: emission channels $B_{s,z}$, $B_{s,\pm}$. We would like to identify a
835: spontaneous emission event with the detection of a photon in one of these
836: side channels. As such, in this section we discuss the situation wherein
837: direct photodetection is performed in each of the spontaneous emission
838: channels, in addition to the homodyne detection of the forward channel.
839:
840: The analysis in this case is very similar to the one performed in sections
841: \ref{sec:jc1}--\ref{sec:jc2}. The system model is still given by
842: (\ref{eq:sys}). Now, in addition to (\ref{eq:obs}), we need to introduce
843: three observation processes $N_{z,+,-}$ corresponding to photodetection
844: (with perfect efficiency) in the three spontaneous emission channels.
845: The details of this setup and the associated filtering equations are well
846: known and we will not repeat them here (see e.g.\ \cite{s:bouten}). The
847: full (normalized) filtering equation is given by
848: \begin{eqnarray}
849: d\rho_t=-i(g/2)[\mu_zx,\rho_t]\,dt
850: +2\kappa\mathcal{D}[a]\rho_t\,dt
851: \nonumber\\
852: \phantom{d\rho_t=-i}
853: +(\gamma/2)\{\mathcal{D}[\mu]
854: +\mathcal{D}[\mu_z]+\mathcal{D}[\mu^\dag]\}\rho_t\,dt
855: \nonumber\\
856: \phantom{d\rho_t=-i}
857: +\sqrt{2\kappa\eta}\,[
858: i\rho_t a^\dag-ia\rho_t-{\rm Tr}[\rho_t y]\rho_t
859: ](dY(t)-\sqrt{2\kappa\eta}\,{\rm Tr}[\rho_t y]\,dt)
860: \nonumber\\
861: \phantom{d\rho_t=-i}
862: +\mathcal{G}[\mu]\rho_t\,(dN_+(t)-(\gamma/2)\,{\rm Tr}[\mu^\dag\mu\rho_t]\,dt)
863: \nonumber\\
864: \phantom{d\rho_t=-i}
865: +\mathcal{G}[\mu_z]\rho_t\,(dN_z(t)-(\gamma/2)\,dt)
866: \nonumber\\
867: \label{eq:fullfilt}
868: \phantom{d\rho_t=-i}
869: +\mathcal{G}[\mu^\dag]\rho_t\,(dN_-(t)-(\gamma/2)\,{\rm Tr}[\mu\mu^\dag\rho_t]\,dt)
870: \end{eqnarray}
871: where $\mathcal{G}[c]\rho=c\rho c^\dag/{\rm Tr}[c\rho c^\dag]-\rho$. It
872: can be shown that the statistics of the processes $N_{+,z,-}$(t) is such
873: that they are counting processes with independent jumps and rates
874: $(\gamma/2)\,{\rm Tr}[\mu^\dag\mu\rho_t]$, $(\gamma/2)$ and
875: $(\gamma/2)\,{\rm Tr}[\mu\mu^\dag\rho_t]$, respectively.
876:
877: We now have two different filters, equations (\ref{eq:dfilt}) and
878: (\ref{eq:fullfilt}), for the same physical system (\ref{eq:sys}). To see
879: how they relate, recall that all the filter is propagating is an
880: information state. The information state in (\ref{eq:dfilt}) represents
881: the best estimate of an observer who only has access to the homodyne
882: measurement in the forward channel. The information state in
883: (\ref{eq:fullfilt}), however, represents the best estimate of a different
884: observer who has access to both the homodyne observation and to direct
885: photodetection of the spontaneous emission channels. Neither information
886: state represents the physical state of the system; the latter is given by
887: (\ref{eq:sys}).
888:
889: In practice, the frequency-resolved monitoring of spontaneously emitted
890: photons is not (yet) experimentally feasible. Hence we would never use
891: the filter (\ref{eq:fullfilt}) in an actual experimental situation. On
892: the other hand, we are able to generate photocurrents $Y(t)$,
893: $N_{+,z,-}(t)$ with the correct statistics in a computer simulation. It
894: is then interesting to compare the estimate of an observer who has access
895: to all photocurrents to the estimate of a realistic observer who only has
896: access to the forward channel. In particular, this gives insight into the
897: question asked in \cite{s:mabuchi}, `in what sense should we be able to
898: associate observed phase-switching events [in the forward channel] with
899: ``actual'' atomic decays?'
900:
901: The main reason for introducing (\ref{eq:fullfilt}) is that it gives us a
902: convenient way to perform computer simulations of the photocurrent $Y(t)$.
903: We wish to generate sample paths of $Y(t)$, with the correct statistics,
904: in order to compare the performance of the optimal filter (\ref{eq:dfilt})
905: with the projection filter that we will derive shortly. Ideally we would
906: directly simulate the system evolution (\ref{eq:sys}); this problem is
907: essentially intractable, however. Fortunately, we have already expressed
908: the statistics of the photocurrents $Y(t)$, $N_{+,z,-}(t)$ completely in
909: terms of the information state. Hence we can equivalently simulate the
910: photocurrents by simulating (\ref{eq:fullfilt}) according to these rules
911: ($dW(t)$ is a Wiener process, $N_+(t)$ has rate $(\gamma/2)\,{\rm
912: Tr}[\mu^\dag\mu\rho_t]$, etc.)
913:
914: Of course, we can also perform such simulations with (\ref{eq:dfilt}).
915: However, the advantage of (\ref{eq:fullfilt}) is that, if we choose
916: $\eta=1$, the pure states are an invariant set of this filter. We can
917: thus rewrite the equation as a stochastic Schr{\"o}dinger equation, in
918: which we only have to propagate a vector instead of an operator. This is
919: a much more efficient numerical procedure, and is frequently used in
920: quantum optics \cite{s:gardinerparkins}. For our system, the stochastic
921: Schr{\"o}dinger equation corresponding to (\ref{eq:fullfilt}) is given by
922: \begin{eqnarray}
923: d|\psi_t\rangle=[
924: (-i(g/2)\mu_z x
925: -i\kappa\langle y\rangle_t a
926: -\kappa a^\dag a
927: -(\kappa/4)\langle y\rangle_t^2)\,dt
928: \nonumber\\
929: \phantom{d|\psi_t\rangle=[(-i}
930: -\sqrt{2\kappa}\,(ia+\langle y\rangle_t/2)\,dW(t)
931: +(\mu/\langle\mu^\dag\mu\rangle_t^{1/2}-1)\,dN_+(t)
932: \nonumber\\
933: \phantom{d|\psi_t\rangle=[(-i}
934: \label{eq:sse}
935: +(\mu_z-1)\,dN_z(t)
936: +(\mu^\dag/\langle\mu\mu^\dag\rangle_t^{1/2}-1)\,dN_-(t)
937: ]\,|\psi_t\rangle
938: \end{eqnarray}
939: where $\langle c\rangle_t=\langle\psi_t|c|\psi_t\rangle$, and
940: $\rho_t=|\psi_t\rangle\langle\psi_t|$. We numerically solve this equation
941: in a truncated Fock basis for the cavity mode. The homodyne photocurrent
942: (\ref{eq:obs}) is calculated from the innovation using
943: $dY(t)=\sqrt{2\kappa\eta}\,\langle y\rangle_t\,dt+\sqrt{\eta}\,dW_t+\sqrt{1-\eta}\,dV_t$.
944:
945:
946: \section{The quantum projection filter}
947: \label{sec:proj2}
948:
949: \subsection{The finite-dimensional family}
950:
951: Before we can obtain a projection filter for (\ref{eq:qfiltt}), we must
952: fix the finite-dimensional family of densities to project onto. Note that
953: each density is actually the pair of $Q$-functions $Q^\pm(y)$, unlike
954: in section \ref{sec:proj} where each density was a single function.
955: However, we can easily put the problem into this form by making $\pm$
956: an argument of the function, i.e.\ $Q^\pm(y)=Q(y,\pm)$. The square roots
957: of $Q$-functions form a perfectly reasonable $L^2$ space
958: ($L^2=L^2(\mathbb{R})\oplus L^2(\mathbb{R})$) with the inner product
959: \begin{equation}
960: \langle Q_1^{1/2},Q_2^{1/2}\rangle=
961: \sum_{a=\pm}\int_{-\infty}^\infty dy\,
962: Q_1^{1/2}(y,a)Q_2^{1/2}(y,a)
963: \end{equation}
964: In the following we will use the notations $Q^\pm(y)$ and $Q(y,\pm)$
965: interchangeably.
966:
967: Numerical simulations of (\ref{eq:qfiltt}) show that at any time, both
968: $Q^+(y)$ and $Q^-(y)$ are unimodal, roughly bell-shaped densities with an
969: approximately constant width. This suggests that we can attempt to
970: approximate the information state by unnormalized density operators of the
971: form
972: \begin{equation}
973: \rho=
974: \nu^+|+\rangle\langle+|\otimes
975: |i\mu^+/2\rangle\langle i\mu^+/2|+
976: \nu^-|-\rangle\langle-|\otimes
977: |i\mu^-/2\rangle\langle i\mu^-/2|
978: \end{equation}
979: This corresponds to the bi-Gaussian family of unnormalized $Q$-functions
980: \begin{equation}
981: q(y,\pm)=\frac{\nu^\pm}{2\sqrt{\pi}}
982: \exp\left[
983: -\frac{(y-\mu^\pm)^2}{4}
984: \right],~\mu^\pm\in\mathbb{R},~\nu^\pm\ge 0
985: \end{equation}
986: We collect the parameters into a vector
987: $\theta=(\mu^+,\nu^+,\mu^-,\nu^-)$, where
988: $\theta\in\Theta=\{\mu^\pm\in\mathbb{R},~\nu^\pm\ge 0\}$. Then the family
989: of square roots of densities
990: \begin{equation}
991: S^{1/2}=\{\sqrt{q(y,\pm;\theta)},~\theta\in\Theta\}
992: \end{equation}
993: is a finite-dimensional manifold in $L^2$ with the tangent space
994: \begin{equation}
995: T_\theta S^{1/2}=\mbox{Span}\left\{
996: \frac{\partial\sqrt{q(y,\pm;\theta)}}{\partial\theta^i}:
997: i=1\ldots 4\right\}
998: \end{equation}
999: and Fisher metric
1000: \begin{equation}
1001: g_{ij}(\theta)=
1002: 4\left\langle
1003: \frac{\partial\sqrt{q(y,\pm;\theta)}}{\partial\theta^i},
1004: \frac{\partial\sqrt{q(y,\pm;\theta)}}{\partial\theta^j}
1005: \right\rangle
1006: \end{equation}
1007: Calculating the latter explicitly, we obtain the diagonal matrix
1008: \begin{equation}
1009: g(\theta)=\mbox{diag}\left\{
1010: \frac{\nu^+}{2},~\frac{1}{\nu^+},~
1011: \frac{\nu^-}{2},~\frac{1}{\nu^-}
1012: \right\}
1013: \end{equation}
1014:
1015: \subsection{The projection filter}
1016:
1017: We will perform projection of the unnormalized filtering equation
1018: (\ref{eq:qfiltt}), as in \cite{s:vellekoop}. We begin by converting the
1019: equation into the Stratonovich form:
1020: \begin{eqnarray}
1021: dQ^\pm_t(y)=
1022: \frac{\gamma}{2}[Q^\mp_t(y)-Q^\pm_t(y)]\,dt
1023: +\frac{\partial}{\partial y}[(\pm g+\kappa(1-4\eta)y)Q^\pm_t(y)]\,dt
1024: \nonumber\\
1025: \phantom{dQ^\pm_t(y)=Q}
1026: +2\kappa(1-2\eta)\frac{\partial^2}{\partial y^2}Q^\pm_t(y)\,dt
1027: +\kappa\eta(2-y^2)Q^\pm_t(y)\,dt
1028: \nonumber\\
1029: \label{eq:qfilttst}
1030: \phantom{dQ^\pm_t(y)=Q+2\kappa(1-2\eta)\frac{\partial}{\partial y}}
1031: +\sqrt{2\kappa\eta}\,\left[
1032: y+2\frac{\partial}{\partial y}
1033: \right]Q^\pm_t(y)\circ dY(t)
1034: \end{eqnarray}
1035: We can now use (\ref{eq:projf}) and (\ref{eq:projf2}) to find dynamical
1036: equations for the projection filter. After tedious but straightforward
1037: calculations, we obtain
1038: \begin{eqnarray}
1039: \label{eq:projfs1}
1040: d\nu^+_t = \left[\frac{\gamma}{2}(\nu^-_t - \nu^+_t)
1041: -\kappa\eta\,(\mu^+_t)^2\nu^+_t
1042: \right]dt
1043: +\sqrt{2\kappa\eta}\,\mu^+_t\nu^+_t\circ dY(t) \\
1044: \label{eq:projfs2}
1045: d\nu^-_t = \left[\frac{\gamma}{2}(\nu^+_t - \nu^-_t)
1046: -\kappa\eta\,(\mu^-_t)^2\nu^-_t
1047: \right]dt
1048: +\sqrt{2\kappa\eta}\,\mu^-_t\nu^-_t\circ dY(t) \\
1049: \frac{d\mu^+_t}{dt} = -g-\kappa\mu^+_t+
1050: \frac{\gamma}{2}\frac{\nu^-_t}{\nu^+_t}
1051: (\mu^-_t-\mu^+_t) \\
1052: \frac{d\mu^-_t}{dt} = +g-\kappa\mu^-_t+
1053: \frac{\gamma}{2}\frac{\nu^+_t}{\nu^-_t}
1054: (\mu^+_t-\mu^-_t)
1055: \end{eqnarray}
1056: Conversion to the It\^o form changes (\ref{eq:projfs1}) and
1057: (\ref{eq:projfs2}) to
1058: \begin{eqnarray}
1059: d\nu^+_t = \frac{\gamma}{2}(\nu^-_t - \nu^+_t)\,dt
1060: +\sqrt{2\kappa\eta}\,\mu^+_t\nu^+_t\, dY(t) \\
1061: d\nu^-_t = \frac{\gamma}{2}(\nu^+_t - \nu^-_t)\,dt
1062: +\sqrt{2\kappa\eta}\,\mu^-_t\nu^-_t\, dY(t)
1063: \end{eqnarray}
1064: Finally, we rewrite the equations in terms of the {\it normalized}
1065: parameters $\mu^\pm$ and $\tilde\nu_t^+=\nu_t^+/(\nu_t^++\nu_t^-)$. This
1066: gives
1067: \begin{eqnarray}
1068: d\tilde\nu^+_t = -\gamma(\tilde\nu^+_t-1/2)\,dt+
1069: \sqrt{2\kappa\eta}\,\tilde\nu^+_t(1-\tilde\nu^+_t)
1070: (\mu^+_t-\mu^-_t)\times
1071: \nonumber\\
1072: \label{eq:pf1}
1073: \phantom{d\tilde\nu^+_t = -\gamma(\tilde\nu^+_t-1/2)}
1074: \{dY(t)-\sqrt{2\kappa\eta}\,[\mu^+_t\tilde\nu^+_t
1075: +\mu^-_t(1-\tilde\nu^+_t)]\,dt\} \\
1076: \label{eq:pf2}
1077: \frac{d\mu^+_t}{dt} = -g-\kappa\mu^+_t+
1078: \frac{\gamma}{2}\frac{1-\tilde\nu^+_t}{\tilde\nu^+_t}
1079: (\mu^-_t-\mu^+_t) \\
1080: \label{eq:pf3}
1081: \frac{d\mu^-_t}{dt} = +g-\kappa\mu^-_t+
1082: \frac{\gamma}{2}\frac{\tilde\nu^+_t}{1-\tilde\nu^+_t}
1083: (\mu^+_t-\mu^-_t)
1084: \end{eqnarray}
1085: Equations (\ref{eq:pf1})--(\ref{eq:pf3}) form the projection filter
1086: for our model on the family $S^{1/2}$.
1087:
1088: Note that equations (\ref{eq:pf2}) and (\ref{eq:pf3}) are singular at
1089: $\tilde\nu^+_t=0$ or $1$. We can trace this back to the fact that we have
1090: cheated a little in the definition of our family of densities.
1091: When $\nu^+=0$ (or $\nu^-=0$), the map $\theta\mapsto q(y,\pm;\theta)$ is
1092: not invertible, as in this case any choice of $\mu^+$ (or $\mu^-$) leads
1093: to the same density. As we have essentially inverted this map to obtain
1094: the equations (\ref{eq:pf1})--(\ref{eq:pf3}) for the parameters, we can
1095: hardly expect these to be well-defined when this map is not invertible.
1096:
1097: Fortunately the points $\tilde\nu^+_t=0$ and $1$ are never reached if
1098: we start the filter with $0<\tilde\nu^+<1$. Hence we can make the filter
1099: well-defined everywhere simply by removing the offending points
1100: $\nu^+_t=0$ and $\nu^-=0$ from $S^{1/2}$. The map $\theta\mapsto
1101: q(y,\pm;\theta)$ is then invertible everywhere (in other words, then the
1102: manifold is covered by a single chart.) Even if we want to consider
1103: starting the filter on $\tilde\nu^+_t=0$ or $1$ at $t=0$ this is not a
1104: problem; the filter dynamics will cause $\tilde\nu^+$ to evolve off the
1105: singular point, so that the filter is well defined after an arbitrarily
1106: small time step \cite{s:vellekoop}.
1107:
1108: \subsection{Connection with the Wonham filter}
1109:
1110: There is a remarkable connection between the projection filter obtained in
1111: the previous section and the theory of jump process filtering. This
1112: theory goes back to the beautiful classic paper by Wonham \cite{s:wonham},
1113: in which the following problem is solved.
1114:
1115: Denote by $x(t)$ a stationary Markovian jump process which switches
1116: between two states $a_-$ and $a_+$ with a rate $\gamma/2$; i.e., $x(t)$ is
1117: a random telegraph signal \cite{s:gardinerhb}. Now suppose we do not have
1118: access to a complete obeservation of $x(t)$, but only to the corrupted
1119: observation $y(t)$ defined by
1120: \begin{equation}
1121: \label{eq:wonhamobs}
1122: dy(t)=\sqrt{2\kappa\eta}\,x(t)\,dt+dw(t)
1123: \end{equation}
1124: where $dw(t)$ is a Wiener process. We can now ask, what is our best guess
1125: of the probability $p_+(t)$ that $x(t)=a_+$, given the observations
1126: $y(s\le t)$? The answer is given in closed form by (a special case of)
1127: the Wonham filter:
1128: \begin{eqnarray}
1129: dp_+(t)=-\gamma[p_+(t)-1/2]\,dt+
1130: \sqrt{2\kappa\eta}\,p_+(t)[1-p_+(t)](a_+-a_-)\times
1131: \nonumber\\
1132: \phantom{dp_+(t)=-\gamma[p_+(t}
1133: \{dy(t)-\sqrt{2\kappa\eta}\,[a_+p_+(t)+a_-(1-p_+(t))]\,dt\}
1134: \end{eqnarray}
1135: But this is exactly (\ref{eq:pf1}) with $\mu^+_t$ and $\mu^-_t$ replaced
1136: by the constants $a_+$ and $a_-$.
1137:
1138: Though intuitively appealing, this is in many ways a remarkable result.
1139: There appear to be no inherent jumps in either the optimal filter
1140: (\ref{eq:dfilt}), or the system-observation pair (\ref{eq:sys}),
1141: (\ref{eq:obs}) from which it was obtained. It is true that we can choose
1142: to observe a jump process in the spontaneous emission channels, as in
1143: (\ref{eq:fullfilt}), but we could have equally chosen to perform homodyne
1144: or heterodyne detection which do not lead to jump process observations.
1145: Nonetheless (\ref{eq:pf1}) emerges naturally from our model, and an
1146: expression of the same form can even be obtained directly from
1147: (\ref{eq:dfilt}) \cite{s:mabuchi}. Evidently there is a deep connection
1148: between our system and the theory of jump processes.
1149:
1150: As a classical filter, we can interpret the projection filter
1151: (\ref{eq:pf1})--(\ref{eq:pf3}) as an {\it adaptive} Wonham filter, where
1152: the equations (\ref{eq:pf2})--(\ref{eq:pf3}) continually adapt the
1153: parameters $a_+$ and $a_-$ in the Wonham filter (\ref{eq:pf1}). A similar
1154: structure was observed in \cite{s:vellekoop}, where the classical problem
1155: of changepoint detection (the detection of a single random jump in white
1156: noise) was treated using the projection filtering approach.
1157:
1158:
1159: \section{Numerical results}
1160: \label{sec:num}
1161:
1162: \begin{figure}
1163: \includegraphics[width=\textwidth]{Compare.eps}
1164: \caption{
1165: \label{f:compare} Comparison between the optimal and projection filters.
1166: A typical observation is shown with $\eta=1$, $g=120$, $\kappa=40$,
1167: $\gamma=20$, and the integration was performed over $25\,000$ time steps.
1168: The top row, figures (a) and (c), were calculated using the optimal filter
1169: (\ref{eq:qfiltt}); the bottom row, (b) and (d), were calculated for the
1170: same observation process using the projection filter
1171: (\ref{eq:pf1})--(\ref{eq:pf3}).}
1172: \end{figure}
1173:
1174: In this section we present the results of numerical simulations of the
1175: various filters. Sample paths of the observation process were generated
1176: by numerically solving (\ref{eq:sse}) with a truncated cavity basis of
1177: $25$ Fock states and the (appropriately truncated) initial state
1178: $|\psi_0\rangle=|-\rangle\otimes|0\rangle$. The thus generated
1179: observations were then filtered using the optimal filter in $Q$-function
1180: form (\ref{eq:qfiltt}), and using the projection filter
1181: (\ref{eq:pf1})--(\ref{eq:pf3}).
1182:
1183: The optimal filter was implemented using a simple finite-difference scheme
1184: \cite{s:nr} on a grid of $128$ equidistant points in the interval
1185: $y\in[-18,18]$, with the appropriately truncated initial condition
1186: corresponding to $|\psi_0\rangle$. Finally, the projection filter was
1187: started with the initial condition $\tilde\nu^+=\mu^+=\mu^-=0$, where care
1188: was taken not to propagate $\mu^+$ until after the first time step. In
1189: all simulations, stochastic integration was performed using the stochastic
1190: Euler method \cite{s:kloeden}.
1191:
1192: In Figure \ref{f:compare} a typical filtered sample path is shown. The
1193: top row was obtained from the optimal filter, while the bottom row was
1194: obtained using the projection filter. The value inferred for both the
1195: conditional probability of finding the atom in the $|+\rangle$ state (left
1196: column) and the conditional expectation of the $y$-quadrature (right
1197: column) are nearly identical for the two filters. Evidently the
1198: projection filter is an extremely good approximation to the optimal
1199: filtering equation.
1200:
1201: \begin{figure}
1202: \includegraphics[width=\textwidth]{Typical.eps}
1203: \caption{
1204: \label{f:typical} A single run of an experiment is simulated. The dashed
1205: line (red) corresponds to the optimal estimate of an observer who has
1206: access to direct photodetection of the spontaneously emitted photons, as
1207: well as homodyne detection of the forward channel. The solid line (green)
1208: is the optimal estimate of a different observer, who only has access to
1209: the homodyne photocurrent, for the same run of the experiment. The dotted
1210: line (blue) is the projection filter estimate based only on the homodyne
1211: photocurrent. All parameters are the same as in Figure \ref{f:compare}.}
1212: \end{figure}
1213:
1214: \begin{figure}
1215: \includegraphics[width=\textwidth]{Atypical.eps}
1216: \caption{
1217: \label{f:atypical} Two more sample paths of the filtered estimate
1218: of atomic state, demonstrating missed jumps (top) and false jumps
1219: (bottom). All parameters are the same as in Figure \ref{f:compare}.}
1220: \end{figure}
1221:
1222: Next, in Figure \ref{f:typical}, we compare the information state of the
1223: optimal and projection filters to the information state that is based on
1224: the additional observation of spontaneously emitted photons. The latter
1225: filter demonstrates the behavior reported in \cite{s:alsing}. Whenever a
1226: photon is observed in one of the side peaks of the Mollow triplet, the
1227: observer infers that the atom has made a jump. The estimated phase of the
1228: cavity field then exponentially decays to a steady-state level of $\langle
1229: y\rangle=\pm g/\kappa$.
1230:
1231: If we do not measure the spontaneous emission, our best guess of the
1232: atomic state still behaves in a jump-like way. However, we see that there
1233: is a little delay between the time that the observer of spontaneous
1234: emission thinks the atom has jumped, and the time that the homodyne
1235: observer comes to the same conclusion. If we identify atomic decay with
1236: the spontaneous emission of a photon, we can now give a fairly
1237: satisfactory answer to the question posed in \cite{s:mabuchi}: `In what
1238: sense should we be able to associate observed phase switching events with
1239: ``actual'' atomic decays'? It appears that an observed phase switch
1240: signals that, had we been making such an observation, we would likely have
1241: seen a spontaneously emitted photon a little while earlier.
1242:
1243: The detection delay is a rather generic property of the type of filtering
1244: problems we are considering \cite{s:vellekoop,s:shiryaev}. Any time we
1245: see a large fluctuation in the observed process, the filter has to decide
1246: whether this is a large fluctuation of the noise, or a large fluctuation
1247: of the observed system. As is pointed out by Shiryaev \cite{s:shiryaev}
1248: in the context of changepoint detection, the filter rides a delicate
1249: balance between minimizing the delay time and minimizing the probability
1250: of ``false alarms''. Decreasing the number of false alarms (by
1251: choosing a different filtering cost) would unavoidably increase the delay
1252: time, and vice versa. In our system, false alarms are missed jumps
1253: and false jumps; these do occur, as can be seen in Figure
1254: \ref{f:atypical}.
1255:
1256: \begin{figure}
1257: \includegraphics[width=\textwidth]{Better.eps}
1258: \caption{
1259: A typical filtered sample path with $\eta=1$, $g=600$, $\kappa=200$,
1260: $\gamma=20$. The integration was performed over $100\,000$ time steps.
1261: The various line types are as in the previous figures.
1262: \label{f:better}
1263: }
1264: \end{figure}
1265:
1266: If we wish to generally improve the quality of detection we have no
1267: alternative than to increase the signal-to-noise level of the observation.
1268: In the case of our system, we can do this if we increase $g$ and $\kappa$
1269: while keeping their ratio fixed (the analogy can be justified from
1270: (\ref{eq:wonhamobs}); the signal $x$ has fixed magnitude $x=\pm g/\kappa$,
1271: while the signal-to-noise ratio $\sim g/\sqrt{\kappa}$.) A simulation
1272: with greatly increased signal-to-noise is shown in Figure \ref{f:better}.
1273: In this very strong coupling and damping regime, it appears that not much
1274: more information can be extracted from observation of the spontaneous
1275: emission than we could have already inferred from the homodyne
1276: photocurrent.
1277:
1278: \section{Conclusion}
1279:
1280: In this paper, we have suggested that the method of projection filtering
1281: can be very fruitful when applied to quantum filtering theory. Using a
1282: simple model of a strongly coupled two-level atom in a cavity we
1283: numerically demonstrated near-optimal performance of the projection
1284: filter, as is evident from Figures \ref{f:compare}--\ref{f:better}. We
1285: have also shown a connection between this model from cavity QED and the
1286: classical Wonham filter; the projection filter can be interpreted as an
1287: adaptive Wonham filter, applied to a quantum model. In future work we
1288: will develop a ``true'' quantum formalism for projection filtering,
1289: using methods from quantum information geometry.
1290:
1291: The reduction of infinite or high-dimensional filters to a tractable set
1292: of equations is essential if we wish to perform estimation in real time,
1293: for example in a feedback control loop. In a control-theoretic context,
1294: converting a large, complex system into a set of simple equations is known
1295: as model reduction. Ideally, however, such a procedure should yield some
1296: bounds on the error of approximation; in our case we have observed
1297: numerically that the approximation error is very small, but we have no
1298: rigorous bounds to back up this statement. In classical control theory of
1299: linear systems, the method of balanced truncation \cite{s:dullerud} gives
1300: a very general method for model reduction with guaranteed error bounds.
1301: How to do this effectively for nonlinear systems is still an open problem,
1302: however, both in classical and in quantum theory.
1303:
1304:
1305: \section*{Acknowledgment}
1306:
1307: The authors thank Mike Armen for useful discussions.
1308: This work was supported by the ARO (DAAD19-03-1-0073).
1309:
1310: \section*{References}
1311:
1312: \bibliographystyle{unsrt}
1313: \bibliography{QPF}
1314:
1315: \end{document}
1316:
1317: