1: \documentclass[12pt]{article}
2:
3: \usepackage{amsmath}
4: \usepackage{amssymb}
5: \usepackage{epsfig}
6:
7: \newcommand{\set}[1]{{\mathbb{#1}}}
8: \newcommand{\one}{\mbox{\tt 1}\hspace{-0.057 in}\mbox{\tt l}}
9: \newcommand{\Tr}{\mbox{\rm Tr}}
10: \newcommand{\tr}{\mbox{\rm Tr}}
11:
12: \begin{document}
13:
14: \title{Bayesian updating of a probability distribution encoded
15: on a quantum register}
16:
17: \author{
18: Andrei N. Soklakov and R\"udiger Schack\\
19: \\
20: {\it Department of Mathematics, Royal Holloway,
21: University of London,}\\
22: {\it Egham, Surrey TW20 0EX, United Kingdom}}
23:
24: %\date{\today}
25: \date{15 November 2005}
26: \maketitle
27:
28: \begin{abstract}
29: We investigate the problem of Bayesian updating of a probability
30: distribution encoded in the quantum state of $n$ qubits. The updating
31: procedure takes the form of a quantum algorithm that prepares the quantum
32: register in the state representing the posterior distribution. Depending on
33: how the prior distribution is given, we describe two implementations, one
34: probabilistic and one deterministic, of such an algorithm in the
35: standard model of a quantum computer.
36: \end{abstract}
37:
38: \section{Introduction}
39:
40: Bayes's rule provides a simple and fundamental mechanism for updating a
41: probability distribution in the light of new data~\cite{Bernardo1994}.
42: The rule takes its simplest
43: form for a finite sample space, $\set{H}$, where the elements $h\in\set{H}$
44: can be identified with the atomic events, or {\em hypotheses}. Let $P_{\rm
45: prior}(h)=P(h)$ be the prior probability distribution, and assume some piece
46: of data, $d$, is observed. If $P(d|h)$ is the conditional probability of $d$,
47: given $h$, Bayesian updating consists of replacing the prior with the
48: posterior distribution, $P_{\rm posterior}=P(h|d)$, where
49: \begin{equation} \label{ConditionalPD}
50: P(h|d)=\frac{P(d|h)P(h)}{\sum_{h}P(d|h)P(h)}\;.
51: \end{equation}
52:
53: To simplify the notation, we assume from now on that the set of hypotheses is
54: of the form $\set{H}=\{0,\dots,2^n-1\}$ for some positive integer $n$.
55: For $h\in\set{H}$, let $|h\rangle$ denote the computational basis states of
56: a register of $n$ qubits. The state
57: \begin{equation} \label{PriorState}
58: |\Psi_{\rm prior}\rangle
59: =\sum_{h\in\set{H}}\sqrt{P(h)}\,|h\rangle
60: \end{equation}
61: provides an encoding of the prior on the quantum register. Even though the size
62: of the sample space grows exponentially with the number of qubits, $n$, there
63: exists an interesting class of priors for which $|\Psi_{\rm prior}\rangle$ can
64: be prepared efficiently, in the sense that the required computational resources
65: grow only polynomially with $n$ \cite{Grover-0208,Soklakov2005b}.
66:
67: To formulate the problem of Bayesian updating for a prior encoded on a quantum
68: register, we make the assumption that we have a classical algorithm that
69: computes, as a function of $h$, the conditional probability $P(d|h)$ for the
70: observed data $d$. Given this classical algorithm, the goal of Bayesian
71: updating is then to prepare the register in the state
72: \begin{equation} \label{PosteriorState}
73: |\Psi_{\rm posterior}\rangle=\sum_{h\in\set{H}}\sqrt{P(h|d)}\,|h\rangle\;,
74: \end{equation}
75: with $P(h|d)$ given by Eq.~(\ref{ConditionalPD}).
76: If the prior is given to us in the form of a single copy of the state
77: $|\Psi_{\rm prior}\rangle$, our problem is equivalent to finding a quantum
78: operation, $M_d$, that maps any prior
79: state of the form~(\ref{PriorState}) into the
80: corresponding posterior state of the form~(\ref{PosteriorState}),
81: \begin{equation}
82: M_d|\Psi_{\rm prior}\rangle=|\Psi_{\rm posterior}\rangle\;.
83: \end{equation}
84: It is easy to see that $M_d$ cannot in general be a trace-preserving map.
85: For example, consider the two prior states
86: \begin{equation} \label{ExamplePriors}
87: |\Psi_{\rm prior}^1\rangle=\frac{1}{\sqrt{2}}(|1\rangle+|2\rangle)\,,\ \ \
88: |\Psi_{\rm prior}^2\rangle=\frac{1}{\sqrt{2}}(|2\rangle+|3\rangle)\,,
89: \end{equation}
90: corresponding to two different prior probability distributions,
91: and assume that the conditional probability distribution is given by
92: \begin{equation}
93: P(d|h)=\left\{\begin{array}{ll}
94: 0 & {\rm if\ }h= 2\,,\cr
95: c\neq 0 & {\rm otherwise}\,,
96: \end{array}
97: \right.
98: \end{equation}
99: where $c$ is a constant determined by normalization.
100: Although the prior states (\ref{ExamplePriors})
101: are nonorthogonal, we obtain mutually orthogonal
102: posterior states
103: \begin{equation}
104: |\Psi_{\rm posterior}^1\rangle=M_d|\Psi_{\rm prior}^1\rangle=|1\rangle\;,\;\;\;
105: |\Psi_{\rm posterior}^2\rangle=M_d|\Psi_{\rm prior}^2\rangle=|3\rangle\;,
106: \end{equation}
107: which implies that $M_d$ is trace-decreasing. Bayesian updating of a single
108: copy of $|\Psi_{\rm prior}\rangle$ is therefore generally probabilistic.
109: Section II of this paper discusses probabilistic Bayesian updating.
110:
111: A deterministic updating scheme is possible, however, if the prior is given in
112: the form of a unitary quantum circuit that maps a standard state, assumed for
113: simplicity to be the computational basis state $|0\rangle$, to $|\Psi_{\rm
114: prior}\rangle$. Deterministic updating is the topic of Section III.
115:
116:
117: \section{Probabilistic algorithms}
118: \label{sec:ProbabilisticAlgorithms}
119:
120: As we have shown above there is in general no trace preserving
121: quantum operation that can transform all prior states
122: into the corresponding posterior state. To
123: realize probabilistic Bayesian updating, we proceed as follows.
124: Define
125: \begin{equation} \label{definition:E_0}
126: E_1=C\sum_{h\in\set{S}_{\rm pr}}\sqrt{P(d|h)}\,|h\rangle\langle h|\,,
127: \end{equation}
128: where $C$ is a constant and $\set{S}_{\rm pr}$ is
129: a set containing the support of the
130: prior probability distribution. We see that
131: \begin{equation}
132: E_1|\Psi_{\rm prior}\rangle\propto|\Psi_{\rm posterior}\rangle\,.
133: \end{equation}
134: For sufficiently small $|C|$, see Eq.~(\ref{BoundOnC}) below,
135: one can view $E_1$ as an
136: element of a trace preserving quantum operation
137: ${\cal E}$ defined, for arbitrary $\rho$, by
138: \begin{equation}
139: {\cal E}(\rho)=\sum_{k=0}^1E_k\rho E_k^\dag=\sum_{k=0}^1 p_k\rho(k)\,,
140: \end{equation}
141: where
142: \begin{equation}
143: p_k=\Tr(E_k\rho E_k^\dag)\ \ \ \ {\rm and}\ \ \ \
144: \rho(k)=E_k\rho E_k^\dag/p_k\,.
145: \end{equation}
146: This decomposition shows that the operation ${\cal E}$
147: can be realized as a measurement with outcomes $k=0,1$, where
148: each outcome $k$ happens with probability $p_k$ and the
149: corresponding conditional density matrix is $\rho(k)$.
150: Substituting $\rho=|\Psi_{\rm prior}\rangle\langle\Psi_{\rm prior}|$
151: we see that the measurement outcome $1$ corresponds
152: to successful Bayesian updating. This
153: happens with probability
154: \begin{equation} \label{p0}
155: p_1=\langle\Psi_{\rm prior}|E_1^\dag E_1|\Psi_{\rm prior}\rangle
156: =C^2\sum_{h}P(h)P(d|h)=C^2P(d)\,.
157: \end{equation}
158: In order to obtain a bound on $C$, we note that
159: \begin{equation} \label{E1_squared}
160: E_0^\dag E_0=\one-E_1^\dag E_1=\one-C^2\sum_{h\in\set{S}_{\rm pr}}P(d|h)\,|h\rangle\langle h|\,.
161: \end{equation}
162: Using the positivity of $E_0^\dag E_0$, we find
163: \begin{equation}
164: C^2\leq\left(\sum_{h\in\set{S}_{\rm pr}} P(d|h)\,|\langle v|h\rangle|^2\right)^{-1}
165: \end{equation}
166: for any vector $|v\rangle$.
167:
168: Now let $h^*$ be such that $P(d|h^*)=\max_{h\in\set{S}_{\rm pr}} P(d|h)$.
169: Since the above
170: condition is valid for any $|v\rangle$, one can choose
171: $|v\rangle=|h^*\rangle$ and obtain
172: \begin{equation} \label{BoundOnC}
173: C^2\leq 1/\max_{h\in\set{S}_{\rm pr}}P(d|h)\,.
174: \end{equation}
175: Together with Eq.~(\ref{p0}) this gives an upper bound
176: on the success probability of Bayesian updating
177: \begin{equation} \label{SuccessProbabilityBound}
178: p_1\leq\frac{P(d)}{\max_{h\in\set{S}_{\rm pr}}P(d|h)}\,.
179: \end{equation}
180: In the next subsection we describe an explicit algorithm that achieves this bound.
181:
182: \subsection{Explicit algorithm}\label{subsec:ExampleAlgorithm}
183:
184: The operation ${\cal E}$ can be realized as a modification of a procedure
185: proposed by Rudolph~\cite{RudolphPrivate} as follows. First we prepare the
186: product of the prior state and an auxiliary qubit state, $|\Psi_{\rm
187: prior}\rangle|0\rangle$. Then, using the classical algorithm for computing
188: $P(d|h)$, one can construct a quantum circuit $U_d$ that performs a
189: conditional rotation of an auxiliary qubit so that
190: \begin{equation} \label{U_d}
191: U_d |\Psi_{\rm prior}\rangle|0\rangle
192: =\sum_{h}\sqrt{P(h)}|h\rangle\Big(A_1(h)|0\rangle+B_1(h)|1\rangle\Big)\,,
193: \end{equation}
194: where
195: \begin{equation} \label{A_1}
196: A_1(h)=c_1\sqrt{P(d|h)},\ \ \ B_1^2=1-A_1^2=1-c_1^2P(d|h)\,,
197: \end{equation}
198: and $c_1$ is a constant. Then measuring the auxiliary qubit
199: we obtain the desired state $|\Psi_{\rm posterior}\rangle|0\rangle$
200: with probability
201: \begin{equation}
202: p_1=c_1^2\sum_h P(h)P(d|h)=c_1^2P(d)\,.
203: \end{equation}
204: Looking at Eqs.(\ref{U_d}) and (\ref{A_1}) we can set
205: $c_1^2=1/\max_{h\in\set{S}_{\rm pr}}P(d|h)$.
206: With this
207: setting, $p_1$ achieves the theoretical bound on the
208: success probability, Eq.(\ref{SuccessProbabilityBound}).
209:
210: In the above algorithm, one can safely achieve the maximal
211: success probability only if the knowledge of
212: the value of $\max_{h\in\set{S}_{\rm pr}}P(d|h)$
213: is available. It is relevant to mention here
214: that the lack of such knowledge does not prevent
215: us from using the above algorithm, since we can always
216: use the trivial setting $c_1^2=1$. The price to pay is
217: a smaller success probability.
218:
219: An intermediate situation occurs if a nontrivial upper bound on
220: $P(d|h)$ is known, i.e., a constant
221: $M$ such that $\max_{h\in\set{S}_{\rm pr}}P(d|h)<M<1$.
222: One can then set $c_1^2=1/M$, which improves the success probability compared
223: to the trivial setting.
224:
225: \subsection{Iterative algorithm} \label{subsec:ExpensivePriors}
226:
227: Let $M_1$ be an
228: upper bound on $\max_{h\in\set{S}_{\rm pr}}P(d|h)$.
229: Imagine that at the beginning we do not have
230: enough information about $P(d|h)$ and $P(h)$
231: to calculate a nontrivial value for $M_1$.
232: In other words, we have to assume that $M_1=1$.
233: Imagine also that we expect to acquire a
234: better bound $M_2<M_1$ in the future.
235: We will now address the following question: Can we run the
236: probabilistic algorithm of Sec.~\ref{subsec:ExampleAlgorithm}
237: first with the trivial bound $M_1=1$, and later with the improved bound
238: $M_2$, without reducing the overall success probability
239: that can be achieved by running the algorithm once with the bound $M_2$?
240: We will find that this is indeed the case.
241: This result remains true for a sequence of bounds, $M_k<M_{k-1}<\dots<M_1$.
242: Below we describe an iterative version of the above algorithm that
243: makes use of better bounds as they become available.
244:
245: Consider the measurement part of the algorithm of
246: Sec.~\ref{subsec:ExampleAlgorithm}.
247: If the measurement fails, which happens with probability $1-p_1$, we
248: end up with the state
249: \begin{equation} \label{psi_1}
250: |\psi_1\rangle= \Big( N_1\sum_h\sqrt{P(h)}B_1(h)|h\rangle\Big) |1\rangle\,,
251: \ \ \ \ N_1^{-2}=1-c_1^2P(d)\,,
252: \end{equation}
253: where we might have set $c_1^2=1/M_1$ to maximize $p_1$.
254: Since we know the
255: exact form of $|\psi_1\rangle$ we may attempt to achieve
256: our original goal by
257: performing a transformation
258: \begin{equation} \label{second_attempt}
259: |\psi_1\rangle\longrightarrow N_1\sum_h\sqrt{P(h)}B_1(h)|h\rangle
260: \Big(A_2(h)|0\rangle+\frac{B_2(h)}{B_1(h)}|1\rangle\Big)\,,
261: \end{equation}
262: where we set
263: \begin{equation}
264: A_2(h)=c_2\frac{\sqrt{P(d|h)}}{B_1(h)}\,,\ \ \ \ B_2^2=(1-A_2^2)B_1^2
265: =B_1^2-c_2^2P(d|h)\,,
266: \end{equation}
267: and $c_2$ is a constant. First of all, it is important to note
268: that this procedure should not be attempted
269: when $c_1^2$ was set to $1/M_1$, and $M_1$ is
270: still the best available bound. This is because
271: in the worst case there will be at least one hypotheses
272: $h^*$ which is present in the sum Eq.(\ref{second_attempt})
273: with $B_1(h^*)=0$ and $A_2(h^*)>1$. It follows that
274: the above procedure should only be applied if
275: a better bound $M_2>M_1$ became available (or when $c_1^2<1/M_1$).
276: In this case,
277: measurement of the auxiliary qubit
278: yields the desired state $|\Psi_{\rm posterior}\rangle|0\rangle$
279: with probability
280: \begin{equation}
281: p_2=N_1^2c_2^2\sum_hP(h)P(d|h)=\frac{c_2^2P(d)}{1-c_1^2P(d)}\,.
282: \end{equation}
283: Alternatively, with probability $1-p_2$, we may end up with the state
284: \begin{equation}
285: |\psi_2\rangle=\Big( N_2\sum_h\sqrt{P(h)}B_2(h)|h\rangle\Big) |1\rangle\,.
286: \end{equation}
287: This state is similar in structure to the state $|\psi_1\rangle$
288: so we may try to recover in the same way
289: by performing the transformation
290: \begin{equation}
291: |\psi_2\rangle\longrightarrow N_2\sum_h\sqrt{P(h)}B_2(h)|h\rangle
292: \Big(A_3(h)|0\rangle+\frac{B_3(h)}{B_2(h)}|1\rangle\Big)\,,
293: \end{equation}
294: followed by the measurements of the auxiliary qubit in complete analogy
295: with our earlier analysis. By continuing this procedure we obtain
296: the sequence of success probabilities $p_1,p_2,\dots$
297: together with the coefficients $\{A_k^2\}$ and $\{B_k^2\}$.
298: We have
299: \begin{equation} \label{AkBk}
300: A_k(h)=c_k\frac{\sqrt{P(d|h)}}{B_{k-1}(h)}\,,\ \ \ \ B_k^2=B_{k-1}^2-c_k^2P(d|h)\,,
301: \end{equation}
302: and
303: \begin{equation} \label{p_k1}
304: p_k=\frac{c_k^2P(d)}{\langle B_{k-2}^2\rangle-c_{k-1}^2P(d)}\,,
305: \end{equation}
306: where $B_{-1}^2=B_0^2=1$, $c_0^2=0$ and
307: \begin{equation} \label{Baverage}
308: \langle B_k^2\rangle=\sum_h P(h) B_k^2(h)\,.
309: \end{equation}
310: The constants $\{c_k\}$ are the only free parameters in this
311: algorithm. As we have seen in the case $k=1$,
312: the constants $\{c_k\}$ cannot be chosen
313: freely, and the optimal choice for them depends on the
314: sequence $\{M_k\}$.
315: From Eq.(\ref{AkBk}) we obtain
316: \begin{equation} \label{BfromCs}
317: B_k^2=1-P(d|h)\sum_{s=1}^k c_s^2\geq 0\,,
318: \end{equation}
319: and therefore
320: \begin{equation}
321: \sum_{s=1}^kc_s^2\leq\frac{1}{P(d|h)}\,.
322: \end{equation}
323: This condition must be satisfied for all $h$
324: in the support of the prior
325: and so we have
326: \begin{equation} \label{BoundOnCs}
327: \sum_{s=1}^kc_s^2\leq\frac{1}{\max_{h\in \set{S}_{\rm pr}} P(d|h)}\,.
328: \end{equation}
329: From Eqs.~(\ref{Baverage}) and (\ref{BfromCs})
330: we compute
331: \begin{equation}
332: \langle B_{k-2}^2\rangle=1-P(d)\sum_{s=1}^{k-2}c_s^2\,.
333: \end{equation}
334: Together with Eq.~(\ref{p_k1}), this implies
335: \begin{equation}
336: p_k=\frac{P(d)c_k^2}{1-P(d)\sum_{s=1}^{k-1}c_s^2}\,.
337: \end{equation}
338: The probability that the algorithm is not
339: successful after the $n$th stage is
340: given by
341: \begin{equation}
342: P_{\rm fail}^n=\prod_{k=1}^n(1-p_k)=1-P(d)\sum_{s=1}^nc_s^2\,,
343: \end{equation}
344: which gives the corresponding success probability
345: \begin{equation}
346: P_{\rm succ}^n=1-P_{\rm fail}^n=P(d)\sum_{s=1}^nc_s^2
347: \leq P(d)/\max_{h\in\set{S}_{\rm pr}}P(d|h)\,,
348: \end{equation}
349: where we used the inequality~(\ref{BoundOnCs}).
350: We see that the theoretical bound for
351: the overall success probability of
352: transforming one copy of the prior state
353: $|\Psi_{\rm prior}\rangle$ into one copy of the posterior
354: state
355: $|\Psi_{\rm posterior}\rangle$ is achieved for
356: as long as at some stage $n$ of the algorithm
357: we have
358: \begin{equation} \label{BoundOnCs2}
359: \sum_{s=1}^nc_s^2=\frac{1}{\max_{h\in \set{S}_{\rm pr}} P(d|h)}\,.
360: \end{equation}
361: Given the sequence of upper bounds $M_1>M_2>\dots>M_k$, and
362: assuming that the information in the first $k-1$ of them
363: was already used without success, the optimal value $c_k^2$ for the next
364: iteration of the algorithm, which takes into account the bound $M_k$, can be
365: calculated as
366: \begin{equation}
367: c_k^2=\frac{1}{M_k}-\sum_{s=1}^{k-1}c_s^2=\frac{1}{M_k}-\frac{1}{M_{k-1}}\,.
368: \end{equation}
369:
370:
371: \section{Deterministic updating}
372: \label{sec:DeterministicAlgorithms}
373:
374: In this section we will assume that the prior is given in the form of a
375: unitary quantum circuit, $U$, that maps the computational basis state
376: $|0\rangle$, to the prior state. Apart from the constraint
377: $U|0\rangle=|\Psi_{\rm prior}\rangle$, $U$ is arbitrary. We first give an
378: algorithm for the special case of hypothesis elimination and then show how to
379: extend it to two-valued and more general models.
380:
381: \subsection{Hypothesis elimination}
382:
383: Imagine the situation where each piece of data $d$ partitions the set of
384: hypotheses $\set{H}$ into two subsets: $\set{H}_d$
385: containing all hypotheses that are consistent with $d$, and
386: $\set{H}\,\backslash\,\set{H}_d$ containing all hypotheses that are rejected
387: by the data $d$. This leads to a special case of Bayesian
388: updating
389: where $P(d|h)$ takes only two different values~\cite{Soklakov-0412},
390: \begin{equation}
391: P(d|h)=\left\{\begin{array}{ll}
392: 1/|\set{H}_d| & {\rm if\ }h\in\set{H}_d\,,\cr
393: 0 & {\rm otherwise}\,,
394: \end{array}
395: \right.
396: \end{equation}
397: where $|\set{H}_d|$ is the number of hypotheses that are consistent with the
398: data $d$. The posterior state~(\ref{PosteriorState}) takes the simple form
399: \begin{equation}
400: |\Psi_{\rm posterior}\rangle
401: =N\sum_{h\in\set{H}_d}\sqrt{P(h)}|h\rangle\,,
402: \end{equation}
403: where $N$ is the normalization factor.
404:
405: Using the given classical algorithm for computing $P(d|h)$, we define a
406: quantum oracle, $O_d$, as
407: \begin{equation}
408: O_d|h\rangle=\left\{\begin{array}{ll}
409: -|h\rangle & {\rm if\ }h\in\set{H}_d\,,\cr
410: |h\rangle & {\rm otherwise}\,.
411: \end{array}
412: \right.
413: \end{equation}
414: Furthermore, let $\Pi$ be a conditional phase shift defined by
415: \begin{equation}
416: \Pi|h\rangle=\left\{\begin{array}{ll}
417: -|h\rangle & {\rm if\ }h\neq0\,,\cr
418: |h\rangle & {\rm if\ }h =0\,.
419: \end{array}
420: \right.
421: \end{equation}
422: These operations are combined with $U$ to form an operation, ${\cal A}$,
423: defined by \cite{Brassard}
424: \begin{equation}
425: {\cal A} = U^{-1} \Pi\,UO_d \;.
426: \end{equation}
427: The circuit for ${\cal A}$ is the basic block of the quantum algorithm to
428: prepare $|\Psi_{\rm posterior}\rangle$.
429:
430: It will be convenient to rewrite the prior state~(\ref{PriorState})
431: in the form
432: \begin{equation} \label{Prior}
433: |\Psi_{\rm prior}\rangle=\sin\frac{\vartheta}{2}\;|\alpha\rangle
434: +\cos\frac{\vartheta}{2}\;|\beta\rangle\,,
435: \end{equation}
436: where
437: \begin{equation}
438: |\alpha\rangle
439: = S_{\set{H}_d}^{-1/2}
440: \sum_{h\in\set{H}_d} \sqrt{P(h)}\,|h\rangle\,,\ \ \ \ \ \
441: S_{\set{H}_d}=\sum_{h\in\set{H}_d}P(h)\,, \label{SHd}
442: \end{equation}
443: \begin{equation}
444: |\beta\rangle=S_{\set{H}\,\backslash\set{H}_d}^{-1/2}
445: \sum_{h\in\set{H}\, \backslash \set{H}_d} \sqrt{P(h)}\,|h\rangle\,,\ \ \ \ \ \
446: S_{\set{H}\,\backslash\set{H}_d}=\sum_{h\in\set{H}\,\backslash\set{H}_d}P(h)\,,
447: \end{equation}
448: and
449: \begin{equation} \label{SinVartheta}
450: \sin\frac{\vartheta}{2}=\sqrt{S_{\set{H}_d}}\;.
451: \end{equation}
452: The last equation shows that knowing the total
453: prior probability of the hypotheses that are
454: consistent with the data $d$ is equivalent
455: to knowing the value of $\vartheta$.
456:
457: It can now be shown that repeated application of the circuit
458: ${\cal A}$ takes $|\Psi_{\rm prior}\rangle$
459: through the sequence of states
460: \begin{equation}
461: {\cal A}^{k}|\Psi_{\rm prior}\rangle
462: = \sin\left(\frac{2k+1}{2}\vartheta\right)\,|\alpha\rangle
463: +\cos\left(\frac{2k+1}{2}\vartheta\right)\,|\beta\rangle\,.
464: \end{equation}
465: The number of times,
466: $T$, of applications of ${\cal A}$ that
467: achieve the required transformation,
468: \begin{equation} \label{TBayesian}
469: {\cal A}^{T}|\Psi_{\rm prior}\rangle
470: =|\alpha\rangle=|\Psi_{\rm posterior}\rangle\,,
471: \end{equation}
472: is therefore
473: \begin{equation}
474: T=(\pi/\vartheta-1)/2\,.
475: \end{equation}
476: If $T$ is not an integer, there are two possibilities. Either one uses the
477: closest integer approximation to $T$ and includes the effect of the noninteger
478: part in the fidelity analysis (see below), or one follows $\lfloor T\rfloor$
479: applications of ${\cal A}$ with one application of a modified version of
480: ${\cal A}$ where phases are shifted by less than $e^{i\pi}$ in both $O_d$ and
481: $\Pi$ \cite{Tim}.
482:
483: In order to compute the number of iterations, $T$, the value of $\vartheta$
484: must be known. To obtain $\vartheta$, a version of the standard phase
485: estimation algorithm \cite{Nielsen2000b} can be used as illustrated in
486: Figure \ref{figure1}.
487:
488: \begin{figure}[here]
489: \begin{center}
490: \epsfig{file=QIntegration.eps,width=12cm}
491: \end{center}
492: \caption{This is the standard phase-estimation circuit applied to the
493: hypothesis-elimination operator ${\cal A}$. A measurement of the upper
494: $t$-qubit register returns the value of $\vartheta$ with an accuracy of $m$
495: bits and a probability of success of at least $1-\epsilon$, where $m$ and
496: $\epsilon$ are related to each other and to $t$ via the condition
497: $t=m+\lceil \log(2+1/2\epsilon) \rceil$. The gates labeled $H^{\otimes t}$
498: and $FT$ are the $t$-qubit Hadamard and quantum Fourier transforms,
499: respectively. }
500: \label{figure1}
501: \end{figure}
502:
503: To calculate the effect of an error in
504: the value of $\vartheta$ on the fidelity of the Bayesian
505: transformation~(\ref{TBayesian}), we assume that there is an upper bound on
506: the absolute error,
507: \begin{equation}
508: \Delta\vartheta\geq |\vartheta-\tilde{\vartheta}|\;,
509: \end{equation}
510: where $\tilde{\vartheta}$ denotes the approximate value. With the definition
511: $\tilde{T}=(\pi/\tilde{\vartheta}-1)/2$, the fidelity is
512: \begin{equation}
513: F=|\langle\Psi_{\rm posterior}|{\cal A}^{\tilde{T}}|\Psi_{\rm prior}\rangle|
514: =\sin\Big(\frac{2\tilde{T}+1}{2}\vartheta\Big)\,.
515: \end{equation}
516: Substituting $\vartheta=\tilde{\vartheta}\pm\Delta\vartheta$
517: and using the relation $(2\tilde{T}+1)\tilde{\vartheta}=\pi$
518: we obtain
519: \begin{equation} \label{FidelityBound}
520: F=\cos\Big(\frac{2\tilde{T}+1}{2}\Delta\vartheta\Big)
521: =\cos\frac{\pi\Delta\vartheta}{2\tilde{\vartheta}}
522: \geq 1-\Big(\frac{\pi\Delta\vartheta}{2\tilde{\vartheta}}\Big)^2\,.
523: \end{equation}
524:
525:
526: \subsection{Two-valued models} \label{sec:suppression}
527:
528: A straightforward generalization of hypothesis elimination is provided by
529: a two-valued conditional probability of the form
530: \begin{equation} \label{SingleStepModel}
531: P(d|h)=\left\{\begin{array}{ll}
532: a_1 & {\rm if\ }h\in\set{H}_d\,,\cr
533: a_2 & {\rm otherwise}\,,
534: \end{array}
535: \right.
536: \end{equation}
537: where $a_1>a_2$ are constants, and $\set{H}_d$ is the set of
538: hypotheses favored by the data $d$. The {\em suppression coefficient\/}
539: $r=a_1/a_2$ measures how much hypotheses in $\set{H}_d$ are favored by the
540: data. As before, the prior state can be written in the
541: form, Eq.(\ref{Prior}),
542: \begin{equation}
543: |\Psi_{\rm prior}\rangle=\sin\frac{\vartheta}{2}\;|\alpha\rangle
544: +\cos\frac{\vartheta}{2}\;|\beta\rangle\,,
545: \end{equation}
546: and for the posterior state we calculate
547: \begin{equation}
548: |\Psi_{\rm posterior}\rangle=\sqrt{a_1}\,\sin\frac{\vartheta}{2}\;|\alpha\rangle
549: +\sqrt{a_2}\,\cos\frac{\vartheta}{2}\;|\beta\rangle\,.
550: \end{equation}
551: Normalization of the posterior state implies that
552: \begin{equation}
553: a_2=\frac{1}{r\sin^2(\vartheta/2)+\cos^2(\vartheta/2)}\,.
554: \end{equation}
555: Defining $\vartheta'$ so that
556: \begin{equation}
557: \cos\frac{\vartheta'}{2}= \sqrt{a_2}\,\cos\frac{\vartheta}{2}
558: =\frac{\cos(\vartheta/2)}{\sqrt{r\sin^2(\vartheta/2)+\cos^2(\vartheta/2)}}\,,
559: \end{equation}
560: the number of iterations $T$ necessary to transform $|\Psi_{\rm prior}\rangle$
561: into $|\Psi_{\rm posterior}\rangle={\cal A}^T|\Psi_{\rm prior}\rangle$ can then
562: be calculated as
563: \begin{equation}
564: T(\vartheta,r)=(\vartheta'/\vartheta-1)/2\,.
565: \end{equation}
566: It follows that knowledge of $\vartheta$ and the suppression coefficient $r$
567: is sufficient for a deterministic implementation of Bayesian updating with the
568: conditional distribution~(\ref{SingleStepModel}). As before, the
569: value of $\vartheta$ can be obtained using the algorithm of
570: figure~\ref{figure1}, and the same fidelity bound (\ref{FidelityBound}) can be
571: used.
572:
573: \subsection{Bayesian updating: general models}
574:
575: In this section we show how to generalize the above algorithm
576: to the case of Bayesian updating with a general model,
577: i.e., a general conditional distribution $P(d|h)$.
578: The main idea is to
579: represent $P(d|h)$ as a product of two-valued models
580: with known suppression coefficients. Bayesian updating
581: with $P(d|h)$ can then be viewed as a sequence of Bayesian
582: updatings for the two-valued models.
583:
584: Let $C_k(h)$ be the coefficients in the binary expansion
585: of $\log_2P(d|h)$,
586: \begin{equation}
587: \log_2 P(d|h)=\sum_{k=1}^\infty C_k(h)\,2^{-k}\,.
588: \end{equation}
589: This allows us to express $P(d|h)$ as a product,
590: \begin{equation}
591: P(d|h)=\prod_{k=1}^{\infty}\sqrt[2^k]{2^{C_k(h)}}\,.
592: \end{equation}
593: Let $\set{H}_{d_k}$ be the set of hypotheses $\{h\}$ for which $C_k(h)=1$.
594: The $k$th term in this product is either $\sqrt[2^k]{2}$ or $1$ depending on
595: whether $h$ is in $\set{H}_{d_k}$ or not. Bayesian updating with the
596: conditional probability $P(d|h)$ can therefore be viewed as a sequence of
597: stages corresponding to the acquisition of data from the sequence
598: $d_1,d_2,\dots$. At each stage, an updating step for a two-valued model as
599: described in the previous section is carried out.
600:
601: \section*{Acknowledgments}
602:
603: We would like to thank Terry Rudolph for helpful
604: discussions.
605: This work was supported in part by the European Union IST-FET project EDIQIP.
606:
607: \begin{thebibliography}{1}
608:
609: \bibitem{Bernardo1994}
610: J.~M. Bernardo and A.~F.~M. Smith, {\em Bayesian Theory} (Wiley, Chichester,
611: 1994).
612:
613: \bibitem{Grover-0208}
614: L. Grover and T. Rudolph, e-print quant-ph/0208112.
615:
616: \bibitem{Soklakov2005b} A.~N. Soklakov and R. Schack, e-print
617: quant-ph/0408045, to be published in Phys.\ Rev.\ A.
618:
619: \bibitem{RudolphPrivate}
620: T. Rudolph, private communication.
621:
622: \bibitem{Soklakov-0412}
623: A.~N. Soklakov and R. Schack, e-print quant-ph/0412025.
624:
625: \bibitem{Brassard}
626: G. Brassard, P. H{\o}yer, M. Mosca and A. Tapp, e-print quant-ph/0005055.
627:
628: \bibitem{Tim}
629: T. Mannveille, A.~N. Soklakov and R. Schack, in preparation.
630:
631: \bibitem{Nielsen2000b}
632: M.~A. Nielsen and I.~L. Chuang, {\em Quantum Computation and Quantum
633: Information} (Cambridge University Press, Cambridge, 2000).
634:
635: \end{thebibliography}
636:
637:
638: \end{document}
639: