0511:quant-ph0511216/t.tex

1: \documentclass[12pt]{article}

2:

3: \usepackage{amsmath}

4: \usepackage{amssymb}

5: \usepackage{epsfig}

6:

7: \newcommand{\set}[1]{{\mathbb{#1}}}

8: \newcommand{\one}{\mbox{\tt 1}\hspace{-0.057 in}\mbox{\tt l}}

9: \newcommand{\Tr}{\mbox{\rm Tr}}

10: \newcommand{\tr}{\mbox{\rm Tr}}

11:

12: \begin{document}

13:

14: \title{Bayesian updating of a probability distribution encoded

15:   on a quantum register}

16:

17: \author{

18: Andrei N. Soklakov and R\"udiger Schack\\

19: \\

20: {\it Department of Mathematics, Royal Holloway,

21:        University of London,}\\

22: {\it Egham, Surrey TW20 0EX, United Kingdom}}

23:

24: %\date{\today}

25: \date{15 November 2005}

26: \maketitle

27:

28: \begin{abstract}

29:   We investigate the problem of Bayesian updating of a probability

30:   distribution encoded in the quantum state of $n$ qubits.  The updating

31:   procedure takes the form of a quantum algorithm that prepares the quantum

32:   register in the state representing the posterior distribution. Depending on

33:   how the prior distribution is given, we describe two implementations, one

34:   probabilistic and one deterministic, of such an algorithm in the

35:   standard model of a quantum computer.

36: \end{abstract}

37:

38: \section{Introduction}

39:

40: Bayes's rule provides a simple and fundamental mechanism for updating a

41: probability distribution in the light of new data~\cite{Bernardo1994}.

42: The rule takes its simplest

43: form for a finite sample space, $\set{H}$, where the elements $h\in\set{H}$

44: can be identified with the atomic events, or {\em hypotheses}. Let $P_{\rm

45:   prior}(h)=P(h)$ be the prior probability distribution, and assume some piece

46: of data, $d$, is observed. If $P(d|h)$ is the conditional probability of $d$,

47: given $h$, Bayesian updating consists of replacing the prior with the

48: posterior distribution, $P_{\rm posterior}=P(h|d)$, where

49: \begin{equation} \label{ConditionalPD}

50: P(h|d)=\frac{P(d|h)P(h)}{\sum_{h}P(d|h)P(h)}\;.

51: \end{equation}

52:

53: To simplify the notation, we assume from now on that the set of hypotheses is

54: of the form $\set{H}=\{0,\dots,2^n-1\}$ for some positive integer $n$.

55: For $h\in\set{H}$, let $|h\rangle$ denote the computational basis states of

56: a register of $n$ qubits. The state

57: \begin{equation}  \label{PriorState}

58: |\Psi_{\rm prior}\rangle

59: =\sum_{h\in\set{H}}\sqrt{P(h)}\,|h\rangle

60: \end{equation}

61: provides an encoding of the prior on the quantum register. Even though the size

62: of the sample space grows exponentially with the number of qubits, $n$, there

63: exists an interesting class of priors for which $|\Psi_{\rm prior}\rangle$ can

64: be prepared efficiently, in the sense that the required computational resources

65: grow only polynomially with $n$ \cite{Grover-0208,Soklakov2005b}.

66:

67: To formulate the problem of Bayesian updating for a prior encoded on a quantum

68: register, we make the assumption that we have a classical algorithm that

69: computes, as a function of $h$, the conditional probability $P(d|h)$ for the

70: observed data $d$.  Given this classical algorithm, the goal of Bayesian

71: updating is then to prepare the register in the state

72: \begin{equation} \label{PosteriorState}

73: |\Psi_{\rm posterior}\rangle=\sum_{h\in\set{H}}\sqrt{P(h|d)}\,|h\rangle\;,

74: \end{equation}

75: with $P(h|d)$ given by Eq.~(\ref{ConditionalPD}).

76: If the prior is given to us in the form of a single copy of the state

77: $|\Psi_{\rm prior}\rangle$, our problem is equivalent to finding a quantum

78: operation, $M_d$, that maps any prior

79: state of the form~(\ref{PriorState}) into the

80: corresponding posterior state of the form~(\ref{PosteriorState}),

81: \begin{equation}

82: M_d|\Psi_{\rm prior}\rangle=|\Psi_{\rm posterior}\rangle\;.

83: \end{equation}

84: It is easy to see that $M_d$ cannot in general be a trace-preserving map.

85: For example, consider the two prior states

86: \begin{equation} \label{ExamplePriors}

87: |\Psi_{\rm prior}^1\rangle=\frac{1}{\sqrt{2}}(|1\rangle+|2\rangle)\,,\ \ \

88: |\Psi_{\rm prior}^2\rangle=\frac{1}{\sqrt{2}}(|2\rangle+|3\rangle)\,,

89: \end{equation}

90: corresponding to two different prior probability distributions,

91: and assume that the conditional probability distribution is given by

92: \begin{equation}

93: P(d|h)=\left\{\begin{array}{ll}

94:                0 & {\rm if\ }h= 2\,,\cr

95:                c\neq 0 & {\rm otherwise}\,,

96:                   \end{array}

97:                    \right.

98: \end{equation}

99: where $c$ is a constant determined by normalization.

100: Although the prior states (\ref{ExamplePriors})

101: are nonorthogonal, we obtain mutually orthogonal

102: posterior states

103: \begin{equation}

104: |\Psi_{\rm posterior}^1\rangle=M_d|\Psi_{\rm prior}^1\rangle=|1\rangle\;,\;\;\;

105: |\Psi_{\rm posterior}^2\rangle=M_d|\Psi_{\rm prior}^2\rangle=|3\rangle\;,

106: \end{equation}

107: which implies that $M_d$ is trace-decreasing. Bayesian updating of a single

108: copy of $|\Psi_{\rm prior}\rangle$ is therefore generally probabilistic.

109: Section II of this paper discusses probabilistic Bayesian updating.

110:

111: A deterministic updating scheme is possible, however, if the prior is given in

112: the form of a unitary quantum circuit that maps a standard state, assumed for

113: simplicity to be the computational basis state $|0\rangle$, to $|\Psi_{\rm

114:   prior}\rangle$. Deterministic updating is the topic of Section III.

115:

116:

117: \section{Probabilistic algorithms}

118: \label{sec:ProbabilisticAlgorithms}

119:

120: As we have shown above there is in general no trace preserving

121: quantum operation that can transform all prior states

122: into the corresponding posterior state. To

123: realize probabilistic Bayesian updating, we proceed as follows.

124: Define

125: \begin{equation} \label{definition:E_0}

126: E_1=C\sum_{h\in\set{S}_{\rm pr}}\sqrt{P(d|h)}\,|h\rangle\langle h|\,,

127: \end{equation}

128: where $C$ is a constant and $\set{S}_{\rm pr}$ is

129: a set containing the support of the

130: prior probability distribution. We see that

131: \begin{equation}

132: E_1|\Psi_{\rm prior}\rangle\propto|\Psi_{\rm posterior}\rangle\,.

133: \end{equation}

134: For sufficiently small $|C|$, see Eq.~(\ref{BoundOnC}) below,

135: one can view $E_1$ as an

136: element of a trace preserving quantum operation

137: ${\cal E}$ defined, for arbitrary $\rho$, by

138: \begin{equation}

139: {\cal E}(\rho)=\sum_{k=0}^1E_k\rho E_k^\dag=\sum_{k=0}^1 p_k\rho(k)\,,

140: \end{equation}

141: where

142: \begin{equation}

143: p_k=\Tr(E_k\rho E_k^\dag)\ \ \ \ {\rm and}\ \ \ \

144: \rho(k)=E_k\rho E_k^\dag/p_k\,.

145: \end{equation}

146: This decomposition shows that the operation ${\cal E}$

147: can be realized as a measurement with outcomes $k=0,1$, where

148: each outcome $k$ happens with probability $p_k$ and the

149: corresponding conditional density matrix is $\rho(k)$.

150: Substituting $\rho=|\Psi_{\rm prior}\rangle\langle\Psi_{\rm prior}|$

151: we see that the measurement outcome $1$ corresponds

152: to successful Bayesian updating. This

153: happens with probability

154: \begin{equation} \label{p0}

155: p_1=\langle\Psi_{\rm prior}|E_1^\dag E_1|\Psi_{\rm prior}\rangle

156:     =C^2\sum_{h}P(h)P(d|h)=C^2P(d)\,.

157: \end{equation}

158: In order to obtain a bound on $C$, we note that

159: \begin{equation} \label{E1_squared}

160: E_0^\dag E_0=\one-E_1^\dag E_1=\one-C^2\sum_{h\in\set{S}_{\rm pr}}P(d|h)\,|h\rangle\langle h|\,.

161: \end{equation}

162: Using the positivity of $E_0^\dag E_0$, we find

163: \begin{equation}

164: C^2\leq\left(\sum_{h\in\set{S}_{\rm pr}} P(d|h)\,|\langle v|h\rangle|^2\right)^{-1}

165: \end{equation}

166: for any vector $|v\rangle$.

167:

168: Now let $h^*$ be such that $P(d|h^*)=\max_{h\in\set{S}_{\rm pr}} P(d|h)$.

169: Since the above

170: condition is valid for any $|v\rangle$, one can choose

171: $|v\rangle=|h^*\rangle$ and obtain

172: \begin{equation} \label{BoundOnC}

173: C^2\leq 1/\max_{h\in\set{S}_{\rm pr}}P(d|h)\,.

174: \end{equation}

175: Together with Eq.~(\ref{p0}) this gives an upper bound

176: on the success probability of Bayesian updating

177: \begin{equation} \label{SuccessProbabilityBound}

178: p_1\leq\frac{P(d)}{\max_{h\in\set{S}_{\rm pr}}P(d|h)}\,.

179: \end{equation}

180: In the next subsection we describe an explicit algorithm that achieves this bound.

181:

182: \subsection{Explicit algorithm}\label{subsec:ExampleAlgorithm}

183:

184: The operation ${\cal E}$ can be realized as a modification of a procedure

185: proposed by Rudolph~\cite{RudolphPrivate} as follows. First we prepare the

186: product of the prior state and an auxiliary qubit state, $|\Psi_{\rm

187:   prior}\rangle|0\rangle$.  Then, using the classical algorithm for computing

188: $P(d|h)$, one can construct a quantum circuit $U_d$ that performs a

189: conditional rotation of an auxiliary qubit so that

190: \begin{equation} \label{U_d}

191: U_d |\Psi_{\rm prior}\rangle|0\rangle

192: =\sum_{h}\sqrt{P(h)}|h\rangle\Big(A_1(h)|0\rangle+B_1(h)|1\rangle\Big)\,,

193: \end{equation}

194: where

195: \begin{equation} \label{A_1}

196: A_1(h)=c_1\sqrt{P(d|h)},\ \ \ B_1^2=1-A_1^2=1-c_1^2P(d|h)\,,

197: \end{equation}

198: and $c_1$ is a constant. Then measuring the auxiliary qubit

199: we obtain the desired state $|\Psi_{\rm posterior}\rangle|0\rangle$

200: with probability

201: \begin{equation}

202: p_1=c_1^2\sum_h P(h)P(d|h)=c_1^2P(d)\,.

203: \end{equation}

204: Looking at Eqs.(\ref{U_d}) and (\ref{A_1}) we can set

205: $c_1^2=1/\max_{h\in\set{S}_{\rm pr}}P(d|h)$.

206: With this

207: setting, $p_1$ achieves the theoretical bound on the

208: success probability, Eq.(\ref{SuccessProbabilityBound}).

209:

210: In the above algorithm, one can safely achieve the maximal

211: success probability only if the knowledge of

212: the value of $\max_{h\in\set{S}_{\rm pr}}P(d|h)$

213: is available. It is relevant to mention here

214: that the lack of such knowledge does not prevent

215: us from using the above algorithm, since we can always

216: use the trivial setting $c_1^2=1$. The price to pay is

217: a smaller success probability.

218:

219: An intermediate situation occurs if a nontrivial upper bound on

220: $P(d|h)$ is known, i.e., a constant

221: $M$ such that $\max_{h\in\set{S}_{\rm pr}}P(d|h)<M<1$.

222: One can then set $c_1^2=1/M$, which improves the success probability compared

223: to the trivial setting.

224:

225: \subsection{Iterative algorithm} \label{subsec:ExpensivePriors}

226:

227: Let $M_1$ be an

228: upper bound on $\max_{h\in\set{S}_{\rm pr}}P(d|h)$.

229: Imagine that at the beginning we do not have

230: enough information about $P(d|h)$ and $P(h)$

231: to calculate a nontrivial value for $M_1$.

232: In other words, we have to assume that $M_1=1$.

233: Imagine also that we expect to acquire a

234: better bound $M_2<M_1$ in the future.

235: We will now address the following question: Can we run the

236: probabilistic algorithm of Sec.~\ref{subsec:ExampleAlgorithm}

237: first with the trivial bound $M_1=1$, and later with the improved bound

238: $M_2$, without reducing the overall success probability

239: that can be achieved by running the algorithm once with the bound $M_2$?

240: We will find that this is indeed the case.

241: This result remains true for a sequence of bounds, $M_k<M_{k-1}<\dots<M_1$.

242: Below we describe an iterative version of the above algorithm that

243: makes use of better bounds as they become available.

244:

245: Consider the measurement part of the algorithm of

246: Sec.~\ref{subsec:ExampleAlgorithm}.

247: If the measurement fails, which happens with probability  $1-p_1$, we

248: end up with the state

249: \begin{equation} \label{psi_1}

250: |\psi_1\rangle= \Big( N_1\sum_h\sqrt{P(h)}B_1(h)|h\rangle\Big) |1\rangle\,,

251: \ \ \ \ N_1^{-2}=1-c_1^2P(d)\,,

252: \end{equation}

253: where we might have set $c_1^2=1/M_1$ to maximize $p_1$.

254: Since we know the

255: exact form of $|\psi_1\rangle$ we may attempt to achieve

256: our original goal by

257: performing a transformation

258: \begin{equation} \label{second_attempt}

259: |\psi_1\rangle\longrightarrow N_1\sum_h\sqrt{P(h)}B_1(h)|h\rangle

260:                               \Big(A_2(h)|0\rangle+\frac{B_2(h)}{B_1(h)}|1\rangle\Big)\,,

261: \end{equation}

262: where we set

263: \begin{equation}

264: A_2(h)=c_2\frac{\sqrt{P(d|h)}}{B_1(h)}\,,\ \ \ \ B_2^2=(1-A_2^2)B_1^2

265:                                                      =B_1^2-c_2^2P(d|h)\,,

266: \end{equation}

267: and $c_2$ is a constant. First of all, it is important to note

268: that this procedure should not be attempted

269: when $c_1^2$ was set to $1/M_1$, and $M_1$ is

270: still the best available bound. This is because

271: in the worst case there will be at least one hypotheses

272: $h^*$ which is present in the sum Eq.(\ref{second_attempt})

273: with $B_1(h^*)=0$ and $A_2(h^*)>1$. It follows that

274: the above procedure should only be applied if

275: a better bound $M_2>M_1$ became available (or when $c_1^2<1/M_1$).

276: In this case,

277: measurement of the auxiliary qubit

278: yields the desired state $|\Psi_{\rm posterior}\rangle|0\rangle$

279: with probability

280: \begin{equation}

281: p_2=N_1^2c_2^2\sum_hP(h)P(d|h)=\frac{c_2^2P(d)}{1-c_1^2P(d)}\,.

282: \end{equation}

283: Alternatively, with probability $1-p_2$, we may end up with the state

284: \begin{equation}

285: |\psi_2\rangle=\Big( N_2\sum_h\sqrt{P(h)}B_2(h)|h\rangle\Big) |1\rangle\,.

286: \end{equation}

287: This state is similar in structure to the state $|\psi_1\rangle$

288: so we may try to recover in the same way

289: by performing the transformation

290: \begin{equation}

291: |\psi_2\rangle\longrightarrow N_2\sum_h\sqrt{P(h)}B_2(h)|h\rangle

292:                               \Big(A_3(h)|0\rangle+\frac{B_3(h)}{B_2(h)}|1\rangle\Big)\,,

293: \end{equation}

294: followed by the measurements of the auxiliary qubit in complete analogy

295: with our earlier analysis. By continuing this procedure we obtain

296: the sequence of success probabilities $p_1,p_2,\dots$

297: together with the coefficients $\{A_k^2\}$ and $\{B_k^2\}$.

298: We have

299: \begin{equation} \label{AkBk}

300: A_k(h)=c_k\frac{\sqrt{P(d|h)}}{B_{k-1}(h)}\,,\ \ \ \ B_k^2=B_{k-1}^2-c_k^2P(d|h)\,,

301: \end{equation}

302: and

303: \begin{equation} \label{p_k1}

304: p_k=\frac{c_k^2P(d)}{\langle B_{k-2}^2\rangle-c_{k-1}^2P(d)}\,,

305: \end{equation}

306: where $B_{-1}^2=B_0^2=1$, $c_0^2=0$ and

307: \begin{equation} \label{Baverage}

308: \langle B_k^2\rangle=\sum_h P(h) B_k^2(h)\,.

309: \end{equation}

310: The constants $\{c_k\}$ are the only free parameters in this

311: algorithm. As we have seen in the case $k=1$,

312: the constants $\{c_k\}$ cannot be chosen

313: freely, and the optimal choice for them depends on the

314: sequence $\{M_k\}$.

315: From Eq.(\ref{AkBk}) we obtain

316: \begin{equation} \label{BfromCs}

317: B_k^2=1-P(d|h)\sum_{s=1}^k c_s^2\geq 0\,,

318: \end{equation}

319: and therefore

320: \begin{equation}

321: \sum_{s=1}^kc_s^2\leq\frac{1}{P(d|h)}\,.

322: \end{equation}

323: This condition must be satisfied for all $h$

324: in the support of the prior

325: and so we have

326: \begin{equation} \label{BoundOnCs}

327: \sum_{s=1}^kc_s^2\leq\frac{1}{\max_{h\in \set{S}_{\rm pr}} P(d|h)}\,.

328: \end{equation}

329: From Eqs.~(\ref{Baverage}) and (\ref{BfromCs})

330: we compute

331: \begin{equation}

332: \langle B_{k-2}^2\rangle=1-P(d)\sum_{s=1}^{k-2}c_s^2\,.

333: \end{equation}

334: Together with Eq.~(\ref{p_k1}), this implies

335: \begin{equation}

336: p_k=\frac{P(d)c_k^2}{1-P(d)\sum_{s=1}^{k-1}c_s^2}\,.

337: \end{equation}

338: The probability that the algorithm is not

339: successful after the $n$th stage is

340: given by

341: \begin{equation}

342: P_{\rm fail}^n=\prod_{k=1}^n(1-p_k)=1-P(d)\sum_{s=1}^nc_s^2\,,

343: \end{equation}

344: which gives the corresponding success probability

345: \begin{equation}

346: P_{\rm succ}^n=1-P_{\rm fail}^n=P(d)\sum_{s=1}^nc_s^2

347: \leq P(d)/\max_{h\in\set{S}_{\rm pr}}P(d|h)\,,

348: \end{equation}

349: where we used the inequality~(\ref{BoundOnCs}).

350: We see that the theoretical bound for

351: the overall success probability of

352: transforming one copy of the prior state

353: $|\Psi_{\rm prior}\rangle$ into one copy of the posterior

354: state

355: $|\Psi_{\rm posterior}\rangle$ is achieved for

356: as long as at some stage $n$ of the algorithm

357: we have

358: \begin{equation} \label{BoundOnCs2}

359: \sum_{s=1}^nc_s^2=\frac{1}{\max_{h\in \set{S}_{\rm pr}} P(d|h)}\,.

360: \end{equation}

361: Given the sequence of upper bounds $M_1>M_2>\dots>M_k$, and

362: assuming that the information in the first $k-1$ of them

363: was already used without success, the optimal value $c_k^2$ for the next

364: iteration of the algorithm, which takes into account the bound $M_k$, can be

365: calculated as

366: \begin{equation}

367: c_k^2=\frac{1}{M_k}-\sum_{s=1}^{k-1}c_s^2=\frac{1}{M_k}-\frac{1}{M_{k-1}}\,.

368: \end{equation}

369:

370:

371: \section{Deterministic updating}

372: \label{sec:DeterministicAlgorithms}

373:

374: In this section we will assume that the prior is given in the form of a

375: unitary quantum circuit, $U$, that maps the computational basis state

376: $|0\rangle$, to the prior state. Apart from the constraint

377: $U|0\rangle=|\Psi_{\rm prior}\rangle$, $U$ is arbitrary. We first give an

378: algorithm for the special case of hypothesis elimination and then show how to

379: extend it to two-valued and more general models.

380:

381: \subsection{Hypothesis elimination}

382:

383: Imagine the situation where each piece of data $d$ partitions the set of

384: hypotheses $\set{H}$ into two subsets: $\set{H}_d$

385: containing all hypotheses that are consistent with $d$, and

386: $\set{H}\,\backslash\,\set{H}_d$ containing all hypotheses that are rejected

387: by the data $d$. This leads to a special case of Bayesian

388: updating

389: where $P(d|h)$ takes only two different values~\cite{Soklakov-0412},

390: \begin{equation}

391: P(d|h)=\left\{\begin{array}{ll}

392:                1/|\set{H}_d| & {\rm if\ }h\in\set{H}_d\,,\cr

393:                            0 & {\rm otherwise}\,,

394:                   \end{array}

395:                    \right.

396: \end{equation}

397: where $|\set{H}_d|$ is the number of hypotheses that are consistent with the

398: data $d$. The posterior state~(\ref{PosteriorState}) takes the simple form

399: \begin{equation}

400: |\Psi_{\rm posterior}\rangle

401: =N\sum_{h\in\set{H}_d}\sqrt{P(h)}|h\rangle\,,

402: \end{equation}

403: where $N$ is the normalization factor.

404:

405: Using the given classical algorithm for computing $P(d|h)$, we define a

406: quantum oracle, $O_d$, as

407: \begin{equation}

408: O_d|h\rangle=\left\{\begin{array}{ll}

409:                      -|h\rangle & {\rm if\ }h\in\set{H}_d\,,\cr

410:                                          |h\rangle & {\rm otherwise}\,.

411:                   \end{array}

412:                    \right.

413: \end{equation}

414: Furthermore, let $\Pi$ be a conditional phase shift defined by

415: \begin{equation}

416: \Pi|h\rangle=\left\{\begin{array}{ll}

417:                      -|h\rangle & {\rm if\ }h\neq0\,,\cr

418:                       |h\rangle & {\rm if\ }h   =0\,.

419:                   \end{array}

420:                    \right.

421: \end{equation}

422: These operations are combined with $U$ to form an operation, ${\cal A}$,

423:   defined by \cite{Brassard}

424: \begin{equation}

425: {\cal A} = U^{-1} \Pi\,UO_d \;.

426: \end{equation}

427: The circuit for ${\cal A}$ is the basic block of the quantum algorithm to

428: prepare $|\Psi_{\rm posterior}\rangle$.

429:

430: It will be convenient to rewrite the prior state~(\ref{PriorState})

431: in the form

432: \begin{equation} \label{Prior}

433: |\Psi_{\rm prior}\rangle=\sin\frac{\vartheta}{2}\;|\alpha\rangle

434:              +\cos\frac{\vartheta}{2}\;|\beta\rangle\,,

435: \end{equation}

436: where

437: \begin{equation}

438: |\alpha\rangle

439: = S_{\set{H}_d}^{-1/2}

440:    \sum_{h\in\set{H}_d} \sqrt{P(h)}\,|h\rangle\,,\ \ \ \ \ \

441:    S_{\set{H}_d}=\sum_{h\in\set{H}_d}P(h)\,, \label{SHd}

442: \end{equation}

443: \begin{equation}

444: |\beta\rangle=S_{\set{H}\,\backslash\set{H}_d}^{-1/2}

445:    \sum_{h\in\set{H}\, \backslash \set{H}_d} \sqrt{P(h)}\,|h\rangle\,,\ \ \ \ \ \

446:    S_{\set{H}\,\backslash\set{H}_d}=\sum_{h\in\set{H}\,\backslash\set{H}_d}P(h)\,,

447: \end{equation}

448: and

449: \begin{equation} \label{SinVartheta}

450: \sin\frac{\vartheta}{2}=\sqrt{S_{\set{H}_d}}\;.

451: \end{equation}

452: The last equation shows that knowing the total

453: prior probability of the hypotheses that are

454: consistent with the data $d$ is equivalent

455: to knowing the value of $\vartheta$.

456:

457: It can now be shown that repeated application of the circuit

458: ${\cal A}$ takes $|\Psi_{\rm prior}\rangle$

459: through the sequence of states

460: \begin{equation}

461: {\cal A}^{k}|\Psi_{\rm prior}\rangle

462: = \sin\left(\frac{2k+1}{2}\vartheta\right)\,|\alpha\rangle

463:              +\cos\left(\frac{2k+1}{2}\vartheta\right)\,|\beta\rangle\,.

464: \end{equation}

465: The number of times,

466: $T$, of applications of ${\cal A}$ that

467: achieve the required transformation,

468: \begin{equation} \label{TBayesian}

469: {\cal A}^{T}|\Psi_{\rm prior}\rangle

470: =|\alpha\rangle=|\Psi_{\rm posterior}\rangle\,,

471: \end{equation}

472: is therefore

473: \begin{equation}

474: T=(\pi/\vartheta-1)/2\,.

475: \end{equation}

476: If $T$ is not an integer, there are two possibilities. Either one uses the

477: closest integer approximation to $T$ and includes the effect of the noninteger

478: part in the fidelity analysis (see below), or one follows $\lfloor T\rfloor$

479: applications of ${\cal A}$ with one application of a modified version of

480: ${\cal A}$ where phases are shifted by less than $e^{i\pi}$ in both $O_d$ and

481: $\Pi$ \cite{Tim}.

482:

483: In order to compute the number of iterations, $T$, the value of $\vartheta$

484: must be known. To obtain $\vartheta$, a version of the standard phase

485: estimation algorithm \cite{Nielsen2000b} can be used as illustrated in

486: Figure \ref{figure1}.

487:

488: \begin{figure}[here]

489: \begin{center}

490: \epsfig{file=QIntegration.eps,width=12cm}

491: \end{center}

492: \caption{This is the standard phase-estimation circuit applied to the

493:   hypothesis-elimination operator ${\cal A}$. A measurement of the upper

494:   $t$-qubit register returns the value of $\vartheta$ with an accuracy of $m$

495:   bits and a probability of success of at least $1-\epsilon$, where $m$ and

496:   $\epsilon$ are related to each other and to $t$ via the condition

497:   $t=m+\lceil \log(2+1/2\epsilon) \rceil$. The gates labeled $H^{\otimes t}$

498:   and $FT$ are the $t$-qubit Hadamard and quantum Fourier transforms,

499:   respectively.  }

500: \label{figure1}

501: \end{figure}

502:

503: To calculate the effect of an error in

504: the value of $\vartheta$  on the fidelity of the  Bayesian

505: transformation~(\ref{TBayesian}), we assume that there is an upper bound on

506: the absolute error,

507: \begin{equation}

508: \Delta\vartheta\geq |\vartheta-\tilde{\vartheta}|\;,

509: \end{equation}

510: where $\tilde{\vartheta}$ denotes the approximate value. With the definition

511: $\tilde{T}=(\pi/\tilde{\vartheta}-1)/2$, the fidelity is

512: \begin{equation}

513: F=|\langle\Psi_{\rm posterior}|{\cal A}^{\tilde{T}}|\Psi_{\rm prior}\rangle|

514: =\sin\Big(\frac{2\tilde{T}+1}{2}\vartheta\Big)\,.

515: \end{equation}

516: Substituting $\vartheta=\tilde{\vartheta}\pm\Delta\vartheta$

517: and using the relation $(2\tilde{T}+1)\tilde{\vartheta}=\pi$

518: we obtain

519: \begin{equation} \label{FidelityBound}

520: F=\cos\Big(\frac{2\tilde{T}+1}{2}\Delta\vartheta\Big)

521:  =\cos\frac{\pi\Delta\vartheta}{2\tilde{\vartheta}}

522:  \geq 1-\Big(\frac{\pi\Delta\vartheta}{2\tilde{\vartheta}}\Big)^2\,.

523: \end{equation}

524:

525:

526: \subsection{Two-valued models} \label{sec:suppression}

527:

528: A straightforward generalization of hypothesis elimination is provided by

529: a two-valued conditional probability of the form

530: \begin{equation} \label{SingleStepModel}

531: P(d|h)=\left\{\begin{array}{ll}

532:                a_1 & {\rm if\ }h\in\set{H}_d\,,\cr

533:                            a_2 & {\rm otherwise}\,,

534:                   \end{array}

535:                    \right.

536: \end{equation}

537: where $a_1>a_2$ are constants, and $\set{H}_d$ is the set of

538: hypotheses favored by the data $d$. The {\em suppression coefficient\/}

539: $r=a_1/a_2$ measures how much hypotheses in $\set{H}_d$ are favored by the

540: data. As before, the prior state can be written in the

541: form, Eq.(\ref{Prior}),

542: \begin{equation}

543: |\Psi_{\rm prior}\rangle=\sin\frac{\vartheta}{2}\;|\alpha\rangle

544:              +\cos\frac{\vartheta}{2}\;|\beta\rangle\,,

545: \end{equation}

546: and for the posterior state we calculate

547: \begin{equation}

548: |\Psi_{\rm posterior}\rangle=\sqrt{a_1}\,\sin\frac{\vartheta}{2}\;|\alpha\rangle

549:              +\sqrt{a_2}\,\cos\frac{\vartheta}{2}\;|\beta\rangle\,.

550: \end{equation}

551: Normalization of the posterior state implies that

552: \begin{equation}

553: a_2=\frac{1}{r\sin^2(\vartheta/2)+\cos^2(\vartheta/2)}\,.

554: \end{equation}

555: Defining $\vartheta'$ so that

556: \begin{equation}

557: \cos\frac{\vartheta'}{2}= \sqrt{a_2}\,\cos\frac{\vartheta}{2}

558: =\frac{\cos(\vartheta/2)}{\sqrt{r\sin^2(\vartheta/2)+\cos^2(\vartheta/2)}}\,,

559: \end{equation}

560: the number of iterations $T$ necessary to transform $|\Psi_{\rm prior}\rangle$

561: into $|\Psi_{\rm posterior}\rangle={\cal A}^T|\Psi_{\rm prior}\rangle$ can then

562: be calculated as

563: \begin{equation}

564: T(\vartheta,r)=(\vartheta'/\vartheta-1)/2\,.

565: \end{equation}

566: It follows that knowledge of $\vartheta$ and the suppression coefficient $r$

567: is sufficient for a deterministic implementation of Bayesian updating with the

568: conditional distribution~(\ref{SingleStepModel}). As before, the

569: value of $\vartheta$ can be obtained using the algorithm of

570: figure~\ref{figure1}, and the same fidelity bound (\ref{FidelityBound}) can be

571: used.

572:

573: \subsection{Bayesian updating: general models}

574:

575: In this section we show how to generalize the above algorithm

576: to the case of Bayesian updating with a general model,

577: i.e., a general conditional distribution $P(d|h)$.

578: The main idea  is to

579: represent $P(d|h)$ as a product of two-valued models

580: with known suppression coefficients. Bayesian updating

581: with $P(d|h)$ can then be viewed as a sequence of Bayesian

582: updatings for the two-valued models.

583:

584: Let $C_k(h)$ be the coefficients in the binary expansion

585: of $\log_2P(d|h)$,

586: \begin{equation}

587: \log_2 P(d|h)=\sum_{k=1}^\infty C_k(h)\,2^{-k}\,.

588: \end{equation}

589: This allows us to express $P(d|h)$ as a product,

590: \begin{equation}

591: P(d|h)=\prod_{k=1}^{\infty}\sqrt[2^k]{2^{C_k(h)}}\,.

592: \end{equation}

593: Let $\set{H}_{d_k}$ be the set of hypotheses $\{h\}$ for which $C_k(h)=1$.

594: The $k$th term in this product is either $\sqrt[2^k]{2}$ or $1$ depending on

595: whether $h$ is in $\set{H}_{d_k}$ or not.  Bayesian updating with the

596: conditional probability $P(d|h)$ can therefore be viewed as a sequence of

597: stages corresponding to the acquisition of data from the sequence

598: $d_1,d_2,\dots$. At each stage, an updating step for a two-valued model as

599: described in the previous section is carried out.

600:

601: \section*{Acknowledgments}

602:

603: We would like to thank Terry Rudolph for helpful

604: discussions.

605: This work was supported in part by the European Union IST-FET project EDIQIP.

606:

607: \begin{thebibliography}{1}

608:

609: \bibitem{Bernardo1994}

610: J.~M. Bernardo and A.~F.~M. Smith, {\em Bayesian Theory} (Wiley, Chichester,

611:   1994).

612:

613: \bibitem{Grover-0208}

614: L. Grover and T. Rudolph, e-print quant-ph/0208112.

615:

616: \bibitem{Soklakov2005b} A.~N. Soklakov and R. Schack, e-print

617:   quant-ph/0408045, to be published in Phys.\ Rev.\ A.

618:

619: \bibitem{RudolphPrivate}

620: T. Rudolph, private communication.

621:

622: \bibitem{Soklakov-0412}

623: A.~N. Soklakov and R. Schack, e-print quant-ph/0412025.

624:

625: \bibitem{Brassard}

626: G. Brassard, P. H{\o}yer, M. Mosca and A. Tapp, e-print quant-ph/0005055.

627:

628: \bibitem{Tim}

629: T. Mannveille, A.~N. Soklakov and R. Schack, in preparation.

630:

631: \bibitem{Nielsen2000b}

632: M.~A. Nielsen and I.~L. Chuang, {\em Quantum Computation and Quantum

633:   Information} (Cambridge University Press, Cambridge, 2000).

634:

635: \end{thebibliography}

636:

637:

638: \end{document}

639: