0809:0809.2809/analysis.tex

1: %%  $Id: analysis.tex,v 1.64 2009/06/03 22:37:53 acsearle Exp $

2:

3:

4: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

5:

6: \section{Analysis}

7: \label{SECII}\label{sec:analysis}

8:

9:

10: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

11:

12: \subsection{Single-sample observation}

13: \label{sec:singleSample}

14:

15: We begin by investigating perhaps the simplest Bayesian coherent data

16: analysis: detecting a signal from a known sky position in a single

17: strain sample from each of $N$ gravitational wave observatories.  This

18: example will show many of the basic features of the Bayesian analysis,

19: and highlight some of the differences between the Bayesian approach

20: and previous statistics.  In the following section we will generalize

21: to a multi-sample search for a signal arriving at an unknown time from

22: an unknown sky position.

23:

24: Consider a single strain sample from each of $N$ detectors, each

25: measurement taken at the moment corresponding to the passage of a

26: postulated plane gravitational wave from some known location on

27: the sky, ($\theta, \phi$).  The measurements are then equal to \cite{GuTi:89}

28: \begin{equation}

29: \mathbf{x}=\mathbf{F}\,\mathbf{h}+\mathbf{e} \, , \label{eqn:ssmodel}

30: \end{equation}

31: where $\mathbf{x}$ is the vector of measurements $[x_1,\ldots,x_N]^T$, the

32: matrix $\mathbf{F}=[[F_1^+,F_1^\times],\ldots,[F_N^+,F_N^\times]]$

33: contains the antenna responses of the observatories to the postulated

34: gravitational wave strain vector $\mathbf{h}=[h_+,h_\times]^T$, and

35: $\mathbf{e}$ is the noise in each sample.  $\mathbf{F}$ is a known

36: function of the source sky direction $(\theta,\phi)$, and the decomposition into $+$ and $\times$ polarizations requires us to choose an arbitrary polarization basis angle $\psi$ for each source sky direction.

37:

38: We wish to distinguish between two hypotheses: $H_0$,

39: that the data contains only noise, and $H_1$, that the

40: data contains a gravitational wave signal.  The Bayesian odds ratio \cite{jaynes, gregory}

41: allows us to compare the plausibility of the hypotheses:

42: \begin{equation}

43: \frac{p(H_1|\mathbf{x},I)}

44: {p(H_0|\mathbf{x},I)}=

45: \frac{p(H_1|I)}

46: {p(H_0|I)}

47: \frac{p(\mathbf{x}|H_1,I)}

48: {p(\mathbf{x}|H_0,I)}

49: \label{Bayes_Ratio} \, ,

50: \end{equation}

51: where $I$ is a set of unstated but shared assumptions (such as the

52: detector locations, orientations and noise power spectra). If the posterior plausibility ratio is greater than one,

53: $H_1$ is more plausible than $H_0$ and we

54: classify the observation as a detection.  If the posterior

55: plausibility ratio is less than one, $H_1$ is less

56: plausible than $H_0$ and we classify the observation as a

57: non-detection.

58:

59: The $p(H|I)$ terms (``plausibility of $H$ assuming $I$'')are the

60: \emph{prior} plausibilities we assign to each hypothesis $H$ on the

61: basis of our knowledge $I$ prior to considering the measurement; for

62: example, our expectation that detectable gravitational waves are rare

63: requires that $p(H_1|I)\ll p(H_0|I)$.

64:

65: The $p(\mathbf{x}|H,I)$ terms (``plausibility of $\mathbf{x}$ assuming

66: $H$ and $I$'') are the probabilities assigned by a hypothesis to the

67: occurrence of a particular observation $\mathbf{x}$.  These are

68: sometimes called likelihood functions; they represent the likelihood

69: of a certain measurement being made.

70:

71: The $p(H|\textbf{x},I)$ terms are the \emph{posterior} plausibilities

72: we assign to the hypotheses in light of the observation.

73: %.  The

74: %difference between the prior and posterior plausibility ratios caused

75: %by the observation is the ratio of the plausibilities those hypotheses

76: %assigned to that observation being made;

77: %The hypothesis that made the

78: %better prediction becomes more plausible.

79: The hypothesis that assigned more probability to the observation becomes more plausible.

80:

81: For notational simplicity we will drop the $I$ in our formulae; the unstated assumptions are implicit.

82:

83: If we make the idealized assumption that the noise in each detector is

84: independent and normally distributed \cite{jaynes, gregory} with zero mean and unit standard

85: deviation, we can then write the following expression for the

86: likelihood $p(\mathbf{x}|H_0)$

87: \begin{eqnarray}

88: p(\mathbf{x}|H_0)&=&\prod_{i=1}^N p(x_i|H_0)\nonumber\\

89: &=&\prod_{i=1}^N\frac{1}{\sqrt{2\pi}}\exp(-\frac{1}{2}x_i^2)\nonumber\\

90: &=&(2\pi)^{-\frac{N}{2}}\exp(-\frac{1}{2}\mathbf{x}^T\mathbf{x})\label{singleNoise} \, ,

91: \label{noise_only}

92: \end{eqnarray}

93: where $^T$ denotes matrix transposition.  For real detectors, the

94: measurements can be \emph{whitened}, which modifies the effective beam pattern functions

95: $\mathbf{F}$.

96:

97: If we assume that there is a gravitational wave $\mathbf{h}$ present, then

98: after subtracting away the response $\mathbf{F}\,\mathbf{h}$ the data will

99: be distributed as noise and the likelihood

100: $p(\mathbf{x}|\mathbf{h},H_1)$ becomes

101: %

102: %\begin{widetext}

103: %

104: \begin{eqnarray}

105: p(\mathbf{x}|\mathbf{h},H_1)

106: &=&(2\pi)^{-\frac{N}{2}}\exp(-\frac{1}{2}(\mathbf{x}-\mathbf{F}\,\mathbf{h})^T

107: (\mathbf{x}-\mathbf{F}\,\mathbf{h})) \label{noiseSignal} \, .

108: \label{noise_signal}

109: \end{eqnarray}

110:

111: Unfortunately, we do not know the signal strain vector $\mathbf{h}$

112: {\em a priori}.  To compute the plausibility of the more general

113: hypothesis $p(\mathbf{x}|H_\mathrm{signal})$ we need to marginalize

114: away these {\it nuisance parameters}

115: \begin{eqnarray}

116: p(\mathbf{x}|H_1)

117: &=&\int_{-\infty}^{+\infty} \int_{-\infty}^{+\infty} p(\mathbf{h}|H_1)

118: p(\mathbf{x}|\mathbf{h},H_1) \, \mathrm{d}{h_+} \, \mathrm{d}{h_\times} \, .

119: \label{marginal}

120: \end{eqnarray}

121: The hypothesis resulting from the marginalization integral is an

122: average of the hypotheses for particular signals $\mathbf{h}$,

123: weighted by the prior probability $p(\mathbf{h}|H_\mathrm{signal})$ we assign

124: to those signals occurring.  A convenient choice of prior is to use a normal

125: distribution for each polarization, with a standard deviation $\sigma$

126: indicative of the amplitude scale of gravitational waves we hope to

127: detect. Under these assumptions the prior is

128: \begin{eqnarray}\label{wave_distribution}

129: p(\mathbf{h}|H_1)

130: & = &

131:         \frac{1}{2\pi\sigma^2}\exp(-\frac{1}{2\sigma^2}\mathbf{h}^T\mathbf{h}) \, .

132: \end{eqnarray}

133: This allows us to perform the marginalization integral analytically

134: \begin{eqnarray}

135: p(\mathbf{x}|H_1)

136: & = &

137:         (2\pi)^{-\frac{N}{2}-1}\sigma^{-2} \int_{-\infty}^{+\infty} \int_{-\infty}^{+\infty}

138:        \exp(-\frac{1}{2}((\mathbf{x}-\mathbf{F}\,\mathbf{h})^T

139:         (\mathbf{x}-\mathbf{F}\,\mathbf{h})

140:         \nonumber \\

141: &   &   \mbox{}

142:         +\sigma^{-2}\mathbf{h}^T\mathbf{h})) \, \mathrm{d}{h_+} \, \mathrm{d}{h_\times}

143:         \nonumber \\

144: & = &

145:         (2\pi)^{-\frac{N}{2}}

146:         |\mathbf{I-K_\mathrm{ss}}|^{\frac{1}{2}} \exp(-\frac{1}{2}\,\mathbf{x}^T

147:         (\mathbf{I-K_\mathrm{ss}})\mathbf{x}) \, ,

148:         \label{eq:simpleP}

149: \end{eqnarray}

150: %

151: %\end{widetext}

152: %

153: where

154: \begin{eqnarray}

155: \mathbf{K_\mathrm{ss}}

156: &\equiv&

157:         \mathbf{F}

158:         (\mathbf{F}^T\mathbf{F}+\sigma^{-2}\mathbf{I})^{-1}

159:         \mathbf{F}^T\label{eq:simpleC}.

160: \end{eqnarray}

161: The result is a multivariate normal distribution with covariance

162: matrix $(\mathbf{I-K_\mathrm{ss}})^{-1}$, which quantifies the correlations among the

163: detectors due to the presence of a gravitational wave signal.

164:

165: With both hypotheses defined, we can form the \emph{likelihood ratio}

166: \begin{eqnarray}

167: \Lambda

168: & = &

169:         \frac{p(\mathbf{x}|H_1)}

170:         {p(\mathbf{x}|H_0)}

171:         \nonumber\\

172: %& = &

173: %        |\mathbf{I-K_\mathrm{ss}}|^\frac12 \exp(

174: %            \frac{1}{2}\,\mathbf{x}^T \mathbf{K_\mathrm{ss}} \mathbf{x})

175: %        \nonumber\\

176: & = &

177:         |\mathbf{I-K_\mathrm{ss}}|^\frac12 \exp ( \frac{1}{2}\,\mathbf{x}^T

178:         \mathbf{F}(\mathbf{F}^T\mathbf{F}+\sigma^{-2}\mathbf{I})^{-1}

179:         \mathbf{F}^T\mathbf{x}) \, . \,

180: \label{eqn:ssLambda}

181: \label{likelihood_final}

182: \end{eqnarray}

183: Multiplying the likelihood ratio by the prior plausibility ratio

184: $p(H_1)/p(H_0)$ completes the calculation of the Bayesian odds ratio

185: (\ref{Bayes_Ratio}).

186:

187: %The part of the likelihood ratio in the exponential can be directly

188: %compared to existing non-Bayesian statistics.  In particular, i

189: In the limit

190: $\sigma\rightarrow\infty$ we find that the odds ratio contains the

191: least-squares estimate of the strain

192: \begin{eqnarray}

193: \mathbf{\hat{h}}&=&(\mathbf{F}^T\mathbf{F})^{-1}\mathbf{F}^T\mathbf{x} \, .

194:  \end{eqnarray}

195: The odds ratio may then be rewritten in terms of a matched filter for the

196: response to the estimated strain, $\mathbf{x}^T\mathbf{F}\,\mathbf{\hat{h}}$.

197: For finite values of $\sigma$, the odds ratio contains the \emph{Tikhonov regularized}

198: estimate of the strain \cite{Ra:06}

199: \begin{eqnarray}

200:   \mathbf{\hat{h}} = (\mathbf{F}^T\mathbf{F}+\sigma^{-2}\mathbf{I})^{-1}\mathbf{F}^T\mathbf{x} \, ,

201: \end{eqnarray}

202: and can still be rewritten as a matched filter for this estimate.

203: %We discuss the relationship of the Bayesian to Tikhonov and other

204: %statistics in more detail in Section~\ref{sec:comparison}.

205:

206: It is also worth noting the presence in (\ref{eqn:ssLambda}) of the determinant $|\mathbf{I-K_\mathrm{ss}}|$ factor.

207: It is independent of the data and depends only on the antenna pattern and the signal model.  In particular, it tells us how strongly

208: to weight likelihoods computed for different possible sky positions

209: of the signal.  This {\em Occam factor} penalizes sky positions

210: of high sensitivity relative to sky positions of lower sensitivity which

211: give similar exponential part of the likelihood.  The effect is typically small compared to the

212: exponential in most cases if the data has good evidence for a signal,

213: but can be important for weak signals and for parameter estimation.

214:

215:

216: %% %% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

217:

218: \subsection{General Bayesian model}

219:

220: We now generalize the analysis of the previous section to the case of

221: burst signals of extended duration and unknown source sky direction $(\theta, \phi)$ and arrival

222: time $\tau$ with respect to the centre of the Earth.

223:

224: A global network of $N$ gravitational wave detectors each produce a

225: time-series of $M$ observations with sampling frequency

226: $f_\textrm{s}$, which we pack into a single vector

227: \begin{equation}

228: \fl

229: \mathbf{x}=[x_{1,1},x_{1,2},\ldots,x_{1,M},x_{2,1},x_{2,2},\ldots,x_{2,M},\ldots,x_{N,1},x_{N,2},\ldots,x_{N,M}]^T \ .

230: \end{equation}

231: %We want to classify the observation as a gravitational wave detection

232: %or not.  Bayesian inference does not allow us to \emph{reject} a

233: %hypothesis in isolation, so we must propose (at least) two hypotheses

234: %and compute which is more plausible.  We will consider a signal

235: %hypothesis $H_1$ and a noise hypothesis

236: %$H_0$.  The ability of the observation to distinguish

237: %between these two hypotheses is contingent upon the observation being

238: %differently distributed for each hypothesis

239: %\begin{eqnarray}

240: %p(\mathbf{x}|H_1)&\neq&p(\mathbf{x}|H_0)

241: %\ .

242: %\end{eqnarray}

243: %To compute these plausibility distributions, we must explicitly form a

244: %model of the experiment.  For $H_1$, we will use the

245: %model

246: Our signal model is a generalization of (\ref{eqn:ssmodel}),

247: \begin{eqnarray}

248: \mathbf{x}&=&\mathbf{F}(\tau,\theta,\phi)\cdot\mathbf{h}+\mathbf{e} \, ,\label{eq:linearmodel}

249: \end{eqnarray}

250: where

251: \begin{eqnarray}

252: \mathbf{h}&=&[h_{+,1},h_{+,2},\ldots,h_{+,L},h_{\times,1},\ldots,h_{\times,L}]^T

253: \end{eqnarray}

254: is a time-series of $2 L$ samples describing the band-limited strain

255: waveform (with the two polarizations packed into a single vector),

256: %$(\tau,\theta,\phi)$ are respectively the time of arrival and

257: %source sky direction of the gravitational wave,

258: $\mathbf{e}$ is a random variable representing the

259: instrumental noise, and $\mathbf{F}(\tau,\theta,\phi)$ is a $NM\times

260: 2L$ response matrix describing the response of each observatory to an

261: incoming gravitational wave,

262: \begin{eqnarray}

263: \fl \mathbf{F}(\tau,\theta,\phi)&=&

264: \left[

265: \begin{array}{cc}

266: F^+_1(\theta,\phi)\mathbf{T}(\tau+\Delta\tau_1(\theta,\phi)) & F^\times_1(\theta,\phi)\mathbf{T}(\tau+\Delta\tau_1(\theta,\phi)) \\

267: F^+_2(\theta,\phi)\mathbf{T}(\tau+\Delta\tau_2(\theta,\phi)) & F^\times_2(\theta,\phi)\mathbf{T}(\tau+\Delta\tau_2(\theta,\phi)) \\

268: \vdots & \vdots \\

269: F^+_N(\theta,\phi)\mathbf{T}(\tau+\Delta\tau_N(\theta,\phi)) & F^\times_N(\theta,\phi)\mathbf{T}(\tau+\Delta\tau_N(\theta,\phi))

270: \end{array}

271: \right] \, .

272: \end{eqnarray}

273: Each $M\times L$ block of the response matrix is responsible for

274: scaling and time shifting one of the waveform polarizations for one

275: detector, so each block is the product of the directional

276: sensitivity of each detector to each polarization, $F^+_i(\theta,\phi)$

277: or $F^\times_i(\theta,\phi)$, and a time delay matrix $T_{j,k}(t)$

278: \footnote{

279: From the assumption that the signal is band-limited, it follows that the

280: time delay matrix may be written as $T_{j,k}(t)=\textrm{sinc}(\pi(j-k-f_\textrm{s}t))$; for $L = M$ and zero time delays, it is equal to the identity matrix; for $L = M$ and time delays corresponding

281: to integer numbers of time samples, it is a \emph{shift matrix}.

282: },

283:  for the source sky direction

284: dependent arrival times $\tau+\Delta\tau_i(\theta,\phi)$ at each

285: detector.

286:

287: %We can restate the equality in (\ref{eq:linearmodel}) as a

288: %Dirac delta-function plausibility distribution

289: %\begin{eqnarray}

290: %p(\mathbf{x}|\mathbf{e},\mathbf{h},\tau,\theta,\phi,H_1)

291: %= \delta(\mathbf{x}-\mathbf{F}(\tau,\theta,\phi)\cdot\mathbf{h}-\mathbf{e}) \ .

292: %\end{eqnarray}

293: %We can then use the marginalization theorem to compute the likelihood

294: %of the data given the hypothesis that a burst is present \cite{jaynes}

295: %\begin{eqnarray}

296: %p(\mathbf{x}|H_1)

297: %  &=&  \int_{V_{\mathbf{e},\mathbf{h},\tau,\theta,\phi}} \!\!\!\!\!\!\!\!\!\!\!\!

298: %          p(\mathbf{x}|\mathbf{e},\mathbf{h},\tau,\theta,\phi,H_1) \,

299: %          p(\mathbf{e},\mathbf{h},\tau,\theta,\phi|H_1) \,

300: %          \mathrm{d}\mathbf{e}\ldots\mathrm{d}\phi \ ,

301: %\end{eqnarray}

302: %where $V_{\mathbf{e},\mathbf{h},\tau,\theta,\phi}$ is the space of all our parameter values.

303:

304: %Similarly we can state the noise model hypothesis as a Dirac

305: %delta-function plausibility distribution, implying the expression

306: %\begin{eqnarray}

307: %p(\mathbf{x}|\mathbf{e},H_0)&=&\delta(\mathbf{x}-\mathbf{e})\\

308: %p(\mathbf{x}|H_0)&=&

309: %\int_{V_{\mathbf{e}}}

310: % p(\mathbf{x}|\mathbf{e},H_0) \, p(\mathbf{e}|H_0) \mathrm{d}\mathbf{e} \, ,

311: %\end{eqnarray}

312: %where ${V_{\mathbf{e}}}$ is the space of the noise.

313:

314: %By using the above expressions we can now construct the \emph{Bayes factor}

315: %\begin{eqnarray}

316: %\fl \frac{p(\mathbf{x}|H_1)}{p(\mathbf{x}|H_0)}

317: %  &=&  \frac{

318: %        \int_{V_{\mathbf{e},\mathbf{h},\tau,\theta,\phi}} \!\!\!\!

319: %        p(\mathbf{x}|\mathbf{e},\mathbf{h},\tau,\theta,\phi,H_1) \,

320: %        p(\mathbf{e},\mathbf{h},\tau,\theta,\phi|H_1) \,

321: %        \mathrm{d}\mathbf{e}\ldots\mathrm{d}\phi

322: %        }{

323: %        \int_{{V_{\mathbf{e}}}}

324: %        p(\mathbf{x}|\mathbf{e},H_0) \,

325: %        p(\mathbf{e}|H_0) \, \mathrm{d}\mathbf{e}

326: %        } \, .

327: %\end{eqnarray}

328: %%and the  \emph{posterior plausibility ratio} is equal to

329: %%\begin{eqnarray}

330: %%\frac{p(H_1|\mathbf{x})}{p(H_0|\mathbf{x})}

331: %%&=&

332: %%    \frac{p(\mathbf{x}|H_1)}{p(\mathbf{x}|H_0)}

333: %%    \frac{p(H_1)}{p(H_0)} \, .

334: %%\end{eqnarray}

335:

336: \subsection{Noise model}

337:

338: %The noise distribution is unaffected by the signal parameters,

339: %\begin{eqnarray}

340: %p(\mathbf{e}|\mathbf{h},\tau,\theta,\phi,H_1)

341: %=

342: %p(\mathbf{e}|H_0)

343: %= p(\mathbf{e}) \, .

344: %\end{eqnarray}

345: %It then follows that

346: %\begin{eqnarray}

347: %p(\mathbf{e},\mathbf{h},\tau,\theta,\phi|H_1)

348: %&=&

349: %p(\mathbf{e}) \, p(\mathbf{h},\tau,\theta,\phi|H_1) \, .\label{eq:signalprior}

350: %\end{eqnarray}

351: The noise that affects gravitational wave detectors is typically

352: modeled as stationary, colored gaussian noise that is independent of the signal parameters.  This can be represented with a

353: \emph{multivariate normal distribution}, which can be compactly written as

354: \begin{eqnarray}

355: \mathcal{N}(\mathbf{\mu},\mathbf{\Sigma},\mathbf{x})&=&\frac{1}{(2\pi)^{N/2}\sqrt{|\mathbf{\Sigma}|}}\exp(-\frac{1}{2}(\mathbf{x}-\mathbf{\mu})^T\mathbf{\Sigma}^{-1}(\mathbf{x}-\mathbf{\mu})) \, .

356: \end{eqnarray}

357: The vector $\mathbf{\mu}$ is the mean of the distribution, and the

358: positive-definite \emph{covariance matrix} $\mathbf{\Sigma}$

359: describes the ellipsoidal shape of the constant-density contours of the distribution in terms of the

360: pairwise covariances of the samples,

361: \begin{eqnarray}

362: \mathbf{\Sigma}_{i,j} = \langle(e_i-\mu_i),(e_j-\mu_j)\rangle \, .

363: \end{eqnarray}

364: Using this notation, the noise likelihood is %we can define the noise distribution to be

365: \begin{eqnarray}

366: p(\mathbf{x}|H_0) &=& \mathcal{N}(\mathbf{0},\mathbf{\Sigma},\mathbf{e})

367: %p(\mathbf{e})&=&\mathcal{N}(\mathbf{0},\mathbf{\Sigma},\mathbf{e})

368: \end{eqnarray}

369: for some $MN\times MN$ positive definite matrix $\mathbf{\Sigma}$.

370: Under the additional assumption of stationarity over some timescale,

371: these covariances can be estimated from previous observations.

372:

373: In the case of Gaussian stationary colored noise, each detector is individually

374: represented by a Toeplitz covariance matrix $\mathbf{\Sigma}^{(i)}$.  For uncorrelated noise,

375: the covariance matrix for the whole network is $\mathbf{\Sigma} =

376: \textrm{diag}(\mathbf{\Sigma}^{(1)},

377: \mathbf{\Sigma}^{(2)},\ldots,\mathbf{\Sigma}^{(N)})$.  In the simple

378: case in which all the noises are white, have equal standard deviation and are uncorrelated, we have $\mathbf{\Sigma} =

379: \textrm{diag}(\mathbf{I}, \mathbf{I},\ldots,\mathbf{I})=\mathbf{I}$.

380:

381: %We can now derive the expression for the noise likelihood

382: %\begin{eqnarray}

383: %p(\mathbf{x}|H_0)

384: %&=&

385: %\int_{{V_{\mathbf{e}}}}

386: %p(\mathbf{x}|\mathbf{e},H_0) \,

387: %p(\mathbf{e}|H_0) \,

388: %\mathrm{d}\mathbf{e}\nonumber

389: %\\

390: %&=&

391: %\int_{{V_{\mathbf{e}}}}

392: %\delta(\mathbf{x}-\mathbf{e}) \,

393: %\mathcal{N}(\mathbf{0},\mathbf{\Sigma},\mathbf{e}) \,

394: %\mathrm{d}\mathbf{e}\nonumber

395: %\\

396: %&=&

397: %\mathcal{N}(\mathbf{0},\mathbf{\Sigma},\mathbf{x}) \, ,

398: %\end{eqnarray}

399: %and substitute the noise model expression into

400: The generalization of (\ref{noise_signal}) and (\ref{marginal}) for the signal likelihood is

401: \begin{eqnarray}

402: %\fl p(\mathbf{x}|H_1)

403: %&=&

404: %\int_{V_{\mathbf{e},\mathbf{h},\tau,\theta,\phi}} \!\!\!\!\!\!\!\!\!\!\!\!

405: %p(\mathbf{x}|\mathbf{e},\mathbf{h},\tau,\theta,\phi,H_1) \,

406: %p(\mathbf{e},\mathbf{h},\tau,\theta,\phi|H_1) \,

407: %\mathrm{d}\mathbf{e}\ldots\mathrm{d}\phi\nonumber

408: %\\

409: %&=&

410: %\int_{V_{\mathbf{e},\mathbf{h},\tau,\theta,\phi}} \!\!\!\!\!\!\!\!\!\!\!\!

411: %\delta(\mathbf{x}-\mathbf{F}(\tau,\theta,\phi)\cdot\mathbf{h}-\mathbf{e}) \,

412: %\mathcal{N}(\mathbf{0},\mathbf{\Sigma},\mathbf{e}) \,

413: %p(\mathbf{h},\tau,\theta,\phi|H_1) \,

414: %\mathrm{d}\mathbf{e}\ldots\mathrm{d}\phi\nonumber

415: %\\

416: p(\mathbf{x}|H_1)

417: &=&

418: \int_{V_{\mathbf{h},\tau,\theta,\phi}} \!\!\!\!\!\!\!\!\!

419: \mathcal{N}(\mathbf{F}(\tau,\theta,\phi)\cdot\mathbf{h},\mathbf{\Sigma},\mathbf{x}) \,

420: p(\mathbf{h},\tau,\theta,\phi|H_1) \,

421: \mathrm{d}\mathbf{h}\ldots\mathrm{d}\phi \ ,\label{eq:partialmarginalization}

422: \end{eqnarray}

423: where ${V_{\mathbf{h},\tau,\theta,\phi}}$ is the space of all signal parameters

424: and $p(\mathbf{h},\tau,\theta,\phi|H_1)$ is the prior for these parameters.

425: Without loss of generality we may separate this signal prior into a

426: prior on source sky direction and arrival time, and a prior on the waveform

427: \emph{conditional on} the source sky direction and the arrival time, i.e.

428: \begin{eqnarray}

429: p(\mathbf{h},\tau,\theta,\phi|H_1)

430:  = p(\tau,\theta,\phi|H_1) \, p(\mathbf{h}|\tau,\theta,\phi,H_1) \, ,

431: \end{eqnarray}

432: giving

433: \begin{eqnarray}

434: \fl p(\mathbf{x}|H_1)

435: &=&

436: \int_{V_{\mathbf{h},\tau,\theta,\phi}} \!\!\!\!\!\!\!\!\!

437: \mathcal{N}(\mathbf{F}(\tau,\theta,\phi)\cdot\mathbf{h},\mathbf{\Sigma},\mathbf{x}) \,

438: p(\tau,\theta,\phi|H_1) \, p(\mathbf{h}|\tau,\theta,\phi,H_1) \,

439: \mathrm{d}\mathbf{h}\ldots\mathrm{d}\phi \ .\label{eq:partialmarginalization2}

440: \end{eqnarray}

441:

442: \subsection{Wideband signal model}

443: \label{sec:wideband}

444:

445: In analogy with the single sample case, we can choose a multivariate normal distribution prior for the waveform amplitudes and

446: render the integral soluble in closed form.

447: %If we choose a multivariate normal distribution for the strain

448: %waveform samples,

449: The marginalization integral over $\mathbf{h}$ in (\ref{eq:partialmarginalization2}) can then be analytically performed, giving

450: \begin{eqnarray}

451: \frac{p(\mathbf{x}|\tau,\theta,\phi,H_1)}

452: {p(\mathbf{x}|H_0)}

453: &=&

454: \frac{

455: \int_{\mathbb{R}^{2L}}

456: \mathcal{N}(\mathbf{F}(\tau,\theta,\phi)\cdot\mathbf{h},\mathbf{\Sigma},\mathbf{x}) \,

457: p(\mathbf{h}|\tau,\theta,\phi,H_1) \,

458: \mathrm{d}\mathbf{h}}

459: {

460: \mathcal{N}(\mathbf{0},\mathbf{\Sigma},\mathbf{x})

461: } \label{eq:quick}

462: \end{eqnarray}

463: (see (\ref{eqn:explicit}) below).  Numerical integration over a more

464: manageable three dimensions is then sufficient to compute the Bayes factor,

465: \begin{eqnarray}

466: \frac{

467: p(\mathbf{x}|H_1)

468: }{

469: p(\mathbf{x}|H_0)

470: }

471: &=&

472: \int\int\int p(\tau,\theta,\phi|H_1) \,

473: \frac{p(\mathbf{x}|\tau,\theta,\phi,H_1)}

474: {p(\mathbf{x}|H_0)} \,

475: \mathrm{d}\tau \, \mathrm{d}\theta \, \mathrm{d}\phi \, .

476: \end{eqnarray}

477: This signal model is computationally tractable.  It represents signals that can be described by an invertible $2 L\times 2 L$ correlation matrix,

478: including the important 'least informative' case of independent, normally distributed samples of $\mathbf{h}$.

479:

480: %Conclusions; this is the 'least informative' computationally tractable situation,

481: %but we are not wholly ignorant of the physics of plausible sources and we may want to include

482: %some of this knowledge in the priors; the next section shows how to do so.

483:

484: \subsection{Informative signal models}

485: %A normal distribution prior for $\mathbf{h}$ states that the signal is

486: %a superposition of basis waveforms with normally distributed amplitudes,

487: %and has $2L$ degrees of freedom.  This is a model for

488: %bursts with a fairly tightly constrained total energy, which is not

489: %a very good representation of our expectations about gravitational wave

490: %bursts.  At the expense of some additional layers of indirection, we

491: %can preserve much of the computational advantages of the normal distribution

492: %prior while modeling a much more realistic source population.

493:

494: %This models a burst with a specific energy, but as general a waveform as possible.

495: %However, we may want to search for less general waveforms--perhaps

496: %restricted to certain bandwidths, durations or even interpolated

497: %families of waveforms from numerical relativity--and for a distribution of energies.

498:

499: The wideband signal model excludes some important cases, such as when we have a known waveform, almost known waveform

500: (such as from a family of numerical simulations) or even just a signal restricted to some frequency-band.  These signals

501: are superpositions of a (relatively) small number $G < 2 L$ of basis waveforms, that may themselves be characterized

502: by a finite number of parameters, which we denote $\rho$.

503: %Let a vector $\mathbf{\rho}$ contain a small (or zero) number of

504: %parameters of the signal model that are not marginalizable

505: %analytically.

506: These parameters must be numerically integrated,

507: like $\tau$, $\theta$, and $\phi$, which may be time-consuming.

508: Their prior distribution will be denoted by

509: $p(\mathbf{\rho}|\tau,\theta,\phi,H_1)$.

510:

511: To describe the signal as a superposition of basis waveforms \cite{Heng:09},

512: define a set of amplitude parameters $\mathbf{a}$ mapped into strain

513: $\mathbf{h}$ via a $2L\times G$ matrix

514: $\mathbf{W}(\rho,\tau,\theta,\phi)$ whose columns

515: $\mathbf{w}_i(\rho,\tau,\theta,\phi)$ are the basis waveforms, so that

516: \begin{eqnarray}

517: \mathbf{h}&=&\mathbf{W}(\mathbf{\rho},\tau,\theta,\phi)\cdot\mathbf{a} \ .

518: \end{eqnarray}

519: We assume that the amplitude parameters $\mathbf{a}$ are multivariate normal distributed with a

520: covariance matrix $\mathbf{A}(\mathbf{\rho},\tau,\theta,\phi)$, so that

521: \begin{eqnarray}

522: p(\mathbf{a}|\mathbf{\rho},\tau,\theta,\phi,H_1)&=&\mathcal{N}(\mathbf{0},\mathbf{A}(\mathbf{\rho},\tau,\theta,\phi),\mathbf{a}) \, .

523: \end{eqnarray}

524: The resulting distribution for the waveform strain is

525: \begin{eqnarray}

526: \fl p(\mathbf{h}|\tau,\theta,\phi,H_1)

527: &=&

528: \int_{V_{\rho}} \int_{\mathbb{R}^{G}}

529: p(\mathbf{h}|\mathbf{a},\mathbf{\rho},\tau,\theta,\phi,H_1) \,

530: p(\mathbf{a},\mathbf{\rho}|\tau,\theta,\phi,H_1)

531: \, \mathrm{d} \mathbf{a} \, \mathrm{d} \mathbf{\rho} \, \nonumber \\

532:  &=&

533: \int_{V_{\rho}} \int_{\mathbb{R}^{G}}

534: \delta(\mathbf{h}-\mathbf{W}\cdot\mathbf{a}) \,

535: \mathcal{N}(\mathbf{0},\mathbf{A},\mathbf{a}) \,

536: p(\mathbf{\rho}|\tau,\theta,\phi,H_1)

537: \, \mathrm{d}\mathbf{a} \, \mathrm{d}\mathbf{\rho} \, ,

538: \end{eqnarray}

539: where for clarity we have begun to omit the dependence of matrices on their

540: parameters.  As $G < 2L$ ({\em i.e.}, we have fewer basis waveforms than

541: samples in the signal time-series) the integral over $\mathbf{a}$ cannot

542: be directly represented as a multivariate normal distribution.

543:

544: This signal model proposes that gravitational wave signals have

545: waveforms that are the sum of $G$ basis waveforms with amplitudes that

546: are normally distributed (and potentially correlated).  The basis

547: waveforms and their amplitude distributions may vary with source sky direction,

548: arrival time, and any other parameters we care to include in

549: $\mathbf{\rho}$.  The model is capable of representing a variety of

550: sources including the important special cases of known `template'

551: waveforms, and band-limited bursts.  We will consider some

552: concrete examples in \S\ref{sec:signalexamples}; perhaps the most

553: important is a scale parameter $\sigma$, that permits us to look

554: for signals of different total energies.

555:

556: We can substitute the expression back into part of

557: (\ref{eq:quick}) to form a multivariate normal distribution

558: partial integral whose solution is given in \cite{jaynes}:

559: \begin{eqnarray}

560: \fl p(\mathbf{x}|\tau,\theta,\phi,H_1)

561: &=&

562: \int_{V_{\rho}}\int_{\mathbb{R}^{G+2L}} \!\!\!\!

563: \mathcal{N}(\mathbf{F}\cdot\mathbf{h},\mathbf{\Sigma},\mathbf{x}) \,

564: \delta(\mathbf{h}

565: -\mathbf{W}\cdot\mathbf{a}) \,

566: \mathcal{N}(\mathbf{0},\mathbf{A},\mathbf{a}) \,

567: \nonumber \\

568: & & \mbox{} \times

569: p(\mathbf{\rho}|\tau,\theta,\phi,H_1) \,

570: \mathrm{d}\mathbf{a} \, \mathrm{d}\mathbf{\rho} \, \mathrm{d}\mathbf{h} \nonumber \\

571: &=&

572: \int_{V_{\rho}}

573: \mathcal{N}(\mathbf{0},(\mathbf{\Sigma}^{-1}-\mathbf{K})^{-1},\mathbf{x}) \,

574: p(\mathbf{\rho}|\tau,\theta,\phi,H_1) \,

575: \mathrm{d}\mathbf{\rho} \, ,

576: \end{eqnarray}

577: where the matrix

578: \begin{eqnarray}

579: \fl \mathbf{K}(\mathbf{\rho},\tau,\theta,\phi)&=&

580: (\mathbf{\Sigma}^{-1}\mathbf{F}\mathbf{W})

581: (

582: (\mathbf{F}\mathbf{W})^T

583: \mathbf{\Sigma}^{-1}

584: \mathbf{F}\mathbf{W}

585: +

586: \mathbf{A}^{-1}

587: )^{-1}

588: (\mathbf{\Sigma}^{-1}\mathbf{F}\mathbf{W})^T

589: \end{eqnarray}

590: will be the kernel of our numerical implementation.  Note that this is a generalization of

591: equation (\ref{eq:simpleC}) obtained in the single-sample case.  Since

592: \begin{eqnarray}\label{eqn:note}

593: \fl \frac{

594:         p(\mathbf{x}|\rho,\tau,\theta,\phi,H_1)

595: }{

596:         p(\mathbf{x}|H_0)

597: }

598: & = &

599: \frac{

600:         \mathcal{N}(\mathbf{0},(\mathbf{\Sigma}^{-1}-\mathbf{K})^{-1},\mathbf{x})

601: }{

602:         \mathcal{N}(\mathbf{0},\mathbf{\Sigma},\mathbf{x})

603: }

604:  =

605: \sqrt{|\mathbf{I}-\mathbf{\Sigma}\mathbf{K}|}

606: \exp(\frac{1}{2}\mathbf{x}^T\mathbf{K}\mathbf{x}) \, ,

607: \end{eqnarray}

608: we have

609: \begin{eqnarray}

610: \fl \frac{p(\mathbf{x}|\tau,\theta,\phi,H_1)}{

611:         p(\mathbf{x}|H_0)}

612: & = &

613:         \int_{V_{\rho}}

614:         p(\mathbf{\rho}|\tau,\theta,\phi,H_1)

615:         \sqrt{|\mathbf{I}-\mathbf{\Sigma}\mathbf{K}|}

616:         \exp(\frac{1}{2}\mathbf{x}^T\mathbf{K}\mathbf{x})

617:         \, \mathrm{d}\mathbf{\rho} \, . \label{eqn:explicit}

618: \end{eqnarray}

619: and the Bayes factor becomes

620: \begin{eqnarray}

621: \fl \frac{p(\mathbf{x}|H_1)}{p(\mathbf{x}|H_0)}

622: & = &

623:         \int_{V_{\rho, \tau, \theta, \phi}} \!\!\!\!\!\!

624:         p(\mathbf{\rho},\tau,\theta,\phi|H_1)

625:         \sqrt{|\mathbf{I}-\mathbf{\Sigma}\mathbf{K}|}

626:         \exp(\frac{1}{2}\mathbf{x}^T\mathbf{K}\mathbf{x})

627:         \, \mathrm{d}\mathbf{\rho} \, \mathrm{d}\tau

628:         \, \mathrm{d}\theta \, \mathrm{d}\phi \, .\label{eq:fastbayesfactor}

629: \end{eqnarray}

630: In other words we have reduced the task of computing the Bayes factor

631: to an integral over arrival time, source sky direction, and any additional signal

632: model parameters $\mathbf{\rho}$.

633:

634: %

635: %\end{widetext}

636: %

637:

638: \subsection{Example signal models\label{sec:signalexamples}}

639:

640: A simple signal model is the wideband signal model discussed briefly in Section~\ref{sec:wideband}.  This is a burst whose spectrum is white, has

641: characteristic strain amplitude $\sigma$ (at the Earth) and duration

642: $f_\textrm{s}^{-1}L$

643: \begin{eqnarray}

644: G&=&2L \label{wnb1}\\

645: \mathbf{A}&=&\sigma^2\mathbf{I}\label{eq:sigma} \label{wnb2} \\

646: \mathbf{W}&=&\mathbf{I} \, . \label{wnb3}

647: \end{eqnarray}

648:

649: If we assert that such bursts are equally likely to come from any

650: source sky direction and arrive at any time in the observation window of

651: $f_\textrm{s}^{-1}M$ seconds, then the priors are

652: \begin{eqnarray}

653: p(\theta|H_1)&=&\frac{1}{2}\sin(\theta)\\

654: p(\phi|H_1)&=&(2\pi)^{-1}\\

655: p(\tau|H_1)&=&f_\textrm{s}M^{-1} \, .

656: \end{eqnarray}

657: If we assert that the source population is distributed uniformly in flat space up to some horizon $r_\mathrm{max}$

658: %(sources that are extragalactic but not cosmological),

659: we have a prior on the distance $r$ to the source

660: $p(r|H_1)\propto r^2$.  We want to turn this into a prior on the characteristic amplitude $\sigma$, an example of a signal model

661: parameter we must numerically marginalize over

662: ($\mathbf{\rho}=[\sigma]$).  Since the gravitational wave energy decays with the square of the distance to the source, $\sigma^2\propto r^{-2}$, we then deduce that:

663: \begin{eqnarray}

664: p(\sigma|H_1)&=&p(r|H_1)\left|\frac{\mathrm{d}r}{\mathrm{d}\sigma}\right|\\

665: &=&\frac{3\sigma_\mathrm{min}^3}{\sigma^4} \, ,\label{eq:sigma4prior}

666: \end{eqnarray}

667: where $\sigma_\mathrm{min} \propto r_\mathrm{max}^{-1}$ is a lower bound on the amplitude of (or upper bound on the distance of)

668: the gravitational wave.  This bound is obviously

669: somewhat arbitrary, but is a consequence of the way we distinguish

670: between detection and non-detection.  For a uniformly spatially

671: distributed population of bursts there are of course many weak signals

672: within the data, and the noise hypothesis is ``never'' true.  In reality

673: we are interested only in gravitational waves of at least a certain

674: size.  If $\sigma_\mathrm{min}$ is much smaller than the noise floor

675: in all detectors, the expression for the noise hypothesis is an

676: excellent approximation to the expressions of the likelihood we

677: adopted.  The classification of observations is insensitive to

678: different choices of $\sigma_\mathrm{min}$ below the noise floor.

679:

680: This distribution of $\sigma$ is preserved if we consider a source population

681: with a distribution of different intrinsic luminosities, so long as they

682: are uniformly distributed in space out to their respective

683: $r_\mathrm{max}$ determined by the choice of $\sigma_\mathrm{min}$.

684:

685: This is an example of a relatively \emph{uninformative} signal model.

686: It is capable of detecting signals of any waveform (of appropriate

687: duration).  However, it incurs a large {\it Occam penalty} for its

688: generality, and cannot be as sensitive as a more \emph{informed}

689: search.

690:

691: The other extreme situation is where a source's waveform is completely

692: known, but its other parameters (amplitude, source sky position, polarization angle) are not.  Consider a source that

693: produces a linearly polarized strain $\mathbf{w}$.  If the source's

694: orientation, inclination and amplitude are unknown, we can

695: parameterize the system with two amplitudes $\mathbf{a}$ mapping the

696: strain into the observatory network's polarization basis

697: \begin{eqnarray}

698: \mathbf{W}&=&\left[

699: \begin{array}{cc}

700: \mathbf{w} & \mathbf{0}\\

701: \mathbf{0} & \mathbf{w}

702: \end{array}\right].

703: \end{eqnarray}

704: This is the Bayesian equivalent of the matched filter.

705: %

706: The template $\mathbf{w}$ appears twice because any specific signal

707: typically will not be aligned with the polarization basis used to describe

708: $h_+$ and $h_\times$ in the detectors, but rather will be rotated by

709: some {\em polarization angle} $\psi$ with respect to that basis.

710: %% The

711: %% amplitudes $\mathbf{a}$ will then scale as $\cos2\psi$ and $\sin2\psi$.

712: %%

713: More generally, any signal model that is independent of the observatory

714: network's polarization basis must have $\mathbf{A}$ and $\mathbf{W}$

715: composed of two identical sub-matrices on the diagonal like this, so that

716: $\mathbf{h}_+$ and $\mathbf{h}_\times$ have the same statistical distribution.  For

717: example, if the source is not linearly polarized, but has strain described

718: by $\mathbf{w}_+$ and $\mathbf{w}_\times$, then

719: \begin{eqnarray}

720: \mathbf{W}&=&\left[

721: \begin{array}{cccc}

722: \mathbf{w}_+ & \mathbf{w}_\times & \mathbf{0} & \mathbf{0}\\

723: \mathbf{0} & \mathbf{0} & \mathbf{w}_+ & \mathbf{w}_\times

724: \end{array}\right].

725: \end{eqnarray}

726: %

727: %A concrete example is the signal from the inspiral of a compact binary,

728: %such as two neutron stars.  In the absence of spin or eccentricity, the

729: %waveform is characterised by 9 parameters: the distance $D$, the

730: %two masses $m_1$ and $m_2$, the inclination angle $\iota$, the

731: %polarization angle $\psi$, a phase $\Phi_0$, the coalescence time $\tau$,

732: %and the sky position angles $\theta,\phi$.  The basis waveforms

733: %$\mathbf{w}_+$, $\mathbf{w}_\times$ are the ``cosine'' and ``sine''

734: %inspiral templates \cite{a-matched-filter-paper}; their shape is

735: %determined by the masses $m_1$, $m_2$, which must be marginalized

736: %over numerically.  The overall phase angle $\Phi_0$ can be marginalized

737: %analytically.  The remaining parameters $\iota$, $\psi$, and $D$ affect

738: %only the relative amplitude with which $\mathbf{w}_+$ and $\mathbf{w}_\times$

739: %couple into the detector data streams; i.e., they contribute only to the

740: %amplitudes $\mathbf{a}$:

741: %\begin{equation}

742: %\mathbf{a} = \frac{1}{D}\left[\begin{array}{c}

743: %  \frac12\cos2\psi(1+\cos^2\iota) \\

744: %  \sin2\psi\cos\iota \\

745: %  -\frac12\sin2\psi(1+\cos^2\iota) \\

746: %  \cos2\psi\cos\iota

747: %  \end{array}

748: %\right]

749: %\end{equation}

750: %The physical parameters $\iota$, $\psi$, and $D$ are not Gaussian distributed.

751: %Assuming the binaries are isotropically distributed through space ($p(D)\propto D^2$),

752: %one finds that the individual $a(i)$ can be reasonably approximated with

753: %Gaussians.  However, scatter plots of the pairs $(a(1),a(4))$ and $(a(2),a(3))$

754: %show a hyperbolic-like distribution which will not be well approximated by a multivariate

755: %Gaussian.  It needs more study to see if we can make a useful example out of this case.

756:

757: A more general case might be where we have a number of different

758: predictions for a waveform, $\mathbf{w}_i$, numerically derived. The

759: resulting search looks for a linear combination of these different

760: waveforms,

761: \begin{eqnarray}

762: \mathbf{W}&=&\left[

763: \begin{array}{cccccc}

764: \mathbf{w}_1 & \mathbf{w}_2 & \cdots & \mathbf{0} & \mathbf{0} & \cdots \\

765: \mathbf{0} & \mathbf{0} & \cdots & \mathbf{w}_1 & \mathbf{w}_2 & \cdots

766: \end{array}

767: \right] \, .

768: \end{eqnarray}

769: %% This model can even encompass interpolations \emph{between} waveforms

770: %% if the set includes $\textrm{diag}(\mathbf{b}_j)\mathbf{w}_i$ for $\mathbf{b}_j$

771: %% some interpolation basis (such as the polynomial basis $b_{j,k} = (k/L)^j$).

772: %% This must be done with care, however, as interpolations we would consider

773: %% unreasonable (such as $h_{+,j} = a_ij^2L^{-2}w_{i,j}$) will also be considered.

774: %% The signal model we are considering is a multivariate normal distribution

775: %% encompassing the surface that properly normalized interpolations lie on, and

776: %% also encompassing some improperly normalized interpolations.  Not all signal

777: %% models can be represented as multivariate normal distributions; properly

778: %% normalized interpolations are one example.

779:

780: %Multivariate normal distributions have another useful property.  The

781: %covariance matrix can be computed for any arbitrary signal model, even

782: %though it may not well describe that signal model.  The multivariate

783: %normal distribution corresponding to that covariance matrix is the

784: %\emph{least informative} distribution in that it makes the weakest

785: %assumption about the signal of any model with that covariance.

786: %Thus, for every signal model, there is a corresponding multivariate

787: %normal distribution that will not overstate what we know. It is therefore

788: %a conservative choice to replace an arbitrary signal model with the corresponding

789: %multivariate normal distribution.  This is precisely the

790: %approach we follow in the Monte Carlo demonstration in

791: %\S\ref{sec:simulations}, where we use a multivariate normal

792: %distribution model to detect binary black-hole merger waveforms.

793:

794: \subsection{Comparison with previously proposed methods}

795: \label{sec:comparison}

796:

797: In this section we will expand on the arguments sketched in a previous

798: paper \cite{SeSuTiWo:08}.

799:

800: %%%%

801: %

802: % We introduce this section with the Bayesian search for known theta, phi

803: % and then compare that against the all-sky frequentist search, inviting

804: % trouble.  Better to note the differences between marginalization and

805: % maximization and then press on comparing the statistics 'within' the

806: % different proceedures

807: %

808: %%%%

809:

810: Several previously proposed hypothesis tests, such as the

811: G\"{u}rsel-Tinto (i.e. standard likelihood), the constraint likelihoods,

812: and the Tikhonov-regularized likelihood, can be written in the form

813: \begin{eqnarray}\label{eq:prev}

814:         \max_{\rho,\tau,\theta,\phi}\mathbf{x}^T\mathbf{J}(\rho,\tau,\theta,\phi)\mathbf{x}&>&\lambda \, ,\label{eq:fht}

815: \end{eqnarray}

816: where $\mathbf{J}$ is an $MN\times MN$ matrix and $\lambda$ is a

817: \emph{threshold}.  These tests proceed in two steps.  First, parameters are

818: \emph{estimated} by maximizing the likelihood function with respect to the parameters.

819: Second, the value of the likelihood function at its maximum is compared to a threshold $\lambda$, which is chosen to ensure that it is only exceeded for the noise hypothesis at some acceptable \emph{false alarm rate}.

820:

821: The corresponding Bayesian expression, from (\ref{eq:fastbayesfactor}),

822: integrates over source sky direction, arrival time and any other parameters

823: and determines if the Bayes factor is large enough to overcome the prior plausibility ratio

824: \begin{eqnarray}

825: \fl \int_{V_{\rho, \tau, \theta, \phi}} \!\!\!\!\!\!

826:         p(\mathbf{\rho},\tau,\theta,\phi|H_1)

827:         \sqrt{|\mathbf{I}-\mathbf{\Sigma}\mathbf{K}|}

828:         \exp(\frac{1}{2}\mathbf{x}^T\mathbf{K}\mathbf{x})

829:         \, \mathrm{d}\mathbf{\rho} \, \mathrm{d}\tau

830:         \, \mathrm{d}\theta \, \mathrm{d}\phi

831:         &>&

832: \frac{

833: p(H_0)

834: }{

835: p(H_1)

836: }\,. \label{eq:bht}

837: \end{eqnarray}

838:

839: There are some obvious similarities between (\ref{eq:fht}) and (\ref{eq:bht}),

840: in particular the quadratic forms central to each.  However, direct mathematical equivalence cannot be established in

841: general because of the difference between maximization and marginalization.

842:

843: We can establish equivalence for the related problem of parameter estimation, where we have maximum likelihood parameter estimate

844: \begin{eqnarray}

845: \{\rho,\tau,\theta,\phi\}&=&\arg\max(\mathbf{x}^T\mathbf{J}\mathbf{x})

846: \end{eqnarray}

847: and the Bayesian most plausible parameters, one of several ways the posterior plausibility distribution for the parameters can be turned into a point estimate

848: \begin{eqnarray}

849: \fl \{\rho,\tau,\theta,\phi\}&=&\arg\max ( p(\mathbf{\rho},\tau,\theta,\phi|H_1)

850:         \sqrt{|\mathbf{I}-\mathbf{\Sigma}\mathbf{K}|}

851:         \exp(\frac{1}{2}\mathbf{x}^T\mathbf{K}\mathbf{x}) )\\

852:         &=&\arg\max(\mathbf{x}^T\mathbf{K}\mathbf{x} + 2\ln p(\mathbf{\rho},\tau,\theta,\phi|H_1) + \ln |\mathbf{I}-\mathbf{\Sigma}\mathbf{K}|) \, .

853: \end{eqnarray}

854: In the cases where we can find a Bayesian signal model that produces $\mathbf{K}=\mathbf{J}$, we must also use a prior

855: \begin{eqnarray}

856: p(\mathbf{\rho},\tau,\theta,\phi|H_1)&\propto&|\mathbf{I}-\mathbf{\Sigma}\mathbf{K}|^{-\frac{1}{2}}.

857: \end{eqnarray}

858: This prior states that gravitational wave bursts are \emph{intrinsically} more likely to occur at the sky positions

859: that the network is more sensitive to.  We interpret this as an implicit bias present in any statistic of the form of (\ref{eq:fht})\footnote{It is important

860: to note that this particular objection applies only to all-sky searches;

861: it is a consequence of the maximization over $(\theta,\phi)$.  These statistics

862: are also used in directed searches (for example, in the direction of a gamma-ray burst) where

863: $(\theta, \phi)$ is known and fixed, and the problem does not arise (the missing normalization

864: term is one of several absorbed by tuning the threshold).}.

865:

866: %As $\mathbf{F}$ and therefore $\mathbf{K}$ is a function of direction

867: %$(\theta,\phi)$, we conclude that the only

868: %way for a Bayesian analysis to produce the same parameter estimates as

869: %a statistic of the form in (\ref{eq:fht}) is to propose that gravitational

870: %wave sources are anisotropically distributed across the sky as some function

871: %of the network's sensitivity.  This is an unphysical proposition; insofar as

872: %we find it incredible, we should expect a Bayesian analysis with a uniform prior

873: %to perform better on the real signal population

874: %

875: %We interpret this result as indicating

876: %that any statistics of the form in (\ref{eq:fht}) contains at least this implicit unphysical assumption.

877: %In the next subsections, we will individually explore the several specific statistics.

878:

879: In order to compare previously proposed statistics to the Bayesian

880: method, we place some restrictions on the configurations

881: considered.  We will assume co-located (but differently oriented)

882: detectors to eliminate the need to time-shift data, and we will use

883: stationary signals and observation times that coincide with the time

884: the signal is present.  These restrictions eliminate the differences

885: in the way previously proposed statistics and the Bayesian method

886: handle arrival time and signal duration.  For simplicity, we will

887: further assume that the detectors are affected by white Gaussian noise.

888: The conclusions drawn will apply equally to different versions of

889: these statistics for colored noise or different bases other than the

890: time-domain (such as the frequency or wavelet domains).

891:

892: \subsubsection{Tikhonov regularized statistic}

893:

894: The Tikhonov regularized statistic proposed in \cite{Ra:06} for white

895: noise interferometers is

896: \begin{eqnarray}

897: \mathbf{x}^T\mathbf{F}(\mathbf{F}^T\mathbf{F}

898: +\alpha^2\mathbf{I})^{-1}\mathbf{F}^T\mathbf{x}\, .

899: \end{eqnarray}

900: The Bayesian kernel $\mathbf{K}$ reduces to this for

901: \begin{eqnarray}

902: \mathbf{\Sigma}&=&\mathbf{I}\\

903: \mathbf{W}&=&\mathbf{I}\\

904: \mathbf{A}&=&\alpha^{-2}\mathbf{I} \, .

905: \end{eqnarray}

906: This is a signal of

907: characteristic amplitude $\sigma = \alpha^{-1}$.  The Tikhonov

908: regularizer $\alpha$ therefore places a delta function prior on the characteristic amplitude of the signal $p(\sigma|H_1)=\delta(\sigma-\alpha^{-1})$.

909: %,

910: %corresponding to a potentially quite restrictive $\chi^2$ distribution with $2L$ degrees of freedom for the signal energy.

911: %This physical interpretation, that the regularizer dictates the size of the signal expected, was not made in \cite{Ra:06}.

912: %The prior plausibility of the signal hypothesis varies with source sky direction

913: %\begin{eqnarray}

914: %\frac{p(H_1|\theta,\phi)}{p(H_0)}

915: %&=&

916: %\frac{\exp(-\frac{1}{2}\lambda)}{\sqrt{|\mathbf{I}-\mathbf{F}(\mathbf{F}^T\mathbf{F}+\alpha^2\mathbf{I})^{-1}\mathbf{F}^T|}} \nonumber \\

917: %&=&\frac{\exp(-\frac{1}{2}\lambda)}{\sqrt{|\mathbf{I}-\mathbf{F}^T\mathbf{F}(\mathbf{F}^T\mathbf{F}+\alpha^2\mathbf{I})^{-1}|}}\mathrm{\ (via\ Sylvester's\ theorem)}\nonumber\\

918: %&=&\frac{\exp(-\frac{1}{2}\lambda)}{\sqrt{|\mathbf{I}-(\mathbf{F}^T\mathbf{F}+\alpha^2\mathbf{I}-\alpha^2\mathbf{I})(\mathbf{F}^T\mathbf{F}+\alpha^2\mathbf{I})^{-1}|}}\nonumber\\

919: %&=&\frac{\exp(-\frac{1}{2}\lambda)}{\sqrt{|\alpha^2(\mathbf{F}^T\mathbf{F}+\alpha^2\mathbf{I})^{-1}|}}\nonumber\\

920: %&=&\frac{\exp(-\frac{1}{2}\lambda)}{\sqrt{|(\alpha^{-2}\mathbf{F}^T\mathbf{F}+\mathbf{I})^{-1}|}}\nonumber\\

921: %&=&\sqrt{e^{-\lambda}|\mathbf{I}+\sigma^{2}\mathbf{F}^T\mathbf{F}|} \, ,

922: %\end{eqnarray}

923: %implying that gravitational wave events to come from some

924: %source sky directions more frequently and from others less frequently;

925: %in particular, this is in proportion to the network's

926: %sensitivity to that source sky direction.

927:

928: The Tikhonov statistic behaves like a Bayesian statistic that

929: postulates all bursts have energies in a narrow range. % and are anisotropically distributed across the sky.

930: %  Since these

931: %priors do not reflect our knowledge of the universe, we should expect

932: %a better performance from an analysis accounting for our prior

933: %knowledge.

934:

935: \subsubsection{G\"{u}rsel-Tinto statistic}

936:

937: The G\"{u}rsel-Tinto or standard likelihood statistic

938: \cite{GuTi:89,FlHu:98b,AnBrCrFl:01} is

939: \begin{eqnarray}

940: \mathbf{x}^T\mathbf{F}(\mathbf{F}^T\mathbf{F})^{-1}\mathbf{F}^T\mathbf{x}\,.

941: \end{eqnarray}

942: For large $\sigma$, the Tikhonov statistic goes to

943: \begin{eqnarray}

944: \mathbf{K}

945:   &\approx&  \mathbf{F}(\mathbf{F}^T\mathbf{F})^{-1}\mathbf{F}^T \, .

946:   % \\

947: %\frac{p(H_1|\theta,\phi)}{p(H_0)}

948: %  &\approx& \sigma^{2M}\sqrt{e^{-\lambda}|\mathbf{F}^T\mathbf{F}|} \, .

949: \end{eqnarray}

950: This implies that the G\"{u}rsel-Tinto statistic is the limit of a series of Bayesian statistics for increasing signal amplitudes.

951:

952: %This implies that there is no Bayesian test equivalent to the

953: %G\"{u}rsel-Tinto statistic, but that the G\"{u}rsel-Tinto statistic is

954: %rather the limit of a series of Bayesian tests for gravitational waves of increasingly

955: %large energies.

956: %increasingly frequent, and increasingly directionally biased

957: %populations of gravitational wave signals.

958:

959: %Alternatively, we could say that the G\"{u}rsel-Tinto statistic

960: %follows from our Bayesian formulation if we adopt an \emph{improper}

961: %(unnormalizable) strain prior $p(\mathbf{h}|H_1)=1$,

962: %which assigns equal plausibility to every possible waveform. This is what is

963: %meant by G\"{u}rsel-Tinto's independence of waveform.  In practice we

964: %expect that smaller signals occur more frequently than larger

965: %signals, and this has real consequences in the analysis.

966: %

967: %Consider a common failure mode of G\"{u}rsel-Tinto: misidentifying the

968: %source sky direction of a gravitational wave.  A moderately sized signal will

969: %come from a source sky direction of typical sensitivity and produce a moderate

970: %response.  G\"{u}rsel-Tinto will correctly declare the true source sky direction

971: %of the injection to be plausible.  However, there are directions on

972: %the sky where the global network becomes insensitive to one

973: %polarization, and near those source sky directions $\mathbf{F}^T\mathbf{F}$ is a

974: %near-singular matrix whose inverse varies rapidly, causing the

975: %G\"{u}rsel-Tinto statistic itself to vary over a wide range.

976: %Often one of these near-pathological source sky directions will be deemed more

977: %plausible than the true source sky direction.  These pathological source sky directions

978: %always correspond to very low sensitivity to at least one

979: %polarization, so to explain the moderately-sized response at least one

980: %polarization of the postulated signal has to be very large.  The

981: %improper G\"{u}rsel-Tinto prior says we believe very large

982: %gravitational waves to be just as plausible as moderately sized ones,

983: %so the statistic has no grounds to discount the pathological

984: %source sky direction, and returns the wrong source sky direction and an obviously wrong

985: %unphysically large reconstructed waveform.  By contrast, a Bayesian

986: %method's prior can tell it that very large signals are very unlikely,

987: %so it does not make the same error.

988: %

989: %This shortcoming also degrades the sensitivity of the G\"{u}rsel-Tinto

990: %statistic.  Noisy observations can be consistent with a very large

991: %gravitational wave from a near-pathological source sky direction and these false

992: %alarms force us to use a high threshold $\lambda$ that limits the

993: %efficiency of the method for more physically reasonable signals.

994: %Again, the Bayesian method does not suffer from this problem because

995: %it knows large signals are rare and it will not postulate them on the

996: %basis of weak evidence.

997: %

998: %It is important to note that G\"{u}rsel-Tinto can detect

999: %realistically-sized signals.

1000: %%; they form only an infinitesimal fraction

1001: %%of the postulated signal population, but this is canceled by the

1002: %%infinite prior plausibility that a signal is present, to give them a

1003: %%net finite plausibility.

1004: %Like the Tikhonov statistic,

1005: %G\"{u}rsel-Tinto works as a detection statistic; its problem is only

1006: %one of efficiency.

1007:

1008: \subsubsection{Soft constraint likelihood}

1009:

1010: The soft constraint statistic \cite{KlMoRaMi:05,KlMoRaMi:06} for white

1011: noise interferometers is

1012: \begin{eqnarray}\label{eqn:SC}

1013: k^2(\theta,\phi)\,\mathbf{x}^T\mathbf{FF}^T\mathbf{x} \, ,

1014: \end{eqnarray}

1015: for some function $k(\theta,\phi)$.  Specifically, (\ref{eqn:SC})

1016: gives the soft constraint likelihood for the choice

1017: $k^2=(\mathbf{F}^{+T}\mathbf{F}^+)^{-1}$, where the antenna response is

1018: computed in the dominant polarization frame \cite{KlMoRaMi:05}.

1019:

1020: Consider the signal model defined by

1021: \begin{eqnarray}

1022: \mathbf{\Sigma}&=&\mathbf{I}\\

1023: \mathbf{W}&=&\mathbf{I}\\

1024: \mathbf{A}&=&\sigma^2k^2(\theta,\phi)\mathbf{I} \, .

1025: \end{eqnarray}

1026: This is a population of signals whose characteristic amplitude

1027: $\sigma k(\theta,\phi)$ varies as some known function of source sky direction,

1028: slightly generalizing the situation of the Tikhonov statistic.  For small $\sigma$,

1029: \begin{eqnarray}

1030: \mathbf{K}&\approx&\sigma^2k^2(\theta,\phi)\mathbf{F}\mathbf{F}^T \, ,

1031: \end{eqnarray}

1032: so we can see that the soft constraint is the limit of a series of Bayesian statistics for decreasing signal amplitudes.

1033:

1034: %

1035: %(the opposite extreme to the G\"{u}rsel-Tinto statistic),

1036: %with the added twist that the infinitesimal expected amplitude of the signal varies with direction.

1037:

1038: %and the Bayesian test becomes

1039: %

1040: %\begin{eqnarray}

1041: %\lefteqn{k^2(\theta,\phi) \, \mathbf{x}^T\mathbf{FF}^T\mathbf{x}} \nonumber \\

1042: %&>&

1043: %-\frac{2}{\sigma^2}\ln\left(\sqrt{|\mathbf{I}-\sigma^2k^2\mathbf{F}\mathbf{F}^T|}\frac{

1044: %p(H_1|\theta,\phi)

1045: %}{

1046: %p(H_0)

1047: %}\right) \, . \quad

1048: %\end{eqnarray}

1049: %This implies that to mimic the soft constraint we must choose the prior

1050: %\begin{eqnarray}

1051: %\frac{p(H_1|\theta,\phi)}{p(H_0)}

1052: %&\approx&1+\frac{\sigma^2}{2}(k^2(\theta,\phi)\tr(\mathbf{FF}^T)-\lambda)+{} \nonumber \\

1053: %& & {} + O(\sigma^4) \,

1054: %\end{eqnarray}

1055: %where we have used the expansion of the determinant of a near-identity matrix

1056: %\begin{eqnarray}

1057: %|\mathbf{I}+\epsilon\mathbf{X}| = 1 + \epsilon\tr(\mathbf{X}) + O(\epsilon^2).

1058: %\end{eqnarray}

1059: %The prior is approximately unity, varying only infinitesimally with

1060: %source sky direction and the threshold $\lambda$.  Even though the prior's

1061: %dependence on source sky direction is weak, the statistic's dependence on the

1062: %data is equally weak. Within these assumptions equation (\ref{eqn:note}) becomes

1063: %\begin{eqnarray}

1064: %\frac{p(\mathbf{x}|\theta,\phi,H_1)}{p(\mathbf{x}|H_0)}

1065: %&\approx& 1+\frac{1}{2}\sigma^2k^2(\theta,\phi)(\mathbf{x}^T\mathbf{FF}^T\mathbf{x}-{}

1066: %\nonumber \\

1067: %& & {}-\tr(\mathbf{FF}^T))+O(\sigma)^4 \, .

1068: %\end{eqnarray}

1069: %Since the expected signals are weak, and the evidence for them will

1070: %also be weak, any information in the prior still strongly affects the result.

1071:

1072: %\begin{widetext}

1073: %%

1074: %\begin{eqnarray}

1075: %k^2(\theta,\phi) \, \mathbf{x}^T\mathbf{FF}^T\mathbf{x}

1076: %&>&

1077: %-\frac{2}{\sigma^2}\ln\left(\sqrt{|\mathbf{I}-\sigma^2k^2(\theta,\phi)\mathbf{F}\mathbf{F}^T|}\frac{

1078: %p(H_1|\theta,\phi)

1079: %}{

1080: %p(H_0)

1081: %}\right) \, .

1082: %\end{eqnarray}

1083: %This implies that to mimic the soft constraint we must choose the prior

1084: %\begin{eqnarray}

1085: %\frac{p(H_1|\theta,\phi)}{p(H_0)}

1086: %&\approx&1+\frac{\sigma^2}{2}(k^2(\theta,\phi)\tr(\mathbf{FF}^T)-\lambda)+O(\sigma^4) \, .

1087: %\end{eqnarray}

1088: %The prior is approximately unity, varying only infinitesimally with

1089: %direction and the threshold $\lambda$.  Even though the prior's

1090: %dependence on direction is weak, the statistic's dependence on the

1091: %data is equally weak. Within these assumptions equation (\ref{eqn:note}) becomes

1092: %\begin{eqnarray}

1093: %\frac{p(\mathbf{x}|\theta,\phi,H_1)}{p(\mathbf{x}|H_0)}

1094: %&\approx& 1+\frac{1}{2}\sigma^2k^2(\theta,\phi)(\mathbf{x}^T\mathbf{FF}^T\mathbf{x}-

1095: %\tr(\mathbf{FF}^T))+O(\sigma)^4 \, .

1096: %\end{eqnarray}

1097: %Since the expected signals are weak, and the evidence for them will

1098: %also be weak, the weak prior still strongly affects the result.

1099: %%

1100: %\end{widetext}

1101:

1102:

1103: %The soft constraint is therefore the limit of a series of Bayesian

1104: %tests for gravitational wave bursts that are common and whose

1105: %amplitudes and rates are a function of direction on the sky. If we

1106: %choose $k(\theta,\phi)=1$ we can eliminate the directional amplitude

1107: %bias; if we choose $k^{-2}(\theta,\phi)=\tr(\mathbf{FF}^T)$ we can

1108: %eliminate the directional dependence of event rate.  Other choices, such as that made

1109: %in \cite{KlMoRaMi:05}, remove neither bias.

1110:

1111: \subsubsection{Hard constraint likelihood}

1112:

1113: Let us restrict the soft-constraint signal model to a population of

1114: \emph{linearly polarized} signals with a known polarization angle

1115: $\psi(\theta,\phi)$ for each source sky direction

1116: \begin{eqnarray}

1117: \mathbf{\Sigma}&=&\mathbf{I}\\

1118: \mathbf{W}&=&\left[

1119: \begin{array}{c}

1120: \cos 2\psi(\theta,\phi)\mathbf{I}\\

1121: \sin 2\psi(\theta,\phi)\mathbf{I}

1122: \end{array}

1123: \right]\\

1124: \mathbf{A}&=&\sigma^2 k^2(\theta,\phi)\mathbf{I} \, .

1125: \end{eqnarray}

1126: Then %with the prior

1127: %\begin{eqnarray}

1128: %\frac{p(H_1|\theta,\phi)}{p(H_0)}

1129: %& = &

1130: %         e^{-\frac{\lambda\sigma^2}{2}}\sqrt{|\mathbf{I}+\sigma^2k^2(\mathbf{FW})^T\mathbf{FW}|}

1131: %         \qquad

1132: %\end{eqnarray}

1133: for $\sigma\rightarrow 0$ the Bayesian statistic limits to

1134: % the left hand side of

1135: \begin{eqnarray}

1136: k^2(\theta,\phi) \, \mathbf{x}^T\mathbf{FW}(\mathbf{FW})^T\mathbf{x}

1137: %  >  \lambda

1138: \, .

1139: \end{eqnarray}

1140: For the particular choice of $\psi(\theta,\phi)$ being the rotation

1141: angle between the detector polarization basis and the dominant

1142: polarization frame, and $k^2=(\mathbf{FW})^T\mathbf{FW}$ (which is

1143: equal to ($\mathbf{F}^{+T}\mathbf{F}^+)^{-1}$ in the dominant

1144: polarization frame \cite{KlMoRaMi:05}), this yields the hard

1145: constraint statistic of \cite{KlMoRaMi:05}.

1146:

1147: In addition to the explicit assumptions that all signals are

1148: linearly polarized with known polarization angle, the hard constraint

1149: has the same properties as the soft constraint.

1150: %  The normalization

1151: %chosen in \cite{KlMoRaMi:05} eliminates the directional rate bias but

1152: %introduces a directional amplitude bias.

1153:

1154: \subsection{Interpretation}

1155:

1156: We have shown that several previously proposed statistics are special cases

1157: or limiting cases of Bayesian statistics for particular choices of prior.

1158: %These comparisons offer new interpretations of the methods:

1159: %the \emph{ad hoc} Tikhonov regularizer is physically interpreted as

1160: %the inverse of an expected signal amplitude; the constraint methods presume

1161: %signals much smaller than the noise level while the G\:{u}rsel-Tinto method presumes

1162: %signals much larger than the noise level; and all the methods considered propose that

1163: %gravitational wave bursts occur intrinsically more frequently in directions that the

1164: %network is more sensitive to.

1165: %

1166: The `priors' implicit in these non-Bayesian methods are not representative of our

1167: expectations about the source population, so we can reasonably expect

1168: improved performance from a detection statistic with priors

1169: better reflecting our state of knowledge.  The Bayesian analysis allows us to begin with our physical understanding

1170: of the problem, described in terms of prior expectations about the

1171: gravitational wave signal population, and derive the detection

1172: statistic for these conditions.  The effects of priors are lessened when there is a strong gravitational wave

1173: signal present; all these statistics, Bayesian and non-Bayesian, are effective at

1174: detecting stronger gravitational waves; significant differences occur only

1175: for marginal signals.  In the next section, we will quantitatively compare the relative performance

1176: of the methods mentioned above and the Bayesian statistic we propose.

1177:

1178:

1179:

1180: %The Tikhonov, G\"{u}rsel-Tinto, and constraint methods are limits of

1181: %Bayesian statistics distinguished only by different choices of prior.

1182: %Yet, as Bayesian priors are \emph{subjective}, on what basis can we

1183: %critique them?

1184: %

1185: %Though subjective, priors make definite statements about our

1186: %\emph{expectations} of the world; one popular paradigm relates priors

1187: %to bets we would be willing to make about the outcome of an

1188: %experiment.  Few scientists would be willing to bet that the first

1189: %detected gravitational wave burst would have strain far above (G\"{u}rsel-Tinto) or below (constraint methods) the

1190: %instrumental noise, or that events will conveniently occur where our network is most sensitive.

1191: %Insofar as we find these prior plausibility

1192: %distributions incredible, we should expect a Bayesian method with a

1193: %more credible prior to be a more \emph{efficient} detection statistic.

1194: %Quite literally, the Bayesian method does not have to waste precious

1195: %(signal) energy overcoming strong prejudices.

1196: %The same logic applies

1197: %to attempts to reconstruct the parameters of a detected signal, such

1198: %as the sky position of the source; Fig.~\ref{fig:skies} shows an

1199: %example.

1200:

1201: %Once may also consider what the Bayesian analysis says about how

1202: %he non-Bayesian statistics can be improved.

1203: %For example, in our comparisons

1204: %we have assumed that the threshold $\lambda$ used in the frequentist

1205: %analysis is the same for all sky positions.  The Bayesian analysis shows

1206: %that this implicitly imposes physically unreasonable priors on the

1207: %gravitational-wave signal.

1208: %In a non-Bayesian all-sky analysis we are perfectly

1209: %free to add a new term to the likelihood that varies with sky position.

1210: %The Bayesian analysis shows

1211: %how the this term can be varied across the sky to correspond to physical

1212: %priors;

1213: %; for example, by selecting $\lambda(\theta,\phi)$ in equation

1214: %(\ref{eqn:FreqPrior}) to make the right-hand side constant.  Analogous

1215: %analogous

1216: %reasoning from the non-Bayesian point of view would be to examine how

1217: %$\lambda(\theta,\phi)$

1218: %the new term should be chosen to balance (in some sense) the false

1219: %alarm rate versus detection probability across the sky.

1220: %The Bayesian

1221: %formulation gives one concrete suggestion.

1222:

1223: %In summary, we have demonstrated that previously proposed methods implicitly

1224: %make unreasonable choices of prior, and consequently must be suboptimal for

1225: %reasonable choices of prior.  We have not yet quantified how much

1226: %worse their performance is.  One way to answer this question is to

1227: %perform a Monte-Carlo simulation, testing the ability of each

1228: %statistic to detect thousands of simulated gravitational wave signals.

1229:

1230: %\begin{figure}[htb]

1231: %\resizebox{\columnwidth}{!}{\includegraphics{skies}}

1232: %\caption{\label{fig:skies}

1233: %  Four statistics as a function of $(\theta,\phi)$ for identical white

1234: %  noise interferometers with the locations and orientations of LHO,

1235: %  LLO and Virgo sampled at 1024 Hz and a 1/16s white noise signal with

1236: %  amplitude SNR of 5.  White is most plausible; black is least

1237: %  plausible; a circle indicates the true direction; a square indicates

1238: %  the most plausible direction.  From top to bottom: $\ln$ Bayesian

1239: %  odds ratio for $\sigma=5$; Tikhonov for $\alpha=0.2$;

1240: %  G\"{u}rsel-Tinto; soft constraint with $k(\theta,\phi)=1$.}

1241: %\end{figure}

1242:

1243: