1: %% $Id: analysis.tex,v 1.64 2009/06/03 22:37:53 acsearle Exp $
2:
3:
4: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
5:
6: \section{Analysis}
7: \label{SECII}\label{sec:analysis}
8:
9:
10: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
11:
12: \subsection{Single-sample observation}
13: \label{sec:singleSample}
14:
15: We begin by investigating perhaps the simplest Bayesian coherent data
16: analysis: detecting a signal from a known sky position in a single
17: strain sample from each of $N$ gravitational wave observatories. This
18: example will show many of the basic features of the Bayesian analysis,
19: and highlight some of the differences between the Bayesian approach
20: and previous statistics. In the following section we will generalize
21: to a multi-sample search for a signal arriving at an unknown time from
22: an unknown sky position.
23:
24: Consider a single strain sample from each of $N$ detectors, each
25: measurement taken at the moment corresponding to the passage of a
26: postulated plane gravitational wave from some known location on
27: the sky, ($\theta, \phi$). The measurements are then equal to \cite{GuTi:89}
28: \begin{equation}
29: \mathbf{x}=\mathbf{F}\,\mathbf{h}+\mathbf{e} \, , \label{eqn:ssmodel}
30: \end{equation}
31: where $\mathbf{x}$ is the vector of measurements $[x_1,\ldots,x_N]^T$, the
32: matrix $\mathbf{F}=[[F_1^+,F_1^\times],\ldots,[F_N^+,F_N^\times]]$
33: contains the antenna responses of the observatories to the postulated
34: gravitational wave strain vector $\mathbf{h}=[h_+,h_\times]^T$, and
35: $\mathbf{e}$ is the noise in each sample. $\mathbf{F}$ is a known
36: function of the source sky direction $(\theta,\phi)$, and the decomposition into $+$ and $\times$ polarizations requires us to choose an arbitrary polarization basis angle $\psi$ for each source sky direction.
37:
38: We wish to distinguish between two hypotheses: $H_0$,
39: that the data contains only noise, and $H_1$, that the
40: data contains a gravitational wave signal. The Bayesian odds ratio \cite{jaynes, gregory}
41: allows us to compare the plausibility of the hypotheses:
42: \begin{equation}
43: \frac{p(H_1|\mathbf{x},I)}
44: {p(H_0|\mathbf{x},I)}=
45: \frac{p(H_1|I)}
46: {p(H_0|I)}
47: \frac{p(\mathbf{x}|H_1,I)}
48: {p(\mathbf{x}|H_0,I)}
49: \label{Bayes_Ratio} \, ,
50: \end{equation}
51: where $I$ is a set of unstated but shared assumptions (such as the
52: detector locations, orientations and noise power spectra). If the posterior plausibility ratio is greater than one,
53: $H_1$ is more plausible than $H_0$ and we
54: classify the observation as a detection. If the posterior
55: plausibility ratio is less than one, $H_1$ is less
56: plausible than $H_0$ and we classify the observation as a
57: non-detection.
58:
59: The $p(H|I)$ terms (``plausibility of $H$ assuming $I$'')are the
60: \emph{prior} plausibilities we assign to each hypothesis $H$ on the
61: basis of our knowledge $I$ prior to considering the measurement; for
62: example, our expectation that detectable gravitational waves are rare
63: requires that $p(H_1|I)\ll p(H_0|I)$.
64:
65: The $p(\mathbf{x}|H,I)$ terms (``plausibility of $\mathbf{x}$ assuming
66: $H$ and $I$'') are the probabilities assigned by a hypothesis to the
67: occurrence of a particular observation $\mathbf{x}$. These are
68: sometimes called likelihood functions; they represent the likelihood
69: of a certain measurement being made.
70:
71: The $p(H|\textbf{x},I)$ terms are the \emph{posterior} plausibilities
72: we assign to the hypotheses in light of the observation.
73: %. The
74: %difference between the prior and posterior plausibility ratios caused
75: %by the observation is the ratio of the plausibilities those hypotheses
76: %assigned to that observation being made;
77: %The hypothesis that made the
78: %better prediction becomes more plausible.
79: The hypothesis that assigned more probability to the observation becomes more plausible.
80:
81: For notational simplicity we will drop the $I$ in our formulae; the unstated assumptions are implicit.
82:
83: If we make the idealized assumption that the noise in each detector is
84: independent and normally distributed \cite{jaynes, gregory} with zero mean and unit standard
85: deviation, we can then write the following expression for the
86: likelihood $p(\mathbf{x}|H_0)$
87: \begin{eqnarray}
88: p(\mathbf{x}|H_0)&=&\prod_{i=1}^N p(x_i|H_0)\nonumber\\
89: &=&\prod_{i=1}^N\frac{1}{\sqrt{2\pi}}\exp(-\frac{1}{2}x_i^2)\nonumber\\
90: &=&(2\pi)^{-\frac{N}{2}}\exp(-\frac{1}{2}\mathbf{x}^T\mathbf{x})\label{singleNoise} \, ,
91: \label{noise_only}
92: \end{eqnarray}
93: where $^T$ denotes matrix transposition. For real detectors, the
94: measurements can be \emph{whitened}, which modifies the effective beam pattern functions
95: $\mathbf{F}$.
96:
97: If we assume that there is a gravitational wave $\mathbf{h}$ present, then
98: after subtracting away the response $\mathbf{F}\,\mathbf{h}$ the data will
99: be distributed as noise and the likelihood
100: $p(\mathbf{x}|\mathbf{h},H_1)$ becomes
101: %
102: %\begin{widetext}
103: %
104: \begin{eqnarray}
105: p(\mathbf{x}|\mathbf{h},H_1)
106: &=&(2\pi)^{-\frac{N}{2}}\exp(-\frac{1}{2}(\mathbf{x}-\mathbf{F}\,\mathbf{h})^T
107: (\mathbf{x}-\mathbf{F}\,\mathbf{h})) \label{noiseSignal} \, .
108: \label{noise_signal}
109: \end{eqnarray}
110:
111: Unfortunately, we do not know the signal strain vector $\mathbf{h}$
112: {\em a priori}. To compute the plausibility of the more general
113: hypothesis $p(\mathbf{x}|H_\mathrm{signal})$ we need to marginalize
114: away these {\it nuisance parameters}
115: \begin{eqnarray}
116: p(\mathbf{x}|H_1)
117: &=&\int_{-\infty}^{+\infty} \int_{-\infty}^{+\infty} p(\mathbf{h}|H_1)
118: p(\mathbf{x}|\mathbf{h},H_1) \, \mathrm{d}{h_+} \, \mathrm{d}{h_\times} \, .
119: \label{marginal}
120: \end{eqnarray}
121: The hypothesis resulting from the marginalization integral is an
122: average of the hypotheses for particular signals $\mathbf{h}$,
123: weighted by the prior probability $p(\mathbf{h}|H_\mathrm{signal})$ we assign
124: to those signals occurring. A convenient choice of prior is to use a normal
125: distribution for each polarization, with a standard deviation $\sigma$
126: indicative of the amplitude scale of gravitational waves we hope to
127: detect. Under these assumptions the prior is
128: \begin{eqnarray}\label{wave_distribution}
129: p(\mathbf{h}|H_1)
130: & = &
131: \frac{1}{2\pi\sigma^2}\exp(-\frac{1}{2\sigma^2}\mathbf{h}^T\mathbf{h}) \, .
132: \end{eqnarray}
133: This allows us to perform the marginalization integral analytically
134: \begin{eqnarray}
135: p(\mathbf{x}|H_1)
136: & = &
137: (2\pi)^{-\frac{N}{2}-1}\sigma^{-2} \int_{-\infty}^{+\infty} \int_{-\infty}^{+\infty}
138: \exp(-\frac{1}{2}((\mathbf{x}-\mathbf{F}\,\mathbf{h})^T
139: (\mathbf{x}-\mathbf{F}\,\mathbf{h})
140: \nonumber \\
141: & & \mbox{}
142: +\sigma^{-2}\mathbf{h}^T\mathbf{h})) \, \mathrm{d}{h_+} \, \mathrm{d}{h_\times}
143: \nonumber \\
144: & = &
145: (2\pi)^{-\frac{N}{2}}
146: |\mathbf{I-K_\mathrm{ss}}|^{\frac{1}{2}} \exp(-\frac{1}{2}\,\mathbf{x}^T
147: (\mathbf{I-K_\mathrm{ss}})\mathbf{x}) \, ,
148: \label{eq:simpleP}
149: \end{eqnarray}
150: %
151: %\end{widetext}
152: %
153: where
154: \begin{eqnarray}
155: \mathbf{K_\mathrm{ss}}
156: &\equiv&
157: \mathbf{F}
158: (\mathbf{F}^T\mathbf{F}+\sigma^{-2}\mathbf{I})^{-1}
159: \mathbf{F}^T\label{eq:simpleC}.
160: \end{eqnarray}
161: The result is a multivariate normal distribution with covariance
162: matrix $(\mathbf{I-K_\mathrm{ss}})^{-1}$, which quantifies the correlations among the
163: detectors due to the presence of a gravitational wave signal.
164:
165: With both hypotheses defined, we can form the \emph{likelihood ratio}
166: \begin{eqnarray}
167: \Lambda
168: & = &
169: \frac{p(\mathbf{x}|H_1)}
170: {p(\mathbf{x}|H_0)}
171: \nonumber\\
172: %& = &
173: % |\mathbf{I-K_\mathrm{ss}}|^\frac12 \exp(
174: % \frac{1}{2}\,\mathbf{x}^T \mathbf{K_\mathrm{ss}} \mathbf{x})
175: % \nonumber\\
176: & = &
177: |\mathbf{I-K_\mathrm{ss}}|^\frac12 \exp ( \frac{1}{2}\,\mathbf{x}^T
178: \mathbf{F}(\mathbf{F}^T\mathbf{F}+\sigma^{-2}\mathbf{I})^{-1}
179: \mathbf{F}^T\mathbf{x}) \, . \,
180: \label{eqn:ssLambda}
181: \label{likelihood_final}
182: \end{eqnarray}
183: Multiplying the likelihood ratio by the prior plausibility ratio
184: $p(H_1)/p(H_0)$ completes the calculation of the Bayesian odds ratio
185: (\ref{Bayes_Ratio}).
186:
187: %The part of the likelihood ratio in the exponential can be directly
188: %compared to existing non-Bayesian statistics. In particular, i
189: In the limit
190: $\sigma\rightarrow\infty$ we find that the odds ratio contains the
191: least-squares estimate of the strain
192: \begin{eqnarray}
193: \mathbf{\hat{h}}&=&(\mathbf{F}^T\mathbf{F})^{-1}\mathbf{F}^T\mathbf{x} \, .
194: \end{eqnarray}
195: The odds ratio may then be rewritten in terms of a matched filter for the
196: response to the estimated strain, $\mathbf{x}^T\mathbf{F}\,\mathbf{\hat{h}}$.
197: For finite values of $\sigma$, the odds ratio contains the \emph{Tikhonov regularized}
198: estimate of the strain \cite{Ra:06}
199: \begin{eqnarray}
200: \mathbf{\hat{h}} = (\mathbf{F}^T\mathbf{F}+\sigma^{-2}\mathbf{I})^{-1}\mathbf{F}^T\mathbf{x} \, ,
201: \end{eqnarray}
202: and can still be rewritten as a matched filter for this estimate.
203: %We discuss the relationship of the Bayesian to Tikhonov and other
204: %statistics in more detail in Section~\ref{sec:comparison}.
205:
206: It is also worth noting the presence in (\ref{eqn:ssLambda}) of the determinant $|\mathbf{I-K_\mathrm{ss}}|$ factor.
207: It is independent of the data and depends only on the antenna pattern and the signal model. In particular, it tells us how strongly
208: to weight likelihoods computed for different possible sky positions
209: of the signal. This {\em Occam factor} penalizes sky positions
210: of high sensitivity relative to sky positions of lower sensitivity which
211: give similar exponential part of the likelihood. The effect is typically small compared to the
212: exponential in most cases if the data has good evidence for a signal,
213: but can be important for weak signals and for parameter estimation.
214:
215:
216: %% %% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
217:
218: \subsection{General Bayesian model}
219:
220: We now generalize the analysis of the previous section to the case of
221: burst signals of extended duration and unknown source sky direction $(\theta, \phi)$ and arrival
222: time $\tau$ with respect to the centre of the Earth.
223:
224: A global network of $N$ gravitational wave detectors each produce a
225: time-series of $M$ observations with sampling frequency
226: $f_\textrm{s}$, which we pack into a single vector
227: \begin{equation}
228: \fl
229: \mathbf{x}=[x_{1,1},x_{1,2},\ldots,x_{1,M},x_{2,1},x_{2,2},\ldots,x_{2,M},\ldots,x_{N,1},x_{N,2},\ldots,x_{N,M}]^T \ .
230: \end{equation}
231: %We want to classify the observation as a gravitational wave detection
232: %or not. Bayesian inference does not allow us to \emph{reject} a
233: %hypothesis in isolation, so we must propose (at least) two hypotheses
234: %and compute which is more plausible. We will consider a signal
235: %hypothesis $H_1$ and a noise hypothesis
236: %$H_0$. The ability of the observation to distinguish
237: %between these two hypotheses is contingent upon the observation being
238: %differently distributed for each hypothesis
239: %\begin{eqnarray}
240: %p(\mathbf{x}|H_1)&\neq&p(\mathbf{x}|H_0)
241: %\ .
242: %\end{eqnarray}
243: %To compute these plausibility distributions, we must explicitly form a
244: %model of the experiment. For $H_1$, we will use the
245: %model
246: Our signal model is a generalization of (\ref{eqn:ssmodel}),
247: \begin{eqnarray}
248: \mathbf{x}&=&\mathbf{F}(\tau,\theta,\phi)\cdot\mathbf{h}+\mathbf{e} \, ,\label{eq:linearmodel}
249: \end{eqnarray}
250: where
251: \begin{eqnarray}
252: \mathbf{h}&=&[h_{+,1},h_{+,2},\ldots,h_{+,L},h_{\times,1},\ldots,h_{\times,L}]^T
253: \end{eqnarray}
254: is a time-series of $2 L$ samples describing the band-limited strain
255: waveform (with the two polarizations packed into a single vector),
256: %$(\tau,\theta,\phi)$ are respectively the time of arrival and
257: %source sky direction of the gravitational wave,
258: $\mathbf{e}$ is a random variable representing the
259: instrumental noise, and $\mathbf{F}(\tau,\theta,\phi)$ is a $NM\times
260: 2L$ response matrix describing the response of each observatory to an
261: incoming gravitational wave,
262: \begin{eqnarray}
263: \fl \mathbf{F}(\tau,\theta,\phi)&=&
264: \left[
265: \begin{array}{cc}
266: F^+_1(\theta,\phi)\mathbf{T}(\tau+\Delta\tau_1(\theta,\phi)) & F^\times_1(\theta,\phi)\mathbf{T}(\tau+\Delta\tau_1(\theta,\phi)) \\
267: F^+_2(\theta,\phi)\mathbf{T}(\tau+\Delta\tau_2(\theta,\phi)) & F^\times_2(\theta,\phi)\mathbf{T}(\tau+\Delta\tau_2(\theta,\phi)) \\
268: \vdots & \vdots \\
269: F^+_N(\theta,\phi)\mathbf{T}(\tau+\Delta\tau_N(\theta,\phi)) & F^\times_N(\theta,\phi)\mathbf{T}(\tau+\Delta\tau_N(\theta,\phi))
270: \end{array}
271: \right] \, .
272: \end{eqnarray}
273: Each $M\times L$ block of the response matrix is responsible for
274: scaling and time shifting one of the waveform polarizations for one
275: detector, so each block is the product of the directional
276: sensitivity of each detector to each polarization, $F^+_i(\theta,\phi)$
277: or $F^\times_i(\theta,\phi)$, and a time delay matrix $T_{j,k}(t)$
278: \footnote{
279: From the assumption that the signal is band-limited, it follows that the
280: time delay matrix may be written as $T_{j,k}(t)=\textrm{sinc}(\pi(j-k-f_\textrm{s}t))$; for $L = M$ and zero time delays, it is equal to the identity matrix; for $L = M$ and time delays corresponding
281: to integer numbers of time samples, it is a \emph{shift matrix}.
282: },
283: for the source sky direction
284: dependent arrival times $\tau+\Delta\tau_i(\theta,\phi)$ at each
285: detector.
286:
287: %We can restate the equality in (\ref{eq:linearmodel}) as a
288: %Dirac delta-function plausibility distribution
289: %\begin{eqnarray}
290: %p(\mathbf{x}|\mathbf{e},\mathbf{h},\tau,\theta,\phi,H_1)
291: %= \delta(\mathbf{x}-\mathbf{F}(\tau,\theta,\phi)\cdot\mathbf{h}-\mathbf{e}) \ .
292: %\end{eqnarray}
293: %We can then use the marginalization theorem to compute the likelihood
294: %of the data given the hypothesis that a burst is present \cite{jaynes}
295: %\begin{eqnarray}
296: %p(\mathbf{x}|H_1)
297: % &=& \int_{V_{\mathbf{e},\mathbf{h},\tau,\theta,\phi}} \!\!\!\!\!\!\!\!\!\!\!\!
298: % p(\mathbf{x}|\mathbf{e},\mathbf{h},\tau,\theta,\phi,H_1) \,
299: % p(\mathbf{e},\mathbf{h},\tau,\theta,\phi|H_1) \,
300: % \mathrm{d}\mathbf{e}\ldots\mathrm{d}\phi \ ,
301: %\end{eqnarray}
302: %where $V_{\mathbf{e},\mathbf{h},\tau,\theta,\phi}$ is the space of all our parameter values.
303:
304: %Similarly we can state the noise model hypothesis as a Dirac
305: %delta-function plausibility distribution, implying the expression
306: %\begin{eqnarray}
307: %p(\mathbf{x}|\mathbf{e},H_0)&=&\delta(\mathbf{x}-\mathbf{e})\\
308: %p(\mathbf{x}|H_0)&=&
309: %\int_{V_{\mathbf{e}}}
310: % p(\mathbf{x}|\mathbf{e},H_0) \, p(\mathbf{e}|H_0) \mathrm{d}\mathbf{e} \, ,
311: %\end{eqnarray}
312: %where ${V_{\mathbf{e}}}$ is the space of the noise.
313:
314: %By using the above expressions we can now construct the \emph{Bayes factor}
315: %\begin{eqnarray}
316: %\fl \frac{p(\mathbf{x}|H_1)}{p(\mathbf{x}|H_0)}
317: % &=& \frac{
318: % \int_{V_{\mathbf{e},\mathbf{h},\tau,\theta,\phi}} \!\!\!\!
319: % p(\mathbf{x}|\mathbf{e},\mathbf{h},\tau,\theta,\phi,H_1) \,
320: % p(\mathbf{e},\mathbf{h},\tau,\theta,\phi|H_1) \,
321: % \mathrm{d}\mathbf{e}\ldots\mathrm{d}\phi
322: % }{
323: % \int_{{V_{\mathbf{e}}}}
324: % p(\mathbf{x}|\mathbf{e},H_0) \,
325: % p(\mathbf{e}|H_0) \, \mathrm{d}\mathbf{e}
326: % } \, .
327: %\end{eqnarray}
328: %%and the \emph{posterior plausibility ratio} is equal to
329: %%\begin{eqnarray}
330: %%\frac{p(H_1|\mathbf{x})}{p(H_0|\mathbf{x})}
331: %%&=&
332: %% \frac{p(\mathbf{x}|H_1)}{p(\mathbf{x}|H_0)}
333: %% \frac{p(H_1)}{p(H_0)} \, .
334: %%\end{eqnarray}
335:
336: \subsection{Noise model}
337:
338: %The noise distribution is unaffected by the signal parameters,
339: %\begin{eqnarray}
340: %p(\mathbf{e}|\mathbf{h},\tau,\theta,\phi,H_1)
341: %=
342: %p(\mathbf{e}|H_0)
343: %= p(\mathbf{e}) \, .
344: %\end{eqnarray}
345: %It then follows that
346: %\begin{eqnarray}
347: %p(\mathbf{e},\mathbf{h},\tau,\theta,\phi|H_1)
348: %&=&
349: %p(\mathbf{e}) \, p(\mathbf{h},\tau,\theta,\phi|H_1) \, .\label{eq:signalprior}
350: %\end{eqnarray}
351: The noise that affects gravitational wave detectors is typically
352: modeled as stationary, colored gaussian noise that is independent of the signal parameters. This can be represented with a
353: \emph{multivariate normal distribution}, which can be compactly written as
354: \begin{eqnarray}
355: \mathcal{N}(\mathbf{\mu},\mathbf{\Sigma},\mathbf{x})&=&\frac{1}{(2\pi)^{N/2}\sqrt{|\mathbf{\Sigma}|}}\exp(-\frac{1}{2}(\mathbf{x}-\mathbf{\mu})^T\mathbf{\Sigma}^{-1}(\mathbf{x}-\mathbf{\mu})) \, .
356: \end{eqnarray}
357: The vector $\mathbf{\mu}$ is the mean of the distribution, and the
358: positive-definite \emph{covariance matrix} $\mathbf{\Sigma}$
359: describes the ellipsoidal shape of the constant-density contours of the distribution in terms of the
360: pairwise covariances of the samples,
361: \begin{eqnarray}
362: \mathbf{\Sigma}_{i,j} = \langle(e_i-\mu_i),(e_j-\mu_j)\rangle \, .
363: \end{eqnarray}
364: Using this notation, the noise likelihood is %we can define the noise distribution to be
365: \begin{eqnarray}
366: p(\mathbf{x}|H_0) &=& \mathcal{N}(\mathbf{0},\mathbf{\Sigma},\mathbf{e})
367: %p(\mathbf{e})&=&\mathcal{N}(\mathbf{0},\mathbf{\Sigma},\mathbf{e})
368: \end{eqnarray}
369: for some $MN\times MN$ positive definite matrix $\mathbf{\Sigma}$.
370: Under the additional assumption of stationarity over some timescale,
371: these covariances can be estimated from previous observations.
372:
373: In the case of Gaussian stationary colored noise, each detector is individually
374: represented by a Toeplitz covariance matrix $\mathbf{\Sigma}^{(i)}$. For uncorrelated noise,
375: the covariance matrix for the whole network is $\mathbf{\Sigma} =
376: \textrm{diag}(\mathbf{\Sigma}^{(1)},
377: \mathbf{\Sigma}^{(2)},\ldots,\mathbf{\Sigma}^{(N)})$. In the simple
378: case in which all the noises are white, have equal standard deviation and are uncorrelated, we have $\mathbf{\Sigma} =
379: \textrm{diag}(\mathbf{I}, \mathbf{I},\ldots,\mathbf{I})=\mathbf{I}$.
380:
381: %We can now derive the expression for the noise likelihood
382: %\begin{eqnarray}
383: %p(\mathbf{x}|H_0)
384: %&=&
385: %\int_{{V_{\mathbf{e}}}}
386: %p(\mathbf{x}|\mathbf{e},H_0) \,
387: %p(\mathbf{e}|H_0) \,
388: %\mathrm{d}\mathbf{e}\nonumber
389: %\\
390: %&=&
391: %\int_{{V_{\mathbf{e}}}}
392: %\delta(\mathbf{x}-\mathbf{e}) \,
393: %\mathcal{N}(\mathbf{0},\mathbf{\Sigma},\mathbf{e}) \,
394: %\mathrm{d}\mathbf{e}\nonumber
395: %\\
396: %&=&
397: %\mathcal{N}(\mathbf{0},\mathbf{\Sigma},\mathbf{x}) \, ,
398: %\end{eqnarray}
399: %and substitute the noise model expression into
400: The generalization of (\ref{noise_signal}) and (\ref{marginal}) for the signal likelihood is
401: \begin{eqnarray}
402: %\fl p(\mathbf{x}|H_1)
403: %&=&
404: %\int_{V_{\mathbf{e},\mathbf{h},\tau,\theta,\phi}} \!\!\!\!\!\!\!\!\!\!\!\!
405: %p(\mathbf{x}|\mathbf{e},\mathbf{h},\tau,\theta,\phi,H_1) \,
406: %p(\mathbf{e},\mathbf{h},\tau,\theta,\phi|H_1) \,
407: %\mathrm{d}\mathbf{e}\ldots\mathrm{d}\phi\nonumber
408: %\\
409: %&=&
410: %\int_{V_{\mathbf{e},\mathbf{h},\tau,\theta,\phi}} \!\!\!\!\!\!\!\!\!\!\!\!
411: %\delta(\mathbf{x}-\mathbf{F}(\tau,\theta,\phi)\cdot\mathbf{h}-\mathbf{e}) \,
412: %\mathcal{N}(\mathbf{0},\mathbf{\Sigma},\mathbf{e}) \,
413: %p(\mathbf{h},\tau,\theta,\phi|H_1) \,
414: %\mathrm{d}\mathbf{e}\ldots\mathrm{d}\phi\nonumber
415: %\\
416: p(\mathbf{x}|H_1)
417: &=&
418: \int_{V_{\mathbf{h},\tau,\theta,\phi}} \!\!\!\!\!\!\!\!\!
419: \mathcal{N}(\mathbf{F}(\tau,\theta,\phi)\cdot\mathbf{h},\mathbf{\Sigma},\mathbf{x}) \,
420: p(\mathbf{h},\tau,\theta,\phi|H_1) \,
421: \mathrm{d}\mathbf{h}\ldots\mathrm{d}\phi \ ,\label{eq:partialmarginalization}
422: \end{eqnarray}
423: where ${V_{\mathbf{h},\tau,\theta,\phi}}$ is the space of all signal parameters
424: and $p(\mathbf{h},\tau,\theta,\phi|H_1)$ is the prior for these parameters.
425: Without loss of generality we may separate this signal prior into a
426: prior on source sky direction and arrival time, and a prior on the waveform
427: \emph{conditional on} the source sky direction and the arrival time, i.e.
428: \begin{eqnarray}
429: p(\mathbf{h},\tau,\theta,\phi|H_1)
430: = p(\tau,\theta,\phi|H_1) \, p(\mathbf{h}|\tau,\theta,\phi,H_1) \, ,
431: \end{eqnarray}
432: giving
433: \begin{eqnarray}
434: \fl p(\mathbf{x}|H_1)
435: &=&
436: \int_{V_{\mathbf{h},\tau,\theta,\phi}} \!\!\!\!\!\!\!\!\!
437: \mathcal{N}(\mathbf{F}(\tau,\theta,\phi)\cdot\mathbf{h},\mathbf{\Sigma},\mathbf{x}) \,
438: p(\tau,\theta,\phi|H_1) \, p(\mathbf{h}|\tau,\theta,\phi,H_1) \,
439: \mathrm{d}\mathbf{h}\ldots\mathrm{d}\phi \ .\label{eq:partialmarginalization2}
440: \end{eqnarray}
441:
442: \subsection{Wideband signal model}
443: \label{sec:wideband}
444:
445: In analogy with the single sample case, we can choose a multivariate normal distribution prior for the waveform amplitudes and
446: render the integral soluble in closed form.
447: %If we choose a multivariate normal distribution for the strain
448: %waveform samples,
449: The marginalization integral over $\mathbf{h}$ in (\ref{eq:partialmarginalization2}) can then be analytically performed, giving
450: \begin{eqnarray}
451: \frac{p(\mathbf{x}|\tau,\theta,\phi,H_1)}
452: {p(\mathbf{x}|H_0)}
453: &=&
454: \frac{
455: \int_{\mathbb{R}^{2L}}
456: \mathcal{N}(\mathbf{F}(\tau,\theta,\phi)\cdot\mathbf{h},\mathbf{\Sigma},\mathbf{x}) \,
457: p(\mathbf{h}|\tau,\theta,\phi,H_1) \,
458: \mathrm{d}\mathbf{h}}
459: {
460: \mathcal{N}(\mathbf{0},\mathbf{\Sigma},\mathbf{x})
461: } \label{eq:quick}
462: \end{eqnarray}
463: (see (\ref{eqn:explicit}) below). Numerical integration over a more
464: manageable three dimensions is then sufficient to compute the Bayes factor,
465: \begin{eqnarray}
466: \frac{
467: p(\mathbf{x}|H_1)
468: }{
469: p(\mathbf{x}|H_0)
470: }
471: &=&
472: \int\int\int p(\tau,\theta,\phi|H_1) \,
473: \frac{p(\mathbf{x}|\tau,\theta,\phi,H_1)}
474: {p(\mathbf{x}|H_0)} \,
475: \mathrm{d}\tau \, \mathrm{d}\theta \, \mathrm{d}\phi \, .
476: \end{eqnarray}
477: This signal model is computationally tractable. It represents signals that can be described by an invertible $2 L\times 2 L$ correlation matrix,
478: including the important 'least informative' case of independent, normally distributed samples of $\mathbf{h}$.
479:
480: %Conclusions; this is the 'least informative' computationally tractable situation,
481: %but we are not wholly ignorant of the physics of plausible sources and we may want to include
482: %some of this knowledge in the priors; the next section shows how to do so.
483:
484: \subsection{Informative signal models}
485: %A normal distribution prior for $\mathbf{h}$ states that the signal is
486: %a superposition of basis waveforms with normally distributed amplitudes,
487: %and has $2L$ degrees of freedom. This is a model for
488: %bursts with a fairly tightly constrained total energy, which is not
489: %a very good representation of our expectations about gravitational wave
490: %bursts. At the expense of some additional layers of indirection, we
491: %can preserve much of the computational advantages of the normal distribution
492: %prior while modeling a much more realistic source population.
493:
494: %This models a burst with a specific energy, but as general a waveform as possible.
495: %However, we may want to search for less general waveforms--perhaps
496: %restricted to certain bandwidths, durations or even interpolated
497: %families of waveforms from numerical relativity--and for a distribution of energies.
498:
499: The wideband signal model excludes some important cases, such as when we have a known waveform, almost known waveform
500: (such as from a family of numerical simulations) or even just a signal restricted to some frequency-band. These signals
501: are superpositions of a (relatively) small number $G < 2 L$ of basis waveforms, that may themselves be characterized
502: by a finite number of parameters, which we denote $\rho$.
503: %Let a vector $\mathbf{\rho}$ contain a small (or zero) number of
504: %parameters of the signal model that are not marginalizable
505: %analytically.
506: These parameters must be numerically integrated,
507: like $\tau$, $\theta$, and $\phi$, which may be time-consuming.
508: Their prior distribution will be denoted by
509: $p(\mathbf{\rho}|\tau,\theta,\phi,H_1)$.
510:
511: To describe the signal as a superposition of basis waveforms \cite{Heng:09},
512: define a set of amplitude parameters $\mathbf{a}$ mapped into strain
513: $\mathbf{h}$ via a $2L\times G$ matrix
514: $\mathbf{W}(\rho,\tau,\theta,\phi)$ whose columns
515: $\mathbf{w}_i(\rho,\tau,\theta,\phi)$ are the basis waveforms, so that
516: \begin{eqnarray}
517: \mathbf{h}&=&\mathbf{W}(\mathbf{\rho},\tau,\theta,\phi)\cdot\mathbf{a} \ .
518: \end{eqnarray}
519: We assume that the amplitude parameters $\mathbf{a}$ are multivariate normal distributed with a
520: covariance matrix $\mathbf{A}(\mathbf{\rho},\tau,\theta,\phi)$, so that
521: \begin{eqnarray}
522: p(\mathbf{a}|\mathbf{\rho},\tau,\theta,\phi,H_1)&=&\mathcal{N}(\mathbf{0},\mathbf{A}(\mathbf{\rho},\tau,\theta,\phi),\mathbf{a}) \, .
523: \end{eqnarray}
524: The resulting distribution for the waveform strain is
525: \begin{eqnarray}
526: \fl p(\mathbf{h}|\tau,\theta,\phi,H_1)
527: &=&
528: \int_{V_{\rho}} \int_{\mathbb{R}^{G}}
529: p(\mathbf{h}|\mathbf{a},\mathbf{\rho},\tau,\theta,\phi,H_1) \,
530: p(\mathbf{a},\mathbf{\rho}|\tau,\theta,\phi,H_1)
531: \, \mathrm{d} \mathbf{a} \, \mathrm{d} \mathbf{\rho} \, \nonumber \\
532: &=&
533: \int_{V_{\rho}} \int_{\mathbb{R}^{G}}
534: \delta(\mathbf{h}-\mathbf{W}\cdot\mathbf{a}) \,
535: \mathcal{N}(\mathbf{0},\mathbf{A},\mathbf{a}) \,
536: p(\mathbf{\rho}|\tau,\theta,\phi,H_1)
537: \, \mathrm{d}\mathbf{a} \, \mathrm{d}\mathbf{\rho} \, ,
538: \end{eqnarray}
539: where for clarity we have begun to omit the dependence of matrices on their
540: parameters. As $G < 2L$ ({\em i.e.}, we have fewer basis waveforms than
541: samples in the signal time-series) the integral over $\mathbf{a}$ cannot
542: be directly represented as a multivariate normal distribution.
543:
544: This signal model proposes that gravitational wave signals have
545: waveforms that are the sum of $G$ basis waveforms with amplitudes that
546: are normally distributed (and potentially correlated). The basis
547: waveforms and their amplitude distributions may vary with source sky direction,
548: arrival time, and any other parameters we care to include in
549: $\mathbf{\rho}$. The model is capable of representing a variety of
550: sources including the important special cases of known `template'
551: waveforms, and band-limited bursts. We will consider some
552: concrete examples in \S\ref{sec:signalexamples}; perhaps the most
553: important is a scale parameter $\sigma$, that permits us to look
554: for signals of different total energies.
555:
556: We can substitute the expression back into part of
557: (\ref{eq:quick}) to form a multivariate normal distribution
558: partial integral whose solution is given in \cite{jaynes}:
559: \begin{eqnarray}
560: \fl p(\mathbf{x}|\tau,\theta,\phi,H_1)
561: &=&
562: \int_{V_{\rho}}\int_{\mathbb{R}^{G+2L}} \!\!\!\!
563: \mathcal{N}(\mathbf{F}\cdot\mathbf{h},\mathbf{\Sigma},\mathbf{x}) \,
564: \delta(\mathbf{h}
565: -\mathbf{W}\cdot\mathbf{a}) \,
566: \mathcal{N}(\mathbf{0},\mathbf{A},\mathbf{a}) \,
567: \nonumber \\
568: & & \mbox{} \times
569: p(\mathbf{\rho}|\tau,\theta,\phi,H_1) \,
570: \mathrm{d}\mathbf{a} \, \mathrm{d}\mathbf{\rho} \, \mathrm{d}\mathbf{h} \nonumber \\
571: &=&
572: \int_{V_{\rho}}
573: \mathcal{N}(\mathbf{0},(\mathbf{\Sigma}^{-1}-\mathbf{K})^{-1},\mathbf{x}) \,
574: p(\mathbf{\rho}|\tau,\theta,\phi,H_1) \,
575: \mathrm{d}\mathbf{\rho} \, ,
576: \end{eqnarray}
577: where the matrix
578: \begin{eqnarray}
579: \fl \mathbf{K}(\mathbf{\rho},\tau,\theta,\phi)&=&
580: (\mathbf{\Sigma}^{-1}\mathbf{F}\mathbf{W})
581: (
582: (\mathbf{F}\mathbf{W})^T
583: \mathbf{\Sigma}^{-1}
584: \mathbf{F}\mathbf{W}
585: +
586: \mathbf{A}^{-1}
587: )^{-1}
588: (\mathbf{\Sigma}^{-1}\mathbf{F}\mathbf{W})^T
589: \end{eqnarray}
590: will be the kernel of our numerical implementation. Note that this is a generalization of
591: equation (\ref{eq:simpleC}) obtained in the single-sample case. Since
592: \begin{eqnarray}\label{eqn:note}
593: \fl \frac{
594: p(\mathbf{x}|\rho,\tau,\theta,\phi,H_1)
595: }{
596: p(\mathbf{x}|H_0)
597: }
598: & = &
599: \frac{
600: \mathcal{N}(\mathbf{0},(\mathbf{\Sigma}^{-1}-\mathbf{K})^{-1},\mathbf{x})
601: }{
602: \mathcal{N}(\mathbf{0},\mathbf{\Sigma},\mathbf{x})
603: }
604: =
605: \sqrt{|\mathbf{I}-\mathbf{\Sigma}\mathbf{K}|}
606: \exp(\frac{1}{2}\mathbf{x}^T\mathbf{K}\mathbf{x}) \, ,
607: \end{eqnarray}
608: we have
609: \begin{eqnarray}
610: \fl \frac{p(\mathbf{x}|\tau,\theta,\phi,H_1)}{
611: p(\mathbf{x}|H_0)}
612: & = &
613: \int_{V_{\rho}}
614: p(\mathbf{\rho}|\tau,\theta,\phi,H_1)
615: \sqrt{|\mathbf{I}-\mathbf{\Sigma}\mathbf{K}|}
616: \exp(\frac{1}{2}\mathbf{x}^T\mathbf{K}\mathbf{x})
617: \, \mathrm{d}\mathbf{\rho} \, . \label{eqn:explicit}
618: \end{eqnarray}
619: and the Bayes factor becomes
620: \begin{eqnarray}
621: \fl \frac{p(\mathbf{x}|H_1)}{p(\mathbf{x}|H_0)}
622: & = &
623: \int_{V_{\rho, \tau, \theta, \phi}} \!\!\!\!\!\!
624: p(\mathbf{\rho},\tau,\theta,\phi|H_1)
625: \sqrt{|\mathbf{I}-\mathbf{\Sigma}\mathbf{K}|}
626: \exp(\frac{1}{2}\mathbf{x}^T\mathbf{K}\mathbf{x})
627: \, \mathrm{d}\mathbf{\rho} \, \mathrm{d}\tau
628: \, \mathrm{d}\theta \, \mathrm{d}\phi \, .\label{eq:fastbayesfactor}
629: \end{eqnarray}
630: In other words we have reduced the task of computing the Bayes factor
631: to an integral over arrival time, source sky direction, and any additional signal
632: model parameters $\mathbf{\rho}$.
633:
634: %
635: %\end{widetext}
636: %
637:
638: \subsection{Example signal models\label{sec:signalexamples}}
639:
640: A simple signal model is the wideband signal model discussed briefly in Section~\ref{sec:wideband}. This is a burst whose spectrum is white, has
641: characteristic strain amplitude $\sigma$ (at the Earth) and duration
642: $f_\textrm{s}^{-1}L$
643: \begin{eqnarray}
644: G&=&2L \label{wnb1}\\
645: \mathbf{A}&=&\sigma^2\mathbf{I}\label{eq:sigma} \label{wnb2} \\
646: \mathbf{W}&=&\mathbf{I} \, . \label{wnb3}
647: \end{eqnarray}
648:
649: If we assert that such bursts are equally likely to come from any
650: source sky direction and arrive at any time in the observation window of
651: $f_\textrm{s}^{-1}M$ seconds, then the priors are
652: \begin{eqnarray}
653: p(\theta|H_1)&=&\frac{1}{2}\sin(\theta)\\
654: p(\phi|H_1)&=&(2\pi)^{-1}\\
655: p(\tau|H_1)&=&f_\textrm{s}M^{-1} \, .
656: \end{eqnarray}
657: If we assert that the source population is distributed uniformly in flat space up to some horizon $r_\mathrm{max}$
658: %(sources that are extragalactic but not cosmological),
659: we have a prior on the distance $r$ to the source
660: $p(r|H_1)\propto r^2$. We want to turn this into a prior on the characteristic amplitude $\sigma$, an example of a signal model
661: parameter we must numerically marginalize over
662: ($\mathbf{\rho}=[\sigma]$). Since the gravitational wave energy decays with the square of the distance to the source, $\sigma^2\propto r^{-2}$, we then deduce that:
663: \begin{eqnarray}
664: p(\sigma|H_1)&=&p(r|H_1)\left|\frac{\mathrm{d}r}{\mathrm{d}\sigma}\right|\\
665: &=&\frac{3\sigma_\mathrm{min}^3}{\sigma^4} \, ,\label{eq:sigma4prior}
666: \end{eqnarray}
667: where $\sigma_\mathrm{min} \propto r_\mathrm{max}^{-1}$ is a lower bound on the amplitude of (or upper bound on the distance of)
668: the gravitational wave. This bound is obviously
669: somewhat arbitrary, but is a consequence of the way we distinguish
670: between detection and non-detection. For a uniformly spatially
671: distributed population of bursts there are of course many weak signals
672: within the data, and the noise hypothesis is ``never'' true. In reality
673: we are interested only in gravitational waves of at least a certain
674: size. If $\sigma_\mathrm{min}$ is much smaller than the noise floor
675: in all detectors, the expression for the noise hypothesis is an
676: excellent approximation to the expressions of the likelihood we
677: adopted. The classification of observations is insensitive to
678: different choices of $\sigma_\mathrm{min}$ below the noise floor.
679:
680: This distribution of $\sigma$ is preserved if we consider a source population
681: with a distribution of different intrinsic luminosities, so long as they
682: are uniformly distributed in space out to their respective
683: $r_\mathrm{max}$ determined by the choice of $\sigma_\mathrm{min}$.
684:
685: This is an example of a relatively \emph{uninformative} signal model.
686: It is capable of detecting signals of any waveform (of appropriate
687: duration). However, it incurs a large {\it Occam penalty} for its
688: generality, and cannot be as sensitive as a more \emph{informed}
689: search.
690:
691: The other extreme situation is where a source's waveform is completely
692: known, but its other parameters (amplitude, source sky position, polarization angle) are not. Consider a source that
693: produces a linearly polarized strain $\mathbf{w}$. If the source's
694: orientation, inclination and amplitude are unknown, we can
695: parameterize the system with two amplitudes $\mathbf{a}$ mapping the
696: strain into the observatory network's polarization basis
697: \begin{eqnarray}
698: \mathbf{W}&=&\left[
699: \begin{array}{cc}
700: \mathbf{w} & \mathbf{0}\\
701: \mathbf{0} & \mathbf{w}
702: \end{array}\right].
703: \end{eqnarray}
704: This is the Bayesian equivalent of the matched filter.
705: %
706: The template $\mathbf{w}$ appears twice because any specific signal
707: typically will not be aligned with the polarization basis used to describe
708: $h_+$ and $h_\times$ in the detectors, but rather will be rotated by
709: some {\em polarization angle} $\psi$ with respect to that basis.
710: %% The
711: %% amplitudes $\mathbf{a}$ will then scale as $\cos2\psi$ and $\sin2\psi$.
712: %%
713: More generally, any signal model that is independent of the observatory
714: network's polarization basis must have $\mathbf{A}$ and $\mathbf{W}$
715: composed of two identical sub-matrices on the diagonal like this, so that
716: $\mathbf{h}_+$ and $\mathbf{h}_\times$ have the same statistical distribution. For
717: example, if the source is not linearly polarized, but has strain described
718: by $\mathbf{w}_+$ and $\mathbf{w}_\times$, then
719: \begin{eqnarray}
720: \mathbf{W}&=&\left[
721: \begin{array}{cccc}
722: \mathbf{w}_+ & \mathbf{w}_\times & \mathbf{0} & \mathbf{0}\\
723: \mathbf{0} & \mathbf{0} & \mathbf{w}_+ & \mathbf{w}_\times
724: \end{array}\right].
725: \end{eqnarray}
726: %
727: %A concrete example is the signal from the inspiral of a compact binary,
728: %such as two neutron stars. In the absence of spin or eccentricity, the
729: %waveform is characterised by 9 parameters: the distance $D$, the
730: %two masses $m_1$ and $m_2$, the inclination angle $\iota$, the
731: %polarization angle $\psi$, a phase $\Phi_0$, the coalescence time $\tau$,
732: %and the sky position angles $\theta,\phi$. The basis waveforms
733: %$\mathbf{w}_+$, $\mathbf{w}_\times$ are the ``cosine'' and ``sine''
734: %inspiral templates \cite{a-matched-filter-paper}; their shape is
735: %determined by the masses $m_1$, $m_2$, which must be marginalized
736: %over numerically. The overall phase angle $\Phi_0$ can be marginalized
737: %analytically. The remaining parameters $\iota$, $\psi$, and $D$ affect
738: %only the relative amplitude with which $\mathbf{w}_+$ and $\mathbf{w}_\times$
739: %couple into the detector data streams; i.e., they contribute only to the
740: %amplitudes $\mathbf{a}$:
741: %\begin{equation}
742: %\mathbf{a} = \frac{1}{D}\left[\begin{array}{c}
743: % \frac12\cos2\psi(1+\cos^2\iota) \\
744: % \sin2\psi\cos\iota \\
745: % -\frac12\sin2\psi(1+\cos^2\iota) \\
746: % \cos2\psi\cos\iota
747: % \end{array}
748: %\right]
749: %\end{equation}
750: %The physical parameters $\iota$, $\psi$, and $D$ are not Gaussian distributed.
751: %Assuming the binaries are isotropically distributed through space ($p(D)\propto D^2$),
752: %one finds that the individual $a(i)$ can be reasonably approximated with
753: %Gaussians. However, scatter plots of the pairs $(a(1),a(4))$ and $(a(2),a(3))$
754: %show a hyperbolic-like distribution which will not be well approximated by a multivariate
755: %Gaussian. It needs more study to see if we can make a useful example out of this case.
756:
757: A more general case might be where we have a number of different
758: predictions for a waveform, $\mathbf{w}_i$, numerically derived. The
759: resulting search looks for a linear combination of these different
760: waveforms,
761: \begin{eqnarray}
762: \mathbf{W}&=&\left[
763: \begin{array}{cccccc}
764: \mathbf{w}_1 & \mathbf{w}_2 & \cdots & \mathbf{0} & \mathbf{0} & \cdots \\
765: \mathbf{0} & \mathbf{0} & \cdots & \mathbf{w}_1 & \mathbf{w}_2 & \cdots
766: \end{array}
767: \right] \, .
768: \end{eqnarray}
769: %% This model can even encompass interpolations \emph{between} waveforms
770: %% if the set includes $\textrm{diag}(\mathbf{b}_j)\mathbf{w}_i$ for $\mathbf{b}_j$
771: %% some interpolation basis (such as the polynomial basis $b_{j,k} = (k/L)^j$).
772: %% This must be done with care, however, as interpolations we would consider
773: %% unreasonable (such as $h_{+,j} = a_ij^2L^{-2}w_{i,j}$) will also be considered.
774: %% The signal model we are considering is a multivariate normal distribution
775: %% encompassing the surface that properly normalized interpolations lie on, and
776: %% also encompassing some improperly normalized interpolations. Not all signal
777: %% models can be represented as multivariate normal distributions; properly
778: %% normalized interpolations are one example.
779:
780: %Multivariate normal distributions have another useful property. The
781: %covariance matrix can be computed for any arbitrary signal model, even
782: %though it may not well describe that signal model. The multivariate
783: %normal distribution corresponding to that covariance matrix is the
784: %\emph{least informative} distribution in that it makes the weakest
785: %assumption about the signal of any model with that covariance.
786: %Thus, for every signal model, there is a corresponding multivariate
787: %normal distribution that will not overstate what we know. It is therefore
788: %a conservative choice to replace an arbitrary signal model with the corresponding
789: %multivariate normal distribution. This is precisely the
790: %approach we follow in the Monte Carlo demonstration in
791: %\S\ref{sec:simulations}, where we use a multivariate normal
792: %distribution model to detect binary black-hole merger waveforms.
793:
794: \subsection{Comparison with previously proposed methods}
795: \label{sec:comparison}
796:
797: In this section we will expand on the arguments sketched in a previous
798: paper \cite{SeSuTiWo:08}.
799:
800: %%%%
801: %
802: % We introduce this section with the Bayesian search for known theta, phi
803: % and then compare that against the all-sky frequentist search, inviting
804: % trouble. Better to note the differences between marginalization and
805: % maximization and then press on comparing the statistics 'within' the
806: % different proceedures
807: %
808: %%%%
809:
810: Several previously proposed hypothesis tests, such as the
811: G\"{u}rsel-Tinto (i.e. standard likelihood), the constraint likelihoods,
812: and the Tikhonov-regularized likelihood, can be written in the form
813: \begin{eqnarray}\label{eq:prev}
814: \max_{\rho,\tau,\theta,\phi}\mathbf{x}^T\mathbf{J}(\rho,\tau,\theta,\phi)\mathbf{x}&>&\lambda \, ,\label{eq:fht}
815: \end{eqnarray}
816: where $\mathbf{J}$ is an $MN\times MN$ matrix and $\lambda$ is a
817: \emph{threshold}. These tests proceed in two steps. First, parameters are
818: \emph{estimated} by maximizing the likelihood function with respect to the parameters.
819: Second, the value of the likelihood function at its maximum is compared to a threshold $\lambda$, which is chosen to ensure that it is only exceeded for the noise hypothesis at some acceptable \emph{false alarm rate}.
820:
821: The corresponding Bayesian expression, from (\ref{eq:fastbayesfactor}),
822: integrates over source sky direction, arrival time and any other parameters
823: and determines if the Bayes factor is large enough to overcome the prior plausibility ratio
824: \begin{eqnarray}
825: \fl \int_{V_{\rho, \tau, \theta, \phi}} \!\!\!\!\!\!
826: p(\mathbf{\rho},\tau,\theta,\phi|H_1)
827: \sqrt{|\mathbf{I}-\mathbf{\Sigma}\mathbf{K}|}
828: \exp(\frac{1}{2}\mathbf{x}^T\mathbf{K}\mathbf{x})
829: \, \mathrm{d}\mathbf{\rho} \, \mathrm{d}\tau
830: \, \mathrm{d}\theta \, \mathrm{d}\phi
831: &>&
832: \frac{
833: p(H_0)
834: }{
835: p(H_1)
836: }\,. \label{eq:bht}
837: \end{eqnarray}
838:
839: There are some obvious similarities between (\ref{eq:fht}) and (\ref{eq:bht}),
840: in particular the quadratic forms central to each. However, direct mathematical equivalence cannot be established in
841: general because of the difference between maximization and marginalization.
842:
843: We can establish equivalence for the related problem of parameter estimation, where we have maximum likelihood parameter estimate
844: \begin{eqnarray}
845: \{\rho,\tau,\theta,\phi\}&=&\arg\max(\mathbf{x}^T\mathbf{J}\mathbf{x})
846: \end{eqnarray}
847: and the Bayesian most plausible parameters, one of several ways the posterior plausibility distribution for the parameters can be turned into a point estimate
848: \begin{eqnarray}
849: \fl \{\rho,\tau,\theta,\phi\}&=&\arg\max ( p(\mathbf{\rho},\tau,\theta,\phi|H_1)
850: \sqrt{|\mathbf{I}-\mathbf{\Sigma}\mathbf{K}|}
851: \exp(\frac{1}{2}\mathbf{x}^T\mathbf{K}\mathbf{x}) )\\
852: &=&\arg\max(\mathbf{x}^T\mathbf{K}\mathbf{x} + 2\ln p(\mathbf{\rho},\tau,\theta,\phi|H_1) + \ln |\mathbf{I}-\mathbf{\Sigma}\mathbf{K}|) \, .
853: \end{eqnarray}
854: In the cases where we can find a Bayesian signal model that produces $\mathbf{K}=\mathbf{J}$, we must also use a prior
855: \begin{eqnarray}
856: p(\mathbf{\rho},\tau,\theta,\phi|H_1)&\propto&|\mathbf{I}-\mathbf{\Sigma}\mathbf{K}|^{-\frac{1}{2}}.
857: \end{eqnarray}
858: This prior states that gravitational wave bursts are \emph{intrinsically} more likely to occur at the sky positions
859: that the network is more sensitive to. We interpret this as an implicit bias present in any statistic of the form of (\ref{eq:fht})\footnote{It is important
860: to note that this particular objection applies only to all-sky searches;
861: it is a consequence of the maximization over $(\theta,\phi)$. These statistics
862: are also used in directed searches (for example, in the direction of a gamma-ray burst) where
863: $(\theta, \phi)$ is known and fixed, and the problem does not arise (the missing normalization
864: term is one of several absorbed by tuning the threshold).}.
865:
866: %As $\mathbf{F}$ and therefore $\mathbf{K}$ is a function of direction
867: %$(\theta,\phi)$, we conclude that the only
868: %way for a Bayesian analysis to produce the same parameter estimates as
869: %a statistic of the form in (\ref{eq:fht}) is to propose that gravitational
870: %wave sources are anisotropically distributed across the sky as some function
871: %of the network's sensitivity. This is an unphysical proposition; insofar as
872: %we find it incredible, we should expect a Bayesian analysis with a uniform prior
873: %to perform better on the real signal population
874: %
875: %We interpret this result as indicating
876: %that any statistics of the form in (\ref{eq:fht}) contains at least this implicit unphysical assumption.
877: %In the next subsections, we will individually explore the several specific statistics.
878:
879: In order to compare previously proposed statistics to the Bayesian
880: method, we place some restrictions on the configurations
881: considered. We will assume co-located (but differently oriented)
882: detectors to eliminate the need to time-shift data, and we will use
883: stationary signals and observation times that coincide with the time
884: the signal is present. These restrictions eliminate the differences
885: in the way previously proposed statistics and the Bayesian method
886: handle arrival time and signal duration. For simplicity, we will
887: further assume that the detectors are affected by white Gaussian noise.
888: The conclusions drawn will apply equally to different versions of
889: these statistics for colored noise or different bases other than the
890: time-domain (such as the frequency or wavelet domains).
891:
892: \subsubsection{Tikhonov regularized statistic}
893:
894: The Tikhonov regularized statistic proposed in \cite{Ra:06} for white
895: noise interferometers is
896: \begin{eqnarray}
897: \mathbf{x}^T\mathbf{F}(\mathbf{F}^T\mathbf{F}
898: +\alpha^2\mathbf{I})^{-1}\mathbf{F}^T\mathbf{x}\, .
899: \end{eqnarray}
900: The Bayesian kernel $\mathbf{K}$ reduces to this for
901: \begin{eqnarray}
902: \mathbf{\Sigma}&=&\mathbf{I}\\
903: \mathbf{W}&=&\mathbf{I}\\
904: \mathbf{A}&=&\alpha^{-2}\mathbf{I} \, .
905: \end{eqnarray}
906: This is a signal of
907: characteristic amplitude $\sigma = \alpha^{-1}$. The Tikhonov
908: regularizer $\alpha$ therefore places a delta function prior on the characteristic amplitude of the signal $p(\sigma|H_1)=\delta(\sigma-\alpha^{-1})$.
909: %,
910: %corresponding to a potentially quite restrictive $\chi^2$ distribution with $2L$ degrees of freedom for the signal energy.
911: %This physical interpretation, that the regularizer dictates the size of the signal expected, was not made in \cite{Ra:06}.
912: %The prior plausibility of the signal hypothesis varies with source sky direction
913: %\begin{eqnarray}
914: %\frac{p(H_1|\theta,\phi)}{p(H_0)}
915: %&=&
916: %\frac{\exp(-\frac{1}{2}\lambda)}{\sqrt{|\mathbf{I}-\mathbf{F}(\mathbf{F}^T\mathbf{F}+\alpha^2\mathbf{I})^{-1}\mathbf{F}^T|}} \nonumber \\
917: %&=&\frac{\exp(-\frac{1}{2}\lambda)}{\sqrt{|\mathbf{I}-\mathbf{F}^T\mathbf{F}(\mathbf{F}^T\mathbf{F}+\alpha^2\mathbf{I})^{-1}|}}\mathrm{\ (via\ Sylvester's\ theorem)}\nonumber\\
918: %&=&\frac{\exp(-\frac{1}{2}\lambda)}{\sqrt{|\mathbf{I}-(\mathbf{F}^T\mathbf{F}+\alpha^2\mathbf{I}-\alpha^2\mathbf{I})(\mathbf{F}^T\mathbf{F}+\alpha^2\mathbf{I})^{-1}|}}\nonumber\\
919: %&=&\frac{\exp(-\frac{1}{2}\lambda)}{\sqrt{|\alpha^2(\mathbf{F}^T\mathbf{F}+\alpha^2\mathbf{I})^{-1}|}}\nonumber\\
920: %&=&\frac{\exp(-\frac{1}{2}\lambda)}{\sqrt{|(\alpha^{-2}\mathbf{F}^T\mathbf{F}+\mathbf{I})^{-1}|}}\nonumber\\
921: %&=&\sqrt{e^{-\lambda}|\mathbf{I}+\sigma^{2}\mathbf{F}^T\mathbf{F}|} \, ,
922: %\end{eqnarray}
923: %implying that gravitational wave events to come from some
924: %source sky directions more frequently and from others less frequently;
925: %in particular, this is in proportion to the network's
926: %sensitivity to that source sky direction.
927:
928: The Tikhonov statistic behaves like a Bayesian statistic that
929: postulates all bursts have energies in a narrow range. % and are anisotropically distributed across the sky.
930: % Since these
931: %priors do not reflect our knowledge of the universe, we should expect
932: %a better performance from an analysis accounting for our prior
933: %knowledge.
934:
935: \subsubsection{G\"{u}rsel-Tinto statistic}
936:
937: The G\"{u}rsel-Tinto or standard likelihood statistic
938: \cite{GuTi:89,FlHu:98b,AnBrCrFl:01} is
939: \begin{eqnarray}
940: \mathbf{x}^T\mathbf{F}(\mathbf{F}^T\mathbf{F})^{-1}\mathbf{F}^T\mathbf{x}\,.
941: \end{eqnarray}
942: For large $\sigma$, the Tikhonov statistic goes to
943: \begin{eqnarray}
944: \mathbf{K}
945: &\approx& \mathbf{F}(\mathbf{F}^T\mathbf{F})^{-1}\mathbf{F}^T \, .
946: % \\
947: %\frac{p(H_1|\theta,\phi)}{p(H_0)}
948: % &\approx& \sigma^{2M}\sqrt{e^{-\lambda}|\mathbf{F}^T\mathbf{F}|} \, .
949: \end{eqnarray}
950: This implies that the G\"{u}rsel-Tinto statistic is the limit of a series of Bayesian statistics for increasing signal amplitudes.
951:
952: %This implies that there is no Bayesian test equivalent to the
953: %G\"{u}rsel-Tinto statistic, but that the G\"{u}rsel-Tinto statistic is
954: %rather the limit of a series of Bayesian tests for gravitational waves of increasingly
955: %large energies.
956: %increasingly frequent, and increasingly directionally biased
957: %populations of gravitational wave signals.
958:
959: %Alternatively, we could say that the G\"{u}rsel-Tinto statistic
960: %follows from our Bayesian formulation if we adopt an \emph{improper}
961: %(unnormalizable) strain prior $p(\mathbf{h}|H_1)=1$,
962: %which assigns equal plausibility to every possible waveform. This is what is
963: %meant by G\"{u}rsel-Tinto's independence of waveform. In practice we
964: %expect that smaller signals occur more frequently than larger
965: %signals, and this has real consequences in the analysis.
966: %
967: %Consider a common failure mode of G\"{u}rsel-Tinto: misidentifying the
968: %source sky direction of a gravitational wave. A moderately sized signal will
969: %come from a source sky direction of typical sensitivity and produce a moderate
970: %response. G\"{u}rsel-Tinto will correctly declare the true source sky direction
971: %of the injection to be plausible. However, there are directions on
972: %the sky where the global network becomes insensitive to one
973: %polarization, and near those source sky directions $\mathbf{F}^T\mathbf{F}$ is a
974: %near-singular matrix whose inverse varies rapidly, causing the
975: %G\"{u}rsel-Tinto statistic itself to vary over a wide range.
976: %Often one of these near-pathological source sky directions will be deemed more
977: %plausible than the true source sky direction. These pathological source sky directions
978: %always correspond to very low sensitivity to at least one
979: %polarization, so to explain the moderately-sized response at least one
980: %polarization of the postulated signal has to be very large. The
981: %improper G\"{u}rsel-Tinto prior says we believe very large
982: %gravitational waves to be just as plausible as moderately sized ones,
983: %so the statistic has no grounds to discount the pathological
984: %source sky direction, and returns the wrong source sky direction and an obviously wrong
985: %unphysically large reconstructed waveform. By contrast, a Bayesian
986: %method's prior can tell it that very large signals are very unlikely,
987: %so it does not make the same error.
988: %
989: %This shortcoming also degrades the sensitivity of the G\"{u}rsel-Tinto
990: %statistic. Noisy observations can be consistent with a very large
991: %gravitational wave from a near-pathological source sky direction and these false
992: %alarms force us to use a high threshold $\lambda$ that limits the
993: %efficiency of the method for more physically reasonable signals.
994: %Again, the Bayesian method does not suffer from this problem because
995: %it knows large signals are rare and it will not postulate them on the
996: %basis of weak evidence.
997: %
998: %It is important to note that G\"{u}rsel-Tinto can detect
999: %realistically-sized signals.
1000: %%; they form only an infinitesimal fraction
1001: %%of the postulated signal population, but this is canceled by the
1002: %%infinite prior plausibility that a signal is present, to give them a
1003: %%net finite plausibility.
1004: %Like the Tikhonov statistic,
1005: %G\"{u}rsel-Tinto works as a detection statistic; its problem is only
1006: %one of efficiency.
1007:
1008: \subsubsection{Soft constraint likelihood}
1009:
1010: The soft constraint statistic \cite{KlMoRaMi:05,KlMoRaMi:06} for white
1011: noise interferometers is
1012: \begin{eqnarray}\label{eqn:SC}
1013: k^2(\theta,\phi)\,\mathbf{x}^T\mathbf{FF}^T\mathbf{x} \, ,
1014: \end{eqnarray}
1015: for some function $k(\theta,\phi)$. Specifically, (\ref{eqn:SC})
1016: gives the soft constraint likelihood for the choice
1017: $k^2=(\mathbf{F}^{+T}\mathbf{F}^+)^{-1}$, where the antenna response is
1018: computed in the dominant polarization frame \cite{KlMoRaMi:05}.
1019:
1020: Consider the signal model defined by
1021: \begin{eqnarray}
1022: \mathbf{\Sigma}&=&\mathbf{I}\\
1023: \mathbf{W}&=&\mathbf{I}\\
1024: \mathbf{A}&=&\sigma^2k^2(\theta,\phi)\mathbf{I} \, .
1025: \end{eqnarray}
1026: This is a population of signals whose characteristic amplitude
1027: $\sigma k(\theta,\phi)$ varies as some known function of source sky direction,
1028: slightly generalizing the situation of the Tikhonov statistic. For small $\sigma$,
1029: \begin{eqnarray}
1030: \mathbf{K}&\approx&\sigma^2k^2(\theta,\phi)\mathbf{F}\mathbf{F}^T \, ,
1031: \end{eqnarray}
1032: so we can see that the soft constraint is the limit of a series of Bayesian statistics for decreasing signal amplitudes.
1033:
1034: %
1035: %(the opposite extreme to the G\"{u}rsel-Tinto statistic),
1036: %with the added twist that the infinitesimal expected amplitude of the signal varies with direction.
1037:
1038: %and the Bayesian test becomes
1039: %
1040: %\begin{eqnarray}
1041: %\lefteqn{k^2(\theta,\phi) \, \mathbf{x}^T\mathbf{FF}^T\mathbf{x}} \nonumber \\
1042: %&>&
1043: %-\frac{2}{\sigma^2}\ln\left(\sqrt{|\mathbf{I}-\sigma^2k^2\mathbf{F}\mathbf{F}^T|}\frac{
1044: %p(H_1|\theta,\phi)
1045: %}{
1046: %p(H_0)
1047: %}\right) \, . \quad
1048: %\end{eqnarray}
1049: %This implies that to mimic the soft constraint we must choose the prior
1050: %\begin{eqnarray}
1051: %\frac{p(H_1|\theta,\phi)}{p(H_0)}
1052: %&\approx&1+\frac{\sigma^2}{2}(k^2(\theta,\phi)\tr(\mathbf{FF}^T)-\lambda)+{} \nonumber \\
1053: %& & {} + O(\sigma^4) \,
1054: %\end{eqnarray}
1055: %where we have used the expansion of the determinant of a near-identity matrix
1056: %\begin{eqnarray}
1057: %|\mathbf{I}+\epsilon\mathbf{X}| = 1 + \epsilon\tr(\mathbf{X}) + O(\epsilon^2).
1058: %\end{eqnarray}
1059: %The prior is approximately unity, varying only infinitesimally with
1060: %source sky direction and the threshold $\lambda$. Even though the prior's
1061: %dependence on source sky direction is weak, the statistic's dependence on the
1062: %data is equally weak. Within these assumptions equation (\ref{eqn:note}) becomes
1063: %\begin{eqnarray}
1064: %\frac{p(\mathbf{x}|\theta,\phi,H_1)}{p(\mathbf{x}|H_0)}
1065: %&\approx& 1+\frac{1}{2}\sigma^2k^2(\theta,\phi)(\mathbf{x}^T\mathbf{FF}^T\mathbf{x}-{}
1066: %\nonumber \\
1067: %& & {}-\tr(\mathbf{FF}^T))+O(\sigma)^4 \, .
1068: %\end{eqnarray}
1069: %Since the expected signals are weak, and the evidence for them will
1070: %also be weak, any information in the prior still strongly affects the result.
1071:
1072: %\begin{widetext}
1073: %%
1074: %\begin{eqnarray}
1075: %k^2(\theta,\phi) \, \mathbf{x}^T\mathbf{FF}^T\mathbf{x}
1076: %&>&
1077: %-\frac{2}{\sigma^2}\ln\left(\sqrt{|\mathbf{I}-\sigma^2k^2(\theta,\phi)\mathbf{F}\mathbf{F}^T|}\frac{
1078: %p(H_1|\theta,\phi)
1079: %}{
1080: %p(H_0)
1081: %}\right) \, .
1082: %\end{eqnarray}
1083: %This implies that to mimic the soft constraint we must choose the prior
1084: %\begin{eqnarray}
1085: %\frac{p(H_1|\theta,\phi)}{p(H_0)}
1086: %&\approx&1+\frac{\sigma^2}{2}(k^2(\theta,\phi)\tr(\mathbf{FF}^T)-\lambda)+O(\sigma^4) \, .
1087: %\end{eqnarray}
1088: %The prior is approximately unity, varying only infinitesimally with
1089: %direction and the threshold $\lambda$. Even though the prior's
1090: %dependence on direction is weak, the statistic's dependence on the
1091: %data is equally weak. Within these assumptions equation (\ref{eqn:note}) becomes
1092: %\begin{eqnarray}
1093: %\frac{p(\mathbf{x}|\theta,\phi,H_1)}{p(\mathbf{x}|H_0)}
1094: %&\approx& 1+\frac{1}{2}\sigma^2k^2(\theta,\phi)(\mathbf{x}^T\mathbf{FF}^T\mathbf{x}-
1095: %\tr(\mathbf{FF}^T))+O(\sigma)^4 \, .
1096: %\end{eqnarray}
1097: %Since the expected signals are weak, and the evidence for them will
1098: %also be weak, the weak prior still strongly affects the result.
1099: %%
1100: %\end{widetext}
1101:
1102:
1103: %The soft constraint is therefore the limit of a series of Bayesian
1104: %tests for gravitational wave bursts that are common and whose
1105: %amplitudes and rates are a function of direction on the sky. If we
1106: %choose $k(\theta,\phi)=1$ we can eliminate the directional amplitude
1107: %bias; if we choose $k^{-2}(\theta,\phi)=\tr(\mathbf{FF}^T)$ we can
1108: %eliminate the directional dependence of event rate. Other choices, such as that made
1109: %in \cite{KlMoRaMi:05}, remove neither bias.
1110:
1111: \subsubsection{Hard constraint likelihood}
1112:
1113: Let us restrict the soft-constraint signal model to a population of
1114: \emph{linearly polarized} signals with a known polarization angle
1115: $\psi(\theta,\phi)$ for each source sky direction
1116: \begin{eqnarray}
1117: \mathbf{\Sigma}&=&\mathbf{I}\\
1118: \mathbf{W}&=&\left[
1119: \begin{array}{c}
1120: \cos 2\psi(\theta,\phi)\mathbf{I}\\
1121: \sin 2\psi(\theta,\phi)\mathbf{I}
1122: \end{array}
1123: \right]\\
1124: \mathbf{A}&=&\sigma^2 k^2(\theta,\phi)\mathbf{I} \, .
1125: \end{eqnarray}
1126: Then %with the prior
1127: %\begin{eqnarray}
1128: %\frac{p(H_1|\theta,\phi)}{p(H_0)}
1129: %& = &
1130: % e^{-\frac{\lambda\sigma^2}{2}}\sqrt{|\mathbf{I}+\sigma^2k^2(\mathbf{FW})^T\mathbf{FW}|}
1131: % \qquad
1132: %\end{eqnarray}
1133: for $\sigma\rightarrow 0$ the Bayesian statistic limits to
1134: % the left hand side of
1135: \begin{eqnarray}
1136: k^2(\theta,\phi) \, \mathbf{x}^T\mathbf{FW}(\mathbf{FW})^T\mathbf{x}
1137: % > \lambda
1138: \, .
1139: \end{eqnarray}
1140: For the particular choice of $\psi(\theta,\phi)$ being the rotation
1141: angle between the detector polarization basis and the dominant
1142: polarization frame, and $k^2=(\mathbf{FW})^T\mathbf{FW}$ (which is
1143: equal to ($\mathbf{F}^{+T}\mathbf{F}^+)^{-1}$ in the dominant
1144: polarization frame \cite{KlMoRaMi:05}), this yields the hard
1145: constraint statistic of \cite{KlMoRaMi:05}.
1146:
1147: In addition to the explicit assumptions that all signals are
1148: linearly polarized with known polarization angle, the hard constraint
1149: has the same properties as the soft constraint.
1150: % The normalization
1151: %chosen in \cite{KlMoRaMi:05} eliminates the directional rate bias but
1152: %introduces a directional amplitude bias.
1153:
1154: \subsection{Interpretation}
1155:
1156: We have shown that several previously proposed statistics are special cases
1157: or limiting cases of Bayesian statistics for particular choices of prior.
1158: %These comparisons offer new interpretations of the methods:
1159: %the \emph{ad hoc} Tikhonov regularizer is physically interpreted as
1160: %the inverse of an expected signal amplitude; the constraint methods presume
1161: %signals much smaller than the noise level while the G\:{u}rsel-Tinto method presumes
1162: %signals much larger than the noise level; and all the methods considered propose that
1163: %gravitational wave bursts occur intrinsically more frequently in directions that the
1164: %network is more sensitive to.
1165: %
1166: The `priors' implicit in these non-Bayesian methods are not representative of our
1167: expectations about the source population, so we can reasonably expect
1168: improved performance from a detection statistic with priors
1169: better reflecting our state of knowledge. The Bayesian analysis allows us to begin with our physical understanding
1170: of the problem, described in terms of prior expectations about the
1171: gravitational wave signal population, and derive the detection
1172: statistic for these conditions. The effects of priors are lessened when there is a strong gravitational wave
1173: signal present; all these statistics, Bayesian and non-Bayesian, are effective at
1174: detecting stronger gravitational waves; significant differences occur only
1175: for marginal signals. In the next section, we will quantitatively compare the relative performance
1176: of the methods mentioned above and the Bayesian statistic we propose.
1177:
1178:
1179:
1180: %The Tikhonov, G\"{u}rsel-Tinto, and constraint methods are limits of
1181: %Bayesian statistics distinguished only by different choices of prior.
1182: %Yet, as Bayesian priors are \emph{subjective}, on what basis can we
1183: %critique them?
1184: %
1185: %Though subjective, priors make definite statements about our
1186: %\emph{expectations} of the world; one popular paradigm relates priors
1187: %to bets we would be willing to make about the outcome of an
1188: %experiment. Few scientists would be willing to bet that the first
1189: %detected gravitational wave burst would have strain far above (G\"{u}rsel-Tinto) or below (constraint methods) the
1190: %instrumental noise, or that events will conveniently occur where our network is most sensitive.
1191: %Insofar as we find these prior plausibility
1192: %distributions incredible, we should expect a Bayesian method with a
1193: %more credible prior to be a more \emph{efficient} detection statistic.
1194: %Quite literally, the Bayesian method does not have to waste precious
1195: %(signal) energy overcoming strong prejudices.
1196: %The same logic applies
1197: %to attempts to reconstruct the parameters of a detected signal, such
1198: %as the sky position of the source; Fig.~\ref{fig:skies} shows an
1199: %example.
1200:
1201: %Once may also consider what the Bayesian analysis says about how
1202: %he non-Bayesian statistics can be improved.
1203: %For example, in our comparisons
1204: %we have assumed that the threshold $\lambda$ used in the frequentist
1205: %analysis is the same for all sky positions. The Bayesian analysis shows
1206: %that this implicitly imposes physically unreasonable priors on the
1207: %gravitational-wave signal.
1208: %In a non-Bayesian all-sky analysis we are perfectly
1209: %free to add a new term to the likelihood that varies with sky position.
1210: %The Bayesian analysis shows
1211: %how the this term can be varied across the sky to correspond to physical
1212: %priors;
1213: %; for example, by selecting $\lambda(\theta,\phi)$ in equation
1214: %(\ref{eqn:FreqPrior}) to make the right-hand side constant. Analogous
1215: %analogous
1216: %reasoning from the non-Bayesian point of view would be to examine how
1217: %$\lambda(\theta,\phi)$
1218: %the new term should be chosen to balance (in some sense) the false
1219: %alarm rate versus detection probability across the sky.
1220: %The Bayesian
1221: %formulation gives one concrete suggestion.
1222:
1223: %In summary, we have demonstrated that previously proposed methods implicitly
1224: %make unreasonable choices of prior, and consequently must be suboptimal for
1225: %reasonable choices of prior. We have not yet quantified how much
1226: %worse their performance is. One way to answer this question is to
1227: %perform a Monte-Carlo simulation, testing the ability of each
1228: %statistic to detect thousands of simulated gravitational wave signals.
1229:
1230: %\begin{figure}[htb]
1231: %\resizebox{\columnwidth}{!}{\includegraphics{skies}}
1232: %\caption{\label{fig:skies}
1233: % Four statistics as a function of $(\theta,\phi)$ for identical white
1234: % noise interferometers with the locations and orientations of LHO,
1235: % LLO and Virgo sampled at 1024 Hz and a 1/16s white noise signal with
1236: % amplitude SNR of 5. White is most plausible; black is least
1237: % plausible; a circle indicates the true direction; a square indicates
1238: % the most plausible direction. From top to bottom: $\ln$ Bayesian
1239: % odds ratio for $\sigma=5$; Tikhonov for $\alpha=0.2$;
1240: % G\"{u}rsel-Tinto; soft constraint with $k(\theta,\phi)=1$.}
1241: %\end{figure}
1242:
1243: