astro-ph0606154/ms.tex
1: 
2: \documentclass[12pt,preprint]{aastex}
3: 
4: \usepackage{emulateapj5,apjfonts}
5: \usepackage{graphics}
6: \usepackage{amssymb}
7: \usepackage{onecolfloat}
8: 
9: \lefthead{Heitmann, Higdon, Nakhleh, Habib}
10: \righthead{Cosmic Calibration}
11: 
12: 
13: \begin{document}
14: 
15: \def\head{
16: \vbox to 0pt{\vss \hbox to 0pt{\hskip 440pt\rm LA-UR-06-2320\hss}
17:   \vskip 25pt}
18: 
19: 
20: \title{Cosmic Calibration}
21: \author{Katrin Heitmann\altaffilmark{1},
22:         David Higdon\altaffilmark{2},          
23:         Charles Nakhleh\altaffilmark{3}, and
24:         Salman Habib\altaffilmark{4}}
25: \affil{$^1$ ISR-1, MS D466, Los Alamos National Laboratory, Los
26: Alamos, NM 87545, heitmann@lanl.gov} 
27: \affil{$^2$ D-1, MS F600, Los
28: Alamos National Laboratory, Los Alamos, NM 87545, dhigdon@lanl.gov} 
29: \affil{$^3$ X-2, MS T087, Los Alamos National Laboratory, Los
30: Alamos, NM 87545, cnakhleh@lanl.gov} 
31: \affil{$^4$ T-8, MS B285, Los Alamos National Laboratory, Los
32: Alamos, NM 87545, habib@lanl.gov} 
33: 
34: \begin{abstract}
35: 
36: The complexity and accuracy of current and future ``precision
37: cosmology'' observational campaigns has made it essential to develop
38: an efficient technique for directly combining simulation and
39: observational datasets to determine cosmological and model parameters;
40: a procedure we term {\em calibration}. Once a satisfactory calibration
41: of the underlying cosmological model is achieved, independent
42: predictions for new observations become possible. For this procedure
43: to be effective, robust characterization of the uncertainty in the
44: calibration process is highly desirable. In this {\em Letter}, we
45: describe a statistical methodology which can achieve both of these
46: goals. An application example based around dark matter structure
47: formation simulations and a synthetic mass power spectrum dataset is
48: used to demonstrate the approach.
49: 
50: \end{abstract}
51: 
52: \keywords{Cosmology: cosmological parameters --- cosmology: theory}}
53: 
54: \twocolumn[\head]
55: 
56: \section{Introduction}
57: 
58: It is widely recognized that, beginning in the last decade, a
59: transition to an era of ``precision cosmology'' is well underway.
60: Ongoing and upcoming surveys such as the Wilkinson Microwave
61: Anisotropy Probe (WMAP, Sper\-gel et al. 2006), the Sloan Digital Sky
62: Survey (SDSS, Adelman-McCarthy et al. 2006), Planck, the Dark Energy Survey
63: (DES), the Joint Dark Energy Mission (JDEM), the Large Synoptic Survey
64: Telescope (LSST), and Pan-STARRS constitute superb sources of
65: cosmological statistics. These sources include (galaxy, cluster, and
66: mass) power spectra and cluster mass functions, from which roughly 25
67: cosmological parameters have to be constrained (see, e.g., Spergel et
68: al. 2006, Tegmark et al. 2003, Abazajian et al. 2005). The promised
69: accuracy from future observations is remarkable, as some parameters
70: can be measured at the 1\% level or better, posing a major challenge
71: to cosmological theory. Predictions and analysis methods must at least
72: match -- and preferably substantially exceed -- the observational
73: accuracy. For many observables, this can only be achieved by
74: simulations incorporating physical effects beyond the reach of
75: analytic modeling.
76: 
77: Cosmological simulations already play a key role in the design and
78: interpretation of observations. Controlling systematics is a necessary
79: first step, followed by combining simulations with observations to
80: extract cosmological and model parameters. This cannot be accomplished
81: by brute force. For example, if every parameter is sampled only
82: ten times in a twenty-dimensional parameter space, it would require
83: $10^{20}$ large-scale simulations, which is currently -- and
84: in the near-term -- quite impossible. Even the variation of only a
85: subset of the parameters over a sufficient range is infeasible. The
86: need to develop and employ reliable statistical methods to determine
87: and constrain parameters robustly is therefore manifest.
88: 
89: In this {\em Letter} we describe a statistical framework to determine
90: cosmological and model parameters and associated uncertainties from
91: simulations and observational data (for an overview of the basic ideas
92: see, e.g., Kennedy \& O'Hagan 2001 and Goldstein \& Rougier 2004). The
93: framework integrates a set of interlocking procedures: (i) simulation
94: design -- the determination of the parameter settings at which to
95: carry out the simulations; (ii) emulation -- given simulation output
96: at the input parameter settings, how to estimate the output at new,
97: untried settings; (iii) uncertainty and sensitivity analysis --
98: determining the variations in simulation output due to uncertainty or
99: changes in the input parameters; (iv) calibration -- combining
100: observations (with known errors) and simulations to estimate parameter
101: values consistent with the observations, including the associated
102: uncertainty; (v) prediction -- using the calibrated simulator to
103: predict new cosmological results with a set of uncertainty bounds.
104: 
105: For concreteness, we discuss the framework methodology in terms of a
106: simple example application: Estimation of five parameters from dark
107: matter structure formation simulations and a synthetic set of ``WMAP +
108: SDSS'' measurements of the matter power spectrum. A detailed
109: description will be provided elsewhere~(S.~Habib et al. in
110: preparation).
111: 
112: \section{The Statistical Framework}
113: 
114: We employ a Bayesian framework to update prior probability
115: distributions on cosmological parameters given observational
116: data. Denoting these parameters collectively by $\theta$ and the
117: observed power spectrum data by a vector $y_{\rm obs}$, we model the
118: data as:
119: \begin{equation}
120: y_{\rm obs} = \eta(\theta) +\epsilon,
121: \end{equation}
122: where $\eta(\theta)$ denotes the simulation output at input setting
123: $\theta$, and $\epsilon \sim N(0,\Sigma_y)$
124: where $\Sigma_y$ describes the error structure of the
125: observations and any potential systematic differences
126: between the simulated and observed data.
127: 
128: Standard Bayesian estimation (Jeffreys 1961) proceeds using the
129: likelihood  
130: \begin{equation}
131: L(y_{\rm obs}|\theta) \propto |\Sigma_y|^{-1/2}
132: {\exp}\{-\frac{1}{2}[y_{\rm obs} - \eta(\theta)]^T \Sigma_y^{-1} [y_{\rm
133: obs} - \eta(\theta)]\},
134: \end{equation} 
135: and a prior $\pi(\theta)$ to form the posterior distribution on
136: $\theta$: 
137: \begin{equation}
138: \pi(\theta|y_{\rm obs}) \propto L(y_{\rm obs}|\theta)\pi(\theta).
139: \end{equation}
140: Because the resulting posterior distributions are not in any easily
141: recognized closed form, they must be explored numerically, usually
142: using Markov Chain Monte Carlo (MCMC) techniques (Besag et al. 1995).
143: This procedure requires running the (potentially very expensive)
144: simulation codes many thousands of times as the $\theta$-space is
145: explored. However, only a limited number ($\sim 100?\, \sim 1000$?) of
146: runs may be feasible.  Therefore, efficiently combining Bayesian
147: methods with simulations requires a representation of the code output
148: (emulator) that can be sampled many thousands of times during the
149: course of the MCMC in lieu of running the actual code. When queried at
150: an input setting where a code run is available, the emulator should
151: reproduce the output of the code.  At other input settings, the
152: emulator effectively interpolates nearby code runs while including
153: uncertainty due to the lack of complete knowledge of the code output.
154: The selection of input settings in the simulation design must be
155: sufficiently dense that the emulator can accurately mimic the code
156: output, and also be sufficiently sparse that the simulation campaign
157: is computationally tractable.
158: 
159: Systematic design of simulation procedures is reviewed in Santner et
160: al. (2003). We use orthogonal array-based Latin hypercube
161: sampling~\cite{tang} to fix 128 input settings over the five
162: parameters. This approach takes an orthogonal array design -- which
163: ensures that all lower-dimensional projections have desirable
164: space-filling properties -- and modifies it so that it is also a Latin
165: hypercube, the most efficient stratified sampling strategy.
166: 
167: The code output for the $i$th input setting is a power spectrum,
168: $y^{(i)}(k)=\eta(\theta_i)$, viewed as a column vector over the $n_k$
169: points in $k$ space.  Each of the resulting $n_s = 128$ output spectra
170: is loaded into a single $n_k \times n_s$ matrix: $y_{\rm sims} =
171: [y^{(1)}|y^{(2)}|\ldots|y^{(n_s)}].$ This matrix is then subjected to
172: a singular value decomposition (SVD) to find an efficient empirical
173: orthogonal representation of the simulation outputs: $[y_{\rm
174: sims}]_{ij} = [USV^T]_{ij} = \sum_{p=1}^{n_s} \lambda_p [\alpha_p]_i
175: w_p(\theta_j)$ where the $\alpha_p$'s are $n_k \times 1$ orthogonal
176: basis vectors in the simulation output space (columns of $U$), the
177: $\lambda_i$'s are the singular values of the simulation matrix, and
178: each principal component (PC) weight $w_p(\theta)$ is a $1 \times n_s$
179: row vector in the parameter space (columns of $V$).  Usually the first
180: few singular values dominate the remainder, allowing us to keep only a
181: few of the principal components in the analysis. In what follows, we
182: have kept five PC's.
183: 
184: The SVD gives the PC weights at the design input settings $(\theta_1,
185: \theta_2,\cdots,\theta_{n_s})$.  However, in the course of the MCMC,
186: we need the PC weights at intermediate input settings. We construct
187: the emulator by putting a spatial Gaussian Process (GP) model on each
188: PC weight (Sacks et al. 1989, MacKay 1998), a nonlinear interpolation
189: scheme that works directly on the space of functions. This allows the
190: emulator to smoothly interpolate the predicted code output between the
191: design settings, giving an efficient probabilistic representation of
192: the prediction uncertainty. The spatial parameters controlling the GP
193: on each component weight are estimated in the course of the MCMC,
194: thereby constructing the emulator as needed during the calibration
195: analysis. Details of the procedures used in our code are being
196: reported elsewhere in the literature.  For recent examples of these
197: techniques used in practice, see Higdon et al. (2004).
198: 
199: \section{Parameter Estimation and the Nonlinear Matter Power Spectrum}
200: 
201: In order to give an explicit demonstration of the approach, we first
202: generate a synthetic observational dataset from simulations. The key
203: advantage of doing this is that the underlying set of cosmological
204: parameters are known, allowing a direct test of the statistical
205: procedure. To generate the ``observations'' we begin with a smooth
206: power spectrum computed by running ten realizations of the same
207: cosmology with the parallel particle mesh (PM) code MC$^2$
208: (Cf. Heitmann et al. 2005 for code information and comparison results)
209: and averaging over the results. The initial conditions are set using
210: CMBFAST~\cite{cmbfast}. We restrict our study to the linear and
211: quasi-linear regime relevant to large-scale structure surveys; the
212: force resolution of the PM-code accurately resolves the scales of
213: interest.
214: 
215: 
216: \begin{figure}
217: \includegraphics[totalheight=75mm,angle=270,clip=]{f1.ps}
218: \caption{Subset of the 128 simulated power spectra and the
219: synthetic dataset. The black line is the spectrum from which
220: the synthetic data were derived.}  
221: \label{plotone}
222: \end{figure}
223: 
224: 
225: We form a single power spectrum by attaching the linear $P_L(k),\;
226: 0.001 h$Mpc$^{-1}\leq k \leq 0.1h$Mpc$^{-1}$ (growth specified by
227: linear theory), to the power spectrum from simulations $P_N(k)$ at $k
228: = 0.1h$Mpc$^{-1}$. Next, 28 points from this combined power spectrum
229: are picked, spaced roughly in the same bins as in a real dataset. The
230: error bars are set by values typical for cosmic microwave background
231: (CMB) experiments such as WMAP~(Sper\-gel et al. 2006) in the low-$k$
232: range, transitioning to values typical of surveys such as
233: SDSS~(Adelman-McCarthy et al. 2006) at higher $k$. Finally, points are
234: moved off the base power spectrum according to a Gaussian distribution
235: with a 1-sigma confidence, as shown in Figure~\ref{plotone}. Note that
236: for this test demonstration we are assuming that galaxy bias has
237: already been incorporated in the measurement. In a more realistic
238: situation, the bias would be included as part of the modeling
239: process. Note also that the choice of a homogeneous observational
240: dataset here is merely for convenience. For a heterogeneous dataset
241: such as CMB $C_l$'s combined with $P(k)$ measured from the galaxy
242: distribution, $y_{obs}$ [Cf. Eqn.  (1)] would also contain the CMB
243: results and the simulations underlying $\eta(\theta)$ would include
244: runs of (say) CMBFAST.
245: 
246: We consider the following five cosmological parameters:
247: $\theta=(n,h,\sigma_8,\Omega_{\rm CDM},\Omega_{\rm b})$. We assume a
248: flat $\Lambda$CDM universe with $\theta=(0.99,0.71,0.84,0.27,0.044)$
249: to make the synthetic observations (black line in
250: Figure~\ref{plotone}).  To determine the simulation design we must fix
251: the range that the input parameters should be varied over.  To this
252: end, we assume independent, flat priors over the ranges: $0.8\le n \le
253: 1.4$, $0.5\le h \le 1.1$, $0.6 \le \sigma_8 \le 1.6$,
254: $0.05\le\Omega_{\rm CDM}\le 0.6$, and $0.02\le \Omega_{\rm b}\le
255: 0.12$.  The simulation design prescribes a set of 128 input settings.
256: This number of simulations, as we show below, yields an emulator with
257: performance at the few percent level, sufficient for our present
258: purposes.
259: 
260: Each run is carried out with 128$^3$ particles on a 512$^3$ grid for a
261: 450$h^{-1}$Mpc box, guaranteeing sufficient force resolution for the
262: scales of interest. To limit systematic biases, different seeds are
263: used to generate the initial Gaussian random field for each
264: simulation. Cosmic variance is minimized by matching the numerical to
265: the linear power spectrum in the linear regime near the $\Lambda$CDM
266: power spectrum peak. We show a subset of the 128 power spectra in
267: Figure~\ref{plotone}. The emulator is now built as described above --
268: note that the emulator is called only as needed by the MCMC analysis
269: in the calibration process.
270: 
271: \section{Results}
272: 
273: \begin{figure}[t]
274: \includegraphics[width=90mm]{f2.eps}
275: \caption{Posterior density for the parameter vector $\theta$. 
276: The diagonal gives estimates of the univariate marginal pdfs for each 
277: component; blue: results from the entire synthetic dataset; green
278: using only the linear regime ($k < 0.1 h{\rm Mpc}^{-1}$). 
279: Off-diagonal
280: images show estimates of the bivariate marginal pdfs: upper triangle
281: for the entire dataset, lower triangle for the linear regime.  The
282: lines give  estimates of the 90\% highest posterior density
283: region. Again, blue is from using the entire dataset; green from using
284: only the linear regime. 
285: The dots show the actual parameter values used to generate
286: the synthetic observations.}
287: \label{plottwo}
288: \end{figure}
289: 
290: 
291: The posterior distribution of the five cosmological parameters is
292: depicted in Figure~\ref{plottwo}. The diagonal displays the
293: univariate, marginal pdfs for each of the parameters, while the
294: off-diagonal plots show estimated 2-d marginal densities, along with
295: 90\% probability contours.  For comparison, Table~\ref{constraints}
296: gives the mean value of the parameters along with the estimated
297: uncertainty, as well as the ``true'' value for each parameter.  These
298: posterior estimates are obtained under two separate formulations --
299: one which uses all of the synthetic observation data, and one which
300: uses only the observations in the {\em linear} regime for which
301: $k<0.1h$Mpc$^{-1}$.  The green pdfs and contours in
302: Figure~\ref{plottwo} show the posterior results including information
303: only from the linear regime, whereas the blue pdfs and the contours
304: result from an analysis of the full nonlinear power spectrum.  Because
305: of the limited observational dynamic range, using only the linear
306: regime results in systematic shifts from the ``true'' answers, albeit
307: within the quoted uncertainties. Overall, we find $\Omega_{\rm CDM}$
308: and $\sigma_8$ to be very well determined. The full nonlinear analysis
309: over the entire $k$-range significantly improves the accuracy for
310: $\sigma_8$ and $\Omega_{\rm CDM}$ as well as the precision of the
311: constraint for $\sigma_8$ (see Table~\ref{constraints}). While the
312: synthetic dataset provides information about the remaining three
313: parameters, $n$, $h$, and $\Omega_{\rm b}$, they are not as well
314: constrained as is to be expected from an analysis restricted to the matter
315: power spectrum only. Note that the linear analysis underestimates the
316: uncertainty in $n$.
317: 
318: 
319: \begin{figure}
320: \includegraphics[width=45mm,angle=270]{f3.ps}
321: \caption{Evaluation of the emulator fit.
322: Left: Three simulations (black dots) and the corresponding
323: response surface fits (green lines) obtained after holding out
324: the simulation to be predicted and training the response surface on
325: the remaining 127 simulations. Right: Residual (simulation $\log P - $
326: response surface) from holdout predictions (i.e. the simulation being
327: predicted is not used to estimate the response surface).  The central
328: gray region contains the middle 50\% of the residuals; the wider light
329: gray region, the middle 90\%.}      
330: \label{plotthree}
331: \end{figure}
332: 
333: 
334: \begin{table}
335: \begin{center}
336: \caption{\label{constraints}Parameter Constraints}
337: \begin{tabular}{lccc}
338: \tableline\tableline
339: Param.             &  Mean$^{\rm nonlin}$ &   Mean$^{\rm lin}$ &  True Value\\
340: \tableline
341: n                  &  $0.991^{+0.276}_{-0.171}$  &  $0.940^{+0.218}_{-0.132}$  &  0.99 \\
342: h                  &  $0.786^{+0.2823}_{-0.259}$  &  $0.765^{+0.287}_{-0.232}$  &  0.71 \\
343: $\sigma_8$         &  $0.882^{+0.082}_{-0.077}$  &  $0.962^{+0.121}_{-0.108}$   &  0.84 \\
344: $\Omega_{\rm CDM}$ &  $0.287^{+0.138}_{-0.133}$  &  $0.343^{+0.156}_{-0.130}$  &  0.27 \\
345: $\Omega_{\rm b}$   &  $0.057^{+0.052}_{-0.034}$  &  $0.054^{+0.052}_{-0.031}$  &  0.044 \\
346: \tableline\tableline
347: \vspace{-1.5cm}
348: \tablecomments{Mean value for the full and linear ($k<0.1h$Mpc$^{-1}$)
349: datasets with\\ their 90\% intervals, and the true value for the five
350: parameters under investigation.} 
351: \end{tabular}
352: \end{center}
353: \end{table}
354: 
355: \begin{figure*}[t]
356: \includegraphics[width=180mm]{f4.eps}
357: \caption{Sensitivity of the computed power spectrum $\log P$ to
358: changes in input parameters.  Here, the response surface is used to
359: compute the change in $\log P$ as each parameter, in turn, is varied
360: from its lower bound to its upper bound while the other parameters
361: are held at their midpoints.}    
362: \label{plotfour}
363: \end{figure*}
364: 
365: The posterior distribution describes the uncertainty regarding the
366: parameter vector $\theta$ as well as statistical variance and
367: correlation parameters that control the response surface model.  Once
368: these posterior samples have been produced, it is straightforward to
369: generate posterior realizations of the emulator to assess its adequacy
370: in modeling the simulated output.  The accuracy of the emulator was
371: estimated by excluding individual simulation runs and building a new
372: emulator based on the remaining 127 power spectra. The emulator
373: predictions can now be compared against the actual simulation output
374: of the excluded run. Three examples of applying this procedure are
375: shown in the left plot in Figure~\ref{plotthree}. The accuracy of the
376: emulator turns out to be extremely good, at the level of a few
377: percent, which is very adequate for the present analysis. The right
378: panel in Figure~\ref{plotthree} summarizes the residuals for all 128
379: simulations -- the central gray band delineates the middle 50\% of the
380: residuals; the light gray band delineates the middle 90\%.  Gaussian
381: process models offer a number of advantages over other methods for
382: modeling simulation output: they do not require runs over a grid of
383: input settings; they allow for interpolation of the simulation output;
384: they can accommodate fairly general interactions between input
385: parameters; and typically outperform other modeling approaches.  For
386: example, the GP model gives substantially better predictions as
387: compared to a quadratic response surface model, a generalized additive
388: model (GAM), or a multivariate additive regression spline model (MARS)
389: (Hastie et al. 2001).
390: 
391: The fitted emulator can be used to explore the sensitivity of the
392: simulation output to changes in the cosmological
393: parameters. Figure~\ref{plotfour} shows how the log of the power
394: spectrum changes as one parameter is varied, the others being fixed at
395: their prior midpoints. Both $\sigma_8$ and $\Omega_{\rm CDM}$ have a
396: large impact when varied over their prior ranges.  Hence it is not
397: surprising that the posterior distribution for these two parameters
398: are the most constrained by the observed data.  Figure~\ref{plotfour}
399: also suggests that while most parameters affect the power spectrum in
400: the linear regime ($k < 0.1h{\rm Mpc}^{-1}$), only $\sigma_8$ affects
401: the power spectrum in the nonlinear regime ($k > 0.1h{\rm
402: Mpc}^{-1}$). Thus, while additional data in the nonlinear regime is
403: likely to help constrain $\sigma_8$, it will not greatly reduce
404: uncertainty in the other four parameters.
405: 
406: \section{Conclusion and Outlook} 
407: 
408: We have introduced a new, very powerful method for determining
409: cosmological and model parameters from simulations and
410: observations. The key idea is to extract maximum utility from a
411: necessarily finite set of expensive simulations. The implementation of
412: this idea includes several valuable features: (i) a design to
413: optimally sample the simulation parameter space; (ii) an accurate
414: emulator capable of generating the required outputs in between the
415: sampled simulation points; (iii) an uncertainty and sensitivity
416: analysis; (iv) the parameter constraints themselves, with associated
417: uncertainty bounds. 
418: 
419: In order to demonstrate the basic approach, we used a set of 128 dark
420: matter structure formation simulations and a homogeneous synthetic
421: ``observational'' dataset to determine five cosmological
422: parameters. The next step is to use the framework for analyses of real
423: data, especially of combined datasets such as the CMB and large scale
424: structure observations.
425: 
426: There are many ways to enhance the method and improve its
427: performance. One is the melding of information from codes with
428: different degrees of resolution and input physics, such as in the
429: extraction of information about the mass distribution from the
430: Lyman-$\alpha$ forest. Here, complex hydrodynamics simulations are
431: certainly desirable, but much faster approximate methods such as
432: hydro-particle mesh (HPM) are available. Thus, a first analysis based
433: on HPM can be performed, narrowing the parameter range of interest
434: sufficiently to make hydro runs feasible. Interesting offshoots of the
435: methodology include the exploitation of certain intermediate
436: results. For instance, a large set of N-body simulations can be
437: performed with several input parameters such as the equation of state
438: for dark energy. An emulator can then be constructed from these and
439: publicly released. This emulator can then be conveniently used instead
440: of real simulations for planning observations and data analysis.
441:  
442: 
443: \acknowledgements 
444: We thank Brian Williams for creating the simulation designs and Kevork
445: Abazajian, Lam Hui, and Adam Lidz for useful discussions and
446: encouragement. A special acknowledgment is due to supercomputing time
447: awarded to us under the LANL Institutional Computing Initiative. This
448: research is supported by the DOE under contract W-7405-ENG-36.
449:  
450: \begin{thebibliography}{99}
451: 
452: 
453: \bibitem[Abazajian et al. 2005]{abazajian} Abazajian,~K. et al. 
454: \ 2005, ApJ 625, 613
455: 
456: \bibitem[Adelman-McCarthy et al. 2006]{sdss} Adelman-McCarthy,~J.K. et al. 
457: \ 2006, ApJS, 162, 38
458: 
459: \bibitem[Besag et al. 1995]{Besag} Besag,~J., Green,~P., Higdon,~D.A., 
460: \& Mengersen,~K. \ 1995, Stat. Sci. 10, 3
461: 
462: \bibitem[Goldstein \& Rougier 2004]{GoldRou}Goldstein,~M. and
463: Rougier,~J. \ 2004, SIAM J. Sci. Comput.
464:   26, 467
465:  
466: \bibitem[Heitmann et al. 2005]{HRWH} Heitmann,~K., Ricker,~P.M.,
467: Warren,~M.S., \& Habib,~S. \ 2005, ApJS, 160, 28
468: 
469: \bibitem[Hastie et al. 2001] {Hastie} Hastie,~T., Tibshirani,~R., \&
470: Friedman,~J. \ 2001, {\em The Elements of Statistical Learning: Data
471: Mining, Inference, and Prediction} (Springer)
472: 
473: \bibitem[Higdon et al. 2004]{Higdon}  Higdon,~D.A. et al. \ 2004, 
474: SIAM J. Sci. Comput. 26, 448
475: 
476: \bibitem[Jeffreys 1961]{HJ} Jeffreys,~H. \ 1961,
477: {\em Theory of Probability} (Oxford)
478: 
479: \bibitem[Kennedy \& O'Hagan 2001]{KOH} Kennedy,~M.C. \&
480: O'Hagan, A. \ 2001, J. Royal Stat. Soc. B, 63, Part 3, 425
481: 
482: \bibitem[MacKay 1998]{Mack} MacKay,~D.J.C. \ 1998 in {\em Neural
483: Networks and Machine Learning}, ed. Bishop,~C.M., NATO ASI Series
484: (Kluwer)
485: 
486: \bibitem[Sacks et al. 1989]{SWMW} Sacks,~J., Welch,~W.J.,
487: Mitchell,~T.J., \& Wynn,~H.P. \ 1989, Stat. Sci, 4, 409
488: 
489: \bibitem[Santer et al. 2003]{SWN} Santer,~T.J., Williams,~B.J., \&
490: Notz,~W.  2003, {\em The Design and Analysis of Computer Experiments} 
491: (New York: Springer)
492:  
493: \bibitem[Seljak \& Zaldarriaga 1996]{cmbfast}
494: Seljak,~U. \& Zaldarriga,~M. \ 1996 ApJ, 469, 437
495: 
496: \bibitem[Spergel et al. 2006]{spergel} Spergel,~D.N. et al.
497: astro-ph/0603449, ApJ, submitted
498: 
499: \bibitem[Tang 1993]{tang} Tang,~B., \ 1993 J.~Am.~Stat.~Assn., 88,
500: 1392
501: 
502: \bibitem[Tegmark et al. 2004]{tegmark} Tegmark,~M. et al. \ 2004,
503: Phys. Rev. D, 69, 103501
504: 
505: 
506: \end{thebibliography}
507: 
508: \end{document}
509:  
510: