0712.0194/ms.tex
1: %\documentclass[10pt,preprint]{aastex}
2: \documentclass{emulateapj}
3: 
4: \usepackage{amsmath}
5: 
6: \begin{document}
7: 
8: \title{Computing High Accuracy Power Spectra with Pico}
9: 
10: \author{William A.~Fendt\altaffilmark{1} and
11:         Benjamin D.~Wandelt\altaffilmark{1,2,3}}
12: 
13: \altaffiltext{1}{Department of Physics, UIUC, 1110 W Green Street,
14:              Urbana, IL 61801; fendt@uiuc.edu}
15: \altaffiltext{2}{Department of Astronomy, UIUC, 1002 W Green
16:              Street, Urbana, IL 61801; bwandelt@uiuc.edu}
17: \altaffiltext{3}{Center for Advanced Studies, UIUC, 912 W Illinois
18:              Street, Urbana, IL 61801}
19: 
20: %%===========================================================================
21: 
22: \begin{abstract}
23: 
24: This paper presents the second release of Pico (Parameters for the Impatient 
25: COsmologist). 
26: Pico is a general purpose machine learning code which we have applied
27: to computing the CMB power spectra and the WMAP likelihood.
28: For this release, 
29: we have made improvements to the algorithm as well as the data 
30: sets used to train Pico,
31: leading to a significant improvement in accuracy.
32: For the $9$ parameter nonflat case presented here Pico can on average compute the 
33: TT, TE and EE spectra to better than $1\%$ of cosmic standard deviation
34: for nearly all $\ell$ values over a large region of parameter space.
35: Performing a cosmological parameter analysis of current CMB and large scale
36: structure data, we show that these power spectra give very accurate $1$ and $2$ 
37: dimensional parameter posteriors.
38: We have extended Pico to allow computation of the tensor power 
39: spectrum and the matter transfer function.
40: Pico runs about $1500$ times faster than CAMB at the default accuracy and
41: about $250,000$ times faster at high accuracy.
42: Training Pico can be done using massively parallel computing resources, 
43: including distributed computing projects such as Cosmology@Home.
44: On the homepage for Pico, located at
45: \verb+http://cosmos.astro.uiuc.edu/pico+,
46: we provide new sets of regression coefficients and make the training
47: code available for public use.
48: \end{abstract}
49: 
50: \keywords{cosmic microwave background --- cosmology: observations ---
51:           methods: numerical}
52: 
53: %%===========================================================================
54: 
55: \section{Introduction}\label{intro}
56: Given the quantity of data available from current experiments such as WMAP and SDSS as well 
57: as the prospects for the next generation Planck and DES experiments on the horizon,
58: there is growing need for cosmologists to develop
59: tools that can accurately interpret the flood of data these experiments will gather.
60: A key component in all such analysis is the exploration of the posterior
61: density of the cosmological parameters given the available data.  This allows 
62: us to constrain and test theoretical models of the Universe.
63: A major computational hurdle in this procedure is the ability to quickly and accurately
64: compute the power spectrum of CMB fluctuations and the matter transfer function.  
65: This is accomplished using codes such as CMBFast \citep{Seljak:1996is} and 
66: CAMB \citep{Lewis:1999bs} 
67: that evolve the Boltzmann equation for the various constituents of the Universe.  
68: Using the default accuracy settings in CAMB the calculation of a single power spectrum
69: takes on the order of a minute. At higher settings, as may be required by upcoming
70: CMB experiments, the computational time can jump to several hours 
71: on a modern desktop. Furthermore, computation of constraints based on the data 
72: requires evaluating the power spectrum at $\mathcal{O}\left(10^5 - 10^6\right)$
73: models.
74: Decreasing the time required to calculate the CMB power spectrum, while maintaining 
75: sub-cosmic variance accuracy, will play an important part of
76: turning raw data into quantitative information about the history and structure of the 
77: Universe.
78: 
79: Previous codes aimed at speeding up power spectrum computations
80: such as DASH \citep{Kaplinghat:2002mh}  and CMBwarp 
81: \citep{Jimenez:2004ct} have attempted to reproduce the
82: computation of CMBFast or CAMB. 
83: More recently, motivated to develop a code that was both faster than DASH and
84: more accurate than CMBwarp, we released 
85: a machine learning code called Pico \citep{Fendt:2006uh}. It uses a 
86: training set of power spectra from CAMB to fit several multivariate 
87: polynomials as a function of the input parameters.  Along with accurately 
88: fitting the power spectra, we also found that
89: Pico was able to directly fit the WMAP likelihood 
90: \citep{Spergel:2006hy,Hinshaw:2006ia,Page:2006hz}.
91: This was done previously by CMBFit \citep{Sandvik:2003ii}
92: which used a similar idea of fitting the
93: likelihood with a polynomial in the cosmological parameters.
94: By replacing CAMB and the WMAP likelihood code with Pico we demonstrated that it 
95: can quickly explore the parameter space and give nearly identical posteriors.
96: Since the first release of Pico, Auld \textit{et al.} have applied a neural 
97: network code called CosmoNet to flat \citep{Auld:2006pm} and nonflat 
98: \citep{Auld:2007qz} models. 
99: Also Habib \textit{et al.} have demonstrated that a Gaussian process 
100: model introduced in \citep{Heitmann:2006hr} can 
101: predict the CMB power spectrum for flat models
102: based on a very small training set \citep{Habib:2007ca}.
103: Here we discuss some improvements to increase the accuracy of Pico 
104: and extend its to application to more general cosmological models.
105: Further, we describe a training method that avoids serial runs of CAMB
106: and is designed to leverage access to massively parallel and even 
107: distributed computing resources.  For example, we demonstrate that
108: Pico can be trained using the thousands of geographically distinct hosts
109: that contribute to Cosmology@Home.\footnote{http://www.cosmologyathome.org}
110: 
111: This paper is organized as follows. Section \ref{sec:algorithm} summarizes 
112: some of the improvements to the Pico algorithm and how training sets are generated.
113: In section \ref{sec:results} we demonstrate the performance of Pico in 
114: computing the power spectrum, matter transfer function and WMAP likelihood
115: for a $9$ parameter nonflat model.  We show that Pico can be used to 
116: quickly explore the parameter posterior for this model while
117: accurately reproducing the $1$ and $2$ dimensional marginalized distributions.
118: Lastly we summarize and conclude in section \ref{sec:conclusion}.
119: 
120: 
121: \section{The Algorithm} \label{sec:algorithm}
122: \subsection{Overview}
123: Given a training set of Cosmological parameters $\mathbf{x}$ and CMB power spectra
124: $\mathbf{y}$, Pico models the function $\mathbf{y}=f\left(\mathbf{x}\right)$ in
125: $3$ steps. First it compresses the power spectra using a 
126: Karhunen-Lo\`{e}ve \citep{Karhunen:1946,Loeve:1955,Tegmark:1994ed} technique. 
127: For $\ell_{\mathrm{max}}=3000$, the temperature and polarization spectra 
128: due to scalar and tensor perturbations can be compressed to $\sim 180$ total numbers. 
129: Next the algorithm clusters the input
130: parameters into non-overlapping regions. This is done using a $k$-means 
131: algorithm \citep{MacQueen:1967,Kirby:2001} or by choosing hyperplanes
132: to manually partition the space. 
133: Lastly, Pico models the function by fitting a least-squares polynomial
134: to the compressed power spectra over each cluster.  
135: Details of the algorithm can be found in the appendix of \citet{Fendt:2006uh}.
136: 
137: \subsection{Improved Fitting}
138: The first change to the algorithm is that it now uses the LAPACK libraries
139: \footnote{http://www.netlib.org/lapack/} to perform matrix decomposition. 
140: LAPACK makes the algorithm more stable, allowing the use of higher order 
141: polynomials. This gives Pico better fits at low $\ell$ ($<200$) and high 
142: $\ell$ ($>1500$).  This also makes clustering
143: significantly less important but still useful for improving the low 
144: $\ell$ accuracy as well as improving the fit to the WMAP likelihood directly.
145: In particular, we have found that partitioning the training set along values
146: of constant $\Omega_{\mathrm{k}}$ improves the low $\ell$ accuracy 
147: in the temperature power spectrum and matter transfer function, 
148: while partitioning in $\tau$, the 
149: reionization depth, gives a large improvement in computing the 
150: polarization power spectrum at low $\ell$.
151: 
152: 
153: \subsection{Decreasing Numerical Noise}\label{subsec:noise}
154: For nonflat models, fitting algorithms 
155: are hindered by the fact that at the default accuracy the power spectra 
156: computed by CAMB are numerically noisy. 
157: This is demonstrated in Figure \ref{fig:cl_noise}, where
158: we have plotted the power spectrum 
159: at the default and high accuracy levels
160: for various $\ell$-values as a function of a 
161: single parameter ($\Omega_{\mathrm{b}}h^2$) for a nonflat model.
162: Since the power spectrum is not a numerically smooth function of the
163: cosmological parameters, Pico is limited in its ability to fit the
164: true spectra.
165: Also note that the higher accuracy power spectrum does not
166: always smooth over the lower accuracy case. 
167: To reduce this noise and limit the effect of interpolation errors
168: we have generated training sets with CAMB using the accuracy parameters 
169: set as: {\tt Accuracy=3}, {\tt lAccuracy=3} and {\tt lSamp=3}.
170: In Figure \ref{fig:camb_acc} we have plotted the error between
171: the default and high accuracy power spectra from CAMB for $25$ 
172: models around the peak of the WMAP likelihood. The left and right plots
173: show the percent error in the TT and EE spectra. Also shown is the error
174: between the spectra computed by Pico and the high accuracy CAMB runs.
175: Here Pico was trained on the $9$ parameter model discussed in
176: section \ref{sec:results}.
177: The figures demonstrate that when trained on high accuracy data Pico
178: computes the power spectrum around the peak of the likelihood to better 
179: precision than CAMB at its default accuracy settings.
180: 
181: \begin{figure}
182: \begin{center}
183:    \epsscale{1.15} \plotone{f1_color.eps}
184:    %\epsscale{1.15} \plotone{f1.eps}
185:    %\epsscale{1.15} \plottwo{f1_color.eps}{f1.eps}
186:    \caption{The plot shows the value of the temperature spectrum as a function of 
187:             $\Omega_{\mathrm{b}}h^2$ at various $\ell$-values for
188:             a nonflat cosmology. The red ($+$) points correspond to 
189:             to the default CAMB accuracies ($1$,$1$,$1$) and the
190:             green ($\Box$) points correspond to higher accuracy
191:             settings ($3$,$3$,$1$).
192:             While at low $\ell$ the power spectrum is smooth, the
193:             default accuracy becomes numerically noisy at higher $\ell$.
194:             This is one reason adding $\Omega_{\mathrm{k}}$ as a free
195:             parameter increases the difficulty in fitting the power
196:             spectrum. Also plotted, as a blue line, is the power
197:             spectrum computed by Pico trained on the $9$ parameter
198:             nonflat model discussed in section \ref{sec:results}.
199:             \label{fig:cl_noise}}
200: \end{center}
201: \end{figure}
202: 
203: \begin{figure}
204: \begin{center}
205:    \epsscale{1.15} \plotone{f2_color.eps} 
206:    %\epsscale{1.15} \plotone{f2.eps} 
207:    %\epsscale{1.15} \plottwo{f2_color.eps}{f2.eps}
208:    \caption{The plots show the percent error between the TT (left) and
209:             EE (right) power spectrum computed by CAMB at the default 
210:             accuracy compared to those computed at high accuracy . 
211:             Also shown is the
212:             percent error between the power spectra computed by Pico
213:             and the high accuracy CAMB spectra.
214:             This test was done on $25$ models all located within $25$
215:             log-likelihoods of the WMAP peak.
216:             \label{fig:camb_acc}}
217: \end{center}
218: \end{figure}
219: 
220: Numerical noise in the power spectrum also
221: leads to noise in the WMAP likelihood as shown in Figure 
222: \ref{fig:lnlike_noise}.  
223: Again, this increases the difficulty in fitting the likelihood
224: with Pico.
225: Just as the power spectrum, this is remedied by running CAMB at
226: higher accuracy.
227: While the level of noise introduced in the likelihood may not be
228: of significant concern when analyzing current CMB data sets, it may 
229: represent an important hurdle to overcome for the next generation of 
230: experiments.
231: Also evident from Figure \ref{fig:lnlike_noise} is that Pico provides
232: a smooth approximation to the noisy likelihood. This is an important property
233: for algorithms that require differentiating the likelihood such as
234: Hamiltonian Monte Carlo \citep{Hajian:2006mt}.
235: 
236: \begin{figure}
237: \begin{center}
238:    \epsscale{1.15} \plotone{f3_color.eps}
239:    %\epsscale{1.15} \plotone{f3.eps}
240:    %\epsscale{1.15} \plottwo{f3_color.eps}{f3.eps}
241:    \caption{Value of $-\ln L_{\mathrm{WMAP}}$ as a function of
242:             $\Omega_{\mathrm{b}}h^2$ for a nonflat cosmology.
243:             Note that this is near the peak of the likelihood 
244:             in the full space. The red ($+$)
245:             points correspond to the default CAMB accuracies ($1$,$1$,$1$)
246:             and the green ($\Box$) points corresponding to using higher 
247:             accuracy settings ($3$,$3$,$1$). The blue line is the 
248:             value computed by Pico trained over the $9$ parameter nonflat
249:             models discussed in section \ref{sec:results}.
250:             Note that using the default accuracy in CAMB gives a numerically 
251:             noisy function, which can lead to variations of $1$ or more
252:             log-likelihoods, but Pico gives a smooth function through the high
253:             accuracy values.
254:             \label{fig:lnlike_noise}}
255: \end{center}
256: \end{figure}
257: 
258: 
259: \subsection{Generating the Training Set}\label{subsec:train}
260: As in the first release of Pico, we generate the training set of
261: power spectra and matter transfer function by sampling uniformly 
262: from a large box.
263: After training Pico to compute the power spectra, evaluation of the
264: WMAP likelihood, which requires a few seconds, becomes the new bottleneck
265: in parameter estimation. Another significant speed up can be obtained by 
266: using Pico to directly fit the likelihood function.  
267: However, for the $9$ parameter case we examine in the next section this 
268: is a difficult problem.
269: Training Pico over a box in parameter space 
270: includes regions that are many thousands of log-likelihoods from the peak
271: and gives a very sparse sampling of the high likelihood region.
272: In practice we are not interested in these areas of parameter space.
273: Instead we aim to compute the likelihood very accurately around the peak
274: of the distribution. This requires generating a training set for Pico that
275: includes only the high likelihood region.
276: 
277: A natural method of accomplishing this is to use the 
278: Metropolis Hastings algorithm to find points in the high likelihood region. 
279: This can be done efficiently by running CosmoMC \cite{Lewis:2002ah} and
280: using Pico to compute the power spectra. 
281: To ensure we cover a sufficient volume the chains are run 
282: using only the WMAP data and at a higher temperature, meaning the log-likelihood
283: is scaled by a constant factor allowing the chains to explore a larger volume.
284: This step is dominated
285: by the time it takes to run the WMAP likelihood code.
286: Lastly we run the samples through CAMB and the WMAP code
287: to get the true likelihood which will be used to retrain Pico. 
288: This step is also quick as it can easily be run in parallel.
289: It is useful to note that this procedure never requires running CAMB 
290: in serial making it an ideal application for distributed computing
291: projects such as Cosmology@Home.
292: The training set can be further refined by pruning out data at low
293: likelihood.
294: In the following Pico is trained to compute the likelihood on
295: points within $25$ log-likelihoods of the WMAP peak.
296: 
297: \subsection{Polynomial Hierarchy} \label{subsec:hierarchy}
298: The process of generating the training set outlined in section \ref{subsec:train}
299: has the added benefit of giving us a set of power spectra constrained around the
300: peak of the likelihood.  We would like to make use of these points by adding them
301: to the power spectra training set. However adding a large weighting of points to
302: a small region of the box will have a negative effet on the accuracy of the 
303: algorithm outside this region. Instead we have implemented the ability to use a 
304: hierarchy of polynomials with Pico by separately training over the uniformly
305: sampled points in the full box and over only the points in the constrained region.
306: If Pico is given a set of input cosmological parameters within this region it
307: computes the power spectra based on a polynomial fit to this constrained region.
308: For points outside this region, Pico defaults to using the polynomials fit over
309: the full box.
310: While we have found that using only a single set of polynomials trained on the box
311: is sufficient for analysis of current experimental data, this
312: will be a useful feature in the future when data from higher resolution 
313: experiments become available.
314: 
315: 
316: \section{Results} \label{sec:results}
317: Here we demonstrate the performance of Pico for nonflat cosmologies with 
318: the dark energy equation of state, $w_{\mathrm{DE}}$ allowed to vary 
319: (but still constant for a given model).
320: In this space Pico fits the power spectrum and likelihood as a function of
321: \begin{equation*}
322:    \left( \Omega_{\mathrm{b}}h^2, \Omega_{\mathrm{cdm}}h^2, \Omega_{\mathrm{k}},
323:           \theta, \tau, n_{\mathrm{s}}, \ln 10^{10} A_{\mathrm{s}}, 
324:           r, w_{\mathrm{DE}}
325:    \right).
326: \end{equation*}
327: The following sections study the accuracy of Pico in computing the power spectra,
328: matter transfer function, WMAP likelihood as well as its application to parameter
329: estimation based on this $9$ parameter model.
330: 
331: \subsection{Power Spectra and Matter Transfer Function}
332: In order to demonstrate Pico's accuracy and robustness we will test the algorithm
333: for two cases. The first case implements the hierarchy method discussed in
334: section \ref{subsec:hierarchy}. Here the training set is divided into two pieces.
335: The first contains $\sim18000$ samples generated uniformly from the box defined in 
336: Table \ref{tbl:param_bounds}, and the second set consists of the $15000$ points
337: constrained to $25$ log-likelihoods from the peak of the WMAP likelihood. 
338: For this case the test set consists of $\sim2000$ points taken from the latter 
339: training set.  These points were removed from the training set and not
340: used to train Pico.
341: 
342: The models in the training set were run through CAMB at accuracy settings
343: ($3$,$3$,$3$) to compute the true power spectra and transfer functions.  
344: As the $\ell$ and $k$ sampling used by CAMB is model dependent it is necessary 
345: to spline the power spectra and transfer function so that each is computed
346: at the same $\ell$ or $k$ value. The $\ell$-values were chosen to be those 
347: used by CAMB for flat models with \texttt{lSamp}$=3$. For the transfer 
348: function we used a unform sampling in $\ln k$.
349: Pico was trained using $6^{\mathrm{th}}$ order polynomials,
350: requiring less than $30$ minutes on a $2.4$GHz desktop.
351: 
352: Pico's performance on this test set is shown in Figure \ref{fig:openw}.
353: The top $2$ rows show the TT, TE and EE power spectra with the second
354: row focusing on low $\ell$. Results for the BB spectra and matter transfer
355: function are shown in the third row.
356: The two lines in each plot represent the mean error and the error bar that bounds
357: $99\%$ of the test set. For the power spectra the error is plotted in units of
358: the cosmic standard deviation and for the matter transfer function the $y$-axis 
359: shows percent error.
360: From the figure we see that over the volume of parameter space important for
361: CMB parameter estimation Pico can compute the power spectra for $99\%$ of 
362: models in the training set to better than $4\%$ of cosmic standard deviation,
363: with the mean error around $0.5\%$, over most $\ell$-values. 
364: Even the worst fits to the EE power spectra, which occur just after the 
365: reionization bump, are only about $25\%$ of the cosmic standard deviation.
366: We also note that many of the models that are fit poorly have very low power 
367: in this region so no experiment should be sensitive to these errors.
368: For the transfer function the $99\%$ error bar
369: is around $0.25\%$, and the mean at $0.02\%$, except at very low $k$. 
370: This should be sufficient for analysis of data from the next generation 
371: of large scale structure experiments.
372: 
373: \begin{table}[ht]
374: \begin{center}
375: \begin{tabular}{|ccccc|}
376:    \hline
377:    $0.018$ & $<$ & $\Omega_\mathrm{b} h^2$   & $<$ & $0.034$ \\
378:    $0.06$  & $<$ & $\Omega_\mathrm{cdm} h^2$ & $<$ & $0.2$   \\
379:    $-0.3$  & $<$ & $\Omega_\mathrm{k}$       & $<$ & $0.3$   \\
380:    $1.02$  & $<$ & $100\,\theta$             & $<$ & $1.08$  \\
381:    $0.01$  & $<$ & $\tau$                    & $<$ & $0.55$  \\
382:    $0.85$  & $<$ & $n_{\mathrm{s}}$          & $<$ & $1.25$  \\
383:    $2.75$  & $<$ & $\ln \left(10^{10} A_{\mathrm{s}} \right) $ & $<$ & $4.0$ \\
384:    $0$     & $<$ & $r$                       & $<$ & $2$     \\
385:    $-1.5$  & $<$ & $w_{\mathrm{DE}}$         & $<$ & $-0.3$  \\
386:    \hline
387: \end{tabular}
388: \end{center}
389: \caption{Parameter bounds defining the box the training set was sampled 
390:          from for the example in section \ref{sec:results}. This encompasses a volume
391:          of at least $3\sigma$ in each parameter around the WMAP maximum likelihood.
392:          Note that we also impose the prior that the corresponding Hubble constant for each 
393:          parameter point lie in the interval $\left[30,100\right]$ which excludes some 
394:          regions inside the box. 
395:          \label{tbl:param_bounds}}
396: \end{table}
397: 
398: \begin{figure}
399: \begin{center}
400:    \epsscale{1.15} \plotone{f4_color.eps}
401:    %\epsscale{1.15} \plotone{f4.eps}
402:    %\epsscale{1.15} \plottwo{f4_color.eps}{f4.eps}
403:    \caption{The above plots compare the performance of Pico with CAMB at
404:             high accuracy settings for 9 parameter nonflat models with 
405:             $w_{\mathrm{DE}}\ne 1$. Pico was trained using the hierarchy
406:             method described in section \ref{subsec:hierarchy} and the
407:             test set consists of $2000$ points within $25$ log-likelihoods
408:             of the WMAP peak.
409:             The top two rows show the error 
410:             compared with CAMB in units of cosmic 
411:             standard deviation for the TT, TE and EE power spectra at
412:             high $\ell$ (top) and low $\ell$ (center).
413:             The bottom row shows the error in the BB spectra in units of 
414:             the cosmic standard deviation and the percent error in the 
415:             matter transfer function.
416:             The two lines on each plot denote the mean error and the error
417:             bar that bounds $99\%$ of the test set.
418:             We note that much of the error at low $\ell$ in the EE spectra
419:             is due to the $1\%$ of models with extremely low power over this range.
420:             These spectra are too small to detect even with Planck.
421:             \label{fig:openw}}
422: \end{center}
423: \end{figure}
424: 
425: For the second test case the hierarchy method is not used and Pico is only
426: trained on a uniform sample of points from the box in 
427: Table \ref{tbl:param_bounds}. For this case the test set consists of 
428: a uniform sample of $\sim2000$ points from the same box. 
429: Pico's performance on this test set is shown in Figure \ref{fig:openw-box}.
430: We include this case only to allow comparison with other codes.
431: When Pico is used to explore the parameter posterior based on CMB constraints, 
432: which is its main purpose, chains will rarely propose points outside the 
433: constrained volume used in the hierarchy method.  Therefore 
434: Figure \ref{fig:openw} provides a better indicator of the types or error
435: incurred by using Pico to compute the power spectra.
436: The regression files on the Pico website implement the hierarchy method.
437: 
438: \begin{figure}
439: \begin{center}
440:    \epsscale{1.15} \plotone{f5_color.eps}
441:    %\epsscale{1.15} \plotone{f5.eps}
442:    %\epsscale{1.15} \plottwo{f5_color.eps}{f5.eps}
443:    \caption{The plots are same as those in Figure \ref{fig:openw}
444:             except here Pico was trained and tested over models
445:             sampled uniformly from the box defined by 
446:             Table \ref{tbl:param_bounds}. Even over this larger
447:             region Pico can compute the power spectrum in $99\%$
448:             of the test cases to better
449:             than $5\%$ of cosmic standard deviation over most $\ell$
450:             and is never worse than $0.7$ cosmic standard deviation.
451:             \label{fig:openw-box}}
452: \end{center}
453: \end{figure}
454: 
455: 
456: \subsection{WMAP Likelihood}
457: Next we test the computation of the WMAP likelihood from Pico for two cases.  The
458: first uses Pico to compute the power spectrum and then the WMAP code to compute 
459: the likelihood and the second uses Pico to directly compute the likelihood.
460: The training set for the likelihood computation consists of $\sim15000$ points generated
461: using the method described in section \ref{subsec:train}.
462: Another $\sim2000$ points, generated using the same method, were used as a test set.
463: The absolute error between the likelihood is shown in Figure \ref{fig:lnlike}. 
464: The plot on the left shows the results of using Pico to compute the power spectrum
465: while the plot on the right shows the results of directly computing the likelihood.
466: For the case of directly evaluating the likelihood, Pico can compute about 
467: $90\%$ of the test set better than $0.25$ log-likelihoods. When only using Pico to
468: compute the power spectrum the results are within $0.25$ log-likelihoods for 
469: better than $99.5\%$ of the models.
470: The training set and test set were computed using 
471: high accuracy CAMB runs and version v2p2p2 of the WMAP likelihood 
472: code.\footnote{http://lambda.gsfc.nasa.gov}
473: 
474: \begin{figure}
475: \begin{center}
476:    \epsscale{1.15} \plotone{f6_color.eps}
477:    %\epsscale{1.15} \plotone{f6.eps}
478:    %\epsscale{1.15} \plottwo{f6_color.eps}{f6.eps}
479:    \caption{The plots show the absolute error when evaluating the
480:             WMAP likelihood with Pico.  In the left plot Pico was
481:             used to compute the power spectrum which were fed into 
482:             the WMAP code. The right plot shows the absolute error
483:             when Pico directly evaluates the likelihood. In both 
484:             cases the likelihood is compared to the value of the 
485:             WMAP code using high accuracy CAMB power spectra.
486:             \label{fig:lnlike}}
487: \end{center}
488: \end{figure}
489: 
490: 
491: \subsection{Parameter Estimation}
492: To test the application of Pico to this $9$ parameter model we ran Markov chains
493: using CosmoMC with the 
494: WMAP\citep{Hinshaw:2006ia,Page:2006hz}, 
495: ACBAR\citep{Kuo:2002ua}, 
496: CBI\citep{Readhead:2004gy}
497: Boomerang\citep{Montroy:2005yx,Jones:2005yb,Piacentini:2005yq}, 
498: $2$df\citep{Cole:2005sx}, 
499: SDSS \citep{AdelmanMcCarthy:2005se},
500: and SNLS \citep{Astier:2005qq} 
501: data sets.
502: Chains were run for $3$ cases. The first uses CAMB and the WMAP likelihood
503: code (CAMB$+$WMAP), the second uses Pico to compute the power spectra and
504: transfer function but still uses the official likelihood codes (PICO$+$WMAP)
505: and the third case uses Pico to compute the power spectra, transfer function
506: and the WMAP likelihood (PICO). In third case we did not use Pico to fit
507: the $2$df, SDSS or SNLS likelihood codes. The $1$-dimensional, marginalized
508: posteriors for each of the $9$ parameters are shown along the diagonal 
509: in Figure \ref{fig:openw-post}. The plots in the lower (upper) triangle
510: in the figure compare the $2$ dimensional posteriors between the 
511: PICO$+$WMAP (PICO) case and the CAMB+WMAP case. In all of the plots the
512: CAMB$+$WMAP results are shown in red, the PICO$+$WMAP results in green and 
513: the PICO results in red.
514: The lines in the 2D plots denote the $68\%$ and $99\%$ contours.
515: Using $6$ chains run in parallel, the PICO+WMAP chains ran about $60$ times 
516: faster than without Pico, requiring about $4$ hours of wall clock time.
517: Using Pico to compute the WMAP likelihood gave another factor of $2.5$ decrease
518: in CPU time giving a total speed up of $\sim150$ over chains run without Pico.
519: In all cases the chains finished with a Gelman-Rubin statistic less than $1.01$.
520: 
521: \begin{figure*}[p]
522: \begin{center}
523:    \epsscale{1.15} \plotone{f7_color.eps}
524:    \caption{The cosmological parameter posteriors using CAMB and Pico for $9$ parameter
525:             nonflat models with $w_{\mathrm{DE}}\ne -1$ based on the WMAP, ACBAR, 
526:             CBI, Boomerang, SDSS, $2$df and SNLS data sets. The red lines are
527:             the result of using CAMB and the WMAP likelihood code. The green
528:             lines use Pico to compute the power spectrum but still uses the
529:             WMAP likelihood. The blue lines are the result from using Pico to  
530:             compute the power spectrum and WMAP likelihood. The plots in the
531:             lower triangle show the $68\%$ and $99\%$ contours for the chains
532:             run using CAMB and PICO with the WMAP likelihood code. The upper 
533:             triangle compares the CAMB chains to those using Pico to compute 
534:             the power spectra and WMAP likelihood.
535:             \label{fig:openw-post}}
536: \end{center}
537: \end{figure*}
538: 
539: \section{Conclusion} \label{sec:conclusion}
540: This paper describes a major new release of Pico, a fast and accurate code for computing
541: the CMB power spectrum, matter transfer function and the WMAP likelihood. 
542: We noted the presence of numerical noise
543: in CAMB at standard accuracy for nonflat models and its effect on the WMAP likelihood.
544: To solve this problem we have generated training sets running CAMB at high accuracy 
545: settings.  
546: Also we have presented a method of generating a training set that finds
547: the high likelihood region of parameter space without ever running CAMB in serial.  
548: This is especially useful for training Pico to fit the WMAP likelihood in large
549: dimensional spaces.
550: Furthermore, Pico can be trained separately on the power spectra in this smaller region of 
551: parameter space allowing even more accurate results around the peak of the likelihood
552: while still maintaining the ability to compute the power spectra over a large box in
553: parameter space.
554: The combination of these improvements, along with modifications to the Pico algorithm,
555: have increased its accuracy in computing the power spectrum and likelihood. 
556: Also we have extended Pico to compute the power spectrum due to tensor perturbations
557: as well as the matter power spectrum.
558: On the Pico homepage,\footnote{http://cosmos.astro.uiuc.edu/pico}
559: we provide the new version of Pico and new sets of regression coefficients. 
560: We have also released the training code for Pico, allowing users to apply
561: the algorithm to new classes of models and parameter sets.
562: We expect that the accuracy and speed achieved by Pico will be useful for current
563: and future CMB and large scale structure observations. Furthermore, we hope that
564: the concept, embodied by Pico, of exploiting massively parallel computing 
565: resources to solve inherently serial numerical problems will find applications
566: beyond the immediate domain of cosmological parameter estimation.
567: 
568: 
569: \acknowledgements
570: This work was partially funded by NSF grants AST 05-07676 and AST 07-08849,
571: by NASA contract JPL1236748, by the National Computational Science Alliance
572: under AST300029N, by the University of Illinois, by the Computational Science
573: and Engineering Department at the University of Illinois and by a Friedrich Wilhelm 
574: Bessel research prize from the Alexander von Humboldt foundation.
575: We utilized the Teragrid\citep{Catlett}
576: Itanium 2 clusters at NCSA and at Argonne National Laboratory,
577: as well as the Turing cluster
578: in the Computational Science and Engineering Department at the 
579: University of Illinois at Urbana-Champaign.
580: 
581: We thank the Max Planck Institute for Astrophysics for its hospitality while part
582: of this work done.
583: We also thank the users of the Cosmology@Home\footnote{http://www.cosmologyathome.org} 
584: project whose donated CPU hours 
585: helped make this work possible.\footnote{http://www.cosmologyathome.org/top\_users.php}
586: In particular we would like to thank Scott Kruger, the administrator of Cosmology@Home,
587: as well as the users laurenu2, PoorBoy, 
588: $\left[\right.$B$\hat{\;\;}$S$\left.\right]$ralfi65, Mitchell, and Mike The Great 
589: as representatives of all Cosmology@Home participants.  Lastly we would like to thank Nikita Sorokin
590: for his work in designing the Pico homepage.
591: 
592: Funding for the Sloan Digital Sky Survey (SDSS) has been provided by the Alfred P. Sloan
593: Foundation, the Participating Institutions, the National Aeronautics and Space Administration,
594: the National Science Foundation, the U.S. Department of Energy, the Japanese Monbukagakusho, and
595: the Max Planck Society. The SDSS Web site is \verb+http://www.sdss.org/+.
596: The SDSS is managed by the Astrophysical Research Consortium (ARC) for the Participating
597: Institutions. The Participating Institutions are The University of Chicago, Fermilab, the
598: Institute for Advanced Study, the Japan Participation Group, The Johns Hopkins University, the
599: Korean Scientist Group, Los Alamos National Laboratory, the Max-Planck-Institute for Astronomy
600: (MPIA), the Max-Planck-Institute for Astrophysics (MPA), New Mexico State University, University
601: of Pittsburgh, University of Portsmouth, Princeton University, the United States Naval
602: Observatory, and the University of Washington.
603: 
604: \pagebreak[4]
605: 
606: \begin{thebibliography}{99}
607: 
608: %\cite{AdelmanMcCarthy:2005se}
609: \bibitem[Adelman-McCarthy et al.(2006)]{AdelmanMcCarthy:2005se}
610:   J.~K.~Adelman-McCarthy {\it et al.}  [SDSS Collaboration],
611:   %``The Fourth Data Release of the Sloan Digital Sky Survey,''
612:   Astrophys.\ J.\ Suppl.\  {\bf 162}, 38 (2006)
613:   %[arXiv:astro-ph/0507711].
614:   %%CITATION = APJSA,162,38;%%
615: 
616: 
617: %\cite{Astier:2005qq}
618: \bibitem[Astier et al.(2005)]{Astier:2005qq}
619:   P.~Astier {\it et al.}  [The SNLS Collaboration],
620:   %``The Supernova Legacy Survey: Measurement of Omega_M, Omega_Lambda and w
621:   %from the First Year Data Set,''
622:   Astron.\ Astrophys.\  {\bf 447}, 31 (2006)
623:   %[arXiv:astro-ph/0510447].
624:   %%CITATION = AAEJA,447,31;%%
625: 
626: %\cite{Auld:2006pm}
627: \bibitem[Auld et al.(2006)]{Auld:2006pm}
628:   T.~Auld, M.~Bridges, M.~P.~Hobson and S.~F.~Gull,
629:   %``Fast cosmological parameter estimation using neural networks,''
630:   Mon.\ Not.\ Roy.\ Astron.\ Soc.\ Lett.\  {\bf 376}, L11 (2007)
631:   %[arXiv:astro-ph/0608174].
632:   %%CITATION = 00482,376,L11;%%
633: 
634: %\cite{Auld:2007qz}
635: \bibitem[Auld et al.(2007)]{Auld:2007qz}
636:   T.~Auld, M.~Bridges and M.~P.~Hobson,
637:   %``{\sc CosmoNet}: fast cosmological parameter estimation in non-flat models
638:   %using neural networks,''
639:   arXiv:astro-ph/0703445.
640:   %%CITATION = ASTRO-PH/0703445;%%
641: 
642: \bibitem[Catlett et al.(2007)]{Catlett}
643:   C.~Catlett {\it et al.},
644:   %``TeraGrid: Analysis of Organization, System Architecture, and Middleware 
645:   %  Enabling New Types of Applications,'' 
646:   HPC and Grids in Action, Ed. Lucio Grandinetti, 
647:   IOS Press 'Advances in Parallel Computing' series, Amsterdam (2007)
648: 
649: %\cite{Cole:2005sx}
650: \bibitem[Cole et al.(2005)]{Cole:2005sx}
651:   S.~Cole {\it et al.}  [The 2dFGRS Collaboration],
652:   %``The 2dF Galaxy Redshift Survey: Power-spectrum analysis of the final
653:   %dataset and cosmological implications,''
654:   Mon.\ Not.\ Roy.\ Astron.\ Soc.\  {\bf 362}, 505 (2005)
655:   %[arXiv:astro-ph/0501174].
656:   %%CITATION = MNRAA,362,505;%%
657: 
658: %\cite{Fendt:2006uh}
659: \bibitem[Fendt \& Wandelt(2007)]{Fendt:2006uh}
660:   W.~A.~Fendt and B.~D.~Wandelt,
661:   %``Pico: Parameters for the Impatient Cosmologist,''
662:   Astrophys.\ J.\  {\bf 654}, 2 (2007)
663:   %[arXiv:astro-ph/0606709].
664:   %%CITATION = ASJOA,654,2;%%
665: 
666: %\cite{Habib:2007ca}
667: \bibitem[Habib et al.(2007)]{Habib:2007ca}
668:   S.~Habib, K.~Heitmann, D.~Higdon, C.~Nakhleh and B.~Williams,
669:   %``Cosmic calibration: Constraints from the matter power spectrum and the
670:   %cosmic microwave background,''
671:   arXiv:astro-ph/0702348.
672:   %%CITATION = ASTRO-PH/0702348;%%
673: 
674: %\cite{Hajian:2006mt}
675: \bibitem[Hajian(2007)]{Hajian:2006mt}
676:   A.~Hajian,
677:   %``Efficient Cosmological Parameter Estimation with Hamiltonian Monte Carlo,''
678:   Phys.\ Rev.\  D {\bf 75}, 083525 (2007)
679:   %[arXiv:astro-ph/0608679].
680:   %%CITATION = PHRVA,D75,083525;%%
681: 
682: %\cite{Heitmann:2006hr}
683: \bibitem[Heitmann et al.(2006)]{Heitmann:2006hr}
684:   K.~Heitmann, D.~Higdon, C.~Nakhleh and S.~Habib,
685:   %``Cosmic Calibration,''
686:   Astrophys.\ J.\  {\bf 646}, L1 (2006)
687:   %[arXiv:astro-ph/0606154].
688:   %%CITATION = ASJOA,646,L1;%%
689: 
690: %\cite{Hinshaw:2006ia}
691: \bibitem[Hinshaw et al.(2007)]{Hinshaw:2006ia}
692:   G.~Hinshaw {\it et al.}  [WMAP Collaboration],
693:   %``Three-year Wilkinson Microwave Anisotropy Probe (WMAP) observations:
694:   %Temperature analysis,''
695:   Astrophys.\ J.\ Suppl.\  {\bf 170}, 288 (2007)
696:   %[arXiv:astro-ph/0603451].
697:   %%CITATION = APJSA,170,288;%%
698: 
699: %\cite{Jimenez:2004ct}
700: \bibitem[Jimenez et al.(2004)]{Jimenez:2004ct}
701:   R.~Jimenez, L.~Verde, H.~Peiris and A.~Kosowsky,
702:   %``Fast Cosmological Parameter Estimation from Microwave Background
703:   %Temperature and Polarization Power Spectra,''
704:   Phys.\ Rev.\  D {\bf 70}, 023005 (2004)
705:   %[arXiv:astro-ph/0404237].
706:   %%CITATION = PHRVA,D70,023005;%%
707: 
708: %\cite{Kaplinghat:2002mh}
709: \bibitem[Kaplinghat et al.(2002)]{Kaplinghat:2002mh}
710:   M.~Kaplinghat, L.~Knox and C.~Skordis,
711:   %``Rapid Calculation of Theoretical CMB Angular Power Spectra,''
712:   Astrophys.\ J.\  {\bf 578}, 665 (2002)
713:   %[arXiv:astro-ph/0203413].
714:   %%CITATION = ASJOA,578,665;%% 
715: 
716: \bibitem[Karhunen(1946)]{Karhunen:1946}
717: Karhunen, K., Ann, Acad. Sci. Fennicae, 37 (1946)
718: 
719: \bibitem[Kirby(2001)]{Kirby:2001}
720: Kirby, M.\ 2001, Geometric Data Analysis: An Empirical Approach to Dimensionality Reduction
721: and the Study of Patterns (New York: John Wiley \& Sons)
722: 
723: %\cite{Kuo:2002ua}
724: \bibitem[Kuo et al.(2004)]{Kuo:2002ua}
725:   C.~l.~Kuo {\it et al.}  [ACBAR collaboration],
726:   %``High Resolution Observations of the CMB Power Spectrum with ACBAR,''
727:   Astrophys.\ J.\  {\bf 600}, 32 (2004)
728:   %[arXiv:astro-ph/0212289].
729:   %%CITATION = ASJOA,600,32;%%
730: 
731: %\cite{Jones:2005yb}
732: \bibitem[Jones et al.(2006)]{Jones:2005yb}
733:   W.~C.~Jones {\it et al.},
734:   %``A Measurement of the Angular Power Spectrum of the CMB Temperature
735:   %Anisotropy from the 2003 Flight of Boomerang,''
736:   Astrophys.\ J.\  {\bf 647}, 823 (2006)
737:   %[arXiv:astro-ph/0507494].
738:   %%CITATION = ASJOA,647,823;%%
739: 
740: %\cite{Lewis:1999bs}
741: \bibitem[Lewis et al.(2000)]{Lewis:1999bs}
742:   A.~Lewis, A.~Challinor and A.~Lasenby,
743:   %``Efficient Computation of CMB anisotropies in closed FRW models,''
744:   Astrophys.\ J.\  {\bf 538}, 473 (2000)
745:   %[arXiv:astro-ph/9911177].
746:   %%CITATION = ASJOA,538,473;%%
747: 
748: %\cite{Lewis:2002ah}
749: \bibitem[Lewis \& Bridle(2002)]{Lewis:2002ah}
750:   A.~Lewis and S.~Bridle,
751:   %``Cosmological parameters from CMB and other data: a Monte-Carlo approach,''
752:   Phys.\ Rev.\  D {\bf 66}, 103511 (2002)
753:   %[arXiv:astro-ph/0205436].
754:   %%CITATION = PHRVA,D66,103511;%%
755: 
756: \bibitem[Lo\`{e}ve(1955)]{Loeve:1955}
757: Lo\`{e}ve, M.\ 1955, Probability Theory (Princeton: Van Nostrand)
758: 
759: \bibitem[MacQueen(1967)]{MacQueen:1967}
760: MacQueen, J.\ 1967, Proc. 5th Berkeley Symp. on Mathematical Statistics and 
761: Probability, 1, 281
762: 
763: %\cite{Montroy:2005yx}
764: \bibitem[Montroy et al.(2006)]{Montroy:2005yx}
765:   T.~E.~Montroy {\it et al.},
766:   %``A Measurement of the CMB  Spectrum from the 2003 Flight of BOOMERANG,''
767:   Astrophys.\ J.\  {\bf 647}, 813 (2006)
768:   [arXiv:astro-ph/0507514].
769:   %%CITATION = ASJOA,647,813;%%
770: 
771: %\cite{Page:2006hz}
772: \bibitem[Page et al.(2007)]{Page:2006hz}
773:   L.~Page {\it et al.}  [WMAP Collaboration],
774:   %``Three year Wilkinson Microwave Anisotropy Probe (WMAP) observations:
775:   %Polarization analysis,''
776:   Astrophys.\ J.\ Suppl.\  {\bf 170}, 335 (2007)
777:   %[arXiv:astro-ph/0603450].
778:   %%CITATION = APJSA,170,335;%%
779: 
780: %\cite{Piacentini:2005yq}
781: \bibitem[Piacentini et al.(2006)]{Piacentini:2005yq}
782:   F.~Piacentini {\it et al.},
783:   %``A measurement of the polarization-temperature angular cross power spectrum
784:   %of the Cosmic Microwave Background from the 2003 flight of BOOMERANG,''
785:   Astrophys.\ J.\  {\bf 647}, 833 (2006)
786:   [arXiv:astro-ph/0507507].
787:   %%CITATION = ASJOA,647,833;%%
788: 
789: %\cite{Readhead:2004gy}
790: \bibitem[Readhead et al.(2004)]{Readhead:2004gy}
791:   A.~C.~S.~Readhead {\it et al.},
792:   %``Extended Mosaic Observations with the Cosmic Background Imager,''
793:   Astrophys.\ J.\  {\bf 609}, 498 (2004)
794:   [arXiv:astro-ph/0402359].
795:   %%CITATION = ASJOA,609,498;%%
796: 
797: %\cite{Sandvik:2003ii}
798: \bibitem[Sandvik et al.(2004)]{Sandvik:2003ii}
799:   H.~B.~Sandvik, M.~Tegmark, X.~M.~Wang and M.~Zaldarriaga,
800:   %``CMBfit: Rapid WMAP likelihood calculations with normal parameters,''
801:   Phys.\ Rev.\  D {\bf 69}, 063005 (2004)
802:   %[arXiv:astro-ph/0311544].
803:   %%CITATION = PHRVA,D69,063005;%%
804: 
805: %\cite{Seljak:1996is}
806: \bibitem[Seljak \& Zaldarriaga(1996)]{Seljak:1996is}
807:   U.~Seljak and M.~Zaldarriaga,
808:   %``A Line of Sight Approach to Cosmic Microwave Background Anisotropies,''
809:   Astrophys.\ J.\  {\bf 469}, 437 (1996)
810:   %[arXiv:astro-ph/9603033].
811:   %%CITATION = ASJOA,469,437;%%
812: 
813: %\cite{Spergel:2006hy}
814: \bibitem[Spergel et al.(2007)]{Spergel:2006hy}
815:   D.~N.~Spergel {\it et al.}  [WMAP Collaboration],
816:   %``Wilkinson Microwave Anisotropy Probe (WMAP) three year results:
817:   %Implications for cosmology,''
818:   Astrophys.\ J.\ Suppl.\  {\bf 170}, 377 (2007)
819:   %[arXiv:astro-ph/0603449].
820:   %%CITATION = APJSA,170,377;%%
821: 
822: %\cite{Tegmark:1994ed}
823: \bibitem[Tegmark \& Bunn(1995)]{Tegmark:1994ed}
824:   M.~Tegmark and E.~F.~Bunn,
825:   %``How should we analyze microwave sky maps?,''
826:   Astrophys.\ J.\  {\bf 455}, 1 (1995)
827:   %[arXiv:astro-ph/9412005].
828:   %%CITATION = ASJOA,455,1;%%
829:    
830: \end{thebibliography}
831: 
832: 
833: 
834: 
835: \end{document}
836: