0712:0712.0194/ms.tex

1: %\documentclass[10pt,preprint]{aastex}

2: \documentclass{emulateapj}

3:

4: \usepackage{amsmath}

5:

6: \begin{document}

7:

8: \title{Computing High Accuracy Power Spectra with Pico}

9:

10: \author{William A.~Fendt\altaffilmark{1} and

11:         Benjamin D.~Wandelt\altaffilmark{1,2,3}}

12:

13: \altaffiltext{1}{Department of Physics, UIUC, 1110 W Green Street,

14:              Urbana, IL 61801; fendt@uiuc.edu}

15: \altaffiltext{2}{Department of Astronomy, UIUC, 1002 W Green

16:              Street, Urbana, IL 61801; bwandelt@uiuc.edu}

17: \altaffiltext{3}{Center for Advanced Studies, UIUC, 912 W Illinois

18:              Street, Urbana, IL 61801}

19:

20: %%===========================================================================

21:

22: \begin{abstract}

23:

24: This paper presents the second release of Pico (Parameters for the Impatient

25: COsmologist).

26: Pico is a general purpose machine learning code which we have applied

27: to computing the CMB power spectra and the WMAP likelihood.

28: For this release,

29: we have made improvements to the algorithm as well as the data

30: sets used to train Pico,

31: leading to a significant improvement in accuracy.

32: For the $9$ parameter nonflat case presented here Pico can on average compute the

33: TT, TE and EE spectra to better than $1\%$ of cosmic standard deviation

34: for nearly all $\ell$ values over a large region of parameter space.

35: Performing a cosmological parameter analysis of current CMB and large scale

36: structure data, we show that these power spectra give very accurate $1$ and $2$

37: dimensional parameter posteriors.

38: We have extended Pico to allow computation of the tensor power

39: spectrum and the matter transfer function.

40: Pico runs about $1500$ times faster than CAMB at the default accuracy and

41: about $250,000$ times faster at high accuracy.

42: Training Pico can be done using massively parallel computing resources,

43: including distributed computing projects such as Cosmology@Home.

44: On the homepage for Pico, located at

45: \verb+http://cosmos.astro.uiuc.edu/pico+,

46: we provide new sets of regression coefficients and make the training

47: code available for public use.

48: \end{abstract}

49:

50: \keywords{cosmic microwave background --- cosmology: observations ---

51:           methods: numerical}

52:

53: %%===========================================================================

54:

55: \section{Introduction}\label{intro}

56: Given the quantity of data available from current experiments such as WMAP and SDSS as well

57: as the prospects for the next generation Planck and DES experiments on the horizon,

58: there is growing need for cosmologists to develop

59: tools that can accurately interpret the flood of data these experiments will gather.

60: A key component in all such analysis is the exploration of the posterior

61: density of the cosmological parameters given the available data.  This allows

62: us to constrain and test theoretical models of the Universe.

63: A major computational hurdle in this procedure is the ability to quickly and accurately

64: compute the power spectrum of CMB fluctuations and the matter transfer function.

65: This is accomplished using codes such as CMBFast \citep{Seljak:1996is} and

66: CAMB \citep{Lewis:1999bs}

67: that evolve the Boltzmann equation for the various constituents of the Universe.

68: Using the default accuracy settings in CAMB the calculation of a single power spectrum

69: takes on the order of a minute. At higher settings, as may be required by upcoming

70: CMB experiments, the computational time can jump to several hours

71: on a modern desktop. Furthermore, computation of constraints based on the data

72: requires evaluating the power spectrum at $\mathcal{O}\left(10^5 - 10^6\right)$

73: models.

74: Decreasing the time required to calculate the CMB power spectrum, while maintaining

75: sub-cosmic variance accuracy, will play an important part of

76: turning raw data into quantitative information about the history and structure of the

77: Universe.

78:

79: Previous codes aimed at speeding up power spectrum computations

80: such as DASH \citep{Kaplinghat:2002mh}  and CMBwarp

81: \citep{Jimenez:2004ct} have attempted to reproduce the

82: computation of CMBFast or CAMB.

83: More recently, motivated to develop a code that was both faster than DASH and

84: more accurate than CMBwarp, we released

85: a machine learning code called Pico \citep{Fendt:2006uh}. It uses a

86: training set of power spectra from CAMB to fit several multivariate

87: polynomials as a function of the input parameters.  Along with accurately

88: fitting the power spectra, we also found that

89: Pico was able to directly fit the WMAP likelihood

90: \citep{Spergel:2006hy,Hinshaw:2006ia,Page:2006hz}.

91: This was done previously by CMBFit \citep{Sandvik:2003ii}

92: which used a similar idea of fitting the

93: likelihood with a polynomial in the cosmological parameters.

94: By replacing CAMB and the WMAP likelihood code with Pico we demonstrated that it

95: can quickly explore the parameter space and give nearly identical posteriors.

96: Since the first release of Pico, Auld \textit{et al.} have applied a neural

97: network code called CosmoNet to flat \citep{Auld:2006pm} and nonflat

98: \citep{Auld:2007qz} models.

99: Also Habib \textit{et al.} have demonstrated that a Gaussian process

100: model introduced in \citep{Heitmann:2006hr} can

101: predict the CMB power spectrum for flat models

102: based on a very small training set \citep{Habib:2007ca}.

103: Here we discuss some improvements to increase the accuracy of Pico

104: and extend its to application to more general cosmological models.

105: Further, we describe a training method that avoids serial runs of CAMB

106: and is designed to leverage access to massively parallel and even

107: distributed computing resources.  For example, we demonstrate that

108: Pico can be trained using the thousands of geographically distinct hosts

109: that contribute to Cosmology@Home.\footnote{http://www.cosmologyathome.org}

110:

111: This paper is organized as follows. Section \ref{sec:algorithm} summarizes

112: some of the improvements to the Pico algorithm and how training sets are generated.

113: In section \ref{sec:results} we demonstrate the performance of Pico in

114: computing the power spectrum, matter transfer function and WMAP likelihood

115: for a $9$ parameter nonflat model.  We show that Pico can be used to

116: quickly explore the parameter posterior for this model while

117: accurately reproducing the $1$ and $2$ dimensional marginalized distributions.

118: Lastly we summarize and conclude in section \ref{sec:conclusion}.

119:

120:

121: \section{The Algorithm} \label{sec:algorithm}

122: \subsection{Overview}

123: Given a training set of Cosmological parameters $\mathbf{x}$ and CMB power spectra

124: $\mathbf{y}$, Pico models the function $\mathbf{y}=f\left(\mathbf{x}\right)$ in

125: $3$ steps. First it compresses the power spectra using a

126: Karhunen-Lo\`{e}ve \citep{Karhunen:1946,Loeve:1955,Tegmark:1994ed} technique.

127: For $\ell_{\mathrm{max}}=3000$, the temperature and polarization spectra

128: due to scalar and tensor perturbations can be compressed to $\sim 180$ total numbers.

129: Next the algorithm clusters the input

130: parameters into non-overlapping regions. This is done using a $k$-means

131: algorithm \citep{MacQueen:1967,Kirby:2001} or by choosing hyperplanes

132: to manually partition the space.

133: Lastly, Pico models the function by fitting a least-squares polynomial

134: to the compressed power spectra over each cluster.

135: Details of the algorithm can be found in the appendix of \citet{Fendt:2006uh}.

136:

137: \subsection{Improved Fitting}

138: The first change to the algorithm is that it now uses the LAPACK libraries

139: \footnote{http://www.netlib.org/lapack/} to perform matrix decomposition.

140: LAPACK makes the algorithm more stable, allowing the use of higher order

141: polynomials. This gives Pico better fits at low $\ell$ ($<200$) and high

142: $\ell$ ($>1500$).  This also makes clustering

143: significantly less important but still useful for improving the low

144: $\ell$ accuracy as well as improving the fit to the WMAP likelihood directly.

145: In particular, we have found that partitioning the training set along values

146: of constant $\Omega_{\mathrm{k}}$ improves the low $\ell$ accuracy

147: in the temperature power spectrum and matter transfer function,

148: while partitioning in $\tau$, the

149: reionization depth, gives a large improvement in computing the

150: polarization power spectrum at low $\ell$.

151:

152:

153: \subsection{Decreasing Numerical Noise}\label{subsec:noise}

154: For nonflat models, fitting algorithms

155: are hindered by the fact that at the default accuracy the power spectra

156: computed by CAMB are numerically noisy.

157: This is demonstrated in Figure \ref{fig:cl_noise}, where

158: we have plotted the power spectrum

159: at the default and high accuracy levels

160: for various $\ell$-values as a function of a

161: single parameter ($\Omega_{\mathrm{b}}h^2$) for a nonflat model.

162: Since the power spectrum is not a numerically smooth function of the

163: cosmological parameters, Pico is limited in its ability to fit the

164: true spectra.

165: Also note that the higher accuracy power spectrum does not

166: always smooth over the lower accuracy case.

167: To reduce this noise and limit the effect of interpolation errors

168: we have generated training sets with CAMB using the accuracy parameters

169: set as: {\tt Accuracy=3}, {\tt lAccuracy=3} and {\tt lSamp=3}.

170: In Figure \ref{fig:camb_acc} we have plotted the error between

171: the default and high accuracy power spectra from CAMB for $25$

172: models around the peak of the WMAP likelihood. The left and right plots

173: show the percent error in the TT and EE spectra. Also shown is the error

174: between the spectra computed by Pico and the high accuracy CAMB runs.

175: Here Pico was trained on the $9$ parameter model discussed in

176: section \ref{sec:results}.

177: The figures demonstrate that when trained on high accuracy data Pico

178: computes the power spectrum around the peak of the likelihood to better

179: precision than CAMB at its default accuracy settings.

180:

181: \begin{figure}

182: \begin{center}

183:    \epsscale{1.15} \plotone{f1_color.eps}

184:    %\epsscale{1.15} \plotone{f1.eps}

185:    %\epsscale{1.15} \plottwo{f1_color.eps}{f1.eps}

186:    \caption{The plot shows the value of the temperature spectrum as a function of

187:             $\Omega_{\mathrm{b}}h^2$ at various $\ell$-values for

188:             a nonflat cosmology. The red ($+$) points correspond to

189:             to the default CAMB accuracies ($1$,$1$,$1$) and the

190:             green ($\Box$) points correspond to higher accuracy

191:             settings ($3$,$3$,$1$).

192:             While at low $\ell$ the power spectrum is smooth, the

193:             default accuracy becomes numerically noisy at higher $\ell$.

194:             This is one reason adding $\Omega_{\mathrm{k}}$ as a free

195:             parameter increases the difficulty in fitting the power

196:             spectrum. Also plotted, as a blue line, is the power

197:             spectrum computed by Pico trained on the $9$ parameter

198:             nonflat model discussed in section \ref{sec:results}.

199:             \label{fig:cl_noise}}

200: \end{center}

201: \end{figure}

202:

203: \begin{figure}

204: \begin{center}

205:    \epsscale{1.15} \plotone{f2_color.eps}

206:    %\epsscale{1.15} \plotone{f2.eps}

207:    %\epsscale{1.15} \plottwo{f2_color.eps}{f2.eps}

208:    \caption{The plots show the percent error between the TT (left) and

209:             EE (right) power spectrum computed by CAMB at the default

210:             accuracy compared to those computed at high accuracy .

211:             Also shown is the

212:             percent error between the power spectra computed by Pico

213:             and the high accuracy CAMB spectra.

214:             This test was done on $25$ models all located within $25$

215:             log-likelihoods of the WMAP peak.

216:             \label{fig:camb_acc}}

217: \end{center}

218: \end{figure}

219:

220: Numerical noise in the power spectrum also

221: leads to noise in the WMAP likelihood as shown in Figure

222: \ref{fig:lnlike_noise}.

223: Again, this increases the difficulty in fitting the likelihood

224: with Pico.

225: Just as the power spectrum, this is remedied by running CAMB at

226: higher accuracy.

227: While the level of noise introduced in the likelihood may not be

228: of significant concern when analyzing current CMB data sets, it may

229: represent an important hurdle to overcome for the next generation of

230: experiments.

231: Also evident from Figure \ref{fig:lnlike_noise} is that Pico provides

232: a smooth approximation to the noisy likelihood. This is an important property

233: for algorithms that require differentiating the likelihood such as

234: Hamiltonian Monte Carlo \citep{Hajian:2006mt}.

235:

236: \begin{figure}

237: \begin{center}

238:    \epsscale{1.15} \plotone{f3_color.eps}

239:    %\epsscale{1.15} \plotone{f3.eps}

240:    %\epsscale{1.15} \plottwo{f3_color.eps}{f3.eps}

241:    \caption{Value of $-\ln L_{\mathrm{WMAP}}$ as a function of

242:             $\Omega_{\mathrm{b}}h^2$ for a nonflat cosmology.

243:             Note that this is near the peak of the likelihood

244:             in the full space. The red ($+$)

245:             points correspond to the default CAMB accuracies ($1$,$1$,$1$)

246:             and the green ($\Box$) points corresponding to using higher

247:             accuracy settings ($3$,$3$,$1$). The blue line is the

248:             value computed by Pico trained over the $9$ parameter nonflat

249:             models discussed in section \ref{sec:results}.

250:             Note that using the default accuracy in CAMB gives a numerically

251:             noisy function, which can lead to variations of $1$ or more

252:             log-likelihoods, but Pico gives a smooth function through the high

253:             accuracy values.

254:             \label{fig:lnlike_noise}}

255: \end{center}

256: \end{figure}

257:

258:

259: \subsection{Generating the Training Set}\label{subsec:train}

260: As in the first release of Pico, we generate the training set of

261: power spectra and matter transfer function by sampling uniformly

262: from a large box.

263: After training Pico to compute the power spectra, evaluation of the

264: WMAP likelihood, which requires a few seconds, becomes the new bottleneck

265: in parameter estimation. Another significant speed up can be obtained by

266: using Pico to directly fit the likelihood function.

267: However, for the $9$ parameter case we examine in the next section this

268: is a difficult problem.

269: Training Pico over a box in parameter space

270: includes regions that are many thousands of log-likelihoods from the peak

271: and gives a very sparse sampling of the high likelihood region.

272: In practice we are not interested in these areas of parameter space.

273: Instead we aim to compute the likelihood very accurately around the peak

274: of the distribution. This requires generating a training set for Pico that

275: includes only the high likelihood region.

276:

277: A natural method of accomplishing this is to use the

278: Metropolis Hastings algorithm to find points in the high likelihood region.

279: This can be done efficiently by running CosmoMC \cite{Lewis:2002ah} and

280: using Pico to compute the power spectra.

281: To ensure we cover a sufficient volume the chains are run

282: using only the WMAP data and at a higher temperature, meaning the log-likelihood

283: is scaled by a constant factor allowing the chains to explore a larger volume.

284: This step is dominated

285: by the time it takes to run the WMAP likelihood code.

286: Lastly we run the samples through CAMB and the WMAP code

287: to get the true likelihood which will be used to retrain Pico.

288: This step is also quick as it can easily be run in parallel.

289: It is useful to note that this procedure never requires running CAMB

290: in serial making it an ideal application for distributed computing

291: projects such as Cosmology@Home.

292: The training set can be further refined by pruning out data at low

293: likelihood.

294: In the following Pico is trained to compute the likelihood on

295: points within $25$ log-likelihoods of the WMAP peak.

296:

297: \subsection{Polynomial Hierarchy} \label{subsec:hierarchy}

298: The process of generating the training set outlined in section \ref{subsec:train}

299: has the added benefit of giving us a set of power spectra constrained around the

300: peak of the likelihood.  We would like to make use of these points by adding them

301: to the power spectra training set. However adding a large weighting of points to

302: a small region of the box will have a negative effet on the accuracy of the

303: algorithm outside this region. Instead we have implemented the ability to use a

304: hierarchy of polynomials with Pico by separately training over the uniformly

305: sampled points in the full box and over only the points in the constrained region.

306: If Pico is given a set of input cosmological parameters within this region it

307: computes the power spectra based on a polynomial fit to this constrained region.

308: For points outside this region, Pico defaults to using the polynomials fit over

309: the full box.

310: While we have found that using only a single set of polynomials trained on the box

311: is sufficient for analysis of current experimental data, this

312: will be a useful feature in the future when data from higher resolution

313: experiments become available.

314:

315:

316: \section{Results} \label{sec:results}

317: Here we demonstrate the performance of Pico for nonflat cosmologies with

318: the dark energy equation of state, $w_{\mathrm{DE}}$ allowed to vary

319: (but still constant for a given model).

320: In this space Pico fits the power spectrum and likelihood as a function of

321: \begin{equation*}

322:    \left( \Omega_{\mathrm{b}}h^2, \Omega_{\mathrm{cdm}}h^2, \Omega_{\mathrm{k}},

323:           \theta, \tau, n_{\mathrm{s}}, \ln 10^{10} A_{\mathrm{s}},

324:           r, w_{\mathrm{DE}}

325:    \right).

326: \end{equation*}

327: The following sections study the accuracy of Pico in computing the power spectra,

328: matter transfer function, WMAP likelihood as well as its application to parameter

329: estimation based on this $9$ parameter model.

330:

331: \subsection{Power Spectra and Matter Transfer Function}

332: In order to demonstrate Pico's accuracy and robustness we will test the algorithm

333: for two cases. The first case implements the hierarchy method discussed in

334: section \ref{subsec:hierarchy}. Here the training set is divided into two pieces.

335: The first contains $\sim18000$ samples generated uniformly from the box defined in

336: Table \ref{tbl:param_bounds}, and the second set consists of the $15000$ points

337: constrained to $25$ log-likelihoods from the peak of the WMAP likelihood.

338: For this case the test set consists of $\sim2000$ points taken from the latter

339: training set.  These points were removed from the training set and not

340: used to train Pico.

341:

342: The models in the training set were run through CAMB at accuracy settings

343: ($3$,$3$,$3$) to compute the true power spectra and transfer functions.

344: As the $\ell$ and $k$ sampling used by CAMB is model dependent it is necessary

345: to spline the power spectra and transfer function so that each is computed

346: at the same $\ell$ or $k$ value. The $\ell$-values were chosen to be those

347: used by CAMB for flat models with \texttt{lSamp}$=3$. For the transfer

348: function we used a unform sampling in $\ln k$.

349: Pico was trained using $6^{\mathrm{th}}$ order polynomials,

350: requiring less than $30$ minutes on a $2.4$GHz desktop.

351:

352: Pico's performance on this test set is shown in Figure \ref{fig:openw}.

353: The top $2$ rows show the TT, TE and EE power spectra with the second

354: row focusing on low $\ell$. Results for the BB spectra and matter transfer

355: function are shown in the third row.

356: The two lines in each plot represent the mean error and the error bar that bounds

357: $99\%$ of the test set. For the power spectra the error is plotted in units of

358: the cosmic standard deviation and for the matter transfer function the $y$-axis

359: shows percent error.

360: From the figure we see that over the volume of parameter space important for

361: CMB parameter estimation Pico can compute the power spectra for $99\%$ of

362: models in the training set to better than $4\%$ of cosmic standard deviation,

363: with the mean error around $0.5\%$, over most $\ell$-values.

364: Even the worst fits to the EE power spectra, which occur just after the

365: reionization bump, are only about $25\%$ of the cosmic standard deviation.

366: We also note that many of the models that are fit poorly have very low power

367: in this region so no experiment should be sensitive to these errors.

368: For the transfer function the $99\%$ error bar

369: is around $0.25\%$, and the mean at $0.02\%$, except at very low $k$.

370: This should be sufficient for analysis of data from the next generation

371: of large scale structure experiments.

372:

373: \begin{table}[ht]

374: \begin{center}

375: \begin{tabular}{|ccccc|}

376:    \hline

377:    $0.018$ & $<$ & $\Omega_\mathrm{b} h^2$   & $<$ & $0.034$ \\

378:    $0.06$  & $<$ & $\Omega_\mathrm{cdm} h^2$ & $<$ & $0.2$   \\

379:    $-0.3$  & $<$ & $\Omega_\mathrm{k}$       & $<$ & $0.3$   \\

380:    $1.02$  & $<$ & $100\,\theta$             & $<$ & $1.08$  \\

381:    $0.01$  & $<$ & $\tau$                    & $<$ & $0.55$  \\

382:    $0.85$  & $<$ & $n_{\mathrm{s}}$          & $<$ & $1.25$  \\

383:    $2.75$  & $<$ & $\ln \left(10^{10} A_{\mathrm{s}} \right) $ & $<$ & $4.0$ \\

384:    $0$     & $<$ & $r$                       & $<$ & $2$     \\

385:    $-1.5$  & $<$ & $w_{\mathrm{DE}}$         & $<$ & $-0.3$  \\

386:    \hline

387: \end{tabular}

388: \end{center}

389: \caption{Parameter bounds defining the box the training set was sampled

390:          from for the example in section \ref{sec:results}. This encompasses a volume

391:          of at least $3\sigma$ in each parameter around the WMAP maximum likelihood.

392:          Note that we also impose the prior that the corresponding Hubble constant for each

393:          parameter point lie in the interval $\left[30,100\right]$ which excludes some

394:          regions inside the box.

395:          \label{tbl:param_bounds}}

396: \end{table}

397:

398: \begin{figure}

399: \begin{center}

400:    \epsscale{1.15} \plotone{f4_color.eps}

401:    %\epsscale{1.15} \plotone{f4.eps}

402:    %\epsscale{1.15} \plottwo{f4_color.eps}{f4.eps}

403:    \caption{The above plots compare the performance of Pico with CAMB at

404:             high accuracy settings for 9 parameter nonflat models with

405:             $w_{\mathrm{DE}}\ne 1$. Pico was trained using the hierarchy

406:             method described in section \ref{subsec:hierarchy} and the

407:             test set consists of $2000$ points within $25$ log-likelihoods

408:             of the WMAP peak.

409:             The top two rows show the error

410:             compared with CAMB in units of cosmic

411:             standard deviation for the TT, TE and EE power spectra at

412:             high $\ell$ (top) and low $\ell$ (center).

413:             The bottom row shows the error in the BB spectra in units of

414:             the cosmic standard deviation and the percent error in the

415:             matter transfer function.

416:             The two lines on each plot denote the mean error and the error

417:             bar that bounds $99\%$ of the test set.

418:             We note that much of the error at low $\ell$ in the EE spectra

419:             is due to the $1\%$ of models with extremely low power over this range.

420:             These spectra are too small to detect even with Planck.

421:             \label{fig:openw}}

422: \end{center}

423: \end{figure}

424:

425: For the second test case the hierarchy method is not used and Pico is only

426: trained on a uniform sample of points from the box in

427: Table \ref{tbl:param_bounds}. For this case the test set consists of

428: a uniform sample of $\sim2000$ points from the same box.

429: Pico's performance on this test set is shown in Figure \ref{fig:openw-box}.

430: We include this case only to allow comparison with other codes.

431: When Pico is used to explore the parameter posterior based on CMB constraints,

432: which is its main purpose, chains will rarely propose points outside the

433: constrained volume used in the hierarchy method.  Therefore

434: Figure \ref{fig:openw} provides a better indicator of the types or error

435: incurred by using Pico to compute the power spectra.

436: The regression files on the Pico website implement the hierarchy method.

437:

438: \begin{figure}

439: \begin{center}

440:    \epsscale{1.15} \plotone{f5_color.eps}

441:    %\epsscale{1.15} \plotone{f5.eps}

442:    %\epsscale{1.15} \plottwo{f5_color.eps}{f5.eps}

443:    \caption{The plots are same as those in Figure \ref{fig:openw}

444:             except here Pico was trained and tested over models

445:             sampled uniformly from the box defined by

446:             Table \ref{tbl:param_bounds}. Even over this larger

447:             region Pico can compute the power spectrum in $99\%$

448:             of the test cases to better

449:             than $5\%$ of cosmic standard deviation over most $\ell$

450:             and is never worse than $0.7$ cosmic standard deviation.

451:             \label{fig:openw-box}}

452: \end{center}

453: \end{figure}

454:

455:

456: \subsection{WMAP Likelihood}

457: Next we test the computation of the WMAP likelihood from Pico for two cases.  The

458: first uses Pico to compute the power spectrum and then the WMAP code to compute

459: the likelihood and the second uses Pico to directly compute the likelihood.

460: The training set for the likelihood computation consists of $\sim15000$ points generated

461: using the method described in section \ref{subsec:train}.

462: Another $\sim2000$ points, generated using the same method, were used as a test set.

463: The absolute error between the likelihood is shown in Figure \ref{fig:lnlike}.

464: The plot on the left shows the results of using Pico to compute the power spectrum

465: while the plot on the right shows the results of directly computing the likelihood.

466: For the case of directly evaluating the likelihood, Pico can compute about

467: $90\%$ of the test set better than $0.25$ log-likelihoods. When only using Pico to

468: compute the power spectrum the results are within $0.25$ log-likelihoods for

469: better than $99.5\%$ of the models.

470: The training set and test set were computed using

471: high accuracy CAMB runs and version v2p2p2 of the WMAP likelihood

472: code.\footnote{http://lambda.gsfc.nasa.gov}

473:

474: \begin{figure}

475: \begin{center}

476:    \epsscale{1.15} \plotone{f6_color.eps}

477:    %\epsscale{1.15} \plotone{f6.eps}

478:    %\epsscale{1.15} \plottwo{f6_color.eps}{f6.eps}

479:    \caption{The plots show the absolute error when evaluating the

480:             WMAP likelihood with Pico.  In the left plot Pico was

481:             used to compute the power spectrum which were fed into

482:             the WMAP code. The right plot shows the absolute error

483:             when Pico directly evaluates the likelihood. In both

484:             cases the likelihood is compared to the value of the

485:             WMAP code using high accuracy CAMB power spectra.

486:             \label{fig:lnlike}}

487: \end{center}

488: \end{figure}

489:

490:

491: \subsection{Parameter Estimation}

492: To test the application of Pico to this $9$ parameter model we ran Markov chains

493: using CosmoMC with the

494: WMAP\citep{Hinshaw:2006ia,Page:2006hz},

495: ACBAR\citep{Kuo:2002ua},

496: CBI\citep{Readhead:2004gy}

497: Boomerang\citep{Montroy:2005yx,Jones:2005yb,Piacentini:2005yq},

498: $2$df\citep{Cole:2005sx},

499: SDSS \citep{AdelmanMcCarthy:2005se},

500: and SNLS \citep{Astier:2005qq}

501: data sets.

502: Chains were run for $3$ cases. The first uses CAMB and the WMAP likelihood

503: code (CAMB$+$WMAP), the second uses Pico to compute the power spectra and

504: transfer function but still uses the official likelihood codes (PICO$+$WMAP)

505: and the third case uses Pico to compute the power spectra, transfer function

506: and the WMAP likelihood (PICO). In third case we did not use Pico to fit

507: the $2$df, SDSS or SNLS likelihood codes. The $1$-dimensional, marginalized

508: posteriors for each of the $9$ parameters are shown along the diagonal

509: in Figure \ref{fig:openw-post}. The plots in the lower (upper) triangle

510: in the figure compare the $2$ dimensional posteriors between the

511: PICO$+$WMAP (PICO) case and the CAMB+WMAP case. In all of the plots the

512: CAMB$+$WMAP results are shown in red, the PICO$+$WMAP results in green and

513: the PICO results in red.

514: The lines in the 2D plots denote the $68\%$ and $99\%$ contours.

515: Using $6$ chains run in parallel, the PICO+WMAP chains ran about $60$ times

516: faster than without Pico, requiring about $4$ hours of wall clock time.

517: Using Pico to compute the WMAP likelihood gave another factor of $2.5$ decrease

518: in CPU time giving a total speed up of $\sim150$ over chains run without Pico.

519: In all cases the chains finished with a Gelman-Rubin statistic less than $1.01$.

520:

521: \begin{figure*}[p]

522: \begin{center}

523:    \epsscale{1.15} \plotone{f7_color.eps}

524:    \caption{The cosmological parameter posteriors using CAMB and Pico for $9$ parameter

525:             nonflat models with $w_{\mathrm{DE}}\ne -1$ based on the WMAP, ACBAR,

526:             CBI, Boomerang, SDSS, $2$df and SNLS data sets. The red lines are

527:             the result of using CAMB and the WMAP likelihood code. The green

528:             lines use Pico to compute the power spectrum but still uses the

529:             WMAP likelihood. The blue lines are the result from using Pico to

530:             compute the power spectrum and WMAP likelihood. The plots in the

531:             lower triangle show the $68\%$ and $99\%$ contours for the chains

532:             run using CAMB and PICO with the WMAP likelihood code. The upper

533:             triangle compares the CAMB chains to those using Pico to compute

534:             the power spectra and WMAP likelihood.

535:             \label{fig:openw-post}}

536: \end{center}

537: \end{figure*}

538:

539: \section{Conclusion} \label{sec:conclusion}

540: This paper describes a major new release of Pico, a fast and accurate code for computing

541: the CMB power spectrum, matter transfer function and the WMAP likelihood.

542: We noted the presence of numerical noise

543: in CAMB at standard accuracy for nonflat models and its effect on the WMAP likelihood.

544: To solve this problem we have generated training sets running CAMB at high accuracy

545: settings.

546: Also we have presented a method of generating a training set that finds

547: the high likelihood region of parameter space without ever running CAMB in serial.

548: This is especially useful for training Pico to fit the WMAP likelihood in large

549: dimensional spaces.

550: Furthermore, Pico can be trained separately on the power spectra in this smaller region of

551: parameter space allowing even more accurate results around the peak of the likelihood

552: while still maintaining the ability to compute the power spectra over a large box in

553: parameter space.

554: The combination of these improvements, along with modifications to the Pico algorithm,

555: have increased its accuracy in computing the power spectrum and likelihood.

556: Also we have extended Pico to compute the power spectrum due to tensor perturbations

557: as well as the matter power spectrum.

558: On the Pico homepage,\footnote{http://cosmos.astro.uiuc.edu/pico}

559: we provide the new version of Pico and new sets of regression coefficients.

560: We have also released the training code for Pico, allowing users to apply

561: the algorithm to new classes of models and parameter sets.

562: We expect that the accuracy and speed achieved by Pico will be useful for current

563: and future CMB and large scale structure observations. Furthermore, we hope that

564: the concept, embodied by Pico, of exploiting massively parallel computing

565: resources to solve inherently serial numerical problems will find applications

566: beyond the immediate domain of cosmological parameter estimation.

567:

568:

569: \acknowledgements

570: This work was partially funded by NSF grants AST 05-07676 and AST 07-08849,

571: by NASA contract JPL1236748, by the National Computational Science Alliance

572: under AST300029N, by the University of Illinois, by the Computational Science

573: and Engineering Department at the University of Illinois and by a Friedrich Wilhelm

574: Bessel research prize from the Alexander von Humboldt foundation.

575: We utilized the Teragrid\citep{Catlett}

576: Itanium 2 clusters at NCSA and at Argonne National Laboratory,

577: as well as the Turing cluster

578: in the Computational Science and Engineering Department at the

579: University of Illinois at Urbana-Champaign.

580:

581: We thank the Max Planck Institute for Astrophysics for its hospitality while part

582: of this work done.

583: We also thank the users of the Cosmology@Home\footnote{http://www.cosmologyathome.org}

584: project whose donated CPU hours

585: helped make this work possible.\footnote{http://www.cosmologyathome.org/top\_users.php}

586: In particular we would like to thank Scott Kruger, the administrator of Cosmology@Home,

587: as well as the users laurenu2, PoorBoy,

588: $\left[\right.$B$\hat{\;\;}$S$\left.\right]$ralfi65, Mitchell, and Mike The Great

589: as representatives of all Cosmology@Home participants.  Lastly we would like to thank Nikita Sorokin

590: for his work in designing the Pico homepage.

591:

592: Funding for the Sloan Digital Sky Survey (SDSS) has been provided by the Alfred P. Sloan

593: Foundation, the Participating Institutions, the National Aeronautics and Space Administration,

594: the National Science Foundation, the U.S. Department of Energy, the Japanese Monbukagakusho, and

595: the Max Planck Society. The SDSS Web site is \verb+http://www.sdss.org/+.

596: The SDSS is managed by the Astrophysical Research Consortium (ARC) for the Participating

597: Institutions. The Participating Institutions are The University of Chicago, Fermilab, the

598: Institute for Advanced Study, the Japan Participation Group, The Johns Hopkins University, the

599: Korean Scientist Group, Los Alamos National Laboratory, the Max-Planck-Institute for Astronomy

600: (MPIA), the Max-Planck-Institute for Astrophysics (MPA), New Mexico State University, University

601: of Pittsburgh, University of Portsmouth, Princeton University, the United States Naval

602: Observatory, and the University of Washington.

603:

604: \pagebreak[4]

605:

606: \begin{thebibliography}{99}

607:

608: %\cite{AdelmanMcCarthy:2005se}

609: \bibitem[Adelman-McCarthy et al.(2006)]{AdelmanMcCarthy:2005se}

610:   J.~K.~Adelman-McCarthy {\it et al.}  [SDSS Collaboration],

611:   %``The Fourth Data Release of the Sloan Digital Sky Survey,''

612:   Astrophys.\ J.\ Suppl.\  {\bf 162}, 38 (2006)

613:   %[arXiv:astro-ph/0507711].

614:   %%CITATION = APJSA,162,38;%%

615:

616:

617: %\cite{Astier:2005qq}

618: \bibitem[Astier et al.(2005)]{Astier:2005qq}

619:   P.~Astier {\it et al.}  [The SNLS Collaboration],

620:   %``The Supernova Legacy Survey: Measurement of Omega_M, Omega_Lambda and w

621:   %from the First Year Data Set,''

622:   Astron.\ Astrophys.\  {\bf 447}, 31 (2006)

623:   %[arXiv:astro-ph/0510447].

624:   %%CITATION = AAEJA,447,31;%%

625:

626: %\cite{Auld:2006pm}

627: \bibitem[Auld et al.(2006)]{Auld:2006pm}

628:   T.~Auld, M.~Bridges, M.~P.~Hobson and S.~F.~Gull,

629:   %``Fast cosmological parameter estimation using neural networks,''

630:   Mon.\ Not.\ Roy.\ Astron.\ Soc.\ Lett.\  {\bf 376}, L11 (2007)

631:   %[arXiv:astro-ph/0608174].

632:   %%CITATION = 00482,376,L11;%%

633:

634: %\cite{Auld:2007qz}

635: \bibitem[Auld et al.(2007)]{Auld:2007qz}

636:   T.~Auld, M.~Bridges and M.~P.~Hobson,

637:   %``{\sc CosmoNet}: fast cosmological parameter estimation in non-flat models

638:   %using neural networks,''

639:   arXiv:astro-ph/0703445.

640:   %%CITATION = ASTRO-PH/0703445;%%

641:

642: \bibitem[Catlett et al.(2007)]{Catlett}

643:   C.~Catlett {\it et al.},

644:   %``TeraGrid: Analysis of Organization, System Architecture, and Middleware

645:   %  Enabling New Types of Applications,''

646:   HPC and Grids in Action, Ed. Lucio Grandinetti,

647:   IOS Press 'Advances in Parallel Computing' series, Amsterdam (2007)

648:

649: %\cite{Cole:2005sx}

650: \bibitem[Cole et al.(2005)]{Cole:2005sx}

651:   S.~Cole {\it et al.}  [The 2dFGRS Collaboration],

652:   %``The 2dF Galaxy Redshift Survey: Power-spectrum analysis of the final

653:   %dataset and cosmological implications,''

654:   Mon.\ Not.\ Roy.\ Astron.\ Soc.\  {\bf 362}, 505 (2005)

655:   %[arXiv:astro-ph/0501174].

656:   %%CITATION = MNRAA,362,505;%%

657:

658: %\cite{Fendt:2006uh}

659: \bibitem[Fendt \& Wandelt(2007)]{Fendt:2006uh}

660:   W.~A.~Fendt and B.~D.~Wandelt,

661:   %``Pico: Parameters for the Impatient Cosmologist,''

662:   Astrophys.\ J.\  {\bf 654}, 2 (2007)

663:   %[arXiv:astro-ph/0606709].

664:   %%CITATION = ASJOA,654,2;%%

665:

666: %\cite{Habib:2007ca}

667: \bibitem[Habib et al.(2007)]{Habib:2007ca}

668:   S.~Habib, K.~Heitmann, D.~Higdon, C.~Nakhleh and B.~Williams,

669:   %``Cosmic calibration: Constraints from the matter power spectrum and the

670:   %cosmic microwave background,''

671:   arXiv:astro-ph/0702348.

672:   %%CITATION = ASTRO-PH/0702348;%%

673:

674: %\cite{Hajian:2006mt}

675: \bibitem[Hajian(2007)]{Hajian:2006mt}

676:   A.~Hajian,

677:   %``Efficient Cosmological Parameter Estimation with Hamiltonian Monte Carlo,''

678:   Phys.\ Rev.\  D {\bf 75}, 083525 (2007)

679:   %[arXiv:astro-ph/0608679].

680:   %%CITATION = PHRVA,D75,083525;%%

681:

682: %\cite{Heitmann:2006hr}

683: \bibitem[Heitmann et al.(2006)]{Heitmann:2006hr}

684:   K.~Heitmann, D.~Higdon, C.~Nakhleh and S.~Habib,

685:   %``Cosmic Calibration,''

686:   Astrophys.\ J.\  {\bf 646}, L1 (2006)

687:   %[arXiv:astro-ph/0606154].

688:   %%CITATION = ASJOA,646,L1;%%

689:

690: %\cite{Hinshaw:2006ia}

691: \bibitem[Hinshaw et al.(2007)]{Hinshaw:2006ia}

692:   G.~Hinshaw {\it et al.}  [WMAP Collaboration],

693:   %``Three-year Wilkinson Microwave Anisotropy Probe (WMAP) observations:

694:   %Temperature analysis,''

695:   Astrophys.\ J.\ Suppl.\  {\bf 170}, 288 (2007)

696:   %[arXiv:astro-ph/0603451].

697:   %%CITATION = APJSA,170,288;%%

698:

699: %\cite{Jimenez:2004ct}

700: \bibitem[Jimenez et al.(2004)]{Jimenez:2004ct}

701:   R.~Jimenez, L.~Verde, H.~Peiris and A.~Kosowsky,

702:   %``Fast Cosmological Parameter Estimation from Microwave Background

703:   %Temperature and Polarization Power Spectra,''

704:   Phys.\ Rev.\  D {\bf 70}, 023005 (2004)

705:   %[arXiv:astro-ph/0404237].

706:   %%CITATION = PHRVA,D70,023005;%%

707:

708: %\cite{Kaplinghat:2002mh}

709: \bibitem[Kaplinghat et al.(2002)]{Kaplinghat:2002mh}

710:   M.~Kaplinghat, L.~Knox and C.~Skordis,

711:   %``Rapid Calculation of Theoretical CMB Angular Power Spectra,''

712:   Astrophys.\ J.\  {\bf 578}, 665 (2002)

713:   %[arXiv:astro-ph/0203413].

714:   %%CITATION = ASJOA,578,665;%%

715:

716: \bibitem[Karhunen(1946)]{Karhunen:1946}

717: Karhunen, K., Ann, Acad. Sci. Fennicae, 37 (1946)

718:

719: \bibitem[Kirby(2001)]{Kirby:2001}

720: Kirby, M.\ 2001, Geometric Data Analysis: An Empirical Approach to Dimensionality Reduction

721: and the Study of Patterns (New York: John Wiley \& Sons)

722:

723: %\cite{Kuo:2002ua}

724: \bibitem[Kuo et al.(2004)]{Kuo:2002ua}

725:   C.~l.~Kuo {\it et al.}  [ACBAR collaboration],

726:   %``High Resolution Observations of the CMB Power Spectrum with ACBAR,''

727:   Astrophys.\ J.\  {\bf 600}, 32 (2004)

728:   %[arXiv:astro-ph/0212289].

729:   %%CITATION = ASJOA,600,32;%%

730:

731: %\cite{Jones:2005yb}

732: \bibitem[Jones et al.(2006)]{Jones:2005yb}

733:   W.~C.~Jones {\it et al.},

734:   %``A Measurement of the Angular Power Spectrum of the CMB Temperature

735:   %Anisotropy from the 2003 Flight of Boomerang,''

736:   Astrophys.\ J.\  {\bf 647}, 823 (2006)

737:   %[arXiv:astro-ph/0507494].

738:   %%CITATION = ASJOA,647,823;%%

739:

740: %\cite{Lewis:1999bs}

741: \bibitem[Lewis et al.(2000)]{Lewis:1999bs}

742:   A.~Lewis, A.~Challinor and A.~Lasenby,

743:   %``Efficient Computation of CMB anisotropies in closed FRW models,''

744:   Astrophys.\ J.\  {\bf 538}, 473 (2000)

745:   %[arXiv:astro-ph/9911177].

746:   %%CITATION = ASJOA,538,473;%%

747:

748: %\cite{Lewis:2002ah}

749: \bibitem[Lewis \& Bridle(2002)]{Lewis:2002ah}

750:   A.~Lewis and S.~Bridle,

751:   %``Cosmological parameters from CMB and other data: a Monte-Carlo approach,''

752:   Phys.\ Rev.\  D {\bf 66}, 103511 (2002)

753:   %[arXiv:astro-ph/0205436].

754:   %%CITATION = PHRVA,D66,103511;%%

755:

756: \bibitem[Lo\`{e}ve(1955)]{Loeve:1955}

757: Lo\`{e}ve, M.\ 1955, Probability Theory (Princeton: Van Nostrand)

758:

759: \bibitem[MacQueen(1967)]{MacQueen:1967}

760: MacQueen, J.\ 1967, Proc. 5th Berkeley Symp. on Mathematical Statistics and

761: Probability, 1, 281

762:

763: %\cite{Montroy:2005yx}

764: \bibitem[Montroy et al.(2006)]{Montroy:2005yx}

765:   T.~E.~Montroy {\it et al.},

766:   %``A Measurement of the CMB  Spectrum from the 2003 Flight of BOOMERANG,''

767:   Astrophys.\ J.\  {\bf 647}, 813 (2006)

768:   [arXiv:astro-ph/0507514].

769:   %%CITATION = ASJOA,647,813;%%

770:

771: %\cite{Page:2006hz}

772: \bibitem[Page et al.(2007)]{Page:2006hz}

773:   L.~Page {\it et al.}  [WMAP Collaboration],

774:   %``Three year Wilkinson Microwave Anisotropy Probe (WMAP) observations:

775:   %Polarization analysis,''

776:   Astrophys.\ J.\ Suppl.\  {\bf 170}, 335 (2007)

777:   %[arXiv:astro-ph/0603450].

778:   %%CITATION = APJSA,170,335;%%

779:

780: %\cite{Piacentini:2005yq}

781: \bibitem[Piacentini et al.(2006)]{Piacentini:2005yq}

782:   F.~Piacentini {\it et al.},

783:   %``A measurement of the polarization-temperature angular cross power spectrum

784:   %of the Cosmic Microwave Background from the 2003 flight of BOOMERANG,''

785:   Astrophys.\ J.\  {\bf 647}, 833 (2006)

786:   [arXiv:astro-ph/0507507].

787:   %%CITATION = ASJOA,647,833;%%

788:

789: %\cite{Readhead:2004gy}

790: \bibitem[Readhead et al.(2004)]{Readhead:2004gy}

791:   A.~C.~S.~Readhead {\it et al.},

792:   %``Extended Mosaic Observations with the Cosmic Background Imager,''

793:   Astrophys.\ J.\  {\bf 609}, 498 (2004)

794:   [arXiv:astro-ph/0402359].

795:   %%CITATION = ASJOA,609,498;%%

796:

797: %\cite{Sandvik:2003ii}

798: \bibitem[Sandvik et al.(2004)]{Sandvik:2003ii}

799:   H.~B.~Sandvik, M.~Tegmark, X.~M.~Wang and M.~Zaldarriaga,

800:   %``CMBfit: Rapid WMAP likelihood calculations with normal parameters,''

801:   Phys.\ Rev.\  D {\bf 69}, 063005 (2004)

802:   %[arXiv:astro-ph/0311544].

803:   %%CITATION = PHRVA,D69,063005;%%

804:

805: %\cite{Seljak:1996is}

806: \bibitem[Seljak \& Zaldarriaga(1996)]{Seljak:1996is}

807:   U.~Seljak and M.~Zaldarriaga,

808:   %``A Line of Sight Approach to Cosmic Microwave Background Anisotropies,''

809:   Astrophys.\ J.\  {\bf 469}, 437 (1996)

810:   %[arXiv:astro-ph/9603033].

811:   %%CITATION = ASJOA,469,437;%%

812:

813: %\cite{Spergel:2006hy}

814: \bibitem[Spergel et al.(2007)]{Spergel:2006hy}

815:   D.~N.~Spergel {\it et al.}  [WMAP Collaboration],

816:   %``Wilkinson Microwave Anisotropy Probe (WMAP) three year results:

817:   %Implications for cosmology,''

818:   Astrophys.\ J.\ Suppl.\  {\bf 170}, 377 (2007)

819:   %[arXiv:astro-ph/0603449].

820:   %%CITATION = APJSA,170,377;%%

821:

822: %\cite{Tegmark:1994ed}

823: \bibitem[Tegmark \& Bunn(1995)]{Tegmark:1994ed}

824:   M.~Tegmark and E.~F.~Bunn,

825:   %``How should we analyze microwave sky maps?,''

826:   Astrophys.\ J.\  {\bf 455}, 1 (1995)

827:   %[arXiv:astro-ph/9412005].

828:   %%CITATION = ASJOA,455,1;%%

829:

830: \end{thebibliography}

831:

832:

833:

834:

835: \end{document}

836: