0806:0806.3538/new.tex

1: %Posted to archive: 08--6--22

2:

3: \documentclass[11pt]{article}

4:

5: \pdfoutput=1

6:

7: \usepackage{graphicx}

8: \usepackage[small]{caption}

9:

10: \newcommand\tf{t_{\!f}}

11: \newcommand\tp{t_p}

12:

13: \def\cmc#1{{\tt #1}}

14: \def\urldot{.\discretionary{}{}{}}

15: \def\urlslash{/\discretionary{}{}{}}

16:

17: \setlength{\textwidth}{6.5in}

18: \setlength{\oddsidemargin}{0.0in}

19: \setlength{\evensidemargin}{0.0in}

20: \setlength{\textheight}{9.0in}

21: \setlength{\topmargin}{0.0in}

22: \setlength{\headheight}{0.0in}

23: \setlength{\headsep}{0.0in}

24:

25: \begin{document}

26:

27: \title{{\bf Predicting future duration from present age:\\

28: Revisiting a critical assessment of Gott's rule}}

29:

30: \author{{\Large Carlton M.~Caves}

31: \\

32: \\

33: Department of Physics and Astronomy, MSC07--4220, University of New Mexico,\\

34: Albuquerque, New Mexico 87131-0001, USA\\

35: \\

36: and\\

37: \\

38: Department of Physics, University of Queensland,\\

39: Brisbane, Queensland 4072, Australia

40: \\

41: \\

42: E-mail: caves@info.phys.unm.edu

43: }

44:

45: \date{2008 June~21}

46:

47: \maketitle

48:

49: \begin{abstract}

50: Gott has promulgated a rule for making probabilistic predictions of

51: the future duration of a phenomenon based on the phenomenon's present

52: age [{\sl Nature\/} {\bf 363}, 315 (1993)].  I show that the two

53: usual methods for deriving Gott's rule are flawed.  Nothing licenses

54: indiscriminate use of Gott's rule as a predictor of future duration.

55: It should only be used when the phenomenon in question has no

56: identifiable time scales.

57: \end{abstract}

58:

59: \baselineskip=14.2pt

60:

61: \section{Introduction}

62:

63: In an article$^1$ published in {\sl Nature\/} in 1993 and in

64: subsequent publications$^{\hbox{\scriptsize2--5}}$ and the concluding

65: chapter of a book,$^6$ J.~Richard Gott~III has promulgated a formula

66: for making probabilistic predictions of the future duration of a

67: phenomenon based on the phenomenon's present age.  When you observe

68: that a phenomenon has lasted a time $\tp$, Gott instructs you to

69: predict that the phenomenon will last an additional time $\tf\ge

70: Y\tp$ with probability

71: \begin{equation}

72: G(\tf\ge Y\tp)={1\over1+Y}\;.

73: \label{eq:Grule}

74: \end{equation}

75: For example, Gott's rule predicts that a phenomenon has a probability

76: of $1/2$ to survive an additional time at least as long as its

77: past ($Y=1$) and, by the same token, a probability of $1/2$ to end

78: before reaching twice its present age.

79:

80: In applying his rule to a host of phenomena, Gott usually couches his

81: predictions in terms of a particular 95\% confidence interval, 95\%

82: confidence being his standard for a scientific prediction.  According

83: to his rule, the probability that a phenomenon's future duration,

84: $\tf$, will be between $1/39$ and $39$ times its present age, $\tp$,

85: is $G(\tf\ge\tp/39)-G(\tf\ge39\tp)=39/40-1/40=0.95$. The flip side of

86: this prediction is that the phenomenon has a 2.5\% chance to end

87: before reaching 1/39 of its present age and a 2.5\% chance of lasting

88: longer than 39 times its present age.

89:

90: Gott bases his formula on a temporal version of the Copernican

91: principle: when you observe the phenomenon's present age, your

92: observation does not occur at a special time.  Here I show,

93: distilling the essence of a previous critical analysis$^7$ of Gott's

94: work, that although the Copernican principle does lead directly to a

95: version of Gott's rule, this version is essentially meaningless and,

96: in particular, does not authorize his predictions for future

97: longevity based on present age.

98:

99: In published papers and his book, Gott is on record as applying his

100: rule to

101: %

102: himself,$^{2,6}$

103: Christ\-ianity,$^6$

104: the former Soviet Union,$^{1,6}$

105: the Third Reich,$^6$

106: the United States,$^6$

107: Canada,$^4$

108: world leaders,$^{2,4,6}$

109: Stonehenge,$^4$

110: the Seven Wonders of the Ancient World,$^6$

111: the Pantheon,$^6$

112: the Great Wall of China,$^6$

113: {\sl Nature},$^1$

114: the {\sl Wall Street Journal},$^6$

115: {\sl The New York Times},$^6$

116: the Berlin Wall,$^{\hbox{\scriptsize1--4,6}}$

117: the Astronomical Society of the Pacific,$^2$

118: the 44 Broadway and off-Broadway plays open

119: and running on 27~May 1993,$^{\hbox{\scriptsize2--4,6}}$

120: the Thatcher-Major Conservative government in the UK,$^{\hbox{\scriptsize2--4,6}}$

121: Manhattan (New York City),$^6$

122: the New York Stock Exchange,$^6$

123: Oxford University,$^6$

124: the internet,$^6$

125: Microsoft,$^6$

126: General Motors,$^6$

127: the human spaceflight program,$^{\hbox{\scriptsize1--6}}$

128: and

129: {\it homo sapiens}.$^{\hbox{\scriptsize1--6}}$

130: %

131: In all these cases---even the New York plays---Gott uses his rule to

132: make probabilistic predictions for the survival of individual

133: phenomena whose present age is known. For example, given {\sl

134: Nature\/}'s 123 years of publishing in 1993, Gott predicted that {\sl

135: Nature\/} had a 95\% chance to continue publishing for a period

136: between 3.15 years (already exceeded) and 4,800 years.$^1$  Most

137: notably, Gott has used the 200,000-year present age of {\it homo

138: sapiens} to predict that we have a 95\% chance to go extinct sometime

139: between 5,100 years and 7.8 million years from

140: now.$^{\hbox{\scriptsize1--6}}$  Although Gott issues occasional

141: cautionary statements about the applicability of his rule,$^{2,6}$

142: the list of phenomena to which he has applied the rule indicates that

143: these cautions don't cramp his style much.

144:

145: Gott's predictions have received attention in the popular media,

146: including a favorable piece by Timothy Ferris in {\sl The New

147: Yorker},$^8$ which highlighted Gott's predictions for human survival

148: (and which motivated me to write my original paper$^7$), and a recent

149: article by John Tierney in {\sl The New York Times},$^9$ which

150: focused on the implications for space colonization.  My late 1999

151: posting of the paper$^7$ that eventually appeared in {\sl

152: Contemporary Physics\/} prompted a sympathetic article in {\sl The

153: New York Times\/} by James Glanz,$^{10}$ then a science writer at

154: {\sl The Times}.

155:

156: Glanz, as a long-suffering fan of the Chicago White Sox, was

157: particularly interested in Gott's prediction, issued in 1996, for the

158: Sox's World Series prospects.  Gott opined that the Sox, having not

159: won a World Series title since 1917, would, with 95\% confidence, win

160: a Series sometime between 1999 and 5077.  In his 2000 article, Glanz

161: noted that the Sox hadn't yet succeeded, but he was clearly dismayed

162: by the long wait evoked by the mere mention of 5077.  Happily for him

163: and other Sox fans, they did win the Series in 2005.  In 1996 Gott

164: would have predicted a World Series title in 2005 or before with

165: probability~0.10, considerably less than the probability,

166: $1-(29/30)^9=0.26$, that comes from assuming that the Sox had the

167: same chance each year as the other 30 major-league ball clubs.

168:

169: Gott has given two main derivations of his rule: the argument from

170: the Copernican principle, which he calls the delta-$t$

171: argument,$^{\hbox{\scriptsize1--6}}$ and a Bayesian

172: analysis,$^{2,6,11}$ which he adopted from criticism due to

173: Buch.$^{12}$  Both of these derivations are flawed.$^7$  Here I begin

174: in Sec.~\ref{sec:deltat} with an analysis of the delta-$t$ argument,

175: because it led Gott to his rule and because he consistently portrays

176: it as the chief justification for his predictions.  I show that the

177: delta-$t$ argument does not lead to any prediction of future duration

178: based on present age.  I then turn in Sec.~\ref{sec:Bayesian} to the

179: usual Bayesian derivation of Gott's rule, which has greater appeal to

180: most other contributors to the literature on the subject.  I

181: demonstrate that this derivation is simply wrong and sketch the

182: correct Bayesian analysis.  A concluding Sec.~\ref{sec:conclusion}

183: considers the assumptions that are required to get Gott's

184: predictions.

185:

186: It should be emphasized at the outset that we will not conclude that

187: Gott's rule is ``wrong,'' but rather that its two primary derivations

188: are wrong.  In science flawed justifications are as bad as---perhaps

189: worse than---being obviously wrong, because they are more pernicious.

190: They can mislead you into using methods that don't apply in your

191: situation and can get you into trouble when you export those methods

192: to other contexts.  Determining the assumptions that underlie

193: whatever you are doing in science is essential, so that you know when

194: to abandon what you are doing in favor of something else.  The

195: purpose of this article is thus to debunk the two primary derivations

196: of Gott's rule and to identify the assumptions that underlie Gott's

197: rule in its predictive form, so that you will know what you are doing

198: should you choose to use it.

199:

200: The discussion in this paper is couched mainly in terms of a simple

201: graphical representation, which is equivalent to the more formal,

202: Bayesian analysis given in Ref.~7.  Section~\ref{sec:deltat} on the

203: delta-$t$ argument is phrased almost entirely in terms of the

204: graphical representation.  The results of the Bayesian analysis in

205: Sec.~\ref{sec:Bayesian} can be understood by referring to the

206: graphical representation, but the Bayesian equations are included for

207: those who prefer to see the details.

208:

209: The evidence from papers$^{\hbox{\scriptsize13--17}}$ that cite

210: Ref.~7 is that its argument and conclusions have not been appreciated

211: and understood. The goal of the present paper, with its graphical

212: mode of presentation, is to rectify that situation.

213:

214: \section{Copernican ensembles and the delta-$t$ argument}

215: \label{sec:deltat}

216:

217: The delta-$t$ argument is short and sweet.  It starts from the

218: premise that if your observation does not occur at a special

219: time---that is the temporal Copernican principle---then it is equally

220: likely to occur at any time within the total duration $T=\tp+\tf$.

221: This means that the probability that the present age $\tp$ is less

222: than or equal to $XT$, where $X$ is between 0 and 1 inclusive, is

223: $G(\tp\le XT)=X$. This being the same as the probability that the

224: future duration $\tf$ is not smaller than $(1-X)T$, i.e., not smaller

225: than $(X^{-1}-1)\tp$, one obtains Gott's rule~(\ref{eq:Grule}) by

226: letting $Y=X^{-1}-1$.

227:

228: The alluring simplicity of the delta-$t$ argument means that we need

229: an equally simple way of investigating its validity and

230: interpretation.  In any probabilistic analysis, you start with a

231: prior probability density, in this case a distribution $w(T)$, which

232: gives the probability $w(T)\,dT$ that the phenomenon's total duration

233: lies in the interval between $T$ and $T+dT$.  This prior probability

234: density is based on whatever information or data you have about the

235: phenomenon before observing it.  To formulate the problem in terms of

236: the temporal Copernican principle, the description in terms of

237: duration $T$ must be supplemented by introducing an additional

238: temporal variable.  In doing so, it is convenient to set the

239: arbitrary zero of time at the present, i.e., at the time you observe

240: the phenomenon, and to let $t_0$ denote the time when the phenomenon

241: starts.  With these choices, the present age is $\tp=-t_0$, and the

242: future duration is $\tf=T+t_0$. The phenomenon is now characterized

243: by two variables, either $t_0$ and $T$ or $\tp$ and $\tf$.

244:

245: The temporal Copernican principle---that you are not at a special

246: time relative to the phe\-no\-menon---is implemented by saying that

247: all starting times are equally likely, independent of duration~$T$.

248: More precisely, one requires that the joint probability density be

249: invariant under time translations,$^{18}$ which yields the unique

250: probability density

251: \begin{equation}

252: p(\tp,\tf)=p(t_0,T)=\gamma w(T)\;,

253: \end{equation}

254: where $\gamma$ is a constant, describing a uniform distribution for

255: the starting time.  That this distribution cannot be normalized turns

256: out not to be a problem, but it can be dealt with at this stage, if

257: desired, by cutting off the distribution at very large negative and

258: positive values of~$t_0$.

259:

260: It is instructive to think about the joint probability density in

261: terms of an ensemble made up of many instances of the same

262: phenomenon.  We can picture this ensemble, which I call an {\it

263: unrestricted Copernican ensemble}, as a population distributed in a

264: plane whose horizontal axis is labeled by $\tp$ and whose vertical

265: axis is labeled by $\tf$.  The population density is proportional to

266: the probability density $p(\tp,\tf)=\gamma w(t_p+t_f)$.  The

267: Copernican plane is depicted in Fig.~\ref{fig1}.

268:

269: \begin{figure}[t]

270: \begin{center}

271: \includegraphics[height=12cm]{fig1}

272: \end{center}

273: \vspace{-24pt}

274: \caption{The $\tp$-$\tf$ plane on which the {\it unrestricted

275: Copernican ensemble\/} resides.

276: \label{fig1}}

277: \end{figure}

278:

279: The duration $T=\tp+\tf$ labels an axis that points symmetrically

280: into the first quadrant.  The constraint of nonnegative durations

281: means that the ensemble occupies the upper right half-plane, which

282: splits naturally into three regions:

283: \begin{enumerate}

284: \item{The upper left wedge ($\tp=-t_0<0$), in which the phenomenon

285: has not yet begun.}

286: \item{The lower right wedge ($\tf<0$), in which the phenomenon is over.}

287: \item{The first quadrant ($\tp\ge0$, $\tf\ge0$), in which the phenomenon is

288: in progress.}

289: \end{enumerate}

290: There are two instructive ways of dividing the unrestricted

291: Copernican ensemble into subensembles.  First, for each starting time

292: $t_0=-\tp$, the population along the associated vertical line is the

293: subensemble of durations for a phenomenon that starts at $t_0$. The

294: translational symmetry of the Copernican ensemble means that all

295: these {\it unrestricted vertical subensembles\/} describe the same

296: distribution of durations, given by the prior density $w(T)$. Second,

297: population is distributed uniformly along the diagonal lines of

298: constant duration $T$, each of which can be called an {\it

299: unrestricted diagonal subensemble}.

300:

301: \begin{figure}[t]

302: \begin{center}

303: \includegraphics[height=10cm]{fig2}

304: \end{center}

305: \vspace{-24pt} \caption{First quadrant, on which resides the {\it

306: (truncated) Copernican ensemble}, which applies to a phenomenon in

307: progress. The truncated Copernican ensemble is obtained by lopping

308: off Regions~1 and~2 of the unrestricted Copernican ensemble.  The

309: Copernican ensemble is an idealized sample of phenomena with

310: uniformly random starting times $t_0$ (or present ages $\tp=-t_0$)

311: and with duration distributed along each vertical subensemble

312: according to the prior density~$w(T)$.  Since the starting time is

313: uniformly random, population is distributed uniformly along each

314: diagonal subensemble.  Gott's delta-$t$ argument is that the fraction

315: of population with $\tf\ge Y\tp$ within each diagonal subensemble is,

316: by the elementary geometry illustrated in the figure, $1/(1+Y)$.

317: This fraction being the same for each diagonal subensemble, it also

318: applies to the entire Copernican ensemble, giving Gott's

319: rule~(\protect\ref{eq:Grule}). The content of Gott's rule is the

320: trivial statement that a fraction $X$ of the members in the

321: Copernican ensemble have an age less than a fraction $X$ of their

322: eventual duration.  This trivial statement does not authorize any

323: prediction of future duration based on present age because the

324: present age is unknown.  Once the present age is known, predictions

325: of future duration are made within the vertical subensemble

326: corresponding to the observed age and thus are governed by the prior

327: density $w(T)$, but with those durations ruled out by the observed

328: age discarded.\label{fig2}}

329: \end{figure}

330:

331: To discuss your observation requires taking into account that you are

332: only interested in the situation, denoted by $I$, where you find the

333: phenomenon to be in progress.  Imposing this condition requires you

334: to lop off the regions of the unrestricted Copernican ensemble that

335: correspond to the phenomenon not having begun or having finished

336: [Regions~1 and~2 of Fig.~\ref{fig1}]. This leaves the {\it

337: (truncated) Copernican ensemble\/} depicted in Fig.~\ref{fig2}, which

338: occupies the first quadrant of the $\tp$-$\tf$ plane.  The

339: probability density for the truncated Copernican ensemble is given by

340: \begin{equation}

341: p(\tp,\tf|I)={w(\tp+\tf)\over\overline T}\;,\quad\mbox{$\tp\ge0$, $\tf\ge0$,}

342: \label{eq:joint}

343: \end{equation}

344: where $\overline T$ is a normalization constant equal to the mean

345: value of the total duration with respect to~$w(T)$.  Truncating the

346: unrestricted Copernican ensemble also truncates the unrestricted

347: diagonal and vertical subensembles.  In the following, the

348: designation ``truncated'' is often omitted; an undesignated ensemble

349: is always the truncated one.

350:

351: A {\it diagonal subensemble\/} lives on a diagonal line of constant

352: $T$.  Along the diagonal line, population is distributed uniformly,

353: and the total population is weighted by $Tw(T)$.  The Copernican

354: principle is the statement that the population within each diagonal

355: subensemble is distributed uniformly, with no bias toward the past or

356: the future, as is expressed by the fact that the joint

357: density~(\ref{eq:joint}) depends only on $T$.  A {\it vertical

358: subensemble\/} lives on a line of constant $t_p$; it has the same

359: population density as the corresponding unrestricted vertical

360: ensemble, except that durations that correspond to the phenomenon's

361: having already finished, $T<\tp$ ($\tf<0$), are not part of the

362: ensemble and have no population.  The Copernican ensemble is an

363: idealization of a sample of phenomena in progress, with random

364: starting times and durations distributed according to the prior

365: density $w(T)$.

366:

367: Now suppose you ask for the probability that $\tf\ge Y\tp$ for a

368: phenomenon selected from the truncated Copernican ensemble.  Within

369: each diagonal subensemble this probability is given by the fraction

370: of the length of the diagonal line that lies above the line

371: $\tf=Y\tp$ shown in Fig.~\ref{fig2}.  This ratio, from elementary

372: geometry, is $1/(1+Y)$, and this ratio is the delta-$t$ argument.  Since

373: this fraction is the same for all the diagonal subensembles, it gives

374: the probability that $\tf\ge Y\tp$ within the entire truncated

375: Copernican ensemble.  The result is Gott's rule~(\ref{eq:Grule}),

376: written here as

377: \begin{equation}

378: P(\tf\ge Y\tp|I)={1\over1+Y}\;.

379: \end{equation}

380: Notice that this probability is independent of the prior density

381: $w(T)$; it is wholly determined by the time-translation symmetry of

382: the Copernican ensemble.  The rule is particularly easy to understand

383: for the case $Y=1$: half the members of the Copernican ensemble lie

384: above (below) the line $\tf=\tp$ and thus have a future duration that

385: is greater than (less than) their present age.  We conclude that

386: Gott's rule {\it is\/} a universal expression of the Copernican

387: principle for a phenomenon drawn from the entire Copernican ensemble,

388: i.e., for a phenomenon known to be in progress, but whose present age

389: is unknown.

390:

391: Gott's rule as a universal expression of the Copernican principle has

392: precisely the content that a fraction $X$ of the members in the

393: Copernican ensemble have an age less than a fraction $X$ of their

394: eventual duration.  This trivial conclusion is what the Copernican

395: principle tells you: you know a phenomenon is in progress, but you

396: know neither when it started nor when it will end, so you judge

397: yourself equally likely to be at any point in the phenomenon's life.

398: This trivial conclusion is of very little interest, because the

399: present age being unknown, the rule has no predictive power. What

400: attracts attention to Gott's work is that he repeatedly uses his rule

401: in a different way, to make probabilistic predictions of the future

402: longevity of particular phenomena whose present age is known.

403:

404: \begin{figure}

405: \begin{center}

406: \includegraphics[height=14cm]{fig3}

407: \end{center}

408: \vspace{-18pt}

409: %

410: \caption{(a)~Unrestricted vertical subensemble, in which population

411: is distributed according to the prior density~$w(T)$.

412: (b)~Unrestricted Copernican ensemble, created from many copies of the

413: unrestricted vertical ensemble (fifteen copies, including the

414: vertical axis, are shown), each corresponding to a different starting

415: time $t_0=-\tp$. Gott's Copernican principle is the statement that

416: all starting times are equally likely. (c)~(Truncated) Copernican

417: ensemble, which describes phenomena in progress.  It is created by

418: removing from the unrestricted Copernican ensemble the regions that

419: correspond to phenomena not yet begun and already completed

420: (Regions~1 and 2 of Fig.~\ref{fig1}). In particular, each vertical

421: subensemble is truncated by removing the part with $T<\tp$ ($\tf<0$).

422: Population is distributed uniformly along the diagonal subensembles

423: of constant total duration~$T$, one of which is shown. (d)~Vertical

424: subensemble chosen by an observation of present age $\tp$.

425: Predictions within this vertical subensemble are governed by a

426: renormalized prior density, $w(T)/\Pi(\tp)$, with durations ruled out

427: by the observed age omitted.  Steps~(b) and~(c) of this process can

428: be short-circuited by going directly from~(a) to~(d).  Imagining many

429: copies of the unrestricted vertical ensemble, as is done in

430: implementing the temporal Copernican principle and thus constructing

431: the Copernican ensemble, or even having an approximation to the

432: Copernican ensemble available cannot increase your power to predict

433: the future duration of a phenomenon with a particular present age.

434: This is particularly clear in the special case of a phenomenon whose

435: total duration $T$ is known in advance, so that only one diagonal

436: subensemble is populated, say, the one shown in~(c).  At the stage of

437: the truncated Copernican ensemble in~(c), the present age and future

438: duration are strictly correlated, but randomly distributed within the

439: interval $[0,T]$, thus giving Gott's delta-$t$ argument.  Once you

440: observe the present age, however, the future duration is known and is

441: certainly not governed by Gott's rule.

442: %

443: \label{fig3}}

444: \end{figure}

445:

446: We thus need to determine what you can say when you discover the

447: present age.  Your probabilistic predictions are then determined by

448: the distribution of population within the vertical subensemble whose

449: members have the observed present age.  It is clear from

450: Fig.~\ref{fig2} that the probability density for future duration

451: within this subensemble---this is the conditional probability density

452: for $\tf$ given $\tp$---is proportional to $w(\tp+\tf)$.  Properly

453: normalized, this conditional density becomes

454: \begin{equation}

455: p(\tf|\tp,I)=w(\tp+\tf)/\Pi(\tp)\;,\quad\mbox{$\tf\ge0$,}

456: \label{eq:cond}

457: \end{equation}

458: where the normalization constant,

459: \begin{equation}

460: \Pi(\tp)=\int_0^\infty d\tf\,w(\tp+\tf)=\int_{\tp}^\infty dT\,w(T)\;,

461: \end{equation}

462: is the survival probability, i.e., the probability for the phenomenon

463: to survive at least a time $\tp$.  The conditional probability

464: density~(\ref{eq:cond}) gives the probabilities you should use for

465: making predictions of future duration based on present age.  It has a

466: very simple interpretation: once you determine the present age, you

467: rule out total durations shorter than the observed age, and you use

468: the prior density, suitably renormalized, for total durations longer

469: than the observed age.  This is what you would have done had you not

470: bothered to introduce the Copernican ensemble, but rather worked

471: directly within an unrestricted vertical ensemble.$^7$

472:

473: The process of constructing an unrestricted Copernican ensemble,

474: truncating to take account that the phenomenon is in progress, and

475: observing the present age is depicted in Fig.~\ref{fig3}.

476:

477: One way to construct the vertical subensemble for present age $\tp$

478: is to select, from each diagonal subensemble with $T\ge\tp$, the

479: subpopulation that has age $\tp$.  That population is distributed

480: uniformly within the rest of each diagonal subensemble is irrelevant

481: to the statistics of a phenomenon drawn from a vertical subensemble.

482: This is why the Copernican principle has no bearing on predictions of

483: future duration based on present age.  Indeed, once you discover the

484: present age, the probability that $\tf\ge Y\tp$ is

485: \begin{equation}

486: P(\tf\ge Y\tp|\tp,I)

487: =\int_{Y\tp}^\infty d\tf\,p(\tf|\tp,I)

488: ={\Pi\Bigl((1+Y)\tp\Bigr)\over \Pi(\tp)}\;.

489: \label{eq:rightrule}

490: \end{equation}

491: This is the predictive form of the desired probability, predictive

492: because it is conditioned on the present age.  It is determined

493: completely by the prior density and coincides with Gott's

494: rule~(\ref{eq:Grule}) only for a special choice of prior density,

495: which is identified in Sec.~\ref{sec:Bayesian} and discussed further

496: in Sec.~\ref{sec:conclusion}.  We conclude that Gott's rule should

497: not be used indiscriminately to make probabilistic predictions of

498: future duration based on present age.

499:

500: All your prior information about a phenomenon's total duration is

501: incorporated in the prior density $w(T)$.  Often you can improve your

502: predictions of future longevity by studying a phenomenon as it

503: progresses, gathering information about its particular history.  In

504: the absence of gathering additional information, however, all

505: predictions about future longevity must arise from the prior density.

506: That Gott's rule, as it comes from the delta-$t$ argument, is

507: independent of the prior density is a dead give-away that it has no

508: predictive power.  Since any prior density can be embedded in a

509: Copernican ensemble, it is clear that the Copernican principle does

510: not restrict the prior density in any way and thus is irrelevant to

511: predicting future longevity.

512:

513: \section{Bayesian analysis of Gott's rule}

514: \label{sec:Bayesian}

515:

516: Gott has endorsed$^{11}$ a Bayesian derivation of his rule, which was

517: introduced by Buch$^{12}$ in the only technical comment {\it

518: Nature\/} has published on Gott's original article.  The input to

519: Buch's analysis is the prior density $w(T)$ and the assertion that

520: given the duration~$T$, present age $\tp$ is uniformly distributed

521: within the interval $[0,T]$:

522: \begin{equation}

523: q(\tp|T)=\cases{

524: 1/T\;,&$0\le\tp\le T$,\cr

525: 0\;,&$\tp>T$.}

526: \label{eq:qtpT}

527: \end{equation}

528: A simple application of Bayes's rule gives

529: \begin{equation}

530: q(T|\tp)={q(\tp|T)w(T)\over q(\tp)}=

531: \cases{

532: w(T)/Tq(\tp)\;,&$T\ge\tp$,\cr

533: 0\;,&$T<\tp$,

534: }

535: \label{eq:qTtp}

536: \end{equation}

537: where

538: \begin{equation}

539: q(\tp)=\int_{\tp}^\infty dT\,{w(T)\over T}

540: \end{equation}

541: is the unconditioned probability density for present age $\tp$.  The

542: conditional probability that $\tf\ge Y\tp$, given $\tp$, takes the

543: form

544: \begin{equation}

545: Q(\tf\ge Y\tp|\tp)=

546: \int_{(1+Y)\tp}^\infty dT\,q(T|\tp)=

547: {q\Bigl((1+Y)\tp\Bigr)\over q(\tp)}\;.

548: \end{equation}

549: If you use the (unnormalizable) prior density $w(T)=1/T$, this result

550: reduces to Gott's rule, in a predictive form:

551: \begin{equation}

552: Q(\tf\ge Y\tp|\tp)={1\over1+Y}\;.

553: \end{equation}

554: The prior $w(T)=1/T$, called the {\it Jeffreys prior},$^{18}$ has the

555: unique status of being the only distribution on the interval

556: $[0,\infty]$ that is invariant under scale changes.  Thus this

557: Bayesian derivation concludes with the appealing result that Gott's

558: rule, as a genuinely predictive rule for future duration given

559: present age, follows from assuming a prior that has no built-in time

560: scales.

561:

562: The only problem with this neat conclusion is that this Bayesian

563: derivation is dead wrong.  This is evident from the

564: posterior~(\ref{eq:qTtp}), which is not just the original prior with

565: excluded durations given zero probability, as in the process of

566: lopping off the already completed phenomena from the unrestricted

567: ensembles to get the truncated ensembles.  The analysis gets right

568: that the posterior probability is zero for durations $T<\tp$ that are

569: ruled out by the observation of present age $\tp$, but it doesn't use

570: a renormalized version of the prior density for the durations that

571: are still allowed, i.e., for $T\ge\tp$.  This must be wrong because

572: your prior density $w(T)$ already contains your entire judgment about

573: the future duration of the phenomenon should it survive to age $\tp$.

574: In the absence of getting additional information, there is nothing to

575: justify changing your judgment about future duration when you learn

576: that the phenomenon has indeed survived to age~$\tp$.

577:

578: The question then is where this apparently innocuous Bayesian

579: analysis goes wrong.  It is not hard to determine that.  The error

580: lies in using the uniform conditional probability density $q(\tp|T)$

581: of Eq.~(\ref{eq:qtpT}) in conjunction with the prior density $w(T)$.

582: Within the unrestricted Copernican ensemble, where it is correct to

583: use $w(T)$, learning the duration $T$ tells you nothing about the

584: present age, as is evident from considering the unrestricted diagonal

585: subensemble in Fig.~\ref{fig1}. This is confirmed by a trivial

586: application of Bayes's rule to the uncorrelated variables $\tp$

587: and~$T$: $p(\tp|T)=p(t_0,T)/w(T)=\gamma$.  It is simply not

588: consistent with the unrestricted Copernican ensemble to use the

589: uniform conditional probability density~(\ref{eq:qtpT}).

590:

591: The natural thing then is to try the truncated Copernican ensemble of

592: Fig.~\ref{fig2}, which applies once you know the phenomenon is in

593: progress.  Then it is correct to use a uniform conditional density

594: for $\tp$, i.e.,

595: \begin{equation}

596: p(\tp|T,I)=\cases{

597: 1/T\;,&$0\le\tp\le T$,\cr

598: 0\;,&$\tp>T$.}\;,

599: \end{equation}

600: as is evident from considering the truncated diagonal subensemble in

601: Fig.~\ref{fig2}, but it is not correct to use the prior

602: density~$w(T)$. Once you know the phenomenon is in progress, you must

603: weight $w(T)$ by a factor of $T$, which comes from the ``lengths'' of

604: the truncated diagonal subensembles being proportional to $T$.

605: Formally, one has

606: \begin{equation}

607: p(T|I)=\int d\tp\,d\tf\,p(\tp,\tf|I)\delta(T-\tp-\tf)={Tw(T)\over\overline T}\;.

608: \label{eq:pTI}

609: \end{equation}

610: The factor of $T$ here is not optional.  It is {\it required\/} once

611: you have decided to describe the phenomenon in terms of two temporal

612: variables and to impose the time-translation symmetry of the

613: Copernican principle on the joint probability density.  To put it

614: more succinctly, it is required once you decide to use an ensemble of

615: phenomena with random starting times.

616:

617: Once one realizes that the factor of $T$ is present in $p(T|I)$,

618: the Bayesian inference of Eq.~(\ref{eq:qTtp}) is replaced by

619: \begin{equation}

620: p(T|\tp,I)={p(\tp|T,I)p(T|I)\over p(\tp|I)}=

621: \cases{

622: w(T)/\Pi(\tp)\;,&$T\ge\tp$,\cr

623: 0\;,&$T<\tp$,

624: }

625: \end{equation}

626: since the probability density of $\tp$ is given by

627: \begin{equation}

628: p(\tp|I)=\int_0^\infty d\tf\,p(\tp,\tf|I)={\Pi(\tp)\over\overline T}\;.

629: \end{equation}

630: This correct Bayesian analysis is thus in accord with the obvious

631: inference of truncating the unrestricted vertical ensemble to get the

632: conditional probability density for $T$, given $\tp$.

633:

634: Because of the additional factor of $T$ in this correct analysis, the

635: (unnormalizable) prior density that gives a predictive version of

636: Gott's rule turns out to be $w(T)=1/T^2$.  This prior density plays a

637: special role in this problem because it is the unique distribution on

638: the first quadrant of the $\tp$-$\tf$ plane that is (i) constant on

639: lines of constant $T$ and (ii) invariant under simultaneous scale

640: changes of $\tp$ and $\tf$.  Formally, with this prior, we can write

641: [see Eq.~(\ref{eq:rightrule})]

642: \begin{equation}

643: P(\tf\ge Y\tp|\tp,I)={1\over1+Y}\;,

644: \end{equation}

645: since $\Pi(\tp)=1/\tp$.  Thus Gott's rule, in a predictive form,

646: emerges from a prior $w(T)=1/T^2$ that has no time scales into the

647: past or future; alternatively, one can say that this predictive form

648: of Gott's rule arises when the probability density for $T$ within the

649: truncated Copernican ensemble, i.e., $p(T|I)$ of Eq.~(\ref{eq:pTI}),

650: is the Jeffreys prior.

651:

652: \section{Conclusion}

653: \label{sec:conclusion}

654:

655: The best way to test belief in probabilistic predictions is to offer

656: a bet based on those predictions.  For that purpose, I sent an e-mail

657: on 1999~October~21 and again on 1999~December~2 to my department's

658: most comprehensive e-mail alias, which included faculty, staff, and

659: graduate students, requesting information on pet dogs.  The responses

660: were compiled and checked for accuracy on 1999~December~6; a

661: notarized list of the 24~dogs, including each dog's name, date of

662: birth, breed, and caretaker, was deposited in my departmental

663: personnel file on 1999~December~21.  In accordance with his practice

664: for other phenomena, Gott would have made a prediction for each dog's

665: future prospects based on its age.  In particular, he would have

666: predicted that each dog would survive beyond twice its age with

667: probability $1/2$.

668:

669: For the youngest and oldest dogs on the list, Gott's predictions

670: offered favorable opportunities for betting.  I chose to focus on the

671: oldest dogs, and for each of the six dogs above ten years old on the

672: list, I offered$^7$ to bet Gott \$1,000\,US that the dog would not

673: survive to twice its age on 1999~December~3.  To sweeten the pot, I

674: offered Gott 2:1 odds in his favor.  Gott refused the bets on the

675: grounds that ``I don't do bets.''$^{19}$  If he had believed his own

676: predictions, his expected gain would have been \$3,000\,US, and the

677: probability that he would have been a net loser on the six bets was

678: 7/64=0.11.  I contacted the caretakers during May and June of 2008

679: and verified that all six dogs have died.  Thus, as I fully expected,

680: I would have won all the bets and been \$6,000\,US richer. Even with

681: the current reduced state of the US dollar, that would have been

682: enough to buy a very nice piece of Australian aboriginal art.

683:

684: More revealing than Gott's blanket refusal to bet was his excuse that

685: his rule only applies to a random dog chosen from my sample,$^{19}$

686: which is another way of saying that his rule applies to a sample of

687: dogs drawn from the truncated Copernican ensemble, i.e., a sample

688: selected without regard to present age.  In discussions of the 44 New

689: York plays$^{\hbox{\scriptsize2--4,6}}$ and of his own

690: longevity,$^{2,6}$ Gott has also suggested that a fair test of the

691: Copernican hypothesis should involve a large sample selected without

692: regard to present age.  As we have seen, Gott is quite right on this

693: score: his rule {\it does\/} apply to a phenomenon whose present age

694: is unknown.  If this were all Gott claimed, however, no one would pay

695: attention, because the universal form of his rule, applicable when

696: the present age is unknown, has no predictive power.  What grabs

697: attention is that in case after case, Gott uses his rule to make

698: predictions of the future longevity of individual phenomena whose

699: present age is known.  In the language of this paper, Gott makes

700: predictions for the vertical subensembles, but only wants to bet on

701: the entire Copernican ensemble.

702:

703: It is obvious that in a large sample of dogs selected without regard

704: to age, roughly half the dogs, within the inevitable statistical

705: fluctuations, will be in the first half of their lives, with the rest

706: in the second half.  This is the trivial content of Gott's Copernican

707: principle.  It is equally obvious that having a sample in which half

708: the dogs are in the first half of their lives does not imply that any

709: particular dog in the sample has a probability of $1/2$ to survive

710: beyond twice its present age.  Yet the elementary error of making

711: this implication underlies all of Gott's predictions.

712:

713: We have seen, at the end of Sec.~\ref{sec:Bayesian}, that there is a

714: particular (unnormalizable) prior density, $w(T)=1/T^2$, which does

715: give Gott's rule in a predictive form.$^7$   Although the prior

716: density~$1/T^2$ does not appear in any of Gott's publications, it has

717: a special status in that it is the unique prior density that makes

718: the Copernican probability $p(\tp,\tf|I)$ invariant under

719: simultaneous rescaling of the past and the future.  Use of this prior

720: is the only license for Gott's predictions.  When you can't identify

721: any time scales, Gott's rule is your best bet for making predictions

722: of future duration based on present age.

723:

724: For most phenomena, including many that Gott discusses, especially

725: those involving human institutions and creations, it is easy to

726: identify important time scales.$^7$  Although it is often difficult

727: to incorporate these time scales into a prior probability, it is

728: always a good idea to try.  This having been said, it is usually the

729: case that formulating prior information precisely is of less value

730: than observing a phenomenon as it progresses, since readily available

731: current information is more cogent than prior information for

732: predicting the future.

733:

734: Although there is little love lost between White Sox fans and fans of

735: the Chicago Cubs, I like to think that {\sl New York Times\/} writer

736: Jim Glanz, having experienced a Sox World Series win in his lifetime,

737: sympathizes with the plight of Cubs fans, who haven't seen a World

738: Series title since 1908.  Gott would predict, with 95\% confidence,

739: that they won't win a Series in the next three years, but will win

740: one before 5868.  Perhaps more to the point, he would predict with

741: probability $1/2$ that they won't bring home a title in the next

742: 99~years.  We are immediately skeptical of Gott's prediction.  For

743: example, giving each of the 30 clubs an equal chance each year sets

744: the probability of a 99-year drought at $(29/30)^{99}=0.035$.  It's

745: not that this is the ``right'' way to calculate the probability, but

746: it does show that a reasonable assumption gives quite a different

747: answer from Gott's rule.

748:

749: The reason Gott's prediction for the Cubs is so unreasonable is that

750: there are readily identifiable time scales---the length of a typical

751: player's career, the turnover in owners and management, etc.---that

752: are well short of 99~years and suggest that the Cubs might get their

753: act together much sooner.  Indeed, as of June~21, they have the best

754: record in North American baseball and are leading the National League

755: Central division.  Still, Cubs fans know to keep some pessimism in

756: reserve.

757:

758: Suppose a fan at a Cubs game at Wrigley Field in Chicago got up and

759: announced to great fanfare that half the people at the game were in

760: the first half of their life.  Everyone would yawn (except perhaps

761: the technically sophisticated, who might wonder about whether the

762: attendees are a representative sample of all ages, although a ball

763: game is probably not a bad sample in this regard).

764:

765: Suppose, however, that the fan marched up to parents holding a

766: one-month-old infant and proclaimed, ``Gott says your baby has a

767: 2.5\% chance of dying before tomorrow's game,'' or informed the

768: 60-year-old next to him, ``Gott says you have a 50\% chance of living

769: to 120.''  Both these predictions would garner attention, as

770: applications of Gott's rule often do.  The parents would probably

771: call security and ask that the fan be removed.  The 60-year-old might

772: reply, with the ingrained pessimism of Cubs fans, ``God only knows,

773: but maybe if I lived to 120, I could see the Cubs win a Series.'' His

774: seatmate would pour cold water on that: ``Don't get your hopes up.

775: Gott gives, and Gott takes away.  You might live to 120, but Gott

776: says there's only a 38\% chance the Cubs will win the Series by

777: then.  There's only a 50\% chance they'll win before you're 160.''

778:

779: Gott's rule makes absurd predictions for human longevity and other

780: human activities because there are readily identifiable time scales,

781: the most obvious of which is the average human life span, that render

782: application of his rule entirely inappropriate.  If he continues to

783: believe his rule makes nontrivial, universal predictions for the

784: future duration of individual phenomena, it's time he took some bets.

785:

786: \vspace{9pt}

787: \hspace{1truein}\hrulefill\hspace{1truein}

788: \vspace{9pt}

789:

790: $^1$\,J.~R. Gott~III, ``Implications of the Copernican principle for

791: our future prospects,'' {\sl Nature\/} {\bf 363}, 315 (1993).

792:

793: \vspace{4pt}

794: $^2$\,J.~R. Gott~III, ``Our future in the Universe,'' in

795: {\sl Clusters, Lensing, and the Future of the Universe}, Astronomical

796: Society of the Pacific Conference Series, Vol.~88, edited by

797: V.~Trimble and A.~Reisenegger (Astronomical Society of the Pacific,

798: San Francisco, 1996), p.~140.

799:

800: \vspace{4pt}

801: $^3$\,J.~R. Gott~III, ``A grim reckoning,'' {\sl New Scientist\/}

802: {\bf 156}\,(No.~2108), 36 (1997 November~15).

803:

804: \vspace{4pt}

805: $^4$\,J.~R. Gott~III, ``The Copernican principle and human survivability,''

806: in {\sl Human Survivability in the 21st Century}, Transactions of the

807: Royal Society of Canada, Series~VI, Vol.~IX, edited by D.~M.~Hayme

808: (University of Toronto Press, Toronto, 1999), p.~131.

809:

810: \vspace{4pt}

811: $^5$\,J.~R. Gott~III, ``Colonies in space; Will we plant colonies

812: beyond the Earth before it is too late?" {\sl New Scientist\/}

813: {\bf 195}\,(No.~2620), 51 (2007 September~8).

814:

815: \vspace{4pt}

816: $^6$\,J.~R. Gott~III, {\sl Time Travel in Einstein's Universe\/}

817: (Houghton Mifflin, Boston, 2001), Chap.~5.

818:

819: \vspace{4pt}

820: $^7$\,C.~M. Caves, ``Predicting future duration from present age:

821: A critical assessment,'' {\sl Contemporary Physics\/} {\bf 41}, 143

822: (2000).

823:

824: \vspace{4pt}

825: $^8$\,T.~Ferris, ``How to predict everything: Has the physicist

826: J.~Richard Gott really found a way?'' {\sl The New Yorker\/}

827: {\bf 75}\,(18), 35 (1999 July~12).

828:

829: \vspace{4pt}

830: $^9$\,J. Tierney, ``A survival imperative for space colonization,''

831: {\sl The New York Times\/} (2007 July~17).

832:

833: \vspace{4pt}

834: $^{10}$\,J. Glanz, ``Point, counterpoint and the duration of everything,''

835: {\sl The New York Times\/} (2000 February~8).

836:

837: \vspace{4pt}

838: $^{11}$\,J.~R. Gott~III, ``Future prospects discussed---Gott replies,''

839: {\sl Nature\/} {\bf 368}, 108 (1994).

840:

841: \vspace{4pt}

842: $^{12}$\,P.~Buch, ``Future prospects discussed,'' {\sl Nature\/}

843: {\bf 368}, 107 (1994).

844:

845: \vspace{4pt}

846: $^{13}$\,A.~Ledford, P.~Marriott, and M. Crowder, ``Lifetime prediction from

847: only present age: Fact or fiction?'' {\sl Physics Letters~A\/} {\bf 280},

848: 309 (2001).

849:

850: \vspace{4pt}

851: $^{14}$\,K.~D. Olum, ``The doomsday argument and the number of possible

852: observers,'' {\sl Philosophical Quarterly\/} {\bf 52}, 164 (2002).

853:

854: \vspace{4pt}

855: $^{15}$\,E.~Sober, ``An empirical critique of two versions of the doomsday

856: argument---Gott's line and Leslie's wedge,'' {\sl Synthese\/} {\bf 135},

857: 415 (2003).

858:

859: \vspace{4pt}

860: $^{16}$\,L.~Bass, ``How to predict everything: Nostradamus in the role of

861: Copernicus,'' {\sl Reports of Mathematical Physics\/} {\bf 57}, 13 (2006).

862:

863: \vspace{4pt}

864: $^{17}$\,B.~Monton and B.~Kierland, ``How to predict future duration from

865: present age,'' {\sl Philosophical Quarterly\/} {\bf 56}, 16 (2006).

866:

867: \vspace{4pt}

868: $^{18}$\,E.~T.~Jaynes, {\sl Probability Theory: The Logic of

869: Science}, edited by G.~L. Bretthorst (Cambridge University Press,

870: Cambridge, England, 2003), Chap.~12.

871:

872: \vspace{4pt}

873: $^{19}$\,``Life, longevity, and a \$6\,000 bet,'' {\sl

874: Physics World} (2000 February~11), {\tt http://physicsworld\urldot

875: com\urlslash cws\urlslash article\urlslash news\urlslash 2890}.

876: Quotes from Gott in this short article are based on a long and

877: informative document entitled ``Random observations and the

878: Copernican Principle,'' which Gott posted to PhysicsWeb early in

879: 2000~February.  As of this writing, I have been unable to find this

880: long document anywhere on the web, but I have a copy that can be made

881: available on request.

882:

883: \vspace{9pt}

884: \hspace{1truein}\hrulefill\hspace{1truein}

885: \vspace{9pt}

886:

887: {\it Author's note:} This paper was originally submitted to {\sl

888: Nature\/} on 2000 April~3 and was summarily rejected on the grounds

889: that {\sl Nature\/} had already published sufficient technical

890: comment$^{12}$ on Gott's original paper.  Then I forgot about it,

891: though it's not clear to me now why I didn't post it to the preprint

892: archive.  That was probably just pique, to which I was more subject

893: then than now.  It's just as well, because the current version is, I

894: think, considerably improved.  Three circumstances prompted me to

895: revive the paper: (i)~my 2007--08 sabbatical at the University of

896: Queensland, which has given me the gift of time; (ii)~John Tierney's

897: recent {\sl New York Times\/} article,$^9$ which showed that Gott's

898: predictions still have the power to fascinate; and (iii)~a

899: conversation with B.~J. Brewer of the University of Sydney, which

900: indicated that there would be interest in a simpler explanation of my

901: {\sl Contemporary Physics\/} article.$^7$  In preparing the current

902: version, I expanded the discussion of the delta-$t$ argument with the

903: aim of making it as simple and airtight as possible, incorporated a

904: discussion of Bayesian derivations of Gott's rule, updated the

905: references, and gathered information about the six dogs' ultimate

906: fates.

907:

908: \end{document}

909: