0509:cs0509025/pnt.tex

1: \documentclass{article}

2: %[a4paper]

3:

4: \usepackage{amsmath,amssymb,mylogic,isabelle,isabellesym}

5:

6: % \isabellestyle{sl}

7:

8: \title{A formally verified proof of the \\prime number

9:   theorem\footnote{To appear in \emph{ACM Transcations on

10:       Computational Logic.} Work by the first author has been

11:     supported by NSF grant DMS-0401042.}}

12:

13: \author{Jeremy Avigad, Kevin Donnelly, David Gray, and Paul Raff}

14:

15: \date{January 19, 2006}

16:

17: \begin{document}

18:

19: \maketitle

20:

21: \begin{abstract}

22:   The prime number theorem, established by Hadamard and de la Vall\'ee

23:   Poussin independently in 1896, asserts that the density of primes in

24:   the positive integers is asymptotic to $1 / \ln x$. Whereas their

25:   proofs made serious use of the methods of complex analysis,

26:   elementary proofs were provided by Selberg and Erd\"os in 1948.  We

27:   describe a formally verified version of Selberg's proof, obtained

28:   using the Isabelle proof assistant.

29: \end{abstract}

30:

31: \section{Introduction}

32:

33: For each positive integer $x$, let $\pi(x)$ denote the number of

34: primes less than or equal to $x$. The prime number theorem asserts

35: that the density of primes $\pi(x)/x$ in the positive integers is

36: asymptotic to $1 / \ln x$, i.e.~that $\lim_{x \to \infty} \pi(x) \ln x

37: / x = 1$. This was conjectured by Gauss and Legendre around the turn

38: of the nineteenth century, and posed a challenge to the mathematical

39: community for almost a hundred years, until Hadamard and de la

40: Vall\'ee Poussin proved it independently in 1896.

41:

42: On September 6, 2004, the first author of this article verified the

43: following statement, using the Isabelle proof assistant:

44: \begin{center}

45: \begin{isabellebody}

46: \isamarkupfalse%

47: {\isacharparenleft}{\isasymlambda}x{\isachardot}\ pi\ x\ {\isacharasterisk}\ ln\ {\isacharparenleft}real\ x{\isacharparenright}\ {\isacharslash}\ {\isacharparenleft}real\ x{\isacharparenright}{\isacharparenright}\ {\isacharminus}{\isacharminus}{\isacharminus}{\isacharminus}{\isachargreater}\ {\isadigit{1}}

48: \end{isabellebody}

49: \end{center}

50: %% \begin{center}

51: %% \begin{isabellebody}

52: %% \isamarkupfalse%

53: %% \isacommand{theorem}\ PrimeNumberTheorem{\isacharcolon}\ {\isachardoublequote}{\isacharparenleft}{\isasymlambda}x{\isachardot}\ pi\ x\ {\isacharasterisk}\ ln\ {\isacharparenleft}real\ x{\isacharparenright}\ {\isacharslash}\ {\isacharparenleft}real\ x{\isacharparenright}{\isacharparenright}\ {\isacharminus}{\isacharminus}{\isacharminus}{\isacharminus}{\isachargreater}\ {\isadigit{1}}{\isachardoublequote}

54: %% \end{isabellebody}

55: %% \end{center}

56: % \begin{quote}

57: % \tt{(\%x.\ pi x * ln (real x) / (real x)) ----> 1}

58: % \end{quote}

59: The system thereby confirmed that the prime number theorem is a

60: consequence of the axioms of higher-order logic, together with an axiom

61: asserting the existence of an infinite set.

62:

63: One reason the formalization is interesting is simply that it is a

64: landmark, showing that today's proof assistants have achieved a level

65: of usability that makes it possible to formalize substantial theorems

66: of mathematics. Similar achievements in the past year include George

67: Gonthier's verification of the four color theorem using Coq, and

68: Thomas Hales's verification of the Jordan curve theorem using

69: HOL-light (see the introduction to Wiedijk \cite{wiedijk:05}). As

70: contemporary mathematical proofs become increasingly complex, the need

71: for formal verification becomes pressing. Formal verification can also

72: help guarantee correctness when, as is becoming increasingly common,

73: proofs rely on computations that are too long to check by hand.

74: Hales's ambitious Flyspeck project \cite{hales:unp}, which aims for a

75: fully verified form of his proof of the Kepler conjecture, is a

76: response to both of these concerns. Here, we will provide some

77: information as to the time and effort that went into our

78: formalization, which should help gauge the feasibility of such

79: verification efforts.

80:

81: More interesting, of course, are the lessons that can be learned.

82: This, however, puts us on less certain terrain. Our efforts certainly

83: provide some indications as to how to improve libraries and systems

84: for verifying mathematics, but the data still need to be processed

85: and better understood. Here, therefore, we simply offer some initial

86: thoughts and observations.

87:

88: The outline of this paper is as follows. In

89: Section~\ref{background:section}, we provide some background on the

90: prime number theorem and the Isabelle proof assistant. In

91: Section~\ref{overview:section}, we provide an overview of Selberg's

92: proof, our formalization, and the effort involved. In

93: Section~\ref{reflection:section}, we discuss some interesting aspects

94: of the formalization: the use of asymptotic reasoning; calculations

95: with real numbers; casts between natural numbers, integers, and real

96: numbers; combinatorial reasoning in number theory; and the use of

97: elementary methods. Finally, in Section~\ref{conclusions:section} we

98: offer some brief conclusions.

99:

100: Our formalization of the prime number theorem was a collaborative

101: effort on the part of Avigad, Donnelly, Gray, and Raff, building, of

102: course, on the efforts of the entire Isabelle development team. This

103: article was, however, written by Avigad, so opinions and speculation

104: contained herein should be attributed to him.

105:

106: %===================================================================

107:

108: \section{Background}

109: \label{background:section}

110:

111: \subsection{The prime number theorem}

112:

113: The statement of the prime number theorem was conjectured by both

114: Gauss and Legendre, on the basis of computation, around the turn of

115: the nineteenth century. In a pair of papers published in 1851 and

116: 1852, Chebyshev made significant advances towards proving it. Note

117: that we can write

118: \[

119: \pi(x) = \sum_{p \leq x} 1,

120: \]

121: where $p$ ranges over the prime numbers. Contrary to our notation

122: above, $x$ is usually treated as a real variable, making $\pi$ a step

123: function on the reals. Chebyshev defined, in addition, the functions

124: \[

125: \theta(x) = \sum_{p \leq x} \ln p

126: \]

127: and

128: \[

129: \psi(x) = \sum_{p^a \leq x} \ln p = \sum_{n \leq x} \Lambda(n),

130: \]

131: where

132: \[

133: \Lambda(n) =

134: \left\{ \begin{array}{ll}

135:   \ln p & \mbox{if $n = p^a$, for some $a \geq 1$} \\

136:   0 & \mbox{otherwise.}

137: \end{array}\right.

138: \]

139: The functions $\theta$ and $\psi$ are more sensitive to the presence

140: of primes than $\pi$, and have nicer analytic properties. Chebyshev

141: showed that the prime number theorem is equivalent to the assertion

142: $\lim_{x \to \infty} \theta(x) / x = 1$, as well as to the assertion

143: $\lim_{x \to \infty} \psi(x) / x = 1$. He also provided bounds

144: \[

145: B < \pi(x) \ln x / x < 6 B / 5

146: \]

147: for sufficiently large $x$, where

148: \[

149: B = \ln 2 / 2 + \ln 3 / 3 + \ln 5 / 5 - \ln 30 / 30 > 0.92

150: \]

151: and $6 B / 5 < 1.11$. So, as $x$ approaches infinity, $\pi(x) \ln

152: x / x$, at worst, oscillates between these two values.

153:

154: In a landmark work of 1859, Riemann introduced the complex-valued

155: function $\zeta$ into the study of number theory. It was not until

156: 1894, however, that von Mangoldt provided an expression for $\psi$

157: that reduced the prime number theorem, essentially, to showing that

158: $\zeta$ has no roots with real part equal to 1. This last step was

159: achieved by Hadamard and de la Vall\'ee Poussin, independently, in

160: 1896. The resulting proofs make strong use of the theory of complex

161: functions.  In 1921, Hardy expressed strong doubts as to whether a

162: proof of the theorem was possible which did not depend, fundamentally,

163: on these ideas. In 1948, however, Selberg and Erd\"os found elementary

164: proofs based on a ``symmetry formula'' due to Selberg. (The nature of

165: the interactions between Selberg and Erd\"os at the time and the

166: influence of ideas is a subtle one, and was the source of tensions

167: between the two for years to come.) Since the libraries we had to work

168: with had only a minimal theory of the complex numbers and a limited

169: real analysis library, we chose to formalize the Selberg proof.

170:

171: There are a number of good introductions to analytic number theory

172: (for example, \cite{apostol:76,jameson:04}). Edwards's \emph{Riemann's

173:   zeta function} \cite{edwards:01} is an excellent source of both

174: historical and mathematical information. A number of textbooks present

175: Selberg's proof, including those by Nathanson \cite{nathanson:00},

176: Shapiro \cite{shapiro:83}, and Hardy and Wright

177: \cite{hardy:wright:79}. We followed Shapiro's excellent presentation

178: quite closely, though we made good use of Nathanson's book as well.

179:

180: We also had help from another source. Cornaros and Dimitracopoulos

181: \cite{cornaros:dimitracopoulos:94} have shown that the prime number

182: theorem is provable in a weak fragment of arithmetic, by showing how

183: to formalize Selberg's proof (based on Shapiro's presentation) in that

184: fragment.\footnote{For issues relating to the formalization of

185:   mathematics, and number theory in particular, in weak theories of

186:   arithmetic, see Avigad \cite{avigad:03b}.}  Their concerns were

187: different from ours: by relying on a formalization of higher-order

188: logic, we were allowing ourselves a logically stronger theory; on the

189: other hand, Cornaros and Dimitracopoulos were concerned solely with

190: axiomatic provability and not ease of formalization. Their work was,

191: however, quite helpful in stripping the proof down to its bare

192: essentials.  Also, since our libraries did not have a good theory of

193: integration, we had to take some care to avoid the mild uses of

194: analysis in the textbook presentations. Cornaros and Dimitracopoulos's

195: work was again often helpful in that respect.

196:

197: \subsection{Isabelle}

198:

199: Isabelle \cite{isabelle} is a generic proof assistant developed under

200: the direction of Larry Paulson at Cambridge University and Tobias

201: Nipkow at TU Munich. The HOL instantiation \cite{nipkow:et:al:02}

202: provides a formal framework that is a conservative extension of

203: Church's simple type theory with an infinite type (from which the

204: natural numbers are constructed), extensionality, and the axiom of

205: choice. Specifically, HOL extends ordinary type theory with set types,

206: and a schema for polymorphic axiomatic type classes designed by

207: Nipkow and implemented by Marcus Wenzel \cite{wenzel:97}. It also

208: includes a definite description operator (``THE''), and an indefinite

209: description operator (``SOME'').\footnote{The extension by set types

210:   is mild, since they are easily interpretable in terms of predicate

211:   types $\sigma \to \fn{bool}$. Similarly, the definite description

212:   operator can be eliminated, at least in principle, using Russell's

213:   well-known interpretation. It is the indefinite description

214:   operator, essentially a version of Hilbert's epsilon operator, that

215:   gives rise to the axiom of choice.  Though we occasionally used the

216:   indefinite description operator for convenience, these uses could

217:   easily be replaced by the definition description operator, and it is

218:   likely that uses of the axiom of choice can be dispensed with in the

219:   libraries as well. In any event, it is a folklore result that

220:   G\"odel's methods transfer to higher-order logic to show that the

221:   axiom of choice is a conservative extension for a fragment the

222:   includes the prime number theorem.}

223:

224: Isabelle offers good automated support, including a term simplifier,

225: an automated reasoner (which combines tableau search with rewriting),

226: and decision procedures for linear and Presburger arithmetic. It is an

227: LCF-style theorem prover, which is to say, correctness is guaranteed

228: by the use of a small number of constructors, in an underlying typed

229: programming language, to build proofs. Using the Proof General

230: interface \cite{proof:general}, one can construct proofs interactively

231: by repeatedly applying ``tactics'' that reduce a current subgoal to

232: simpler ones. But Isabelle also allows one to take advantage of a

233: higher-level proof language, called Isar, implemented by Wenzel

234: \cite{wenzel:02}. These two styles of interaction can, furthermore, be

235: combined within a proof. We found Isar to be extremely helpful in

236: structuring complex proofs, whereas we typically resorted to

237: tactic-application for filling in low-level inferences. Occasionally,

238: we also made mild use of Isabelle's support for locales

239: \cite{ballarin:05}. For more information on Isabelle, one should

240: consult the tutorial \cite{nipkow:et:al:02} and other online

241: documentation \cite{isabelle}.

242:

243: Our formalization made use of the basic HOL library, as well as those

244: parts of the HOL-Complex library, developed primarily by Jacques

245: Fleuriot, that deal with the real numbers. Some of our earlier

246: definitions, lemmas, and theorems made their way into the 2004 release

247: of Isabelle, in which the formalization described here took

248: place. Some additional theorems in our basic libraries will be part of

249: the 2005 release.

250:

251: %===================================================================

252:

253: \section{Overview}

254: \label{overview:section}

255:

256: \subsection{The Selberg proof}

257: \label{selberg:subsection}

258:

259: The prime number theorem describes the asymptotic behavior of a

260: function from the natural numbers to the reals. Analytic number theory

261: works by extending the domain of such functions to the real numbers,

262: and then providing a toolbox for reasoning about such functions. One

263: is typically concerned with rough characterizations of a function's

264: rate of growth; thus $f = O(g)$ expresses the fact that

265: for some constant $C$, $| f(x) | \leq C | g(x) |$ for every $x$.

266: (Sometimes, when writing $f = O(g)$, one really means that the

267: inequality holds except for some initial values of $x$, where $g$ is

268: $0$ or one of the functions is undefined; or that the inequality holds

269: when $x$ is large enough.)

270:

271: For example, all of the following identities can be obtained using

272: elementary calculus:

273: \begin{equation*}

274: \begin{split}

275: \ln (1 + 1 / n) & = 1 / n + O(1/n^2) \\

276: \sum_{n \leq x} 1 / n  & = \ln x + O(1) \\

277: \sum_{n \leq x} \ln n & = x \ln x - x + O(\ln x)\\

278: \sum_{n \leq x} \ln n / n & = \ln^2 x / 2 + O(1)

279: \end{split}

280: \end{equation*}

281: In all of these, $n$ ranges over positive integers. The last three

282: inequalities hold whether one takes $x$ to be an integer or a real

283: number greater than or equal to $1$. The second identity reflects the

284: fact that the integral of $1 / x$ is $\ln x$, and the third reflects

285: the fact that the integral of $\ln x$ is $x \ln x - x$. A list of

286: identities like these form one part of the requisite background to the

287: Selberg proof.

288:

289: Some of Chebyshev's results form another part. Rate-of-growth

290: comparisons between $\theta$, $\psi$, and $\pi$ sufficient to show the

291: equivalence of the various statements of the prime number theorem can

292: be obtained by fairly direct calculations. Obtaining any of the upper

293: bounds equivalent to $\psi(x) = O(x)$ requires more work. A nice way

294: of doing this, using binomial coefficients, can be found in

295: Nathanson \cite{nathanson:00}.

296:

297: Number theory depends crucially on having different ways of counting

298: things, and rudimentary combinatorial methods form a third

299: prerequisite to the Selberg proof. For example, consider the set of

300: (positive) divisors $d$ of a positive natural number $n$. Since the

301: function $d \mapsto n / d$ is a permutation of that set, we have the

302: following identity:

303: \[

304: \sum_{d | n} f(d) = \sum_{d | n} f(n / d).

305: \]

306: For a more complicated example, suppose $n$ is a positive integer, and

307: consider the set of pairs $d, d'$ of positive integers such that $d d'

308: \leq n$. There are two ways to enumerate these pairs: for each

309: value of $d$ between $1$ and $n$, we can enumerate all the values $d'$

310: such that $d' \leq n / d$; or for each product $c$ less than $n$, we

311: can enumerate all pairs $d, c / d$ whose product is $c$. Thus we have

312: \begin{equation}

313: \begin{split}

314: \sum_{d \leq n} \sum_{d' \leq n / d} f(d, d') & = \sum_{d d' \leq n}

315: f(d, d') \\

316: & = \sum_{c \leq n} \sum_{d | c} f(d, c/d).

317: \end{split}

318: \end{equation}

319: A similar argument yields

320: \begin{equation}

321: \label{counting:eq}

322: \begin{split}

323: \sum_{d | n} \sum_{d' | (n / d)} f(d, d') & = \sum_{d d' | n}

324: f(d, d') \\

325: & = \sum_{c | n} \sum_{d | c} f(d, c/d).

326: \end{split}

327: \end{equation}

328: Yet another important combinatorial identity is given by the partial

329: summation formula, which, in one formulation, is as follows: if $a

330: \leq b$, $F(n)

331: = \sum_{i = 1}^n f(i)$, and $G$ is any function, then

332: \begin{multline*}

333: \sum_{n = a}^b f(n + 1) G(n + 1) = F(b + 1) G(b + 1) - F(a) G(a + 1) -

334: \\

335: \sum_{n = a}^{b-1} F(n + 1) (G(n + 2) - G(n + 1)).

336: \end{multline*}

337: This can be viewed as a discrete analogue of integration by parts, and

338: can be verified by induction.

339:

340: An important use of (\ref{counting:eq}) occurs in the proof of the

341: M\"obius inversion formula, which we now describe. A positive natural

342: number $n$ is said to be \emph{square free} if no prime in its

343: factorization occurs with multiplicity greater than 1; in other words,

344: $n = p_1 p_2 \cdots p_s$ where the $p_i$'s are distinct primes (and

345: $s$ may be $0$). Euler's function $\mu$ is defined by

346: \[

347: \mu(n) = \left\{

348:   \begin{array}{ll}

349:   (-1)^s & \mbox{if $n$ is squarefree and $s$ is as above} \\

350:   0 & \mbox{otherwise.} \\

351:   \end{array}

352: \right.

353: \]

354: A remarkably useful fact regarding $\mu$ is that for $n > 0$,

355: \begin{eqnarray}

356: \label{mu:eq}

357: \sum_{d | n} \mu(d) = \left\{

358:   \begin{array}{ll}

359:   1 & \mbox{if $n = 1$} \\

360:   0 & \mbox{otherwise.}

361:   \end{array}

362: \right.

363: \end{eqnarray}

364: To see this, define the \emph{radical} of a number $n$, denoted

365: $\fn{rad}(n)$, to be the greatest squarefree number dividing $n$. It

366: is not hard to see that if $n$ has prime factorization $p_1^{j_1}

367: p_2^{j_2} \cdots p_s^{j_s}$, then $\fn{rad}(n)$ is given by $p_1 p_2

368: \cdots p_s$. Then $\sum_{d | n} \mu(d) = \sum_{d | \fn{rad}(n)}

369: \mu(d)$, since divisors of $n$ that are not divisors of $\fn{rad}(n)$

370: are not squarefree and hence contribute $0$ to the sum. If $n = 1$,

371: equation (\ref{mu:eq}) is clear. Otherwise, write $\fn{rad}(n) = p_1

372: p_2 \cdots p_s$, write

373: \[

374: \sum_{d|\fn{rad}(n)} \mu(d) = \sum_{d | \fn{rad}(n), p_1 | d} \mu(d) +

375: \sum_{d | \fn{rad}(n), p_1 \nmid d} \mu(d),

376: \]

377: and note that each term in the first sum is canceled by a

378: corresponding one in the second.

379:

380: Now, suppose $g$ is any function from $\N$ to $\mathbb{R}$, and define $f$

381: by $f(n) = \sum_{d |n} g(d)$. The M\"obius inversion formula provides

382: a way of ``inverting'' the definition to obtain an expression for $g$

383: in terms of $f$.  Using (\ref{counting:eq}) for the third equality

384: below and (\ref{mu:eq}) for the last, we have, somewhat miraculously,

385: \begin{eqnarray*}

386: \sum_{d | n} \mu(d) f(n / d) & = &

387:   \sum_{d | n} \mu(d) \sum_{d' | (n / d)} g((n / d) / d') \\

388: & = & \sum_{d | n} \sum_{d' | (n / d)} \mu(d) g((n / d) / d') \\

389: & = & \sum_{c | n} \sum_{d | c} \mu(d) g(n / c) \\

390: & = & \sum_{c | n} g(n / c) \sum_{d | c} \mu(d) \\

391: & = & g(n),

392: \end{eqnarray*}

393: since the inner sum on the second-to-last line is $0$ except when $c$

394: is equal to $1$.

395:

396: All the pieces just described come together to yield additional

397: identities involving sums, $\ln$, and $\mu$, as well as Mertens's

398: theorem:

399: \[

400: \sum_{n \leq x} \Lambda(n) / n = \ln x + O(1).

401: \]

402: These, in turn, are used to derive Selberg's elegant ``symmetry

403: formula,'' which is the central component in the proof. One

404: formulation of the symmetry formula is as follows:

405: \[

406: \sum_{n \leq x} \Lambda (n) \ln n + \sum_{n \leq x} \sum_{d | n}

407: \Lambda(d) \Lambda(n / d) = 2 x \ln x + O(x).

408: \]

409: There are, however, many variants of this identity, involving

410: $\Lambda$, $\psi$, and $\theta$. These crop up in profusion because

411: one can always unpack definitions of the various functions, apply the

412: types of combinatorial manipulations described above, and use

413: identities and approximations to simplify expressions.

414:

415: What makes the Selberg symmetry formula so powerful is that there are

416: two terms in the sum on the left, each sensitive to the presence of

417: primes in different ways. The formula above implies there have to be

418: some primes --- to make left-hand side nonzero --- but there can't be

419: too many. Selberg's proof involves cleverly balancing the two terms

420: off each other, to show that in the long run, the density of the

421: primes has the appropriate asymptotic behavior.

422:

423: Specifically, let $R(x) = \psi(x) - x$ denote the ``error term,'' and

424: note that by Chebyshev's equivalences the prime number theorem amounts

425: to the assertion $\lim_{x \to \infty} R(x) / x = 0$. With some

426: delicate calculation, one can use the symmetry formula to obtain a

427: bound on $|R(x)|$:

428: \begin{equation}

429: \label{r:eq}

430: |R(x)| \ln^2 x \leq 2 \sum_{n \leq x} | R(x / n)| \ln n + O(x \ln x).

431: \end{equation}

432: Now, suppose we have a bound $| R(x) | \leq a x$ for sufficiently large

433: $x$. Substituting this into the right side of (\ref{r:eq}) and using

434: an approximation for $\sum_{n \leq x} \ln n / n$ we get

435: \[

436: |R(x)| \leq a x + O(x / \ln x),

437: \]

438: which is not an improvement on the bound $|R(x)| \leq a x$ with which

439: we began. Selberg's method involves showing that in fact there are

440: always enough sufficiently large intervals on which one can obtain a

441: stronger bound on $R(x)$, so that for some positive constant $k$,

442: assuming we have a bound $|R(x)| \leq a x$ that valid for $x \geq

443: c_1$, we can obtain a $c_2$ and a better bound $|R(x)| \leq (a - k

444: a^3)$, valid for $x \geq c_2$. The constant $k$ depends on $a$, but

445: the same constant also works for any $a' < a$.

446:

447: By Chebyshev's theorem, we know that there is a constant $a_1$ such

448: that $|R(x)| \leq a_1 x$ for every $x$. Choosing $k$ appropriate for

449: $a_1$ and then setting $a_{n+1} = a_n - k a_n^3$, we have that for

450: every $n$, there is a $c$ large enough so that $|R(x)|/x \leq a_n$ for

451: every $x \geq c$. But it is not hard to verify that the sequence $a_1,

452: a_2, \ldots$ approaches $0$, which implies that $R(x) / x$ approaches

453: $0$ as $x$ approaches infinity, as required.

454:

455: \subsection{Our formalization}

456: \label{formalization:subsection}

457:

458: All told, our number theory session, including the proof of the prime

459: number theorem and supporting libraries, constitutes 673 pages of

460: proof scripts, or roughly 30,000 lines.  This count includes about 65

461: pages of elementary number theory that we had at the outset, developed

462: by Larry Paulson and others; also about 50 pages devoted to a proof of

463: the law of quadratic reciprocity and properties of Euler's $\ph$

464: function, neither of which are used in the proof of the prime number

465: theorem.  The page count does not include the basic HOL library, or

466: properties of the real numbers that we obtained from the HOL-Complex

467: library.

468:

469: The overview provided in the last section should provide a general

470: sense of the components that are needed for the formalization. To

471: start with, one needs good supporting libraries:

472: \begin{itemize}

473: \item a theory of the natural numbers and integers, including

474:   properties of primes and divisibility, and the fundamental theorem

475:   of arithmetic

476: \item a library for reasoning about finite sets, sums, and products

477: \item a library for the real numbers, including properties of $\ln$

478: \end{itemize}

479: The basic Isabelle libraries provided a good starting point, though we

480: had to augment these considerably as we went along. More specific

481: supporting libraries include:

482: \begin{itemize}

483: \item properties of the $\mu$ function, combinatorial identities, and

484:   the M\"obius inversion formula

485: \item a library for asymptotic ``big O'' calculations

486: \item a number of basic identities involving sums and $\ln$

487: \item Chebyshev's theorems

488: \end{itemize}

489: Finally, the specific components of the Selberg proof are:

490: \begin{itemize}

491: \item the Selberg symmetry formula

492: \item the inequality involving $R(n)$

493: \item a long calculation to show $R(n)$ approaches $0$

494: \end{itemize}

495:

496: This general outline is clearly discernible in the list of theory

497: files, which can be viewed online \cite{avigad:isabelle}. Keep in

498: mind that the files described here have not been modified since the

499: original proof was completed, and many of the proofs were written

500: while various participants in the project were still learning how to

501: use Isabelle.  Since then, some of the basic libraries have been

502: revised and incorporated into Isabelle, but Avigad intends to revise

503: the number theory libraries substantially before cleaning up the rest

504: of the proof.

505:

506: Once the basic libraries are in place, our formal proof follows

507: Shapiro's presentation quite closely, though for some parts we

508: followed Nathanson instead. A detailed description of

509: our proof would amount to little more than a step-by-step narrative

510: of (one of the various paths through) Selberg's proof, with page

511: correspondences in texts we followed. For example, one of our

512: formulations of the M\"obius inversion is as follows:

513:

514: \medskip

515:

516: \begin{isabellebody}

517: \isamarkupfalse%

518: \isacommand{lemma}\ mu{\isacharunderscore}inversion{\isacharunderscore}nat{\isadigit{1}}a{\isacharcolon}\ {\isachardoublequote}ALL\ n{\isachardot}\ {\isacharparenleft}{\isadigit{0}}\ {\isacharless}\ n\ {\isasymlongrightarrow}\isanewline

519: \ \ f\ n\ {\isacharequal}\ {\isacharparenleft}{\isasymSum}\ d\ {\isacharbar}\ d\ dvd\ n{\isachardot}\ g{\isacharparenleft}n\ div\ d{\isacharparenright}{\isacharparenright}{\isacharparenright}\ {\isasymLongrightarrow}\ {\isadigit{0}}\ {\isacharless}\ {\isacharparenleft}n{\isacharcolon}{\isacharcolon}nat{\isacharparenright}\ {\isasymLongrightarrow}\isanewline

520: \ \ g\ n\ {\isacharequal}\ {\isacharparenleft}{\isasymSum}\ d\ {\isacharbar}\ d\ dvd\ n{\isachardot}\ of{\isacharunderscore}int{\isacharparenleft}mu{\isacharparenleft}int{\isacharparenleft}d{\isacharparenright}{\isacharparenright}{\isacharparenright}\ {\isacharasterisk}\ f\ {\isacharparenleft}n\ div\ d{\isacharparenright}{\isacharparenright}{\isachardoublequote}

521: \end{isabellebody}

522:

523: \medskip

524:

525: \noindent This appears on page 64 of Shapiro's book, and on page 218

526: of Nathanson's book. We formalized a version of the fourth identity

527: listed in Section~\ref{formalization:subsection} as follows:

528:

529: \medskip

530:

531: \begin{isabellebody}

532: \isamarkupfalse%

533: \isacommand{lemma}\ identity{\isacharunderscore}four{\isacharunderscore}real{\isacharunderscore}b{\isacharcolon}\ {\isachardoublequote}{\isacharparenleft}{\isasymlambda}x{\isachardot}\ {\isasymSum}i{\isacharequal}{\isadigit{1}}{\isachardot}{\isachardot}natfloor{\isacharparenleft}abs\ x{\isacharparenright}{\isachardot}\isanewline

534: \ \ \ \ ln\ {\isacharparenleft}real\ i{\isacharparenright}\ {\isacharslash}\ {\isacharparenleft}real\ i{\isacharparenright}{\isacharparenright}\ {\isacharequal}o\ \isanewline

535: \ \ \ \ \ \ {\isacharparenleft}{\isasymlambda}x{\isachardot}\ ln{\isacharparenleft}abs\ x\ {\isacharplus}\ {\isadigit{1}}{\isacharparenright}{\isacharcircum}{\isadigit{2}}\ {\isacharslash}\ {\isadigit{2}}{\isacharparenright}\ {\isacharplus}o\ O{\isacharparenleft}{\isasymlambda}x{\isachardot}\ {\isadigit{1}}{\isacharparenright}{\isachardoublequote}

536: \end{isabellebody}

537:

538: \medskip

539:

540: \noindent In fact, stronger assertions can be found on page 93 of Shapiro's

541: book, and on page 209 of Nathanson's book. Here is one of our

542: formulations of the Selberg symmetry principle:

543:

544: \medskip

545:

546: \begin{isabellebody}

547: \isamarkupfalse%

548: \isacommand{lemma}\ Selberg{\isadigit{3}}{\isacharcolon}\ {\isachardoublequote}{\isacharparenleft}{\isasymlambda}x{\isachardot}\ {\isasymSum}n\ {\isacharequal}\ {\isadigit{1}}{\isachardot}{\isachardot}natfloor\ {\isacharparenleft}abs\ x{\isacharparenright}\ {\isacharplus}\ {\isadigit{1}}{\isachardot}\isanewline

549: \ \ \ \ Lambda\ n\ {\isacharasterisk}\ ln\ {\isacharparenleft}real\

550: n{\isacharparenright}{\isacharparenright}\ {\isacharplus}\ {\isacharparenleft}{\isasymlambda}x{\isachardot}\ {\isasymSum}n{\isacharequal}{\isadigit{1}}{\isachardot}{\isachardot}natfloor\ {\isacharparenleft}abs\ x{\isacharparenright}\ {\isacharplus}\ {\isadigit{1}}{\isachardot}\ \isanewline

551: \ \ \ \ \ {\isacharparenleft}{\isasymSum}\ u\ {\isacharbar}\ u\ dvd\ n{\isachardot}\ Lambda\ u\ {\isacharasterisk}\ Lambda\ {\isacharparenleft}n\ div\ u{\isacharparenright}{\isacharparenright}{\isacharparenright}\isanewline

552: \ \ \ \ \ \ {\isacharequal}o\ {\isacharparenleft}{\isasymlambda}x{\isachardot}\ {\isadigit{2}}\ {\isacharasterisk}\ {\isacharparenleft}abs\ x\ {\isacharplus}\ {\isadigit{1}}{\isacharparenright}\ {\isacharasterisk}\ ln\ {\isacharparenleft}abs\ x\ {\isacharplus}\ {\isadigit{1}}{\isacharparenright}{\isacharparenright}\ {\isacharplus}o\ O{\isacharparenleft}{\isasymlambda}x{\isachardot}\ abs\ x\ {\isacharplus}\ {\isadigit{1}}{\isacharparenright}{\isachardoublequote}

553: \end{isabellebody}

554:

555: \medskip

556:

557: \noindent This is given on page 419 of Shapiro's book, and on page 293 of

558: Nathanson's book. The error estimate given in the previous section,

559: taken from 431 of Shapiro's book, takes the following form:

560:

561: \medskip

562:

563: \begin{isabellebody}

564: \isamarkupfalse%

565: \isacommand{lemma}\ error{\isadigit{7}}{\isacharcolon}\ {\isachardoublequote}{\isacharparenleft}{\isasymlambda}x{\isachardot}\ abs\ {\isacharparenleft}R\ {\isacharparenleft}abs\ x\ {\isacharplus}\ {\isadigit{1}}{\isacharparenright}{\isacharparenright}\ {\isacharasterisk}\ ln\ {\isacharparenleft}abs\ x\ {\isacharplus}\ {\isadigit{1}}{\isacharparenright}\ {\isacharcircum}\ {\isadigit{2}}{\isacharparenright}\ {\isacharless}o\isanewline

566: \ \ {\isacharparenleft}{\isasymlambda}x{\isachardot}\ {\isadigit{2}}\ {\isacharasterisk}\ {\isacharparenleft}{\isasymSum}\ n\ {\isacharequal}\ {\isadigit{1}}{\isachardot}{\isachardot}natfloor\ {\isacharparenleft}abs\ x{\isacharparenright}\ {\isacharplus}\ {\isadigit{1}}{\isachardot}\isanewline

567: \ \ \ \ \ \ abs\ {\isacharparenleft}R\ {\isacharparenleft}{\isacharparenleft}abs\ x\ {\isacharplus}\ {\isadigit{1}}{\isacharparenright}\ {\isacharslash}\ real\ n{\isacharparenright}{\isacharparenright}\ {\isacharasterisk}\ ln\ {\isacharparenleft}real\ n{\isacharparenright}{\isacharparenright}{\isacharparenright}\ {\isacharequal}o\ \isanewline

568: \ \ \ \ O{\isacharparenleft}{\isasymlambda}x{\isachardot}\ {\isacharparenleft}abs\ x\ {\isacharplus}\ {\isadigit{1}}{\isacharparenright}\ {\isacharasterisk}\ {\isacharparenleft}{\isadigit{1}}\ {\isacharplus}\ ln\ {\isacharparenleft}abs\ x\ {\isacharplus}\ {\isadigit{1}}{\isacharparenright}{\isacharparenright}{\isacharparenright}{\isachardoublequote}

569: \end{isabellebody}

570:

571: \medskip

572:

573: \noindent We will have more to say, below, about handling of

574: asymptotic notation, the type casts, and the various occurrences of

575: $\fn{abs}$ and $+1$ that make the formal presentation differ from

576: ordinary mathematical notation. But aside from calling attention to

577: differences like these, a more detailed outline would not be very

578: interesting.

579:

580: There are additional reasons that it does not pay to describe the

581: formal proofs in great detail. For one thing, they are not

582: particularly nice: our efforts were designed to get us to the prime

583: number theorem as quickly as possible, and so the proofs could use a

584: good deal of cleaning and polishing. Second, and more to the point, we

585: know that our formalization is not optimal. It hardly makes sense for

586: us to describe exactly how we went about proving the M\"obius

587: inversion formula, for example, until we are convinced that we have

588: done it right; that is, until we are convinced the we have made the

589: supporting libraries as generally useful as possible, and configured

590: the automated tools in such a way to make the formalization as smooth

591: as possible. We therefore intend to invest more time in improving the

592: various parts of the formalization and report on these when it is

593: clear what we have learned from the efforts.

594:

595: In the meanwhile, we will devote the rest of this report to conveying

596: two types of information. First, to help gauge the usability of the

597: current technology, we will try to provide a sense of the amount of

598: time required to seeing the project through to its completion. Second,

599: we will provide some initial reflections on the project, and on the

600: strengths and weaknesses of contemporary proof assistants. In

601: particular, we will discuss what we take to be some of the novel

602: aspects of the formalization, and indicate where we believe better

603: automated support would have been especially helpful.

604:

605: \subsection{The effort involved}

606:

607: As we have noted in the introduction, one of the most interesting

608: features of our formalization of the prime number theorem is simply

609: its existence, which shows that current technology makes it possible

610: to treat a proof of this complexity. The question naturally arises as

611: to how long the formalization took.

612:

613: This is a question that it hard to answer with any precision.  Avigad

614: first decided to undertake the project in March of 2003, having

615: learned how to use Isabelle and proved Gauss's law of quadratic

616: reciprocity with Gray and Adam Kramer the preceding summer and fall.

617: But this was a side project for everyone involved, and time associated

618: it includes time spent learning to use Isabelle, time spent learning

619: the requisite number theory, and so on. Gray developed a substantial

620: part of the number theory library, including basic facts about primes

621: and multiplicity, the $\mu$ function, and the identity

622: (\ref{counting:eq}), working a few hours per week in the summer of

623: 2003, before his thesis work in ethics took over. Donnelly and Avigad

624: developed the library to support big O calculations

625: \cite{avigad:donnelly:04} while Donnelly worked half-time during the

626: summer of 2003, just after he completed his junior year at Carnegie

627: Mellon. During that summer, and working part time the following year,

628: Donnelly also derived some of the basic identities involving $\ln$.

629: Raff started working on the project in the 2003-2004 academic year,

630: but most of his contributions came working roughly half-time in the

631: summer of 2004, just after he obtained his undergraduate degree.

632: During that time, he proved Chebyshev's theorem to the effect that

633: $\psi(x) = O(x)$, and also did most of the work needed to prove the

634: equivalence of statements of the prime number theorem in terms of the

635: functions $\pi$, $\theta$, and $\psi$. Though Avigad's involvement was

636: more constant, he rarely put in more than a few hours per week before

637: the summer of 2004, and set the project aside for long stretches of

638: time. The bulk of his proof scripts were written during the summer of

639: 2004, when he worked roughly half-time on the project from the middle

640: of June to the end of August.

641:

642: Some specific benchmarks may be more informative. Proving most of the

643: inversion theorems we needed, starting from (\ref{counting:eq}) and

644: the relevant properties of $\mu$, took Avigad about a day. (For a

645: ``day'' read eight hours of dedicated formalization.  Though he could

646: put in work-days like that for small stretches, in some of the

647: estimates below, the work was spread out over longer periods of time.)

648: Proving the first version of the Selberg symmetry formula using the

649: requisite identities took another day. Along the way, he was often

650: sidetracked by the need to prove elementary facts about things like

651: primes and divisibility, or the floor function on the real numbers.

652: This process stabilized, however, and towards the end he found that he

653: could formalize about a page of Shapiro's text per day. Thus, the

654: derivation of the error estimate described above, taken from pages

655: 428--431 in Shapiro's book, took about three-and-a-half days to

656: formalize; and the remainder of the proof, corresponding to 432--437

657: in Shapiro's book, took about five days.

658:

659: In many cases, the increase in length is dramatic: the

660: three-and-a-half pages of text associated with the proof of the error

661: estimate translate to about 1,600 lines, or 37 pages, of proof

662: scripts, and the five pages of text associated with the final part of

663: the proof translate to about 4,000 lines, or 89 pages, of proof

664: scripts. These ratios are abnormally high, however, for reasons

665: discussed in Section~\ref{calculation:subsection}. The five-line

666: derivation of the M\"obius inversion formula in

667: Section~\ref{selberg:subsection} translates to about 40 lines, and the

668: proof of the form of the Selberg symmetry formula discussed there,

669: carried out in about two-and-a-half pages in Shapiro's book, takes up

670: about 600 lines, or 13 pages. These ratios are more typical.

671:

672: We suspect that over the coming years both the time it takes to carry

673: out such formalizations, as well as the lengths of the formal proof

674: scripts, will drop significantly. Much of the effort involved in the

675: project was spent on the following:

676: \begin{itemize}

677: \item Defining fundamental concepts and gathering basic libraries of

678:   easy facts.

679: \item Proving trivial lemmas and spelling out ``straightforward''

680:   inferences.

681: \item Finding the right lemmas and theorems to apply.

682: \item Entering long formulas and expressions correctly, and adapting

683:   ordinary mathematical notation to the formal notation in Isabelle.

684: \end{itemize}

685: Gradually, all these requirements will be ameliorated, as better

686: libraries, automated tools, and interfaces are developed. On a

687: personal note, we are entirely convinced that, although there is a

688: long road ahead, formal verification of mathematics will inevitably

689: become commonplace. Getting to that point will require both

690: theoretical and practical ingenuity, but we do not see any conceptual

691: hurdles.\footnote{For further speculation along these lines, see the

692:   preliminary notes \cite{avigad:unp:pnt}.}

693:

694: %===================================================================

695:

696: \section{Thoughts on the formalization}

697: \label{reflection:section}

698:

699: In this section, we will discuss features of the formalization that we

700: feel are worthy of discussion, either because they represent novel and

701: successful solutions to general problems, or (more commonly) because

702: they indicate aspects of formal mathematical verification where better

703: support is possible.

704:

705: \subsection{Asymptotics}

706:

707: One of our earliest tasks in the formalization was to develop a

708: library to support the requisite calculations with big O expressions.

709: To that end, we gave the expression $f = O(g)$ the strict reading $\ex

710: C \fa x (|f(x)| \leq C |g(x)|)$, and followed the common practice of

711: taking $O(g)$ to be the set of all functions with the requisite rate

712: of growth, i.e.~

713: \[

714: O(g) = \{ f \st \ex C \fa x (|f(x)| \leq C |g(x)|) \}.

715: \]

716: We then read the ``equality'' in $f = O(g)$ as the element-of

717: relation, $\in$.

718:

719: Note that these expressions make sense for any function type for which

720: the codomain is an ordered ring. Isabelle's axiomatic type classes

721: made it possible to develop the library fully generally. We were able

722: to lift operations like addition and multiplication to such types,

723: defining $f + g$ to denote the pointwise sum, $\lambda x. (f(x) +

724: g(x))$. Similarly, given a set $B$ of elements of a type that supports

725: addition, we defined

726: \[

727: a +_o B = \{ c \st \ex {b \in B} (c = a + b) \}.

728: \]

729: We also defined $a =_o B$ to be alternative input syntax for $a \in

730: B$.  This gave expressions like $f =_o g +_o O(h)$ the intended

731: meaning. In mathematical texts, convention dictates that in an

732: expression like $x^2 + 3 x = x^2 + O(x)$, the terms are to be

733: interpreted as functions of $x$; in Isabelle we had to use lambda

734: notation to make this explicit. Thus, the expression above would be

735: entered

736: \begin{center}

737: {\isastyle {\isacharparenleft}{\isasymlambda}x{\isachardot}\

738:   x{\isacharcircum}{\isadigit{2}}\ {\isacharplus}\ {\isadigit{3}}\

739:   {\isacharasterisk}\ x{\isacharparenright}\ {\isacharequal}o\

740:   {\isacharparenleft}{\isasymlambda}x{\isachardot}\

741:   x{\isacharcircum}{\isadigit{2}}{\isacharparenright}\ {\isacharplus}o\

742:   O{\isacharparenleft}{\isasymlambda}x{\isachardot}\

743:   x{\isacharparenright}}

744: \end{center}

745: This should help the reader make sense of the formalizations presented

746: in Section~\ref{formalization:subsection}.

747:

748: An early version of our big O library was presented at IJCAR

749: \cite{avigad:donnelly:04}. That version is nonetheless fairly close to

750: the version used in the proof of the prime number theorem described

751: here, as well as a version that is scheduled for the 2005 release of

752: Isabelle.\footnote{Improvements in the more recent versions include

753:   better and more general theorems involving summations, theorems to

754:   handle composition of big O equations, and support for reasoning

755:   about asymptotic inequalities. Also, in the most recent version, we

756:   have dispensed with expressions of the form $O(S)$, where $S$ is a

757:   set of functions. It seems that uses of these are easily eliminable,

758:   and having $O$ notation for both functions and sets of functions led

759:   to annoying type ambiguities.}

760:

761: There is one feature of our library that seems to be less than

762: optimal, and resulted in a good deal of tedium. With our definition, a

763: statement like $\lam {x.} x + 1 = O(\lam {x.} x^2)$ is false when the

764: variables range over the natural numbers, since $x^2$ is equal to $0$

765: when $x$ is $0$. Often one wants to restrict one's attention to

766: strictly positive natural numbers, or nonnegative real numbers. There

767: are four ways one can do this:

768: \begin{itemize}

769: \item Define new types for the strictly positive natural numbers, or

770:   nonnegative real numbers, and state the identities for those types.

771: \item Formalize the notion ``$f = O(g)$ on $S$.''

772: \item Formalize the notion ``$f = O(g)$ eventually.''

773: \item Replace $x$ by $x + 1$ in the first case, and by $|x|$ in the

774:   second case, to make the identities correct. For example, ``$f(|x|)

775:   = O(|x|^3)$'' expresses that $f(x) = O(x^3)$ on the nonnegative

776:   reals. Various similar tinkerings are effective; for example, the

777:   relationship intended in the example above is probably best

778:   expressed as $\lam {x.} x + 1 = O(\lam{x.} x^2 + 1)$.

779: \end{itemize}

780: These various options are discussed in the IJCAR paper

781: \cite{avigad:donnelly:04}, and all come at a cost. For example, the

782: first requires annoying casts, say, between positive natural numbers,

783: and natural numbers. The second requires carrying around a set $S$ in

784: every formula, and both the second and third require additional work

785: when composing expressions or reasoning about sums (roughly, one has

786: to make sure that the range of a function lies in the domain where an

787: asymptotic estimate is valid).

788:

789: In our formalization, we chose the fourth route, which explains the

790: numerous occurrences of $+1$ and $\fn{abs}$ in the statements in

791: Section~\ref{formalization:subsection}. This often made some of the

792: more complex calculations painfully tedious, forcing us, for example,

793: the following ``helper'' lemma in Selberg:

794:

795: \medskip

796:

797: \begin{isabellebody}

798: \isamarkupfalse%

799: \isacommand{lemma}\ aux{\isacharcolon}\ {\isachardoublequote}{\isadigit{1}}\ {\isacharless}{\isacharequal}\ z\ {\isasymLongrightarrow}\ natfloor{\isacharparenleft}abs{\isacharparenleft}z\ {\isacharminus}\ {\isadigit{1}}{\isacharparenright}{\isacharparenright}\ {\isacharplus}\ {\isadigit{1}}\ {\isacharequal}\ natfloor\ z{\isachardoublequote}

800: \end{isabellebody}

801:

802: \medskip

803:

804: \noindent On the general principle that formalization goes most

805: smoothly when the formalization is as close as possible to the

806: informal text, it is probably worth extending the library in the ways

807: described above. We do not have a good sense, however, as to how much

808: this would have simplified our task.

809:

810: Donnelly and Avigad have designed a decision procedure for entailments

811: between linear big O equations, and have obtained a prototype

812: implementation (though we have not incorporated it into the Isabelle

813: framework). This would eliminate the need for helper lemmas like the

814: following:

815:

816: \medskip

817:

818: \begin{isabellebody}

819: \isamarkupfalse%

820: \isacommand{lemma}\ aux{\isadigit{5}}{\isacharcolon}\ {\isachardoublequote}f\ {\isacharplus}\ g\ {\isacharequal}o\ h\ {\isacharplus}o\ O{\isacharparenleft}k{\isacharcolon}{\isacharcolon}{\isacharprime}a{\isacharequal}{\isachargreater}{\isacharparenleft}{\isacharprime}b{\isacharcolon}{\isacharcolon}ordered{\isacharunderscore}ring{\isacharparenright}{\isacharparenright}\ {\isasymLongrightarrow}\isanewline

821: \ \ g\ {\isacharplus}\ l\ {\isacharequal}o\ h\ {\isacharplus}o\ O{\isacharparenleft}k{\isacharparenright}\ {\isasymLongrightarrow}\ f\ {\isacharequal}o\ l\ {\isacharplus}o\ O{\isacharparenleft}k{\isacharparenright}{\isachardoublequote}

822: \end{isabellebody}

823:

824: \medskip

825:

826: \noindent We believe calculations going beyond the linear fragment

827: would also benefit from a better handling of monotonicity, just as is

828: needed to support ordinary calculations with inequalities,

829: as described in the next section.

830:

831: \subsection{Calculations with real numbers}

832: \label{calculation:subsection}

833:

834: One salient feature of the Selberg proof is the amount of calculation

835: involved. The dramatic increase in the length of the formalization of

836: the final part of the proof (5 pages in Shapiro, compared to 89 or so

837: in the formal version) is directly attributable to the need to spell

838: out calculations involving field operations, logarithms and

839: exponentiation, the greatest and least integer functions (``ceiling''

840: and ``floor''), and so on. The textbook calculations themselves were

841: complex; but then each textbook inference had to be expanded, by hand,

842: to what was often a long sequence of entirely straightforward

843: inferences.

844:

845: Of course, Isabelle does provide some automated support. For example,

846: the simplifier employs a form of ordered rewriting for operations,

847: like addition and multiplication, that are associative and

848: commutative. This puts terms involving these operations into canonical

849: normal forms, thereby making it easy to verify equality of terms that

850: differ up to such rewriting. More complex equalities can similarly be

851: obtained by simplifying with appropriate rewrite rules, such as

852: various forms of distributivity in a ring or identities for logarithms

853: and exponents.

854:

855: Much of the work in the final stages of the proof, however, involved

856: verifying \emph{inequalities} between expressions. Isabelle's linear

857: arithmetic package is complete for reasoning about inequalities

858: between linear expressions in the integers and reals, i.e.~validities

859: that depend only on the linear fragment of these theories. But,

860: many of the calculations went just beyond that,

861: at which point we were stuck manipulating expressions by hand and

862: applying low-level inferences.

863:

864: As a simple example, part of one of the long proofs in

865: PrimeNumberTheorem required verifying that

866: \[

867: (1 + \frac{\varepsilon}{3(C + 3)}) \cdot n < K x

868: \]

869: using the following hypotheses:

870: \[

871: \begin{split}

872: & n \leq (K / 2) x \\

873: & 0 < C \\

874: & 0 < \varepsilon < 1

875: \end{split}

876: \]

877: The conclusion is easily obtained by noting that $1 +

878: \frac{\varepsilon}{3(C + 3)}$ is strictly less than $2$, and so the

879: product with $n$ is strictly less than $2 (K / 2) x = K x$.

880: But spelling out the details requires, for one thing, invoking the

881: relevant monotonicity rules for addition, multiplication, and

882: division. The last two, in turn, require verifying that the relevant

883: terms are positive. Furthermore, getting the calculation to go through

884: can require explicitly specifying terms like $2 (K / 2) x$ (which can

885: be simplified to $K x$), or, in other contexts, using rules like

886: associativity or commutativity to manipulate terms into the the forms

887: required by the rules.

888:

889: The file PrimeNumberTheorem consists of a litany of such calculations.

890: This required us to have names like ``mult-left-mono''

891: ``add-pos-nonneg,'' ``order-le-less-trans,'' ``exp-less-cancel-iff,''

892: ``pos-divide-le-eq'' at our fingertips, or to search for them when

893: they were needed.  Furthermore, sign calculations had a way of coming

894: back to haunt us.  For example, verifying an inequality like $1 / (1 +

895: st) < 1 / (1 + su)$ might require showing that the denominators are

896: positive, which, in turns, might require verifying that $s$, $t$, and

897: $u$ are nonnegative; but then showing $s t > s u$ may again require

898: verifying that $s$ is positive. Since $s$ can be carried along in a

899: chain of inequalities, such queries for sign information can keep

900: coming back. Isar made it easy to break out such facts, name them, and

901: reuse them as needed. But since we were usually working in a context

902: where obtaining the sign information was entirely straightforward,

903: these concerns always felt like an annoying distraction from the

904: interesting and truly difficult parts of the calculations.

905:

906: In short, inferences like the ones we have just described are commonly

907: treated as ``obvious'' in ordinary mathematical texts, and it would be

908: nice if mechanized proof assistants could recognize them as such.

909: Decision procedures that are stronger than linear arithmetic are

910: available; for example, a proof-producing decision procedure for

911: real-closed fields has recently been implemented in HOL-light

912: \cite{mclaughlin:harrison:05}. But for calculations like the one

913: above, computing sequences of partial derivatives, as decision

914: procedures for the real closed fields are required to do, is arguably

915: unnecessary and inefficient. Furthermore, decision procedures for real

916: closed fields cannot be extended, say, to handle exponentiation and

917: logarithms; and adding a generic monotone function, or trigonometric

918: functions, or the floor function, renders the full theory undecidable.

919:

920: Thus, in contexts similar to ours, we expect that principled heuristic

921: procedures will be most effective. Roughly, one simply needs to chain

922: backwards through the obvious rules in a sensible way. There are

923: stumbling blocks, however. For one thing, excessive case splits can

924: lead to exponential blowup; e.g.~one can show $s t > 0$ by

925: showing that $s$ and $t$ are either both strictly positive or strictly

926: negative. Other inferences are similarly nondeterministic: one can

927: show $r + s + t > 0$ by showing that two of the terms are nonnegative

928: and the third is strictly positive, and one can show $r + s < t + u +

929: v + w$, say, by showing $r < u$, $s \leq t + v$, and $0 \leq w$.

930:

931: As far as case splits are concerned, we suspect that they are rarely

932: needed to establish ``obvious'' facts; for example, in

933: straightforward calculations, the necessary sign information is

934: typically available. As far as the second

935: sort of nondeterminism is concerned, notice that the procedures for

936: linear arithmetic are effective in drawing the requisite conclusions

937: from available hypotheses; this is a reflection that of the fact that

938: the theory of the real numbers with addition (and, say, multiplication

939: by rational constants) is decidable.

940:

941: The analogous theory of the reals with multiplication is also

942: decidable. To see this, observe that the structure consisting of the

943: strictly positive real numbers with multiplication is isomorphic to

944: the structure of the real numbers with addition, and so the usual

945: procedures for linear arithmetic carry over. More generally, by

946: introducing case splits on the signs of the basic terms, one can

947: reduce the multiplicative fragment of the reals to the previous case.

948:

949: In short, when the signs of the relevant terms are known, there are

950: straightforward and effective methods of deriving inequalities in the

951: additive and multiplicative fragments. This suggests that what is

952: really needed is a principled method of amalgamating such ``local''

953: procedures, together with, say, procedures that make use of

954: monotonicity and sign properties of logarithms and exponentiation. The

955: well-known Nelson-Oppen procedure provides a method of amalgamating

956: decision procedures for disjoint theories that share only the equality

957: symbol in their common language; but these methods fail for theories

958: that share an inequality symbol when one adds, say, rational constants

959: to the language, which is necessary to render such combinations

960: nontrivial. We believe that there are principled ways, however, of

961: extending the Nelson-Oppen framework to obtain useful heuristic

962: procedures. This possibility is explored in Avigad and Friedman

963: \cite{avigad:friedman:unp}.

964:

965: \subsection{Casting between domains}

966: \label{casting:subsection}

967:

968: In our formalization, we found that the most natural way to establish

969: basic properties of the functions $\theta$, $\psi$, and $\pi$, as well

970: as Chebyshev's theorems, was to treat them as functions from the

971: natural numbers to the reals, rather them as functions from the reals

972: to the reals. Either way, however, it is clear that the relevant

973: proofs have to use the embedding of the natural numbers into the reals

974: in an essential way. Since the $\mu$ function takes positive and

975: negative values, we were also forced to deal with integers as soon

976: as $\mu$ came into play. In short, our proof of the prime number

977: theorem inevitably involved combining reasoning about the natural

978: numbers, integers, and real numbers effectively; and this, in turn,

979: involved frequent casting between the various domains.

980:

981: We tended to address such needs as they arose, in an ad-hoc way.  For

982: example, the version of the fundamental theorem of arithmetic that we

983: inherited from prior Isabelle distributions asserts that every

984: positive natural number can be written uniquely as the product of an

985: increasing list of primes. Developing properties of the radical

986: function required being able to express the unique factorization

987: theorem in the more natural form that every positive number is the

988: product of the primes that divide it, raised to the appropriate

989: multiplicity; i.e.~the fact that for every $n > 0$,

990: \[

991: n = \prod_{p | n} p^{\fn{mult}_p(n)},

992: \]

993: where $\fn{mult}_p(n)$ denotes the multiplicity of $p$ in $n$. We also

994: needed, at our disposal, things like the fact that $n$ divides $m$ if

995: and only if for every prime number $p$, the multiplicity of $p$ in $n$

996: is less than or equal to the multiplicity of $p$ in $m$. Thus, early

997: on, we faced the dual tasks of translating the unique factorization

998: theorem from a statement about positive natural numbers to positive

999: integers, and developing a good theory of multiplicity in that

1000: setting. Later, when proving Chebyshev's theorems, we found that we

1001: needed to recast some of the facts about multiplicity to statements

1002: about natural numbers.

1003:

1004: We faced similar headaches when we began serious calculations

1005: involving natural numbers and the reals. In particular, as we

1006: proceeded we were forced to develop a substantial theory of the floor

1007: and ceiling functions, including a theory of their behavior vis-a-vis

1008: the various field operations. In calculations, expressions sometimes

1009: involved objects of all three types, and we often had to explicitly

1010: transport operations in or out of casts in order to apply a relevant

1011: lemma.

1012:

1013: When one extends a domain like the natural numbers to the integers, or

1014: the integers to the real numbers, some operations are simply extended.

1015: For example, properties of addition and multiplication of natural

1016: numbers carry all the way through to the reals. On the other hand, one

1017: has new operations, like subtraction on the integers and division in

1018: the real numbers, that are mirrored imperfectly in the smaller

1019: domains. For example, subtraction on the integers extends truncated

1020: subtraction $x \tsub y$ on the natural numbers only when $x \geq y$,

1021: and division in the reals extends the function $x \mathop{\fn{div}} y$

1022: on the integers or natural numbers only when $y$ divides $x$. Finally,

1023: there are facts that depend on the choice of a left inverse to the

1024: embedding: for example, if $n$ is an integer, $x$ is a real number,

1025: $\fn{real}$ is the embedding of the integers into the reals, and

1026: $\lfloor \cdot \rfloor$ denotes the floor function from the reals to

1027: the integers, we have

1028: \[

1029: (n \leq \lfloor x \rfloor) \equiv (\fn{real}(n) \leq x).

1030: \]

1031: This is an example of what mathematicians call a

1032: Galois correspondence, and category theorists call an adjunction,

1033: between the integers and the real numbers with the ordering relation.

1034:

1035: Our formalization of the prime number theorem involved a good deal of

1036: manipulation of expressions, by hand, using the three types of facts

1037: just described. Many of these inferences should be handled

1038: automatically. After all, such issues are transparent in mathematical

1039: texts; we carry out the necessary inferences smoothly and

1040: unconsciously whenever we read an ordinary proof. The guiding

1041: principle should be that anything that is transparent to us can be

1042: made transparent to a mechanized proof assistant: we simply need to

1043: reflect on \emph{why} we are effectively able to combine domains in

1044: ordinary mathematical reasoning, and codify that knowledge

1045: appropriately.

1046:

1047: \subsection{Combinatorial reasoning with sums}

1048:

1049: As described in Section~\ref{formalization:subsection}, formalizing

1050: the prime number theorem involved a good deal of combinatorial

1051: reasoning with sums and products. Thus, we had to develop some basic

1052: theorems to support such reasoning, many of which have since been

1053: moved into Isabelle's HOL library. These include, for example,

1054:

1055: \medskip

1056:

1057: \begin{isabellebody}

1058: \isamarkupfalse%

1059: \isacommand{lemma}\ setsum{\isacharunderscore}cartesian{\isacharunderscore}product{\isacharcolon}\ \isanewline

1060: \ \ \ {\isachardoublequote}{\isacharparenleft}{\isasymSum}x{\isasymin}A{\isachardot}\ {\isacharparenleft}{\isasymSum}y{\isasymin}B{\isachardot}\ f\ x\ y{\isacharparenright}{\isacharparenright}\ {\isacharequal}\ {\isacharparenleft}{\isasymSum}z{\isasymin}A\ {\isacharless}{\isacharasterisk}{\isachargreater}\ B{\isachardot}\ f\ {\isacharparenleft}fst\ z{\isacharparenright}\ {\isacharparenleft}snd\ z{\isacharparenright}{\isacharparenright}{\isachardoublequote}

1061: \end{isabellebody}

1062:

1063: \medskip

1064:

1065: \noindent which allows one to view a double summation as a sum over a

1066: cartesian product. A more interesting example is

1067:

1068: \medskip

1069:

1070: \begin{isabellebody}

1071: \isamarkupfalse%

1072: \isacommand{lemma}\ setsum{\isacharunderscore}reindex{\isacharcolon}\isanewline

1073: \ \ \ \ \ {\isachardoublequote}inj{\isacharunderscore}on\ f\ B\ {\isasymLongrightarrow}\ {\isacharparenleft}{\isasymSum}x{\isasymin}f{\isacharbackquote}B{\isachardot}\ h\ x{\isacharparenright}\ {\isacharequal}\ {\isacharparenleft}{\isasymSum}x{\isasymin}B{\isachardot}\ {\isacharparenleft}h\ {\isasymcirc}\ f{\isacharparenright}{\isacharparenleft}x{\isacharparenright}{\isacharparenright}{\isachardoublequote}

1074: \end{isabellebody}

1075:

1076: \medskip

1077:

1078: \noindent which expresses that if $f$ is an injective function on a

1079: set $B$, then summing $h$ over the image of $B$ under $f$ is

1080: the same as summing $h \circ f$ over $B$. In particular, if $f$ is a

1081: bijection from $B$ to $A$, the second identity implies that summing

1082: $h$ over $A$ is the same as summing $h \circ f$ over $B$. This type of

1083: ``reindexing'' is often so transparent in mathematical arguments that

1084: when we first came across an instance where we needed it (long ago,

1085: when proving quadratic reciprocity), it took some thought to identify

1086: the relevant principle. It is needed, for example, to show

1087: \[

1088: \sum_{d | n} h(n) = \sum_{d | n} h(n / d),

1089: \]

1090: using the fact that $f(d) = n / d$ is a bijection from the set of

1091: divisors of $n$ to itself; or, for example, to show

1092: \[

1093: \sum_{d d' = c} h(d, d') = \sum_{d | c} h(d, c/d),

1094: \]

1095: using the fact that $f(d) = \la d, c /d \ra$ is a bijection from the

1096: set of divisors of $c$ to $\{ \la d, d' \ra \st d d' = c \}$. The

1097: reindexing lemma is a discrete analogue of integration by

1098: substitution, so it is likely that methods developed to

1099: support such inferences will be more generally useful.

1100:

1101: In Isabelle, if $\sigma$ is any type, then $\fn{\sigma} \; set$ denotes

1102: the type of all subsets of $\sigma$. The predicate ``finite'' is

1103: defined inductively for these subset types.  Isabelle's summation

1104: operator takes a subset $A$ of $\sigma$ and a function $f$ from

1105: $\sigma$ to any type with an appropriate notion of addition, and

1106: returns $\sum_{x \in A} f(x)$. This summation operator really only

1107: makes sense when $A$ is a finite subset, so many identities have to be

1108: restricted accordingly. (An alternative would be to define a type of

1109: finite subsets of $\sigma$, with appropriate closure operations; but

1110: then work would be required to translate properties of arbitrary

1111: subsets to properties of finite subsets, or to mediate relationships

1112: between finite subsets and arbitrary subsets.) This has the net effect

1113: that applying an identity involving a sum or product often requires

1114: one to verify that the relevant sets are finite. This difficulty is

1115: ameliorated by defining $\sum_{x \in A} f(x)$ to be $0$ when $A$ is

1116: infinite, since it then turns out that a number of identities hold in

1117: the unrestricted form. But this fix is not universal, and so

1118: finiteness issues tend to pop up repeatedly when one carries out a

1119: long calculation.

1120:

1121: In short, at present, carrying out combinatorial calculations often

1122: requires a number of straightforward verifications involving

1123: reindexing and finiteness. Once again, these are inferences that are

1124: nearly transparent in ordinary mathematical texts, and so, by our

1125: general principle, we should expect mechanized proof assistants to

1126: take care of them. As before, there are stumbling blocks; for example,

1127: when reindexing is needed, the appropriate injection $f$ has to be

1128: pulled from the air. We expect, however, that in the types of

1129: inferences that are commonly viewed as obvious, there are natural

1130: candidates for $f$. So this is yet another domain where reflection and

1131: empirical work should allow us to make proof assistants more usable.

1132:

1133: \subsection{Devising elementary proofs}

1134:

1135: Anyone who has undertaken serious work in formal mathematical

1136: verification has faced the task of adapting an ordinary mathematical

1137: proof so that it can be carried out using the libraries and resources

1138: available. When a proof uses mathematical ``machinery'' that is

1139: unavailable, one is faced with the choice of expanding the background

1140: libraries to the point where one can take the original proof at face

1141: value, or finding workarounds, say, by replacing the original

1142: arguments with ones that are more elementary. The need to rewrite

1143: proofs in such a way can be frustrating, but the task can also be

1144: oddly enjoyable: it poses interesting puzzles, and enables one to

1145: better understand the relationship of the advanced mathematical

1146: methods to the elementary substitutes. As more powerful mathematical

1147: libraries are developed, the need for elementary workarounds will

1148: gradually fade, and with it, alas, one good reason for investing time

1149: in such exercises.

1150:

1151: Our decision to use Selberg's proof rather than a complex-analytic one

1152: is an instance of this phenomenon. To this day, we do not have a sense

1153: of how long it would have taken to build up a complex-analysis library

1154: sufficient to formalize one of the more common proofs of the prime

1155: number theorem, nor how much easier a formal verification of the prime

1156: number theorem would have been in the presence of such a library.

1157:

1158: But similar issues arose even with respect to the mild uses of

1159: analysis required by the Selberg proof. Isabelle's real library gave

1160: us a good theory of limits, series, derivatives, and the basic

1161: transcendental functions, but it had almost no theory of integration

1162: to speak of. Rather than develop such a theory, we found that we were

1163: able to work around the mild uses of integration needed in the Selberg

1164: proof.\footnote{Since the project began, Sebastian Skalberg managed to

1165:   import the more extensive analysis library from the HOL theorem

1166:   prover to Isabelle. By the time that happened though, we had already

1167:   worked around most of the applications of analysis needed for the

1168:   proof.} Often, we also had to search for quick patches to other gaps

1169: in the underlying library. For the reader's edification and

1170: entertainment, we describe a few such workarounds here.

1171:

1172: Recall that one of the fundamental identities we needed asserts

1173: \[

1174: \ln(1 + 1/n) = 1/n + O(1/n^2).

1175: \]

1176: This follows from the fact that $\ln(1 + x)$ is well approximated by

1177: $x$ when $x$ is small, which, in turn, can be seen from the

1178: Maclaurin series for $\ln(1 + x)$, or even the fact that the

1179: derivative of $\ln(1 + x)$ is equal to $1$ at $0$. But these were

1180: among the few elementary properties of transcendental functions that

1181: were missing from the real library. How could we work around this?

1182:

1183: To be more specific: Fleuriot's real library defined $e^x$ by the

1184: power series $e^x = \sum_{n = 0}^{\infty} x^n / n!$, and showed that

1185: $e^x$ is strictly increasing, $e^0 = 1$, $e^{x + y} = e^x e^y$ for

1186: every $x$ and $y$, and the range of $e^x$ is exactly the set of

1187: positive reals. The library then defines $\ln$ to be a left inverse to

1188: $e^x$. The puzzle was to use these facts to show that $| \ln(1 + x) -

1189: x| \leq x^2$ when $x$ is positive and small enough.

1190:

1191: Here is the solution we hit upon. First, note that when $x \geq 0$,

1192: $e^x \geq 1 + x$, and so, $x \geq \ln(1 + x)$. Replacing $x$ by $x^2$,

1193: we also have

1194: \begin{equation}

1195: \label{e:eq:one}

1196: e^{x^2} \geq 1 + x^2.

1197: \end{equation}

1198: On the other hand, the definition of $e^x$ can be used to show

1199: \begin{equation}

1200: \label{e:eq:two}

1201: e^x \leq 1 + x + x^2

1202: \end{equation}

1203: when $0 \leq x \leq 1/2$. From (\ref{e:eq:one}) and (\ref{e:eq:two})

1204: we have

1205: \[

1206: \begin{split}

1207: e^{x - x^2} & = e^x / e^{x^2} \\

1208: & \leq (1 + x + x^2) / (1 + x^2) \\

1209: & \leq 1 + x,

1210: \end{split}

1211: \]

1212: where the last inequality is easily obtained by multiplying

1213: through. Taking logarithms of both sides, we have

1214: \[

1215: x - x^2 \leq \ln(1 + x) \leq x

1216: \]

1217: when $0 \leq x \leq 1/2$, as required. In fact, a similar calculation

1218: yields bounds on $\ln(1 + x)$ when $x$ is negative and close to

1219: $0$. This can be used to show that the derivative of $\ln x$ is $1 /

1220: x$; the details are left to the reader.

1221:

1222: For another example, consider the problem of showing that $\sum_{n =

1223:   1}^\infty 1 / n^2$ converges. This follows immediately from the

1224: integral test: $\sum_{n = 1}^\infty 1 / n^2 \leq \int_1^\infty 1 / x^2

1225: = 1$. How can it be obtained otherwise? Answer: simply write

1226: \begin{eqnarray*}

1227: \sum_{n = 1}^M 1 / n^2 & \leq & 1 + \sum_{n = 2}^M 1 / n (n - 1) \\

1228: & = & 1 + \sum_{n = 2}^M (1 / (n -1 ) - 1 / n) \\

1229: & = & 1 + 1 - 1/M \\

1230: & \leq & 2,

1231: \end{eqnarray*}

1232: where the second equality relies on the fact that the preceding

1233: expression involves a telescoping sum. Having to stop frequently to

1234: work out puzzles like these helped us appreciate the immense power of

1235: the Newton-Leibniz calculus, which provides uniform and mechanical

1236: methods for solving such problems. The reader may wish to consider

1237: what can be done to show that the sum $\sum_{n = 1}^\infty 1 / x^a$ is

1238: convergent for general values of $a > 1$, or even for the special case

1239: $a = 3 / 2$. Fortunately, we did not need these facts.

1240:

1241: Now consider the identity

1242: \[

1243: \sum_{n \leq x} 1 / n = \ln x +

1244: O(1).

1245: \]

1246: To obtain this, note that when $x$ is positive integer we can write

1247: $\ln x$ as a telescoping sum,

1248: \[

1249: \begin{split}

1250: \ln x & = \sum_{n \leq x - 1} (\ln (n + 1) - \ln n) \\

1251: & = \sum_{n \leq x - 1} \ln(1 + 1 / n) \\

1252: & = \sum_{n \leq x - 1} 1 / n + O(\sum_{n \leq x} 1 / n^2) \\

1253: & = \sum_{n \leq x} 1 / n + O(1).

1254: \end{split}

1255: \]

1256: We learned this trick from Cornaros and Dimitracopoulos

1257: \cite{cornaros:dimitracopoulos:94}. In fact, a slight refinement of

1258: the argument shows

1259: \[

1260: \sum_{n \leq x} 1 /n = \ln x + C + O(1/x)

1261: \]

1262: for some constant, $C$. This constant is commonly known as Euler's

1263: constant, denoted by $\gamma$.

1264:

1265: One last puzzle: how can one show that $\ln x / x^a$ approaches $0$,

1266: for any $a > 0$? Here is our solution. First, note that we have $\ln x

1267: \leq \ln(1 + x) \leq x$ for every positive $x$. Thus we have

1268: \[

1269: a \ln x = \ln x^a \leq x^a,

1270: \]

1271: for every positive $x$ and $a$. Replacing $a$ by $a / 2$ and dividing

1272: both sides by $a x^a / 2$, we obtain $\ln x / x^a \leq 2 /

1273: (ax^{a/2})$. It is then easy to show that the right-hand-side

1274: approaches $0$ as $x$ approaches infinity.

1275:

1276: \section{Conclusions}

1277: \label{conclusions:section}

1278:

1279: Our efforts show that formal verification of significant mathematical

1280: theorems is possible, although more work is needed before the practice

1281: is likely to become widespread. In an ideal situation, it would be

1282: possible to enter mathematical text almost exactly as it appears in a

1283: careful and precise informal presentation, and interactive proof

1284: systems would be able to verify inferences at that level. Our

1285: formalization of the prime number theorem provides a case study that

1286: clarifies some of the ways in which the current technology

1287: falls short of the ideal.

1288:

1289: The formal statements of theorems in

1290: Section~\ref{formalization:subsection} are notably less attractice

1291: than their informal counterparts in Section~\ref{selberg:subsection}.

1292: The difference is not merely cosmetic; notation is an integral part of

1293: mathematics, and it is unreasonable to expect the mathematical

1294: community to make notational sacrafices for mechanical convenience.

1295: Integrating formal verification into mathematical practice will

1296: therefore require us to take ordinary mathematical notation extremely

1297: seriously.

1298:

1299: The biggest obstacle at present is the gap between those inferences

1300: that ordinary mathematicians recognize as obvious, and those that can

1301: be verified automatically by conventional proof assistants. We have

1302: suggested one strategy for improvement, namely, to reflect on the

1303: capacities that enable us, in specific domains, to verify textbook

1304: inferences, and then to formalize that understanding. In particular,

1305: it seems that fairly straightforward support for reasoning about

1306: inequalities between real numbers and casts between integers and real

1307: numbers would have simplified our task substantially.

1308:

1309: Progress in formal verification will require a broad but focused

1310: philosophical reflection on ordinary mathematical practice, together

1311: with robust formal characterizations of that practice and sound

1312: engineering. As such, the field represents an auspicious combination

1313: of theory and practice.

1314:

1315: \bigskip

1316:

1317: \noindent {\bf Acknowledgements.} We are especially grateful to Tobias

1318: Nipkow and Larry Paulson for continued support on this project. We are

1319: also grateful to them, and to Freek Wiedijk and two anonymous

1320: referees, for comments on this paper.

1321:

1322: \bibliographystyle{plain}

1323: \bibliography{proofthry,numbertheory,automated}

1324:

1325:

1326: \end{document}

1327:

1328: %%% Local Variables:

1329: %%% mode: latex

1330: %%% TeX-master: t

1331: %%% End:

1332: