0805:0805.0759/ms.tex

1: \documentclass[12pt,preprint]{aastex}

2: % notes:

3: % ------

4: % - minimal capitalization in section headings.

5: % - ``uncertainty'' not ``error'' except when it really is an error.

6: % - units get macros and go inside math environments

7:

8: \usepackage{natbib}

9: \usepackage{ifthen}

10: \usepackage{amssymb,amsmath}

11: % \usepackage{subfig}

12: \usepackage{subfigure}

13: \newcounter{address}

14: \newcommand{\latin}[1]{\textit{#1}}

15: \newcommand{\ie}{\latin{ie}}

16: \newcommand{\eg}{\latin{eg}}

17: \newcommand{\cf}{\latin{cf}}

18: \newcommand{\etc}{\latin{etc}}

19: \newcommand{\etal}{\latin{et~al}}

20: \newlength{\threewidth}

21: \setlength{\threewidth}{0.333\textwidth}

22: \newlength{\threewidthshort}

23: \setlength{\threewidthshort}{0.300\textwidth}

24: \newlength{\twowidth}

25: \setlength{\twowidth}{0.499\textwidth}

26: \newlength{\twowidthshort}

27: \setlength{\twowidthshort}{0.45\textwidth}

28: \newlength{\twothreewidth}

29: \setlength{\twothreewidth}{0.666\textwidth}

30: \newlength{\onewidth}

31: \setlength{\onewidth}{1.0\textwidth}

32: \newcommand{\Tycho}{Tycho-2}

33: \newcommand{\USNOB}{USNO-B Catalog}

34: \newcommand{\an}{\textsl{Astrometry.net}}

35: \newcommand{\numtests}{47}

36: \newcommand{\numcleantests}{27}

37: \newcommand{\numnoisytests}{20}

38: \newcommand{\smallspace}{\hspace{5 mm}}

39: \newcommand{\norm}[1]{\left|\left|#1\right|\right|}

40: \newcommand{\linespace}{\medskip \\}

41: \newcommand{\bd}{\textsl{Blind~Date}}

42: \newcommand{\dchiSq}{\frac{\mathrm{d}\chi^2}{\mathrm{d}t}}

43: \newcommand{\ddchiSq}{\frac{\mathrm{d}^2\chi^2}{\mathrm{d}t^2}}

44: \newcommand{\diag}{\mathrm{diag}}

45: \newcommand{\cutoff}{8}

46:

47: \newcommand{\examplecaptionA}{Mar-1914 / Nov-1917}

48: \newcommand{\examplecaptionB}{Mar-1947 / Nov-1945}

49: \newcommand{\examplecaptionC}{Feb-1949 / May-1951}

50: \newcommand{\examplecaptionD}{Feb-1914 / Jun-1910}

51: \newcommand{\examplecaptionE}{Mar-1914 / Nov-1916}

52: \newcommand{\examplecaptionF}{Jan-1975 / Jan-1800}

53:

54: \newcommand{\imagex}[1]{x_{#1}}

55: \newcommand{\imagey}[1]{y_{#1}}

56: \newcommand{\nextimagex}[1]{{\imagex{#1}}'}

57: \newcommand{\nextimagey}[1]{{\imagey{#1}}'}

58: \newcommand{\imagesigmax}[1]{\sigma_{x{#1}}}

59: \newcommand{\imagesigmay}[1]{\sigma_{y{#1}}}

60:

61: \newcommand{\catalogx}[1]{u_{#1}}

62: \newcommand{\catalogy}[1]{v_{#1}}

63: \newcommand{\catalogsigmax}[1]{ \sigma_{u{#1}} }

64: \newcommand{\catalogsigmay}[1]{ \sigma_{v{#1}} }

65:

66: \newcommand{\catalogra}[1]{\ra_{#1}}

67: \newcommand{\catalogdec}[1]{\dec_{#1}}

68: \newcommand{\catalogsigmara}[1]{\sigma_{\ra{#1}}}

69: \newcommand{\catalogsigmadec}[1]{\sigma_{\dec {#1}}}

70:

71: \newcommand{\catalogmura}[1]{\hat{\mu}_{\ra {#1}}}

72: \newcommand{\catalogmudec}[1]{\mu_{\dec {#1}}}

73: \newcommand{\catalogsigmamura}[1]{\hat{\sigma}_{\mu \ra {#1}}}

74: \newcommand{\catalogsigmamudec}[1]{\sigma_{\mu \dec {#1}}}

75:

76: \newcommand{\catalogmuraRAW}[1]{\mu_{\ra {#1}}}

77: \newcommand{\catalogsigmamuraRAW}[1]{\sigma_{\mu \ra {#1}}}

78: \newcommand{\tearlysym}{t_{\mathrm{early}}}

79: \newcommand{\tlatesym}{t_{\mathrm{late}}}

80: \newcommand{\tearly}{1955.0}

81: \newcommand{\tlate}{1990.0}

82: \newcommand{\epoch}{2000.0}

83:

84: \newcommand{\pairdE}{d}

85: \newcommand{\paird}[1]{\pairdE_{#1}}

86: \newcommand{\nextpaird}[1]{{\paird{#1}}'}

87: \newcommand{\pairsigmaE}{\bar{\sigma}}

88: \newcommand{\pairsigma}[1]{\pairsigmaE_{#1}}

89: \newcommand{\pairweight}[1]{w_{#1}}

90:

91: \newcommand{\imageresolution}{\theta_\mathrm{pix}}

92: \newcommand{\xytord}{WCS_{xy \rightarrow \mathrm{RD}}}

93: \newcommand{\rdtoxy}{WCS_{\mathrm{RD} \rightarrow xy}}

94: \newcommand{\ra}{\mathrm{RA}}

95: \newcommand{\dec}{\mathrm{Dec}}

96: \newcommand{\unitf}[1]{\mathrm{#1}}

97: \renewcommand{\arcsec}{\unitf{arcsec}}

98: \newcommand{\pix}{\unitf{pix}}

99: \newcommand{\yr}{\unitf{yr}}

100:

101: \begin{document}

102: \title{

103:   \textsl{Blind Date:}\

104:   Using proper motions to determine the ages of historical images

105: }

106:

107: \author{

108:   Jonathan~T.~Barron\altaffilmark{\ref{Toronto},\ref{NYUCS}},

109:   David~W.~Hogg\altaffilmark{\ref{NYUCCPP},\ref{email}},

110:   Dustin~Lang\altaffilmark{\ref{Toronto}},

111:   Sam~Roweis\altaffilmark{\ref{Toronto},\ref{Google}}

112: }

113:

114: \setcounter{address}{1}

115: \altaffiltext{\theaddress}{\stepcounter{address}\label{Toronto}

116: Department of Computer Science, University of Toronto, 6 King's

117: College Road, Toronto, Ontario, M5S~3G4 Canada}

118: \altaffiltext{\theaddress}{\stepcounter{address}\label{NYUCS}

119: Department of Computer Science, The Courant Institute of Mathematical

120: Sciences, New York University, 715 Broadway, New York, NY 10003}

121: \altaffiltext{\theaddress}{\stepcounter{address}\label{NYUCCPP} Center

122: for Cosmology and Particle Physics, Department of Physics, New York

123: University, 4 Washington Place, New York, NY 10003}

124: \altaffiltext{\theaddress}{\stepcounter{address}\label{email} To whom

125: correspondence should be addressed: \texttt{david.hogg@nyu.edu}}

126: \altaffiltext{\theaddress}{\stepcounter{address}\label{Google} Google Inc,

127: Mountain View, CA}

128:

129: \begin{abstract}

130: Astrometric calibration is based on patterns of cataloged stars and

131: therefore effectively assumes a particular epoch, which can be

132: substantially incorrect for historical images. With the known proper

133: motions of stars we can ``run back the clock'' to an approximation of

134: the night sky in any given year, and in principle the year that best

135: fits stellar patterns in any given image is an estimate of the year in

136: which that image was taken. In this paper we use 47 scanned

137: photographic images of M44 spanning years 1910--1975 to demonstrate

138: this technique. We use only the pixel information in the images; we

139: use no prior information or meta-data about image pointing, scale,

140: orientation, or date. \bd\ returns date meta-data for the input

141: images.  It also improves the astrometric calibration of the image

142: because the final astrometric calibration is performed at the

143: appropriate epoch. The accuracy and reliability of \bd\ are functions

144: of image size, pointing, angular resolution, and depth; performance is

145: related to the sum of proper-motion signal-to-noise ratios for catalog

146: stars measured in the input image.  All of the science-quality images

147: and $85$~percent of the low-quality images in our sample of

148: photographic plate images of M44 have their dates reliably determined

149: to within a decade, many to within months.

150: \end{abstract}

151:

152: \keywords{

153:     astrometry ---

154:     catalogs ---

155:     methods:~statistical ---

156:     stars:~kinematics ---

157:     techniques:~image~processing ---

158:     time

159: }

160:

161: \section{Introduction}

162:

163: Astronomy needs well calibrated data to make precise measurements, but also

164: wants to make use of large data sources that are poorly calibrated. Unreliable

165: data sets such as historical archives, amateurs collections, and engineering

166: data contain important information, especially in the time domain. Astronomy

167: needs methods by which data of unknown provenance, quality, and calibration

168: can be vetted, calibrated, and made reliably useable by the community.

169:

170: There is an enormous amount of information about proper motions,

171: stellar and AGN variability, transients, and Solar System bodies in

172: historical plate archives and the collections of good amateur

173: astronomers. The Harvard College Observatory Astronomical Plate

174: Stacks\footnote{http://tdc-www.harvard.edu/plates/} alone contain

175: enough photographic exposures to cover the entire sky 500 times over,

176: and span many decades with good coverage and imaging depth. However,

177: in many cases, it is challenging to use those images

178: quantitatively. Often the details of observing date, telescope

179: pointing, bandpass, and exposure time are lost because the logs have

180: been lost, because the information was written incorrectly or

181: illegibly, or because it is difficult or expensive to associate each

182: image with the appropriate record in the log.

183:

184: The astronomical world is moving towards the development of a Virtual

185: Observatory, in which a heterogeneous set of data providers

186: communicate with researchers and the public through open data-sharing

187: protocols\footnote{http://www.ivoa.net/}.  These protocols can be

188: easily spoofed---intentionally and unintentionally---and permit the

189: dissemination of badly calibrated, erroneous, or untrustworthy

190: data\footnote{http://www.ivoa.net/Documents/Notes/IVOArch/IVOArch-20040615.html}. Indeed,

191: the lack of a ``trust'' model may compromise the VO's goals of making

192: it easy for astronomers to discover and use wide varieties of input

193: data; without trust, all the time that saved in searching and using

194: has to be spent in verification and tracking of provenance.

195:

196: Amateur astronomers and educators at college/planetarium observatories are, in

197: many cases, well equipped and can take science-quality observations,

198: especially for time-domain science. In addition, most of these astronomers are

199: interested in contributing to research astronomy. However, it is challenging

200: at present for these potential data providers to provide their data to the

201: community in a form that is trivially useable by the research community. These

202: observers need automated systems to calibrate their data and hardware and to

203: package the data and meta-data in standards-compliant forms. Even professional

204: observatories often produce data with incorrect or standards-violating

205: meta-data because of telescope faults, software bugs, or the youth of most

206: standards and conventions for digital data formats.

207:

208: We have begun a large project (\an) to vet, restore, determine, and package in

209: standards-compliant form the calibration meta-data for astronomical images for

210: which such information is lost, damaged, or unreliable \citep{lang08a}. Our

211: system can astrometrically calibrate an image (that is, determine the world

212: coordinate system) using the information in the image pixels alone. It starts

213: by identifying asterisms that determine astrometric calibration. Once the

214: astrometry is correct, the sources in the image can be identified with

215: catalogs and other calibration meta-data can be inferred through quantitative

216: image analysis. In the process of testing and running this calibration system,

217: we have indeed confirmed that many historical and amateur images---and even

218: some scientific images from modern professional facilities---have missing or

219: incorrect astrometric meta-data; automated calibration, as a vetting step at

220: the very least, is essential for all data sources.

221:

222: For the same reason that the time domain is interesting, it can also be used

223: to calibrate the \emph{date} at which an image was taken. Stars are moving and

224: varying, so the particular configuration and relative brightnesses of the

225: stars in an image provide, in principle, a measure of the time at which the

226: image was taken. Here we show that stellar motions catalogued at the present

227: day can be used to age-date plates from a plate archive to within a few years,

228: even for very old plates. This new capability takes the \an\ project a small

229: step closer to being a comprehensive image meta-data vetting and automated

230: calibration system.

231:

232: \section{Input data}

233:

234: The larger goal of our project is to add calibration meta-data to data of

235: unknown provenance. For this reason, the system begins truly ``blind'' in the

236: sense that we ignore all meta-data associated with each input image, and start

237: with only the image pixels themselves. We calibrate the images using the \an\

238: astrometric system; this calibration provides output that is taken as input

239: data for the \bd\ analysis.

240:

241: For each automatically detected star $i$ in the image, there is a centroid

242: $(\imagex{i}, \imagey{i})$ in the input image measured in pixels. For each

243: source, this centroid is the location of the maximum of a second-order

244: polynomial (generalized parabolic) surface fit to the area immediately

245: surrounding the center of the image star. The fit also provides uncertainties

246: $(\imagesigmax{i}, \imagesigmay{i})$ in these centroids. For unsaturated stars

247: these are taken (arbitrarily) to be one pixel, and for saturated stars (which

248: are common in the digitized plate data) these are taken to be one-third of the

249: radius of the saturated region. These are over-estimates, since the stars

250: are detected at good signal-to-noise, but this is

251: conservative; furthermore, in the case of saturated stars, it is possible for

252: the saturated ``disk'' to be non-concentric with the true centroid.

253:

254: The system also provides a rough world coordinate system (WCS) for the

255: image; that is, first guesses at two functions: $\xytord(x,y)$, which

256: transforms points from the image plane in pixels into celestial

257: coordinates in angular units, and $\rdtoxy(\ra,\dec)$, the inverse. A

258: derived quantity from these functions is the pixel scale

259: $\imageresolution$, measured in angle per pixel, which we will use

260: below.  Strictly $\imageresolution$ is a function of position in the

261: image, but in typical science images it does not vary substantially.

262:

263: The WCS effectively identifies the sources from the \USNOB\ \citep{monet03a}

264: that are in or likely to be in the image. For each catalog ``star'' $j$ (the

265: \USNOB\ contains both stars and compact galaxies) inside the image, the

266: catalog contains a J2000 celestial position $(\catalogra{j}, \catalogdec{j})$

267: on the celestial sphere (extrapolated to epoch $\epoch$) measured in angular

268: units, an uncertainty $(\catalogsigmara{j}, \catalogsigmadec{j})$ in that

269: position, a proper motion $(\catalogmuraRAW{j}, \catalogmudec{j})$ measured in

270: angle per time, and an uncertainty

271: $(\catalogsigmamuraRAW{j},\catalogsigmamudec{j})$ in that proper motion.

272:

273: The \USNOB\ uncertainties required some processing and adjustment. For the

274: sake of clarity and simplicity, we make as few assumptions as possible in

275: transforming the uncertainties. Our approach here is not intended to be

276: definitive. Many of these entries in the catalog have values of zero for the

277: uncertainty of position or proper motion. In the case of zero-valued

278: uncertainty in position, we assume that the uncertainty is below the precision

279: of $0.002\,\arcsec$ at which the catalog was reported. We therefore set all

280: zero-valued position uncertainties to one-half of the precision

281: ($0.001\,\arcsec$). For entries with zero-valued uncertainty in proper motion,

282: there is more we need to consider; a nonzero-valued proper motion paired with

283: a zero-valued uncertainty indicates that the uncertainty is below the

284: precision of the catalog, and therefore should be set to half of the

285: precision. A zero-valued proper motion paired with a zero-valued uncertainty

286: indicates that the proper motion of the entry could not be measured accurately

287: (Dave Monet, private communication). We therefore set the uncertainty in the

288: proper motion for such entries to three times the median value of nonzero

289: proper motion uncertainties. This captures the idea that, generally speaking,

290: we are significantly more uncertain about the proper motion of such entries

291: than we are most other entries. A more principled approach could certainly be

292: attempted but we found that this worked well enough for our purposes.

293:

294: Much work has already been done by the \an\ team to identify spurious sources

295: in the \USNOB\ that appear to have been created by diffraction spikes and

296: reflection halos \citep{barron08a}. For the purposes of this project, we

297: ignore the entries in the catalog which have been flagged as spurious. Testing

298: has shown that ignoring these sources generally improves the accuracy of our

299: results.

300:

301: Using a technique that will be described in a future paper from the

302: \an\ team, we are able to estimate the bandpass of each image being

303: processed, in that we determine which bandpass of the \USNOB\ most

304: closely predicts the brightness ordering of the stars in the

305: image. Though this technique is still in an experimental stage, its

306: results are not very controversial; all of Harvard's images of M44

307: appear to best match the blue bands ($O$ and $J$ emulsions) of the

308: \USNOB. This finding is reinforced through experimentation with

309: manually setting each image's band: On average, \bd\ performs better

310: on these images using the $J$ emulsion of the Catalog

311: than on any other band.

312:

313: Assuming that we have estimated the bandpass correctly, the $N$ stars in the

314: image should correspond---roughly---to the $N$ brightest catalog stars that

315: lie within the area of the image. For that reason, we use only the $N$

316: brightest catalog stars in what follows.

317:

318: \section{Method}

319:

320: We use the \USNOB\ positions and proper motions to ``wind'' the $N$ catalog

321: stars backwards and forward through time along the celestial sphere. Once the

322: catalog has been adjusted, we can use the image WCS to project the catalog

323: entries onto the image plane, making a synthetic catalog for that image at

324: that time, in image coordinates. We then attempt to fit the image stars to

325: that synthetic image of the moved catalog stars. We choose a freedom with

326: which the image star positions are allowed to warp to fit the catalog star

327: positions, and a scalar objective function that is minimized when the

328: positions are ``best'' warped. We warp the input image to the catalog

329: ``wound'' to different times, and use the best-fit values of the objective

330: function to determine the year at which the image was taken.

331:

332: \subsection{Winding back the catalog}

333:

334: We estimate the celestial coordinates of catalog star $j$ at the arbitrary

335: date $t$, and then project them onto the image plane

336: \begin{eqnarray}\displaystyle

337: 	\left( \catalogx{j}, \catalogy{j} \right) & = &

338: 	\rdtoxy \left(

339: 	\catalogra{j} - \catalogmura{j} [t-(\epoch\,\yr)]

340: 	,

341: 	\catalogdec{j} - \catalogmudec{j} [t-(\epoch\,\yr)]

342: 	\right)

343: 	\quad ,

344: \end{eqnarray}

345: where we have adjusted the $\catalogmuraRAW{j}$ proper motion vector

346: components and their associated uncertainties into coordinate derivatives

347: $\catalogmura{j}$ by

348: \begin{eqnarray}\displaystyle

349: \catalogmura{j} & = & \mathrm{cos}(\catalogdec{j})\,\catalogmuraRAW{j}

350: \nonumber \\

351: \catalogsigmamura{j} & = & \mathrm{cos}(\catalogdec{j})\,\catalogsigmamuraRAW{j}

352: \quad .

353: \end{eqnarray}

354: We estimate the uncertainty of the location of catalog star $j$ at year $t$ on

355: the image plane with

356: \begin{eqnarray}\displaystyle

357: 	\catalogsigmax{j}

358: 	& = &

359: 	\frac{1}{\imageresolution}\,\sqrt{\catalogsigmara{j}^{2} + ( \max\left( | t - \tearlysym |, | t - \tlatesym | \right)\,\catalogsigmamura{j} )^{2}}

360: 	\nonumber \\

361: 	\catalogsigmay{j}

362: 	& = &

363: 	\frac{1}{\imageresolution} \sqrt{\catalogsigmadec{j}^{2} + ( \max\left( | t - \tearlysym |, | t - \tlatesym | \right)\,\catalogsigmamudec{j} )^{2}}

364: \label{eq:uncertainty}

365: \end{eqnarray}

366: where $\tearlysym$ and $\tlatesym$ are the dates at which the \USNOB\ source

367: imagery were taken, which in this patch of the sky are \tearly\ and \tlate,

368: respectively. All positions in the Catalog, however, were extrapolated to the

369: year $\epoch$ (epoch and equinox). This means that though we are given each

370: catalog star's location at the year $\epoch$, we know that each star's

371: measured location is, in general, more accurate between the two epochs in

372: which the images were taken, and less accurate at years further from that

373: range. This means that when ``winding'' locations through time, we look at the

374: difference between $t$ and the year $\epoch$; when ``winding'' uncertainties

375: through time, we look at the maximum distance between $t$ and both

376: $\tearlysym$ and $\tlatesym$. This is equivalent to using

377: $(\tearlysym + \tlatesym)/2$ as our reference year, and assuming a non-zero

378: uncertainty on the measurement of each star's proper motion at that reference

379: year.

380:

381: Note that in our notation for $(\catalogx{j}, \catalogy{j})$ and

382: $(\catalogsigmax{j}, \catalogsigmay{j})$, we do not reference $t$. This is

383: because once the catalog has been wound through time and projected onto the

384: image plane, we consider time to be fixed. Note that in Sections

385: \ref{sec:objectiveSection} and \ref{sec:fittingSection}, time will remain

386: fixed, and therefore $t$ is not mentioned in any of the notation except for in

387: $\chi^2(t)$.

388:

389: \subsection{Objective function}

390: \label{sec:objectiveSection}

391:

392: Determination of the image coordinate system and date involves finding

393: parameters---astrometric parameters and the date---that optimize an objective

394: function. The choice of this objective is therefore the fundamental scientific

395: choice in the project.

396:

397: We seek an objective function that has the following properties, listed in

398: rough order of priority: The function must decrease as image-coordinate

399: distances between catalog and image stars decrease. The function must be

400: insensitive to anomalous outliers, and more sensitive to well-measured stars

401: than to poorly-measured stars. The function must be some approximation to a

402: likelihood or have some equivalent justification so that changes in the

403: function with respect to parameters can be interpreted in terms of

404: uncertainties in those parameters. The function ought to be differentiable and

405: second-differentiable with respect to all fit parameters (in particular time

406: and astrometric calibration). The function should be easily optimized. We have

407: identified an objective function that has all of these properties; it is so

408: similar to the least-square function that we call it a ``modified

409: chi-squared'' and denote it ``$\chi^2$''.

410:

411: For all $i = 1:N$ and $j = 1:N$ we compute $\paird{ij}$, the Euclidian

412: distance between image star $i$ and catalog stars $j$ in the image plane.

413: \begin{equation} \paird{ij} = \sqrt{ \left( \catalogx{j} - \imagex{i}

414: \right)^2 + \left( \catalogy{j} - \imagey{i} \right)^2 } \end{equation}

415:

416: We also estimate the uncertainty $\pairsigma{ij}$ of each pair's distance

417: measurement, using the previously defined values for

418: $(\imagesigmax{i}, \imagesigmay{i})$ and

419: $(\catalogsigmax{j}, \catalogsigmay{j})$. We calculate

420: the combined uncertainty for the pair in $x$ and $y$ at time $t$, and then

421: simply take the mean as an estimation of the uncertainty of that pair:

422:

423: \begin{equation}

424: \pairsigma{ij} =

425: \frac{1}{2}\,\left(\sqrt{ \imagesigmax{i}^2 + \catalogsigmax{j}^2}

426:   + \sqrt{ \imagesigmay{i}^2 + \catalogsigmay{j}^2}\right)

427: \end{equation}

428: We define a weighting function for each pair that returns a value between 0

429: and 1 based on the ratio of the distance of a pair to the uncertainty of that

430: pair:

431: \begin{equation}

432: W(\pairdE, \pairsigmaE) = \frac{1}{1+\left(\frac{\pairdE}{\pairsigmaE}\right)^2}

433: \end{equation}

434: (see also Figure~\ref{fig:weightCurve}).

435: This function has the property that outliers are down-weighed in a smooth

436: manner; it causes the influence of a image--catalog pair to smoothly drop to

437: zero at large displacement. This permits us to avoid the

438: discontinuous optimization problem of sigma clipping, which is the standard

439: ``robust estimation'' technique in common use in astronomy applications. The

440: weighting function has a number of properties which make it well-suited to our

441: purpose:

442: \begin{eqnarray}\displaystyle

443:   W(0, \pairsigmaE) & = & 1

444:   \label{eqn:wnearzero} \\

445:   \left.\frac{\mathrm{d}W}{\mathrm{d}\pairdE}(\pairdE, \pairsigmaE)\right|_{d=0} & = & 0

446:   \label{eqn:dwnearzero} \\

447:   \lim_{(\pairdE/\pairsigmaE)\rightarrow\infty} W(\pairdE, \pairsigmaE)\left(\frac{\pairdE}{\pairsigmaE}\right)^2 & = & 1

448:   \label{eqn:wbig}

449: \end{eqnarray}

450:

451: \begin{figure}[t]

452:

453: 	\centering

454: 	\resizebox{\twowidthshort}{!}{\includegraphics{f1a.eps}}

455: 	\resizebox{\twowidthshort}{!}{\includegraphics{f1b.eps}}

456:

457: 	\caption{On the left, the weighting function versus a pair's ratio of

458: 	distance to uncertainty. On the right, the corresponding weighted

459: 	contribution of a pair to the modified chi-squared. The dotted line

460: 	indicates the point at which we assume that the weighted contribution

461: 	stops changing.

462: 	\label{fig:weightCurve}}

463: \end{figure}

464:

465: We use this weighting function to assign a weight $\pairweight{ij}$ to all

466: pairs as follows:

467: \begin{equation}

468: \pairweight{ij} = W \left(\paird{ij}, \pairsigma{ij} \right)

469: \end{equation}

470: Our final objective function is:

471: \begin{equation}

472: \displaystyle \chi^2(t) =

473: 	\sum_{ij} \pairweight{ij}\,\left(\frac{\paird{ij}}{\pairsigma{ij}}\right)^2

474: \end{equation}

475: where the sum is over all possible image--catalog pairs.

476:

477: For ``good'' (small-separation) image--catalog pairs, the weight function is

478: near unity (equation \ref{eqn:wnearzero}) and has near-zero derivatives

479: (equation \ref{eqn:dwnearzero}), so small changes in separation do not enter

480: strongly into derivatives of the objective function. For ``bad''

481: (large-separation) image--catalog pairs, the pair's contribution to

482: $\chi^2(t)$ is nearly constant (equation \ref{eqn:wbig}). This makes

483: optimization and interpretation of our objective function very like

484: optimization and interpretation of a chi-squared fitting system. This weighted

485: chi-squared objective function could also be interpreted as a Geman-McLure

486: error function, In this framework, optimizing the objective function is

487: equivalent to robust M-estimation \citep{hampel86}.

488:

489: In constructing the modified chi-squared, we use---in principle---all

490: image--catalog pairs, irrespective of their separation in image coordinates.

491: The fact that the contribution of a pair to the objective function quickly

492: converges to $1$ as the pair becomes highly separated allows us---in

493: practice---to ignore all highly separated pairs. We therefore choose to

494: approximate the contributions of all pairs where

495: $\pairdE > \cutoff\,\pairsigmaE$ as

496: $W(\cutoff\,\pairsigmaE, \pairsigmaE) \left( \frac{8\pairsigmaE}{\pairsigmaE} \right)^2 $,

497: which is $64/65$. This dramatically speeds up our computation.

498:

499: \subsection{Fitting the image}

500: \label{sec:fittingSection}

501:

502: At this point, we have our image stars on the image plane, our catalog stars

503: (wound to the time of interest) projected onto the image plane, and an

504: objective function that we wish to minimize. We need to find the

505: transformation that we can apply to the image that minimizes the objective

506: function, which we take to be the transformation that best ``fits'' the image

507: to the catalog.

508:

509: Testing has suggested that the initial location of the image returned by the

510: \an\ solver is close enough to the optimal location that locally minimizing

511: the objective function is sufficient for finding the global minimum, and that

512: we generally do not run the risk of falling into a false local minimum.

513: Therefore, we only present our method for locally minimizing the optimal

514: function through iteratively reweighted least-squares (IRLS). We experimented

515: with techniques such as RANSAC to fit images to the catalog in the face of

516: extreme noise, but no technique was more effective and robust than our IRLS

517: method.

518:

519: Let us first construct a solution to a simplified version of this problem, in

520: which we know which correspondences are true: We assume that image point $i$

521: corresponds to catalog point $i$ for all $i \in \{ 1, 2, ..., N \} $. Assuming

522: that we are interested in solving for an affine transformation (first-order

523: linear transformation plus shift), this means that we need to find the

524: transformation matrix that best satisfies the following equations:

525: \begin{equation}

526: 	\underset{i \in \{ 1, 2, ..., N \} }{\forall} \,

527: 	\left[\begin{array}{ccc}

528: 	m_x & m_y & t_x \\

529: 	n_x & n_y & t_y

530: 	\end{array} \right]

531: 	\left[\begin{array}{c}

532: 	\imagex{i} \\

533: 	\imagey{i} \\

534: 	1

535: 	\end{array} \right]

536: 	=

537: 	\left[\begin{array}{c}

538: 	\catalogx{i} \\

539: 	\catalogy{i}

540: 	\end{array} \right]

541: \end{equation}

542:

543: This can be generalized straightforwardly for higher order transformations.

544:

545: The transformation that best satisfies these equations can be found using a

546: standard least-squares solver. We can then use this transformation to warp all

547: of the image points onto the catalog points (and vice-versa), thus solving our

548: simplified problem.

549:

550: Of course, since we do not know which image stars correspond to which catalog

551: stars, we must include equations for all image--catalog pairs. We are not

552: interested in the solution to this problem, as it would describe a

553: transformation from \emph{every} image star to \emph{every} catalog star. To

554: specify a transformation that satisfies \emph{likely} image--catalog

555: correspondences, we must use our weighting function to make soft assignments

556: regarding correspondences. We therefore use the following equations:

557: \begin{equation}

558: \underset{i \in \{1, 2, ..., N \} }{\forall} \,\,\,

559: \underset{j \in \{1, 2, ..., N \} }{\forall} \,

560: \left[\begin{array}{cc}

561: \frac{\sqrt{\pairweight{ij}}}{\pairsigma{ij}} & 0 \\

562: 0 & \frac{\sqrt{\pairweight{ij}}}{\pairsigma{ij}}

563: \end{array} \right]

564: 	\left[\begin{array}{ccc}

565: 	m_x & m_y & t_x \\

566: 	n_x & n_y & t_y

567: 	\end{array} \right]

568: 	\left[\begin{array}{c}

569: 	\imagex{i} \\

570: 	\imagey{i} \\

571: 	1

572: 	\end{array} \right]

573: 	=

574: 	\left[\begin{array}{cc}

575: 	\frac{\sqrt{\pairweight{ij}}}{\pairsigma{ij}} & 0 \\

576: 	0 & \frac{\sqrt{\pairweight{ij}}}{\pairsigma{ij}}

577: 	\end{array} \right]

578: 	\left[\begin{array}{c}

579: 	\catalogx{j} \\

580: 	\catalogy{j}

581: 	\end{array} \right]

582: \end{equation}

583:

584: We begin our solution by constructing a linear system which contains all of

585: the previously described (unweighted) equations.

586: \begin{equation}

587: 	\left[\begin{array}{cccccc}

588: 	\imagex{1} & \imagey{1} & 1 & 0 & 0 & 0\\

589: 	0 & 0 & 0 & \imagex{1} & \imagey{1} & 1\\

590: 	\imagex{1} & \imagey{1}  & 1 & 0 & 0 & 0\\

591: 	0 & 0 & 0 & \imagex{1} & \imagey{1} & 1\\

592: 	 & & \ldots \\

593: 	\imagex{N} & \imagey{N} & 1 & 0 & 0 & 0\\

594: 	0 & 0 & 0 & \imagex{N} & \imagey{N} & 1\\

595: 	\imagex{N} & \imagey{N} & 1 & 0 & 0 & 0\\

596: 	0 & 0 & 0 & \imagex{N} & \imagey{N} & 1

597: 	\end{array} \right]

598: 	\left[\begin{array}{c}

599: 	m_x \\

600: 	m_y \\

601: 	t_x \\

602: 	n_x \\

603: 	n_y \\

604: 	t_y

605: 	\end{array} \right]

606: 	=

607: 	\left[\begin{array}{c}

608: 	\catalogx{1} \\

609: 	\catalogy{1} \\

610: 	\catalogx{2} \\

611: 	\catalogy{2} \\

612: 	\ldots \\

613: 	\catalogx{N-1} \\

614: 	\catalogy{N-1} \\

615: 	\catalogx{N} \\

616: 	\catalogy{N}

617: 	\end{array} \right]

618: \end{equation}

619:

620: We can write this matrix equation as:

621: \begin{equation}

622:  \mathbf{Ax} = \mathbf{b}

623: \end{equation}

624:

625: To introduce the weight values described in the equations, we construct our

626: weight matrix $ \mathbf{W} $, as follows:

627: \begin{equation}

628: \mathbf{W} = \diag

629: \left(\begin{array}{ccccccccc}

630: \frac{\sqrt{\pairweight{1,1}}}{\pairsigma{1,1}},

631: \frac{\sqrt{\pairweight{1,1}}}{\pairsigma{1,1}},

632: \frac{\sqrt{\pairweight{1,2}}}{\pairsigma{1,2}},

633: \frac{\sqrt{\pairweight{1,2}}}{\pairsigma{1,2}},

634: \ldots,

635: \frac{\sqrt{\pairweight{N,N}}}{\pairsigma{N,N}},

636: \frac{\sqrt{\pairweight{N,N}}}{\pairsigma{N,N}}

637: \end{array} \right)

638: \end{equation}

639:

640: Our final matrix equation can then be written as:

641: \begin{equation}

642:  \mathbf{WAx} = \mathbf{Wb}

643: \end{equation}

644: We then find $\mathbf{x}$ such that the squared residuals,

645: $\mathbf{ (WAx-Wb)^T (WAx-Wb) }$,

646: are minimized. By construction, $\mathbf{x}$ describes the transformation that

647: best satisfies all of the equations we previously constructed, and can be

648: found using a standard least-squares solver.

649:

650: Using the transformation defined by $\mathbf{x}$ we can calculate the

651: coordinates of our warped image points.

652: \begin{eqnarray}

653: 	\displaystyle

654: 	{\imagex{i}}' & = & m_x \imagex{i} + m_y \imagey{i} + t_x \nonumber \\

655: 	{\imagey{i}}' & = & n_x \imagex{i} + n_y \imagey{i} + t_y

656: \end{eqnarray}

657:

658: Unlike the simplified version of this problem, the warp found after one

659: iteration is not our final solution. As we warp the image, the distances and

660: weights between the image and catalog points change, and so our objective

661: function is no longer minimized. The solution is to repeatedly recalculate our

662: $\paird{ij}$ and $\pairweight{ij}$ values and our weighted least squares

663: transformation. We perform this iteratively reweighted least-squares operation

664: until the resulting transformations stop changing, which by construction is

665: also when our objective function stops changing. The solution which we

666: converge upon is returned as the best fit of the image onto the catalog.

667:

668: The IRLS method minimizes the objective function, as it was constructed such

669: that the objective function is equal to the sum of the squares of the weighted

670: residuals of the matrix equation.

671: \begin{eqnarray}\displaystyle

672: \mathbf{ (WAx-Wb)^T (WAx-Wb) } & = & \mathbf{ (W(Ax-b))^T (W(Ax-b)) } \nonumber \\

673: & = & \sum_{ij}

674: \left(\frac{\sqrt{\pairweight{ij}}}{\pairsigma{ij} }

675: \left( \catalogx{j} - \nextimagex{i} \right) \right)^2 +

676: \left(\frac{\sqrt{\pairweight{ij}}}{\pairsigma{ij} }

677: \left( \catalogy{j} - \nextimagey{i} \right) \right)^2 \nonumber \\

678: & = & \sum_{ij} \left(\frac{\sqrt{\pairweight{ij}}}{\pairsigma{ij}} \nextpaird{ij} \right)^2 \nonumber \\

679: & = & \sum_{ij} \pairweight{ij} \left(\frac{{\nextpaird{ij}}}{\pairsigma{ij}}\right)^2 \nonumber \\

680: & \approx & \chi^2(t)

681: \end{eqnarray}

682:

683: We say that the residuals only \emph{approximate} the optimal function because

684: we use the weights of the previous transformed image ($\pairweight{ij}$) and

685: the distances of the next warped transformed image ($\nextpaird{ij}$). Though

686: this may seem strange, once the IRLS has converged on a solution and the

687: optimal function has stopped changing, the distances in one iteration are

688: effectively equal to the distances in the following iteration.

689:

690: For numerical stability we always use the initial coordinates when

691: constructing our transformation, and only return the final warped image

692: coordinates. The intermediate warped image coordinates are only used for

693: calculating distances and weights. This means that our final image coordinates

694: have only been subjected to one transformation, as opposed to many small

695: transformations.

696:

697: See Figure~\ref{fig:fitExamples} for some examples of fit and unfit images

698: against correct and incorrect catalog dates.

699:

700: \subsection{Estimating the date}

701: \label{sec:searchSection}

702:

703: Now that we have defined a method for fitting an image to the catalog, and a

704: metric by which we can assess the degree to which an image can be fit to the

705: catalog, we can use a number of techniques to estimate the year in which the

706: image can best be fit to the catalog. We take that year to be the year in

707: which the image was created.

708:

709: This problem can be phrased as such: an image has some unknown chi-squared

710: curve $\chi^2(t)$, of which we wish to find the year $t_0$, the theoretical

711: optimal year such that: \begin{equation} t_0 = \underset{t}{\arg\min} \,

712: \chi^2(t) \end{equation}

713:

714: In the algorithm we will describe, finding $t_0$ is not computationally

715: feasible, so we must settle on finding a reasonable approximation of the

716: minimum $t^*$, such that given an accuracy threshold $\delta_\chi$ we are

717: confident that:

718: \begin{equation}

719: \norm{\chi^2(t_0)-\chi^2(t^*)} \leq \delta_\chi

720: \end{equation}

721:

722: Once we have found $t^*$ and therefore $\chi^2(t^*)$, we also wish to find the

723: extents of our uncertainty region, $(t^*_-, t^*_+)$, such that:

724: \begin{equation}

725: 	\begin{array}{c}

726: 		t^*_- < t^* < t^*_+ \linespace

727: 		\chi^2(t^*_-) = \chi^2(t^*)+1 = \chi^2(t^*_+) \linespace

728: 		\underset{t \in [t^*_-,t^*_+]}{\forall} {\, \chi^2(t) \leq \chi^2(t^*)+1 }

729: 	\end{array}

730: \end{equation}

731:

732: Ideally, we want to find all of these values as accurately as possible while

733: sampling the $\chi^2$ curve as few times as possible. There are a number of

734: methods by which we can accomplish this task, each with different tradeoffs

735: concerning efficiency and assumptions about the shape of the curve.

736:

737: The simplest method for estimating the origin date is through brute force. We

738: sample our $\chi^2$ curve at regular intervals, and take the year in which our

739: objective function scored the lowest as $t^*$. We then linearly interpolate

740: along the curve to find the extents of the uncertainty region. This method is

741: terribly inefficient and assumes nothing about the shape of the curve, so we

742: only use it as an approximate ground truth to which we will compare our final

743: algorithm.

744:

745: Our search algorithm begins with first sampling our $\chi^2$ curve at a very

746: broad, regular interval. Though only one initial sample is required, for the

747: figures shown in this paper we sample the curve at $1900$, $1950$, and $2000$.

748: We take the sampled year with the lowest $\chi^2$ score to be $t_n$, and we

749: then iteratively refine $t_n$ into $t_{n+1}$ until we believe that we have

750: found $t^*$.

751:

752: Our objective function was constructed such that we could efficiently

753: calculate $\dchiSq(t)$ and $\ddchiSq(t)$. The equations for these analytical

754: derivatives are elaborate, so we do not present them here. Since $\chi^2(t^*)$

755: is a minimum in the $\chi^2$ curve, we can assume that $\dchiSq(t^*)=0$. We

756: can therefore use Newton's method to find $t_{n+1}$:

757: \begin{equation}

758: 	t_{n+1} = t_n - \frac{\dchiSq(t_n)}{\ddchiSq(t_n)}

759: \end{equation}

760:

761: We iteratively evaluate $t_{n+1}$ until

762: $\norm{\chi^2(t_{n+1})-\chi^2(t_n)} \leq \delta_\chi$

763: holds for two consecutive iterations, at which point we take $t_{n+1}$ as

764: $t^*$. Since Newton's method generally converges quadratically, this is an

765: extremely fast process.\footnote{Of course since $\chi^2(t)$ is not globally

766: quadratic, Newton's method is not guaranteed to converge from any

767: starting point. To improve numerical conditioning and ensure that

768: the search remains well-behaved in the face of somewhat

769: irregular $\chi^2$ curves, we require that $\ddchiSq(t_n)\ge\epsilon$,

770: where $1+\epsilon$ is the smallest representable number $>1$ on our machine

771: and we require that $| t_{n+1} - t_n | < 25$ years.

772: Additionally, if Newton's method appears to be diverging or failing to

773: converge, we find $t_{n+1}$ using a binary-search approach in which we sample

774: the midpoint of the area in which the minimum appears to lie. This collection

775: of restrictions effectively amounts intelligent gradient descent, which we

776: switch to when Newton's method cannot be performed. This fallback system is

777: rarely required, but does sometimes prevent oscillation and search failure.}

778:

779: Once we have found $t^*$, we can locate $t^*_-$ and $t^*_+$. We use our

780: modified Newton's method with these new goals:

781: \begin{eqnarray}\displaystyle

782: 	\chi^2(t^*_+) = \chi^2(t^*)+1 \nonumber \\

783: 	\chi^2(t^*_-) = \chi^2(t^*)+1

784: \end{eqnarray}

785: This uncertainty region would be the true one-sigma uncertainty region in the

786: limit that the modified chi-squared were the standard linear-fitting

787: chi-squared. Because of the weighting function, in practice this criterion

788: over-estimates the one-sigma uncertainty.

789:

790: We can perform root-finding on these equations using the following formula for

791: iteration:

792: \begin{equation}

793: 	t_{n+1} = t_n - \frac{\chi^2(t_n) - \left( \chi^2(t^*)+1 \right)}{\dchiSq(t_n)}

794: \end{equation}

795:

796: We begin our two new searches with a sensible initial estimate of the bounds

797: of the uncertainty region, based on our present knowledge of the $\chi^2$

798: curve. These searches operate under all of the constraints under which the

799: previously detailed search operated, and also converges when

800: $\norm{\chi^2(t_{n+1})-\chi^2(t_{n})} \leq \delta_\chi$ holds for two

801: consecutive iterations.

802:

803: Additionally, we can speed up the total search process by using the

804: transformed image points from the previous iteration to find the new

805: transformation for the next iteration. This heuristic means that as the search

806: converges on a final result, the amount of time required to query new years is

807: dramatically reduced. Also, it becomes easier to visualize the search

808: algorithm as a single bidirectional fitting process, in which we repeatedly

809: fit the image to the catalog and the catalog to the image until both fittings

810: have converged. Just as in the previous section, all transformations are

811: constructed using the initial positions of the points, so our final

812: transformation after searching the $\chi^2$ curve is still very numerically

813: stable. Figure~\ref{fig:searchMethodCompare} shows a comparison of this search

814: algorithm against a brute-force ``ground truth''.

815: Figure~\ref{fig:chiSqResults} shows the modified $\chi^2$ curves for each

816: image as they were estimated by this search algorithm.

817:

818: The output of this process is a polynomial description of the image

819: astrometric WCS, a best-fit year value $t^*$, and an uncertainty region around

820: that value.

821:

822: \subsection{Implementation notes}

823:

824: \bd's two primary performance bottlenecks are calculating the distances

825: between image and catalog points and solving the weighted least-squares

826: problems. In both cases, we are able to use the properties of our weighting

827: function to construct approximate solutions that very effectively approximate

828: the true solution.

829:

830: Our algorithm theoretically requires us to repeatedly calculate the distances

831: of all image--catalog pairs. However, due to the properties of the weighting

832: function as described in Section \ref{sec:objectiveSection}, we do not need to

833: know the distances of significantly separated pairs. Because the

834: transformation applied at each iteration tends to be very small, we can

835: generally assume that significantly separated pairs in one iteration will also

836: be significantly separated in the next iteration. This allow us to do one

837: initial calculation of all image--catalog distances, but in later iterations

838: only calculate the distances of image--catalog pairs that will likely cause a

839: change in the value of the objective function. This is a rough heuristic, so

840: we safeguard ourselves by manually recalculating all image--catalog distances

841: every 10 iterations, as well as whenever the IRLS begins to converge. This

842: dramatically speeds up our algorithm, while producing nearly identical results

843: to the na\"{\i}ve brute-force approach.

844:

845: We use the aforementioned properties of our weight function to determine if an

846: image--catalog pair should be considered in the weighted least-squares

847: calculation. A highly separated pair always contributes a nearly-constant

848: value to the residuals, and therefore can be safely ignored. This gives us a

849: slight performance boost.

850:

851: We require an additional threshold for the difference between the optimal

852: function from one iteration to the next, which determines when our IRLS

853: operation has converged. We use the very conservative value of $10^{-4}$ as

854: the maximum amount that the $\chi^2$ score of the final IRLS iteration is

855: allowed to change from those of the previous two IRLS iterations.

856:

857: \section{Results}

858:

859: Harvard's interface for accessing its scanned plates of M44 made it difficult

860: to obtain more than $3000$ by $3000$ pixel subsets of the images, though the

861: entire plates are significantly larger. The interface for downloading the

862: images did not provide an obvious mechanism for selecting the same $3000$ by

863: $3000$ pixel subsets of each image, which means that such selection was done

864: by hand, and is therefore not very accurate. Many of the plates suffer from

865: the many sources of noise typical of historical imagery: Some are

866: multiple-exposures, badly out of focus, badly saturated, or cracked, and some

867: contain handwritten labels, digital scanning artifacts, and bad trailing. For

868: the sake of fairly assessing \bd's performance, we split the images into two

869: sets; \numcleantests\ ``science-quality'' images and \numnoisytests\

870: ``low-quality'' (see Figure~\ref{fig:imageExamples} for examples). For our

871: convenience, we used the JPEG versions of the images, which probably

872: introduces some minor noise in the form of compression artifacts. Harvard

873: graciously provided ground-truth dates for each image, which presumably were

874: taken from logs or from writing on the plates. We take these dates to be true.

875: Though the ground-truth dates range from 1910 to 1975, the dates are not

876: uniformly distributed. See the distribution of images along the y-axis of

877: figure~\ref{fig:performance} for a demonstration of this clustering.

878:

879: The tests were performed using the modified Newton's method, with affine

880: distortions and an accuracy threshold $\delta_\chi$ of $10^{-4}$ year. Tests

881: were done on a 2007 Macbook with a 2GHz Intel Core 2 Duo processor, and 2GB of

882: RAM. Median runtime for estimating each image's date was $\sim 2.6$ seconds

883: per image, after source extraction and \an's initial calibration. Results are

884: shown in Table~\ref{table:performance}.

885:

886: \begin{table}[!h]

887: 	\begin{center}

888: 	  \begin{tabular}{| l || c | c | c |}

889: 	    \hline

890: 		  & mean year error (bias) & median absolute error & fraction within uncertainty \\ \hline \hline

891: 		science-quality & $ 1.68 $ & $ 1.29 $ & $ 27/27 $ \\ \hline

892: 		low-quality & $ -5.28 $ & $ 4.00 $ & $ 17/20 $ \\ \hline

893: 		\end{tabular}

894: 	\end{center}

895: 	\caption{Accuracy of estimated dates for the two subsets of data.

896: 	See Figures \ref{fig:performance} and \ref{fig:errorDist} for a

897: 	more detailed visualization of the results.

898: 	}

899: 	\label{table:performance}

900: \end{table}

901:

902: Though the results shown were generated by fitting an affine (linear)

903: transformation in image coordinates, we experimented with increasing the order

904: of the polynomial warp being fitted. Results were very similar to those

905: obtained using affine transformations, although the median absolute error

906: for science-quality images dropped to $1.01$ years for second-order

907: transformation, and to $0.99$ years for third-order transformations. Median

908: absolute error for low-quality images also decreased slightly as the order was

909: increased, as did the bias for science-quality images.

910:

911: Additionally, we tested \bd\ on the five \USNOB\ source images of M44 that we

912: were able to retrieve from the US Naval Observatory Precision Measuring

913: Machine Data Archive, and on the Sloan Digital Sky Survey \citep{york00} image

914: of M44. The dates of the \USNOB\ source imagery were estimated accurately (all

915: within six years of the true dates, and all within the uncertainties), which

916: is as good as we would expect performance to be given the relatively low

917: resolution ($3.2\,\arcsec\,\pix^{-1}$) of the source imagery that we were able

918: to obtain. The SDSS image was estimated to have been taken in late 2004, and

919: was actually taken in 2006.

920:

921: We assembled an alternate test-bed of ten amateur images of M44 that we were

922: able to find online. The websites on which we found most of our imagery did

923: not explicitly note the date at which the image was taken, so we were forced

924: to use the ``date'' tag in each image's EXIF meta-data as the ground truth.

925: For many of the images, \bd\ provided accurate dates and uncertainties that

926: are consistent with our previous findings: all estimated dates lay within the

927: our uncertainty bounds, accuracy generally depended on the resolution and

928: quality of each image, and most estimated dates (for all sufficiently

929: high-resolution images) were within a few years from the true dates. Our

930: ground-truth dates are, unfortunately, very unreliable, as the EXIF data may

931: simply reflect the date at which an image was digitized or modified, rather

932: than the date at which it was imaged. Though this means that we are not able

933: to truly vet \bd's performance for these amateur images, this issue also

934: highlights the utility of this system: the dates of origin of these images are

935: effectively lost, but can be re-estimated.

936:

937: \section{Discussion}

938:

939: We have shown that our \bd\ system can successfully attach time meta-data to

940: historical imaging data. The system runs in seconds on standard inexpensive

941: consumer computer equipment; it does not require large investments of time or

942: money to vet or create time meta-data for large collections of astronomical

943: imaging.

944:

945: The performance of \bd\ will depend on the properties of the input image, and

946: on the properties of the catalog information known about the region of the sky

947: that is being imaged. We can phrase this as two questions: ``What is the

948: information content in an image?'', and ``What is the information content in a

949: catalog star?''

950:

951: In an attempt to empirically assess the information content of an image, we

952: ran a simple experiment in which we varied the resolution of an input images

953: and the number of stars contained in an input image (by downsampling and

954: cropping the image, respectively). The results are shown in

955: Figure~\ref{fig:performanceVs}, where we see that performance depends heavily

956: on an image containing a large number of stars imaged at high resolution. We

957: also explored the effects of different kinds of imaging defects. Our objective

958: function is designed to be robust to false sources, and as such, \bd\ performs

959: very well on images with multiple exposures. Our experiments suggests that

960: saturation, large PSF due to poor focus or trailing, and short exposure time

961: (low sensitivity) most negatively effect \bd's performance. See

962: Figure~\ref{fig:imageExamples} for examples of our accuracy in the face of

963: different kinds of imaging defects. Trailing and saturation can effectively be

964: thought of as decreasing the resolution of our input image (by decreasing our

965: ability to accurately centroid stars), and shallow imaging is effectively

966: equivalent to dropping dim stars out of the image; these are the two trends

967: demonstrated in Figure~\ref{fig:performanceVs}. Once again, \bd's accuracy

968: appears to depend on an image containing many well-imaged (high resolution)

969: stars.

970:

971: The information in a single catalog star (provided that it has been

972: detected in the input image, in the limit that our procedure is

973: equivalent to least-square fitting) is proportional to that star's

974: contribution to the second derivative of $\chi^2$ with respect to

975: date. This contribution is approximately the square of the magnitude

976: of the star's proper motion, divided by the square of the uncertainty,

977: that is, the square of the signal-to-noise at which the proper motion

978: is detected (where the ``noise'' in this case is the combined

979: uncertainty from the catalog and the image as in

980: equation~\ref{eq:uncertainty}).  We expect \bd's performance on an image to

981: scale roughly with the sum of the squares of the detected catalog

982: stars' proper motion signal-to-noise ratios.  Imagery unlike the

983: imagery analyzed here ought to obtain date calibration with

984: uncertainty that goes down as the sum of the detected catalog stars'

985: proper motion signal-to-noise ratios goes up.

986:

987: Increasing the polynomial order of the transformation on the image plane

988: produces slightly more accurate date estimates, presumably because the input

989: images do have distortions that are represented reasonably by these functions.

990: We are reluctant to advocate unnecessarily large polynomial orders,

991: as---theoretically---the more freedom we give the transformation model, the

992: more irregular our resulting $\chi^2$ curves may become. That being said, we

993: have not seen any evidence that reinforces such a concern. In principle, even

994: more accurate results could be obtained without increasing the number of

995: degrees of freedom in the fit by employing a physical camera model that

996: represents known distortions in the particular camera used to take the data.

997:

998: For each image in our dataset, we performed an experiment to determine the

999: range of initial dates for which our search algorithm is robust. We discovered

1000: each image had a range of at least $300$ years (and on average, $620$ years)

1001: roughly centered around the true year from which the search could be

1002: initialized without the final result being affected. If we ignore our

1003: precaution of using gradient descent when the second derivative of the

1004: $\chi^2$ curve is non-positive, this range is significantly smaller (on

1005: average, about $55$ years). This finding highlights the importance of the

1006: modifications we make to Newton's method in constructing our search algorithm,

1007: and also suggests that the coarse grid of queries with which we initialize our

1008: search is unnecessary---a single initial query in the correct century would

1009: have been more than sufficient.

1010:

1011: \bd\ largely ignores one very important source of information, namely the

1012: brightnesses of the image and catalog stars. This data is used in our

1013: band-pass estimation step, but is then largely ignored. In future versions of

1014: \an\ we plan to utilize this data in a number of ways. We eventually hope to

1015: simultaneously estimate all parameters of a given image, including the image's

1016: WCS, its date of origin, and its band-pass. Estimating all of these

1017: simultaneously should means that brightness information is implicitly used in

1018: our estimation of the location and date, and should improve our results

1019: accordingly. This would also solve our current conundrum regarding this

1020: process, which is that image--catalog correspondences are required for

1021: band-pass estimation, while band-pass estimation is required for finding

1022: image--catalog correspondences.

1023:

1024: Analysis of the brightness of image stars may be able to play a profound role

1025: in date estimation if we consider the subset with periodic variability. Given

1026: an image containing $k$ stars with different periods, and given sufficient

1027: information concerning the periodic variations in their brightnesses, we

1028: should be able to constrain the date of origin of the image to one of a set of

1029: time intervals in which those $k$ stars are at whatever particular point in

1030: their periods (to within photometric precision). Given the set of intervals

1031: constrained by the periodic variations, we can use the range of dates

1032: determined by \bd\ (that is, from the proper motions of the stars) to select a

1033: potentially very narrow time interval in which the image must have originated.

1034: In principle, it may even be possible to determine the date of origin of an

1035: image solely though periodic brightness, though that would require very good

1036: measurements of the periods of the catalog stars and of the brightnesses of

1037: the image stars.

1038:

1039: \bd's value, on the most superficial level, is clear: this system could be

1040: used to recover lost meta-data (at low precision) for historical and amateur

1041: data that have been archived poorly or not at all. Now that large scanning

1042: projects are underway at photographic archives and the web is providing new

1043: opportunities for file sharing among amateurs and professionals, we need

1044: systems that automatically vet and provide meta-data for data of unknown

1045: provenance.

1046:

1047: Regardless of whether or not imagery already contains reliable date meta-data,

1048: the techniques described in \bd\ may have deep-seated implications for the

1049: calibration of all imagery not taken at the year $\epoch$. The fact that the

1050: date can be well estimated from input images demonstrates both that the images

1051: contain important information about stellar motions, \emph{and} that

1052: astrometric calibration is hampered when calibration is performed with a

1053: catalog projected to an epoch far from the date of the image. A system that is

1054: time sensitive, such as \bd, will plausibly provide the best astrometric

1055: calibration possible for arbitrary imaging.

1056:

1057: What may be \bd's most important consequence is an inversion of the system,

1058: in which we attempt to use imagery to re-estimate the proper motions of

1059: catalog stars. The most straightforward approach to this would be to ``cheat''

1060: and use the ground-truth dates of all input imagery. One could repeatedly:

1061: calibrate each image using the catalog wound to that image's date-of-origin,

1062: re-estimate the proper motions of the catalog stars, and re-wind the catalog

1063: using those new proper motions. This could be thought of as performing

1064: expectation-maximization on the proper motions of the catalog. Of course, an

1065: ideal system would be robust to some (or all) input imagery not having

1066: ground-truth ages. We could estimate the date-of-origin of all unlabeled

1067: imagery, and use these estimates (and their uncertainties) in our

1068: expectation-maximization. Labeled and unlabeled data could be treated

1069: equivalently, except that labeled data would have much less uncertainty

1070: associated with it. This system would then become a two-way street, in which

1071: we do not just reposition images relative to the sky, but also reposition the

1072: sky relative to the images, and dynamically develop a consensus between the

1073: two. The future \an\ ``catalog'' would not have to be a static entity, but

1074: would instead be a consensus of all available imagery---using the \USNOB\ as a

1075: static ``prior.'' \bd\ takes us one step closer to this grand long-term hope

1076: for \an\ becoming an always-changing database of everything we know about the

1077: sky, by allowing \emph{time} to become one more dimension of our data.

1078:

1079:

1080: \acknowledgments We would like to acknowledge generous assistance from Mike

1081: Blanton, Rob Fergus, Yann LeCun, Brett Mensh, Keir Mierle, and Dave Monet. We

1082: thank the USNO-B and DASCH teams for providing the data used for this study.

1083: This project made use of the NASA Astrophysics Data System, the US Naval

1084: Observatory Precision Measuring Machine Data Archive, and data and code from

1085: the \an\ project.

1086:

1087: \begin{thebibliography}{70}

1088:

1089: \bibitem[Barron \etal(2008)]{barron08a}

1090: Barron,~J.~T., Stumm,~C., Hogg,~D.~W., Lang,~D., \& Roweis,~S.,

1091: 2008, \aj, 135, 414

1092:

1093: \bibitem[Hampel \etal(1986)]{hampel86}

1094: Hampel,~F.~R., Ronchetti,~E.~M., Rousseeuw,~P.~J., \& Stahel,~W.~A.,

1095: 1986, \textit{Robust Statistics:\ The Approach Based on Influence Functions,}

1096: Wiley, New York

1097:

1098: \bibitem[Lang \etal(2008)]{lang08a}

1099: Lang,~D., Hogg,~D.~W., Mierle,~K., Blanton,~M., \& Roweis,~S.,

1100: 2007, Science, submitted

1101:

1102: \bibitem[Monet \etal(2003)]{monet03a}

1103: Monet,~D.~G., \etal,

1104: 2003, \aj, 125, 984

1105:

1106: \bibitem[York \etal(2000)]{york00}

1107: York,~D., \etal, 2000, \aj, 120, 1579

1108:

1109: \end{thebibliography}

1110:

1111: \clearpage

1112:

1113: \begin{figure}

1114: \centering

1115: \subfigure[Catalog at year 1914, initial image.]{

1116: 	\label{fig:fitExample1}

1117: 	\resizebox{\twowidthshort}{!}{\includegraphics{f2a.eps}}

1118: }

1119: \subfigure[Catalog at year 1914, fitted image.]{

1120: 	\label{fig:fitExample2}

1121: 	\resizebox{\twowidthshort}{!}{\includegraphics{f2b.eps}}

1122: }

1123: \subfigure[Catalog at year 2000, initial image.]{

1124: 	\label{fig:fitExample3}

1125: 	\resizebox{\twowidthshort}{!}{\includegraphics{f2c.eps}}

1126: }

1127: \subfigure[Catalog at year 2000, fitted image.]{

1128: 	\label{fig:fitExample4}

1129: 	\resizebox{\twowidthshort}{!}{\includegraphics{f2d.eps}}

1130: }

1131: \caption{

1132: 	The extracted sources (size is proportional to brightness) from the 900 by

1133: 900 pixel sub-image shown in Figure~\ref{fig:imageExamples}. These plots

1134: illustrate the error in the initial calibration returned by the \an\

1135: solver, as well as the difference in fitting a historical image to \USNOB\ at

1136: the year 2000 and to the Catalog at 1914, the year which we correctly estimate

1137: to be the image's year of origin.

1138: \label{fig:fitExamples}}

1139: \end{figure}

1140:

1141: \begin{figure}

1142:

1143: 	\resizebox{\onewidth}{!}{\includegraphics{f3a.eps}}

1144:

1145: 	\caption{

1146: Evaluation of the accuracy of our search algorithm. The top is our modified

1147: Newton's method with $\delta_\chi=10^{-4}$, and the bottom is brute force,

1148: with an interval of $0.1$ years. The uncertainty regions are identical to

1149: within $10^{-3}$ years, and the estimated dates are within $0.05$ years of

1150: each other. Brute force took 1001 iterations to achieve this accuracy, while

1151: Newton's method took only 23. For clarity's sake, the curves were vertically

1152: separated, and the individual points that were sampled to construct the

1153: brute-force curve are not displayed. \label{fig:searchMethodCompare}}

1154: \end{figure}

1155:

1156: \begin{figure}

1157: \centering

1158: \subfigure[Estimated $\chi^2$ curves of the science-quality images.]{

1159: 	\label{fig:chiSqResults1}

1160: 	\resizebox{\twowidthshort}{!}{\includegraphics{f4a.eps}}

1161: }

1162: \subfigure[Estimated $\chi^2$ curves of the low-quality images.]{

1163: 	\label{fig:chiSqResults2}

1164: 	\resizebox{\twowidthshort}{!}{\includegraphics{f4b.eps}}

1165: }

1166:

1167: \caption{

1168: The estimations of each image's $\chi^2$ curves generated by our modified

1169: Newton's method, for both datasets. The vertical axes of these plots do not

1170: show the constant contributions of image--catalog pairs whose separations are

1171: always too large to be directly calculated, as including those would make

1172: the curves excessively vertically separated, rendering these plots

1173: incomprehensible.

1174: \label{fig:chiSqResults}}

1175: \end{figure}

1176:

1177:

1178: \begin{figure}

1179: \centering

1180: \subfigure[\examplecaptionA]{

1181: 	\label{fig:imageExample1}

1182: \resizebox{\threewidthshort}{!}{\includegraphics{f5a.eps}}

1183: }

1184: \subfigure[\examplecaptionB]{

1185: 	\label{fig:imageExample2}

1186: 	\resizebox{\threewidthshort}{!}{\includegraphics{f5b.eps}}

1187: }

1188: \subfigure[\examplecaptionC]{

1189: 	\label{fig:imageExample3}

1190: 	\resizebox{\threewidthshort}{!}{\includegraphics{f5c.eps}}

1191: }

1192: \subfigure[\examplecaptionD]{

1193: 	\label{fig:imageExample4}

1194: 	\resizebox{\threewidthshort}{!}{\includegraphics{f5d.eps}}

1195: }

1196: \subfigure[\examplecaptionE]{

1197: 	\label{fig:imageExample5}

1198: 	\resizebox{\threewidthshort}{!}{\includegraphics{f5e.eps}}

1199: }

1200: \subfigure[\examplecaptionF]{

1201: 	\label{fig:imageExample6}

1202: 	\resizebox{\threewidthshort}{!}{\includegraphics{f5f.eps}}

1203: }

1204:

1205: 	\caption{

1206: A series of $900$ by $900$ pixel subsets of our $3000$ by $3000$ pixel images.

1207: The top three images are from our set of science-quality images, and the the

1208: bottom three images are from our set of low-quality images. The captions are

1209: of the form ``true date / estimated date''. All but

1210: figure~\ref{fig:imageExample6} have estimated dates that lie within the

1211: uncertainty region.

1212: 	\label{fig:imageExamples}}

1213: \end{figure}

1214:

1215:

1216: \begin{figure}

1217: \centering

1218: \subfigure[Performance on the science-quality images.]{

1219: \label{fig:performance1}

1220: \resizebox{\twowidthshort}{!}{\includegraphics{f6a.eps}}

1221: }

1222: \subfigure[Performance on the low-quality images.]{

1223: \label{fig:performance2}

1224: \resizebox{\twowidthshort}{!}{\includegraphics{f6b.eps}}

1225: }

1226:

1227: 	\caption{

1228: An informative visualization of performance for both datasets. The x-axis is

1229: an arbitrary index denoting the test image (images were sorted by true year),

1230: and the y-axis shows each image's true year, estimated year, and uncertainty

1231: region. The last image in figure~\ref{fig:performance2} is not shown, as

1232: it is estimated as originating before $1900$.

1233: 	\label{fig:performance}}

1234: \end{figure}

1235:

1236: \begin{figure}

1237: \centering

1238: \subfigure[Error in date estimation.]{

1239: 	\label{fig:errorDist1}

1240: 	\resizebox{\twowidthshort}{!}{\includegraphics{f7a.eps}}

1241: }

1242: \subfigure[Error relative to uncertainty.]{

1243: 	\label{fig:errorDist2}

1244: 	\resizebox{\twowidthshort}{!}{\includegraphics{f7b.eps}}

1245: }

1246:

1247: 	\caption{

1248: Histograms of the errors in date estimation, for both datasets.

1249: Figure~\ref{fig:errorDist1} shows the difference between our estimated dates

1250: and the true dates. Figure~\ref{fig:errorDist2} shows those errors relative to

1251: the widths of the uncertainties; effectively, the differences between the

1252: $\chi^2$ scores of the true years and the $\chi^2$ scores of the estimated

1253: years. A relative error of less than $1$ indicates that the true year lies

1254: within the uncertainty. The two outliers in figure~\ref{fig:errorDist1} are

1255: actually significantly worse than they appear: $+65$ and $-175$ years.

1256: 	\label{fig:errorDist}}

1257: \end{figure}

1258:

1259: \begin{figure}

1260:

1261: 	\centering

1262: 	\subfigure[Performance versus resolution.]{

1263: 		\label{fig:performanceVs1}

1264: 		\resizebox{\twowidthshort}{!}{\includegraphics{f8a.eps}}

1265: 	}

1266: 	\subfigure[Performance versus number of image stars.]{

1267: 		\label{fig:performanceVs2}

1268: 		\resizebox{\twowidthshort}{!}{\includegraphics{f8b.eps}}

1269: 	}

1270:

1271: 	\caption{

1272: Plots showing performance relative the resolution of the input image, and to

1273: the number of stars the input image contains. Figure~\ref{fig:performanceVs1}

1274: was produced by repeatedly downsampling the input image, while

1275: Figure~\ref{fig:performanceVs2} was produced by repeatedly cropping out the

1276: borders of the input image. Note that in downsampling the image, some smaller

1277: stars stop being detectable by our source-extraction algorithm, so the number

1278: of stars in the image decreases with the resolution of the image.

1279: 	\label{fig:performanceVs}}

1280: \end{figure}

1281:

1282: \end{document}

1283: