0001:astro-ph0001540/Q.tex

1: %Begun 30 Dec 1999

2: \documentstyle[12pt,aasms4]{article}

3: \begin{document}

4: \def\arcsec{\ifmmode^{\prime\prime}\;\else$^{\prime\prime}\;$\fi}

5: \def\arcmin{\ifmmode^{\prime}\;\else$^{\prime}\;$\fi}

6: \title{The Number of Publications Used as a Metric of the NOAO WIYN Queue

7: Experiment}

8:

9:

10: \author{Philip Massey, Mary Guerrieri,

11: and Richard R. Joyce}

12: \affil{Kitt Peak National Observatory, National Optical Astronomy

13: Observatory\altaffilmark{1}

14: \\ P.O. Box 26732, Tucson, AZ 85726-6732}

15:

16:

17:

18:

19: \altaffiltext{1}{Operated by AURA

20: under cooperative agreement with the

21: National Science Foundation.}

22: \begin{abstract}

23:

24: We use the number of papers published in 1998 and 1999 to test the

25: hypothesis that the queue observing mode at WIYN leads to a

26: significantly higher scientific throughput than classical mode

27: observing.  We use the papers published from the 4-m, and papers

28: published from the non-queue WIYN time as controls, requiring only that

29: the data be obtained after 1996 August 1, at which time the WIYN queue

30: was in its third full semester of operation, and the WIYN instruments

31: functional and stable.  The number of papers published from the queue

32: data is actually 1.5 times smaller (on a per night basis) than from the

33: 4-m, and roughly comparable to (but lower than) the number published

34: from non-queue WIYN time.  Thus neither comparison offers any {\it

35: support} for the hypothesis that queue leads to a higher scientific

36: throughput.  The number of papers is relatively small, but the

37: statistics are sufficiently robust to {\it reject} the possibility that

38: queue observing at WIYN leads to a factor of 1.5 enhancement in

39: publication rate with a 99.3\%  confidence in comparison to the 4-m,

40: and with an 89.9\% confidence in comparison with non-queue WIYN time.

41: We consider several explanations, and urge that other observatories

42: planning to employ the queue mode include some controls to provide

43: an objective evaluation of its success. \end{abstract}

44:

45: \keywords{PAC codes 95.45 95.55; instrumentation: miscellaneous---methods:

46: miscellaneous---sociology of astronomy}

47:

48: \section{Introduction}

49:

50: The 3.5-m Wisconsin-Indiana-Yale-NOAO (WIYN) telescope was dedicated on

51: 1994 October 15, and shared-risk observing began in 1995 March.  NOAO's share

52: of the time is 40\%, and nearly all of this has been carried out in ``queue" mode, where the observations

53: from highly ranked proposals are placed in a queue and executed during

54: nights assigned to the queue program.  The observations are carried out by

55: highly experienced professionals, who are extremely familiar with

56: the instrumentation, without the direct assistance of the proposing astronomer.

57: A small fraction of the NOAO time

58: is scheduled out in ``classical mode", with the

59: observers present at the telescope.  The time allocated to the

60: university consortium members (roughly 60\%)

61: is all carried out in classical mode.

62:

63:

64: The goal

65: of the NOAO WIYN queue experiment was

66: eloquently described by Silva \& De Young (1996) as an empirical test

67: of ``the hypothesis that in the face of a high over-subscription rate, the

68: science throughput of WIYN can be maximized by executing

69: the most highly ranked science programs first, completing datasets in a timely

70: manner, allowing a larger range of program lengths, and matching the

71: observing program to the observing conditions on an observation-by-observation

72: basis."

73:

74: The WIYN queue has often been described as an ``experiment" at least in part

75: because other observatories are considering scheduling some or all of their time

76: in this mode, and NOAO staff have felt that what we can learn from the WIYN

77: queue will be useful to others.

78:  In an era that sees both the proliferation of very large

79: ($\ge$8 m) telescopes, but ever-tightening financial resources, observatories

80: are scrambling to understand how to maximize their scientific return.

81:

82: Queue observing offers a variety of theoretical advantages, as nicely

83: summarized by Mountain (1996) and

84: Boroson et al.\ (1998). For very highly ranked programs that require

85: rare conditions, queue observing may be the only practical way to acquire such data.  Queue observing naturally allows synoptic observations, and such

86: scheduling easily accommodates target-of-opportunity requests, such as optical

87: follow-ups of gamma-ray bursts or supernovae. Furthermore, as instrumentation

88: becomes more complex, queue observing carried out by dedicated observers

89: may result in more efficient use of telescope time than if the observations

90: were carried out by visitors who uses the equipment only occasionally.

91: This contention is partially supported by evidence that

92: observers collect less data on the first night of an observing run than

93: on subsequent nights (Bohannan 1998).

94:

95: However, there are obvious down-sides to the queue mode.  The astronomer is not

96: present at the telescope, and therefore cannot make real-time decisions

97: concerning the data.  Serendipity is eliminated, as are the risky programs

98: many of us have snuck in during gaps in our main observing program, and which

99: have sometimes led to the more interesting results.

100: Some of us suspect, rightly or wrongly,

101: that we could better carry out our own observations.

102: And, there is not the same strong sense of

103: ``data ownership" that comes

104: with having carried out the observations ourselves:  the memory of a night

105: may provide details that are relevant to the interpretation of the reduced

106: data, as well as providing an emotional impetus for seeing the project

107: through to its completion.

108:

109: There is also a non-negligible expense of

110: running a queue, which is off-set to some degree

111: by the smaller support

112: required for visiting astronomers.

113:

114: Boroson (1996) has described a simulation program that can be used to

115: test how successfully programs are completed in a queue mode

116: vs.\ a classical mode, using Monte Carlo sampling of characteristic

117: observing conditions (weather, seeing) for the site. Boroson

118: et al.\ (1998) used this simulation program comparing queue mode and classical

119: scheduling

120: for two actual semesters (1997) of WIYN programs, concluding that

121: queue scheduling at WIYN has led to a significant gain in efficiency

122: and scientific effectiveness.

123:

124:

125: Now that the queue experiment has run for several years, we thought it would be worth examining the gain using some real-world measure.

126: As emphasized by Boroson et al.\ (1998), much of the argument about observing

127: modes can be emotional.  We seek some metric that we can use to {\it test}

128: the {\it hypothesis} enunciated above that the queue observing mode leads

129: to significant improvement in the {\it science throughput}.  One such

130: simple metric is the number of refereed papers published.  This may not be

131: as meaningful in its long-term impact on astronomy as, say,

132: the number of important new discoveries,

133: but at least it has the advantage of being quantifiable, and, if the

134: experimental and control samples are well matched, equitable and fair.

135:

136:

137: We choose to compare the number of papers produced by the WIYN queue to the

138: following two controls,

139: each with its advantages and disadvantages:

140: \begin{enumerate}

141: \item The number of papers produced by observations made over the same time

142: period with the Mayall 4-m telescope.

143: \item The number of papers produced by observations made over the same time

144: period by non-queue use of WIYN; i.e., primarily the time used by the

145: consortium universities.

146:

147: \end{enumerate}

148:

149: The first comparison has the primary advantage that both the 4-m and WIYN

150: proposals have undergone similar scrutiny by

151: the same time allocation committees (TACs),

152: which often consider such factors as the past track-record of the proposers

153: as well as the

154: scientific excellence of the proposals.  Thus proposers to the 4-m and

155: WIYN will feel similar pressures to publish in a timely manner, and the

156: feasibility of the proposals has been carefully evaluated.  Users of the

157: university time may choose to undertake

158: longer-term projects, leading ultimately

159: to more important results, but not processing the same rapid turn around

160: from observing to publication.

161: We offer the second comparison as there may be differences in the actual

162: on-sky performance of the two telescopes that would affect the results:

163: the 4-m is a mature telescope, possibly with fewer teething problems, than

164: the newer WIYN.

165:

166: If the queue leads to significantly higher

167: scientific throughput, then we expect that the number of papers published using

168: data obtained via the queue should be significantly

169: greater than those produced by the

170: control samples, after normalization on the basis of the number of scheduled

171: nights.

172:

173: \section{The Data Set}

174:

175: All of the 1998 and 1999 issues of the main US astronomy journals

176: were examined

177: for papers which used 4-m and/or WIYN observations.  The complete list of 135

178: papers is given in Table~A1 of the Appendix.

179:

180: In order to make a fair

181: comparison, we restricted ourselves to only those papers for which the data

182: were obtained in semester ``1996B" or later (i.e., after 1996 August 1).

183: This was the third full semester of WIYN queue time, and the first semester in

184: which both the imager and fiber positioner were fully functional.

185: (A non-linearity problem with the S2KB imager chip was discovered and

186: fixed during the 1996A semester, and a mechanical problem which compromised the

187: positioning accuracy of the Hydra fiber positioner was fixed in 1996 March.)

188:

189: We list in Table~1 the number of papers published during 1998 and 1999.

190: Six papers used both 4-m {\it and} WIYN data; we chose to count each of

191: these papers separately for both telescopes, depending upon the date

192: in which the data were obtained for the telescope under consideration; i.e.,

193: if the data for WIYN was obtained in 1996B or later, but the 4-m data was

194: obtained prior to 1996, it would count as a WIYN publication but not as a

195: 4-m paper. There were six papers in our list in which the data

196: collected were such a minor component of the

197: paper that we chose not to count the paper at all; only one of these used

198: the WIYN queue, and in that case the data had been published previously by

199: the original proposers.

200:

201: \section{Results}

202:

203:

204: \subsection{Comparison of the WIYN and the 4-m}

205:

206: In order to make a valid comparison, we must first take into account that not

207: as much time is scheduled for the WIYN queue as for the 4-m.  We expect the

208: answer is about 40\%, as NOAO receives

209: 40\% of the time on WIYN, and almost

210: all of this goes to the queue.  However, the 4-m is shut down during July

211: and August, while WIYN continues to operate; on the other hand, there

212: are more engineering nights scheduled at WIYN. One could use the

213: total number of clear hours spent observing as

214: the normalization, but these data are hard to extract reliably.  Instead, we

215: took the final observing schedules for semesters 1996B, 1997A, 1997B, 1998A, and

216: 1998B, and simply counted the number of nights assigned to the WIYN queue,

217: and to science operations at the 4-m. (For the latter, we included half-night

218: instrument ``checkout" nights, as much of this time is typically

219: returned to the observers

220: scheduled on the second half; full-night ``check" nights and engineering

221: nights were excluded.  We excluded all engineering nights scheduled at WIYN,

222: although occasionally queue observations are obtained during such time.)

223: The numbers of nights so scheduled for the WIYN

224: queue and for the 4-m are 260 and 656 respectively; i.e.,

225: the number of nights scheduled to the WIYN queue turned out to be 39.6\% of the

226: nights scheduled at the 4-m.

227:

228: If the hypothesis described above is

229: correct, we would expect the number of publications based upon WIYN queue data

230: to be significantly greater than 40\% of those produced by the 4-m.

231: Instead, we find in Table~1 that there

232: were only 9 papers produced by WIYN queue data as opposed to 34 papers

233: produced by the 4-m; i.e., 26\%.  Thus there are actually

234: 1.5 times fewer papers published (on a per night basis) based on

235: queue WIYN data relative to those based on 4-m data. This comparison does not

236: support the hypothesis of greater science throughput by the WIYN queue.

237:

238: Can we rule out the hypothesis given the small number statistics?  If we assume

239: the simplest model that a 1$\sigma$ uncertainty in the number of publications

240: $N$ is simply the $\sqrt{N}$, then the 1$\sigma$ error on the 0.26 ratio of

241: WIYN to 4-m publications is 0.13.  What does it mean for there to be a

242: ``significant" enhancement in the scientific throughput?  Boroson et al.\ (1998)

243: discuss how their simulation predicts this will depend upon program type,

244: TAC grade, and so on, and that overall about 2.5 times as many programs will be completed by queue observing than with classical observing.  We take here

245: a more conservative approach: certainly a 50\% increase (a factor of 1.5)

246: would be cause for celebration. Were this enhancement present, we would expect

247: there to be 1.5 $\times$ 39.6\% = 59.4\% as many WIYN queue papers as 4-m

248: papers.  We observe 0.26$\pm$0.13

249: We thus can reject such an increase at a +2.5$\sigma$ level; i.e., with a 99.3\%

250: confidence.\footnote{The rejection probability corresponding to +2.5$\sigma$

251: was found by  $$1.0-0.5 \times (1.0-A_G(\mid x-\mu\mid /\sigma)),$$ where

252: $A_G$ is the

253: integral probability of the normal distribution with a mean of $\mu$ and

254: a standard deviation of $\sigma$; see, for example, Fig.~C-2

255: in Bevington (1969).}

256:

257: \subsection{Comparison of Queue vs.\ Non-Queue Time at WIYN}

258:

259: Of the 731 nights scheduled for science at WIYN during 1996B through 1998B,

260: we find that 260 nights were scheduled for queue observations (35.6\%),

261: 27 nights were scheduled for NOAO classical observations (3.7\%), and 444 as

262: university time (60.7\%).  If queue observing produced

263: a significantly higher scientific throughput, we would expect significantly

264: more than 36\% of the papers produced by WIYN data to be based on data obtained

265: with the queue.  Instead, of the 28 total WIYN

266: papers in our sample, 9

267: (32\%) were produced from queue data. This is

268: essentially

269: the same fraction of time on WIYN used by the queue (36\%), and therefore does

270: not suggest that queue provides a significant advantage.

271:

272: While the data fail to offer any support for the hypothesis, at what level

273: can we reject the claim, given our limited statistics? Using the same argument

274: as above that we would hope for a factor of 1.5 enhancement over the non-queue

275: publication rate, we can ask at what level can we exclude the queue publications

276: amounting to 1.5$\times$ 35.6\% = 53.4\% of the total.  The uncertainty

277: in our ratio 0.32 ratio is 0.17.  Thus we can exclude a 50\% enhancement at

278: the +1.3$\sigma$ level; i.e., with an 89.8\% confidence.

279:

280: Nevertheless, it is clear that queue observing does

281: fare better in this comparison than it did in comparison to the 4-m control,

282: although still failing to produce a higher number of publications.

283: Several explanations

284: come to mind.

285: One possibility is that the 4-m simply operates more efficiently

286: than WIYN (at least in the time period when most of the data were acquired),

287: and that

288: it was thus easier to obtain usable data at the 4-m.

289: It is possible that review of queue proposals by an

290: outside TAC leads to a higher publication rate than time used by the

291: universities, who have a preallocated amount of time, which is divided up

292: internally.  (As suggested earlier, the university time may be spent on

293: longer-term programs than the NOAO portion.)  Finally, the 4-m supports a wider

294: complement of instrumentation (such as infrared imaging and spectroscopy) than WIYN,

295: which plausibly provides greater coverage of astronomical disciplines and

296: thus involvement in a wider variety of publications.

297:

298: Although the numbers are small, the very high publication rate

299: for NOAO time that is scheduled {\it classically} at WIYN suggests that it

300: may be the TAC process rather than the telescope or instrumentation

301: which explains why the queue

302: does better in this comparison than it does in comparison to the 4-m:

303: 14\% of the WIYN papers were produced by the small

304: (3.7\%) time allocated to non-university classical observing.

305: The classically scheduled NOAO time

306: undergoes the same rigorous review as the queue

307: proposals, and thus is under the same pressure to publish rapidly.

308:

309: \section{Discussion}

310:

311: Arguably, the WIYN queue has been as well run as it is possible for

312: any queue to be.  A survey carried out of astronomers who had proposed for

313: queue time suggests that people were very satisfied with the quality of the

314: data they received (Boroson et al.\ 1998); some might expect that maintaining

315: data quality to be the hardest part of a queue.

316: Yet the evidence so far fails to support the suggestion that queue

317: observing leads to a higher scientific throughput, at least as measured by the

318: number of publications.  Why does this differ from the dramatic predictions of simulations that suggest that a much higher percentage of programs should be

319: completed by the queue mode?

320:

321: We have read through the papers based upon the WIYN queue data and have several observations of our own to offer.  First, let us consider the advantage that

322: queue offers in providing easy ``target of opportunity" (TOO) observations.

323: Of the full set of 11 papers (ignoring the 1996B cutoff), four rely on the

324: TOO advantage of queue for optical followup of

325: gamma-ray bursts (Galama et al.\ 1998) or supernovae

326: (Jha et al.\ 1999; Perlmutter et al.\ 1999; and Riess et al.\ 1998).

327: Although WIYN played a role in these important studies, our examination of

328: these papers suggests that it was a relatively minor role, with the

329: majority of the data coming from elsewhere.   For instance,

330: there are considerably

331: more

332: data from the CTIO 4-m (which is classically scheduled) than from WIYN

333: in the Riess et al.\ (1998) study.

334: Inspection of these papers suggest that

335: there is no lack of ways for large groups to acquire such data.

336: The number of authors on these four papers range from

337: 17 to 42, and with a large number of participants being a reflection of

338: the degree (and method?) of telescope access.

339: Thus TOO use of WIYN may not be more significant simply

340: because there

341: are other ways of obtaining such data.

342:

343: One of the other purported advantages for queue observing is the ability to

344: take

345: advantage of particularly good

346: conditions, and indeed some programs may not be

347: completed any other way.  However, this advantage is larger the greater the

348: range of conditions.

349: For instance, if the frequency histogram of delivered image quality (DIQ)

350: is very sharply peaked,

351: then queue offers less of an advantage, as all programs will obtain something

352: like the median seeing.  At WIYN the median DIQ (at {\it R})

353: is

354: 0.8 arcseconds, and 0.6 arcsecond or better images are achieved 18\% of the

355: time (Green 1999).

356: Of the 11 queue papers listed in Table~A1,

357: Armandroff, Jacoby, \& Davies (1999) is one of the clearest examples of taking

358: advantage of the queue to obtain the best DIQ.

359: The study utilized sub-arcsecond conditions (0.8 arcsec at {\it B},

360: 0.6 arcsec at {\it V}, and 0.7 arcsec at {\it I}) for deep imaging of a newly

361: discovered dwarf member of the Local Group, Andromeda~VI, after confirming its

362: nature using imaging at the 4-m.  Nevertheless, these DIQ

363: values are not all that

364: different than the median values.

365:

366: However, it may be that the sociological issues raised in the introduction dominate.  The

367: use of queue may reduce the sense of ``data ownership," and given

368: situations of ``data saturation," we are more likely to publish the

369: data more rapidly if we have acquired them ourselves.

370: The use of ``queue mode" on {\it HST} has been perceived as being highly

371: successful,

372: although a meaningful control sample is hard to find for comparison; however,

373: one important difference comes to mind, namely that observing time (to US

374: proposers) usually comes with grants, providing a financial incentive to produce

375: results rapidly, coupled with a 1-year proprietary

376: period for unique data.  An additional consideration is that {\it HST} supplies

377: the user with fully reduced data, unlike WIYN, which provides basic calibration

378: data and requested standard observations, but which does not attempt a

379: ``pipe-line" reduction. However, our own experience with {\it HST} data is

380: that customized reductions are often needed in order to provide the data most

381: meaningful for a particular application.

382:

383: Finally, it may be that we simply have not been sufficiently patient.  As is evident from the 4-m publications, only one-third of the 4-m papers in

384: the past two years relied purely on ``new" data (i.e., all data obtained in

385: the past 3.5 years).  While our control samples explicitly took this into account,

386: we are nevertheless comparing numbers that are on the

387: the tails of the distribution of how quickly data finds its way into the

388: literature.  This may be particularly true if the datasets from the WIYN

389: queue were to be larger than that in the control samples, or if they

390: take longer to reduce.

391: Current plans call for discontinuing the WIYN queue at the

392: end of semester 2000A, but continuing to provide some synoptic and target of opportunity service observing beyond that.  It will be interesting to

393: re-examine the literature five years from now

394: using data obtained in 1996B-2000A

395: as the selection criterion.

396:

397: We note that the quantity we would most like to measure is ``quality", but

398: this is of course harder to do in an objective manner.  Citation rates might

399: provide one means, but not enough time has past for these to be meaningful.

400: Counting the number of papers is some measure of the ``output" of a telescope,

401: but it is not necessarily the best; it does have the advantage of being

402: objective and reproducible, qualities usually assumed to be desirable

403: in any experiment.

404:

405:

406: Nevertheless, our results suggest that it

407: may benefit observatories to evaluate

408: their queue programs using some external measure, such as the number of

409: publications, if suitable controls can be defined.

410:

411: \acknowledgments

412: Helmut Abt,

413: Dave De Young, David Sawyer,

414: Dave Silva, and Sidney C. Wolff were kind enough to

415: provide thoughtful comments

416: on the manuscript.  We also benefited conversations with Taft Armandroff,

417: Bruce Bohannan, and

418: Abi Saha on the issues of queue observing.

419:

420: \section{Appendix}

421:

422: In Table~A1 we present the list of papers published in the

423: {\it Astronomical Journal}, the {\it Astrophysical Journal} (Parts 1 and 2),

424: and the {\it Publications of the Astronomical Society of the Pacific}

425: during 1998 and 1999 that used data from the 4-m and/or WIYN.  We list

426: the dates of the first data obtained (from the relevant telescope).  Often

427: this information was directly obtained from the paper, but in many cases we

428: had to contact the authors, or inspect the observing schedule or list of

429: queue programs to determine the actual data or semester.

430:

431:

432:

433: %\input{tabs.tex}

434:

435: \begin{references}

436: \reference {} Armandroff, T. E., Jacoby, G. H., \& Davies, J. E. 1999, AJ, 118,

437: 1220

438:

439: \reference {} Bevington, P. R. 1969, Data Reduction and Error Analysis for

440: the Physical Sciences (New York, McGraw-Hill)

441:

442: \reference {} Bohannan, B. 1998, SPIE, 3349, 30.

443:

444: \reference {} Boroson, T. 1996, in New Observing Modes for the

445: Next Century, ed.\ T. Boroson, J. Davies, \& I. Robson (San Francisco: ASP), 13

446:

447: \reference {} Boroson, T., Harmer, D. L., Saha, A., Smith, P. S., Willmarth, D. W., \& Silva, D. R. 1998, SPIE, 3349, 41

448:

449: \reference {} Galama, T. J. et al.\ (16 additional authors) 1998, ApJ, 497, L13

450:

451: \reference {} Green, R. 1999 NOAO Newsletter No. 60, 38

452:

453: \reference {} Jha, S. et al.\ (41 additional authors) 1999, ApJS, 125, 73

454:

455: \reference {} Mountain, M. 1996, in New Observing Modes for the

456: Next Century, ed.\ T. Boroson, J. Davies, \& I. Robson (San Francisco: ASP), 235

457:

458: \reference {} Perlmutter, S. et al.\ (31 additional authors) 1999, ApJ, 517, 565

459:

460: \reference {} Riess, A. G. et al.\ (19 additional authors) 1998, AJ, 116, 1009

461:

462: \reference {} Silva, D., \& De Young, D. 1996, NOAO Newsletter 45, 36

463:

464: \end{references}

465:

466:

467: \end{document}

468:

469: