1: \section{Detector Operation and Performance}
2: \label{sec:ops}
3:
4: As the result of several years of routine data-taking, extensive operational
5: experience has been obtained with the MINOS detector systems. Representative
6: observations are presented here. Section~\ref{sec:ops-rel} summarizes
7: detector performance and reliability information.
8: %Section~\ref{sec:ops-comm}
9: %[Reviewers' note: this prints as ``section 4.5'']
10: % ATH - This section was indeed moved back to the electronics section as
11: % part of the last round of reviews, so this intro text needs to be
12: % changed. In fact, this sentence will be removed.
13: %describes the inter-detector
14: %timing system that provides a beam spill ``gate'' for the far detector and
15: %allows the association of near and far data from the same beam pulse.
16: Section~\ref{sec:ops-quality} describes the systems and procedures used
17: to ensure data quality and to monitor detector system performance in real time.
18: Finally, Sec.~\ref{sec:ops-off-line} gives an overview of the offline software
19: used for detector performance measurements and data analysis.
20:
21: Since the NuMI beam and near detector are located at Fermilab while
22: the far detector is \unit[735]{km} north in Soudan, MN,
23: the coordination of experimental operations is non-trivial.
24: This challenge has been addressed by making as much of the experiment as
25: possible controllable remotely over computer networks.
26: Physicist shift workers are present \unit[24]{hours/day}, \unit[7]{days/week}
27: in the main MINOS control room on the 12th floor of Wilson Hall at Fermilab,
28: where both near and far detectors are monitored and controlled.
29: In addition, the NuMI beam is monitored by the MINOS shift workers but
30: is controlled from the Fermilab accelerator control room. Weekdays
31: between the hours of 7:30 and 17:30 US Central Time, four to five full-time
32: technicians are present in the MINOS cavern of the Soudan Underground
33: Laboratory, monitoring and controlling the far detector, and, as needed,
34: repairing the detector subsystems. Both sites have
35: technical support on call for after-hours intervention when necessary.
36: Close coordination among the three control rooms has provided
37: high detector live times for periods when the beam is in operation (see
38: Sec.~\ref{sec:ops-rel-fd}).
39:
40: \subsection{Detector reliability and live-time fractions}
41: \label{sec:ops-rel}
42:
43: \subsubsection{Near detector}
44: \label{sec:ops-rel-nd}
45:
46: The near detector was commissioned in January~2005. Since
47: then the detector has been kept in an operational state
48: except during extended periods when the beam was not in operation.
49: The near detector data taking is usually organized into an approximately
50: \unit[24]{hour} run sequence, consisting of \unit[210]{seconds} of
51: calibration runs followed by a \unit[24]{hour} physics run.
52: Excluding periods when the beam has been off, the fraction of
53: time during which the near detector has been in physics data taking mode
54: has averaged above 98.5\%. Typically more than 99.95\% of the
55: detector channels are operational. The small fraction of the lost beam time
56: is due to the daily calibration runs and infrequent detector maintenance.
57:
58: Most downtime with beam on is due to maintenance, usually for the
59: replacement of failed front-end electronics cards. A mean number of
60: \unit[3.5]{cards} (out of 9,360~total) failed per week before a mass
61: replacement of unreliable on-board fuses was done in the summer of 2007.
62: After that, the failure rate dropped to less than a board per week. The
63: typical intervention to replace a few front-end cards requires
64: approximately one hour of downtime, including the calibration of the
65: replacement channels.
66:
67: \subsubsection{Far detector}
68: \label{sec:ops-rel-fd}
69:
70: The far detector installation was completed in July~2003 and the
71: detector has been recording cosmic ray and atmospheric-neutrino data
72: since then. By the time the beam arrived in the spring of 2005,
73: reliable detector operation had become routine.
74: Similar to near detector data taking, a
75: \unit[24]{hour} run sequence for physics data and calibrations is also
76: the operational mode at the far detector. The overall live-time
77: fraction in 2005 was 96.7\% and has risen to the high 90\%'s since. Typically all channels in the detector are
78: functional, with isolated failures being fixed during beam downtime,
79: usually within hours to days of the appearance of the problem.
80:
81: After the neutrino beam turned on in March~2005 the important metric for
82: evaluating experimental performance was the fraction of
83: protons-on-target (``POT'') while the far detector was taking good data. The MINOS experiment's
84: sensitivity is driven by the statistics of neutrinos observed at the far
85: detector, making it crucial to keep the far detector operating as much as
86: possible. Since the end of beam commissioning in March 2005,
87: the detector has run very smoothly, taking physics data for $>$98.7\% of all
88: delivered NuMI protons for the first year's beam
89: operations~\cite{Adamson:2007gu} and better than 99\% thereafter.
90:
91: \subsection{Data quality and real-time monitoring}
92: \label{sec:ops-quality}
93:
94: A combination of real-time
95: monitoring and offline or post-processing monitoring is
96: performed on a daily basis by physicists on shift to ensure data quality.
97: The systems developed
98: for these tasks keep track of similar parameters for both near and far
99: detectors.
100:
101: The MINOS Online Monitoring (OM) system is designed to provide real-time
102: monitoring of data quality in the near and far detectors. It is based on
103: the system used for the CDF experiment's Run~II operations~\cite{wagner:2001is}
104: and consists of three main processes.
105: i)~The {\it Producer} process receives raw data records from the DAQ via the
106: MINOS Data Dispatcher, which are then processed to fill monitoring
107: histograms. ii)~The {\it Server} process receives monitoring
108: histograms from the Producer, handles connections from external GUI
109: processes, and serves histogram data to these processes on request.
110: iii)~The {\it GUI} process allows browsing and plotting of any of these
111: monitoring histograms.
112:
113: The monitor histograms are grouped into sections, e.g., those relating to
114: digitized hits from the detector (channel occupancies, ADC
115: distributions, etc.), singles rates, and distributions relating to
116: electronics calibration and light injection data. A representative
117: subset of these histograms is checked once every six hours by
118: the shift crews at the detector sites and any problems are entered into
119: the MINOS electronic logbook via a checklist template. All
120: monitoring histograms are archived to tape for future reference.
121: %A stand-alone OMhistory browser has been developed to examine data from
122: %a collection of such files to observe long-term trends in the data.
123:
124: The raw event data are moved to storage at Fermilab and copied over to the
125: Farm Batch System for offline processing. From there the reconstruction is
126: completed with a stable software release. Offline reconstruction is
127: performed on data taken the previous day and used for the offline
128: monitoring and subsequent data quality checks. The data are divided into
129: separate streams in-time and out-of-time with the beam spills to
130: facilitate monitoring as well as analysis.
131:
132: The Offline Monitoring system serves two main purposes. It allows
133: monitoring of the detector systems using reconstructed event data
134: quantities such as event rates per POT, demultiplexing and scintillator
135: strip efficiencies. Additionally the system provides the ability to
136: verify that the offline
137: production is proceeding normally so that unexpected changes can be
138: tracked down quickly. The Offline Monitoring system has a histogram
139: making process which runs once per day, reading in all the reconstructed
140: data from the near and far detectors processed in the previous day and
141: producing a set of histograms for monitoring. It also runs the
142: {\it OMhistory} package, a process for viewing how these histograms
143: change over time.
144: % From a
145: %selected set of histograms, a daily checklist is filled by the shift
146: %physicists as a part of the data taking procedure.
147:
148: Other tasks performed during shifts include completing a checklist of the
149: DCS systems described in Sec.~\ref{sec:elec-dcs}, monitoring
150: quasi-real-time event displays for both near and far detectors, and
151: monitoring the NuMI beam performance ~\cite{Kopp:2006nq}.
152: These checks are performed during each
153: shift to ensure that problems are noticed promptly and flagged for
154: repair.
155:
156: \subsubsection{Near detector}
157: \label{sec:ops-quality-nd}
158:
159: In order to detect anomalies and trends in both the performance and data
160: quality of the MINOS near detector, several quantities are verified
161: weekly, including the total uncalibrated digitized response of the detector
162: activity per POT during the $\unit[\sim10]{\mu s}$ spill as a function
163: of time (Fig.~\ref{fig:ops-quality-nd}a), the number of reconstructed
164: events per POT as a function of time (Fig.~\ref{fig:ops-quality-nd}b),
165: and the reconstructed event time
166: (Fig.~\ref{fig:ops-quality-nd}c). Instabilities in these quantities
167: may indicate a detector and/or reconstruction problem. The data used for
168: these quantities come from the in-time spill stream, taking advantage of
169: the large flux of beam neutrino events at the near site.
170:
171: \begin{figure*}[htpb]
172: \centering
173: \includegraphics[width=\textwidth,keepaspectratio=true,bb=0 0 740 235]{nim_near_monitor_plots-5.eps}
174: \caption{Distributions examined
175: in near detector data quality monitoring include: (a) the average
176: spill pulse height, (b) the average number of
177: reconstructed events per \unit[$1\times10^{12}$]{POT} in a
178: \unit[13]{day} period, and (c) reconstructed event times in the
179: $\unit[20]{\mu s}$ gate.)}
180: \label{fig:ops-quality-nd}
181: \end{figure*}
182:
183: \subsubsection{Far detector}
184: \label{sec:ops-quality-fd}
185:
186: The far detector data are checked weekly for anomalies in
187: the reconstruction and data quality by comparing distributions of several
188: reconstructed quantities to a baseline data set.
189: Examples of distributions monitored are the number of planes crossed
190: by muons in the detector (Fig.~\ref{fig:ops-quality-fd}a), the
191: incoming directions of the tracks and showers
192: (Fig.~\ref{fig:ops-quality-fd}b), and track entry
193: locations (Fig.~\ref{fig:ops-quality-fd}c).
194: Other quantities that help ensure the detector calibration remains stable
195: are the reconstructed velocity for cosmic ray muons
196: for timing calibration and pulse heights of tracks and showers for
197: energy calibration. Cosmic ray muons are most useful for these
198: checks as they are the most abundant data source in the far detector.
199: %Similar checks are also made for major changes in the reconstruction
200: %software. In such cases, distributions made from the previous release
201: %are compared to those made with the current software version.
202:
203: \begin{figure*}[htpb]
204: \centering
205: \includegraphics[width=\textwidth,keepaspectratio=true]{nim_far_monitor_plots.eps}
206: \caption{Distributions examined during
207: far detector data quality monitoring include: (a) the number
208: of planes crossed by cosmic ray muon tracks, (b) the incoming
209: direction of the cosmic ray muons with respect to the beam direction
210: and (c) the entry location for cosmic ray muons along the
211: length of the detector.}
212: \label{fig:ops-quality-fd}
213: \end{figure*}
214:
215: \subsection{Offline software overview}
216: \label{sec:ops-off-line}
217:
218: Although MINOS comprises three detectors (the near, far and calibration
219: detectors) at different depths and
220: latitudes and with different sizes, physical configurations, beam
221: characteristics and electronic readout schemes, the simplicity of the
222: active detector technology has allowed a single framework
223: %\footnote{MINOS offline software: \url{http://www-numi.fnal.gov/off-line_software/srt_public_context/WebDocs/WebDocs.html}}
224: of offline analysis software to be constructed for all detectors. The object
225: oriented characteristics of the C++ language~\cite{Stroustrup97} have
226: enabled the modularity required for this task. MINOS software is
227: made available to collaborators using the Concurrent
228: Versioning System (CVS)\footnote{\url{http://www.nongnu.org/cvs/}}
229: embedded in the SLAC-Fermilab Software Release Tools (SRT) code
230: management
231: system\footnote{\url{http://www.fnal.gov/docs/products/srt/}}.
232: %[Reviewers' comment: are footnotes appropriate? Should these just be
233: %endnotes?]
234: % ATH - good question, and I think what we decided is for stuff like
235: % this, where the reference really is a URL, that footnotes were the
236: % appropriate vehicle. Will leave the final judgement to the journal
237: % editors, though.
238: %and
239: %\url{http://www.slac.stanford.edu/BFROOT/www/Computing/Environment/Tools/SRT/SRTuser-node1.html}}.
240: The system uses software
241: libraries from the CERN ROOT project~\cite{Brun:1997pa}, including
242: ROOT tools for I/O, graphical display, analysis, geometric detector
243: representation, database access and networking.
244:
245: Raw data from different data acquisition processes at the MINOS
246: detectors are written to disk as separate ROOT TTree ``streams.'' These
247: include physics event data, pulser calibration data, beam monitoring
248: data and detector control data. This information immediately becomes available
249: for monitoring, calibration and event display processes
250: through an online data ``dispatcher'' service. This utility can access the
251: online ROOT files while they are still open for writing by the MINOS DAQ
252: systems. Subsequent offline processing produces additional TTree
253: streams for event reconstruction results and analysis ntuples.
254:
255: %[Reviewers' note: this is a large paragraph -- suggest breaking in two.]
256: % ATH - OK
257: The need to correlate these streams of MINOS data with each other has
258: motivated a key element in the MINOS software strategy called {\it VldContext}
259: (``Validity Context''). VldContext is a C++ class that encapsulates
260: information needed to locate a data record in time and space. Separate
261: streams of data from different sources can be synchronized by comparing
262: their VldContext objects. When MINOS offline software opens files
263: containing these streams, it indexes each stream according to the
264: VldContext of each record. The indexing information can then be used
265: %with ROOT's random access I/O capability
266: to put VldContext-matched
267: records into computer memory simultaneously. The GPS timestamps
268: attached to raw data records enable this matching for far and near
269: detector data in the same offline job. These features are illustrated
270: in Fig.~\ref{fig:validity}.
271:
272: All MINOS record types derive from a common
273: record base class with a header that derives from a common header base
274: class. The minimum data content of the record header is the VldContext,
275: used to associate records on input. The small record header is stored
276: on a separate ROOT TTree Branch from the much larger data blocks. A
277: MINOS stream is an ordered sequence of records stored in a ROOT TTree
278: containing objects of a single record type extending over one or more
279: sequential files. On input, records stored in different streams are
280: associated with each other by VldContext and not by Tree index. The
281: default mode is that records of a common VldContext form an input record
282: set. Alternative input sequencing modes by VldContext are also
283: supported.
284:
285: \begin{figure*}[htpb]
286: \centering
287: \begin{minipage}[b]{0.45\textwidth}
288: \includegraphics[width=\textwidth,bb=155 530 375 760]{NIMVldCFigA.eps}
289: \end{minipage}
290: \begin{minipage}[b]{0.4\textwidth}
291: % Made B a bit smaller since it's taller, trying to make the fonts match
292: \includegraphics[width=\textwidth,bb=180 230 385 495]{NIMVldCFigB.eps}
293: \end{minipage}
294: \caption{
295: MINOS data structure. The schematic on the left shows a MINOS record type
296: derivation and header structure. The schematic on the right shows
297: a MINOS stream as an ordered
298: sequence of records stored in a ROOT TTree. On input, records
299: stored in different streams are associated with each other by
300: VldContext and not by Tree index.}
301: \label{fig:validity}
302: \end{figure*}
303:
304: The MINOS offline database contains calibration and survey data,
305: including component locations and connection maps from the construction
306: phase of the detector. These relational tables are keyed with a notion
307: of ``Validity Range'' or scope of VldContext values to which a
308: database record applies. For physics data of a particular VldContext,
309: the offline database interface enables retrieval of matching database
310: records whose Validity Ranges encompass the VldContext of the physics
311: data in question.
312:
313: %The offline software accesses the offline database through a generic
314: %ODBC\footnote{\url{http://www.unixodbc.org/}} interface, which allows
315: %data to be saved to and retrieved from any compliant database product.
316: %Currently, the central database warehouse is served by
317: %Oracle\footnote{\url{http://www.oracle.com/database/index.html}} and is
318: %around \unit[100]{GB} in size.
319: %% and \url{http://www.easysoft.com/developer/interfaces/odbc/index.html}}.
320: %Local distributed databases are in
321: %MySQL\footnote{\url{http://www.mysql.com/}}
322: %% and \url{http://www.mysql.com/products/connector/odbc/}}
323: %installations and can be substantially smaller depending on the local
324: %needs. Data in the MySQL servers are automatically synchronized with
325: %the Oracle warehouse through a multiple- master replication scheme.
326:
327: %New paragraph from George:
328: The offline software accesses the offline database through
329: a low-level ROOT~\cite{Brun:1997pa} interface, which allows data to be
330: saved to and retrieved from compliant database products.
331: The central database warehouse is served by
332: MySQL\footnote{\url{http://www.mysql.com/}}
333: and is about \unit[100]{GB} in size. Local distributed databases
334: are in MySQL installations and can be substantially smaller
335: depending on the local needs. Data in the distributed MySQL
336: servers are automatically synchronized with the MySQL
337: warehouse through a multiple-master replication scheme.
338:
339: %%% Local Variables:
340: %%% mode: latex
341: %%% TeX-master: "minos-nim"
342: %%% End:
343: