cs0104008/intro.tex
1: 
2: \section{Introduction}
3: \label{sec:intro}
4: 
5: Large High Energy Physics (HEP) detectors typically have many hundreds of
6: thousands of readout channels and record very large data
7: samples. The task of storing and managing these data is a challenge
8: which requires sophisticated data management techniques. Initially the data
9: provided by the on-line data aquisition system of the detector
10: must be recorded at typical rates of several megabytes per second.
11: Subsequently fast access to the entirety of the data must be provided
12: for reconstruction and analysis.
13: 
14: Various techniques have been employed by different experiments to meet
15: these requirements. Typically the data are stored in sequential format on
16: magnetic tapes inside a robotic tape storage and access system containing
17: thousands of tape cartridges.  The tapes are mounted on a tape 
18: drive automatically without human intervention, both for reading and writing
19: the data. Currently, typical tape robots provide storage space for up to
20: several hundred terabytes of data. The data accumulated
21: by the ZEUS experiment at the HERA $ep$ collider~\cite{ZEUS,HERA}
22: over eight years of operation amount to approximately 35
23: terabytes. In addition, approximately 40 terabytes of simulation data
24: have been accumulated.
25: 
26: Tape storage systems of the type described above work efficiently when a
27: large fraction of the data to be retrieved is stored on a single tape. This
28: is typically the case for targeted simulation data, but is generally not
29: the case for real event sets when the subset of events required for a 
30: particular analysis may be very sparse.
31: When only a small fraction of the events are required and these are spread
32: out over a number of entire tapes, these systems become inefficient. The
33: inefficiency originates
34: both from access to the tapes, typically limited by mechanical
35: constraints
36: in the tape robotics systems, and from access to data on individual
37: tapes, limited by the sequential nature of the data format.  The
38: sequential format requires large amounts of data to be read from the
39: tape into the memory of an analyzing computer system in order to
40: extract the desired information.
41: 
42: Various approaches have been used to address this problem. A standard
43: solution involves splitting the data at an early stage into many data
44: samples, often overlapping, according to the foreseen needs of
45: different physics analyses. The split samples are then stored on
46: magnetic tapes or on disks. In either case the data can be
47: analysed efficiently if a high proportion of the events stored in a
48: given sample are required for a particular analysis. However, this method
49: has two disadvantages. Firstly,
50: the data samples from selections for different physics interests will be
51: overlapping, requiring more total storage
52: space than the original sample. Secondly, the criteria used to split
53: the data must be defined at an early stage when the understanding
54: of the data may still be rudimentary. As a result, the splitting may
55: have to be repeated several times as the understanding of the data
56: advances.
57: 
58: The limitations of this method can be avoided if the data are stored
59: using a database management system with appropriate indexing and query
60: facilities. However, conventional database management systems such as the
61: relational database ORACLE\cite{ORACLE} have not yet been able to cope
62: with the typical data recording and analysis requirements of large HEP
63: experiments. In particular, in these systems the time needed to
64: retrieve a single event from the global event
65: sample may exceed the computing time needed to analyse the event by
66: orders of magnitude.
67: 
68: In this paper we describe a system which overcomes the limitations
69: described above. The system is built on top of a standard datastore
70: consisting of sequential datafiles stored on magnetic disks or tapes,
71: and uses a commercial object-oriented database management system to
72: provide the missing index and query facilities.  The system was
73: designed and implemented for the ZEUS experiment but it could be
74: adapted for use at other large high energy physics experiments
75: in operation or under construction.
76: 
77: 
78: 
79: 
80: