hep-ex0612063/analysis_tools.tex
1: \section{\label{sec:theta_tools}$\Theta^+$ analysis tools}
2: \subsection{\label{sec:proton-strategy} The proton identification strategy} 
3: The $\Theta^+$ signal is expected to appear as a narrow peak in the invariant
4: mass distribution of $\ko$--proton pairs. $\ko$ are identified using their
5: $\vo$-like signature (see Sec.~\ref{sec:k0_id}). To separate protons
6: from $\pi^+$, for each positively charged track we build likelihoods
7: under the proton and $\pi^+$ hypothesis using the information from DC,
8: TRD, and ECAL (see Sec. 3.2), and we take their ratios 
9: ${\cal L}_{DC}$, ${\cal L}_{TRD}$, ${\cal L}_{ECAL}$:
10: 
11: \begin{equation}
12: \label{eq:likelihoods}
13: \begin{aligned}
14: &{\cal L}_{DC}(p,L),                 && L - \mbox { track length} & \\
15: &{\cal L}_{TRD}(p,\epsilon_{TRD}),   && \epsilon_{TRD} - \mbox{ energy release in TRD} \\
16: &{\cal L}_{ECAL}(p,\epsilon_{ECAL}), && \epsilon_{ECAL} -\mbox{ energy release in ECAL}\\
17: &\phantom{X}                         && p - \mbox{ track momentum. }                   \\
18: \end{aligned}
19: \end{equation}
20: 
21: We optimize the cuts for the proton identification likelihood ratios
22: maximizing the sensitivity to the expected $\Theta^+$ signal. These 
23: ``optimal`` cuts are not necessarily those which maximize the purity of
24: the proton sample.
25: 
26: The best approach for tuning the proton identification cuts would be to
27: maximize the sensitivity using a detailed Monte Carlo for $\Theta^+$
28: production. However, given the poor knowledge on the properties of this
29: particle, there is no available MC generator describing the production
30: of exotic baryons. We create, therefore, ``fake`` $\Theta^+$ states 
31: in the NOMAD event generator by using pairs of protons and $\ko$ with
32: invariant mass close to the mass of $\Theta^+$ state. However, in this
33: approach the momentum distribution of these ``fake`` $\Theta^+$ states
34: is determined by the momentum distribution of protons and $\ko$ from the
35: primary vertex. This can result in wrong ``optimal`` cuts if the true
36: momentum distribution of $\Theta^+$ particles is very different.  We try to
37: avoid this problem by subdividing the original MC sample into several
38: narrow bins of $x_F$ and optimizing the cuts for {\em each} $x_F$ interval
39: independently. The $x_F$ variable is defined as the ratio of the longitudinal
40: projection of the $\Theta^+$ momentum on the hadronic jet momentum to the
41: hadronic jet energy in the hadronic center-of-mass frame. The variable
42: $x_F$ is in the range $(-1,1)$ with negative (positive) values often called
43: the {\em target (current) } fragmentation regions. 
44: 
45: The procedure of tuning the proton identification cuts is then as follows:
46: \begin{itemize}
47: \item We build ``fake`` $\Theta^+$ states by taking $\ko$--proton pairs
48: with $1510 < M < 1550$ MeV$/c^2$. Assuming no $\Theta^+$ polarization,
49: a flat distribution of $\cos\theta^*$, where $\theta^*$ is the angle between
50: the proton momentum in the $\Theta^+$ rest frame and the $\Theta^+$ momentum
51: in the laboratory. We reweight the $\cos\theta^*$ distribution so obtained
52: to make it flat. This is our MC ``signal``.
53: \item Any other combination of a $\ko$ and a positive track not identified
54: as a proton, but with an assigned proton mass, is taken as the MC background
55: if its invariant mass M falls in the same mass interval.
56: \item We split the ``fake`` $\Theta^+$ states into several intervals of
57: positive track momentum. We vary the cuts on
58: ${\cal L}_{DC}$, ${\cal L}_{TRD}$, ${\cal L}_{ECAL}$
59: simultaneously in each interval and find those cuts which maximize the
60: $signal/\sqrt{background}$ ratio.
61: \end{itemize}
62: 
63: We check this procedure on a sample of $\lamdecay$ events. 
64: Fig.~\ref{fig:lambda-xF-optimum} displays the invariant mass distributions
65: of proton--$\pi^-$ pairs in both MC and data without proton identification
66: and with ``optimal`` for the $\lamdecay$ observation proton identification, for $-0.6 < x_F < -0.3$.
67: With ``optimal`` proton identification the significance of the
68: $\lamdecay$ signal increases in both MC and data samples.
69: 
70: \begin{table}[htb]
71: \begin{center}
72: \begin{tabular}{||c|c|c|c|c||}
73: \hline\hline
74:                       & \multicolumn{2}{|c|}{$N({p\ko})$}    & \multicolumn{2}{|c||}{purity (in \%)}\\
75: \cline{2-5}
76:                       & all  &  ``signal`` & all &  ``signal`` \\
77: \hline
78: no ID          & 53463   & 1856                     & 23      & 16.4\\
79: \hline
80: ``optimal`` ID & 40561   & 1090                     & 27.8    & 22.1\\
81: \hline\hline
82: \end{tabular}
83: \end{center}
84: \caption{\label{tab:events_purity} Numbers of $p\ko$ pairs and purity of proton samples in the data for two subsets of events: without proton identification and with ``optimal`` proton identification. These numbers are shown for all entries and for ``signal`` region: $1510<M<1550$ MeV$/c^2$.}
85: \end{table}
86: In Tab.~\ref{tab:events_purity} we show numbers of $p\ko$ pairs and purity of proton samples in the data for two subsets of events: without proton identification and with ``optimal`` proton identification. These numbers are shown for all entries and for ``signal`` region ($1510<M<1550$ MeV$/c^2$).
87: \subsection{\label{sec:massresolution} The $p\ko$ mass resolution} 
88: 
89: The expected mass resolution of the $p\ko$ pair is estimated as follows.
90: 
91: % \begin{figure}[htb]
92: %  \begin{center}
93: %  \epsfig{file=EPS/invmass_resolution.eps,width=\linewidth}
94: %  \end{center}
95: % \caption {\it The expected invariant mass resolution of proton+$\ko$pair as a function of the invariant mass 
96: % for three different approaches: ``A``,``B MC`` and ``B DATA`` (see text for details).}
97: % \label{fig:resolutionmass2}
98: % \end{figure}
99: 
100: \begin{itemize}
101: \item For MC events we calculate the invariant masses of the generated and
102: reconstructed $p K^0_S$ pairs, and we fit the distribution of the difference
103: between the two values by a Gaussian whose width is taken as the mass
104: resolution (method ``A``).
105: \item Using the measured momenta of the proton ($\vec{p}_1$) and of the
106: $\ko$  ($\vec{p}_2$), the angle $\theta$ between  $\vec{p}_1$ and $\vec{p}_2$,
107: and the associated errors $\sigma(\vec{p}_1)$ and $\sigma(\vec{p}_2)$ we find (neglecting errors in $\cos\theta$): 
108: \begin{equation}
109: \begin{aligned}
110: M^2_{inv} \ \sigma^2(M_{inv}) = & \left(\frac{E_2}{E_1} p_1 - p_2 \ cos \theta\right)^2 \sigma^2(p_1) + \\
111:                                 & \left(\frac{E_1}{E_2} p_2 - p_1 \ cos \theta\right)^2 \sigma^2(p_2).
112: \end{aligned}
113: \end{equation}
114: This method,``B``, can be applied to both MC and data events.
115: \end{itemize}
116: 
117: Fig.~\ref{fig:resolutionmass2} displays the expected mass resolution
118: of $p\ko$ pairs as a function of their reconstructed invariant mass,
119: as obtained using method ``A`` (MC only), or method ``B`` (for both MC and data).
120: The results agree well with each other and predict a resolution of about
121: 8.8 MeV$/c^2$ at the $\Theta^+$ mass (1530 MeV$/c^2$).
122: 
123: \subsection{\label{sec:stat_analysis} The statistical analysis} 
124: An estimation of the signal significance in the data is performed as follows:
125: \begin{enumerate}
126: \item A possible difference in the proton $\cos\theta^*$ distribution for
127: the signal and background is exploited to improve the signal
128: sensitivity. We take all $\ko$--proton pairs with $1510 < M < 1550$ MeV$/c^2$,
129: and we split them into 10 intervals with similar statistics: five mass
130: intervals with $\cos\theta^*$ in the interval $[-1, -0.5)$, and another
131: five mass intervals with $\cos\theta^*$ in the interval $[-0.5, 1]$.
132: The total mass interval ($1510 < M < 1550$ MeV$/c^2$) covers well the
133: expected $\Theta^+$ mass. The mass bin width, 10 MeV$/c^2$, is comparable
134: to the expected invariant mass resolution of $\ko$--proton pairs. 
135: \item We compute two likelihoods:
136:  \begin{equation}
137: \label{eq:likelihoods1}
138: \begin{aligned}
139: \ln L_B &=&& \sum_{i=1,10} \left(-b_i + n_i\cdot \ln b_i\right)\\
140: \ln L_{B+S} &=&& \sum_{i=1,10} \left(-b_i -s_i + n_i\cdot \ln \left(b_i + s_i\right) \right)\\
141: \end{aligned}
142: \end{equation}
143: where $b_i$, $s_i$, $n_i$ are the number of predicted background and signal
144: events, and observed data events in the $i$-th bin.
145: \item We compute the signal statistical significance as:
146: \begin{equation}
147: \label{eq:likelihoods2}
148: S_L = \sqrt{2 \left(\ln L_{B+S} - \ln L_B\right)}
149: \end{equation}
150: \item We find the resonance mass position $M$ and Breit-Wigner width
151: $\Gamma$ and the number of signal events $N_s$ which maximize $S_L$.
152: \end{enumerate}
153: For the background we use the procedure decribed in Sec.~\ref{sec:background}. 
154: The signal is modeled by a Breit-Wigner distorted by a Gaussian resolution
155: with $\sigma=8.8$ MeV$/c^2$. This algorithm was checked on several generated
156: distributions containing a Breit-Wigner signal of width $\Gamma$ distorted
157: by a Gaussian resolution of width $\sigma$ and superimposed on a fluctuating
158: background. We considered three cases, $\sigma\ll\Gamma$, $\sigma=\Gamma$,
159: $\sigma\gg\Gamma$, and found that in all cases the procedure of maximizing
160: $S_L$ correctly determined the number of signal events and $\Gamma$
161: (with $\Gamma$ around zero for the case $\sigma\gg\Gamma$).
162: 
163: