hep-ph0602101/prl.tex
1: \documentclass[twocolumn,twoside,prl,floatfix,letterpaper]{revtex4}
2: 
3: \newif\ifpdf \ifx\pdfoutput\undefined \pdffalse \else \pdftrue \fi
4: \ifpdf 
5:   \usepackage[pdftex]{graphicx} 
6: \else 
7:   \usepackage{graphicx} 
8: %  \renewcommand\topmargin{0in}
9: \fi
10: \usepackage{amsmath}
11: \usepackage{amssymb}
12: \usepackage{fancyhdr}
13: \usepackage{journals}
14: \usepackage{color}
15: \usepackage{rotating}
16: \usepackage[colorlinks,hyperindex]{hyperref}
17: 
18: \bibliographystyle{apsrev}
19: 
20: \def \Bard {{\sc Bard}}
21: \def \Vista {{\sc Vista}}
22: \def \Sleuth {{\sc Sleuth}}
23: \def \Quaero {{\sc Quaero}}
24: \def \TurboSim {{\sc TurboSim}}
25: \def \Pythia {{\sc Pythia}}
26: \def \Herwig {{\sc Herwig}}
27: \def \MadGraph {{\sc MadGraph}}
28: \def \MadEvent {{\sc MadEvent}}
29: \def \Stntuple {{\sc Stntuple}}
30: \def \Experiment {CDF}
31: \def \CdfSim {{\sc CdfSim}}
32: \def \DZero {{D\O}}
33: \def \SM {{\ensuremath{\text{SM}}}}
34: \def \D {{\ensuremath{\cal D}}}
35: \def \H {{\ensuremath{\cal H}}}
36: \def \L {{\ensuremath{\cal L}}}
37: \def\ltapprox{\,\raisebox{-0.6ex}{$\stackrel{<}{\sim}$}\,}
38: \def\gtapprox{\,\raisebox{-0.6ex}{$\stackrel{>}{\sim}$}\,}
39: \def \pmiss {{\,/\!\!\!p}}
40: \def \met {{\,/\!\!\!\!E_{T}}}
41: \def \SumPt {{\ensuremath{\sum{p_T}}}}
42: \def \figuresize {3.0in}
43: \def \scriptP {\ensuremath{{\cal P}}}
44: \def \twiddleScriptP {\ensuremath{\tilde{\cal P}}}
45: \def \detEta {{\ensuremath{\eta_{\text{det}}}}}
46: \newcommand {\poo}[2]{{\ensuremath{p(#1\!\rightarrow\!#2)}}}
47: \newcommand {\abs}[1]{\left| #1 \right|}
48: 
49: \begin{document}
50: 
51: \title{\Bard: Interpreting New Frontier Energy Collider Physics}
52: \author{Bruce Knuteson}
53: \homepage{http://mit.fnal.gov/~knuteson/}
54: \email{knuteson@mit.edu}
55: \affiliation{MIT}
56: \author{Stephen Mrenna}
57: \homepage{http://home.fnal.gov/~mrenna/}
58: \email{mrenna@fnal.gov}
59: \affiliation{FNAL}
60: 
61: %\date{\today}
62: 
63: \begin{abstract}
64:  No systematic procedure currently exists for inferring the underlying
65: physics from discrepancies observed in high energy collider data. We
66: present \Bard, an algorithm designed to facilitate the process of model
67: construction at the energy frontier. Top-down scans of model parameter
68: space are discarded in favor of bottom-up diagrammatic explanations of
69: particular discrepancies, an explanation space that can be exhaustively
70: searched and conveniently tested with existing analysis tools.
71: \end{abstract}
72: 
73: \maketitle
74: 
75: %===============================================================
76: 
77: \begin{figure}
78: \includegraphics[width=2.5in,angle=0]{eebbSleuthExcess}
79: \caption{A cartoon illustration of \Bard's starting point: an excess (circled in red) in data (individual events shown as tick marks on the horizontal axis) over Standard Model prediction (shown as a continuous distribution) in a particular exclusive final state ($e^+e^-b\bar{b}$) on the tail of the total summed scalar transverse momentum of all objects in the event ($\sum{p_T}$).\label{fig:eebbSleuthCartoon}}
80: \end{figure}
81: 
82: \begin{figure}
83: \begin{tabular}{cc}
84: \includegraphics[width=1.6in]{eebb_mystery} &
85: \includegraphics[width=1.6in]{eebb_12} \\
86: \end{tabular}
87: \caption{Chalkboard drawing of the ingoing and outgoing legs of the Feynman diagram responsible for producing an observed signal in the final state $e^+e^-b\bar{b}$ at the Tevatron (left), and of a Feynman diagram possibly responsible for producing this signal (right).\label{fig:eebbChalkboard}}
88: \end{figure}
89: 
90:  In contemporary high energy physics experiments, it is not uncommon to observe discrepancies between data and Standard Model predictions.  Most of these discrepancies have been explained away over time. To convincingly demonstrate that an observed effect is evidence of physics beyond the Standard Model, it is necessary to prove it is (1) not a likely statistical fluctuation, (2) not introduced by an imperfect understanding of the experimental apparatus, (3) not due to an inadequacy of the implementation of the Standard Model prediction, and (4) interpretable in terms of a sensible underlying theory.  Those who object to (4) as being necessary fail to appreciate that most hypothesis development in science occurs before, rather than after, publication.  This last criterion is essential, and will likely point the way to other discrepancies that must exist if the interpretation is correct.
91: 
92: In the search for new electroweak-scale physics at the frontier energy colliders, a model-independent search strategy
93: (\Vista~\cite{ACAT2003:Knuteson:2004nj,Moriond2005:Knuteson:2005ev}
94: or \Sleuth~\cite{ACAT2003:Knuteson:2004nj,Moriond2005:Knuteson:2005ev,
95: SleuthPRL:Abbott:2001ke,SleuthPRD1:Abbott:2000fb,SleuthPRD2:Abbott:2000gx})
96: rigorously addresses whether a statistical fluctuation explains the
97:  observation. Rejecting the hypothesis that the observed effect arises
98: from
99:  a feature of the detector or an inadequacy of the detector simulation
100: is best
101:  handled by requiring consistency among all collected data; this is the
102: purpose of \Vista.
103: %
104:  Our ability to calculate QCD at hadron colliders has improved
105: dramatically over the past decade, with much recent progress in describing multi-jet
106: final states.  Using these tools and demanding consistency among
107: many different observables addresses the third
108: criterion.
109: %
110:  Addressing the fourth requires a practical method for
111: systematically generating new hypotheses to yield sensible interpretations of discrepancies. 
112: %
113: %Notable progress has recently been made in the calculation of the Standard Model: \MadEvent~\cite{MadEvent:Maltoni:2002qb} and other generators are able to provide the Standard Model prediction exactly at tree level for arbitrary final states of low multiplicity, and other efforts are pushing systematic calculations to one loop.
114: 
115: 
116: Event generators containing implementations of physics beyond the Standard Model are able to calculate model predictions within particular scenarios.  Interpreting a specific discrepancy requires working in the inverse direction, from observed phenomenon to the underlying model.  The typical top-down approach of scanning model parameter spaces to find regions compatible with discrepancies is computationally intractable for parameter spaces with dimensionality larger than about five.  %~\footnote{Inserting rough numbers, five parameter points per dimension raised to five dimensions times ten minutes to test each parameter point is one month.}  
117: We are aware of no satisfactory systematic prescription for interpreting possible discrepancies observed at the Tevatron or Large Hadron Collider in terms of the new underlying physics. This Letter introduces \Bard, a bottom-up algorithm whose function is to weave a story to explain observation.
118: 
119:  Working in an effective field theoretic framework, we write ${\cal
120: L}_{\cal H} = {\cal L}_{\text{SM}} + {\cal L}_{\text{new}}$, where
121: ${\cal H}$ denotes a new hypothesis, the sum of Standard Model
122: Lagrangian terms ${\cal L}_{\text{SM}}$ and new terms ${\cal
123: L}_{\text{new}}$ entailing additional Feynman diagrams. Our goal is to
124: determine what new term(s) ${\cal L}_{\text{new}}$ best describe a
125: particular observed discrepancy in the data.
126: The ability to generate new predictions automatically is
127: facilitated by progress in the calculation of the Standard Model: 
128: \MadEvent~\cite{MadEvent:Maltoni:2002qb} and other tools are able to provide the 
129: Standard Model prediction exactly at tree level for arbitrary final 
130: states of low multiplicity, and other efforts are pushing systematic 
131: calculations to one loop.  
132: 
133: 
134: The result of \Vista\ or \Sleuth\ is a discrepancy observed in a
135: particular final state, perhaps on the tail of the distribution of the total
136: summed scalar transverse momentum in the event, as pictured in cartoon
137: form in Fig.~\ref{fig:eebbSleuthCartoon}. In determining the Feynman
138: diagram(s) potentially responsible for producing the observed effect,
139: the nature of the incoming particles determines the incoming legs in the
140: graphs of interest, and the particular final state in which the
141: discrepancy is observed determines the outgoing legs. This is shown as a
142: chalkboard drawing in Fig.~\ref{fig:eebbChalkboard}(a). The game is to
143: provide the middle part of the graph, such as shown in
144: Fig.\ref{fig:eebbChalkboard}(b).
145: 
146: \Bard\ begins by exhaustively listing reasonable possibilities, involving all operators with mass dimension four or less, and introducing generic new particles of spin 0, 1/2, or 1; having electric charge in multiples of 1/3; and existing as singlets, triplets, or octets under SU(3)$_\text{color}$.
147: 
148: \Bard\ uses \MadGraph~\cite{MadGraph:Stelzer:1994ta} to systematically generate all diagrams entailed by these new terms, an example of which is shown in Fig.~\ref{fig:eebbChalkboard}(b).  No attention is paid at this stage to whether the particles and interactions introduced fit naturally into a fashionable theoretical framework.  The resulting diagrams are partitioned into stories, collections of diagrams in which the existence of any single diagram in the story implies the existence of the others.  Depending on the final state, \Bard\ will generate between a few and a few thousand stories as potential explanations for the observed discrepancy.
149: 
150:  Each story introduces several new parameters. These parameters are the
151: masses and widths of the introduced particles, and the couplings at each
152: vertex. This parameter space is sufficiently small that it can be
153: scanned, provided a fast yet sensitive analysis algorithm exists to test
154: each of these stories as an explanation for the observed effect.
155: \Quaero~\cite{QuaeroPRL:Abazov:2001ny,chep2003Quaero:Knuteson:2003dn}
156: was designed for this purpose.
157: 
158: \Bard\ passes the new Lagrangian terms ${\cal L}_{\text{new}}$ to \Quaero, which has been prepared with the interesting subset of the data highlighted by \Vista\ or \Sleuth.  \Quaero\ uses \MadEvent\ to integrate the squared amplitude over the available phase space and to generate representative events, and uses \Pythia~\cite{Pythia:Sjostrand:2000wi} for the showering and fragmentation of these events.  \TurboSim\ is used as a fast replacement for the experiment's full detector simulation.  \Quaero\ performs the analysis, numerically integrating over systematic errors, returning as output $\log_{10}{\cal L}$, where ${\cal L} = p({\cal D}|{\cal H})/p({\cal D}|\text{SM})$ is a likelihood ratio, representing the probability of observing the data ${\cal D}$ assuming the hypothesis ${\cal H}$ divided by the probability of observing the data ${\cal D}$ assuming the Standard Model alone.  The region in the parameter space of the story that maximizes $\log_{10}{\cal L}$ is determined, providing also an error estimate on the parameter values.  Repeating this process in parallel for each story enables an ordering of the stories according to decreasing goodness of fit to the data.
159: 
160:  The testing discussed so far occurs only on that subset of data in
161: which the discrepancy is observed. Once the list of stories has been
162: ordered, those at the top of the list can be tested further. In the
163: example provided in Fig.~\ref{fig:eebbChalkboard}, a story involving a
164: $Z$ boson as an intermediate state decaying to $e^+e^-$ must produce
165: effects also in $\mu^+\mu^-b\bar{b}$ and $\tau^+\tau^-b\bar{b}$.  A story involving the pair production of charge $4/3$ leptoquarks coupling the first lepton generation with the third quark generation might (by crossing) have other observable consequences at LEP or HERA, depending on the leptoquark mass.
166: The broader consequences of the most compelling stories can then be worked
167: out against all frontier energy collider data using \Quaero.
168: 
169:  Simplifications to the procedure described above decrease the
170: computational cost of the algorithm. Vectors and scalars enter in
171: similar ways into the stories considered; either spin 0 or spin 1
172: particles can be discarded. Electric and color charge and fermion number
173: conservation may be assumed at each vertex. Vertices with four external
174: legs can be ignored. When generating the list of diagrams, it is
175: convenient to exclude those diagrams containing propagators that are not
176: new particles, the top quark, or a gauge boson, on the grounds that
177: a diagram involving a light internal propagator would likely first appear
178: as a discrepancy in another final state through the subdiagram obtained by cutting through the light internal propagator. The widths of the particles can
179: be taken to be small compared to experimental resolution. Since the
180: couplings of diagrams in each story enter only as the square of their
181: product, the parameters associated with each story are one mass for each
182: new particle added, and one overall coupling; this parameter space is
183: most efficiently explored by scanning in the subparameter space of
184: masses, and for each choice of particle masses exploiting the known
185: shape of $\log_{10}{\cal L}$ as a function of the overall coupling to
186: find the maximum. Final states with missing energy require a loop over
187: neutrinos and heavy new particles lacking strong and electromagnetic
188: interactions.  Interference between Standard Model and new diagrams can be ignored.  Stories involving only one new particle may first be considered, and stories involving two or three new particles considered secondarily.  Assumptions such as these explicitly limit the story space in the interest of speed.
189: 
190: %Additional assumptions in the interest of speed will no doubt suggest themselves for particular discrepancies that will be seen.
191: %The limitations to our simplified approach are: (1) miss spin-dependent effects, (2) ...
192: % flexible enough to be used in a guided way
193: 
194: Starting bottom-up from a specific observed discrepancy, \Bard\ is able to perform a more targeted search than those who scan model parameter spaces.  \Bard\ will allow an experiment to publish an observed discrepancy together with an extensive list of possible interpretations, with this list ordered according to how well each story fits the data, and with best fit parameter values for each story.  Multiple discrepancies are naturally handled sequentially by \Bard.  A systematic approach will likely be required in sorting out scenarios involving a complex spectrum of new resonances, such as supersymmetry, with \Bard\ regularly suggesting possible explanations of the data that might otherwise be overlooked for years.  As an unanticipated advantage, \Bard\ is also able to determine whether an observed discrepancy has any possible underlying interpretation at all, and assists in understanding which of our assumptions must be violated for an underlying interpretation to exist.
195: 
196: The new theory ${\cal L}_{\cal H}$ is at this point the Standard Model Lagrangian ${\cal L}_{\text{SM}}$ patched with additional terms ${\cal L}_{\text{new}}$ to explain particular effects.  There will likely be no practical possibility to divine a deeper structure until several such additional terms have been added to explain several discrepancies.  Once several such new terms have been added, deriving the deeper structure is largely a matter of identifying similar terms in ${\cal L}_{\cal H}$, and writing the Lagrangian more compactly.  If the $W$ and $Z$ bosons, the top quark, and the Higgs boson were not already known, one could imagine deducing the Standard Model from LEP, Tevatron, and future LHC data in this manner.
197: 
198:  We expect the systematic, bottom-up approach encapsulated in the \Bard\ algorithm and described in this Letter to be useful for interpreting impending discoveries at the Tevatron and Large Hadron Collider.  In the problem domain of interpreting new electroweak scale physics from the current generation of frontier energy colliders, the details of the algorithm are sufficiently worked out to be reasonably confident of its success.  More generally, the spirit of automatic model construction described here has application to other interpretations of data that take the form of an effective Lagrangian.  In these problem domains the details of a workable algorithm may or may not turn out to be as trivial as we have found them to be at the electroweak scale.  More generally still, the systematization of model construction may eventually play a useful role in other subfields of science.
199: 
200: 
201: \acknowledgments
202: 
203: Tim Stelzer (UIUC) provided \MadGraph\ and \MadEvent, two crucial ingredients in the approach advocated in this Letter.  Conversations with Michael Niczyporuk (MIT) led to the congealing of the ideas described here.  Khaldoun Makhoul and Georgios Choudalakis (MIT) assisted in \Bard's implementation.  Financial support for this effort comes in part from a Department of Defense Graduate Science and Engineering Fellowship at the University of California at Berkeley; NSF International Research Fellowship INT-0107322 at CERN; a Fermi/McCormick Fellowship at the University of Chicago; and DoE grant DE-FC02-94ER40818 at the Laboratory for Nuclear Science at MIT.
204: 
205: \bibliography{prl}
206: 
207: %===============================================================
208: \end{document}
209: 
210: 
211: