hep-lat0701009/appa
1: 
2: 
3: \appendix A. Statistical error analysis
4: 
5: In the physics analysis of the runs $A_1-A_3$, $B_1-B_4$ and 
6: $D_1-D_5$, we kept track of the statistical errors using
7: the jackknife method. In particular, 
8: any correlations among the errors of different observables
9: were always properly taken into account.
10: Here we summarize our conventions and briefly explain
11: the basic procedures that we used.
12: 
13: \subsection A.1 Jackknife samples
14: 
15: Let $A_{r}$, $r=1,\ldots,R$, be a set of 
16: primary stochastic observables and 
17: $a_{r,1},\ldots,a_{r,N}$ a sequence of $N$ measured values of
18: these. In lattice QCD the most common primary observables are the Wilson
19: loops and sums of products of quark propagators.
20: The jackknife method assumes that
21: the measured values are unbiased and statistically independent.
22: We shall thus take it for granted that the residual autocorrelations
23: are negligible in the cases of interest (see sect.~4).
24: 
25: The averages $\abar_r$ of the observables $A_r$
26: and the associated statistical error covariance $C_{rs}$
27: are given by
28: \equation{
29:    \abar_r={1\over N}\sum_{i=1}^Na_{r,i},
30:    \enum
31:    \next{2ex}
32:    C_{rs}={1\over N(N-1)}\sum_{i=1}^N
33:    \left(a_{r,i}-\abar_r\right)
34:    \left(a_{s,i}-\abar_s\right).
35:    \enum
36: }
37: If we introduce 
38: the jackknife samples
39: \equation{
40:    a^J_{r,i}=\abar_r+c_N\left(\abar_r-a_{r,i}\right),
41:    \qquad
42:    c_N=\left(N(N-1)\right)^{-1/2},
43:    \enum
44: }
45: an equivalent expression for the error matrix is
46: \equation{
47:    C_{rs}=\sum_{i=1}^N
48:    \left(a^J_{r,i}-\abar_r\right)
49:    \left(a^J_{s,i}-\abar_s\right).
50:    \enum
51: }
52: Note that our definition of the jackknife
53: samples slightly departs from
54: the standard conventions, where $c_N=1/(N-1)$. 
55: The modification is numerically insignificant in practice, but 
56: leads to some simplifications
57: when data from different simulations
58: are to be combined (see subsect.~A.3).
59: 
60: 
61: \subsection A.2 Error propagation
62: 
63: Apart from estimating the primary observables, one may be
64: interested in evaluating various functions $f(A_1,\ldots,A_R)$
65: of them,
66: which may involve fit procedures and 
67: other complicated operations. 
68: The standard stochastic estimate of such an observable is
69: \equation{
70:    \fbar=f(\abar_1,\ldots,\abar_R)
71:    \enum
72: }
73: and the associated series of jackknife estimates is defined by
74: \equation{
75:    f^J_i=f(\abar^J_{1,i},\ldots,\abar^J_{R,i}), 
76:    \qquad i=1,\ldots,N.
77:    \enum
78: }
79: A little algebra then shows that the expression
80: \equation{
81:   \sigma^2=
82:   \sum_{i=1}^N
83:   \left(f^J_i-\fbar\right)^2
84:   \enum
85: }
86: provides an estimate of the statistical variance of $\fbar$, which
87: coincides with the usual error propagation formula (the one 
88: that involves the
89: gradient of $f$) up to terms of order $1/N$. Similarly the 
90: error covariance of $f$ and any other function $g$ is obtained by 
91: summing $(f^J_i-\fbar)(g^J_i-\gbar)$
92: over the jackknife samples.
93: 
94: In practice the error formula (A.7) proves to be very convenient.
95: If an observable is a function of previously calculated
96: observables, for example, one can take advantage of the 
97: fact that the composition of functions is associative, i.e.~the 
98: jackknife series $f^J_i$ is simply obtained
99: by inserting the jackknife series of the arguments,
100: independently of whether these are primary or not.
101: The data analysis can thus proceed in steps, starting from the 
102: primary observables and progressing to more and more complicated
103: observables.
104: 
105: 
106: \subsection A.3 Combining data from different runs
107: 
108: Simulations of lattice QCD at different
109: sea-quark masses, lattice spacings, etc., can be assumed to 
110: be statistically independent. The statistical variance of any observable
111: that depends on data from several simulations is therefore the sum of the 
112: associated partial variances. 
113: This rule can easily be accommodated in the jackknife analysis
114: by embedding the jackknife series of the observables in extended 
115: series that include all simulations on which the 
116: observable depends.
117: 
118: The method is best explained by considering two
119: simulations, where $N_1$ measurements
120: of some observables $A_r$ are made in the first
121: and $N_2$ measurements of some other observables $B_s$
122: in the second. The associated jackknife
123: series $a^J_{r,1},\ldots,a^J_{r,N_1}$ and $b^J_{s,1},\ldots,b^J_{s,N_2}$
124: are then computed as before,
125: starting from the primary observables in each simulation.
126: Next they are embedded in extended series 
127: \equation{
128:    a^J_{r,1},\ldots,a^J_{r,N_1},
129:    \underbrace{\abar_r,\ldots,\abar_r}_{N_2\;{\rm elements}}
130:    \qquad\hbox{and}\qquad 
131:    \underbrace{\bbar_s,\ldots,\bbar_s}_{N_1\;{\rm elements}}
132:    b^J_{s,1},\ldots,b^J_{s,N_2}
133:    \enum
134: }
135: of length $N_1+N_2$ 
136: such that the first $N_1$ elements are occupied by the jackknife
137: series from the first simulation and the last $N_2$ elements by those from the 
138: second simulation. 
139: 
140: With this assignment, and if the extended
141: series are treated as ordinary jackknife series,
142: the correct error correlation matrix of the full set 
143: $A_1,\ldots,A_R,B_1,\ldots,B_S$
144: of observables is obtained.
145: Moreover, we may define the jackknife series of 
146: any observable $f(A_1,\ldots A_R,B_1,\ldots,B_S)$ in the standard
147: manner and compute its variance using eq.~(A.7).
148: The embedding trick thus allows the statistical errors
149: to be pro\-pa\-gated as if there were a single simulation.
150: 
151: 
152: