astro-ph0403613/ms.tex
1: % Flare prediction paper, written February 2004. 
2: % Revised following referee's report, March 2004.
3: 
4: \documentclass[12pt,preprint,a4]{aastex}  
5: 
6: \slugcomment{To appear in the Astrophysical Journal}
7: 
8: \begin{document}
9: \title{A Bayesian Approach to Solar Flare Prediction}
10: \author{M. S. Wheatland}
11: \affil{School of Physics, University of Sydney, NSW 2006, Australia}
12: \email{m.wheatland@physics.usyd.edu.au}
13: 
14: \begin{abstract}
15: A number of methods of flare prediction rely on classification of
16: physical characteristics of an active region, in particular optical
17: classification of sunspots, and historical rates of flaring for a 
18: given classification. However these methods largely ignore the number 
19: of flares the active region has already produced, in particular the 
20: number of small events. The past history of occurrence of 
21: flares (of all sizes) is an important indicator to future flare 
22: production. We present a Bayesian approach to flare prediction, 
23: which uses the flaring record of an active region together with 
24: phenomenological rules of flare statistics to refine an initial 
25: prediction for the occurrence of a big flare during a subsequent
26: period of time. The initial prediction is assumed to come from one 
27: of the extant methods of flare prediction. The theory of the method 
28: is outlined, and simulations are presented to show how the refinement 
29: step of the method works in practice.  
30: \end{abstract}
31: 
32: \keywords{Sun: activity --- Sun: flares --- Sun: X-rays --- 
33:   methods: statistical}
34: 
35: \section{Introduction}
36: 
37: Solar flares influence local `space weather,' and as a result there is 
38: a demand for accurate flare prediction. Unfortunately no reliable 
39: deterministic method of predicting a flare is known, and existing methods 
40: are probabilistic in nature. 
41: 
42: A number of methods discussed in the literature are based on a commonly
43: used white-light classification of sunspots, and the correlation 
44: between classification and flare occurrence. The McIntosh classification
45: (McIntosh 1990) categorizes a group of sunspots into one of 60 classes,
46: based on three parameters. Historical flare rates for each of the 
47: classifications were used by McIntosh (1990) as the basis of an 
48: `expert system' for flare prediction. The system, called Theophrastus
49: (the associated code is called THEO), also incorporates additional 
50: information including dynamical properties 
51: of spot growth, rotation and shear, magnetic topology inferred from 
52: sunspot structure, magnetic classification, and previous flare activity. 
53: The method is apparently somewhat subjective, involving rules of thumb 
54: incorporated by a human expert. A second approach using the McIntosh 
55: classification was presented by Bornmann and Shaw (1994). In this case 
56: multiple linear regression was used to determine the effective contribution 
57: of each of the McIntosh parameters to the rate of flaring, based on historical 
58: records of flaring. Codes based on the methods of McIntosh (1990) and 
59: Bornmann and Shaw (1994) are used by the Ionospheric Prediction 
60: Service (IPS) of Australia to issue flare predictions at their
61: Learmonth and Culgoora observatories.\footnote{See http://www.ips.gov.au.} 
62: Recently Gallagher, Moon and Wang (2002) implemented a system
63: using historical averages of flare numbers for McIntosh classifications
64: to predict a rate for an active region, and then converted this to
65: a probability of flaring in a day using the assumption of Poisson 
66: statistics. This prediction is given as part of the Big Bear Solar
67: Observatory Active Region Monitor (ARM).\footnote{See 
68: http://beauty.nascom.nasa.gov/arm/latest/.} Finally the US National
69: Oceanic and Atmospheric Administration (NOAA) issues flare 
70: probability forecasts for active regions
71: which include input from THEO.\footnote{See 
72: http://www.sec.noaa.gov/ftpdir/latest/daypre.txt.} 
73: 
74: A shortcoming of methods relying on correlations of flaring with 
75: active region classification based on historical records is that they
76: ignore the important information of how many flares the active region 
77: of interest has already produced. The system of McIntosh (1990) 
78: incorporates information about previous activity, but it is unclear 
79: how objectively this is done, and the information is limited to 
80: the number of large flares already produced by the given active region. 
81: In the flare prediction 
82: literature, the tendency of a region which has produced large flares in 
83: the past to produce large flares in the future is called persistence, 
84: which is recognised as one of the most reliable predictors for large 
85: flare occurrence in 24-hour forecasts (e.g.\ Neidig, Weiborg, \& 
86: Seagraves 1989). In this paper we argue that the history of occurrence of 
87: all flares (large and small) observed in a given active region is an 
88: important indicator as to how the region will flare in the future, and 
89: should be used in any prediction. A related criticism of methods based
90: on classification and historical records is that a given classification
91: may embrace active regions with a variety of flaring rates. If an
92: active region has a flaring rate differing from the average historical
93: rate for its class then the predictions will be in error. 
94: 
95: Studies of solar flare statistics provide simple phenomenological 
96: rules describing flare occurrence. It is well known that flares follow 
97: a power-law size distribution, where by size we mean e.g.\ peak flux 
98: in soft X-ray. More 
99: formally the flare frequency-size distribution $N(S)$ (i.e.\ the number 
100: of events per unit size $S$ and per unit time) may be written
101: \begin{equation}\label{eq:pldist}
102: N(S)=AS^{-\gamma}
103: \end{equation}
104: where $A$ and $\gamma$ are constants. The exact power-law
105: index $\gamma$ depends on the choice of the quantity $S$, but typically 
106: it is found to be in the range 1.5 to 2 (e.g.\ Crosby, Aschwanden,
107: \& Dennis 1992). The power law index $\gamma$ appears to be the same
108: in different active regions~\cite{whe00}, although there is some 
109: evidence that it varies with the solar cycle~\cite{bai93}. A second
110: simple rule concerns the way flares occur in time. Studies of the 
111: rate of occurrence of soft X-ray flares in individual active regions 
112: suggest that events occur as a Poisson process in time (e.g.\ Moon et 
113: al.\ 2001), although many active regions exhibit changes in the 
114: mean rate of events (Wheatland 2001).
115: 
116: In this paper we show how the observed record of flaring in an active
117: region may be used together with the phenomenological rules of 
118: flare statistics to objectively refine an initial flare prediction. 
119: The initial prediction may be based on the McIntosh classification, or 
120: may come from any other prediction method which does not consider the 
121: flare data. The new method
122: is envisaged to work as follows. When an active region appears at the 
123: east limb of the Sun, the best guess as to its future flare productivity 
124: comes from one of the conventional prediction methods. However, as the 
125: active region produces flares, the observed flare statistics are used to 
126: adjust the prediction for future flaring. After many flares have been
127: observed, the prediction for future flaring may be dominated by the 
128: contribution from the observed data. This process --- refining a 
129: probability estimate based on new data --- is naturally performed using 
130: Bayes's theorem (e.g.\ Sivia 1996; Jaynes 2003).   
131: 
132: The layout of the paper is as follows. In \S\,2 a simple approach
133: to flare prediction using only the past record of flaring from an active
134: region [previously presented in Wheatland (2001)] is reiterated. 
135: In \S\,3 the new method of prediction, 
136: combining existing methods and information from observed flare statistics,
137: is described.
138: In \S\,4 simulations are presented showing how the method uses the
139: observed flaring record, and in \S\,5 the results are discussed.
140: 
141: \section{Wheatland (2001)}
142: 
143: Wheatland (2001) presented a method for flare prediction using
144: only observed flare statistics and the assumptions that flares obey
145: Poisson statistics in time, and power-law statistics in size,
146: elaborating on a suggestion by Moon et al.\ (2001). 
147: The approach is briefly reiterated here, since it is part of the new
148: method. 
149: 
150: First assume that there is a threshold size $S_1$ above which
151: all events occurring in an active region are observed, so that the
152: distribution~(\ref{eq:pldist}) applies for events above that size. 
153: The total rate of events larger than $S_1$ is then
154: \begin{equation}
155: \lambda_1=\int_{S_1}^{\infty}N(S)dS=A(\gamma -1)^{-1}
156:   S_1^{-\gamma+1},
157: \end{equation} 
158: assuming $\gamma>1$. Hence the frequency-size distribution may be 
159: rewritten
160: \begin{equation}\label{eq:fdist}
161: N(S)=\lambda_1(\gamma-1)S_1^{\gamma-1}S^{-\gamma}.
162: \end{equation}
163: Suppose the probability of a big event in a given period $\Delta T$ is 
164: required, where by big we mean an event at least as large as 
165: $S_2$. According to the distribution~(\ref{eq:fdist})
166: the rate of events larger than $S_2$ is
167: \begin{equation}\label{eq:rate_big} 
168: \lambda_2=\lambda_1
169:   \left( \frac{S_1}{S_2}\right)^{\gamma-1}. 
170: \end{equation}
171: 
172: Applying the Poisson model of flare occurrence, the probability of at 
173: least one big event during a period $\Delta T$ is given by Poisson 
174: statistics as
175: \begin{equation}\label{eq:prob_big}
176: \epsilon =1-\exp(-\lambda_2 \Delta T).
177: \end{equation}
178: 
179: Equations~(\ref{eq:rate_big}) and~(\ref{eq:prob_big}) provide the
180: required estimate. The quantities $S_1$, $S_2$ and $\Delta T$ are chosen, 
181: and then the parameters $\lambda_1$ and $\gamma$ (if the precise value
182: of $\gamma$ is assumed unknown) need to be 
183: estimated from the past history of flaring of the active region.
184: Wheatland (2001) assumed that $\gamma$ is the same for all active 
185: regions, and hence known (see Wheatland 2000), 
186: and estimated $\lambda_1$ using the
187: Bayesian procedure of Scargle (1998).
188: 
189: The rationale behind the method of Wheatland (2001) is that the 
190: flare frequency-size distribution is steep so there are very many small 
191: events, which allows $\lambda_1$ to be estimated relatively accurately 
192: from the observed history of flaring in an active region. Hence the
193: estimate of $\epsilon$ should be relatively accurate. To make this
194: point quantitative, note that from Equations~(\ref{eq:rate_big}) 
195: and~(\ref{eq:prob_big}) the uncertainty in the estimate of the
196: probability $\epsilon $ is given approximately by
197: \begin{equation}
198: \frac{\sigma_{\epsilon}}{\cal \epsilon}
199:   =\frac{\lambda_1 \Delta T (S_1/S_2)^{\gamma-1}}
200:   {\exp[\lambda_1 \Delta T (S_1/S_2)^{\gamma-1}] -1}
201:   \frac{\sigma_1}{\lambda_1},
202: \end{equation}
203: where $\sigma_1$ is the uncertainty in $\lambda_1$, and where we have 
204: ignored any uncertainty in $\gamma$. Assuming $S_2\gg S_1$ leads to 
205: $\sigma_{\epsilon}/\epsilon \approx \sigma_1/\lambda_1$.
206: If the rate $\lambda_1$ is determined from 
207: $M$ observed events, then for Poisson statistics we expect
208: $\sigma_1/\lambda_1=M^{-1/2}$, and hence 
209: \begin{equation}\label{eq:unc}
210: \frac{\sigma_{\epsilon}}{\epsilon}\approx M^{-1/2}.
211: \end{equation}
212: Equation~(\ref{eq:unc}) provides a crude estimate of the accuracy of the
213: method. To achieve a 10\% accuracy in the estimate requires of order
214: 100 observed events.
215: 
216: \section{New method}
217: 
218: \subsection{Approach}
219: 
220: The Wheatland (2001) method shows how to use the flaring record
221: for an active region to make a flare prediction, but it ignores the 
222: other information which is normally the basis of prediction. It is 
223: sensible to combine all of the available information, and in this 
224: section we consider how to do this.
225: 
226: We assume that a sequence of events with sizes $s_1,s_2,...,s_M$ 
227: (all larger than $S_1$) are observed to occur at times 
228: $t_1< t_2< ...< t_M$ respectively in an active region. 
229: These events occur within an observing interval which starts at 
230: time $t_{\rm sta}$ and ends at time $t_{\rm end}$. We also have
231: additional information, which we label $I$, including our 
232: knowledge of the phenomenological rules of flare statistics, and
233: e.g.\ the McIntosh classification of the active region. 
234: The problem is then to estimate $\epsilon$, the probability of a big
235: event, based on the data and the additional information $I$. 
236: By `estimating $\epsilon$' we strictly mean that we want to calculate 
237: a probability distribution for the quantity $\epsilon$, based on the 
238: available information. The peak of this distribution
239: is our most likely value for the probability of occurrence of a big
240: flare, and the width of the distribution is a measure of the
241: uncertainty of that value. To do this we proceed as follows. 
242: First we estimate (calculate probability distributions for) 
243: $\lambda_1$ and $\gamma$ based on the available information, and then 
244: we use these distributions to estimate $\lambda_2$. Then we use this 
245: distribution together with the relationship~(\ref{eq:prob_big}) to 
246: estimate the desired quantity $\epsilon$. We now consider each of 
247: these steps in turn.
248: 
249: \subsection{Estimating $\gamma$}
250: 
251: First we consider the calculation of 
252: $P_{\gamma}(\gamma )$, the probability distribution for 
253: the power-law index 
254: $\gamma$.\footnote{In the following probability distributions are given
255: labels such as $P_{\gamma}(\gamma)$ when the actual functional form 
256: of the distribution is needed. When this is not the case 
257: the generic label ${\rm prob}(...)$ is used to denote a 
258: distribution.} 
259: As mentioned in the Introduction, 
260: Wheatland (2000) found that the index $\gamma$ 
261: is independent of active region for a set of hard X-ray events, 
262: although the statistics underlying 
263: the study were somewhat poor. If $\gamma$ is the same in all active 
264: regions then the 
265: observations $s_1,s_2,...,s_M$ can be replaced by a larger set of 
266: events over many active regions. We return to this point in \S\,3.4, 
267: but for now admit the possibility that $\gamma$ is different in different 
268: active regions, and consider its estimation based on data for the given 
269: active region alone. 
270: 
271: Bai (1993) has shown how to estimate a power-law index for a set of 
272: data, using `maximum likelihood'. Following Bai, the likelihood 
273: function, that is the probability of the observed data 
274: $D=\{s_1,s_2,...,s_M\}$ given the model, is (assuming $\gamma>1$)
275: \begin{equation}\label{eq:gam_like}
276: {\rm prob}(D | \gamma, I )
277:   \propto \prod_{i=1}^{M}(\gamma-1)(s_i/S_1)^{-\gamma},
278: \end{equation}
279: where $I$ stands for all additional information, including knowledge of 
280: the phenomenological rule~(\ref{eq:pldist}). We note that this
281: expression requires $\gamma >1$, which follows from the requirement that
282: the probability distribution for size $S$ is normalized over all $S$ larger
283: than $S_1$. It is not necessary to introduce an upper cutoff for $S$ in 
284: the present treatment (provided $\gamma >1$), although an upper cutoff 
285: is necessary to ensure that the mean flare size is finite, if 
286: $\gamma<2$. We will return to this point in \S\,5.
287: 
288: Bayes's theorem may be used to convert the likelihood into the 
289: probability of the model given the data, which is what we are 
290: interested in: 
291: \begin{equation}\label{eq:p_gam_bayes}
292: {\rm prob}(\gamma | D,I)
293: \propto 
294:   {\rm prob}(D | \gamma,I)\times {\rm prob}(\gamma,I ),
295: \end{equation}
296: where ${\rm prob}(\gamma,I )$ is the `prior distribution' for
297: $\gamma$, i.e.\ the distribution we would assign to $\gamma$ in
298: the absence of the data (e.g.\ Sivia 1996). A choice needs to 
299: be made for this distribution, and a common choice is to assume 
300: a constant value within minimum and maximum values $\gamma_1$ and
301: $\gamma_2$ respectively:
302: \begin{equation}
303: {\rm prob} (\gamma |D,I) = \left\{
304:   \begin{array}{ll}
305:   (\gamma_2-\gamma_1)^{-1} & \mbox{if $\gamma_1\leq \gamma \leq
306: \gamma_2$}
307:   \\
308:   0 & \mbox{else,}
309: \end{array}
310: \right.
311: \end{equation}
312: which is referred to as a `uniform prior'.
313: We note that for a uniform prior the most likely value of 
314: $\gamma$ is the maximum of the likelihood function:
315: \begin{equation}\label{eq:gam_ML}
316: \gamma^{\ast}=\frac{M}{\sum_{i=1}^{M}\ln (s_i/S_1)}+1,
317: \end{equation}
318: which is the maximum likelihood estimate of $\gamma$ found by Bai. 
319: 
320: We can identify ${\rm prob} (\gamma | D,I)$ with 
321: $P_{\gamma}(\gamma)$, and then Equations~(\ref{eq:gam_like}) 
322: and~(\ref{eq:p_gam_bayes}) 
323: give the required `posterior distribution' for $\gamma$:
324: \begin{equation}\label{eq:prob_gam}
325: P_{\gamma}(\gamma)= C \frac{(\gamma-1)^{M}}{\pi^{\gamma}}\Gamma (\gamma),
326: \end{equation}
327: where
328: \begin{equation}
329: \pi=\prod_{i=1}^M\frac{s_i}{S_1},
330: \end{equation}
331: and where we have relabelled the prior distribution $\Gamma (\gamma)$.
332: The normalizing factor $C$ is determined by the requirement
333: $\int_{1}^{\infty}P_{\gamma}(\gamma)d\gamma=1$.\footnote{In the 
334: following all normalizing factors are labelled $C$, although they 
335: refer to different values. It is understood that in each case the 
336: value $C$ is to be determined by integration.} For a uniform prior
337: the integral may be performed, leading to
338: \begin{equation}
339: C=\frac{(\gamma_2-\gamma_1) \pi (\ln \pi )^{M+1}/M!}
340:   {P[M+1,(\gamma_2-1)\ln\pi ]
341:   - P[M+1, (\gamma_1-1)\ln \pi ]},
342: \end{equation}
343: where $P (a,x)$ denotes the incomplete Gamma function~\cite{abr&ste64}.
344: 
345: Before proceeding we present a rough estimate of the uncertainty in 
346: the most likely value of $\gamma$ based on the distribution 
347: $P_{\gamma}(\gamma)$ with a uniform prior. 
348: Assuming Gaussian behavior in the vicinity of
349: the peak, the width of the distribution~(\ref{eq:prob_gam}) is 
350: $\sigma_{\gamma}\approx [L^{\prime\prime}(\gamma^{\ast})]^{-1/2}$, where 
351: $L(\gamma)=-\ln P_{\gamma}(\gamma)$, and where $\gamma^{\ast}$ is the
352: location of the peak of the distribution (Sivia 1996). This leads to
353: $\sigma_{\gamma}\approx M^{1/2}/\ln\pi$, and using 
354: Equation~(\ref{eq:gam_ML}) gives
355: \begin{equation}\label{eq:sig_gam}
356: \sigma_{\gamma}\approx (\gamma^{\ast}-1)M^{-1/2}.
357: \end{equation}
358: 
359: 
360: \subsection{Estimating $\lambda_1$}
361: 
362: Next we consider the calculation of $P_1(\lambda_1)$, the distribution 
363: of the rate $\lambda_1$ of flares larger than $S_1$.
364: This is a more difficult problem because the rate of flaring in an active 
365: region may vary with time~(see e.g.\ Wheatland 2001). However,
366: observations suggest that a piecewise-constant Poisson process
367: provides a good model for the way flares occur in time in 
368: individual active regions. 
369: 
370: We assume that a period of time of duration $T^{\prime}\leq T$ immediately 
371: prior to $t_{\rm end}$ is identified (i.e.\ from $t=t_{\rm end}-T^{\prime}$
372: to $t=t_{\rm end}$) during which time flare occurrence is consistent
373: with a constant-rate Poisson process. 
374: 
375: One approach to identifying the necessary period of time has been 
376: presented by Scargle (1998), who showed how to select a piecewise-constant 
377: Poisson model to describe an observed sequence of events. When applied 
378: to a sequence of events at times $t_1< t_2< ... < t_M$ the Scargle method 
379: gives a sequence of times $t_{B\it 0}< t_{B1}<...<t_{BK}$ 
380: at which the rate is determined to change 
381: (where $t_{B0}=t_{\rm sta}$ and $t_{BK}=t_{\rm end}$ are the start and
382: end of the observing period), and a corresponding sequence 
383: $\lambda_{B1},\lambda_{B2},...,\lambda_{BK}$ of rates. The sequence 
384: of times and rates is called a set of  `Bayesian blocks'. In this
385: case we identify $T^{\prime}$ with $t_{BK}-t_{B(K-1)}$.
386: We note that the original Bayesian blocks procedure [which was used 
387: e.g.\ by Wheatland (2001)] does not necessarily select the best 
388: piecewise-constant model. Recently Scargle has found a computationally 
389: feasible way to determine the optimal decomposition (Scargle, private 
390: communication, 2003). We begin by assuming this method (or another 
391: method) has been applied to the data, to determine the required period 
392: $T^{\prime}$ prior to the end of observations.
393: 
394: A probability distribution for the rate $\lambda_1$ is then be
395: determined as follows. We assume that $M^{\prime}\leq M$ events are observed 
396: during the selected period $T^{\prime}$. The probability of the observed 
397: data $D^{\prime}$ (strictly this comprises not just the number of events 
398: but also their times) given a Poisson model with rate $\lambda_1$ is 
399: \begin{equation}\label{eq:pdkmk}
400: {\rm prob} (D^{\prime}|\lambda_1,I)\propto \lambda_1^{M^{\prime}}
401:   e^{-\lambda_1T^{\prime}},
402: \end{equation}
403: where we retain only the dependence on $\lambda_1$ on the
404: right hand side of this equation, and where we formally recognise any 
405: additional information by the dependence on $I$.
406: Bayes's theorem may be used to turn this likelihood into a probability 
407: of the model given the data, and the additional information: 
408: \begin{equation}\label{eq:pmkdk}
409: {\rm prob}(\lambda_1|D^{\prime},I)\propto 
410:   {\rm prob}(D^{\prime}|\lambda_1,I)\times {\rm prob} (\lambda_1,I),
411: \end{equation}
412: where ${\rm prob}(\lambda_1,I)$ is the prior distribution for the rate. 
413: 
414: The prior distribution ${\rm prob} (\lambda_1,I)$ represents the 
415: estimate of the rate of flaring for the active region in the absence 
416: of any data. This distribution allows the incorporation of any additional
417: information we have about the expected rate of flaring, not including 
418: the actual data. To make this concrete, we will consider the case that
419: the additional information is the McIntosh classification of the sunspots
420: associated with the active region, although we stress that any other
421: additional information can also be incorporated.
422: When the additional information is the McIntosh classification, 
423: a suitable prior distribution can be 
424: constructed from historical records of the observed rates of events 
425: above size $S_1$ for every active region of the same class. 
426: This is a generalization of the analysis underlying present flare 
427: prediction methods based on McIntosh classification, which considers 
428: only the mean flaring rate extracted from historical data. Hence we 
429: propose the construction of distributions of flaring rate for each 
430: McIntosh classification. We assume these are available, and label the 
431: appropriate distribution 
432: $\Lambda_{\rm MC} (\lambda_1)$, where MC denotes McIntosh 
433: classification. Equation~(\ref{eq:pmkdk}) then becomes
434: \begin{equation}\label{eq:prob_lam1}
435: P_1(\lambda_1)=C\lambda_1^{M^{\prime}} e^{-\lambda_1T^{\prime}}
436:   \Lambda_{\rm MC} (\lambda_1),
437: \end{equation}
438: where we have identified ${\rm prob}(\lambda_1|D^{\prime},I)$ with
439: $P_1(\lambda_1)$, and and where $C$ is the normalization factor. This
440: is the required posterior distribution for $\lambda_1$.
441: 
442: It should be noted that the distribution~(\ref{eq:prob_lam1}) explicitly
443: uses only a subset of all flares observed in an active region, 
444: i.e.\ the $M^{\prime}\leq M$ flares observed during the interval 
445: $T^{\prime}\leq T$. Previous 
446: data contribute only to the determination of the interval $T^{\prime}$. The
447: motivation is that when the rate changes, the old rate is no
448: longer relevant for future prediction. For many active regions the
449: observed rate appears to be constant during a transit of the disk, or
450: at least no rate change is detectable (e.g.\ Wheatland 2001), in which
451: case all observed flares contribute explicitly to the inference.
452: 
453: Before proceeding we note two simple results for 
454: Equation~(\ref{eq:prob_lam1}) with a uniform prior.
455: First, it is easy to see that with a uniform prior the maximum of this 
456: distribution occurs at $M^{\prime}/T^{\prime}$.
457: Second we note the well known result that for large $\lambda_1T^{\prime}$ 
458: and neglecting the prior, Equation~(\ref{eq:prob_lam1}) 
459: approximates a Gaussian with a width 
460: \begin{equation}\label{eq:sig_lam}
461: \sigma_1\approx \frac{(M^{\prime})^{1/2}}{T^{\prime}},
462: \end{equation} 
463: which is consistent with the arguments at the end of \S\,2. 
464: 
465: \subsection{Estimating $\epsilon$}
466: 
467: The probability distribution $P_2(\lambda_2)$ for the rate $\lambda_2$
468: of flares larger than $S_2$ may be constructed from the distributions 
469: $P_1(\lambda_1)$ and $P_{\gamma}(\gamma)$ using 
470: Equation~(\ref{eq:rate_big}). Specifically we have 
471: $\lambda_2=\lambda_1(S_1/S_2)^{\gamma-1}$, 
472: and hence
473: \begin{equation}
474: P_2(\lambda_2)= 
475:   \int_1^{\infty}d\gamma \int_0^{\infty} d\lambda_1 P_1(\lambda_1)
476:   P_{\gamma}(\gamma)\delta
477:   \left[ \lambda_2-\lambda_1(S_1/S_2)^{\gamma-1}\right],
478: \end{equation} 
479: and performing the integral over $\lambda_1$ leads to
480: \begin{equation}\label{eq:P2}
481: P_2(\lambda_2) = 
482:   \int_1^{\infty} 
483:   d\gamma P_{\gamma}(\gamma)\left(\frac{S_2}{S_1}\right)^{\gamma-1} 
484:   P_1\left[\lambda_2 \left(\frac{S_2}{S_1}\right)^{\gamma-1} \right].
485: \end{equation} 
486: 
487: 
488: The quantity we are interested in is $\epsilon$, the probability of
489: an event bigger than $S_2$ occurring in an interval $\Delta T$. 
490: The probability distribution $P_{\epsilon}(\epsilon)$ for this
491: quantity may be contructed from the distribution for $\lambda_2$ by 
492: a change of variable. 
493: Specifically, from Equation~(\ref{eq:prob_big}) we have 
494: $\lambda_2=-\ln (1-\epsilon)/\Delta T$, and hence  
495: \begin{eqnarray}\label{eq:prob_pbig}
496: P_{\epsilon}(\epsilon)&=&P_2\left[\lambda_2(\epsilon )\right]
497: \left|\frac{d\lambda_2}{d\epsilon }\right| \nonumber \\
498: &=& P_2\left[-\frac{\ln (1-\epsilon )}{\Delta T} \right]
499:   \frac{1}{\Delta T (1-\epsilon ) }.
500: \end{eqnarray}
501: Using Equations~(\ref{eq:prob_gam}), (\ref{eq:prob_lam1}), and
502: (\ref{eq:P2}) in~(\ref{eq:prob_pbig}) leads to
503: \begin{equation}\label{eq:pbig_general}
504: P_{\epsilon}(\epsilon )=
505:   \int_1^{\infty} d\gamma \, f(\epsilon,\gamma),
506: \end{equation}
507: where
508: \begin{eqnarray}\label{eq:fjoint}
509: f(\epsilon,\gamma)&=&C\left[-\ln (1-\epsilon )\right]^{M^{\prime}} 
510:   (\gamma-1)^M\Gamma (\gamma ) 
511:   \left[\frac{(S_2/S_1)^{M^{\prime}+1}}{\pi}\right]^{\gamma}
512:   \nonumber \\ 
513:   &\times& (1-\epsilon )^{\left(T^{\prime}/\Delta T\right)
514:     \left(S_2/S_1\right)^{\gamma-1}-1}
515:   \Lambda_{\rm MC} \left[-\frac{\ln (1-\epsilon )}{\Delta T} 
516:   \left(\frac{S_2}{S_1}\right)^{\gamma-1} \right]
517: \end{eqnarray}
518: is the joint probability 
519: distribution for $\epsilon$ and $\gamma$. The normalization factor
520: $C$ is obtained by requiring that 
521: $\int_{0}^{1}P_{\epsilon}(\epsilon)d\epsilon=1$. We note that
522: $P_{\gamma}(\gamma)$ and $P_{\epsilon}(\epsilon)$ may be considered
523: to be marginal distributions of $f(\epsilon,\gamma)$ (i.e.\ they are
524: obtained by integration over $\epsilon$ and $\gamma$ respectively). 
525: However, Equation~(\ref{eq:prob_gam}) gives the distribution for
526: $\gamma$ directly.
527: 
528: As noted in \S\,3.2, observations suggest that $\gamma$ is the same
529: in all active regions, in which case the index can be determined very 
530: accurately from events over many active regions using 
531: Equation~(\ref{eq:gam_ML}). If the estimate is $\gamma^{\ast}$, 
532: then we can consider the prior distribution for $\gamma$ to be
533: $\Gamma (\gamma) = \delta (\gamma-\gamma^{\ast})$, and 
534: Equation~(\ref{eq:pbig_general}) simplifies to 
535: \begin{equation}\label{eq:pbig_simp}
536: P_{\epsilon}(\epsilon ) =
537:   C\left[-\ln (1-\epsilon ) \right]^{M^{\prime}} 
538:   (1-\epsilon )^{\left( T^{\prime}/\Delta T\right)
539:   \left(S_2/S_1\right)^{\gamma^{\ast}-1}-1} 
540:   \Lambda_{\rm MC} \left[-\frac{\ln (1-\epsilon )}{\Delta T} 
541:   \left(\frac{S_2}{S_1}\right)^{\gamma^{\ast}-1} \right].
542: \end{equation}
543: 
544: Equations~(\ref{eq:pbig_general}), (\ref{eq:fjoint})
545: and~(\ref{eq:pbig_simp}) are the required expressions for the posterior 
546: probability distribution for $\epsilon$.
547: 
548: \section{Simulations}
549: 
550: We present two simulations demonstrating the application of the
551: method to synthetic data. These simulations omit the inclusion of 
552: other information via the prior 
553: $\Lambda_{\rm MC} (\lambda_1)$, so they illustrate only how the
554: method performs using the observed data. 
555: 
556: First we consider the case that $\gamma$ is assumed to be known.
557: Ten days of flaring were simulated by producing a sequence of
558: event times as a Poisson process in time with a rate $\lambda_1=0.5$
559: per day for the first five days, and with a rate $\lambda_1=5.0$
560: per day for the second five days. Each event was assigned a size 
561: according to a power law distribution with an index $\gamma=1.8$, 
562: above the threshold size $S_1=1$ (in arbitrary units). Figure~1
563: illustrates a typical simulation. The first (upper) panel shows the 
564: size of each event versus the time at which the event occurred.
565: In this case there were 31 events. The simulation applies the method
566: to the problem of predicting the probability of a big event occurring 
567: during the next day ($\Delta T=1$ day) at the end of the ten days. 
568: The size of a big event was taken to be 
569: $S_2=100$. The original Bayesian blocks procedure~(Scargle 1998) was 
570: applied to the event time series to determine a decomposition into a 
571: sequence of piecewise-constant intervals and rates. The second panel 
572: of Figure~1 shows the result of this process: 
573: the solid lines indicate the rate as a function of time 
574: inferred by the Bayesian blocks procedure, and the dotted lines indicate 
575: the true rate versus time. The Bayesian blocks procedure correctly
576: identifies a two-rate model as the most likely model, and identifies 
577: the approximate time of the change in rate. The third panel shows the
578: probability distribution $P_{\epsilon}(\epsilon)$ obtained from 
579: Equation~(\ref{eq:pbig_simp}) with a uniform prior for $\lambda_1$, 
580: and with $M^{\prime}$ and $T^{\prime}$ equal to the number of events in the 
581: second Bayesian block and the duration of the second Bayesian block 
582: respectively. The dotted vertical line in this panel is the true value
583: of $\epsilon$. 
584: We see that, even for a relatively small number of events, the method is 
585: able to provide a good estimate of the probability of a big event. The
586: width of the inferred distribution for $\epsilon$ is consistent with
587: Equation~(\ref{eq:unc}).
588: 
589: \begin{figure}
590: \epsscale{0.7}
591: \plotone{f1.eps}
592: \caption[f1.eps]{Simulation of 10 days of flaring and application of 
593: the prediction method, assuming $\gamma$ is known.}
594: \end{figure}
595: 
596: Second we consider the more difficult case of simultaneously
597: estimating $\gamma$ and $\lambda_1$. Ten days of flaring were again 
598: simulated, with a rate $\lambda_1=1$ per day for the first five days, 
599: and a rate $\lambda_1=10$ per day for the second five days. Larger 
600: rates were chosen to provide more events for the inference, but the 
601: other parameters were kept the same as in the first simulation. 
602: Figure~2 illustrates the results of a typical simulation. The 
603: first (upper) panel shows the time history of events --- in this case
604: 57 events occurred. The second panel shows the result of a Bayesian
605: blocks decomposition of the data (solid lines) together with the
606: true rate versus time (dotted lines). Once again the Bayesian blocks
607: procedure correctly identifies a two-rate model as the most likely
608: model, and identifies the approximate time of the change in rate.
609: The third panel shows  the result of using Equation~(\ref{eq:prob_gam})
610: --- with a uniform prior with $\gamma_1=1.25$ and $\gamma_2=2.25$ ---
611: to construct the distribution for $\gamma$. The dotted vertical line in this 
612: panel shows the true value of $\gamma$.
613: The fourth panel of Figure~2 shows the distribution for $\epsilon$
614: constructed using Equation~(\ref{eq:pbig_general}), with 
615: $M=57$, with $M^{\prime}$ and $T^{\prime}$ obtained from the second 
616: Bayesian block, and with uniform prior distributions for $\gamma$ and 
617: $\lambda_1$. 
618: The dotted vertical line indicates the true value. From this simulation 
619: we see that a reasonable estimate for $\epsilon $ is obtained for a 
620: relatively small number of events. 
621: 
622: \begin{figure}
623: \epsscale{0.7}
624: \plotone{f2.eps}
625: \caption[f2.eps]{Simulation of 10 days of flaring and application of the
626: prediction method, assuming $\gamma$ is unknown.}
627: \end{figure}
628: 
629: The distribution for $\epsilon $ obtained in the lower panel of
630: Figure~2 is quite broad.
631: A basic reason is that $\epsilon$ depends sensitively on $\gamma$ 
632: because of its appearance as an exponent in
633: Equation~(\ref{eq:rate_big}), and $\gamma$ has a range of possible
634: values, as shown in the third panel of Figure~2.
635: This effect may be seen by considering
636: $f(\epsilon,\gamma)$ [defined by Equation~(\ref{eq:fjoint})],
637: which is the joint distribution of $\epsilon$ and $\gamma$. Figure~3
638: shows a contour plot of $f(\epsilon,\gamma)$ for the simulation depicted
639: in Figure~2. The dotted vertical and horizontal lines are the true values
640: of $\epsilon$ and $\gamma$ respectively.
641: The dashed curve is defined by 
642: $\epsilon=1-\exp[-(M^{\prime}/T^{\prime})(S_1/S_2)^{\gamma-1}\Delta T ]$, 
643: and the contours of $f(\epsilon,\gamma)$ are observed to be stretched
644: out along this curve. The practical implication of this figure is that
645: accurate estimation of $\epsilon$ depends on accurate estimation
646: of $\gamma$. In practice $\gamma$ is known a priori quite accurately,
647: but in this simulation we have assumed that $\gamma$ is initially unknown
648: (within the range 1.25 to 2.25), to illustrate the process of
649: inference. 
650:  
651: \begin{figure}
652: \epsscale{0.7}
653: \plotone{f3.eps}
654: \caption[f3.eps]{Contour map of the joint probability of $\epsilon$ and
655: $\gamma$, for the simulation in Fig.~2.}
656: \end{figure}
657: 
658: \section{Discussion}
659: 
660: Existing methods of solar flare prediction do not make complete
661: use of an important source of information: the time history of flares
662: already observed in the active region of interest, in particular 
663: frequently occurring small events.
664: A new method for flare prediction is presented which exploits the 
665: observed history of flaring from an active region to improve an initial
666: prediction, which e.g.\ may come from one of the existing methods. 
667: To make the example concrete we may think of the initial prediction
668: coming from from the McIntosh sunspot classification, which is a common 
669: basis for prediction. This background information provides an initial 
670: estimate for the expected flaring rate through a prior distribution 
671: $\Lambda_{\rm MC}(\lambda_1)$, which represents the probability that
672: the flaring rate above a (small) size $S_1$ is $\lambda_1$, given
673: historical rates of occurrence of flares for the given McIntosh 
674: class. Bayes's theorem is then used to estimate the probability 
675: $\epsilon$ of observing a large flare (above size $S_2$) in a given 
676: period of time, based on this prior information and on the sequence of 
677: flares already produced by the active region, and assuming simple
678: phenomenological rules describing the occurrence of flares. 
679: In this paper the basic theory behind the inference of $\epsilon$ 
680: based on observed data is presented. The inclusion of background
681: information [i.e.\ the construction of the priors 
682: $\Lambda_{\rm MC}(\lambda_1)$] is yet to be done.
683: 
684: The method relies on event sizes following the phenomenological 
685: law~(\ref{eq:pldist}). Some studies of very small extreme 
686: ultraviolet events (`nanoflares') suggest that their thermal energies 
687: follow a steeper distribution than energies of large events 
688: (e.g.\ Krucker and Benz 1998; Parnell and Jupp 2000), although this 
689: remains controversial (e.g.\ Aschwanden and Parnell 2002). 
690: From the point of view of the prediction method presented here,
691: the uncertainty over the low-size end of the distribution is irrelevant 
692: provided events significantly larger than nanoflares are used.
693: In any case the observed distributions from many active 
694: regions may be examined as a check on Equation~(\ref{eq:pldist}).
695: A related point is that the distribution~(\ref{eq:pldist}) requires 
696: a cutoff at large sizes on energetics grounds, and neglect of this
697: cutoff will lead to the number of large flares being overestimated. 
698: A cutoff will be incorporated before the method is applied to real data.
699: 
700: The choice of the quantity $S$ has not been addressed, although a good
701: choice is likely to be important to the method. Most flare forecasting
702: deals with soft X-ray events, in particular prediction of GOES
703: (Geostationary Observational Environmental Satellite) M and X class 
704: events (events with peak fluxes greater than $10^{-5}$W/m$^2$
705: and $10^{-4}$W/m$^2$ respectively in the 1-8 Angstrom band observed by
706: the satellites). A practical motivation for this is that flare
707: soft X-ray emission causes disturbances of the ionosphere which affect 
708: shortwave radio communication, and there is a need to predict these
709: occurrences. A disadvantage of using GOES events is that they are not 
710: ideal for flare statistics e.g.\ because of problems with event selection 
711: due to the large background in soft X-ray (see Wheatland 2001). 
712: 
713: A number of other issues also need to be considered before the method is
714: implemented with real data. A point neglected so far is that active regions 
715: evolve, so that predictions based on the traditional methods also 
716: change with time. For example, an active region evolves through McIntosh 
717: classifications (e.g.\ Bornmann, Kalmbach, Kulhanek, and Casale 1990). 
718: Changes in background information such as this should be incorporated 
719: through changes in the prior, and this question will be considered in more 
720: detail in future work. A related point concerns the construction of the 
721: prior distributions for rate. It is likely that the McIntosh classification 
722: will be used, although other possibilities will be considered. The
723: problem is then to determine the probability of a given McIntosh class 
724: having a given rate, based on observed flaring sequences in the
725: historical record for active regions of that class.
726: The details of this calculation will be addressed in future work. 
727:  
728: Finally, as with all methods of forecasting, it is essential to 
729: test the reliability of the method. It is straightforward to compare, 
730: after the fact, the number of predicted and the number of observed 
731: events for a large sample of active regions. The method presented here 
732: will be implemented and tested in this way, and the results compared 
733: with existing methods of prediction.
734: 
735: \section*{Acknowledgements}
736: 
737: M.S.W. acknowledges the support of an Australian Research Council
738: QEII Fellowship, and thanks Richard Thompson and Garth Patterson
739: of the Ionospheric Prediction Service for useful discussions. The
740: comments of an anonymous referee have also helped to improve the 
741: paper.
742:  
743: \begin{thebibliography}{}
744: \small
745: %
746: \bibitem[Abramowitz and Stegun 1964]{abr&ste64} 
747:   Abramowitz, M., \& Stegun, I.A. 1964, Handbook of Mathematical 
748:   Functions, National Bureau of Standards, Applied Mathematics Series 
749:   volume 55.
750: %
751: \bibitem[Aschwanden and Parnell 2002]{asc&par02} 
752:   Aschwanden, M.J., \& Parnell, C.E. 2002, \apj ~572, 1048.
753: %
754: \bibitem[Bai 1993]{bai93} Bai, T. 1993, \apj ~404, 805.
755: %
756: \bibitem[Bornmann, Kalmbach, Kulhanek, \& Casale 1990]{bor&90}
757:   Bornmann, P.L., Kalmbach, D., Kulhanek, D., and Calsale, A. 
758:   1990, in 
759:   Solar-Terrestrial Predictions: Proceedings of a Workshop in 
760:   Leura, Australia, October 16-20, 1989, Volume 1, eds.\ R.J. Thompson,
761:   D.G. Cole, P.J. Wilkinson, M.A. Shea, D. Smart, \& G. Heckman,
762:   (NOAA Environmental Research Laboratories: Boulder, Colorado), 
763:   301.
764: %
765: \bibitem[Bornmann and Shaw 1994]{bor&sha94} 
766:   Bornmann, P.L., \&  Shaw, D. 1994, Sol.\ Phys.\ 150, 127.
767: %
768: \bibitem[Crosby, Aschwanden and Dennis 1993]{cro&93} 
769:   Crosby, N.B., Aschwanden, M.J., \& Dennis, B.R. 1993,
770:   Sol.\ Phys.\ 143, 275.
771: %
772: \bibitem[Gallagher, Moon, and Wang 2002]{gal&02}
773:   Gallagher, P.T., Moon, Y.-J., \& Wang, H. 2002, Sol.\ Phys.\ 
774:   209, 171.
775: % 
776: \bibitem[Jaynes 2003]{jay03}
777:   Jaynes, E.T. 2003, Probability Theory: The Logic of Science (Cambridge
778:   University Press: Cambridge). 
779: %
780: \bibitem[Krucker and Benz 1998]{kru&ben98}
781:   Krucker, S., \& Benz, A.O. 1998, \apj ~501, L213.
782: %
783: \bibitem[McIntosh 1990]{mci90}
784:   McIntosh, P.S. 1990, Sol.\ Phys.\ 125, 251.
785: %
786: \bibitem[Moon et al.\ 2001]{moo&01}
787:   Moon, Y.-J., Choe, G.S., Yun, H.S., \& Park, Y.D. 2001, 
788:   \jgr ~106, 29951.
789: % 
790: \bibitem[Neidig, Wiborg and Seagraves 1990]{nei&90}
791:   Neidig, D.F., Wiborg, P.H. and Seagraves, P.H. 1990, in 
792:   Solar-Terrestrial Predictions: Proceedings of a Workshop in 
793:   Leura, Australia, October 16-20, 1989, Volume 1, eds.\ R.J. Thompson,
794:   D.G. Cole, P.J. Wilkinson, M.A. Shea, D. Smart, \& G. Heckman,
795:   (NOAA Environmental Research Laboratories: Boulder, Colorado), 
796:   541.
797: %
798: \bibitem[Parnell and Jupp 2000]{par&jup00}
799:   Parnell, C.E., \& Jupp, P.E. 2000, \apj ~529, 554.
800: %
801: \bibitem[Scargle 1998]{sca98} Scargle, J.D. 1998, \apj ~504, 405.
802: %
803: \bibitem[Sivia 1996]{siv96} Sivia, D.S. 1996, Data Analysis: A 
804:   Bayesian Tutorial, (Clarendon Press: Oxford).
805: %
806: \bibitem[Wheatland 2000]{whe00} Wheatland, M.S. 2000, \apj ~532, 1209. 
807: %
808: \bibitem[Wheatland 2001]{whe01} Wheatland, M.S. 2001, Sol.\ Phys.\
809:   203, 87. 
810: %
811: \end{thebibliography}
812: 
813: \end{document}
814: 
815: