1: % Flare prediction paper, written February 2004.
2: % Revised following referee's report, March 2004.
3:
4: \documentclass[12pt,preprint,a4]{aastex}
5:
6: \slugcomment{To appear in the Astrophysical Journal}
7:
8: \begin{document}
9: \title{A Bayesian Approach to Solar Flare Prediction}
10: \author{M. S. Wheatland}
11: \affil{School of Physics, University of Sydney, NSW 2006, Australia}
12: \email{m.wheatland@physics.usyd.edu.au}
13:
14: \begin{abstract}
15: A number of methods of flare prediction rely on classification of
16: physical characteristics of an active region, in particular optical
17: classification of sunspots, and historical rates of flaring for a
18: given classification. However these methods largely ignore the number
19: of flares the active region has already produced, in particular the
20: number of small events. The past history of occurrence of
21: flares (of all sizes) is an important indicator to future flare
22: production. We present a Bayesian approach to flare prediction,
23: which uses the flaring record of an active region together with
24: phenomenological rules of flare statistics to refine an initial
25: prediction for the occurrence of a big flare during a subsequent
26: period of time. The initial prediction is assumed to come from one
27: of the extant methods of flare prediction. The theory of the method
28: is outlined, and simulations are presented to show how the refinement
29: step of the method works in practice.
30: \end{abstract}
31:
32: \keywords{Sun: activity --- Sun: flares --- Sun: X-rays ---
33: methods: statistical}
34:
35: \section{Introduction}
36:
37: Solar flares influence local `space weather,' and as a result there is
38: a demand for accurate flare prediction. Unfortunately no reliable
39: deterministic method of predicting a flare is known, and existing methods
40: are probabilistic in nature.
41:
42: A number of methods discussed in the literature are based on a commonly
43: used white-light classification of sunspots, and the correlation
44: between classification and flare occurrence. The McIntosh classification
45: (McIntosh 1990) categorizes a group of sunspots into one of 60 classes,
46: based on three parameters. Historical flare rates for each of the
47: classifications were used by McIntosh (1990) as the basis of an
48: `expert system' for flare prediction. The system, called Theophrastus
49: (the associated code is called THEO), also incorporates additional
50: information including dynamical properties
51: of spot growth, rotation and shear, magnetic topology inferred from
52: sunspot structure, magnetic classification, and previous flare activity.
53: The method is apparently somewhat subjective, involving rules of thumb
54: incorporated by a human expert. A second approach using the McIntosh
55: classification was presented by Bornmann and Shaw (1994). In this case
56: multiple linear regression was used to determine the effective contribution
57: of each of the McIntosh parameters to the rate of flaring, based on historical
58: records of flaring. Codes based on the methods of McIntosh (1990) and
59: Bornmann and Shaw (1994) are used by the Ionospheric Prediction
60: Service (IPS) of Australia to issue flare predictions at their
61: Learmonth and Culgoora observatories.\footnote{See http://www.ips.gov.au.}
62: Recently Gallagher, Moon and Wang (2002) implemented a system
63: using historical averages of flare numbers for McIntosh classifications
64: to predict a rate for an active region, and then converted this to
65: a probability of flaring in a day using the assumption of Poisson
66: statistics. This prediction is given as part of the Big Bear Solar
67: Observatory Active Region Monitor (ARM).\footnote{See
68: http://beauty.nascom.nasa.gov/arm/latest/.} Finally the US National
69: Oceanic and Atmospheric Administration (NOAA) issues flare
70: probability forecasts for active regions
71: which include input from THEO.\footnote{See
72: http://www.sec.noaa.gov/ftpdir/latest/daypre.txt.}
73:
74: A shortcoming of methods relying on correlations of flaring with
75: active region classification based on historical records is that they
76: ignore the important information of how many flares the active region
77: of interest has already produced. The system of McIntosh (1990)
78: incorporates information about previous activity, but it is unclear
79: how objectively this is done, and the information is limited to
80: the number of large flares already produced by the given active region.
81: In the flare prediction
82: literature, the tendency of a region which has produced large flares in
83: the past to produce large flares in the future is called persistence,
84: which is recognised as one of the most reliable predictors for large
85: flare occurrence in 24-hour forecasts (e.g.\ Neidig, Weiborg, \&
86: Seagraves 1989). In this paper we argue that the history of occurrence of
87: all flares (large and small) observed in a given active region is an
88: important indicator as to how the region will flare in the future, and
89: should be used in any prediction. A related criticism of methods based
90: on classification and historical records is that a given classification
91: may embrace active regions with a variety of flaring rates. If an
92: active region has a flaring rate differing from the average historical
93: rate for its class then the predictions will be in error.
94:
95: Studies of solar flare statistics provide simple phenomenological
96: rules describing flare occurrence. It is well known that flares follow
97: a power-law size distribution, where by size we mean e.g.\ peak flux
98: in soft X-ray. More
99: formally the flare frequency-size distribution $N(S)$ (i.e.\ the number
100: of events per unit size $S$ and per unit time) may be written
101: \begin{equation}\label{eq:pldist}
102: N(S)=AS^{-\gamma}
103: \end{equation}
104: where $A$ and $\gamma$ are constants. The exact power-law
105: index $\gamma$ depends on the choice of the quantity $S$, but typically
106: it is found to be in the range 1.5 to 2 (e.g.\ Crosby, Aschwanden,
107: \& Dennis 1992). The power law index $\gamma$ appears to be the same
108: in different active regions~\cite{whe00}, although there is some
109: evidence that it varies with the solar cycle~\cite{bai93}. A second
110: simple rule concerns the way flares occur in time. Studies of the
111: rate of occurrence of soft X-ray flares in individual active regions
112: suggest that events occur as a Poisson process in time (e.g.\ Moon et
113: al.\ 2001), although many active regions exhibit changes in the
114: mean rate of events (Wheatland 2001).
115:
116: In this paper we show how the observed record of flaring in an active
117: region may be used together with the phenomenological rules of
118: flare statistics to objectively refine an initial flare prediction.
119: The initial prediction may be based on the McIntosh classification, or
120: may come from any other prediction method which does not consider the
121: flare data. The new method
122: is envisaged to work as follows. When an active region appears at the
123: east limb of the Sun, the best guess as to its future flare productivity
124: comes from one of the conventional prediction methods. However, as the
125: active region produces flares, the observed flare statistics are used to
126: adjust the prediction for future flaring. After many flares have been
127: observed, the prediction for future flaring may be dominated by the
128: contribution from the observed data. This process --- refining a
129: probability estimate based on new data --- is naturally performed using
130: Bayes's theorem (e.g.\ Sivia 1996; Jaynes 2003).
131:
132: The layout of the paper is as follows. In \S\,2 a simple approach
133: to flare prediction using only the past record of flaring from an active
134: region [previously presented in Wheatland (2001)] is reiterated.
135: In \S\,3 the new method of prediction,
136: combining existing methods and information from observed flare statistics,
137: is described.
138: In \S\,4 simulations are presented showing how the method uses the
139: observed flaring record, and in \S\,5 the results are discussed.
140:
141: \section{Wheatland (2001)}
142:
143: Wheatland (2001) presented a method for flare prediction using
144: only observed flare statistics and the assumptions that flares obey
145: Poisson statistics in time, and power-law statistics in size,
146: elaborating on a suggestion by Moon et al.\ (2001).
147: The approach is briefly reiterated here, since it is part of the new
148: method.
149:
150: First assume that there is a threshold size $S_1$ above which
151: all events occurring in an active region are observed, so that the
152: distribution~(\ref{eq:pldist}) applies for events above that size.
153: The total rate of events larger than $S_1$ is then
154: \begin{equation}
155: \lambda_1=\int_{S_1}^{\infty}N(S)dS=A(\gamma -1)^{-1}
156: S_1^{-\gamma+1},
157: \end{equation}
158: assuming $\gamma>1$. Hence the frequency-size distribution may be
159: rewritten
160: \begin{equation}\label{eq:fdist}
161: N(S)=\lambda_1(\gamma-1)S_1^{\gamma-1}S^{-\gamma}.
162: \end{equation}
163: Suppose the probability of a big event in a given period $\Delta T$ is
164: required, where by big we mean an event at least as large as
165: $S_2$. According to the distribution~(\ref{eq:fdist})
166: the rate of events larger than $S_2$ is
167: \begin{equation}\label{eq:rate_big}
168: \lambda_2=\lambda_1
169: \left( \frac{S_1}{S_2}\right)^{\gamma-1}.
170: \end{equation}
171:
172: Applying the Poisson model of flare occurrence, the probability of at
173: least one big event during a period $\Delta T$ is given by Poisson
174: statistics as
175: \begin{equation}\label{eq:prob_big}
176: \epsilon =1-\exp(-\lambda_2 \Delta T).
177: \end{equation}
178:
179: Equations~(\ref{eq:rate_big}) and~(\ref{eq:prob_big}) provide the
180: required estimate. The quantities $S_1$, $S_2$ and $\Delta T$ are chosen,
181: and then the parameters $\lambda_1$ and $\gamma$ (if the precise value
182: of $\gamma$ is assumed unknown) need to be
183: estimated from the past history of flaring of the active region.
184: Wheatland (2001) assumed that $\gamma$ is the same for all active
185: regions, and hence known (see Wheatland 2000),
186: and estimated $\lambda_1$ using the
187: Bayesian procedure of Scargle (1998).
188:
189: The rationale behind the method of Wheatland (2001) is that the
190: flare frequency-size distribution is steep so there are very many small
191: events, which allows $\lambda_1$ to be estimated relatively accurately
192: from the observed history of flaring in an active region. Hence the
193: estimate of $\epsilon$ should be relatively accurate. To make this
194: point quantitative, note that from Equations~(\ref{eq:rate_big})
195: and~(\ref{eq:prob_big}) the uncertainty in the estimate of the
196: probability $\epsilon $ is given approximately by
197: \begin{equation}
198: \frac{\sigma_{\epsilon}}{\cal \epsilon}
199: =\frac{\lambda_1 \Delta T (S_1/S_2)^{\gamma-1}}
200: {\exp[\lambda_1 \Delta T (S_1/S_2)^{\gamma-1}] -1}
201: \frac{\sigma_1}{\lambda_1},
202: \end{equation}
203: where $\sigma_1$ is the uncertainty in $\lambda_1$, and where we have
204: ignored any uncertainty in $\gamma$. Assuming $S_2\gg S_1$ leads to
205: $\sigma_{\epsilon}/\epsilon \approx \sigma_1/\lambda_1$.
206: If the rate $\lambda_1$ is determined from
207: $M$ observed events, then for Poisson statistics we expect
208: $\sigma_1/\lambda_1=M^{-1/2}$, and hence
209: \begin{equation}\label{eq:unc}
210: \frac{\sigma_{\epsilon}}{\epsilon}\approx M^{-1/2}.
211: \end{equation}
212: Equation~(\ref{eq:unc}) provides a crude estimate of the accuracy of the
213: method. To achieve a 10\% accuracy in the estimate requires of order
214: 100 observed events.
215:
216: \section{New method}
217:
218: \subsection{Approach}
219:
220: The Wheatland (2001) method shows how to use the flaring record
221: for an active region to make a flare prediction, but it ignores the
222: other information which is normally the basis of prediction. It is
223: sensible to combine all of the available information, and in this
224: section we consider how to do this.
225:
226: We assume that a sequence of events with sizes $s_1,s_2,...,s_M$
227: (all larger than $S_1$) are observed to occur at times
228: $t_1< t_2< ...< t_M$ respectively in an active region.
229: These events occur within an observing interval which starts at
230: time $t_{\rm sta}$ and ends at time $t_{\rm end}$. We also have
231: additional information, which we label $I$, including our
232: knowledge of the phenomenological rules of flare statistics, and
233: e.g.\ the McIntosh classification of the active region.
234: The problem is then to estimate $\epsilon$, the probability of a big
235: event, based on the data and the additional information $I$.
236: By `estimating $\epsilon$' we strictly mean that we want to calculate
237: a probability distribution for the quantity $\epsilon$, based on the
238: available information. The peak of this distribution
239: is our most likely value for the probability of occurrence of a big
240: flare, and the width of the distribution is a measure of the
241: uncertainty of that value. To do this we proceed as follows.
242: First we estimate (calculate probability distributions for)
243: $\lambda_1$ and $\gamma$ based on the available information, and then
244: we use these distributions to estimate $\lambda_2$. Then we use this
245: distribution together with the relationship~(\ref{eq:prob_big}) to
246: estimate the desired quantity $\epsilon$. We now consider each of
247: these steps in turn.
248:
249: \subsection{Estimating $\gamma$}
250:
251: First we consider the calculation of
252: $P_{\gamma}(\gamma )$, the probability distribution for
253: the power-law index
254: $\gamma$.\footnote{In the following probability distributions are given
255: labels such as $P_{\gamma}(\gamma)$ when the actual functional form
256: of the distribution is needed. When this is not the case
257: the generic label ${\rm prob}(...)$ is used to denote a
258: distribution.}
259: As mentioned in the Introduction,
260: Wheatland (2000) found that the index $\gamma$
261: is independent of active region for a set of hard X-ray events,
262: although the statistics underlying
263: the study were somewhat poor. If $\gamma$ is the same in all active
264: regions then the
265: observations $s_1,s_2,...,s_M$ can be replaced by a larger set of
266: events over many active regions. We return to this point in \S\,3.4,
267: but for now admit the possibility that $\gamma$ is different in different
268: active regions, and consider its estimation based on data for the given
269: active region alone.
270:
271: Bai (1993) has shown how to estimate a power-law index for a set of
272: data, using `maximum likelihood'. Following Bai, the likelihood
273: function, that is the probability of the observed data
274: $D=\{s_1,s_2,...,s_M\}$ given the model, is (assuming $\gamma>1$)
275: \begin{equation}\label{eq:gam_like}
276: {\rm prob}(D | \gamma, I )
277: \propto \prod_{i=1}^{M}(\gamma-1)(s_i/S_1)^{-\gamma},
278: \end{equation}
279: where $I$ stands for all additional information, including knowledge of
280: the phenomenological rule~(\ref{eq:pldist}). We note that this
281: expression requires $\gamma >1$, which follows from the requirement that
282: the probability distribution for size $S$ is normalized over all $S$ larger
283: than $S_1$. It is not necessary to introduce an upper cutoff for $S$ in
284: the present treatment (provided $\gamma >1$), although an upper cutoff
285: is necessary to ensure that the mean flare size is finite, if
286: $\gamma<2$. We will return to this point in \S\,5.
287:
288: Bayes's theorem may be used to convert the likelihood into the
289: probability of the model given the data, which is what we are
290: interested in:
291: \begin{equation}\label{eq:p_gam_bayes}
292: {\rm prob}(\gamma | D,I)
293: \propto
294: {\rm prob}(D | \gamma,I)\times {\rm prob}(\gamma,I ),
295: \end{equation}
296: where ${\rm prob}(\gamma,I )$ is the `prior distribution' for
297: $\gamma$, i.e.\ the distribution we would assign to $\gamma$ in
298: the absence of the data (e.g.\ Sivia 1996). A choice needs to
299: be made for this distribution, and a common choice is to assume
300: a constant value within minimum and maximum values $\gamma_1$ and
301: $\gamma_2$ respectively:
302: \begin{equation}
303: {\rm prob} (\gamma |D,I) = \left\{
304: \begin{array}{ll}
305: (\gamma_2-\gamma_1)^{-1} & \mbox{if $\gamma_1\leq \gamma \leq
306: \gamma_2$}
307: \\
308: 0 & \mbox{else,}
309: \end{array}
310: \right.
311: \end{equation}
312: which is referred to as a `uniform prior'.
313: We note that for a uniform prior the most likely value of
314: $\gamma$ is the maximum of the likelihood function:
315: \begin{equation}\label{eq:gam_ML}
316: \gamma^{\ast}=\frac{M}{\sum_{i=1}^{M}\ln (s_i/S_1)}+1,
317: \end{equation}
318: which is the maximum likelihood estimate of $\gamma$ found by Bai.
319:
320: We can identify ${\rm prob} (\gamma | D,I)$ with
321: $P_{\gamma}(\gamma)$, and then Equations~(\ref{eq:gam_like})
322: and~(\ref{eq:p_gam_bayes})
323: give the required `posterior distribution' for $\gamma$:
324: \begin{equation}\label{eq:prob_gam}
325: P_{\gamma}(\gamma)= C \frac{(\gamma-1)^{M}}{\pi^{\gamma}}\Gamma (\gamma),
326: \end{equation}
327: where
328: \begin{equation}
329: \pi=\prod_{i=1}^M\frac{s_i}{S_1},
330: \end{equation}
331: and where we have relabelled the prior distribution $\Gamma (\gamma)$.
332: The normalizing factor $C$ is determined by the requirement
333: $\int_{1}^{\infty}P_{\gamma}(\gamma)d\gamma=1$.\footnote{In the
334: following all normalizing factors are labelled $C$, although they
335: refer to different values. It is understood that in each case the
336: value $C$ is to be determined by integration.} For a uniform prior
337: the integral may be performed, leading to
338: \begin{equation}
339: C=\frac{(\gamma_2-\gamma_1) \pi (\ln \pi )^{M+1}/M!}
340: {P[M+1,(\gamma_2-1)\ln\pi ]
341: - P[M+1, (\gamma_1-1)\ln \pi ]},
342: \end{equation}
343: where $P (a,x)$ denotes the incomplete Gamma function~\cite{abr&ste64}.
344:
345: Before proceeding we present a rough estimate of the uncertainty in
346: the most likely value of $\gamma$ based on the distribution
347: $P_{\gamma}(\gamma)$ with a uniform prior.
348: Assuming Gaussian behavior in the vicinity of
349: the peak, the width of the distribution~(\ref{eq:prob_gam}) is
350: $\sigma_{\gamma}\approx [L^{\prime\prime}(\gamma^{\ast})]^{-1/2}$, where
351: $L(\gamma)=-\ln P_{\gamma}(\gamma)$, and where $\gamma^{\ast}$ is the
352: location of the peak of the distribution (Sivia 1996). This leads to
353: $\sigma_{\gamma}\approx M^{1/2}/\ln\pi$, and using
354: Equation~(\ref{eq:gam_ML}) gives
355: \begin{equation}\label{eq:sig_gam}
356: \sigma_{\gamma}\approx (\gamma^{\ast}-1)M^{-1/2}.
357: \end{equation}
358:
359:
360: \subsection{Estimating $\lambda_1$}
361:
362: Next we consider the calculation of $P_1(\lambda_1)$, the distribution
363: of the rate $\lambda_1$ of flares larger than $S_1$.
364: This is a more difficult problem because the rate of flaring in an active
365: region may vary with time~(see e.g.\ Wheatland 2001). However,
366: observations suggest that a piecewise-constant Poisson process
367: provides a good model for the way flares occur in time in
368: individual active regions.
369:
370: We assume that a period of time of duration $T^{\prime}\leq T$ immediately
371: prior to $t_{\rm end}$ is identified (i.e.\ from $t=t_{\rm end}-T^{\prime}$
372: to $t=t_{\rm end}$) during which time flare occurrence is consistent
373: with a constant-rate Poisson process.
374:
375: One approach to identifying the necessary period of time has been
376: presented by Scargle (1998), who showed how to select a piecewise-constant
377: Poisson model to describe an observed sequence of events. When applied
378: to a sequence of events at times $t_1< t_2< ... < t_M$ the Scargle method
379: gives a sequence of times $t_{B\it 0}< t_{B1}<...<t_{BK}$
380: at which the rate is determined to change
381: (where $t_{B0}=t_{\rm sta}$ and $t_{BK}=t_{\rm end}$ are the start and
382: end of the observing period), and a corresponding sequence
383: $\lambda_{B1},\lambda_{B2},...,\lambda_{BK}$ of rates. The sequence
384: of times and rates is called a set of `Bayesian blocks'. In this
385: case we identify $T^{\prime}$ with $t_{BK}-t_{B(K-1)}$.
386: We note that the original Bayesian blocks procedure [which was used
387: e.g.\ by Wheatland (2001)] does not necessarily select the best
388: piecewise-constant model. Recently Scargle has found a computationally
389: feasible way to determine the optimal decomposition (Scargle, private
390: communication, 2003). We begin by assuming this method (or another
391: method) has been applied to the data, to determine the required period
392: $T^{\prime}$ prior to the end of observations.
393:
394: A probability distribution for the rate $\lambda_1$ is then be
395: determined as follows. We assume that $M^{\prime}\leq M$ events are observed
396: during the selected period $T^{\prime}$. The probability of the observed
397: data $D^{\prime}$ (strictly this comprises not just the number of events
398: but also their times) given a Poisson model with rate $\lambda_1$ is
399: \begin{equation}\label{eq:pdkmk}
400: {\rm prob} (D^{\prime}|\lambda_1,I)\propto \lambda_1^{M^{\prime}}
401: e^{-\lambda_1T^{\prime}},
402: \end{equation}
403: where we retain only the dependence on $\lambda_1$ on the
404: right hand side of this equation, and where we formally recognise any
405: additional information by the dependence on $I$.
406: Bayes's theorem may be used to turn this likelihood into a probability
407: of the model given the data, and the additional information:
408: \begin{equation}\label{eq:pmkdk}
409: {\rm prob}(\lambda_1|D^{\prime},I)\propto
410: {\rm prob}(D^{\prime}|\lambda_1,I)\times {\rm prob} (\lambda_1,I),
411: \end{equation}
412: where ${\rm prob}(\lambda_1,I)$ is the prior distribution for the rate.
413:
414: The prior distribution ${\rm prob} (\lambda_1,I)$ represents the
415: estimate of the rate of flaring for the active region in the absence
416: of any data. This distribution allows the incorporation of any additional
417: information we have about the expected rate of flaring, not including
418: the actual data. To make this concrete, we will consider the case that
419: the additional information is the McIntosh classification of the sunspots
420: associated with the active region, although we stress that any other
421: additional information can also be incorporated.
422: When the additional information is the McIntosh classification,
423: a suitable prior distribution can be
424: constructed from historical records of the observed rates of events
425: above size $S_1$ for every active region of the same class.
426: This is a generalization of the analysis underlying present flare
427: prediction methods based on McIntosh classification, which considers
428: only the mean flaring rate extracted from historical data. Hence we
429: propose the construction of distributions of flaring rate for each
430: McIntosh classification. We assume these are available, and label the
431: appropriate distribution
432: $\Lambda_{\rm MC} (\lambda_1)$, where MC denotes McIntosh
433: classification. Equation~(\ref{eq:pmkdk}) then becomes
434: \begin{equation}\label{eq:prob_lam1}
435: P_1(\lambda_1)=C\lambda_1^{M^{\prime}} e^{-\lambda_1T^{\prime}}
436: \Lambda_{\rm MC} (\lambda_1),
437: \end{equation}
438: where we have identified ${\rm prob}(\lambda_1|D^{\prime},I)$ with
439: $P_1(\lambda_1)$, and and where $C$ is the normalization factor. This
440: is the required posterior distribution for $\lambda_1$.
441:
442: It should be noted that the distribution~(\ref{eq:prob_lam1}) explicitly
443: uses only a subset of all flares observed in an active region,
444: i.e.\ the $M^{\prime}\leq M$ flares observed during the interval
445: $T^{\prime}\leq T$. Previous
446: data contribute only to the determination of the interval $T^{\prime}$. The
447: motivation is that when the rate changes, the old rate is no
448: longer relevant for future prediction. For many active regions the
449: observed rate appears to be constant during a transit of the disk, or
450: at least no rate change is detectable (e.g.\ Wheatland 2001), in which
451: case all observed flares contribute explicitly to the inference.
452:
453: Before proceeding we note two simple results for
454: Equation~(\ref{eq:prob_lam1}) with a uniform prior.
455: First, it is easy to see that with a uniform prior the maximum of this
456: distribution occurs at $M^{\prime}/T^{\prime}$.
457: Second we note the well known result that for large $\lambda_1T^{\prime}$
458: and neglecting the prior, Equation~(\ref{eq:prob_lam1})
459: approximates a Gaussian with a width
460: \begin{equation}\label{eq:sig_lam}
461: \sigma_1\approx \frac{(M^{\prime})^{1/2}}{T^{\prime}},
462: \end{equation}
463: which is consistent with the arguments at the end of \S\,2.
464:
465: \subsection{Estimating $\epsilon$}
466:
467: The probability distribution $P_2(\lambda_2)$ for the rate $\lambda_2$
468: of flares larger than $S_2$ may be constructed from the distributions
469: $P_1(\lambda_1)$ and $P_{\gamma}(\gamma)$ using
470: Equation~(\ref{eq:rate_big}). Specifically we have
471: $\lambda_2=\lambda_1(S_1/S_2)^{\gamma-1}$,
472: and hence
473: \begin{equation}
474: P_2(\lambda_2)=
475: \int_1^{\infty}d\gamma \int_0^{\infty} d\lambda_1 P_1(\lambda_1)
476: P_{\gamma}(\gamma)\delta
477: \left[ \lambda_2-\lambda_1(S_1/S_2)^{\gamma-1}\right],
478: \end{equation}
479: and performing the integral over $\lambda_1$ leads to
480: \begin{equation}\label{eq:P2}
481: P_2(\lambda_2) =
482: \int_1^{\infty}
483: d\gamma P_{\gamma}(\gamma)\left(\frac{S_2}{S_1}\right)^{\gamma-1}
484: P_1\left[\lambda_2 \left(\frac{S_2}{S_1}\right)^{\gamma-1} \right].
485: \end{equation}
486:
487:
488: The quantity we are interested in is $\epsilon$, the probability of
489: an event bigger than $S_2$ occurring in an interval $\Delta T$.
490: The probability distribution $P_{\epsilon}(\epsilon)$ for this
491: quantity may be contructed from the distribution for $\lambda_2$ by
492: a change of variable.
493: Specifically, from Equation~(\ref{eq:prob_big}) we have
494: $\lambda_2=-\ln (1-\epsilon)/\Delta T$, and hence
495: \begin{eqnarray}\label{eq:prob_pbig}
496: P_{\epsilon}(\epsilon)&=&P_2\left[\lambda_2(\epsilon )\right]
497: \left|\frac{d\lambda_2}{d\epsilon }\right| \nonumber \\
498: &=& P_2\left[-\frac{\ln (1-\epsilon )}{\Delta T} \right]
499: \frac{1}{\Delta T (1-\epsilon ) }.
500: \end{eqnarray}
501: Using Equations~(\ref{eq:prob_gam}), (\ref{eq:prob_lam1}), and
502: (\ref{eq:P2}) in~(\ref{eq:prob_pbig}) leads to
503: \begin{equation}\label{eq:pbig_general}
504: P_{\epsilon}(\epsilon )=
505: \int_1^{\infty} d\gamma \, f(\epsilon,\gamma),
506: \end{equation}
507: where
508: \begin{eqnarray}\label{eq:fjoint}
509: f(\epsilon,\gamma)&=&C\left[-\ln (1-\epsilon )\right]^{M^{\prime}}
510: (\gamma-1)^M\Gamma (\gamma )
511: \left[\frac{(S_2/S_1)^{M^{\prime}+1}}{\pi}\right]^{\gamma}
512: \nonumber \\
513: &\times& (1-\epsilon )^{\left(T^{\prime}/\Delta T\right)
514: \left(S_2/S_1\right)^{\gamma-1}-1}
515: \Lambda_{\rm MC} \left[-\frac{\ln (1-\epsilon )}{\Delta T}
516: \left(\frac{S_2}{S_1}\right)^{\gamma-1} \right]
517: \end{eqnarray}
518: is the joint probability
519: distribution for $\epsilon$ and $\gamma$. The normalization factor
520: $C$ is obtained by requiring that
521: $\int_{0}^{1}P_{\epsilon}(\epsilon)d\epsilon=1$. We note that
522: $P_{\gamma}(\gamma)$ and $P_{\epsilon}(\epsilon)$ may be considered
523: to be marginal distributions of $f(\epsilon,\gamma)$ (i.e.\ they are
524: obtained by integration over $\epsilon$ and $\gamma$ respectively).
525: However, Equation~(\ref{eq:prob_gam}) gives the distribution for
526: $\gamma$ directly.
527:
528: As noted in \S\,3.2, observations suggest that $\gamma$ is the same
529: in all active regions, in which case the index can be determined very
530: accurately from events over many active regions using
531: Equation~(\ref{eq:gam_ML}). If the estimate is $\gamma^{\ast}$,
532: then we can consider the prior distribution for $\gamma$ to be
533: $\Gamma (\gamma) = \delta (\gamma-\gamma^{\ast})$, and
534: Equation~(\ref{eq:pbig_general}) simplifies to
535: \begin{equation}\label{eq:pbig_simp}
536: P_{\epsilon}(\epsilon ) =
537: C\left[-\ln (1-\epsilon ) \right]^{M^{\prime}}
538: (1-\epsilon )^{\left( T^{\prime}/\Delta T\right)
539: \left(S_2/S_1\right)^{\gamma^{\ast}-1}-1}
540: \Lambda_{\rm MC} \left[-\frac{\ln (1-\epsilon )}{\Delta T}
541: \left(\frac{S_2}{S_1}\right)^{\gamma^{\ast}-1} \right].
542: \end{equation}
543:
544: Equations~(\ref{eq:pbig_general}), (\ref{eq:fjoint})
545: and~(\ref{eq:pbig_simp}) are the required expressions for the posterior
546: probability distribution for $\epsilon$.
547:
548: \section{Simulations}
549:
550: We present two simulations demonstrating the application of the
551: method to synthetic data. These simulations omit the inclusion of
552: other information via the prior
553: $\Lambda_{\rm MC} (\lambda_1)$, so they illustrate only how the
554: method performs using the observed data.
555:
556: First we consider the case that $\gamma$ is assumed to be known.
557: Ten days of flaring were simulated by producing a sequence of
558: event times as a Poisson process in time with a rate $\lambda_1=0.5$
559: per day for the first five days, and with a rate $\lambda_1=5.0$
560: per day for the second five days. Each event was assigned a size
561: according to a power law distribution with an index $\gamma=1.8$,
562: above the threshold size $S_1=1$ (in arbitrary units). Figure~1
563: illustrates a typical simulation. The first (upper) panel shows the
564: size of each event versus the time at which the event occurred.
565: In this case there were 31 events. The simulation applies the method
566: to the problem of predicting the probability of a big event occurring
567: during the next day ($\Delta T=1$ day) at the end of the ten days.
568: The size of a big event was taken to be
569: $S_2=100$. The original Bayesian blocks procedure~(Scargle 1998) was
570: applied to the event time series to determine a decomposition into a
571: sequence of piecewise-constant intervals and rates. The second panel
572: of Figure~1 shows the result of this process:
573: the solid lines indicate the rate as a function of time
574: inferred by the Bayesian blocks procedure, and the dotted lines indicate
575: the true rate versus time. The Bayesian blocks procedure correctly
576: identifies a two-rate model as the most likely model, and identifies
577: the approximate time of the change in rate. The third panel shows the
578: probability distribution $P_{\epsilon}(\epsilon)$ obtained from
579: Equation~(\ref{eq:pbig_simp}) with a uniform prior for $\lambda_1$,
580: and with $M^{\prime}$ and $T^{\prime}$ equal to the number of events in the
581: second Bayesian block and the duration of the second Bayesian block
582: respectively. The dotted vertical line in this panel is the true value
583: of $\epsilon$.
584: We see that, even for a relatively small number of events, the method is
585: able to provide a good estimate of the probability of a big event. The
586: width of the inferred distribution for $\epsilon$ is consistent with
587: Equation~(\ref{eq:unc}).
588:
589: \begin{figure}
590: \epsscale{0.7}
591: \plotone{f1.eps}
592: \caption[f1.eps]{Simulation of 10 days of flaring and application of
593: the prediction method, assuming $\gamma$ is known.}
594: \end{figure}
595:
596: Second we consider the more difficult case of simultaneously
597: estimating $\gamma$ and $\lambda_1$. Ten days of flaring were again
598: simulated, with a rate $\lambda_1=1$ per day for the first five days,
599: and a rate $\lambda_1=10$ per day for the second five days. Larger
600: rates were chosen to provide more events for the inference, but the
601: other parameters were kept the same as in the first simulation.
602: Figure~2 illustrates the results of a typical simulation. The
603: first (upper) panel shows the time history of events --- in this case
604: 57 events occurred. The second panel shows the result of a Bayesian
605: blocks decomposition of the data (solid lines) together with the
606: true rate versus time (dotted lines). Once again the Bayesian blocks
607: procedure correctly identifies a two-rate model as the most likely
608: model, and identifies the approximate time of the change in rate.
609: The third panel shows the result of using Equation~(\ref{eq:prob_gam})
610: --- with a uniform prior with $\gamma_1=1.25$ and $\gamma_2=2.25$ ---
611: to construct the distribution for $\gamma$. The dotted vertical line in this
612: panel shows the true value of $\gamma$.
613: The fourth panel of Figure~2 shows the distribution for $\epsilon$
614: constructed using Equation~(\ref{eq:pbig_general}), with
615: $M=57$, with $M^{\prime}$ and $T^{\prime}$ obtained from the second
616: Bayesian block, and with uniform prior distributions for $\gamma$ and
617: $\lambda_1$.
618: The dotted vertical line indicates the true value. From this simulation
619: we see that a reasonable estimate for $\epsilon $ is obtained for a
620: relatively small number of events.
621:
622: \begin{figure}
623: \epsscale{0.7}
624: \plotone{f2.eps}
625: \caption[f2.eps]{Simulation of 10 days of flaring and application of the
626: prediction method, assuming $\gamma$ is unknown.}
627: \end{figure}
628:
629: The distribution for $\epsilon $ obtained in the lower panel of
630: Figure~2 is quite broad.
631: A basic reason is that $\epsilon$ depends sensitively on $\gamma$
632: because of its appearance as an exponent in
633: Equation~(\ref{eq:rate_big}), and $\gamma$ has a range of possible
634: values, as shown in the third panel of Figure~2.
635: This effect may be seen by considering
636: $f(\epsilon,\gamma)$ [defined by Equation~(\ref{eq:fjoint})],
637: which is the joint distribution of $\epsilon$ and $\gamma$. Figure~3
638: shows a contour plot of $f(\epsilon,\gamma)$ for the simulation depicted
639: in Figure~2. The dotted vertical and horizontal lines are the true values
640: of $\epsilon$ and $\gamma$ respectively.
641: The dashed curve is defined by
642: $\epsilon=1-\exp[-(M^{\prime}/T^{\prime})(S_1/S_2)^{\gamma-1}\Delta T ]$,
643: and the contours of $f(\epsilon,\gamma)$ are observed to be stretched
644: out along this curve. The practical implication of this figure is that
645: accurate estimation of $\epsilon$ depends on accurate estimation
646: of $\gamma$. In practice $\gamma$ is known a priori quite accurately,
647: but in this simulation we have assumed that $\gamma$ is initially unknown
648: (within the range 1.25 to 2.25), to illustrate the process of
649: inference.
650:
651: \begin{figure}
652: \epsscale{0.7}
653: \plotone{f3.eps}
654: \caption[f3.eps]{Contour map of the joint probability of $\epsilon$ and
655: $\gamma$, for the simulation in Fig.~2.}
656: \end{figure}
657:
658: \section{Discussion}
659:
660: Existing methods of solar flare prediction do not make complete
661: use of an important source of information: the time history of flares
662: already observed in the active region of interest, in particular
663: frequently occurring small events.
664: A new method for flare prediction is presented which exploits the
665: observed history of flaring from an active region to improve an initial
666: prediction, which e.g.\ may come from one of the existing methods.
667: To make the example concrete we may think of the initial prediction
668: coming from from the McIntosh sunspot classification, which is a common
669: basis for prediction. This background information provides an initial
670: estimate for the expected flaring rate through a prior distribution
671: $\Lambda_{\rm MC}(\lambda_1)$, which represents the probability that
672: the flaring rate above a (small) size $S_1$ is $\lambda_1$, given
673: historical rates of occurrence of flares for the given McIntosh
674: class. Bayes's theorem is then used to estimate the probability
675: $\epsilon$ of observing a large flare (above size $S_2$) in a given
676: period of time, based on this prior information and on the sequence of
677: flares already produced by the active region, and assuming simple
678: phenomenological rules describing the occurrence of flares.
679: In this paper the basic theory behind the inference of $\epsilon$
680: based on observed data is presented. The inclusion of background
681: information [i.e.\ the construction of the priors
682: $\Lambda_{\rm MC}(\lambda_1)$] is yet to be done.
683:
684: The method relies on event sizes following the phenomenological
685: law~(\ref{eq:pldist}). Some studies of very small extreme
686: ultraviolet events (`nanoflares') suggest that their thermal energies
687: follow a steeper distribution than energies of large events
688: (e.g.\ Krucker and Benz 1998; Parnell and Jupp 2000), although this
689: remains controversial (e.g.\ Aschwanden and Parnell 2002).
690: From the point of view of the prediction method presented here,
691: the uncertainty over the low-size end of the distribution is irrelevant
692: provided events significantly larger than nanoflares are used.
693: In any case the observed distributions from many active
694: regions may be examined as a check on Equation~(\ref{eq:pldist}).
695: A related point is that the distribution~(\ref{eq:pldist}) requires
696: a cutoff at large sizes on energetics grounds, and neglect of this
697: cutoff will lead to the number of large flares being overestimated.
698: A cutoff will be incorporated before the method is applied to real data.
699:
700: The choice of the quantity $S$ has not been addressed, although a good
701: choice is likely to be important to the method. Most flare forecasting
702: deals with soft X-ray events, in particular prediction of GOES
703: (Geostationary Observational Environmental Satellite) M and X class
704: events (events with peak fluxes greater than $10^{-5}$W/m$^2$
705: and $10^{-4}$W/m$^2$ respectively in the 1-8 Angstrom band observed by
706: the satellites). A practical motivation for this is that flare
707: soft X-ray emission causes disturbances of the ionosphere which affect
708: shortwave radio communication, and there is a need to predict these
709: occurrences. A disadvantage of using GOES events is that they are not
710: ideal for flare statistics e.g.\ because of problems with event selection
711: due to the large background in soft X-ray (see Wheatland 2001).
712:
713: A number of other issues also need to be considered before the method is
714: implemented with real data. A point neglected so far is that active regions
715: evolve, so that predictions based on the traditional methods also
716: change with time. For example, an active region evolves through McIntosh
717: classifications (e.g.\ Bornmann, Kalmbach, Kulhanek, and Casale 1990).
718: Changes in background information such as this should be incorporated
719: through changes in the prior, and this question will be considered in more
720: detail in future work. A related point concerns the construction of the
721: prior distributions for rate. It is likely that the McIntosh classification
722: will be used, although other possibilities will be considered. The
723: problem is then to determine the probability of a given McIntosh class
724: having a given rate, based on observed flaring sequences in the
725: historical record for active regions of that class.
726: The details of this calculation will be addressed in future work.
727:
728: Finally, as with all methods of forecasting, it is essential to
729: test the reliability of the method. It is straightforward to compare,
730: after the fact, the number of predicted and the number of observed
731: events for a large sample of active regions. The method presented here
732: will be implemented and tested in this way, and the results compared
733: with existing methods of prediction.
734:
735: \section*{Acknowledgements}
736:
737: M.S.W. acknowledges the support of an Australian Research Council
738: QEII Fellowship, and thanks Richard Thompson and Garth Patterson
739: of the Ionospheric Prediction Service for useful discussions. The
740: comments of an anonymous referee have also helped to improve the
741: paper.
742:
743: \begin{thebibliography}{}
744: \small
745: %
746: \bibitem[Abramowitz and Stegun 1964]{abr&ste64}
747: Abramowitz, M., \& Stegun, I.A. 1964, Handbook of Mathematical
748: Functions, National Bureau of Standards, Applied Mathematics Series
749: volume 55.
750: %
751: \bibitem[Aschwanden and Parnell 2002]{asc&par02}
752: Aschwanden, M.J., \& Parnell, C.E. 2002, \apj ~572, 1048.
753: %
754: \bibitem[Bai 1993]{bai93} Bai, T. 1993, \apj ~404, 805.
755: %
756: \bibitem[Bornmann, Kalmbach, Kulhanek, \& Casale 1990]{bor&90}
757: Bornmann, P.L., Kalmbach, D., Kulhanek, D., and Calsale, A.
758: 1990, in
759: Solar-Terrestrial Predictions: Proceedings of a Workshop in
760: Leura, Australia, October 16-20, 1989, Volume 1, eds.\ R.J. Thompson,
761: D.G. Cole, P.J. Wilkinson, M.A. Shea, D. Smart, \& G. Heckman,
762: (NOAA Environmental Research Laboratories: Boulder, Colorado),
763: 301.
764: %
765: \bibitem[Bornmann and Shaw 1994]{bor&sha94}
766: Bornmann, P.L., \& Shaw, D. 1994, Sol.\ Phys.\ 150, 127.
767: %
768: \bibitem[Crosby, Aschwanden and Dennis 1993]{cro&93}
769: Crosby, N.B., Aschwanden, M.J., \& Dennis, B.R. 1993,
770: Sol.\ Phys.\ 143, 275.
771: %
772: \bibitem[Gallagher, Moon, and Wang 2002]{gal&02}
773: Gallagher, P.T., Moon, Y.-J., \& Wang, H. 2002, Sol.\ Phys.\
774: 209, 171.
775: %
776: \bibitem[Jaynes 2003]{jay03}
777: Jaynes, E.T. 2003, Probability Theory: The Logic of Science (Cambridge
778: University Press: Cambridge).
779: %
780: \bibitem[Krucker and Benz 1998]{kru&ben98}
781: Krucker, S., \& Benz, A.O. 1998, \apj ~501, L213.
782: %
783: \bibitem[McIntosh 1990]{mci90}
784: McIntosh, P.S. 1990, Sol.\ Phys.\ 125, 251.
785: %
786: \bibitem[Moon et al.\ 2001]{moo&01}
787: Moon, Y.-J., Choe, G.S., Yun, H.S., \& Park, Y.D. 2001,
788: \jgr ~106, 29951.
789: %
790: \bibitem[Neidig, Wiborg and Seagraves 1990]{nei&90}
791: Neidig, D.F., Wiborg, P.H. and Seagraves, P.H. 1990, in
792: Solar-Terrestrial Predictions: Proceedings of a Workshop in
793: Leura, Australia, October 16-20, 1989, Volume 1, eds.\ R.J. Thompson,
794: D.G. Cole, P.J. Wilkinson, M.A. Shea, D. Smart, \& G. Heckman,
795: (NOAA Environmental Research Laboratories: Boulder, Colorado),
796: 541.
797: %
798: \bibitem[Parnell and Jupp 2000]{par&jup00}
799: Parnell, C.E., \& Jupp, P.E. 2000, \apj ~529, 554.
800: %
801: \bibitem[Scargle 1998]{sca98} Scargle, J.D. 1998, \apj ~504, 405.
802: %
803: \bibitem[Sivia 1996]{siv96} Sivia, D.S. 1996, Data Analysis: A
804: Bayesian Tutorial, (Clarendon Press: Oxford).
805: %
806: \bibitem[Wheatland 2000]{whe00} Wheatland, M.S. 2000, \apj ~532, 1209.
807: %
808: \bibitem[Wheatland 2001]{whe01} Wheatland, M.S. 2001, Sol.\ Phys.\
809: 203, 87.
810: %
811: \end{thebibliography}
812:
813: \end{document}
814:
815: