0403:astro-ph0403613/ms.tex

1: % Flare prediction paper, written February 2004.

2: % Revised following referee's report, March 2004.

3:

4: \documentclass[12pt,preprint,a4]{aastex}

5:

6: \slugcomment{To appear in the Astrophysical Journal}

7:

8: \begin{document}

9: \title{A Bayesian Approach to Solar Flare Prediction}

10: \author{M. S. Wheatland}

11: \affil{School of Physics, University of Sydney, NSW 2006, Australia}

12: \email{m.wheatland@physics.usyd.edu.au}

13:

14: \begin{abstract}

15: A number of methods of flare prediction rely on classification of

16: physical characteristics of an active region, in particular optical

17: classification of sunspots, and historical rates of flaring for a

18: given classification. However these methods largely ignore the number

19: of flares the active region has already produced, in particular the

20: number of small events. The past history of occurrence of

21: flares (of all sizes) is an important indicator to future flare

22: production. We present a Bayesian approach to flare prediction,

23: which uses the flaring record of an active region together with

24: phenomenological rules of flare statistics to refine an initial

25: prediction for the occurrence of a big flare during a subsequent

26: period of time. The initial prediction is assumed to come from one

27: of the extant methods of flare prediction. The theory of the method

28: is outlined, and simulations are presented to show how the refinement

29: step of the method works in practice.

30: \end{abstract}

31:

32: \keywords{Sun: activity --- Sun: flares --- Sun: X-rays ---

33:   methods: statistical}

34:

35: \section{Introduction}

36:

37: Solar flares influence local `space weather,' and as a result there is

38: a demand for accurate flare prediction. Unfortunately no reliable

39: deterministic method of predicting a flare is known, and existing methods

40: are probabilistic in nature.

41:

42: A number of methods discussed in the literature are based on a commonly

43: used white-light classification of sunspots, and the correlation

44: between classification and flare occurrence. The McIntosh classification

45: (McIntosh 1990) categorizes a group of sunspots into one of 60 classes,

46: based on three parameters. Historical flare rates for each of the

47: classifications were used by McIntosh (1990) as the basis of an

48: `expert system' for flare prediction. The system, called Theophrastus

49: (the associated code is called THEO), also incorporates additional

50: information including dynamical properties

51: of spot growth, rotation and shear, magnetic topology inferred from

52: sunspot structure, magnetic classification, and previous flare activity.

53: The method is apparently somewhat subjective, involving rules of thumb

54: incorporated by a human expert. A second approach using the McIntosh

55: classification was presented by Bornmann and Shaw (1994). In this case

56: multiple linear regression was used to determine the effective contribution

57: of each of the McIntosh parameters to the rate of flaring, based on historical

58: records of flaring. Codes based on the methods of McIntosh (1990) and

59: Bornmann and Shaw (1994) are used by the Ionospheric Prediction

60: Service (IPS) of Australia to issue flare predictions at their

61: Learmonth and Culgoora observatories.\footnote{See http://www.ips.gov.au.}

62: Recently Gallagher, Moon and Wang (2002) implemented a system

63: using historical averages of flare numbers for McIntosh classifications

64: to predict a rate for an active region, and then converted this to

65: a probability of flaring in a day using the assumption of Poisson

66: statistics. This prediction is given as part of the Big Bear Solar

67: Observatory Active Region Monitor (ARM).\footnote{See

68: http://beauty.nascom.nasa.gov/arm/latest/.} Finally the US National

69: Oceanic and Atmospheric Administration (NOAA) issues flare

70: probability forecasts for active regions

71: which include input from THEO.\footnote{See

72: http://www.sec.noaa.gov/ftpdir/latest/daypre.txt.}

73:

74: A shortcoming of methods relying on correlations of flaring with

75: active region classification based on historical records is that they

76: ignore the important information of how many flares the active region

77: of interest has already produced. The system of McIntosh (1990)

78: incorporates information about previous activity, but it is unclear

79: how objectively this is done, and the information is limited to

80: the number of large flares already produced by the given active region.

81: In the flare prediction

82: literature, the tendency of a region which has produced large flares in

83: the past to produce large flares in the future is called persistence,

84: which is recognised as one of the most reliable predictors for large

85: flare occurrence in 24-hour forecasts (e.g.\ Neidig, Weiborg, \&

86: Seagraves 1989). In this paper we argue that the history of occurrence of

87: all flares (large and small) observed in a given active region is an

88: important indicator as to how the region will flare in the future, and

89: should be used in any prediction. A related criticism of methods based

90: on classification and historical records is that a given classification

91: may embrace active regions with a variety of flaring rates. If an

92: active region has a flaring rate differing from the average historical

93: rate for its class then the predictions will be in error.

94:

95: Studies of solar flare statistics provide simple phenomenological

96: rules describing flare occurrence. It is well known that flares follow

97: a power-law size distribution, where by size we mean e.g.\ peak flux

98: in soft X-ray. More

99: formally the flare frequency-size distribution $N(S)$ (i.e.\ the number

100: of events per unit size $S$ and per unit time) may be written

101: \begin{equation}\label{eq:pldist}

102: N(S)=AS^{-\gamma}

103: \end{equation}

104: where $A$ and $\gamma$ are constants. The exact power-law

105: index $\gamma$ depends on the choice of the quantity $S$, but typically

106: it is found to be in the range 1.5 to 2 (e.g.\ Crosby, Aschwanden,

107: \& Dennis 1992). The power law index $\gamma$ appears to be the same

108: in different active regions~\cite{whe00}, although there is some

109: evidence that it varies with the solar cycle~\cite{bai93}. A second

110: simple rule concerns the way flares occur in time. Studies of the

111: rate of occurrence of soft X-ray flares in individual active regions

112: suggest that events occur as a Poisson process in time (e.g.\ Moon et

113: al.\ 2001), although many active regions exhibit changes in the

114: mean rate of events (Wheatland 2001).

115:

116: In this paper we show how the observed record of flaring in an active

117: region may be used together with the phenomenological rules of

118: flare statistics to objectively refine an initial flare prediction.

119: The initial prediction may be based on the McIntosh classification, or

120: may come from any other prediction method which does not consider the

121: flare data. The new method

122: is envisaged to work as follows. When an active region appears at the

123: east limb of the Sun, the best guess as to its future flare productivity

124: comes from one of the conventional prediction methods. However, as the

125: active region produces flares, the observed flare statistics are used to

126: adjust the prediction for future flaring. After many flares have been

127: observed, the prediction for future flaring may be dominated by the

128: contribution from the observed data. This process --- refining a

129: probability estimate based on new data --- is naturally performed using

130: Bayes's theorem (e.g.\ Sivia 1996; Jaynes 2003).

131:

132: The layout of the paper is as follows. In \S\,2 a simple approach

133: to flare prediction using only the past record of flaring from an active

134: region [previously presented in Wheatland (2001)] is reiterated.

135: In \S\,3 the new method of prediction,

136: combining existing methods and information from observed flare statistics,

137: is described.

138: In \S\,4 simulations are presented showing how the method uses the

139: observed flaring record, and in \S\,5 the results are discussed.

140:

141: \section{Wheatland (2001)}

142:

143: Wheatland (2001) presented a method for flare prediction using

144: only observed flare statistics and the assumptions that flares obey

145: Poisson statistics in time, and power-law statistics in size,

146: elaborating on a suggestion by Moon et al.\ (2001).

147: The approach is briefly reiterated here, since it is part of the new

148: method.

149:

150: First assume that there is a threshold size $S_1$ above which

151: all events occurring in an active region are observed, so that the

152: distribution~(\ref{eq:pldist}) applies for events above that size.

153: The total rate of events larger than $S_1$ is then

154: \begin{equation}

155: \lambda_1=\int_{S_1}^{\infty}N(S)dS=A(\gamma -1)^{-1}

156:   S_1^{-\gamma+1},

157: \end{equation}

158: assuming $\gamma>1$. Hence the frequency-size distribution may be

159: rewritten

160: \begin{equation}\label{eq:fdist}

161: N(S)=\lambda_1(\gamma-1)S_1^{\gamma-1}S^{-\gamma}.

162: \end{equation}

163: Suppose the probability of a big event in a given period $\Delta T$ is

164: required, where by big we mean an event at least as large as

165: $S_2$. According to the distribution~(\ref{eq:fdist})

166: the rate of events larger than $S_2$ is

167: \begin{equation}\label{eq:rate_big}

168: \lambda_2=\lambda_1

169:   \left( \frac{S_1}{S_2}\right)^{\gamma-1}.

170: \end{equation}

171:

172: Applying the Poisson model of flare occurrence, the probability of at

173: least one big event during a period $\Delta T$ is given by Poisson

174: statistics as

175: \begin{equation}\label{eq:prob_big}

176: \epsilon =1-\exp(-\lambda_2 \Delta T).

177: \end{equation}

178:

179: Equations~(\ref{eq:rate_big}) and~(\ref{eq:prob_big}) provide the

180: required estimate. The quantities $S_1$, $S_2$ and $\Delta T$ are chosen,

181: and then the parameters $\lambda_1$ and $\gamma$ (if the precise value

182: of $\gamma$ is assumed unknown) need to be

183: estimated from the past history of flaring of the active region.

184: Wheatland (2001) assumed that $\gamma$ is the same for all active

185: regions, and hence known (see Wheatland 2000),

186: and estimated $\lambda_1$ using the

187: Bayesian procedure of Scargle (1998).

188:

189: The rationale behind the method of Wheatland (2001) is that the

190: flare frequency-size distribution is steep so there are very many small

191: events, which allows $\lambda_1$ to be estimated relatively accurately

192: from the observed history of flaring in an active region. Hence the

193: estimate of $\epsilon$ should be relatively accurate. To make this

194: point quantitative, note that from Equations~(\ref{eq:rate_big})

195: and~(\ref{eq:prob_big}) the uncertainty in the estimate of the

196: probability $\epsilon $ is given approximately by

197: \begin{equation}

198: \frac{\sigma_{\epsilon}}{\cal \epsilon}

199:   =\frac{\lambda_1 \Delta T (S_1/S_2)^{\gamma-1}}

200:   {\exp[\lambda_1 \Delta T (S_1/S_2)^{\gamma-1}] -1}

201:   \frac{\sigma_1}{\lambda_1},

202: \end{equation}

203: where $\sigma_1$ is the uncertainty in $\lambda_1$, and where we have

204: ignored any uncertainty in $\gamma$. Assuming $S_2\gg S_1$ leads to

205: $\sigma_{\epsilon}/\epsilon \approx \sigma_1/\lambda_1$.

206: If the rate $\lambda_1$ is determined from

207: $M$ observed events, then for Poisson statistics we expect

208: $\sigma_1/\lambda_1=M^{-1/2}$, and hence

209: \begin{equation}\label{eq:unc}

210: \frac{\sigma_{\epsilon}}{\epsilon}\approx M^{-1/2}.

211: \end{equation}

212: Equation~(\ref{eq:unc}) provides a crude estimate of the accuracy of the

213: method. To achieve a 10\% accuracy in the estimate requires of order

214: 100 observed events.

215:

216: \section{New method}

217:

218: \subsection{Approach}

219:

220: The Wheatland (2001) method shows how to use the flaring record

221: for an active region to make a flare prediction, but it ignores the

222: other information which is normally the basis of prediction. It is

223: sensible to combine all of the available information, and in this

224: section we consider how to do this.

225:

226: We assume that a sequence of events with sizes $s_1,s_2,...,s_M$

227: (all larger than $S_1$) are observed to occur at times

228: $t_1< t_2< ...< t_M$ respectively in an active region.

229: These events occur within an observing interval which starts at

230: time $t_{\rm sta}$ and ends at time $t_{\rm end}$. We also have

231: additional information, which we label $I$, including our

232: knowledge of the phenomenological rules of flare statistics, and

233: e.g.\ the McIntosh classification of the active region.

234: The problem is then to estimate $\epsilon$, the probability of a big

235: event, based on the data and the additional information $I$.

236: By `estimating $\epsilon$' we strictly mean that we want to calculate

237: a probability distribution for the quantity $\epsilon$, based on the

238: available information. The peak of this distribution

239: is our most likely value for the probability of occurrence of a big

240: flare, and the width of the distribution is a measure of the

241: uncertainty of that value. To do this we proceed as follows.

242: First we estimate (calculate probability distributions for)

243: $\lambda_1$ and $\gamma$ based on the available information, and then

244: we use these distributions to estimate $\lambda_2$. Then we use this

245: distribution together with the relationship~(\ref{eq:prob_big}) to

246: estimate the desired quantity $\epsilon$. We now consider each of

247: these steps in turn.

248:

249: \subsection{Estimating $\gamma$}

250:

251: First we consider the calculation of

252: $P_{\gamma}(\gamma )$, the probability distribution for

253: the power-law index

254: $\gamma$.\footnote{In the following probability distributions are given

255: labels such as $P_{\gamma}(\gamma)$ when the actual functional form

256: of the distribution is needed. When this is not the case

257: the generic label ${\rm prob}(...)$ is used to denote a

258: distribution.}

259: As mentioned in the Introduction,

260: Wheatland (2000) found that the index $\gamma$

261: is independent of active region for a set of hard X-ray events,

262: although the statistics underlying

263: the study were somewhat poor. If $\gamma$ is the same in all active

264: regions then the

265: observations $s_1,s_2,...,s_M$ can be replaced by a larger set of

266: events over many active regions. We return to this point in \S\,3.4,

267: but for now admit the possibility that $\gamma$ is different in different

268: active regions, and consider its estimation based on data for the given

269: active region alone.

270:

271: Bai (1993) has shown how to estimate a power-law index for a set of

272: data, using `maximum likelihood'. Following Bai, the likelihood

273: function, that is the probability of the observed data

274: $D=\{s_1,s_2,...,s_M\}$ given the model, is (assuming $\gamma>1$)

275: \begin{equation}\label{eq:gam_like}

276: {\rm prob}(D | \gamma, I )

277:   \propto \prod_{i=1}^{M}(\gamma-1)(s_i/S_1)^{-\gamma},

278: \end{equation}

279: where $I$ stands for all additional information, including knowledge of

280: the phenomenological rule~(\ref{eq:pldist}). We note that this

281: expression requires $\gamma >1$, which follows from the requirement that

282: the probability distribution for size $S$ is normalized over all $S$ larger

283: than $S_1$. It is not necessary to introduce an upper cutoff for $S$ in

284: the present treatment (provided $\gamma >1$), although an upper cutoff

285: is necessary to ensure that the mean flare size is finite, if

286: $\gamma<2$. We will return to this point in \S\,5.

287:

288: Bayes's theorem may be used to convert the likelihood into the

289: probability of the model given the data, which is what we are

290: interested in:

291: \begin{equation}\label{eq:p_gam_bayes}

292: {\rm prob}(\gamma | D,I)

293: \propto

294:   {\rm prob}(D | \gamma,I)\times {\rm prob}(\gamma,I ),

295: \end{equation}

296: where ${\rm prob}(\gamma,I )$ is the `prior distribution' for

297: $\gamma$, i.e.\ the distribution we would assign to $\gamma$ in

298: the absence of the data (e.g.\ Sivia 1996). A choice needs to

299: be made for this distribution, and a common choice is to assume

300: a constant value within minimum and maximum values $\gamma_1$ and

301: $\gamma_2$ respectively:

302: \begin{equation}

303: {\rm prob} (\gamma |D,I) = \left\{

304:   \begin{array}{ll}

305:   (\gamma_2-\gamma_1)^{-1} & \mbox{if $\gamma_1\leq \gamma \leq

306: \gamma_2$}

307:   \\

308:   0 & \mbox{else,}

309: \end{array}

310: \right.

311: \end{equation}

312: which is referred to as a `uniform prior'.

313: We note that for a uniform prior the most likely value of

314: $\gamma$ is the maximum of the likelihood function:

315: \begin{equation}\label{eq:gam_ML}

316: \gamma^{\ast}=\frac{M}{\sum_{i=1}^{M}\ln (s_i/S_1)}+1,

317: \end{equation}

318: which is the maximum likelihood estimate of $\gamma$ found by Bai.

319:

320: We can identify ${\rm prob} (\gamma | D,I)$ with

321: $P_{\gamma}(\gamma)$, and then Equations~(\ref{eq:gam_like})

322: and~(\ref{eq:p_gam_bayes})

323: give the required `posterior distribution' for $\gamma$:

324: \begin{equation}\label{eq:prob_gam}

325: P_{\gamma}(\gamma)= C \frac{(\gamma-1)^{M}}{\pi^{\gamma}}\Gamma (\gamma),

326: \end{equation}

327: where

328: \begin{equation}

329: \pi=\prod_{i=1}^M\frac{s_i}{S_1},

330: \end{equation}

331: and where we have relabelled the prior distribution $\Gamma (\gamma)$.

332: The normalizing factor $C$ is determined by the requirement

333: $\int_{1}^{\infty}P_{\gamma}(\gamma)d\gamma=1$.\footnote{In the

334: following all normalizing factors are labelled $C$, although they

335: refer to different values. It is understood that in each case the

336: value $C$ is to be determined by integration.} For a uniform prior

337: the integral may be performed, leading to

338: \begin{equation}

339: C=\frac{(\gamma_2-\gamma_1) \pi (\ln \pi )^{M+1}/M!}

340:   {P[M+1,(\gamma_2-1)\ln\pi ]

341:   - P[M+1, (\gamma_1-1)\ln \pi ]},

342: \end{equation}

343: where $P (a,x)$ denotes the incomplete Gamma function~\cite{abr&ste64}.

344:

345: Before proceeding we present a rough estimate of the uncertainty in

346: the most likely value of $\gamma$ based on the distribution

347: $P_{\gamma}(\gamma)$ with a uniform prior.

348: Assuming Gaussian behavior in the vicinity of

349: the peak, the width of the distribution~(\ref{eq:prob_gam}) is

350: $\sigma_{\gamma}\approx [L^{\prime\prime}(\gamma^{\ast})]^{-1/2}$, where

351: $L(\gamma)=-\ln P_{\gamma}(\gamma)$, and where $\gamma^{\ast}$ is the

352: location of the peak of the distribution (Sivia 1996). This leads to

353: $\sigma_{\gamma}\approx M^{1/2}/\ln\pi$, and using

354: Equation~(\ref{eq:gam_ML}) gives

355: \begin{equation}\label{eq:sig_gam}

356: \sigma_{\gamma}\approx (\gamma^{\ast}-1)M^{-1/2}.

357: \end{equation}

358:

359:

360: \subsection{Estimating $\lambda_1$}

361:

362: Next we consider the calculation of $P_1(\lambda_1)$, the distribution

363: of the rate $\lambda_1$ of flares larger than $S_1$.

364: This is a more difficult problem because the rate of flaring in an active

365: region may vary with time~(see e.g.\ Wheatland 2001). However,

366: observations suggest that a piecewise-constant Poisson process

367: provides a good model for the way flares occur in time in

368: individual active regions.

369:

370: We assume that a period of time of duration $T^{\prime}\leq T$ immediately

371: prior to $t_{\rm end}$ is identified (i.e.\ from $t=t_{\rm end}-T^{\prime}$

372: to $t=t_{\rm end}$) during which time flare occurrence is consistent

373: with a constant-rate Poisson process.

374:

375: One approach to identifying the necessary period of time has been

376: presented by Scargle (1998), who showed how to select a piecewise-constant

377: Poisson model to describe an observed sequence of events. When applied

378: to a sequence of events at times $t_1< t_2< ... < t_M$ the Scargle method

379: gives a sequence of times $t_{B\it 0}< t_{B1}<...<t_{BK}$

380: at which the rate is determined to change

381: (where $t_{B0}=t_{\rm sta}$ and $t_{BK}=t_{\rm end}$ are the start and

382: end of the observing period), and a corresponding sequence

383: $\lambda_{B1},\lambda_{B2},...,\lambda_{BK}$ of rates. The sequence

384: of times and rates is called a set of  `Bayesian blocks'. In this

385: case we identify $T^{\prime}$ with $t_{BK}-t_{B(K-1)}$.

386: We note that the original Bayesian blocks procedure [which was used

387: e.g.\ by Wheatland (2001)] does not necessarily select the best

388: piecewise-constant model. Recently Scargle has found a computationally

389: feasible way to determine the optimal decomposition (Scargle, private

390: communication, 2003). We begin by assuming this method (or another

391: method) has been applied to the data, to determine the required period

392: $T^{\prime}$ prior to the end of observations.

393:

394: A probability distribution for the rate $\lambda_1$ is then be

395: determined as follows. We assume that $M^{\prime}\leq M$ events are observed

396: during the selected period $T^{\prime}$. The probability of the observed

397: data $D^{\prime}$ (strictly this comprises not just the number of events

398: but also their times) given a Poisson model with rate $\lambda_1$ is

399: \begin{equation}\label{eq:pdkmk}

400: {\rm prob} (D^{\prime}|\lambda_1,I)\propto \lambda_1^{M^{\prime}}

401:   e^{-\lambda_1T^{\prime}},

402: \end{equation}

403: where we retain only the dependence on $\lambda_1$ on the

404: right hand side of this equation, and where we formally recognise any

405: additional information by the dependence on $I$.

406: Bayes's theorem may be used to turn this likelihood into a probability

407: of the model given the data, and the additional information:

408: \begin{equation}\label{eq:pmkdk}

409: {\rm prob}(\lambda_1|D^{\prime},I)\propto

410:   {\rm prob}(D^{\prime}|\lambda_1,I)\times {\rm prob} (\lambda_1,I),

411: \end{equation}

412: where ${\rm prob}(\lambda_1,I)$ is the prior distribution for the rate.

413:

414: The prior distribution ${\rm prob} (\lambda_1,I)$ represents the

415: estimate of the rate of flaring for the active region in the absence

416: of any data. This distribution allows the incorporation of any additional

417: information we have about the expected rate of flaring, not including

418: the actual data. To make this concrete, we will consider the case that

419: the additional information is the McIntosh classification of the sunspots

420: associated with the active region, although we stress that any other

421: additional information can also be incorporated.

422: When the additional information is the McIntosh classification,

423: a suitable prior distribution can be

424: constructed from historical records of the observed rates of events

425: above size $S_1$ for every active region of the same class.

426: This is a generalization of the analysis underlying present flare

427: prediction methods based on McIntosh classification, which considers

428: only the mean flaring rate extracted from historical data. Hence we

429: propose the construction of distributions of flaring rate for each

430: McIntosh classification. We assume these are available, and label the

431: appropriate distribution

432: $\Lambda_{\rm MC} (\lambda_1)$, where MC denotes McIntosh

433: classification. Equation~(\ref{eq:pmkdk}) then becomes

434: \begin{equation}\label{eq:prob_lam1}

435: P_1(\lambda_1)=C\lambda_1^{M^{\prime}} e^{-\lambda_1T^{\prime}}

436:   \Lambda_{\rm MC} (\lambda_1),

437: \end{equation}

438: where we have identified ${\rm prob}(\lambda_1|D^{\prime},I)$ with

439: $P_1(\lambda_1)$, and and where $C$ is the normalization factor. This

440: is the required posterior distribution for $\lambda_1$.

441:

442: It should be noted that the distribution~(\ref{eq:prob_lam1}) explicitly

443: uses only a subset of all flares observed in an active region,

444: i.e.\ the $M^{\prime}\leq M$ flares observed during the interval

445: $T^{\prime}\leq T$. Previous

446: data contribute only to the determination of the interval $T^{\prime}$. The

447: motivation is that when the rate changes, the old rate is no

448: longer relevant for future prediction. For many active regions the

449: observed rate appears to be constant during a transit of the disk, or

450: at least no rate change is detectable (e.g.\ Wheatland 2001), in which

451: case all observed flares contribute explicitly to the inference.

452:

453: Before proceeding we note two simple results for

454: Equation~(\ref{eq:prob_lam1}) with a uniform prior.

455: First, it is easy to see that with a uniform prior the maximum of this

456: distribution occurs at $M^{\prime}/T^{\prime}$.

457: Second we note the well known result that for large $\lambda_1T^{\prime}$

458: and neglecting the prior, Equation~(\ref{eq:prob_lam1})

459: approximates a Gaussian with a width

460: \begin{equation}\label{eq:sig_lam}

461: \sigma_1\approx \frac{(M^{\prime})^{1/2}}{T^{\prime}},

462: \end{equation}

463: which is consistent with the arguments at the end of \S\,2.

464:

465: \subsection{Estimating $\epsilon$}

466:

467: The probability distribution $P_2(\lambda_2)$ for the rate $\lambda_2$

468: of flares larger than $S_2$ may be constructed from the distributions

469: $P_1(\lambda_1)$ and $P_{\gamma}(\gamma)$ using

470: Equation~(\ref{eq:rate_big}). Specifically we have

471: $\lambda_2=\lambda_1(S_1/S_2)^{\gamma-1}$,

472: and hence

473: \begin{equation}

474: P_2(\lambda_2)=

475:   \int_1^{\infty}d\gamma \int_0^{\infty} d\lambda_1 P_1(\lambda_1)

476:   P_{\gamma}(\gamma)\delta

477:   \left[ \lambda_2-\lambda_1(S_1/S_2)^{\gamma-1}\right],

478: \end{equation}

479: and performing the integral over $\lambda_1$ leads to

480: \begin{equation}\label{eq:P2}

481: P_2(\lambda_2) =

482:   \int_1^{\infty}

483:   d\gamma P_{\gamma}(\gamma)\left(\frac{S_2}{S_1}\right)^{\gamma-1}

484:   P_1\left[\lambda_2 \left(\frac{S_2}{S_1}\right)^{\gamma-1} \right].

485: \end{equation}

486:

487:

488: The quantity we are interested in is $\epsilon$, the probability of

489: an event bigger than $S_2$ occurring in an interval $\Delta T$.

490: The probability distribution $P_{\epsilon}(\epsilon)$ for this

491: quantity may be contructed from the distribution for $\lambda_2$ by

492: a change of variable.

493: Specifically, from Equation~(\ref{eq:prob_big}) we have

494: $\lambda_2=-\ln (1-\epsilon)/\Delta T$, and hence

495: \begin{eqnarray}\label{eq:prob_pbig}

496: P_{\epsilon}(\epsilon)&=&P_2\left[\lambda_2(\epsilon )\right]

497: \left|\frac{d\lambda_2}{d\epsilon }\right| \nonumber \\

498: &=& P_2\left[-\frac{\ln (1-\epsilon )}{\Delta T} \right]

499:   \frac{1}{\Delta T (1-\epsilon ) }.

500: \end{eqnarray}

501: Using Equations~(\ref{eq:prob_gam}), (\ref{eq:prob_lam1}), and

502: (\ref{eq:P2}) in~(\ref{eq:prob_pbig}) leads to

503: \begin{equation}\label{eq:pbig_general}

504: P_{\epsilon}(\epsilon )=

505:   \int_1^{\infty} d\gamma \, f(\epsilon,\gamma),

506: \end{equation}

507: where

508: \begin{eqnarray}\label{eq:fjoint}

509: f(\epsilon,\gamma)&=&C\left[-\ln (1-\epsilon )\right]^{M^{\prime}}

510:   (\gamma-1)^M\Gamma (\gamma )

511:   \left[\frac{(S_2/S_1)^{M^{\prime}+1}}{\pi}\right]^{\gamma}

512:   \nonumber \\

513:   &\times& (1-\epsilon )^{\left(T^{\prime}/\Delta T\right)

514:     \left(S_2/S_1\right)^{\gamma-1}-1}

515:   \Lambda_{\rm MC} \left[-\frac{\ln (1-\epsilon )}{\Delta T}

516:   \left(\frac{S_2}{S_1}\right)^{\gamma-1} \right]

517: \end{eqnarray}

518: is the joint probability

519: distribution for $\epsilon$ and $\gamma$. The normalization factor

520: $C$ is obtained by requiring that

521: $\int_{0}^{1}P_{\epsilon}(\epsilon)d\epsilon=1$. We note that

522: $P_{\gamma}(\gamma)$ and $P_{\epsilon}(\epsilon)$ may be considered

523: to be marginal distributions of $f(\epsilon,\gamma)$ (i.e.\ they are

524: obtained by integration over $\epsilon$ and $\gamma$ respectively).

525: However, Equation~(\ref{eq:prob_gam}) gives the distribution for

526: $\gamma$ directly.

527:

528: As noted in \S\,3.2, observations suggest that $\gamma$ is the same

529: in all active regions, in which case the index can be determined very

530: accurately from events over many active regions using

531: Equation~(\ref{eq:gam_ML}). If the estimate is $\gamma^{\ast}$,

532: then we can consider the prior distribution for $\gamma$ to be

533: $\Gamma (\gamma) = \delta (\gamma-\gamma^{\ast})$, and

534: Equation~(\ref{eq:pbig_general}) simplifies to

535: \begin{equation}\label{eq:pbig_simp}

536: P_{\epsilon}(\epsilon ) =

537:   C\left[-\ln (1-\epsilon ) \right]^{M^{\prime}}

538:   (1-\epsilon )^{\left( T^{\prime}/\Delta T\right)

539:   \left(S_2/S_1\right)^{\gamma^{\ast}-1}-1}

540:   \Lambda_{\rm MC} \left[-\frac{\ln (1-\epsilon )}{\Delta T}

541:   \left(\frac{S_2}{S_1}\right)^{\gamma^{\ast}-1} \right].

542: \end{equation}

543:

544: Equations~(\ref{eq:pbig_general}), (\ref{eq:fjoint})

545: and~(\ref{eq:pbig_simp}) are the required expressions for the posterior

546: probability distribution for $\epsilon$.

547:

548: \section{Simulations}

549:

550: We present two simulations demonstrating the application of the

551: method to synthetic data. These simulations omit the inclusion of

552: other information via the prior

553: $\Lambda_{\rm MC} (\lambda_1)$, so they illustrate only how the

554: method performs using the observed data.

555:

556: First we consider the case that $\gamma$ is assumed to be known.

557: Ten days of flaring were simulated by producing a sequence of

558: event times as a Poisson process in time with a rate $\lambda_1=0.5$

559: per day for the first five days, and with a rate $\lambda_1=5.0$

560: per day for the second five days. Each event was assigned a size

561: according to a power law distribution with an index $\gamma=1.8$,

562: above the threshold size $S_1=1$ (in arbitrary units). Figure~1

563: illustrates a typical simulation. The first (upper) panel shows the

564: size of each event versus the time at which the event occurred.

565: In this case there were 31 events. The simulation applies the method

566: to the problem of predicting the probability of a big event occurring

567: during the next day ($\Delta T=1$ day) at the end of the ten days.

568: The size of a big event was taken to be

569: $S_2=100$. The original Bayesian blocks procedure~(Scargle 1998) was

570: applied to the event time series to determine a decomposition into a

571: sequence of piecewise-constant intervals and rates. The second panel

572: of Figure~1 shows the result of this process:

573: the solid lines indicate the rate as a function of time

574: inferred by the Bayesian blocks procedure, and the dotted lines indicate

575: the true rate versus time. The Bayesian blocks procedure correctly

576: identifies a two-rate model as the most likely model, and identifies

577: the approximate time of the change in rate. The third panel shows the

578: probability distribution $P_{\epsilon}(\epsilon)$ obtained from

579: Equation~(\ref{eq:pbig_simp}) with a uniform prior for $\lambda_1$,

580: and with $M^{\prime}$ and $T^{\prime}$ equal to the number of events in the

581: second Bayesian block and the duration of the second Bayesian block

582: respectively. The dotted vertical line in this panel is the true value

583: of $\epsilon$.

584: We see that, even for a relatively small number of events, the method is

585: able to provide a good estimate of the probability of a big event. The

586: width of the inferred distribution for $\epsilon$ is consistent with

587: Equation~(\ref{eq:unc}).

588:

589: \begin{figure}

590: \epsscale{0.7}

591: \plotone{f1.eps}

592: \caption[f1.eps]{Simulation of 10 days of flaring and application of

593: the prediction method, assuming $\gamma$ is known.}

594: \end{figure}

595:

596: Second we consider the more difficult case of simultaneously

597: estimating $\gamma$ and $\lambda_1$. Ten days of flaring were again

598: simulated, with a rate $\lambda_1=1$ per day for the first five days,

599: and a rate $\lambda_1=10$ per day for the second five days. Larger

600: rates were chosen to provide more events for the inference, but the

601: other parameters were kept the same as in the first simulation.

602: Figure~2 illustrates the results of a typical simulation. The

603: first (upper) panel shows the time history of events --- in this case

604: 57 events occurred. The second panel shows the result of a Bayesian

605: blocks decomposition of the data (solid lines) together with the

606: true rate versus time (dotted lines). Once again the Bayesian blocks

607: procedure correctly identifies a two-rate model as the most likely

608: model, and identifies the approximate time of the change in rate.

609: The third panel shows  the result of using Equation~(\ref{eq:prob_gam})

610: --- with a uniform prior with $\gamma_1=1.25$ and $\gamma_2=2.25$ ---

611: to construct the distribution for $\gamma$. The dotted vertical line in this

612: panel shows the true value of $\gamma$.

613: The fourth panel of Figure~2 shows the distribution for $\epsilon$

614: constructed using Equation~(\ref{eq:pbig_general}), with

615: $M=57$, with $M^{\prime}$ and $T^{\prime}$ obtained from the second

616: Bayesian block, and with uniform prior distributions for $\gamma$ and

617: $\lambda_1$.

618: The dotted vertical line indicates the true value. From this simulation

619: we see that a reasonable estimate for $\epsilon $ is obtained for a

620: relatively small number of events.

621:

622: \begin{figure}

623: \epsscale{0.7}

624: \plotone{f2.eps}

625: \caption[f2.eps]{Simulation of 10 days of flaring and application of the

626: prediction method, assuming $\gamma$ is unknown.}

627: \end{figure}

628:

629: The distribution for $\epsilon $ obtained in the lower panel of

630: Figure~2 is quite broad.

631: A basic reason is that $\epsilon$ depends sensitively on $\gamma$

632: because of its appearance as an exponent in

633: Equation~(\ref{eq:rate_big}), and $\gamma$ has a range of possible

634: values, as shown in the third panel of Figure~2.

635: This effect may be seen by considering

636: $f(\epsilon,\gamma)$ [defined by Equation~(\ref{eq:fjoint})],

637: which is the joint distribution of $\epsilon$ and $\gamma$. Figure~3

638: shows a contour plot of $f(\epsilon,\gamma)$ for the simulation depicted

639: in Figure~2. The dotted vertical and horizontal lines are the true values

640: of $\epsilon$ and $\gamma$ respectively.

641: The dashed curve is defined by

642: $\epsilon=1-\exp[-(M^{\prime}/T^{\prime})(S_1/S_2)^{\gamma-1}\Delta T ]$,

643: and the contours of $f(\epsilon,\gamma)$ are observed to be stretched

644: out along this curve. The practical implication of this figure is that

645: accurate estimation of $\epsilon$ depends on accurate estimation

646: of $\gamma$. In practice $\gamma$ is known a priori quite accurately,

647: but in this simulation we have assumed that $\gamma$ is initially unknown

648: (within the range 1.25 to 2.25), to illustrate the process of

649: inference.

650:

651: \begin{figure}

652: \epsscale{0.7}

653: \plotone{f3.eps}

654: \caption[f3.eps]{Contour map of the joint probability of $\epsilon$ and

655: $\gamma$, for the simulation in Fig.~2.}

656: \end{figure}

657:

658: \section{Discussion}

659:

660: Existing methods of solar flare prediction do not make complete

661: use of an important source of information: the time history of flares

662: already observed in the active region of interest, in particular

663: frequently occurring small events.

664: A new method for flare prediction is presented which exploits the

665: observed history of flaring from an active region to improve an initial

666: prediction, which e.g.\ may come from one of the existing methods.

667: To make the example concrete we may think of the initial prediction

668: coming from from the McIntosh sunspot classification, which is a common

669: basis for prediction. This background information provides an initial

670: estimate for the expected flaring rate through a prior distribution

671: $\Lambda_{\rm MC}(\lambda_1)$, which represents the probability that

672: the flaring rate above a (small) size $S_1$ is $\lambda_1$, given

673: historical rates of occurrence of flares for the given McIntosh

674: class. Bayes's theorem is then used to estimate the probability

675: $\epsilon$ of observing a large flare (above size $S_2$) in a given

676: period of time, based on this prior information and on the sequence of

677: flares already produced by the active region, and assuming simple

678: phenomenological rules describing the occurrence of flares.

679: In this paper the basic theory behind the inference of $\epsilon$

680: based on observed data is presented. The inclusion of background

681: information [i.e.\ the construction of the priors

682: $\Lambda_{\rm MC}(\lambda_1)$] is yet to be done.

683:

684: The method relies on event sizes following the phenomenological

685: law~(\ref{eq:pldist}). Some studies of very small extreme

686: ultraviolet events (`nanoflares') suggest that their thermal energies

687: follow a steeper distribution than energies of large events

688: (e.g.\ Krucker and Benz 1998; Parnell and Jupp 2000), although this

689: remains controversial (e.g.\ Aschwanden and Parnell 2002).

690: From the point of view of the prediction method presented here,

691: the uncertainty over the low-size end of the distribution is irrelevant

692: provided events significantly larger than nanoflares are used.

693: In any case the observed distributions from many active

694: regions may be examined as a check on Equation~(\ref{eq:pldist}).

695: A related point is that the distribution~(\ref{eq:pldist}) requires

696: a cutoff at large sizes on energetics grounds, and neglect of this

697: cutoff will lead to the number of large flares being overestimated.

698: A cutoff will be incorporated before the method is applied to real data.

699:

700: The choice of the quantity $S$ has not been addressed, although a good

701: choice is likely to be important to the method. Most flare forecasting

702: deals with soft X-ray events, in particular prediction of GOES

703: (Geostationary Observational Environmental Satellite) M and X class

704: events (events with peak fluxes greater than $10^{-5}$W/m$^2$

705: and $10^{-4}$W/m$^2$ respectively in the 1-8 Angstrom band observed by

706: the satellites). A practical motivation for this is that flare

707: soft X-ray emission causes disturbances of the ionosphere which affect

708: shortwave radio communication, and there is a need to predict these

709: occurrences. A disadvantage of using GOES events is that they are not

710: ideal for flare statistics e.g.\ because of problems with event selection

711: due to the large background in soft X-ray (see Wheatland 2001).

712:

713: A number of other issues also need to be considered before the method is

714: implemented with real data. A point neglected so far is that active regions

715: evolve, so that predictions based on the traditional methods also

716: change with time. For example, an active region evolves through McIntosh

717: classifications (e.g.\ Bornmann, Kalmbach, Kulhanek, and Casale 1990).

718: Changes in background information such as this should be incorporated

719: through changes in the prior, and this question will be considered in more

720: detail in future work. A related point concerns the construction of the

721: prior distributions for rate. It is likely that the McIntosh classification

722: will be used, although other possibilities will be considered. The

723: problem is then to determine the probability of a given McIntosh class

724: having a given rate, based on observed flaring sequences in the

725: historical record for active regions of that class.

726: The details of this calculation will be addressed in future work.

727:

728: Finally, as with all methods of forecasting, it is essential to

729: test the reliability of the method. It is straightforward to compare,

730: after the fact, the number of predicted and the number of observed

731: events for a large sample of active regions. The method presented here

732: will be implemented and tested in this way, and the results compared

733: with existing methods of prediction.

734:

735: \section*{Acknowledgements}

736:

737: M.S.W. acknowledges the support of an Australian Research Council

738: QEII Fellowship, and thanks Richard Thompson and Garth Patterson

739: of the Ionospheric Prediction Service for useful discussions. The

740: comments of an anonymous referee have also helped to improve the

741: paper.

742:

743: \begin{thebibliography}{}

744: \small

745: %

746: \bibitem[Abramowitz and Stegun 1964]{abr&ste64}

747:   Abramowitz, M., \& Stegun, I.A. 1964, Handbook of Mathematical

748:   Functions, National Bureau of Standards, Applied Mathematics Series

749:   volume 55.

750: %

751: \bibitem[Aschwanden and Parnell 2002]{asc&par02}

752:   Aschwanden, M.J., \& Parnell, C.E. 2002, \apj ~572, 1048.

753: %

754: \bibitem[Bai 1993]{bai93} Bai, T. 1993, \apj ~404, 805.

755: %

756: \bibitem[Bornmann, Kalmbach, Kulhanek, \& Casale 1990]{bor&90}

757:   Bornmann, P.L., Kalmbach, D., Kulhanek, D., and Calsale, A.

758:   1990, in

759:   Solar-Terrestrial Predictions: Proceedings of a Workshop in

760:   Leura, Australia, October 16-20, 1989, Volume 1, eds.\ R.J. Thompson,

761:   D.G. Cole, P.J. Wilkinson, M.A. Shea, D. Smart, \& G. Heckman,

762:   (NOAA Environmental Research Laboratories: Boulder, Colorado),

763:   301.

764: %

765: \bibitem[Bornmann and Shaw 1994]{bor&sha94}

766:   Bornmann, P.L., \&  Shaw, D. 1994, Sol.\ Phys.\ 150, 127.

767: %

768: \bibitem[Crosby, Aschwanden and Dennis 1993]{cro&93}

769:   Crosby, N.B., Aschwanden, M.J., \& Dennis, B.R. 1993,

770:   Sol.\ Phys.\ 143, 275.

771: %

772: \bibitem[Gallagher, Moon, and Wang 2002]{gal&02}

773:   Gallagher, P.T., Moon, Y.-J., \& Wang, H. 2002, Sol.\ Phys.\

774:   209, 171.

775: %

776: \bibitem[Jaynes 2003]{jay03}

777:   Jaynes, E.T. 2003, Probability Theory: The Logic of Science (Cambridge

778:   University Press: Cambridge).

779: %

780: \bibitem[Krucker and Benz 1998]{kru&ben98}

781:   Krucker, S., \& Benz, A.O. 1998, \apj ~501, L213.

782: %

783: \bibitem[McIntosh 1990]{mci90}

784:   McIntosh, P.S. 1990, Sol.\ Phys.\ 125, 251.

785: %

786: \bibitem[Moon et al.\ 2001]{moo&01}

787:   Moon, Y.-J., Choe, G.S., Yun, H.S., \& Park, Y.D. 2001,

788:   \jgr ~106, 29951.

789: %

790: \bibitem[Neidig, Wiborg and Seagraves 1990]{nei&90}

791:   Neidig, D.F., Wiborg, P.H. and Seagraves, P.H. 1990, in

792:   Solar-Terrestrial Predictions: Proceedings of a Workshop in

793:   Leura, Australia, October 16-20, 1989, Volume 1, eds.\ R.J. Thompson,

794:   D.G. Cole, P.J. Wilkinson, M.A. Shea, D. Smart, \& G. Heckman,

795:   (NOAA Environmental Research Laboratories: Boulder, Colorado),

796:   541.

797: %

798: \bibitem[Parnell and Jupp 2000]{par&jup00}

799:   Parnell, C.E., \& Jupp, P.E. 2000, \apj ~529, 554.

800: %

801: \bibitem[Scargle 1998]{sca98} Scargle, J.D. 1998, \apj ~504, 405.

802: %

803: \bibitem[Sivia 1996]{siv96} Sivia, D.S. 1996, Data Analysis: A

804:   Bayesian Tutorial, (Clarendon Press: Oxford).

805: %

806: \bibitem[Wheatland 2000]{whe00} Wheatland, M.S. 2000, \apj ~532, 1209.

807: %

808: \bibitem[Wheatland 2001]{whe01} Wheatland, M.S. 2001, Sol.\ Phys.\

809:   203, 87.

810: %

811: \end{thebibliography}

812:

813: \end{document}

814:

815: