0805.0968/ms.tex
1: \documentclass[12pt,preprint]{aastex}
2: 
3: \shorttitle{Statistical Modeling of Solar Flare Activity} \shortauthors{Stanislavsky et al.}
4: 
5: \begin{document}
6: 
7: \title{Statistical Modeling of Solar Flare Activity from \\
8: Empirical Time Series of Soft X-ray Solar Emission}
9: 
10: \author{A. A. Stanislavsky,}
11: \affil{Institute of Radio Astronomy, National Academy of Sciences
12: of Ukraine,\\ 4 Chervonopraporna St., Kharkov 61002, Ukraine}
13: \email{alexstan@ri.kharkov.ua}
14: 
15: \author{K. Burnecki, M. Magdziarz, A. Weron}
16: \affil{Hugo Steinhaus Center, Institute of Mathematics and\\
17: Computer Science, Wroc{\l}aw University of Technology,\\ Wyb.
18: Wyspia\'{n}skiego 27, 50-370 Wroc{\l}aw, Poland}
19: \and
20: 
21: \author{K. Weron}
22: \affil{Institute of Physics, Wroc{\l}aw University of Technology,\\
23: Wyb. Wyspia\'{n}skiego 27, 50-370 Wroc{\l}aw, Poland}
24: 
25: 
26: \begin{abstract}
27: 
28: A time series of soft X-ray emission observed on 1974-2007 years (GOES) is analyzed. We
29: show that in the periods of high solar activity 1977-1981, 1988-1992, 1999-2003 the
30: energy statistics of soft X-ray solar flares for class M and C is well described by a
31: FARIMA time series with Pareto innovations. The model is characterized by two effects. One of them is a long-range
32: dependence (long-term memory), and another corresponds to heavy-tailed distributions. Their parameters are
33: statistically stable enough during the periods. However, when the solar activity tends to
34: minimum, they change essentially. We discuss possible causes of this evolution and
35: suggest a statistical model for predicting the flare energy statistics.
36: 
37: 
38: \end{abstract}
39: \keywords{Sun: activity --- Sun: flares --- Sun: X-rays, gamma rays --- methods: data
40: analysis --- methods: statistical}
41: 
42: 
43: \section{Introduction}
44: \label{Introduction}
45: 
46: 
47: 
48: Individual solar cycles are different in form, amplitude and
49: length. At present, the accurate solar data is only available
50: for the most recent three cycles. Understanding the
51: long-term solar variability and predicting the solar activity is
52: an actual problem for solar physics. It is very important to
53: predict the time and strength of such events because these
54: disturbances can pose serious threats to man-made spacecrafts, can
55: disrupt electronic communication channels and can even set up huge
56: electrical currents in power grids (\citeauthor{Clark06}
57: \citeyear{Clark06}). It is enough to remind about serious problems
58: with GOES, Deuthsche Telecom, Telstars 401, etc. Satellite
59: operators would be glad to escape the unhappy surprise, and
60: mission planners are compelled to take into account the future
61: space weather forecast. Not only NASA satellites malfunctioned
62: because of the disturbances, but the global positioning system was
63: impaired. The cost to the airline industry arose as planes were
64: re-routed to lower altitudes, burning more fuel in force of
65: atmospheric drag.
66: 
67: As the geological records show, the Earth's climate has always
68: been changing. The reasons for such changes, however, have always
69: been subject to continuous discussions and are still not well
70: understood. In addition to natural climate changes the risk of
71: human influence on climate is seriously considered too. Any factor
72: that alters the radiation received from the Sun or lost to Space
73: will affect climate. So, \citeauthor{Mann98}(\citeyear{Mann98})
74: have clearly detected  a significant correlation between solar
75: irradiance and reconstructed Northern Hemisphere temperature. The
76: statistics indicates that during ``Maunder Minimum'' of solar
77: activity the climate was especially cold, but when the intensity
78: of solar radiance again increased from early nineteenth century
79: through to the mid-twentieth century, the period coincides with
80: the general warming. This, however, would either imply
81: unrealistically large variations in total solar irradiance or a
82: higher climate sensitivity to radiative forcing than normally
83: accepted. Therefore, other mechanisms have to be invoked. The most
84: promising candidate is a change in cloud formation because clouds
85: have a very strong impact on the radiation balance and because
86: only little energy is needed to change the cloud formation
87: process. According to satellite records taken from 1979 to 1992, 
88: Earth was 3\% cloudier during solar minima than at solar maxima
89: (\citeauthor{Svensmark97}\citeyear{Svensmark97}). One of the ways
90: to influence cloud formation might be through the cosmic ray flux
91: that is strongly modulated by the varying solar activity.
92: \citeauthor{Scafetta03}(\citeyear{Scafetta03}) argue that Earth's
93: short-term temperature anomalies inherit a L\'evy-walk memory
94: component from the intermittence of solar flares.
95: 
96: The aim of this paper is to present a statistical model for
97: predicting soft X-ray solar burst activity in the period of solar
98: cycle maxima.The paper is organized as follows. The random
99: features of solar activity is outlined in Section \ref{activity}.
100: The data set is described in Section \ref{flares}. In Section
101: \ref{predict} we present the essence  of our statistical
102: investigation of SXR flares. Interrelations
103: among statistical flare parameters, such as long-range dependence
104: index, Hurst exponent for X-ray flux and their evolution during
105: solar cycles, are analyzed. In the period of strong solar activity
106: the index of self-similarity is nearly constant (Section
107: \ref{results}). This feature can be used for predicting the power
108: of soft X-ray emission for the 24-th solar cycle near its solar activity maximum 
109: (2010-2014). The corresponding model is constructed in Section 
110: \ref{sectionempev}. Next, we give a summary and discussion of the 
111: main results. Finally, the conclusions are drawn in Section~\ref{conclusions}.
112: 
113: \section{Randomness in Solar Activity}\label{activity}
114: 
115: The solar 11-year cycle is driven by Sun's magnetic field. The
116: Sun's magnetic field is produced by a hydromagnetic dynamo process
117: underneath the solar surface and is cyclic in nature. This
118: fundamental theoretical idea was established by
119: \citeauthor{Parker55}(\citeyear{Parker55}). However, only within
120: the last few years, theoretical models of the solar dynamo have
121: become sophisticated enough to explain various aspects of the
122: solar activity. So, recently
123: \citeauthor{Gilman06}(\citeyear{Gilman06}) have made the first
124: attempt of using a theoretical dynamo model to predict the
125: strength of the upcoming cycle 24. They have shown that this cycle
126: will be the strongest in 50 years. But later
127: \citeauthor{Choudhuri07}(\citeyear{Choudhuri07}) pointed out that
128: some assumptions in the Dikpati-Gilman model are
129: unjustified. On the contrary, their model, based on the earlier work of
130: \citeauthor{Nandy02}(\citeyear{Nandy02}), predicts that the cycle 24 will be
131: weaker than the 23-rd. The key problem here is the following:
132: the dominant processes like the magnetic field
133: advection and toroidal field generation by differential rotation are fairly 
134: regular during the rising phase of a cycle from a minimum to a
135: maximum, and hence a good knowledge of magnetic configurations during a
136: minimum would enable a good theoretical model to predict the next
137: maximum reliably. However, the dominant process in the declining
138: phase of a cycle contains the poloidal field generation by the
139: Babcock-Leighton mechanism which involves randomness (primary
140: cause of solar cycle fluctuations) and cannot be predicted in
141: advance by any deterministic model. That is why, although active
142: regions appear in a latitude belt at a certain phase of the solar
143: cycle, where exactly within this belt the active regions appear
144: seems random. Since the poloidal field generated from an active
145: region depends on the tilt, the scatter in the tilts introduces
146: randomness in the poloidal field generation process.
147: 
148: The other feature of solar activity is that there is a ``magnetic
149: persistence'' between the surface polar fields and spot-producing
150: toroidal fields, generated by differential rotation shearing
151: (\citeauthor{Dikpati06}\citeyear{Dikpati06}). This means that the
152: Sun retains a memory of its magnetic field for a long time (about
153: 20 years or so). The solar cycle prediction is similar to that
154: employed in global atmospheric dynamics over the last ten years.
155: Such models predict changes in certain global characteristics of a
156: cycle, without attempting to reproduce details that occur on
157: smaller spatial scales and shorter time scales. The interrelation
158: between global characteristics and small scale processes is an
159: open problem, and meanwhile some effects of smaller scales are
160: included in parametric form.
161: 
162: \section{X-ray Flare Observations}\label{flares}
163: 
164: Solar activity is a many-sided phenomenon. It includes flares, prominence eruptions,
165: coronal mass ejections, solar energetic particles, various radio bursts, high-speed solar
166: wind streaming from coronal holes, etc. Solar flares are the most energetic and violent
167: events occurring in the solar atmosphere. The energy release in a flare ranges from
168: 10$^{26}$ to 3$\times$10$^{32}$ ergs. Magnetic reconnection is considered to play a
169: central role in any flare energy release.
170: 
171: Observations of solar flare phenomena in X-rays became possible in
172: the 1960s with the availability of space-borne instrumentation.
173: Since 1974 broad-band soft X-ray emission of the Sun has been
174: measured almost continuously by the meteorology satellites
175: operated by NOAA so as the Synchronous Meteorological Satellite
176: (SMS) and the Geostationary Operational Environment Satellite
177: (GOES). The first GOES was launched by NASA in 1975, and the GOES
178: series extends to the currently operational GOES 11 and GOES 12.
179: From 1974 to 1986 the soft X-ray records are obtained by at least
180: one GOES-type satellite; starting with 1983, data from two and
181: even three co-operating GOES are generally available. The X-ray
182: sensor, part of the space environment monitor system aboard GOES,
183: consists of two ion chamber detectors, which provide whole-sun
184: X-ray fluxes in the 0.05-0.3 and 0.1-0.8 nm wavelength bands.
185: Solar soft X-ray flares are classified according to their peak
186: burst intensity measured in the 0.1-0.8 nm wavelength band by
187: GOES. The letters (A, B, C, M, X) denote the order of magnitude of
188: the peak flux on a logarithmic scale, and the number following the
189: letter gives the multiplicative factor, i.e., A$n=n\times
190: 10^{-8}$, B$n=n\times 10^{-7}$, C$n=n\times 10^{-6}$, M$n=n\times
191: 10^{-5}$ and X$n=n\times 10^{-4}$ W/m$^2$. In general, $n$ is
192: given as a float number with one decimal (prior to 1980, $n$ is
193: listed as an integer). No background subtraction is applied to the
194: data. Now the data is widely available from the NOAA Space
195: Environment Center site
196: (http://www.ngdc.noaa.gov/stp/SOLAR/ftpsolarflares.html).
197: 
198: In the meantime, a wealth of data has been accumulated. It makes worthwhile
199: re-investigating the temporal and spatial features of soft X-ray (SXR) flares on an
200: extensive statistical basis. \citeauthor{Li98} (\citeyear{Li98}) studied the distribution
201: of the X-ray flares (M$\geq$1) from 1987 to 1992 with respect to helio longitude. They
202: have shown that the flares were not uniformly distributed in longitude. The temporal
203: analysis of X-flare statistics concerns basically the waiting-time distribution (see, for
204: example, \citeauthor{Boffetta99} \citeyear{Boffetta99}; \citeauthor{Moon01}
205: \citeyear{Moon01}; \citeauthor{Lepreti01}\citeyear{Lepreti01};
206: \citeauthor{Weatland02}\citeyear{Weatland02}; \citeauthor{Veronig02} \citeyear{Veronig02}
207: and so on). In the present analysis we make use of SXR flares observed by GOES during
208: 1976-2006. Our consideration will be devoted only to the energy statistics of soft X-ray
209: solar flares in time.
210: 
211: \section{Predictive Tool Description}\label{predict}
212: 
213: Our analysis is based on the properties of fractional autoregressive integrated moving average (FARIMA) processes 
214: (\citeauthor{Beran94} \citeyear{Beran94}). They are widely used in modeling of 
215: various complex physical systems. The FARIMA($p$,$d$,$q$) process is defined
216: as the solution of the equation $\Phi(B)\Delta^dX(n)=\Theta(B)\epsilon_n$, $n\in{\bf Z}$, where
217: $B$ is the shift operator $BX(n)=X(n-1)$ and $\Delta$ is the difference operator, i.\ e. 
218: $\Delta X(n)= X(n)-X(n-1)$. Here $\Phi$ and $\Theta$ are the polynomials of degree $p$ and $q$
219: respectively, $d$ takes fractional values, either positive or negative, and ``innovations'' 
220: $\epsilon_j$ are independent and identically distributed (i.i.d.) 
221: random variables. The polynomials $\Phi$ and $\Theta$ correspond to autoregressive (AR) 
222: and moving average (MA) parts, respectively. The linear representation of FARIMA processes 
223: takes the form
224: \begin{equation}
225: X(n)=\sum^\infty_{j=0}c_{n-j}\epsilon_j\,,\label{eq1}
226: \end{equation}
227: for details see (\citeauthor{Beran94} \citeyear{Beran94}). The innovations may be
228: either Gaussian, non-Gaussian with finite variance or they may have infinite variance.
229: For infinite variance innovations $\epsilon$, one may consider, for example, symmetric
230: and skewed stable distributions, as well as Pareto distributions. Both are characterized
231: by the parameter $\alpha$ and their tails $P(\epsilon > x)$ satisfy
232: \begin{equation}
233: P(\epsilon>x)=1-F(x)\sim x^{-\alpha},\qquad {\rm as}\qquad x\to\infty\label{eq2},
234: \end{equation}
235: where $F(x)$ denotes the corresponding distribution function and $\sim$ denotes that 
236: the ratio of the left-hand side to the right-hand one tends to 1, as $x\to\infty$. 
237: It should be noted that the L\'evy-stable distributions have $0<\alpha<2$ whereas for the Pareto
238: distribution the parameter $\alpha$ is greater than zero. The resulting process $X(n)$ will be 
239: long-range dependent and L\'evy-stable if the innovations are L\'evy-stable, and asymptotically
240: will be in the domain of attraction of a L\'evy-stable distribution if the innovations are Pareto 
241: (see \citeauthor{Samorod94} \citeyear{Samorod94}). Moreover, such FARIMA processes are asymptotically 
242: self-similar with $d-1/\alpha$. 
243: 
244: The L\'evy-stable distribution, named after
245: the French mathematician Paul L\'evy who investigated the behavior of sums of independent random variables, 
246: is most conveniently described by its characteristic function $\phi(\theta)$ -- the inverse Fourier transform 
247: of the probability density function. The most popular form of the characteristic function of a 
248: L\'evy-stable random variable is given by the expression
249: \begin{equation}
250: \log\phi(\theta) = \cases{ 
251:   -\sigma^\alpha|\theta|^\alpha\{1-i\beta\,{\rm sign}(\theta)\tan(\pi\alpha/2)\}+i\mu \theta,\qquad \alpha\neq 1,\cr
252:   -\sigma\,|\theta|\,\{1+2i\beta\,{\rm sign}(\theta)\log|\theta|/\pi\}+i\mu \theta,\quad\qquad \alpha = 1,}
253: \end{equation}
254: where $0<\alpha\leq 2$, $-1\leq\beta\leq 1$, $\sigma>0$ and $\mu \in {\bf R}$ are parameters of this distribution (\citeauthor{Samorod94} \citeyear{Samorod94}). The Pareto distribution, 
255: introduced by the Italian economist Vilfredo Pareto, is a power law probability density that we represent 
256: in the form of $f(x)=\alpha\lambda^\alpha(\lambda+x)^{-\alpha-1}$, where $\lambda$ and $\alpha$ are positive constants
257: (\citeauthor{Burn05} \citeyear{Burn05}).
258: 
259: The power-law behavior of the tails implies that the variance is infinite if $\alpha<2$. The
260: tail index (exponent) $\alpha$ controls the rate of decay of the
261: tail of the distribution function $F$. Modeling with FARIMA time series with infinite 
262: variance allows to take into account heavy tails. Through a suitable choice of coefficients 
263: $c_{n-j}$ one can also add long-term memory effects. The FARIMA processes is an useful family 
264: of models because it offers a lot of flexibility in modeling long-range and short-range
265: dependence by choosing the memory parameter $d$ and appropriate autoregressive and moving
266: average coefficients in expression (\ref{eq1}).
267: 
268: The problem of estimating the exponent in heavy-tailed data has a
269: long history in statistics because of its practical importance.
270: The presence of heavy tails in data was firstly noted in the work
271: of \citeauthor{Zipf32}(\citeyear{Zipf32}) in his study of word
272: frequencies in languages. Next,
273: \citeauthor{Mandelbrot60}(\citeyear{Mandelbrot60}) noted their
274: presence in financial data. Since the early 1970s the heavy-tailed
275: behavior has been noted in many other scientific fields (see, for
276: example, reviews of \citeauthor{Adler98} \citeyear{Adler98};
277: \citeauthor{Park00} \citeyear{Park00}). However, the availability
278: of huge amount of various data poses a set of new challenges for
279: the problem of estimating the tail index. The point is that the
280: data can be contaminated by extraneous oscillations, different
281: noises with finite variance and so on. This makes the analysis of 
282: heavy-tailed data more complicated (\citeauthor{Janicki94}\citeyear{Janicki94}; 
283: \citeauthor{Lynch05} \citeyear{Lynch05}). The time series of soft 
284: X-ray solar emission relates to such problematic data
285: (\citeauthor{Baiesi06}\citeyear{Baiesi06}). Therefore, for
286: reliability we will estimate the tail index by different
287: statistical tests.
288: One of them is based on the asymptotic {\it max self-similarity}
289: properties of heavy-tailed maxima
290: (\citeauthor{Stoev06}\citeyear{Stoev06}). In this test the maximum
291: values of data are calculated over blocks of size $m$, scaled at
292: rate of $m^{1/\alpha}$. By examining a sequence of growing block
293: sizes $m=2^j$, $1\leq j\leq\log_2 N, j\in{\bf N}$, and
294: subsequently estimating the mean of logarithms of block-maxima one
295: obtains an estimation of the tail index $\alpha$. Another
296: estimator, that we use, under the assumption of the L\'evy stable law 
297: applies the \citeauthor{McCulloch86}(\citeyear{McCulloch86}) quantile fit.
298: 
299: \section{Cycling of Self-similarity}\label{results}
300: 
301: Using the max self-similarity estimator and considering our
302: data in year intervals, we have analyzed how the solar cycling
303: influences on the tail index. Figure \ref{maxspec} shows a clear
304: correlation between solar activity and the index. When the solar
305: activity is around maxima, the tail index is larger than one,
306: whereas in minima it tends to fall down less than one. The index
307: value in the period of high solar activity almost coincides with
308: the result of \citeauthor{Weron05} (\citeyear{Weron05}). Although
309: their data insert only X-ray solar bursts of C and M type, the
310: value $\alpha$ was estimated to be $1.2674$. The present analysis
311: extends the tail index analysis on some cycles and speaks surely
312: that the index tendency observed earlier is kept at least
313: during the recent three solar cycles.
314: 
315: The McCulloch's testing (\citeauthor{McCulloch86}\citeyear{McCulloch86}) gives similar results.
316: This test shows that the tail parameter $\alpha$ 
317: depends on the solar activity value during the three solar cycles. Of course, the estimator has a 
318: particular character. It is convenient for the analysis of random variables with a stable 
319: distribution. Nevertheless, the distribution also has heavy tails. Our first aim is to find 
320: such distribution that will be best fitted to the experimental series.
321: 
322: \section{Solar flares}
323: \label{sectionempev}
324: 
325: Now we restrict our attention to such time intervals, in which the solar activity is
326: strong (in particular, 1978-1981, 1988-1992 and 1999-2003). We use X-ray flare data from
327: GOES satellite, that contain information about time of appearance and energy of solar
328: flares (from http://www.ngdc.noaa.gov/stp/SOLAR/ftpsolar-flares.html). The captured
329: energy was transmitted by X-rays emitted during blasts on a solar surface from 2000
330: January 1 to 2002 December 31. We aggregated the energy values on a daily basis. The time
331: series is presented in Fig.~\ref{data}.
332: 
333: The first estimation procedure of the self-similarity parameter
334: $H$ (i.\ e. the Hurst exponent)is the so-called finite impulse response transformation
335: (FIRT).  The FIRT estimator involve an array of coefficients. The
336: array is made out of finite impulse response coefficients. The
337: estimator $H_{FIRT}$ is obtained by performing a log-linear
338: regression on the coefficients and measuring the slope
339: (\citeauthor{SP}\citeyear{SP}). It is important to note that the
340: estimator $H_{FIRT}$ is unbiased for all $\alpha$ falling in the range $(0,2)$.
341: 
342: An alternative method of testing scaling and correlation
343: properties of a time series is the variance of residuals
344: method (VR) (\citeauthor{peng} \citeyear{peng}). First, the series
345: is divided into blocks of size $m$. Then, within each block, the
346: partial sums of the series are calculated. A least­-squares line
347: is fitted to the partial sums within each block, and the sample
348: variance of the residuals is computed. The variance of residuals
349: is proportional to $m^{2H}$. This variance of residuals is
350: computed for each block, and the median (or average) is computed
351: over the blocks. A log­-log plot versus $m$ should follow a straight
352: line with a slope of $2H$.
353: 
354: The R/S method is one of the oldest and better known methods for estimation of the Hurst parameter (\citeauthor{Hurst51}\citeyear{Hurst51}). Hurst found that drought in the Nile Valley is not a random phenomenon, but rather that the region is inclined to become progressively more arid after a succession of long droughts. It is widely known that for Gaussian (i.e. finite variance) time series the method returns $H$ (\citeauthor{ManWal69}\citeyear{ManWal69}). The popularity of the method has been also a source of misunderstandings and errors. This is due to the fact that, in general, for power-law distributed time series (i.e. with infinite variance) the method yields $d+1/2$. In particular for Pareto or L\'evy stable distributions the output is $H-1/\alpha+1/2$  (\citeauthor{TaqTev96}\citeyear{TaqTev96}). The method is based on R/S (rescaled adjusted range) statistics. The series is divided into blocks. Then, within each block, the statistics is calculated. Finally, arithmetic means of the values of the statistics over the blocks are calculated and a least squares line is fitted to the mean for different lengths of the blocks. The slope should be equal to $d+1/2$.
355: 
356: For the finite variance cases, the interpretation of the FIRT and VR estimators is very
357: similar to the Hurst exponent: if only short-range correlations (or no correlations at
358: all) exist in the studied series, then $H_{FIRT}=H_{VR}=1/2$; if there is a correlation
359: then $H_{FIRT}=H_{VR}\neq 1/2$. Moreover, if the estimator $H_{FIRT}=H_{VR}$ is greater
360: than $1/2$, the time series is persistent and if $H_{FIRT}=H_{VR}<1/2$, then the time
361: series is not persistent.
362: 
363: Note that both estimators give an information on memory and not on distribution of the
364: process increments.
365: 
366: The analysis of the data shows that the tails of the underlying
367: distribution conform to the power law. Hence, we model the data by a FARIMA process with Pareto innovations. As the power-law distributions belong to the domain of attraction of stable law (see, eg. \citeauthor{Janicki94}\citeyear{Janicki94}) the resulting distribution of the FARIMA process should be close to the stable one. We applied
368: the McCulloch quantile fit to obtain the parameters of the
369: distribution (\citeauthor{McCulloch86}\citeyear{McCulloch86}). The
370: value of $\alpha$ was estimated to be $1.213$, see Figure \ref{fig_alpha}. One may check that the estimated value of $\alpha$ of simulated FARIMA times series with Pareto innovations is usually underestimated, see e.~g. Figure \ref{fig_alpha}. Therefore, we assume that the innovations in our model follow the Pareto law with $\alpha = 1.25$.
371: 
372: According to \citeauthor{Weron05} (\citeyear{Weron05}), in order
373: to recover both the self-similarity exponent $H$ and the memory
374: parameter $d$ (hence, the distribution parameter $\alpha$) we can
375: use the following BMW$^2$ computer test. The surrogate data are
376: obtained here by a random shuffling of the original data
377: positions.
378: 
379: \begin{itemize}
380: \item If the process is FARIMA with Gaussian noise, then the values of the estimator should
381: change to $1/2$ for the surrogate data independently on the initial values.
382: \item If the process is FARIMA with $\alpha$-stable or Pareto noise for $\alpha<2$, then the values of the estimator should change to $1/\alpha$ for the surrogate data independently on the initial values.
383: \end{itemize}
384: 
385: The above formalism can be easily applied to determine basic
386: features of an empirical data series. Now we employ this to
387: study an empirical time series recorded from the
388: system describing the energy of solar flares (Fig. \ref{data}). The obtained values
389: of the parameters are listed in Table~\ref{tab1}. Therefore, from
390: the results for the surrogate data, the corresponding estimates
391: for the parameter $1/\alpha$ are:
392: $1/\alpha_{FIRT}=H_{FIRT}=0.8452$ and
393: $1/\alpha_{VR}=H_{VR}=0.7722$. We observe that the estimators are
394: close to the one assumed in our model: $1/\alpha_{MC}=0.8$. Moreover, we choose $d = 0.19$ as the highest admissible value of $d$, which is close to the one obtained via the R/S method for the original data, see Table \ref{tab1}.
395: 
396: One may notice that the estimators of $H$ obtained via FIRT and VR methods, see Table \ref{tab1}, are greater than theoretically admissible in the FARIMA model, i.\ e. they exceed one. As stated in \citeauthor{Burn08} (\citeyear{Burn08}), this can be justified by performing simulations of the FARIMA processes and estimating the parameter $H$ on the simulated time series via different methods. It appears that a reasonable percentage of the values of $H$ is higher than the data estimates, so one can not reject the hypothesis that the underlying model is the FARIMA(0,$d$,0) process. Nevertheless, we now look for an enhanced model which would describe better the behavior of different estimators obtained for the original and shuffled data, and would improve the fit in terms of the prediction error. Thus, we propose a slight generalization of the model incorporating the short-dependence component, namely FARIMA (2,$d$,0) model. We estimated the AR(2) coefficients: $a_1$ (linear term) and $a_2$ (quadratic term) via the mean-square error (MSE) minimalization scheme taking into account three statistics: FIRT, VR and RS, see Figure \ref{fig_mse}. The estimated values are: $a_1 = 0.02$ and $a_2 = 0.03$. The FARIMA processes were generated
397: according to the algorithm presented by \citeauthor{Stoev04}(\citeyear{Stoev04}).
398: 
399: We calculate $H_{FIRT}$, $H_VR$ and $d_{RS}$ estimators for the
400: simulated FARIMA (2,$d$,0) processes (top
401: panel in Figs. \ref{fig_box_est} and \ref{fig_box_rs}) and the corresponding shuffled data (bottom panel in Figs. \ref{fig_box_est} and \ref{fig_box_rs}). We generate $1000$ trajectories of
402: size $2^{10}$ which is close to the length of the original solar data, i.\ e. 1089 and
403: present the results in form of the so-called box plots.  The
404: box plot has lines at the lower quartile, median, and upper quartile
405: values. The whiskers are lines extending from each end of the box
406: to show the extent of the rest of the data. Outliers are data with
407: values beyond the ends of the whiskers. If there is no data
408: outside the whisker, a dot is placed at the bottom whisker. The results
409: correspond to the analysis of the original solar data included in
410: Table~\ref{tab1}. Thus, our FARIMA  simulations reconstruct 
411: well the structure of the original solar flares data. 
412: 
413: We conclude that the proper model could be based on the FARIMA(2,$d$,0) process with Pareto innovations with
414: the parameters $a_1 = 0.02$, $a_2 = 0.03$, $d=0.19$, and $\alpha = 1.25$ which has the
415: long-range dependence property since $d>1-2/\alpha$ (\citeauthor{Burn08} \citeyear{Burn08}). The Pareto distribution is also convenient for the present analysis
416: because it gives a description of positive random variables whereas the L\'evy-stable one for $1<\alpha<2$ is related to
417: both positive and negative random variables, but a time series of x-ray flare energy is quite positive by definition.
418: 
419: We also calculated the 1-day-ahead prediction for the FARIMA(2,$d$,0) time series (for the prediction discussion in the infinite variance FARIMA case see \citeauthor{Kokoszka95}\citeyear{Kokoszka95}). The results are depicted in Figure \ref{fig_pred}.
420: 
421: 
422: \section{Concluding Remarks and Discussion}
423: \label{conclusions}
424: 
425: In this paper we demonstrate how self-similar models driven by
426: L\'evy stable noise can be useful for modeling X-ray solar data.
427: To be more precise we have suggested the FARIMA(2,$d$,0) model with
428: $\alpha$-stable noise for predicting solar flare appearance in the
429: period of a strong solar activity. 
430: 
431: The procedure is illustrated in Section \ref{sectionempev} for the
432: captured energy transmitted by X-rays emitted during blasts on a solar surface from 2000
433: January 1 to 2002 December 31. Comparing the values of the different estimators for the
434: original data series and for the surrogate data we estimated the components of the
435: self-similarity index corresponding to the memory of the time series ($d$) and to the
436: tail properties of the time series values distribution ($\alpha$), see Table \ref{tab1}.
437: Thus, this allows in principle to build a proper physical model for analyzing the
438: solar activity.
439: 
440: The analysis of soft X-ray emission observations shows that this
441: series is enough complicated in nature. We have seen a strong
442: dependence of statistics on solar cycling in the data. However, if
443: one takes year intervals, then this cycle influence becomes not so
444: strong. This allows one to reconstruct a statistical model for
445: predicting the soft X-ray solar activity on the nearest solar
446: cycle. The soft X-ray solar time series contains both long-range
447: dependence and heavy-tailed effects. The first creates a
448: random number of strong bursts on a background, and the second
449: forms their persistence between each other. The most convenient
450: model for their joint description is a FARIMA time series. In view
451: of solar cycling the model permits one to predict a time series of
452: soft X-ray solar flares, when the solar activity will be again 
453: near its maximum in 2010-2014 years. While we do not
454: claim that this model provides the only possible explanation, it
455: does provide a rigorous statistical picture of the expected observations of
456: X -ray solar flares.
457: 
458: 
459: \acknowledgements
460: 
461: A.A.S. is grateful to the Institute of Physics and the Hugo
462: Steinhaus Center for Stochastic Methods for pleasant hospitality
463: during his visit in Wroc{\l}aw University of Technology.
464: The GOES X-ray light curve was made available courtesy of the NOAA
465: Space Environment Center, Boulder, CO.
466: 
467: 
468: \begin{thebibliography}{}
469: 
470: \bibitem[Adler et al.(1998)]{Adler98}
471: Adler, R., Feldman R., \& Taqqu, M.S. 1998, A Practical Guide to Heavy Tails: Statistical
472: Techniques and Application (Boston: Birkh$\ddot{\rm a}$user)
473: 
474: \bibitem[Baiesi et al. (2006)]{Baiesi06}
475: Baiesi, M., Paczuski, M., \& Stella, A.L. 2006, \prl, 96, 051103
476: 
477: 
478: \bibitem[Beran (1994)]{Beran94}
479: Beran, J. 1994, Statistics for Long-Memory Processes (New York: Chapman \& Hall) 
480: 
481: \bibitem[Boffetta et al.(1999)]{Boffetta99}
482: Boffetta, G., Carbone, V., Giuliani, P., Veltri, P., \& Vulpiani, A. 1999, \prl,
483:  83, 4662
484: 
485: \bibitem[Burnecki et al.(2008)]{Burn08}
486: Burnecki, K.,  Klafter, J., Magdziarz M., \& Weron, A. 2008, Physica A, 387, 1077
487: 
488: \bibitem[Burnecki et al.(2005)]{Burn05}
489: Burnecki, K., Misiorek, A., \& Weron, R., 2005, in Statistical Tools for Finance and Insurance,  
490: ed. P. \v C\' i\v zek, W. H$\ddot{\rm a}$rdle, \& R. Weron, (Berlin: Springer), 289 
491: 
492: 
493: \bibitem[Choudhuri et al. (2007)]{Choudhuri07}
494: Choudhuri, A.R., Chatterjee, P., \& Jiang, J. 2007, \prl, 98, 131103
495: 
496: \bibitem[Clark(2006)]{Clark06}
497: Clark, S. 2006, \nat, 441, 402
498: 
499: \bibitem[Dikrati \& Gilman (2006)]{Gilman06}
500: Dikpati, M., Gilman, P.A.: 2006, \apj, 649, 498
501: 
502: \bibitem[Dikpati et al. (2006)]{Dikpati06}
503: Dikpati, M., de Toma, G., \& Gilman, P.A. 2006, \grl, 33, L05102
504: 
505: \bibitem[Hurst (1951)]{Hurst51}
506: Hurst, H.E. 1951, Transactions, American Society of Civil Engineers, 116, 770
507: 
508: \bibitem[Kokoszka (1995)]{Kokoszka95}
509: Kokoszka, P.S. 1995, Probab. Math. Statist., 16, 83
510: 
511: 
512: \bibitem[Janicki \& Weron (1994)]{Janicki94}
513: Janicki, A., \& Weron, A. 1994, A Simulation and Chaotic Behavior of $\alpha$-Stable
514: Stochatic Processes (New York: Dekker)
515: 
516: \bibitem[Lepreti et al. (2001)]{Lepreti01}
517: Lepreti, F., Carbone, C., \& Veltri, P. 2001, \apjl, 555, L133
518: 
519: \bibitem[Li et al.(1998)]{Li98}
520: Li, K.-J., Schmieder, B., \& Li, Q.-Sh. 1998, \apjs, 131, 99
521: 
522: \bibitem[Lynch et al.(2005)]{Lynch05}
523: Lynch, V.E., Carreras, B.A., Sanchez, R., LaBombard, B., van Milligen, B.Ph., \& Newman
524: D.E. 2005, Phys. Plasmas, 12, 052304
525: 
526: \bibitem[Mandelbrot (1960)]{Mandelbrot60}
527: Mandelbrot, B.B. 1960, Int. Econ. Rev., 1, 79
528: 
529: \bibitem[Mandelbrot \& Wallis (1969)]{ManWal69}
530: Mandelbrot, B.B.  \&  Wallis, J. R. 1969, Water Resources Research, 5, 228
531: 
532: \bibitem[Mann et al. (1998)]{Mann98}
533: Mann, M.E., Bradley, R.S., \& Hughes, M.K. 1998, \nat, 392, 779
534: 
535: \bibitem[McCulloch (1986)]{McCulloch86}
536: McCulloch, J.H. 1986, Comm. Statist. Simulation Comput., 15, 1109
537: 
538: \bibitem[Moon et al.(2001)]{Moon01}
539: Moon, Y.-J., Choe, G.S., Yun, H.S., \& Park, Y.D. 2001, J. Geophys. Res., 106, A12, 29951
540: 
541: \bibitem[Nandy \& Choudhuri (2002)]{Nandy02}
542: Nandy, D., \& Choudhuri, A. R. 2002, Science, 296, 1671
543: 
544: \bibitem[Park \& Willinger (2000)]{Park00}
545: Park, K., \& Willinger, W. 2000, Self-Similar Network Traffic and Performance Evaluation
546: (New York: J. Wiley \& Sons, Inc)
547: 
548: \bibitem[Parker (1955)]{Parker55}
549: Parker, E.N. 1955, \apj, 122, 293
550: 
551: \bibitem[Peng et al.(1994)]{peng}
552: Peng, C.-K., Buldyrev, S. V., Havlin, S., Simons, M.,  Stanley, H. E., \& Goldberger A.
553: L. 1994, \pre, 49, 1685
554: 
555: \bibitem[Samorodnitsky \& Taqqu (1994)]{Samorod94}
556: Samorodnitsky, G., \& Taqqu, M. S. 1994, Stable NonGaussian Random Processes (New York: Chapman \& Hall)
557: 
558: \bibitem[Scafetta \& West (2003)]{Scafetta03} Scafetta, N., \& West, B.J. 2003, \prl, 90, 248701
559: 
560: \bibitem[Stoev et al. (2002)]{SP} Stoev, S.,  Pipiras, V., \&  Taqqu M.S.
561: 2002, Signal Processing, 82, 1873
562: 
563: \bibitem[Stoev \& Taqqu (2004)]{Stoev04}
564: Stoev, S., \& Taqqu, M. 2004, Fractals,  12, 1, 95
565: 
566: \bibitem[Stoev \& Michailidis (2006)]{Stoev06}
567: Stoev, S., \& Michailidis, G. 2006, Technical Report, 447, Department of Statistics,
568: University of Michigan, http://www.stat.lsa.umich.edu/$\sim$sstoev/max-spectrum-dep.pdf
569: 
570: \bibitem[Svensmark \& Friis-Christensen (1997)] {Svensmark97} Svensmark, H., \&
571: Friis-Christensen, E. 1997, Solar -Terr. Phys., 59, 1225
572: 
573: \bibitem[Taqqu \& Teverovsky (1998)]{TaqTev96}
574: Taqqu, M.S., \& Teverovsky, V. 1998, in A Practical Guide To Heavy Tails: Statistical Techniques and 
575:   Applications, ed. R. Adler, R. Feldman \& M. S. Taqqu (Boston: Birkh$\ddot{\rm a}$user), 177 
576: 
577: \bibitem[Veronig et al.(2002)]{Veronig02}
578: Veronig, A., Temmer, M., Hanslmeier, A., Otruba W., \& Messerotti M. 2002, \aap, 382,
579: 1078
580: 
581: \bibitem[Weron (2001)]{RW01} Weron, R. 2001, Int. J. Mod. Phys. C, 12, 209
582: 
583: 
584: \bibitem[Weron (2002)]{RWer02}
585: Weron, R. 2002, Physica A, 312, 285
586: 
587: \bibitem[Weron et al.(2005)]{Weron05}
588: Weron, A., Burnecki, K., Mercik, Sz., \& Weron, K. 2005, \pre, 71, 016113
589: 
590: \bibitem[Wheatland (2002)]{Weatland02}
591: Wheatland, M.S. 2002, \solphys, 208, 33
592: 
593: \bibitem[Zipf (1932)]{Zipf32}
594: Zipf, G. 1932, Selective Studies and Principle of Relative Frequency in Language (Harvard
595: University Press)
596: 
597: 
598: 
599: 
600: 
601: \end{thebibliography}
602: 
603: \clearpage
604: 
605: \begin{figure}
606: \begin{center}
607: \includegraphics[width=11.cm]{f1.eps}
608: \caption{The evolution of the tail index $\alpha$ during the last solar cycles 1974-2006
609: (top picture) obtained via the max self-similarity method, the bottom picture shows Wolf numbers (characterizing solar activity) in
610: this period.} \label{maxspec}
611: \end{center}
612: \end{figure}
613: 
614: \clearpage
615: 
616: \begin{figure}
617: \begin{center}
618: \includegraphics[width=15.cm]{f2.eps}
619: \end{center}
620: \caption{Energy-time series of solar flares from 2000 January 01 to 2002 December
621: 31.} \label{data}
622: \end{figure}
623: 
624: \clearpage
625: 
626: \begin{figure}
627: \begin{center}
628: \includegraphics[width=11.cm]{f3.eps}
629: \end{center}
630: \caption{Values of the calculated $\alpha$ estimators for the simulated FARIMA time series. Solid line represents the value of the estimator for the analyzed data.}\label{fig_alpha}
631: \end{figure}
632: 
633: \clearpage
634: 
635: \begin{figure}
636: \begin{center}
637: \includegraphics[width=11.cm]{f4.eps}\\
638: \end{center}
639: \caption{Mean squared error on the basis of the calculated $H$ and $d$ estimators for the simulated FARIMA time series with respects to the estimators obtained for the analyzed data for different $a_1$ and $a_2$.}\label{fig_mse}
640: \end{figure}
641: 
642: \clearpage
643: 
644: \begin{figure}
645: \begin{center}
646: \includegraphics[width=8.cm]{f5a.eps}\includegraphics[width=8.cm]{f5c.eps}\\
647: \includegraphics[width=8.cm]{f5b.eps}\includegraphics[width=8.cm]{f5d.eps}
648: \end{center}
649: \caption{Values of the FIRT and VR estimators for the simulated time series (top
650: panel) and the surrogate data (bottom panel) for the generated FARIMA processes. Solid line represents the value of the estimator for the analyzed data.}\label{fig_box_est}
651: \end{figure}
652: 
653: \clearpage
654: 
655: \begin{figure}
656: \begin{center}
657: \includegraphics[width=8.cm]{f6a.eps}\\
658: \includegraphics[width=8.cm]{f6b.eps}
659: \end{center}
660: \caption{Values of the RS estimator $H_{RS}$ for the simulated time series (top panel)
661: and the surrogate data (bottom panel) for the generated FARIMA processes. Solid line represents the value of the estimator for the analyzed data.}\label{fig_box_rs}
662: \end{figure}
663: 
664: 
665: \clearpage
666: 
667: \begin{figure}
668: \begin{center}
669: \includegraphics[width=11.cm]{f7.eps}
670: \end{center}
671: \caption{Solar flare data and prediction in the FARIMA(2,d,0) model.}\label{fig_pred}
672: \end{figure}
673: 
674: \clearpage
675: 
676: \begin{table}
677: \begin{center}
678: \caption{Values of the FIRT, VR and RS estimators for the original time series and the
679: shuffled (surrogate) solar flare data.}\label{tab1}
680: \begin{tabular}{cccc}
681: \\
682: \tableline\tableline Data set & $H_{FIRT}$ & $H_{VR}$ & $d_{RS}$\\ \tableline
683: \multicolumn{4}{c}{Original time series} \\ \tableline Solar flares & $1.1424$ &
684: $1.0665$ & $0.2408$\\ \tableline \multicolumn{4}{c}{Surrogate data}
685: \\ \tableline Solar flares & $0.8452$ & $0.7722$ & $0.0507$\\ \tableline
686: \end{tabular}
687: \end{center}
688: \end{table}
689: 
690: \end{document}
691: