physics0402026/srp.tex
1: \documentclass{article}
2: 
3: \usepackage{graphicx}
4: \usepackage{psfig}
5: \usepackage{epsfig}
6: \usepackage[round]{natbib}
7: 
8: \setlength{\hoffset}{-1in}\setlength{\oddsidemargin}{2.5cm}
9: \setlength{\textwidth}{16cm} \setlength{\voffset}{-1in}
10: %\setlength{\topmargin}{1cm} \setlength{\textheight}{11cm}
11: \setlength{\topmargin}{1cm} \setlength{\textheight}{25cm}
12: \setlength{\unitlength}{1cm} \setlength{\parindent}{0cm}
13: 
14: \bibliographystyle{plainnat}
15: 
16: \title{Improving probabilistic weather forecasts using seasonally varying calibration parameters}
17: 
18: \author{Stephen Jewson\footnote{\emph{Correspondence address}: RMS, 10 Eastcheap,
19: London, EC3M 1AJ, UK. Email: \texttt{x@stephenjewson.com}}\\
20: RMS, London, United Kingdom}
21: \begin{document}
22: 
23: \newcommand{\bx}[1]{\fbox{\begin{minipage}{15.8cm}#1\end{minipage}}}
24: 
25: \maketitle
26: 
27: \maketitle
28: \begin{abstract}
29: We show that probabilistic weather forecasts of site specific temperatures can
30: be dramatically improved by using seasonally varying rather than constant calibration parameters.
31: \end{abstract}
32: 
33: \section{Introduction}
34: 
35: Different users of weather forecast are interested in different things. 
36: One particular group of users, including weather derivatives traders, 
37: is most interested in probabilistic forecasts at specific locations. 
38: The production of such forecasts on the 0-10 day timescale
39: is what we will consider in this article.
40: We will derive our forecasts from the output
41: of numerical weather prediction models, as is usual. 
42: This output contains a lot of information about the future weather
43: but needs processing in order to be directly relevant to individual sites, which are
44: not represented in the models. 
45: This processing step, usually known as \emph{calibration} or \emph{downscaling} 
46: can usefully be cast as a 
47: classical (i{.}e{.} non-Bayesian) statistical problem. 
48: In this framework a mixture of climatology and the output 
49: from the model are used as predictors and
50: the observations that we wish to predict are the predictand. 
51: 
52: 
53: Forecast calibration can be performed for any weather variable,
54: but we will focus on temperature.
55: For temperature anomalies\footnote{from which the mean
56: and seasonal cycle have been removed}
57: a good starting point for the calibration model is a standard linear regression taking the ensemble mean
58: as the single predictor, and the distribution of possible future temperatures as the predictand.
59: Linear regression has been used as a calibration model for at least 30 years, although it is only recently
60: been fully appreciated that it gives a good probabilistic forecast, rather than just a good forecast of
61: the expected temperature.
62: The challenge is now to build models
63: that perform better than linear regression, and that is the subject of this article. We have
64: described the testing of a number of models versus linear regression in previous articles, 
65: and have generally found it hard to beat by more than minute amounts.
66: For instance, we have tried to improve forecast skill by using the ensemble spread as a predictor of uncertainty
67: (\citet{jewsonbz03a} and~\citet{jewson03g}),
68: but found only a small benefit. We have also tried to improve the forecast by relaxing the assumption of
69: normality and replacing the normal distribution with a kernel density, 
70: but in that case found almost no benefit at all~\citep{jewson03h}.
71: 
72: In another study we investigated whether the benefit of using an ensemble versus a single forecast arises
73: more from the information content of the ensemble mean or the ensemble spread~\citep{jewson03i}. 
74: The answer is very clear: the ensemble mean
75: is vastly more useful. This suggests that the best way to beat linear regression might be to improve
76: the forecast of the mean, rather than the forecast of the spread, and that is the approach we
77: follow below. 
78: 
79: How could we improve the forecast of the mean? 
80: There is a long list of methods we might consider, including:
81: \begin{itemize}
82:   \item using non-linear models such as neural nets
83:   \item using predictors from other locations
84:   \item using lagged predictors
85:   \item using multiple models
86:   \item using seasonally varying parameters
87: \end{itemize}
88: 
89: These are all probably worthy of investigation. 
90: However in this paper we will only address the last of these:
91: can we beat constant parameter linear regression by allowing the parameters to vary seasonally?
92: On the one hand, from a meteorological point of view, this seems a very reasonable approach since
93: in the climate one usually finds that everything varies seasonally. On the other hand,
94: as devotees of parsimony, we balk at this approach. 
95: In a seasonal parameter model each of the parameters in the regression
96: becomes (at least) 3 parameters.
97: Thus a 3 parameter model becomes a 9 parameter model. For such a model to be better, the seasonality
98: in the mapping from model forecast to observed temperature had better be rather strong.
99: 
100: The idea of using seasonally varying parameters is somewhat similar to a method currently 
101: used by some National Meteorological Services for
102: the calibration of single forecasts which builds the calibration models using only recent training data 
103: (e{.}g{.} data for the previous 90 days). This method automatically captures some aspects of seasonality because it allows the
104: calibration parameters to vary through the year. However, we believe that explicitly modelling seasonality
105: has some benefits, and will be the method of choice in the long run. This is because:
106: 
107: \begin{itemize}
108:   \item Fitting calibration parameters from the previous 90 days suffers from the problem that the parameters
109:   are always slightly behind relative to the present point in the season. They may be the best parameters for
110:   45 days ago, but will not be the best parameters for today.
111:   \item Fitting calibration parmaeters from the previous 90 days only allows us to use a small amount of 
112:   past forecast data for estimating the parameters. As the amount of available past forecast data increases
113:   it makes sense to try and use all of this data. This is 
114:   especially important if we are to make use of the subtle signals contained in the varying spread of
115:   ensemble forecasts.
116: \end{itemize}
117:  
118: In section~\ref{data} we discuss the data we use for our study. In section~\ref{models} we describe the models
119: and how we will compare them. In section~\ref{results} we present the results, and in section~\ref{conc} we present 
120: our conclusions and discuss areas for future work.
121: 
122: \section{Data}
123: \label{data}
124: 
125: We will base our analyses on one year of ensemble forecast data for the weather
126: station at London's Heathrow airport, WMO number 03772. The forecasts are predictions
127: of the daily average temperature, and the target days of the forecasts
128: run from 1st January 2002 to 31st December 2002. The forecast was produced
129: from the ECMWF model~\citep{molteniet96} and downscaled to the airport location using a simple
130: interpolation routine prior to our analysis. There are 51 members in the ensemble.
131: We will compare these forecasts to the quality controlled climate
132: values of daily average temperature for the same location as reported by the UKMO.
133: 
134: There is no guarantee that the forecast system was held constant throughout this period,
135: and as a result there is no guarantee that the forecasts are in any sense stationary,
136: quite apart from issues of seasonality. This is clearly far from ideal with respect to 
137: our attempts to build statistical interpretation models on past forecast data but is,
138: however, unavoidable: this is the data we have to work with.
139: 
140: Throughout this paper all equations and all values are in terms of double anomalies
141: (have had both the seasonal mean and
142: the seasonal standard deviation removed). 
143: Removing the seasonal standard deviation
144: removes most of the seasonality in the forecast error statistics, and partly justifies the use of
145: non-seasonal parameters in the statistical models for temperature that we propose.
146: 
147: \section{Models}
148: \label{models}
149: 
150: As mentioned in the introduction, we will take a \emph{classical statistical} approach to the problem of creating
151: probabilistic temperature forecasts. 
152: This means that we will postulate models which predict the distribution
153: of temperature directly in terms of a number of predictors. Such models are simple to design, 
154: simple to understand, simple to fit, simple to test and easy to use for making forecasts. 
155: They also allow us to incorporate any number of predictors, 
156: including climatology, in an optimum (likelihood maximising) way.
157: 
158: The standard method for estimating the parameters of classical statistical models is to
159: find those parameters which maximise the probability of the observations given the model. 
160: This quantity, when considered as a function of the model parameters, is known
161: as the likelihood. A standard (and intuitively very reasonable) way to compare such models against
162: each other is to compare the maximum likelihoods they achieve. 
163: The forecast that gives the higher likelihood (or log-likelihood)
164: is the better forecast (see~\citet{jewson03d} and~\citet{jewson03f} for more details on this). 
165: This method for comparing probabilistic forecasts can
166: be used on both continuous and discrete forecasts, and in-sample or out-of-sample. In-sample testing has two
167: caveats: firstly that it can only be applied to parsimonious parametric models 
168: (because non-parametric models and models with large numbers of parameters are overfitted)
169: and secondly that the probability used in the
170: comparison has to be adjusted to penalise models with more parameters. A common way to make this adjustment is to
171: use the AIC criterion, and the AIC score is what we will use below.
172: 
173: We now present the models we compare in this study.
174: 
175: Our first model is simply linear regression between temperature on day $i$ ($T_i$) and the ensemble
176: mean on day $i$ ($m_i$), which we write as:
177: 
178: \begin{equation}\label{regression}
179:   T_i \sim N(\alpha_0+\beta_0 m_i, \gamma_0)
180: \end{equation}
181: 
182: This model corrects biases using $\alpha_0$, optimally "damps" the variability of the ensemble mean 
183: and merges optimally
184: with climatology using $\beta_0$, and predicts flow-independent uncertainty using $\gamma_0$.
185: The bias and the uncertainty produced by this model vary seasonally because of the deseasonalisation
186: and reseasonalisation steps.
187: All our subsequent models will be judged against this model.
188: 
189: Our second model generalises this model so that the parameters themselves vary seasonally:
190: 
191: \begin{equation}\label{sregression}
192:   T_i \sim N(\alpha_i+\beta_i m_i, \gamma_i)
193: \end{equation}
194: where
195: \begin{eqnarray}
196:   \alpha_i&=&\alpha_0+\alpha_s \mbox{sin} \theta_i+\alpha_c \mbox{cos} \theta_i \\\nonumber
197:   \beta_i &=&\beta _0+\beta_s  \mbox{sin} \theta_i+\beta_c  \mbox{cos} \theta_i \\\nonumber
198:   \gamma_i&=&\gamma_0+\gamma_s \mbox{sin} \theta_i+\gamma_c \mbox{cos} \theta_i
199: \end{eqnarray}
200: 
201: where $\theta_i$ is the time of year. 
202: We have represented seasonality in the simplest way possible by using just one harmonic in
203: order to keep the number of parameters as low as possible.
204: 
205: We now consider three models that are intermediate between the constant-parameter linear 
206: regression (equation~\ref{regression}) and the seasonal-parameter linear regression (equation~\ref{sregression}).
207: 
208: The first only has seasonal bias correction:
209: \begin{equation}
210:   T_i \sim N(\alpha_i+\beta_0 m_i, \gamma_0)
211: \end{equation}
212: 
213: The second has seasonal damping:
214: \begin{equation}
215:   T_i \sim N(\alpha_0+\beta_i m_i, \gamma_0)
216: \end{equation}
217: 
218: and the third has seasonal innovations:
219: \begin{equation}
220:   T_i \sim N(\alpha_0+\beta_0 m_i, \gamma_i)
221: \end{equation}
222: 
223: For comparison we will also consider the spread regression model of~\citet{jewsonbz03a}:
224: \begin{equation}
225:   T_i \sim N(\alpha_0+\beta_0 m_i, \gamma_0+\delta_0 s_i)
226: \end{equation}
227: 
228: (where $s_i$ is the ensemble spread on day $i$)
229: and a completely seasonal version of the spread regression model:
230: 
231: \begin{equation}
232:   T_i \sim N(\alpha_i+\beta_i m_i, \gamma_i+\delta_i s_i)
233: \end{equation}
234: 
235: where
236: 
237: \begin{eqnarray}
238:   \alpha_i&=&\alpha_0+\alpha_s \mbox{sin} \theta_i+\alpha_c \mbox{cos} \theta_i \\\nonumber
239:   \beta_i &=&\beta _0+\beta_s  \mbox{sin} \theta_i+\beta_c  \mbox{cos} \theta_i \\\nonumber
240:   \gamma_i&=&\gamma_0+\gamma_s \mbox{sin} \theta_i+\gamma_c \mbox{cos} \theta_i \\\nonumber
241:   \delta_i&=&\delta_0+\delta_s \mbox{sin} \theta_i+\delta_c \mbox{cos} \theta_i
242: \end{eqnarray}
243: 
244: This last model has the greatest number of parameters of all the models we consider (12) and would
245: be expected to perform the best because it includes all the other models.
246: 
247: We fit the parameters of all the models by maximising the likelihood using a standard quasi-Newton method
248: with finite-difference gradient.
249: 
250: \section{Results}
251: \label{results}
252: 
253: Figure~\ref{f:f1} shows the AIC scores for the constant parameters and seasonal parameter linear
254: regression models (upper and lower solid lines respectively). By definition, lower values of the AIC
255: score are better, and a value of zero would be a perfect forecast. We see a \emph{vast} improvement
256: in the skill of the probabilistic forecast as a result of using seasonal parameters, especially at
257: short lead times. The dotted line shows the AIC score for the seasonal spread regression model.
258: There is a further improvement at most lags from making the spread calibration parameters seasonal in addition, 
259: but it is small. At lag 9 the fitting of the parameters of the spread regression failed: the algorithm
260: was unable to find a convincing maximum for the likelihood. This is presumably because the signal is too
261: weak given the number of parameters we are trying to fit and the amount of data being used. 
262: 
263: Where does this vast improvement come from? Is it the seasonality in the 
264: bias correction, the damping, the innovations, or all three in combination together? 
265: We address this question using the intermediate models.
266: Figure~\ref{f:f2} shows the same results as figure~\ref{f:f1} in the solid lines in each panel,
267: but also shows the AIC score for 4 other models as a dotted lines (the top left panel is the seasonal
268: bias model, top right is the seasonal damping model, lower left is the seasonal noise model and lower right is the
269: non-seasonal spread regression model). We immediately see that, of these models, 
270: the seasonal bias model gives the greatest
271: benefit over straight linear regression. Figure~\ref{f:f3} shows the same data, but now relative
272: to the AIC score of linear regression (more negative values are better). 
273: We see again that the seasonal bias correction gives a large
274: benefit, the seasonal damping gives no real benefit, the seasonal noise gives a small benefit
275: at short leads, and that the spread regression gives a small benefit at all leads. 
276: 
277: Figure~\ref{f:f10} investigates the extent to which there is \emph{synergy} between the different
278: parameterisations. In the upper panel we compare the benefit from making all of $\alpha, \beta$ and $\gamma$
279: seasonal at once (dotted line) against the sum of the benefits of making them seasonal separately (solid line). 
280: There is definitely
281: some synergy: the total is greater than the sum of the parts. In the lower panel we consider the benefit
282: of using spread regression in a non-seasonal and seasonal model. In our previous research it
283: has been rather disappointing that using the ensemble spread brings so little benefit to probabilistic
284: forecasting, and we had hoped that maybe the benefit would be greater in the context of seasonal 
285: parameters for the mean. However, this is not the case. The benefit is more or less the same and is still very small.
286: 
287: Constant parameter linear transformations such as the basic linear regression model cannot improve the
288: linear correlation of the ensemble mean with the observations. However, seasonal parameter linear
289: transformations can. Figure~\ref{f:f9} shows linear correlations before (solid line) and after (dotted line) the 
290: seasonal transformation. We see a definite improvement in linear correlation at all lead times.
291: 
292: Figure~\ref{f:f4} shows the 9 parameters for the seasonal regression model. The top row shows the
293: alpha parameters, the second row the beta parameters and the third row the gamma parameters. 
294: We see that the variability in alpha is dominated by the cosine term at all but the longest leads,
295: the variability in beta is small relative to the average level of beta, and that gamma is more or
296: less constant throughout the year (but not with lead, of course).
297: 
298: %We have attempted to put confidence intervals on these parameter values, using the standard method of
299: %inverting the Fisher information matrix. However, this only worked for the three of the models:
300: %the linear regression model, the seasonal bias model and the spread regression model. For the other models
301: %(which include seasonal damping and noise parameters) the Fisher information matrix was not invertible.
302: %We suspect that this implies that these parameters are very poorly estimated. The uncertainty on the
303: %alpha and gamma parameters in the seasonal bias model is shown in figure~\ref{f:f5}, as two dotted
304: %lines either side of the parameter estimate. The parameters are so well estimated that these
305: %lines are almost indistinguishable for the parameter estimate itself.
306: 
307: The final figure, figure~\ref{f:f6}, shows the seasonal variability of alpha predicted by the seasonal bias model
308: for the first 9 lead times. We see that there is significant seasonal variability in the alpha predicted by the
309: model, which is consistent with the large effect that this model had on the AIC score. We see that
310: at short leads the smallest values of $\alpha$ are in spring and the largest in autumn, while at longer leads the 
311: opposite is true. 
312: 
313: \section{Summary}
314: \label{conc}
315: 
316: We have made \emph{another} attempt at improving the probabilistic forecasts of temperature that can be
317: made from ensemble forecasts. As before our starting point and basis for comparison is a linear regression model. 
318: Our previous attempts, that have looked at the benefit from using the ensemble spread 
319: (\citet{jewsonbz03a} and~\citet{jewson03g})
320: and the benefit from using the distribution of
321: the individual ensemble members~\citep{jewson03h}, have not shown much improvement over linear regression.
322: This time we have tried allowing the parameters of the regression model to vary seasonally. We find a \emph{dramatic}
323: improvement in the skill of the forecasts, much larger than the improvement from our previous attempts.
324: When we break down which terms are driving the improvement in skill we find that adding seasonality in the
325: bias is the most important factor. However seasonality in all three terms is important and there is actually
326: synergy between the terms such that the benefit from making all three regression parameters seasonal is greater than the
327: sum of the benefits of making each one seasonal separately (by the measure we use for skill).
328: Furthermore, our seasonal regression model also improves the linear correlation between forecast and observations.
329: 
330: The clear implication of this is that one should always 
331: use the seasonal parameter regression model in preference to the
332: constant parameter linear regression model, 
333: and that the seasonal parameter regression model should become the new baseline for comparison with other
334: methods and models.
335: 
336: There are, as ever, a number of avenues for future work that are suggested by this study.
337: Most obviously, allowing the bias to vary seasonally seems to be so important that 
338: one could try using more harmonics. It may well be that adding extra parameters in the modelling
339: of the bias is more useful than adding extra parameters elsewhere. We do not, however,
340: feel it would be justified with only the limited amount of data used in this study, and this is why we have not
341: considered higher harmonics here.
342: 
343: At a technical level, our fitting algorithm could be improved if we avoided the assumption that the 
344: forecast errors are uncorrelated in time (they are, in fact, weakly positively correlated). 
345: This may affect the results somewhat, but we doubt it would
346: affect them qualitatively. 
347: 
348: One possible criticism of our study might be that, by using up to 12 parameters to model only 
349: 365 (weakly correlated) observation pairs we are flirting with overfitting. We wouldn't disagree.
350: We have compensated for this by using AIC rather than straight log-likelihood as our measure of skill 
351: and so it should be the case that our results would transfer to out of sample likelihood comparisons.
352: Nevertheless if longer data sets of stationary past forecasts ever become available
353: then it would be very interesting to repeat this analysis:
354: the parameters of the models we have presented will become much better estimated, and the results
355: that much better justified.
356: 
357: Our highest priority is now to repeat this analysis on wind and precipitation forecasts, which 
358: present similar but different challenges to the modelling of temperature because they are not close to 
359: normally distributed.
360: 
361: \section{Acknowledgements}
362: 
363: Many thanks to Ken Mylne and Caroline Woolcock for providing the forecast data used in this study, and for
364: helpful discussions.
365: 
366: \section{Legal statement}
367: 
368: The lead author was employed by RMS at the time that this article was written.
369: 
370: However, neither the research behind this article nor the writing of this
371: article were in the course of his employment,
372: (where 'in the course of his employment' is within the meaning of the Copyright, Designs and Patents Act 1988, Section 11),
373: nor were they in the course of his normal duties, or in the course of
374: duties falling outside his normal duties but specifically assigned to him
375: (where 'in the course of his normal duties' and 'in the course of duties
376: falling outside his normal duties' are within the meanings of the Patents Act 1977, Section 39).
377: Furthermore the article does not contain any proprietary information or
378: trade secrets of RMS.
379: As a result, the lead author is the owner of all the intellectual
380: property rights (including, but not limited to, copyright, moral rights,
381: design rights and rights to inventions) associated with and arising from
382: this article. The lead author reserves all these rights.
383: No-one may reproduce, store or transmit, in any form or by any
384: means, any part of this article without the author's prior written permission.
385: The moral rights of the lead author have been asserted.
386: 
387: \bibliography{jewson}
388: 
389: \newpage
390: \begin{figure}[!htb]
391:   \begin{center}
392:     \includegraphics{fig1.ps}
393:   \end{center}
394:   \caption{
395: The AIC scores for probabilistic forecasts made using linear regression (top solid line) and
396: linear regression with seasonal parameters (lower solid line). The dotted line
397: shows spread regression with seasonal parameters. Low scores are better.
398:           }
399:   \label{f:f1}
400: \end{figure}
401: 
402: \newpage
403: \begin{figure}[!htb]
404:   \begin{center}
405:     \includegraphics{fig2.ps}
406:   \end{center}
407:   \caption{
408: The two solid lines from figure~\ref{f:f1}, along with AIC scores for four more models (dotted lines):
409: a) seasonal bias, b) seasonal damping, c) seasonal noise and d) non-seasonal spread regression.
410:           }
411:   \label{f:f2}
412: \end{figure}
413: 
414: \newpage
415: \begin{figure}[!htb]
416:   \begin{center}
417:     \includegraphics{fig3.ps}
418:   \end{center}
419:   \caption{
420: As figure~\ref{f:f2}, but all values shown relative to the AIC score for linear regression.
421:           }
422:   \label{f:f3}
423: \end{figure}
424: 
425: \newpage
426: \begin{figure}[!htb]
427:   \begin{center}
428:     \includegraphics{fig10.ps}
429:   \end{center}
430:   \caption{
431: The synergy among the seasonal regression parameters, and between seasonality and spread regression.
432:           }
433:   \label{f:f10}
434: \end{figure}
435: 
436: \newpage
437: \begin{figure}[!htb]
438:   \begin{center}
439:     \includegraphics{fig9.ps}
440:   \end{center}
441:   \caption{
442: The linear correlation before (solid line) and after (dotted line) calibration with the seasonal regression model.
443:           }
444:   \label{f:f9}
445: \end{figure}
446: 
447: \newpage
448: \begin{figure}[!htb]
449:   \begin{center}
450:     \includegraphics{fig4.ps}
451:   \end{center}
452:   \caption{
453: The 9 parameters of the seasonal regression model versus lead time.
454: The top row shows the alphas, the second row the betas and the third row
455: the gammas.
456:            }
457:   \label{f:f4}
458: \end{figure}
459: 
460: %\newpage
461: %\begin{figure}[!htb]
462: %  \begin{center}
463: %    \includegraphics{figs/fig5.ps}
464: %  \end{center}
465: %  \caption{
466: %The first four parameters (the alphas and the gamma) for the seasonal bias model.
467: %          }
468: %  \label{f:f5}
469: %\end{figure}
470: 
471: \newpage
472: \begin{figure}[!htb]
473:   \begin{center}
474:     \includegraphics{fig6.ps}
475:   \end{center}
476:   \caption{
477: The seasonal variation of alpha predicted by the seasonal bias model, for leads
478: 0 to 8.
479:           }
480:   \label{f:f6}
481: \end{figure}
482: 
483: 
484: \end{document}
485: