IAS/IGSSE Doctoral Symposium on Statistical Space-time Modelling for Wind Power Forecasts
Tuesday, June 14 2011
IGSSE Center
Exzellenzzentrum of Technische Universität München
Campus Garching
Boltzmannstr. 17
85748 Garching, Event Room (Ground Floor)
Organizers:
- Richard Davis
- Vincenzo Ferrazzano
- Claudia Klüppelberg
- Christina Steinkohl
Program
Tuesday, June 14, 2011
9:00 - 9:45 Tilmann Gneiting (Heidelberg University) - Making and evaluating point forecasts.
Typically, point forecasting methods are compared and assessed by means of an error measure or scoring function, such as the absolute error or the squared error. The individual scores are then averaged over forecast cases, to result in a summary measure of the predictive performance, such as the mean absolute error or the (root) mean squared error. I demonstrate that this common practice can lead to grossly misguided inferences, unless the scoring function and the forecasting task are carefully matched. Effective point forecasting requires that the scoring function be specified ex ante, or that the forecaster receives a directive in the form of a statistical functional, such as the mean or a quantile of the predictive distribution. If the scoring function is specified ex ante, the forecaster can issue the optimal point forecast, namely, the Bayes rule; this will be illustrated in the context of wind (energy) forecasts. If the forecaster receives a directive in the form of a functional, it is critical that the scoring function be consistent for it, in the sense that the expected score is minimized when following the directive. A functional is elicitable if there exists a scoring function that is strictly consistent for it. Expectations, ratios of expectations and quantiles are elicitable. For example, a scoring function is consistent for the mean functional if and only if it is a Bregman function. It is consistent for a quantile if and only if it is generalized piecewise linear. Similar characterizations apply to ratios of expectations and to expectiles. Weighted scoring functions are consistent for functionals that adapt to the weighting in peculiar ways. Not all functionals are elicitable; for instance, conditional value-at-risk is not, despite its popularity in quantitative finance.
9:45 - 10:30 Ensemble postprocessing for wind speed and wind direction.
A major human desire is to make forecasts for an uncertain future. Consequently, forecasts ought to be probabilistic in nature, taking the form of probability distributions over future quantities or events. At this time, the meteorological community is taking massive steps in a reorientation towards probabilistic weather forecasting. This is typically done using a numerical weather prediction (NWP) model, perturbing the inputs to the model (initial conditions and physics parameters) in various ways, and running the model for each perturbed set of inputs. The result is then viewed as an ensemble of forecasts, taken to be a sample from the joint probability distribution of future weather quantities of interest. However, NWP ensembles typically are biased and uncalibrated, and thus there is a pronounced need for statistical postprocessing, with Bayesian model averaging (BMA) and heterogeneous regression (HR) being state of the art methods for doing this. I will demonstrate how BMA and HR are applied to postprocess ensemble forecasts of wind speed and wind direction at individual sites, such as a wind energy center, thereby providing both probabilistic and point forecasts. Many challenges remain, both theoretically and practically, particularly in the postprocessing of spatio-temporal weather field forecasts, where copula methods are in critical demand.
10:30 - 11:00 Coffee Break
11:00 - 11:45 Marc Genton (Texas A&M University) - Power System Economic Dispatch with Spatio-temporal Wind Forecasts
In this project, spatio-temporal wind forecast is incorporated in power system economic dispatch models. Compared to most existing power system dispatch models, the proposed formulation takes into account both spatial and temporal wind power correlations. This in turn leads to an overall more costeffective scheduling of system-wide wind generation portfolios. The potential economic benefits are manifested in the systemwide generation cost savings, as well as the ancillary service cost savings. We illustrate in a modified IEEE 24 bus system that the overall generation cost can be reduced by 12.7% by using spatiotemporal wind forecasts compared with only using a persistent forecast model. The talk is based on joint work with Le Xie, Yingzhong Gu, and Xinxin Zhu.
11:45 - 12:30 Comparing Spatial Predictions
Under a general loss function, we develop a hypothesis test to determine whether a significant difference in the spatial predictions produced by two competing models exists on average across the entire spatial domain of interest. The null hypothesis is that of no difference, and a spatial loss differential is created based on the observed data, the two sets of predictions, and the loss function chosen by the researcher. The test assumes only isotropy and short-range spatial dependence of the loss differential but does allow it to be non-Gaussian, non-zero mean, and spatially correlated. Constant and non-constant spatial trends in the loss differential are treated in two separate cases. Monte Carlo simulations illustrate the size and power properties of this test, and an example based on daily average wind speeds in Oklahoma is used for illustration. The talk is based on joint work with Amanda Hering.
12:30 - 13:15 Yanyuan Ma (Texas A&M University) - A Semiparametric Approach to Dimension Reduction.
We provide a novel and completely different approach to dimension reduction problems from the existing literature. We cast the dimension reduction problem in a semiparametric estimation framework and derive estimating equations. Viewing this problem from the new angle allows us to derive a rich class of estimators, and obtain the classical dimension reduction techniques as special cases in this class. The semiparametric approach also reveals that the common assumption of linearity and/or constant variance on the covariates can be removed at the cost of performing additional nonparametric regression. The semiparametric estimators without these common assumptions are illustrated through simulation studies and a data example.
13:15 - 14:15 Lunch Break
14:15 - 14:55 Richard Davis (Columbia University) - A Class of Stochastic Volatility Models for Environmental and Computer Experiment Applications
As spatial data in the environmental and geophysical sciences have become more prevalent, spatial models have seen rapid development. Applications of spatial data can be found in diverse areas including meteorology, ecology, environmental health, environmental science, agriculture, disease modeling, and complex computer experiments. Stationary and even isotropic Gaussian processes (GP) often provide the starting point for modeling spatial data. While the stationary GP model possesses a number of attractive features such as an easily computable likelihood function, simple formulae for computing predictors at all points in the domain, and a straightforward method for measuring uncertainty of predictors throughout the domain, it can often do a poor job especially for data that exhibit spatial inhomogeneities. We adapt stochastic volatility modeling to this context, resulting in a stochastic heteroscedastic process (SHP), which is unconditionally stationary and non-Gaussian. The sample paths of this process offer more modeling flexibility than those produced by a traditional GP, and can better reflect prediction uncertainty. GP prediction error variances depend only on the locations of inputs, while SHP can reflect local inhomogeneities in a response surface through prediction error variances that depend on both input locations and output responses. We use maximum likelihood for inference, which is complicated by the high dimensionality of the latent process. Accordingly, we develop an importance sampling method for likelihood computation and use a low-rank kriging approximation to reconstruct the latent process. This procedure is illustrated with simulated and real computer experiment data. (This is joint work with Jay Briedt, Wenying Huang and Ke Wang.)
14:55 - 15:35 Christina Steinkohl (TU München) - Max-stable random fields for extremes of processes observed in space and time
Max-stable processes have proved to be very useful for the statistical modelling of spatial extremes. Several representations of max-stable random fields have been proposed in the literature. For statistical inference it is often assumed that there is no temporal dependence, i.e. the observations at spatial locations are independent in time. We use two representations of stationary max-stable spatial random fields and extend the concepts to the space-time domain. In a first approach, we extend Smith's storm profile model [1990] to a space-time setting and calculate the resulting bivariate distribution function, which is needed for pairwise likelihood estimation. The idea of constructing max-stable random fields as limits of normalized and scaled pointwise maxima of Gaussian random fields has first been introduced by Kabluchko, Schlather and de Haan [2009], who construct max-stable random fields associated to variograms. We use a similar approach based on a well-known result by Hüsler and Reiss [1989] and apply specific spatio-temporal covariance models for the underlying Gaussian random field, which satisfy weak regularity assumptions.
The tail dependence coefficient is an important measure of extremal dependence. We show how the spatio-temporal covariance function underlying the Gaussian random field can be interpreted in terms of the tail dependence coefficient. Within this context, we examine different concepts for constructing spatio-temporal covariance models and analyse several specific examples, including Gneiting's class of nonseparable stationary covariance functions. This is joint work with Richard A. Davis (Columbia University) and Claudia Klüppelberg
15:35 - 16:00 Coffee Break
16:00 - 16:40 Peter Brockwell (Colorado State University & Columbia) - High frequency sampling of a continuous-time ARMA process
Continuous-time autoregressive moving average (CARMA) processes have recently been used widely in the modeling of non-uniformly spaced data and as a tool for dealing with high-frequency data of the form $Y_{n\Delta}, n=0,1,2,\ldots$, where $\Delta$ is small and positive. Such data occur in many fields of application, particularly in finance and the study of turbulence. This paper is concerned with the characteristics of the process $(Y_{n\Delta})_{n\in\bbz}$, when $\Delta$ is small and the underlying continuous-time process $(Y_t)_{t\in\bbr}$ is a specified CARMA process.
16:40 - 17:20 Vincenzo Ferrazzano (TU München) -
Some statistical aspects of fully developed turbulence modeling.
Many real world turbulent flows, e.g. boundary layer atmospheric turbulence, are characterized by a high Reynolds number (fully developed turbulence). In this talk we will outline stylized facts exhibited by the velocity of a fully developed turbulent flow when observed at high frequency in the longitudinal direction at a fixed location. Some of these traits, like non-gaussian increments, intermittent behavior, power-law scaling of the structure functions are universal, i.e. shared by every fully developed turbulent flow. These features are then considered to be essential fingerprints of turbulence, and therefore they should be reproduced by an appropriate stochastic model.
Our long-term goal is to model turbulence with a continuous-time moving average process driven by a non-gaussian intermittency process, estimating the components in a non-parametric way.
17:20 - 18:00 Claudia Klüppelberg (TU München) - Statistics for Turbulence Data with high Reynolds Numbers
We base the investigation of mean flow turbulence data on a stochastic intermittency model, which integrates a memory function by a stochastic process with uncorrelated and weakly stationary increments, which also allows for modeling of intermittency effects. We aim at statistical inference for very high frequency data as for instance the Brookhaven data (which will be presented in the talk by Vincenzo Ferrazzano). In this talk we discuss the estimation of the memory function via the spectrum of the process in a non-parametric way. The estimated velocity spectrum confirms Kolmogorov's 5/3-law for a turbulence spectrum for the inertial range. After filtering out the driving process we are able to investigate its distributional properties and dependence structure.
18:00 - 18:30 Discussions
from 18:30 Get Together and Dinner