Users of my new online forecasting book have asked about having a facility for personal highlighting of selected sections, as students often do with print books. We have plans to make this a built-in part of the platform, but for now it is possible to do it using a simple browser extension. This approach allows any website to be highlighted, so is even more useful than if we only had the facility on OTexts.org.
This is another situation where Fourier terms are useful for handling the seasonality. Not only is the seasonal period rather long, it is non-integer (averaging 365.25÷7 = 52.18). So ARIMA and ETS models do not tend to give good results, even with a period of 52 as an approximation.
Following my post on fitting models to long time series, I thought I’d tackle the opposite problem, which is more common in business environments.
I often get asked how few data points can be used to fit a time series model. As with almost all sample size questions, there is no easy answer. It depends on the number of model parameters to be estimated and the amount of randomness in the data. The sample size required increases with the number of parameters to be estimated, and the amount of noise in the data. (more…)
I received this email today:
I recall you made this very insightful remark somewhere that, fitting a standard arima model with too much data, ie. a very long time series, is a bad idea.
Can you elaborate why?
I can see the issue with noise, which compounds the ML estimation as the series gets too long. But is there anything else?
I’m not sure where I made a comment about this, but it is true that ARIMA models don’t work well for very long time series. The same can be said about almost any other model too. The problem is that real data do not come from the models we use. When the number of observations is not large (say up to about 200) the models often work well as an approximation to whatever process generated the data. But eventually you will have enough data that the difference between the true process and the model starts to become more obvious. An additional problem is that the optimization of the parameters becomes more time consuming because of the number of observations involved.
What to do about these issues depends on the purpose of the model. A more flexible nonparametric model could be used, but this still assumes that the model structure will work over the whole period of the data. A better approach is usually to allow the model itself to change over time. For example, by using time-varying parameters in a parametric model, or by using a time-based kernel in a nonparametric model. If you are only interested in forecasting the next few observations, it is equivalent and simpler to throw away the earliest observations and only fit a model to the most recent observations.
How many observations to retain, or how fast to allow the time-varying parameters to vary, can be tricky decisions.
Earlier this week I had coffee with Ben Fulcher who told me about his online collection comprising about 30,000 time series, mostly medical series such as ECG measurements, meteorological series, birdsong, etc. There are some finance series, but not many other data from a business or economic context, although he does include my Time Series Data Library. In addition, he provides Matlab code to compute a large number of characteristics. Anyone wanting to test time series algorithms on a large collection of data should take a look.
Unfortunately there is no R code, and no R interface for downloading the data.
Many functions in the forecast package for R will allow a Box-Cox transformation. The models are fitted to the transformed data and the forecasts and prediction intervals are back-transformed. This preserves the coverage of the prediction intervals, and the back-transformed point forecast can be considered the median of the forecast densities (assuming the forecast densities on the transformed scale are symmetric). For many purposes, this is acceptable, but occasionally the mean forecast is required. For example, with hierarchical forecasting the forecasts need to be aggregated, and medians do not aggregate but means do.
It is easy enough to derive the mean forecast using a Taylor series expansion. Suppose represents the back-transformation function, is the mean on the transformed scale and is the variance on the transformed scale. Then using the first three terms of a Taylor expansion around , the mean on the original scale is given by
Last week we had the pleasure of Professor Stephen Pollock (University of Leicester) visiting our Department, best known in academic circles for his work on time series filtering (see his papers, and his excellent book). But he has another career as a member of the UK House of Lords (under the name Viscount Hanworth — he is a hereditary peer).
It made me wonder how many other politicians have PhDs (or equivalent) in statistics, or at least in mathematics. I realise that a lot of mathematicians before the 20th century were often involved in politics, in one way or another, especially in France. Also, the notion of a PhD is a relatively recent invention. But if we restrict the time to 1950 onwards, there must be quite a few politicians with doctorates in the mathematical sciences. (more…)
Sometimes it is useful to “backcast” a time series — that is, forecast in reverse time. Although there are no in-built R functions to do this, it is very easy to implement. Suppose
x is our time series and we want to backcast for periods. Here is some code that should work for most univariate time series. The example is non-seasonal, but the code will also work with seasonal data. (more…)