Forecasting annual totals from monthly data

This question was posed on crossvalidated.com:

I have a monthly time series (for 2009-2012 non-stationary, with seasonality). I can use ARIMA (or ETS) to obtain point and interval forecasts for each month of 2013, but I am interested in forecasting the total for the whole year, including prediction intervals. Is there an easy way in R to obtain interval forecasts for the total for 2013?

I’ve come across this problem before in my consulting work, although I don’t think I’ve ever published my solution. So here it is.

If x is your monthly time series, then you can construct annual totals as follows.

 library(forecast) y <- filter(x,rep(1,12), sides=1) # Total of last 12 months

To get the forecasts of the annual totals:

 fit <- auto.arima(y) forecast(fit,h=12)

The last forecast is for the total of the next year.

Note that diff(y,lag=1) is the same as diff(x,lag=12). So, provided $D>1$, if an ARIMA(p,d,q)(P,D,Q)12 is appropriate for the x series, then an ARIMA(p,d+1,q)(P,D-1,Q)12 is appropriate for the y series. However, auto.arima may not choose the equivalent models because the filtering and differencing will lead to different numbers of observations. To take advantage of this result, and use all the available data as efficiently as possible, the following code is better, assuming $d=D=1$ is appropriate for x:

 fit <- auto.arima(x,d=1,D=1) fit$arma[c(6,7)] <- c(2,0) fit <- Arima(y,model=fit) forecast(fit,h=12) Related Posts: • Richard Warnung Thanks for posting this solution. If we replace x by a data set from the fpp package then this could by fancy, right? If I want to use this as an exercise, would you recommend some other data set? • Yes, you could do this with the fancy data from the fpp package. However, you would need to model the logarithms, not the raw data. • Anthony Professor Hyndman, How can we plot MAPE or MASE vs different lead times? Thanks Anthony • plot(1:h, mape) for example. You will need to compute the mape vector yourself. • Anthony Thanks. I will find the values of MAPE for different h using accuracy() and construct a vector. regards Anthony. • Fraj Professor Hyndman, First of all, I must thank you very much for your valuable work. I have 3000 monthly time series (sales revenue by agent) each with 30 observations, each series has different behaviour I am looping through all of them using ets to get a 6 months forecast but sometimes the forecast is just linear while the series is very random!! the question is: 1. how to choose between ets and auto.arima in a loop? Can I loop and compare BIC and/or AIC and choose the fit with the lowest BIC/AIC? 2. why sometimes the forecast is just linear? • 1. You can’t compare AIC/BIC between ets and auto.arima. See http://robjhyndman.com/hyndsight/aic/ for details. 2. The point forecasts are means of future distributions. It is common for the means to be linear or flat. See http://robjhyndman.com/hyndsight/flat-forecasts/ • Fraj thanks a lot for your prompt reply, really appreciated. the ts are very different, some of them has only trend, or only seasonality and some has trend & seasonality or even it can be random walk…. any suggestion/advice on how to approch this in a batch process? Thanks • I would choose either ets or auto.arima and stick with that for all series. I’ve had better results with ets when applied to large numbers of time series of sales data. • Fraj thanks again, I’ll give it a try with either function. if not well working, can I share 10 series (csv, matrix 30X10) with you for testing? Appreciated, thanks • Fraj Lazreg Prof. Hyndman, I have a very high level in both ts forecasting and programming. am trying to make good use out of your R forecasting package using couple of thoushand of ts (with/without trend and/or with/without seasonality). i have used dummy data to test a batch script with ets and auto.arima but like explained above some forecast are just linear (avg of future distrubutions like you mentioned) my questions are: 1. why ets and auto.arima are giving linear forecast while HW(add) and HW(mul) are not? 2. how to avoid getting linear forecast? appreciate if you could have a look at this ts: [99,98,68,78,88,78,98,98,80,79,86,83,94,93,91,99,91,92,71,100,65,85,82,99,100,95,98,75,75,67] fitets=ets(t) fitarima=auto.arima(t) fithwa=HoltWinters(t,seasonal=”a”) fithwm=HoltWinters(t,seasonal=”m”) fets=forecast(fitets,h=6) farima=forecast(fitarima,h=6) fhwa=forecast.HoltWinters(fithwa,h=6) fhwm=forecast.HoltWinters(fithwm,h=6) mts=matrix(NA,nrow=36,ncol=4) mts[,1]=c(t,fets$mean)
mts[,2]=c(t,farima$mean) mts[,3]=c(t,fhwa$mean)
mts[,4]=c(t,fhwm\$mean)
plot(mts)

best regards & thanks a lot

• Cyrille

Hello Professor,

thank you for all the very useful information you post on this blog and the fantastic forecast package.

Concerning this trick for forecasting annual volumes, I’ve been using it extensively for years to forecast 3 months / 6 months / 12 months volumes, considering that the decrease in variability coming from using cumulated series would certainly improve the accuracy of my forecasts. It is also an easy way to communicate when trying to change the Sales & operation planning process in a firm.

Recently I realized that it did not drive so much improvement in accuracy, and often the accuracy is actually lower than the one resulting from aggregating up the forecasts from monthly forecasts. I don’t see where it can come from and I would like to know if you had any thoughts on this? I would certainly share examples if the topic is of interest for you and your readers.

Have a nice day.

Regards,

Cyrille

• If you are only interested in point forecasts, then just aggregating the monthly forecasts should be fine. There is no reason why the cumulative totals will lead to better forecast accuracy. The point of my suggestion in this post is to provide a way of getting prediction intervals for the cumulative totals.

I am actually working on a new proposal where forecasts of various aggregates (e.g., 3 months, 6 months, 12 months) are required, and I have found a way to improve accuracy by considering all the aggregates required in combination. I’ll post about the idea when the paper is finished.

• Cyrille

Hello Professor,

thank you for your quick reply. I guess my opinion was biased by empirical results where this was useful even for Point forecasts.

As you might imagine, I’ll be very interested in your solution for aggregates and quite sure I’ll be testing it in real-life situations.

Many thanks,

Regards,

Cyrille

• Mandy Oud

Hi Professor Hyndman, I have a question about the difference in the running yearforecast and the summation of pointforecast. My annual forecast is more optimistic and higher than the monthly summed up. Probably the yearforecast takes account of the whole timeserie while the monthly forecast gives higher weights to recent data. Now can I assume that the prediction based on monthly point forecast is more accurate than the year forecast?

• Maybe. It sounds like you could use the thief package to reconcile forecasts at different levels of aggregation. See http://robjhyndman.com/hyndsight/thief/

• Mandy Oud

Perfect! Just what I needed, thank you!