Forecasting annual totals from monthly data

This ques­tion was posed on cross​val​i​dated​.com:

I have a monthly time series (for 2009–2012 non-​​stationary, with sea­son­al­ity). I can use ARIMA (or ETS) to obtain point and inter­val fore­casts for each month of 2013, but I am inter­ested in fore­cast­ing the total for the whole year, includ­ing pre­dic­tion inter­vals. Is there an easy way in R to obtain inter­val fore­casts for the total for 2013?

I’ve come across this prob­lem before in my con­sult­ing work, although I don’t think I’ve ever pub­lished my solu­tion. So here it is.

If x is your monthly time series, then you can con­struct annual totals as follows.

y <- filter(x,rep(1,12), sides=1) # Total of last 12 months

To get the fore­casts of the annual totals:

fit <- auto.arima(y)

The last fore­cast is for the total of the next year.

Note that diff(y,lag=1) is the same as diff(x,lag=12). So, pro­vided D>1, if an ARIMA(p,d,q)(P,D,Q)12 is appro­pri­ate for the x series, then an ARIMA(p,d+1,q)(P,D-1,Q)12 is appro­pri­ate for the y series. How­ever, auto.arima may not choose the equiv­a­lent mod­els because the fil­ter­ing and dif­fer­enc­ing will lead to dif­fer­ent num­bers of obser­va­tions. To take advan­tage of this result, and use all the avail­able data as effi­ciently as pos­si­ble, the fol­low­ing code is bet­ter, assum­ing d=D=1 is appro­pri­ate for x:

fit <- auto.arima(x,d=1,D=1)
fit$arma[c(6,7)] <- c(2,0)
fit <- Arima(y,model=fit)

Related Posts:

  • Richard War­nung

    Thanks for post­ing this solu­tion. If we replace x by a data set from the fpp pack­age then this could by fancy, right? If I want to use this as an exer­cise, would you rec­om­mend some other data set?

    • Rob J Hyndman

      Yes, you could do this with the fancy data from the fpp pack­age. How­ever, you would need to model the log­a­rithms, not the raw data.

  • Anthony

    Pro­fes­sor Hyn­d­man,
    How can we plot MAPE or MASE vs dif­fer­ent lead times?

    • Rob J Hyndman

      plot(1:h, mape) for exam­ple. You will need to com­pute the mape vec­tor yourself.

      • Anthony

        Thanks. I will find the val­ues of MAPE for dif­fer­ent h using accu­racy() and con­struct a vec­tor.

  • Fraj

    Pro­fes­sor Hyndman,

    First of all, I must thank you very much for your valu­able work.

    I have 3000 monthly time series (sales rev­enue by agent) each with 30 obser­va­tions, each series has dif­fer­ent behaviour

    I am loop­ing through all of them using ets to get a 6 months fore­cast but some­times the fore­cast is just lin­ear while the series is very random!!

    the ques­tion is:
    1. how to choose between ets and auto.arima in a loop? Can I loop and com­pare BIC and/​or AIC and choose the fit with the low­est BIC/​AIC?
    2. why some­times the fore­cast is just linear?

    • Rob J Hyndman

      1. You can’t com­pare AIC/​BIC between ets and auto.arima. See http://​rob​jhyn​d​man​.com/​h​y​n​d​s​i​g​h​t​/aic/ for details.

      2. The point fore­casts are means of future dis­tri­b­u­tions. It is com­mon for the means to be lin­ear or flat. See http://​rob​jhyn​d​man​.com/​h​y​n​d​s​i​g​h​t​/​f​l​a​t​-​f​o​r​e​c​asts/

      • Fraj

        thanks a lot for your prompt reply, really appre­ci­ated.
        the ts are very dif­fer­ent, some of them has only trend, or only sea­son­al­ity and some has trend & sea­son­al­ity or even it can be ran­dom walk.…
        any suggestion/​advice on how to approch this in a batch process?


        • Rob J Hyndman

          I would choose either ets or auto.arima and stick with that for all series. I’ve had bet­ter results with ets when applied to large num­bers of time series of sales data.

          • Fraj

            thanks again, I’ll give it a try with either func­tion.
            if not well work­ing, can I share 10 series (csv, matrix 30X10) with you for testing?

            Appre­ci­ated, thanks

          • Fraj Lazreg

            Prof. Hyn­d­man,

            I have a very high level in both ts fore­cast­ing and pro­gram­ming. am try­ing to make good use out of your R fore­cast­ing pack­age using cou­ple of thous­hand of ts (with/​without trend and/​or with/​without seasonality).

            i have used dummy data to test a batch script with ets and auto.arima but like explained above some fore­cast are just lin­ear (avg of future dis­tru­bu­tions like you mentioned)

            my ques­tions are:

            1. why ets and auto.arima are giv­ing lin­ear fore­cast while HW(add) and HW(mul) are not?

            2. how to avoid get­ting lin­ear forecast?

            appre­ci­ate if you could have a look at this ts:


            mts[,1]=c(t,fetsmean) mts[,2]=c(t,farimamean)
            mts[,3]=c(t,fhwamean) mts[,4]=c(t,fhwmmean)

            best regards & thanks a lot

          • Rob J Hyndman
  • Cyrille

    Hello Pro­fes­sor,

    thank you for all the very use­ful infor­ma­tion you post on this blog and the fan­tas­tic fore­cast package.

    Con­cern­ing this trick for fore­cast­ing annual vol­umes, I’ve been using it exten­sively for years to fore­cast 3 months /​ 6 months /​ 12 months vol­umes, con­sid­er­ing that the decrease in vari­abil­ity com­ing from using cumu­lated series would cer­tainly improve the accu­racy of my fore­casts. It is also an easy way to com­mu­ni­cate when try­ing to change the Sales & oper­a­tion plan­ning process in a firm.

    Recently I real­ized that it did not drive so much improve­ment in accu­racy, and often the accu­racy is actu­ally lower than the one result­ing from aggre­gat­ing up the fore­casts from monthly fore­casts. I don’t see where it can come from and I would like to know if you had any thoughts on this? I would cer­tainly share exam­ples if the topic is of inter­est for you and your readers.

    Have a nice day.



    • Rob J Hyndman

      If you are only inter­ested in point fore­casts, then just aggre­gat­ing the monthly fore­casts should be fine. There is no rea­son why the cumu­la­tive totals will lead to bet­ter fore­cast accu­racy. The point of my sug­ges­tion in this post is to pro­vide a way of get­ting pre­dic­tion inter­vals for the cumu­la­tive totals.

      I am actu­ally work­ing on a new pro­posal where fore­casts of var­i­ous aggre­gates (e.g., 3 months, 6 months, 12 months) are required, and I have found a way to improve accu­racy by con­sid­er­ing all the aggre­gates required in com­bi­na­tion. I’ll post about the idea when the paper is finished.

      • Cyrille

        Hello Pro­fes­sor,

        thank you for your quick reply. I guess my opin­ion was biased by empir­i­cal results where this was use­ful even for Point forecasts.

        As you might imag­ine, I’ll be very inter­ested in your solu­tion for aggre­gates and quite sure I’ll be test­ing it in real-​​life situations.

        Many thanks,