Out-​​of-​​sample one-​​step forecasts

It is com­mon to fit a model using train­ing data, and then to eval­u­ate its per­for­mance on a test data set. When the data are time series, it is use­ful to com­pute one-​​step fore­casts on the test data. For some rea­son, this is much more com­monly done by peo­ple trained in machine learn­ing rather than statistics.

If you are using the fore­cast pack­age in R, it is eas­ily done with ETS and ARIMA mod­els. For example:

library(forecast)
fit <- ets(trainingdata)
fit2 <- ets(testdata, model=fit)
onestep <- fitted(fit2)

Note that the sec­ond call to ets does not involve the model being re-​​estimated. Instead, the model obtained in the first call is applied to the test data in the sec­ond call. This works because fit­ted val­ues are one-​​step fore­casts in a time series model.

The same process works for ARIMA mod­els when ets is replaced by Arima or auto.arima. Note that it does not work with the arima func­tion from the stats pack­age. One of the rea­sons I wrote Arima (in the fore­cast pack­age) is to allow this sort of thing to be done.


Related Posts:


  • Ricardo Bessa

    Very use­ful func­tion. One sug­ges­tion for future ver­sion of the ‘fore­cast’ pack­age is to include a func­tion to test multi-​​step ahead pre­dic­tion with ARIMA and ETS. For instance, fit the model in a train­ing dataset and then con­duct multi-​​step ahead pre­dic­tions iter­a­tively in a test dataset.

  • Luis Juan

    Fore­cast is a very use­ful pack­age. We use it a lot for classes and research. I want to com­pare the ARIMA mod­els with expo­nen­tial smooth­ing in a data set. (I’m using Arima and auto.arima). I have fit­ted a very com­plex ARIMA model (mod_​A) to hourly data with n days. Now I would like to obtain the fore­cast erros for the next 20 days using the model. I want to keep the model fixed (mod_​A). Obtain the 24 fore­cast for the day n+1. Then, with the same model and the updated time series up to n+1, i want to fore­cast the 24 hours cor­rre­spond­ing to next day n+2, and so on. How can I do this with Arima?

    • http://robjhyndman.com Rob J Hyndman

      You will need to use a loop with the fol­low­ing com­mands within it.

      fit <- Arima(x, model=mod_A)
      fcast <- forecast(fit, h=24)
      e <- y — fcast$mean

      where x is the data up to the fore­cast ori­gin, and y is the data for the next 24 hours. The first line applies model mod_​A with­out re-​​estimating it.

      • Luis Juan

        Thank you for such a quick and effec­tive response. It is great, thanks.

        • http://www.facebook.com/SatyamTwanabasu Satyam Twan­abasu

          Rob Sir and Luis,
          I beg your pre-​​pardon for ask­ing this(if it seems so obvi­ous to you), but I got to ask this for my clear under­stand­ing. I am also deal­ing with sim­i­lar sit­u­a­tion as Luis had been deal­ing. In my case, I had to fore­cast for next 48 hours and keep the fore­cast­ing mov­ing on, keep­ing the orig­i­nal model unchanged. In this con­text, I don’t get the notion of data ‘x’(in Rob Reply)? How can I relate this with the train­ing­data and test­data as shown in his blog post? Can you please elab­o­rate this? Luis, How did you deal with it?
          Thanks in advance.

          • http://robjhyndman.com Rob J Hyndman

            x is the data up to the fore­cast origin.

  • Anto­nio

    Good post. Rob I’m try­ing to do some­thing sim­i­lar to Luis Juan (fore­cast n+1 and so on) but using bats and Arima. The Arima works pretty well although using the same com­mands but chang­ing Arima for bats gets a lot of NaNs after aprox 23 of the out of sam­ple data instead. Is there a dif­fer­ent treat­ment for bats? if not, have you expe­ri­enced any­thing sim­i­lar? I’m run­ning out of ideas here. Thanks

    • Anto­nio

      I’m sorry I wanted to say in-​​sample data. I have 8760 obser­va­tions + 672 for out of sam­ple. I get the NANs aprox. after obser­va­tion num­ber 6717.

      • Anto­nio

        In fact the val­ues that I get before the NaNs are huge and make no sense at all but the fittted.values extracted from model_​A bats are fine, so the error comes only when I do the step bats(x+y,model=mod_A) where y are new observations.

        • http://robjhyndman.com Rob J Hyndman

          There is no model argu­ment for bats(). So there is cur­rently no way of get­ting fore­casts on new data with­out re-​​estimating the model. Some­thing for a future version.

          • Anto­nio

            Thanks indeed and con­grats for your blog.

  • Pat­tana Lee

    Does this trick work with arfima() in the {fore­cast} pack­age as well? Thank you.

    • http://robjhyndman.com/ Rob J Hyndman

      No, but I’ll add it to the list of fea­ture requests.

      • Pat­tana Lee

        Thank you very much.

  • sana

    kindly let me know about the accu­racy mea­sure. is there any way to get MASE for test dataset, while i have onestep forecasts,

    • http://robjhyndman.com/ Rob J Hyndman

      Use the accu­racy() command.

      • sana

        thanx . I found the way.
        One more thing, i am work­ing with tem­per­a­ture data, it has neg­a­tive val­ues and of course no absolute zero. i want to com­pare the Arima, ets and splinef. i am work­ing with one step fore­casts. is this rea­son­able to com­pare with MAPE? is MASE a good mea­sure to com­pare the accu­racy in this case?

        • http://robjhyndman.com/ Rob J Hyndman

          MAPE makes no sense with tem­per­a­tures.
          Look at the code for accu­racy to see pre­cisely what it is doing.

          • sana

            thank a lot.