Out-of-sample one-step forecasts

It is common to fit a model using training data, and then to evaluate its performance on a test data set. When the data are time series, it is useful to compute one-step forecasts on the test data. For some reason, this is much more commonly done by people trained in machine learning rather than statistics.

If you are using the forecast package in R, it is easily done with ETS and ARIMA models. For example:

fit <- ets(trainingdata)
fit2 <- ets(testdata, model=fit)
onestep <- fitted(fit2)

Note that the second call to ets does not involve the model being re-estimated. Instead, the model obtained in the first call is applied to the test data in the second call. This works because fitted values are one-step forecasts in a time series model.

The same process works for ARIMA models when ets is replaced by Arima or auto.arima. Note that it does not work with the arima function from the stats package. One of the reasons I wrote Arima (in the forecast package) is to allow this sort of thing to be done.

Related Posts:

  • Ricardo Bessa

    Very useful function. One suggestion for future version of the ‘forecast’ package is to include a function to test multi-step ahead prediction with ARIMA and ETS. For instance, fit the model in a training dataset and then conduct multi-step ahead predictions iteratively in a test dataset.

  • Luis Juan

    Forecast is a very useful package. We use it a lot for classes and research. I want to compare the ARIMA models with exponential smoothing in a data set. (I’m using Arima and auto.arima). I have fitted a very complex ARIMA model (mod_A) to hourly data with n days. Now I would like to obtain the forecast erros for the next 20 days using the model. I want to keep the model fixed (mod_A). Obtain the 24 forecast for the day n+1. Then, with the same model and the updated time series up to n+1, i want to forecast the 24 hours corrresponding to next day n+2, and so on. How can I do this with Arima?

    • You will need to use a loop with the following commands within it.

      fit <- Arima(x, model=mod_A)
      fcast <- forecast(fit, h=24)
      e <- y – fcast$mean

      where x is the data up to the forecast origin, and y is the data for the next 24 hours. The first line applies model mod_A without re-estimating it.

      • Luis Juan

        Thank you for such a quick and effective response. It is great, thanks.

        • Rob Sir and Luis,
          I beg your pre-pardon for asking this(if it seems so obvious to you), but I got to ask this for my clear understanding. I am also dealing with similar situation as Luis had been dealing. In my case, I had to forecast for next 48 hours and keep the forecasting moving on, keeping the original model unchanged. In this context, I don’t get the notion of data ‘x'(in Rob Reply)? How can I relate this with the trainingdata and testdata as shown in his blog post? Can you please elaborate this? Luis, How did you deal with it?
          Thanks in advance.

      • Robin Westerberg

        This example was really helpful for us. We have a small problem still with our test. Our data is a univariate if the dates are excluded(Monthly exchange rates). We would like to loop an auto.arima model that is changing through time. The commands you gave here works fine for our case, but we are having trouble putting these commands in to the loop function. Should we use the same loop function as you showed in http://robjhyndman.com/hyndsight/batch-forecasting/ , and how should the combination of these two functions loop in that case?

        This is our version of the fcast function:

        fit <- auto.arima(EUR1$DEUR3M[1:84])
        fcast <- forecast(fit3, h=3)
        e <- EUR1$DEUR3M[85:87] – fcast$mean

        Thank you alot for the help in advance!

  • Antonio

    Good post. Rob I’m trying to do something similar to Luis Juan (forecast n+1 and so on) but using bats and Arima. The Arima works pretty well although using the same commands but changing Arima for bats gets a lot of NaNs after aprox 2/3 of the out of sample data instead. Is there a different treatment for bats? if not, have you experienced anything similar? I’m running out of ideas here. Thanks

    • Antonio

      I’m sorry I wanted to say in-sample data. I have 8760 observations + 672 for out of sample. I get the NANs aprox. after observation number 6717.

      • Antonio

        In fact the values that I get before the NaNs are huge and make no sense at all but the fittted.values extracted from model_A bats are fine, so the error comes only when I do the step bats(x+y,model=mod_A) where y are new observations.

        • There is no model argument for bats(). So there is currently no way of getting forecasts on new data without re-estimating the model. Something for a future version.

          • Antonio

            Thanks indeed and congrats for your blog.

  • Pattana Lee

    Does this trick work with arfima() in the {forecast} package as well? Thank you.

    • No, but I’ll add it to the list of feature requests.

      • Pattana Lee

        Thank you very much.

  • sana

    kindly let me know about the accuracy measure. is there any way to get MASE for test dataset, while i have onestep forecasts,

    • Use the accuracy() command.

      • sana

        thanx . I found the way.
        One more thing, i am working with temperature data, it has negative values and of course no absolute zero. i want to compare the Arima, ets and splinef. i am working with one step forecasts. is this reasonable to compare with MAPE? is MASE a good measure to compare the accuracy in this case?

        • MAPE makes no sense with temperatures.
          Look at the code for accuracy to see precisely what it is doing.

          • sana

            thank a lot.

  • Yang-hui Chang

    Very useful function. I want to konw more about the diffreences between out-of-sample and in-sample. Is it true that ,if my sample has 500 observations, the out-of-sample estimate is below:arma1<-Arima(window(y, end=400))
    arima11 <- Arima(window(y,start=500),model=arma1)

    Or else.

    • You want to start the second window at 401.

      • Yang-hui Chang

        Thanks for your help! And I want to konw how to decide the matrixs of SR and LR in SVAR or VECM ? I am learning..

  • plat

    Is there a way to do out-of-sample one-step forecasts with an existing stlm model? I’m not sure how that would work, given that you’d have to seasonally decompose the new timeseries data before forecasting.

    • plat

      I have come up with the following strategy, any insight on if it’s correct?

      I use stlm on the training series and get the resulting STL object and ETS model. I use that model as input to the one-step ahead call to ets with the test training series, then I add to the mean the seasonal+trend components from the last season of the STL object. Basically the same as what STLM does internally.

      • OK except you will need to seasonally adjust the test data too. You could probably use the last year o the seasonal component of the STL object to do that. Also, you only add back the seasonal component, not the trend component.

        • plat

          Ok yes. The trend addition was a mistake. Thanks for the input Rob I appreciate it. Also forgot to do the invert BoxCox in case the stlm did one

  • John

    Hi Rob, thanks for the post. I would like to ask you about a modification for HAR-RV equation which looks like this:

    har0 = lm(GK~volatd+volatw+volatm+lagret0+lagsign)

    it is based on realised volatility and I need to do a forecast in order to compare similar models in Diebold-Mariano test (dm.test{forecast}). Do you have any idea how to forecast it and save its residuals in order to use it in the dm.test?

    Thanks, John

    • Andres

      Could you handle this? what if you have quantile reg? how the forecast code goes? Thanks a lot Rob

  • Dan

    I’ve tried to use this procedure on my data. I used the following code:
    fit2 <- auto.arima(Test,model=fit)
    And it gave me the following error massage:
    Error in auto.arima(Test, model = fit) : unused argument (model = fit)
    I'll be grateful for some help,

    • Use `Arima` for refitting. Not auto.arima.

      • Dan

        Thank you for the quick reply!
        I would like though to use the auto.arima for forecasting since it optimizes my model. I’ve tried fitting using auto.arima and then using the same number of AR, MA, etc it gave with the Arima function. But I got different coefficients. Am I missing something?

        • Dan

          I realized I have to add include.constant=TRUE to the Arima function, and then it gives the same model as auto.arima.

          • OK. That sounds like a bug. I’ll look into it.

          • I cannot replicate this problem. Please provide a minimal reproducible example.

  • Monique

    Dear Professor Hyndman

    Very useful package!

    I have a question regarding out of sample forecasting. I have daily time series from 2008-2015. I set 2008-2013 as my training data and 2014-2015 as test data. I want to compare two models based on their out of sample fit, by comparing their forecasts for 2014-2015 to the actual values in the test data. I am using the following models:

    #fit model on traindata
    fit<-auto.arima(traindata, xreg=dummies_train)

    #forecast testdata based on original model
    fit_test<-Arima(testdata, model=fit, xreg=dummies_test)

    Q1: Is it correct that this forecast ‘fit_test’ does not re-estimate the model, but does use information up until yesterday to forecast? So it is a 1-point ahead forecast?

    I also tried to forecast the testdata by using the original model, and not the new data.
    predict_fit <- predict(fit, newxreg=dummies_test, n.ahead=800)
    outsample_accuracy <- accuracy(predict_fit$pred, testdata)

    Q2: Is it correct that this is a forecast that uses only data from the train period, and then forecasts the whole test period by using the model that is based on the train period?

    I tried to do the same with a tbats model.
    fit <- tbats(traindata)
    fcast<-forecast(fit, h=800)
    outsample_accuracy <- accuracy(fitted(fcast),testdata)

    Is it correct that this is the same kind of forecast as I tried with the Arima model, which uses only information from the train period, and then forecasts the whole test period? So not 1-point ahead.

    Q3: Is there already a way to also do 1-point ahead forecasts with the tbats model? Also, is it possible for any of those models to forecast for example 5 days ahead? So using information up until 5 days ago for every forecasted observation?

    Thank you very much in advance.

    • Please ask detailed questions on crossvalidated.com.

      • Monique

        Ok thanks, I will. One short question about your forecast package. Is there already a way to do 1-point ahead forecasts with the tbats model, without re-estimating the model? In the same way as it is possible with arima models:

        fit<-auto.arima(traindata, xreg=dummies_train)
        fit_test<-Arima(testdata, model=fit, xreg=dummies_test)
        Thank you very much!

  • Eva Shah

    I am trying to build a very simple moving average stock price prediction model. If I am using the following code based on the forecast package to predict the closing price of a stock for the next 5 days, am I using this code correctly or do I need to do one-step forecasting?

    nn.fit=nnetar(stock.ts) # stock.ts, is the time series – closing price of stock from 2013-01-01 till the most recent closing price

    • If n=5, that will work. But nnetar() does not give moving averages. There isn’t actually a function for forecasting using moving averages in the forecast package, largely because it is rarely a good choice as a forecasting algorithm.

  • For stock prices, it will be hard to beat naive().

    • Muhammad Aamir

      Thank you very much Rob for such a nice forecast package. Sorry to ask a basic question currently i am using the forecast package for onestep ARIMA forecasting using the formula mentioned in the blog above but i received error with following code. Thanks in advance.
      Air1<-auto.arima(data[1:451]) # Training Data set
      fit1 <- Arima(data[452:551], model=Air1) # Testing data set
      onestep<- fitted(fit1) # Fitted Values
      Error in NextMethod(.Generic) : cannot assign 'tsp' to zero-length vector

      • I can’t replicate this. Please provide a reproducible example.

  • nj

    Hi Professor,

    Why one-step ahead forecasts by fitted values of auto.arima function seems off by one time unit as compared to that calculated manually, using equation?

    I have my data, code, plot and results here:


    Thank you,

    • Your AR1 coefficient should be multiplied by the previous observation.

      • nj

        Thanks Professor, I am supposed to forecast next value by applying ARIMA equation on previous observation, and then, when the actual next value is generated by sensor, the residual is found by = actual next value – forecasted next value. The next value comes one at a time. I have used for loop for the same but results are not as expected.

  • Arsa Nikzad

    Dear prof. Hyndman,
    Thank you for all of your posts. I had a basic question regarding how to select the best model based on their performance on test data.
    If we fit two models on training data, should we compare the performance of those two exact models with the test data or we should compare the performance of the forecasts derived from those two models with the test data?
    Thank you.

  • Marcelo Klötzle

    Hi Rob,

    I am having a problem in making an in and out of sample forecast in R using the forecast package.

    Supose I have 200 observations and pick out 195 to estimate my model (trainingsdata) and the next 5 to test the accuracy of my model.

    I fit following arima model for 195 observations, where carg is the windpower generation and chuv is the correlate(rain).

    fit <- Arima(carg, xreg=chuv, order=c(1,0,2))

    Now I input rain for the next 5 days

    chuvNext5 <- c(1,4,3,2,5)

    fcast <- forecast(fit, h=5, xreg=chuvNext5 )


    Now I have the forecasted values for carg for the next 5 days and the real observed values.

    Is there a way of making a accuracy test for my forecast. I know there is a function called accuracy(), but I do not know a way of implementing it in R using my example.

    Any help is welcome.

    • Daniel K

      (Not Hyndman here, but maybe I can help)


      I believe you just have to write:

      accuracy(fcast$mean, x)

      where x is a vector containing the last five observed values of the windpower generation (carg).

      Moreover, I don’t know if it is an idiossincrancy of your model, but I believe you should not include chuvNext5 when forecasting. By doing it, you are assuming you would have known the rain for the next five days, which is not fair for pseudo-out-of-sample testing procedures.

  • Raditya

    Hi Rob,

    I have a problem:

    I’m using:
    fit <- auto.arima(trainingdata)
    fit2 <- auto.arima(testingdata, model = fit)
    But, it appears: " Error in auto.arima(testingdata, model = fit) : No suitable ARIMA model found"
    Could you please help me solve this problem. Thank you

  • Daniel

    Professor Rob,

    I was studying out-of-sample one-step forecasts, but then I realized I had a doubt with in-sample forecasts. I arrive at the same fitted (one-step in-sample forecasts) when I (i) fit the model for the whole sample and then use fitted() and; (ii) when I split the sample into two, use the coefficients of the model estimated with the whole sample to forecast the “training set” adding one obs. each time in a loop. This is exactly what I expected.

    However, when I try a third “test” I arrive at an odd result (at least to me…). Now, I use the coefficients estimated with the whole sample to fit only the “test set”. As I used the same coeffcients in the three cases, I expected to find the same fitted results here too, but that’s not the case… Let me show you the script.

    Thanks in advance!


    #Model fitted with the whole sample
    model <- auto.arima(gas)

    #Definitions for forecasting
    h <- 1 #prediction horizon
    start <- start(gas) #Series start
    train_end <- c(1990,12) #End of the training period
    series_end <- end(gas) #End of the series
    n <- length(gas-window(gas, end=train_end)) – h + 1 #Quantity of months tested
    a <- (c(1995,12)-series_end+c(0,1))[2] #Months till the end of the year
    #Creating the training dates for the loop
    test_date <- matrix(NA, nrow=12*(end(gas)[1]-train_end[1])+1, ncol=2)
    test_date[1,1:2] <- c(1990,12)
    test_date[2:nrow(test_date),1] <- rep((train_end[1]+1):end(gas)[1],each=12)
    test_date[2:nrow(test_date),2] <- rep(1:12, end(gas)[1]-train_end[1])

    # Use the fitted values of the estimated model from the whole sample
    model1 <- Arima(gas, model=model)

    fc1 <- window(fitted(model1), start=train_end+c(1,-11))

    # Use the estimated model in a loop, adding one observation each period

    #matrix that will store the forecasts
    fc2 <- ts(matrix(0, nrow=n, ncol=h), start=train_end+c(1,-11), end=series_end, freq=12)

    #Loop to calculate the forecasts
    for(i in 1:(nrow(test_date)-a))
    fc2[i] <- forecast(Arima(window(gas, end=test_date[i,], freq=12), model=model), h=h)$mean

    # Use the fitted values of the estimated model only in the testing period (from c(1995,12) on)
    model3 <- Arima(window(gas, start=train_end+c(1,-11)), model=model)

    fc3 <- fitted(model3)

    View(cbind(fc1, fc2,fc3))

    • Daniel K

      I figure it is because of the MA term estimation, which initialisation value will impact depending on where you start… Right?

      • Yes, I think that explains it. If you set max.q=0 in the call to `auto.arima`, you will get the same values in each column.

        • Daniel K

          Thanks, professor Hyndman!