Constants and ARIMA models in R

This post is from my new book Fore­cast­ing: prin­ci­ples and prac­tice, avail­able freely online at OTexts​.com/fpp/.


A non-​​seasonal ARIMA model can be writ­ten as

(1)   \begin{equation*} (1-\phi_1B - \cdots - \phi_p B^p)(1-B)^d y_t = c + (1 + \theta_1 B + \cdots + \theta_q B^q)e_t \end{equation*}

or equiv­a­lently as

(2)   \begin{equation*} (1-\phi_1B - \cdots - \phi_p B^p)(1-B)^d (y_t - \mu t^d/d!) = (1 + \theta_1 B + \cdots + \theta_q B^q)e_t, \end{equation*}

where B is the back­shift oper­a­tor, c = \mu(1-\phi_1 - \cdots - \phi_p ) and \mu is the mean of (1-B)^d y_t. R uses the param­e­triza­tion of equa­tion (2).

Thus, the inclu­sion of a con­stant in a non-​​stationary ARIMA model is equiv­a­lent to induc­ing a poly­no­mial trend of order d in the fore­cast func­tion. (If the con­stant is omit­ted, the fore­cast func­tion includes a poly­no­mial trend of order d-1.) When d=0, we have the spe­cial case that \mu is the mean of y_t.

Includ­ing con­stants in ARIMA mod­els using R

arima()

By default, the arima() com­mand in R sets c=\mu=0 when d>0 and pro­vides an esti­mate of \mu when d=0. The para­me­ter \mu is called the “inter­cept” in the R out­put. It will be close to the sam­ple mean of the time series, but usu­ally not iden­ti­cal to it as the sam­ple mean is not the max­i­mum like­li­hood esti­mate when p+q>0.

The arima() com­mand has an argu­ment include.mean which only has an effect when d=0 and is TRUE by default. Set­ting include.mean=FALSE will force \mu=0.

Arima()

The Arima() com­mand from the fore­cast pack­age pro­vides more flex­i­bil­ity on the inclu­sion of a con­stant. It has an argu­ment include.mean which has iden­ti­cal func­tion­al­ity to the cor­re­spond­ing argu­ment for arima(). It also has an argu­ment include.drift which allows \mu\ne0 when d=1. For d>1, no con­stant is allowed as a qua­dratic or higher order trend is par­tic­u­larly dan­ger­ous when fore­cast­ing. The para­me­ter \mu is called the “drift” in the R out­put when d=1.

There is also an argu­ment include.constant which, if TRUE, will set include.mean=TRUE if d=0 and include.drift=TRUE when d=1. If include.constant=FALSE, both include.mean and include.drift will be set to FALSE. If include.constant is used, the val­ues of include.mean=TRUE and include.drift=TRUE are ignored.

When d=0 and include.drift=TRUE, the fit­ted model from Arima() is

    \[(1-\phi_1B - \cdots - \phi_p B^p) (y_t - a - bt) = (1 + \theta_1 B + \cdots + \theta_q B^q)e_t.\]

In this case, the R out­put will label a as the “inter­cept” and b as the “drift” coefficient.

auto.arima()

The auto.arima() func­tion auto­mates the inclu­sion of a con­stant. By default, for d=0 or d=1, a con­stant will be included if it improves the AIC value; for d>1 the con­stant is always omit­ted. If allowdrift=FALSE is spec­i­fied, then the con­stant is only allowed when d=0.

Even­tual fore­cast functions

The even­tual fore­cast func­tion (EFF) is the limit of \hat{y}_{t+h|t} as a func­tion of the fore­cast hori­zon h as h\rightarrow\infty.

The con­stant c has an impor­tant effect on the long-​​term fore­casts obtained from these models.

  • If c=0 and d=0, the EFF will go to zero.
  • If c=0 and d=1, the EFF will go to a non-​​zero con­stant deter­mined by the last few observations.
  • If c=0 and d=2, the EFF will fol­low a straight line with inter­cept and slope deter­mined by the last few observations.
  • If c\ne0 and d=0, the EFF will go to the mean of the data.
  • If c\ne0 and d=1, the EFF will fol­low a straight line with slope equal to the mean of the dif­fer­enced data.
  • If c\ne0 and d=2, the EFF will fol­low a qua­dratic trend.

Sea­sonal ARIMA models

If a sea­sonal model is used, all of the above will hold with d replaced by d+D where D is the order of sea­sonal dif­fer­enc­ing and d is the order of non-​​seasonal dif­fer­enc­ing.


Related Posts:


  • syazreen

    Hi Prof,
    auto.arima func­tion is great, the best model can auto­mat­i­cally be fit­ted .by hav­ing small­est error.
    Hpw­ever my model required a ran­dom walk with drift model.
    How to use arima() to include drift term.
    arima(x,c(0,1,0)) will not give the same model as I want it to be; that is ran­dom walk with drift term is the mean.

    • http://robjhyndman.com/ Rob J Hyndman

      auto.arima() returns the fit­ted model.

      If you need to fit it again, use
      Arima(x, order=c(0,1,0), include.drift=TRUE)

      • syazreen

        Thanks so much!
        It works exactly as what I want.

  • Brian

    Is it pos­si­ble the behav­ior of Arima has changed?

    install.packages(“fma”)
    library(fma)

    mod2.

    Thanks!

    • http://robjhyndman.com/ Rob J Hyndman

      Well spot­ted. The drift term is actu­ally redun­dant and non-​​identifiable — note the size of the stan­dard error on the drift coef­fi­cient. So it is not actu­ally fit­ting a cubic drift. I’ll fix the func­tion so it doesn’t return the parameter.

  • sun­set­ter

    Can auto.arima() be set to never allow non-​​zero mean, even when d=0?

    • http://robjhyndman.com/ Rob J Hyndman

      No. You would have to do your own mod­i­fi­ca­tion of it if you wanted to do that.

  • Déb­ora Spenassato

    Hi prof. Hyn­d­man,
    my model using the func­tion auto. arima() is ARIMA(1,1,0) with drift (AR = 0.208 and drift =2.531). I’m hav­ing dif­fi­culty to form the equa­tion. would be deltaYt=2.531+0.208 deltaYt-​​1 + et?

    Thanks,