Constants and ARIMA models in R

This post is from my new book Forecasting: principles and practice, available freely online at OTexts.com/fpp/.


A non-seasonal ARIMA model can be written as
\begin{equation}\label{eq:c}
(1-\phi_1B – \cdots – \phi_p B^p)(1-B)^d y_t = c + (1 + \theta_1 B + \cdots + \theta_q B^q)e_t
\end{equation}
or equivalently as
\begin{equation}\label{eq:mu}
(1-\phi_1B – \cdots – \phi_p B^p)(1-B)^d (y_t – \mu t^d/d!) = (1 + \theta_1 B + \cdots + \theta_q B^q)e_t,
\end{equation}
where $B$ is the backshift operator, $c = \mu(1-\phi_1 – \cdots – \phi_p )$ and $\mu$ is the mean of $(1-B)^d y_t$. R uses the parametrization of the second equation.

Thus, the inclusion of a constant in a non-stationary ARIMA model is equivalent to inducing a polynomial trend of order $d$ in the forecast function. (If the constant is omitted, the forecast function includes a polynomial trend of order $d-1$.) When $d=0$, we have the special case that $\mu$ is the mean of $y_t$.

Including constants in ARIMA models using R

arima()

By default, the arima() command in R sets $c=\mu=0$ when $d>0$ and provides an estimate of $\mu$ when $d=0$. The parameter $\mu$ is called the “intercept” in the R output. It will be close to the sample mean of the time series, but usually not identical to it as the sample mean is not the maximum likelihood estimate when $p+q>0$.

The arima() command has an argument include.mean which only has an effect when $d=0$ and is TRUE by default. Setting include.mean=FALSE will force $\mu=0$.

Arima()

The Arima() command from the forecast package provides more flexibility on the inclusion of a constant. It has an argument include.mean which has identical functionality to the corresponding argument for arima(). It also has an argument include.drift which allows $\mu\ne0$ when $d=1$. For $d>1$, no constant is allowed as a quadratic or higher order trend is particularly dangerous when forecasting. The parameter $\mu$ is called the “drift” in the R output when $d=1$.

There is also an argument include.constant which, if TRUE, will set include.mean=TRUE if $d=0$ and include.drift=TRUE when $d=1$. If include.constant=FALSE, both include.mean and include.drift will be set to FALSE. If include.constant is used, the values of include.mean=TRUE and include.drift=TRUE are ignored.

When $d=0$ and include.drift=TRUE, the fitted model from Arima() is $$(1-\phi_1B – \cdots – \phi_p B^p) (y_t – a – bt) = (1 + \theta_1 B + \cdots + \theta_q B^q)e_t.
$$
In this case, the R output will label $a$ as the “intercept” and $b$ as the “drift” coefficient.

auto.arima()

The auto.arima() function automates the inclusion of a constant. By default, for $d=0$ or $d=1$, a constant will be included if it improves the AIC value; for $d>1$ the constant is always omitted. If allowdrift=FALSE is specified, then the constant is only allowed when $d=0$.

Eventual forecast functions

The eventual forecast function (EFF) is the limit of $\hat{y}_{t+h|t}$ as a function of the forecast horizon $h$ as $h\rightarrow\infty$.

The constant $c$ has an important effect on the long-term forecasts obtained from these models.

  • If $c=0$ and $d=0$, the EFF will go to zero.
  • If $c=0$ and $d=1$, the EFF will go to a non-zero constant determined by the last few observations.
  • If $c=0$ and $d=2$, the EFF will follow a straight line with intercept and slope determined by the last few observations.
  • If $c\ne0$ and $d=0$, the EFF will go to the mean of the data.
  • If $c\ne0$ and $d=1$, the EFF will follow a straight line with slope equal to the mean of the differenced data.
  • If $c\ne0$ and $d=2$, the EFF will follow a quadratic trend.

Seasonal ARIMA models

If a seasonal model is used, all of the above will hold with $d$ replaced by $d+D$ where $D$ is the order of seasonal differencing and $d$ is the order of non-seasonal differencing.


Related Posts:


  • syazreen

    Hi Prof,
    auto.arima function is great, the best model can automatically be fitted .by having smallest error.
    Hpwever my model required a random walk with drift model.
    How to use arima() to include drift term.
    arima(x,c(0,1,0)) will not give the same model as I want it to be; that is random walk with drift term is the mean.

    • auto.arima() returns the fitted model.

      If you need to fit it again, use
      Arima(x, order=c(0,1,0), include.drift=TRUE)

      • syazreen

        Thanks so much!
        It works exactly as what I want.

  • Brian

    Is it possible the behavior of Arima has changed?

    install.packages(“fma”)
    library(fma)

    mod2.

    Thanks!

    • Well spotted. The drift term is actually redundant and non-identifiable — note the size of the standard error on the drift coefficient. So it is not actually fitting a cubic drift. I’ll fix the function so it doesn’t return the parameter.

  • sunsetter

    Can auto.arima() be set to never allow non-zero mean, even when d=0?

    • No. You would have to do your own modification of it if you wanted to do that.

      • Erdal

        hello Mr. Hyndman,
        can auto.arima be set to never allow zero mean for case d=0?

  • Débora Spenassato

    Hi prof. Hyndman,
    my model using the function auto. arima() is ARIMA(1,1,0) with drift (AR = 0.208 and drift =2.531). I’m having difficulty to form the equation. would be deltaYt=2.531+0.208 deltaYt-1 + et?

    Thanks,

  • Shraddha Panda

    Hello Professor, Could you please have a look at this question here and share your inputs? I am trying to fit auto.arima for longitudnal data by grouping different regions.. http://stackoverflow.com/questions/25036986/auto-arima-using-xreg-and-forecasting-several-ts-together

  • Joshua Makubu

    Hi Prof
    Arima() is not a function in R is the feedback i get when i try to model with drift. Please educate me

    • load the forecast package first.

      • Joshua Makubu

        Prof .Good evening, Am using r version 3.1.2 but still cant get the function auto.arima or the function arima after installing the forecast package.
        is there anything am not doing right?

        • Mathijs

          You’ve likely not loaded the package, load the package by entering library(forecast). After that, loading the help files by entering ?auto.arima (use R Studio!) and the examples in Prof. Hyndman’s book will help you get there.

  • JJJJ

    Does the same apply when i include dummies in my sarima model? My model seems to get a trend when i include dummies (and differencing).

  • JJJJJ

    Does the same apply when i include dummy variables? My model seems to get a trend when i include dummy variables (together with my differencing).

    • The dummy variables should be differenced before use. Otherwise they will induce a trend.

      • Alex Mustermann

        thanks!

  • Alex Mustermann

    Does the same apply when i include dummy variables? I seem to get a trend when i include dummies together with my differencing.

  • Ashwini Srinivas

    Hi Rob,
    I have used auto.arima and the best model is 1,1,1 and there is no mention of zero or non zero mean and also my forecast fallows the straight line could you please throw some light on this

    • It sounds like you have a drift term. In this case, the slope is the constant.

      • Ashwini Srinivas

        Thanks a lot .I am really sorry to ask this what exactly is the drift term i am unable to find much on net regarding this

  • Erdal

    hello Mr. Hyndman,

    can auto.arima be set to never allow zero mean for case d=0?
    ı want it to give only models with intercept. Adding allowmean=FALSE give only zero mean models but allowmean=TRUE gives both zero and nonzero mean models. “allowmean: If TRUE, models with a non-zero mean are considered.” this expresion does not work.
    how can i run function with only nonzero mean models or fix intercept for all models run by function?

    • No. But you can specify your own model using Arima() with include.mean=TRUE

      • Erdal

        Thank you Mr. Hyndman.
        It resulted in models with intercept. But all models are ARIMA(0,0,0). Is it possible?
        Data is in form of first difference.
        code is:
        (Arima(ts(y, freq=143), include.mean=TRUE, xreg=as.data.frame.list(v19[i,][ !v19[i,] %in% Remove]))

      • Erdal

        Hi again Mr. Hyndman.

        I want to get the models with intercept and automatic ARIMA(p,d,q) values.
        This is only possible with auto.arima function.
        But allowmean=T does not work.
        Is it a problem or a bug?
        It is said in Package ‘forecast’ (April 14, 2016) that when allowmean=T models with a non-zero mean are considered.

        Can it be fixed?

        This is very important for me as i am writting my master’s thesis. I can send you the deatil of the code i use in study.
        Thank you very much.

        • There is no bug. allowmean allows a mean as the help file explains. It doesn’t force a non-zero mean. If you want to force a specific model, use Arima(). Don’t send me code — I do not provide a help service to the whole world.

          • Erdal

            Thank you Mr. Hyndman.
            I thought that it will also be helpful for the other researchers to run the auto.arima() with only nonzero mean models/force a non-zero mean.

            Allowmean is a recent addition to the function (Version 6.0 (9 May 2015)).
            It works very well for only zero mean models. It also gives results with both zero and non zero models.

            It can be helpful to set it for only non-zero mean models (force non-zero mean).

            I do not know whether this is possible or not/logical or not?

            Thanks a lot.

            (I also write it in http://stackoverflow.com/questions/37422790/how-to-run-auto-arima-for-only-nonzero-mean-models )
            But i could’t get an answer.)

  • Anne

    Hi Mr. Hyndman,
    I defined a model using your nice auto.arima function and I include dummy variables for day of the week in xreg. When I use only 6 out of 7 week-dummy variables (to have a full rank matrix) the model works well. However, I would like to include all 7 days of the week as dummy variables. Hence, I need to exclude the constant/intercept in the model if I understand correctly. When I use Arima with include.constant=F and using several different orders of p, d and q I get the error: Error in optim(init[mask], armaCSS, method = optim.method, hessian = FALSE, :non-finite value supplied by optim. What could be causing this error? I would like to understand why the model does work with only 6 out of 7 week-dummies and does not work when I include all 7 week-dummies and exclude the intercept.

    • You would to provide a minimal reproducible example. The following code works.

      z <- ts(rnorm(200), freq=7)
      xreg <- seasonaldummy(z)
      xreg <- cbind(xreg, 1-rowSums(xreg))
      Arima(z, xreg=xreg, order=c(1,0,1), include.constant=FALSE)

  • Shanu Agrawal

    I am running ARIMA model with external regressors, but getting error: “NO Suitable ARIMA model found”.
    What might be the reasons?