The ARIMAX model muddle

There is often confusion about how to include covariates in ARIMA models, and the presentation of the subject in various textbooks and in R help files has not helped the confusion. So I thought I’d give my take on the issue. To keep it simple, I will only describe non-seasonal ARIMA models although the ideas are easily extended to include seasonal terms. I will include only one covariate in the models although it is easy to extend the results to multiple covariates. And, to start with, I will  assume the data are stationary, so we only consider ARMA models.

Let the time series be denoted by $y_1,\dots,y_n$. First, we will define an ARMA$(p,q)$ model with no covariates:
y_t = \phi_1 y_{t-1} + \cdots + \phi_p y_{t-p} – \theta_1 z_{t-1} – \dots – \theta_q z_{t-q} + z_t,
where $z_t$ is a white noise process (i.e., zero mean and iid).

ARMAX models

An ARMAX model simply adds in the covariate on the right hand side:
y_t = \beta x_t + \phi_1 y_{t-1} + \cdots + \phi_p y_{t-p} – \theta_1 z_{t-1} – \dots – \theta_q z_{t-q} + z_t
where $x_t$ is a covariate at time $t$ and $\beta$ is its coefficient. While this looks straight-forward, one disadvantage is that the covariate coefficient is hard to interpret. The value of $\beta$ is not the effect on $y_t$ when the $x_t$ is increased by one (as it is in regression). The presence of lagged values of the response variable on the right hand side of the equation mean that $\beta$ can only be interpreted conditional on the value of previous values of the response variable, which is hardly intuitive.

If we write the model using backshift operators, the ARMAX model is given by
\phi(B)y_t = \beta x_t + \theta(B)z_t
y_t = \frac{\beta}{\phi(B)}x_t + \frac{\theta(B)}{\phi(B)}z_t,
where $\phi(B)=1-\phi_1B -\cdots – \phi_pB^p$ and $\theta(B)=1-\theta_1B-\cdots-\theta_qB^q$. Notice how the AR coefficients get mixed up with both the covariates and the error term.

Regression with ARMA errors

For this reason, I prefer to use regression models with ARMA errors, defined as follows.
y_t &= \beta x_t + n_t\\
n_t &= \phi_1 n_{t-1} + \cdots + \phi_p n_{t-p} – \theta_1 z_{t-1} – \dots – \theta_q z_{t-q} + z_t
In this case, the regression coefficient has its usual interpretation. There is not much to choose between the models in terms of forecasting ability, but the additional ease of interpretation in the second one makes it attractive.

Using backshift operators, this model can be written as
y_t = \beta x_t + \frac{\theta(B)}{\phi(B)}z_t.

Transfer function models

Both of these models can be considered as special cases of transfer function models, popularized by Box and Jenkins:
y_t = \frac{\beta(B)}{v(B)} x_t + \frac{\theta(B)}{\phi(B)}z_t.
This allows for lagged effects of covariates (via the $\beta(B)$ operator) and for decaying effects of covariates (via the $v(B)$ operator).

Sometimes these are called “dynamic regression models”, although different books use that term for different models.

The method for selecting the orders of a transfer function model that is described in Box and Jenkins is cumbersome and difficult, but continues to be described in textbooks. A much better procedure is given in Pankratz (1991), and repeated in my 1998 forecasting textbook.

Non-stationary data

For ARIMA errors, we simply replace $\phi(B)$ with $\nabla^d\phi(B)$ where $\nabla=(1-B)$ denotes the differencing operator. Notice that this is equivalent to differencing both $y_t$ and $x_t$ before fitting the model with ARMA errors. In fact, it is necessary to difference all variables first as estimation of a model with non-stationary errors is not consistent and can lead to “spurious regression”.

R functions

The arima() function in R (and Arima() and auto.arima() from the forecast package) fits a regression with ARIMA errors. Note that R reverses the signs of the moving average coefficients compared to the standard parameterization given above.

The arimax() function from the TSA package fits the transfer function model (but not the ARIMAX model). This is a new package and I have not yet used it, but it is nice to finally be able to fit transfer function models in R. Sometime I plan to write a function to allow automated order selection for transfer functions as I have done with auto.arima() for regression with ARMA errors (part of the forecast package)

Related Posts:

  • Hi Rob,

    I just started out in time series and R, and recently I have tried to fit a univariate time series. The basic process was that I converted my data to a ts object and fit a model using auto.arima to a weekly dataset. A plot of my data shows my data are most likely non-stationary, but based on the post above, and if I understand this correctly, I need to difference my dataset first? When looking at the help, it looks like the default option is to feed auto.arima a non-stationary process. I am new to time series and may be biting off more than I can chew in some cases, but I just want to make sure I learn as much as I can. Thanks in advance.

    • No, you do not need to difference the data first. auto.arima() will determine the order of differencing required using a unit-root test.

      • Dietmar

        Hi Rob,

        how can I reconcile the statement in the article “The ARIMAX model muddle”

        “Con­se­quently, you should do your own dif­fer­enc­ing when required.”

        with the statement

        “No, you do not need to difference the data first. auto.arima() will
        determine the order of differencing required using a unit-root test.”

        Thanks a lot!!


        • Brock was trying to fit a univariate ARIMA model with no regressors. In that case, auto.arima() handles the differencing without a problem. In the article, I was talking about regression with ARIMA errors. In that situation (where there is at least one regressor), differencing should be done by hand if required, because the underlying `arima()` code (not my code) does not do it correctly.

          • liaoming zhou

            may I ask more?
            for example: I fit a ARIMA model: ARIMA(1,1,1)(0,1,1)[7] with xreg=Xr, How to do differencing of Xr handly?

          • This issue has been fixed. There is no need to do any differencing by hand. Just use xreg=xr and everything should work correctly.

  • Perfect thanks Rob!

  • saugat gurung

    dear Rob,
    do i need to do log transformation (if data requires) to use auto.arima() function. as the function does the differencing itself. what if my data requires log transformation,where should i apply auto.arima in the original data or the log transformed data.

    • In v3 of the forecast package, you can do the log transformation inside the auto.arima() function:

      fit <- auto.arima(x,lambda=0)

      The lambda indicates a Box-Cox transformation and lambda=0 corresponds to a logarithm. The forecast function will recognize that a log has been applied and back-transform the forecasts appropriately.

      • saugat gurung

        thank you very much, i find it very helpful.

  • Hervé

    dear Rob, I have some problem with the use of ARIMA with covariate in R. For example how can I estimate in R an ARIMA(2,01) included a covariate with a time lag of 20s for example. Thanks

    • Let x be the lagged covariate. Then
      fit <- Arima(y, xreg=x, order=c(2,0,1))

  • Wolf3616

    I find this article very helpful! thank you!

  • Wolf3616

    Dear Rob,

    I generate a synthetic series using ARX model, using the arima.sim() function, and adding a constant to it (i.e. Yt=C+arima.sim()) to have an Expectation/mean of the ‘observation’ Yt~1000. later I ‘estimate’ the parameters using an arima() function for the generated series. The estimates of the polynomials and mean turn out with small s.e. values to the ‘true’ parameters which I believe, based on your article, that the arima() fits a regression with arima errors. Is my assumption is true?

    My other question is, I use a random variables generated from a ‘GEV distribution’ for the ‘innovation’ (for the arima.sim() function) to generate the ‘observation’, Yt, in hope that a similar distribution type (i.e. GEV) is likely to be generated for the Yt. as far as I had simulated, the ‘observation’ tend to be skewed and non-normal. Am I wrong to predict that?

    • 1. Yes, arima() fits a regression with ARIMA errors.
      2. GEV errors does not lead to a GEV marginal distribution. I don’t know what error distribution would lead to a GEV marginal distribution.

  • Wolf3616

    Thank you for your answer! Really appreciate it!

  • Mohamed

    Dear Rob,

    i have two variables, one response variable and the other is covariate, i would like to use arimax in r and i’m confused if i should put the covariate in xreg or in x transf and if i would like to estimate the relation between them at different lags, how can i apply this in R?!! thanks for help in advance.

  • Shea Parkes

    Might as well dig up an old thread. This is very solid information, I’m currently exploring transfer functions and their use for interventions.

    I’m unable to reproduce the differencing error in the base arima() function of R-stats now (when using an xreg). Is this still a problem or was it eventually fixed?

    If it was fixed, you might want to toss a note into the post since google pageranks this post rather high for “ARIMAX” key words.

    • The bug still exists. It doesn’t return an error. The problem is the method of estimation used is inconsistent. Differencing should be done first so all variables are stationary before estimation.

      • Alex

        I’m wondering if the Shea comment is about the problem with regression including arima errors (and the bug in the arima function).

        I’m using auto.arima to fit a model with one regressor, both dependant and independant variables are non stationnary, and i don’t get the same coefficient from the two above command lines :

        model return an arima (2,1,0)(1,0,0)
        model return an arma (1,0,1)(1,0,0)
        the coefficient of the regressor are very different even if i force an arma (2,0,0)(1,0,0) ine the second model.

        So i just want to know if the bug in the arima function is still present or if I have a problem in my code (or my understanding of the model).

        For a model selection point of view, if the bug still exists, i suppose the right methodology is to fit the first model and apply the differenciation of the arima process to the data and refit the model, am i right ?

        • Please send me a minimalist example by email and I’ll investigate.

  • Vinayraj

    Hi Mr.Hyndman…..I have used the auto.arima function in forecast package with causal variables and found that for non stationary data, the parameter co efficient estimates are not correct…as per your suggestion, i did differencing of the time series before feeding to auto.arima…however how do i use the above arima model and the differenced series while generating the forecast?

    • I don’t know what you mean “not correct” as auto.arima does fit the model correctly. It is arima() that is incorrect.

  • Marcel

    Dear Rob,

    thank you for writing this article! I was wondering if you have come across a package/method for fitting ARIMAX models for ensemble traces, e.g.

    Over a period of 30 days I record 3 measures A,B,C every minute between 12am to 3pm. I assume that A(t) = alpha_1 A(t-1) + … + alpha_n A(t-n) + beta_1 B(t-1) + … + gamma_1 C(t-1) + … + Z(t), while B and C do not depend on A. I want to estimate the alpha’s, beta’s and gamma’s such that the errors minimise the sum of the prediction error over all days.

    Generally I have found quite few tools/libraries for general ARIMAX fitting, but there seems to be little information about how to do this when I have multiple non-consecutive time series of the same process.

  • Canadian Tom

    Hi Rob,

    Thanks for this. This was one of the clearest explanations I could find on the web. If you don’t mind, I have a question to do with non-stationary data. It was my understanding that it’s OK to calibrate an OLS model with non-stationary time-series data, as long as the residuals are a stationary time series. This would imply that they ‘co-integrate’. So if I can calibrate an ARMAX model with a couple AR lags that produces a residuals series with no autocorrelation, in general, that’s OK… right? Or, as you said, do I need to difference?

  • Pingback: cointegration – mixing S & NS regressors; how to test for stationarity; so on | Question and Answer()

  • liana

    Hi Rob,

    I’m trying to fit an arima model with one
    exogeneous variable using auto.arima with xreg option. I’m having some doubts
    because none of my data is stationary and they are both seasonal. Do I have to
    apply log and differentiate my series first? How do I treat the
    seasonality in the original time serie and in the exogeneous serie?

  • Herimanitra

    Dear Sir,

    Your website is a valuable resources.

    I’m curious to know when are you going to talk about VAR, ECM, VECM and more sophisticated models?

    • I don’t have an agenda. I write about things that interest me. Currently, I’m not thinking about these multivariate time series models.

      • Herimanitra

        I see you talked little about VAR model in your book :

        I was expected to see better results in the VAR forecasting (as it is considered as superior to other models).

        So my questions are:

        1) are these variables stationary?

        2) if yes, why forecasts seem to be flat, or one should include seasonal pattern to give better results like in this: (by adding method=”naive”) ?

        Thanks in advance,

        • Yes, the variables look stationary. There is not much signal here, so the forecasts are relatively flat. No method is going to work very well when there is not much signal. The advantage of the VAR is that it is using all the available data relatively efficiently.

  • Dietmar

    Hi Rob,

    in a regression with ARIMA errors I ran into the following questions. What is the relationship between the value of the my.xreg parameter and the prediction error, e.g., MAE, which is calculated relative to the random walk model? Can the value of my.xreg be interpreted as a measure of the importance of the covariate?

    I was – perhaps naively – assuming that there is a linear relationship between my.xreg and forecast error/MAE. In the sense that the lower the MAE the greater/smaller the value of my.xreg parameter. But this does not seem to be the case. Do you have any suggestions on this? Thanks!


    • Guest

      Prof. Hyndman,

      In one of my projects I am trying to implement transfer function in R and I am
      using TSA package.

      In Time Series models’ transfer functions there is a decay parameter in the
      formula (let’s call it b). In TSA package that decay parameter is not mentioned. When I used other software before
      (such as SAS) I used to determine b after analyzing ‘prewhitened series’. But
      in TSA package in R there is no need to specify the decay parameter once you analyze CCF?

      If not how am I going to know when the decay starts?

      I understand CCF is used after prewhitening to determine how to
      filter the outputs but where b comes into the picture?



  • Spinoza

    Prof. Hyndman,

    In one of my projects I am trying to implement transfer function in R and I am
    using the class arima within TSA package for the transfer function

    In Time Series models’ transfer functions there is a decay parameter in the
    formula (let’s call it b). In TSA package that decay parameter is not mentioned. When I used other software before
    (such as SAS) I used to determine b after analyzing ‘prewhitened series’. But
    in TSA package in R there is no need to specify the decay parameter once you analyze CCF?

    If not how am I going to know when the decay starts?

    I understand CCF is used after prewhitening to determine how to
    filter the outputs but where b comes into the picture?



    • I am not the author of the TSA package. Please ask on

  • Kavitha R

    very helpful

  • Kavitha R

    Thanks Dr.Rob. But i have seen references where they use regressions (harmonic reg for adjusting seasonality [i am not sure whether we can include independent variables or not], Poisson regression.. etc. mostly without any adjustment for accommodating serial dependency of time series data) for time series data. Is it correct?? If it is correct what are the important things that has to taken care while we use regression based methods for time series data?? Are these results are comparable with ARMAX models??

    • Regression with time series data but no accommodation for serial correlation is ok if there is no serial correlation. But if serial correlation is present, the estimated coefficients are still unbiased although the standard errors, etc., are wrong.

  • Angie


    I would like to ask your help how can i simulate different forms of transfer functions in R?


  • Jim

    Hi Rob – Is there a preference when modeling time series data (with predictors) between using linear regression (possibly with lagged dependent and independent variables on the RHS) and using robust standard errors like Newey West for serial correlation VERSUS using the time series regression with arima errors you show above? Specifically the forecast accuracy and the inference on the coefficients. Thanks!

    • It depends on the purpose of the model. If you want to do forecasting, then you need to use a time series approach to use the serial correlation in modifying the forecasts. If you want to understand the effect of each regressor, and are therefore primarily interested in the regression coefficients, then the serial correlation is just a nuisance and using the Newey-West approach is simpler.

      • Jim

        Thanks for the response that helps my understanding. Are the standard errors from the time series regression with arima errors for the predictors X, adjusted in a way similar to treating the serial correlation as a nuisance – i.e does time series regression with arima errors produce accurate standard errors and is thus the best of both worlds (forecast and inference)?

        • Yes, the standard errors from the regression+ARIMA model are accurate.

          • Geoffrey

            Hi Rob,

            Along these same lines – can you recommend an R tutorial for time series regression, where the primary interest is exogenous variable coefficients (with properly ARIMA corrected standard errors)? Everything I’ve found so far is focused on forecasting. Thank you.

          • Try Cryer and Chan’s book, chapter 11.

  • Aanshi

    Hi. Would you please explain how to calsulate (white noise process) ?

    I am trying to calculate baseline sales usring ARIMAX model.
    Details for the same:
    Dependent variable= Sales
    Independent Variable (xreg)= Promotion flags and seasonality index

    Would B= xreg coefficient (be it for promo flag or seasonality index)
    Xt = xreg (promo flag or seasonality index) itself?

    • Try asking about specific problems on

      An estimate of the white noise process is obtained via the residuals.

  • Broke

    Hi Rob,
    Would you please explain how to calculate Zt values for the data points?

    • Estimates of z_t are given by the residuals, which are normally available with the residuals() function.

      • Broke

        I used residuals function on model. The output is almost equivalent to subtracting fitted/predicted series from actual series. And these are Zt values. Correct me if I am wrong

        • Broke

          Hi Rob,
          Awaiting your reply

  • Kris

    What is the actual output shown from arima() in R?

    Say I estimate an AR(1) model with some exogenous variables, x:

    fit <- arima(y, order = c(1, 0, 0), method = "ML", xreg = x)

    Then the output shown in R will be:

    Series: y
    ARIMA(1,0,0) with non-zero mean
    ar1 intercept x_1 x_2 x_3

    0.8440 -1.0413 0.0009 0.0014 –
    s.e. 0.0051 0.3055 0.0003 0.0010

    Will the estimates on x "come" from the first regressions with y_t = Bx_t + n_t, while the estimate on ar1 comes from the second regression n_t = phi_t n_t-1 + z_t

    In other words, will R run the two regression, saving the relevant output from both and concatenate an output of both?

    I mainly ask for interpretation reasons, can I trust that the Beta and Phi estimates are marginal changes?

    • The parameters are estimated simultaneously, not in two stages.

      • Kris

        Hi Rob,

        Thank you for the quick reply.

        I don’t know if my question is stupid, but I still don’t quite understand what output I am getting from arima().
        My main issue is when I try to estimate the same model;
        y ~ lag(y,-1) + x using dynlm() for example, I don’t get the same results. In my mind I should get the same results from both estimation methods.

        Or would I need to correct the estimates from dynlm by dividing them with (1-phi)?

        • They are completely different models as I’ve explained in the post. arima() fits a regression with ARMA errors, but it fits it in one step, not in two stages as you have described. dynlm fits an ARMAX model.

  • anup parmar

    Hi Rob,

    I am unable to run this code
    fcast <- forecast(fit2,xreg=rep(mean(usconsumption[,2]),8), h=8)
    Below is the error when i try to run above code.

    Error in :
    argument "newdata" is missing, with no default
    kindly please help me out.

  • anup parmar

    Hi rob,
    when i try to run this code
    fit <- Arima(y, xreg=x, order=c(1,1,0))

    output is not coming and it showed me the below error

    Error in NCOL(x) : object 'y' not found
    kindly please help me out.

    • Make sure x and y exist.

    • anup parmar

      thank you rob!

      One more question how to calculate forecast v/s actual performance
      measurement in R?

  • anup parmar

    Hi rob,
    how to calculate foreast v/s actual performance measurement in R? and how to draw overlay graph in R?

  • Patricia


    I am using the regression with ARMA errors (Arima function from forecast package in R) to reconstruct some water discharge data from 8 stations. I was wondering if there is a function in R that does a regression with ARMA errors in a vectorial way? Each of my station is, in turn, response variable and covariate, and I was thinking that maybe if I work with all the stations together as a vector I have an improvement for my forecast? I found the VAR function (vars package), but it works with uncorrelated errors…
    Did you come across this problem? Can you give me some insights or advices?

    Thank you in advance for your time!

  • Fred

    Thanks for such a good article. Can you introduce me a few good references about these models (I am looking for more application based references more suitable for engineers).
    Also, I am not very familiar with statistical terms, what is covariate and how can we estimate it?

  • Maria


    I would like to study the impact the advertising of a product on its sales (weekly data for 5 years).

    As the final aim is to forecast what would be the impact on sales of a change in the advertising presence in the media, I was considering including also other variables (such as competitors prices, macroeconomics variables,…)

    As I know that advertising has a long lasting effect and that, for example, at time t I would be only able to know prices at t-1 I need to include lags.

    Would ARIMAX model be appropriate?

    Shall I just try to use a regression with a trend+seasonality+lagged variables?

    Thank you.

  • ene100

    I see you’ve requested that people ask questions at Cross Validated rather than here. I’ve just made new account for my question, which has to do with transfer function estimation. If you have time, could you take a look? Thanks.

  • Hector

    Hi Rob J Hyndman,

    I am working with time series in which can exist a nonlinear relationship with other exogenous variables and an ARMA model can be fitted to the residuals. I am interested in using the function arima() for such task due to the parameters are estimated simultaneously, not in two stages. I have not found the way to make fit such nonlinear model and from the residual fit an ARMA model, could you please guide me with this? (My question has already been published in Cross Validated).



  • Jesús Juan

    Hi Rob,

    We are using a reg ARIMA model to forecast electricity demand. In the ARIMA part, we have a seasonal component (weekly) and one regular and one seasonal difference.

    The most complex part is the regression model, with 200 variables. We make predictions and we would like to consider the two sources of error: (1) error of the estimate of the regression betas and (2) the noise of the dynamic part (ARIMA).

    Got a recommendation on the matter ?. Can you recommend us bibliography?

    And finally, we are evaluating the use of MATLAB (2013-2014), MATLAB has a toolbox of econometrics to estimate these models. For us, working in MATLAB has great advantages in this particular case. Do you have some information of how good this toolbox is to estimate ARIAM and RegARIMA models?

  • Anindya Sankar Dey

    HI Rob,

    I was just wondering what will happen if we estimate the regressors first even if its not stationary and then model the error as an ARIMA model, and forecast. It often give good forecast values. What issues we might face.

    • The regression coefficients are not consistent and so the forecasts could be bad.

      • Anindya Sankar Dey

        Hi Rob,

        Thanks for your reply.

        In this case, the independent variables are all dummy variables, and we are not really interested in interpreting the regression coefficients.

        Anindya Sankar Dey

      • Anindya Sankar Dey

        In my case the regressors are all dummy variable, they are like event occurences as per the time, can’t it work in this case

  • Pingback: Cenários e Previsões no R: o caso da taxa de desemprego. | Análise Macroeconômica()

  • Pingback: R: Dynamic Regression with ARIMA model using xreg, make use of step function? | Question and Answer()

  • Pingback: Modelo ARIMA | Monolito Nimbus()

  • hamedhm

    Thank you for the nice explanation. I am wondering if there is any real datasset that transfer function model works fine on that. If yes can you provide a dataset so we can see the model in action. Thanks

  • Hector

    Hi Rob,

    Recently, I have been working with ARMAX models but I realised that in fact my modeling problem must be regarded as a multivariate model. Thus, the model to be fitted is either a VARMAX(p=1,q=1,s=0) or a VMAX(q=1,s=0). Do you know a package in R that deal with these two models? I found that MTS package only fits VARX models.


  • Hector

    Dear Rob,
    Firstly, thanks for sharing all your knowledge in this blog. I found that in this site is studied the “Regression with ARMA errors” in the univariate scenario. Do you know a package/function that deals directly with the multivariate scenario, this is, Multivariate Regression with VARMA errors?


  • Erin Graham

    Dear Rob,
    I just want to clarify something about auto.arima. I am fitting a regression with Arima errors. If I use auto.arima with the xreg option to select a model, it selects a model that is not differenced (2,0,2). However, if I perform unit root tests on the regressors and dependent variable, most of them are not stationary and require 1 order of differencing. I am wondering why auto.arima doesn’t select a model with differencing, and is it safe to go with the (2,0,2) model knowing that some of the regressors are not stationary?
    Thank you so much for your help.

    • When you have an xreg argument, auto.arima tests the stationarity of the residuals of y ~ x. It would be safer to test the stationarity of y and each x separately, but more difficult to implement and to deal with conflicts. So if you think you need to difference, over-ride the automatic choice by setting d=1 in the call to auto.arima.

      • Erin Graham

        Thank you for the reply. If you specify d=1 in auto.arima, are your model coefficients for the regressors specified for the original data, or for the differenced data? In other words, can you interpret the regressor coefficients as you would for a regular regression, or are the coefficients for the model in differences?

  • Brian Miner

    Rob, your contribution (R and otherwise) is extremely appreciated. Great learning for us out here! There is one question about arimax that has troubled me for some time and I cant find an answer (well, i find conflicting ones) (in fact I posted it on cross-validated without a solid answer) and thought I’d come to the top. The question is if one is doing linear regression on time series data (mainly the purpose is for inference) and needs to difference the data (first, second, seasonally etc) due to a unit root and also perhaps needs ARMA errors (so uses Arima with xreg), does the interpretation of the regression coefficients change from the undifferenced, OLS version? Or do you still interpret them as a change in Y from a 1-unit change in X just like if you never differenced the dependent and independent variables? Does it matter the degree of differencing?

    In case interested:
    Thank you!!

    • It must be *estimated* in differences, but the interpretation can apply to both differences and levels. To see why, think of the undifferenced equation, and apply differencing to all terms on both sides. The coefficients remain unchanged (apart from the intercept which is lost).

      • Brian Miner

        Thank you! So if you have a unit root and have to difference (even seasonally and not just a basic first difference) the regression can still answer the same inferential question as if it was never differenced? That is good news!

  • Skeet Michael Singleton

    Dear Dr. Hyndman,

    You wrote, “There is not much to choose between the models in terms of forecasting ability, but the additional ease of interpretation in the second one makes it attractive.” Has someone done a comprehensive test of the difference between the two methods in terms of forecasting? Could you please provide a reference of the trade-off?

    Thank you,


    • I’ve not seen any empirical comparisons, but it is fairly obvious that they will give almost the same results, so there is not much point doing it.

  • imrul hasan

    Hello Sir,

    I have seen one of your answers here ( ) to fit a single ARIMA model on multiple time series. I wanted to know is there any way to do the same for an ARIMAX model ? That is I want to fit an ARIMAX model for multiple time series. I will be waiting for your reply. Thanks in advance..

  • Sana

    Hi Rob,

    I wanted to know how are Transfer Function Models built from ARIMAX models?

  • Arsa Nikzad

    Hi Rob,
    I wonder what the maximum practical regressors can be used with auto.arima.
    Do you recommend PCA or PLS for dimension reduction of regressors?
    Thank you.

  • germ

    Hi Prof Rob,

    I have a problem with the auto.arima() on the insurance example from your textbook in otext. It is based on the dynamic regression model so I feel it is suitable to ask it here. Hope you can give some insights on the auto.arima() problem here:

  • This post is great for its clarity and I have found myself reading it many times over some years. Have you formed any more thoughts or written elsewhere on TSA::arimax or other implementations of transfer function models in R since then?

    • The arfima package does transfer function modelling, and includes prediction as far as I know. But I haven’t used it myself.

  • germ

    Hi Prof Rob,

    Posting this question here because it seems the most appropriate. Been seeking for help in cross validate but no one seems to have the answer.

    With regards to dynamic regression with lagged predictors (refer to your textbook 9.1), with the example of Quotes and Adverts.

    fit1 <- auto.arima(insurace[4:40,1], xreg=Advert[4:40,1], d=0)
    fit2 <- auto.arima(insurance[4:40,1], xreg=Advert[4:40,1:2], d=0)
    fit3 <- auto.arima(insurance[4:40,1], xreg=Advert[4:40,1:3], d=0)
    fit4 <- auto.arima(insurance[4:40,1], xreg=Advert[4:40,1:4], d=0)

    Notice you used d=0. But if i were to removed them, the auto.arima function will do a first differencing based on kpss test. However, I have checked both the Quotes as well as the predictors using kpss.test() and they are all stationary which explains why you used d=0.

    So why is auto.arima() doing differencing when it is not required? Is there any explaination?


    • auto.arima actually applies the KPSS test on the residuals from the regression of y ~ x. In this case, the residuals appear non-stationary, and so auto.arima wants to set d=1. I chose to set d=0 to keep the model simpler for the textbook, and because the non-stationary was not so severe. ACF1 is 0.64.

      • germ

        thank you Prof Rob for the prompt reply.

        You emphasized that the y as well as the covariate must be stationary before passing them into a dynamic regression model. As for the residuals, must we also ensure they are non stationary? And how do you classify as the residual’s non stationarity is not severe?

        • The problem is that checking stationarity of x only makes sense when x is random, and there is no way for auto.arima to tell whether x is fixed or random. The user needs to take some responsibility here. An automated procedure cannot determine the context of the data. Checking the residuals of y~x is a bare minimum test — if either y or x is non-stationary, and they are not cointegrated, then the residuals will be non-stationary.

          Statistical modelling requires careful thinking, and an awareness of the robustness of the results to the model choices made. You can’t just blindly follow recipes or automated procedures.

          • germ

            I assumed that when you meant x is random or fixed, it means if x is a time series?

            In this context, the insurance variable Quote as well as the lagged predictors Advert are both stationary, which is puzzling why the residual of its y ~ x is non stationary. Looking through your textbook emphasized that the independent variables as well as predictors must always be stationary. Which is something my prof emphasized back in university. I cannot seem to find any information regarding residual non stationary.
            Only autocorrelation checks are done regarding residuals.