Forecasting with long seasonal periods

I am often asked how to fit an ARIMA or ETS model with data having a long seasonal period such as 365 for daily data or 48 for half-hourly data. Generally, seasonal versions of ARIMA and ETS models are designed for shorter periods such as 12 for monthly data or 4 for quarterly data.

The ets() function in the forecast package restricts seasonality to be a maximum period of 24 to allow hourly data but not data with a larger seasonal frequency. The problem is that there are m-1 parameters to be estimated for the initial seasonal states where m is the seasonal period. So for large m, the estimation becomes almost impossible.

The arima() function will allow a seasonal period up to m=350 but in practice will usually run out of memory whenever the seasonal period is more than about 200. I haven’t dug into the code to find out why this is the case — theoretically it would be possible to have any length of seasonality as the number of parameters to be estimated does not depend on the seasonal order. However, seasonal differencing of very high order does not make a lot of sense — for daily data it involves comparing what happened today with what happened exactly a year ago and there is no constraint that the seasonal pattern is smooth.

For such data I prefer a Fourier series approach where the seasonal pattern is modelled using Fourier terms with short-term time series dynamics allowed in the error. For example, consider the following model:

    \[y_t = a + \sum_{k=1}^K \left[ \alpha_k\sin(2\pi kt/m) + \beta_k\cos(2\pi kt/m)\right] + N_t,\]

where N_t is an ARIMA process. The value of K can be chosen by minimizing the AIC.

This is easily fitted using R. Here is an example with m=200, K=4 and N_t an ARIMA(2,0,1) process:

n <- 2000
m <- 200
y <- ts(rnorm(n) + (1:n)%%100/30, f=m)
fit <- Arima(y, order=c(2,0,1), xreg=fourier(y, K=4))
plot(forecast(fit, h=2*m, xreg=fourierf(y, K=4, h=2*m)))

The advantages of this approach are:

  • it allows any length seasonality;
  • for data with more than one seasonal period, you can include Fourier terms of different frequencies;
  • the seasonal pattern is smooth for small values of K (but more wiggly seasonality can be handled by increasing K);
  • the short-term dynamics are easily handled with a simple ARMA error.

The only real disadvantage (compared to a seasonal ARIMA model) that I can think of is that the seasonality is assumed to be fixed — the pattern is not allowed to change over time. But in practice, seasonality is usually remarkably constant so this is not a big disadvantage except for very long time series.

The order of N_t can also be chosen automatically:

fit <- auto.arima(y, seasonal=FALSE, xreg=fourier(y, K=4))

Note that the ARIMA model for N_t should be non-seasonal.

A state space approach to the problem is possible using TBATS models, as described in this paper on complex seasonality. The tbats() function which implements TBATS models will automatically select the Fourier order as well as the other aspects of the model. One advantage of the TBATS model is the seasonality is allowed to change slowly over time.

Related Posts:

  • Matteo Bertini

    The 350 limit in R is in arima.c: the array rbar can be very big.

    SEXP getQ0(SEXP sPhi, SEXP sTheta)
    /* NB: nrbar could overflow */
    int r = max(p, q + 1);
    size_t np = r * (r + 1) / 2, nrbar = np * (np - 1) / 2;
    /* This is the limit using an int index. We could use
    size_t and get more on a 64-bit system,
    but there seems no practical need.*/
    if(r > 350) error(_("maximum supported lag is 350"));
    rbar = (double *) R_alloc(nrbar, sizeof(double));

    The problem with m=350 is that in some domains, like traffic forecasting, is quite common to have 4 samples per hour and a weekly seasonality, resulting in m=672.
    I think the matrix is related to the Kalman filter, but I’ve found no way to switch to a lighter method.

  • Erhan

    Dear Mr. Hyndman,
    I am working on a thesis right now, to use ARIMA-GARCH modeling approach to model a data with multiple seasonal pattern ( daily and weekly ). After a search on internet, I downloaded your forecast package. However, I could not find out how to model multiple seasonality using R with your package ? auto.arima( etc. ) does not find multiple seasonality and I guess I cound not use fourier() function effectively. Can you please lead me a way to model multiple seasonality in R.
    Best Regards,

    • Rob J Hyndman

      Try the bats() and tbats() functions.

      • Guo

        I am still confusing about the multiple seasonal period model. How should I do if I want to fit a model ARIMA(2,0,2)*(1,0,0)24*(1,0,1)168? Many thanks!

        • Rob J Hyndman

          R does not have the facilities to fit that model.

          • Anthony

            Dr. Hyndman,
            What other software have that facility.

          • Rob J Hyndman

            I don’t know

          • Rahul

            Try WEKA 3.7.11

  • Pingback: What kind of model could I use to predict annual server loads?()

  • Ahmet Gurel

    Hello Mr. Hyndman,

    Thank you for all your work and blog posts. I read your paper (on complex seasonality) but I am confused at a simple point. What is the exact difference between an ARIMA(p,d,q) with season S with and Arima(p,d,q,P,D,Q,S). I believe the latter components are essentially have the same interpretation with p,d,q with lag S. Is this correct? And also, I was wondering if there is a particular reason why you don’t consider these models in the paper as additional parameters of tbats or bats?

    Thank you very much,


    • Rob J Hyndman

      1. Read for seasonal ARIMA models.

      2. I have no software for fitting multiple period seasonal ARIMA models.

      • Ahmet Gurel

        Thank you very much, that was helpful.

  • Ahmet Gurel

    Hello again Dr.
    Hyndman ,

    I have again a quick question for you. While reading your book, I was stuck at a very simple point.

    Lets say I have data composed of 10 observations and I want to predict the next 5 observations. (arimaForecast(data, h=5)). Then I take the the first prediction (at time 10+1) and add it to data. Then I predict the next 4 observations ((arimaForecast(append(data,forecast11) , h=4)).

    Shouldn’t these forecasts be the same since I did replace yhat as the algorithms do? Actually this question is not arima specific, but for all the forecasting algortihms in Forecast package.

    Thank you very much


    • Rob J Hyndman

      For a linear model (such as ARIMA), the point forecasts should be the same (provided the ARIMA model does not change), although the forecast variances will change.

  • Arzu

    Dr. Hyderman,

    Is it normal that for a double seasonal ts with frequency 144 (1000 observations)-if I specify seasonal periods 144 and 1008, tbats and bats functions do not converge in reasonable time limits (i tried to run them for 3 hrs). Is it because of auto.arima function? Even if I did specify parameters such as max.p etc. I still could not get an answer. Do you have any comments regarding the cause?

    • Rob J Hyndman

      The functions are slow with high seasonal periods and a long time series. I’m trying to figure out how to make them faster.

      • Arzu

        Thank you for clarification.

  • Anindya Sankar Dey

    Dr. Hyndman,

    In your above code, what I’m not understanding is,when you are forecasting, why in the fourier function your are passing (n+1):(2*m)? Can you kindly help?

    • Rob J Hyndman

      They are the time periods I want to forecast, except there is an error in your question. The code as given in my post is n+1:(2*m).

      • Anindya Sankar Dey


  • Giorgio

    Dear Rob,

    in case of an ARIMAX model, would you advise applying the fourier deseasonalization also to the X part, considering also that each X covariate can generally have its own seasonality?

    Thanks for the clarification, best regards

    • Rob J Hyndman

      Maybe. If the covariates are highly seasonal, then seasonally adjusting them will prevent collinearity with the fourier terms. But it changes the interpretation of the model. If you have highly seasonal covariates, you may not need to user Fourier terms at all, as the seasonality in the response variable may be entirely driven by the seasonality in the covariates.

  • Pingback: Handling Long Seasonality in a Time Series | Jeffrey Chin Tang Wong()

  • Marco Powers

    Hi Professor Hyndman-

    Question — why did you select K=4 in your above code? Is there an automatic process in R used to select K?

    • Rob J Hyndman

      It was just an example. In practice, you can select K using the AIC.

      • olivier des

        Hello Rob,

        So K depends on the minimum AIC, but is there any formula or procedure to calculate it directly?

        Thank you.

        • Rob J Hyndman

          No. Just compute the AIC for a few models and select the model with the smallest AIC value.

  • Herimanitra

    Dear Prof Hyndman,

    I don’t know if this is the right post to post this question. I have already tried some googling approach and saw your StackOverflow but I don’t find solution(s) yet.
    Actually, I’m experiencing problem to set up a daily time series for the purpose of prediction with “forecast” package:

    #where train is a dataframe, “date” is the name of the time series period with numeric format, for example 19940101=1994/01/01 , etc.
    It seems to work (because no error raises during set up) but auto.arima() can’t detect seasonnality after…
    using zoo package doesn’t solve the problem as forecast() doesn’t work with it.

    Could you please provide the correct setting I need to do?

    • Herimanitra

      In case where it is impossible (to set up daily time series and forecast with the “forecast” package),
      I was thinking of extracting monthly values (as, argued, in your 1st paragraph) from the original train and then use the “forecast” package. After obtaining predictions, I was thinking of temporal disagregation (Chow-Lin, DENTON) to recover the prediction for each day between predicted months

      • Rob J Hyndman

        Again, actually reading the article you are commenting on would be a very good place to start. You can fit a model using Fourier series. An alternative is tbats().

    • Rob J Hyndman

      auto.arima will not fit a seasonal model with period greater than 350. Please read the article above.

      • Herimanitra

        OK, thanks, so if really understand the article, if I had 3years with 365days (for example, the 2 first year) and 366days (for the last year), I can specify 02 Fourier Series like this?
        fourier (…) {

        X[,2*i-1] <- sin(2*pi*i*t/period1)
        X[,2*i] <- cos(2*pi*i*t/period1)
        X[,2*i+1] <- sin(2*pi*i*t/period2)
        X[,2*i+2] <- cos(2*pi*i*t/period2)

        Am I correct in this specification?

        • Rob J Hyndman

          No. The period of the data should not change from year to year. Presumably you mean that the third year is a leap year. In that case, set the period to be 365.25 years.

          • Herimanitra

            Yes, leap year,

          • Herimanitra

            I’m also curious to know how to restrict output of the forecast() function to the specified horizon’s length in order to speed up computation, it’s really slow at a certain level (~2000)
            I’ve tried something like this but it didn’t work:


          • Herimanitra

            I figure out now that one needs to do: forecast(mymodel,h=100,method=”naive”)$mean
            to obtain mean forecast in the desired horizons but the main problem I have is that with HoltWinters, it’s really slow compared to loess decomposition.

  • Herimanitra

    I’ve just seen someone setup ts() with daily frequency and use forecast() after:

    What do you think about that?

    Here is the link:

    • Rob J Hyndman

      If you do that with auto.arima, it will return a non-seasonal model. If you try fitting a seasonal model with Arima or arima, it will return an error.

  • Sonbol

    Hello Dr. Hyndman,

    I have a time series measuring solar radiance every minute. So I am assuming that I have a seasonality with length of 1440 (24*60). I am trying to use Fourier as xreg to consider the seasonality. I’ve considered a model like:

    fit <-Arima(training, order=c(3,1,3), xreg=fourier(training,3)).
    My aim is not forecasting. I am going to evaluate the model on a testing set. So I have something like

    fit1 <- Arima(checking, model=fit, xreg=(?))
    My problem is that I do not know how to choose xreg on the testing data. Is it right to consider xreg=fourier(checking,3)?

  • Shubham

    What does ‘n’ mean in the code? Is it the number of data points in the time series?

    • Rob J Hyndman


  • Azza Ali

    If I integrat time series with judgment what is right Equation ?

  • Pingback: Arima function doesn't consider seasonal components | Question and Answer()

  • Gamage

    Dear Prof Hyndman,

    I’m doing my final year undergraduate research on predicting tourist arrivals.
    Monthly data is available from 1973 to 2013. But I supposed to use data from 2000 to 2013 in my analysis.
    But the civil war of the country, which was lasting more than 30 years, was over in 2009. Hence after 2008 the behavior of tourist arrivals is rather different. (time series plot of monthly tourist arrivals from 1973-2013is attached)

    Hence if I use data from 2009-2013 I have 60 monthly data points.
    I surfed net & studied some articles for the minimum sample size required in time series analysis. In those articles they were saying that minimum of 50-60 data points is required.

    Sir I would like to know whether it is capable to do time series analysis with these 60 monthly observations

    Thank you.

  • A.D

    Dear Prof. Hyndman,

    I have two large time series data. One is separated by seconds intervals and the other by minutes. The length of each time series is 180 days. I’m using R (3.1.1) for forecasting the data. I’d like to know the value of the “frequency” argument in the ts() function in R, for each data set.

    Since most of the examples and cases I’ve seen so far are for months or days at the most, it is quite confusing for me when dealing with equally separated seconds or minutes. According to my understanding, the “frequency” argument is the number of observations per season. So what is the “season” in the case of seconds/minutes? My guess is that since there are 86,400 seconds and 1440 minutes a day, these should be the values for the “freq” argument. Is that correct?

    I have posted a more constructed version of this question on cross validation:

    unfortunately, I haven’t received any answer to this question.
    I would very much appreciate you help!!

  • Akhil Dua

    Hello Sir,

    I am working with daily data on flight booking by fare class.

    Data sample :

    A B C D Departure date

    10 20 30 5 15th Nov 2012

    5 50 2 6 16th Nov 2012
    . . . . .
    . . . . .
    5 50 2 6 16th Nov 2014
    Here A, B, C and D are fare classes
    I tried fitting arima model to all fare classes using auto.arima model for forecasting demand for the next day. I know that there is seasonal effect and weekend effect but auto.arima doesn’t pick up these effect.
    Next I tried running auto.arima with dummies for month and weekend but still the model doesn’t fit well with data and the fitted values from the model are very different from the actual values.
    Can you please suggest me which function of the package shall I use for modeling this kind of seasonality

  • Pingback: Fitting models to short time series | Hyndsight()

  • Pingback: Reddit Button Forecast (Post-Mortem*: 5/25) | Bryan S. Weber()

  • Indrek Sepp

    Dear Prof. Hyndman,

    Is there a danger that using fourier seasonal regressor(s) on time-series with non-seasonal trends and/or level shifts, that have a greater magnitude than the seasonal component of the TS, would result in the fourier series fitting these artifacts, and projecting them into the forecast?

  • 黄耀鹏

    Thank you very much for sharing! I’ve just met such kind of problem when using the arima model with a season priod of 288.

  • Daumantas

    In the equation for y_t, shouldn’t alpha and beta have subscripts k?

    • Rob J Hyndman

      Yes. Now fixed.

  • Anton Lebedevich

    it seems that there is a text missing between paragraphs:

    “For such data I prefer a Fourier series approach where the seasonal
    pattern is modelled using Fourier terms with short-term time series
    dynamics allowed in the error. For example, consider the following

    It is much harder to do something like this for ETS models. One of our students has been working on this and our paper on complex seasonality describes the procedure. We plan to add this functionality to the forecast package soon, but the code is not quite ready yet.”

    after words “consider the following model:”

    • Rob J Hyndman

      Now fixed. I edited this post recently and left a stray few sentences in.

  • João Victor Zuanazzi Leme

    Thanks a lot for this method!

    Did you officially publish anything on this?
    I was having a hard time trying to fit a proper SARIMA model on autocorrelated dailly observations with annual seasonal period and needed the residuals to fit a distribution (i.e., needed independent observations).
    If you don’t mind, I would like to use it in my term paper and would like to know how would you like your reference to be.

    Best regards, J.V.

    • Rob J Hyndman

      Not exactly. It’s a special case of dynamic regression models discussed in Section 9.1 of my forecasting textbook (

      • João Victor Z. Leme

        I’m analyzing meteorological data and I’m interested in modelling bivariate distributions using Copula models. The problem is that these kind of variables are obviously autocorrelated (for example, daily measures of max. temperature or daily measures of precipitation). It’s not ok to fit a distribution to a variable of this kind because the assumption required for fitting a distribution is that the observations are independent within the sample, right?

        A professor suggested working with the residuals from an ARIMA or a SARIMA model, but I don’t seem to understand what to do with the residuals after I fit them a distribution (generally non-Normal). Or a bivariate distribution: the residuals from different variables are clearly correlated, so I can (for ex.) simulate random bivariate observations using the Copula approach.
        The correlations between the original variables and the correlation between their models residuals are often close (and their scatterplots look similar).

        The question is: would I be able to generate observations from the original variables using only their simulated residuals? (taking in account that I know the order of their respective ARIMA or SARIMA models).

        Am I using the right method, to start with?

        Thanks a lot!

        • João Victor Z. Leme

          Thinking of now, it surely would be possible to forecast an observation using the ARIMA model plus the generated residual (sorry for the silly question, didn’t think it straight).

          But does it make sense to model the dependence between the residuals of different variables when these variables are serially correlated?

          I’m just an undergradute student, not really an expert in time series.

      • João Victor Z. Leme

        Modelling the dependence of the residuals worked really well for simulated data (adding the generated residuals to the means of SARIMA forecasts). It maintained the dependence structure from the original variables (bivariate case).

  • Ethan

    Prof. Hyndman,
    how does the forecast() function estimate the forecast$mean values in this case? Is it like an extrapolation from the ARIMA model with added seasonality given by the Fourier series?

    Regards, Ethan

    • Rob J Hyndman