Forecasting with daily data

I’ve had several emails recently asking how to forecast daily data in R. Unless the time series is very long, the easiest approach is to simply set the frequency attribute to 7.

y <- ts(x, frequency=7)

Then any of the usual time series forecasting methods should produce reasonable forecasts. For example

fit <- ets(y)
fc <- forecast(fit)

When the time series is long enough to take in more than a year, then it may be necessary to allow for annual seasonality as well as weekly seasonality. In that case, a multiple seasonal model such as TBATS is required.

y <- msts(x, seasonal.periods=c(7,365.25))
fit <- tbats(y)
fc <- forecast(fit)

This should capture the weekly pattern as well as the longer annual pattern. The period 365.25 is the average length of a year allowing for leap years. In some countries, alternative or additional year lengths may be necessary. For example, with the Turkish electricity data analysed in De Livera et al (JASA 2011), we used three seasonal periods: 7, 354.35 and 365.25. The period 354.37 is the average length of the Islamic calendar.

Capturing seasonality associated with moving events such as Easter or the Chinese New Year is more difficult. Even with monthly data, this can be tricky as the festivals can fall in either March or April (for Easter) or in January or February (for the Chinese New Year). The usual seasonal models don’t allow for this, and even the complex seasonality discussed in my JASA paper assumes that the seasonal patterns occur at the same time in each year. The best way to deal with moving holiday effects is to use dummy variables. However, neither ETS nor TBATS models allow for covariates. A state space model of the same form as TBATS but with multiple sources of error and covariates could be used, but I don’t have any R code to do that.

Instead, I would use a regression model with ARIMA errors, where the regression terms include any dummy holiday effects as well as the longer annual seasonality. Unless there are many decades of data, it is usually reasonable to assume that the annual seasonal shape is unchanged from year to year, and so Fourier terms can be used to model the annual seasonality. Suppose we use K=5 Fourier terms to model annual seasonality, and that the holiday dummy variables are in the vector holiday with 100 future values in holidayf. Then the following code will fit an appropriate model.

y <- ts(x, frequency=7)
z <- fourier(ts(x, frequency=365.25), K=5)
zf <- fourierf(ts(x, frequency=365.25), K=5, h=100)
fit <- auto.arima(y, xreg=cbind(z,holiday), seasonal=FALSE)
fc <- forecast(fit, xreg=cbind(zf,holidayf), h=100)

The order K can be chosen by minimizing the AIC of the fitted model.

Related Posts:

  • Kris Ewican

    This is very useful Rob. Thank you very much!

  • Pingback: Momento R do Dia | De Gustibus Non Est Disputandum()

  • mike

    Thanks for the post. Couple of questions: 1) When using regression with ARMA with Fourier explanatory variables, do you go b back and remove the individual Fourier series that are are insignificant? 2) can you use xreg= with TBATS?

    • Rob J Hyndman

      1. No. I use the AIC to determine K, and leave all the terms in. Significance is not as important as being useful for prediction, and they are not the same thing.

      2. No.

  • Pingback: Somewhere else, part 76 | Freakonometrics()

  • Richard Warnung

    Very nice summary, very useful, thanks for all your efforts and best wishes from Vienna.

  • Leo

    Professor Hyndman,
    What will be the seasonality of hourly data that’s available for weekdays (5 days of week) or daily data that’s available for five days of the week?

    • Leo

      I meant seasonal period which I guess should be 5 for weekdays data.

  • Pingback: Rtistry & Rbominations :: linklist, sunday 2013-09-29()

  • harvey chaparro

    how to create the holiday and holidayf vectors?

  • Anthony

    Do you think ets or tbats are useful for analyzing daily stock market data?

    Moreover, I am using tbats but R is running the bats command. At least that’s what it shows when I enter the datatbats (my tbats object) at the R prompt.

    • Rob J Hyndman

      No. These functions are for data that have trends and seasonality. Daily stock market data typically have neither.

      When you call tbats() with a non seasonal time series, it will return a non-seasonal BATS model as that is equivalent to a non-seasonal TBATS model.

      • Anthony

        Stock market data does have a trend. Perhaps simple Holt’s method can produce something good.

        • Rob J Hyndman

          Wrong. Stock market data has some apparent local trends that are simply the local effects of random walk like behaviour. Holt’s method will give useless forecasts for daily stock data.

  • A.M.jaber

    i would like to ask how I can extract and forecasting trend using Empirical Mode Decomposition with R code for daily stock market data

  • Abhishek

    Respected Sir,

    I am a student of computer science and currently I am working on my project.You are the expert of this field and I have seen on your blog that you help everyone so it is so kind of you.

    I am working on forecasting of the energy consumption in R . I do have the data of previous 30 years and I want to forecast the data of next 5 years. Can you suggest the best way to do it ?

    Thank you.

  • Ville

    Hi, one can determine the parameters alpha, gamma etc or give upper and lower bounds for them when using ETS. Can you fix the values or give boundaries to the parameters when using BATS/TBATS? Thanks

    • Rob J Hyndman

      Yes. Read the help file for ets().
      No. BATS/TBATS models are currently only available using a completely automated procedure. We may introduce more manual model specification in a later version.

      • Ville

        Thank you for your reply. I think that more manual model specifications in BATS/TBATS would be a great improvement and I’m looking forward to seeing it in the future!

  • KM

    Mr. Hyndman, let me first thank you immensely for teaching me so much about forecasting. I have taken up a few courses and worked at two leading firms in the past but the amount I’ve learnt from your blog posts is far more.

    I am trying to learn more forecasting using R currently to aide in my masters course. I got a time series data (for 5 years) from a third party fmcg data vendor which has weekly and monthly seasonality which I could see by using the decompose() function.

    I am trying to forecast using the code below:

    mydata1 <-read.csv("E:/file.csv",header=TRUE);
    mydata <- msts(mydata1, seasonal.periods=c(7,365.25,354.37,365));
    fit <- tbats(mydata,, use.trend=NULL, use.damped.trend=NULL,seasonal.periods=c(7,365.25,365,354.37), use.arma.errors=TRUE, use.parallel=TRUE,bc.lower=0, bc.upper=1);
    fcast <- forecast(fit,h=433);
    fcast.df <- data.frame(fcast)
    where I have used weekly, Gregorian, Hindu and Hijri calendars to set seasonal periods.
    The correlation between the forecasted data and observed data is ~0.82 which seems low to me. The major reason could be that I can see peaks on a few particular dates like Jan1 and Dec25 year on year which is not forecasted by tbats.
    Is there a way to include this into the code? What is the significance of manually setting the box-cox limits?

    • Rob J Hyndman

      The tbats model does not allow for covariates, so specific effects such as Christmas and New Year cannot be handled by dummy variables. However, you could use the Fourier-ARIMA approach mentioned above, and add the covariate as I’ve explained. The Box-Cox parameter is normally restricted to (0,1). The arguments allow other ranges which are occasionally useful.

      • ninnawei

        how to use the holiday vector specific? could you take an example?

      • KM

        Thank you Mr. Hyndman for the explanation. The data has a weekly and monthly seasonality and requires me to use TBATS as you said here:
        Is there a way to decompose using tbats.components() to get trend, seasonal components and irregular components?
        I was not able to grasp the significance of “level” and “slope” that comes out of tbats.components().
        Is the decompose() or stl() functions usable in it’s place? Would these take the same parameters and give proper decomposition?

      • KM

        What I meant to ask was how do I compare trend (coming out of decompose()) and level coming out of tbats.components(). I have taken them as the same, Divided the original series with a multiplication of log(trend*slope*irr) to get seasonality.


        • Rob J Hyndman

          decompose() and tbats() use different models, so they are not strictly comparable. But if your tbats model has no Box-Cox transformation, then the trend from decompose is roughly equal to the level from tbats. Both functions already produce seasonal components for you.

          • fei Li

            Hi Professor Hyndman, What if there is Box-Cox transformation, how could we get the trend value from tbats model?

            And is the slope the random error item?

            Thank you for your time!

          • Rob J Hyndman

            You will need to back-transform the trend to get it on the original scale. No, the slope is not the random error term. Please read the documentation to understand the model.

  • Marco De Nadai

    @robjhyndman:disqus sorry to disturb you, but I need to forecast gas consumption composed by a daily, weekly (week
    days-weekend), yearly seasonality. Does it make sense to apply three
    times the STL decomposition by LOESS? (

    • Rob J Hyndman

      Ask your questions on

  • Pingback: TBATS with regressors | Hyndsight()

  • Pingback: Multiple seasonality with ARIMA? | CL-UAT()

  • Mary Rose

    Hi Rob,
    I am forecasting daily data and fitting it to a tbats model:

    y <- msts(x, seasonal.periods=c(7,365.25))
    fit <- tbats(y)

    I know that the function Arima() is used when wanting to update an ARIMA model whenever new data is available. Is there a function in R that will do the same for a tbats model (i.e. update the tbats model for new incoming data)?
    Mary Rose

    • Rob J Hyndman

      Not yet. It’s on my to do list.

      • Mary Rose

        Sounds good.

        I ended up doing the following:

        y <- ts(x, frequency=7)
        z <- fourier(ts(x, frequency=365.25), K=5)
        fit <- auto.arima(y, xreg=z, seasonal=TRUE)

        # new_x is a vector of newly observed daily data with length h
        new_y <- ts(new_x, frequency = 7)
        new_z <- fourierf(ts(new_x, frequency=365.25), K=5, h))

        update <- Arima(x=c(y,new_y), xreg = c(z,new_z), model = fit, seasonal=TRUE)

        Would this be a good way to update a daily model when wanting to include both week and annual frequencies?


        Mary Rose

        • Rob J Hyndman

          Yes, that should work very well.

  • Christopher

    I have a data set that is daily data that has a strong weekly pattern (M-F is high traffic, weekends are low traffic).

    When I do:

    gtbats <- tbats(us); fc2 <- forecast(gtbats, h=28) ; plot(fc2)

    The historical data in the plot is missing and the y-axis is incorrect. The shape of the forecast itself appears fine.

    When I use the msts function, everything looks better:

    y <- msts(us, seasonal.periods=c(7))
    fit <- tbats(y); fc <- forecast(fit,h=28); plot(fc)

    Just a note that at first, the plot without msts() appears wildly incorrect, but the forecast itself appears sound shape-wise.

    How do I interpret the x-axis? It doesn't appear to correspond to days…

    The data spans several years. There are no visually obvious yearly patterns. But when I add seasonal.periods=c(7,29.6,365.25), the prediction is more nuanced. I am thinking I could test for the monthly seasonality and yearly seasonality by using less "training" data and see how it predicts matched to actual data. Is there an easier manner to determine the underlining seasonality in a time-series? (I know I have more reading to do…)

    PS Thank you so very much for your blogs and your other writing and research.

    • Rob J Hyndman

      Use either ts or msts objects, as explained in the help file. Don’t use xts objects.

      The x-axis is in weeks.

      The model will tell you whether there is any annual seasonality.

      • Gregory R. Duchon

        On a related note, if you had weekday data only would you lower the frequency to 5 as opposed to have weekend values with 0 and would the second seas then become something like 365-(52*2) =261? (I am looking at support requests that only occur on weekdays as project for my MS Business Analytics program.)

        • Rob J Hyndman

          Yes, I would use frequency=5. Setting weekends to 0 will create problems in finding appropriate models.

          • Gregory R. Duchon

            Thank you professor!

  • Pingback: Modelo ARIMA | Monolito Nimbus()

  • randomdude

    Hi Rob,
    is there a way to use the tbats method and extract the remainder of the decomposition like you did it in your “turkey electricity demand” analysis? I asked on CrossValidated some time ago but nobody could help me with that so far:

    Thanks for your time and advice in advance, i learned so much through your blog and your papers!

  • Alassane

    Hi Sir,

    Thank you for all your interesting explanations. I used to forecast daily positive time series using tbats. But in some cases i get négatives or very long values like 1.25868e+14 where my real values are between 0 and 50000. I tried to include lamda=0 in my models but it doesn’t work any more. I would like to know if there is a solution to avoid that problems ?

    Thank you.


    • Rob J Hyndman

      Can you please provide a reproducible example of problems like this. I am always trying to improve the software, and edge cases that cause problems are helpful in identifying areas for improvement. You can submit bug reports at

      • Alassane

        I submit the reports in the site. It’s about forecasting daily turnover in 87 countries by using historical datas in 4 years (from 2011 to 2015)

        Thank you

  • max

    Hi sir Rob and thank you for the post, i have a daily time series about the number of byer on web site each day . my goal is to forecast the number of byers(visitors) for future days.

    haw can i do this using R programming, how can i undestand that my time series is statonary or not ? any suggestion of R code welcome

    may data contain 2 column: date and number of byers.
    this mu code R and the output: it is right or not ?

    > df=read.csv(“byers.csv”,sep=”;”, stringsAsFactors = FALSE,header=TRUE)

    > head(df)

    date byers

    1 01/01/2014 3114

    2 02/01/2014 5954

    3 03/01/2014 5342

    4 04/01/2014 4929

    5 05/01/2014 5633

    6 06/01/2014 5890

    > head(df)

    [1] 3114 5954 5342 4929 5633 5890

    > dftime=ts(df,start=c(2014,01),frequency=365)

    > HWmodel=HoltWinters(dftime,beta=FALSE,gamma=FALSE)

    > HWmodel

    Holt-Winters exponential smoothing without trend and without seasonal component.


    HoltWinters(x = dftime, beta = FALSE, gamma = FALSE)

    Smoothing parameters:

    alpha: 0.8619079

    > future=predict(HWmodel,n.ahead=10,lebel=0.95)

    > future

    Time Series:

    Start = c(2015, 1)

    End = c(2015, 10)

    Frequency = 365


    [1,] 6738.195

    [2,] 6738.195

    [3,] 6738.195

    [4,] 6738.195

    [5,] 6738.195

    [6,] 6738.195

    [7,] 6738.195

    [8,] 6738.195

    [9,] 6738.195

    [10,] 6738.195

    beta : FALSE

    gamma: FALSE



    a 6738.195

    if you remark always i have the same value predicterd fo 10 days . why ? and how can i do this type of forcasting using R ?

    Thank you very much for your help Sir.

  • max

    Hi and thans profsor for post. i have daily data and i like to make some forecasting. what model can I use , there are many model : ets(), arima(),auto.arima(),holtwinters()…
    thanks in advance