# Forecasting with daily data

I’ve had several emails recently asking how to forecast daily data in R. Unless the time series is very long, the easiest approach is to simply set the frequency attribute to 7.

 y <- ts(x, frequency=7)

Then any of the usual time series forecasting methods should produce reasonable forecasts. For example

 library(forecast) fit <- ets(y) fc <- forecast(fit) plot(fc)

When the time series is long enough to take in more than a year, then it may be necessary to allow for annual seasonality as well as weekly seasonality. In that case, a multiple seasonal model such as TBATS is required.

 y <- msts(x, seasonal.periods=c(7,365.25)) fit <- tbats(y) fc <- forecast(fit) plot(fc)

This should capture the weekly pattern as well as the longer annual pattern. The period 365.25 is the average length of a year allowing for leap years. In some countries, alternative or additional year lengths may be necessary. For example, with the Turkish electricity data analysed in De Livera et al (JASA 2011), we used three seasonal periods: 7, 354.35 and 365.25. The period 354.37 is the average length of the Islamic calendar.

Capturing seasonality associated with moving events such as Easter or the Chinese New Year is more difficult. Even with monthly data, this can be tricky as the festivals can fall in either March or April (for Easter) or in January or February (for the Chinese New Year). The usual seasonal models don’t allow for this, and even the complex seasonality discussed in my JASA paper assumes that the seasonal patterns occur at the same time in each year. The best way to deal with moving holiday effects is to use dummy variables. However, neither ETS nor TBATS models allow for covariates. A state space model of the same form as TBATS but with multiple sources of error and covariates could be used, but I don’t have any R code to do that.

Instead, I would use a regression model with ARIMA errors, where the regression terms include any dummy holiday effects as well as the longer annual seasonality. Unless there are many decades of data, it is usually reasonable to assume that the annual seasonal shape is unchanged from year to year, and so Fourier terms can be used to model the annual seasonality. Suppose we use $K=5$ Fourier terms to model annual seasonality, and that the holiday dummy variables are in the vector holiday with 100 future values in holidayf. Then the following code will fit an appropriate model.

 y <- ts(x, frequency=7) z <- fourier(ts(x, frequency=365.25), K=5) zf <- fourierf(ts(x, frequency=365.25), K=5, h=100) fit <- auto.arima(y, xreg=cbind(z,holiday), seasonal=FALSE) fc <- forecast(fit, xreg=cbind(zf,holidayf), h=100)

The order $K$ can be chosen by minimizing the AIC of the fitted model.

### Related Posts:

• Kris Ewican

This is very useful Rob. Thank you very much!

• mike

Thanks for the post. Couple of questions: 1) When using regression with ARMA with Fourier explanatory variables, do you go b back and remove the individual Fourier series that are are insignificant? 2) can you use xreg= with TBATS?

• 1. No. I use the AIC to determine K, and leave all the terms in. Significance is not as important as being useful for prediction, and they are not the same thing.

2. No.

• Pingback: Somewhere else, part 76 | Freakonometrics()

• Richard Warnung

Very nice summary, very useful, thanks for all your efforts and best wishes from Vienna.

• Leo

Professor Hyndman,
What will be the seasonality of hourly data that’s available for weekdays (5 days of week) or daily data that’s available for five days of the week?
regards
Leo

• Leo

I meant seasonal period which I guess should be 5 for weekdays data.

• harvey chaparro

how to create the holiday and holidayf vectors?

• Anthony

Do you think ets or tbats are useful for analyzing daily stock market data?

Moreover, I am using tbats but R is running the bats command. At least that’s what it shows when I enter the datatbats (my tbats object) at the R prompt.

• No. These functions are for data that have trends and seasonality. Daily stock market data typically have neither.

When you call tbats() with a non seasonal time series, it will return a non-seasonal BATS model as that is equivalent to a non-seasonal TBATS model.

• Anthony

Stock market data does have a trend. Perhaps simple Holt’s method can produce something good.

• Wrong. Stock market data has some apparent local trends that are simply the local effects of random walk like behaviour. Holt’s method will give useless forecasts for daily stock data.

• A.M.jaber

Hi
i would like to ask how I can extract and forecasting trend using Empirical Mode Decomposition with R code for daily stock market data

• Abhishek

Respected Sir,

I am a student of computer science and currently I am working on my project.You are the expert of this field and I have seen on your blog that you help everyone so it is so kind of you.

I am working on forecasting of the energy consumption in R . I do have the data of previous 30 years and I want to forecast the data of next 5 years. Can you suggest the best way to do it ?

Thank you.

• Pranav Bhargava

Did you get the solution to this?

• Ville

Hi, one can determine the parameters alpha, gamma etc or give upper and lower bounds for them when using ETS. Can you fix the values or give boundaries to the parameters when using BATS/TBATS? Thanks

• Yes. Read the help file for ets().
No. BATS/TBATS models are currently only available using a completely automated procedure. We may introduce more manual model specification in a later version.

• Ville

Thank you for your reply. I think that more manual model specifications in BATS/TBATS would be a great improvement and I’m looking forward to seeing it in the future!

• KM

Mr. Hyndman, let me first thank you immensely for teaching me so much about forecasting. I have taken up a few courses and worked at two leading firms in the past but the amount I’ve learnt from your blog posts is far more.

I am trying to learn more forecasting using R currently to aide in my masters course. I got a time series data (for 5 years) from a third party fmcg data vendor which has weekly and monthly seasonality which I could see by using the decompose() function.

I am trying to forecast using the code below:

mydata1 <-read.csv("E:/file.csv",header=TRUE);
mydata <- msts(mydata1, seasonal.periods=c(7,365.25,354.37,365));
fit <- tbats(mydata,use.box.cox=NULL, use.trend=NULL, use.damped.trend=NULL,seasonal.periods=c(7,365.25,365,354.37), use.arma.errors=TRUE, use.parallel=TRUE,bc.lower=0, bc.upper=1);
fcast <- forecast(fit,h=433);
plot(fcast);
fcast.df <- data.frame(fcast)
WriteXLS("fcast.df","E:/fcast.xlsx");
where I have used weekly, Gregorian, Hindu and Hijri calendars to set seasonal periods.
The correlation between the forecasted data and observed data is ~0.82 which seems low to me. The major reason could be that I can see peaks on a few particular dates like Jan1 and Dec25 year on year which is not forecasted by tbats.
Is there a way to include this into the code? What is the significance of manually setting the box-cox limits?
Regards,
KM

• The tbats model does not allow for covariates, so specific effects such as Christmas and New Year cannot be handled by dummy variables. However, you could use the Fourier-ARIMA approach mentioned above, and add the covariate as I’ve explained. The Box-Cox parameter is normally restricted to (0,1). The arguments allow other ranges which are occasionally useful.

• ninnawei

how to use the holiday vector specific? could you take an example?

• KM

Thank you Mr. Hyndman for the explanation. The data has a weekly and monthly seasonality and requires me to use TBATS as you said here: http://www.r-bloggers.com/forecasting-weekly-data/
Is there a way to decompose using tbats.components() to get trend, seasonal components and irregular components?
I was not able to grasp the significance of “level” and “slope” that comes out of tbats.components().
Is the decompose() or stl() functions usable in it’s place? Would these take the same parameters and give proper decomposition?
Regards,
KM

• KM

What I meant to ask was how do I compare trend (coming out of decompose()) and level coming out of tbats.components(). I have taken them as the same, Divided the original series with a multiplication of log(trend*slope*irr) to get seasonality.

Regards,
KM

• decompose() and tbats() use different models, so they are not strictly comparable. But if your tbats model has no Box-Cox transformation, then the trend from decompose is roughly equal to the level from tbats. Both functions already produce seasonal components for you.

• fei Li

Hi Professor Hyndman, What if there is Box-Cox transformation, how could we get the trend value from tbats model?

And is the slope the random error item?

Thank you for your time!

• You will need to back-transform the trend to get it on the original scale. No, the slope is not the random error term. Please read the documentation to understand the model.

• Manpreet Singh

I have joined the party 2 yrs late. Please explain how to do the back-transformation.

• @robjhyndman:disqus sorry to disturb you, but I need to forecast gas consumption composed by a daily, weekly (week
days-weekend), yearly seasonality. Does it make sense to apply three
times the STL decomposition by LOESS? (http://datascience.stackexchange.com/questions/957/multiple-seasonality-with-arima)

• Ask your questions on crossvalidated.com.

• Pingback: TBATS with regressors | Hyndsight()

• Pingback: Multiple seasonality with ARIMA? | CL-UAT()

• Mary Rose

Hi Rob,
I am forecasting daily data and fitting it to a tbats model:

y <- msts(x, seasonal.periods=c(7,365.25))
fit <- tbats(y)

I know that the function Arima() is used when wanting to update an ARIMA model whenever new data is available. Is there a function in R that will do the same for a tbats model (i.e. update the tbats model for new incoming data)?
Thanks,
Mary Rose

• Not yet. It’s on my to do list.

• Mary Rose

Sounds good.

I ended up doing the following:

y <- ts(x, frequency=7)
z <- fourier(ts(x, frequency=365.25), K=5)
fit <- auto.arima(y, xreg=z, seasonal=TRUE)

# new_x is a vector of newly observed daily data with length h
new_y <- ts(new_x, frequency = 7)
new_z <- fourierf(ts(new_x, frequency=365.25), K=5, h))

update <- Arima(x=c(y,new_y), xreg = c(z,new_z), model = fit, seasonal=TRUE)

Would this be a good way to update a daily model when wanting to include both week and annual frequencies?

Thanks,

Mary Rose

• Yes, that should work very well.

• Christopher

I have a data set that is daily data that has a strong weekly pattern (M-F is high traffic, weekends are low traffic).

When I do:

us=xts(usDf[,’daily’],order.by=usDf[,’date’],freq=7);
gtbats <- tbats(us); fc2 <- forecast(gtbats, h=28) ; plot(fc2)

The historical data in the plot is missing and the y-axis is incorrect. The shape of the forecast itself appears fine.

When I use the msts function, everything looks better:

y <- msts(us, seasonal.periods=c(7))
fit <- tbats(y); fc <- forecast(fit,h=28); plot(fc)

Just a note that at first, the plot without msts() appears wildly incorrect, but the forecast itself appears sound shape-wise.

How do I interpret the x-axis? It doesn't appear to correspond to days…

The data spans several years. There are no visually obvious yearly patterns. But when I add seasonal.periods=c(7,29.6,365.25), the prediction is more nuanced. I am thinking I could test for the monthly seasonality and yearly seasonality by using less "training" data and see how it predicts matched to actual data. Is there an easier manner to determine the underlining seasonality in a time-series? (I know I have more reading to do…)

PS Thank you so very much for your blogs and your other writing and research.

• Use either ts or msts objects, as explained in the help file. Don’t use xts objects.

The x-axis is in weeks.

The model will tell you whether there is any annual seasonality.

• Gregory R. Duchon

On a related note, if you had weekday data only would you lower the frequency to 5 as opposed to have weekend values with 0 and would the second seas then become something like 365-(52*2) =261? (I am looking at support requests that only occur on weekdays as project for my MS Business Analytics program.)

• Yes, I would use frequency=5. Setting weekends to 0 will create problems in finding appropriate models.

• Gregory R. Duchon

Thank you professor!

• Pingback: Modelo ARIMA | Monolito Nimbus()

• randomdude

Hi Rob,
is there a way to use the tbats method and extract the remainder of the decomposition like you did it in your “turkey electricity demand” analysis? I asked on CrossValidated some time ago but nobody could help me with that so far: http://stats.stackexchange.com/questions/163371/work-with-results-of-tbats-decomposition

Thanks for your time and advice in advance, i learned so much through your blog and your papers!

• Alassane

Hi Sir,

Thank you for all your interesting explanations. I used to forecast daily positive time series using tbats. But in some cases i get négatives or very long values like 1.25868e+14 where my real values are between 0 and 50000. I tried to include lamda=0 in my models but it doesn’t work any more. I would like to know if there is a solution to avoid that problems ?

Thank you.

Alassane

• Can you please provide a reproducible example of problems like this. I am always trying to improve the software, and edge cases that cause problems are helpful in identifying areas for improvement. You can submit bug reports at https://github.com/robjhyndman/forecast/issues

• Alassane

I submit the reports in the site. It’s about forecasting daily turnover in 87 countries by using historical datas in 4 years (from 2011 to 2015)

Thank you

• max

Hi sir Rob and thank you for the post, i have a daily time series about the number of byer on web site each day . my goal is to forecast the number of byers(visitors) for future days.

haw can i do this using R programming, how can i undestand that my time series is statonary or not ? any suggestion of R code welcome

may data contain 2 column: date and number of byers.
this mu code R and the output: it is right or not ?

> df=read.csv(“byers.csv”,sep=”;”, stringsAsFactors = FALSE,header=TRUE)

> head(df)

date byers

1 01/01/2014 3114

2 02/01/2014 5954

3 03/01/2014 5342

4 04/01/2014 4929

5 05/01/2014 5633

6 06/01/2014 5890

> head(df)

[1] 3114 5954 5342 4929 5633 5890

> dftime=ts(df,start=c(2014,01),frequency=365)

> HWmodel=HoltWinters(dftime,beta=FALSE,gamma=FALSE)

> HWmodel

Holt-Winters exponential smoothing without trend and without seasonal component.

Call:

HoltWinters(x = dftime, beta = FALSE, gamma = FALSE)

Smoothing parameters:

alpha: 0.8619079

> future=predict(HWmodel,n.ahead=10,lebel=0.95)

> future

Time Series:

Start = c(2015, 1)

End = c(2015, 10)

Frequency = 365

fit

[1,] 6738.195

[2,] 6738.195

[3,] 6738.195

[4,] 6738.195

[5,] 6738.195

[6,] 6738.195

[7,] 6738.195

[8,] 6738.195

[9,] 6738.195

[10,] 6738.195

beta : FALSE

gamma: FALSE

Coefficients:

[,1]

a 6738.195

if you remark always i have the same value predecterd fo 10

• max

Hi sir Rob and thank you for the post, i have a daily time series about the number of byer on web site each day . my goal is to forecast the number of byers(visitors) for future days.

haw can i do this using R programming, how can i undestand that my time series is statonary or not ? any suggestion of R code welcome

may data contain 2 column: date and number of byers.
this mu code R and the output: it is right or not ?

> df=read.csv(“byers.csv”,sep=”;”, stringsAsFactors = FALSE,header=TRUE)

> head(df)

date byers

1 01/01/2014 3114

2 02/01/2014 5954

3 03/01/2014 5342

4 04/01/2014 4929

5 05/01/2014 5633

6 06/01/2014 5890

> head(df)

[1] 3114 5954 5342 4929 5633 5890

> dftime=ts(df,start=c(2014,01),frequency=365)

> HWmodel=HoltWinters(dftime,beta=FALSE,gamma=FALSE)

> HWmodel

Holt-Winters exponential smoothing without trend and without seasonal component.

Call:

HoltWinters(x = dftime, beta = FALSE, gamma = FALSE)

Smoothing parameters:

alpha: 0.8619079

> future=predict(HWmodel,n.ahead=10,lebel=0.95)

> future

Time Series:

Start = c(2015, 1)

End = c(2015, 10)

Frequency = 365

fit

[1,] 6738.195

[2,] 6738.195

[3,] 6738.195

[4,] 6738.195

[5,] 6738.195

[6,] 6738.195

[7,] 6738.195

[8,] 6738.195

[9,] 6738.195

[10,] 6738.195

beta : FALSE

gamma: FALSE

Coefficients:

[,1]

a 6738.195

if you remark always i have the same value predicterd fo 10 days . why ? and how can i do this type of forcasting using R ?

Thank you very much for your help Sir.

• max

Hi and thans profsor for post. i have daily data and i like to make some forecasting. what model can I use , there are many model : ets(), arima(),auto.arima(),holtwinters()…
thanks in advance

• Learner

I have daily demand data from past 3 years. I want to forecast for next 365 days. Month of the year and day of the week has impact on demand. How do i include month in msts? since january has 31 days, february has 28 days and April has 30 days. not to forget leap year! If day of the week also has trend, would you suggest doing below for the month or is there any other way month can be specified?
msts(x, seasonal.periods=c(7,365.25,30))

• Do you really have monthly seasonality? That would be very unusual, but it does sometimes happen. Much more common is both weekly and annual seasonality. Monthly seasonality would arise if you had end-of-month effects due to accounting practices, but I can’t think of anything else that would cause them. I think you would need to consider what is causing the monthly seasonality and try to allow for it explicitly. It is not strictly seasonal as the periodicity is not regular.

• Oli Paul

Real time bidding. Marketing agencies use pacing algorithms that try to average their spend across the month but always end up having to spend more at the end of the month – they need to spend the budget to justify it.

• larry77

Very interesting post, but can you provide a numerical example? I am not sure I understand the nature of holiday vector. Is it a vector of dates? Or it has 0/1 depending on whether a certain day is a holiday or not?

• It is a dummy variable. That means it contains 0s and 1s indicating which days are holidays.

• Arun Gunalan

HI Professor, so the holidays will be set to “0”, am i corect.

• No. 1s for holidays, 0s everywhere else.

• Raed

Very informative, Rob. Thank you so much.

• yuqin

Hello professor, what if there is a monthly period as well?

• Read the comments to this post.

• Arun Gunalan

HI Rob, i have a retail sales data time series of 2 years, where the sales will be higher in weekends, will the model you expained above will fit it.

• Maybe.

• Dear Rob, regarding the frequency parameter in ts, will it be 7 or 365.25. As by default it is for annual ( that is 1 for annual) and 12 for monthly, logically would it be 365.25?

• Thank you very much Rob.

• Jul

Awesome post, thanks Rob!

I was asked to summarize in a couple of sentences what the tbats formula does.
Would you agree with the following summary?

1. Apply a Box-Cox transformation if Y is not normally distributed*
2. Model the seasonality with Fourier series **
3. Model the remaining auto-correlation, once the effect of seasonality has been removed, with an ARMA process **

* That’s not a guarantee that Y will be normally distributed after the transformation.
** The parameters K for the Fourier series and (p,q) for the ARMA process are chosen by minimizing the AIC of the fitted model.

Thanks!

• Not quite.

1. Y is not required to be normal, and usually won’t be due to the trend and seasonality. Instead, the residuals are assumed normal with constant variance, and the BC transformation will normally allow that to occur.

2. The seasonality is modelled with Fourier-like terms that can change coefficients over time. Also, a local linear time trend is included, as it is in an ETS model.

3. Yes

All terms are selected by minimizing the AIC, not just the ARMA orders.

• Jul

thanks Rob!

• Saneesh George

Thank you very much Professor for this post.

Can you please explain how do we forecast a future value based on the current value. For eg: what would be the total number of bookings on next Saturday if there is already x numbers at this moment.

• Steev

hello Sir. how can I have the p-value of coefficients. It would be interesting to justify that the holidays effects are significant.

• You have the coefficients, and the standard errors. It is not too hard to compute the p-values yourself.

• Batuhan

Hi Rob,
Can you tell your data source for daily Turkish electricity demand? I want to make a forecasting analysis, but I have daily Electricity production data starting from 2015.
Thank you very much for your help in advance.

• My PhD student (Alysha De Livera) got it from somewhere, but I don’t have any records of the source.

• Mihai Stancu

Mr. Hyndman, i recently started a small forecast case study regarding energy demand of a distribution grid.
I used auto.arima and xreg with temperature as driver on historical data 4 years.
I received good results but i want to have more drivers like holidays (given by future variables ) in the same time. I switched using the example you provided for hollidays using dummy values and it works but i want to use temperature in the same time too and i don’t know how. Using cbind on xreg with 2 variables gives me a error in not being a matrix 🙁

In a few words, what i used was:
x <- auto.arima(ts(mydata,frequency = 365.25),d=1,D=1,xreg=Temperature)
Fc <- forecast(x,xreg = temperature of next 30 days)

The outcome that i need is a forecast of 1 month, daily data using as driver the temperature and known holidays.
In the future i would like to understand how gradient buster works but for now i would like to stick to simpler cases…

Regards, MS

• Dipti

Hi Rob,

The fourierf function has been depricated from forecast package so my code errors out. Also, if you can explain why did we do a fourier transformation of X? Usually I dont see that done before ARIMA. Will really appreciate your reply. I have learned everything about forecasting from your posts and packages and look forward to your response.

• fourierf does not cause errors. It only issues a warning. It also tells you how to fix it.

We use Fourier series as a parsimonious way of modelling seasonality. Look up “harmonic regression” on google scholar.

• Dipti

I tried the above for year 2013 through 2016 and predicting for dec 2016. My data is heavily annual periodic (retail) and stl plot shows all the seasonal and trend components beautifully. Still the forecast is stuck at the mean. I cant figure out why. Please help!

• Dipti

I am able to use the above post for generating inventory forecast in millions of dollars a day. So, this post proved to be extremely useful and important for our operations. However I have a question:
y is a timeseries with weekly frequency.
z is a fourier series with daily frequency.
How did you calculate the model fit using input as y and regressive term z? Dont they have different periodicity? What would happen if I generate y with daily frequency 365.25?
Would really appreciate your response.

• They have different frequency attributes, but the same length. So the fit works fine. If y had frequency of 365.25, the ARIMA model would fail because it can’t handle seasonality that high.

• Pablo Beltramone

Hi Rob, Thanks for the post. I have a few questions for you:

I have daily data but only workable days in the year along of 3 years. The data is the count of customers who enter in a bank. I have weekly, month and year seasonability.
So I can’t figure out what is the correct frequency to be set. I’m trying to fit an arima model with Fourier regressors in order to model the seasonality.
I get confused when I need to choose what is the frequency for those seasonalities. For instance: in workable days I have 5 days a week, 20 days per month and 240 days per year, ideally.
Using periodograms I saw that the max frequency is given in the periods: 21,130,5,261. So rounding this is about 20,130,5 and 260.

So when I fit the arima model I choose to model the seasonality at period 5 and then model the remaining seasonalities with fourier. Doing so and after fitting I check the residuals with the periodogram and the ACF and PACF and I still see loops and seasonals behaviors in the same periodos.
My questions are:
If I fit correctly the seasonalities I must hope that the residuals do not show the seasonal behavior, that’s correct?
I’m dealing with missing data. How to deal with it? I must impute them? remove them?

Thank you very much.

Pablo

Here is my code so far:

tscli = ts(cli,frequency = 5)
tsclimean = imputeTS::na.mean(tscli) #imput the mean
y=tsclimean
p = TSA::periodogram(y,ylab=’Variable tsclimean Periodogram’);abline(h=0)
maxSpec=order(p$spec,decreasing = T)[1:5] #picks on 37 6 160 3 1 p$spec[maxSpec]
fre = maxSpec/length(y)
per = round(1/fre) # periods 21,130,5,261,783
per

#remover el período 5 y 783
per=per[!(per %in% c(5,783))]

#holidays
feri =c(1 ,5 ,44 ,45 ,59 ,66 ,77 ,78 ,87 ,88 ,107 ,123 ,135 ,136 ,157 ,164 ,172 ,200 ,204 ,211 ,222 ,226 ,227 ,234 ,244 ,257 ,258 ,261,
262 ,294 ,295 ,319 ,320 ,325 ,327 ,328 ,348 ,355 ,364 ,365 ,366 ,397 ,424 ,461 ,464 ,483 ,498 ,504 ,505 ,518 ,523 ,549 ,550 ,582 ,583 ,597,
602 ,626 ,643 ,644 ,657 ,658 ,684 ,723 ,724 ,738 ,759 ,767 ,768)
#dataset with dummy vars
d = 1*(seq(cli)==feri[1])
for(i in 2:length(feri)){ d = cbind(d,1*(seq(cli)==feri[i]))}
d=data.frame(d)
names(d)[1] = “V1”

z1 =fourier(ts(cli,frequency = per[1]),1)
z2 =fourier(ts(cli,frequency = per[2]),1)
z3 =fourier(ts(cli,frequency = per[3]),1)
z1h=fourier(ts(cli,frequency = per[1]),1,h=100)
z2h=fourier(ts(cli,frequency = per[2]),1,h=100)
z3h=fourier(ts(cli,frequency = per[3]),1,h=100)
xregre=cbind(z1,z2,z3,d)
xregrenew=data.frame(cbind(z1h,z2h,z3h,matrix(0,nrow = 100,ncol = ncol(d))))
names(xregrenew)=names(xregre)

bf=forecast::Arima(tsclimean,order = c(1,1,1),seasonal = list(order=c(1,0,1),period = 5),xreg = xregre)
par(mfrow=c(2,1))
acf(bf$residuals ,lag.max = 50) pacf(bf$residuals,lag.max = 50)
par(mfrow=c(1,1))

• Arima and auto.arima will handle missing values. No need to impute them.

Yes, residuals should look like white noise. So no seasonalities.

With weekday data, your seasonalities will be 5 and 365.25*5/7. You *might* have a monthly seasonality as well if you have end-of-month accounting procedures, or some other activity on fixed days of the month. They are harder to account for as there are not 4 weeks in a month, and the months are not of equal length. Consequently, you might find it better to use dummy variables for the days in which such monthly activities occur.

• Pranav Bhargava

Hello professor,
I have data for every 15 min interval for each day for almost 2 years. How can I use it to predict for future days?
The data is from June 2012 to July 2014.
please help.

• This is not a help site. Ask on crossvalidated.com

• Ryann

Hi Rob,

For ts() and msts(), I am wondering what is the unique season pattern assumed?

For example, if we have seasonal.periods(7,365.25), does it assume every 1 day in 7 days is a unique season that repeats 7 days later. Comparing Monday week 1 and Monday week 2. Alternatively for the 365.25, every 1 day in the 365.25 days is a unique season that doesn’t repeat until 365.25 days later? Comparing Jan 1 2015 and Jan 1 2016?

Or does seasonal.periods(7,365.25) work as every 7 days is a unique season that doesn’t repeat until 365.25 days later, comparing week 1 2015 and week 2 2016?

I hope I have explained my confusion clearly. Thanks in advanced for your help and being a maven.

• Nothing is assumed. The seasonal periods are simply numerical attributes that are stored. How they are used depends on the model that is fitted later.

• Nassim

Hi Professor,

Why do you set seasonal=FALSE on auto.arima ? I get ARIMA(1,1,2)(1,0,1)[7] with AICC 11714.2 vs ARIMA(5,1,3) with AICC 11836.61

Is there any underlying theorical reason to avoid using seasonal ARIMA in this case ?

Also, I have noticed that I have a lower AICC by using as covariate an array with store opening status (1 opened, 0 closed) than an array with holidays for forecasting daily sales.

• The Fourier terms are meant to handle the seasonality, so I generally do not include seasonal ARIMA components as well. It is possible that you need more Fourier terms if there is still seasonality that the ARIMA model is picking up. Alternatively, it may be that the seasonality is changing over time, and the seasonal ARIMA is required to handle the evolution of the seasonality, while the Fourier terms handle the average seasonal shape. There is no theoretical reason why you can’t have both, it just makes it more confusing.

• Nassim

From the code I understand that the Fourier terms handle the annual seasonality while seasonal ARIMA handle the weekly seasonality.

• Yes, you’re right. I had forgotten that.

• Yash Kothari

Hello Professor,

I’m currently working on a project to forecast the levels of Carbon Mono oxide (CO) in air. I have daily data of the levels of CO of past 3 years. Can you help me and tell me that how should i go about ?

Thanking you,
Yash Kothari

• Yash Kothari

Hello Professor,

I’m currently working on a project to forecast the levels of Carbon Mono oxide (CO) in air. I have daily data of the levels of CO of past 3 years. Can you help me and tell me that how should i go about ?

Thanking you,
Yash Kothari