A blog by Rob J Hyndman 

Twitter Gplus RSS

forecast package v4.0

Published on 3 December 2012

A few days ago I released ver­sion 4.0 of the fore­cast pack­age for R. There were quite a few changes and new fea­tures, so I thought it deserved a new ver­sion num­ber. I keep a list of changes in the Changelog for the pack­age, but I doubt that many peo­ple look at it. So for the record, here are the most impor­tant changes to the fore­cast pack­age made since v3.0 was released.

Expo­nen­tial smoothing

The ets() func­tion han­dles expo­nen­tial smooth­ing in the ETS frame­work, but for teach­ing pur­poses it is some­times use­ful to dis­cuss a more tra­di­tional imple­men­ta­tion that does not involve opti­miz­ing the ini­tial val­ues. The HoltWinters() func­tion in the stats pack­age attempts to do this, although the ini­tial­iza­tion used for the Holt and Holt-​​Winters meth­ods is non-​​standard, mak­ing it unsuit­able for teaching.

Con­se­quently, I’ve now mod­i­fied the ses(), holt() and hw() func­tions to allow both tra­di­tional heuris­tic ini­tial­iza­tion and ETS-​​style opti­miza­tion of ini­tial states. These func­tions also now allow damped trend and mul­ti­plica­tive trend.

Mul­ti­ple sea­son­al­ity and BATS and TBATS models

Time series with mul­ti­ple sea­sonal pat­terns (e.g., hourly data that con­tains a daily pat­tern, weekly pat­tern and an annual pat­tern) now have their own model class msts. This is use­ful when plot­ting the data, and in using the dshw() dou­ble sea­sonal Holt-​​Winters func­tion for example.

The BATS and TBATS mod­els (fit­ted using bats() and tbats()) also han­dle mul­ti­ple sea­son­al­i­ties as well as com­plex sea­son­al­i­ties (such as non-​​integer sea­son­al­ity, non-​​nested sea­son­al­ity and large period sea­son­al­ity). The mod­els are in the exponential-​​smoothing frame­work and were pro­posed by De Liv­era, Hyn­d­man and Sny­der (JASA 2011).

BATS and TBATS mod­els allow a form of time series decom­po­si­tion, and so the fit­ted objects can now plot­ted in the same way as other decom­po­si­tion objects (e.g., stl objects).

These mod­els are quite com­pu­ta­tion­ally inten­sive, and so the func­tions will use par­al­lel pro­cess­ing when it is available.

auto.arima()

This is prob­a­bly the most pop­u­lar func­tion in the pack­age, and I am always look­ing at ways to speed it up and improve the results.

  • Default model selec­tion has been switched from AIC to AICc (the bias-​​corrected form of AIC) which may affect model selec­tion for very short time series.
  • The max­i­mum orders are now restricted to less than 13 of the length of the data.
  • Par­al­lel pro­cess­ing is used where pos­si­ble when stepwise=FALSE.

Accu­racy calculations

The accuracy func­tion now works with both time series fore­casts and cross-​​sectional forecasts.

The MASE for sea­sonal time series is now defined using sea­sonal naïve fore­casts for the in-​​sample scal­ing. The MASE for cross-​​sectional fore­casts is defined using the mean fore­casts for in-​​sample scal­ing. This is con­sis­tent with my new book.

Neural net­work AR models

A new exper­i­men­tal func­tion nnetar has been intro­duced to fit a neural net­work with lagged val­ues of the time series as inputs. It is a feed-​​forward net­work with one hid­den layer, spec­i­fied using the nota­tion NNAR(p,k) to indi­cate there are p lagged inputs and k nodes in the hid­den layer. For exam­ple, a NNAR(9,5) model is a neural net­work with the last nine obser­va­tions (y_{t-1},y_{t-2},\dots,y_{t-9}) used as inputs to fore­cast the out­put y_t, and with five neu­rons in the hid­den layer. A NNAR(p,0) model is equiv­a­lent to an ARIMA(p,0,0) model but with­out the restric­tions on the para­me­ters to ensure stationarity.

With sea­sonal data, it is use­ful to also add the last observed val­ues from the same sea­son as inputs. For exam­ple, an NNAR(3,1,2)_{12} model has inputs y_{t-1}, y_{t-2}, y_{t-3} and y_{t-12}, and two neu­rons in the hid­den layer. More gen­er­ally, an NNAR(p,P,k)_m model has inputs (y_{t-1},y_{t-2},\dots,y_{t-p},y_{t-m},y_{t-2m},y_{t-Pm}) and k neu­rons in the hid­den layer. A NNAR(p,P,0)_m model is equiv­a­lent to an ARIMA(p,0,0)(P,0,0)_m model but with­out the restric­tions on the para­me­ters to ensure stationarity.

The nnetar() func­tion fits an NNAR(p,P,k)_m model. If the val­ues of p and P are not spec­i­fied, they are auto­mat­i­cally selected. For non-​​seasonal time series, the default is the opti­mal num­ber of lags (accord­ing to the AIC) for a lin­ear AR(p) model. For sea­sonal time series, the default val­ues are P=1 and p is cho­sen from the opti­mal lin­ear model fit­ted to the sea­son­ally adjusted data. If k is not spec­i­fied, it is set to k=(p+P+1)/2 (rounded to the near­est integer).

By default, 25 net­works with ran­dom start­ing val­ues are trained and their pre­dic­tions averaged.

Like many of the func­tions in the fore­cast pack­age, the default argu­ments usu­ally give rea­son­ably good fore­casts with­out the user need­ing to know much. For example:

fit <- nnetar(lynx)
plot(forecast(fit,h=20))

Because ran­dom start­ing val­ues are used, the fore­casts will be slightly dif­fer­ent each time the func­tion is called.

It is pos­si­bly to com­pute pre­dic­tion inter­vals from NNAR mod­els via sim­u­la­tion, but I have not yet imple­mented this. The func­tion will only give point fore­casts for now.

New default col­ors for fore­cast plots

The old yel­low and orange col­ors have been replaced by shades of blue-​​grey which are a lit­tle eas­ier on the eye. The darker the color, the higher the den­sity. Here is an exam­ple with 80% and 95% pre­dic­tion intervals.

library(fpp)
fit <- Arima(euretail, order=c(0,1,3), seasonal=c(0,1,1))
plot(forecast(fit,h=12))

Future devel­op­ment

Ear­lier this year I started using the pack­age for an under­grad­u­ate fore­cast­ing class which gave it a good work-​​out, and I’m also using it as the basis of my new book with George Athana­sopou­los. As a result, a lot of new func­tion­al­ity has been added, and lots of bugs have been dis­cov­ered and fixed.

The process is con­tin­u­ing and I have lots of ideas I want to imple­ment, and no doubt more bugs will be dis­cov­ered. The devel­op­ment of the fore­cast pack­age is now main­tained on github where you can see what is in devel­op­ment and report any new prob­lems. Please let me know if you find any­thing you think is an error, or if you have a suggestion.


Related Posts:


 
22 Comments  comments 
  • fer­nando

    It would be nice to have an auto­matic cross-​​validation func­tion, given the fit­ted model and the val­i­da­tion para­me­ters (training-​​set size, max Hori­zon, etc). Any­way, great job in the new version!

  • Samik

    Hi Rob, thanks for the noti­fi­ca­tion. Indeed, the ‘What’s new’ sec­tion rarely gets read. Anways, I was won­der­ing if you have an updated ver­sion of the orig­i­nal paper which detailed the auto.arima() algo­rithm, that includes the improve­ments and changes?

  • http://www.logisticadescomplicada.com/ Lean­dro Cal­le­gari Coelho

    Rob, con­grats for the devel­op­ment. I like this pack­age a lot. By the way, when­ever a new ver­sion is avail­able, the first thing I look for is the changelog/what’s new section!

  • Shea Parkes

    Thanks for all your hard work Rob! I appre­ci­ate the sub­tler default col­ors, but you’ll need to go back and update your online book to stop men­tion­ing yellow/​orange now: http://​otexts​.com/​f​p​p​/1/4/

  • Jerome

    Thank you Rob for the func­tion dshw. When I get the point of fore­cast, how could I get the fore­cast inter­val (low 0.80 and upper 0.80)? Thank you

    • http://robjhyndman.com Rob J Hyndman

      I have not imple­mented pre­dic­tion inter­vals for dshw(). Use bats() to get PI for an equiv­a­lent model.

  • BMiner

    Hey Rob, You men­tion pre­dic­tion inter­vals with ANN via sim­u­la­tion. Do you have any sources to reference?

    • http://robjhyndman.com Rob J Hyndman

      No, but a neural net­work is essen­tially a non­lin­ear autore­gres­sion and sim­u­la­tion is the most com­mon way to get PI for other non­lin­ear autore­gres­sion models.

      • Bminer

        Do you think boot­strap­ping would be a viable method (using the boot pack­age and tsboot)? I am think­ing of boot­strap­ping the series and then using each of the n mod­els cre­ated to pre­dict a new case — and thus get­ting a pre­dic­tion distribution.

        • http://robjhyndman.com Rob J Hyndman

          No. The boot­strapped series won’t allow pre­dic­tion con­di­tional on the last few val­ues of the observed series.

  • Joseph John­son

    Dr. Hyn­d­man,

    Does the ets func­tion in fore­cast pack­age v4.0 con­strain the para­me­ters to be;
    0<α<1, 0<β<α, 0<γ<1-α ?
    Would you rec­om­mend this con­straint for all mod­els?
    Does the pack­age go even fur­ther to con­straint the cer­tain mod­els as men­tioned in Chap­ter 10 of your expo­nen­tial smooth­ing book?

    Thank you for the pack­age. It has been very help­ful to me!

    • http://robjhyndman.com Rob J Hyndman

      Try read­ing the help file. It is all explained. The con­straints you men­tion are imple­mented using bounds=“usual”. The con­straints in ch10 of our book are imple­mented using bounds=“admissible”. The default is bounds=“both” mean­ing both sets of con­straints are applied.

    • Joseph John­son

      Thank you.

  • Francesco Gian­fer­rari Pini

    Hi! How can I extract the time series for the sea­sonal component(s) on a TBATS fore­cast? I can see them in the plot!

    • http://robjhyndman.com Rob J Hyndman

      Look at plot.tbats() and you will see how they are constructed.

  • Jerome

    Bon­jour Rob! I am run­ning a bats func­tion and I get this error:

    Error in checkForRemoteErrors(val) : 6 nodes pro­duced errors; first error: argu­ment is of length zero
    I have no clue to solve it. Thank you in advance for your tips.
    Jerome

    • http://robjhyndman.com Rob J Hyndman

      Try using debug() to track where the error is occurring.

  • SimonMN

    Hi Prof Hyndman,

    Is there any way to add exoge­nous inputs to the nnetar func­tion in the same way you can add exoge­nous vari­ables to the Arima or auto.arima functions?

    • http://robjhyndman.com/ Rob J Hyndman

      No, but it isn’t hard to do it using the avN­Net func­tion from the caret pack­age. Basi­cally nnetar() is fore­cast­ing using an aver­age of 20 neural nets where the inputs are lagged val­ues of the time series. You would just need to add the xreg vari­ables as inputs as well.

  • Chris­t­ian Miller

    I don’t have any­thing sub­stan­tive to add, but I want to give a huge thanks to you Rob. This has been a help­ful and use­ful pack­age in learn­ing about fore­cast­ing and imple­ment­ing ideas. And you con­tinue to add func­tion­al­ity faster than I can keep up: it’s a great prob­lem to have.

    Greatly appre­ci­ated (and the book too!)