Comparing HoltWinters() and ets()

I received this email today:

I have a ques­tion about the ets() func­tion in R, which I am try­ing to use for Holt-​​Winters expo­nen­tial smooth­ing.
My prob­lem is that I am get­ting very dif­fer­ent esti­mates of the alpha, beta and gamma para­me­ters using ets() com­pared to HoltWin­ters(), and I can’t fig­ure out why.

This is a com­mon ques­tion, so I thought the answer might be of suf­fi­cient inter­est to post here.

There are sev­eral issues involved.

  1. HoltWinters() and ets() are opti­miz­ing dif­fer­ent cri­te­rion. HoltWinters() is using heuris­tic val­ues for the ini­tial states and then esti­mat­ing the smooth­ing para­me­ters by opti­miz­ing the MSE. ets() is esti­mat­ing both the ini­tial states and smooth­ing para­me­ters by opti­miz­ing the like­li­hood func­tion (which is only equiv­a­lent to opti­miz­ing the MSE for the lin­ear addi­tive models).
  2. The two func­tions use dif­fer­ent opti­miza­tion rou­tines and dif­fer­ent start­ing val­ues. That wouldn’t mat­ter if the sur­faces being opti­mized were smooth, but they are not. Because the MSE and like­li­hood sur­faces are both fairly bumpy, it is easy to find a local opti­mum. The only way to avoid this prob­lem is to use a much slower com­pu­ta­tional method such as PSO.
  3. ets() searches over a restricted para­me­ter space to ensure the result­ing model is fore­castable. HoltWinters() ignores this issue (it was writ­ten before the prob­lem was even dis­cov­ered). See this paper for details (equiv­a­lently chap­ter 10 of my expo­nen­tial smooth­ing book).

I have exper­i­mented with many dif­fer­ent choices of the start­ing val­ues for the ini­tial val­ues and smooth­ing para­me­ters, and what is imple­mented in ets() seems about as good as is pos­si­ble with­out using a much slower opti­miza­tion rou­tine. Where there is a dif­fer­ence between ets() and HoltWinters(), the results from ets() are usu­ally more reliable.

A related ques­tion on esti­ma­tion of ARIMA mod­els was dis­cussed at http://​rob​jhyn​d​man​.com/​h​y​n​d​s​i​g​h​t​/​e​s​t​i​m​a​tion/.

Related Posts:

  • Leo

    Thanks for the com­par­i­son Dr. Hyn­d­man. I have another issue related to Holt Win­ters. While using hourly data, with weekly sea­son­al­ity, the frequency=168. I guess ets() fails to han­dle this while HoltWin­ters works.

  • Leo

    Dr. Hyn­d­man,
    Accord­ing to your book, its not pos­si­ble to use ets(AAM) for a Holt-​​Winters model with addi­tive trend and mul­ti­plica­tive sea­son­al­ity. How about using ets(MAM) if we are inter­ested in point fore­casts only.

    • Rob J Hyndman

      ETS(M,A,M) is fine for point fore­casts and pre­dic­tion inter­vals. ETS(A,A,M) is numer­i­cally unsta­ble with infi­nite pre­dic­tion intervals.

  • James

    What would be the best way to objec­tively com­pare the per­for­mance of the HoltWin­ters and ets func­tions in R? It seems HoltWin­ters returns a value con­tain­ing SSE (sum of the squared errors) whereas ets returns a value con­tain­ing log­lik as a mea­sure of accu­racy… which makes it dif­fi­cult to com­pare the two (apples and oranges).

    I know in your book you rec­om­mend using ets over HoltWin­ters but HoltWin­ters seems to be gen­er­at­ing much more cred­i­ble fore­casts for some sam­ple data that I have (just look­ing at a plot) and I wanted to ver­ify this using some objec­tive mea­sure of the fit­ting algorithms.

    • James

      Argh… as usual, I should look with my eyes and not with my mouth. I see the fore­cast pack­age con­tains an accu­racy() method that does exactly what I wanted…

      Sorry for the bother — and thank you so much for the fan­tas­tic work both on the R pack­ages and your text books.

      • Rob J Hyndman

        It is, of course, pos­si­ble that HoltWin­ters will give bet­ter fore­casts for a spe­cific time series. But on aver­age, ets will be bet­ter as it opti­mizes the ini­tial states, it pro­vides a larger model class, and it allows model selec­tion via AIC.