Hierarchical forecasting with hts v4.0

A new ver­sion of my hts pack­age for R is now on CRAN. It was com­pletely re-​​written from scratch. Not a sin­gle line of code sur­vived. There are some minor syn­tax changes, but the biggest change is speed and scope. This ver­sion is many times faster than the pre­vi­ous ver­sion and can han­dle hun­dreds of thou­sands of time series with­out complaining.

The speed-​​up is due to some new research I am doing with Alan Lee (Uni­ver­sity of Auck­land). Usu­ally, we would write the paper first, and then release a pack­age imple­ment­ing the ideas. But this time, the pack­age was pro­duced first and the papers will fol­low in the next few weeks.

For those unfa­mil­iar with the tech­niques and con­cepts involved with hier­ar­chi­cal fore­cast­ing, there is an intro­duc­tion in Sec­tion 9.4 of my fore­cast­ing text­book. The basic idea is that you need to fore­cast a large num­ber of time series under the con­straint that the fore­casts of some series have to add up to equal the fore­casts of other series. For exam­ple, the fore­casts of sales in each state need to equal to the fore­casts of sales across the coun­try. Our approach is to fore­cast all series inde­pen­dently, ignor­ing the con­straints. Then adjust all the fore­casts so the con­straints are sat­is­fied. We have a neat result that gives the opti­mal (least squares) reconciliation.

Users of pre­vi­ous ver­sions of the pack­age should read the vignette (pp.10–11) which explains some syn­tax and func­tion changes.

Reg­u­lar read­ers of this blog will have noticed a cou­ple of ref­er­ences to my new research assis­tant, Earo Wang. She has been work­ing with me on R pack­age devel­op­ment over the Aus­tralian sum­mer. The new ver­sion of the hts pack­age was one of the main projects she has worked on, and the many improve­ments are largely due to her con­sid­er­able tal­ents!

Related Posts:

  • Rob Steele

    18 year old daugh­ter just asked what I was read­ing. I said I was read­ing about hier­ar­chi­cal fore­cast­ing. She asked “What’s that?” I said “pre­dict­ing the future”. She said “Oh, Fore­cast­ing!” I asked what she thought I had said. “Fork Cast­ing, like for fish­ing or some­thing.” I can see throw­ing forks instead of darts to pre­dict the future.

    Do you think it makes sense to use a bunch of dif­fer­ent but related time series to pre­dict a sin­gle thing and then com­bine the pre­dic­tions into a meta-​​prediction? That’s what I thought you meant by hier­ar­chi­cal at first. Thanks!

    • http://robjhyndman.com/ Rob J Hyndman

      I think you need to define what it is you are pre­dict­ing. Per­haps the aver­age of the col­lected time series. Then it might be bet­ter to fore­cast them indi­vid­u­ally and aver­age the results, rather than fore­cast the aver­age directly, espe­cially if the dynam­ics are some­what different.

      • Rob Steele

        Let’s say I’m try­ing to pre­dict noon tem­per­a­tures in Tempe. I have that time series and sev­eral related ones, say 9:00 AM tem­per­a­tures in Tempe, 3:00 PM tem­per­a­tures in Flagstaff, and mid­night wind speed and direc­tion in Hous­ton. What I pro­pose to do is build a sep­a­rate model map­ping each poten­tial pre­dic­tor to my tar­get of noon tem­per­a­tures in Tempe. Some mod­els would be bet­ter than oth­ers and their pre­dic­tions would be highly cor­re­lated. Then I would com­bine those pre­dic­tions as input to a meta-​​model to pre­dict, again, noon tem­per­a­tures in Tempe.

        Does this even make sense or is it obvi­ously stu­pid? One pit­fall is over­fit­ting with so many degrees of free­dom. Thanks!

        • http://robjhyndman.com/ Rob J Hyndman

          Rather than com­bine many sin­gle vari­able mod­els, you are prob­a­bly bet­ter off using a model with mul­ti­ple pre­dic­tors. So model noon tem­per­a­tures in Tempe as a func­tion of all the other temperature/​location series. Pro­vided all the pre­dic­tors are from past times it is ok. Weather pat­terns tend to move, so the tem­per­a­tures at past times in other loca­tions are often good pre­dic­tors. How­ever, the fore­casts pro­duced by mete­o­rol­o­gists will be much bet­ter as they use atmos­pheric mod­els with more data.

  • Thi­ago G. Martins

    Hi Rob,

    Nice post and use­ful intu­itive expla­na­tion of hier­ar­chi­cal fore­cast. Have you an appli­ca­tion where instead of “geo­graph­i­cal aggre­ga­tion” you have some kind of time aggre­ga­tion? Like if I want to pre­dict daily data but in a way that becomes con­sis­tent with monthly predictions?

    • http://robjhyndman.com/ Rob J Hyndman

      I’m cur­rently work­ing on a paper about hier­ar­chi­cal tem­po­ral aggre­ga­tion. Hope­fully I will have some­thing to say about it in a month or two. There is a spe­cial ses­sion on hier­ar­chi­cal fore­cast­ing at the Inter­na­tional Sym­po­sium on Fore­cast­ing in Rot­ter­dam in June. At least one of the papers in the ses­sion will be on this topic.

      • http://tgmstat.wordpress.com Thi­ago G. Martins

        Great, look­ing for­ward to read it.

  • Stephan Kolassa

    I’ve been a big fan of your opti­mal approach since you first pre­sented it in San­tander in 2006. I par­tic­u­lar like the way it nat­u­rally deals with mul­ti­di­men­sional hier­ar­chies, e.g., hier­ar­chi­cally fore­cast­ing across both loca­tion *and* prod­uct hier­ar­chies. For other approaches, one has to do all kinds of database-​​fu — here, we only have to build the sum­ma­tion matrix. Or we can not sim­ply sum fore­casts, but sum them by sales price by putting prices into the sum­ma­tion matrix instead of zeros and ones. Beautiful!

    How­ever, when we run into big­ger hier­ar­chies (lots of prod­ucts, per­haps with many lev­els), there comes a point where the OLS sys­tem involved gets painful. Of course the matri­ces involved are sparse, and I have looked a bit into spe­cial­ized approaches, but many of those fail, since our sum­ma­tion matrix is usu­ally of full rank (after all, we are usu­ally also inter­ested in the rec­on­ciled bot­tom level fore­casts — which means that the sum­ma­tion matrix will con­tain a unit matrix across all columns). Have you or Earo ever looked into this issue? Does the new hts() func­tion per­haps even include an opti­miza­tion along these lines (I haven’t looked into its source code — should I?)?

    • http://robjhyndman.com/ Rob J Hyndman

      Hi Stephan. That is pre­cisely what this post is about. Even with a hun­dred thou­sand series, the OLS will work extremely quickly due to some new research we have been doing. Big hier­ar­chies are no longer a prob­lem. To keep every­thing pos­i­tive, use positive=TRUE in the fore­cast func­tion. That won’t quite guar­an­tee pos­i­tiv­ity after rec­on­cil­i­a­tion. It only imposes pos­i­tiv­ity of the base fore­casts before rec­on­cil­i­a­tion. I will think about ways we can over­come this issue. Maybe con­strained LS will help, but the prob­lem will be speed with big hierarchies.

  • Car­o­line C.

    Hello Rob,

    First, thank you very much for the hts pack­age and all your
    expla­na­tions of it. The pack­age is not only extremely use­ful but
    also really educational.

    Cur­rently, I am deal­ing with some grouped time series that I have
    trou­ble rec­on­cil­ing after fore­cast­ing. Sim­i­lar to the prob­lem Rob
    Steele described, I have a grouped time series where the top series
    is split in two dif­fer­ent ways, but with­out any data for splits by
    both of the ways at the same time. It would be like in your
    demo­graphic fore­cast­ing exam­ple if you had the mor­tal­ity counts
    in Aus­tralia dis­ag­gre­gated by gen­der and also by state, but don’t
    have the dis­ag­gre­ga­tion by gen­der and by state. Is there a way to
    opti­mally rec­on­cile the fore­casts in each split such that the
    top-​​level series has an equal­ity con­straint (e.g. sum of mor­tal­ity
    counts by gen­der = sum of mor­tal­ity count by state)?

    Another some­what related prob­lem I have right now is fore­cast­ing
    time series of ratios where the ratios use the same vari­able as the
    numer­a­tor but with dif­fer­ent denom­i­na­tors, e.g. ratios X/​Y and X/​Z.
    The sit­u­a­tion is a bit strange in that fore­cast­ing X/​Y, X/​Z, Y, or Z
    works well but fore­cast­ing X alone is awful. Is there a way to
    opti­mally rec­on­cile fore­casts of the ratios with fore­casts of the

    Thank you very much for any advice or suggestions.


    • http://robjhyndman.com/ Rob J Hyndman

      Yes, that is pos­si­ble but you would need to set up the S matrix and regres­sion your­self as the pack­age does not han­dle that sit­u­a­tion. It is not very dif­fi­cult pro­vided you do not have too many time series.

      I haven’t thought about the ratio prob­lem at all. You could try tak­ing logs and then it becomes additive.

  • Caro Ana

    Good morn­ing,
    I am work­ing on Pre­dic­tion Inter­vals for rec­on­ciled time series with the hts pack­age. Is there a way to obtain the Pre­dic­tion Inter­vals know­ing what they are for the base time series or what could be an upper and lower bound ? My base time series fore­casts are based on aver­aged mod­els, and I don’t know yet how to obtain the con­fi­dent inter­vals for them though.
    Thank you very much for your help. And for every­thing in this blog.

  • Adam Rus­sell

    What is the best way to include covari­ates into a hier­ar­chi­cal time series?


    • http://robjhyndman.com/ Rob J Hyndman

      If you have dif­fer­ent covari­ates for every time series, you will need to do the fore­cast­ing your­self and then com­bine them using com­binef(). If you want to use the same covari­ates for every time series, sim­ply use fmethod=“arima” and include the xreg argu­ment when you call forecast.gts.

      • Adam Rus­sell

        Great! Thanks for your response. Appre­ci­ate it.