Hierarchical forecasting with hts v4.0

A new ver­sion of my hts pack­age for R is now on CRAN. It was com­pletely re-​​written from scratch. Not a sin­gle line of code sur­vived. There are some minor syn­tax changes, but the biggest change is speed and scope. This ver­sion is many times faster than the pre­vi­ous ver­sion and can han­dle hun­dreds of thou­sands of time series with­out complaining.

The speed-​​up is due to some new research I am doing with Alan Lee (Uni­ver­sity of Auck­land). Usu­ally, we would write the paper first, and then release a pack­age imple­ment­ing the ideas. But this time, the pack­age was pro­duced first and the papers will fol­low in the next few weeks.

For those unfa­mil­iar with the tech­niques and con­cepts involved with hier­ar­chi­cal fore­cast­ing, there is an intro­duc­tion in Sec­tion 9.4 of my fore­cast­ing text­book. The basic idea is that you need to fore­cast a large num­ber of time series under the con­straint that the fore­casts of some series have to add up to equal the fore­casts of other series. For exam­ple, the fore­casts of sales in each state need to equal to the fore­casts of sales across the coun­try. Our approach is to fore­cast all series inde­pen­dently, ignor­ing the con­straints. Then adjust all the fore­casts so the con­straints are sat­is­fied. We have a neat result that gives the opti­mal (least squares) reconciliation.

Users of pre­vi­ous ver­sions of the pack­age should read the vignette (pp.10–11) which explains some syn­tax and func­tion changes.

Reg­u­lar read­ers of this blog will have noticed a cou­ple of ref­er­ences to my new research assis­tant, Earo Wang. She has been work­ing with me on R pack­age devel­op­ment over the Aus­tralian sum­mer. The new ver­sion of the hts pack­age was one of the main projects she has worked on, and the many improve­ments are largely due to her con­sid­er­able tal­ents!


Related Posts:


  • Rob Steele

    18 year old daugh­ter just asked what I was read­ing. I said I was read­ing about hier­ar­chi­cal fore­cast­ing. She asked “What’s that?” I said “pre­dict­ing the future”. She said “Oh, Fore­cast­ing!” I asked what she thought I had said. “Fork Cast­ing, like for fish­ing or some­thing.” I can see throw­ing forks instead of darts to pre­dict the future.

    Do you think it makes sense to use a bunch of dif­fer­ent but related time series to pre­dict a sin­gle thing and then com­bine the pre­dic­tions into a meta-​​prediction? That’s what I thought you meant by hier­ar­chi­cal at first. Thanks!

    • http://robjhyndman.com/ Rob J Hyndman

      I think you need to define what it is you are pre­dict­ing. Per­haps the aver­age of the col­lected time series. Then it might be bet­ter to fore­cast them indi­vid­u­ally and aver­age the results, rather than fore­cast the aver­age directly, espe­cially if the dynam­ics are some­what different.

      • Rob Steele

        Let’s say I’m try­ing to pre­dict noon tem­per­a­tures in Tempe. I have that time series and sev­eral related ones, say 9:00 AM tem­per­a­tures in Tempe, 3:00 PM tem­per­a­tures in Flagstaff, and mid­night wind speed and direc­tion in Hous­ton. What I pro­pose to do is build a sep­a­rate model map­ping each poten­tial pre­dic­tor to my tar­get of noon tem­per­a­tures in Tempe. Some mod­els would be bet­ter than oth­ers and their pre­dic­tions would be highly cor­re­lated. Then I would com­bine those pre­dic­tions as input to a meta-​​model to pre­dict, again, noon tem­per­a­tures in Tempe.

        Does this even make sense or is it obvi­ously stu­pid? One pit­fall is over­fit­ting with so many degrees of free­dom. Thanks!

        • http://robjhyndman.com/ Rob J Hyndman

          Rather than com­bine many sin­gle vari­able mod­els, you are prob­a­bly bet­ter off using a model with mul­ti­ple pre­dic­tors. So model noon tem­per­a­tures in Tempe as a func­tion of all the other temperature/​location series. Pro­vided all the pre­dic­tors are from past times it is ok. Weather pat­terns tend to move, so the tem­per­a­tures at past times in other loca­tions are often good pre­dic­tors. How­ever, the fore­casts pro­duced by mete­o­rol­o­gists will be much bet­ter as they use atmos­pheric mod­els with more data.

  • Thi­ago G. Martins

    Hi Rob,

    Nice post and use­ful intu­itive expla­na­tion of hier­ar­chi­cal fore­cast. Have you an appli­ca­tion where instead of “geo­graph­i­cal aggre­ga­tion” you have some kind of time aggre­ga­tion? Like if I want to pre­dict daily data but in a way that becomes con­sis­tent with monthly predictions?

    • http://robjhyndman.com/ Rob J Hyndman

      I’m cur­rently work­ing on a paper about hier­ar­chi­cal tem­po­ral aggre­ga­tion. Hope­fully I will have some­thing to say about it in a month or two. There is a spe­cial ses­sion on hier­ar­chi­cal fore­cast­ing at the Inter­na­tional Sym­po­sium on Fore­cast­ing in Rot­ter­dam in June. At least one of the papers in the ses­sion will be on this topic.

      • http://tgmstat.wordpress.com Thi­ago G. Martins

        Great, look­ing for­ward to read it.

  • Stephan Kolassa

    I’ve been a big fan of your opti­mal approach since you first pre­sented it in San­tander in 2006. I par­tic­u­lar like the way it nat­u­rally deals with mul­ti­di­men­sional hier­ar­chies, e.g., hier­ar­chi­cally fore­cast­ing across both loca­tion *and* prod­uct hier­ar­chies. For other approaches, one has to do all kinds of database-​​fu — here, we only have to build the sum­ma­tion matrix. Or we can not sim­ply sum fore­casts, but sum them by sales price by putting prices into the sum­ma­tion matrix instead of zeros and ones. Beautiful!

    How­ever, when we run into big­ger hier­ar­chies (lots of prod­ucts, per­haps with many lev­els), there comes a point where the OLS sys­tem involved gets painful. Of course the matri­ces involved are sparse, and I have looked a bit into spe­cial­ized approaches, but many of those fail, since our sum­ma­tion matrix is usu­ally of full rank (after all, we are usu­ally also inter­ested in the rec­on­ciled bot­tom level fore­casts — which means that the sum­ma­tion matrix will con­tain a unit matrix across all columns). Have you or Earo ever looked into this issue? Does the new hts() func­tion per­haps even include an opti­miza­tion along these lines (I haven’t looked into its source code — should I?)?

    • http://robjhyndman.com/ Rob J Hyndman

      Hi Stephan. That is pre­cisely what this post is about. Even with a hun­dred thou­sand series, the OLS will work extremely quickly due to some new research we have been doing. Big hier­ar­chies are no longer a prob­lem. To keep every­thing pos­i­tive, use positive=TRUE in the fore­cast func­tion. That won’t quite guar­an­tee pos­i­tiv­ity after rec­on­cil­i­a­tion. It only imposes pos­i­tiv­ity of the base fore­casts before rec­on­cil­i­a­tion. I will think about ways we can over­come this issue. Maybe con­strained LS will help, but the prob­lem will be speed with big hierarchies.

  • Car­o­line C.

    Hello Rob,

    First, thank you very much for the hts pack­age and all your
    expla­na­tions of it. The pack­age is not only extremely use­ful but
    also really educational.

    Cur­rently, I am deal­ing with some grouped time series that I have
    trou­ble rec­on­cil­ing after fore­cast­ing. Sim­i­lar to the prob­lem Rob
    Steele described, I have a grouped time series where the top series
    is split in two dif­fer­ent ways, but with­out any data for splits by
    both of the ways at the same time. It would be like in your
    demo­graphic fore­cast­ing exam­ple if you had the mor­tal­ity counts
    in Aus­tralia dis­ag­gre­gated by gen­der and also by state, but don’t
    have the dis­ag­gre­ga­tion by gen­der and by state. Is there a way to
    opti­mally rec­on­cile the fore­casts in each split such that the
    top-​​level series has an equal­ity con­straint (e.g. sum of mor­tal­ity
    counts by gen­der = sum of mor­tal­ity count by state)?

    Another some­what related prob­lem I have right now is fore­cast­ing
    time series of ratios where the ratios use the same vari­able as the
    numer­a­tor but with dif­fer­ent denom­i­na­tors, e.g. ratios X/​Y and X/​Z.
    The sit­u­a­tion is a bit strange in that fore­cast­ing X/​Y, X/​Z, Y, or Z
    works well but fore­cast­ing X alone is awful. Is there a way to
    opti­mally rec­on­cile fore­casts of the ratios with fore­casts of the
    denominators?

    Thank you very much for any advice or suggestions.

    Caroline

    • http://robjhyndman.com/ Rob J Hyndman

      Yes, that is pos­si­ble but you would need to set up the S matrix and regres­sion your­self as the pack­age does not han­dle that sit­u­a­tion. It is not very dif­fi­cult pro­vided you do not have too many time series.

      I haven’t thought about the ratio prob­lem at all. You could try tak­ing logs and then it becomes additive.

  • Caro Ana

    Good morn­ing,
    I am work­ing on Pre­dic­tion Inter­vals for rec­on­ciled time series with the hts pack­age. Is there a way to obtain the Pre­dic­tion Inter­vals know­ing what they are for the base time series or what could be an upper and lower bound ? My base time series fore­casts are based on aver­aged mod­els, and I don’t know yet how to obtain the con­fi­dent inter­vals for them though.
    Thank you very much for your help. And for every­thing in this blog.

  • Adam Rus­sell

    What is the best way to include covari­ates into a hier­ar­chi­cal time series?

    Thanks!

    • http://robjhyndman.com/ Rob J Hyndman

      If you have dif­fer­ent covari­ates for every time series, you will need to do the fore­cast­ing your­self and then com­bine them using com­binef(). If you want to use the same covari­ates for every time series, sim­ply use fmethod=“arima” and include the xreg argu­ment when you call forecast.gts.

      • Adam Rus­sell

        Great! Thanks for your response. Appre­ci­ate it.

  • Ste­fano

    Prof. Hyn­d­man,
    first of all thank you very much for shar­ing the results of your work. I find hier­ar­chi­cal time series fore­cast­ing very inter­est­ing, but I am new to R and I would be really grate­ful if you could give me some sug­ges­tions. I want to fore­cast con­sump­tion of sev­eral dis­ag­gre­gated prod­ucts with a hier­ar­chi­cal struc­ture and I would like to use your opti­mal com­bi­na­tion approach. Let’s say I am not sat­is­fied with my fore­cast out­put for some series at a higher level in the hyer­ar­chy and I have an idea about how that “aggre­gated” mar­ket is going. Is there a way to use con­stant adjust­ments to improve the com­plete hyer­ar­chy fore­cast? Thank you very much. Best regards from Italy.

    • http://robjhyndman.com/ Rob J Hyndman

      You can make any adjust­ments you like to the base fore­casts. The com­bi­na­tion approach will make the small­est pos­si­ble adjust­ments to ensure the fore­casts are rec­on­ciled, so it is very unlikely that the rec­on­ciled fore­casts will be bad if you are happy with the base forecasts.

      • Ste­fano

        Dear Prof. Hyn­d­man,
        I really appre­ci­ate your help. I am slowly leran­ing how to use R and I am find­ing really inter­est­ing your hts pack­age. I would be really grate­ful if you could answer a cou­ple of ques­tions:
        1. I am try­ing to use the combinef(f) func­tion, fol­low­ing the help guide, to learne how to esti­mate equa­tions with dif­fer­ent covari­ates and then using the opti­mal com­bi­na­tion approach. But once I try the accuracy,gts(), for an in-​​sample fore­cast eval­u­a­tion, I receive an error mes­sage say­ing the argu­ment is not a matrix. Do you have an idea about my mis­take?
        2. Would you please sug­gest the eas­i­est way to intro­duce con­stant adjust­ments to only some base forecasts?

        Thank you very much for your help. Best Regards. Stefano

        • http://robjhyndman.com/ Rob J Hyndman

          1. Please pro­vide a min­i­mal repro­ducible exam­ple.
          2. The com­binef() func­tion allows you to use what­ever base fore­casts you like. Just add what­ever adjust­ments you like to the forecasts.

          • Ste­fano

            Thank you for your kind reply.
            1. This is what I am try­ing to do. y is my hier­ar­chi­cal time series (1995−2013). I would like to test the fore­cast per­for­mance for the period 2011–2013.

            ally <- aggts(y)

            allf <- matrix(NA, nrow = 3, ncol = ncol(ally))

            allf <- ts(allf, start = 2011)

            datast <- window(ally, start = 1995, end = 2010)

            test <- window(y, start = 2011)

            for(i in 1:ncol(datast))
            allf[,i]<- forecast(auto.arima(datast[,i],trace=TRUE,test=“adf”), h = 3, PI=FALSE)mean allf<-ts(allf,start=2011)  fcast <- combinef(allf, ynodes, weights = NULL, keep = “gts”)

            accuracy.gts(fcast,test)

            With the last line, I get the error mes­sage: Error in t.default(y$bts) : argu­ment is not a matrix

            2. When you say “add what­ever adjust­ment you like to the fore­casts” you mean chang­ing val­ues in the “allf” matrix before using combinef()?

            Thank you very much.
            Stefano

          • http://robjhyndman.com/ Rob J Hyndman

            1. This is a bug. Now fixed on github.
            2. Yes.

          • Ste­fano

            Prof. Hyn­d­man,
            I installed the fixed ver­sion from github but unfor­tu­nately now I receive the error mes­sage
            Error in accuracy.gts(foreali, test) : could not find func­tion “is.gts“
            Thank you very much for your help

          • http://robjhyndman.com/ Rob J Hyndman

            I do not get that error using your code above. Please pro­vide a min­i­mal repro­ducible exam­ple if you think there is a bug.

          • Ste­fano

            I am prob­a­bly mak­ing some mis­takes when installing the pack­age from github. I down­loaded the zip.file hts-master.zip from github and then installed it in R. Then I run: library(hts) and the code above. Am I doing some­thing wrong? Thank you really much, and please excuse me for my poor knowl­edge of R.

          • http://robjhyndman.com/ Rob J Hyndman

            Instruc­tions on how to install from github are given on the github page at https://​github​.com/​r​o​b​j​h​y​n​d​m​a​n/hts. Please fol­low them.

          • Ste­fano

            It was def­i­nitely my mis­take with the instal­la­tion pro­ce­dure. Now every­thing works per­fectly. I am really grate­ful for your help. Thank you very much Prof. Hyndman.

  • Manuel Rivas

    Hi Rob — huge fan of your work. I was won­der­ing if there was any way to obtain fan plots or pre­dic­tion inter­vals for the Hier­ar­chi­cal Time Series fore­casts. Every time I run fore­cast with an hts object I get the fol­low­ing error mes­sage: unused argu­ment (fan = TRUE). Thanks!

    • http://robjhyndman.com/ Rob J Hyndman

      Not yet, but we are work­ing on it.

  • Ste­fano

    Dear Prof. Hyn­d­man, I am hav­ing some trou­bles with the fol­low­ing code:

    y_durev<-hts(dati_durev,nodes_durev)

    ally_​durev
    <- aggts(y_durev)

    allf_​durev
    <- matrix(NA, nrow = 4, ncol = ncol(ally_durev))

    datast_​durev
    <- window(ally_durev, start = 1996, end = 2013)

    reddispst<-window(reddisp,start=1996,
    end=2013)

    reddispprev<-window(reddisp,start=2014)

    ricchrst<-window(ricchr,start=1996,
    end=2013)

    ricchrprev<-window(ricchr,start=2014)
    for (i in c(1,2,6,7))

    allf_durev[,i]<- forecast(auto.arima((datast_durev[,i]),
    xreg=(reddispst), d=1), xreg=(reddispprev), h = 4)mean  for (i in c(4,8))    allf_durev[,i]<- forecast(auto.arima((datast_durev[,i]), xreg=(ricchrst), d=1), xreg=(ricchrprev), h = 4)mean

    for (i in c(3,5))
    allf_durev[,i]<- forecast(ets((datast_durev[,i])),h=4)mean  allf_durev<-ts(allf_durev,start=2014)  prev_durev <- combinef(allf_durev, y_durevnodes_​durev, weights=NULL, keep=”gts”)

    at the end I get an error mes­sage like

    Error in combinef(allf_durev, y_durev$nodes_durev,
    weights = NULL, keep = “gts”) : Argu­ment fcasts requires all the forecasts.

    I am not able to under­stand what I am doing wrong, I would be really grate­ful if you could help me

    Thank­ing you,
    Kind regards
    Stefano