Hierarchical forecasting with hts v4.0

A new version of my hts package for R is now on CRAN. It was completely re-written from scratch. Not a single line of code survived. There are some minor syntax changes, but the biggest change is speed and scope. This version is many times faster than the previous version and can handle hundreds of thousands of time series without complaining.

The speed-up is due to some new research I am doing with Alan Lee (University of Auckland). Usually, we would write the paper first, and then release a package implementing the ideas. But this time, the package was produced first and the papers will follow in the next few weeks.

For those unfamiliar with the techniques and concepts involved with hierarchical forecasting, there is an introduction in Section 9.4 of my forecasting textbook. The basic idea is that you need to forecast a large number of time series under the constraint that the forecasts of some series have to add up to equal the forecasts of other series. For example, the forecasts of sales in each state need to equal to the forecasts of sales across the country. Our approach is to forecast all series independently, ignoring the constraints. Then adjust all the forecasts so the constraints are satisfied. We have a neat result that gives the optimal (least squares) reconciliation.

Users of previous versions of the package should read the vignette (pp.10-11) which explains some syntax and function changes.

Regular readers of this blog will have noticed a couple of references to my new research assistant, Earo Wang. She has been working with me on R package development over the Australian summer. The new version of the hts package was one of the main projects she has worked on, and the many improvements are largely due to her considerable talents!

Related Posts:

  • Rob Steele

    18 year old daughter just asked what I was reading. I said I was reading about hierarchical forecasting. She asked “What’s that?” I said “predicting the future”. She said “Oh, Forecasting!” I asked what she thought I had said. “Fork Casting, like for fishing or something.” I can see throwing forks instead of darts to predict the future.

    Do you think it makes sense to use a bunch of different but related time series to predict a single thing and then combine the predictions into a meta-prediction? That’s what I thought you meant by hierarchical at first. Thanks!

    • I think you need to define what it is you are predicting. Perhaps the average of the collected time series. Then it might be better to forecast them individually and average the results, rather than forecast the average directly, especially if the dynamics are somewhat different.

      • Rob Steele

        Let’s say I’m trying to predict noon temperatures in Tempe. I have that time series and several related ones, say 9:00 AM temperatures in Tempe, 3:00 PM temperatures in Flagstaff, and midnight wind speed and direction in Houston. What I propose to do is build a separate model mapping each potential predictor to my target of noon temperatures in Tempe. Some models would be better than others and their predictions would be highly correlated. Then I would combine those predictions as input to a meta-model to predict, again, noon temperatures in Tempe.

        Does this even make sense or is it obviously stupid? One pitfall is overfitting with so many degrees of freedom. Thanks!

        • Rather than combine many single variable models, you are probably better off using a model with multiple predictors. So model noon temperatures in Tempe as a function of all the other temperature/location series. Provided all the predictors are from past times it is ok. Weather patterns tend to move, so the temperatures at past times in other locations are often good predictors. However, the forecasts produced by meteorologists will be much better as they use atmospheric models with more data.

  • Thiago G. Martins

    Hi Rob,

    Nice post and useful intuitive explanation of hierarchical forecast. Have you an application where instead of “geographical aggregation” you have some kind of time aggregation? Like if I want to predict daily data but in a way that becomes consistent with monthly predictions?

    • I’m currently working on a paper about hierarchical temporal aggregation. Hopefully I will have something to say about it in a month or two. There is a special session on hierarchical forecasting at the International Symposium on Forecasting in Rotterdam in June. At least one of the papers in the session will be on this topic.

  • Stephan Kolassa

    I’ve been a big fan of your optimal approach since you first presented it in Santander in 2006. I particular like the way it naturally deals with multidimensional hierarchies, e.g., hierarchically forecasting across both location *and* product hierarchies. For other approaches, one has to do all kinds of database-fu – here, we only have to build the summation matrix. Or we can not simply sum forecasts, but sum them by sales price by putting prices into the summation matrix instead of zeros and ones. Beautiful!

    However, when we run into bigger hierarchies (lots of products, perhaps with many levels), there comes a point where the OLS system involved gets painful. Of course the matrices involved are sparse, and I have looked a bit into specialized approaches, but many of those fail, since our summation matrix is usually of full rank (after all, we are usually also interested in the reconciled bottom level forecasts – which means that the summation matrix will contain a unit matrix across all columns). Have you or Earo ever looked into this issue? Does the new hts() function perhaps even include an optimization along these lines (I haven’t looked into its source code – should I?)?

    • Hi Stephan. That is precisely what this post is about. Even with a hundred thousand series, the OLS will work extremely quickly due to some new research we have been doing. Big hierarchies are no longer a problem. To keep everything positive, use positive=TRUE in the forecast function. That won’t quite guarantee positivity after reconciliation. It only imposes positivity of the base forecasts before reconciliation. I will think about ways we can overcome this issue. Maybe constrained LS will help, but the problem will be speed with big hierarchies.

  • Caroline C.

    Hello Rob,

    First, thank you very much for the hts package and all your
    explanations of it. The package is not only extremely useful but
    also really educational.

    Currently, I am dealing with some grouped time series that I have
    trouble reconciling after forecasting. Similar to the problem Rob
    Steele described, I have a grouped time series where the top series
    is split in two different ways, but without any data for splits by
    both of the ways at the same time. It would be like in your
    demographic forecasting example if you had the mortality counts
    in Australia disaggregated by gender and also by state, but don’t
    have the disaggregation by gender and by state. Is there a way to
    optimally reconcile the forecasts in each split such that the
    top-level series has an equality constraint (e.g. sum of mortality
    counts by gender = sum of mortality count by state)?

    Another somewhat related problem I have right now is forecasting
    time series of ratios where the ratios use the same variable as the
    numerator but with different denominators, e.g. ratios X/Y and X/Z.
    The situation is a bit strange in that forecasting X/Y, X/Z, Y, or Z
    works well but forecasting X alone is awful. Is there a way to
    optimally reconcile forecasts of the ratios with forecasts of the

    Thank you very much for any advice or suggestions.


    • Yes, that is possible but you would need to set up the S matrix and regression yourself as the package does not handle that situation. It is not very difficult provided you do not have too many time series.

      I haven’t thought about the ratio problem at all. You could try taking logs and then it becomes additive.

  • Caro Ana

    Good morning,
    I am working on Prediction Intervals for reconciled time series with the hts package. Is there a way to obtain the Prediction Intervals knowing what they are for the base time series or what could be an upper and lower bound ? My base time series forecasts are based on averaged models, and I don’t know yet how to obtain the confident intervals for them though.
    Thank you very much for your help. And for everything in this blog.

  • Adam Russell

    What is the best way to include covariates into a hierarchical time series?


    • If you have different covariates for every time series, you will need to do the forecasting yourself and then combine them using combinef(). If you want to use the same covariates for every time series, simply use fmethod=”arima” and include the xreg argument when you call forecast.gts.

      • Adam Russell

        Great! Thanks for your response. Appreciate it.

  • Stefano

    Prof. Hyndman,
    first of all thank you very much for sharing the results of your work. I find hierarchical time series forecasting very interesting, but I am new to R and I would be really grateful if you could give me some suggestions. I want to forecast consumption of several disaggregated products with a hierarchical structure and I would like to use your optimal combination approach. Let’s say I am not satisfied with my forecast output for some series at a higher level in the hyerarchy and I have an idea about how that “aggregated” market is going. Is there a way to use constant adjustments to improve the complete hyerarchy forecast? Thank you very much. Best regards from Italy.

    • You can make any adjustments you like to the base forecasts. The combination approach will make the smallest possible adjustments to ensure the forecasts are reconciled, so it is very unlikely that the reconciled forecasts will be bad if you are happy with the base forecasts.

      • Stefano

        Dear Prof. Hyndman,
        I really appreciate your help. I am slowly leraning how to use R and I am finding really interesting your hts package. I would be really grateful if you could answer a couple of questions:
        1. I am trying to use the combinef(f) function, following the help guide, to learne how to estimate equations with different covariates and then using the optimal combination approach. But once I try the accuracy,gts(), for an in-sample forecast evaluation, I receive an error message saying the argument is not a matrix. Do you have an idea about my mistake?
        2. Would you please suggest the easiest way to introduce constant adjustments to only some base forecasts?

        Thank you very much for your help. Best Regards. Stefano

        • 1. Please provide a minimal reproducible example.
          2. The combinef() function allows you to use whatever base forecasts you like. Just add whatever adjustments you like to the forecasts.

          • Stefano

            Thank you for your kind reply.
            1. This is what I am trying to do. y is my hierarchical time series (1995-2013). I would like to test the forecast performance for the period 2011-2013.

            ally <- aggts(y)

            allf <- matrix(NA, nrow = 3, ncol = ncol(ally))

            allf <- ts(allf, start = 2011)

            datast <- window(ally, start = 1995, end = 2010)

            test <- window(y, start = 2011)

            for(i in 1:ncol(datast))
            allf[,i]<- forecast(auto.arima(datast[,i],trace=TRUE,test="adf"), h = 3, PI=FALSE)$mean

            fcast <- combinef(allf, y$nodes, weights = NULL, keep = "gts")


            With the last line, I get the error message: Error in t.default(y$bts) : argument is not a matrix

            2. When you say "add whatever adjustment you like to the forecasts" you mean changing values in the "allf" matrix before using combinef()?

            Thank you very much.

          • 1. This is a bug. Now fixed on github.
            2. Yes.

          • Stefano

            Prof. Hyndman,
            I installed the fixed version from github but unfortunately now I receive the error message
            Error in accuracy.gts(foreali, test) : could not find function “is.gts”
            Thank you very much for your help

          • I do not get that error using your code above. Please provide a minimal reproducible example if you think there is a bug.

          • Stefano

            I am probably making some mistakes when installing the package from github. I downloaded the zip.file hts-master.zip from github and then installed it in R. Then I run: library(hts) and the code above. Am I doing something wrong? Thank you really much, and please excuse me for my poor knowledge of R.

          • Instructions on how to install from github are given on the github page at https://github.com/robjhyndman/hts. Please follow them.

          • Stefano

            It was definitely my mistake with the installation procedure. Now everything works perfectly. I am really grateful for your help. Thank you very much Prof. Hyndman.

  • Manuel Rivas

    Hi Rob – huge fan of your work. I was wondering if there was any way to obtain fan plots or prediction intervals for the Hierarchical Time Series forecasts. Every time I run forecast with an hts object I get the following error message: unused argument (fan = TRUE). Thanks!

  • Stefano

    Dear Prof. Hyndman, I am having some troubles with the following code:


    <- aggts(y_durev)

    <- matrix(NA, nrow = 4, ncol = ncol(ally_durev))

    <- window(ally_durev, start = 1996, end = 2013)




    for (i in c(1,2,6,7))

    allf_durev[,i]<- forecast(auto.arima((datast_durev[,i]),
    xreg=(reddispst), d=1), xreg=(reddispprev), h = 4)$mean

    for (i in c(4,8))

    allf_durev[,i]<- forecast(auto.arima((datast_durev[,i]),
    xreg=(ricchrst), d=1), xreg=(ricchrprev), h = 4)$mean

    for (i in c(3,5))
    allf_durev[,i]<- forecast(ets((datast_durev[,i])),h=4)$mean


    <- combinef(allf_durev, y_durev$nodes_durev, weights=NULL, keep=”gts”)

    at the end I get an error message like

    Error in combinef(allf_durev, y_durev$nodes_durev,
    weights = NULL, keep = "gts") : Argument fcasts requires all the forecasts.

    I am not able to understand what I am doing wrong, I would be really grateful if you could help me

    Thanking you,
    Kind regards

  • Max Ghenis

    This approach sounds great, particularly for reducing the noise from smaller forecasts by effectively using information from other series. I found that slides 22-23 at https://forecasters.org/wp-content/uploads/gravity_forms/7-2a51b93047891f1ec3608bdbd77ca58d/2014/07/Athanasopoulos_George_ISF2014.pdf evaluate the effectiveness of the approach from a competition-like standpoint, which indicate scaling and averaging outperform other methods like bottom-up, but it’s a bit sparse without the voiceover. Are materials available which go more into the forecast accuracy benefits from using hts methods?

  • greddy121

    When running gts forecasts on 2 groups, I kept running into “Error in LU.dgC(a) : cs_lu(A) failed: near-singular A (or out of memory).”

    After some debugging it seems this happens because of a few groups that have no values. So I had to remove all columns with zero variance. Is that expected? Is there a better way to handle those groups/columns that end up having no values?

    • 1. The new version on github now has a QR option if LU fails. That might help. Otherwise I could look at it if you could provide a minimal reproducible example.

      2. Yes, this happens. Setting lambda=0 only guarantees that the base forecasts are positive, but after reconciliation some forecasts might become negative. We are working on a solution to that problem and it will be in the package in the next couple of months.

      • Peter Stephensen

        Hi Rob. As I understand it, setting lambda=0 implies sending log(Ybase) to the forecast method instead of Ybase. In the standard implementation of your (very cool) optimal method, it makes sure that sum(Ybase) = YHigherLevel. Does lambda=0 implies sum(log(Ybase)) = log(YHigherLevel)? If so: is there a way to get sum(Ybase) = YHigherLevel instead?

        • The log transformation is undone before reconciliation. So sum(Ybase) = YHigherLevel even when the lambda argument is used.

          • Peter Stephensen

            Thanks! Do you know when the new package that you mention under point 2 above will be available? Looking forward to use it.

          • It will be part of the hts package, not a new package. It should be included, at least on github, in the next month or so.