# Hierarchical forecasting with hts v4.0

A new version of my hts package for R is now on CRAN. It was completely re-written from scratch. Not a single line of code survived. There are some minor syntax changes, but the biggest change is speed and scope. This version is many times faster than the previous version and can handle hundreds of thousands of time series without complaining.

The speed-up is due to some new research I am doing with Alan Lee (University of Auckland). Usually, we would write the paper first, and then release a package implementing the ideas. But this time, the package was produced first and the papers will follow in the next few weeks.

For those unfamiliar with the techniques and concepts involved with hierarchical forecasting, there is an introduction in Section 9.4 of my forecasting textbook. The basic idea is that you need to forecast a large number of time series under the constraint that the forecasts of some series have to add up to equal the forecasts of other series. For example, the forecasts of sales in each state need to equal to the forecasts of sales across the country. Our approach is to forecast all series independently, ignoring the constraints. Then adjust all the forecasts so the constraints are satisfied. We have a neat result that gives the optimal (least squares) reconciliation.

Users of previous versions of the package should read the vignette (pp.10-11) which explains some syntax and function changes.

Regular readers of this blog will have noticed a couple of references to my new research assistant, Earo Wang. She has been working with me on R package development over the Australian summer. The new version of the hts package was one of the main projects she has worked on, and the many improvements are largely due to her considerable talents!

### Related Posts:

• Rob Steele

18 year old daughter just asked what I was reading. I said I was reading about hierarchical forecasting. She asked “What’s that?” I said “predicting the future”. She said “Oh, Forecasting!” I asked what she thought I had said. “Fork Casting, like for fishing or something.” I can see throwing forks instead of darts to predict the future.

Do you think it makes sense to use a bunch of different but related time series to predict a single thing and then combine the predictions into a meta-prediction? That’s what I thought you meant by hierarchical at first. Thanks!

• I think you need to define what it is you are predicting. Perhaps the average of the collected time series. Then it might be better to forecast them individually and average the results, rather than forecast the average directly, especially if the dynamics are somewhat different.

• Rob Steele

Let’s say I’m trying to predict noon temperatures in Tempe. I have that time series and several related ones, say 9:00 AM temperatures in Tempe, 3:00 PM temperatures in Flagstaff, and midnight wind speed and direction in Houston. What I propose to do is build a separate model mapping each potential predictor to my target of noon temperatures in Tempe. Some models would be better than others and their predictions would be highly correlated. Then I would combine those predictions as input to a meta-model to predict, again, noon temperatures in Tempe.

Does this even make sense or is it obviously stupid? One pitfall is overfitting with so many degrees of freedom. Thanks!

• Rather than combine many single variable models, you are probably better off using a model with multiple predictors. So model noon temperatures in Tempe as a function of all the other temperature/location series. Provided all the predictors are from past times it is ok. Weather patterns tend to move, so the temperatures at past times in other locations are often good predictors. However, the forecasts produced by meteorologists will be much better as they use atmospheric models with more data.

• Thiago G. Martins

Hi Rob,

Nice post and useful intuitive explanation of hierarchical forecast. Have you an application where instead of “geographical aggregation” you have some kind of time aggregation? Like if I want to predict daily data but in a way that becomes consistent with monthly predictions?

• I’m currently working on a paper about hierarchical temporal aggregation. Hopefully I will have something to say about it in a month or two. There is a special session on hierarchical forecasting at the International Symposium on Forecasting in Rotterdam in June. At least one of the papers in the session will be on this topic.

• Great, looking forward to read it.

• Stephan Kolassa

I’ve been a big fan of your optimal approach since you first presented it in Santander in 2006. I particular like the way it naturally deals with multidimensional hierarchies, e.g., hierarchically forecasting across both location *and* product hierarchies. For other approaches, one has to do all kinds of database-fu – here, we only have to build the summation matrix. Or we can not simply sum forecasts, but sum them by sales price by putting prices into the summation matrix instead of zeros and ones. Beautiful!

However, when we run into bigger hierarchies (lots of products, perhaps with many levels), there comes a point where the OLS system involved gets painful. Of course the matrices involved are sparse, and I have looked a bit into specialized approaches, but many of those fail, since our summation matrix is usually of full rank (after all, we are usually also interested in the reconciled bottom level forecasts – which means that the summation matrix will contain a unit matrix across all columns). Have you or Earo ever looked into this issue? Does the new hts() function perhaps even include an optimization along these lines (I haven’t looked into its source code – should I?)?

• Hi Stephan. That is precisely what this post is about. Even with a hundred thousand series, the OLS will work extremely quickly due to some new research we have been doing. Big hierarchies are no longer a problem. To keep everything positive, use positive=TRUE in the forecast function. That won’t quite guarantee positivity after reconciliation. It only imposes positivity of the base forecasts before reconciliation. I will think about ways we can overcome this issue. Maybe constrained LS will help, but the problem will be speed with big hierarchies.

• Caroline C.

Hello Rob,

First, thank you very much for the hts package and all your
explanations of it. The package is not only extremely useful but
also really educational.

Currently, I am dealing with some grouped time series that I have
trouble reconciling after forecasting. Similar to the problem Rob
Steele described, I have a grouped time series where the top series
is split in two different ways, but without any data for splits by
both of the ways at the same time. It would be like in your
demographic forecasting example if you had the mortality counts
in Australia disaggregated by gender and also by state, but don’t
have the disaggregation by gender and by state. Is there a way to
optimally reconcile the forecasts in each split such that the
top-level series has an equality constraint (e.g. sum of mortality
counts by gender = sum of mortality count by state)?

Another somewhat related problem I have right now is forecasting
time series of ratios where the ratios use the same variable as the
numerator but with different denominators, e.g. ratios X/Y and X/Z.
The situation is a bit strange in that forecasting X/Y, X/Z, Y, or Z
works well but forecasting X alone is awful. Is there a way to
optimally reconcile forecasts of the ratios with forecasts of the
denominators?

Thank you very much for any advice or suggestions.

Caroline

• Yes, that is possible but you would need to set up the S matrix and regression yourself as the package does not handle that situation. It is not very difficult provided you do not have too many time series.

I haven’t thought about the ratio problem at all. You could try taking logs and then it becomes additive.

• Caro Ana

Good morning,
I am working on Prediction Intervals for reconciled time series with the hts package. Is there a way to obtain the Prediction Intervals knowing what they are for the base time series or what could be an upper and lower bound ? My base time series forecasts are based on averaged models, and I don’t know yet how to obtain the confident intervals for them though.
Thank you very much for your help. And for everything in this blog.

What is the best way to include covariates into a hierarchical time series?

Thanks!

• If you have different covariates for every time series, you will need to do the forecasting yourself and then combine them using combinef(). If you want to use the same covariates for every time series, simply use fmethod=”arima” and include the xreg argument when you call forecast.gts.

Great! Thanks for your response. Appreciate it.

• Stefano

Prof. Hyndman,
first of all thank you very much for sharing the results of your work. I find hierarchical time series forecasting very interesting, but I am new to R and I would be really grateful if you could give me some suggestions. I want to forecast consumption of several disaggregated products with a hierarchical structure and I would like to use your optimal combination approach. Let’s say I am not satisfied with my forecast output for some series at a higher level in the hyerarchy and I have an idea about how that “aggregated” market is going. Is there a way to use constant adjustments to improve the complete hyerarchy forecast? Thank you very much. Best regards from Italy.

• You can make any adjustments you like to the base forecasts. The combination approach will make the smallest possible adjustments to ensure the forecasts are reconciled, so it is very unlikely that the reconciled forecasts will be bad if you are happy with the base forecasts.

• Stefano

Dear Prof. Hyndman,
I really appreciate your help. I am slowly leraning how to use R and I am finding really interesting your hts package. I would be really grateful if you could answer a couple of questions:
1. I am trying to use the combinef(f) function, following the help guide, to learne how to estimate equations with different covariates and then using the optimal combination approach. But once I try the accuracy,gts(), for an in-sample forecast evaluation, I receive an error message saying the argument is not a matrix. Do you have an idea about my mistake?
2. Would you please suggest the easiest way to introduce constant adjustments to only some base forecasts?

Thank you very much for your help. Best Regards. Stefano

• 1. Please provide a minimal reproducible example.
2. The combinef() function allows you to use whatever base forecasts you like. Just add whatever adjustments you like to the forecasts.

• Stefano

1. This is what I am trying to do. y is my hierarchical time series (1995-2013). I would like to test the forecast performance for the period 2011-2013.

ally <- aggts(y)

allf <- matrix(NA, nrow = 3, ncol = ncol(ally))

allf <- ts(allf, start = 2011)

datast <- window(ally, start = 1995, end = 2010)

test <- window(y, start = 2011)

for(i in 1:ncol(datast))
allf[,i]<- forecast(auto.arima(datast[,i],trace=TRUE,test="adf"), h = 3, PI=FALSE)$mean allf<-ts(allf,start=2011) fcast <- combinef(allf, y$nodes, weights = NULL, keep = "gts")

accuracy.gts(fcast,test)

With the last line, I get the error message: Error in t.default(y$bts) : argument is not a matrix 2. When you say "add whatever adjustment you like to the forecasts" you mean changing values in the "allf" matrix before using combinef()? Thank you very much. Stefano • 1. This is a bug. Now fixed on github. 2. Yes. • Stefano Prof. Hyndman, I installed the fixed version from github but unfortunately now I receive the error message Error in accuracy.gts(foreali, test) : could not find function “is.gts” Thank you very much for your help • I do not get that error using your code above. Please provide a minimal reproducible example if you think there is a bug. • Stefano I am probably making some mistakes when installing the package from github. I downloaded the zip.file hts-master.zip from github and then installed it in R. Then I run: library(hts) and the code above. Am I doing something wrong? Thank you really much, and please excuse me for my poor knowledge of R. • Instructions on how to install from github are given on the github page at https://github.com/robjhyndman/hts. Please follow them. • Stefano It was definitely my mistake with the installation procedure. Now everything works perfectly. I am really grateful for your help. Thank you very much Prof. Hyndman. • Manuel Rivas Hi Rob – huge fan of your work. I was wondering if there was any way to obtain fan plots or prediction intervals for the Hierarchical Time Series forecasts. Every time I run forecast with an hts object I get the following error message: unused argument (fan = TRUE). Thanks! • Not yet, but we are working on it. • Stefano Dear Prof. Hyndman, I am having some troubles with the following code: y_durev<-hts(dati_durev,nodes_durev) ally_durev <- aggts(y_durev) allf_durev <- matrix(NA, nrow = 4, ncol = ncol(ally_durev)) datast_durev <- window(ally_durev, start = 1996, end = 2013) reddispst<-window(reddisp,start=1996, end=2013) reddispprev<-window(reddisp,start=2014) ricchrst<-window(ricchr,start=1996, end=2013) ricchrprev<-window(ricchr,start=2014) for (i in c(1,2,6,7)) allf_durev[,i]<- forecast(auto.arima((datast_durev[,i]), xreg=(reddispst), d=1), xreg=(reddispprev), h = 4)$mean

for (i in c(4,8))

allf_durev[,i]<- forecast(auto.arima((datast_durev[,i]),
xreg=(ricchrst), d=1), xreg=(ricchrprev), h = 4)$mean for (i in c(3,5)) allf_durev[,i]<- forecast(ets((datast_durev[,i])),h=4)$mean

allf_durev<-ts(allf_durev,start=2014)

prev_durev
<- combinef(allf_durev, y_durev$nodes_durev, weights=NULL, keep=”gts”) at the end I get an error message like Error in combinef(allf_durev, y_durev$nodes_durev,
weights = NULL, keep = "gts") : Argument fcasts requires all the forecasts.

I am not able to understand what I am doing wrong, I would be really grateful if you could help me

Thanking you,
Kind regards
Stefano

• Max Ghenis

This approach sounds great, particularly for reducing the noise from smaller forecasts by effectively using information from other series. I found that slides 22-23 at https://forecasters.org/wp-content/uploads/gravity_forms/7-2a51b93047891f1ec3608bdbd77ca58d/2014/07/Athanasopoulos_George_ISF2014.pdf evaluate the effectiveness of the approach from a competition-like standpoint, which indicate scaling and averaging outperform other methods like bottom-up, but it’s a bit sparse without the voiceover. Are materials available which go more into the forecast accuracy benefits from using hts methods?

• greddy121

When running gts forecasts on 2 groups, I kept running into “Error in LU.dgC(a) : cs_lu(A) failed: near-singular A (or out of memory).”

After some debugging it seems this happens because of a few groups that have no values. So I had to remove all columns with zero variance. Is that expected? Is there a better way to handle those groups/columns that end up having no values?

• 1. The new version on github now has a QR option if LU fails. That might help. Otherwise I could look at it if you could provide a minimal reproducible example.

2. Yes, this happens. Setting lambda=0 only guarantees that the base forecasts are positive, but after reconciliation some forecasts might become negative. We are working on a solution to that problem and it will be in the package in the next couple of months.

• Peter Stephensen

Hi Rob. As I understand it, setting lambda=0 implies sending log(Ybase) to the forecast method instead of Ybase. In the standard implementation of your (very cool) optimal method, it makes sure that sum(Ybase) = YHigherLevel. Does lambda=0 implies sum(log(Ybase)) = log(YHigherLevel)? If so: is there a way to get sum(Ybase) = YHigherLevel instead?

• The log transformation is undone before reconciliation. So sum(Ybase) = YHigherLevel even when the lambda argument is used.

• Peter Stephensen

Thanks! Do you know when the new package that you mention under point 2 above will be available? Looking forward to use it.

• It will be part of the hts package, not a new package. It should be included, at least on github, in the next month or so.

• Frank

Hi Professor Hyndman. I am new in forecast. It is very helpful to have a resource like this. I am currently using your fantastic HTS package in R to do forecast. However, there are some questions confused me.

I have a couple of sales data of different SKUs start from 01/01/2013. Some of the series are discontinued after several time lags.(Discontinued in 01/01/2014, there are no data later than then). Some are new item which don’t have data before (Stars at 01/01/2015, there are no data before then). Others are fine.
I try to fill all missed data of these items with zero. However, when I use hts function to do forecast. The results are not very favorable. The fitted value of the missed data are not 0 and the forecast for new introduced item are going very low to almost to 0 for a period (I assume that the model recognize the time series as a seasonal mode since I fill all the missed value with 0).

Could you help me point out what is my problem? Did the hts function recognize the missed value as a seasonal factor?

• Frank

I just come out an idea that I could use bottom-up reconciliation method. But it will be very helpful (since there are some regression relationships between each items) if I could achieve this with Optimal forecast combination method.

• You are wanting to do reconciliation under some constraints (that some forecasts are zero). The hts package does not handle that situation. Maybe it will in the future. You can set it up as a quadratic programming problem.

• OHN

Really basic question, Prof, from a complete novice. Do you have a step-by-step publication that shows how to go from the hierarchical diagram to the R code? We could even use the example from the famous Chapter 9 of the FPandP, on the Australian tourism hierarchy. How do SYD, Other, MEL, Other… at the bottom layer map to the “list” as nodes = list(4, c(2,2,2,2))?

• Have you tried reading the help file? We provide several examples there.

• OHN

That’s what I have not found. Where can I find the help file? I know it’s not in FPandP.
Thanks.

• In R, type help(hts::hts)

• OHN

Did, thanks, Prof. Also scoured links and references, still not clear. Until I ran into this post by someone else from 2 years ago, so I knew that I am not alone:

“Hi Prof thanks for the slides, my biggest problem is constructing the multivariate time series(Y_t) or matrix containing the bottom level series. also how would you construct it if you have real data unlike assigning rnorm numbers to it? ”

Yes, constructing the bts is a stumbling block thus far.

Thanks for patience.

• Claudio Serra

Dear Prof. Hyndman
First, thank you very much for give us your knowledge. Second, I have a question

library(hts)
y <- hts(vn, nodes=list(4,c(2,2,2,2)))
allf <- forecast(y, h=8)
plot(allf)

But, error message appears

Error in hts(vn, nodes = list(4, c(2, 2, 2, 2))) :
argument(s) not (s) (nodes = list(4, c(2, 2, 2, 2)))

Please, what's the problem?? Thank you.

• I cannot reproduce that error.

• Claudio Serra

Excuse me, the error was simple: My version of R was old, as your package hts. Installed the package 5.0, then everything worked fine. Thank you soo much.

Ps: I’m from brazil.

• Claudio Serra

Dear Dr. Hyndman,

Where I get an example of the Top Down series with hts package?

• Top down reconciliation is a method, not a time series example. You can apply top down reconciliation to any hierarchical time series.