A blog by Rob J Hyndman 

Twitter Gplus RSS

I am not an econometrician

Published on 21 July 2014

I am a sta­tis­ti­cian, but I have worked in a depart­ment of pre­dom­i­nantly econo­me­tri­cians for the past 17 years. It is a lit­tle like an Aus­tralian vis­it­ing the United States. Ini­tially, it seems that we talk the same lan­guage, do the same sorts of things, and have a very sim­i­lar cul­ture. But the longer you stay there, the more you realise there are dif­fer­ences that run deep and affect the way you see the world.

Last week at my research group meet­ing, I spoke about some of the dif­fer­ences I have noticed. Coin­ci­den­tally, Andrew Gel­man blogged about the same issue a day later. (more…)

No Comments  comments 

Variations on rolling forecasts

Published on 16 July 2014

Rolling fore­casts are com­monly used to com­pare time series mod­els. Here are a few of the ways they can be com­puted using R. I will use ARIMA mod­els as a vehi­cle of illus­tra­tion, but the code can eas­ily be adapted to other uni­vari­ate time series mod­els. (more…)

No Comments  comments 

SAS/​IIF grants

Published on 15 July 2014

Every year, the Inter­na­tional Insti­tute of Fore­cast­ers in con­junc­tion with SAS offer some small grants to help pro­mote research in fore­cast­ing. There are two $5000 grants per year for research on fore­cast­ing method­ol­ogy and appli­ca­tions. This year, appli­ca­tions close on 30 Sep­tem­ber 2014. More details are given here.

Infor­ma­tion about past SAS-​​IIF awards is given on the IIF web­site. It is inter­est­ing to see the range of top­ics cov­ered. Here are the win­ning projects in the last two years:

  • Jef­frey Stone­braker: “Prob­a­bilis­tic Fore­cast­ing of the Global Demand for the Treat­ment of Hemo­philia B.”
  • Yongchen (Her­bert) Zhao: “Robust Real-​​Time Auto­mated Fore­cast Com­bi­na­tion in SAS: Devel­op­ment of a SAS Pro­ce­dure and a Com­pre­hen­sive Eval­u­a­tion of Recently Devel­oped Com­bi­na­tion Methods.”
  • Zoe Theocharis, Nigel Har­vey, Leonard Smith: “Improv­ing judg­men­tal input to hur­ri­cane fore­casts in the insur­ance and rein­sur­ance sector.”
  • Elena-​​Ivona Dumitrescu, Janine Chris­tine Bal­ter, Peter Rein­hard Hansen: “Fore­cast­ing Exchange Rate Volatil­ity: Mul­ti­vari­ate Real­ized GARCH Framework.”
  • Yorghos Tripodis: “Fore­cast­ing the Cog­ni­tive Sta­tus in an Aging Population.”
No Comments  comments 

Varian on big data

Published on 16 June 2014

Last week my research group dis­cussed Hal Varian’s inter­est­ing new paper on “Big data: new tricks for econo­met­rics”, Jour­nal of Eco­nomic Per­spec­tives, 28(2): 3–28.

It’s a nice intro­duc­tion to trees, bag­ging and forests, plus a very brief entrée to the LASSO and the elas­tic net, and to slab and spike regres­sion. Not enough to be able to use them, but ok if you’ve no idea what they are. (more…)

No Comments  comments 

Specifying complicated groups of time series in hts

Published on 15 June 2014

With the lat­est ver­sion of the hts pack­age for R, it is now pos­si­ble to spec­ify rather com­pli­cated group­ing struc­tures rel­a­tively easily.

All aggre­ga­tion struc­tures can be rep­re­sented as hier­ar­chies or as cross-​​products of hier­ar­chies. For exam­ple, a hier­ar­chi­cal time series may be based on geog­ra­phy: coun­try, state, region, store. Often there is also a sep­a­rate prod­uct hier­ar­chy: prod­uct groups, prod­uct types, packet size. Fore­casts of all the dif­fer­ent types of aggre­ga­tion are required; e.g., prod­uct type A within region X. The aggre­ga­tion struc­ture is a cross-​​product of the two hierarchies.

This frame­work includes even appar­ently non-​​hierarchical data: con­sider the sim­ple case of a time series of deaths split by sex and state. We can con­sider sex and state as two very sim­ple hier­ar­chies with only one level each. Then we wish to fore­cast the aggre­gates of all com­bi­na­tions of the two hierarchies.

Any num­ber of sep­a­rate hier­ar­chies can be com­bined in this way. Non-​​hierarchical fac­tors such as sex can be treated as single-​​level hier­ar­chies. (more…)

No Comments  comments 

European talks. June-​​July 2014

Published on 14 June 2014

For the next month I am trav­el­ling in Europe and will be giv­ing the fol­low­ing talks.

17 June. Chal­lenges in fore­cast­ing peak elec­tric­ity demand. Energy Forum, Sierre, Valais/​Wallis, Switzerland.

20 June. Com­mon func­tional prin­ci­pal com­po­nent mod­els for mor­tal­ity fore­cast­ing. Inter­na­tional Work­shop on Func­tional and Oper­a­to­r­ial Sta­tis­tics. Stresa, Italy.

24–25 June. Func­tional time series with appli­ca­tions in demog­ra­phy. Hum­boldt Uni­ver­sity, Berlin.

1 July. Fast com­pu­ta­tion of rec­on­ciled fore­casts in hier­ar­chi­cal and grouped time series. Inter­na­tional Sym­po­sium on Fore­cast­ing, Rot­ter­dam, Netherlands.

No Comments  comments 

Creating a handout from beamer slides

Published on 11 June 2014

I’m about to head off on a speak­ing tour to Europe (more on that in another post) and one of my hosts has asked for my pow­er­point slides so they can print them. They have made two false assump­tions: (1) that I use pow­er­point; (2) that my slides are sta­tic so they can be printed.

Instead, I pro­duced a cut-​​down ver­sion of my beamer slides, leav­ing out some of the ani­ma­tions and other fea­tures that will not print eas­ily. Then I pro­duced a pdf file with sev­eral slides per page. (more…)

1 Comment  comments 

Data science market places

Published on 26 May 2014

Some new web­sites are being estab­lished offer­ing “mar­ket places” for data sci­ence. Two I’ve come across recently are Experfy and Sna­p­An­a­lytx. (more…)

No Comments  comments 

Structural breaks

Published on 23 May 2014

I’m tired of read­ing about tests for struc­tural breaks and here’s why.

A struc­tural break occurs when we see a sud­den change in a time series or a rela­tion­ship between two time series. Econo­me­tri­cians love papers on struc­tural breaks, and appar­ently believe in them. Per­son­ally, I tend to take a dif­fer­ent view of the world. I think a more real­is­tic view is that most things change slowly over time, and only occa­sion­ally with sud­den dis­con­tin­u­ous change. (more…)

5 Comments  comments 

To explain or predict?

Published on 19 May 2014

Last week, my research group dis­cussed Galit Shmueli’s paper “To explain or to pre­dict?”, Sta­tis­ti­cal Sci­ence, 25(3), 289–310. (See her web­site for fur­ther mate­ri­als.) This is a paper every­one doing sta­tis­tics and econo­met­rics should read as it helps to clar­ify a dis­tinc­tion that is often blurred. In the dis­cus­sion, the fol­low­ing issues were cov­ered amongst other things.

  1. The AIC is bet­ter suited to model selec­tion for pre­dic­tion as it is asymp­tot­i­cally equiv­a­lent to leave-​​one-​​out cross-​​validation in regres­sion, or one-​​step-​​cross-​​validation in time series. On the other hand, it might be argued that the BIC is bet­ter suited to model selec­tion for expla­na­tion, as it is consistent.
  2. P-​​values are asso­ci­ated with expla­na­tion, not pre­dic­tion. It makes lit­tle sense to use p-​​values to deter­mine the vari­ables in a model that is being used for pre­dic­tion. (There are prob­lems in using p-​​values for vari­able selec­tion in any con­text, but that is a dif­fer­ent issue.)
  3. Mul­ti­collinear­ity has a very dif­fer­ent impact if your goal is pre­dic­tion from when your goal is esti­ma­tion. When pre­dict­ing, mul­ti­collinear­ity is not really a prob­lem pro­vided the val­ues of your pre­dic­tors lie within the hyper-​​region of the pre­dic­tors used when esti­mat­ing the model.
  4. An ARIMA model has no explana­tory use, but is great at short-​​term prediction.
  5. How to han­dle miss­ing val­ues in regres­sion is dif­fer­ent in a pre­dic­tive con­text com­pared to an explana­tory con­text. For exam­ple, when build­ing an explana­tory model, we could just use all the data for which we have com­plete obser­va­tions (assum­ing there is no sys­tem­atic nature to the miss­ing­ness). But when pre­dict­ing, you need to be able to pre­dict using what­ever data you have. So you might have to build sev­eral mod­els, with dif­fer­ent num­bers of pre­dic­tors, to allow for dif­fer­ent vari­ables being missing.
  6. Many sta­tis­tics and econo­met­rics text­books fail to observe these dis­tinc­tions. In fact, a lot of sta­tis­ti­cians and econo­me­tri­cians are trained only in the expla­na­tion par­a­digm, with pre­dic­tion an after­thought. That is unfor­tu­nate as most applied work these days requires pre­dic­tive mod­el­ling, rather than explana­tory modelling.



4 Comments  comments