biblatex for statisticians

I am now using bibla­tex for all my bib­li­o­graphic work as it seems to have devel­oped enough to be sta­ble and reli­able. The big advan­tage of bibla­tex is that it is easy to for­mat the bib­li­og­ra­phy to con­form to spe­cific jour­nal or pub­lisher styles. It is also pos­si­ble to have struc­tured bib­li­ogra­phies (e.g., divided into sec­tions: books, papers, R pack­ages, etc.) Con­tinue reading →

Varian on big data

Last week my research group dis­cussed Hal Varian’s inter­est­ing new paper on “Big data: new tricks for econo­met­rics”, Jour­nal of Eco­nomic Per­spec­tives, 28(2): 3–28.

It’s a nice intro­duc­tion to trees, bag­ging and forests, plus a very brief entrée to the LASSO and the elas­tic net, and to slab and spike regres­sion. Not enough to be able to use them, but ok if you’ve no idea what they are. Con­tinue reading →

To explain or predict?

Last week, my research group dis­cussed Galit Shmueli’s paper “To explain or to pre­dict?”, Sta­tis­ti­cal Sci­ence, 25(3), 289–310. (See her web­site for fur­ther mate­ri­als.) This is a paper every­one doing sta­tis­tics and econo­met­rics should read as it helps to clar­ify a dis­tinc­tion that is often blurred. In the dis­cus­sion, the fol­low­ing issues were cov­ered amongst other things.

  1. The AIC is bet­ter suited to model selec­tion for pre­dic­tion as it is asymp­tot­i­cally equiv­a­lent to leave-​​one-​​out cross-​​validation in regres­sion, or one-​​step-​​cross-​​validation in time series. On the other hand, it might be argued that the BIC is bet­ter suited to model selec­tion for expla­na­tion, as it is consistent.
  2. P-​​values are asso­ci­ated with expla­na­tion, not pre­dic­tion. It makes lit­tle sense to use p-​​values to deter­mine the vari­ables in a model that is being used for pre­dic­tion. (There are prob­lems in using p-​​values for vari­able selec­tion in any con­text, but that is a dif­fer­ent issue.)
  3. Mul­ti­collinear­ity has a very dif­fer­ent impact if your goal is pre­dic­tion from when your goal is esti­ma­tion. When pre­dict­ing, mul­ti­collinear­ity is not really a prob­lem pro­vided the val­ues of your pre­dic­tors lie within the hyper-​​region of the pre­dic­tors used when esti­mat­ing the model.
  4. An ARIMA model has no explana­tory use, but is great at short-​​term prediction.
  5. How to han­dle miss­ing val­ues in regres­sion is dif­fer­ent in a pre­dic­tive con­text com­pared to an explana­tory con­text. For exam­ple, when build­ing an explana­tory model, we could just use all the data for which we have com­plete obser­va­tions (assum­ing there is no sys­tem­atic nature to the miss­ing­ness). But when pre­dict­ing, you need to be able to pre­dict using what­ever data you have. So you might have to build sev­eral mod­els, with dif­fer­ent num­bers of pre­dic­tors, to allow for dif­fer­ent vari­ables being missing.
  6. Many sta­tis­tics and econo­met­rics text­books fail to observe these dis­tinc­tions. In fact, a lot of sta­tis­ti­cians and econo­me­tri­cians are trained only in the expla­na­tion par­a­digm, with pre­dic­tion an after­thought. That is unfor­tu­nate as most applied work these days requires pre­dic­tive mod­el­ling, rather than explana­tory modelling.

 

 

Great papers to read

My research group meets every two weeks. It is always fun to talk about gen­eral research issues and new tools and tips we have dis­cov­ered. We also use some of the time to dis­cuss a paper that I choose for them. Today we dis­cussed Breiman’s clas­sic (2001) two cul­tures paper — some­thing every sta­tis­ti­cian should read, includ­ing the discussion.

I select papers that I want every mem­ber of research team to be famil­iar with. Usu­ally they are clas­sics in fore­cast­ing, or they are recent sur­vey papers.

In the last cou­ple of months we have also read the fol­low­ing papers:

Past, present, and future of statistical science

This is the title of a won­der­ful new book that has just been released, cour­tesy of the Com­mit­tee of Pres­i­dents of Sta­tis­ti­cal Societies.

It can be freely down­loaded from the COPSS web­site or a hard copy can be pur­chased on Ama­zon (for only a lit­tle over 10c per page which is not bad com­pared to other sta­tis­tics books).

The book con­sists of 52 chap­ters span­ning 622 pages. The full table of con­tents below shows its scope and the list of authors (a ver­i­ta­ble who’s who in sta­tis­tics). Con­tinue reading →

Errors on percentage errors

The MAPE (mean absolute per­cent­age error) is a pop­u­lar mea­sure for fore­cast accu­racy and is defined as

    \[\text{MAPE} = 100\text{mean}(|y_t - \hat{y}_t|/|y_t|)\]

where y_t denotes an obser­va­tion and \hat{y}_t denotes its fore­cast, and the mean is taken over t.

Arm­strong (1985, p.348) was the first (to my knowl­edge) to point out the asym­me­try of the MAPE say­ing that “it has a bias favor­ing esti­mates that are below the actual val­ues”. Con­tinue reading →

My forecasting book now on Amazon

For all those peo­ple ask­ing me how to obtain a print ver­sion of my book “Fore­cast­ing: prin­ci­ples and prac­tice” with George Athana­sopou­los, you now can.

FPP cover

Order on Ama​zon​.com

Order on Ama​zon​.co​.uk

Order on Ama​zon​.fr

The online book will con­tinue to be freely avail­able. The print ver­sion of the book is intended to help fund the devel­op­ment of the OTexts plat­form.

The price is US$45, £27 or €35.

Com­pare that to $195 for my pre­vi­ous fore­cast­ing text­book, $150 for Fildes and Ord, or $182 for Gonzalez-​​Rivera. No mat­ter how good the books are, the prices are absurdly high.

OTexts is intended to be a dif­fer­ent kind of pub­lisher — all our books are online and free, those in print will be rea­son­ably priced.

The online ver­sion will con­tinue to be updated reg­u­larly. The print ver­sion is a snap­shot of the online ver­sion today. We will release a new print edi­tion occa­sion­ally, no more than annu­ally and only when the online ver­sion has changed enough to war­rant a new print edition.

We are plan­ning an offline elec­tronic ver­sion as well. I’ll announce it here when it is ready.

Top papers in the International Journal of Forecasting

Every year or so, Else­vier asks me to nom­i­nate five Inter­na­tional Jour­nal of Fore­cast­ing papers from the last two years to high­light in their mar­ket­ing mate­ri­als as “Editor’s Choice”. I try to select papers across a broad range of sub­jects, and I take into account cita­tions and down­loads as well as my own impres­sion of the paper. That tends to bias my selec­tion a lit­tle towards older papers as they have had more time to accu­mu­late cita­tions. Here are the papers I chose this morn­ing (in the order they appeared):

  1. Diebold and Yil­maz (2012) Bet­ter to give than to receive: Pre­dic­tive direc­tional mea­sure­ment of volatil­ity spillovers. IJF 28(1), 57–66.
  2. Loter­man, Brown, Martens, Mues, and Bae­sens (2012) Bench­mark­ing regres­sion algo­rithms for loss given default mod­el­ing. IJF 28(1), 161–170.
  3. Soyer and Hog­a­rth (2012) The illu­sion of pre­dictabil­ity: How regres­sion sta­tis­tics mis­lead experts. IJF 28(3), 695–711.
  4. Fried­man (2012) Fast sparse regres­sion and clas­si­fi­ca­tion. IJF 28(3), 722–738.
  5. Davy­denko and Fildes (2013) Mea­sur­ing fore­cast­ing accu­racy: The case of judg­men­tal adjust­ments to SKU-​​level demand fore­casts. IJF 29(3), 510–522.

Last time I did this, three of the five papers I chose went on to win awards. (I don’t pick the award win­ners — that’s a mat­ter for the whole edi­to­r­ial board.) On the other hand, I didn’t pick the paper that got the top award for the period 2010–2011. So per­haps my selec­tion is not such a good guide.

Automatic time series forecasting in Granada

In two weeks I am pre­sent­ing a work­shop at the Uni­ver­sity of Granada (Spain) on Auto­matic Time Series Fore­cast­ing.

Unlike most of my talks, this is not intended to be pri­mar­ily about my own research. Rather it is to pro­vide a state-​​of-​​the-​​art overview of the topic (at a level suit­able for Mas­ters stu­dents in Com­puter Sci­ence). I thought I’d pro­vide some his­tor­i­cal per­spec­tive on the devel­op­ment of auto­matic time series fore­cast­ing, plus give some com­ments on the cur­rent best prac­tices. Con­tinue reading →

Free books on statistical learning

Hastie, Tib­shi­rani and Friedman’s Ele­ments of Sta­tis­ti­cal Learn­ing first appeared in 2001 and is already a clas­sic. It is my go-​​to book when I need a quick refresher on a machine learn­ing algo­rithm. I like it because it is writ­ten using the lan­guage and per­spec­tive of sta­tis­tics, and pro­vides a very use­ful entry point into the lit­er­a­ture of machine learn­ing which has its own ter­mi­nol­ogy for sta­tis­ti­cal con­cepts. A free down­load­able pdf ver­sion is avail­able on the website.

Recently, a sim­pler related book appeared enti­tled Intro­duc­tion to Sta­tis­ti­cal Learn­ing with appli­ca­tions in R by James, Wit­ten, Hastie and Tib­shi­rani. It “is aimed for upper level under­grad­u­ate stu­dents, mas­ters stu­dents and Ph.D. stu­dents in the non-​​mathematical sci­ences”. This would be a great text­book for our new 3rd year sub­ject on Busi­ness Ana­lyt­ics. The R code is a wel­come addi­tion in show­ing how to imple­ment the meth­ods. Again, a free down­load­able pdf ver­sion is avail­able on the website.

There is also a new, free book on Sta­tis­ti­cal foun­da­tions of machine learn­ing by Bon­tempi and Ben Taieb avail­able on the OTexts plat­form. This is more of a hand­book and is writ­ten by two authors com­ing from a machine learn­ing back­ground. R code is also pro­vided. Being an OTexts book, it is con­tin­u­ally updated and revised, and is freely avail­able to any­one with a browser.

Thanks to the authors for being will­ing to make these books freely available.