A blog by Rob J Hyndman 

Twitter Gplus RSS

Errors on percentage errors

Published on 16 April 2014

The MAPE (mean absolute per­cent­age error) is a pop­u­lar mea­sure for fore­cast accu­racy and is defined as

    \[\text{MAPE} = 100\text{mean}(|y_t - \hat{y}_t|/|y_t|)\]

where y_t denotes an obser­va­tion and \hat{y}_t denotes its fore­cast, and the mean is taken over t.

Arm­strong (1985, p.348) was the first (to my knowl­edge) to point out the asym­me­try of the MAPE say­ing that “it has a bias favor­ing esti­mates that are below the actual val­ues”. (more…)

 
6 Comments  comments 

Generating tables in LaTeX

Published on 15 April 2014

Typ­ing tables in LaTeX can get messy, but there are some good tools to sim­plify the process. One I dis­cov­ered this week is tables​gen​er​a​tor​.com, a web-​​based tool for gen­er­at­ing LaTeX tables. It also allows the table to saved in other for­mats includ­ing HTML and Mark­down. The inter­face is sim­ple, but it does most things. For com­pli­cated tables, some addi­tional for­mat­ting may be nec­es­sary. (more…)

 
2 Comments  comments 

My forecasting book now on Amazon

Published on 9 April 2014

For all those peo­ple ask­ing me how to obtain a print ver­sion of my book “Fore­cast­ing: prin­ci­ples and prac­tice” with George Athana­sopou­los, you now can.

FPP cover

Order on Ama​zon​.com

Order on Ama​zon​.co​.uk

Order on Ama​zon​.fr

The online book will con­tinue to be freely avail­able. The print ver­sion of the book is intended to help fund the devel­op­ment of the OTexts plat­form.

The price is US$45, £27 or €35.

Com­pare that to $195 for my pre­vi­ous fore­cast­ing text­book, $150 for Fildes and Ord, or $182 for Gonzalez-​​Rivera. No mat­ter how good the books are, the prices are absurdly high.

OTexts is intended to be a dif­fer­ent kind of pub­lisher — all our books are online and free, those in print will be rea­son­ably priced.

The online ver­sion will con­tinue to be updated reg­u­larly. The print ver­sion is a snap­shot of the online ver­sion today. We will release a new print edi­tion occa­sion­ally, no more than annu­ally and only when the online ver­sion has changed enough to war­rant a new print edition.

We are plan­ning an offline elec­tronic ver­sion as well. I’ll announce it here when it is ready.

 
1 Comment  comments 

Job at Center for Open Science

Published on 8 April 2014

This looks like an inter­est­ing job.

Dear Dr. Hyndman,

I write from the Cen­ter for Open Sci­ence, a non-​​profit orga­ni­za­tion based in Char­lottesville, Vir­ginia in the United States, which is ded­i­cated to improv­ing the align­ment between sci­en­tific val­ues and sci­en­tific prac­tices. We are ded­i­cated to open source and open science.

We are reach­ing out to you to find out if you know any­one who might be inter­ested in our Sta­tis­ti­cal and Method­olog­i­cal Con­sul­tant position.

The posi­tion is a unique oppor­tu­nity to con­sult on repro­ducible best prac­tices in data analy­sis and research design; the con­sul­tant will make shorts vis­its to pro­vide lec­tures and train­ing at uni­ver­si­ties, lab­o­ra­to­ries, con­fer­ences, and through vir­tual medi­ums. An espe­cially unique part of the job involves col­lab­o­rat­ing with the White House’s Office of Sci­ence and Tech­nol­ogy Pol­icy on mat­ters relat­ing to reproducibility.

If you know some­one with sub­stan­tial train­ing and expe­ri­ence in sci­en­tific research, quan­ti­ta­tive meth­ods, repro­ducible research prac­tices, and some pro­gram­ming expe­ri­ence (at least R, ide­ally Python or Julia) might you please pass this along to them?

Any­one may find out more about the job or apply via our website:

http://​cen​ter​foropen​science​.org/​j​o​b​s​/​#​stats

The posi­tion is full-​​time and located at our office in beau­ti­ful Char­lottesville, VA.

Thanks in advance for your time and help.

 
No Comments  comments 

Interpreting noise

Published on 6 April 2014

04_03_2_prevWhen watch­ing the TV news, or read­ing news­pa­per com­men­tary, I am fre­quently amazed at the attempts peo­ple make to inter­pret ran­dom noise.

For exam­ple, the lat­est tiny fluc­tu­a­tion in the share price of a major com­pany is attrib­uted to the CEO being ill. When the exchange rate goes up, the TV finance com­men­ta­tor con­fi­dently announces that it is a reac­tion to Chi­nese build­ing con­tracts. No one ever says “The unem­ploy­ment rate has dropped by 0.1% for no appar­ent reason.”

What is going on here is that the com­men­ta­tors are assum­ing we live in a noise-​​free world. They imag­ine that every­thing is explic­a­ble, you just have to find the expla­na­tion. How­ever, the world is noisy — real data are sub­ject to ran­dom fluc­tu­a­tions, and are often also mea­sured inac­cu­rately. So to inter­pret every lit­tle fluc­tu­a­tion is silly and mis­lead­ing. (more…)

 
2 Comments  comments 

Getting a LaTeX system set up

Published on 4 April 2014

Today I was teach­ing the hon­ours stu­dents in econo­met­rics and eco­nom­ics about LaTeX. Here are some brief instruc­tions on how to set up a LaTeX sys­tem on dif­fer­ent oper­at­ing sys­tems. (more…)

 
Tags: ,
2 Comments  comments 

Cover of my forecasting textbook

Published on 18 March 2014

We now have a cover for the print ver­sion of my fore­cast­ing book with George Athana­sopou­los.

FPP cover

It should be on Ama­zon in a cou­ple of weeks. The book is also freely avail­able online.

This is a vari­a­tion of the most pop­u­lar one in the poll con­ducted a month or two ago.

The cover was pro­duced by Scar­lett Rugers who I can hap­pily rec­om­mend to any­one want­ing a book cover designed.

 
Tags: , , ,
No Comments  comments 

Fast computation of cross-​​validation in linear models

Published on 17 March 2014

The leave-​​one-​​out cross-​​validation sta­tis­tic is given by

    \[\text{CV} = \frac{1}{N} \sum_{i=1}^N e_{[i]}^2,\]

where e_{[i]} = y_i - \hat{y}_{[i]}, ~y_1,\dots,y_N are the obser­va­tions, and \hat{y}_{[i]} is the pre­dicted value obtained when the model is esti­mated with the ith case deleted. This is also some­times known as the PRESS (Pre­dic­tion Resid­ual Sum of Squares) statistic.

It turns out that for lin­ear mod­els, we do not actu­ally have to esti­mate the model N times, once for each omit­ted case. Instead, CV can be com­puted after esti­mat­ing the model once on the com­plete data set. (more…)

 
6 Comments  comments 

Probabilistic forecasting by Gneiting and Katzfuss (2014)

Published on 14 March 2014

The IJF is intro­duc­ing occa­sional review papers on areas of fore­cast­ing. We did a whole issue in 2006 review­ing 25 years of research since the Inter­na­tional Insti­tute of Fore­cast­ers was estab­lished. Since then, there has been a lot of new work in appli­ca­tion areas such as call cen­ter fore­cast­ing and elec­tric­ity price fore­cast­ing. In addi­tion, there are areas we did not cover in 2006 includ­ing new prod­uct fore­cast­ing and fore­cast­ing in finance. There have also been method­olog­i­cal and the­o­ret­i­cal devel­op­ments over the last eight years. Con­se­quently, I’ve started invit­ing emi­nent researchers to write sur­vey papers for the journal.

One obvi­ous choice was Tilmann Gneit­ing, who has pro­duced a large body of excel­lent work on prob­a­bilis­tic fore­cast­ing in the last few years. The the­ory of fore­cast­ing was badly in need of devel­op­ment, and Tilmann and his coau­thors have made sev­eral great con­tri­bu­tions in this area. How­ever, when I asked him to write a review he explained that another jour­nal had got in before me, and that the review was already writ­ten. It appeared in the very first vol­ume of the new jour­nal Annual Review of Sta­tis­tics and its Appli­ca­tion: Gneit­ing and Katz­fuss (2014) Prob­a­bilis­tic Fore­cast­ing, pp.125–151.

Hav­ing now read it, I’m both grate­ful for this more acces­si­ble intro­duc­tion to the area, and dis­ap­pointed that it didn’t end up in the Inter­na­tional Jour­nal of Fore­cast­ing. I fore­cast that it will be highly cited (although I won’t cal­cu­late a fore­cast dis­tri­b­u­tion or com­pute a scor­ing func­tion for that).

Also, good luck to the new jour­nal; it looks like it will be very use­ful, and is sure to have a high impact fac­tor given it pub­lishes review articles.

 
1 Comment  comments 

Testing for trend in ARIMA models

Published on 13 March 2014

Today’s email brought this one:

I was won­der­ing if I could get your opin­ion on a par­tic­u­lar prob­lem that I have run into dur­ing the review­ing process of an article.

Basi­cally, I have an analy­sis where I am look­ing at a cou­ple of time-​​series and I wanted to know if, over time there was an upward trend in the series. Inspec­tion of the raw data sug­gests there is, but we want some sta­tis­ti­cal evi­dence for this.

To achieve this I ran some ARIMA (0,1,1) mod­els includ­ing a drift/​trend term to see if the mean of the series did indeed shift upwards with time and found that it did. How­ever, we have run into an issue with a reviewer who argues that dif­fer­enc­ing removes trends and may not be a suit­able way to detect trends. There­fore, the fact that we found a trend despite dif­fer­enc­ing sug­gest that dif­fer­enc­ing was not suc­cess­ful. I know there are a few papers and text­books that use ARIMA (0,1,1) mod­els as ‘ran­dom walks with drift’-type mod­els so I cited them as exam­ples of this pro­ce­dure in action, but they remained unconvinced.

Instead it was sug­gested that I look for trends in the raw undif­fer­enced time-​​series as these would be more reli­able as no trends had been removed. AT the moment I am hes­i­tant to do this as I was sort of taught that even pure ran­dom walks could give you sig­nif­i­cant trends. More­over, given that the raw time-​​series is not sta­tion­ary I was wor­ried that an ARIMA (0,0,1) model as it would be might not actu­ally be appropriate.

There’s noth­ing like run­ning into igno­rant review­ers who want you to do things that make no sense. (more…)

 
10 Comments  comments