Mathematical annotations on R plots

I’ve always strug­gled with using plotmath via the expression func­tion in R for adding math­e­mat­i­cal nota­tion to axes or leg­ends. For some rea­son, the most obvi­ous way to write some­thing never seems to work for me and I end up using trial and error in a loop with far too many iterations.

So I am very happy to see the new latex2exp pack­age avail­able which trans­lates LaTeX expres­sions into a form suit­able for R graphs. This is going to save me time and frus­tra­tion! Con­tinue reading →

Murphy diagrams in R

At the recent Inter­na­tional Sym­po­sium on Fore­cast­ing, held in River­side, Cal­i­for­nia, Till­man Gneit­ing gave a great talk on “Eval­u­at­ing fore­casts: why proper scor­ing rules and con­sis­tent scor­ing func­tions mat­ter”. It will be the sub­ject of an IJF invited paper in due course.

One of the things he talked about was the “Mur­phy dia­gram” for com­par­ing fore­casts, as pro­posed in Ehm et al (2015). Here’s how it works for com­par­ing mean fore­casts. Con­tinue reading →

Statistical modelling and analysis of big data

I’m cur­rently attend­ing the one day work­shop on this topic at QUT in Bris­bane. This morn­ing I spoke on “Visu­al­iz­ing and fore­cast­ing big time series data”. My slides are here.

The talks are being streamed.


Big data is now endemic in busi­ness, indus­try, gov­ern­ment, envi­ron­men­tal man­age­ment, med­ical sci­ence, social research and so on. One of the com­men­su­rate chal­lenges is how to effec­tively model and analyse these data.

This work­shop will bring together national and inter­na­tional experts in sta­tis­ti­cal mod­el­ling and analy­sis of big data, to share their expe­ri­ences, approaches and opin­ions about future direc­tions in this field.

Di Cook is moving to Monash

I’m delighted that Pro­fes­sor Dianne Cook will be join­ing Monash Uni­ver­sity in July 2015 as a Pro­fes­sor of Busi­ness Ana­lyt­ics. Di is an Aus­tralian who has worked in the US for the past 25 years, mostly at Iowa State Uni­ver­sity. She is mov­ing back to Aus­tralia and join­ing the Depart­ment of Econo­met­rics and Busi­ness Sta­tis­tics in the Monash Busi­ness School, as part of our ini­tia­tive in Busi­ness Analytics.

Di is a world leader in data visu­al­iza­tion, and is well-​​​​known for her work on inter­ac­tive graph­ics. She is also the aca­d­e­mic super­vi­sor of sev­eral lead­ing data sci­en­tists includ­ing Hadley Wick­ham and Yihui Xie, both of whom work for RStu­dio.

Di has a great deal of energy and enthu­si­asm for com­pu­ta­tional sta­tis­tics and data visu­al­iza­tion, and will play a key role in devel­op­ing and teach­ing our new sub­jects in busi­ness analytics.

The Monash Busi­ness School is already excep­tion­ally strong in econo­met­rics (ranked 7th in the world on RePEc), and fore­cast­ing (ranked 11th on RePEc), and we have recently expanded into actu­ar­ial sci­ence. With Di join­ing the depart­ment, we will be extend­ing our exper­tise in the area of data visu­al­iza­tion as well.



Visualization of probabilistic forecasts

This week my research group dis­cussed Adrian Raftery’s recent paper on “Use and Com­mu­ni­ca­tion of Prob­a­bilis­tic Fore­casts” which pro­vides a fas­ci­nat­ing but brief sur­vey of some of his work on mod­el­ling and com­mu­ni­cat­ing uncer­tain futures. Coin­ci­den­tally, today I was also sent a copy of David Spiegelhalter’s paper on “Visu­al­iz­ing Uncer­tainty About the Future”. Both are well-​​worth reading.

It made me think about my own efforts to com­mu­ni­cate future uncer­tainty through graph­ics. Of course, for time series fore­casts I nor­mally show pre­dic­tion inter­vals. I pre­fer to use more than one inter­val at a time because it helps con­vey a lit­tle more infor­ma­tion. The default in the fore­cast pack­age for R is to show both an 80% and a 95% inter­val like this: Con­tinue reading →

A new candidate for worst figure

Today I read a paper that had been sub­mit­ted to the IJF which included the fol­low­ing figure


along with sev­eral sim­i­lar plots. (Click for a larger ver­sion.) I haven’t seen any­thing this bad for a long time. In fact, I think I would find it very dif­fi­cult to repro­duce using R, or even Excel (which is par­tic­u­larly adept at bad graphics).

A few years ago I pro­duced “Twenty rules for good graph­ics”. I think I need to add a cou­ple of addi­tional rules:

  • Rep­re­sent time changes using lines.
  • Never use fill pat­terns such as cross-​​hatching.

(My orig­i­nal rule #20 said Avoid pie charts.)

It would have been rel­a­tively sim­ple to show these data as six lines on a plot of GDP against time. That would have made it obvi­ous that the Euro­pean GDP was shrink­ing, the GDP of Asia/​Oceania was increas­ing, while other regions of the world were fairly sta­ble. At least I think that is what is hap­pen­ing, but it is very hard to tell from such graph­i­cal obfuscation.

Visit of Di Cook

Next week, Pro­fes­sor Di Cook from Iowa State Uni­ver­sity is vis­it­ing my research group at Monash Uni­ver­sity. Di is a world leader in data visu­al­iza­tion, and is espe­cially well-​​known for her work on inter­ac­tive graph­ics and the XGobi and GGobi soft­ware. See her book with Deb Swayne for details.

For those want­ing to hear her speak, read on. Con­tinue reading →

Reflections on UseR! 2013

This week I’ve been at the R Users con­fer­ence in Albacete, Spain. These con­fer­ences are a lit­tle unusual in that they are not really about research, unlike most con­fer­ences I attend. They pro­vide a place for peo­ple to dis­cuss and exchange ideas on how R can be used.

Here are some thoughts and high­lights of the con­fer­ence, in no par­tic­u­lar order. Con­tinue reading →