There is a one day workshop on this topic on 23 February 2015 at QUT in Brisbane. I will be speaking on “Visualizing and forecasting big time series data”.
Big data is now endemic in business, industry, government, environmental management, medical science, social research and so on. One of the commensurate challenges is how to effectively model and analyse these data.
This workshop will bring together national and international experts in statistical modelling and analysis of big data, to share their experiences, approaches and opinions about future directions in this field.
The workshop programme will commence at 8.30am and close at 5pm. Registration is free, however numbers are strictly limited so please ensure you register when you receive your invitation via email. Morning and afternoon tea will be provided; participants will need to purchase their own lunch.
Further details will be made available in early January. Continue reading →
This week my research group discussed Adrian Raftery’s recent paper on “Use and Communication of Probabilistic Forecasts” which provides a fascinating but brief survey of some of his work on modelling and communicating uncertain futures. Coincidentally, today I was also sent a copy of David Spiegelhalter’s paper on “Visualizing Uncertainty About the Future”. Both are well-worth reading.
It made me think about my own efforts to communicate future uncertainty through graphics. Of course, for time series forecasts I normally show prediction intervals. I prefer to use more than one interval at a time because it helps convey a little more information. The default in the forecast package for R is to show both an 80% and a 95% interval like this: Continue reading →
Today I read a paper that had been submitted to the IJF which included the following figure
along with several similar plots. (Click for a larger version.) I haven’t seen anything this bad for a long time. In fact, I think I would find it very difficult to reproduce using R, or even Excel (which is particularly adept at bad graphics).
A few years ago I produced “Twenty rules for good graphics”. I think I need to add a couple of additional rules:
- Represent time changes using lines.
- Never use fill patterns such as cross-hatching.
(My original rule #20 said Avoid pie charts.)
It would have been relatively simple to show these data as six lines on a plot of GDP against time. That would have made it obvious that the European GDP was shrinking, the GDP of Asia/Oceania was increasing, while other regions of the world were fairly stable. At least I think that is what is happening, but it is very hard to tell from such graphical obfuscation.
Next week, Professor Di Cook from Iowa State University is visiting my research group at Monash University. Di is a world leader in data visualization, and is especially well-known for her work on interactive graphics and the XGobi and GGobi software. See her book with Deb Swayne for details.
For those wanting to hear her speak, read on. Continue reading →
This week I’ve been at the R Users conference in Albacete, Spain. These conferences are a little unusual in that they are not really about research, unlike most conferences I attend. They provide a place for people to discuss and exchange ideas on how R can be used.
Here are some thoughts and highlights of the conference, in no particular order. Continue reading →
When I want to insert figures generated in R into a LaTeX document, it looks better if I first remove the white space around the figure. Unfortunately, R does not make this easy as the graphs are generated to look good on a screen, not in a document.
There are two things that can be done to fix this problem. Continue reading →
Today I was writing a report which included 20 figures, with the names
demandplot20.pdf, and all with similar captions. Clearly a loop was required. After all, LaTeX is a programming language, so we should be able to take advantage of its capabilities. Continue reading →
The Australian Young Statisticians Conference (Feb 2013) is organizing a communication competition. They invite all early-career statisticians (studying, or within 5 years of graduation) to produce a short (3−5 minute) video for the ABS YSC2013 Video Competition, or a static infographic for the ABS YSC2013 Infographic Competition.
Both competitions have a 1st prize of $500, and 2nd prize of $250.
Entries close 16th November, and winners will be notified by mid-December.
Details available at: ysc2013.com/program/competitions/
I’m a speaker at the conference, so hopefully I will get to see some of the great entries!
For those who have not read the seminal works of Tufte and Cleveland, please hang your heads in shame. To salvage some sense of self-worth, you can then head over to Solomon Messing’s blog where he is starting a series on data visualization based on the principles developed by Tufte and Cleveland (with R examples).
The classics are also worth reading, and remain relevant despite the 20 or 30 years that have elapsed since they appeared.