Version 7 of the forecast package was released on CRAN about a month ago, but I'm only just getting around to posting about the new features.
The most visible feature was the introduction of ggplot2 graphics. I first wrote the forecast package before ggplot2 existed, and so only base graphics were available. But I figured it was time to modernize and use the nice features available from ggplot2. The following examples illustrate the main new graphical functionality.
For illustration purposes, I'm using the male and female monthly deaths from lung diseases in the UK.
Continue reading →
I often see figures with two sets of prediction intervals plotted on the same graph using different line types to distinguish them. The results are almost always unreadable. A better way to do this is to use semi-transparent shaded regions. Here is an example showing two sets of forecasts for the Nile River flow.
f1 = forecast(auto.arima(Nile, lambda=0), h=20, level=95)
f2 = forecast(ets(Nile), h=20, level=95)
plot(f1, shadecol=rgb(0,0,1,.4), flwd=1,
main="Forecasts of Nile River flow",
xlab="Year", ylab="Billions of cubic metres")
The blue region shows 95% prediction intervals for the ARIMA forecasts, while the red region shows 95% prediction intervals for the ETS forecasts. Where they overlap, the colors blend to make purple. In this case, the point forecasts are quite close, but the prediction intervals are relatively different.
I’ve always struggled with using
plotmath via the
expression function in R for adding mathematical notation to axes or legends. For some reason, the most obvious way to write something never seems to work for me and I end up using trial and error in a loop with far too many iterations.
So I am very happy to see the new latex2exp package available which translates LaTeX expressions into a form suitable for R graphs. This is going to save me time and frustration! Continue reading →
At the recent International Symposium on Forecasting, held in Riverside, California, Tillman Gneiting gave a great talk on “Evaluating forecasts: why proper scoring rules and consistent scoring functions matter”. It will be the subject of an IJF invited paper in due course.
One of the things he talked about was the “Murphy diagram” for comparing forecasts, as proposed in Ehm et al (2015). Here’s how it works for comparing mean forecasts. Continue reading →
I’m currently attending the one day workshop on this topic at QUT in Brisbane. This morning I spoke on “Visualizing and forecasting big time series data”. My slides are here.
The talks are being streamed.
Big data is now endemic in business, industry, government, environmental management, medical science, social research and so on. One of the commensurate challenges is how to effectively model and analyse these data.
This workshop will bring together national and international experts in statistical modelling and analysis of big data, to share their experiences, approaches and opinions about future directions in this field.
I’m currently visiting Taiwan and I’m giving two seminars while I’m here — one at the National Tsing Hua University in Hsinchu, and the other at Academia Sinica in Taipei. Details are below for those who might be nearby. Continue reading →
I’m delighted that Professor Dianne Cook will be joining Monash University in July 2015 as a Professor of Business Analytics. Di is an Australian who has worked in the US for the past 25 years, mostly at Iowa State University. She is moving back to Australia and joining the Department of Econometrics and Business Statistics in the Monash Business School, as part of our initiative in Business Analytics.
Di is a world leader in data visualization, and is well-known for her work on interactive graphics. She is also the academic supervisor of several leading data scientists including Hadley Wickham and Yihui Xie, both of whom work for RStudio.
Di has a great deal of energy and enthusiasm for computational statistics and data visualization, and will play a key role in developing and teaching our new subjects in business analytics.
The Monash Business School is already exceptionally strong in econometrics (ranked 7th in the world on RePEc), and forecasting (ranked 11th on RePEc), and we have recently expanded into actuarial science. With Di joining the department, we will be extending our expertise in the area of data visualization as well.
This week my research group discussed Adrian Raftery’s recent paper on “Use and Communication of Probabilistic Forecasts” which provides a fascinating but brief survey of some of his work on modelling and communicating uncertain futures. Coincidentally, today I was also sent a copy of David Spiegelhalter’s paper on “Visualizing Uncertainty About the Future”. Both are well-worth reading.
It made me think about my own efforts to communicate future uncertainty through graphics. Of course, for time series forecasts I normally show prediction intervals. I prefer to use more than one interval at a time because it helps convey a little more information. The default in the forecast package for R is to show both an 80% and a 95% interval like this: Continue reading →
Today I read a paper that had been submitted to the IJF which included the following figure
along with several similar plots. (Click for a larger version.) I haven’t seen anything this bad for a long time. In fact, I think I would find it very difficult to reproduce using R, or even Excel (which is particularly adept at bad graphics).
A few years ago I produced “Twenty rules for good graphics”. I think I need to add a couple of additional rules:
- Represent time changes using lines.
- Never use fill patterns such as cross-hatching.
(My original rule #20 said Avoid pie charts.)
It would have been relatively simple to show these data as six lines on a plot of GDP against time. That would have made it obvious that the European GDP was shrinking, the GDP of Asia/Oceania was increasing, while other regions of the world were fairly stable. At least I think that is what is happening, but it is very hard to tell from such graphical obfuscation.
Next week, Professor Di Cook from Iowa State University is visiting my research group at Monash University. Di is a world leader in data visualization, and is especially well-known for her work on interactive graphics and the XGobi and GGobi software. See her book with Deb Swayne for details.
For those wanting to hear her speak, read on. Continue reading →