Visit of Di Cook


13 August 2014

data science
Monash University
research team

Next week, Professor Di Cook from Iowa State University is visiting my research group at Monash University. Di is a world leader in data visualization, and is especially well-known for her work on interactive graphics and the XGobi and GGobi software. See her book with Deb Swayne for details.

For those wanting to hear her speak, read on.

Research seminar

She will be giving a seminar at 2pm on Monday 18 August at the Monash Clayton campus (Rm E457, Menzies Building 11).

Title: Not Drowning, Waving

Abstract: In this technological age we are drowning in data. Good data visualisation helps us to swim, digest the data, and learn about our world. The statistics community creates visualisation systems within the context of data analysis, so the graphics are designed to support and enrich the statistical processes of data exploration, modeling, and inference. As a result, statistical data visualisation has some unique features which differentiates it from visualisations made in other fields. Statisticians are always concerned with variability in observations and error in measurements, both of which cause uncertainty about conclusions drawn from data. Dealing with this uncertainty is at the heart of classical statistics, and statisticians have developed a huge body of inferential methods that help to quantify uncertainty. Statistical data graphics cover a spectrum of methods including elegant static data visualisations and highly interactive and dynamic graphics used for exploratory data analysis.

In this talk, we will explain how graphical methods were used to study the tech boom and bust of the late 1990s, food quality control, atmospheric CO2 levels and temperature changes, stimulus fund spending, university ranking, PISA education data, labor market wages, stock market trends, and soybean breeding for agribusiness. (Ok, only a selection of these will be used.) We will explain how interactive statistical graphics systems are constructed that enable data exploration, how this differs from computer graphics, and how incorporating inferential techniques enables us to determine if what we see is real. We will also describe new developments that allow (almost) everyone to become developers. Good data visualisation equips statisticians to balance skepticism with discovery, and helps business analysts swim with their data.

Meetup talk on data visualization

On Thursday evening, Di will speak at “Visualize That”, an event organized by the Melbourne Data Science Meetup group (beginning at 6pm).

Title: Is Nick Kyrgios the Next Number 1?

Abstract: A 19 year old tennis player from Canberra, Nick Kyrgios, shocked the tennis world on July 1, 2014 by beating world number 1, Rafael Nadal, in the fourth round of Wimbledon. Nick had already beaten the 13th seed Richard Gasquet to get this far. McEnroe commenting on the match exclaimed

“We’ve been waiting for this for a while. We keep saying, ‘Who’s the next guy?’, and I think we found that guy right now.”

Nadal’s post-interview language, to the contrary, indicates that he thinks Nick has a long way to go:

“He has things, positive things, to be a good player. But everything is a little bit easier when you are arriving.”

We can take a look at Nick’s stats in comparison to the other players in the tournament, and to the best players, by scraping data from the Wimbledon web site, using the R package XML, and making plots using the R package ggplot2. The statistics available include aces, double faults, % first serves in, % first and second serves won, fastest serve speed, average first and second serve speed, net points won, break points won, receiving points won, winners and unforced errors, for each match.

Research group meeting

Finally, my research group will have a private meeting with Di on data visualization issues. As preparation for our discussions, the group have been asked to watch the following Google Tech Talk by Hadley Wickham on interactive graphics. Hadley is very well-known in the R world as the author of ggplot2, plyr, and a huge range of other extremely useful packages; he was previously a PhD student of Di Cook.