I gave a seminar at Stanford today. Slides are below. It was definitely the most intimidating audience I’ve faced, with Jerome Friedman, Trevor Hastie, Brad Efron, Persi Diaconis, Susan Holmes, David Donoho and John Chambers all present (and probably other famous names I’ve missed).
Jane Frazier spoke at our research team meeting today on “Reproducibility in computational research”. We had a very stimulating and lively discussion about the issues involved. One interesting idea was that reproducibility is on a scale, and we can all aim to move further along the scale towards making our own research more reproducible. For example
- Can you reproduce your results tomorrow on the same computer with the same software installed?
- Could someone else on a different computer reproduce your results with the same software installed?
- Could you reproduce your results in 3 years time after some of your software environment may have changed?
Think about what changes you need to make to move one step further along the reproducibility continuüm, and do it.
Jane’s slides and handout are below. Continue reading →
I will be speaking at the Chinese R conference in Nanchang, to be held on 24–25 October, on “Forecasting Big Time Series Data using R”.
Details (for those who can read Chinese) are at china-r.org.
I’m back in California for the next couple of weeks, and will give the following talk at Stanford and UC-Davis.
Optimal forecast reconciliation for big time series data
Time series can often be naturally disaggregated in a hierarchical or grouped structure. For example, a manufacturing company can disaggregate total demand for their products by country of sale, retail outlet, product type, package size, and so on. As a result, there can be millions of individual time series to forecast at the most disaggregated level, plus additional series to forecast at higher levels of aggregation.
A common constraint is that the disaggregated forecasts need to add up to the forecasts of the aggregated data. This is known as forecast reconciliation. I will show that the optimal reconciliation method involves fitting an ill-conditioned linear regression model where the design matrix has one column for each of the series at the most disaggregated level. For problems involving huge numbers of series, the model is impossible to estimate using standard regression algorithms. I will also discuss some fast algorithms for implementing this model that make it practicable for implementing in business contexts.
June 19–22, 2016
Santander, Spain – Palace of La Magdalena
The International Symposium on Forecasting (ISF) is the premier forecasting conference, attracting the world’s leading forecasting researchers, practitioners, and students. Through a combination of keynote speaker presentations, academic sessions, workshops, and social programs, the ISF provides many excellent opportunities for networking, learning, and fun.
Greg Allenby, The Ohio State University, USA
Todd Clark, Federal Reserve Bank of Cleveland, USA
José Duato, Polytechnic University of Valencia, Spain
Robert Fildes, Lancaster University, United Kingdom
Edward Leamer, UCLA Anderson, USA
Henrik Madsen, Technical University of Denmark
Adrian Raftery, University of Washington, USA
Invited Session Proposals: January 31 2016
Abstract Submissions: March 16 2016
Early Registration Ends: May 15 2016
More information at www.forecasters.org/isf
This is a very different book from my usual areas of forecasting and statistics. It is a personal memoir describing my journey of deconversion from Christianity.
Until a few years ago, I was regularly speaking at church conferences internationally, and my books are still used in Bible classes and Sunday Schools around the world. I even helped establish an innovative new church, which became a model for similar churches in other countries. Eventually I came to the view that I was mistaken, and that there was little or no evidence that the Bible was inspired or that God exists. In this book, I reflect on how I was fooled, and why I changed my mind.
The last issue of the International Journal of Forecasting for 2015 has been released. This one contains the usual mix of topics, plus a special section on Forecasting in telecommunications and ICT including a nice review article by Nigel Meade and Towhidul Islam. Enjoy!
I get asked to review journal papers almost every day, and I have to say no to almost all of them. I know it is hard to find reviewers, but many of these requests indicate very lazy editors. So to all the editors out there looking for reviewers, here is some advice.
- Never ask someone who is an editor for another journal. I am handling about 500 submissions per year for the International Journal of Forecasting, and about 10 per year for the Journal of Statistical Software. There is very little time left to review for other journals. You are much better off identifying someone early in their career, within 10 years of finishing their PhD. They have more time, fewer requests, and are often looking to build an academic reputation.
- Look at the key papers cited in the submission, especially the recent ones, and then check the web sites of their authors. Find someone who is currently working in the area. For multi-authored papers, figure out which author was the PhD student, who was the professor, etc. If there was a post-doc involved, ask him/her.
- If that fails, do a Google Scholar search for an author who has written on the same topic recently. That is, in the last 2–3 years, not 10 years ago.
- If possible, ask someone who has recently authored a paper in your journal. They owe you one.
- Ask someone you know rather than a stranger. They are much more likely to say yes. If you don’t know many people you shouldn’t be an editor.
I’ve always struggled with using
plotmath via the
expression function in R for adding mathematical notation to axes or legends. For some reason, the most obvious way to write something never seems to work for me and I end up using trial and error in a loop with far too many iterations.
So I am very happy to see the new latex2exp package available which translates LaTeX expressions into a form suitable for R graphs. This is going to save me time and frustration! Continue reading →
I am teaching part of a short-course on Data Science for Managers from 10–12 October in Melbourne.
The impact of Data Science on modern business is second only to the introduction of computers. And yet, for many businesses the barrier of entry remains too high due to lack of knowhow, organisational inertia, difficulties in hiring the right manpower, an apparent need for upfront commitment, and more.
This course is designed to address these barriers, giving the necessary knowledge and skills to flesh out and manage Data Science functions within your organisation, taking the anxiety-factor out of the Big Data revolution and demonstrating how data-driven decision-making can be integrated into one’s organisation to harness existing advantages and to create new opportunities.
Assuming minimal prior knowledge, this course provides complete coverage of the key aspects, including data wrangling, modelling and analysis, predictive-, descriptive– and prescriptive-analytics, data management and curation, standards for data storage and analysis, the use of structured, semi-structured and unstructured data as well as of open public data, and the data-analytic value chain, all covered at a fundamental level.
More details available at it.monash.edu/data-science.
Early-bird bookings close in a few days.