I am now using biblatex for all my bibliographic work as it seems to have developed enough to be stable and reliable. The big advantage of biblatex is that it is easy to format the bibliography to conform to specific journal or publisher styles. It is also possible to have structured bibliographies (e.g., divided into sections: books, papers, R packages, etc.) Continue reading →
Last week my research group discussed Hal Varian’s interesting new paper on “Big data: new tricks for econometrics”, Journal of Economic Perspectives, 28(2): 3–28.
It’s a nice introduction to trees, bagging and forests, plus a very brief entrée to the LASSO and the elastic net, and to slab and spike regression. Not enough to be able to use them, but ok if you’ve no idea what they are. Continue reading →
Last week, my research group discussed Galit Shmueli’s paper “To explain or to predict?”, Statistical Science, 25(3), 289–310. (See her website for further materials.) This is a paper everyone doing statistics and econometrics should read as it helps to clarify a distinction that is often blurred. In the discussion, the following issues were covered amongst other things.
- The AIC is better suited to model selection for prediction as it is asymptotically equivalent to leave-one-out cross-validation in regression, or one-step-cross-validation in time series. On the other hand, it might be argued that the BIC is better suited to model selection for explanation, as it is consistent.
- P-values are associated with explanation, not prediction. It makes little sense to use p-values to determine the variables in a model that is being used for prediction. (There are problems in using p-values for variable selection in any context, but that is a different issue.)
- Multicollinearity has a very different impact if your goal is prediction from when your goal is estimation. When predicting, multicollinearity is not really a problem provided the values of your predictors lie within the hyper-region of the predictors used when estimating the model.
- An ARIMA model has no explanatory use, but is great at short-term prediction.
- How to handle missing values in regression is different in a predictive context compared to an explanatory context. For example, when building an explanatory model, we could just use all the data for which we have complete observations (assuming there is no systematic nature to the missingness). But when predicting, you need to be able to predict using whatever data you have. So you might have to build several models, with different numbers of predictors, to allow for different variables being missing.
- Many statistics and econometrics textbooks fail to observe these distinctions. In fact, a lot of statisticians and econometricians are trained only in the explanation paradigm, with prediction an afterthought. That is unfortunate as most applied work these days requires predictive modelling, rather than explanatory modelling.
My research group meets every two weeks. It is always fun to talk about general research issues and new tools and tips we have discovered. We also use some of the time to discuss a paper that I choose for them. Today we discussed Breiman’s classic (2001) two cultures paper — something every statistician should read, including the discussion.
I select papers that I want every member of research team to be familiar with. Usually they are classics in forecasting, or they are recent survey papers.
In the last couple of months we have also read the following papers:
- Timmermann (2008) Elusive return predictability
- Diebold (2013) Comparing predictive accuracy, twenty years later: A personal perspective on the use and abuse of Diebold-Mariano tests
- Gneiting and Katsfuss (2014) Probabilistic forecasting
- Makridakis and Hibon (1978) Accuracy of forecasting: an empirical investigation
This is the title of a wonderful new book that has just been released, courtesy of the Committee of Presidents of Statistical Societies.
The book consists of 52 chapters spanning 622 pages. The full table of contents below shows its scope and the list of authors (a veritable who’s who in statistics). Continue reading →
The MAPE (mean absolute percentage error) is a popular measure for forecast accuracy and is defined as
where denotes an observation and denotes its forecast, and the mean is taken over .
Armstrong (1985, p.348) was the first (to my knowledge) to point out the asymmetry of the MAPE saying that “it has a bias favoring estimates that are below the actual values”. Continue reading →
For all those people asking me how to obtain a print version of my book “Forecasting: principles and practice” with George Athanasopoulos, you now can.
The online book will continue to be freely available. The print version of the book is intended to help fund the development of the OTexts platform.
The price is US$45, £27 or €35.
OTexts is intended to be a different kind of publisher — all our books are online and free, those in print will be reasonably priced.
The online version will continue to be updated regularly. The print version is a snapshot of the online version today. We will release a new print edition occasionally, no more than annually and only when the online version has changed enough to warrant a new print edition.
We are planning an offline electronic version as well. I’ll announce it here when it is ready.
Every year or so, Elsevier asks me to nominate five International Journal of Forecasting papers from the last two years to highlight in their marketing materials as “Editor’s Choice”. I try to select papers across a broad range of subjects, and I take into account citations and downloads as well as my own impression of the paper. That tends to bias my selection a little towards older papers as they have had more time to accumulate citations. Here are the papers I chose this morning (in the order they appeared):
- Diebold and Yilmaz (2012) Better to give than to receive: Predictive directional measurement of volatility spillovers. IJF 28(1), 57–66.
- Loterman, Brown, Martens, Mues, and Baesens (2012) Benchmarking regression algorithms for loss given default modeling. IJF 28(1), 161–170.
- Soyer and Hogarth (2012) The illusion of predictability: How regression statistics mislead experts. IJF 28(3), 695–711.
- Friedman (2012) Fast sparse regression and classification. IJF 28(3), 722–738.
- Davydenko and Fildes (2013) Measuring forecasting accuracy: The case of judgmental adjustments to SKU-level demand forecasts. IJF 29(3), 510–522.
Last time I did this, three of the five papers I chose went on to win awards. (I don’t pick the award winners — that’s a matter for the whole editorial board.) On the other hand, I didn’t pick the paper that got the top award for the period 2010–2011. So perhaps my selection is not such a good guide.
In two weeks I am presenting a workshop at the University of Granada (Spain) on Automatic Time Series Forecasting.
Unlike most of my talks, this is not intended to be primarily about my own research. Rather it is to provide a state-of-the-art overview of the topic (at a level suitable for Masters students in Computer Science). I thought I’d provide some historical perspective on the development of automatic time series forecasting, plus give some comments on the current best practices. Continue reading →
Hastie, Tibshirani and Friedman’s Elements of Statistical Learning first appeared in 2001 and is already a classic. It is my go-to book when I need a quick refresher on a machine learning algorithm. I like it because it is written using the language and perspective of statistics, and provides a very useful entry point into the literature of machine learning which has its own terminology for statistical concepts. A free downloadable pdf version is available on the website.
Recently, a simpler related book appeared entitled Introduction to Statistical Learning with applications in R by James, Witten, Hastie and Tibshirani. It “is aimed for upper level undergraduate students, masters students and Ph.D. students in the non-mathematical sciences”. This would be a great textbook for our new 3rd year subject on Business Analytics. The R code is a welcome addition in showing how to implement the methods. Again, a free downloadable pdf version is available on the website.
There is also a new, free book on Statistical foundations of machine learning by Bontempi and Ben Taieb available on the OTexts platform. This is more of a handbook and is written by two authors coming from a machine learning background. R code is also provided. Being an OTexts book, it is continually updated and revised, and is freely available to anyone with a browser.
Thanks to the authors for being willing to make these books freely available.