When watching the TV news, or reading newspaper commentary, I am frequently amazed at the attempts people make to interpret random noise. For example, the latest tiny fluctuation in the share price of a major company is attributed to the CEO being ill. When the exchange rate goes up, the TV finance commentator confidently announces that it is a reaction to Chinese building contracts. No one ever says “The unemployment rate has dropped by 0.1% for no apparent reason.” What is going on here is that the commentators are assuming we live in a noise-free world. They imagine that everything is explicable, you just have to find the explanation. However, the world is noisy — real data are subject to random fluctuations, and are often also measured inaccurately. So to interpret every little fluctuation is silly and misleading.
Posts Tagged ‘teaching’:
The leave-one-out cross-validation statistic is given by where , are the observations, and is the predicted value obtained when the model is estimated with the th case deleted. This is also sometimes known as the PRESS (Prediction Residual Sum of Squares) statistic. It turns out that for linear models, we do not actually have to estimate the model times, once for each omitted case. Instead, CV can be computed after estimating the model once on the complete data set.
Users of my new online forecasting book have asked about having a facility for personal highlighting of selected sections, as students often do with print books. We have plans to make this a built-in part of the platform, but for now it is possible to do it using a simple browser extension. This approach allows any website to be highlighted, so is even more useful than if we only had the facility on OTexts.org. There are several possible tools available. One of the simplest tools that allows both highlighting and annotations is Diigo.
Earo Wang recently interviewed me for the Chinese website Capital of Statistics. The English transcript of the intervew is on Earo’s personal website. This is the third interview I’ve done in the last 18 months. The others were for: Data Mining Research. Republished in Amstat News. DecisionStats.
Hastie, Tibshirani and Friedman’s Elements of Statistical Learning first appeared in 2001 and is already a classic. It is my go-to book when I need a quick refresher on a machine learning algorithm. I like it because it is written using the language and perspective of statistics, and provides a very useful entry point into the literature of machine learning which has its own terminology for statistical concepts. A free downloadable pdf version is available on the website. Recently, a simpler related book appeared entitled Introduction to Statistical Learning with applications in R by James, Witten, Hastie and Tibshirani. It “is aimed for upper level undergraduate students, masters students and Ph.D. students in the non-mathematical sciences”. This would be a great textbook for our new 3rd year subject on Business Analytics. The R code is a welcome addition in showing how to implement the methods. Again, a free downloadable pdf version is available on the website. There is also a new, free book on Statistical foundations of machine learning by Böntempi and Ben Taieb available on the OTexts platform. This is more of a handbook and is written by two authors coming from a machine learning background. R code is also provided. Being an OTexts book, it is continually updated and revised, and is freely available
Last year I taught an online course on forecasting using R. The slides and exercise sheets are now available at www.otexts.org/fpp/resources/
I’ve been getting emails asking questions about my upcoming course on Forecasting using R. Here are some answers.
The publishing platform I set up for my forecasting book has now been extended to cover more books and greater functionality. Check it out at www.otexts.org.
The following video has been produced to advertise my upcoming course on Forecasting with R, run in partnership with Revolution Analytics.
I am teaming up with Revolution Analytics to teach an online course on forecasting with R. Topics to be covered include seasonality and trends, exponential smoothing, ARIMA modelling, dynamic regression and state space models, as well as forecast accuracy methods and forecast evaluation techniques such as cross-validation. I will talk about some of my consulting experiences, and explain the tools in the forecast package for R. The course will run from 21 October to 4 December, for two hours each week. Participants can network and interact with other practitioners through an online community.