All Hyndsight posts by date

The cricketdata package

Four functions The cricketdata package has been around for a few years on github, and it has been on CRAN since February 2022. There are only four functions: fetch_cricinfo(): Fetch team data on international cricket matches provided by ESPNCricinfo. fetch_player_data(): Fetch individual player data on international cricket matches provided by ESPNCricinfo. find_player_id(): Search for the player ID on ESPNCricinfo. fetch_cricsheet(): Fetch ball-by-ball, match and player data from Cricsheet. Jacquie Tran wrote the first version of the fetch_cricsheet() function, and the vignette which demonstrates it.

Read More…

Monash time series forecasting repository

The Monash time series forecasting respository is a comprehensive collection of time series data made available in a convenient form to encourage empirical forecast evaluations. The repository includes the data from many forecasting competitions including the M1, M3, M4, NN5, tourism, and KDD cup 2018, as well as many other data sets from diverse applications. The associated paper discusses the various data sets and their characteristics. Where a time series collection contains data with different observation frequencies, they are split into different data sets so that the series within each data set has the same frequency.

Read More…

Simulating from TBATS models

I’ve had several requests for an R function to simulate future values from a TBATS model. We will eventually include TBATS in the fable package, and the facilities will be added there. But in the meantime, if you are using the forecast package and want to simulate from a fitted TBATS model, here is how do it. Simulating via one-step forecasts Doing it efficiently would require a more complicated approach, but this is super easy if you are willing to sacrifice some speed.

Read More…

Job advertisements

Employers often contact me asking how to find a good statistician, econometrician or forecaster for their organization. Students also ask me how to go about finding a job when they finish their degree. This post is for both groups, hopefully making it easier for them to pair up appropriately. General online job sites such as seek or careerjet are ok, but job-seekers can find it hard to find the relevant openings because job titles are so varied.

Read More…

Detecting time series outliers

The tsoutliers() function in the forecast package for R is useful for identifying anomalies in a time series. However, it is not properly documented anywhere. This post is intended to fill that gap. The function began as an answer on CrossValidated and was later added to the forecast package because I thought it might be useful to other people. It has since been updated and made more reliable.

Read More…

Forecasting: Principles and Practice

Useful extensions for online books

I’ve had two recent questions from readers of my online textbook (with George Athanasopoulos) which could be solved using Google Chrome extensions. Hi, I’m an MSc student and am shortly starting my project/dissertation on time series data. I’ve started reading Version 3 of your book and improving my R skills but am wondering if there’s any way I can read V3 that will allow annotation? Thanks For personal annotation of websites, the Hypothesis extension is very useful.

Read More…

What is forecasting?

Time series cross-validation using fable

Time series cross-validation is handled in the fable package using the stretch_tsibble() function to generate the data folds. In this post I will give two examples of how to use it, one without covariates and one with covariates. Quarterly Australian beer production Here is a simple example using quarterly Australian beer production from 1956 Q1 to 2010 Q2. First we create a data object containing many training sets starting with 3 years (12 observations), and adding one quarter at a time until all data are included.

Read More…

Forecasting podcasts

I’ve been interviewed for several podcasts over the last year or so. It’s always fun to talk about my work, and I hope there is enough differences between them to make it interesting for listeners. Here is a full list of them.

(Updated: 17 Nov 2021)

Date Podcast Episode
17 November 2021 The Random Sample Software as a first class research output
24 May 2021 Data Skeptic Forecasting principles and practice
12 April 2021 Seriously Social Forecasting the future: the science of prediction
6 February 2021 Forecasting Impact Rob Hyndman
19 July 2020 The Curious Quant Forecasting COVID, time series, and why causality doesnt matter as much as you think‪
27 May 2020 The Random Sample Forecasting the future & the future of forecasting
9 October 2019 Thought Capital Forecasts are always wrong (but we need them anyway)