Hyndsight
Thoughts on research, forecasting, statistics, and other distractions.Topics covered
acems
anomalies
beamer
computing
conferences
consulting
data
data-science
demography
econometrics
energy
epidemiology
evidence
forecasting
fpp
genealogy
grants
graphics
hts
humour
ijf
isf2017
jobs
journals
jss
kaggle
latex
mathematics
maxima
monash-university
obituary
organization
otexts
phd
podcast
poetry
politics
prizes
productivity
progress
publishing
r
refereeing
references
religion
reproducible-research
research
research-team
ropensci
seasonality
seminars
stackexchange
statistics
supervision
tables
teaching
technology
tidyverts
time-series
trend
video
welfare
writing
ysc2013
All Hyndsight posts by date
The cricketdata package
Four functions The cricketdata package has been around for a few years on github, and it has been on CRAN since February 2022. There are only four functions: fetch_cricinfo(): Fetch team data on international cricket matches provided by ESPNCricinfo. fetch_player_data(): Fetch individual player data on international cricket matches provided by ESPNCricinfo. find_player_id(): Search for the player ID on ESPNCricinfo. fetch_cricsheet(): Fetch ball-by-ball, match and player data from Cricsheet. Jacquie Tran wrote the first version of the fetch_cricsheet() function, and the vignette which demonstrates it.Monash time series forecasting repository
The Monash time series forecasting respository is a comprehensive collection of time series data made available in a convenient form to encourage empirical forecast evaluations. The repository includes the data from many forecasting competitions including the M1, M3, M4, NN5, tourism, and KDD cup 2018, as well as many other data sets from diverse applications. The associated paper discusses the various data sets and their characteristics. Where a time series collection contains data with different observation frequencies, they are split into different data sets so that the series within each data set has the same frequency.Simulating from TBATS models
I’ve had several requests for an R function to simulate future values from a TBATS model. We will eventually include TBATS in the fable package, and the facilities will be added there. But in the meantime, if you are using the forecast package and want to simulate from a fitted TBATS model, here is how do it. Simulating via one-step forecasts Doing it efficiently would require a more complicated approach, but this is super easy if you are willing to sacrifice some speed.Job advertisements
Employers often contact me asking how to find a good statistician, econometrician or forecaster for their organization. Students also ask me how to go about finding a job when they finish their degree. This post is for both groups, hopefully making it easier for them to pair up appropriately. General online job sites such as seek or careerjet are ok, but job-seekers can find it hard to find the relevant openings because job titles are so varied.Detecting time series outliers
The tsoutliers() function in the forecast package for R is useful for identifying anomalies in a time series. However, it is not properly documented anywhere. This post is intended to fill that gap. The function began as an answer on CrossValidated and was later added to the forecast package because I thought it might be useful to other people. It has since been updated and made more reliable.Forecasting: Principles and Practice
Useful extensions for online books
I’ve had two recent questions from readers of my online textbook (with George Athanasopoulos) which could be solved using Google Chrome extensions. Hi, I’m an MSc student and am shortly starting my project/dissertation on time series data. I’ve started reading Version 3 of your book and improving my R skills but am wondering if there’s any way I can read V3 that will allow annotation? Thanks For personal annotation of websites, the Hypothesis extension is very useful.What is forecasting?
Time series cross-validation using fable
Time series cross-validation is handled in the fable package using the stretch_tsibble() function to generate the data folds. In this post I will give two examples of how to use it, one without covariates and one with covariates. Quarterly Australian beer production Here is a simple example using quarterly Australian beer production from 1956 Q1 to 2010 Q2. First we create a data object containing many training sets starting with 3 years (12 observations), and adding one quarter at a time until all data are included.Forecasting podcasts
I’ve been interviewed for several podcasts over the last year or so. It’s always fun to talk about my work, and I hope there is enough differences between them to make it interesting for listeners. Here is a full list of them.
(Updated: 17 Nov 2021)
Date | Podcast | Episode |
---|---|---|
17 November 2021 | The Random Sample | Software as a first class research output |
24 May 2021 | Data Skeptic | Forecasting principles and practice |
12 April 2021 | Seriously Social | Forecasting the future: the science of prediction |
6 February 2021 | Forecasting Impact | Rob Hyndman |
19 July 2020 | The Curious Quant | Forecasting COVID, time series, and why causality doesnt matter as much as you think |
27 May 2020 | The Random Sample | Forecasting the future & the future of forecasting |
9 October 2019 | Thought Capital | Forecasts are always wrong (but we need them anyway) |