I spoke to our new crop of honours students this morning. Here are my slides, example files and links. Continue reading →
Invitations to write for bogus journals and speak at bogus conferences keep rolling in. Here is one I received today.
Dear Dr. Rob J. Hyndman,
It is our great pleasure to welcome you to join in Part 2: Knowledge Economy Symposium of GCKE-2017, which will be held in Qingdao, China during September 19-21, 2017. And we cordially invite you to propose a Speech on your recent research of Corrigendum to: “Hierarchical forecasts for Australian domestic tourism” [International Journal of Forecasting 25 (2009) 146–166]… . Your prompt reply with a speech proposal (Speech Title and abstract preferred along with full CV) will be highly appreciated!
That particularly exciting piece of research is freely available here. I’m trying to imagine how I could expand it into a full talk.
Last October I gave a 3-day masterclass on “Forecasting with R” in Eindhoven, Netherlands. There is a follow-up event planned for Tuesday 18 April 2017. It is particularly designed for people who attended the 3-day class, but if anyone else wants to attend they would be welcome.
Last year we had WOMBAT (Workshop Organized by the Monash Business Analytics Team) at the zoo, and MeDaScIn (Melbourne Data Science Initiative) in the city.
This year we are combining forces to hold WOMBAT MeDaScIn 2017.
There will be four days of tutorials (Monday 29 May to Thursday 1 June), and the main conference on Friday 2 June. We have an impressive range of local and international presenters including Yihui Xie (author of Rmarkdown, Knitr, Bookdown, Blogdown and more), Di Cook (data visualization guru), Stephanie Kovalchik (Data Scientist at Tennis Australia), Amy Shi-Nash (Head of Data Science at Commonwealth Bank of Australia), Graham Williams (Director of Data Science at Microsoft) and many more. I’ll be doing a workshop on “Time series in R, forecasting and visualisation” with Earo Wang.
Full details can be found on the web site.
- Friday only: \$330 (student \$70, academic $200)
- Thursday reception and Friday: $400
- Tutorials only (pick 4 from selection of two choices per day, and combine with a friend or two if desired): $1300
- Everything: $1600
There are a only 30 spaces for each workshop, and there are a limited number of student and academic conference tickets.
Every two years we award a prize for the best paper published in the International Journal of Forecasting. It is now time to identify the best paper published in the IJF during 2014 and 2015. There is always about 18 months delay after the publication period to allow time for reflection, citations, etc. The prize is US$1000 plus an engraved plaque. I will present the prize at the ISF in Cairns in late June.
Nominations are invited from any reader of the IJF. Each person may nominate up to three papers, but you cannot nominate a paper that you have coauthored yourself. Papers coauthored by one of the six editors (Hyndman, Kapetanios, McCracken, Önkal, Ruiz, or van Dijk) are not eligible for the prize. All nominated papers are to be accompanied by a short statement (up to 200 words) from the nominator, explaining why the paper deserves an award.
You can see all the papers published in the period 2014-2105 on Google Scholar. You can also download a spreadsheet of the relevant papers with citations as counted by Scopus. Scopus does not cover every published journal, so the citation counts are underestimates, but they give some general guide as to which papers have attracted the attention of researchers. Google Scholar includes far more citations including working papers, but there may be some double counting.
Of course, a good paper does not always get noticed, so don’t let the citation count sway you too much in nominating what you consider to be the best IJF paper from this period.
Nominations should be sent by email to me by 30 April 2017.
In what is now a roughly annual event, the forecast package has been updated on CRAN with a new version, this time 8.0.
A few of the more important new features are described below.
A common task when building forecasting models is to check that the residuals satisfy some assumptions (that they are uncorrelated, normally distributed, etc.). The new function
checkresiduals makes this very easy: it produces a time plot, an ACF, a histogram with super-imposed normal curve, and does a Ljung-Box test on the residuals with appropriate number of lags and degrees of freedom.
fit <- auto.arima(WWWusage) checkresiduals(fit)
## ## Ljung-Box test ## ## data: residuals ## Q* = 7.8338, df = 8, p-value = 0.4499 ## ## Model df: 2. Total lags used: 10
This should work for all the modelling functions in the package, as well as some of the time series modelling functions in the
Different types of residuals
Usually, residuals are computed as the difference between observations and the corresponding one-step forecasts. But for some models, residuals are computed differently; for example, a multiplicative ETS model or a model with a Box-Cox transformation. So the
residuals() function now has an additional argument to deal with this situation.
“Innovation residuals”” correspond to the white noise process that drives the evolution of the time series model. “Response residuals” are the difference between the observations and the fitted values (as with GLMs). For homoscedastic models, the innovation residuals and the one-step response residuals are identical. “Regression residuals” are also available for regression models with ARIMA errors, and are equal to the original data minus the effect of the regression variables. If there are no regression variables, the errors will be identical to the original series (possibly adjusted to have zero mean).
library(ggplot2) fit <- ets(woolyrnq) res <- cbind(Residuals = residuals(fit), Response.residuals = residuals(fit, type='response')) autoplot(res, facets=TRUE)
Some new graphs
geom_histogram() function in the
ggplot2 package is nice, but it does not have a good default binwidth. So I added the
gghistogram function which provides a quick histogram with good defaults. You can also overlay a normal density curve or a kernel density estimate.
ggseasonplot function is useful for studying seasonal patterns and how they change over time. It now has a
polar argument to create graphs like this.
I often want to add a time series line to an existing plot. Base graphics has
line() which works well when a time series is passed as an argument. So I added
autolayer which is similar (but more general). It is an S3 method like
autoplot, and adds a layer to an existing
autolayer will eventually form part of the next release of
ggplot2, but for now it is available in the
forecast package. There are methods provided for
WWWusage %>% ets %>% forecast(h=20) -> fc autoplot(WWWusage, series="Data") + autolayer(fc, series="Forecast") + autolayer(fitted(fc), series="Fitted")
CVar functions have been added. These were discussed in a previous post.
baggedETS function has been added, which implements the procedure discussed in Bergmeir et al (2016) for bagging ETS forecasts.
head and tail of time series
I’ve long found it annoying that
tail do not work on multiple time series. So I added some functions to the package so they now work.
Imports and Dependencies
The pipe operator from the
magrittr package is now imported. So you don’t need to load the
magrittr package to use it.
There are now no packages that are loaded with
forecast – everything required is imported. This makes the start up much cleaner (no more annoying messages from all those packages being loaded). Instead, some random tips are occasionally printed when you load the forecast package (much like
There is quite a bit more — see the Changelog for a list.
We are still looking for a few more invited sessions for the International Symposium on Forecasting, to be held in Cairns, Australia, 25-28 June 2017. Continue reading →
We know Australia is a long way to come for many forecasters, so we are making it easy for you to bring your families along to the International Symposium on Forecasting and have a vacation at the same time.
AusMacroData is a new website that encourages and facilitates the use of quantitative, publicly available Australian macroeconomic data. The Australian Macro Database hosted at ausmacrodata.org provides a user-friendly front end for searching among over 40000 economic variables and is loosely based on similar international sites such as the Federal Reserve Economic Database (FRED). Continue reading →