Connect with local employers

I keep telling stu­dents that there are lots of jobs in data sci­ence (includ­ing sta­tis­tics), and they often tell me they can’t find them adver­tised. As usual, you do have to do some net­work­ing, and one of the best ways of doing it is via a Data Sci­ence Meetup. Many cities now have them includ­ing Mel­bourne, Syd­ney, Lon­don, etc. It is the per­fect oppor­tu­nity to meet with local employ­ers, many of which are hir­ing due to the huge expan­sion in the use of data analy­sis in busi­ness (aka busi­ness analytics).

At the end of each Mel­bourne meetup, some employ­ers have been adver­tis­ing their cur­rent ana­lytic job open­ings to the audience.

Now the local orga­niz­ers are going to extend the oppor­tu­nity to allow job-​​searchers to give a 90 sec­ond pitch to employ­ers. Details are pro­vided on the mes­sage board.

IIF Sponsored Workshops

The Inter­na­tional Insti­tute of Fore­cast­ers spon­sors work­shops every year, each of which focuses on a spe­cific theme. The pur­pose of these work­shops is to facil­i­tate small, infor­mal meet­ings where experts in a par­tic­u­lar field of fore­cast­ing can dis­cuss fore­cast­ing prob­lems, research, and solu­tions. Over the years, our work­shops have cov­ered top­ics from Pre­dict­ing Rare Events, ICT Fore­cast­ing, and, most recently, Sin­gu­lar Spec­trum Analy­sis. Often these work­shops are asso­ci­ated with a spe­cial issue of the Inter­na­tional Jour­nal of Fore­cast­ing.

If you are already host­ing a work­shop on a fore­cast­ing topic and need sup­port from the IIF, or if you are inter­ested in organ­is­ing and host­ing a new work­shop, please con­tact George Athana­sopou­los.

A list of past work­shops and work­shop guide­lines are pro­vided on the IIF web­site.

TBATS with regressors

I’ve received a few emails about includ­ing regres­sion vari­ables (i.e., covari­ates) in TBATS mod­els. As TBATS mod­els are related to ETS mod­els, tbats() is unlikely to ever include covari­ates as explained here. It won’t actu­ally com­plain if you include an xreg argu­ment, but it will ignore it.

When I want to include covari­ates in a time series model, I tend to use auto.arima() with covari­ates included via the xreg argu­ment. If the time series has mul­ti­ple sea­sonal peri­ods, I use Fourier terms as addi­tional covari­ates. See my post on fore­cast­ing daily data for some dis­cus­sion of this model. Note that fourier() and fourierf() now han­dle msts objects, so it is very sim­ple to do this.

For exam­ple, if holiday con­tains some dummy vari­ables asso­ci­ated with pub­lic hol­i­days and holidayf con­tains the cor­re­spond­ing vari­ables for the first 100 fore­cast peri­ods, then the fol­low­ing code can be used:

y <- msts(x, seasonal.periods=c(7,365.25))
z <- fourier(y, K=c(5,5))
zf <- fourierf(y, K=c(5,5), h=100)
fit <- auto.arima(y, xreg=cbind(z,holiday), seasonal=FALSE)
fc <- forecast(fit, xreg=cbind(zf,holidayf), h=100)

The main dis­ad­van­tage of the ARIMA approach is that the sea­son­al­ity is forced to be peri­odic, whereas a TBATS model allows for dynamic seasonality.

FPP now available as a downloadable e-​​book

FPP coverMy fore­cast­ing text­book with George Athana­sopou­los is already avail­able online (for free), and in print via Ama­zon (for under $40). Now we have made it avail­able as a down­load­able e-​​book via Google Books (for $15.55). The Google Books ver­sion is iden­ti­cal to the print ver­sion on Ama­zon (apart from a few typos that have been fixed).

To use the e-​​book ver­sion on an iPad or Android tablet, you need to have the Google Books app installed [iPad, Android]. You could also put it on an iPhone or Android phone, but I wouldn’t rec­om­mend it as the text will be too small to read.

You can down­load a free sam­ple (up to the end of Chap­ter 2) if you want to check how it will look on your device.

The sales of the print and e-​​book ver­sions are used to fund the run­ning the OTexts web­site where all OTexts books are freely available.

The online ver­sion is con­tin­u­ously updated — any errors dis­cov­ered are fixed imme­di­ately. The print and e-​​book ver­sions will be updated approx­i­mately annu­ally to bring them into line with the online version.

 

Tim Harford on forecasting

A few weeks ago I had a Skype chat with Tim Har­ford, the “Under­cover Econ­o­mist” for Britain’s Finan­cial Times. He was work­ing on an arti­cle for the FT on fore­cast­ing, and wanted my per­spec­tive as an aca­d­e­mic fore­caster. I mostly talked about what makes some things more pre­dictable than oth­ers, as dis­cussed in this blog post. In the end, his arti­cle headed in a dif­fer­ent direc­tion, so I don’t get quoted, but it is still a good read!

He also put out this YouTube sum­mary, for those who don’t like to read:

Generating quantile forecasts in R

From today’s email:

I have just fin­ished read­ing a copy of ‘Forecasting:Principles and Prac­tice’ and I have found the book really inter­est­ing. I have par­tic­u­larly enjoyed the case stud­ies and focus on prac­ti­cal applications.

After fin­ish­ing the book I have joined a fore­cast­ing com­pe­ti­tion to put what I’ve learnt to the test. I do have a cou­ple of queries about the fore­cast­ing out­puts required. The out­put required is a quan­tile fore­cast, is this the same as pre­dic­tion inter­vals? Is there any R func­tion to pro­duce quan­tiles from 0 to 99?

If you were able to point me in the right direc­tion regard­ing the above it would be greatly appreciated.

Many Thanks,

Con­tinue reading →

Resources for the FPP book

The FPP resources page has recently been updated with sev­eral new addi­tions including

  • R code for all exam­ples in the book. This was already avail­able within each chap­ter, but the exam­ples have been col­lected into one file per chap­ter to save copy­ing and past­ing the var­i­ous code fragments.
  • Slides from a course on Pre­dic­tive Ana­lyt­ics from the Uni­ver­sity of Sydney.
  • Slides from a course on Eco­nomic Fore­cast­ing from the Uni­ver­sity of Hawaii.

If any one using the book has other mate­r­ial that could be made avail­able, please send them to me. For exam­ple, recorded lec­tures, slides, addi­tional exam­ples, assign­ments, exam ques­tions, solu­tions, etc.

A new candidate for worst figure

Today I read a paper that had been sub­mit­ted to the IJF which included the fol­low­ing figure

worstgraphic

along with sev­eral sim­i­lar plots. (Click for a larger ver­sion.) I haven’t seen any­thing this bad for a long time. In fact, I think I would find it very dif­fi­cult to repro­duce using R, or even Excel (which is par­tic­u­larly adept at bad graphics).

A few years ago I pro­duced “Twenty rules for good graph­ics”. I think I need to add a cou­ple of addi­tional rules:

  • Rep­re­sent time changes using lines.
  • Never use fill pat­terns such as cross-​​hatching.

(My orig­i­nal rule #20 said Avoid pie charts.)

It would have been rel­a­tively sim­ple to show these data as six lines on a plot of GDP against time. That would have made it obvi­ous that the Euro­pean GDP was shrink­ing, the GDP of Asia/​Oceania was increas­ing, while other regions of the world were fairly sta­ble. At least I think that is what is hap­pen­ing, but it is very hard to tell from such graph­i­cal obfuscation.

Forecasting with R in WA

On 23–25 Sep­tem­ber, I will be run­ning a 3-​​day work­shop in Perth on “Fore­cast­ing: prin­ci­ples and prac­tice” mostly based on my book of the same name.

Work­shop par­tic­i­pants will be assumed to be famil­iar with basic sta­tis­ti­cal tools such as mul­ti­ple regres­sion, but no knowl­edge of time series or fore­cast­ing will be assumed. Some prior expe­ri­ence in R is highly desirable.

Venue: The Uni­ver­sity Club, Uni­ver­sity of West­ern Aus­tralia, Ned­lands WA.

Day 1:
Fore­cast­ing tools, sea­son­al­ity and trends, expo­nen­tial smoothing.
Day 2:
State space mod­els, sta­tion­ar­ity, trans­for­ma­tions, dif­fer­enc­ing, ARIMA models.
Day 3:
Time series cross-​​validation, dynamic regres­sion, hier­ar­chi­cal fore­cast­ing, non­lin­ear models.

The course will involve a mix­ture of lec­tures and prac­ti­cal ses­sions using R. Each par­tic­i­pant must bring their own lap­top with R installed, along with the fpp pack­age and its dependencies.

For costs and enrol­ment details, go to
http://​www​.cas​.maths​.uwa​.edu​.au/​c​o​u​r​s​e​s​/​f​o​r​e​c​a​sting.

biblatex for statisticians

I am now using bibla­tex for all my bib­li­o­graphic work as it seems to have devel­oped enough to be sta­ble and reli­able. The big advan­tage of bibla­tex is that it is easy to for­mat the bib­li­og­ra­phy to con­form to spe­cific jour­nal or pub­lisher styles. It is also pos­si­ble to have struc­tured bib­li­ogra­phies (e.g., divided into sec­tions: books, papers, R pack­ages, etc.) Con­tinue reading →