The MAPE (mean absolute percentage error) is a popular measure for forecast accuracy and is defined as where denotes an observation and denotes its forecast, and the mean is taken over . Armstrong (1985, p.348) was the first (to my knowledge) to point out the asymmetry of the MAPE saying that “it has a bias favoring estimates that are below the actual values”.
Posts Tagged ‘forecasting’:
For all those people asking me how to obtain a print version of my book “Forecasting: principles and practice” with George Athanasopoulos, you now can. Order on Amazon.com Order on Amazon.co.uk Order on Amazon.fr The online book will continue to be freely available. The print version of the book is intended to help fund the development of the OTexts platform. The price is US195 for my previous forecasting textbook, 182 for Gonzalez-Rivera. No matter how good the books are, the prices are absurdly high. OTexts is intended to be a different kind of publisher — all our books are online and free, those in print will be reasonably priced. The online version will continue to be updated regularly. The print version is a snapshot of the online version today. We will release a new print edition occasionally, no more than annually and only when the online version has changed enough to warrant a new print edition. We are planning an offline electronic version as well. I’ll announce it here when it is ready.
We now have a cover for the print version of my forecasting book with George Athanasopoulos. It should be on Amazon in a couple of weeks. The book is also freely available online. This is a variation of the most popular one in the poll conducted a month or two ago. The cover was produced by Scarlett Rugers who I can happily recommend to anyone wanting a book cover designed.
The leave-one-out cross-validation statistic is given by where , are the observations, and is the predicted value obtained when the model is estimated with the th case deleted. This is also sometimes known as the PRESS (Prediction Residual Sum of Squares) statistic. It turns out that for linear models, we do not actually have to estimate the model times, once for each omitted case. Instead, CV can be computed after estimating the model once on the complete data set.
The IJF is introducing occasional review papers on areas of forecasting. We did a whole issue in 2006 reviewing 25 years of research since the International Institute of Forecasters was established. Since then, there has been a lot of new work in application areas such as call center forecasting and electricity price forecasting. In addition, there are areas we did not cover in 2006 including new product forecasting and forecasting in finance. There have also been methodological and theoretical developments over the last eight years. Consequently, I’ve started inviting eminent researchers to write survey papers for the journal. One obvious choice was Tilmann Gneiting, who has produced a large body of excellent work on probabilistic forecasting in the last few years. The theory of forecasting was badly in need of development, and Tilmann and his coauthors have made several great contributions in this area. However, when I asked him to write a review he explained that another journal had got in before me, and that the review was already written. It appeared in the very first volume of the new journal Annual Review of Statistics and its Application: Gneiting and Katzfuss (2014) Probabilistic Forecasting, pp.125–151. Having now read it, I’m both grateful for this more accessible
Today’s email brought this one: I was wondering if I could get your opinion on a particular problem that I have run into during the reviewing process of an article. Basically, I have an analysis where I am looking at a couple of time-series and I wanted to know if, over time there was an upward trend in the series. Inspection of the raw data suggests there is, but we want some statistical evidence for this. To achieve this I ran some ARIMA (0,1,1) models including a drift/trend term to see if the mean of the series did indeed shift upwards with time and found that it did. However, we have run into an issue with a reviewer who argues that differencing removes trends and may not be a suitable way to detect trends. Therefore, the fact that we found a trend despite differencing suggest that differencing was not successful. I know there are a few papers and textbooks that use ARIMA (0,1,1) models as ‘random walks with drift’-type models so I cited them as examples of this procedure in action, but they remained unconvinced. Instead it was suggested that I look for trends in the raw undifferenced time-series as these would be more reliable as no trends had been removed. AT the moment I am hesitant to do this
An email I received today: I have a small problem. I have a time series called x : — If I use the default values of auto.arima(x), the best model is an ARIMA(1,0,0) — However, I tried the function ndiffs(x, test=“adf”) and ndiffs(x, test=“kpss”) as the KPSS test seems to be the default value, and the number of difference is 0 for the kpss test (consistent with the results of auto.arima() ) but 2 for the ADF test. I then tried auto.arima(x, test=“adf”) and now I have another model ARIMA(1,2,1). I am unsure which order of integration I should use as tests give fairly different results. Is there a test that prevails ?
I received this email yesterday: I have been using your ‘forecast’ package for more than a year now. I was on R version 2.15 until last week, but I am having issues with lubridate package, hence decided to update R version to R 3.0.1. In our organization even getting an open source application require us to go through a whole lot of approval processes. I asked for R 3.0.1, before I get approval for 3.0.1, a new version of R ( R 3.0.2 ) came out. Unfortunately for me forecast package was built in R3.0.2. Is there any version of forecast package that works in older version of R(3.0.1). I just don’t want to go through this entire approval war again within the organization. Please help if you have any work around for this This is unfortunately very common. Many corporate IT environments lock down computers to such an extent that it cripples the use of modern software like R which is continuously updated. It also affects universities (which should know better) and I am constantly trying to invent work-arounds to the constraints that Monash IT services place on staff and student computers. Here are a few thoughts that might help.
This is a short piece I wrote for the next issue of the Oracle newsletter produced by the International Institute of Forecasters.
This is another situation where Fourier terms are useful for handling the seasonality. Not only is the seasonal period rather long, it is non-integer (averaging 365.25÷7 = 52.18). So ARIMA and ETS models do not tend to give good results, even with a period of 52 as an approximation.