The difference between prediction intervals and confidence intervals

Prediction intervals and confidence intervals are not the same thing. Unfortunately the terms are often confused, and I am often frequently correcting the error in students’ papers and articles I am reviewing or editing.

A prediction interval is an interval associated with a random variable yet to be observed, with a specified probability of the random variable lying within the interval. For example, I might give an 80% interval for the forecast of GDP in 2014. The actual GDP in 2014 should lie within the interval with probability 0.8. Prediction intervals can arise in Bayesian or frequentist statistics.

A confidence interval is an interval associated with a parameter and is a frequentist concept. The parameter is assumed to be non-random but unknown, and the confidence interval is computed from data. Because the data are random, the interval is random. A 95% confidence interval will contain the true parameter with probability 0.95. That is, with a large number of repeated samples, 95% of the intervals would contain the true parameter.

A Bayesian confidence interval, also known as a “credible interval”, is an interval associated with the posterior distribution of the parameter. In the Bayesian perspective, parameters are treated as random variables, and so have probability distributions. Thus a Bayesian confidence interval is like a prediction interval, but associated with a parameter rather than an observation.

I think the distinction between prediction and confidence intervals is worth preserving because sometimes you want to use both. For example, consider the regression
y_i = \alpha + \beta x_i + e_i
where $y_i$ is the change in GDP from quarter $i-1$ to quarter $i$, $x_i$ is the change in the unemployment rate from quarter $i-1$ to quarter $i$, and $e_i\sim\text{N}(0,\sigma^2)$. (This regression model is known as Okun’s law in macroeconomics.) In this case, both confidence intervals and prediction intervals are interesting. You might be interested in the confidence interval associated with the mean value of $y$ when $x=0$; that is, the mean growth in GDP when the unemployment rate does not change. You might also be interested in the prediction interval for $y$ when $x=0$; that is, the likely range of future values of GDP growth when the unemployment rate does not change.

The distinction is mostly retained in the statistics literature. However, in econometrics it is common to use “confidence intervals” for both types of interval (e.g., Granger & Newbold, 1986). I once asked Clive Granger why he confused the two concepts, and he dismissed my objection as fussing about trivialities. I disagreed with him then, and I still do.

I have seen someone compute a confidence interval for the mean, and use it as if it was a prediction interval for a future observation. The trouble is, confidence intervals for the mean are much narrower than prediction intervals, and so this gave him an exaggerated and false sense of the accuracy of his forecasts. Instead of the interval containing 95% of the probability space for the future observation, it contained only about 20%.

So I ask statisticians to please preserve this distinction. And I ask econometricians to stop being so sloppy about terminology. Unfortunately, I can’t continue my debate with Clive Granger. I rather hoped he would come to accept my point of view.

Related Posts:

  • molecule61

    It’s not correct to say that model parameters are considered to be random in the Bayesian perspective – they are considered to be unknown. The probability distribution for the parameter is a measure of your uncertainty about its fixed value.

    • Yes, but it is correct to say they are “treated as random variables”.

    • Svein Olav Nyberg

      Technically true (except that unknown =/= uncertain), but then again we have the entire concept of “random” to grapple with. Does it mean physical indeterminism? Or does it mean that the value is unknown to us but has a certain probability distribution? If the latter case is the meaning of “random”, which is what I think it should be, then you can indeed speak of randomness in this case.

      • Diego Doe

        But what if the distribution is the Dirac delta? Is this a random and certain variable at the same time? 🙂

  • Eran

    Is it the case that there is one-to-one mapping between PI and CI?
    (For example, PI = CI+std*1, when symmetry is assumed)
    if so, might be an additional reason for the confusion.

    • Maybe. But there is a one-to-one mapping between variance and standard deviation too, but nobody confuses them.

    • Svein Olav Nyberg

      In the bayesian case, you use pythagoras’ theorem to find the predictive sigma from the posterior sigma and the measurement sigma. THEN you find your PI based on that new sigma and your % margin. You do something similar in the frequentist case.

  • Hrodebert

    The statement: “A 95% con­fi­dence inter­val will con­tain the true para­me­ter with prob­a­bil­ity 0.95.” might be misunderstood, because the true parameter falls into an interval or not. But if an interval looks like: a – T < param < a + T where T is a statistics you are absolutely right.

    • If you read the next sentence, I don’t think it can be misunderstood.

  • Read the following sentence. The CI is random because it is based on the data. The probability coverage occurs with repeated sampling.

    • Svein Olav Nyberg

      The two sentences actually do not mean the same. You say «A 95% confidence interval will contain the true parameter with probability 0.95. That is, with a large number of repeated samples, 95% of the intervals would contain the true parameter.»

      The «that is» is not correct, since the first sentence states the bayesian point of view, whereas the latter explains the frequentist point of view.

  • zbicyclist

    I agree this is an important distinction. I agree that students have a lot of trouble remembering which is which.

    I think the terminology is to blame. Aren’t they both confidence intervals, just confidence about different things? So we might call them “Individual Prediction confidence interval” and “General Prediction confidence interval”, although I’m not terribly happy with that exact phrasing.

    • d0ubs

      I totally agree with you, they are both CI but for different things. One is for the mean of the dependant variable and the other is for the dependant variable itself.

      Also the article is a bit confusing by implying that one difference between the two concepts is that prediction interval is used for future value(s). It is kind of misleading, you can very well compute an intervall for the future mean value of the dependant variable as well as you can compute an intervall for the value of dependant variable conditionned on observed value of the independant variable (or, for instance, at the sample mean value of independant variable).

  • mark

    I had question, which doesn’t really corespondence with topic above. Namely, I estimated arima coefficients using auto.arima() function on 250 observations and i did forecasts. Now I added I want to use this particular model and its coefficients to do forecast from 251 th observations. What should i do?

    • Use the model argument in forecast.Arima().

      • marek

        Ok thank you, but then I will not only the structure of arima (number of parameters), but i will change values.

        • No. As I have already said, it applies the model to new data *without changing the coefficients*.

      • mark

        Now i understand, sorry. I found the documentation of forecast package. Thank you!

  • Johnno

    Thanks for this! I have a practical question that’s related to this. I have some timeseries data that I’m using to create a multiplicative HW forecast. And I want to create a PI around the 12 month look-ahead forecast. So I was thinking about going into my time series, and for a period in it, creating some 12 month lookahead forecasts and using the empirical distribution of the error between them and the actuals to generate a PI.

    As it relates to the PI/CI discussion above, I was reading about making bootstrap CIs, but since what I want is a PI, maybe that approach doesn’t work. Or does it?

    Secondly, just generally, is there an approach that uses the empirical distribution of forecast errors to construct PIs?

    Johnno K.

    • Yes, you can do that. But you generally won’t have enough data to get a good estimate. Usually, a better approach is to use the modelling framework for HW. If you are using R, use the ets() function in the forecast package with model=”MAM”.

  • Pingback: Forecasting Continued: Using Simulation to Create Prediction Intervals Around Holt-Winters | Analytics Made Skeezy()

  • Ken

    It is odd that this is something that is not usually covered in a first year stats course, but rather in second year for linear regression. Covering it in first year for means would help in clarifying the difference between standard deviation and standard error, and then make it easier to cover for regression.

  • Rajib Sarkar

    thank you, prof.hyndman! this sis the first time i have understood the distinction between i and ci clearly. many thanks, indeed!!

  • Wei1

    thanks! So in the above example, “the mean growth in GDP when the unem­ploy­ment rate does not change” here the mean growth means the toal GDP distribution right?
    And how the prediction interval is computed in forecast function of forecast package? Do you use the residuals’ variance to estimate the variance of the forecasting data?

    • No. the mean growth in GDP means the average quarterly change in GDP.

      Prediction intervals depend on the model. The forecast function computes them using the theoretical variance of the forecast distribution. For a one-step time series forecast, that is equal to the residual variance. But for other steps, and for regression models, the forecast variance is not the same as the residual variance.

      • Wei1

        But the residual variance is used to estimate the variance of the forecast distribution. If my data has frequency=7 days and I want to forecast for example 15th day’s data, should I only consider the variance of the 1st,8th and 14th data? Thanks!

        • No. It estimates the variance of the one-step forecast variance for time series. For multi-step or cross-sectional forecasts, the residual variance is NOT equal to the forecast variance as I’ve already explained. Your second question does not make sense to me. The forecast variance does not depend directly on the variance of any particular days.

      • Wei1

        And, do you assume the distribution is normal when computing the prediction interval in forecast function? Thanks!

        • Yes, usually. But some functions have a bootstrap argument, and then no distributional assumption is made.

  • Brian

    This post would have been much better if you fleshed out the distinction in an example.

  • hk

    Although “pre­dic­tion inter­val is an inter­val asso­ci­ated with a ran­dom vari­able yet to be observed”, but when you try to cross-validate prediction intervals in some data, future values of a test data are already available. So, in this case two notions should/can be compared. Could you please elaborate on this? For example how it is possible to show a prediction method provides better (not necessarily narrower) confidence
    intervals. For instance, showing ets() gives better prediction intervals compared to meanf().

    • hk

      What about “Empirical Prediction Interval”s?

  • Laura Poole

    So when using the forecast package to perform ARIMA analysis.
    Can you change the CI?
    I want different confidence intervals other than 80% and 95% but cannot figure out how to change them.

  • A credible interval is a Bayesian version of a confidence interval. Your first interval is a credible interval. I don’t know what you mean by “parameters’ posterior predictive distributions”. Presumably if you are referring to the distribution of a parameter, it is a credible interval. A prediction interval refers to the distribution of an unobserved data value.

    • Stats222

      Thank you for the clarification.
      I was referring to this distribution :

      • OK. So you don’t mean a “parameter’s posterior predictive distribution”. It is the posterior predictive distribution of a new data point. A corresponding credible interval would be the Bayesian analogue of a prediction interval.

  • SAN

    Do you discuss “The difference between prediction intervals and confidence intervals”in any journal/book? I want to cite it in my manuscript.

  • SAN

    I have a set of simulation data , the prediction interval is calculated from the simulation data, and the prediction interval is used to predict the real data from experiment. Is it correct?

  • SAN

    Can I know why the prediction interval need to add 1 on the confidence interval?

  • Rizwan

    How are confidence intervals related to the confidence band (in a nonlinear regression problem)? I understand that the term confidence interval is reserved for the parameters involved in a regression problem and the confidence band encloses the area that one is certain of to contain the best fit curve. If the lower and upper limits (say there are two nonlinear parameters with confidence intervals a1<= a <= a2, b1<= b <=b2 and y = f(x;a,b) is the function such that yl = f(x,a1,b1) and y2 = (x,a2,b2)) of all the parameters obtained through an asymptotic analysis (related to the variance-covariance matrix) is used in the best fit function and plotted, does this plot relate to the confidence band? Would the strip generated using the lower and upper limits of the confidence intervals equals the confidence band? I guess the answer is no, but I am not sure how. Could you please explain and highlight the differences?

  • Rizwan

    Could you please explain the difference between confidence intervals and the confidence bound? Can a confidence bound of the best fit curve be obtained from the lower and upper limits of the confidence intervals of parameters ? Does it make sense to compute the confidence intervals using an asymptotic technique by computing the variance-covariance matrix and then using their lower and upper limits to trace the function and call the region as confidence band?

  • konstantinweixelbaum

    Could you maybe give an example how to calculate the Bayesian prediction interval? Maybe with an easy set of Data? Looking through the internet I couldn’t really find a definition or example. Thanks!

  • Nicolas

    really, statistics should get rid of that “i describe in words the operations I do” .
    Formulas. Modern math. Not being stuck in 16th century kind of approach.
    Descartes is the new black.

    The distinction IS a triviality ONCE the correpsonding equation is written down.
    Before that, it is just another bloody case of bad science badly explained.

  • Thank you, Prof. Hyndman. This clarifies many things now. I have a further question for this. In economics, economic estimates are usually calculated by the combinations of predictions. Take economies of scope for instance. Once the cost function is estimated, economies of scope are estimated by the proportion of cost savings from joint
    production relative to fully integrated costs. The above costs are conditional expectations
    given that cost function coefficients equal to some fixed constants. Could you tell me that the interval for this estimate (economies of scope) is confidence interval or prediction interval? Thanks for your time.

    • You can compute either depending on whether you want to allow for uncertainty in the estimate only, or whether you also want to allow for the observational uncertainty in future.

  • 王蒙

    How can I give the prediction interval using the HoltWinters Model? I am writing codes to implement HoltWinters Model, but I cannot give the prediction interval.

    • See my 2008 Springer book, chapter 6.

      • 王蒙

        The parameter alpha,beta and gamma is trained according to the training data set. But different size of data set gives different parameter (for example, I can choose the previous two months data to train the model, and I can also choose the previous three months data to train the model. However, the above two models is not the same. ). How should I choose the size of the training data set?
        To handle this problem, I guess that may be the parameter sequence is convergent as the size of the data set is growing up. I do an experiment to test the hypothesis, however, the sequence is not convergent.

  • kat

    Thanks for the thorough article regarding PI and CI. I have a question related with both terms. Currently, I try to model a future stock prices. I apply two approaches:

    – modeling a 50% quantile (median) of future value (dependent on other variables) and then construct 90% intervals with quantiles of normal distribution
    – modeling a 5% and 95% quantile of future value which gives an 90% interval
    If I understood correctly, first approach will give 90% CI of median (being estimated parameter), while second approach will give 90% PI (as modeled is the future stock price value itself). I’d be grateful for confirmation, whether my conclusion is correct

    • It depends what the parameters of the normal distribution are. Just assuming a normal distribution does not make it a CI.

      • kat

        Normal distribution of mean zero and variance estimated from errors’ variance in the quantile regression model.

        • If you measure the variance from the errors, you are going to get a prediction interval. A confidence interval measures uncertainty about a parameter, in this case the median. So the variance has to be about parameter uncertainty, not observational uncertainty.

  • Pingback: Python:how do you create a linear regression forecast on time series data in python – IT Sprite()

  • Pingback: Difference between prediction intervals and confidence intervals – Tingting Zhao's research blog()

  • Pingback: Model variance for ARIMA models | Hyndsight()

  • Rakesh Tripathi

    I found the procedure to calculate prediction interval for holt winters method (Chatfiled et al 1990/91, Hyndman 2003). But I did not find that how to calculate the confidence interval of parameters involved in these methods. Please provide some pointer. Thanks

    • They are estimated using MLE so are normally distributed. You just need the standard errors. But why do you want to compute confidence intervals for parameters of ETS models? The parameters are usually of no interest at all when interpreting data.

      • Rakesh Tripathi

        I wanted to baseline my time series coming from network latency. I am planning to say ‘Anomaly’, Any thing outside the 1 step ahead prediction interval. However, I do not want to detect lots of Anomaly and something more than 2 times of PI will be declared as Anomaly (Requirement is not to detect anomaly very accurately but not to detect too many).

        So, you saying that parameters CI are not interesting. Is it because dependent and independent both are same variable, and hence correlating how good is the predictor, does not make sense? I also read following, “Prediction intervals must account for both the uncertainty in knowing the value of the population mean, plus data scatter. So a prediction interval is always wider than a confidence interval.”

        • Choose the probability of false anomalies that you are prepared to handle, and set the level of your prediction interval accordingly.

          parameter CI are not interesting because the parameters are not what you care about here. You care about the future observations.

          • Rakesh Tripathi

            I did not understand that how probability of false anomalies are correlated to level of PI. Please suggest some reading.
            I want 2% of false anomaly, Now what should be the level of prediction interval?

  • Pingback: Model Variance For ARIMA Models | A Bunch Of Data()