The difference between prediction intervals and confidence intervals

Pre­dic­tion inter­vals and con­fi­dence inter­vals are not the same thing. Unfor­tu­nately the terms are often con­fused, and I am often fre­quently cor­rect­ing the error in stu­dents’ papers and arti­cles I am review­ing or editing.

A pre­dic­tion inter­val is an inter­val asso­ci­ated with a ran­dom vari­able yet to be observed, with a spec­i­fied prob­a­bil­ity of the ran­dom vari­able lying within the inter­val. For exam­ple, I might give an 80% inter­val for the fore­cast of GDP in 2014. The actual GDP in 2014 should lie within the inter­val with prob­a­bil­ity 0.8. Pre­dic­tion inter­vals can arise in Bayesian or fre­quen­tist statistics.

A con­fi­dence inter­val is an inter­val asso­ci­ated with a para­me­ter and is a fre­quen­tist con­cept. The para­me­ter is assumed to be non-​​random but unknown, and the con­fi­dence inter­val is com­puted from data. Because the data are ran­dom, the inter­val is ran­dom. A 95% con­fi­dence inter­val will con­tain the true para­me­ter with prob­a­bil­ity 0.95. That is, with a large num­ber of repeated sam­ples, 95% of the inter­vals would con­tain the true parameter.

A Bayesian con­fi­dence inter­val, also known as a “cred­i­ble inter­val”, is an inter­val asso­ci­ated with the pos­te­rior dis­tri­b­u­tion of the para­me­ter. In the Bayesian per­spec­tive, para­me­ters are treated as ran­dom vari­ables, and so have prob­a­bil­ity dis­tri­b­u­tions. Thus a Bayesian con­fi­dence inter­val is like a pre­dic­tion inter­val, but asso­ci­ated with a para­me­ter rather than an observation.

I think the dis­tinc­tion between pre­dic­tion and con­fi­dence inter­vals is worth pre­serv­ing because some­times you want to use both. For exam­ple, con­sider the regression

    \[ y_i = \alpha + \beta x_i + e_i \]

where y_i is the change in GDP from quar­ter i-1 to quar­ter i, x_i is the change in the unem­ploy­ment rate from quar­ter i-1 to quar­ter i, and e_i\sim\text{N}(0,\sigma^2). (This regres­sion model is known as Okun’s law in macro­eco­nom­ics.) In this case, both con­fi­dence inter­vals and pre­dic­tion inter­vals are inter­est­ing. You might be inter­ested in the con­fi­dence inter­val asso­ci­ated with the mean value of y when x=0; that is, the mean growth in GDP when the unem­ploy­ment rate does not change. You might also be inter­ested in the pre­dic­tion inter­val for y when x=0; that is, the likely range of future val­ues of GDP growth when the unem­ploy­ment rate does not change.

The dis­tinc­tion is mostly retained in the sta­tis­tics lit­er­a­ture. How­ever, in econo­met­rics it is com­mon to use “con­fi­dence inter­vals” for both types of inter­val (e.g., Granger & New­bold, 1986). I once asked Clive Granger why he con­fused the two con­cepts, and he dis­missed my objec­tion as fuss­ing about triv­i­al­i­ties. I dis­agreed with him then, and I still do.

I have seen some­one com­pute a con­fi­dence inter­val for the mean, and use it as if it was a pre­dic­tion inter­val for a future obser­va­tion. The trou­ble is, con­fi­dence inter­vals for the mean are much nar­rower than pre­dic­tion inter­vals, and so this gave him an exag­ger­ated and false sense of the accu­racy of his fore­casts. Instead of the inter­val con­tain­ing 95% of the prob­a­bil­ity space for the future obser­va­tion, it con­tained only about 20%.

So I ask sta­tis­ti­cians to please pre­serve this dis­tinc­tion. And I ask econo­me­tri­cians to stop being so sloppy about ter­mi­nol­ogy. Unfor­tu­nately, I can’t con­tinue my debate with Clive Granger. I rather hoped he would come to accept my point of view.


Related Posts:


  • molecule61

    It’s not cor­rect to say that model para­me­ters are con­sid­ered to be ran­dom in the Bayesian per­spec­tive — they are con­sid­ered to be unknown. The prob­a­bil­ity dis­tri­b­u­tion for the para­me­ter is a mea­sure of your uncer­tainty about its fixed value.

    • http://robjhyndman.com Rob J Hyndman

      Yes, but it is cor­rect to say they are “treated as ran­dom variables”.

  • Eran

    Is it the case that there is one-​​to-​​one map­ping between PI and CI?
    (For exam­ple, PI = CI+std*1, when sym­me­try is assumed)
    if so, might be an addi­tional rea­son for the confusion.

    • http://robjhyndman.com Rob J Hyndman

      Maybe. But there is a one-​​to-​​one map­ping between vari­ance and stan­dard devi­a­tion too, but nobody con­fuses them.

  • Hrode­bert

    The state­ment: “A 95% con­fi­dence inter­val will con­tain the true para­me­ter with prob­a­bil­ity 0.95.” might be mis­un­der­stood, because the true para­me­ter falls into an inter­val or not. But if an inter­val looks like: a — T < param < a + T where T is a sta­tis­tics you are absolutely right.

    • http://robjhyndman.com Rob J Hyndman

      If you read the next sen­tence, I don’t think it can be misunderstood.

  • http://robjhyndman.com Rob J Hyndman

    Read the fol­low­ing sen­tence. The CI is ran­dom because it is based on the data. The prob­a­bil­ity cov­er­age occurs with repeated sampling.

  • zbi­cy­clist

    I agree this is an impor­tant dis­tinc­tion. I agree that stu­dents have a lot of trou­ble remem­ber­ing which is which.

    I think the ter­mi­nol­ogy is to blame. Aren’t they both con­fi­dence inter­vals, just con­fi­dence about dif­fer­ent things? So we might call them “Indi­vid­ual Pre­dic­tion con­fi­dence inter­val” and “Gen­eral Pre­dic­tion con­fi­dence inter­val”, although I’m not ter­ri­bly happy with that exact phrasing.

    • d0ubs

      I totally agree with you, they are both CI but for dif­fer­ent things. One is for the mean of the depen­dant vari­able and the other is for the depen­dant vari­able itself.

      Also the arti­cle is a bit con­fus­ing by imply­ing that one dif­fer­ence between the two con­cepts is that pre­dic­tion inter­val is used for future value(s). It is kind of mis­lead­ing, you can very well com­pute an inter­vall for the future mean value of the depen­dant vari­able as well as you can com­pute an inter­vall for the value of depen­dant vari­able con­di­tionned on observed value of the inde­pen­dant vari­able (or, for instance, at the sam­ple mean value of inde­pen­dant variable).

  • mark

    hello,
    I had ques­tion, which doesn’t really core­spon­dence with topic above. Namely, I esti­mated arima coef­fi­cients using auto.arima() func­tion on 250 obser­va­tions and i did fore­casts. Now I added I want to use this par­tic­u­lar model and its coef­fi­cients to do fore­cast from 251 th obser­va­tions. What should i do?

    • http://robjhyndman.com Rob J Hyndman

      Use the model argu­ment in forecast.Arima().

      • marek

        Ok thank you, but then I will not only the struc­ture of arima (num­ber of para­me­ters), but i will change values.

        • http://robjhyndman.com Rob J Hyndman

          No. As I have already said, it applies the model to new data *with­out chang­ing the coefficients*.

      • mark

        Now i under­stand, sorry. I found the doc­u­men­ta­tion of fore­cast pack­age. Thank you!

  • Johnno

    Thanks for this! I have a prac­ti­cal ques­tion that’s related to this. I have some time­series data that I’m using to cre­ate a mul­ti­plica­tive HW fore­cast. And I want to cre­ate a PI around the 12 month look-​​ahead fore­cast. So I was think­ing about going into my time series, and for a period in it, cre­at­ing some 12 month looka­head fore­casts and using the empir­i­cal dis­tri­b­u­tion of the error between them and the actu­als to gen­er­ate a PI.

    As it relates to the PI/​CI dis­cus­sion above, I was read­ing about mak­ing boot­strap CIs, but since what I want is a PI, maybe that approach doesn’t work. Or does it?

    Sec­ondly, just gen­er­ally, is there an approach that uses the empir­i­cal dis­tri­b­u­tion of fore­cast errors to con­struct PIs?

    Thanks,
    Johnno K.

    • http://robjhyndman.com Rob J Hyndman

      Yes, you can do that. But you gen­er­ally won’t have enough data to get a good esti­mate. Usu­ally, a bet­ter approach is to use the mod­el­ling frame­work for HW. If you are using R, use the ets() func­tion in the fore­cast pack­age with model=“MAM”.

  • Pingback: Forecasting Continued: Using Simulation to Create Prediction Intervals Around Holt-Winters | Analytics Made Skeezy()

  • Ken

    It is odd that this is some­thing that is not usu­ally cov­ered in a first year stats course, but rather in sec­ond year for lin­ear regres­sion. Cov­er­ing it in first year for means would help in clar­i­fy­ing the dif­fer­ence between stan­dard devi­a­tion and stan­dard error, and then make it eas­ier to cover for regression.

  • Rajib Sarkar

    thank you, prof.hyndman! this sis the first time i have under­stood the dis­tinc­tion between i and ci clearly. many thanks, indeed!!

  • Wei1

    thanks! So in the above exam­ple, “the mean growth in GDP when the unem­ploy­ment rate does not change” here the mean growth means the toal GDP dis­tri­b­u­tion right?
    And how the pre­dic­tion inter­val is com­puted in fore­cast func­tion of fore­cast pack­age? Do you use the resid­u­als’ vari­ance to esti­mate the vari­ance of the fore­cast­ing data?
    Thanks!

    • http://robjhyndman.com/ Rob J Hyndman

      No. the mean growth in GDP means the aver­age quar­terly change in GDP.

      Pre­dic­tion inter­vals depend on the model. The fore­cast func­tion com­putes them using the the­o­ret­i­cal vari­ance of the fore­cast dis­tri­b­u­tion. For a one-​​step time series fore­cast, that is equal to the resid­ual vari­ance. But for other steps, and for regres­sion mod­els, the fore­cast vari­ance is not the same as the resid­ual variance.

      • Wei1

        But the resid­ual vari­ance is used to esti­mate the vari­ance of the fore­cast dis­tri­b­u­tion. If my data has frequency=7 days and I want to fore­cast for exam­ple 15th day’s data, should I only con­sider the vari­ance of the 1st,8th and 14th data? Thanks!

        • http://robjhyndman.com/ Rob J Hyndman

          No. It esti­mates the vari­ance of the one-​​step fore­cast vari­ance for time series. For multi-​​step or cross-​​sectional fore­casts, the resid­ual vari­ance is NOT equal to the fore­cast vari­ance as I’ve already explained. Your sec­ond ques­tion does not make sense to me. The fore­cast vari­ance does not depend directly on the vari­ance of any par­tic­u­lar days.

      • Wei1

        And, do you assume the dis­tri­b­u­tion is nor­mal when com­put­ing the pre­dic­tion inter­val in fore­cast func­tion? Thanks!

        • http://robjhyndman.com/ Rob J Hyndman

          Yes, usu­ally. But some func­tions have a boot­strap argu­ment, and then no dis­tri­b­u­tional assump­tion is made.

  • Brian

    This post would have been much bet­ter if you fleshed out the dis­tinc­tion in an example.

  • hk

    Although “pre­dic­tion inter­val is an inter­val asso­ci­ated with a ran­dom vari­able yet to be observed”, but when you try to cross-​​validate pre­dic­tion inter­vals in some data, future val­ues of a test data are already avail­able. So, in this case two notions should/​can be com­pared. Could you please elab­o­rate on this? For exam­ple how it is pos­si­ble to show a pre­dic­tion method pro­vides bet­ter (not nec­es­sar­ily nar­rower) con­fi­dence
    inter­vals. For instance, show­ing ets() gives bet­ter pre­dic­tion inter­vals com­pared to meanf().

    • hk

      What about “Empir­i­cal Pre­dic­tion Interval“s?

  • Laura Poole

    So when using the fore­cast pack­age to per­form ARIMA analy­sis.
    Can you change the CI?
    I want dif­fer­ent con­fi­dence inter­vals other than 80% and 95% but can­not fig­ure out how to change them.
    Thanks,
    Laura

    • http://robjhyndman.com/ Rob J Hyndman

      Use the argu­ment level.

  • http://robjhyndman.com/ Rob J Hyndman

    A cred­i­ble inter­val is a Bayesian ver­sion of a con­fi­dence inter­val. Your first inter­val is a cred­i­ble inter­val. I don’t know what you mean by “para­me­ters’ pos­te­rior pre­dic­tive dis­tri­b­u­tions”. Pre­sum­ably if you are refer­ring to the dis­tri­b­u­tion of a para­me­ter, it is a cred­i­ble inter­val. A pre­dic­tion inter­val refers to the dis­tri­b­u­tion of an unob­served data value.