Probit out-of-sample pseudo R2
Probit out-of-sample pseudo R2
Hi all,
I've been estimating a Probit model to forecast the probability of a recession with financial variables, like in the paper by Estrella and Mishkin 1998(see file attached). For example, to estimate the 6-month ahead probability of a recession in a (pseudo) out-of-sample test using the term spread, I have the following code:
@NBERCycles(peaks=peaks,troughs=troughs,up=ups,down=downs)
do endperiod = 1994:12, 2008:6
DDV(DIST=PROBIT, noprint) downs 1960:1 endperiod
# constant spread{6}
prj(dist=probit,cdf=prb)
end do
How can I calculate the out-of-sample pseudo-R2 statistic used in the paper ?
Thank you very much in advance,
Franziska
I've been estimating a Probit model to forecast the probability of a recession with financial variables, like in the paper by Estrella and Mishkin 1998(see file attached). For example, to estimate the 6-month ahead probability of a recession in a (pseudo) out-of-sample test using the term spread, I have the following code:
@NBERCycles(peaks=peaks,troughs=troughs,up=ups,down=downs)
do endperiod = 1994:12, 2008:6
DDV(DIST=PROBIT, noprint) downs 1960:1 endperiod
# constant spread{6}
prj(dist=probit,cdf=prb)
end do
How can I calculate the out-of-sample pseudo-R2 statistic used in the paper ?
Thank you very much in advance,
Franziska
- Attachments
-
- Estrella and Mishkin 1998.pdf
- (639.78 KiB) Downloaded 921 times
Re: Probit out-of-sample pseudo R2
They really don't do a very good job of defining how that's calculated, particularly for the rolling sample. This is my best guess of what they mean. Note that your PRJ instructions are doing the calculation in-sample, not out-of-sample. You need to override the range in order to do the predictions (here <<NH>> steps ahead).
Code: Select all
@NBERCycles(peaks=peaks,troughs=troughs,up=ups,down=downs)
clear(zeros) logl loglc
compute nh=6
do endperiod = 1994:12, 2008:6
ddv(dist=probit, noprint) downs 1960:1 endperiod
# constant spread{6}
prj(dist=probit,cdf=prb) * endperiod+nh endperiod+nh
ddv(dist=probit, noprint) downs 1960:1 endperiod
# constant
prj(distrib=probit,cdf=prbc) * endperiod+nh endperiod+nh
compute logl(endperiod)=log(%if(downs(endperiod+nh),prb(endperiod+nh),1-prb(endperiod+nh)))
compute loglc(endperiod)=log(%if(downs(endperiod+nh),prbc(endperiod+nh),1-prbc(endperiod+nh)))
end do
sstats 1994:12 2008:6 logl>>sumlogl loglc>>sumloglc
disp "Pseudo-R^2" 1-(sumlogl/sumloglc)^((-2.0/%nobs)*sumloglc)Re: Probit out-of-sample pseudo R2
Hi Tom,
thank you very much for your help, I have replicated the Estrella and Mishkin (1998) tests for the spread, both in and out-of-sample. My results for the in-sample tests are pretty close, the small difference may be due to rounding or a different data source, but the out-of-sample pseudo R2s are off. I'm a little bit confused about the use of lagged explanatory variables in the out-of-sample test, the paper does not mention any lags for the spread(just spread_t), but then I get negative R2s for all forecast horizons. I attached the data files, my code is
@NBERCycles(peaks=peaks,troughs=troughs,up=ups,down=downs)
set spread = GS10 - TBill
*IN-SAMPLE**************
dis "In-Sample Estimation"
do i=1,8
DDV(DIST=PROBIT, robusterrors, noprint) downs 1959:1 1995:1
# constant spread{i}
prj(dist=probit,cdf=cdf)
set %s("fitprb"+i) = cdf
dis i "quarters ahead pseudo-R^2:" %Rsquared "t-stat:" %TSTATS(2)
end do
*OUT-OF-SAMPLE**********
clear(zeros) logl loglc
dis "Out-of-sample Estimation"
do i=1,8
do endperiod = 1970:4, 1995:1
DDV(DIST=PROBIT, noprint) downs 1959:1 endperiod
# constant spread{1}
prj(dist=probit,cdf=prb) * endperiod+i endperiod+i
ddv(dist=probit, noprint) downs 1959:1 endperiod
# constant
prj(distrib=probit,cdf=prbc) * endperiod+i endperiod+i
compute logl(endperiod)=log(%if(downs(endperiod+i),prb(endperiod+i),1-prb(endperiod+i)))
compute loglc(endperiod)=log(%if(downs(endperiod+i),prbc(endperiod+i),1-prbc(endperiod+i)))
end do endperiod
sstats 1970:4 1995:1 logl>>sumlogl loglc>>sumloglc
disp i "quarters ahead pseudo-R^2:" 1-(sumlogl/sumloglc)^((-2.0/%nobs)*sumloglc)
end do i
Thanks again,
Franziska
thank you very much for your help, I have replicated the Estrella and Mishkin (1998) tests for the spread, both in and out-of-sample. My results for the in-sample tests are pretty close, the small difference may be due to rounding or a different data source, but the out-of-sample pseudo R2s are off. I'm a little bit confused about the use of lagged explanatory variables in the out-of-sample test, the paper does not mention any lags for the spread(just spread_t), but then I get negative R2s for all forecast horizons. I attached the data files, my code is
@NBERCycles(peaks=peaks,troughs=troughs,up=ups,down=downs)
set spread = GS10 - TBill
*IN-SAMPLE**************
dis "In-Sample Estimation"
do i=1,8
DDV(DIST=PROBIT, robusterrors, noprint) downs 1959:1 1995:1
# constant spread{i}
prj(dist=probit,cdf=cdf)
set %s("fitprb"+i) = cdf
dis i "quarters ahead pseudo-R^2:" %Rsquared "t-stat:" %TSTATS(2)
end do
*OUT-OF-SAMPLE**********
clear(zeros) logl loglc
dis "Out-of-sample Estimation"
do i=1,8
do endperiod = 1970:4, 1995:1
DDV(DIST=PROBIT, noprint) downs 1959:1 endperiod
# constant spread{1}
prj(dist=probit,cdf=prb) * endperiod+i endperiod+i
ddv(dist=probit, noprint) downs 1959:1 endperiod
# constant
prj(distrib=probit,cdf=prbc) * endperiod+i endperiod+i
compute logl(endperiod)=log(%if(downs(endperiod+i),prb(endperiod+i),1-prb(endperiod+i)))
compute loglc(endperiod)=log(%if(downs(endperiod+i),prbc(endperiod+i),1-prbc(endperiod+i)))
end do endperiod
sstats 1970:4 1995:1 logl>>sumlogl loglc>>sumloglc
disp i "quarters ahead pseudo-R^2:" 1-(sumlogl/sumloglc)^((-2.0/%nobs)*sumloglc)
end do i
Thanks again,
Franziska
- Attachments
-
- TBillsQuarterly.RAT
- (3.25 KiB) Downloaded 1093 times
-
- 10yTreasuryQuarterly.RAT
- (2.5 KiB) Downloaded 1068 times
Re: Probit out-of-sample pseudo R2
Could you attach your whole program please? You cut out the data instructions. Apply the Code button to your pasted program - it makes the post easier to read.
That's just a timing convention. They model y*(t+h) given x(t) rather than y*(t) given x(t-h).
That's just a timing convention. They model y*(t+h) given x(t) rather than y*(t) given x(t-h).
Re: Probit out-of-sample pseudo R2
This is the program file
- Attachments
-
- Estrella and Mishkin 1998 Probit.RPF
- (1.41 KiB) Downloaded 1053 times
Re: Probit out-of-sample pseudo R2
You have a {1} rather than {i} in the line marked in red.
do i=1,8
do endperiod = 1970:4, 1995:1
DDV(DIST=PROBIT, noprint) downs 1959:1 endperiod
# constant spread{i}
prj(dist=probit,cdf=prb) * endperiod+i endperiod+i
ddv(dist=probit, noprint) downs 1959:1 endperiod
# constant
prj(distrib=probit,cdf=prbc) * endperiod+i endperiod+i
compute logl(endperiod)=log(%if(downs(endperiod+i),prb(endperiod+i),1-prb(endperiod+i)))
compute loglc(endperiod)=log(%if(downs(endperiod+i),prbc(endperiod+i),1-prbc(endperiod+i)))
end do endperiod
sstats 1970:4 1995:1 logl>>sumlogl loglc>>sumloglc
disp i "quarters ahead pseudo-R^2:" 1-(sumlogl/sumloglc)^((-2.0/%nobs)*sumloglc)
end do i
do i=1,8
do endperiod = 1970:4, 1995:1
DDV(DIST=PROBIT, noprint) downs 1959:1 endperiod
# constant spread{i}
prj(dist=probit,cdf=prb) * endperiod+i endperiod+i
ddv(dist=probit, noprint) downs 1959:1 endperiod
# constant
prj(distrib=probit,cdf=prbc) * endperiod+i endperiod+i
compute logl(endperiod)=log(%if(downs(endperiod+i),prb(endperiod+i),1-prb(endperiod+i)))
compute loglc(endperiod)=log(%if(downs(endperiod+i),prbc(endperiod+i),1-prbc(endperiod+i)))
end do endperiod
sstats 1970:4 1995:1 logl>>sumlogl loglc>>sumloglc
disp i "quarters ahead pseudo-R^2:" 1-(sumlogl/sumloglc)^((-2.0/%nobs)*sumloglc)
end do i