Bai-Perron Procedure

Ken-Cogger · Unread post by **Ken-Cogger** » Thu Dec 20, 2012 11:54 pm

Bai-Perron fits multiple structural changes to a multiple regression.
In the simple linear regression case, this is x<= or x> some value equal to one of the x values, if two linear pieces are estimated.
Almost always, the intersection of the two pieces will not be at one of the x values, producing an inability to estimate
a piecewise linear function of the form y=B1+B2*x+B3*(x-H)*(x>=H) that is valid for any H, not just some values of H.

This additional continuity requirement, of course, cannot be handled by the Bai-Perron procedure @baiperron so ably programmed
by Tom Doan. In all simulated cases I have examined, the solution seems to be:
(1) Run @baiperron on the data, obtaining an estimate of the two lines.
(2) Calculate the implied values of B1,B2,B3, and H from these two lines.
(3) Starting with these initial values, run NLLS (I find METHOD=GENETIC) preferable.
(4) Using the final H from NLLS, compute x1=(x-H)*(x>=H) and run linreg y; # constant x x1
to get the usual t- and F-tests. These are approximate of course due to the search process.

Am I off base on this, or does it seem reasonable?

The following RATS code demonstrates the idea.

Code: Select all

*Bai-Perron Simulation
set x 1 100 =t
set y 1 100 =1+2*x-3*(x-23.3)*(x>=23.3)+%ran(1)
*Note that all following results will depend on the particular generation of random N(0,1) values
*added to the basic model y = B1+B2*x+B3*(x-H)*(x>=H)
*with B1 = 1, B2 = 2, B3 = -3, and H = 23.3
*This model, unlike Bai-Perron, allows the hinge point H to differ from the recorded x values.
scatter(style=dots,header="Piecewise Linear Simulation",subheader="y=1+2x-3(x-23.3)(x>=23.3)+N(0,1)",vlabel='y',hlabel='x',hticks=20) 1
# x y
*Run Bai-Perron analysis with one break point (two linear pieces)
source baiperron.src
@baiperron(maxbreaks=1,iters=100) y
# constant x
display 'RSS =' %rss
*Bai-Perron identifies two regions for the piecewise linear regression, separated by x <= H amd x > H,
*where H is restricted to be one of the values of x. The RSS is reflective of this restriction.
*The intersection of the two Bai-Perron lines will almost certainly not be at one of the values of x.
*These five parameters (the two linear pieces + the dividing point of the regions) translate into
*the four-parameter model y = B1+B2*x+B3*(x-H)*(x>=H) with x restricted to one of its values is required by Bai-Perron.
*Compute and display the implied B and H parameters.
compute B1 =%beta(1)
compute B2 =%beta(2)
compute B3 =%beta(4)-%beta(2)
compute H =(%beta(3)-%beta(1))/(%beta(2)-%beta(4))
display 'B1 =' B1
display 'B2 =' B2
display 'B3 =' B3
display 'H =' H
*These parameter estimates will usually be close to those of the generating model.
*However, the computed value of H from Bai-Perron will almost never be a specific value of x.
*We call such an estimate inadmissable.  Refinement of the Bai-Perron estimates is needed.
*We have found nonlinear least squares (NLLS) to be a useful refinement of the estimates.
*METHOD=GAUSS usually produces a poor refinement, due to its reliance on derivatives which do not exist for the generating model.
*METHOD=SIMPLEX works fairly well, but the results are generally not as good as
*METHOD=GENETIC so that is our preferred choice to clean up the estimates.
*Note that B1,B2,B3,H have been previously calculated from Bai-Perron; these are used as initial estimates in NLLS.
nonlin B1 B2 B3 H
FRML YHAT =B1+B2*x+B3*(x-H)*(x>=H)
NLLS(FRML=YHAT,ITERS=1000,METHOD=GENETIC) y
*Display the GENETIC results for the coefficients
display 'B1 =' %beta(1)
display 'B2 =' %beta(2)
display 'B3 =' %beta(3)
display 'H =' %beta(4)
display 'RSS =' %rss
*For this nonlinear estimation, RSS will often be larger than reported by @baiperron.
*This is due to the additional continuity requirement of the fitted model.
*For reporting of approximate (due to the above search process) standard errors, t-tests, and F-test,
*run an ordinary linear regression after computing an additional explanatory variable x1 from the
*estimated hinge H.
set x1 1 100 =(x-%beta(4))*(x>=%beta(4))
linreg y
# constant x x1
display 'B1 =' %beta(1)
display 'B2 =' %beta(2)
display 'B3 =' %beta(3)
*The linreg display should show the same values as from NLLS for B1,B2,B3 for the fixed H.

TomDoan · Unread post by **TomDoan** » Wed Jan 02, 2013 9:33 pm

First, from your description, it sounds like you should be using @MultipleBreaks rather than @BaiPerron. @BaiPerron is for breaks based upon time sequence, while @MultipleBreaks is for breaks based upon the values of another series. However, neither of those can be used for regressions with join points and I don't know whether it would even give you a good guess value.

The RATS Software Forum

Bai-Perron Procedure

Bai-Perron Procedure

Re: Bai-Perron Procedure