Examples / BOXCOX.RPF |
BOXCOX.RPF is an example of estimation of a model with a Box-Cox transformation on the dependent variable.
Box-Cox models (Box and Cox (1964)) can take several forms. The only one which requires use of MAXIMIZE rather than NLLS has the dependent variable subject to a Box-Cox transformation with unknown parameter as you need to take into account the Jacobian of the transformation. A common type of model has the relationship between y and x being linear when both variables are subject to the same Box-Cox transformation: this includes as special cases, a linear relationship and a log-linear relationship:
(1) \(y_t^{\left( \lambda \right)} = b_0 + b_1 x_t^{\left( \lambda \right)} + u_t \)
In RATS, you can compute the Box-Cox class of transformations (typically denoted as \(y_{}^{\left( \lambda \right)} \)) using the function %BOXCOX(x,lambda).
The example generates data from a square root on square root relationship:
all 200
set x = %rangamma(5.0)
set sqry = .5 + 3*sqrt(x) + %ran(4.0)
set y = sqry^2
(%RANGAMMA is used in generating x because we need x to have positive values).
With
(2) \(y^{\left( \lambda \right)} = \frac{{(y^\lambda - 1)}}{\lambda }\)
we have
(3) \(\frac{{dy^{\left( \lambda \right)} }}{{dy}} = y^{\left( {\lambda - 1} \right)} \)
so we need to adjust the log likelihood in (1) by adding (for each entry)
\(\log y^{\left( {\lambda - 1} \right)} = \left( {\lambda - 1} \right)\log y\)
The four free parameters are \(\lambda\), the linear parameters in (1) and the variance of \(u\) assuming those are Normally distributed (we will actually estimate the standard deviation of \(u\). Using a separate FRML for the right side of (1), the log likelihood can be computed using the BOXCOX FRML defined by:
nonlin b0 b1 lambda sigma
frml rhsfrml = b0+b1*%boxcox(x,lambda)
frml boxcox = (lambda-1)*log(y) + %logdensity(sigma^2,%boxcox(y,lambda)-rhsfrml(t))
For guess values, this takes the linear regression of y on x. The guess value for the intercept has to be adjusted since the Box-Cox transformation has the -1 in each term, which doesn't affect the slope, but does affect the intercept.
linreg y
# constant x
compute sigma=sqrt(%seesq),b0=%beta(1)+%beta(2)-1
compute b1=%beta(2),lambda=1.0
The Box-Cox model is estimated using MAXIMIZE:
maximize(method=bfgs) boxcox
Because the example uses randomly chosen data, the results will be different when run, but this is an example of the output:
MAXIMIZE - Estimation by BFGS
Convergence in 23 Iterations. Final criterion was 0.0000020 <= 0.0000100
Usable Observations 200
Function Value -1005.9653
Variable Coeff Std Error T-Stat Signif
************************************************************************************
1. B0 3.9672328268 0.9399214875 4.22081 0.00002434
2. B1 2.1293789081 0.4359881354 4.88403 0.00000104
3. LAMBDA 0.3806290766 0.0461064950 8.25543 0.00000000
4. SIGMA 4.3753157531 0.7346527557 5.95562 0.00000000
An alternative approach to doing this with RATS is to use NLLS but to include the JACOBIAN option to provide the adjustment for the transformation. Note that JACOBIAN takes the Jacobian itself (3), not its log. (The variance gets concentrated out in doing this). The FRML input to NLLS (called here BOXCOXX) is a definition of u from (1):
nonlin b0 b1 lambda
linreg y
# constant x
compute b0=%beta(1)+%beta(2)-1,b1=%beta(2),lambda=1.0
frml boxcoxx = %boxcox(y,lambda)-b0-b1*%boxcox(x,lambda)
nlls(frml=boxcoxx,jacobian=y^(lambda-1))
Nonlinear Least Squares - Estimation by Gauss-Newton
Convergence in 28 Iterations. Final criterion was 0.0000034 <= 0.0000100
Usable Observations 200
Degrees of Freedom 197
Standard Error of Estimate 4.4086467885
Sum of Squared Residuals 3828.9248017
Log Likelihood -1005.9653
Durbin-Watson Statistic 1.8744
Variable Coeff Std Error T-Stat Signif
************************************************************************************
1. B0 3.9673127680 0.1342121976 29.56000 0.00000000
2. B1 2.1294229014 0.0637617701 33.39655 0.00000000
3. LAMBDA 0.3806381995 0.0100254241 37.96729 0.00000000
Finally, it's possible to estimate this type of model using the fact that given \(\lambda\), the model is a linear regression, so can be estimated using a search over \(\lambda\). There is probably no real advantage for this over direct log likelihood maximization given the speed of modern computers. Note the the log likelihood that comes out of the LINREG on transformed data needs to be adjusted for the Jacobian term.
sstats / log(y)>>sumlogy
nonlin lambda
compute lambda=1.0
declare real loglikely
find(print) maximum loglikely
set xbc = %boxcox(x,lambda)
set ybc = %boxcox(y,lambda)
linreg(noprint) ybc
# constant xbc
compute sigmasq=%rss/%nobs
compute loglikely=(lambda-1)*sumlogy+%logconcdensity(sigmasq,%nobs)
end find
The output from this is:
FIND Optimization - Estimation by Simplex
Convergence in 20 Iterations. Final criterion was 0.0000061 <= 0.0000100
Function Value -1005.9653
Variable Coeff
**********************************************
1. LAMBDA 0.3806396484
You would have to re-do the LINREG to get the other coefficients.
Full Program
all 200
*
* Create data with a square root transformation.
*
set x = %rangamma(5.0)
set sqry = .5 + 3*sqrt(x) + %ran(4.0)
set y = sqry^2
*
* Using MAXIMIZE.
* The free coefficients b0, b1 and sigma will be transforms of the .5, 3 and 4
* used in generating the data since sqrt(y) and sqrt(x) are translated, scaled
* versions of the Box-Cox transformed data.
*
nonlin b0 b1 lambda sigma
frml rhsfrml = b0+b1*%boxcox(x,lambda)
frml boxcox = (lambda-1)*log(y) + %logdensity(sigma^2,%boxcox(y,lambda)-rhsfrml(t))
*
* This estimates a simple linear regression (special case when l=1), and sets the
* initial conditions based upon it.
*
linreg y
# constant x
compute sigma=sqrt(%seesq),b0=%beta(1)+%beta(2)-1
compute b1=%beta(2),lambda=1.0
maximize(method=bfgs) boxcox
*
* Using NLLS with JACOBIAN option
*
nonlin b0 b1 lambda
linreg y
# constant x
compute b0=%beta(1)+%beta(2)-1,b1=%beta(2),lambda=1.0
frml boxcoxx = %boxcox(y,lambda)-b0-b1*%boxcox(x,lambda)
nlls(frml=boxcoxx,jacobian=y^(lambda-1))
*
* Using FIND
*
sstats / log(y)>>sumlogy
nonlin lambda
compute lambda=1.0
declare real loglikely
find(print) maximum loglikely
set xbc = %boxcox(x,lambda)
set ybc = %boxcox(y,lambda)
linreg(noprint) ybc
# constant xbc
compute sigmasq=%rss/%nobs
compute loglikely=(lambda-1)*sumlogy+%logconcdensity(sigmasq,%nobs)
end find
Copyright © 2024 Thomas A. Doan