RATS 11.1
RATS 11.1

HETEROTEST.RPF is an example of tests for heteroscedasticity. It's adapted from Wooldridge(2009), Example 8.4 from pages 273-274.

 

The model is a hedonic price index for homes using lot size, square footage and number of bedrooms as the explanatory variables. Because this is done (for illustration) as a linear model, there's a good chance that we would see heteroscedasticity, as the variance in actual dollars for larger, more expensive homes is likely to be higher than for less expensive homes.

 

Because we estimate the base regression many times, we define an EQUATION so we can just use the EQUATION option on all the LINREG's.

 

equation linff price
# constant lotsize sqrft bdrms
 

The first test is the Breusch-Pagan test, which can be done by regressing the squared residuals on the regressors. This will have three degrees of freedom for the three (non-constant) explanatory variables. This does both the F and LM forms:

 

linreg(equation=linff)

set usq = %resids^2
linreg usq
# constant lotsize sqrft bdrms
exclude(title="Breusch-Pagan Test for Heteroscedasticity")
# lotsize sqrft bdrms
cdf(title="Breusch-Pagan Test, LM Form") chisqr %trsquared 3
 

The Breusch-Pagan LM test can also be done using the procedure @RegWhiteTest with the option TYPE=BP. You need to do this right after the regression that you want to test. So you can replace the last block of code with simply:

 

linreg(equation=linff)

@RegWhiteTest(type=bp)

 

The results would have us reject homoscedasticity rather strongly.

 

The Harvey test does a log-log regression for the squared residuals on (in this case) lot size. The LM version of this is:

 

linreg(equation=linff)

set logusq   = log(%resids^2)

set llotsize = log(lotsize)

linreg logusq

# constant llotsize

cdf(title="Harvey Test") chisqr %trsquared 1

 

The results here would seem to show that the lot size isn't very good at explaining the variance.

 

The next block of code does the White test manually, by creating the squares and cross products of the regressors and doing a Breusch-Pagan test using those and the original regressors as the potential explanatory variables for the variance. We don't recommend this—use the @RegWhiteTest procedure instead.

 

linreg(equation=linff)

set usq        = %resids^2

set lotsq      = lotsize^2

set sqrftsq    = sqrft^2

set bdrmssq    = bdrms^2

set lotxsqrft  = lotsize*sqrft

set lotxbdrms  = lotsize*bdrms

set sqftxbdrms = sqrft*bdrms

*

linreg usq

# constant lotsize sqrft bdrms $

   lotsq sqrftsq bdrmssq lotxsqrft lotxbdrms sqftxbdrms

cdf(title="White Heteroscedasticity Test") chisqr $

  %trsquared %nobs-%ndf-1

 

This is the equivalent of that last code block:

 

linreg(equation=linff)

@RegWhiteTest

 

This gives an even stronger rejection than the original Breusch-Pagan test.

 

Finally, this does a Goldfeld-Quandt test (on lot size). ORDER with the RANKS option keeps the LOTSIZE series intact, but creates a series named LOTRANKS which has the ranks (from 1 to 88) of the corresponding LOTSIZE values. This compares the variance for the first 36 observations with the last 36, leaving out the middle 16 to try to improve the power. As with the Harvey test, the results indicate that lot size is not a significant factor in explaining the variance.

 

order(ranks=lotranks) lotsize

linreg(equation=linff,smpl=lotranks<=36)

compute rss1=%rss,ndf1=%ndf

linreg(equation=linff,smpl=lotranks>=53)

compute rss2=%rss,ndf2=%ndf

cdf(title="Goldfeld-Quandt Test") ftest $

 (rss2/ndf2)/(rss1/ndf1) ndf2 ndf1


Full Program

 

open data hprice1.raw
data(format=free,org=columns) 1 88 price assess bdrms lotsize sqrft $
  colonial lprice lassess llotsize lsqrft
*
* Because we keep re-estimating this (possibly over different samples),
* we define an EQUATION.
*
equation linff price
# constant lotsize sqrft bdrms
*
linreg(equation=linff)
*
set usq = %resids^2
linreg usq
# constant lotsize sqrft bdrms
exclude(title="Breusch-Pagan Test for Heteroscedasticity")
# lotsize sqrft bdrms
cdf(title="Breusch-Pagan Test, LM Form") chisqr %trsquared 3
*
* The LM test can also be done using the procedure @RegWhiteTest with
* the option type=bp. You need to do this right after the regression
* that you want to test.
*
linreg(equation=linff)
@RegWhiteTest(type=bp)
*
* Harvey test
*
linreg(equation=linff)
set logusq   = log(%resids^2)
set llotsize = log(lotsize)
linreg logusq
# constant llotsize
cdf(title="Harvey Test") chisqr %trsquared 1
*
* White's test (the hard way, not recommended)
*
linreg(equation=linff)
set usq        = %resids^2
set lotsq      = lotsize^2
set sqrftsq    = sqrft^2
set bdrmssq    = bdrms^2
set lotxsqrft  = lotsize*sqrft
set lotxbdrms  = lotsize*bdrms
set sqftxbdrms = sqrft*bdrms
*
linreg usq
# constant lotsize sqrft bdrms $
   lotsq sqrftsq bdrmssq lotxsqrft lotxbdrms sqftxbdrms
cdf(title="White Heteroscedasticity Test") chisqr $
  %trsquared %nobs-%ndf-1
*
* Using @RegWhiteTest (recommended)
*
linreg(equation=linff)
@RegWhiteTest
*
* Goldfeld-Quandt test (on lot size)
*
order(ranks=lotranks) lotsize
linreg(equation=linff,smpl=lotranks<=36)
compute rss1=%rss,ndf1=%ndf
linreg(equation=linff,smpl=lotranks>=53)
compute rss2=%rss,ndf2=%ndf
cdf(title="Goldfeld-Quandt Test") ftest $
 (rss2/ndf2)/(rss1/ndf1) ndf2 ndf1
 

Output

 

Linear Regression - Estimation by Least Squares

Dependent Variable PRICE

Usable Observations                        88

Degrees of Freedom                         84

Centered R^2                        0.6723622

R-Bar^2                             0.6606609

Uncentered R^2                      0.9646239

Mean of Dependent Variable       293.54603409

Std Error of Dependent Variable  102.71344517

Standard Error of Estimate        59.83347988

Sum of Squared Residuals         300723.80646

Regression F(3,84)                    57.4602

Significance Level of F             0.0000000

Log Likelihood                      -482.8775

Durbin-Watson Statistic                2.1098

 

    Variable                        Coeff      Std Error      T-Stat      Signif

************************************************************************************

1.  Constant                     -21.77030860  29.47504196     -0.73860  0.46220778

2.  LOTSIZE                        0.00206771   0.00064213      3.22010  0.00182293

3.  SQRFT                          0.12277819   0.01323741      9.27509  0.00000000

4.  BDRMS                         13.85252186   9.01014545      1.53744  0.12794506

 

 

Linear Regression - Estimation by Least Squares

Dependent Variable USQ

Usable Observations                        88

Degrees of Freedom                         84

Centered R^2                        0.1601407

R-Bar^2                             0.1301458

Uncentered R^2                      0.3197842

Mean of Dependent Variable       3417.3159824

Std Error of Dependent Variable  7094.3837812

Standard Error of Estimate       6616.6462785

Sum of Squared Residuals         3677520669.9

Regression F(3,84)                     5.3389

Significance Level of F             0.0020477

Log Likelihood                      -896.9860

Durbin-Watson Statistic                2.3511

 

    Variable                        Coeff      Std Error      T-Stat      Signif

************************************************************************************

1.  Constant                     -5522.794789  3259.478257     -1.69438  0.09389782

2.  LOTSIZE                          0.201521     0.071009      2.83796  0.00569096

3.  SQRFT                            1.691037     1.463850      1.15520  0.25128478

4.  BDRMS                         1041.760223   996.381047      1.04554  0.29877103

 

Breusch-Pagan Test for Heteroscedasticity

 

Null Hypothesis : The Following Coefficients Are Zero

LOTSIZE

SQRFT

BDRMS

F(3,84)=      5.33892 with Significance Level 0.00204774

 

 

Breusch-Pagan Test, LM Form

Chi-Squared(3)=     14.092386 with Significance Level 0.00278206

 

Linear Regression - Estimation by Least Squares

Dependent Variable PRICE

Usable Observations                        88

Degrees of Freedom                         84

Centered R^2                        0.6723622

R-Bar^2                             0.6606609

Uncentered R^2                      0.9646239

Mean of Dependent Variable       293.54603409

Std Error of Dependent Variable  102.71344517

Standard Error of Estimate        59.83347988

Sum of Squared Residuals         300723.80646

Regression F(3,84)                    57.4602

Significance Level of F             0.0000000

Log Likelihood                      -482.8775

Durbin-Watson Statistic                2.1098

 

    Variable                        Coeff      Std Error      T-Stat      Signif

************************************************************************************

1.  Constant                     -21.77030860  29.47504196     -0.73860  0.46220778

2.  LOTSIZE                        0.00206771   0.00064213      3.22010  0.00182293

3.  SQRFT                          0.12277819   0.01323741      9.27509  0.00000000

4.  BDRMS                         13.85252186   9.01014545      1.53744  0.12794506

 

 

Breusch-Pagan Heteroscedasticity Test

Chi-Squared(3)=     14.092386 with Significance Level 0.00278206

 

Linear Regression - Estimation by Least Squares

Dependent Variable PRICE

Usable Observations                        88

Degrees of Freedom                         84

Centered R^2                        0.6723622

R-Bar^2                             0.6606609

Uncentered R^2                      0.9646239

Mean of Dependent Variable       293.54603409

Std Error of Dependent Variable  102.71344517

Standard Error of Estimate        59.83347988

Sum of Squared Residuals         300723.80646

Regression F(3,84)                    57.4602

Significance Level of F             0.0000000

Log Likelihood                      -482.8775

Durbin-Watson Statistic                2.1098

 

    Variable                        Coeff      Std Error      T-Stat      Signif

************************************************************************************

1.  Constant                     -21.77030860  29.47504196     -0.73860  0.46220778

2.  LOTSIZE                        0.00206771   0.00064213      3.22010  0.00182293

3.  SQRFT                          0.12277819   0.01323741      9.27509  0.00000000

4.  BDRMS                         13.85252186   9.01014545      1.53744  0.12794506

 

 

Linear Regression - Estimation by Least Squares

Dependent Variable LOGUSQ

Usable Observations                        88

Degrees of Freedom                         86

Centered R^2                        0.0215285

R-Bar^2                             0.0101509

Uncentered R^2                      0.8979108

Mean of Dependent Variable       6.6147892576

Std Error of Dependent Variable  2.2706012440

Standard Error of Estimate       2.2590475263

Sum of Squared Residuals         438.88343242

Regression F(1,86)                     1.8922

Significance Level of F             0.1725280

Log Likelihood                      -195.5701

Durbin-Watson Statistic                2.4417

 

    Variable                        Coeff      Std Error      T-Stat      Signif

************************************************************************************

1.  Constant                     1.1617367352 3.9715291766      0.29252  0.77059652

2.  LLOTSIZE                     0.6123513270 0.4451628292      1.37557  0.17252802

 

Harvey Test

Chi-Squared(1)=      1.894506 with Significance Level 0.16869458

 

Linear Regression - Estimation by Least Squares

Dependent Variable PRICE

Usable Observations                        88

Degrees of Freedom                         84

Centered R^2                        0.6723622

R-Bar^2                             0.6606609

Uncentered R^2                      0.9646239

Mean of Dependent Variable       293.54603409

Std Error of Dependent Variable  102.71344517

Standard Error of Estimate        59.83347988

Sum of Squared Residuals         300723.80646

Regression F(3,84)                    57.4602

Significance Level of F             0.0000000

Log Likelihood                      -482.8775

Durbin-Watson Statistic                2.1098

 

    Variable                        Coeff      Std Error      T-Stat      Signif

************************************************************************************

1.  Constant                     -21.77030860  29.47504196     -0.73860  0.46220778

2.  LOTSIZE                        0.00206771   0.00064213      3.22010  0.00182293

3.  SQRFT                          0.12277819   0.01323741      9.27509  0.00000000

4.  BDRMS                         13.85252186   9.01014545      1.53744  0.12794506

 

 

Linear Regression - Estimation by Least Squares

Dependent Variable USQ

Usable Observations                        88

Degrees of Freedom                         78

Centered R^2                        0.3833143

R-Bar^2                             0.3121582

Uncentered R^2                      0.5005361

Mean of Dependent Variable       3417.3159824

Std Error of Dependent Variable  7094.3837812

Standard Error of Estimate       5883.8141474

Sum of Squared Residuals         2700302975.9

Regression F(9,78)                     5.3870

Significance Level of F             0.0000101

Log Likelihood                      -883.3955

Durbin-Watson Statistic                2.0527

 

    Variable                        Coeff      Std Error      T-Stat      Signif

************************************************************************************

1.  Constant                     15626.243719 11369.411858      1.37441  0.17325028

2.  LOTSIZE                         -1.859507     0.637097     -2.91872  0.00459119

3.  SQRFT                           -2.673918     8.662183     -0.30869  0.75838135

4.  BDRMS                        -1982.841114  5438.482750     -0.36459  0.71640071

5.  LOTSQ                           -0.000000     0.000005     -0.10750  0.91467008

6.  SQRFTSQ                          0.000352     0.001840      0.19148  0.84864369

7.  BDRMSSQ                        289.754063   758.830273      0.38184  0.70361609

8.  LOTXSQRFT                        0.000457     0.000277      1.64967  0.10303162

9.  LOTXBDRMS                        0.314647     0.252094      1.24813  0.21571537

10. SQFTXBDRMS                      -1.020860     1.667154     -0.61234  0.54209582

 

White Heteroscedasticity Test

Chi-Squared(9)=     33.731657 with Significance Level 0.00009953

 

Linear Regression - Estimation by Least Squares

Dependent Variable PRICE

Usable Observations                        88

Degrees of Freedom                         84

Centered R^2                        0.6723622

R-Bar^2                             0.6606609

Uncentered R^2                      0.9646239

Mean of Dependent Variable       293.54603409

Std Error of Dependent Variable  102.71344517

Standard Error of Estimate        59.83347988

Sum of Squared Residuals         300723.80646

Regression F(3,84)                    57.4602

Significance Level of F             0.0000000

Log Likelihood                      -482.8775

Durbin-Watson Statistic                2.1098

 

    Variable                        Coeff      Std Error      T-Stat      Signif

************************************************************************************

1.  Constant                     -21.77030860  29.47504196     -0.73860  0.46220778

2.  LOTSIZE                        0.00206771   0.00064213      3.22010  0.00182293

3.  SQRFT                          0.12277819   0.01323741      9.27509  0.00000000

4.  BDRMS                         13.85252186   9.01014545      1.53744  0.12794506

 

 

White Heteroscedasticity Test

Chi-Squared(9)=     33.731657 with Significance Level 0.00009953

 

Linear Regression - Estimation by Least Squares

Dependent Variable PRICE

Usable Observations                        36

Degrees of Freedom                         32

Skipped/Missing (from 88)                  52

Centered R^2                        0.2049898

R-Bar^2                             0.1304576

Uncentered R^2                      0.9602660

Mean of Dependent Variable       248.09791667

Std Error of Dependent Variable   57.71234839

Standard Error of Estimate        53.81633555

Sum of Squared Residuals         92678.335090

Regression F(3,32)                     2.7504

Significance Level of F             0.0588086

Log Likelihood                      -192.4425

Durbin-Watson Statistic                2.0486

 

    Variable                        Coeff      Std Error      T-Stat      Signif

************************************************************************************

1.  Constant                     44.496857844 87.408803284      0.50907  0.61419837

2.  LOTSIZE                       0.011696813  0.008643992      1.35317  0.18548346

3.  SQRFT                         0.077404477  0.028108424      2.75378  0.00963109

4.  BDRMS                         1.532607749 13.921042823      0.11009  0.91302327

 

 

Linear Regression - Estimation by Least Squares

Dependent Variable PRICE

Usable Observations                        36

Degrees of Freedom                         32

Skipped/Missing (from 88)                  52

Centered R^2                        0.7377785

R-Bar^2                             0.7131952

Uncentered R^2                      0.9717906

Mean of Dependent Variable       353.75069444

Std Error of Dependent Variable  124.56383912

Standard Error of Estimate        66.70911791

Sum of Squared Residuals         142403.40518

Regression F(3,32)                    30.0114

Significance Level of F             0.0000000

Log Likelihood                      -200.1740

Durbin-Watson Statistic                2.3022

 

    Variable                        Coeff      Std Error      T-Stat      Signif

************************************************************************************

1.  Constant                     -31.62793264  44.92983273     -0.70394  0.48656105

2.  LOTSIZE                        0.00144117   0.00078054      1.84639  0.07410319

3.  SQRFT                          0.12342990   0.02128261      5.79956  0.00000194

4.  BDRMS                         21.89272051  15.14338108      1.44570  0.15798597

 

Goldfeld-Quandt Test

F(32,32)=      1.53653 with Significance Level 0.11489700

 


Copyright © 2026 Thomas A. Doan