Examples / NPREG.RPF |
NPREG.RPF estimates a linear regression using weighted least squares with the spread series created from a non-parametric fit of the squared residuals to one of the explanatory variables (population). It's adapted from Pagan and Ullah(1999), pp 248-249. It demonstrates use of the NPREG instruction.
The code segment here first estimates the model by least squares, then does conventional weighted least squares using the square of population as the scedastic function. NPREG is then used to get a non-parametric fit of the squared residuals on population, creating VPOPNP as the fitted estimate; that is, instead of using a specific functional form in the population for the residual variance (the square) for weighted least squares, it uses a non-parametric estimate. Since the actual data series POP was used as the grid on NPREG, VPOPNP will also align with the data, so it can be used directly in LINREG as the SPREAD option.
linreg exptrav / resids
# constant income
set popsq = pop^2
linreg(spread=popsq) exptrav
set ressqr = resids^2
npreg(grid=input,type=gaussian) ressqr pop / pop vpopnp
linreg(spread=vpopnp,$
title="Semiparametric Weighted Least Squares") exptrav
# constant income
Full Program
open data travel.csv
data(format=prn,org=columns) 1 51 pop income exptrav
*
linreg exptrav / resids
# constant income
*
* Assumed scedastic function is pop^2
*
set popsq = pop^2
linreg(spread=popsq) exptrav
# constant income
*
* Nonparametric estimator
*
set ressqr = resids^2
npreg(grid=input,type=gaussian) ressqr pop / pop vpopnp
linreg(spread=vpopnp,title="Semiparametric Weighted Least Squares") exptrav
# constant income
*
* Parametric estimates of alternative scedastic functions
*
linreg ressqr
# popsq
prj vpopsq
linreg ressqr
# constant pop popsq
prj vpopquad
*
* The data set is already sorted by pop, so style=lines will work
*
scatter(style=lines,footer="Alternative Empirical Scedastic Functions",$
key=upleft,klabels=||"Quadratic Only","Full Polynomial","Non-Parametric"||) 3
# pop vpopsq
# pop vpopquad
# pop vpopnp
*
* This reveals that a big problem is one truly massive outlier
*
scatter(style=lines,footer="Non-Parametric Variance with Residuals",$
overlay=dots,ovsame) 2
# pop vpopnp
# pop ressqr
Output
Linear Regression - Estimation by Least Squares
Dependent Variable EXPTRAV
Usable Observations 51
Degrees of Freedom 49
Centered R^2 0.8836169
R-Bar^2 0.8812418
Uncentered R^2 0.9322972
Mean of Dependent Variable 4.3736470588
Std Error of Dependent Variable 5.2091982985
Standard Error of Estimate 1.7951583044
Sum of Squared Residuals 157.90707356
Regression F(1,49) 372.0234
Significance Level of F 0.0000000
Log Likelihood -101.1855
Durbin-Watson Statistic 2.6695
Variable Coeff Std Error T-Stat Signif
************************************************************************************
1. Constant 0.2664897296 0.3294409293 0.80892 0.42247443
2. INCOME 0.0675410389 0.0035017294 19.28791 0.00000000
Linear Regression - Estimation by Weighted Least Squares
Dependent Variable EXPTRAV
Usable Observations 51
Degrees of Freedom 49
Centered R^2 0.1577653
R-Bar^2 0.1405768
Uncentered R^2 0.6485201
Mean of Dependent Variable 1.1108773510
Std Error of Dependent Variable 0.9494764958
Standard Error of Estimate 0.8802129355
Sum of Squared Residuals 37.963965780
Log Likelihood -118.4384
Durbin-Watson Statistic 2.6422
Variable Coeff Std Error T-Stat Signif
************************************************************************************
1. Constant 0.5539567129 0.2248496450 2.46368 0.01730999
2. INCOME 0.0643900657 0.0138997026 4.63248 0.00002694
Linear Regression - Estimation by Semiparametric Weighted Least Squares
Dependent Variable EXPTRAV
Usable Observations 51
Degrees of Freedom 49
Centered R^2 0.8226464
R-Bar^2 0.8190269
Uncentered R^2 0.9305163
Mean of Dependent Variable 2.8608502722
Std Error of Dependent Variable 2.3189251803
Standard Error of Estimate 0.9864923217
Sum of Squared Residuals 47.685187933
Log Likelihood -84.3678
Durbin-Watson Statistic 2.6541
Variable Coeff Std Error T-Stat Signif
************************************************************************************
1. Constant 0.3010292386 0.2327686082 1.29326 0.20198673
2. INCOME 0.0652469911 0.0038782338 16.82389 0.00000000
Linear Regression - Estimation by Least Squares
Dependent Variable RESSQR
Usable Observations 51
Degrees of Freedom 50
Centered R^2 -0.0124249
R-Bar^2 -0.0124249
Uncentered R^2 0.0935548
Mean of Dependent Variable 3.0962171286
Std Error of Dependent Variable 9.1451526339
Standard Error of Estimate 9.2017910219
Sum of Squared Residuals 4233.6479005
Log Likelihood -185.0502
Durbin-Watson Statistic 1.5806
Variable Coeff Std Error T-Stat Signif
************************************************************************************
1. POPSQ 0.0250831946 0.0110416914 2.27168 0.02744221
Linear Regression - Estimation by Least Squares
Dependent Variable RESSQR
Usable Observations 51
Degrees of Freedom 48
Centered R^2 0.1189024
R-Bar^2 0.0821900
Uncentered R^2 0.2111349
Mean of Dependent Variable 3.0962171286
Std Error of Dependent Variable 9.1451526339
Standard Error of Estimate 8.7612755998
Sum of Squared Residuals 3684.4776065
Regression F(2,48) 3.2388
Significance Level of F 0.0479265
Log Likelihood -181.5074
Durbin-Watson Statistic 1.7840
Variable Coeff Std Error T-Stat Signif
************************************************************************************
1. Constant -1.377906335 2.240700176 -0.61494 0.54149582
2. POP 1.372387724 0.671473387 2.04385 0.04647697
3. POPSQ -0.041242004 0.030856975 -1.33655 0.18766950
Graphs
Copyright © 2025 Thomas A. Doan