Examples / TOBIT.RPF |
TOBIT.RPF is an example of estimation of tobit and similar models. It is based upon Verbeek(2008), example 7.4.3.
It estimates the consumer shares of alcohol (SHARE1) and tobacco (SHARE2) using both standard Tobit (censored at zero) and a two-step estimator with a separate probit estimators.
This uses LDV to estimate the standard Tobit models:
ldv(lower=0.0,censor=lower) share1
# constant age nadults nkids nkids2 lnx agelnx nadlnx
ldv(lower=0.0,censor=lower) share2
# constant age nadults nkids nkids2 lnx agelnx nadlnx
These do OLS regressions on the positive observations only (which will be biased):
linreg(smpl=share1>0) share1
# constant age nadults nkids nkids2 lnx agelnx nadlnx
linreg(smpl=share2>0) share2
# constant age nadults nkids nkids2 lnx agelnx nadlnx
These now do the Tobit II models (two-stage), which require first step probits for non-zero consumption. While not strictly necessary, we remap the share values into 0–1 dummies. (DDV would work fine with just the zero-non-zero coding). These add bias-correction terms to the LINREG's.
set choice1 = share1>0
set choice2 = share2>0
ddv(noprint) choice1
# constant age nadults nkids nkids2 lnx $
agelnx nadlnx bluecol whitecol
prj(mills=lambda)
linreg(smpl=share1>0,title="Tobit II") share1
# constant age nadults nkids nkids2 lnx agelnx nadlnx lambda
ddv(noprint) choice2
# constant age nadults nkids nkids2 lnx $
agelnx nadlnx bluecol whitecol
prj(mills=lambda)
linreg(smpl=share2>0,title="Tobit II") share2
# constant age nadults nkids nkids2 lnx agelnx nadlnx lambda
Full Program
open data tobacco.asc
data(format=free,org=columns) 1 2724 bluecol whitecol flanders $
walloon nkids nkids2 nadults lnx share2 $
share1 nadlnx agelnx age d1 d2 w1 w2 lnx2 age2
*
* Tobit I models
*
ldv(lower=0.0,censor=lower) share1
# constant age nadults nkids nkids2 lnx agelnx nadlnx
ldv(lower=0.0,censor=lower) share2
# constant age nadults nkids nkids2 lnx agelnx nadlnx
*
* OLS regressions on positive observations only
*
linreg(smpl=share1>0) share1
# constant age nadults nkids nkids2 lnx agelnx nadlnx
linreg(smpl=share2>0) share2
# constant age nadults nkids nkids2 lnx agelnx nadlnx
*
* Tobit II models, which require first step probits for non-zero
* consumption. While not strictly necessary, we remap the share values
* into 0-1 dummies. (DDV would work fine with just the zero-non-zero
* coding).
*
set choice1 = share1>0
set choice2 = share2>0
ddv(noprint) choice1
# constant age nadults nkids nkids2 lnx $
agelnx nadlnx bluecol whitecol
prj(mills=lambda)
linreg(smpl=share1>0,title="Tobit II") share1
# constant age nadults nkids nkids2 lnx agelnx nadlnx lambda
*
ddv(noprint) choice2
# constant age nadults nkids nkids2 lnx $
agelnx nadlnx bluecol whitecol
prj(mills=lambda)
linreg(smpl=share2>0,title="Tobit II") share2
# constant age nadults nkids nkids2 lnx agelnx nadlnx lambda
Output
ML-Censored Below - Estimation by Newton-Raphson
Convergence in 4 Iterations. Final criterion was 0.0000000 <= 0.0000100
Dependent Variable SHARE1
Usable Observations 2724
Degrees of Freedom 2716
Log Likelihood 4755.3709
Variable Coeff Std Error T-Stat Signif
************************************************************************************
1. Constant -0.159151575 0.043777979 -3.63543 0.00027752
2. AGE 0.013493534 0.010882374 1.23994 0.21499614
3. NADULTS 0.029189530 0.016946803 1.72242 0.08499328
4. NKIDS -0.002640842 0.000604858 -4.36605 0.00001265
5. NKIDS2 -0.003878905 0.002383502 -1.62740 0.10365270
6. LNX 0.012667747 0.003215595 3.93947 0.00008166
7. AGELNX -0.000809257 0.000800551 -1.01087 0.31207636
8. NADLNX -0.002248384 0.001223245 -1.83805 0.06605518
************************************************************************************
9. SIGMA 0.024419881 0.000374460 65.21367 0.00000000
ML-Censored Below - Estimation by Newton-Raphson
Convergence in 5 Iterations. Final criterion was 0.0000016 <= 0.0000100
Dependent Variable SHARE2
Usable Observations 2724
Degrees of Freedom 2716
Log Likelihood 758.7003
Variable Coeff Std Error T-Stat Signif
************************************************************************************
1. Constant 0.589976622 0.093426841 6.31485 0.00000000
2. AGE -0.125852311 0.024178251 -5.20519 0.00000019
3. NADULTS 0.015370892 0.038047509 0.40399 0.68621854
4. NKIDS 0.004269695 0.001324657 3.22325 0.00126747
5. NKIDS2 -0.009971885 0.005471315 -1.82258 0.06836770
6. LNX -0.044431138 0.006889338 -6.44926 0.00000000
7. AGELNX 0.008822082 0.001783193 4.94735 0.00000075
8. NADLNX -0.000600806 0.002750125 -0.21846 0.82706689
************************************************************************************
9. SIGMA 0.047995080 0.001183158 40.56523 0.00000000
Linear Regression - Estimation by Least Squares
Dependent Variable SHARE1
Usable Observations 2258
Degrees of Freedom 2250
Skipped/Missing (from 2724) 466
Centered R^2 0.0509899
R-Bar^2 0.0480374
Uncentered R^2 0.5135232
Mean of Dependent Variable 0.0215072981
Std Error of Dependent Variable 0.0220618317
Standard Error of Estimate 0.0215254140
Sum of Squared Residuals 1.0425227566
Regression F(7,2250) 17.2702
Significance Level of F 0.0000000
Log Likelihood 5467.4243
Durbin-Watson Statistic 1.9178
Variable Coeff Std Error T-Stat Signif
************************************************************************************
1. Constant 0.052714568 0.043851337 1.20212 0.22944354
2. AGE 0.007797104 0.010981578 0.71002 0.47776735
3. NADULTS -0.013115039 0.016321653 -0.80354 0.42174976
4. NKIDS -0.002030671 0.000574337 -3.53568 0.00041494
5. NKIDS2 -0.002427396 0.002284617 -1.06250 0.28812467
6. LNX -0.002319931 0.003211082 -0.72248 0.47007677
7. AGELNX -0.000410507 0.000804935 -0.50999 0.61010979
8. NADLNX 0.000829210 0.001176810 0.70463 0.48111646
Linear Regression - Estimation by Least Squares
Dependent Variable SHARE2
Usable Observations 1036
Degrees of Freedom 1028
Skipped/Missing (from 2724) 1688
Centered R^2 0.1539975
R-Bar^2 0.1482368
Uncentered R^2 0.5866594
Mean of Dependent Variable 0.0321908141
Std Error of Dependent Variable 0.0314790344
Standard Error of Estimate 0.0290523210
Sum of Squared Residuals 0.8676704034
Regression F(7,1028) 26.7324
Significance Level of F 0.0000000
Log Likelihood 2200.0438
Durbin-Watson Statistic 2.0383
Variable Coeff Std Error T-Stat Signif
************************************************************************************
1. Constant 0.489659067 0.074059493 6.61170 0.00000000
2. AGE -0.031466154 0.020563187 -1.53022 0.12627048
3. NADULTS -0.013026320 0.032414853 -0.40186 0.68786860
4. NKIDS 0.001284717 0.001054086 1.21880 0.22320083
5. NKIDS2 -0.003436886 0.004556050 -0.75436 0.45080792
6. LNX -0.033576628 0.005467243 -6.14142 0.00000000
7. AGELNX 0.002209715 0.001515954 1.45764 0.14524528
8. NADLNX 0.001112516 0.002345014 0.47442 0.63530279
Linear Regression - Estimation by Tobit II
Dependent Variable SHARE1
Usable Observations 2258
Degrees of Freedom 2249
Skipped/Missing (from 2724) 466
Centered R^2 0.0509899
R-Bar^2 0.0476142
Uncentered R^2 0.5135233
Mean of Dependent Variable 0.0215072981
Std Error of Dependent Variable 0.0220618317
Standard Error of Estimate 0.0215301983
Sum of Squared Residuals 1.0425226858
Regression F(8,2249) 15.1047
Significance Level of F 0.0000000
Log Likelihood 5467.4244
Durbin-Watson Statistic 1.9178
Variable Coeff Std Error T-Stat Signif
************************************************************************************
1. Constant 0.054269233 0.133258542 0.40725 0.68386484
2. AGE 0.007709525 0.013072734 0.58974 0.55542364
3. NADULTS -0.013344930 0.024753654 -0.53911 0.58986466
4. NKIDS -0.002024425 0.000765253 -2.64543 0.00821549
5. NKIDS2 -0.002412687 0.002576666 -0.93636 0.34918830
6. LNX -0.002428894 0.009386018 -0.25878 0.79583030
7. AGELNX -0.000404420 0.000943883 -0.42846 0.66835408
8. NADLNX 0.000846170 0.001808316 0.46793 0.63987788
9. LAMBDA -0.000204616 0.016561500 -0.01235 0.99014353
Linear Regression - Estimation by Tobit II
Dependent Variable SHARE2
Usable Observations 1036
Degrees of Freedom 1027
Skipped/Missing (from 2724) 1688
Centered R^2 0.1541960
R-Bar^2 0.1476075
Uncentered R^2 0.5867564
Mean of Dependent Variable 0.0321908141
Std Error of Dependent Variable 0.0314790344
Standard Error of Estimate 0.0290630517
Sum of Squared Residuals 0.8674668222
Regression F(8,1027) 23.4037
Significance Level of F 0.0000000
Log Likelihood 2200.1653
Durbin-Watson Statistic 2.0368
Variable Coeff Std Error T-Stat Signif
************************************************************************************
1. Constant 0.451581525 0.107259090 4.21019 0.00002775
2. AGE -0.017299170 0.035438374 -0.48815 0.62554924
3. NADULTS -0.017437770 0.033648811 -0.51823 0.60441054
4. NKIDS 0.000764339 0.001495139 0.51122 0.60930951
5. NKIDS2 -0.002075534 0.005334998 -0.38904 0.69732624
6. LNX -0.030109460 0.008932479 -3.37078 0.00077745
7. AGELNX 0.001224297 0.002515693 0.48666 0.62660028
8. NADLNX 0.001364974 0.002401581 0.56836 0.56991146
9. LAMBDA -0.009017865 0.018368604 -0.49094 0.62357444
Copyright © 2025 Thomas A. Doan