TOBIT.RPF

TOBIT.RPF is an example of estimation of tobit and similar models. It is based upon Verbeek(2008), example 7.4.3.

It estimates the consumer shares of alcohol (SHARE1) and tobacco (SHARE2) using both standard Tobit (censored at zero) and a two-step estimator with a separate probit estimators.

This uses LDV to estimate the standard Tobit models:

ldv(lower=0.0,censor=lower) share1

# constant age nadults nkids nkids2 lnx agelnx nadlnx

ldv(lower=0.0,censor=lower) share2

# constant age nadults nkids nkids2 lnx agelnx nadlnx

These do OLS regressions on the positive observations only (which will be biased):

linreg(smpl=share1>0) share1

# constant age nadults nkids nkids2 lnx agelnx nadlnx

linreg(smpl=share2>0) share2

# constant age nadults nkids nkids2 lnx agelnx nadlnx

These now do the Tobit II models (two-stage), which require first step probits for non-zero consumption. While not strictly necessary, we remap the share values into 0–1 dummies. (DDV would work fine with just the zero-non-zero coding). These add bias-correction terms to the LINREG's.

set choice1 = share1>0

set choice2 = share2>0

ddv(noprint) choice1

# constant age nadults nkids nkids2 lnx $

agelnx nadlnx bluecol whitecol

prj(mills=lambda)

linreg(smpl=share1>0,title="Tobit II") share1

# constant age nadults nkids nkids2 lnx agelnx nadlnx lambda

ddv(noprint) choice2

# constant age nadults nkids nkids2 lnx $

agelnx nadlnx bluecol whitecol

prj(mills=lambda)

linreg(smpl=share2>0,title="Tobit II") share2

# constant age nadults nkids nkids2 lnx agelnx nadlnx lambda

Full Program

open data tobacco.asc
data(format=free,org=columns) 1 2724 bluecol whitecol flanders $
walloon nkids nkids2 nadults lnx share2 $
share1 nadlnx agelnx age d1 d2 w1 w2 lnx2 age2
*
* Tobit I models
*
ldv(lower=0.0,censor=lower) share1
# constant age nadults nkids nkids2 lnx agelnx nadlnx
ldv(lower=0.0,censor=lower) share2
# constant age nadults nkids nkids2 lnx agelnx nadlnx
*
* OLS regressions on positive observations only
*
linreg(smpl=share1>0) share1
# constant age nadults nkids nkids2 lnx agelnx nadlnx
linreg(smpl=share2>0) share2
# constant age nadults nkids nkids2 lnx agelnx nadlnx
*
* Tobit II models, which require first step probits for non-zero
* consumption. While not strictly necessary, we remap the share values
* into 0-1 dummies. (DDV would work fine with just the zero-non-zero
* coding).
*
set choice1 = share1>0
set choice2 = share2>0

ddv(noprint) choice1
# constant age nadults nkids nkids2 lnx $
agelnx nadlnx bluecol whitecol
prj(mills=lambda)
linreg(smpl=share1>0,title="Tobit II") share1
# constant age nadults nkids nkids2 lnx agelnx nadlnx lambda
*
ddv(noprint) choice2
# constant age nadults nkids nkids2 lnx $
agelnx nadlnx bluecol whitecol
prj(mills=lambda)
linreg(smpl=share2>0,title="Tobit II") share2
# constant age nadults nkids nkids2 lnx agelnx nadlnx lambda

Output

ML-Censored Below - Estimation by Newton-Raphson

Convergence in 4 Iterations. Final criterion was 0.0000000 <= 0.0000100

Dependent Variable SHARE1

Usable Observations 2724

Degrees of Freedom 2716

Log Likelihood 4755.3709

Variable Coeff Std Error T-Stat Signif

************************************************************************************

1. Constant -0.159151575 0.043777979 -3.63543 0.00027752

2. AGE 0.013493534 0.010882374 1.23994 0.21499614

3. NADULTS 0.029189530 0.016946803 1.72242 0.08499328

4. NKIDS -0.002640842 0.000604858 -4.36605 0.00001265

5. NKIDS2 -0.003878905 0.002383502 -1.62740 0.10365270

6. LNX 0.012667747 0.003215595 3.93947 0.00008166

7. AGELNX -0.000809257 0.000800551 -1.01087 0.31207636

8. NADLNX -0.002248384 0.001223245 -1.83805 0.06605518

************************************************************************************

9. SIGMA 0.024419881 0.000374460 65.21367 0.00000000

ML-Censored Below - Estimation by Newton-Raphson

Convergence in 5 Iterations. Final criterion was 0.0000016 <= 0.0000100

Dependent Variable SHARE2

Usable Observations 2724

Degrees of Freedom 2716

Log Likelihood 758.7003

Variable Coeff Std Error T-Stat Signif

************************************************************************************

1. Constant 0.589976622 0.093426841 6.31485 0.00000000

2. AGE -0.125852311 0.024178251 -5.20519 0.00000019

3. NADULTS 0.015370892 0.038047509 0.40399 0.68621854

4. NKIDS 0.004269695 0.001324657 3.22325 0.00126747

5. NKIDS2 -0.009971885 0.005471315 -1.82258 0.06836770

6. LNX -0.044431138 0.006889338 -6.44926 0.00000000

7. AGELNX 0.008822082 0.001783193 4.94735 0.00000075

8. NADLNX -0.000600806 0.002750125 -0.21846 0.82706689

************************************************************************************

9. SIGMA 0.047995080 0.001183158 40.56523 0.00000000

Linear Regression - Estimation by Least Squares

Dependent Variable SHARE1

Usable Observations 2258

Degrees of Freedom 2250

Skipped/Missing (from 2724) 466

Centered R^2 0.0509899

R-Bar^2 0.0480374

Uncentered R^2 0.5135232

Mean of Dependent Variable 0.0215072981

Std Error of Dependent Variable 0.0220618317

Standard Error of Estimate 0.0215254140

Sum of Squared Residuals 1.0425227566

Regression F(7,2250) 17.2702

Significance Level of F 0.0000000

Log Likelihood 5467.4243

Durbin-Watson Statistic 1.9178

Variable Coeff Std Error T-Stat Signif

************************************************************************************

1. Constant 0.052714568 0.043851337 1.20212 0.22944354

2. AGE 0.007797104 0.010981578 0.71002 0.47776735

3. NADULTS -0.013115039 0.016321653 -0.80354 0.42174976

4. NKIDS -0.002030671 0.000574337 -3.53568 0.00041494

5. NKIDS2 -0.002427396 0.002284617 -1.06250 0.28812467

6. LNX -0.002319931 0.003211082 -0.72248 0.47007677

7. AGELNX -0.000410507 0.000804935 -0.50999 0.61010979

8. NADLNX 0.000829210 0.001176810 0.70463 0.48111646

Linear Regression - Estimation by Least Squares

Dependent Variable SHARE2

Usable Observations 1036

Degrees of Freedom 1028

Skipped/Missing (from 2724) 1688

Centered R^2 0.1539975

R-Bar^2 0.1482368

Uncentered R^2 0.5866594

Mean of Dependent Variable 0.0321908141

Std Error of Dependent Variable 0.0314790344

Standard Error of Estimate 0.0290523210

Sum of Squared Residuals 0.8676704034

Regression F(7,1028) 26.7324

Significance Level of F 0.0000000

Log Likelihood 2200.0438

Durbin-Watson Statistic 2.0383

Variable Coeff Std Error T-Stat Signif

************************************************************************************

1. Constant 0.489659067 0.074059493 6.61170 0.00000000

2. AGE -0.031466154 0.020563187 -1.53022 0.12627048

3. NADULTS -0.013026320 0.032414853 -0.40186 0.68786860

4. NKIDS 0.001284717 0.001054086 1.21880 0.22320083

5. NKIDS2 -0.003436886 0.004556050 -0.75436 0.45080792

6. LNX -0.033576628 0.005467243 -6.14142 0.00000000

7. AGELNX 0.002209715 0.001515954 1.45764 0.14524528

8. NADLNX 0.001112516 0.002345014 0.47442 0.63530279

Linear Regression - Estimation by Tobit II

Dependent Variable SHARE1

Usable Observations 2258

Degrees of Freedom 2249

Skipped/Missing (from 2724) 466

Centered R^2 0.0509899

R-Bar^2 0.0476142

Uncentered R^2 0.5135233

Mean of Dependent Variable 0.0215072981

Std Error of Dependent Variable 0.0220618317

Standard Error of Estimate 0.0215301983

Sum of Squared Residuals 1.0425226858

Regression F(8,2249) 15.1047

Significance Level of F 0.0000000

Log Likelihood 5467.4244

Durbin-Watson Statistic 1.9178

Variable Coeff Std Error T-Stat Signif

************************************************************************************

1. Constant 0.054269233 0.133258542 0.40725 0.68386484

2. AGE 0.007709525 0.013072734 0.58974 0.55542364

3. NADULTS -0.013344930 0.024753654 -0.53911 0.58986466

4. NKIDS -0.002024425 0.000765253 -2.64543 0.00821549

5. NKIDS2 -0.002412687 0.002576666 -0.93636 0.34918830

6. LNX -0.002428894 0.009386018 -0.25878 0.79583030

7. AGELNX -0.000404420 0.000943883 -0.42846 0.66835408

8. NADLNX 0.000846170 0.001808316 0.46793 0.63987788

9. LAMBDA -0.000204616 0.016561500 -0.01235 0.99014353

Linear Regression - Estimation by Tobit II

Dependent Variable SHARE2

Usable Observations 1036

Degrees of Freedom 1027

Skipped/Missing (from 2724) 1688

Centered R^2 0.1541960

R-Bar^2 0.1476075

Uncentered R^2 0.5867564

Mean of Dependent Variable 0.0321908141

Std Error of Dependent Variable 0.0314790344

Standard Error of Estimate 0.0290630517

Sum of Squared Residuals 0.8674668222

Regression F(8,1027) 23.4037

Significance Level of F 0.0000000

Log Likelihood 2200.1653

Durbin-Watson Statistic 2.0368

Variable Coeff Std Error T-Stat Signif

************************************************************************************

1. Constant 0.451581525 0.107259090 4.21019 0.00002775

2. AGE -0.017299170 0.035438374 -0.48815 0.62554924

3. NADULTS -0.017437770 0.033648811 -0.51823 0.60441054

4. NKIDS 0.000764339 0.001495139 0.51122 0.60930951

5. NKIDS2 -0.002075534 0.005334998 -0.38904 0.69732624

6. LNX -0.030109460 0.008932479 -3.37078 0.00077745

7. AGELNX 0.001224297 0.002515693 0.48666 0.62660028

8. NADLNX 0.001364974 0.002401581 0.56836 0.56991146

9. LAMBDA -0.009017865 0.018368604 -0.49094 0.62357444