RATS 10.1
RATS 10.1

TOBIT.RPF is an example of estimation of tobit and similar models. It is based upon Verbeek(2008), example 7.4.3.

 

It estimates the consumer shares of alcohol (SHARE1) and tobacco (SHARE2) using both standard Tobit (censored at zero) and a two-step estimator with a separate probit estimators.

 

This uses LDV to estimate the standard Tobit models:

 

ldv(lower=0.0,censor=lower) share1

# constant age nadults nkids nkids2 lnx agelnx nadlnx

ldv(lower=0.0,censor=lower) share2

# constant age nadults nkids nkids2 lnx agelnx nadlnx

 

These do OLS regressions on the positive observations only (which will be biased):

 

linreg(smpl=share1>0) share1

# constant age nadults nkids nkids2 lnx agelnx nadlnx

linreg(smpl=share2>0) share2

# constant age nadults nkids nkids2 lnx agelnx nadlnx

 

These now do the Tobit II models (two-stage), which require first step probits for non-zero consumption. While not strictly necessary, we remap the share values into 0–1 dummies. (DDV would work fine with just the zero-non-zero coding). These add bias-correction terms to the LINREG's.

 

set choice1 = share1>0

set choice2 = share2>0

 

ddv(noprint) choice1

# constant age nadults nkids nkids2 lnx $

    agelnx nadlnx bluecol whitecol

prj(mills=lambda)

linreg(smpl=share1>0,title="Tobit II") share1

# constant age nadults nkids nkids2 lnx agelnx nadlnx lambda

 

ddv(noprint) choice2

# constant age nadults nkids nkids2 lnx $

    agelnx nadlnx bluecol whitecol

prj(mills=lambda)

linreg(smpl=share2>0,title="Tobit II") share2

# constant age nadults nkids nkids2 lnx agelnx nadlnx lambda

 

Full Program

open data tobacco.asc
data(format=free,org=columns) 1 2724 bluecol whitecol flanders  $
  walloon nkids nkids2 nadults lnx share2 $
  share1 nadlnx agelnx age d1 d2 w1 w2 lnx2 age2
*
* Tobit I models
*
ldv(lower=0.0,censor=lower) share1
# constant age nadults nkids nkids2 lnx agelnx nadlnx
ldv(lower=0.0,censor=lower) share2
# constant age nadults nkids nkids2 lnx agelnx nadlnx
*
* OLS regressions on positive observations only
*
linreg(smpl=share1>0) share1
# constant age nadults nkids nkids2 lnx agelnx nadlnx
linreg(smpl=share2>0) share2
# constant age nadults nkids nkids2 lnx agelnx nadlnx
*
* Tobit II models, which require first step probits for non-zero
* consumption. While not strictly necessary, we remap the share values
* into 0-1 dummies. (DDV would work fine with just the zero-non-zero
* coding).
*
set choice1 = share1>0
set choice2 = share2>0

ddv(noprint) choice1
# constant age nadults nkids nkids2 lnx $
    agelnx nadlnx bluecol whitecol
prj(mills=lambda)
linreg(smpl=share1>0,title="Tobit II") share1
# constant age nadults nkids nkids2 lnx agelnx nadlnx lambda
*
ddv(noprint) choice2
# constant age nadults nkids nkids2 lnx $
    agelnx nadlnx bluecol whitecol
prj(mills=lambda)
linreg(smpl=share2>0,title="Tobit II") share2
# constant age nadults nkids nkids2 lnx agelnx nadlnx lambda

Output

 

ML-Censored Below - Estimation by Newton-Raphson

Convergence in     4 Iterations. Final criterion was  0.0000000 <=  0.0000100

Dependent Variable SHARE1

Usable Observations                      2724

Degrees of Freedom                       2716

Log Likelihood                      4755.3709

 

    Variable                        Coeff      Std Error      T-Stat      Signif

************************************************************************************

1.  Constant                     -0.159151575  0.043777979     -3.63543  0.00027752

2.  AGE                           0.013493534  0.010882374      1.23994  0.21499614

3.  NADULTS                       0.029189530  0.016946803      1.72242  0.08499328

4.  NKIDS                        -0.002640842  0.000604858     -4.36605  0.00001265

5.  NKIDS2                       -0.003878905  0.002383502     -1.62740  0.10365270

6.  LNX                           0.012667747  0.003215595      3.93947  0.00008166

7.  AGELNX                       -0.000809257  0.000800551     -1.01087  0.31207636

8.  NADLNX                       -0.002248384  0.001223245     -1.83805  0.06605518

************************************************************************************

9.  SIGMA                         0.024419881  0.000374460     65.21367  0.00000000


 

ML-Censored Below - Estimation by Newton-Raphson

Convergence in     5 Iterations. Final criterion was  0.0000016 <=  0.0000100

Dependent Variable SHARE2

Usable Observations                      2724

Degrees of Freedom                       2716

Log Likelihood                       758.7003

 

    Variable                        Coeff      Std Error      T-Stat      Signif

************************************************************************************

1.  Constant                      0.589976622  0.093426841      6.31485  0.00000000

2.  AGE                          -0.125852311  0.024178251     -5.20519  0.00000019

3.  NADULTS                       0.015370892  0.038047509      0.40399  0.68621854

4.  NKIDS                         0.004269695  0.001324657      3.22325  0.00126747

5.  NKIDS2                       -0.009971885  0.005471315     -1.82258  0.06836770

6.  LNX                          -0.044431138  0.006889338     -6.44926  0.00000000

7.  AGELNX                        0.008822082  0.001783193      4.94735  0.00000075

8.  NADLNX                       -0.000600806  0.002750125     -0.21846  0.82706689

************************************************************************************

9.  SIGMA                         0.047995080  0.001183158     40.56523  0.00000000

 

Linear Regression - Estimation by Least Squares

Dependent Variable SHARE1

Usable Observations                      2258

Degrees of Freedom                       2250

Skipped/Missing (from 2724)               466

Centered R^2                        0.0509899

R-Bar^2                             0.0480374

Uncentered R^2                      0.5135232

Mean of Dependent Variable       0.0215072981

Std Error of Dependent Variable  0.0220618317

Standard Error of Estimate       0.0215254140

Sum of Squared Residuals         1.0425227566

Regression F(7,2250)                  17.2702

Significance Level of F             0.0000000

Log Likelihood                      5467.4243

Durbin-Watson Statistic                1.9178

 

    Variable                        Coeff      Std Error      T-Stat      Signif

************************************************************************************

1.  Constant                      0.052714568  0.043851337      1.20212  0.22944354

2.  AGE                           0.007797104  0.010981578      0.71002  0.47776735

3.  NADULTS                      -0.013115039  0.016321653     -0.80354  0.42174976

4.  NKIDS                        -0.002030671  0.000574337     -3.53568  0.00041494

5.  NKIDS2                       -0.002427396  0.002284617     -1.06250  0.28812467

6.  LNX                          -0.002319931  0.003211082     -0.72248  0.47007677

7.  AGELNX                       -0.000410507  0.000804935     -0.50999  0.61010979

8.  NADLNX                        0.000829210  0.001176810      0.70463  0.48111646


 

Linear Regression - Estimation by Least Squares

Dependent Variable SHARE2

Usable Observations                      1036

Degrees of Freedom                       1028

Skipped/Missing (from 2724)              1688

Centered R^2                        0.1539975

R-Bar^2                             0.1482368

Uncentered R^2                      0.5866594

Mean of Dependent Variable       0.0321908141

Std Error of Dependent Variable  0.0314790344

Standard Error of Estimate       0.0290523210

Sum of Squared Residuals         0.8676704034

Regression F(7,1028)                  26.7324

Significance Level of F             0.0000000

Log Likelihood                      2200.0438

Durbin-Watson Statistic                2.0383

 

    Variable                        Coeff      Std Error      T-Stat      Signif

************************************************************************************

1.  Constant                      0.489659067  0.074059493      6.61170  0.00000000

2.  AGE                          -0.031466154  0.020563187     -1.53022  0.12627048

3.  NADULTS                      -0.013026320  0.032414853     -0.40186  0.68786860

4.  NKIDS                         0.001284717  0.001054086      1.21880  0.22320083

5.  NKIDS2                       -0.003436886  0.004556050     -0.75436  0.45080792

6.  LNX                          -0.033576628  0.005467243     -6.14142  0.00000000

7.  AGELNX                        0.002209715  0.001515954      1.45764  0.14524528

8.  NADLNX                        0.001112516  0.002345014      0.47442  0.63530279

 

Linear Regression - Estimation by Tobit II

Dependent Variable SHARE1

Usable Observations                      2258

Degrees of Freedom                       2249

Skipped/Missing (from 2724)               466

Centered R^2                        0.0509899

R-Bar^2                             0.0476142

Uncentered R^2                      0.5135233

Mean of Dependent Variable       0.0215072981

Std Error of Dependent Variable  0.0220618317

Standard Error of Estimate       0.0215301983

Sum of Squared Residuals         1.0425226858

Regression F(8,2249)                  15.1047

Significance Level of F             0.0000000

Log Likelihood                      5467.4244

Durbin-Watson Statistic                1.9178

 

    Variable                        Coeff      Std Error      T-Stat      Signif

************************************************************************************

1.  Constant                      0.054269233  0.133258542      0.40725  0.68386484

2.  AGE                           0.007709525  0.013072734      0.58974  0.55542364

3.  NADULTS                      -0.013344930  0.024753654     -0.53911  0.58986466

4.  NKIDS                        -0.002024425  0.000765253     -2.64543  0.00821549

5.  NKIDS2                       -0.002412687  0.002576666     -0.93636  0.34918830

6.  LNX                          -0.002428894  0.009386018     -0.25878  0.79583030

7.  AGELNX                       -0.000404420  0.000943883     -0.42846  0.66835408

8.  NADLNX                        0.000846170  0.001808316      0.46793  0.63987788

9.  LAMBDA                       -0.000204616  0.016561500     -0.01235  0.99014353


 

Linear Regression - Estimation by Tobit II

Dependent Variable SHARE2

Usable Observations                      1036

Degrees of Freedom                       1027

Skipped/Missing (from 2724)              1688

Centered R^2                        0.1541960

R-Bar^2                             0.1476075

Uncentered R^2                      0.5867564

Mean of Dependent Variable       0.0321908141

Std Error of Dependent Variable  0.0314790344

Standard Error of Estimate       0.0290630517

Sum of Squared Residuals         0.8674668222

Regression F(8,1027)                  23.4037

Significance Level of F             0.0000000

Log Likelihood                      2200.1653

Durbin-Watson Statistic                2.0368

 

    Variable                        Coeff      Std Error      T-Stat      Signif

************************************************************************************

1.  Constant                      0.451581525  0.107259090      4.21019  0.00002775

2.  AGE                          -0.017299170  0.035438374     -0.48815  0.62554924

3.  NADULTS                      -0.017437770  0.033648811     -0.51823  0.60441054

4.  NKIDS                         0.000764339  0.001495139      0.51122  0.60930951

5.  NKIDS2                       -0.002075534  0.005334998     -0.38904  0.69732624

6.  LNX                          -0.030109460  0.008932479     -3.37078  0.00077745

7.  AGELNX                        0.001224297  0.002515693      0.48666  0.62660028

8.  NADLNX                        0.001364974  0.002401581      0.56836  0.56991146

9.  LAMBDA                       -0.009017865  0.018368604     -0.49094  0.62357444


 


Copyright © 2025 Thomas A. Doan