NEURAL.RPF

NEURAL.RPF fits a neural network to a binary choice model using the data set from the PROBIT.RPF example. Like a probit model (also estimated here), the neural net attempts to explain the YESVM data given the characteristics of the individuals. Aside from a different functional form, the neural net model also differs by using the sum of squared errors rather than the likelihood as a criterion function.

This first does a linear probability model (LPM) with "fitted" values. Because the LPM doesn't constrain the fitted values to the [0,1] range, some of them may be (and are) outside that.

linreg yesvm

# constant public1_2 public3_4 public5 private years teacher $

loginc logproptax

prj lpmfitted

This estimates the the probit model, then uses PRJ to compute the fitted probabilities.

ddv(dist=probit) yesvm

# constant public1_2 public3_4 public5 private years teacher $

loginc logproptax

prj(distr=probit,cdf=prfitted)

This does the neural network. We use two hidden nodes and one direct. (Two hidden nodes alone can't cover the space of values well enough). Note that the isn't included in the explanatory variables, since it's automatically included.

nnlearn(hidden=2,direct=1,iters=10000,save=nnmeth)

# public1_2 public3_4 public5 private years teacher $

loginc logproptax

# yesvm

This compute the forecast values from the network into TESTVM.

nntest / nnmeth

# public1_2 public3_4 public5 private years teacher $

loginc logproptax

# testvm

This uses SSTATS to compute the number of correct predictions for the various models and displays them in a REPORT. SMPL=YESVM==0 restricts the sample to the observations where the observed value is 0—if the fitted value from a model is less than .5, then the model is correctly predicting 0. Similarly, if we restrict to the sample where YESVM==1, we are looking for a fitted value greater than .5.

sstat(smpl=yesvm==0) / 1>>nos testvm<.5>>nnnos $

lpmfitted<.5>>lpmnos prfitted<.5>>prbnos

sstat(smpl=yesvm==1) / 1>>yes testvm>.5>>nnyes $

lpmfitted>.5>>lpmyes prfitted>.5>>prbyes

report(action=define,$

hlabels=||"Vote","Actual","Neural Net","LPM","Probit"||)

report(atcol=1) "No" nos nnnos lpmnos prbnos

report(atcol=1) "Yes" yes nnyes lpmyes prbyes

report(action=show)

Full Program

open data probit.dat

data(org=obs) 1 95 public1_2 public3_4 public5 private $

years teacher loginc logproptax yesvm

* Linear probability model. Compute the "fitted" values. Because the LPM

* doesn't constrain the fitted values to the [0,1] range, some of them

* may be (and are) outside that.

linreg(title="Linear Probability Model") yesvm

# constant public1_2 public3_4 public5 private years teacher $

loginc logproptax

prj lpmfitted

* Probit model. Compute the fitted probabilities.

ddv(dist=probit) yesvm

# constant public1_2 public3_4 public5 private years teacher $

loginc logproptax

prj(distr=probit,cdf=prfitted)

* Neural network. We use two hidden nodes and one direct. (Two hidden

* nodes alone can't cover the space of values well enough). Note that the

* CONSTANT isn't included in the explanatory variables, since it's

* automatically included.

nnlearn(hidden=2,direct=1,iters=10000,save=nnmeth)

# public1_2 public3_4 public5 private years teacher $

loginc logproptax

# yesvm

* Compute the forecast values from the network.

nntest / nnmeth

# public1_2 public3_4 public5 private years teacher $

loginc logproptax

# testvm

* Compute the number of correct predictions for the various models

sstat(smpl=yesvm==0) / 1>>nos testvm<.5>>nnnos $

lpmfitted<.5>>lpmnos prfitted<.5>>prbnos

sstat(smpl=yesvm==1) / 1>>yes testvm>.5>>nnyes $

lpmfitted>.5>>lpmyes prfitted>.5>>prbyes

report(action=define,title="Number of Correct Predictions",$

hlabels=||"Vote","Actual","Neural Net","LPM","Probit"||)

report(atcol=1) "No" nos nnnos lpmnos prbnos

report(atcol=1) "Yes" yes nnyes lpmyes prbyes

report(action=format,picture="*.")

report(action=show)

Output

Note that neural networks rarely "converge" in the conventional sense; they just train to a certain level and stop.

Linear Regression - Estimation by Linear Probability Model

Dependent Variable YESVM

Usable Observations 95

Degrees of Freedom 86

Centered R^2 0.2121226

R-Bar^2 0.1388317

Uncentered R^2 0.7097294

Mean of Dependent Variable 0.6315789474

Std Error of Dependent Variable 0.4849354328

Standard Error of Estimate 0.4500159746

Sum of Squared Residuals 17.416236459

Regression F(8,86) 2.8943

Significance Level of F 0.0066400

Log Likelihood -54.2166

Durbin-Watson Statistic 1.9076

Variable Coeff Std Error T-Stat Signif

************************************************************************************

1. Constant -0.601793045 1.443449504 -0.41691 0.67778079

2. PUBLIC1_2 0.069730975 0.143338772 0.48648 0.62786665

3. PUBLIC3_4 0.221482770 0.157009294 1.41063 0.16195934

4. PUBLIC5 0.116358585 0.256623843 0.45342 0.65138781

5. PRIVATE -0.200486440 0.163067040 -1.22947 0.22224784

6. YEARS 0.000532853 0.005316808 0.10022 0.92040257

7. TEACHER 0.282277689 0.154075166 1.83208 0.07040135

8. LOGINC 0.456984047 0.136871435 3.33878 0.00124538

9. LOGPROPTAX -0.496719240 0.178449012 -2.78354 0.00661003

Binary Probit - Estimation by Newton-Raphson

Convergence in 6 Iterations. Final criterion was 0.0000081 <= 0.0000100

Dependent Variable YESVM

Usable Observations 95

Degrees of Freedom 86

Log Likelihood -49.8523

Average Likelihood 0.5916966

Pseudo-R^2 0.2577210

Log Likelihood(Base) -62.5204

LR Test of Coefficients(8) 25.3363

Significance Level of LR 0.0013631

Variable Coeff Std Error T-Stat Signif

************************************************************************************

1. Constant -3.541348266 4.682990221 -0.75622 0.44952020

2. PUBLIC1_2 0.274909168 0.437072132 0.62898 0.52936284

3. PUBLIC3_4 0.774834354 0.489462462 1.58303 0.11341435

4. PUBLIC5 0.322397549 0.795262289 0.40540 0.68518516

5. PRIVATE -0.599412191 0.478639801 -1.25232 0.21045177

6. YEARS 0.002428853 0.017017002 0.14273 0.88650269

7. TEACHER 1.680510627 0.972635994 1.72779 0.08402590

8. LOGINC 1.738724221 0.519018741 3.35002 0.00080805

9. LOGPROPTAX -1.993735567 0.720354248 -2.76772 0.00564507

Neural Network

Convergence not achieved in 10000 epochs

Mean Squared Error = 1.099801e-01, RSquared = 0.527347

Vote Actual Neural Net LPM Probit

No 35 26 17 16

Yes 60 57 51 53