RATS 10.1
RATS 10.1

GIBBS.RPF demonstrates Gibbs sampling for a multiple regression with an independent prior. It is adapted from Koop(2003), 4.2.7 from pp 73-77. It demonstrates use of several functions for doing draws for linear regressions (particularly %RANMVPOSTCMOM), and the @MCMCPOSTPROC procedure for post-processing of MCMC draws. 

 

The model analyzed is a linear regression, estimating a hedonic regression for housing prices. The explanatory variables are lot size, number of bedrooms, number of bathrooms and a measure of storage. The prior used is fairly loose. It’s usually easier to think about priors based upon a mean and standard deviation, so the standard deviations need to be converted into prior precisions by squaring and then inverting that matrix. The prior here has a diagonal covariance which is fairly typical—we’re likely to have an idea about how large each coefficient might be, but very little about the specifics regarding correlations among them.

 

This is the regression of interest:

 

linreg price

# constant lot_siz bed bath stor

 

This is the prior for the variance. Setting this will generally need at least some familiarity with the scale of the data, though with over 500 observations, a prior with 5 degrees of freedom will be dominated by the data.

 

compute s2prior = 5000.0^2

compute nuprior = 5.0

 

This is the mean and variance for the prior on the regression coefficients themselves. The standard errors are what are actually specified, which gets squared to produce the variance, then inverted to give the precision.

 

compute [vector] bprior =||0.0,10.0,5000.0,10000.0,10000.0||

compute [symm]   vprior = $

   %diag(||10000.0^2,5.0^2,2500.0^2,5000.0^2,5000.0^2||)

compute [symm]   hprior = inv(vprior)

 

Compute the cross product matrix from the regression. This will include the dependent variable as the final row. The cross product matrix has all the sufficient statistics for the likelihood (except the number of observations, which is %NOBS).

 

cmom(lastreg)

 

The two obvious places to start the Gibbs sampler are the OLS estimates and the prior. We’ll use OLS.

 

compute bdraw  = %beta

compute s2draw = %seesq

 

compute nburn  = 1000

compute ndraws = 10000

 

This sets up and initializes the two locations where we are storing information generated by the draws. BGIBBS saves the coefficient vectors, while HGIBBS saves the draws for the precision (reciprocal of the variance).

 

dec series[vect] bgibbs

dec series       hgibbs

gset bgibbs 1 ndraws = %zeros(%nreg,1)

set  hgibbs 1 ndraws = 0.0

 

This has a relatively simple draw loop. We first draw the residual precision given the beta's, then the beta's given the new draw from the residual precision. When we are past the burn-in, we save the BDRAW VECTOR's and HDRAW values for the value of DRAW.

   

do draw=-nburn,ndraws

   compute rssplus = nuprior*s2prior + %rsscmom(%cmom,bdraw)

   compute hdraw   = %ranchisqr(nuprior+%nobs)/rssplus

  

   compute bdraw = %ranmvpostcmom(%cmom,hdraw,hprior,bprior)

   if draw<=0

      next

   compute bgibbs(draw) = bdraw

   compute hgibbs(draw) = hdraw

end do draw

 

Now we do the post-processing, computing the means, standard errors, numerical standard errors (estimate of the precision of the estimate of the mean) and CD measures:

 

@mcmcpostproc(ndraws=ndraws,mean=bmean,$

    stderrs=bstderrs,cd=bcd,nse=bnse) bgibbs

 

This sets up a REPORT with the output from @MCMCPOSTPROC. @EQNREGLABELS(0)(I) is the regressor label for coefficient I in the last estimated regression (equation 0). The report is formatted to use 3 digits right on the mean, standard error and numerical standard error (all of which depend upon the scales of the coefficients) and 2 digits right on the CD measure (which doesn't; it has an asymptotic standard Normal distribution).

 

report(action=define)

report(atrow=1,atcol=1,align=center) "Variable" "Coeff" $

    "Std Error" "NSE" "CD"

do i=1,%nreg

   report(row=new,atcol=1) %eqnreglabels(0)(i) bmean(i) $

     bstderrs(i) bnse(i) bcd(i)

end do i

report(action=format,atcol=2,tocol=4,picture="*.###")

report(action=format,atcol=5,picture="*.##")

report(action=show)

 

Full Program


 

open data hprice.prn
data(format=prn,org=columns) 1 546 price lot_siz bed bath stor drive $
 recroom bment gas aircond gar desireloc
*
* This is the equation we're analyzing
*
linreg price
# constant lot_siz bed bath stor
*
* Prior for variance.
*
compute s2prior=5000.0^2
compute nuprior=5.0
*
* Prior mean and variance. Convert the variance to precision by
* inverting.
*
compute [vector] bprior=||0.0,10.0,5000.0,10000.0,10000.0||
compute [symm]   vprior=$
   %diag(||10000.0^2,5.0^2,2500.0^2,5000.0^2,5000.0^2||)
compute [symm]   hprior =inv(vprior)
*
* Compute the cross product matrix from the regression. This will
* include the dependent variable as the final row. The cross product
* matrix has all the sufficient statistics for the likelihood (except
* the number of observations, which is %nobs).
*
cmom(lastreg)
*
* The two obvious places to start the Gibbs sampler are the OLS
* estimates and the prior. We'll use OLS.
*
compute bdraw =%beta
compute s2draw=%seesq
*
* We'll use the following for all MCMC estimators. This ends up taking
* nburn+1 burn-ins.
*
compute nburn =1000
compute ndraws=10000
*
dec series[vect] bgibbs
dec series       hgibbs
gset bgibbs 1 ndraws = %zeros(%nreg,1)
set  hgibbs 1 ndraws = 0.0
*
do draw=-nburn,ndraws
   *
   * Draw residual precision conditional on previous beta
   *
   compute rssplus=nuprior*s2prior+%rsscmom(%cmom,bdraw)
   compute hdraw  =%ranchisqr(nuprior+%nobs)/rssplus
   *
   * Draw betas given hdraw
   *
   compute bdraw  =%ranmvpostcmom(%cmom,hdraw,hprior,bprior)
   if draw<=0
      next
   *
   * Do the bookkeeping here.
   *
   compute bgibbs(draw)=bdraw
   compute hgibbs(draw)=hdraw
end do draw
*
@mcmcpostproc(ndraws=ndraws,mean=bmean,$
    stderrs=bstderrs,cd=bcd,nse=bnse) bgibbs
report(action=define)
report(atrow=1,atcol=1,align=center) "Variable" "Coeff" $
    "Std Error" "NSE" "CD"
do i=1,%nreg
   report(row=new,atcol=1) %eqnreglabels(0)(i) bmean(i) $
     bstderrs(i) bnse(i) bcd(i)
end do i

report(action=format,atcol=2,tocol=4,picture="*.###")

report(action=format,atcol=5,picture="*.##")

report(action=show)

Output

 

Linear Regression - Estimation by Least Squares

Dependent Variable PRICE

Usable Observations                       546

Degrees of Freedom                        541

Centered R^2                        0.5355473

R-Bar^2                             0.5321133

Uncentered R^2                      0.9382388

Mean of Dependent Variable       68121.597070

Std Error of Dependent Variable  26702.670926

Standard Error of Estimate       18265.227116

Sum of Squared Residuals         1.80488e+011

Regression F(4,541)                  155.9529

Significance Level of F             0.0000000

Log Likelihood                     -6129.9928

Durbin-Watson Statistic                1.4829

 

    Variable                        Coeff      Std Error      T-Stat      Signif

************************************************************************************

1.  Constant                      -4009.54998   3603.10903     -1.11280  0.26628732

2.  LOT_SIZ                           5.42917      0.36925     14.70325  0.00000000

3.  BED                            2824.61379   1214.80763      2.32515  0.02043276

4.  BATH                          17105.17446   1734.43413      9.86211  0.00000000

5.  STOR                           7634.89700   1007.97449      7.57449  0.00000000


The output in the table depends upon random numbers and so will be similar, but not exactly the same when it is re-run.

 

Variable   Coeff   Std Error  NSE    CD

Constant -4075.370  3236.783 34.341 -1.16

LOT_SIZ      5.449     0.362  0.004  1.52

BED       3215.961  1058.250 10.613  0.86

BATH     16096.096  1612.908 15.978 -0.06

STOR      7703.893   971.488  9.882 -1.31

 

 


Copyright © 2025 Thomas A. Doan