LDV Instruction

LDV(options) depvar start end residuals

# list of explanatory variables (in Regression Format)

Implements limited dependent variable estimation techniques, for models with censored or truncated data.

Wizard

The Statistics>Limited/Discrete Dependent Variable Wizard provides dialog-driven access to most of the features of LDV.

Parameters

depvar	Dependent variable. RATS requires numeric coding for this.
start, end	Estimation range. If you have not set a SMPL, this defaults to the maximum common range of all the variables involved.
residuals	(Optional) series for the residuals

Options

Standard Regression Options

Standard Non-Linear Estimation Options

Robust Error Options

TRUNCATE=[NEITHER]/LOWER/UPPER/BOTH

CENSOR=[NEITHER]/LOWER/UPPER/BOTH

INTERVAL/[NOINTERVAL]

These choose the type of estimation method to be employed. The TRUNCATE options are used when an observation is in the data set only if the dependent variable is in range. CENSOR is used when you can observe the data points which hit the limit. INTERVAL is used when the data actually consist only of the upper and lower limits, that is, all you can observe are bounds above and below. With the interval estimation, you still need the dependent variable, but it is used only to determine which observations to use.

UPPER=SERIES of upper limits

LOWER=SERIES of lower limits

Use these options to supply series containing the upper and/or lower bound values as required by your choice of TRUNCATE, CENSOR, or INTERVAL options (note that INTERVAL requires that you supply both UPPER and LOWER series). Use missing value codes for any entries that are to be treated as unlimited.

SIGMA=input value for the regression [none - estimated]

You can use this to input a value for the standard deviation of the regression equation. If not, it will be estimated.

GRESIDS=SERIES of generalized residuals [unused]

Use this option if you want to save the generalized residuals to a series.

EQUATION=Equation to estimate [unused]

INITIAL=vector of initial guesses [unused]

WEIGHT=series of weights for the data points

Use this option if you want to provide different weights for each observation.

Description

See Censored and Truncated Samples for technical information on these models. All the models are based upon the standard

(1) \({y_i} = {{\bf{X}}_i}\beta + {u_i}\,\,;\,\,{u_i} \sim N(0,{\sigma ^2})\,i.i.d.\)

They differ upon when and what values can be observed for the dependent variable. Note that while theoretically you can have a data set which is truncated at one end and censored at the other, LDV isn’t designed for it.

Truncated and censored models tend to be fairly easy to set up. The INTERVAL estimator is a bit trickier. It’s used when all that is observed for an individual is a pair of values which bracket the true dependent variable. If you have hard numbers for the upper and lower bounds for all observations, you’re unlikely to get much of an improvement from using LDV versus a linear regression using the interval midpoints for the dependent variable. INTERVAL is most useful when some of the observations are unlimited on one end. With the interval estimator, the “dependent variable” is provided using two series, one indicated with the UPPER option and one with the LOWER option. The dependent variable is used only to determine which observations are valid; LDV can’t look just at the upper and lower series, since a missing value in them is used to show no limit in that direction.

All models are estimated by Newton-Raphson on the model reparameterized as described in Olsen (1978), that is with \(\{ \gamma ,h\} \equiv \{ \beta /\sigma ,1/\sigma \} \). This assumes that \(\sigma\) is being estimated: if you want to input a specific value, use the SIGMA option. With this parameterization, for instance, the log likelihood for an observation for the interval model is

(2) \(\log L = \left\{ {\begin{array}{*{20}{c}}{\log (1 - \Phi ({L_i}h - {X_i}\gamma ))} \hfill & {{\text{if unbounded above}}} \hfill \\ {\log (\Phi ({U_i}h - {X_i}\gamma ))} \hfill & {{\text{if unbounded below}}} \hfill \\ {\log (\Phi ({U_i}h - {X_i}\gamma ) - \Phi ({L_i}h - {X_i}\gamma ))} \hfill & {{\text{otherwise}}} \hfill \\ \end{array}} \right.\)

The covariance matrix for the natural parameterization is estimated by taking minus the inverse Hessian from the reparameterized model and using the “delta method” (linearization) to recast it in the original terms.

Variables Defined

%BETA	coefficient vector (VECTOR)
%XX	covariance matrix of coefficients (SYMMETRIC)
%STDERRS	VECTOR of coefficient standard errors
%TSTATS	VECTOR of t-statistics of the coefficients
%NOBS	number of observations (INTEGER)
%NREG	number of regressors (INTEGER)
%NFREE	number of free parameters (INTEGER)
%LOGL	log likelihood value (REAL)
%CVCRIT	final convergence criterion (REAL). This will be equal to zero if the sub-iteration limit was reached on the last iteration.
%ITERS	iterations completed (INTEGER)

Hypothesis Testing

You can apply the hypothesis testing instructions (EXCLUDE, TEST, RESTRICT and MRESTRICT) to estimates from LDV and you can use the Statistics>Regression Tests Wizard to set those up. These tests compute the “Wald” test based upon the quadratic approximation to the likelihood function. Note that you cannot use the CREATE or REPLACE options on RESTRICT and MRESTRICT.

Examples

ldv(censor=lower,lower=0.0) hours

# nwifeinc educ exper expersq age kidslt6 kidsge6 constant

This is a classic “tobit” model, censored below at zero.

ldv(truncate=lower,lower=0.0,smpl=affair) y

# constant z2 z3 z5 z7 z8

This is truncated below at zero, with the sample restricted to those with a non-zero value for AFFAIR.

Sample Output

This is the output from the first example. Note that there are relatively few summary statistics since it isn't estimated by least squares.

ML-Censored Below - Estimation by Newton-Raphson

Convergence in 4 Iterations. Final criterion was 0.0000011 <= 0.0000100

Dependent Variable HOURS

Usable Observations 753

Degrees of Freedom 745

Log Likelihood -3819.0946

Variable Coeff Std Error T-Stat Signif

************************************************************************************

1. NWIFEINC -8.814243 4.459099 -1.97669 0.04807705

2. EDUC 80.645605 21.583236 3.73649 0.00018660

3. EXPER 131.564299 17.279391 7.61394 0.00000000

4. EXPERSQ -1.864158 0.537662 -3.46716 0.00052600

5. AGE -54.405012 7.418502 -7.33369 0.00000000

6. KIDSLT6 -894.021740 111.878011 -7.99104 0.00000000

7. KIDSGE6 -16.217997 38.641390 -0.41971 0.67470074

8. Constant 965.305298 446.436130 2.16225 0.03059912

************************************************************************************

9. SIGMA 1122.021668 41.579099 26.98523 0.00000000