Statistics and Algorithms / Cross Section Data Methods /

Hazard Models

Hazard models are a generalization of simple models of “time to failure.” They are used extensively in biometrics to analyze survival times. In econometrics, they are used in modelling transitions from one state to another. Examples are unemployed to employed, working to retired, unmarried to married. A good reference on the subject is Lancaster (1990).

The basic assumption is that the probability of leaving the initial state to the final state over a very short period of time can be modeled as

\begin{equation} P\left( {{\text{exiting in }}\left[ {t{\rm{,}}{\kern 1pt} {\kern 1pt} {\kern 1pt} t{\rm{ + }}d{\kern 1pt} t} \right]{\text{ given no exit before }}t} \right){\rm{ = }}\theta \left( {t{\rm{,}}{\kern 1pt} {\kern 1pt} X_i {\kern 1pt} \left( t \right)\,} \right){\kern 1pt} {\kern 1pt} dt \end{equation}

where $X_i \left( t \right)$ are the characteristics of individual $i$ at time $t$. In the simplest case, $\theta$ is a constant, and the exit times follow an exponential distribution with mean ${1 \mathord{\left/{\vphantom {1 \theta }} \right.} \theta }$. Obviously, the more interesting cases depend upon the individual or length of time.

$\theta$ is known as the hazard function, perhaps not the best choice of terms for economics examples, but in biometrics these models are most often used to analyze deaths and other less than pleasant occurrences.

Estimation

In full generality, it is almost impossible to estimate the parameters governing the hazard function. If the hazard function varies over time, either because it directly depends upon time, or (even more difficult) the characteristics of individuals change over time, the above would require integrating $\theta$ over $t$ to get the probability of exiting at $T$ or earlier.

A common simplification is to assume that the dependence upon $t$ comes only through a function which is independent of the individual. This is called a proportional hazard model. For estimation purposes, it is usually assumed that $\theta \left( {t{\rm{,}}{\kern 1pt} {\kern 1pt} X_i {\kern 1pt} \left( t \right)\,} \right)$ is

\begin{equation} \theta _0 (t)\exp (X_i \beta ) \end{equation}

If the baseline hazard function $\theta _0 \left( t \right)$ is simple enough, it’s possible to estimate the parameters directly using MAXIMIZE. The density function for the exit time is

\begin{equation} \theta _0 \left( t \right)\exp \left( {X_i \beta } \right)\exp \left( { - \exp \left( {X_i \beta } \right)\int_0^t {\theta _0 \left( s \right)ds} } \right) \end{equation}

If, for instance, $\theta_0$ is a constant, the FRML to compute the log of this would be

frml logl = z=zfrml,$

log(theta)+z-exp(z)*theta*exittime

where the sub formula ZRFML computes $X_i \beta $ and EXITTIME is the data variable which gives the actual exit time.

In many cases, however, it is easier to explain how exit times might vary across individuals than to explain the actual lengths of time to exit. Without a precise form for the baseline hazard, full maximum likelihood can’t be used. However, there is an alternative method of attack called partial likelihood. The idea here is that, if we look at the time an individual exits, we have a valid model of the relative probabilities of exit for all individuals who have still not exited. For each possible setting of $\beta$, we can form, for each individual, the probability that she is the one who exits at her exit time rather than someone else. It turns out this has all the desirable properties of a likelihood.

The one tricky part about the partial likelihood is that the probability calculation for an individual requires a sum across a subsample, a subsample which changes with the individual. In the example, we handle this by defining for each individual a “Risk Set” vector which has 1’s only for those data points where the exit time is at least as large as it is for the individual in question. We also need to include an START option on the MAXIMIZE which computes the hazard functions for all the data points. Throughout this, NOBS is the number of observations. It’s a good idea to use a variable for this and not code the number in directly.

This is from Greene (2012), example 19.8. It first does maximum likelihood assuming a two-parameter Weibull distribution for $\theta _0 \left( t \right)$, then the proportional hazards estimator.

nonlin b0 b1 p

frml zfrml = b0+b1*prod

frml weibduration = (z=zfrml),(w=-exp(z)*dur^p),$

z+w+log(p)+(p-1)*log(dur)

compute p=1.0

maximize(method=bfgs) weibduration

RISKSET is a vector of vectors. Each individual has a vector and each vector has a slot for each individual. For individual $i$, slot $j$ will be 1 if $j$’s exit time is greater than or equal to $i$’s.

compute nobs=62

dec vect[vect] riskset(nobs)

dec vect hazards(nobs)

do i=1,nobs

dim riskset(i)(nobs)

ewise riskset(i)(j)=dur(j)>=dur(i)

end do i

The function HazardCalc gets computed at the start of each function evaluation to recalculate the relative hazard rates. These are put into the vector HAZARDS. The probability that an individual is the one to exit at her exit time is the ratio of her relative hazard to the sum of those hazards across the risk set for her exit time.

nonlin b1

compute b1=0.0

frml hazard = exp(b1*prod)

function HazardCalc

ewise hazards(i)=hazard(i)

end

frml logl = log(hazards(t))-log(%dot(riskset(t),hazards))

maximize(method=bfgs,start=HazardCalc()) logl