DLMCYCLE.RPF

DLMCYCLE.RPF demonstrates estimation of a state-space model with more than one observable. It is a very small example of the type of common cycle analysis from Stock and Watson(1991). Though it's a simple model, it demonstrates many of the requirements for handling a more involved dynamic factor model.

The model takes the form

\begin{equation} \begin{array}{l} C_t = \phi _c C_{t - 1} + w_{ct} \\ y_{it} = \mu _i + \gamma _i C_t + \chi _{it} \\ \chi _{it} = \phi _i \chi _{i,t - 1} + w_{it} \\ \end{array} \end{equation}

where $C$ is the (unobservable) common cycle, the $y$ are observable macroeconomic series (in growth rates) and the $\xi$ are the measurement errors in the observation equations. Because the measurement errors are assumed to be serially correlated (with AR(1) processes), we need to add those to the state equation. With two observables, the state-space model takes the form

\begin{equation} \begin{array}{l} {\bf{X}}_t \equiv \left[ {\begin{array}{*{20}c} {C_t } \\ {\chi _{1t} } \\ {\chi _{2t} } \\ \end{array}} \right] = \left[ {\begin{array}{*{20}c} {\phi _c } & 0 & 0 \\ 0 & {\phi _1 } & 0 \\ 0 & 0 & {\phi _2 } \\ \end{array}} \right]\left[ {\begin{array}{*{20}c} {C_{t - 1} } \\ {\chi _{1,t - 1} } \\ {\chi _{2,t - 1} } \\ \end{array}} \right] + \left[ {\begin{array}{*{20}c} {w_{ct} } \\ {w_{1t} } \\ {w_{2t} } \\ \end{array}} \right] \\ y_t \equiv \left[ {\begin{array}{*{20}c} {y_{1t} } \\ {y_{2t} } \\ \end{array}} \right] = \left[ {\begin{array}{*{20}c} {\mu _1 } \\ {\mu _2 } \\ \end{array}} \right] + \left[ {\begin{array}{*{20}c} {\gamma _1 } & 1 & 0 \\ {\gamma _2 } & 0 & 1 \\ \end{array}} \right]{\bf{X}}_t \\ \end{array} \end{equation}

One issue with any model such as this is that the scale of $C$ isn’t identified—if you multiply $C$ by $\lambda$ and divide all the $\gamma$ by $\lambda$, the model is unchanged. There are two ways to handle this: either fix the scale of $C$ (most easily by pegging the variance of $w_{ct}$ to 1), or fix one of the two loadings, such as pegging $\gamma_1$ at 1. The former is safer, since it’s possible (though unlikely) that you could be pegging a loading which is actually zero theoretically.

This uses growth rates of income and consumption as the two observables:

cal(q) 1947
open data coninc.rat
data(format=rats) 1947:1 1989:3 gyd82 gc82
*
* Convert to growth rates
*
set ygr = 100.0*log(gyd82/gyd82{1})
set cgr = 100.0*log(gc82/gc82{1})

This sets up the free parameters, which include the autoregressive coefficients, the mean shifts in the observation equations, the loadings on the cycle (called G), and the three variances. (The shocks are assumed to be independent). As discussed above, the variance of the cycle shock is fixed at 1. We include it in the parameter set with the pegged value as it makes the state-space representation easier to understand.

dec rect g(1,2)

nonlin phic phi1 phi2 mu1 mu2 g sigc=1.0 sig11 sig22

The matrices in the state-space representation are all time-invariant (other than the data), but several depend upon the free parameters. The most convenient way to handle these is to create a “startup” function, which gets executed at the start of each function evaluation to copy the current parameter settings into matrices:

function %%DLMStart

compute a=%diag(||phic,phi1,phi2||)

compute sw=%diag(||sigc,sig11,sig22||)

compute c=g~~%identity(2)

end %%DLMStart

Coming up with good guess values for state-space models can be difficult. The most important ones to get at least within an order of magnitude or so are the variances—the others are usually more robust to poor guess values. With a complicated model, it’s often a good idea to start with a more restricted submodel and “bootstrap” the estimates of the simpler model to start the complex model. What’s included in the example may be more than is needed in practice, but gives an idea of how you can construct guess values from more straightforward calculations. First is an estimate of an AR(1) on the sum of the two series to get a guess for the PHIC.

set combined = ygr+cgr

linreg combined

# constant combined{1}

compute phic=%beta(2)

The residual variance in the cycle equation is supposed to be one. So get a preliminary estimate of the cycle by taking the fitted values and scaling them down by the observed standard error:

prj testc

set testc = testc/sqrt(%seesq)

Estimate AR(1) regressions on the two observables and pull out guess values.

ar1 ygr

# constant testc

compute mu1=%beta(1),g(1,1)=%beta(2),phi1=%rho,sig11=%seesq

ar1 cgr

# constant testc

compute mu2=%beta(1),g(1,2)=%beta(2),phi2=%rho,sig22=%seesq

Estimate the parameters by maximum likelihood. Given the estimated model, use Kalman smoothing to estimate the business cycle. The way this works, at each function evaluation the %DLMSTART() function is executed. That resets the SW, A and C matrices to the values needed for the current set of parameters. With the multiple observables, you need to use the in-line matrix ||...|| notation to feed in both values.

dlm(start=%%DLMStart(),type=smooth,y=||ygr,cgr||,sw=sw,c=c,a=a,$

mu=||mu1,mu2||,presample=ergodic,pmethod=simplex,piters=10,$

method=bfgs) 1947:2 1989:3 xstates

The cycle series is the first component of the state vector. A positive value means above “average” growth.

set cycle 1947:2 1989:3 = %scalar(xstates(t))

graph(footer="Estimated State of Business Cycle")

# cycle

Note that the model doesn't make the cycle process sufficiently "stiff". Something with more persistence than an AR(1) would probably give results that are more in line with our notion of a business cycle.

Full Program

cal(q) 1947
open data coninc.rat
data(format=rats) 1947:1 1989:3 gyd82 gc82
*
* Convert to growth rates
*
set ygr = 100.0*log(gyd82/gyd82{1})
set cgr = 100.0*log(gc82/gc82{1})
*
dec rect g(1,2)
nonlin phic phi1 phi2 mu1 mu2 g sigc=1.0 sig11 sig22
*
* The DLM instruction's "A" matrix (H in the text) will be diagonal
* with the phi's down the diagonal. The SW matrix will similarly be
* diagonal, with the three sigma components. The "C" matrix (A in
* the text) is a vector of the g parameters in the first row
* (remember, it's input in transposed form) and an identity matrix
* below that.
*
* These are all time-invariant, but depend upon the free
* parameters, so we set up a function to use with the START option
* on DLM; that will copy the current settings into fixed matrices
* for use on the instruction.
*
function %%DLMStart
compute a=%diag(||phic,phi1,phi2||)
compute sw=%diag(||sigc,sig11,sig22||)
compute c=g~~%identity(2)
end %%DLMStart
*
* Estimate an AR(1) on the sum of the two series. Use that to get a
* guess for the PHIC.
*
set combined = ygr+cgr
linreg combined
# constant combined{1}
compute phic=%beta(2)
*
* Get a preliminary estimate of the cycle by taking the fitted
* values and rescaling them to have a unit residual variance.
*
prj testc
set testc = testc/sqrt(%seesq)
*
* Estimate AR(1) regressions on the two observables and pull out
* guess values.
*
ar1 ygr
# constant testc
compute mu1=%beta(1),g(1,1)=%beta(2),phi1=%rho,sig11=%seesq
ar1 cgr
# constant testc
compute mu2=%beta(1),g(1,2)=%beta(2),phi2=%rho,sig22=%seesq
*
* Estimate the parameters by maximum likelihood. Given the estimated
* model, use Kalman smoothing to estimate the business cycle.
*
dlm(start=%%DLMStart(),type=smooth,y=||ygr,cgr||,sw=sw,c=c,a=a,$
mu=||mu1,mu2||,presample=ergodic,pmethod=simplex,piters=10,$
method=bfgs) 1947:2 1989:3 xstates
*
* The cycle series is the first component of the state vector. Note
* that the model doesn't make the cycle process sufficiently
* "stiff". Something with more persistence than an AR(1) would
* probably give results that are more in line with our notion of a
* business cycle.
*

set cycle 1947:2 1989:3 = %scalar(xstates(t))

graph(footer="Estimated State of Business Cycle")

# cycle