DLM( options )   start end statevectors statevariances

DLM applies one of three methods to dynamic linear models: state-space/signal extraction by Kalman filtering by Kalman smoothing, and optimal control of a quadratic cost functional. You can use DLM to solve for unknown states or control variables, or to estimate unknown parameters in the transition matrices.


start, end

range over which state-space model is computed


A SERIES of VECTORS into which the output state vectors are placed. If you call this STATES, for instance, STATES(2018:1) is the complete estimated state vector for 2018:1. To get component k of this, use STATES(2018:1)(k).


Series of SYMMETRIC arrays into which the variance matrices of the states are saved.


Most of the options define information needed in the model. For state-space/signal extraction, the model consists of two equations:

(1) \({{\bf{X}}_t} = {{\bf{A}}_t}{{\bf{X}}_{t - 1}} + {{\bf{Z}}_t} + {{\bf{F}}_t}{{\bf{W}}_t}\), and

(2) \({{\bf{Y}}_t} = {\mu _t} + {{\bf{C}}_t}^\prime {{\bf{X}}_t} + {{\bf{V}}_t}\)

\({{\bf{X}}_t}\) is an unobserved vector of states. \({{\bf{Y}}_t}\) is observable, and gives information about \({{\bf{X}}_t}\) through the measurement equation (2). \({{\bf{W}}_t}\) and \({{\bf{V}}_t}\) are shocks to the transition process and noise in the measurement equation, respectively. \({{\bf{F}}_t}\) are the loadings from the transition shocks to the states; often the identity matrix. The \({{\bf{Z}}_t}\) term in (1) allows for exogenous shifts in the state equation, and \({\mu_t}\) in (2) allows for exogenous shifts in the measurement equation.

In these descriptions, N is the size of \({\bf{X}}\), M is the size of \({\bf{Y}}\), L is the size of \({\bf{W}}\).

DLM Setup Options

A=RECTANGULAR or FRML[RECT] [identity matrix]

This gives the A matrices. The expression should evaluate to an \(N \times N\) matrix.

C=RECTANGULAR or FRML[RECT] [zero matrix]

This gives the C matrices—it should evaluate to an \(N \times M\) matrix.

Y=VECTOR or FRML[VECT] [no measurement equation]

This gives the Y vectors—it should evaluate to an \(M\) vector


This allows the inclusion of exogenous shifts in the state equation—it should evaluate to an \(N\) VECTOR.


This allows the inclusion of of a shift term in the observable equation—it should evaluate to an \(M\) VECTOR.


This gives the covariance matrices of the \({\bf{W}}\)’s—it should evaluate to an \(N \times N\) SYMMETRIC array.


This gives the covariance matrices of the \({\bf{V}}\)’s—it should evaluate to an \(M \times M\) SYMMETRIC.

F=RECTANGULAR or FRML[RECT] [zero matrix]

This gives the loadings from the \({\bf{W}}\)'s to the \({\bf{X}}\)'s. It should evaluate to an \(N \times L\) matrix.

DISCOUNT=discount value [not used]

Multiplies \(\Sigma (t|t - 1) \times discount\). This is an alternative to using SW; instead of the change in variance being additive, it will be multiplicative.


This determines what technique is to be used. FILTER is the Kalman filter, SMOOTH is the Kalman smoother and CONTROL solves the control problem (see "Options for Optimal Control"). SIMULATE does a random (Normal) simulation of the DLM drawing randomly from the presample distribution, the state disturbances and the measurement equation disturbances. CSIMULATE does a conditional simulation, drawing the states from their distribution conditional on the observed \({\bf{Y}}\)’s.

SAVE=DLM to save inputs [not used]

GET=DLM with saved model to be used[not used]

SAVE saves the information about the inputs to the model (basic input options A, C, Y, F, SW, SV, MU, Z and less standard ones SX0, X0, Q, R, SH0 and DISCOUNT). You can then use GET on a later DLM instruction to re-use those inputs. Note that this only applies to the inputs, not the outputs, or to TYPE options or estimation controls.

SMPL=Standard SMPL Option[not used]

This series has value 0 in entries which have no data (or which you want to skip for other reasons) and non-zero values in the other entries. See the note on "Missing Values".



G=matrix (RxN, R<N) reducing model to stationarity [not used]

SH0=variance matrix of diffuse part of the prior [identity]

X0=VECTOR or FRML[VECTOR] [zero vector]


These options can be used to initialize the pre-sample states of the Kalman filter. See the User’s Guide for the technical details.

PRESAMPLE=DIFFUSE (the older EXACT option is a synonym) gives a diffuse prior to the entire pre-sample state vector. PRESAMPLE=ERGODIC computes a (possibly mixed) stationary-diffuse prior, as described in Doan (2010) and is generally the recommended option. G and SH0 are older options for this: the G option can provide a matrix which maps the states to a set of stationary states; by default, the entire state vector is treated as non-stationary. Alternatively, you can use the SH0 option to set directly a (proportional) matrix for the variance of the diffuse part of the prior.  

If PRESAMPLE=X0 (the default), the initial (finite) state mean and covariance matrix are supplied by the X0 and SX0 options. X0 supplies the initial state mean—it should evaluate to an \(N\)VECTOR while SX0 gives the covariance of \({{\rm{X}}_{\rm{0}}}\)—it should evaluate to an \(N \times N\) SYMMETRIC array. If PRESAMPLE=X1, X0 and SX0 are still used, but they provide \({{\rm{X}}_{{\rm{1|0}}}}\) and \({\Sigma _{{\rm{1|0}}}}\), not \({{\rm{X}}_{{\rm{0|0}}}}\) and \({\Sigma _{{\rm{1|0}}}}\).



VARIANCE=KNOWN assumes all variances are known (or being estimated). VARIANCE=CONCENTRATED assumes all variances are known up to a single unknown scale factor (usually the variance of the measurement equation) which is to be concentrated out. VARIANCE=CHISQUARED assumes all variances are known up to an unknown scale factor (again, usually the variance of the measurement equation) which has an informative (inverse) chi-squared prior mean—see the PDF and PSCALE options below.

When you use VARIANCE=CONCENTRATED or VARIANCE=CHISQUARED, you will usually peg one of the variances to 1.0, such as with SV=1.0.

PDF=prior degrees of freedom [not used]

PSCALE=prior scale factor [not used]

VDISCOUNT=discount factor for degrees of freedom [1.0]

These are all used only with VARIANCE=CHISQUARED. PDF and PSCALE are the prior degrees of freedom and prior scale factor for the scaling variance, respectively. VDISCOUNT is a suggestion of West and Harrison (1997), which downweights the past information about the variance if the value is less than the default of 1.0.

STARTUP=expression evaluated at period "start" [not used]

ONLYIF=expression tested before calculating start option [not used]

You can use the START option to provide an expression which is computed once per function evaluation, before any of the regular formulas are computed. This allows you to do any time-consuming calculations that depend upon the parameters, but not upon time. It can be an expression of any type. ONLYIF calculates the expression provided; if it evaluates to a zero value, the function evaluation doesn't continue, and the function is assigned the missing value. ONLYIF is examined before doing the START option, unlike REJECT (below), which is done after the START.

REJECT=expression with "rejection" zone for parameters

If the expression evaluates to a non-zero (“true”) value, the function is immediately assigned the missing value. If you have a START option, REJECT is examined after it, so it can look at the results for anything computed as part of the START option.

LIMIT=number of periods before "limit" calculations are used [all observations]

This can be used to speed calculations if the system matrices are time-invariant and there are no missing values. This is particularly helpful when the number of observable variables is high.

FREE=number of (fixed) states to be freely estimated [0]

Use this for models where part of the state vector is a set of unknown regression coefficients which are fixed over the sample, and which you want to be, in effect, estimated freely. This will adjust the likelihood to be the likelihood conditional on those parameters, rather than the unconditional one. (The states themselves are unaffected by this). You must arrange your state vector so that these will be at the end. Adding these to the state vector and using FREE is generally much more efficient than putting them in the parameter set and estimating them that way.

Options for Extracting Information from DLM



TITLE="title for output" ["DLM"]

These are the same as for other regressions.

YHAT=SERIES[VECTORS] of predicted or simulated values of Y [not used]

Series of VECTORS of one-step predicted values of \({\bf{Y}}\) (with TYPE=FILTER or TYPE=SMOOTH), or simulated values of \({\bf{Y}}\) (with TYPE=SIMULATE).

VHAT=SERIES[VECT] for prediction errors [not used]

SVHAT=SERIES[SYMM] for prediction error cov. matrices [not used]

VHAT returns a SERIES[VECT] of the one-step prediction errors for Kalman filtering, the smoothed prediction errors for Kalman smoothing, and the simulated disturbances for the measurement equation for the simulations. SVHAT returns a SERIES[SYMM] of the one-step prediction error variance for Kalman filtering, and the smoothed prediction variance for Kalman smoothing.

WHAT=SERIES[VECT] of state disturbances [not used]

SWHAT=SERIES[SYMM] of state disturbance variances [not used]

With TYPE=SMOOTH, these give the expected values and variances of the shocks to the states given the full data set. Neither is defined for TYPE=FILTER. With TYPE=SIMULATE or TYPE=CSIMULATE, the WHAT series will be the simulated disturbances; SWHAT isn’t defined for the simulations.

LIKELIHOOD=series of cumulated log likelihoods

This saves the (cumulated) log likelihoods into a SERIES.

GAIN=SERIES[RECT] of Kalman gain matrices [not used]

Saves the series of Kalman gain matrices.

SIGHISTORY=series of estimated scaling variances [not used]

DFHISTORY=series of degrees of freedom [not used]

If you use VARIANCE=CHISQUARED, SIGHISTORY can be used to get the series of estimated scaling variances, while DFHISTORY returns a series of the sequential degrees of freedom.

Estimation Options

The remaining options apply if you are estimating free parameters of the model.

Standard Non-Linear Estimation Options


ITERATIONS=iteration limit[100]

SUBITERATIONS=subiteration limit [30]

CVCRIT=convergence limit [.00001]


METHOD selects the estimation method used by DLM. If you choose any of these other than SOLVE, it is assumed that there are free parameters to be estimated, which need to be defined ahead of time with NONLIN.

BFGS and GAUSS (Gauss-Newton) are the only two which can compute standard errors; the others can get point estimates only and are more often used as "PMETHODS" to improve guess values. See Optimization Methods.

ITERATIONS sets the maximum number of iterations, SUBITERS sets the maximum number of subiterations, CVCRIT the convergence criterion. TRACE prints the intermediate results. For METHOD=SIMPLEX, an “iteration” is actually defined as K vertex changes, where K is the number of free parameters. This makes the number of calculations per “iteration” similar to the other methods.


PITERS=number of PMETHOD iterations to perform [none]

Use PMETHOD and PITERS if you want to use a preliminary estimation method to refine your initial parameter values before switching to one of the other estimation methods. RATS will automatically switch to the METHOD choice after completing the "preliminary" iterations requested using PMETHOD and PITERS.

PARMSET=PARMSET to use[default internal]

This option selects the parameter set to be estimated. RATS maintains a single unnamed parameter set which is the one used for estimation if you don’t provide a named set.

HESSIAN=initial guess for inverse Hessian (METHOD=BFGS only)

You can use this with METHOD=BFGS. Without it, DLM will start with a diagonal matrix whose elements are the reciprocals of the (numerically computed) second derivatives of the function.

CONDITION=number of early sample data points to skip [0]

If you have a non-stationary model, it usually takes several data points for the variance of the state vector to become “finite” (assuming a diffuse prior). You can use the CONDITION option to indicate how many early data points should be left out of the calculation of the criterion function. The alternative to CONDITION is to use PRESAMPLE=ERGODIC or PRESAMPLE=DIFFUSE to deal explicitly with the diffuse initial conditions. If the transition model has free parameters which might change the number of "unit roots", you may need to use CONDITION to make the log likelihood function continuous with respect to those parameters.

Options for Adjusting the State or Covariance Matrix

FPRE=FUNCTION(VECTOR *) called before the prediction step [not used]

FMID=FUNCTION(VECTOR *) called after the prediction step but before the filter update step [not used]

FPOST=FUNCTION(VECTOR *,SYMMETRIC *) called after the filter update step [not used]

FSPOST=FUNCTION(VECTOR *,SYMMETRIC *) called after a smoothing update [not used]

These can be used to intervene in the filtering or smoothing process to either do some added calculations (for, for instance, linearization) or to actually change the state vector and/or the state covariance matrix. If you use FPRE, it will be called before doing the prediction step, so the state will be have the values at the end of the previous period. FMID is called after the state vector has been updated in the prediction step. FPOST is called after this period's Kalman update calculation, so will have the t|t information. FSPOST is called after the smoothing calculation for period t, and thus will have t|T. In all cases, the reserved variable T will be equal to the entry currently being calculated. For instance, the following will adjust the state vector (here, just a single value) to be equal to 1 if the filtered value exceeds 1.

function WithStdSigma xt vt

type vector *xt

type symm   *vt


if xt(1)>1.0

   compute xt(1)=1.0




  presample=diffuse,type=filter) / xstates_std vstates_std

Options for Optimal Control

For optimal control the model takes the a slightly different form, as there are now control variables \({{\bf{U}}_t}\). \(P\) is the size of \({\bf{U}}\).

(3) \({{\bf{X}}_t} = {{\bf{A}}_t}{{\bf{X}}_{t{\rm{ - }}1}}{\rm{ + }}{{\bf{B}}_t}{{\bf{U}}_t}{\rm{ + }}{{\bf{F}}_t}{{\bf{W}}_t}\)

(4) \({{\bf{Y}}_t} = {\mu _t} + {\bf{C}'_t}{{\bf{X}}_t} + {{\bf{V}}_t}\), with the objective function being

(5) \(E\left( {{\bf{X}'_0}{{\bf{Q}}_0}{{\bf{X}}_0} + \sum\limits_{t = 1}^T {\left\{ {{\bf{X}'_t}{{\bf{Q}}_t}{{\bf{X}}_t} + {\bf{U}'_t}{{\bf{R}}_t}{{\bf{U}}_t}} \right\}} } \right)\)

B=RECTANGULAR or FRML[RECT] [NxN identity matrix]

This gives the \({\bf{B}}\) matrices. The expression should evaluate to an \(N \times P\) matrix.


This gives the \({\bf{Q}}\) matrices—it should evaluate to an \(N \times N\) SYMMETRIC.


This gives the \({\bf{R}}\) matrices—it should evaluate to a \(P \times P\) SYMMETRIC.

Missing Values

A missing data point can be represented by either a \({{\rm{C}}_t}\) or \({{\rm{Y}}_t}\) which has a missing value. You can also use a SMPL option to skip data points. Note that the missing time periods are still considered part of the overall sample. The Kalman filter and smoother will estimate the state vector at those time periods and optimal control will generate a control value for them.

Note that if \({\bf{Y}}\) has more than one component, some can be missing, while others aren’t. DLM will adjust the calculation in those cases to use the information available.

Declaring the Options

Most of the time, the options which construct your model will either take the default value or they will be constant over the sample. If this is the case, you can put them in as a matrix or as a constant value (for a 1x1 matrix). If, however, the matrix is time-varying (at minimum, \({\bf{Y}}\) usually is), or it depends upon a non-linear parameter which you are estimating, you will either have to define it as a FRML of the correct type, or put its definition directly into the DLM option.


(6) \({y_t} - \mu  = \phi ({y_{t - 1}} - \mu ) + {\varepsilon _t} + \theta {\varepsilon _{t - 1}}\)

can be defined by the following matrices

(7) \({{\bf{X}}_t} = \left[ {\begin{array}{*{20}{c}}{{y_t} - \mu }  \\ {{\varepsilon _t}}  \\ \end{array}} \right],{\bf{A}} = \left[ {\begin{array}{*{20}{c}} \phi  & \theta   \\ 0 & 0  \\ \end{array}} \right],{\bf{C'}}\left[ {\begin{array}{*{20}{c}} 1 & 0  \\ \end{array}} \right],{\bf{F}} = \left[ {\begin{array}{*{20}{c}} 1  \\ 1  \\ \end{array}} \right],{{\bf{W}}_t} = [{\varepsilon _t}]\)

If the variance of \({\varepsilon _t}\) is set to one, we can use VARIANCE=CONCENTRATE in estimating this model. Assuming that \(\left| \phi  \right| < 1\), this is a stationary model, which can be initialized using PRESAMPLE=ERGODIC. This sets up and estimates the model. It also estimates the same model using BOXJENK with the MAXL option. This state-space model is exactly what BOXJENK with MAXL does internally.

nonlin mu phi theta

dec frml[rect] af

frml af = ||phi,theta|0,0||

sstats(mean) / y>>mu

compute phi=theta=.10


  a=af,y=y,mu=mu,c=%unitv(2,1),f=||1.0|1.0||,sw=1.0) 1960:1 2009:4

boxjenk(ar=1,ma=1,constant,maxl) y

(8) \({y_t} = {\varphi _1}{y_{t - 1}} + {\varphi _2}{y_{t - 2}} + ... + {\varphi _p}{y_{t - p}} + {\varepsilon _t}\)

If p is known and fairly small, it’s probably easiest to just code the \({\bf{A}}\) matrix directly. We show here how to do it for a general p. The transition equation matrices are

(9) \({{\bf{X}}_t} = \left[ {\begin{array}{*{20}{c}}{{y_t}}  \\ {{y_{t - 1}}}  \\ \vdots   \\ {{y_{t - p + 1}}}  \\ \end{array}} \right],{\bf{A}} = \left[ {\begin{array}{*{20}{c}} {{\varphi _1}} & {{\varphi _2}} &  \cdots  & {{\varphi _{p - 1}}} & {{\varphi _p}}  \\ 1 & 0 &  \cdots  & 0 & 0  \\ 0 & 1 & 0 & 0 & 0  \\ 0 & 0 &  \ddots  &  \ddots  &  \vdots   \\ 0 & 0 &  \cdots  & 1 & 0  \\ \end{array}} \right],{\bf{F}} = \left[ {\begin{array}{*{20}{c}} 1  \\ 0  \\ \vdots   \\ 0  \\ \end{array}} \right],{{\bf{W}}_t} = \left[ {{\varepsilon _t}} \right]\)

If the coefficients are known and aren’t being estimated, A can be set up with an EWISE instruction with an %IF to handle the first row. Assume that the coefficients are in a vector named PHI.

dec rect a(p,p)

ewise a(i,j)=%if(i==1,phi(j),(i==j+1))

However, if the coefficients are to be estimated, the “ewise” has to be done within the defining formula. The easiest way to do that is to create a FUNCTION which returns the matrix. Because the \({\bf{A}}\) matrix doesn’t depend on time (just on the parameters), it’s a good candidate for being computed using the START option. In the DLM instruction below, this calls AFUNC with the current values of the parameters, puts the result into the matrix A, which is then used by the A option.

compute p=5

dec vect phi(p)

nonlin phi

function afunc phi

type vector phi

type rect afunc

local integer i j

dim afunc(%size(phi),%size(phi))

ewise afunc(i,j)=%if(i==1,phi(j),i==j+1)

end afunc



 f=%unitv(p,1),sw=1.0) 1960:1 2009:4

Variables Defined


The log likelihood for TYPE=FILTER or TYPE=SMOOTH, or the value of the objective function for TYPE=CONTROL (REAL)


If VARIANCE=CONCENTRATED, the (maximum likelihood) estimate of the variance scale (REAL)

The following are defined only when estimating coefficients:


coefficient vector (VECTOR)


estimated covariance matrix (SYMMETRIC)


vector containing the t-stats for the coefficients (VECTOR)


Vector of coefficient standard errors (VECTOR)


number of observations (INTEGER)


number of regressors (INTEGER)


number of free parameters in the parameter set, plus the variance if it's concentrated out (INTEGER)