Bernanke Boivin Eliasz QJE 2005 |
This is a replication for Bernanke, Boivin & Eliasz (2005) which introduces Factor Augmented VAR (FAVAR) analysis. This uses a combination of unobserved and observed "factors" for modelling the joint dynamics of a large set of series.
One thing to note right up front is that the emphasis in the paper is on effects of the one observable factor variable: the Federal Funds rate. The three (in this case) unobservable factors are supposed to soak up much of the other common dynamics. In the calculation of impulse responses, the FF rate is ordered after the three factors in a Cholesky factor, so this is really designed to find a lower bound on the ability of the FF rate to predict the other variables.
This has a very large and diverse data set of monthly data. Because the data set has so many variables, there is a section which defines a set of "key" variables which are the ones for which the program generates output. The segment below creates a list of the key series (by their variable names) with their descriptive titles.
compute keyvars=19
dec rect[int] keylooks(keyvars,2)
dec vect[strings] keylabels(keyvars)
do i=1,19
enter keylooks(i,1) keylooks(i,2) keylabels(i)
end do i
# ip 5 "IP"
# punew 5 "CPI"
# fygm3 1 "3m TREASURY BILLS"
# fygt5 1 "5y TREASURY BONDS"
# fmfba 5 "MONETARY BASE"
# fm2 5 "M2"
# exrjan 5 "EXCHANGE RATE YEN"
# pmcp 1 "COMMODITY PRICE INDEX"
# ipxmca 1 "CAPACITY UTIL RATE"
# gmcq 5 "PERSONAL CONSUMPTION"
# gmcdq 5 "DURABLE CONS"
# gmcnq 5 "NONDURABLE CONS"
# lhur 1 "UNEMPLOYMENT"
# lhem 5 "EMPLOYMENT"
# lehm 5 "AVG HOURLY EARNINGS"
# hsfr 4 "HOUSING STARTS"
# mocmq 5 "NEW ORDERS"
# fsdxp 1 "DIVIDENDS"
# hhsntn 1 "CONSUMER EXPECTATIONS"
The observable variables are divided into three groups: the "slow-moving", "fast-moving" and the factor variables. Slow-moving are those that aren't assumed to react contemporaneously to the factor variables, while the fast-moving are those that are assumed to react quickly. The split between slow- and fast-moving is used to help identify the unobservable factors. The three groups are created using EQUATIONS.
equation sloweqn *
# ip lhur punew ipp ipf ipc ipcd ipcn ipe ipi ipm ipmd ipmnd ipmfg $
ipd ipn ipmin iput ipxmca pmi pmp gmpyq gmyxpq lhel lhelx lhem lhnag $
lhu680 lhu5 lhu14 lhu15 lhu26 lpnag lp lpgd lpmi lpcc lpem lped $
lpen lpsp lptu lpt lpfr lps lpgov lphrm lpmosa pmemp gmcq gmcdq gmcnq $
gmcsq gmcanq pwfsa pwfcsa pwimsa pwcmsa psm99q pu83 pu84 pu85 puc pucd $
pus puxf puxhs puxm lehcc lehm
equation fasteqn *
# hsfr hsne hsmw hssou hswst hsbr hmob $
pmnv pmno pmdel mocmq msondq fsncom fspcom fspin fspcap fsput fsdxp $
fspxe exrsw exrjan exruk exrcan fygm3 fygm6 fygt1 fygt5 fygt10 fyaaac $
fybaac sfygm3 sfygm6 sfygt1 sfygt5 sfygt10 sfyaaac sfybaac fm1 fm2 fm3 $
fm2dq fmfba fmrra fmrnba fclnq fclbmc ccinrv pmcp hhsntn
equation yeqn *
# fyff
These are the control variables for the model. NSTEPS is the number of steps over which the impulse responses are compute. NLAGS is the number of lags in the VAR itself. The authors use 13 here, but note that the results aren't much different than with fewer. We use 7 to keep the computation time down. For the Gibbs sampling estimation, the "hot spot" is the simulation of the state-space model to generate the factors—the greater the number of lags, the larger the state-space model and the slower is the calculation. NLAGLAM is the number of lags in the factor loadings, so 0 (which was used in the paper) means only the current values. That rather severely restricts the model, and the choice here has very little effect on the computation time, so we would recommend, in practice, a larger value. NF is the number of latent (unobservable) factors. A larger number here also increases the size of the state-space model, so you should keep it within reason. NBURN and NKEEP control the number of draws in the Gibbs sampler—the authors did substantially more than this, but we use a smaller number to keep the run time down. 100000 draws will take overnight to run on a typical computer.
compute nsteps =60 ;* Number of steps in IRF
compute nlags =7 ;* Number of lags in VAR
compute nlaglam =0 ;* Number of lags in factor loadings (no bigger than nlags-1)
compute nf =3 ;* Number of latent factors
compute nburn =1000 ;* Number of burn-in draws
compute nkeep =5000 ;* Number of keeper draws
BBEGIBBS.RPF
This program does the maximum likelihood (Gibbs sampling) estimates of the model. The sampler is done in the following blocks:
1.Draw the lag coefficients and the covariance matrix for the VAR treating the factors as "data". This is done with a "Minnesota"-type prior but with a zero mean for all coefficients. If this has a dominant root larger than .999, it's rejected and a new draw is done. (About 1% of draws on this data set are rejected). This prior seems to be critical for getting good behavior out of the sampler.
2.Draw for the factor loadings for the slow- and fast-moving series. The factors are identified by making the contemporaneous loadings of the first three "slow-moving" variables equal to 1 on the corresponding latent factor and 0 contemporaneously on all other factors. In the paper, the authors use (weakly) informative priors on the coefficients, basically a shrinkage prior to zero. This program doesn't do that. In the absence of the priors, the space spanned by the latent factors would be the same regardless of which three variables you list first on the "slow-moving" list. The shrinkage prior changes that because of the (forced) unit loadings on the lead variables—some of the listed variables are quite similar to each other, so the loadings for a variable which has similar behavior to a lead variable would be expected to have similar loadings, and the prior would work against that. And (unlike the prior on the VAR), the prior on the loadings doesn't seem to be necessary in practice. The measurement equation variances are drawn at the same time.
3.Draw the factors given the VAR coefficients (which determine the state-space representation) and the loadings (which determine the measurement equations). This is done using DLM with conditional simulation. As mentioned above, this is by far the most time-consuming part of the calculation. The number of observables (the "X" variables) is the main reason for that, though the size of the state-space representation (number of factors × number of lags) also plays a part.
Once the sampler moves out of the burn-in draws, the impulse responses are computed. This is done using matrix calculations because creating a linear model (so that IMPULSE and ERRORS can be used) is probably just as much work.
The program produces three main pieces of output: a table of the decomposition of variance for the "key" variables, responses to a shock in the Federal Funds rate for the key variables and a graph of the factors. The last of these has been requested by several users, though the paper really doesn't do anything with these and the specific factors depending heavily on which three variables are listed first—unlike principal components, there isn't anything special about "factor 1".
The specific results are broadly similar to those reported in the paper. They are, of course, subject to simulation error. However, the paper used 13 lags rather than 7 and used a prior on the loadings while this doesn't, both of which would be expected to affect the results in at least a minor way. (Their impulse responses are considerably more volatile which may be due to the extra lags). Increasing the number of lags in the loadings (the NLAGLAM variable) as recommended above makes for a substantially different result—with NLAGLAM=0, the three factors are (basically by construction) noisy versions of the first three variables in the "slow" list so the dynamics of the "X" variables are severely restricted.
The decomposition of variance also lists the percentages due to the three factors (the first four columns should add up to 100 subject to rounding error) while the paper only shows its estimate for the percentage due to "Y1" (the one-and-only observable factor). Again, the "definitions" of the factors will depend upon which variables are listed first in the "slow" group.
Variable Factor 1 Factor 2 Factor 3 Y1 Model R^2
IP 75.55 8.47 10.91 5.08 54.75
CPI 20.53 3.35 74.43 1.69 89.42
3m TREASURY BILLS 36.13 6.12 42.08 15.68 94.74
5y TREASURY BONDS 28.37 7.01 49.93 14.69 94.64
MONETARY BASE 16.40 35.69 42.63 5.28 7.07
M2 26.32 35.42 30.34 7.92 2.21
EXCHANGE RATE YEN 32.96 33.11 17.63 16.31 3.45
COMMODITY PRICE INDEX 42.74 3.10 49.88 4.29 56.44
CAPACITY UTIL RATE 51.36 14.97 24.68 8.99 59.15
PERSONAL CONSUMPTION 40.30 5.35 50.86 3.49 8.63
DURABLE CONS 46.64 10.27 37.42 5.67 4.25
NONDURABLE CONS 36.22 6.15 54.09 3.54 5.60
UNEMPLOYMENT 23.99 31.67 34.99 9.36 61.35
EMPLOYMENT 77.25 8.76 8.93 5.06 21.14
AVG HOURLY EARNINGS 37.90 4.43 55.33 2.35 16.01
HOUSING STARTS 73.38 7.77 12.33 6.52 36.05
NEW ORDERS 63.00 13.29 17.04 6.67 16.22
DIVIDENDS 24.52 6.32 62.94 6.22 47.05
CONSUMER EXPECTATIONS 28.88 5.45 62.09 3.58 61.19
Copyright © 2025 Thomas A. Doan