FRML Aggregator |
A FRML is a specialized data type which describes a function of the entry number T. Usually this function includes references to current and lagged data series. For non-linear estimation, it will also use the parameters that are to be estimated.
You can define vectors and arrays of FRMLs, although you must take some special care in defining the elements of these in a loop. FRMLs can also be passed as procedure parameters.
FRMLs used in non-linear estimation need to be created using the instruction FRML. FRMLs can also be created by LINREG and some other instructions from the results of regressions, but these have the estimated coefficients coded into them and thus are of no use for further estimation.
FRMLs usually produce a real value, but you can also create formulas which produce matrices—these are used, for instance, by the instructions CVMODEL and DLM. You need to do a DECLARE instruction as the first step in creating such a formula. For instance, to make A a formula which returns a RECTANGULAR, do
declare frml[rect] a
The Instruction FRML
The FRML instruction is used to define the formula or formulas that you want to estimate. FRML will usually take the form:
FRML( options ) formulaname depvar = function(T)
The depvar (dependent variable) is often unnecessary. Some simple examples:
nonlin k a b
frml logistic = 1.0/(k+a*b^t)
is a formula for a logistic trend. The parameters K, A and B will be estimated later.
nonlin b0 b1 b2 gamma sigmasq
frml olsresid = pcexp-b0-b1*pcaid-b2*pcinc
frml varfunc = sigmasq*(pop^gamma)
translates the following into FRMLs: \(PCEX{P_t} - {\beta _0} - {\beta _1}PCAI{D_t} - {\beta _2}PCIN{C_t}\) and \({\sigma ^2}POP_t^\gamma \) with \({\beta _0}\), \({\beta _1}\), \({\beta _2}\), \({\gamma}\) and \({\sigma ^2}\) representing the unknown parameters.
nonlin i0 i1 i2 i3 i4
frml investnl invest = $
i0+i1*invest{1}+i2*ydiff{1}+i3*gnp+i4*rate{4}
translates an investment equation which depends upon current GNP and lags of INVEST, YDIFF and RATE. You could also do this (more flexibly) with:
frml(regressors,vector=invparms) investnl invest
# constant invest{1} ydiff{1} gnp rate{4}
nonlin invparms
You can create a FRML to compute a function which depends upon its own value at the previous data point. Such recursively defined functions are not uncommon in time series work. For instance, in GARCH models, this period’s variance depends upon last period’s. In a models with moving average terms, this period’s residual depends upon last period’s. We will demonstrate how to handle this for a geometric distributed lag. (This is for purpose of illustration. You can estimate this more easily with BOXJENK).
\({y_t} = {\beta _0} + {\beta _1}\left\{ {\sum\limits_{s = 0}^\infty {{\lambda ^s}{X_{t - s}}} } \right\}\)
We can generate the part in braces (call it \({Z_t}\)) recursively by
\({Z_t} = {X_t} + \lambda {\kern 1pt} {\kern 1pt} {Z_{t - 1}}\)
The parameter THETA represents the unobservable
\({Z_0} = \sum\limits_{s = 0}^\infty {{\lambda ^s}{X_{ - s}}} \)
At each T, the formula below sets XLAGS equal to the new value of Z, and then uses this value to compute the next entry. %IF is used to distinguish the first observation from the others, as it needs to use THETA for the pre-sample value of Z. (The START option on many instructions provides a more flexible way to handle the different treatment of the first observation).
declare real xlags
nonlin beta0 beta1 theta lambda
frml geomdlag = (xlags = x + lambda*%if(t==1949:1,theta,xlags)),$
beta0 + beta1 * xlags
Notice that the formula includes two separate calculations: the first computes XLAGS, and the second uses this to create the actual value for the formula. This ability to split the calculation into manageable parts is a great help in writing formulas which are easy to read and write. Just follow each preliminary calculation with a comma. The final value produced by the formula (the expression after the last comma) is the one that is used.
The handling of the initial conditions is often the trickiest part of both setting up and estimating a recursive function. The example above shows one way to do this: estimate it as a free parameter. Another technique is to set it as a “typical” value. For instance, if X were a series with zero mean, zero would not be an unreasonable choice. The simplest way to set this up is to make XLAGS a data series, rather than a single real value.
set xlags = 0.0
nonlin beta0 beta1 lambda
frml geomdlag = (xlags=x+lambda*xlags{1}),beta0+beta1*xlags
When the formula needs an initial lagged value of XLAGS, it pulls in the zero.
Many of the models which you will be estimating will have two or more distinct parts. For instance, in a maximum likelihood estimation of a single equation, there is usually a model for the mean and a model for the variance. While both parts are needed for the complete model, there is no direct interaction between the two. You might very well want to alter one without changing the other.
This can be handled within RATS by defining separate FRMLs for each part, and then combining them into a final FRML. Redefining one of the components has no effect upon the other. For instance,
nonlin b0 b1 b2 gamma sigmasq
frml olsresid = pcexp-b0-b1*pcaid-b2*pcinc
frml varfunc = sigmasq*(pop^gamma)
frml likely = %logdensity(varfunc(t),olsresid(t))
defines the log likelihood function for a model with Normally distributed errors whose variance is proportional to a power of the series POP. The variance model and regression residuals model are represented by separate formulas. The LIKELY formula doesn’t need to know anything about the two formulas which it references.
There is one minor difficulty with the way the model above was coded: the single NONLIN instruction declares the parameters for both parts. This is where PARMSETS can come in handy. If we rewrite this as
nonlin(parmset=olsparms) b0 b1 b2
frml olsresid = pcexp-b0-b1*pcaid-b2*pcinc
nonlin(parmset=varparms) gamma sigmasq
frml varfunc = sigmasq*(pop^gamma)
frml likely = %logdensity(varfunc(t),olsresid(t))
then the PARMSET for the complete model is OLSPARMS+VARPARMS.
Creating a FRML from a Regression
In the last example, the OLSRESID formula was fairly typical of the “mean” model in many cases: it’s linear in the parameters, with no constraints. This could be estimated by LINREG if it weren’t for the non-standard model of the variance.
This is a relatively small model, but it is possible to have a model for the mean which has many more explanatory variables than this one. Coding these up as formulas can be a bit tedious. Fortunately, FRML provides several ways to simplify this process. We demonstrated the REGRESSORS option earlier—it takes the right side of the formula from a list of regressors. For example, we can set up this model with the following:
frml(regressors,parmset=olsparms,vector=b) olsmodel
# constant pcaid pcinc
nonlin(parmset=varparms) gamma sigmasq
frml varfunc = sigmasq*(pop^gamma)
frml likely = %logdensity(varfunc(t),pcexp-olsmodel(t))
The first FRML instruction does the following:
1.Creates OLSMODEL as the formula B(1)+B(2)*PCAID+B(3)*PCINC.
2.Puts the 3-element vector B into the PARMSET named OLSPARMS.
The final function has to be altered slightly because OLSMODEL gives the explained part of the model, not the residual. Notice that, with this way of setting up the model, you can change the mean model by just changing the list of explanatory variables on the supplementary card.
You can also create formulas following a LINREG instruction by using FRML with the LASTREG option, or by using the EQUATION option to convert an estimated equation. For instance, the following sets up the same model, but uses the least squares estimates for the guess values for the parameters in the mean model, and the (constant) least squares variance as the guess for the variance model.
linreg pcexp
# constant pcaid pcinc
frml(lastreg,parmset=olsparms,vector=b) olsmodel
nonlin(parmset=varparms) gamma sigmasq
compute sigmasq=%sigmasq,gamma=0.0
frml varfunc = sigmasq*(pop^gamma)
frml likely = %logdensity(varfunc(t),pcexp-olsmodel(t))
You can create VECTORs or other arrays of FRMLs. This can be very handy when you have a large number of FRMLs with a similar form. You have to be careful, however, if the FRMLs are defined in a loop. Wherever you use the loop index in the formula definition, you must prefix it with the & symbol.
dec vector b(n)
dec vect[frml] blackf(n)
nonlin gamma b
do i=1,n
frml blackf(i) s(i) = (1-b(&i))*gamma+b(&i)*market
end do i
The &i’s are needed in the formula because (1-B(i))*GAMMA+B(i)*MARKET is a perfectly good formula, which would be calculated using the value of i at the time the formula is used, not at the time it was defined. The &i returns the value of i as the formula is defined, so, for example, the first formula is defined as:
S(1) = (1-B(1))*GAMMA+B(1)*MARKET
rather than
S(i) = (1-B(i))*GAMMA+B(i)*MARKET
Copyright © 2024 Thomas A. Doan