Page 1 of 1

significance tests of c vectors

Posted: Wed Oct 26, 2011 6:19 am
by chiade
Hi Tom,

I had utilized the time-varying coefficients regression, taken from Lutkepohl's "New Introduction to Multiple Time Series Analysis", pp 637-640. which u posted. I have a few queries and below is my setup. I had read the manual 6-7 times.

These are my queries:
1) how do i retrieve the initial and final time-varying coefficients of my equation i.e. c=%eqnxvector(ceqn,t) for DLM? and their t tests? the t tests were only given for the linreg and subsequently timevarying variances lsigsqw & lsigsqv. where can i view the initial/final A and F values?

2) What exactly is %eqnxvector(ceqn,t)? the manual says they are x(t) vectors. are they the coefficients i derive from the linreg function? does the a, f and c in DLM() serves as initial values?

3) Where do i insert the series for HH and BSB in DLM as these are regressors? do i need to use mu for the constant or the below had taken care of it?

many thanks for your clarification.

Code: Select all

linreg(define=ceqn,noprint) wti 1997:10:31 2011:6:24
# constant hh bsb

dec vect[series] b(3) lower(3) upper(3)
*
dec vect lsigsqw(3)
*
compute lsigsqw=%log(||20.,60.,60.||)
compute lsigsqv=log(30)
dlm(c=%eqnxvector(ceqn,t),sw=%diag(%exp(lsigsqw)),sv=exp(lsigsqv),exact,y=wti,$
  method=solve,type=filter,vhat=vhat,yhat=yhat) 1997:10:31 2011:6:24 xstates vstates
do i=1,3
   set b(i) 1997:11:28 2011:6:24 = xstates(t)(i)
   set lower(i) 1997:11:28 2011:6:24 = b(i)-2.0*sqrt(vstates(t)(i,i))
   set upper(i) 1997:11:28 2011:6:24 = b(i)+2.0*sqrt(vstates(t)(i,i))
end do i
*
set v 1997:11:28 2011:6:24 = %scalar(vhat)
set y 1997:11:28 2011:6:24 = %scalar(yhat)

Re: significance tests of c vectors

Posted: Wed Oct 26, 2011 12:03 pm
by TomDoan
chiade wrote:Hi Tom,

I had utilized the time-varying coefficients regression, taken from Lutkepohl's "New Introduction to Multiple Time Series Analysis", pp 637-640. which u posted. I have a few queries and below is my setup. I had read the manual 6-7 times.

These are my queries:
1) how do i retrieve the initial and final time-varying coefficients of my equation i.e. c=%eqnxvector(ceqn,t) for DLM? and their t tests? the t tests were only given for the linreg and subsequently timevarying variances lsigsqw & lsigsqv. where can i view the initial/final A and F values?
In your program, you've already fetched the coefficients as the XSTATES in our DO I loop. The "t-statistics" will be XSTATES(t)(i)/sqrt(VSTATES(t)(i)).
chiade wrote:2) What exactly is %eqnxvector(ceqn,t)? the manual says they are x(t) vectors. are they the coefficients i derive from the linreg function? does the a, f and c in DLM() serves as initial values?
The model is y(t)=X(t)B(t)+v(t), B(t)=B(t-1)+w(t). The states in the DLM are the coefficients (B(t)). The regressors X(t) are the loadings; they aren't estimated, they're data.
chiade wrote: 3) Where do i insert the series for HH and BSB in DLM as these are regressors? do i need to use mu for the constant or the below had taken care of it?
You already did that when you defined your regression equation.
chiade wrote: many thanks for your clarification.

Code: Select all

linreg(define=ceqn,noprint) wti 1997:10:31 2011:6:24
# constant hh bsb

dec vect[series] b(3) lower(3) upper(3)
*
dec vect lsigsqw(3)
*
compute lsigsqw=%log(||20.,60.,60.||)
compute lsigsqv=log(30)
dlm(c=%eqnxvector(ceqn,t),sw=%diag(%exp(lsigsqw)),sv=exp(lsigsqv),exact,y=wti,$
  method=solve,type=filter,vhat=vhat,yhat=yhat) 1997:10:31 2011:6:24 xstates vstates
do i=1,3
   set b(i) 1997:11:28 2011:6:24 = xstates(t)(i)
   set lower(i) 1997:11:28 2011:6:24 = b(i)-2.0*sqrt(vstates(t)(i,i))
   set upper(i) 1997:11:28 2011:6:24 = b(i)+2.0*sqrt(vstates(t)(i,i))
end do i
*
set v 1997:11:28 2011:6:24 = %scalar(vhat)
set y 1997:11:28 2011:6:24 = %scalar(yhat)

Re: significance tests of c vectors

Posted: Thu Oct 27, 2011 7:21 am
by chiade
Hi Tom,

i can't get a single t statistics for the timevarying coefficients like those u get in the linear regression table. I need to find out the significance of these coefficients. I tried using sstats but don't know whether the real value is a t-stat?

I would like to ask how the Yhat was derived? I tried adding the final C coefficients mutiply by the respective regressors to the vhat but the derived Y is different from the Yhat.

Also, i have tried using the below seasonal and local DLM. I wonder whether i have done it the right way as i added the identity 3 matrix into A and F? or is it better to use nonlin unknown variables. however, it didn't manage to converge, so I presume it was not a good model. Are there any tools to diagnose whether my model is better with or without the seasonal and locals? As my data is on weekly basis, I would have 52x52 matrix, is that too big a size and not recommended?

I have added the A and F into my earlier setup, although I don't know whether it is relevant? However, I encountered an error msg" ## DLM5. Probable Model Error. Diffuse prior was not reduced to zero rank". I wonder what happened?

This is my new setup. Thanks a lot for your help.

Code: Select all

*
linreg(define=ceqn,noprint) wti 1997:10:31 2011:6:24
# constant hh bsb

dec vect[series] b(3) lower(3) upper(3) ts(3)  vs(3)
@LocalDLM(type=level,a=al,c=cl,f=fl)
@SeasonalDLM(type=fourier,a=as,c=cs,f=fs)
dec frml[rect] cf
frml cf = %eqnxvector(ceqn,t)~~cl~~cs
*
*

compute a=%identity(3)~\al~\as,f=%identity(3)~\fl~\fs,c=%eqnxvector(ceqn,t)~~cl~~cs
*
dec vect lsigsqw(3)
*
compute lsigsqw=%log(||20.,60.,30.,30.,30.,30.,30.,30.,30.,30.||)
compute lsigsqv=log(30)
dlm(a=a,f=f,c=cf,sw=%diag(%exp(lsigsqw)),sv=exp(lsigsqv),exact,y=wti,$
  method=solve,type=filter,vhat=vhat,yhat=yhat) 1997:10:31 2011:6:24 xstates vstates
do i=1,3
   set b(i) 1997:11:28 2011:6:24 = xstates(t)(i)
   set lower(i) 1997:11:28 2011:6:24 = b(i)-2.0*sqrt(vstates(t)(i,i))
   set upper(i) 1997:11:28 2011:6:24 = b(i)+2.0*sqrt(vstates(t)(i,i))
   set ts(i) 1997:11:28 2011:6:24 = xstates(t)(i)/sqrt(vstates(t)(i,i))
   set vs(i) 1997:11:28 2011:6:24 = sqrt(vstates(t)(i,i))
   end do i
*
set v 1997:11:28 2011:6:24 = %scalar(vhat)
set y 1997:11:28 2011:6:24 = %scalar(yhat)
stats 1997:11:28 2011:6:24 ts(3)>>mu1
sstats 1997:11:28 2011:6:24 ts(1)>>mu1
sstats 1997:11:28 2011:6:24 ts(2)>>mu22

Re: significance tests of c vectors

Posted: Thu Oct 27, 2011 10:46 am
by TomDoan
chiade wrote:Hi Tom,

i can't get a single t statistics for the timevarying coefficients like those u get in the linear regression table. I need to find out the significance of these coefficients. I tried using sstats but don't know whether the real value is a t-stat?
I'm not sure what you mean by a "single" t-statistic. You have time-varying parameters, hence you have a separate t-statistic at each entry. If you're trying to aggregate the t-statistics, I'm not sure what that would accomplish.
chiade wrote: I would like to ask how the Yhat was derived? I tried adding the final C coefficients mutiply by the respective regressors to the vhat but the derived Y is different from the Yhat.
YHAT's are the one step predictions, which, in this case, would be the regressors at t dotted with the coefficients at t-1.
chiade wrote: Also, i have tried using the below seasonal and local DLM. I wonder whether i have done it the right way as i added the identity 3 matrix into A and F? or is it better to use nonlin unknown variables. however, it didn't manage to converge, so I presume it was not a good model. Are there any tools to diagnose whether my model is better with or without the seasonal and locals? As my data is on weekly basis, I would have 52x52 matrix, is that too big a size and not recommended?
Yes, a 52 week seasonal requires a very large matrix. I'm not sure I understand what your model looks like once you add the local trend and seasonal.

Re: significance tests of c vectors

Posted: Fri Oct 28, 2011 8:31 am
by chiade
Hi tom,

Thanks for your clarification. I had tallied the yhat figures. I am actually referring to something generated with eviews as seen below for the significance tests. How can i generate the rmse and pvalues?

Code: Select all

Sspace: SSBSB				
Method: Maximum likelihood (Marquardt)				
Date: 07/30/11   Time: 16:48				
Sample: 11/21/1997 6/24/2011				
Included observations: 710				
User prior mean: SVEC0				
User prior variance: SVAR0				
Convergence achieved after 10 iterations				
WARNING: Singular covariance - coefficients are not unique				
				
	Coefficient	Std. Error	z-Statistic	Prob.  
				
C(1)	-262.6266	NA	NA	NA
C(2)	-8.783051	NA	NA	NA
C(3)	-9.935267	NA	NA	NA
				
	Final State	Root MSE	z-Statistic	Prob.  
				
SV1	3.124044	9.52E-05	32815.14	0.0000
SV2	0.036858	0.081558	0.451921	0.6513
SV3	0.261526	0.026543	9.852964	0.0000
				
Log likelihood	-17879381	     Akaike info criterion		50364.46
Parameters	3	     Schwarz criterion		50364.48
Diffuse priors	0	     Hannan-Quinn criter.		50364.47

Re: significance tests of c vectors

Posted: Fri Oct 28, 2011 9:25 am
by TomDoan
That's XSTATES, SQRT(VSTATES) and XSTATES/SQRT(VSTATES) (element by element) all evaluated at the final data point only. Why that would be included in standard output is beyond me. We have examples from four textbooks on the subject of SSM's and not a single author has ever to my knowledge mentioned the z-score on the final state as being something of interest.

Re: significance tests of c vectors

Posted: Mon Oct 31, 2011 8:49 am
by chiade
Hi Tom,

I have still some queries which I need to clarify. I have set up the following with HH as the dependent var and bsbc, wtic as the regressors. I tried variances from 0.0000001 to 1,000 and these are the output.

Code: Select all

linreg(define=ceqn,noprint) hh 1997:10:31 2011:6:24
# constant bsbc wtic

dec vect[series] bd(3) lower(3) upper(3) tsd(3)  vsd(3)
dec vect sigsqw(3)
*
compute sigsqw=(||20.,20.,20.||)
compute sigsqv=20.
dlm(c=%eqnxvector(ceqn,t),sw=%diag(sigsqw),sv=sigsqv,PRESAMPLE=ERGODIC,y=wti,$
  method=solve,type=filter,vhat=vhat,yhat=yhat) 1997:10:31 2011:6:24 xstates vstates
do i=1,3
   set bd(i) 1997:11:28 2011:6:24 = xstates(t)(i)
   set lower(i) 1997:11:28 2011:6:24 = bd(i)-2.0*sqrt(vstates(t)(i,i))
   set upper(i) 1997:11:28 2011:6:24 = bd(i)+2.0*sqrt(vstates(t)(i,i))
   set tsd(i) 1997:11:28 2011:6:24 = xstates(t)(i)/sqrt(vstates(t)(i,i))
   set vsd(i) 1997:11:28 2011:6:24 = (vstates(t)(i,i))
   end do i
*
set vd 1997:11:28 2011:6:24 = %scalar(vhat)
set yd 1997:11:28 2011:6:24 = %scalar(yhat)

nonlin sigsqv sigsqw
dlm(c=%eqnxvector(ceqn,t),sw=%diag(sigsqw),sv=sigsqv,PRESAMPLE=ERGODIC,y=wti,$
  method=bfgs,iters=500,type=filter,vhat=vhat,yhat=yhat) 1997:10:31 2011:6:24 xstates vstates
*
do i=1,3
   set bd(i) 1997:11:28 2011:6:24 = xstates(t)(i)
   set lower(i) 1997:11:28 2011:6:24 = bd(i)-2.0*sqrt(vstates(t)(i,i))
   set upper(i) 1997:11:28 2011:6:24 = bd(i)+2.0*sqrt(vstates(t)(i,i))
end do i

DLM - Estimation by BFGS
NO CONVERGENCE IN 25 ITERATIONS
LAST CRITERION WAS  0.0000000
SUBITERATIONS LIMIT EXCEEDED.
ESTIMATION POSSIBLY HAS STALLED OR MACHINE ROUNDOFF IS MAKING FURTHER PROGRESS DIFFICULT
TRY HIGHER SUBITERATIONS LIMIT, TIGHTER CVCRIT, DIFFERENT SETTING FOR EXACTLINE OR ALPHA ON NLPAR
RESTARTING ESTIMATION FROM LAST ESTIMATES OR DIFFERENT INITIAL GUESSES MIGHT ALSO WORK
Weekly Data From 1997:10:31 To 2011:06:24
Usable Observations                       713
Rank of Observables                       710
Log Likelihood                      -545.5534

    Variable                        Coeff      Std Error      T-Stat      Signif
************************************************************************************
1.  SIGSQV                             0.3896  2.7520e-003    141.58607  0.00000000
2.  SIGSQW(1)                          0.1071  8.4587e-004    126.57770  0.00000000
3.  SIGSQW(2)                    -1.7802e-003  5.9034e-006   -301.55859  0.00000000
4.  SIGSQW(3)                     8.5825e-004  7.3743e-006    116.38481  0.00000000


DLM - Estimation by BFGS
NO CONVERGENCE IN 500 ITERATIONS
LAST CRITERION WAS     NA
Weekly Data From 1997:10:31 To 2011:06:24
Usable Observations                       713
Rank of Observables                         0
Log Likelihood                        NA

    Variable                        Coeff      Std Error      T-Stat      Signif
************************************************************************************
1.  SIGSQV                            NA          0.000000      0.00000  0.00000000
2.  SIGSQW(1)                         NA          0.000000      0.00000  0.00000000
3.  SIGSQW(2)                         NA          0.000000      0.00000  0.00000000
4.  SIGSQW(3)                         NA          0.000000      0.00000  0.00000000
1) Does this mean that HH cannot be explained by WTIC/BSBC? when i switched the positions i.e. WTIC becoming dependent variable, there is quick convergence. whenever WTIC is involved as the regressor, there is no convergence. Does this mean HH/BSBC are exogenous variables?

This is the output with WTIC as dependent variable.

Code: Select all

Convergence in    23 Iterations. Final criterion was  0.0000069 <=  0.0000100
Weekly Data From 1997:10:31 To 2011:06:24
Usable Observations                       713
Rank of Observables                       710
Log Likelihood                     -1624.7511

    Variable                        Coeff      Std Error      T-Stat      Signif
************************************************************************************
1.  SIGSQV                       -0.200010722  0.119048097     -1.68008  0.09294111
2.  SIGSQW(1)                     0.470404223  0.315520958      1.49088  0.13599272
3.  SIGSQW(2)                     0.027438327  0.015725289      1.74485  0.08101036
4.  SIGSQW(3)                     0.002532101  0.000264004      9.59113  0.00000000

Since SIGSQV is negative, i log it and get the following.  
DLM - Estimation by BFGS   (log 60,60,60 - .000001)
Convergence in    11 Iterations. Final criterion was  0.0000040 <=  0.0000100
Weekly Data From 1997:10:31 To 2011:06:24
Usable Observations                       713
Rank of Observables                       710
Log Likelihood                     -1708.1141

    Variable                        Coeff      Std Error      T-Stat      Signif
************************************************************************************
1.  LSIGSQV                      -13.81476379   5.47587350     -2.52284  0.01164107
2.  LSIGSQW(1)                     0.40328655   0.30063744      1.34144  0.17977821
3.  LSIGSQW(2)                    -1.43889055   0.12558052    -11.45791  0.00000000
4.  LSIGSQW(3)                   -61.99362233   0.42639808   -145.38907  0.00000000
2) Since the lsigsqw is insignificant, can i infer that SIGSQW (w/o loggin) is also insignificant w/o running the regressions in the level form? If not, how can I find the significance of the SIGSQW by conversion back to level from logs if I opt not to run regression in level form?

Back to the first regression.....

Code: Select all

linreg(define=ceqn,noprint) hh 1997:10:31 2011:6:24
# constant bsbc wtic

dec vect[series] bd(3) lower(3) upper(3) tsd(3)  vsd(3)
dec vect sigsqw(3)
*
compute sigsqw=(||20.,20.,20.||)
compute sigsqv=20.
dlm(c=%eqnxvector(ceqn,t),sw=%diag(sigsqw),sv=sigsqv,PRESAMPLE=ERGODIC,y=wti,$
  method=solve,type=filter,vhat=vhat,yhat=yhat) 1997:10:31 2011:6:24 xstates vstates
do i=1,3
   set bd(i) 1997:11:28 2011:6:24 = xstates(t)(i)
3) since the first few observations (my data is weekly) are normally volatile, to get my xstates from 1997:11:28-2011:6:24, i ran the linreg starting 1 month earlier i.e. from 1997:10:31 as well as the dlm instruction, retrieving the xstates for my required period. Is this a right approach, or what is your recommendation as to how many additional front points should be added?

4) what is the difference between the kalman filter and bai-perron? the t-stats i derived for the xtstates, where sharp spikes are evident happend to mostly coincide with the breaks indicated by bai-perron missed by a few lags.

5) i noticed that the state-space model doesnt include a trend. is a trend necessary like in a regression i.e.
linreg wtic
# constant trend hh bsbc
or the purpose of kalman filter is to reflect changing time trend?

6) is it possible and recommended to insert structural breaks into the kalman filter since the purpose of KF is to reflect the breaks?

7) you mentioned in your previous posts that even if i pegged
compute sigsqw=(||0.,20.,20.||)
and use the start=sigsqw(2) instruction
the coefficients will be changing if type=filter. So how do i fixed the variances which are insignifcant i.e. LSIGSQW(1) or it will be fixed automatically if i used the instructions?

Code: Select all

   Variable                        Coeff      Std Error      T-Stat      Signif
************************************************************************************
1.  LSIGSQV                      -13.81476379   5.47587350     -2.52284  0.01164107
2.  LSIGSQW(1)                     0.40328655   0.30063744      1.34144  0.17977821
3.  LSIGSQW(2)                    -1.43889055   0.12558052    -11.45791  0.00000000
4.  LSIGSQW(3)                   -61.99362233   0.42639808   -145.38907  0.00000000
Many thanks once again for all your kind advice. RATS is really a difficult software but can do more in-depth analysis compared to eviews which is more user friendly.

Rgds,
Des

Re: significance tests of c vectors

Posted: Mon Oct 31, 2011 10:54 am
by TomDoan
chiade wrote:Hi Tom,

I have still some queries which I need to clarify. I have set up the following with HH as the dependent var and bsbc, wtic as the regressors. I tried variances from 0.0000001 to 1,000 and these are the output.

Code: Select all

linreg(define=ceqn,noprint) hh 1997:10:31 2011:6:24
# constant bsbc wtic

dec vect[series] bd(3) lower(3) upper(3) tsd(3)  vsd(3)
dec vect sigsqw(3)
*
compute sigsqw=(||20.,20.,20.||)
compute sigsqv=20.
dlm(c=%eqnxvector(ceqn,t),sw=%diag(sigsqw),sv=sigsqv,PRESAMPLE=ERGODIC,y=wti,$
  method=solve,type=filter,vhat=vhat,yhat=yhat) 1997:10:31 2011:6:24 xstates vstates
do i=1,3
   set bd(i) 1997:11:28 2011:6:24 = xstates(t)(i)
   set lower(i) 1997:11:28 2011:6:24 = bd(i)-2.0*sqrt(vstates(t)(i,i))
   set upper(i) 1997:11:28 2011:6:24 = bd(i)+2.0*sqrt(vstates(t)(i,i))
   set tsd(i) 1997:11:28 2011:6:24 = xstates(t)(i)/sqrt(vstates(t)(i,i))
   set vsd(i) 1997:11:28 2011:6:24 = (vstates(t)(i,i))
   end do i
*
set vd 1997:11:28 2011:6:24 = %scalar(vhat)
set yd 1997:11:28 2011:6:24 = %scalar(yhat)

nonlin sigsqv sigsqw
dlm(c=%eqnxvector(ceqn,t),sw=%diag(sigsqw),sv=sigsqv,PRESAMPLE=ERGODIC,y=wti,$
  method=bfgs,iters=500,type=filter,vhat=vhat,yhat=yhat) 1997:10:31 2011:6:24 xstates vstates
*
do i=1,3
   set bd(i) 1997:11:28 2011:6:24 = xstates(t)(i)
   set lower(i) 1997:11:28 2011:6:24 = bd(i)-2.0*sqrt(vstates(t)(i,i))
   set upper(i) 1997:11:28 2011:6:24 = bd(i)+2.0*sqrt(vstates(t)(i,i))
end do i

DLM - Estimation by BFGS
NO CONVERGENCE IN 25 ITERATIONS
LAST CRITERION WAS  0.0000000
SUBITERATIONS LIMIT EXCEEDED.
ESTIMATION POSSIBLY HAS STALLED OR MACHINE ROUNDOFF IS MAKING FURTHER PROGRESS DIFFICULT
TRY HIGHER SUBITERATIONS LIMIT, TIGHTER CVCRIT, DIFFERENT SETTING FOR EXACTLINE OR ALPHA ON NLPAR
RESTARTING ESTIMATION FROM LAST ESTIMATES OR DIFFERENT INITIAL GUESSES MIGHT ALSO WORK
Weekly Data From 1997:10:31 To 2011:06:24
Usable Observations                       713
Rank of Observables                       710
Log Likelihood                      -545.5534

    Variable                        Coeff      Std Error      T-Stat      Signif
************************************************************************************
1.  SIGSQV                             0.3896  2.7520e-003    141.58607  0.00000000
2.  SIGSQW(1)                          0.1071  8.4587e-004    126.57770  0.00000000
3.  SIGSQW(2)                    -1.7802e-003  5.9034e-006   -301.55859  0.00000000
4.  SIGSQW(3)                     8.5825e-004  7.3743e-006    116.38481  0.00000000


DLM - Estimation by BFGS
NO CONVERGENCE IN 500 ITERATIONS
LAST CRITERION WAS     NA
Weekly Data From 1997:10:31 To 2011:06:24
Usable Observations                       713
Rank of Observables                         0
Log Likelihood                        NA

    Variable                        Coeff      Std Error      T-Stat      Signif
************************************************************************************
1.  SIGSQV                            NA          0.000000      0.00000  0.00000000
2.  SIGSQW(1)                         NA          0.000000      0.00000  0.00000000
3.  SIGSQW(2)                         NA          0.000000      0.00000  0.00000000
4.  SIGSQW(3)                         NA          0.000000      0.00000  0.00000000
1) Does this mean that HH cannot be explained by WTIC/BSBC? when i switched the positions i.e. WTIC becoming dependent variable, there is quick convergence. whenever WTIC is involved as the regressor, there is no convergence. Does this mean HH/BSBC are exogenous variables?
It means that you probably didn't re-initialize the parameters between two DLM instructions and the parameter values that carried through from the first didn't work at all in the second.
chiade wrote:This is the output with WTIC as dependent variable.

Code: Select all

Convergence in    23 Iterations. Final criterion was  0.0000069 <=  0.0000100
Weekly Data From 1997:10:31 To 2011:06:24
Usable Observations                       713
Rank of Observables                       710
Log Likelihood                     -1624.7511

    Variable                        Coeff      Std Error      T-Stat      Signif
************************************************************************************
1.  SIGSQV                       -0.200010722  0.119048097     -1.68008  0.09294111
2.  SIGSQW(1)                     0.470404223  0.315520958      1.49088  0.13599272
3.  SIGSQW(2)                     0.027438327  0.015725289      1.74485  0.08101036
4.  SIGSQW(3)                     0.002532101  0.000264004      9.59113  0.00000000

Since SIGSQV is negative, i log it and get the following.  
DLM - Estimation by BFGS   (log 60,60,60 - .000001)
Convergence in    11 Iterations. Final criterion was  0.0000040 <=  0.0000100
Weekly Data From 1997:10:31 To 2011:06:24
Usable Observations                       713
Rank of Observables                       710
Log Likelihood                     -1708.1141

    Variable                        Coeff      Std Error      T-Stat      Signif
************************************************************************************
1.  LSIGSQV                      -13.81476379   5.47587350     -2.52284  0.01164107
2.  LSIGSQW(1)                     0.40328655   0.30063744      1.34144  0.17977821
3.  LSIGSQW(2)                    -1.43889055   0.12558052    -11.45791  0.00000000
4.  LSIGSQW(3)                   -61.99362233   0.42639808   -145.38907  0.00000000
2) Since the lsigsqw is insignificant, can i infer that SIGSQW (w/o loggin) is also insignificant w/o running the regressions in the level form? If not, how can I find the significance of the SIGSQW by conversion back to level from logs if I opt not to run regression in level form?
A TVP model with a strongly negative variance on the V when the variances are unconstrained is not good. It's one thing to have negative values on the variances of the drifts---that happens frequently. A negative equation variance would generally indicate that the slow drift of the TVP model isn't adequate.
chiade wrote: Back to the first regression.....

Code: Select all

linreg(define=ceqn,noprint) hh 1997:10:31 2011:6:24
# constant bsbc wtic

dec vect[series] bd(3) lower(3) upper(3) tsd(3)  vsd(3)
dec vect sigsqw(3)
*
compute sigsqw=(||20.,20.,20.||)
compute sigsqv=20.
dlm(c=%eqnxvector(ceqn,t),sw=%diag(sigsqw),sv=sigsqv,PRESAMPLE=ERGODIC,y=wti,$
  method=solve,type=filter,vhat=vhat,yhat=yhat) 1997:10:31 2011:6:24 xstates vstates
do i=1,3
   set bd(i) 1997:11:28 2011:6:24 = xstates(t)(i)
3) since the first few observations (my data is weekly) are normally volatile, to get my xstates from 1997:11:28-2011:6:24, i ran the linreg starting 1 month earlier i.e. from 1997:10:31 as well as the dlm instruction, retrieving the xstates for my required period. Is this a right approach, or what is your recommendation as to how many additional front points should be added?
What you do with the LINREG doesn't matter; the state-space model is estimated independently of it. The first few observations (typically up to the number of coefficients) in a TVP model usually have very wild coefficient estimates. Typically, you omit those in graphing the evolution of the coefficients.
chiade wrote: 4) what is the difference between the kalman filter and bai-perron? the t-stats i derived for the xtstates, where sharp spikes are evident happend to mostly coincide with the breaks indicated by bai-perron missed by a few lags.
Bai-Perron allows complete breaks in the coefficient vectors at a few points, while the TVP model has coefficients drifting from time period to time period.
chiade wrote: 5) i noticed that the state-space model doesnt include a trend. is a trend necessary like in a regression i.e.
linreg wtic
# constant trend hh bsbc
or the purpose of kalman filter is to reflect changing time trend?
The TVP model *can* capture a trend by drifting the intercept upwards, but that's not what it's designed to do. If there's a secular trend then a TVP model on a regression without a trend will not be correctly specified.
chiade wrote: 6) is it possible and recommended to insert structural breaks into the kalman filter since the purpose of KF is to reflect the breaks?
The Kalman filter is not a model. It's an algorithm that is used to compute the likelihood for the state-space model. The model itself is a TVP model, which is misspecified if the model has sharp breaks.
chiade wrote: 7) you mentioned in your previous posts that even if i pegged
compute sigsqw=(||0.,20.,20.||)
and use the start=sigsqw(2) instruction
the coefficients will be changing if type=filter. So how do i fixed the variances which are insignifcant i.e. LSIGSQW(1) or it will be fixed automatically if i used the instructions?

Code: Select all

   Variable                        Coeff      Std Error      T-Stat      Signif
************************************************************************************
1.  LSIGSQV                      -13.81476379   5.47587350     -2.52284  0.01164107
2.  LSIGSQW(1)                     0.40328655   0.30063744      1.34144  0.17977821
3.  LSIGSQW(2)                    -1.43889055   0.12558052    -11.45791  0.00000000
4.  LSIGSQW(3)                   -61.99362233   0.42639808   -145.38907  0.00000000
If you model the variances in logs, you can peg a variance to zero by making the log variance a large negative number like -60. (The LSIGSQW(3) is, in effect, doing that.) That won't change the fact that in Kalman filtering through the data, the non-drifting coefficients will change as you get more data.

Re: significance tests of c vectors

Posted: Tue Nov 01, 2011 7:53 am
by chiade
Thanks for all the clarifications. Is there any way I can compare (i.e. Rsquare) a state space model with another linear regression model or conintegration model using CATS where Rsquare is given?

u mentioned that the TV adjustment is slowly evolving. does it mean adjustment is slow compared to that in OLS or ECT? Where can i get reference?

I have been trying different initial values including -60 and 0 in the sigsqw=(||||). An error msg arose saying that sigsqw(1) is not intialized. Does it mean it must be >0? Based on your experience, what is your recommendation for the initial values? I have tried so many values ranging from 0.0000001 to 1,000.00.

Code: Select all

dec vect[series] bd(3) lower(3) upper(3) tsd(3)  vsd(3)
dec vect sigsqw(3)
*
compute sigsqw=(||0.,20.,20.||)
compute sigsqv=20.
dlm(c=%eqnxvector(ceqn,t),sw=%diag(sigsqw),sv=sigsqv,PRESAMPLE=ERGODIC,y=wti,$
  method=solve,type=filter,vhat=vhat,yhat=yhat) 1997:10:31 2011:6:24 xstates vstates

linreg(define=ceqn,noprint) wti 1997:10:31 2011:6:24
# constant trend bsb hh
compute lsigsqw=%log(||.1,.1,.1,.1||)
compute lsigsqv=log(.1)
Convergence in    38 Iterations. Final criterion was  0.0000000 <=  0.0000100
Weekly Data From 1997:10:31 To 2011:06:24
Usable Observations                       713
Rank of Observables                       705
Log Likelihood                     -1619.6571

    Variable                        Coeff      Std Error      T-Stat      Signif
************************************************************************************
1.  LSIGSQV                       -27.8433955  407.8946178     -0.06826  0.94557767
2.  LSIGSQW(1)                    -18.6444107  253.1383462     -0.07365  0.94128646
3.  LSIGSQW(2)                    -28.8438101    3.5970642     -8.01871  0.00000000
4.  LSIGSQW(3)                     -5.9748770    0.0864015    -69.15245  0.00000000
5.  LSIGSQW(4)                     -3.5048605    0.4765077     -7.35531  0.00000000


DLM - Estimation by BFGS
linreg(define=ceqn,noprint) wti 1997:10:31 2011:6:24
# constant trend bsb hh
compute lsigsqw=%log(||.01,.01,.01,.01||)
compute lsigsqv=log(.01)
Convergence in    61 Iterations. Final criterion was  0.0000013 <=  0.0000100
Weekly Data From 1997:10:31 To 2011:06:24
Usable Observations                       713
Rank of Observables                       705
Log Likelihood                     -1592.9389
Do you recommend basing intiialization on convergence rates rather than log likelihood? It seemed at odds that the log likelihood for the latter is better than the former altho' convergence is faster?

Re: significance tests of c vectors

Posted: Tue Nov 01, 2011 10:00 am
by TomDoan
chiade wrote:Thanks for all the clarifications. Is there any way I can compare (i.e. Rsquare) a state space model with another linear regression model or conintegration model using CATS where Rsquare is given?
The TVP model generalizes OLS on the same set of variables--you get (sequential) OLS by making all the "W" variances zero. So you can use AIC or BIC to compare the models.
chiade wrote: u mentioned that the TV adjustment is slowly evolving. does it mean adjustment is slow compared to that in OLS or ECT? Where can i get reference?
Probably the best description of this is in Chapter 3 of West and Harrison(1997), Bayesian Forecasting and Dynamic Models, 2nd ed, Springer.
chiade wrote: I have been trying different initial values including -60 and 0 in the sigsqw=(||||). An error msg arose saying that sigsqw(1) is not intialized. Does it mean it must be >0? Based on your experience, what is your recommendation for the initial values? I have tried so many values ranging from 0.0000001 to 1,000.00.

Code: Select all

dec vect[series] bd(3) lower(3) upper(3) tsd(3)  vsd(3)
dec vect sigsqw(3)
*
compute sigsqw=(||0.,20.,20.||)
compute sigsqv=20.
dlm(c=%eqnxvector(ceqn,t),sw=%diag(sigsqw),sv=sigsqv,PRESAMPLE=ERGODIC,y=wti,$
  method=solve,type=filter,vhat=vhat,yhat=yhat) 1997:10:31 2011:6:24 xstates vstates

linreg(define=ceqn,noprint) wti 1997:10:31 2011:6:24
# constant trend bsb hh
compute lsigsqw=%log(||.1,.1,.1,.1||)
compute lsigsqv=log(.1)
Convergence in    38 Iterations. Final criterion was  0.0000000 <=  0.0000100
Weekly Data From 1997:10:31 To 2011:06:24
Usable Observations                       713
Rank of Observables                       705
Log Likelihood                     -1619.6571

    Variable                        Coeff      Std Error      T-Stat      Signif
************************************************************************************
1.  LSIGSQV                       -27.8433955  407.8946178     -0.06826  0.94557767
2.  LSIGSQW(1)                    -18.6444107  253.1383462     -0.07365  0.94128646
3.  LSIGSQW(2)                    -28.8438101    3.5970642     -8.01871  0.00000000
4.  LSIGSQW(3)                     -5.9748770    0.0864015    -69.15245  0.00000000
5.  LSIGSQW(4)                     -3.5048605    0.4765077     -7.35531  0.00000000


DLM - Estimation by BFGS
linreg(define=ceqn,noprint) wti 1997:10:31 2011:6:24
# constant trend bsb hh
compute lsigsqw=%log(||.01,.01,.01,.01||)
compute lsigsqv=log(.01)
Convergence in    61 Iterations. Final criterion was  0.0000013 <=  0.0000100
Weekly Data From 1997:10:31 To 2011:06:24
Usable Observations                       713
Rank of Observables                       705
Log Likelihood                     -1592.9389
Do you recommend basing intiialization on convergence rates rather than log likelihood? It seemed at odds that the log likelihood for the latter is better than the former altho' convergence is faster?
The speed of convergence has nothing to do with the fit. It's controlled by how good the guess values are and how "quadratic" the log likelihood surface is in the particular parameter space. As long as it's converged, don't worry about it in evaluating models.

Re: significance tests of c vectors

Posted: Wed Nov 02, 2011 7:07 am
by chiade
Hi,

as u mentioned earlier that ignoring structural breaks in TVP model is bad if these are sharp breaks. I had used bai-perron to get the break points. Then used these break points to be used in the TVP model, discovered that the first break or second break's variance is time varying with significance. i am using type=filter in DLM. Is it advisable to incorporate these breaks i.e. d = 1 {t>=T0}? The dummy breaks are also time varying with better AIC/sbc. or is it better to keep them fixed w zero variances?

I used i.e. nonlin lsigsqv lsigsqw lsigsqw(4)=0.
BTW, how do i get the AIC/SBC after running a normal linear regression i.e.
linreg wti
# trend bsb hh

Best of rgds,
Des

Re: significance tests of c vectors

Posted: Fri Nov 04, 2011 11:12 am
by TomDoan
chiade wrote:Hi,

as u mentioned earlier that ignoring structural breaks in TVP model is bad if these are sharp breaks. I had used bai-perron to get the break points. Then used these break points to be used in the TVP model, discovered that the first break or second break's variance is time varying with significance. i am using type=filter in DLM. Is it advisable to incorporate these breaks i.e. d = 1 {t>=T0}? The dummy breaks are also time varying with better AIC/sbc. or is it better to keep them fixed w zero variances?

I used i.e. nonlin lsigsqv lsigsqw lsigsqw(4)=0.
I've not sure I understand what the point is of using Bai-Perron to identify breaks then somehow feeding them into a TVP model.
chiade wrote:BTW, how do i get the AIC/SBC after running a normal linear regression i.e.
linreg wti
# trend bsb hh
@REGCRITS

works after LINREG or MAXIMIZE or GARCH or DLM. It uses the log likelihood based versions of the criteria so they are comparable across different estimation methods.

Re: significance tests of c vectors

Posted: Mon Nov 07, 2011 5:18 am
by chiade
Hi Tom,

Your comments have been invaluable and made me think hard.

I got the t-stats for one of the time-varying coefficient which is less than 1.0 as seen from the graph with formula derived from the XSTATES(t)(i)/sqrt(VSTATES(t)(i,i). However, the t-stats for its variance is significant. Why is this so? The other coeffieicnts have mostly t-stats of >2.0 with the resp. variances also significant. Is it alright to ignore the t-stats but look only at the siginificance of the variances to conclude whether the variable is necessary in model?

The @REGCRITS is v. useful. How do i generate the aic/bic from CATS for the individual equations to be compared to the DLM? I stored the residuals into RATS, and tried using @regcorrs but it showed the results from previous equation.

Also, how do i generate impulse responses when i used the engle-granger 2-step cointegration to formulate VECM? I am formulating a vecm using filtered erros from DLM.

Thanks again for your advice.