TARMODELS.RPF

TARMODELS.RPF is an example of estimation and analysis of a STAR (Smooth Transition Autoregression) model. It is based on example 3 from Terasvirta(1994). The Canadian lynx population is a standard data set in the non-linear time series literature. The population fluctuates because a high population causes depletion of their food source (lynx are a small wild cat), causing a rapid die-off due to starvation. When the population is low, it can only be rebuilt through reproduction. Thus the up cycles and down cycles are driven by completely different mechanisms, suggesting that a threshold effect is likely.

The @ARAutoLags procedure does a quick, efficient examination of a range of AR models for stationary data. Because this is for a standard AR with no threshold effects, it might not give a good estimate of the lag length in the presence of a threshold (it would probably overshoot, so you might need to correct later by pruning unneeded lags). Here it picks 11 as the minimum AIC.

@yulelags(max=20) x

The delay for the STAR is chosen using applications of the @STARTest procedure which does an LM test on various functions of the lagged dependent variable. Standard practice is to pick the delay by choosing the one giving the most significant test statistic. The tests are done on the demeaned data using 11 lags and up to a 9 period delay. D=3 is by far the strongest test (most significant rejection of linearity).

diff(center) x / xc

do d=1,9

@StarTest(p=11,d=d) xc

end do d

This initially fixes the C and GAMMA at the mean and two times the reciprocal of the standard error, respectively. The logistic exponent is rescaled by 1/s as shown in the article, so GAMMA is, with that parameterization, put at 2. This gives a fairly sharp split between larger and smaller values, so the two branches should have decidedly different coefficients if there is evidence of STAR. With C and GAMMA fixed, NLLS is actually just OLS.

stats x

compute scalef=1.0/sqrt(%variance)

nonlin(parmset=starparms) gamma c

frml glstar = %logistic(scalef*gamma*(x{3}-c),1.0)

compute c=%mean, gamma=2.0

These are the two branches. Putting these in as linear equations makes specification more flexible—you can easily change the lag lists and the rest goes through as is.

equation standard x

# constant x{1 to 11}

equation transit x

# constant x{1 to 11}

These convert the linear equations into FRMLs with PHI1 and PHI2 as the coefficient vectors. Put them together with the transition function to make the STAR formula.

frml(equation=standard,vector=phi1) phi1f

frml(equation=transit ,vector=phi2) phi2f

frml star x = g=glstar,phi1f+g*phi2f

nonlin(parmset=regparms) phi1 phi2

nonlin(parmset=starparms) gamma c

Estimate the model with the STARPARMS left out.

nlls(parmset=regparms,frml=star) x

Based upon the initial results, the standard equation is trimmed to just X{1} and TRANSIT to X{2 3 4 10 11}.

equation standard x

# x{1}

equation transit x

# x{2 3 4 10 11}

frml(equation=standard,vector=phi1) phi1f

frml(equation=transit ,vector=phi2) phi2f

The new specification is re-estimated with just the regression parameters (output), then including the STAR parameters (output).

nlls(parmset=regparms,frml=star) x

nlls(parmset=regparms+starparms,frml=star) x

This next analyzes the roots of the polynomials (output) using @LagPolyRoots. The UPPER polynomial (for the part above the threshold) is obtained by adding the STANDARD and TRANSIT polynomials. Note that neither one of these will likely describe the cycles of the threshold process, as they just show the roots of the each polynomial if it were governing the entire process. However, the whole point of the threshold is that each subprocess drives the series towards the threshold to shift the process to the other branch. (Note that each subprocess is actually somewhat non-stationary, but it will turn out that the overall process is stationary).

compute %eqnsetcoeffs(standard,phi1)

compute %eqnsetcoeffs(transit,phi2)

compute upper=standard+transit

@LagPolyRoots(title="Above Threshold Polynomial") %eqnlagpoly(upper,x)

@LagPolyRoots(title="Below Threshold Polynomial") %eqnlagpoly(standard,x)

To forecast the non-linear equation, we need to create a MODEL with the one formula. This uses SFORECAST for the results for all forecasting calculations, so as we do different types of forecasts, we will need to copy the information out of that each time.

group starmodel star>>sforecast

Compute eventual forecasting function starting in 1925:1.

forecast(model=starmodel,steps=40,from=1925:1)

set eforecast 1925:1 1964:1 = sforecast

Computes forecasts for 40 steps beginning in 1925 by taking the mean of simulated values over 10000 replications.

set meanf 1925:1 1964:1 = 0.0

compute nreps=10000

do reps=1,nreps

simulate(model=starmodel,from=1925:1,to=1964:1)

set meanf 1925:1 1964:1 = meanf+sforecast

end do reps

set meanf 1925:1 1964:1 = meanf/nreps

Graphs the eventual forecast, the mean forecast and the last few years of actual data. While the process has a fairly regular cycle of roughly eight years, shocks will shift the timing enough that even 16 years out, the probability is fairly high that it will be in a somewhat different phase than it is today. The mean forecasts thus eventually converge to the series mean.

graph(footer="Forecasts of Lynx Data with LSTAR Model",$

key=loright,klabels=$

||"Eventual Forecast","Mean Forecast","Actual"||) 3

# eforecast

# meanf

# x 1920:1 1934:1

Full Program

cal(a) 1821
open data lynx.dat
data(org=cols) 1821:1 1934:1 lynx
set x = log(lynx)/log(10)
*
* The @ARAutoLags procedure does a quick, efficient examination of a range
* of AR models for stationary data. Because this is for a standard AR,
* with no threshold effects, it might not give a good estimate of the
* lag length in the presence of a threshold.
*
@arautolags(max=20) x
*
* Standard practice is to pick the delay by choosing the one which gives
* the most significant test statistic. The tests are done on the
* demeaned data using 11 lags and up to a 9 period delay.
*
diff(center) x / xc
do d=1,9
   @StarTest(p=11,d=d) xc
end do d
*
* This initially fixes the c and gamma at the mean and two times the
* reciprocal of the standard error, respectively. (The logistic exponent
* is rescaled by 1/sigma as shown in the article), so gamma is, with
* that parameterization, put at 2. This gives a fairly sharp split
* between larger and smaller values, so two branches should have
* decidedly different coefficients if there is evidence of STAR. With c
* and gamma fixed, NLLS is actually just OLS.
*
stats x
*
* In practice, you would use the following. The author used a rounded value.
*
compute scalef=1.0/sqrt(%variance)
nonlin(parmset=starparms) gamma c
frml glstar = %logistic(scalef*gamma*(x{3}-c),1.0)
compute c=%mean,gamma=2.0
*
* These are the two branches. Putting these in as linear equations makes
* specification more flexible---you can easily change the lag lists and
* the rest goes through as is.
*
equation standard x
# constant x{1 to 11}
equation transit x
# constant x{1 to 11}
*
* Convert the linear equations into FRML's with phi1 and phi2 as the
* coefficient vectors. Put them together with the transition function to
* make the STAR formula.
*
frml(equation=standard,vector=phi1) phi1f
frml(equation=transit ,vector=phi2) phi2f
frml star x = g=glstar,phi1f+g*phi2f
*
nonlin(parmset=regparms) phi1 phi2
nonlin(parmset=starparms) gamma c
*
* Estimate the model with the STARPARMS left out.
*
compute phi1=%const(0.0),phi2=%const(0.0)
nlls(parmset=regparms,frml=star) x
*
* Based upon the initial results, the standard equation is trimmed to
* just x{1} and transit to x{2 3 4 10 11}.
*
equation standard x
# x{1}
equation transit x
# x{2 3 4 10 11}
frml(equation=standard,vector=phi1) phi1f
frml(equation=transit ,vector=phi2) phi2f
*
* The new specification is re-estimated with just the regression
* parameters, then including the STAR parameters.
*
* Because the PHI1 and PHI2 get re-dimensioned when we use them for
* the new equations, the previous values are lost (and should be,
* since the role of each element has likely changed).
*
compute phi1=%const(0.0),phi2=%const(0.0)
nlls(parmset=regparms,frml=star) x
nlls(parmset=regparms+starparms,frml=star) x
*
* Analysis of roots of the polynomials
*
compute %eqnsetcoeffs(standard,phi1)
compute %eqnsetcoeffs(transit,phi2)
compute upper=standard+transit
*
* The equation <<upper>> is created by adding the right hand sides of
* the standard and transit polynomials.
*
@LagPolyRoots(title="Above Threshold Polynomial") %eqnlagpoly(upper,x)
@LagPolyRoots(title="Below Threshold Polynomial") %eqnlagpoly(standard,x)
*
* To forecast the non-linear equation, we need to create a MODEL with
* the one formula. This uses SFORECAST for the results for all
* forecasting calculations, so we need to pull information out that we
* need as we do forecasts.
*
group starmodel star>>sforecast
*
* Compute eventual forecasting function starting in 1925:1.
*
forecast(model=starmodel,steps=40,from=1925:1)
set eforecast 1925:1 1964:1 = sforecast
*
* Computes forecasts for 40 steps beginning in 1925 by taking
* the mean of simulated values over 10000 replications.
*
set meanf 1925:1 1964:1 = 0.0
compute nreps=10000
do reps=1,nreps
   simulate(model=starmodel,from=1925:1,to=1964:1)
   set meanf 1925:1 1964:1 = meanf+sforecast
end do reps
set meanf 1925:1 1964:1 = meanf/nreps
*
* Graphs the eventual forecast, the mean forecast and the last few years
* of actual data. While the process has a fairly regular cycle of
* roughly eight years, shocks will shift the timing enough that even 16
* years out, the probability is fairly high that it will be in a
* somewhat difference phase than it is today. The mean forecasts thus
* eventually converge to the series mean.
*
graph(footer="Forecasts of Lynx Data with LSTAR Model",$
key=loright,klabels=||"Eventual Forecast","Mean Forecast","Actual"||) 3
# eforecast
# meanf
# x 1920:1 1934:1

Output

Minimum AIC Lags for X = 11

Test for STAR in series XC

AR length 11

Delay 1

Test F-stat Signif

Linearity 0.6000230 0.9422

H01 0.7482027 0.6894

H02 0.3578074 0.9677

H03 0.7942389 0.6449

H12 0.5199714 0.9561

Test for STAR in series XC

AR length 11

Delay 2

Test F-stat Signif

Linearity 1.6378355 0.0497

H01 1.6909968 0.0904

H02 1.0677802 0.3996

H03 1.7896804 0.0768

H12 1.3872683 0.1528

Test for STAR in series XC

AR length 11

Delay 3

Test F-stat Signif

Linearity 3.2698979 0.0000

H01 2.8129117 0.0038

H02 2.3600164 0.0154

H03 2.6299182 0.0085

H12 2.8494745 0.0005

Test for STAR in series XC

AR length 11

Delay 4

Test F-stat Signif

Linearity 1.7361651 0.0327

H01 2.0885541 0.0305

H02 2.1629503 0.0266

H03 0.7822202 0.6562

H12 2.2927380 0.0048

Test for STAR in series XC

AR length 11

Delay 5

Test F-stat Signif

Linearity 0.7985486 0.7546

H01 0.6202687 0.8064

H02 1.0299721 0.4303

H03 0.7964754 0.6427

H12 0.8263985 0.6838

Test for STAR in series XC

AR length 11

Delay 6

Test F-stat Signif

Linearity 0.7689718 0.7904

H01 0.4673629 0.9180

H02 0.9030428 0.5421

H03 0.9529686 0.4983

H12 0.6820875 0.8416

Test for STAR in series XC

AR length 11

Delay 7

Test F-stat Signif

Linearity 1.2684351 0.2111

H01 0.8725809 0.5700

H02 0.8339016 0.6071

H03 1.8816904 0.0608

H12 0.8432770 0.6635

Test for STAR in series XC

AR length 11

Delay 8

Test F-stat Signif

Linearity 2.0161557 0.0096

H01 1.9272337 0.0477

H02 2.6315051 0.0072

H03 1.0319591 0.4311

H12 2.4955395 0.0021

Test for STAR in series XC

AR length 11

Delay 9

Test F-stat Signif

Linearity 1.3656524 0.1478

H01 1.6031711 0.1137

H02 1.1468952 0.3401

H03 1.2178682 0.2962

H12 1.3912237 0.1507

Statistics on Series X

Annual Data From 1821:01 To 1934:01

Observations 114

Sample Mean 2.903664 Variance 0.311820

Standard Error 0.558409 SE of Sample Mean 0.052300

t-Statistic (Mean=0) 55.519635 Signif Level (Mean=0) 0.000000

Skewness -0.366835 Signif Level (Sk=0) 0.114579

Kurtosis (excess) -0.712265 Signif Level (Ku=0) 0.132317

Jarque-Bera 4.966567 Signif Level (JB=0) 0.083469

Nonlinear Least Squares - Estimation by Gauss-Newton

Convergence in 2 Iterations. Final criterion was 0.0000000 <= 0.0000100

Dependent Variable X

Annual Data From 1821:01 To 1934:01

Usable Observations 103

Degrees of Freedom 79

Skipped/Missing (from 114) 11

Centered R^2 0.9257581

R-Bar^2 0.9041434

Uncentered R^2 0.9972979

Mean of Dependent Variable 2.8792489604

Std Error of Dependent Variable 0.5623054270

Standard Error of Estimate 0.1740937871

Sum of Squared Residuals 2.3943830910

Regression F(23,79) 42.8300

Significance Level of F 0.0000000

Log Likelihood 47.5719

Durbin-Watson Statistic 1.9907

Variable Coeff Std Error T-Stat Signif

************************************************************************************

1. PHI1(1) 0.025848022 0.571011089 0.04527 0.96400863

2. PHI1(2) 0.668548522 0.188857089 3.53997 0.00067421

3. PHI1(3) 0.461125960 0.253163707 1.82145 0.07232338

4. PHI1(4) 0.184581904 0.275494992 0.67000 0.50481095

5. PHI1(5) -0.146845860 0.246691273 -0.59526 0.55336995

6. PHI1(6) -0.134058075 0.271670047 -0.49346 0.62305724

7. PHI1(7) 0.235631261 0.303784102 0.77565 0.44026931

8. PHI1(8) -0.378882763 0.317690902 -1.19261 0.23659023

9. PHI1(9) 0.419678375 0.290162662 1.44636 0.15203351

10. PHI1(10) 0.119632018 0.277481971 0.43113 0.66754483

11. PHI1(11) -0.273928141 0.277206746 -0.98817 0.32608503

12. PHI1(12) 0.023564897 0.203698002 0.11569 0.90819539

13. PHI2(1) -1.326538749 1.022810359 -1.29695 0.19842194

14. PHI2(2) 0.837059737 0.279862106 2.99097 0.00370775

15. PHI2(3) -1.958728913 0.425255810 -4.60600 0.00001549

16. PHI2(4) 1.550658982 0.467651084 3.31585 0.00138189

17. PHI2(5) -0.490453856 0.481571508 -1.01844 0.31157564

18. PHI2(6) 0.670541627 0.487627945 1.37511 0.17298492

19. PHI2(7) -0.691039847 0.499937491 -1.38225 0.17078984

20. PHI2(8) 0.671961428 0.494547981 1.35874 0.17809637

21. PHI2(9) -0.762655514 0.444937542 -1.71407 0.09043647

22. PHI2(10) -0.029660324 0.411261158 -0.07212 0.94268836

23. PHI2(11) 0.840380789 0.411635874 2.04156 0.04453302

24. PHI2(12) -0.565566352 0.297407611 -1.90165 0.06086207

Nonlinear Least Squares - Estimation by Gauss-Newton

Convergence in 2 Iterations. Final criterion was 0.0000000 <= 0.0000100

Dependent Variable X

Annual Data From 1821:01 To 1934:01

Usable Observations 103

Degrees of Freedom 97

Skipped/Missing (from 114) 11

Centered R^2 0.9046906

R-Bar^2 0.8997777

Uncentered R^2 0.9965312

Mean of Dependent Variable 2.8792489604

Std Error of Dependent Variable 0.5623054270

Standard Error of Estimate 0.1780141068

Sum of Squared Residuals 3.0738351553

Log Likelihood 34.7072

Durbin-Watson Statistic 1.8866

Variable Coeff Std Error T-Stat Signif

************************************************************************************

1. PHI1(1) 1.145884054 0.012609219 90.87669 0.00000000

2. PHI2(1) -1.037808666 0.152655886 -6.79835 0.00000000

3. PHI2(2) 1.259477172 0.260895937 4.82751 0.00000515

4. PHI2(3) -0.478089521 0.159566498 -2.99618 0.00347076

5. PHI2(4) 0.465555041 0.085563753 5.44103 0.00000040

6. PHI2(5) -0.484199596 0.106478002 -4.54741 0.00001570

Nonlinear Least Squares - Estimation by Gauss-Newton

Convergence in 7 Iterations. Final criterion was 0.0000027 <= 0.0000100

Dependent Variable X

Annual Data From 1821:01 To 1934:01

Usable Observations 103

Degrees of Freedom 95

Skipped/Missing (from 114) 11

Centered R^2 0.9071181

R-Bar^2 0.9002742

Uncentered R^2 0.9966195

Mean of Dependent Variable 2.8792489604

Std Error of Dependent Variable 0.5623054270

Standard Error of Estimate 0.1775726200

Sum of Squared Residuals 2.9955433592

Log Likelihood 36.0359

Durbin-Watson Statistic 1.9825

Variable Coeff Std Error T-Stat Signif

************************************************************************************

1. PHI1(1) 1.163062313 0.025474575 45.65581 0.00000000

2. PHI2(1) -0.920600414 0.161069778 -5.71554 0.00000013

3. PHI2(2) 1.057903296 0.266360439 3.97170 0.00013882

4. PHI2(3) -0.388772553 0.149415742 -2.60195 0.01075313

5. PHI2(4) 0.473615100 0.075976680 6.23369 0.00000001

6. PHI2(5) -0.485992795 0.097163992 -5.00178 0.00000260

7. GAMMA 2.099670641 0.614235570 3.41835 0.00092942

8. C 2.752266708 0.093469416 29.44564 0.00000000

Above Threshold Polynomial

Real Imag Modulus Period

-0.246 -1.010 1.039

-0.246 1.010 1.039 3.471

0.230 -0.933 0.961

0.230 0.933 0.961 4.726

0.944 -0.095 0.948

0.944 0.095 0.948 62.535

0.727 -0.559 0.917

0.727 0.559 0.917 9.590

-0.654 0.585 0.878 2.605

-0.654 -0.585 0.878

-0.836 -0.000 0.836

Below Threshold Polynomial

Real Imag Modulus Period

1.163 0.000 1.163

Graph