Examples / TARMODELS.RPF |
TARMODELS.RPF is an example of estimation and analysis of a STAR (Smooth Transition Autoregression) model. It is based on example 3 from Terasvirta(1994). The Canadian lynx population is a standard data set in the non-linear time series literature. The population fluctuates because a high population causes depletion of their food source (lynx are a small wild cat), causing a rapid die-off due to starvation. When the population is low, it can only be rebuilt through reproduction. Thus the up cycles and down cycles are driven by completely different mechanisms, suggesting that a threshold effect is likely.
The @ARAutoLags procedure does a quick, efficient examination of a range of AR models for stationary data. Because this is for a standard AR with no threshold effects, it might not give a good estimate of the lag length in the presence of a threshold (it would probably overshoot, so you might need to correct later by pruning unneeded lags). Here it picks 11 as the minimum AIC.
@yulelags(max=20) x
The delay for the STAR is chosen using applications of the @STARTest procedure which does an LM test on various functions of the lagged dependent variable. Standard practice is to pick the delay by choosing the one giving the most significant test statistic. The tests are done on the demeaned data using 11 lags and up to a 9 period delay. D=3 is by far the strongest test (most significant rejection of linearity).
diff(center) x / xc
do d=1,9
@StarTest(p=11,d=d) xc
end do d
This initially fixes the C and GAMMA at the mean and two times the reciprocal of the standard error, respectively. The logistic exponent is rescaled by 1/s as shown in the article, so GAMMA is, with that parameterization, put at 2. This gives a fairly sharp split between larger and smaller values, so the two branches should have decidedly different coefficients if there is evidence of STAR. With C and GAMMA fixed, NLLS is actually just OLS.
stats x
compute scalef=1.0/sqrt(%variance)
nonlin(parmset=starparms) gamma c
frml glstar = %logistic(scalef*gamma*(x{3}-c),1.0)
compute c=%mean, gamma=2.0
These are the two branches. Putting these in as linear equations makes specification more flexible—you can easily change the lag lists and the rest goes through as is.
equation standard x
# constant x{1 to 11}
equation transit x
# constant x{1 to 11}
These convert the linear equations into FRMLs with PHI1 and PHI2 as the coefficient vectors. Put them together with the transition function to make the STAR formula.
frml(equation=standard,vector=phi1) phi1f
frml(equation=transit ,vector=phi2) phi2f
frml star x = g=glstar,phi1f+g*phi2f
nonlin(parmset=regparms) phi1 phi2
nonlin(parmset=starparms) gamma c
Estimate the model with the STARPARMS left out.
nlls(parmset=regparms,frml=star) x
Based upon the initial results, the standard equation is trimmed to just X{1} and TRANSIT to X{2 3 4 10 11}.
equation standard x
# x{1}
equation transit x
# x{2 3 4 10 11}
frml(equation=standard,vector=phi1) phi1f
frml(equation=transit ,vector=phi2) phi2f
The new specification is re-estimated with just the regression parameters (output), then including the STAR parameters (output).
nlls(parmset=regparms,frml=star) x
nlls(parmset=regparms+starparms,frml=star) x
This next analyzes the roots of the polynomials (output) using @LagPolyRoots. The UPPER polynomial (for the part above the threshold) is obtained by adding the STANDARD and TRANSIT polynomials. Note that neither one of these will likely describe the cycles of the threshold process, as they just show the roots of the each polynomial if it were governing the entire process. However, the whole point of the threshold is that each subprocess drives the series towards the threshold to shift the process to the other branch. (Note that each subprocess is actually somewhat non-stationary, but it will turn out that the overall process is stationary).
compute %eqnsetcoeffs(standard,phi1)
compute %eqnsetcoeffs(transit,phi2)
compute upper=standard+transit
@LagPolyRoots(title="Above Threshold Polynomial") %eqnlagpoly(upper,x)
@LagPolyRoots(title="Below Threshold Polynomial") %eqnlagpoly(standard,x)
To forecast the non-linear equation, we need to create a MODEL with the one formula. This uses SFORECAST for the results for all forecasting calculations, so as we do different types of forecasts, we will need to copy the information out of that each time.
group starmodel star>>sforecast
Compute eventual forecasting function starting in 1925:1.
forecast(model=starmodel,steps=40,from=1925:1)
set eforecast 1925:1 1964:1 = sforecast
Computes forecasts for 40 steps beginning in 1925 by taking the mean of simulated values over 10000 replications.
set meanf 1925:1 1964:1 = 0.0
compute nreps=10000
do reps=1,nreps
simulate(model=starmodel,from=1925:1,to=1964:1)
set meanf 1925:1 1964:1 = meanf+sforecast
end do reps
set meanf 1925:1 1964:1 = meanf/nreps
Graphs the eventual forecast, the mean forecast and the last few years of actual data. While the process has a fairly regular cycle of roughly eight years, shocks will shift the timing enough that even 16 years out, the probability is fairly high that it will be in a somewhat different phase than it is today. The mean forecasts thus eventually converge to the series mean.
graph(footer="Forecasts of Lynx Data with LSTAR Model",$
key=loright,klabels=$
||"Eventual Forecast","Mean Forecast","Actual"||) 3
# eforecast
# meanf
# x 1920:1 1934:1
Full Program
cal(a) 1821
open data lynx.dat
data(org=cols) 1821:1 1934:1 lynx
set x = log(lynx)/log(10)
*
* The @ARAutoLags procedure does a quick, efficient examination of a range
* of AR models for stationary data. Because this is for a standard AR,
* with no threshold effects, it might not give a good estimate of the
* lag length in the presence of a threshold.
*
@arautolags(max=20) x
*
* Standard practice is to pick the delay by choosing the one which gives
* the most significant test statistic. The tests are done on the
* demeaned data using 11 lags and up to a 9 period delay.
*
diff(center) x / xc
do d=1,9
@StarTest(p=11,d=d) xc
end do d
*
* This initially fixes the c and gamma at the mean and two times the
* reciprocal of the standard error, respectively. (The logistic exponent
* is rescaled by 1/sigma as shown in the article), so gamma is, with
* that parameterization, put at 2. This gives a fairly sharp split
* between larger and smaller values, so two branches should have
* decidedly different coefficients if there is evidence of STAR. With c
* and gamma fixed, NLLS is actually just OLS.
*
stats x
*
* In practice, you would use the following. The author used a rounded value.
*
compute scalef=1.0/sqrt(%variance)
nonlin(parmset=starparms) gamma c
frml glstar = %logistic(scalef*gamma*(x{3}-c),1.0)
compute c=%mean,gamma=2.0
*
* These are the two branches. Putting these in as linear equations makes
* specification more flexible---you can easily change the lag lists and
* the rest goes through as is.
*
equation standard x
# constant x{1 to 11}
equation transit x
# constant x{1 to 11}
*
* Convert the linear equations into FRML's with phi1 and phi2 as the
* coefficient vectors. Put them together with the transition function to
* make the STAR formula.
*
frml(equation=standard,vector=phi1) phi1f
frml(equation=transit ,vector=phi2) phi2f
frml star x = g=glstar,phi1f+g*phi2f
*
nonlin(parmset=regparms) phi1 phi2
nonlin(parmset=starparms) gamma c
*
* Estimate the model with the STARPARMS left out.
*
compute phi1=%const(0.0),phi2=%const(0.0)
nlls(parmset=regparms,frml=star) x
*
* Based upon the initial results, the standard equation is trimmed to
* just x{1} and transit to x{2 3 4 10 11}.
*
equation standard x
# x{1}
equation transit x
# x{2 3 4 10 11}
frml(equation=standard,vector=phi1) phi1f
frml(equation=transit ,vector=phi2) phi2f
*
* The new specification is re-estimated with just the regression
* parameters, then including the STAR parameters.
*
* Because the PHI1 and PHI2 get re-dimensioned when we use them for
* the new equations, the previous values are lost (and should be,
* since the role of each element has likely changed).
*
compute phi1=%const(0.0),phi2=%const(0.0)
nlls(parmset=regparms,frml=star) x
nlls(parmset=regparms+starparms,frml=star) x
*
* Analysis of roots of the polynomials
*
compute %eqnsetcoeffs(standard,phi1)
compute %eqnsetcoeffs(transit,phi2)
compute upper=standard+transit
*
* The equation <<upper>> is created by adding the right hand sides of
* the standard and transit polynomials.
*
@LagPolyRoots(title="Above Threshold Polynomial") %eqnlagpoly(upper,x)
@LagPolyRoots(title="Below Threshold Polynomial") %eqnlagpoly(standard,x)
*
* To forecast the non-linear equation, we need to create a MODEL with
* the one formula. This uses SFORECAST for the results for all
* forecasting calculations, so we need to pull information out that we
* need as we do forecasts.
*
group starmodel star>>sforecast
*
* Compute eventual forecasting function starting in 1925:1.
*
forecast(model=starmodel,steps=40,from=1925:1)
set eforecast 1925:1 1964:1 = sforecast
*
* Computes forecasts for 40 steps beginning in 1925 by taking
* the mean of simulated values over 10000 replications.
*
set meanf 1925:1 1964:1 = 0.0
compute nreps=10000
do reps=1,nreps
simulate(model=starmodel,from=1925:1,to=1964:1)
set meanf 1925:1 1964:1 = meanf+sforecast
end do reps
set meanf 1925:1 1964:1 = meanf/nreps
*
* Graphs the eventual forecast, the mean forecast and the last few years
* of actual data. While the process has a fairly regular cycle of
* roughly eight years, shocks will shift the timing enough that even 16
* years out, the probability is fairly high that it will be in a
* somewhat difference phase than it is today. The mean forecasts thus
* eventually converge to the series mean.
*
graph(footer="Forecasts of Lynx Data with LSTAR Model",$
key=loright,klabels=||"Eventual Forecast","Mean Forecast","Actual"||) 3
# eforecast
# meanf
# x 1920:1 1934:1
Output
Test for STAR in series XC
AR length 11
Delay 1
Test F-stat Signif
Linearity 0.6000230 0.9422
H01 0.7482027 0.6894
H02 0.3578074 0.9677
H03 0.7942389 0.6449
H12 0.5199714 0.9561
Test for STAR in series XC
AR length 11
Delay 2
Test F-stat Signif
Linearity 1.6378355 0.0497
H01 1.6909968 0.0904
H02 1.0677802 0.3996
H03 1.7896804 0.0768
H12 1.3872683 0.1528
Test for STAR in series XC
AR length 11
Delay 3
Test F-stat Signif
Linearity 3.2698979 0.0000
H01 2.8129117 0.0038
H02 2.3600164 0.0154
H03 2.6299182 0.0085
H12 2.8494745 0.0005
Test for STAR in series XC
AR length 11
Delay 4
Test F-stat Signif
Linearity 1.7361651 0.0327
H01 2.0885541 0.0305
H02 2.1629503 0.0266
H03 0.7822202 0.6562
H12 2.2927380 0.0048
Test for STAR in series XC
AR length 11
Delay 5
Test F-stat Signif
Linearity 0.7985486 0.7546
H01 0.6202687 0.8064
H02 1.0299721 0.4303
H03 0.7964754 0.6427
H12 0.8263985 0.6838
Test for STAR in series XC
AR length 11
Delay 6
Test F-stat Signif
Linearity 0.7689718 0.7904
H01 0.4673629 0.9180
H02 0.9030428 0.5421
H03 0.9529686 0.4983
H12 0.6820875 0.8416
Test for STAR in series XC
AR length 11
Delay 7
Test F-stat Signif
Linearity 1.2684351 0.2111
H01 0.8725809 0.5700
H02 0.8339016 0.6071
H03 1.8816904 0.0608
H12 0.8432770 0.6635
Test for STAR in series XC
AR length 11
Delay 8
Test F-stat Signif
Linearity 2.0161557 0.0096
H01 1.9272337 0.0477
H02 2.6315051 0.0072
H03 1.0319591 0.4311
H12 2.4955395 0.0021
Test for STAR in series XC
AR length 11
Delay 9
Test F-stat Signif
Linearity 1.3656524 0.1478
H01 1.6031711 0.1137
H02 1.1468952 0.3401
H03 1.2178682 0.2962
H12 1.3912237 0.1507
Statistics on Series X
Annual Data From 1821:01 To 1934:01
Observations 114
Sample Mean 2.903664 Variance 0.311820
Standard Error 0.558409 SE of Sample Mean 0.052300
t-Statistic (Mean=0) 55.519635 Signif Level (Mean=0) 0.000000
Skewness -0.366835 Signif Level (Sk=0) 0.114579
Kurtosis (excess) -0.712265 Signif Level (Ku=0) 0.132317
Jarque-Bera 4.966567 Signif Level (JB=0) 0.083469
Nonlinear Least Squares - Estimation by Gauss-Newton
Convergence in 2 Iterations. Final criterion was 0.0000000 <= 0.0000100
Dependent Variable X
Annual Data From 1821:01 To 1934:01
Usable Observations 103
Degrees of Freedom 79
Skipped/Missing (from 114) 11
Centered R^2 0.9257581
R-Bar^2 0.9041434
Uncentered R^2 0.9972979
Mean of Dependent Variable 2.8792489604
Std Error of Dependent Variable 0.5623054270
Standard Error of Estimate 0.1740937871
Sum of Squared Residuals 2.3943830910
Regression F(23,79) 42.8300
Significance Level of F 0.0000000
Log Likelihood 47.5719
Durbin-Watson Statistic 1.9907
Variable Coeff Std Error T-Stat Signif
************************************************************************************
1. PHI1(1) 0.025848022 0.571011089 0.04527 0.96400863
2. PHI1(2) 0.668548522 0.188857089 3.53997 0.00067421
3. PHI1(3) 0.461125960 0.253163707 1.82145 0.07232338
4. PHI1(4) 0.184581904 0.275494992 0.67000 0.50481095
5. PHI1(5) -0.146845860 0.246691273 -0.59526 0.55336995
6. PHI1(6) -0.134058075 0.271670047 -0.49346 0.62305724
7. PHI1(7) 0.235631261 0.303784102 0.77565 0.44026931
8. PHI1(8) -0.378882763 0.317690902 -1.19261 0.23659023
9. PHI1(9) 0.419678375 0.290162662 1.44636 0.15203351
10. PHI1(10) 0.119632018 0.277481971 0.43113 0.66754483
11. PHI1(11) -0.273928141 0.277206746 -0.98817 0.32608503
12. PHI1(12) 0.023564897 0.203698002 0.11569 0.90819539
13. PHI2(1) -1.326538749 1.022810359 -1.29695 0.19842194
14. PHI2(2) 0.837059737 0.279862106 2.99097 0.00370775
15. PHI2(3) -1.958728913 0.425255810 -4.60600 0.00001549
16. PHI2(4) 1.550658982 0.467651084 3.31585 0.00138189
17. PHI2(5) -0.490453856 0.481571508 -1.01844 0.31157564
18. PHI2(6) 0.670541627 0.487627945 1.37511 0.17298492
19. PHI2(7) -0.691039847 0.499937491 -1.38225 0.17078984
20. PHI2(8) 0.671961428 0.494547981 1.35874 0.17809637
21. PHI2(9) -0.762655514 0.444937542 -1.71407 0.09043647
22. PHI2(10) -0.029660324 0.411261158 -0.07212 0.94268836
23. PHI2(11) 0.840380789 0.411635874 2.04156 0.04453302
24. PHI2(12) -0.565566352 0.297407611 -1.90165 0.06086207
Nonlinear Least Squares - Estimation by Gauss-Newton
Convergence in 2 Iterations. Final criterion was 0.0000000 <= 0.0000100
Dependent Variable X
Annual Data From 1821:01 To 1934:01
Usable Observations 103
Degrees of Freedom 97
Skipped/Missing (from 114) 11
Centered R^2 0.9046906
R-Bar^2 0.8997777
Uncentered R^2 0.9965312
Mean of Dependent Variable 2.8792489604
Std Error of Dependent Variable 0.5623054270
Standard Error of Estimate 0.1780141068
Sum of Squared Residuals 3.0738351553
Log Likelihood 34.7072
Durbin-Watson Statistic 1.8866
Variable Coeff Std Error T-Stat Signif
************************************************************************************
1. PHI1(1) 1.145884054 0.012609219 90.87669 0.00000000
2. PHI2(1) -1.037808666 0.152655886 -6.79835 0.00000000
3. PHI2(2) 1.259477172 0.260895937 4.82751 0.00000515
4. PHI2(3) -0.478089521 0.159566498 -2.99618 0.00347076
5. PHI2(4) 0.465555041 0.085563753 5.44103 0.00000040
6. PHI2(5) -0.484199596 0.106478002 -4.54741 0.00001570
Nonlinear Least Squares - Estimation by Gauss-Newton
Convergence in 7 Iterations. Final criterion was 0.0000027 <= 0.0000100
Dependent Variable X
Annual Data From 1821:01 To 1934:01
Usable Observations 103
Degrees of Freedom 95
Skipped/Missing (from 114) 11
Centered R^2 0.9071181
R-Bar^2 0.9002742
Uncentered R^2 0.9966195
Mean of Dependent Variable 2.8792489604
Std Error of Dependent Variable 0.5623054270
Standard Error of Estimate 0.1775726200
Sum of Squared Residuals 2.9955433592
Log Likelihood 36.0359
Durbin-Watson Statistic 1.9825
Variable Coeff Std Error T-Stat Signif
************************************************************************************
1. PHI1(1) 1.163062313 0.025474575 45.65581 0.00000000
2. PHI2(1) -0.920600414 0.161069778 -5.71554 0.00000013
3. PHI2(2) 1.057903296 0.266360439 3.97170 0.00013882
4. PHI2(3) -0.388772553 0.149415742 -2.60195 0.01075313
5. PHI2(4) 0.473615100 0.075976680 6.23369 0.00000001
6. PHI2(5) -0.485992795 0.097163992 -5.00178 0.00000260
7. GAMMA 2.099670641 0.614235570 3.41835 0.00092942
8. C 2.752266708 0.093469416 29.44564 0.00000000
Real Imag Modulus Period
-0.246 -1.010 1.039
-0.246 1.010 1.039 3.471
0.230 -0.933 0.961
0.230 0.933 0.961 4.726
0.944 -0.095 0.948
0.944 0.095 0.948 62.535
0.727 -0.559 0.917
0.727 0.559 0.917 9.590
-0.654 0.585 0.878 2.605
-0.654 -0.585 0.878
-0.836 -0.000 0.836
Below Threshold Polynomial
Real Imag Modulus Period
1.163 0.000 1.163
Graph
Copyright © 2025 Thomas A. Doan