GMAUTOFIT

Questions and discussions on Time Series Analysis
ac_1
Posts: 421
Joined: Thu Apr 15, 2010 6:30 am

GMAUTOFIT

Unread post by ac_1 »

Hi Tom,

Shouldn't GMAUTOFIT.SRC https://estima.com/webhelp/topics/gmaut ... edure.html take into account outliers?

Using AUTOBOX.RPF as an example I get very different residual autocorrelation structure and histogram, using no outliers in BOXJENK vs. OUTLIERS=STANDARD.

Code: Select all

*===============================
*
* AC_AUTOBOX.RPF
*
cal(m) 1992:1
open data x12test.xls
data(format=xls,org=columns) 1992:1 2008:7 u11bvs


*===============================
* transformations

diff u11bvs / du11bvs

set ldata = log(u11bvs)
diff ldata / dldata

prin /


*===============================
spgraph(hea='u11bvs\\Levels & 1st-Difference',vfi=2,hfi=1)
graph(header="", key=upleft) 1
# u11bvs
graph(header="", key=upleft) 1
# du11bvs
spgraph(done)


spgraph(hea='ldata\\Levels & 1st-Difference',vfi=2,hfi=1)
graph(header="", key=upleft) 1
# ldata
graph(header="", key=upleft) 1
# dldata
spgraph(done)


*===============================
* automated choice for the DIFFS, SDIFFS and CONSTANT options on a BOXJENK instruction

@bjdiff(diffs=2,SDIFFS=1) ldata


*===============================
* computing and graphing the autocorrelations

* Correlogram of ldata
@bjident(diffs=%%autod,SDIFFS=%%autods) ldata


*===============================
@gmautofit(diffs=%%autod,sdiffs=%%autods,const=%%autoconst,report) ldata


dec vect[labels] methodlabl(7)
compute methodlabl(1)="const"
compute methodlabl(2)="ar"
compute methodlabl(3)="diffs"
compute methodlabl(4)="ma"
compute methodlabl(5)="sar"
compute methodlabl(6)="sdiffs"
compute methodlabl(7)="sma"
display @@^10 methodlabl

declare vector[integer] parameters(7)
compute parameters(1)=%%autoconst
compute parameters(2)=%%autop
compute parameters(3)=%%autod
compute parameters(4)=%%autoq
compute parameters(5)=%%autops
compute parameters(6)=%%autods
compute parameters(7)=%%autoqs
display @@^10 parameters


*===============================
boxjenk(diffs=%%autod,sdiffs=%%autods,const=%%autoconst,$
  ar=%%autop,sar=%%autops,ma=%%autoq,sma=%%autoqs,define=eq_D) ldata / resids

@regcrits

set fit / = ldata - resids

prin / ldata fit resids


* Actual vs fitted and residuals ARIMA-GM-BIC
spgraph(hea='ldata',vfi=1,hfi=1)
@REGACTFIT
spgraph(done)


* Residual Plots & Tests
spgraph(hfields=2,vfields=1)
   @regcorrs(header="acf plot for residuals",dfc=%narma,number=25,qstats,report,method=YULE) resids
   @histogram(header="histogram for residuals with overlay of the normal density",distrib=normal,MAXGRID=60) resids
spgraph(done)


*===============================
boxjenk(diffs=%%autod,sdiffs=%%autods,const=%%autoconst,$
  ar=%%autop,sar=%%autops,ma=%%autoq,sma=%%autoqs,define=eq_D,$
  outliers=standard) ldata / resids

@regcrits

set fit / = ldata - resids

prin / ldata fit resids


* Actual vs fitted and residuals ARIMA-GM-BIC
spgraph(hea='ldata',vfi=1,hfi=1)
@REGACTFIT
spgraph(done)


* Residual Plots & Tests
spgraph(hfields=2,vfields=1)
   @regcorrs(header="acf plot for residuals",dfc=%narma,number=25,qstats,report,method=YULE) resids
   @histogram(header="histogram for residuals with overlay of the normal density",distrib=normal,MAXGRID=60) resids
spgraph(done)

Plotting u11bvs and ldata visually there appears to be no outliers.

Yet fitting both BOXJENK models

Code: Select all

boxjenk(diffs=%%autod,sdiffs=%%autods,const=%%autoconst,$
  ar=%%autop,sar=%%autops,ma=%%autoq,sma=%%autoqs,define=eq_D) ldata / resids
* there is an outlier
* 1994:01 0.140773209247

flat residual ACF, non-normal residuals


and using

Code: Select all

boxjenk(diffs=%%autod,sdiffs=%%autods,const=%%autoconst,$
  ar=%%autop,sar=%%autops,ma=%%autoq,sma=%%autoqs,define=eq_D,$
  outliers=standard) ldata / resids
* no outlier
* 1994:01 0.096017402284

NOT a flat residual ACF, normal residuals


So isn't GMAUTOFIT identifying the wrong model as OUTLIERS are not being taken into account?

In otherwords, the logic isn't correct in the modelling process.

In general, for any series, the main aim being a flat residual ACF structure, then normal residuals (as bootstrapping can handle non-normality for a simple AR model). Right?

Importantly, at what STAGE in the modelling procedure do I take outliers into account (say, if not using OUTLIERS in BOXJENK but manual pulse dummies)?

thanks,
Amarjit
TomDoan
Posts: 7732
Joined: Wed Nov 01, 2006 4:36 pm

Re: GMAUTOFIT

Unread post by TomDoan »

The short answer is no. And if you look at the output, the estimation with OUTLIERS doesn't find any outliers. The difference between the estimators is that (non-empty) OUTLIERS option forces use of maximum likelihood while the other estimation is doing Gauss-Newton. The two aren't comparable (aside from using different objective functions, MAXL is able to use extra data points).

While ideally you want residuals to pass a test for whiteness, there is no particularly strong reason to be concerned about non-normality.
ac_1
Posts: 421
Joined: Thu Apr 15, 2010 6:30 am

Re: GMAUTOFIT

Unread post by ac_1 »

TomDoan wrote: Fri Apr 18, 2025 7:58 am The short answer is no. And if you look at the output, the estimation with OUTLIERS doesn't find any outliers. The difference between the estimators is that (non-empty) OUTLIERS option forces use of maximum likelihood while the other estimation is doing Gauss-Newton. The two aren't comparable (aside from using different objective functions, MAXL is able to use extra data points).
Yes, MAXL in BOXJENK can use extra data points.
TomDoan wrote: Fri Apr 18, 2025 7:58 am While ideally you want residuals to pass a test for whiteness, there is no particularly strong reason to be concerned about non-normality.
Why?


Also, generally, at what STAGE in the modelling procedure do I take outliers into account? As in this example plotting u11bvs and ldata visually there appears to be no outliers.
TomDoan
Posts: 7732
Joined: Wed Nov 01, 2006 4:36 pm

Re: GMAUTOFIT

Unread post by TomDoan »

ac_1 wrote: Fri Apr 18, 2025 9:49 am
TomDoan wrote: Fri Apr 18, 2025 7:58 am The short answer is no. And if you look at the output, the estimation with OUTLIERS doesn't find any outliers. The difference between the estimators is that (non-empty) OUTLIERS option forces use of maximum likelihood while the other estimation is doing Gauss-Newton. The two aren't comparable (aside from using different objective functions, MAXL is able to use extra data points).
Yes, MAXL in BOXJENK can use extra data points.
TomDoan wrote: Fri Apr 18, 2025 7:58 am While ideally you want residuals to pass a test for whiteness, there is no particularly strong reason to be concerned about non-normality.
Why?
The better question is why would you expect them to BE normally distributed? There's over 40 years of work showing that the assumption of Gaussianity isn't necessary for much of anything.
ac_1 wrote: Fri Apr 18, 2025 9:49 am Also, generally, at what STAGE in the modelling procedure do I take outliers into account? As in this example plotting u11bvs and ldata visually there appears to be no outliers.
The Census Bureau procedure selects a basic time series model, uses it to identify outliers (if any), strips the outlier effects from the data, analyzes the cleaned-up data, then adds the outliers back in at the end.
Post Reply