Outliers

For questions and discussion related to reading in and working with data.
sachin
Posts: 7
Joined: Fri May 08, 2009 11:53 am

Outliers

Unread post by sachin »

I have time-series data with monthly frequency for two variables-- room rate and hotels occupancy. I need to find out outliers on the series of these data.I will simply appreciate your help in coding for detecting outliers.
TomDoan
Posts: 7814
Joined: Wed Nov 01, 2006 4:36 pm

Re: Outliers

Unread post by TomDoan »

In what model? For a linear regression, there have been many ways proposed to detect outliers. The Baltagi textbook example baltp193.rpf computes five of them. Most of these are refinements on the simpler |e|/sigma criterion, computing a different sigma for each data point rather than just using the regression sigma. For an ARIMA model, the BOXJENK instruction has automatic outlier detection which is implemented by testing the effect of adding various dummies to the model.
sachin
Posts: 7
Joined: Fri May 08, 2009 11:53 am

Re: Outliers

Unread post by sachin »

My whole idea is to find outliers(event) before choosing the model. In the data of monthly frequency, I just want to figure out the months as outliers. I do not like to eliminate them, but just to see which months appear as outliers.
TomDoan
Posts: 7814
Joined: Wed Nov 01, 2006 4:36 pm

Re: Outliers

Unread post by TomDoan »

However, an outlier is specific to a model. If you have Y,X pairs

Y X
0 0
1 1
2 2
50 50

looking at the Y values in isolation (in effect thinking of a "model" in which the Y's are i.i.d. N(mu,sigma^2)), the 4th observation appears to be an outlier. In the model Y=a+bX, it isn't; it's right on the regression line along with everyone else. Even if you assume i.i.d. data, what would be seen as an outlier in Normally distributed data might be perfectly reasonable for a fat-tailed distribution like a Cauchy.
Post Reply