How are fractiles computed?

Econometrics questions and discussions
davidkelley
Posts: 1
Joined: Thu Apr 28, 2016 5:13 pm

How are fractiles computed?

Unread post by davidkelley »

How exactly are fractiles calculated? I'm trying to replicate some data preparation work in RATS elsewhere but I can't replicate the computation of percentiles. I've generally thought that a percentile had to be an observation in the dataset, but it seems as through that isn't the case. For example, the following code says that the 25th percentile of a time series from 1 to 6 is 2.25.

Code: Select all

allocate 6
set test = 1
do j=2,6
     set(scratch) test j j = test(j-1) + 1
end do j
print / test
statistics(fractiles) test
Thank you for your help.
TomDoan
Posts: 7814
Joined: Wed Nov 01, 2006 4:36 pm

Re: How are fractiles computed?

Unread post by TomDoan »

First off, you could just use

set test 1 6 = t
stats(fractiles) test

in place of what you did.

There are actually several defensible ways to compute quantiles when there isn't an "obvious" value. However, it is certainly not true that it has to be an actual data point. The median of {1,2,3,4,5,6} is typically considered to be 3.5---theoretically any number in (3,4) could be considered a median since 50% of the observations are above and 50% below, but the average of the two in the middle seems to be the most obvious choice. To better demonstrate the calculation in general, consider something like {1,4,9,16,25,36}. To compute the 25%-ile, RATS computes the position of the 25%-ile as .25*(6-1)+1=2.25. That means the value is between observations 2 and 3, .25 of the distance from 2 to 3, so 4+.25*(9-4)=5.25. There are alternative formulas which (in effect) act as if -inf and +inf are added to the data set. That will give the same results for the median, and the two methods will converge as n-->inf, but will come up with very different results for small data sets, particularly for quantiles near either 0 or 1.
Post Reply