How are fractiles computed?

davidkelley · Unread post by **davidkelley** » Fri Apr 29, 2016 8:21 am

How exactly are fractiles calculated? I'm trying to replicate some data preparation work in RATS elsewhere but I can't replicate the computation of percentiles. I've generally thought that a percentile had to be an observation in the dataset, but it seems as through that isn't the case. For example, the following code says that the 25th percentile of a time series from 1 to 6 is 2.25.

Code: Select all

allocate 6
set test = 1
do j=2,6
     set(scratch) test j j = test(j-1) + 1
end do j
print / test
statistics(fractiles) test

Thank you for your help.

TomDoan · Unread post by **TomDoan** » Fri Apr 29, 2016 10:27 am

First off, you could just use

set test 1 6 = t
stats(fractiles) test

in place of what you did.

There are actually several defensible ways to compute quantiles when there isn't an "obvious" value. However, it is certainly not true that it has to be an actual data point. The median of {1,2,3,4,5,6} is typically considered to be 3.5---theoretically any number in (3,4) could be considered a median since 50% of the observations are above and 50% below, but the average of the two in the middle seems to be the most obvious choice. To better demonstrate the calculation in general, consider something like {1,4,9,16,25,36}. To compute the 25%-ile, RATS computes the position of the 25%-ile as .25*(6-1)+1=2.25. That means the value is between observations 2 and 3, .25 of the distance from 2 to 3, so 4+.25*(9-4)=5.25. There are alternative formulas which (in effect) act as if -inf and +inf are added to the data set. That will give the same results for the median, and the two methods will converge as n-->inf, but will come up with very different results for small data sets, particularly for quantiles near either 0 or 1.

The RATS Software Forum

How are fractiles computed?

How are fractiles computed?

Re: How are fractiles computed?