SSTATS( options ) start end (pairs of) expression>>result

SSTATS computes the sum, product, mean, maximum, or minimum or other quantile of a range of values from a series or formula. You supply an expression providing the values on which the computation will be based and a target variable (or array element) for the result. You can do several "queries" on a single SSTATS, though the type of calculation (sum, product, etc.) will be the same for all. If you need to do (for instance) a MAX calculation and a MIN calculation, you will need separate SSTATS instructions to do that.

This is a very useful instruction. STATISTICS and TABLE provide other ways to get statistical information, but SSTATS doesn't require you to create a separate series for analysis. Note that SSTATS displays no output directly—it's up to you to use or display the values in the result variables that it generates.

Parameters

start, end |
range to use. This defaults to the standard workspace. You will generally need to use this to set the proper range. Unlike many RATS instructions, you will need to use the SMPL option to exclude any observations that would cause the expression to return a missing value. |

expression |
a variable or formula |

>>result |
the (REAL) variable into which the computed result will be stored |

Options (both)

SMPL=standard SMPL option[not used]

You can supply a series or a formula that can be evaluated across entry numbers. Entries for which the series or formula is zero or “false” will be omitted from the calculations.

MEAN

PRODUCT

MAXIMUM

MINIMUM

FRAC=desired fractile (quantile) [not used]

Use one of these (mutually exclusive) options to select the statistic you want to compute. Use FRAC=.50 for the median, and similarly for any other percentile. If you don’t use any of the options, SSTATS computes the sum.

WEIGHT=standard WEIGHT option[not used]

Use this option if you want to provide different weights for each observation.

STARTUP=FRML evaluated at period "start"

You can use the START option to provide an expression which is computed once per function evaluation, before the regular formula is computed. This allows you to do any time-consuming calculations that don’t depend upon time. It can be an expression of any type.

[PANEL]/NOPANEL

When working with panel data, NOPANEL disables the special panel data treatment of expressions which cross individual boundaries.

Variables Defined

%NOBS |
Number of observations (INTEGER) |

%MAXENT |
Entry number of maximum value (if MAXIMUM) (INTEGER) |

%MINENT |
Entry number of minimum value (if MINIMUM) (INTEGER) |

Examples

This is a common use for SSTATS. It does a comparison of a series of statistics with a critical value and computes the percentage of those that exceed it. STATS>%CDSTAT is either 1 or 0 depending upon whether or not the value of STATS(T) is bigger than %CDSTAT or not, so the mean of that over the sample will be the fraction for which that is true. The resulting value (called PVALUE) is an empirical significance level.

sstats(mean) 1 ndraws (stats>%cdstat)>>pvalue

disp "Bootstrapped p-value" pvalue

The first finds the maximum value of the ICOUNT series, putting the result into PT. The second takes the sum (the default statistic) of the number of entries where ICOUNT is positive, into NT.

sstats(max) starti endi icount>>pt

sstats starti endi (icount>0)>>nt

This takes the sum of the lag of XSERIES squared into the variable SUMYSQ.

sstats startl+1 endl xseries{1}^2>>sumysq

This sums the squares of the DY series into SDY and the squares of the UT series into SQY.

sstats start end dy^2>>sdy ut^2>>sqy

sstat(smpl=yesvm==0) / 1>>nos testvm<.5>>nnnos prfitted<.5>>prbnos

sstat(smpl=yesvm==1) / 1>>yes testvm>.5>>nnyes prfitted>.5>>prbyes

The first acts only on the entries where YESVM is 0. NOS will be the count of the observations (sum of the value of 1), NNNOS is the number of observations where TESTVM is less than .5 (sum of the logical operator TESTVM<.5) and PRBNOS is the number where PRFITTED is less than .5. The second acts only on entries where YESVM is 1, computing an analogous set of count variables.