The following are the formulas used for different information criteria. The first column is generally used in anything which generates a log likelihood (log L is the sample value for that), while the second can be used for models estimated by least squares. The two give identical orderings of models assuming a Normal likelihood.

k is the number of estimated parameters (or regressors) and T is the number of observations. It's also possible (and in many ways desirable) to "standardize" these by dividing the expression by T which gives the statistic a more manageable scale while not changing the relative order of models. That's what the @REGCRITS procedure does.

You're looking for the model which minimizes the chosen criterion. Note that you should always use the same sample range for all models considered. If you don't, you can be biasing your decision in favor of one model or another based mainly upon the number of data points used. Note also that you need to be careful that the values of "k" are also compatible. For instance, the IC formulas for least squares models usually don't count the variance as part of k (the variance is concentrated out in least squares), which is fine if you're comparing least squares models with each other. However, if you're comparing a least squares model with (for instance) a GARCH model, the variance needs to be included in the parameter count for the regression, since the GARCH model is explicitly including parameters to describe the variance. The RATS variable %NFREE (which is used by @REGCRITS) includes variance (or covariances for multivariate models) that are concentrated out in the estimation, so @REGCRITS will give you comparable parameter counts if you're using different general model types.

AIC (Akaike Information Criterion) |
\( - 2\log L + k \times 2\) |
\(T\log \,{{\hat \sigma }^2} + k \times 2\) |

SBC (Schwarz Bayesian Criterion, or Bayesian Information Criterion) |
\( - 2\log L + k \times \log T\) |
\(T\log \,{{\hat \sigma }^2} + k \times \log T\) |

HQ (Hannan-Quinn) |
\( - 2\log L + k \times 2\log \left( {\log T} \right)\) |
\(T\log \,{{\hat \sigma }^2} + k \times 2\log \left( {\log T} \right)\) |

FPE (log) (Final Prediction Error) |
\( - 2\log L + T\log \left( {\frac{{T + k}}{{T - k}}} \right)\) |
\(T\log \,{{\hat \sigma }^2} + T\log \left( {\frac{{T + k}}{{T - k}}} \right)\) |