Computer Arithmetic |
The standard representation for "real" numbers in statistical computations for roughly the past 40 years has been "double precision". This uses 64 bits to represent the value. The older "single precision" is 32 bits. Before the introduction of specialized "floating point processors" (which are now incorporated into standard chip sets), both the single and double precision floating point calculations had to be done using blocks of shorter integers, similar to the way that children are taught to multiply two four-digit numbers. If you compare the amount of work required to multiply two two-digit numbers by hand vs the amount required to do two four-digit numbers, you can see that single precision calculations were quite a bit faster, and were sometimes chosen for that reason. However, particularly when used with time series data, they were often inadequate—even a relatively short univariate autoregression on (say) log GDP wouldn't be computed accurately at single precision.
There are three principal problems which are produced due to the way computers do arithmetic: overflow, underflow and loss of precision. The standard now for double precision number is called IEEE 754. This uses 1 sign bit, 11 exponent bits and 53 "significand" bits, with a lead 1 bit assumed, so 52 are actually needed. The value is represented in scientific notation as (in binary, though we'll use standard notation)
(1) \((sign) \times 1.xxxxxxxxxxxxxxx \times {10^{power}}\)
The range of the power in (1) is from -308 to +308. An overflow is a calculation which makes the power larger than +308, so there's no valid double precision representation. When this happens, RATS and most other software will treat the result as "infinity". (There are special codings set aside in the standard to represent infinity and other "denormals" like that). An underflow is a calculation which makes the power less than -308. Here the standard behavior is to call the result zero.
Most statistical calculations are able to steer clear of over- and underflow, typically by using logs rather than multiplying—the main cause of these conditions is multiplying many large or small numbers together, so adding logs avoids that. However, there are situations (particularly in Bayesian methods) where the actual probabilities or density functions are needed, and you may have to exercise some care in doing calculations in those cases.
A more serious problem is loss of precision. What's the value of
1.00000000000000001 − 1.0
If you're a computer doing double-precision arithmetic, it's 0 because it doesn't have the 17 significant digits available to give different representations to the two values on either side of the −. When you subtract two large and almost equal numbers, you may lose almost all the precision in the result. Take the following situation—this is (to 10 digits) the cross product matrix of two lags of GDP from a data set. If we want to do a linear regression with those two lags as the regressors, we would need to invert this matrix.
14872.33134 14782.32952
14782.32952 14693.68392
You can compute the determinant of this using all ten digits, and the same rounding to just five with
?14872.33134*14693.68392-14782.32952^2
?14872.*14694.-14782.^2
The results are very different:
12069.82561
21644.00000
Even though the input values in the second case were accurate to five digits, the end result isn't even correct to a single digit. And this is just a 2 × 2 case.
Linear regressions in RATS and most other software are set up to avoid these types of problems where possible. For instance, specialized inversion routines are used. And while theoretically, the sum of squared residuals can be computed as
\({\bf{y}}'{\bf{y}} - {\bf{y}}'{\bf{X}}\hat \beta \)
that's "subtracting two big numbers to create a small one" that can cause precision problems. Instead, RATS computes the individual residuals and uses those to compute the sum of squares. It's slower, but more exact.
You should never assume that a floating point calculation will be sufficiently accurate to allow for an exact comparison either directly (calculation A being exactly equal to specific value B) or by exclusion (calculation A being greater than or less than value B, which means that the exact value B is excluded, but everything on one side of it is not). Instead, if it is important to test for such a condition (and you should ask yourself how important it is before doing it), you should allow for a "fuzzy" comparison which will be met by values very close to B. Instead of A==1, something like A>=.9999999.and.A<=1.0000001 will trap minor deviations due to numerical imprecision, while almost certainly not misclassifying an error due to statistical randomness. An alternative is to use the %NOPREC(x) function, which lets the computer manage the level of roundoff error: instead of doing A==1, use %NOPREC(A-1) which returns "true" if A-1 meets a (somewhat machine-specific) definition of having no precision. You can also use the %ROUND(x,n) function, which rounds to a certain number of decimals: in the comparison above, %ROUND(A,7)==1 will be true for any value of A that, when rounded to 7 decimals would be 1.0.
It's important to note that direct tests for zero are particularly troublesome. A calculation can be near zero because that's its natural value, or it can be near zero because it represents roundoff error. For instance, if you were to compute 1/(US GDP in $), you will get a value like 4.e-14: a tiny value which in most other contexts would be considered "machine-zero", but here it's just the natural scale of that particular calculation. The mistake is in using a denominator of (US GDP in $), rather than (US GDP in 10^9 or 10^12 $) to change the natural scale of the calculation to be many orders of magnitude different from zero. Properly sizing the scales of your variables can go a long way towards improving your ability to analyze your data.
Copyright © 2025 Thomas A. Doan