DENSITY Instruction |
DENSITY( options ) series start end grid density
DENSITY estimates the density function for a series of data. This can be done using kernel methods or by binning and counting for a histogram.
Wizard
The Statistics—(Kernel) Density Wizard provides a dialog-driven interface to the DENSITY instruction.
Parameters
series |
series for which you want to compute the density function |
start, end |
range to use. Defaults to the defined range of series |
grid |
(input or output) series of points at which the density is estimated |
density |
(output) series for the estimated density corresponding to the grid points. The grid and density series will be defined from entry 1 until the number of points in the grid. How the grid is set depends upon the GRID and MAXGRID options. |
Options
TYPE=[EPANECHNIKOV]/TRIANGULAR/GAUSSIAN/LOGISTIC/FLAT/PARZEN/HISTOGRAM
COUNTS/[NOCOUNTS]
Determines the type of density function that will be estimated. If you use HISTOGRAM, the counts for each "bin" are normally divided by the number of data points times the width of the bin to return a density estimate. Use COUNTS if you just want the counts for each bin.
BANDWIDTH=kernel bandwidth
The default for BANDWIDTH is:
\(0.79{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {N^{ - 1/5}}{\kern 1pt} IQR\)
where IQR is the interquartile range of the series and N is the number of data points. This has some optimality properties, but in practice seems to be too narrow. You can use the SMOOTHING option to increase (or decrease it) without knowing what the value (which depends upon the data).
GRID=[AUTOMATIC]/INPUT
If AUTOMATIC, the grid series runs in equal steps from the 1%-ile to the 99%-ile of the input series. If INPUT, you fill in the grid series with whatever values you want prior to using the DENSITY instruction.
MAXGRID=number of grid points [100 except for TYPE=HISTOGRAM]
If GRID=AUTOMATIC (the default), MAXGRID gives the number of equally spaced points at which the density is estimated.
WEIGHT=series of weights for the data points [not used]
Use this option if you want to provide different weights for each observation.
DERIVATIVE=(output) series of estimated derivatives [not used]
This saves the estimated derivatives of the density function into a series. This matches up, point for point, with the grid and density series. It requires TYPE=GAUSSIAN, as the other kernel types aren’t differentiable.
SMOOTHING=smoothing scale vactor[1]
You can supply a real value (bigger than 0) to adjust the amount of smoothing. Use a value bigger than 1 for more smoothing than the default, values less than 1 for less smoothing.
SMPL=standard SMPL option [unused]
[PRINT]/NOPRINT
If PRINT, DENSITY produces a table of grid values and the estimated density at each point.
Description
For types other than HISTOGRAM, DENSITY estimates the density function for a series of data x, by computing at each point u in the grid:
\(\hat f\left( u \right) = \frac{{\sum\limits_{t = 1}^T {\left( {{\kern 1pt} {w_t}K\left( {\frac{{u - {x_t}}}{h}} \right)} \right)} }}{{h{\kern 1pt} \sum\limits_{t = 1}^T {{w_t}} }}\)
where K is the kernel function, h the bandwidth and w are the weights, which, by default, are 1 for all t. The kernel types take the following forms:
EPANECHNIKOV |
\(K\left( v \right) = 0.75{\kern 1pt} {\kern 1pt} \left( {1 - {v^2}} \right)\) if \(\left| {{\kern 1pt} {\kern 1pt} v{\kern 1pt} {\kern 1pt} } \right| \le 1\), 0 otherwise |
TRIANGULAR |
\(K\left( v \right) = \left( {1 - \left| v \right|} \right)\) if \(\left| {{\kern 1pt} {\kern 1pt} v{\kern 1pt} {\kern 1pt} } \right| \le 1\), 0 otherwise |
GAUSSIAN |
\(K\left( v \right) = \frac{1}{{\sqrt {2\pi } }}{\kern 1pt} {\kern 1pt} {\kern 1pt} \exp {\kern 1pt} {\kern 1pt} \left( {\frac{{ - {v^2}}}{2}} \right)\) |
LOGISTIC |
\(K\left( v \right) = \frac{{{e^v}}}{{{{\left( {1 + {e^v}} \right)}^2}}}\) |
FLAT |
\(K\left( v \right) = 0.5\) if \(\left| {{\kern 1pt} {\kern 1pt} v{\kern 1pt} {\kern 1pt} } \right| \le 1\), 0 otherwise |
PARZEN |
\(K\left( v \right) = 4/3 - 8{v^2} + 8{\left| v \right|^3}\) if \(\left| {{\kern 1pt} {\kern 1pt} v{\kern 1pt} {\kern 1pt} } \right| \le 0.5\), \(8\left( {1 - {{\left| v \right|}^3}} \right)/3\) if \(0.5 \le v \le 1\) |
As you increase the bandwidth, you will get a smoother estimated density function, but you will be less able to detect sharp features. A shorter bandwidth leads to a more ragged estimated density function, but sharp features, such as a truncation at one end, will be more apparent.
For TYPE=HISTOGRAM, the grid becomes a series of “bins” centered at each grid point. DENSITY counts the number of data points which fall in each bin. If you use COUNTS, these raw counts will be the values returned. Otherwise, the counts are divided by the number of data points times the bin width to produce an estimate of the density.
Variables Defined
%EBW |
the computed bandwidth (REAL) |
Examples
This computes and graphs density functions for three sets of statistics generated by a simulation process. Each uses an automatic grid with the default 100 grid points.
density(smoothing=1.5) inter 1 ndraws ginter finter
density(smoothing=1.5) coeff1 1 ndraws gcoeff1 fcoeff1
density(smoothing=1.5) sums 1 ndraws gsums fsums
scatter(style=lines,window="Posterior for Intercept")
# ginter finter
scatter(style=lines,window="Posterior for Lag 0")
# gcoeff1 fcoeff1
scatter(style=lines,window="Posterior for Sum")
# gsums fsums
This computes a heavily smoothed density function across an input grid created using @GRIDSERIES.
@gridseries(from=-100,to=+100,n=400) xsr
density(smoothing=3.0,grid=input) sacratios 1 nboot xsr fsr
scatter(style=lines,header="Frequency Distribution for the Cecchetti Model")
# xsr fsr
This creates two density functions, one using a created grid, the other using the same grid as the first.
density(grid=automatic,maxgrid=100,smoothing=1.5) b2draws / bx fxn
scatter(style=line,vmin=0.0,$
footer="Figure 2.2 Distribution of b2 in the Monte Carlo experiment")
# bx fxn
density(grid=input,maxgrid=100,smoothing=1.5) b2drawsln / bx fxln
This computes a histogram for an income series using an input set of interval midpoints which give wider intervals for the higher incomes.
data(unit=input) 1 12 gridpts
5 15 25 35 45 55 65 85 105 135 165 235
density(type=histogram,grid=input,print) income / gridpts idensity
Copyright © 2025 Thomas A. Doan