Non-Parametric Regression and Density Estimation |
DENSITY does kernel density estimation. It can estimate the density function and, for the differentiable kernels, the derivative of the density.
NPREG is for the non-parametric regression of one series on another. Its design is quite similar to DENSITY for the kernel-based regressions (Nadaraya-Watson). There’s also an option for LoWeSS (Locally WEighted Scatterplot Smoother), which isn’t used as much in econometrics.
Both NPREG and DENSITY have a GRID=INPUT/AUTOMATIC option. When you’re using them just to get a graph of the shape, the GRID=AUTOMATIC usually works fine. Use GRID=INPUT when you need either to better control the grid or you need to evaluate at existing data in order to use the output as part of a more involved calculation. The “grid” doesn’t really have to be a grid—there’s no computational reason for it to be organized in any particular way.
Both instructions have Wizards: Statistics—Nonparametric Regression and Statistics—(Kernel) Density Estimation, which can be rather handy since the instructions themselves are somewhat complicated. Note in particular that each wizard includes a "Create Graph" checkbox, which adds a SCATTER instruction to display the density or the x-y regression.
The graph instructions will generally have to be edited somewhat to add a header or footer, but this still can be a time-saver if you want the graphical output.
Bandwidths
The default bandwidth on either instruction is:
\begin{equation} (.79\,IQR)N^{ - 1/5} \end{equation}
where \(IQR\) is the interquartile range (difference between 75%-ile and 25%-ile) and \(N\) is the number of observations.
This has certain optimality properties in larger samples (see the discussion in Pagan and Ullah, 1999), but you might find it to be too small in many applications. You can use the SMOOTHING option to adjust this up or down. The bandwidth is multiplied by the value given on that option, so SMOOTHING=1.0 (the default) gives the default bandwidth, while SMOOTHING=2.0 doubles the default. The bandwidth used can be obtained after the instruction from the variable %EBW.
Examples
By far the most common use of DENSITY is graphing densities from simulations. These are usually quite straightforward, as you can let DENSITY set up the grid as is done here (GINTER is the grid and FINTER the corresponding estimated density—this is from SHILLERGIBBS.RPF).
density(smoothing=1.5) inter 1 ndraws ginter finter
scatter(style=lines,footer="Posterior for Intercept")
# ginter finter
Example file GARCHSEMIPARAM.RPF estimates a GARCH model using a non-parametric estimate for the density function of the standardized residuals (rather than using a conventional Normal or t).
The example file NPREG.RPF is taken from Pagan and Ullah(1999), pp 248-249. It does a correction for heteroscedasticity using a non-parametric function of population for the variance rather than a simple power function.
Example program ADAPTIVE.RPF is also taken from Pagan and Ullah(1999), pp 250-251. It does an adaptive kernel estimator, which adjusts the covariance matrix of a linear regression for an empirically estimated residual density function. This requires both the density function and the derivative of the density.
Copyright © 2025 Thomas A. Doan