Statistics and Algorithms / Non-Linear Optimization /

Simplex algorithm (general optimization)

The simplex algorithm (METHOD/PMETHOD=SIMPLEX) is a derivative-free optimization method which can be applied assuming nothing more than continuity of the objective function. In order to operate with so little restriction on the behavior of the function, the simplex algorithm is more a collection of "rules" that seem to work in practice than a formal algorithm.

Instead of trying to "climb'' the hill directly, it instead crawls upward primarily by determining which directions aren't correct. In a K-dimensional space, it uses a set of K+1 vertices; for instance, in 2-space, they are the vertices of a triangle. At a pass through the algorithm, a replacement is sought for the worst of the vertices. An obvious guess is that the function will be better if we go through the face opposite the worst point. The test is whether that new point is better than the one we're trying to replace, not that it's better than all the other points. If we have an improvement over the worst point, that old one is removed from the simplex collection and the new one added. If the new point is worse, then it seems likely that the optimum may already be surrounded, so a test point is chosen in the interior. This process of replacing the worst of the vertices continues until the entire simplex has shrunk down so the difference between the best and worst vertices satisfies the convergence criterion.

Unlike more direct "climbing" methods, this has to drag all K+1 vertices up the hill, rather than a single point, so it's less efficient for well-behaved problems. However, even with functions that are twice-continuously differentiable, the simplex algorithm can be useful as a preliminary optimization method, as it can work even if the likelihood surface has the "wrong'' curvature at the guess values (that is, it's not concave). In effect, the preliminary simplex iterations help to refine the guess values. To get preliminary simplex iterations, use the combination of the options PMETHOD=SIMPLEX and PITERS=number of preliminary "iterations". What counts as an "iteration" for this is 2K simplex moves, which roughly equalizes the number of function evaluations in an actual iteration with other methods.

We often see users overdo the number of preliminary iterations. Usually 5 to 10 is enough, and rarely will you need more than 20. PMETHOD=SIMPLEX,PITERS=200 does quite a bit of calculation and won't really get you much farther along than the same with PITERS=20.

Note, however, that the simplex algorithm assumes that you are starting on the right "hill", so if you have a surface with multiple modes, it is likely to only find the mode which is uphill from the one on which it starts. Some of the other derivative-free methods are capable of dealing with even more poorly-behaved functions, though at a cost in computational time.

Applicability

RATS uses the simplex method (exclusively) to estimate the parameters in ESMOOTH. While ESMOOTH generates models which can be estimated by non-linear least squares models, the Gauss–Newton algorithm that applies to such problems does not deal well with the unstable regions in the parameter space.

You can choose to use the simplex method for estimation for FIND (where it’s the default method), BOXJENK, CVMODEL, DLM, GARCH, MAXIMIZE, NLLS and NLSYSTEM (for all of which where it’s an option, and is typically used as a PMETHOD to refine the guess values).

Control Parameters

There is a tendency for the simplex method to get stuck in situations where the points are too much in a line to provide information about the shape. To prevent this, RATS perturbs the vertices every 30 function evaluations. (How often this is done is controlled by the JIGGLE option on NLPAR). For more information, see, for instance, Press, et. al. (2007).