Harmonic Analysis and Signal Processing Seminar

Efficiently recovering second-order models in derivative-free optimization

Katya Scheinberg

Columbia University and CIMS

Wednesday, February 17, 2010, 2:00pm, WWH 1314


Derivative-free optimization is a field of nonlinear programming which deals with optimization methods that do not require explicit computations of the derivative information. In many engineering applications the objective function values are  computed by a computer simulations which do not provide derivatives of this functions. It is typical that the function value computations are comparatively costly and prone to noise. In such cases a polynomial interpolation model is often used to provide local approximation of the objective function based on a few carefully selected sample points. To build a full quadratic interpolation (or regression) model one needs to have a set of sample points whose cardinality is quadratic in the dimension of the parameter space. If the dimension of the parameter space is large it may be too costly to apply  optimization on the whole space. Typically statistical methods are used before the optimization stage to select a smaller dimensional subspace where optimization is likely to be most beneficial. Thus, the a priori analysis selects the most important features for optimization. Alternatively incomplete quadratic models are built where the Frobenius norm of the Hessian (or its change) is regularized to keep the models "well behaved". This has been a successful approach in practice, yet such models only guarantee linear approximation. On the other hand, when the quadratic models are built each coefficient of the model (element of the Hessian) is considered equally important, which means that all the  interaction between variables in are considered.  It is well known, however, that in many  practical instances the pairwise interactions between  variables is negligible for many pairs.   Hence   the quadratic model approximating the true function is likely to be sparse. From the sparse  solution recovery theory developed recently in the field of compressed sensing (CS) we can expect that a sparse quadratic model can be recovered using much fewer  points than the square of the dimension. This recovery is achieved by minimizing  the $l_1$-norm of the vector  of the model coefficients.  We will discuss how we hope to combine theoretical results from CS and DFO to obtain convergence theory for the methods based on  $l_1$ regularized quadratic models. We will also show preliminary computation results.