next up previous
Next: 2.1 Problem, Elemental and Up: The SIF Reference Report Previous: 1 Introduction


2 An introduction to nonlinear optimization
problem structure

As we have already mentioned, structure is an integral and significant aspect of large-scale problems. Structure is often equated with sparsity; indeed the two are closely linked when the problem is linear. However, sparsity is not the most important phenomenon associated with a nonlinear function; that role is played by invariant subspaces. The invariant subspace of a function $f(x)$ is the set of all vectors $w$ for which $f(x + w) = f(x)$ for all possible vectors $x$. This phenomenon encompasses function sparsity. For instance, the function

\begin{displaymath}f(x_1 , x_2 ,\cdots,x_{1000}) = x^{2}_{500}\end{displaymath}

has a gradient and Hessian matrix each with a single nonzero, has an invariant subspace of dimension 999, and is, by almost any criterion, sparse. However the function

\begin{displaymath}f(x_1 , x_2 ,\cdots,x_{1000}) = (x_1 + \cdots + x_{1000})^2\end{displaymath}

has a completely dense Hessian matrix but still has an invariant subspace of dimension 999, the set of all vectors orthogonal to a vector of ones. The importance of invariant subspaces is that nonlinear information is not required for a function in this subspace. We are particularly interested in functions which have large (as a percentage of the overall number of variables) invariant subspaces. This allows for efficient storage and calculation of derivative information. The penalty is, of course, the need to provide information about the subspace to an optimization procedure.

A particular objective function $F(x)$ is unlikely to have a large invariant subspace itself. However, many reasonably behaved functions may be expressed as a sum of element functions, each of which does have a large invariant subspace. This is certainly true if the function is sufficiently differentiable and has a sparse Hessian matrix [11]. Thus, rather than storing a function as itself, it pays to store it as the sum of its elements. The elemental representation of a particular function is by no means unique and there may be specific reasons for selecting a particular representation. Specifying Hessian sparsity is also supported in the present proposal, but we believe that it is more efficient and also much easier to specify the invariant subspaces directly.

LANCELOT considers the problem of minimizing or maximizing an objective function of the form

\begin{displaymath}
F( x )
= \sum_{i \in I_O}w_i g_i \left (
\sum_{j \in J_i} w_{i,j...
...le \frac{1}{2}}\sum_{j=1}^n \sum_{k=1}^n h_{j,k} x_j x_k ,
\;
\end{displaymath} (2.1)

where $x = (x_1, x_2,\cdots, x_n)$, within the ``box'' region
\begin{displaymath}
l_i \leq x_i \leq u_i, \;\;\;\;l \leq i \leq n
\end{displaymath} (2.2)

(where either bound on each variable may be infinite), and where the variables are required to satisfy the extra conditions
\begin{displaymath}
w_i g_i \left(
\sum_{j \in J_{i}} w_{i,j}f_j (\bar{x}_j) + a_i^T x - b_i \right) = 0
\; (i \in I_E)
\end{displaymath} (2.3)

and
\begin{displaymath}
0 \left \{ \begin{array}{ll}
\leq \\ \geq
\end{array} \rig...
...ay}{ll}
\leq \\ \geq
\end{array} \right\}
r_i,
\; (i \in I_I)
\end{displaymath} (2.4)

for some index sets $I_0, I_E$ and $I_I$ and (possibly infinite) values $r_i$. The univariate functions $g_i$ are known as group functions. The argument

\begin{displaymath}\sum_{j \in J_i} w_{i,j} f_j(\bar{x}_j) + a_i^T x - b_i \end{displaymath}

is known as the $i$-th group. The functions $f_j, j=1,\cdots,
n_e$, are called nonlinear element functions. They are functions of the problem variables $\bar{x}_j$, where the $\bar{x}_j$ are either small subsets of $x$ or such that $f_j$ has a large invariant subspace for some other reason. The constants $w_i$ and $w_{i,j}$ are known as group and element weights, respectively, while the function $a_i^T x - b_i$ is known as the linear element for the $i$-th group. The additional term ${\scriptstyle \frac{1}{2}}\sum_{j=1}^n \sum_{k=1}^n h_{j,k} x_j x_k$ in the objective function is the quadratic objective group; the leading ${\scriptstyle \frac{1}{2}}$ is there by convention.

It is more common to call the group functions in (2.3) equality constraint functions, those in (2.4) inequality constraint functions and the sum of those in (2.1) the objective function.

When stating a structured nonlinear optimization problem of the form (2.1)-(2.4), we need to specify the group functions, linear and nonlinear elements and the way that they all fit together.



Subsections
next up previous
Next: 2.1 Problem, Elemental and Up: The SIF Reference Report Previous: 1 Introduction