1 Introduction

The mathematical modelling of many real-world applications involves the minimization or maximization of a function of unknown parameters or variables. Frequently these parameters have known bounds; sometimes there are more general relationships between the parameters. When the number of variables is modest, say up to ten, the input of such a problem to an optimization procedure is usually fairly straightforward. Unfortunately many application areas now require the solution of optimization problems with thousands of variables; in this case merely the input of the problem data is extremely time-consuming and prone to error. Moreover, the mathematical programming community is only now designing algorithms for solving problems of this scale.

The format described in this report
was motivated directly by the difficulties the authors were
experiencing entering test examples to the *LANCELOT* large-scale
nonlinear optimization package. It soon became apparent that if
others were to be encouraged to carry out similar tests and even
enticed to use our software, the process of specifying problems had to
be considerably simplified. Thus we were inevitably drawn to provide a
preliminary version of what is described here:
a standard input format (SIF) for nonlinear programming
problems, together with an appropriate translator from the input file
to the form required by the authors' minimization software. While
understandably reflecting our views and experience, the present
proposal is intended to be broadly applicable.

During the subsequent (and successive) stages of development of these preliminary ideas, various important considerations were discussed. These strongly influenced the present proposal.

- There are many reasons for proposing a standard input format. The
most obvious one is the increased consistency in coding
nonlinear programming problems, and the resulting improvement in code
reliability. As every problem is treated in a similar and standardized
way, it is more difficult to overlook certain aspects of the problem
definition. The provision of a SIF file for a given problem also
allows some elementary (and very often helpful) automatic error and
consistency checking.
- A further advantage of having a standard input format is the long
awaited possibility of having a portable testbed of meaningful
problems. Moreover, such a testbed that can be expected to grow. The
authors soon experienced the daunting difficulties associated with
specifying large scale problems -- not only the difficulty of
writing down the specification correctly but also the actual coding
(and frequent re-coding) of a particular problem which often results
in non-trivial differences between the initial and final data. These
differences could be a major obstacle to valid comparisons between
competing optimization codes. By contrast, having a SIF file allows
simple and unambiguous data transfer via diskette, tape or electronic
mail. The success of the NETLIB and Harwell/Boeing problem collections
for linear programming and sparse linear algebra [9,6]
is a good recommendation for such
flexibility. The formality required by the SIF approach may
admittedly appear formidable for very simple problems, but is soon
repaid when dealing with more complex ones.
- Of course, the SIF format should cover a large part of the practical
optimization problems that users may want to specify. Explicit
provision should be made not only for unconstrained problems, but also
for constraints of different types and complexity: simple bounds on
the variables,
linear and/or nonlinear equations and inequalities
should be handled without trouble. Special structure
of the problem at
hand is also a mandatory part of an SIF file. For example, the
structure of least-squares
problems must be described in an
exploitable form. Sparsity
of relevant matrices and partial separability
of involved nonlinear functions must be included in the
standard problem description when they are known. Finally, the
special case of systems of nonlinear equations should also be covered.
- The existence and worldwide success of the MPS standard input format
for linear programming must be considered as a
*de facto*basis for any attempt to define an SIF for nonlinear problems. The number of problems already available in this format is large, and many nonlinear problems arise as a refinement of existing linear ones whose linear part and sparsity structure are expected to be described in the MPS format. It therefore seems reasonable to require that an SIF for nonlinear programming problems should conform to the MPS format. We were thus led to choose a standard that corresponds to MPS, augmented with additional constructs and structures, thus allowing nonlinearity, and the general features that we wished, to be described properly. - The requirement of compatibility with the MPS format has a number of
consequences, not all of which are pleasant. The first one is that the
new SIF must be based on fixed format
for the SIF file. Indeed, blanks
are significant characters in MPS,
when they appear in the right data fields,
and cannot be used as general separators for free format
input in any compatible system. The second one is the
*a priori*existence of a ``style'' for keywords and overall layout of the problem description, a format which is not always ideally suited to nonlinear problems. Our present proposal accepts these limitations. - The SIF should not be dependent on a specific operating system and/or
manufacturer. In this respect, it must avoid relying on tools that may
be excellent but are too specific (yacc and lex, for example). This of
course does not prevent any implementation of an SIF interpreter using
whatever facilities are locally available.
- In principle, the SIF should not be dependent on a particular high level programming language. However, as the intention is that SIF files may be converted into executable programs, restrictions on the symbolic names allowed by different programming languages may influence the choice of names within the problem description itself. For instance, in Fortran, symbolic variables may only contain up to six characters from a restricted set. We have chosen to base the present proposal, where necessary, on Fortran as this appears to be the most restrictive of the more popular high level languages. This dependence has been isolated as much as possible.

The authors are very well aware of the shortcomings of the SIF approach when compared to more elaborate modelling languages (see, for example, GAMS [1], AMPL [7], and OMP [4]. These probably remain the best way to allow easy and error free input of large problems. However, we contend that there is at present no language in the public domain which satisfactorily handles the nonlinear aspects of mathematical programming problems. While the advent of a tool of this nature is very much hoped for, it nevertheless seems necessary to provide something like the SIF now. This (we hope, intermediate) step is indeed crucial for the development and comparison of algorithms for solving large scale nonlinear problems, without which a more elaborate tool would be meaningless anyway. The SIF for nonlinear problems may also be considered as a first attempt to specify the minimal structures that should be present in a true modelling language for such problems. It is also of interest to develop a relatively simple input format, given that researchers developing new optimization methods may have to implement their own code for translating the SIF file into a form suitable for their algorithms. At this level, some compromise between completeness and simplicity seems necessary. Finally, the existence and availability of modelling languages for linear programming for a number of years has not yet made the MPS format irrelevant.

Hence, the reader should be aware that what sometimes appear as unnecessarily restrictive ``features'' of the proposed standard are often direct consequences of the considerations outlined above.

In the next section, we explain how we propose to exploit the structure in problems of the form (2.1)-(2.4) . We do this both in general and for a number of examples. Details of the way such structure may be expressed in a standard data input format follow in Section 3. The input of nonlinear information for element and group functions is covered in Section 4 and Section 5 respectively. The formats proposed in Sections 3-5 are quite rigid. A more flexible, free-form, input is considered in Section 6. The relationship to existing work is presented in Section 7 and conclusions drawn in Section 7.