LSMR - Sparse Least Squares LSMR Solver¶
#include <spral_lsmr.h> /* or <spral.h> for all packages */
Purpose¶
This package uses the LSMR iterative method to solve sparse linear equations and sparse least-squares problems of the form:
where the \(m \times n\) matrix \(A\) may be square or rectangular, and may have any rank. The scalar \(\lambda\) is a damping parameter. If \(\lambda > 0\), the solution is regularized in the sense that a unique soluton always exists, and \(\|x\|_2\) is always bounded.
Preconditioning may be used to try to reduce the number of iterations. A suitable choice for the preconditioner depends on the user’s knowledge of \(A\). For a user-chosen \(n \times n\) nonsingular matrix \(P\), LSMR solves
The user must then recover the final solution \(x\) by computing \(x=Py\). \(P\) will be a good preconditioner if \(AP\) is significantly better conditioned than \(A\).
Reverse communication is used for preconditioning operations \(Pz\) and \(P^Tz\) and matrix-vector products of the form \(Av\) and \(A^Tu\).
The method used is based on the Golub-Kahan bidiagonalization process. It is algebraically equivalent to applying MINRES to the normal equation \((A^TA+\lambda^2I)x=A^Tb\) (or \(((AP)^T(AP)+\lambda^2I)y=(AP)^Tb\)), but has better numerical properties, especially if \(A\) is ill-conditioned.
Notation¶
In the rest of this documentation, we use the following notation:
and
Subroutines¶
-
void
spral_lsmr_default_options
(struct spral_lsmr_options *options)¶ Initialises options to default values.
Parameters: - options – data structure to be initialised.
-
int
spral_lsmr_solve
(int *action, int m, int n, double u[m], double v[n], double y[n], void **keep, struct spral_lsmr_options const *options, struct spral_lsmr_inform *inform, double *damp)¶ Solve the least squares problem. Uses reverse communication. Upon return the user must perform a task specified by the action parameter and recall the routine. Possible values of action and associated tasks are:
action Task to be performed 0 None. Computation has terminated. Check inform.flag for reason. 1 The user must compute
\[v = v + P^TA^Tu\]without altering \(u\).
2 The user must compute
\[u = u + APv\]without altering \(v\).
3 The user may test for convergence.
To continue, recall without changing any of the arguments.
Parameters: - action – reverse communication task (see table above). Must be 0 on first call.
- m – number of rows in \(A\).
- n – number of columns in \(A\).
- u – the vector \(u\). Must contain \(b\) on first call.
- v – the vector \(v\).
- y – the current solution \(y\) of the preconditioned problem (note \(x=Py\)).
- keep – private internal data for LSMR.
- options – controls execution of algorithm.
- inform – information about execution of algorithm. Must not be changed by the user.
- damp – Damping parameter \(\lambda\).
Returns: Value of
spral_lsmr_inform.flag
.
-
int
lsmr_free
(void **keep)¶ Free memory allocated in keep. If a series of problems is being solved sequentially, the same keep may be used without calling
spral_lsmr_free()
between each solution.Parameters: - keep – private data to be freed.
Returns: 0 on success, or Fortran stat parameter on failed deallocation.
Derived types¶
-
struct
spral_lsmr_options
¶ Specify options used by LSMR algorithm.
-
int
print_freq_head
¶ Frequency of printing heading information (that is, how many lines are printed before the heading information is reprinted). Default is 20.
-
int
print_freq_itn
¶ Frequency of printing status. There is printing on each of the first print_freq_itn iterations and then printing every print_freq_itn iterations. Default is 10.
-
int
unit_diagnostics
¶ Fortran unit for diagnostic printing. Printing is suppressed if negative. Default is 6.
-
int
unit_error
¶ Fortran unit for printing error messages. Printing is suppressed if negative. Default is 6.
-
double
atol
¶ Relative error in \(A\). i.e. if \(A\) is accurate to about 6 digits, set atol to 1.0e-6. Only used if
spral_lsmr_options.ctest
=3. Default issqrt(DBL_EPSILON)
.
-
double
btol
¶ Relative error in \(b\). i.e. if \(b\) is accurate to about 6 digits, set btol to 1.0e-6. Only used if
spral_lsmr_options.ctest
=3. Default issqrt(DBL_EPSILON)
.
-
double
conlim
¶ Upper limit on \(cond(\bar{A})\), apparent condition number of \(\bar{A}\). Only used if
spral_lsmr_options.ctest
=3. Default is1/(10*sqrt(DBL_EPSILON))
:
-
int
ctest
¶ Convergence test to use. Options are:
1 User to test convergence (action=3). Without computation of norms in inform. 2 User to test convergence (action=3). With computation of norms in inform. 3 (default) Convergence test of Fond and Saunders (in preconditioner norm). Default is 3.
-
int
itnlim
¶ Maximum number of iterations. If negative, the limit used is \(4n\). Default is -1.
-
int
itn_test
¶ Number of iterations between user convergence tests. If negative, use \(\min(n,10)\). Default is -1.
-
int
localSize
¶ Number of historical vectors to use for reorthogonalization. Default is 0.
-
int
-
struct
spral_lsmr_inform
¶ Information about progress of algorithm.
-
double
condAP
¶ Estimate of \(cond(\bar{A})\). A very high value of condAP may again indicate an error in the products with \(A\), \(A^T\), \(P\), or \(P^T\). A negative value indicates that no estimate is currently available. This component is not used if
spral_lsmr_options.ctest
=1.
-
int
flag
¶ Exit status of algorithm. See table below.
-
int
itn
¶ Number of iterations performed
-
double
normAP
¶ Estimate of Frobenius norm of \(\bar{A}\). If \(\lambda\) is small and the columns of \(AP\) have all been scaled to have length 1.0, normAP should increase to roughly \(\sqrt{n}\). A radically different value for normAP may indicate an error in the user-supplied products with \(A\), \(A^T\), \(P\), or \(P^T\). A negative value indicates that no estimate is currently available. This component is not used if
spral_lsmr_options.ctest
=1.
-
double
normAP
Estimate of \(\|\bar{A}^T\bar{r}\|_2\), (normal equations residual). This should be small in all cases. Note that normAPr will often be smaller than the true value. A negative value indicates that no estimate is currently available. This component is not used if
spral_lsmr_options.ctest
=1.
-
double
normr
¶ Estimate of \(\|\bar{r}\|_2\). This will be small if \(Ax = b\) has a solution. A negative value indicates that no estimate is currently available. This component is not used if
spral_lsmr_options.ctest
=1.
-
double
normy
¶ Estimate of \(\|y\|_2\). A negative value indicates that no estimate is currently available. This component is not used if
spral_lsmr_options.ctest
=1.
-
int
stat
¶ The Fortran stat parameter in the event of a failed allocation.
inform.flag Interpretation 0 \(x = 0.0\) is the exact solution. No iterations were performed. 1 The equations \(Ax = b\) are probably compatible. \(\|Ax - b\|_2\) is sufficiently small, given the values of spral_lsmr_options.atol
andspral_lsmr_options.btol
. (spral_lsmr_options.ctest
=3 only).2 If damp is not present or is zero then the system \(Ax = b\) is probably not compatible. A least-squares solution has been obtained that is sufficiently accurate, given the value of spral_lsmr_options.atol
. Otherwise, damped least-squares solution has been obtained that is sufficiently accurate, given the value ofspral_lsmr_options.atol
. (spral_lsmr_options.ctest
=3 only).3 An estimate of cond(\(\bar{A}\)) has exceeded spral_lsmr_options.conlim
. The system \(Ax = b\) appears to be ill-conditioned, or there could be an error in the products with \(A\), \(A^T\), \(P\), or \(P^T\). (spral_lsmr_options.ctest
=3 only).4 \(\|APy - b \|_2\) is small enough for this machine. ( spral_lsmr_options.ctest
=3 only).5 The least-squares solution is good enough for this machine. ( spral_lsmr_options.ctest
=3 only).6 The estimate spral_lsmr_inform.condAP
appears to be too large for this machine. (spral_lsmr_options.ctest
=3 only).7 The iteration limit spral_lsmr_options.itnlim
has been reached.8 An array allocation failed. 9 An array deallocation failed. 10 Either m<0 or n<0. -
double
Example¶
The following code illustrates the use of LMSR
/* examples/C/lsmr.c - Example code for SPRAL_LSMR package */
#include "spral.h"
#include <stdlib.h>
#include <stdio.h>
void matrix_mult(int m, int n, int const *ptr, int const *row,
double const *val, double const *v, double *u);
void matrix_mult_trans(int m, int n, int const *ptr, int const *row,
double const *val, double const *u, double *v);
int main(void) {
/* Derived types */
void *keep;
struct spral_lsmr_options options;
struct spral_lsmr_inform inform;
/* Initialize derived types */
keep = NULL;
spral_lsmr_default_options(&options);
options.unit_diagnostics = -1; /* Switch off diagnostic printing */
/* Data for matrix:
* ( 1.0 -1.0 )
* ( 2.0 )
* ( 2.0 2.0 )
* ( 5.0 3.0 -2.0 )
* ( 6.0 ) */
const int m = 5, n = 3;
int ptr[] = { 0, 3, 5, 9 };
int row[] = { 0, 2, 3, 1, 3, 0, 2, 3, 4 };
double val[] = { 1.0, 2.0, 5.0, 2.0, 3.0, -1.0, 2.0, -2.0, 6.0 };
/* Data for rhs b */
double b[] = { 1.0, 1.0, 1.0, 1.0, 1.0 };
/* prepare for LSMR calls (using no preconditioning) */
int action = 0;
double u[m], v[n], x[n];
for(int i=0; i<m; ++i) u[i] = b[i];
bool done = false;
while(!done) {
spral_lsmr_solve(&action, m, n, u, v, x, &keep, &options, &inform, NULL);
switch(action) {
case 0: /* we are done */
printf("Exit LSMR with inform.flag = %d and inform.itn = %d\n",
inform.flag, inform.itn);
printf("LS solution is:\n");
for(int i=0; i<n; ++i) printf(" %10.2f", x[i]);
printf("\n");
done = true;
break;
case 1: /* Compute v = v + A'*u without altering u */
matrix_mult_trans(m, n, ptr, row, val, u, v);
break;
case 2: /* Compute u = u + A*v without altering v */
matrix_mult(m, n, ptr, row, val, v, u);
break;
}
}
spral_lsmr_free(&keep);
return 0; // Success
}
/* Takes b and computes u = u + A * v (A in CSC format) */
void matrix_mult(int m, int n, int const *ptr, int const *row,
double const *val, double const *v, double *u) {
for(int j=0; j<n; ++j) {
for(int k=ptr[j]; k<ptr[j+1]; ++k) {
int i = row[k];
u[i] += val[k]*v[j];
}
}
}
/* Takes b and computes v = v + A^T * u (A in CSC format) */
void matrix_mult_trans(int m, int n, int const *ptr, int const *row,
double const *val, double const *u, double *v) {
for(int j=0; j<n; ++j) {
for(int k=ptr[j]; k<ptr[j+1]; ++k) {
int i = row[k];
v[j] += val[k]*u[i];
}
}
}
This returns the following output:
Exit LSMR with inform.flag = 2 and inform.itn = 3
LS solution is:
0.18 0.26 0.17
Method¶
Algorithm¶
The method used is based on the Golub-Kahan bidiagonalization process. It is algebraically equivalent to applying MINRES to the normal equation \((A^TA+\lambda^2I)x=A^Tb\) (or \(((AP)^T(AP)+\lambda^2I)y=(AP)^Tb\), \(Py = x\), if preconditioning is used), but has better numerical properties, especially if \(A\) is ill-conditioned. Full details may be found in [1].
Scaling¶
LSMR uses an iterative method to approximate the solution. The number of iterations required to reach a certain accuracy depends strongly on the scaling of the problem. Poor scaling of the rows or columns of \(A\) should therefore be avoided where possible. For example, in problem 1 the solution is unaltered by row-scaling. If a row of \(A\) is very small or large compared to the other rows of \(A\), the corresponding row of \(( A\; b )\) should be scaled up or down.
In problems 1 and 2, the solution \(x\) is easily recovered following column-scaling. Unless better information is known, the nonzero columns of \(A\) should be scaled so that they all have the same Euclidean norm (e.g., 1.0). In problem 3, there is no freedom to re-scale if damp is nonzero. However, the value of damp should be assigned only after attention has been paid to the scaling of \(A\).
The parameter damp is intended to help regularize ill-conditioned
systems, by preventing the true solution from being very large. Another
aid to regularization is provided by the spral_lsmr_inform.condAP
,
which may be used to terminate iterations before the computed solution becomes
very large.
Initial estimate¶
Note that \(x\) (or \(y\) for the preconditioned problem) is not an input parameter. If some initial estimate \(x_0\) of \(x\) is known and if \(\lambda = 0\), one could proceed as follows:
- Compute a residual vector \(r_0 = b - Ax_0\).
- Use LSMR to solve the system \(A \delta x = r_0\).
- Add the correction \(\delta x\) to obtain a final solution \(x = x_0 + \delta x\).
This can be generalized for the preconditioned case. The guess \(x_0\) has
to be available before and after the calls to spral_lsmr_solve()
.
To judge the benefits, suppose spral_lsmr_solve()
takes \(k_1\)
iterations to solve \(Ax = b\) and \(k_2\) iterations to solve
\(A \delta x = r_0\). If \(x_0\) is “good”, \(\|r_0\|_2\) will be
smaller than \(\|b\|_2\). If the same stopping tolerances
spral_lsmr_options.atol
and spral_lsmr_options.btol
are
used for each system, \(k_1\) and \(k_2\) will
be similar, but the final solution \(x = x_0 + \delta x\) should be more
accurate. The only way to reduce the total work is to use a larger stopping
tolerance for the second system. If some value
spral_lsmr_options.btol
is suitable for
\(Ax=b\), the larger value
\(\mathrm{spral\_lsmr\_options.btol}*\|b\|_2 / \|r_0\|_2\) should be
suitable for \(A \delta x = r_0\).
References¶
[1] | D.C.-L. Fong and M.A. Saunders (2011). LSMR: An iterative algorithm for sparse least-squares problems. SIAM J. Sci. Comput. 33:5, 2950-2971. [DOI: 10.1137/10079687X] |