****************************************************** :f:mod:`spral_lsmr` - Sparse Least Squares LSMR Solver ****************************************************** .. f:module:: spral_lsmr :synopsis: Sparse Least Squares LSMR Solver ======= Purpose ======= This package uses the LSMR iterative method to solve sparse linear equations and sparse least-squares problems of the form: .. math:: \begin{array}{ll} \mbox{1. Nonsymmetric equations:} & \mbox{minimize } \|x\|_2 \mbox{ subject to }Ax = b, \\ \mbox{2. Linear least squares:} & \mbox{minimize } \|Ax - b\|_2^2,\\ \mbox{3. Regularized least squares:} & \mbox{minimize } \|Ax - b\|_2^2 + \lambda^2\|x\|_2^2, \end{array} where the :math:`m \times n` matrix :math:`A` may be square or rectangular, and may have any rank. The scalar :math:`\lambda` is a damping parameter. If :math:`\lambda > 0`, the solution is regularized in the sense that a unique soluton always exists, and :math:`\|x\|_2` is always bounded. Preconditioning may be used to try to reduce the number of iterations. A suitable choice for the preconditioner depends on the user's knowledge of :math:`A`. For a user-chosen :math:`n \times n` nonsingular matrix :math:`P`, LSMR solves .. math:: \begin{array}{ll} \mbox{1. Nonsymmetric equations:} & \mbox{minimize } \|Py\|_2 \mbox{ subject to }APy = b, \\ \mbox{2. Linear least squares:} & \mbox{minimize } \|APy - b\|_2^2 ,\\ \mbox{3. Regularized least squares:} & \mbox{minimize } \|APy - b\|_2^2 + \lambda^2\|Py\|_2^2 , \end{array} The user must then recover the final solution :math:`x` by computing :math:`x=Py`. :math:`P` will be a good preconditioner if :math:`AP` is significantly better conditioned than :math:`A`. Reverse communication is used for preconditioning operations :math:`Pz` and :math:`P^Tz` and matrix-vector products of the form :math:`Av` and :math:`A^Tu`. The method used is based on the Golub-Kahan bidiagonalization process. It is algebraically equivalent to applying MINRES to the normal equation :math:`(A^TA+\lambda^2I)x=A^Tb` (or :math:`((AP)^T(AP)+\lambda^2I)y=(AP)^Tb`), but has better numerical properties, especially if :math:`A` is ill-conditioned. Notation -------- In the rest of this documentation, we use the following notation: .. math:: \bar{A} = \left( \begin{array}{c} A \\ \lambda I \end{array} \right)P, \hspace{1cm} \bar{b} = \left( \begin{array}{c} b \\ 0 \end{array} \right), and .. math:: r = b - APy , \hspace{1cm} \bar{r} = \bar{b} - \bar{A}y, \hspace{1cm} x = Py. Version history --------------- 2016-05-10 Version 1.0.0. Initial release. [For detailed history, see ChangeLog] =========== Subroutines =========== .. f:subroutine:: lsmr_solve(action, m, n, u, v, y, keep, options, inform[,damp]) Solve the least squares problem. Uses reverse communication. Upon return the user must perform a task specified by the `action` parameter and recall the routine. Possible values of action and associated tasks are: +----------+----------------------------------------------------------------+ | `action` | Task to be performed | +==========+================================================================+ | 0 | None. Computation has terminated. Check inform%flag for reason.| +----------+----------------------------------------------------------------+ | 1 | The user must compute | | | | | | .. math:: v = v + P^TA^Tu | | | | | | without altering :math:`u`. | +----------+----------------------------------------------------------------+ | 2 | The user must compute | | | | | | .. math:: u = u + APv | | | | | | without altering :math:`v`. | +----------+----------------------------------------------------------------+ | 3 | The user may test for convergence. | | | | | | To continue, recall without changing any of the arguments. | +----------+----------------------------------------------------------------+ :p integer action [inout]: reverse communication task (see table above). Must be 0 on first call. :p integer m [in]: number of rows in :math:`A`. :p integer n [in]: number of columns in :math:`A`. :p real u (m) [inout]: the vector :math:`u`. Must contain :math:`b` on first call. :p real v (n) [inout]: the vector :math:`v`. :p real y (n) [inout]: the current solution :math:`y` of the preconditioned problem (note :math:`x=Py`). :p lsmr_keep keep [inout]: private internal data for LSMR. :p lsmr_options options [in]: controls execution of algorithm. :p lsmr_inform inform [inout]: information about execution of algorithm. Must not be changed by the user. :o real damp [in]: Damping parameter :math:`\lambda`. .. f:subroutine:: lsmr_free(keep,stat) Free memory allocated in `keep` (unnecessary if `keep` is going out of scope). :p lsmr_keep [inout]: private data to be freed. :p integer stat [out]: return 0 on success, or Fortran stat parameter on failed deallocation. .. note:: If a series of problems is being solved sequentially, the same keep may be used without calling :f:subr:`lsmr_free()` between each solution. ============= Derived types ============= .. f:type:: lsmr_options Specify options used by LSMR algorithm. :f integer print_freq_head [default=20]: frequency of printing heading information (that is, how many lines are printed before the heading information is reprinted). :f integer print_freq_itn [default=10]: frequency of printing status. There is printing on each of the first `print_freq_itn` iterations and then printing every `print_freq_itn` iterations. :f integer unit_diagnostics [default=6]: Fortran unit for diagnostic printing. Printing is suppressed if negative. :f integer unit_error [default=6]: Fortran unit for printing error messages. Printing is suppressed if negative. :f real atol [default=sqrt(epsilon(1.0d0))]: Relative error in :math:`A`. i.e. if :math:`A` is accurate to about 6 digits, set atol to 1.0d-6. Only used if `options%ctest=3`. :f real btol [default=sqrt(epsilon(1.0d0))]: Relative error in :math:`b`. i.e. if :math:`b` is accurate to about 6 digits, set btol to 1.0d-6. Only used if `options%ctest=3`. :f real conlim [default=1/(10*sqrt(epsilon(1.0d0))]: Upper limit on :math:`cond(\bar{A})`, apparent condition number of :math:`\bar{A}`. Only used if `options%ctest=3`. :f integer ctest [default=3]: Convergence test to use, one of: +-------------+----------------------------------------------------------+ | 1 | User to test convergence (`action=3`). | | | *Without* computation of norms in inform. | +-------------+----------------------------------------------------------+ | 2 | User to test convergence (`action=3`). | | | *With* computation of norms in inform. | +-------------+----------------------------------------------------------+ | 3 (default) | Convergence test of Fond and Saunders (in preconditioner | | | norm). | +-------------+----------------------------------------------------------+ :f integer itnlim [default=-1]: Maximum number of iterations. If negative, the limit used is :math:`4n`. :f integer itn_test [default=-1]: Number of iterations between user convergence tests. If negative, use :math:`min(n,10)`. :f integer localSize [default=0]: Number of historical vectors to use for reorthogonalization. .. f:type:: lsmr_inform Information about progress of algorithm. :f real condAP: Estimate of :math:`cond(\bar{A})`. A very high value of condAP may again indicate an error in the products with :math:`A`, :math:`A^T`, :math:`P`, or :math:`P^T`. A negative value indicates that no estimate is currently available. This component is not used if `options%ctest=1`. :f integer flag: exit status of algorithm. See table below. :f integer itn: number of iterations performed :f real normAP: Estimate of Frobenius norm of :math:`\bar{A}`. If :math:`\lambda` is small and the columns of :math:`AP` have all been scaled to have length 1.0, normAP should increase to roughly :math:`\sqrt{n}`. A radically different value for normAP may indicate an error in the user-supplied products with :math:`A`, :math:`A^T`, :math:`P`, or :math:`P^T`. A negative value indicates that no estimate is currently available. This component is not used if `options%ctest=1`. :f real normAP: Estimate of :math:`\|\bar{A}^T\bar{r}\|_2`, (normal equations residual). This should be small in all cases. Note that normAPr will often be smaller than the true value. A negative value indicates that no estimate is currently available. This component is not used if `options%ctest=1`. :f real normr: Estimate of :math:`\|\bar{r}\|_2`. This will be small if :math:`Ax = b` has a solution. A negative value indicates that no estimate is currently available. This component is not used if `options%ctest=1`. :f real normy: Estimate of :math:`\|y\|_2`. A negative value indicates that no estimate is currently available. This component is not used if `options%ctest=1`. :f integer stat: The Fortran stat parameter in the event of a failed allocation. +-------------+-------------------------------------------------------------+ | inform%flag | Interpretation | +=============+=============================================================+ | 0 | :math:`x = 0.0` is the exact solution. | | | No iterations were performed. | +-------------+-------------------------------------------------------------+ | 1 | The equations :math:`Ax = b` are probably compatible. | | | :math:`\|Ax - b\|_2` is sufficiently small, given the values| | | of `options%atol` and `options%btol`. | | | (`options%ctest=3` only). | +-------------+-------------------------------------------------------------+ | 2 | If damp is not present or is zero then the system | | | :math:`Ax = b` is probably not compatible. A least-squares | | | solution has been obtained that is sufficiently accurate, | | | given the value of `options%atol`. Otherwise, damped | | | least-squares solution has been obtained that is | | | sufficiently accurate, given the value of `options%atol`. | | | (`options%ctest=3` only). | +-------------+-------------------------------------------------------------+ | 3 | An estimate of cond(:math:`\bar{A}`) has exceeded | | | `options%conlim`. The system :math:`Ax = b` appears to be | | | ill-conditioned, or there could be an error in the products | | | with :math:`A`, :math:`A^T`, :math:`P`, or :math:`P^T`. | | | (`options%ctest=3` only). | +-------------+-------------------------------------------------------------+ | 4 | :math:`\|APy - b \|_2` is small enough for this machine. | | | (`options%ctest=3` only). | +-------------+-------------------------------------------------------------+ | 5 | The least-squares solution is good enough for this machine. | | | (options%ctest=3 only). | +-------------+-------------------------------------------------------------+ | 6 | The estimate `inform%condAP` appears to be too large for | | | this machine. (`options%ctest=3` only). | +-------------+-------------------------------------------------------------+ | 7 | The iteration limit `options%itnlim` has been reached. | +-------------+-------------------------------------------------------------+ | 8 | An array allocation failed. | +-------------+-------------------------------------------------------------+ | 9 | An array deallocation failed. | +-------------+-------------------------------------------------------------+ | 10 | Either `m<0` or `n<0`. | +-------------+-------------------------------------------------------------+ ======= Example ======= The following code illustrates the use of LMSR .. literalinclude:: ../../examples/Fortran/lsmr.f90 :language: Fortran This returns the following output: :: Exit LSMR with inform%flag = 2 and inform%itn = 3 LS solution is: 0.18 0.26 0.17 ====== Method ====== Algorithm --------- The method used is based on the Golub-Kahan bidiagonalization process. It is algebraically equivalent to applying MINRES to the normal equation :math:`(A^TA+\lambda^2I)x=A^Tb` (or :math:`((AP)^T(AP)+\lambda^2I)y=(AP)^Tb`, :math:`Py = x`, if preconditioning is used), but has better numerical properties, especially if :math:`A` is ill-conditioned. Full details may be found in [1]_. Scaling ------- LSMR uses an iterative method to approximate the solution. The number of iterations required to reach a certain accuracy depends strongly on the scaling of the problem. Poor scaling of the rows or columns of :math:`A` should therefore be avoided where possible. For example, in problem 1 the solution is unaltered by row-scaling. If a row of :math:`A` is very small or large compared to the other rows of :math:`A`, the corresponding row of :math:`( A\; b )` should be scaled up or down. In problems 1 and 2, the solution :math:`x` is easily recovered following column-scaling. Unless better information is known, the nonzero columns of :math:`A` should be scaled so that they all have the same Euclidean norm (e.g., 1.0). In problem 3, there is no freedom to re-scale if `damp` is nonzero. However, the value of `damp` should be assigned only after attention has been paid to the scaling of :math:`A`. The parameter `damp` is intended to help regularize ill-conditioned systems, by preventing the true solution from being very large. Another aid to regularization is provided by the `inform%condAP`, which may be used to terminate iterations before the computed solution becomes very large. Initial estimate ---------------- Note that :math:`x` (or :math:`y` for the preconditioned problem) is not an input parameter. If some initial estimate :math:`x_0` of :math:`x` is known and if :math:`\lambda = 0`, one could proceed as follows: 1. Compute a residual vector :math:`r_0 = b - Ax_0`. 2. Use LSMR to solve the system :math:`A \delta x = r_0`. 3. Add the correction :math:`\delta x` to obtain a final solution :math:`x = x_0 + \delta x`. This can be generalized for the preconditioned case. The guess :math:`x_0` has to be available before and after the calls to :f:subr:`lsmr_solve()`. To judge the benefits, suppose :f:subr:`lsmr_solve()` takes :math:`k_1` iterations to solve :math:`Ax = b` and :math:`k_2` iterations to solve :math:`A \delta x = r_0`. If :math:`x_0` is "good", :math:`\|r_0\|_2` will be smaller than :math:`\|b\|_2`. If the same stopping tolerances `options%atol` and `options%btol` are used for each system, :math:`k_1` and :math:`k_2` will be similar, but the final solution :math:`x = x_0 + \delta x` should be more accurate. The only way to reduce the total work is to use a larger stopping tolerance for the second system. If some value `options%btol` is suitable for :math:`Ax=b`, the larger value :math:`\mathrm{options\%btol}*\|b\|_2 / \|r_0\|_2` should be suitable for :math:`A \delta x = r_0`. References ---------- .. [1] D.C.-L. Fong and M.A. Saunders (2011). *LSMR: An iterative algorithm for sparse least-squares problems*. SIAM J. Sci. Comput. 33:5, 2950-2971. [`DOI: 10.1137/10079687X `_]