SPRAL

About
Licence
1 Installation
- 1.1 Quick Start
- 1.2 Licence
- 1.3 Third-party libraries
  - 1.3.1 METIS
  - 1.3.2 BLAS
- 1.4 Compilers and compiler options
- 1.5 Other options to configure
- 1.6 Support
2 RANDOM: Pseudo-random Number Generator
- 2.1 Installation
- 2.2 Usage overview
  - 2.2.1 Calling sequences
  - 2.2.2 Seed initialization
- 2.3 Random Generation Subroutines
  - 2.3.1 random_real()
  - 2.3.2 random_integer()
  - 2.3.3 random_logical()
- 2.4 Method
  - 2.4.1 Pseudo-random number generation
  - 2.4.2 random_real()
  - 2.4.3 random_int()
  - 2.4.4 random_logical()
- 2.5 Example
3 RANDOM_MATRIX: Pseudo-random Matrix Generator
- 3.1 Installation
- 3.2 Usage overview
  - 3.2.1 Calling sequences
  - 3.2.2 Seed initialization
  - 3.2.3 Matrix types
- 3.3 Subroutines
  - 3.3.1 random_matrix_generate()
- 3.4 Return codes
- 3.5 Method
- 3.6 Example
4 SCALING: Sparse Matrix Scalings
- 4.1 Installation
- 4.2 Usage overview
  - 4.2.1 Calling sequences
  - 4.2.2 Data formats
- 4.3 Auction Algorithm
  - 4.3.1 spral_scaling_auction_default_options()
  - 4.3.2 spral_scaling_auction_sym()
  - 4.3.3 spral_scaling_auction_unsym()
  - 4.3.4 struct spral_scaling_auction_options
  - 4.3.5 struct spral_scaling_auction_inform
  - 4.3.6 Error Flags
  - 4.3.7 Algorithm description
  - 4.3.8 Example of spral_scaling_auction_sym()
- 4.4 Norm-equilibration algorithm
  - 4.4.1 spral_scaling_equilib_default_options()
  - 4.4.2 spral_scaling_equilib_sym()
  - 4.4.3 spral_scaling_equilib_unsym()
  - 4.4.4 struct spral_scaling_equilib_options
  - 4.4.5 struct spral_scaling_equilib_inform
  - 4.4.6 Error Flags
  - 4.4.7 Algorithm description
  - 4.4.8 Example of spral_scaling_equilib_sym()
- 4.5 Hungarian algorithm
  - 4.5.1 spral_scaling_hungarian_default_options()
  - 4.5.2 spral_scaling_hungarian_sym()
  - 4.5.3 spral_scaling_hungarian_unsym()
  - 4.5.4 struct spral_scaling_hungarian_options
  - 4.5.5 struct spral_scaling_hungarian_inform
  - 4.5.6 Error Flags
  - 4.5.7 Algorithm description
  - 4.5.8 Example usage of spral_scaling_hungarian_unsym()
5 SSIDS: Sparse Symmetric Indeﬁnite Direct Solver
- 5.1 Installation
- 5.2 Usage overview
  - 5.2.1 Calling sequences
  - 5.2.2 Derived types
  - 5.2.3 Achieving bit-compatibility
  - 5.2.4 Data formats
- 5.3 Basic Subroutines
  - 5.3.1 spral_ssids_default_options()
  - 5.3.2 spral_ssids_analyse() and spral_ssids_analyse_coord()
  - 5.3.3 spral_ssids_factor()
  - 5.3.4 spral_ssids_solve()
  - 5.3.5 spral_ssids_free(), spral_ssids_free_akeep(), and spral_ssids_free_fkeep()
- 5.4 Advanced subroutines
  - 5.4.1 spral_ssids_enquire_posdef()
  - 5.4.2 spral_ssids_enquire_indef()
  - 5.4.3 spral_ssids_alter()
- 5.5 Derived types
  - 5.5.1 struct spral_ssids_options
  - 5.5.2 struct spral_ssids_inform
- 5.6 Error ﬂags
- 5.7 Method
- 5.8 Example
6 SSMFE: Sparse Symmetric Matrix-Free Eigensolver
- 6.1 Installation
- 6.2 Usage overview
  - 6.2.1 Calling sequences
  - 6.2.2 Derived types
- 6.3 Argument lists
- 6.4 Derived types
  - 6.4.1 struct spral_ssmfe_options
  - 6.4.2 struct spral_ssmfe_inform
- 6.5 Error ﬂags
- 6.6 Method
- 6.7 Examples
  - 6.7.1 Preconditioning example
  - 6.7.2 Shift-and-invert example
  - 6.7.3 Hermitian example
7 SSMFE_CORE: Sparse Symmetric Matrix-Free Eigensolver (Core Algorithm)
- 7.1 Installation
- 7.2 Usage overview
  - 7.2.1 Calling sequences
  - 7.2.2 Derived types
- 7.3 Argument lists
  - 7.3.1 spral_ssmfe_core_default_options()
  - 7.3.2 spral_ssmfe_type() and spral_ssmfe_largest_type()
  - 7.3.3 spral_ssmfe_core_free()
- 7.4 Derived types
  - 7.4.1 struct spral_ssmfe_core_options
  - 7.4.2 struct spral_ssmfe_inform
- 7.5 Error ﬂags
- 7.6 Method
  - 7.6.1 The algorithm
  - 7.6.2 Error estimation
- 7.7 Examples
  - 7.7.1 Preconditioning example
8 SSMFE_EXPERT: Sparse Symmetric Matrix-Free Eigensolver (Expert Interface)
- 8.1 Installation
- 8.2 Usage overview
  - 8.2.1 Calling sequences
  - 8.2.2 Derived types
- 8.3 Argument lists
- 8.4 Derived types
  - 8.4.1 struct spral_ssmfe_options
  - 8.4.2 struct spral_ssmfe_inform
- 8.5 Error ﬂags
- 8.6 Method
  - 8.6.1 The algorithm
  - 8.6.2 Stopping criteria
  - 8.6.3 Improving eigenvector accuracy
  - 8.6.4 The use of shifted matrix factorization
  - 8.6.5 Error estimation
- 8.7 Examples
  - 8.7.1 Preconditioning example

SSMFE: Sparse Symmetric Matrix-Free Eigensolver

SPRAL_SSMFEv1.0.0

Sparse Symmetric Matrix-Free Eigensolver

C User Guide

This package computes extreme (leftmost and/or rightmost) eigenpairs

{λ_{i}, x_{i}}

of the following eigenvalue problems:

the standard eigenvalue problem $\begin{array}{rcl} A x = λ x, & (6.1) \end{array}$
the generalized eigenvalue problem $\begin{array}{rcl} A x = λ B x, & (6.2) \end{array}$
the buckling problem $\begin{array}{rcl} B x = λ A x, & (6.3) \end{array}$

where $A$ and $B$ are real symmetric (or Hermitian) matrices and $B$ is positive deﬁnite.

Evgueni Ovtchinnikov (STFC Rutherford Appleton Laboratory)

Major version history

2015-04-20 Version 1.0.0: Initial release

6.1 Installation

Please see the SPRAL install documentation. In particular note that:

A BLAS library is required.
A LAPACK library is required.

6.2 Usage overview

The eigensolver subroutines behind SPRAL_SSMFE implement a block iterative algorithm. The block nature of this algorithm allows the user to beneﬁt from highly optimized linear algebra subroutines and from the ubiquitous multicore architecture of modern computers. It also makes this algorithm more reliable than Krylov-based algorithms employed e.g. by ARPACK in the presence of clustered eigenvalues. However, convergence of the iterations may be slow if the density of the spectrum is high.

Thus, good performance (in terms of speed) is contingent on the following two factors: (i) the number of desired eigenpairs must be substantial (e.g. not less than the number of CPU cores), and (ii) the employment of a convergence acceleration technique. The acceleration techniques that can be used are shift-and-invert and preconditioning. The former requires the direct solution of linear systems with the matrix $A$ or its linear combination with $B$ , for which a sparse symmetric indeﬁnite solver (such as HSL_MA97 or SPRAL_SSIDS) can be employed. The latter applies to the case of positive deﬁnite $A$ and requires a matrix or an operator¹ $T$ , called a preconditioner, such that the vector $v = T f$ is an approximation to the solution $u$ of the system $A u = f$ (see a simple example in Section 6.7.1). This technique is more sophisticated and is likely to be of interest only to experienced users.

Additional options are oﬀered by the packages SPRAL_SSMFE_EXPERT and SPRAL_SSMFE_CORE, upon which SPRAL_SSMFE is built and which are recommended for experienced users. Further information on the algorithm used by SPRAL_SSMFE can be found in the speciﬁcation document for SPRAL_SSMFE_CORE and in Technical Report RAL-TR-2010-19.

6.2.1 Calling sequences

Access to the package requires inclusion of either spral.h (for the entire SPRALlibrary) or spral_ssmfe.h (for just the SSMFE routines), i.e.

#include "spral.h"

The following procedures are available to the user:

spral_ssmfe_default_options() initializes the options structure to default vales
spral_ssmfe_standard_type() computes leftmost eigenpairs of (6.1), optionally using preconditioning
spral_ssmfe_standard_shift_type() computes eigenpairs of (6.1) near a given shift using the shift-and-invert technique
spral_ssmfe_generalized_type() computes leftmost eigenpairs of (6.2), optionally using preconditioning
spral_ssmfe_generalized_shift_type() computes eigenpairs of (6.2) near a given shift using the shift-and-invert technique
spral_ssmfe_buckling_type() computes eigenpairs of (6.3) near a given shift using the shift-and-invert technique
spral_ssmfe_free_type() should be called after all other calls are complete. It frees memory references by keep and inform.

In the above, type should be replaced by either double or double_complex, and this must match for all subsequent calls with the same data structures.

The main solver procedures must be called repeatedly using a reverse communication interface. The procedure spral_ssmfe_free() should be called once after the ﬁnal call to a solver procedure to deallocate all arrays that have been allocated by the solver procedure.

6.2.2 Derived types

For each problem, the user must employ the derived types deﬁned by the module to declare scalars of the types struct spral_ssmfe_rcid (real version) or struct spral_ssmfe_rciz (complex version), struct spral_ssmfe_options and struct spral_ssmfe_inform. The user must also declare a void * pointer keep for the package’s private data structure. The options data structure must be initialized using spral_ssmfe_default_options(), and the pointer keep must be initialized to NULL. The following pseudocode illustrates this.

      #include "spral.h"
      ...
      struct spral_ssmfe_rcid rcid;
      struct spral_ssmfe_options options;
      struct spral_ssmfe_inform  inform;
      void *keep = NULL;
      ...
      spral_ssmfe_default_options(&options);

The components of ssmfe_options and ssmfe_inform are explained in Sections 6.4.1 and 6.4.2. The components of ssmfe_keepd and ssmfe_keepz are used to pass private data between calls. The components of ssmfe_rcid and ssmfe_rciz that are used by SPRAL_SSMFE for the reverse communication are job, nx, ny, all of type int, and x and y, which are two-dimensional arrays of type double (ssmfe_keepd) or double complex (ssmfe_keepz).

6.3 Argument lists

6.3.1 spral_ssmfe_default_options()

To initialize a variable of type struct spral_ssmfe_options the following routine is provided.

void spral_ssmfe_default_options(struct spral_ssmfe_options *options);

*options: is the instance to be initialized.

6.3.2 spral_ssmfe_standard_type(), spral_ssmfe_standard_shift_type(),
spral_ssmfe_generalized_type(), spral_ssmfe_generalized_shift_type(), and spral_ssmfe_buckling_type(),

To compute the leftmost eigenpairs of (6.1), optionally using preconditioning, the following routine must be called repeatedly:

   void spral_ssmfe_standard_double        (struct spral_ssmfe_rcid *rci, int left, int mep,
      double *lambda, int n, double         *x, int ldx, void **keep,
      const struct spral_ssmfe_options *options, struct spral_ssmfe_inform *inform);
   void spral_ssmfe_standard_double_complex(struct spral_ssmfe_rciz *rci, int left, int mep,
      double *lambda, int n, double complex *x, int ldx, void **keep,
      const struct spral_ssmfe_options *options, struct spral_ssmfe_inform *inform);

To compute the eigenvalues of (6.1) in the vicinity of a given value sigma and the corresponding eigenvectors using the shift-and-invert technique, the following routine must be called repeatedly:

   void spral_ssmfe_standard_shift_double        (struct spral_ssmfe_rcid *rci, double sigma,
      int left, int right, int mep, double *lambda, int n, double         *x, int ldx,
      void **keep, const struct spral_ssmfe_options *options,
      struct spral_ssmfe_inform *inform);
   void spral_ssmfe_standard_shift_double_complex(struct spral_ssmfe_rciz *rci, double sigma,
      int left, int right, int mep, double *lambda, int n, double complex *x, int ldx,
      void **keep, const struct spral_ssmfe_options *options,
      struct spral_ssmfe_inform *inform);

To compute the leftmost eigenpairs of (6.2), optionally using preconditioning, the following routine must be called repeatedly:

   void spral_ssmfe_generalized_double        (struct spral_ssmfe_rcid *rci, int left,
      int mep, double *lambda, int n, double         *x, int ldx, void **keep,
      const struct spral_ssmfe_options *options, struct spral_ssmfe_inform *inform);
   void spral_ssmfe_generalized_double_complex(struct spral_ssmfe_rciz *rci, int left,
      int mep, double *lambda, int n, double complex *x, int ldx, void **keep,
      const struct spral_ssmfe_options *options, struct spral_ssmfe_inform *inform);

To compute the eigenvalues of (6.2) in the vicinity of a given value sigma and the corresponding eigenvectors using the shift-and-invert technique, the following routine must be called repeatedly:

   void spral_ssmfe_generalized_shift_double        (struct spral_ssmfe_rcid *rci,
      double sigma, int left, int right, int mep, double *lambda, int n,
      double         *x, int ldx, void **keep, const struct spral_ssmfe_options *options,
      struct spral_ssmfe_inform *inform);
   void spral_ssmfe_generalized_shift_double_complex(struct spral_ssmfe_rciz *rci,
      double sigma, int left, int right, int mep, double *lambda, int n,
      double complex *x, int ldx, void **keep, const struct spral_ssmfe_options *options,
      struct spral_ssmfe_inform *inform);

To compute the eigenvalues of (6.3) in the vicinity of a given value sigma and the corresponding eigenvectors the following routine must be called repeatedly:

   void spral_ssmfe_buckling_double        (struct spral_ssmfe_rcid *rci, double sigma,
      int left, int right, int mep, double *lambda, int n, double         *x, int ldx,
      void **keep, const struct spral_ssmfe_options *options,
      struct spral_ssmfe_inform *inform);
   void spral_ssmfe_buckling_double_complex(struct spral_ssmfe_rciz *rci, double sigma,
      int left, int right, int mep, double *lambda, int n, double complex *x, int ldx,
      void **keep, const struct spral_ssmfe_options *options,
      struct spral_ssmfe_inform *inform);

rci

is used for the reverse communication interface. Before the ﬁrst call, rci.job must be set to 0. No other values may be assigned to rci.job by the user. After each call, the value of rci.job must be inspected by the user’s code and the appropriate action taken (see below for details). The following values of rci.job are common to all solver procedures and require the same action:

-3: : fatal error return, the computation must be terminated;
-2: : non-fatal error return, the computation may be restarted, see Section 6.5 for the guidance;
-1: : the computation is complete and successful.
1: : the user must multiply the n $\times$ rci.nx matrix rci.x[] (stored in column major format) by $A$ and place the result into rci.y[] (in column major format).
2: : (spral_ssmfe_standard_type() and spral_ssmfe_generalized_type() only) the user must apply the preconditioner $T$ to the n $\times$ rci.nx matrix rci.x[] (stored in column major format) and place the result into rci.y[] (in column major format).
3: : (spral_ssmfe_generalized_type(), spral_ssmfe_generalized_shift_type() and
spral_ssmfe_buckling_type() only) the user must multiply the n $\times$ rci.nx matrix rci.x[] (stored in column major format) by $B$ and place the result into rci.y[] (in column major format).
9: : (spral_ssmfe_standard_shift_type(), spral_ssmfe_generalized_shift_type() and
spral_ssmfe_buckling_type() only) the solution of the shifted system with the right-hand side n $\times$ rci.nx matrix rci.x[] (stored in column major format) must be placed into rci.y[] (in column major format). For problem (6.1), the shifted matrix is $A - σ I$ , where $I$ is $n \times n$ identity. For problem (6.2), the shifted matrix is $A - σ B$ . For problem (6.3), the shifted matrix is $B - σ A$ .

Restriction: rci.job = 0 is the only value that can be assigned by the user.

sigma

holds the shift, a value around which the desired eigenvalues are situated.

left

holds the number of desired leftmost eigenpairs. Restriction:

0 <

left + right

\leq

min(mep, n/2), where right is zero for spral_ssmfe_standard_type() and spral_ssmfe_generalized_type().

right

holds the number of desired eigenvalues to the right of sigma. Restriction:

0 <

left + right

\leq

min(mep, n/2).

mep

holds the size of the array lambda. See Section 6.6 for guidance on setting mep. Restriction: mep is not less than the number of desired eigenpairs (cf. left and right).

lambda[mep]

is used to store the computed eigenvalues. After a successful completion of the computation it contains eigenvalues in ascending order. This array must not be changed by the user.

n

holds the problem size. Restriction: n

\geq 1

x[mep][ldX]

is used to store the computed eigenvectors. The order of eigenvectors in x[][] is the same as the order of eigenvalues in lambda[]. This array may only be changed by the user before the ﬁrst call to an eigensolver procedure (see the description of options.user_x in Section 6.4.1).

ldx

holds the leading dimension of x[][]. Restriction: ldx

\geq

*keep

must be initialized to NULL before the ﬁrst call. It holds private data and must not be modiﬁed by the user.

*options

speciﬁes the algorithmic options used by the routines, as explained in Section 6.4.1. It must not be changed by the user between calls.

*inform

is used to return information about the execution of the routine, as explained in Section 6.4.2. It must not be changed by the user.

6.3.3 spral_ssmfe_free_type()

At the end of the computation, the memory allocated by the solver procedures should be released by making a call to the following subroutine:

void spral_ssmfe_free_double (void **keep, struct spral_ssmfe_inform *inform);
void spral_ssmfe_free_double_complex(void **keep, struct spral_ssmfe_inform *inform);

*keep: must be unchanged since the last call to a solver subroutine.
*inform: must be unchanged since the last call to a solver subroutine.

6.4 Derived types

6.4.1 struct spral_ssmfe_options

The derived data type ssmfe_options has the following components.

Convergence control options

double abs_tol_lambda: holds an absolute tolerance used when testing the estimated eigenvalue error, see Section 6.6. The default value is 0. Negative values are treated as the default.
double abs_tol_residual: holds an absolute tolerance used when testing the residual, see Section 6.6. The default value is 0. Negative values are treated as the default.
int max_iterations: holds the maximum number of iterations to be performed. The default value is 100.
double rel_tol_lambda: holds a relative tolerance used when testing the estimated eigenvalue error, see Section 6.6. The default value is 0. Negative values are treated as the default.
double rel_tol_residual: holds a relative tolerance used when testing the residual, see Section 6.6. If both abs_tol_residual and rel_tol_residual are set to 0, then the residual norms are not taken into consideration by the convergence test, see Section 6.6. The default value is 0. Negative values are treated as the default.
double tol_x: holds a tolerance used when testing the estimated eigenvector error, see Section 6.6. If tol_x is set to zero, the eigenvector error is not estimated. If a negative value is assigned, the tolerance is set to sqrt(epsilon(lambda)). The default value is -1.0.

Printing options

int print_level

determines the amount of printing. Possible values are:

$< 0$	:	no printing;
$0$	:	error and warning messages only;
$1$	:	the type (standard or generalized) and the size of the problem, the number of eigenpairs
		requested, the error tolerances and the size of the subspace are printed before the iterations start;
$2$	:	as $1$ but, for each eigenpair tested for convergence (see Section 6.6), the iteration number, the index of
		the eigenpair, the eigenvalue, whether it has converged, the residual norm, and the error estimates
		are printed;
$> 2$	:	as $1$ but with all eigenvalues, whether converged, residual norms and eigenvalue/eigenvector
		error estimates printed on each iteration.

The default value is 0. Note that for eigenpairs that are far from convergence, ‘rough’ error estimates are printed (the estimates that are actually used by the stopping criteria, see Section 6.6, only become available on the last few iterations).

int unit_error

holds the Fortran unit number for error messages. Printing is suppressed if unit_error < 0. The default value is 6.

int unit_diagnostic

holds the Fortran unit number for messages monitoring the convergence. Printing is suppressed if unit_diagnostics < 0. The default value is 6.

int unit_warning

holds the Fortran unit number for warning messages. Printing is suppressed if unit_warning < 0. The default value is 6.

Advanced options

double left_gap: is only used when left is non-zero, and speciﬁes the minimal acceptable distance between the last computed left eigenvalue and the rest of the spectrum. For spral_ssmfe_standard_type() and spral_ssmfe_generalized_type(), the last computed left eigenvalue is the rightmost of the computed ones, and for the other procedures it is the leftmost. If set to a negative value $δ$ , the minimal distance is taken as $| δ |$ times the average distance between the computed eigenvalues. Note that for this option to have any eﬀect, the value of mep must be larger than left + right: see Section 6.6 for further explanation. The default value is 0.
int max_left: holds the number of eigenvalues to the left from $σ$ , or a negative value, if this number is not known (cf. Section 6.6). The default is max_left = -1.
int max_right: holds the number of eigenvalues to the right from $σ$ , or a negative value, if this number is not known. (cf. Section 6.6). The default is max_right = -1.
double right_gap: is only used by spral_ssmfe_standard_shift_type(), spral_ssmfe_generalized_shift_type() and spral_ssmfe_buckling_type() with a non-zero right, and has the same meaning as options.left_gap but for the rightmost computed eigenvalue. The default value is 0.
int user_x: speciﬁes whether an inital guess for eigenvectors is supplied. If user_x > 0 then the ﬁrst user_x columns of x[][] will be used as initial guesses for eigenvectors. Hence, if the user has good approximations to some of the required eigenvectors, the computation time may be reduced by putting these approximations into the ﬁrst user_x columns of x[][]. The default value is 0, i.e. the columns of x[][] are overwritten by the solver. Restriction: if user_x > 0 then the ﬁrst user_x columns in x[][] must be linearly independent.

6.4.2 struct spral_ssmfe_inform

The derived data type ssmfe_inform is used to hold information from the execution of the solver procedures. The components are:

int flag: is used as an error ﬂag. If a call is successful, flag has value 0. A nonzero value of flag indicates an error or a warning (see Section 6.5).
int iteration: holds the number of iterations.
int left: holds the number of converged eigenvalues on the left, i.e. the total number of converged eigenpairs for spral_ssmfe_standard_type() and spral_ssmfe_generalized_type(), and the number of the converged eigenvalues left of sigma for spral_ssmfe_standard_shift_type(), spral_ssmfe_generalized_shift_type() and spral_ssmfe_buckling_type().
double next_left: holds the non-converged eigenvalue next to the last converged on the left (cf. options.left_gap).
double next_right: is used by spral_ssmfe_standard_shift_type(), spral_ssmfe_generalized_shift_type() and spral_ssmfe_buckling_type() only, and holds the non-converged eigenvalue next to the last converged on the right (cf. options.right_gap).
int non_converged: holds the number of non-converged eigenpairs (see Section 6.5).
int right: is used by spral_ssmfe_standard_shift_type(), spral_ssmfe_generalized_shift_type() and spral_ssmfe_buckling_type() only, and holds the number of converged eigenvalues right of sigma.
int stat: holds the Fortran allocation status in the event of an error (see Section 6.5).

6.5 Error ﬂags

A successful return from a solver procedure is indicated by inform.flag $=$ 0. A negative value indicates an error, a positive value indicates a warning; inform.data provides further information about some errors and warnings.

Possible negative values of inform.flag are as follows:

: -1 rci.job is out-of-range.
: -9 n is out-of-range.
: -10 ldx is out-of-range.
: -11 left is out-of-range.
: -12 right is out-of-range.
: -13 mep is less than the number of desired eigenpairs.
: -100 Not enough memory; inform.stat contains the value of the Fortran stat parameter.
: -200 $B$ is not positive deﬁnite or user_x > 0 and linear dependent initial guesses were supplied.

Possible positive values are:

: 1 The iterations have been terminated because no further improvement in accuracy is possible (this may happen if $B$ or the preconditioner is not positive deﬁnite, or if the components of the residual vectors are so small that the round-oﬀ errors make them essentially random). The value of inform.non_converged is set to the number of non-converged eigenpairs.
: 2 The maximum number of iterations max_iterations has been exceeded. The value of inform.non_converged is set to the number of non-converged eigenpairs.
: 3 The solver had run out of storage space for the converged eigenpairs before the gap in the spectrum required by options.left_gap and/or options.right_gap was reached. The value of inform.non_converged is set to the number of non-converged eigenpairs.

If the computation is terminated with the error code 2 or 3, it can be resumed with larger values of max_iterations and/or mep. In this case the user should set options.user_X to info.left $+$ info.right and restart the reverse communication loop. An alternative option is to use one of the advanced solver procedures from SPRAL_SSMFE_EXPERT or SPRAL_SSMFE_CORE that delegate the storage of computed eigenpairs and the termination of the computation to the user.

6.6 Method

SPRAL_SSMFE_CORE, upon which SPRAL_SSMFE is built, implements a block iterative algorithm based on the Jacobi-conjugate preconditioned gradients (JCPG) method [2,3]. This algorithm simultaneously computes $m < n$ approximate eigenpairs, where the block size $m$ exceeds the number $n_{e}$ of desired eigenpairs for the sake of better convergence, namely, $m = n_{e} + min (10, 0.1 n_{e})$ .

An approximate eigenpair ${x, λ}$ is considered to have converged if the following three conditions are all satisﬁed:

if options.abs_tol_lambda and options.rel_tol_lambda are not both equal to zero, then the estimated error in the approximate eigenvalue must be less than or equal to
max(options.abs_tol_lambda, $δ$ *options.rel_tol_lambda),
where $δ$ is the estimated average distance between eigenvalues.
if options.tol_x is not zero, then the estimated sine of the angle between the approximate eigenvector and the invariant subspace corresponding to the eigenvalue approximated by $λ$ must be less than or equal to options.tol_x.
if options.abs_tol_residual and options.rel_tol_residual are not both equal to zero, then the Euclidean norm of the residual, $∥ A x - λ B x ∥_{2}$ , must be less than or equal to
max(options.abs_tol_residual, options.rel_tol_residual* $∥ λ B x ∥_{2}$ ).

The extra eigenpairs are not checked for convergence, as their role is purely auxiliary.

If the gap between the last computed eigenvalue and the rest of the spectrum is small, then the accuracy of the corresponding eigenvector may be very low. To prevent this from happening, the user should set the eigenpairs storage size mep to a value that is larger than the number of desired eigenpairs, and set the options options.left_gap and options.right_gap to non-zero values $δ_{l}$ and $δ_{r}$ . These values determine the size of the minimal acceptable gaps between the computed eigenvalues and the rest of the spectrum, $δ_{l}$ referring to either leftmost eigenvalues (for spral_ssmfe_standard_type() and spral_ssmfe_generalized_type() only) or those to the left of the shift sigma, and $δ_{r}$ to those to the right of the shift sigma. Positive values of $δ_{l}$ and $δ_{r}$ set the gap explicitely, and negative values require the gap to be not less than their absolute value times the average distance between the computed eigenvalues. A recommended value of $δ_{l}$ and $δ_{r}$ is $- 0.1$ . The value of mep has little eﬀect on the speed of computation, hence it might be set to any reasonably large value. The larger the value of mep, the larger the size of an eigenvalue cluster for which accurate eigenvectors can be computed, notably: to safeguard against clusters of size up to $k$ , it is suﬃcient to set mep to the number of desired eigenpairs plus $k - 1$ .

When using the solver procedures that employ the shift-and-invert technique, it is very important to ensure that the numbers of desired eigenvalues each side of the shift do not exceed the actual numbers of these eigenvalues, as the eigenpairs ‘approximating’ non-existing eigenpairs of the problem will not converge. It is therefore strongly recommended that the user employs a linear system solver that performs the LDLT factorization of the shifted system, e.g. HSL_MA97 or SPRAL_SSIDS. The LDLT factorization of the matrix $A - σ B$ consists in ﬁnding a lower triangular matrix $L$ , a block-diagonal matrix $D$ with $1 \times 1$ and $2 \times 2$ blocks on the main diagonal and a permutation matrix $P$ such that $P^{T} (A - σ B) P = L D L^{T}$ . By inertia theorem, the number of eigenvalues to the left and right from the shift $σ$ is equal to the number of negative and positive eigenvalues of $D$ , which allows quick computation of the eigenvalue numbers each side of the shift.

References

[1] E. E. Ovtchinnikov and J. Reid. A preconditioned block conjugate gradient algorithm for computing extreme eigenpairs of symmetric and Hermitian problems. Technical Report RAL-TR-2010-19, 2010.

[2] E. E. Ovtchinnikov, Jacobi correction equation, line search and conjugate gradients in Hermitian eigenvalue computation I: Computing an extreme eigenvalue, SIAM J. Numer. Anal., 46:2567–2592, 2008.

[3] E. E. Ovtchinnikov, Jacobi correction equation, line search and conjugate gradients in Hermitian eigenvalue computation II: Computing several extreme eigenvalues, SIAM J. Numer. Anal., 46:2593–2619, 2008.

6.7 Examples

6.7.1 Preconditioning example

The following code computes the 5 leftmost eigenpairs of the matrix $A$ of order 100 that approximates the two-dimensional Laplacian operator on a 20-by-20 grid. One forward and one backward Gauss-Seidel update are used for preconditioning, which halves the number of iterations compared with solving the same problem without preconditioning. The header laplace2d.h (examples/C/ssmfe/laplace2d.h) supplies a subroutine apply_laplacian() that multiplies a block of vectors by $A$ , and a subroutine apply_gauss_seidel_step() that computes $y = T x$ for a block of vectors $x$ by applying one forward and one backward update of the Gauss-Seidel method to the system $A y = x$ .

/* examples/C/ssmfe/precond_ssmfe.c */
/* Laplacian on a square grid (using SPRAL_SSMFE routines) */
#include "spral.h"
#include <math.h>
#include <stdlib.h>
#include <stdio.h>
#include <cblas.h>

/* Header that implements Laplacian and preconditioners */
#include "laplace2d.h"

void main(void) {
   const int m   = 20;     /* grid points along each side */
   const int n   = m*m;    /* problem size */
   const int nep = 5;      /* eigenpairs wanted */

   double lambda[2*nep];                  /* eigenvalues */
   double X[2*nep][n];                    /* eigenvectors */
   struct spral_ssmfe_rcid rci;           /* reverse communication data */
   struct spral_ssmfe_options options;    /* options */
   void *keep;                            /* private data */
   struct spral_ssmfe_inform inform;      /* information */

   /* Initialize options to default values */
   spral_ssmfe_default_options(&options);
   /* gap between the last converged eigenvalue and the rest of the spectrum
    * must be at least 0.1 times average gap between computed eigenvalues */
   options.left_gap = -0.1;

   rci.job = 0; keep = NULL;
   while(true) { /* reverse communication loop */
      spral_ssmfe_standard_double(&rci, nep, 2*nep, lambda, n, &X[0][0], n,
         &keep, &options, &inform);
      switch ( rci.job ) {
      case 1:
         apply_laplacian(m, m, rci.nx, rci.x, rci.y);
         break;
      case 2:
         apply_gauss_seidel_step(m, m, rci.nx, rci.x, rci.y);
         break;
      default:
         goto finished;
      }
   }
finished:
   printf("%d eigenpairs converged in %d iterations\n", inform.left, inform.iteration);
   for(int i=0; i<inform.left; i++)
      printf(" lambda[%1d] = %13.7e\n", i, lambda[i]);
   spral_ssmfe_free_double(&keep, &inform);
}

This code produces the following output:

6 eigenpairs converged in 19 iterations
lambda[0] = 4.4676695e-02
lambda[1] = 1.1119274e-01
lambda[2] = 1.1119274e-01
lambda[3] = 1.7770878e-01
lambda[4] = 2.2040061e-01
lambda[5] = 2.2040061e-01

Note that the code computed one extra eigenpair because of the insuﬃcient gap between the 5th and 6th eigenvalues.

6.7.2 Shift-and-invert example

The following code computes the eigenpairs of the matrix of order 64 that approximates the two-dimensional Laplacian operator on 8-by-8 grid with eigenvalues near the shift sigma $= 1.0$ . For the shifted solve, LAPACK subroutines DSYTRS and DSYTRF are used, which perform the LDLT-factorization and the solution of the factorized system respectively. The matrix of the discretized Laplacian is computed by the subroutine set_2d_laplacian_matrix() from the laplace2d.h header (examples/C/ssmfe/laplace2d.h). The header ldltf.h (examples/C/ssmfe/ldltf.h) supplies the function num_neg_D() that counts the number of negative eigenvalues of the D-factor.

/* examples/C/ssmfe/shift_invert.c */
/* Laplacian on a rectangular grid by shift-invert via LDLT factorization */
#include "spral.h"
#include <math.h>
#include <stdlib.h>
#include <stdio.h>
#include <cblas.h>

/* Headers that implements Laplacian and preconditioners and LDLT support */
#include "laplace2d.h"
#include "ldltf.h"

void main(void) {
   const int nx = 8;          /* grid points along x */
   const int ny = 8;          /* grid points along y */
   const int n = nx*ny;       /* problem size */
   const double sigma = 1.0;  /* shift */

   int ipiv[n];               /* LDLT pivot index */
   double lambda[n];          /* eigenvalues */
   double X[n][n];            /* eigenvectors */
   double A[n][n];            /* matrix */
   double LDLT[n][n];         /* factors */
   double work[n][n];         /* work array for dsytrf */
   struct spral_ssmfe_options options;    /* eigensolver options */
   struct spral_ssmfe_inform inform;      /* information */
   struct spral_ssmfe_rcid rci;           /* reverse communication data */
   void *keep;                            /* private data */

   /* Initialize options to default values */
   spral_ssmfe_default_options(&options);

   /* Set up then perform LDLT factorization of the shifted matrix */
   set_laplacian_matrix(nx, ny, n, A);
   for(int j=0; j<n; j++)
   for(int i=0; i<n; i++)
      LDLT[j][i] = (i==j) ? A[j][i] - sigma
                          : A[j][i];
   cwrap_dsytrf(’L’, n, &LDLT[0][0], n, ipiv, &work[0][0], n*n);

   /* Main loop */
   int left = num_neg_D(n, n, LDLT, ipiv);   /* all evalues to left of sigma */
   int right = 5;                            /* 5 evalues to right of sigma */
   rci.job = 0; keep = NULL;
   while(true) {
      spral_ssmfe_standard_shift_double(&rci, sigma, left, right, n, lambda,
            n, &X[0][0], n, &keep, &options, &inform);
      switch( rci.job ) {
      case 1:
         cblas_dgemm(CblasColMajor, CblasNoTrans, CblasNoTrans, n, rci.nx, n,
            1.0, &A[0][0], n, rci.x, n, 0.0, rci.y, n);
         break;
      case 2:
         // No preconditioning
         break;
      case 9:
         cblas_dcopy(n*rci.nx, rci.x, 1, rci.y, 1);
         cwrap_dsytrs(’L’, n, rci.nx, &LDLT[0][0], n, ipiv, rci.y, n);
         break;
      default:
         goto finished;
      }
   }
finished:
   printf("Eigenvalues near %e (took %d iterations)\n", sigma, inform.iteration);
   for(int i=0; i<inform.left+inform.right; i++)
      printf(" lambda[%1d] = %13.7e\n", i, lambda[i]);
   spral_ssmfe_free_double(&keep, &inform);
}

This code produces the following output:

Eigenvalues near 1.000000e+00 (took 5 iterations)
lambda[0] = 2.4122952e-01
lambda[1] = 5.8852587e-01
lambda[2] = 5.8852587e-01
lambda[3] = 9.3582223e-01
lambda[4] = 1.1206148e+00
lambda[5] = 1.1206148e+00
lambda[6] = 1.4679111e+00
lambda[7] = 1.4679111e+00
lambda[8] = 1.7733184e+00

6.7.3 Hermitian example

The following code computes the 5 leftmost eigenpairs of the diﬀerential operator $i \frac{d}{d x}$ acting in the space of periodic functions discretized by central diﬀerences on a uniform mesh of 80 steps.

/* examples/C/ssmfe/hermitian.c - Example code for SPRAL_SSMFE package */
/* Hermitian operator example */
#include "spral.h"
#include <math.h>
#include <stdlib.h>
#include <stdio.h>
#include <cblas.h>

/* central differences for i d/dx */
void apply_idx(int n, int m, const double complex *x_ptr, double complex *y_ptr) {
   const double complex (*x)[n] = (const double complex (*)[n]) x_ptr;
   double complex (*y)[n] = (double complex (*)[n]) y_ptr;
   for(int j=0; j<m; j++) {
      for(int i=0; i<n; i++) {
         int il = (i==0)   ? n-1 : i-1;
         int ir = (i==n-1) ? 0   : i+1;
         y[j][i] = _Complex_I * (x[j][ir] - x[j][il]);
      }
   }
}

/* main routine */
void main(void) {
   const int n   = 80;                 /* problem size */
   const int nep = 5;                  /* eigenpairs wanted */

   double lambda[nep];                 /* eigenvalues */
   double complex X[nep][n];           /* eigenvectors */
   struct spral_ssmfe_rciz rci;        /* reverse communication data */
   struct spral_ssmfe_options options; /* options */
   void *keep;                         /* private data */
   struct spral_ssmfe_inform inform;   /* information */

   /* Initialize options to default values */
   spral_ssmfe_default_options(&options);

   rci.job = 0; keep = NULL;
   while(true) { /* reverse communication loop */
      spral_ssmfe_standard_double_complex(&rci, nep, nep, lambda, n,
         &X[0][0], n, &keep, &options, &inform);
      switch ( rci.job ) {
      case 1:
         apply_idx(n, rci.nx, rci.x, rci.y);
         break;
      case 2:
         // No preconditioning
         break;
      default:
         goto finished;
      }
   }
finished:
   printf("%d eigenpairs converged in %d iterations\n", inform.left, inform.iteration);
   for(int i=0; i<inform.left; i++)
      printf(" lambda[%1d] = %13.7e\n", i, lambda[i]);
   spral_ssmfe_free_double_complex(&keep, &inform);
}

This code produces the following output:

5 eigenpairs converged in 25 iterations
lambda[0] = -2.0000000e+00
lambda[1] = -1.9938347e+00
lambda[2] = -1.9938347e+00
lambda[3] = -1.9753767e+00
lambda[4] = -1.9753767e+00

Contents

SPRAL

SSMFE: Sparse Symmetric Matrix-Free Eigensolver

SPRAL_SSMFEv1.0.0

Sparse Symmetric Matrix-Free Eigensolver

Major version history

6.1 Installation

6.2 Usage overview

6.2.1 Calling sequences

6.2.2 Derived types

6.3 Argument lists

6.3.1 spral_ssmfe_default_options()

6.3.2 spral_ssmfe_standard_type(), spral_ssmfe_standard_shift_type(), spral_ssmfe_generalized_type(), spral_ssmfe_generalized_shift_type(), and spral_ssmfe_buckling_type(),

6.3.3 spral_ssmfe_free_type()

6.4 Derived types

6.4.1 struct spral_ssmfe_options

6.4.2 struct spral_ssmfe_inform

6.5 Error ﬂags

6.6 Method

References

6.7 Examples

6.7.1 Preconditioning example

6.7.2 Shift-and-invert example

6.7.3 Hermitian example

6.3.2 spral_ssmfe_standard_type(), spral_ssmfe_standard_shift_type(),
spral_ssmfe_generalized_type(), spral_ssmfe_generalized_shift_type(), and spral_ssmfe_buckling_type(),