spral_ssmfe_expert - Sparse Symmetric Matrix-Free Eigensolver (Expert interface)

Purpose

This package computes extreme (leftmost and/or rightmost) eigenpairs \(\{\lambda_i, x_i\}\) of the following eigenvalue problems:

  • the standard eigenvalue problem

    \[A x = \lambda x,\]
  • the generalized eigenvalue problem

    \[A x = \lambda B x,\]
  • the buckling problem

    \[B x = \lambda A x,\]

where \(A\) and \(B\) are real symmetric (or Hermitian) matrices and \(B\) is positive definite.

The module spral_ssmfe provides a more user-friendly wrapper around this code. Conversely, spral_ssmfe_core provides a lower level implementation of the core solver, which this package provides a wrapper for.

Major version history

2014-11-20 Version 1.0.0

Initial release

[for detail please see ChangeLog]

Usage overview

The eigensolver subroutines behind spral_ssmfe_expert implement a block iterative algorithm. The block nature of this algorithm allows the user to benefit from highly optimized linear algebra subroutines and from the ubiquitous multicore architecture of modern computers. It also makes this algorithm more reliable than Krylov-based algorithms employed e.g. by ARPACK in the presence of clustered eigenvalues. However, convergence of the iterations may be slow if the density of the spectrum is high.

Thus, good performance (in terms of speed) is contingent on the following two factors:

  1. the number of desired eigenpairs must be substantial (e.g. not fewer than the number of CPU cores), and

  2. the employment of a convergence acceleration technique.

The acceleration techniques that can be used are shift-and-invert and preconditioning.

The former requires the direct solution of linear systems with the matrix \(A\) or its linear combination with \(B\), for which a sparse symmetric indefinite solver (such as HSL_MA97 or SPRAL_SSIDS) can be employed.

The latter applies to the case of positive definite \(A\) and requires a matrix or an operator \(T\), called a preconditioner, such that the vector \(v = T f\) is an approximation to the solution \(u\) of the system \(A u = f\) (see a simple example in Section [ssmfe:example:precond]). Note: This technique is only recommended for experienced users.

In this expert interface, the user must handle storage of all vectors, facilitating advanced memory handling techniques required for parallel, hybrid and/or out-of-core execution. If there is no requirement to store these vectors, consider using the simplified interface of spral_ssmfe instead.

Subroutines

To use the solver procedures, the user must maintain a workspace of (kw+1) blocks each containing m vectors of size n. For notational convienience we refer to this workspace as a Fortran array W(n,m,0:kw), but the user is free to store it as they wish. Note the block dimension is indexed from zero, not from one. The following table provides minimum values of kw for each setup:

minAprod=T

minAprod=F

Problem

minBprod=T

minBprod=F

minBprod=T

minBprod=F

standard

7

5

3

3

standard_shift

7

5

N/A

N/A

generalized

7

5

5

3

generalized_shift

7

7

N/A

N/A

buckling

7

7

N/A

N/A

Further, the user must also store the converged eigenvectors \(X\), and (for generalised problems) their \(B\)-images \(BX\) using separate storage, e.g. X(n,mep), BX(n,mep). In addition to being output, the routine may need to reorthagonalise against these from time to time.

subroutine  ssmfe_standard(rci, left, mep, lambda, m, rr, ind, keep, options, inform)

Computes the left-most eigenpairs of the standard eigenvalue problem

\[Ax = \lambda x\]

Optionally uses preconditioning.

Uses reverse-communication. Upon return the user must perform a task specified by the rci parameter and recall the routine. Possible values of rci and associated tasks are:

rci%job

Task to be performed

-3

None. Fatal error, see inform%flag.

-2

Failed to converge, see inform%flag.

-1

None. Computation complete.

1

Calculate \(\bar{V} = AU\).

2

Apply preconditioner \(\bar{V} = TU\). (Copy if T=I).

5

Copy converged eigenvectors \(X\) to user storage:

  • If rci%i>0: W(:,rci%jx:rci%jx+rci%nx-1,rci%kx).

  • Else: W(:,rci%jx-rci%nx+1:rci%jx,rci%kx).

11

If rci%i.eq.0, copy \(\bar{V} = U\).

Otherwise, reorder columns of block rci%kx such that column ind(j) becomes the new column j for j=1, …, rci%nx

Note: if rci%kx.eq.rci%ky, only reorder once.

12

Compute the dot products

\[r_{ii} = U_i \cdot \bar{V}_i\]

13

Perform the scalings

\[U_i = U_i/\sqrt{(U_i\cdot \bar{V}_i)}\]

and

\[\bar{V}_i = \bar{V}_i/\sqrt{(U_i\cdot \bar{V}_i)}\]

for each column \(U_i\) and \(\bar{V}_i\) of \(U\) and \(\bar{V}\).

Note: if rci%kx.eq.rci%ky, only scale once.

14

Perform the updates

\[\bar{V}_i = \bar{V}_i + r_{ii} U_i\]

for each column \(\bar{V}_i\) of \(\bar{V}\)

15

Perform the update

\[R = \alpha U^T V + \beta R\]

16

Perform the update

\[V = \alpha U R + \beta V\]

17

Perform the update

\[U = \alpha U R\]

Note: \(V\) may be used as a workspace

21

Orthogonalize columns of \(V\) to all vectors \(X\) by solving

\[(X^TX) Q = X^T \bar{V}\]

for \(Q\) and updating

\[U = U - XQ\]

22

Orthogonalize columns of \(U\) to all vectors \(X\) by solving

\[(X^TX) Q = X^T U\]

for \(Q\) and updating

\[U = U - XQ\]

999

Restart:

If rci%k>0: Restart suggested with block size m >= rci%nx + rci%i + rci%j, adjusting workspace size to match. Set rci%i=0 and rci%j=0 and recall the routine. If a restart is not desirable, routine may be recalled with no change to parameters.

If rci%k=0: Restart required with the same block size.

In both cases, the first block W(:,:,0) should retain vectors rci%jx:rci%jx+rci%nx-1, filling remaining vectors randomly such that the entire set of columns is linearly independent from each other and also from the converged eigenvectors.

The matrices are defined as follows:

  • \(U\) = W(:, rci%jx:rci%jx+rci%nx-1, rci%kx)

  • \(V\) = W(:, rci%jy:rci%jy+rci%ny-1, rci%ky)

  • \(\bar{V}\) = W(:, rci%jy:rci%jy+rci%nx-1, rci%ky)

  • \(R\) = rr(rci%i:rci%i+rci%nx-1, rci%j:rci%j+rci%ny-1, rci%k)

and \(\alpha\) and \(\beta\) are given by rci%alpha and rci%beta respectively. We use the notation \(r_{ii}\) to refer to the \(i\)-th diagonal element of \(R\), being rr(rci%i+i-1,rci%j+i-1,rci%k).

Parameters:
  • rci [ssmfe_rcid ,inout] :: Reverse communication type. rci%job must be set to 0 before the first call. (Type ssmfe_rciz in complex version).

  • left [integer ,in] :: Number of left eigenpairs to find.

  • mep [integer ,in] :: Number of working eigenpairs. See [method] section for guidance on selecting a good value. Must be at least left.

  • lambda (mep) [real ,inout] :: Current eigenvalue estimates in ascending order.

  • m [integer ,in] :: Block size of workspace W. Must be at least 2.

  • rr (2*m,2*m,3) [real ,inout] :: reverse communication workspace. (Type complex in complex version).

  • ind (m) [integer ,inout] :: reverse communication workspace.

  • keep [ssmfe_expert_keep ,inout] :: Internal workspace used by routine.

  • options [ssmfe_options ,in] :: specifies algorithm options to be used.

  • inform [ssmfe_inform ,inout] :: returns information about the exection of the routine.

subroutine  ssmfe_standard_shift(rci, sigma, left, right, mep, lambda, m, rr, ind, keep, options, inform)

Computes eigenpairs of the standard eigenvalue problem

\[Ax = \lambda x\]

in the vicinity of a given value \(sigma\).

Uses reverse-communication. Upon return the user must perform a task specified by the rci parameter and recall the routine. Possible values of rci and associated tasks are:

rci%job

Task to be performed

-3

None. Fatal error, see inform%flag.

-2

Failed to converge, see inform%flag.

-1

None. Computation complete.

2

Apply preconditioner \(\bar{V} = TU\). (Copy if T=I).

5

Copy converged eigenvectors \(X\) to user storage:

  • If rci%i>0: W(:,rci%jx:rci%jx+rci%nx-1,rci%kx).

  • Else: W(:,rci%jx-rci%nx+1:rci%jx,rci%kx).

9

Compute \(V = (A-\sigma I)^{-1} U\)

11

If rci%i.eq.0, copy \(\bar{V} = U\).

Otherwise, reorder columns of block rci%kx such that column ind(j) becomes the new column j for j=1, …, rci%nx

Note: if rci%kx.eq.rci%ky, only reorder once.

12

Compute the dot products

\[r_{ii} = U_i \cdot \bar{V}_i\]

13

Perform the scalings

\[U_i = U_i/\sqrt{(U_i\cdot \bar{V}_i)}\]

and

\[\bar{V}_i = \bar{V}_i/\sqrt{(U_i\cdot \bar{V}_i)}\]

for each column \(U_i\) and \(\bar{V}_i\) of \(U\) and \(\bar{V}\).

Note: if rci%kx.eq.rci%ky, only scale once.

14

Perform the updates

\[\bar{V}_i = \bar{V}_i + r_{ii} U_i\]

for each column \(\bar{V}_i\) of \(\bar{V}\)

15

Perform the update

\[R = \alpha U^T V + \beta R\]

16

Perform the update

\[V = \alpha U R + \beta V\]

17

Perform the update

\[U = \alpha U R\]

Note: \(V\) may be used as a workspace

21

Orthogonalize columns of \(V\) to all vectors \(X\) by solving

\[(X^TX) Q = X^T \bar{V}\]

for \(Q\) and updating

\[U = U - XQ\]

22

Orthogonalize columns of \(U\) to all vectors \(X\) by solving

\[(X^TX) Q = X^T U\]

for \(Q\) and updating

\[U = U - XQ\]

999

Restart:

If rci%k>0: Restart suggested with block size m >= rci%nx + rci%i + rci%j, adjusting workspace size to match. Set rci%i=0 and rci%j=0 and recall the routine. If a restart is not desirable, routine may be recalled with no change to parameters.

If rci%k=0: Restart required with the same block size.

In both cases, the first block W(:,:,0) should retain vectors rci%jx:rci%jx+rci%nx-1, filling remaining vectors randomly such that the entire set of columns is linearly independent from each other and also from the converged eigenvectors.

The matrices are defined as follows:

  • \(U\) = W(:, rci%jx:rci%jx+rci%nx-1, rci%kx)

  • \(V\) = W(:, rci%jy:rci%jy+rci%ny-1, rci%ky)

  • \(\bar{V}\) = W(:, rci%jy:rci%jy+rci%nx-1, rci%ky)

  • \(R\) = rr(rci%i:rci%i+rci%nx-1, rci%j:rci%j+rci%ny-1, rci%k)

and \(\alpha\) and \(\beta\) are given by rci%alpha and rci%beta respectively. We use the notation \(r_{ii}\) to refer to the \(i\)-th diagonal element of \(R\), being rr(rci%i+i-1,rci%j+i-1,rci%k).

Parameters:
  • rci [ssmfe_rcid ,inout] :: Reverse communication type. rci%job must be set to 0 before the first call. (Type ssmfe_rciz in complex version).

  • sigma [real ,in] :: Shift value \(sigma\).

  • left [integer ,in] :: Number of left eigenpairs to find.

  • right [integer ,in] :: Number of right eigenpairs to find.

  • mep [integer ,in] :: Number of working eigenpairs. See [method] section for guidance on selecting a good value. Must be at least left+right.

  • lambda (mep) [real ,inout] :: Current eigenvalue estimates in ascending order.

  • m [integer ,in] :: Block size of workspace W. Must be at least 2.

  • rr (2*m,2*m,3) [real ,inout] :: reverse communication workspace. (Type complex in complex version).

  • ind (m) [integer ,inout] :: reverse communication workspace.

  • keep [ssmfe_expert_keep ,inout] :: Internal workspace used by routine.

  • options [ssmfe_options ,in] :: specifies algorithm options to be used.

  • inform [ssmfe_inform ,inout] :: returns information about the exection of the routine.

subroutine  ssmfe_generalized(rci, left, mep, lambda, m, rr, ind, keep, options, inform)

Computes the left-most eigenpairs of the generalized eigenvalue problem

\[Ax = \lambda B x\]

Optionally uses preconditioning.

Uses reverse-communication. Upon return the user must perform a task specified by the rci parameter and recall the routine. Possible values of rci and associated tasks are:

rci%job

Task to be performed

-3

None. Fatal error, see inform%flag.

-2

Failed to converge, see inform%flag.

-1

None. Computation complete.

1

Calculate \(\bar{V} = AU\).

2

Apply preconditioner \(\bar{V} = TU\). (Copy if T=I).

3

Compute \(\bar{V} = BU\)

5

Copy converged eigenvectors \(X\) to user storage:

  • If rci%i>0: W(:,rci%jx:rci%jx+rci%nx-1,rci%kx).

  • Else: W(:,rci%jx-rci%nx+1:rci%jx,rci%kx).

Optionally save their \(B\)-images:

  • If rci%i>0: W(:,rci%jx:rci%jx+rci%nx-1,rci%ky).

  • Else: W(:,rci%jx-rci%nx+1:rci%jx,rci%ky).

11

If rci%i.eq.0, copy \(\bar{V} = U\).

Otherwise, reorder columns of block rci%kx such that column ind(j) becomes the new column j for j=1, …, rci%nx

Note: if rci%kx.eq.rci%ky, only reorder once.

12

Compute the dot products

\[r_{ii} = U_i \cdot \bar{V}_i\]

13

Perform the scalings

\[U_i = U_i/\sqrt{(U_i\cdot \bar{V}_i)}\]

and

\[\bar{V}_i = \bar{V}_i/\sqrt{(U_i\cdot \bar{V}_i)}\]

for each column \(U_i\) and \(\bar{V}_i\) of \(U\) and \(\bar{V}\).

Note: if rci%kx.eq.rci%ky, only scale once.

14

Perform the updates

\[\bar{V}_i = \bar{V}_i + r_{ii} U_i\]

for each column \(\bar{V}_i\) of \(\bar{V}\)

15

Perform the update

\[R = \alpha U^T V + \beta R\]

16

Perform the update

\[V = \alpha U R + \beta V\]

17

Perform the update

\[U = \alpha U R\]

Note: \(V\) may be used as a workspace

21

\(B\)-orthogonalize columns of \(V\) to all vectors \(X\) by solving

\[(X^TBX) Q = X^T \bar{V}\]

for \(Q\) and updating

\[\begin{split}U & = & U - XQ \\ \bar{V} & = & \bar{V} - BXQ\end{split}\]

The update of \(\bar{V}\) may be replaced by

\[\bar{V} = BU\]

22

\(B\)-orthogonalize columns of \(U\) to all vectors \(X\) by solving

\[(X^TBX) Q = X^T U\]

for \(Q\) and updating

\[U = U - BXQ\]

999

Restart:

If rci%k>0: Restart suggested with block size m >= rci%nx + rci%i + rci%j, adjusting workspace size to match. Set rci%i=0 and rci%j=0 and recall the routine. If a restart is not desirable, routine may be recalled with no change to parameters.

If rci%k=0: Restart required with the same block size.

In both cases, the first block W(:,:,0) should retain vectors rci%jx:rci%jx+rci%nx-1, filling remaining vectors randomly such that the entire set of columns is linearly independent from each other and also from the converged eigenvectors.

The matrices are defined as follows:

  • \(U\) = W(:, rci%jx:rci%jx+rci%nx-1, rci%kx)

  • \(V\) = W(:, rci%jy:rci%jy+rci%ny-1, rci%ky)

  • \(\bar{V}\) = W(:, rci%jy:rci%jy+rci%nx-1, rci%ky)

  • \(R\) = rr(rci%i:rci%i+rci%nx-1, rci%j:rci%j+rci%ny-1, rci%k)

and \(\alpha\) and \(\beta\) are given by rci%alpha and rci%beta respectively. We use the notation \(r_{ii}\) to refer to the \(i\)-th diagonal element of \(R\), being rr(rci%i+i-1,rci%j+i-1,rci%k).

Parameters:
  • rci [ssmfe_rcid ,inout] :: Reverse communication type. rci%job must be set to 0 before the first call. (Type ssmfe_rciz in complex version).

  • left [integer ,in] :: Number of left eigenpairs to find.

  • mep [integer ,in] :: Number of working eigenpairs. See [method] section for guidance on selecting a good value. Must be at least left.

  • lambda (mep) [real ,inout] :: Current eigenvalue estimates in ascending order.

  • m [integer ,in] :: Block size of workspace W. Must be at least 2.

  • rr (2*m,2*m,3) [real ,inout] :: reverse communication workspace. (Type complex in complex version).

  • ind (m) [integer ,inout] :: reverse communication workspace.

  • keep [ssmfe_expert_keep ,inout] :: Internal workspace used by routine.

  • options [ssmfe_options ,in] :: specifies algorithm options to be used.

  • inform [ssmfe_inform ,inout] :: returns information about the exection of the routine.

subroutine  ssmfe_generalized_shift(rci, sigma, left, right, mep, lambda, m, rr, ind, keep, options, inform)

Computes eigenpairs of the generalized eigenvalue problem

\[Ax = \lambda B x\]

in the vicinity of a given value \(\sigma\).

Uses reverse-communication. Upon return the user must perform a task specified by the rci parameter and recall the routine. Possible values of rci and associated tasks are:

rci%job

Task to be performed

-3

None. Fatal error, see inform%flag.

-2

Failed to converge, see inform%flag.

-1

None. Computation complete.

3

Compute \(\bar{V} = BU\)

5

Copy converged eigenvectors \(X\) to user storage:

  • If rci%i>0: W(:,rci%jx:rci%jx+rci%nx-1,rci%kx).

  • Else: W(:,rci%jx-rci%nx+1:rci%jx,rci%kx).

Optionally save their \(B\)-images:

  • If rci%i>0: W(:,rci%jx:rci%jx+rci%nx-1,rci%ky).

  • Else: W(:,rci%jx-rci%nx+1:rci%jx,rci%ky).

9

Compute \(V = (A-\sigma B)^{-1} U\)

11

If rci%i.eq.0, copy \(\bar{V} = U\).

Otherwise, reorder columns of block rci%kx such that column ind(j) becomes the new column j for j=1, …, rci%nx

Note: if rci%kx.eq.rci%ky, only reorder once.

12

Compute the dot products

\[r_{ii} = U_i \cdot \bar{V}_i\]

13

Perform the scalings

\[U_i = U_i/\sqrt{(U_i\cdot \bar{V}_i)}\]

and

\[\bar{V}_i = \bar{V}_i/\sqrt{(U_i\cdot \bar{V}_i)}\]

for each column \(U_i\) and \(\bar{V}_i\) of \(U\) and \(\bar{V}\).

Note: if rci%kx.eq.rci%ky, only scale once.

14

Perform the updates

\[\bar{V}_i = \bar{V}_i + r_{ii} U_i\]

for each column \(\bar{V}_i\) of \(\bar{V}\)

15

Perform the update

\[R = \alpha U^T V + \beta R\]

16

Perform the update

\[V = \alpha U R + \beta V\]

17

Perform the update

\[U = \alpha U R\]

Note: \(V\) may be used as a workspace

21

\(B\)-orthogonalize columns of \(V\) to all vectors \(X\) by solving

\[(X^TBX) Q = X^T \bar{V}\]

for \(Q\) and updating

\[\begin{split}U & = & U - XQ \\ \bar{V} & = & \bar{V} - BXQ\end{split}\]

The update of \(\bar{V}\) may be replaced by

\[\bar{V} = BU\]

22

\(B\)-orthogonalize columns of \(U\) to all vectors \(X\) by solving

\[(X^TBX) Q = X^T U\]

for \(Q\) and updating

\[U = U - BXQ\]

999

Restart:

If rci%k>0: Restart suggested with block size m >= rci%nx + rci%i + rci%j, adjusting workspace size to match. Set rci%i=0 and rci%j=0 and recall the routine. If a restart is not desirable, routine may be recalled with no change to parameters.

If rci%k=0: Restart required with the same block size.

In both cases, the first block W(:,:,0) should retain vectors rci%jx:rci%jx+rci%nx-1, filling remaining vectors randomly such that the entire set of columns is linearly independent from each other and also from the converged eigenvectors.

The matrices are defined as follows:

  • \(U\) = W(:, rci%jx:rci%jx+rci%nx-1, rci%kx)

  • \(V\) = W(:, rci%jy:rci%jy+rci%ny-1, rci%ky)

  • \(\bar{V}\) = W(:, rci%jy:rci%jy+rci%nx-1, rci%ky)

  • \(R\) = rr(rci%i:rci%i+rci%nx-1, rci%j:rci%j+rci%ny-1, rci%k)

and \(\alpha\) and \(\beta\) are given by rci%alpha and rci%beta respectively. We use the notation \(r_{ii}\) to refer to the \(i\)-th diagonal element of \(R\), being rr(rci%i+i-1,rci%j+i-1,rci%k).

Parameters:
  • rci [ssmfe_rcid ,inout] :: Reverse communication type. rci%job must be set to 0 before the first call. (Type ssmfe_rciz in complex version).

  • sigma [real ,in] :: Shift value \(sigma\).

  • left [integer ,in] :: Number of left eigenpairs to find.

  • right [integer ,in] :: Number of right eigenpairs to find.

  • mep [integer ,in] :: Number of working eigenpairs. See [method] section for guidance on selecting a good value. Must be at least left+right.

  • lambda (mep) [real ,inout] :: Current eigenvalue estimates in ascending order.

  • m [integer ,in] :: Block size of workspace W. Must be at least 2.

  • rr (2*m,2*m,3) [real ,inout] :: reverse communication workspace. (Type complex in complex version).

  • ind (m) [integer ,inout] :: reverse communication workspace.

  • keep [ssmfe_expert_keep ,inout] :: Internal workspace used by routine.

  • options [ssmfe_options ,in] :: specifies algorithm options to be used.

  • inform [ssmfe_inform ,inout] :: returns information about the exection of the routine.

subroutine  ssmfe_buckling(rci, sigma, left, right, mep, lambda, m, rr, ind, keep, options, inform)

Computes the eigenpairs of the buckling problem

\[Bx = \lambda A x\]

in the vicinity of a given value \(\sigma\).

Uses reverse-communication. Upon return the user must perform a task specified by the rci parameter and recall the routine. Possible values of rci and associated tasks are:

rci%job

Task to be performed

-3

None. Fatal error, see inform%flag.

-2

Failed to converge, see inform%flag.

-1

None. Computation complete.

3

Compute \(\bar{V} = BU\)

5

Copy converged eigenvectors \(X\) to user storage:

  • If rci%i>0: W(:,rci%jx:rci%jx+rci%nx-1,rci%kx).

  • Else: W(:,rci%jx-rci%nx+1:rci%jx,rci%kx).

Optionally save their \(B\)-images:

  • If rci%i>0: W(:,rci%jx:rci%jx+rci%nx-1,rci%ky).

  • Else: W(:,rci%jx-rci%nx+1:rci%jx,rci%ky).

9

Compute \(V = (B-\sigma A)^{-1} U\)

11

If rci%i.eq.0, copy \(\bar{V} = U\).

Otherwise, reorder columns of block rci%kx such that column ind(j) becomes the new column j for j=1, …, rci%nx

Note: if rci%kx.eq.rci%ky, only reorder once.

12

Compute the dot products

\[r_{ii} = U_i \cdot \bar{V}_i\]

13

Perform the scalings

\[U_i = U_i/\sqrt{(U_i\cdot \bar{V}_i)}\]

and

\[\bar{V}_i = \bar{V}_i/\sqrt{(U_i\cdot \bar{V}_i)}\]

for each column \(U_i\) and \(\bar{V}_i\) of \(U\) and \(\bar{V}\).

Note: if rci%kx.eq.rci%ky, only scale once.

14

Perform the updates

\[\bar{V}_i = \bar{V}_i + r_{ii} U_i\]

for each column \(\bar{V}_i\) of \(\bar{V}\)

15

Perform the update

\[R = \alpha U^T V + \beta R\]

16

Perform the update

\[V = \alpha U R + \beta V\]

17

Perform the update

\[U = \alpha U R\]

Note: \(V\) may be used as a workspace

21

\(B\)-orthogonalize columns of \(V\) to all vectors \(X\) by solving

\[(X^TBX) Q = X^T \bar{V}\]

for \(Q\) and updating

\[\begin{split}U & = & U - XQ \\ \bar{V} & = & \bar{V} - BXQ\end{split}\]

The update of \(\bar{V}\) may be replaced by

\[\bar{V} = BU\]

22

\(B\)-orthogonalize columns of \(U\) to all vectors \(X\) by solving

\[(X^TBX) Q = X^T U\]

for \(Q\) and updating

\[U = U - BXQ\]

999

Restart:

If rci%k>0: Restart suggested with block size m >= rci%nx + rci%i + rci%j, adjusting workspace size to match. Set rci%i=0 and rci%j=0 and recall the routine. If a restart is not desirable, routine may be recalled with no change to parameters.

If rci%k=0: Restart required with the same block size.

In both cases, the first block W(:,:,0) should retain vectors rci%jx:rci%jx+rci%nx-1, filling remaining vectors randomly such that the entire set of columns is linearly independent from each other and also from the converged eigenvectors.

The matrices are defined as follows:

  • \(U\) = W(:, rci%jx:rci%jx+rci%nx-1, rci%kx)

  • \(V\) = W(:, rci%jy:rci%jy+rci%ny-1, rci%ky)

  • \(\bar{V}\) = W(:, rci%jy:rci%jy+rci%nx-1, rci%ky)

  • \(R\) = rr(rci%i:rci%i+rci%nx-1, rci%j:rci%j+rci%ny-1, rci%k)

and \(\alpha\) and \(\beta\) are given by rci%alpha and rci%beta respectively. We use the notation \(r_{ii}\) to refer to the \(i\)-th diagonal element of \(R\), being rr(rci%i+i-1,rci%j+i-1,rci%k).

Parameters:
  • rci [ssmfe_rcid ,inout] :: Reverse communication type. rci%job must be set to 0 before the first call. (Type ssmfe_rciz in complex version).

  • sigma [real ,in] :: Shift value \(sigma\).

  • left [integer ,in] :: Number of left eigenpairs to find.

  • right [integer ,in] :: Number of right eigenpairs to find.

  • mep [integer ,in] :: Number of working eigenpairs. See [method] section for guidance on selecting a good value. Must be at least left+right.

  • lambda (mep) [real ,inout] :: Current eigenvalue estimates in ascending order.

  • m [integer ,in] :: Block size of workspace W. Must be at least 2.

  • rr (2*m,2*m,3) [real ,inout] :: reverse communication workspace. (Type complex in complex version).

  • ind (m) [integer ,inout] :: reverse communication workspace.

  • keep [ssmfe_expert_keep ,inout] :: Internal workspace used by routine.

  • options [ssmfe_options ,in] :: specifies algorithm options to be used.

  • inform [ssmfe_inform ,inout] :: returns information about the exection of the routine.

subroutine  ssmfe_free(keep, inform)

Free memory allocated in keep and inform. Unnecessary if both are going out of scope.

Parameters:
  • keep [ssmfe_expert_keep ,inout] :: Workspace to be freed.

  • inform [ssmfe_inform ,inout] :: Information type to be freed.

Derived types

type  ssmfe_rcid

Real-valued reverse communication interface (RCI) type.

Type fields:
  • % job [integer ] :: Reverse-communication task to perform.

  • % jx [integer ] :: First column of \(U\) in block.

  • % kx [integer ] :: Block to which \(U\) belongs.

  • % nx [integer ] :: Number of columns in \(U\) and \(\bar{V}\), and number of rows in \(R\).

  • % jy [integer ] :: First column of \(V\) in block.

  • % ky [integer ] :: Block to which \(V\) belongs.

  • % ny [integer ] :: Number of columns in \(V\) and \(R\).

  • % i [integer ] :: First row of \(R\) in rr(:,:,:).

  • % j [integer ] :: First column of \(R\) in rr(:,:,:).

  • % k [integer ] :: Block of \(R\) in rr(:,:,:).

  • % alpha [real ] :: Coefficient for matrix multiplication.

  • % beta [real ] :: Coefficient for matrix multiplication.

type  ssmfe_rciz

Real-valued reverse communication interface (RCI) type.

Type fields:
  • % job [integer ] :: Reverse-communication task to perform.

  • % jx [integer ] :: First column of \(U\) in block.

  • % kx [integer ] :: Block to which \(U\) belongs.

  • % nx [integer ] :: Number of columns in \(U\) and \(\bar{V}\), and number of rows in \(R\).

  • % jy [integer ] :: First column of \(V\) in block.

  • % ky [integer ] :: Block to which \(V\) belongs.

  • % ny [integer ] :: Number of columns in \(V\) and \(R\).

  • % i [integer ] :: First row of \(R\) in rr(:,:,:).

  • % j [integer ] :: First column of \(R\) in rr(:,:,:).

  • % k [integer ] :: Block of \(R\) in rr(:,:,:).

  • % alpha [complex ] :: Coefficient for matrix multiplication.

  • % beta [complex ] :: Coefficient for matrix multiplication.

type  ssmfe_options

Options that control the algorithm.

Type fields:
  • % abs_tol_lambda [real ,default=0.0] :: absolute tolerance for estimated eigenvalue convergence test, see Section [ssmfe:method]. Negative values are treated as the default.

  • % abs_tol_residual [real ,default=0.0] :: absolute tolerance for residual convergence test, see Section [ssmfe:method]. Negative values are treated as the default.

  • % max_iterations [integer ,default=100] :: maximum number of iterations.

  • % rel_tol_lambda [real ,default=0.0] :: relative tolerance for estimated eigenvalue error convergence test, see Section [ssmfe:method]. Negative values are treated as the default.

  • % rel_tol_residual [real ,default=0.0] :: relative tolerance for residual convergence test, see Section [ssmfe:method]. If both abs_tol_residual and rel_tol_residual are 0.0, then the residual norms are not taken into consideration by the convergence test, see Section [ssmfe:method]. Negative values are treated as the default.

  • % tol_x [real ,default=-1.0] :: tolerance for estimated eigenvector error convergence test, see Section [ssmfe:method]. If tol_x is set to 0.0, the eigenvector error is not estimated. If a negative value is assigned, the tolerance is set to sqrt(epsilon(lambda)).

  • % print_level [integer ,default=0] ::

    amount of printing. Possible values are:

    <0

    no printing

    0

    error and warning messages only

    1

    the type (standard or generalized) and the size of the problem, the number of eigenpairs requested, the error tolerances and the size of the subspace are printed before the iterations start

    2

    as above but, for each eigenpair tested for convergence, the iteration number, the index of the eigenpair, the eigenvalue, whether it has converged, the residual norm, and the error estimates are also printed

    >2

    as 1 but with all eigenvalues, whether converged, residual norms and eigenvalue/eigenvector error estimates printed on each iteration.

    Note that for eigenpairs that are far from convergence, ‘rough’ error estimates are printed (the estimates that are actually used by the stopping criteria, see Section [ssmfe:method], only become available on the last few iterations).

  • % unit_error [integer ,default=6] :: unit number for error messages. Printing suppressed if negative.

  • % unit_diagnostic [integer ,default=6] :: unit number for diagnostic messages. Printing suppressed if negative.

  • % unit_warning [integer ,default=6] :: unit number for warning messages. Printing suppressed if negative.

  • % err_est [integer ,default=2] ::

    error estimation scheme, one of:

    1

    Residual error bounds: modified Davis-Kahan estimate for eigenvector error and Lehmann bounds for eigenvale error (see method section).

    2 (default)

    Convergence curve-based estimate.

  • % extra_left [integer ,default=0] :: number of extra approximate eigenvectors corresponding to leftmost eigenvalues used to enhance convergence.

  • % extra_right [integer ,default=0] :: number of extra approximate eigenvectors corresponding to rightmost eigenvalues used to enhance convergence.

  • % left_gap [real ,default=0.0] :: minimal acceptable distance between last computed left eigenvalue and rest of spectrum. For ssmfe_standard() and ssmfe_generalized() the last computed left eigenvalue is the rightmost of those computed. For other routines it is the leftmost. If set to a negative value \(\delta\), the minimal distance is taken as \(|\delta|\) times the average distance between the computed eigenvalues. Note that for this option to have any effect, the value of mep must be larger than left+right. See Section [ssmfe:method] for further explanation.

  • % max_left [integer ,default=-1] :: number of eigenvalues to left of \(\sigma\), or a negative value if not known.

  • % max_right [integer ,default=-1] :: number of eigenvalues to right of \(\sigma\), or a negative value if not known.

  • % minAprod [logical ,default=.true.] :: If true, minimize number of multiplications with \(A\) by requiring 2 additional blocks of memory for the workspace W(:,:,:). Must be true for calls to ssmfe_standard_shift(), ssmfe_generalized_shift(), and ssmfe_buckling().

  • % minBprod [logical ,default=.true.] :: If true, minimize number of multiplications with \(B\) by requiring 2 additional blocks of memory for the workspace W(:,:,:).

  • % right_gap [real ,default=0.0] :: as left_gap, but for right eigenvalues.

  • % user_x [integer ,default=0] :: number of eigenvectors for which an initial guess is supplied in x(:,:) on the first call. Such eigenvectors must be lineraly independent.

type  ssmfe_inform

Information on progress of the algorithm.

Type fields:
  • % converged (mep) [integer ,allocatable] ::

    Convergence status.

    • If converged(j)>0, the eigenpair (lambda(j), X(j)) converged on iteration converged(j).

    • If converged(j)=0, the eigenpair (lambda(j), X(j)) is still converging.

    • If converged(j)<0, the eigenpair (lambda(j), X(j)) stagnated at iteration converged(j).

  • % err_lambda (mep) [real ,allocatable] :: estimated eigenvalue errors for converged and stagnated eigenvalues.

  • % err_x (mep) [real ,allocatable] :: estimated eigenvector errors for converged and stagnated eigenvectors.

  • % flag [integer ] :: return status of algorithm. See table below.

  • % iteration [integer ] :: number of iterations.

  • % left [integer ] :: number of converged left eigenvalues.

  • % next_left [real ] :: upon completion, next left eigenvalue in spectrum (see options%left_gap).

  • % next_right [real ] :: upon completion, next right eigenvalue in spectrum (see options%right_gap).

  • % residual_norms (mep) [real ,allocatable] :: Euclidean norms of residuals for (lambda(:), X(:)) on return with rci%job=5.

  • % non_converged [integer ] :: number of non-converged eigenpairs.

  • % right [integer ] :: number of converged right eigenvalues.

  • % stat [integer ] :: allocation status in event of failure

inform%flag

-1

rci%job is out-of-range.

-2

m is out-of-range.

-3

options%err_est is out-of-range.

-4

options%minAprod is incompatible with selected routine.

-5

options%extra_left or options%extra_right is out-of-range.

-6

options%min_gap is out-of-range.

-11

left is out-of-range.

-12

right is out-of-range.

-13

mep is less than the number of desired eigenpairs.

-100

Not enough memory; inform%stat contains the value of the Fortran stat parameter.

-200

\(B\) is not positive definite or user_x>0 and linearly dependent initial guesses were supplied.

+1

The iterations have been terminated because no further improvement in accuracy is possible (this may happen if \(B\) or the preconditioner is not positive definite, or if the components of the residual vectors are so small that the round-off errors make them essentially random). The value of inform%non_converged is set to the number of non-converged eigenpairs.

+2

The maximum number of iterations max_iterations has been exceeded. The value of inform%non_converged is set to the number of non-converged eigenpairs.

+3

The solver had run out of storage space for the converged eigenpairs before the gap in the spectrum required by options%left_gap and/or options%right_gap was reached. The value of inform%non_converged is set to the number of non-converged eigenpairs.

Examples

Preconditioning example

The following code computes the 5 leftmost eigenpairs of the matrix \(A\) of order 100 that approximates the two-dimensional Laplacian operator on a 20-by-20 grid. One forward and one backward Gauss-Seidel update are used for preconditioning, which halves the number of iterations compared with solving the same problem without preconditioning. The module laplace2d (examples/Fortran/ssmfe/laplace2d.f90) supplies the subroutine apply_laplacian() that multiplies a block of vectors by \(A\), and the subroutine apply_gauss_seidel_step() that computes \(y = T x\) for a block of vectors \(x\) by applying one forward and one backward update of the Gauss-Seidel method to the system \(A y = x\).

! examples/Fortran/ssmfe/precond_expert.f90
! Laplacian on a square grid (using SPRAL_SSMFE_EXPERT routines)
program ssmfe_expert_precond_example
  use spral_random
  use spral_ssmfe_expert
  use laplace2d ! implement Lapalacian and preconditioners
  implicit none

  integer, parameter :: wp = kind(0d0) ! Working precision is double

  integer, parameter :: ngrid = 20           ! grid points along each side
  integer, parameter :: n     = ngrid*ngrid  ! problem size
  integer, parameter :: nep   = 5            ! eigenpairs wanted
  integer, parameter :: m     = 3            ! dimn of the iterated subspace

  type(random_state) :: state                ! PRNG state

  real(wp), external :: dnrm2, ddot          ! BLAS functions

  integer :: ncon                 ! number of converged eigenpairs
  integer :: i, j, k
  integer :: ind(m)               ! permutation index
  real(wp) :: s
  real(wp) :: lambda(n)           ! eigenvalues
  real(wp) :: X(n, n)             ! eigenvectors
  real(wp) :: rr(2*m, 2*m, 3)     ! work array
  real(wp) :: W(n, m, 0:7)        ! work array
  real(wp) :: U(n, m)             ! work array
  type(ssmfe_rcid       ) :: rci      ! reverse communication data
  type(ssmfe_options    ) :: options  ! options
  type(ssmfe_expert_keep) :: keep     ! private data
  type(ssmfe_inform     ) :: inform   ! information

  ! the gap between the last converged eigenvalue and the rest of the spectrum
  ! must be at least 0.1 times average gap between computed eigenvalues
  options%left_gap = -0.1
  ! block size is small, convergence may be slow, allow more iterations
  options%max_iterations = 200

  ! Initialize W to lin indep vectors by randomizing
  do i=1,n
    do j=1,m
      W(i,j,0) = random_real(state, positive=.true.)
    end do
  end do

  ncon = 0
  rci%job = 0
  do ! reverse communication loop
!    call ssmfe_standard &
    call ssmfe_generalized &
      ( rci, nep, n, lambda, m, rr, ind, keep, options, inform )
    select case ( rci%job )
    case ( 1 )
      call apply_laplacian &
        ( ngrid, ngrid, rci%nx, W(1 : n, rci%jx : rci%jx + rci%nx - 1, rci%kx),&
          W(1 : n, rci%jy : rci%jy + rci%nx - 1, rci%ky ) )
    case ( 2 )
      call apply_gauss_seidel_step &
        ( ngrid, ngrid, rci%nx, W(1 : n, rci%jx : rci%jx + rci%nx - 1, rci%kx),&
          W(1 : n, rci%jy : rci%jy + rci%nx - 1, rci%ky ) )
    case ( 3 )
      call dcopy( n*rci%nx, W(1, rci%jx, rci%kx), 1, W(1, rci%jy, rci%ky), 1 )
    case ( 5 )
      if ( rci%i < 0 ) cycle
      do k = 0, rci%nx - 1
        j = ncon + k + 1
        call dcopy( n, W(1, rci%jx + k, 0), 1, X(1, j), 1 )
      end do
      ncon = ncon + rci%nx
    case ( 11 )
      if ( rci%i == 0 ) then
        if ( rci%kx /= rci%ky .or. rci%jx > rci%jy ) then
          call dcopy &
            ( n * rci%nx, W(1, rci%jx, rci%kx), 1, W(1, rci%jy, rci%ky), 1 )
        else if ( rci%jx < rci%jy ) then
          do j = rci%nx - 1, 0, -1
            call dcopy &
              ( n, W(1, rci%jx + j, rci%kx), 1, W(1, rci%jy + j, rci%ky), 1 )
          end do
        end if
      else
        do i = 1, n
          do j = 1, rci%nx
            U(i, j) = W(i, ind(j), rci%kx)
          end do
          do j = 1, rci%nx
            W(i, j, rci%kx) = U(i, j)
          end do
          if ( rci%ky /= rci%kx ) then
            do j = 1, rci%nx
              U(i, j) = W(i, ind(j), rci%ky)
            end do
            do j = 1, rci%nx
              W(i, j, rci%ky) = U(i, j)
            end do
          end if
        end do
      end if
    case ( 12 )
      do i = 0, rci%nx - 1
        rr(rci%i + i, rci%j + i, rci%k) = &
          ddot(n, W(1, rci%jx + i, rci%kx), 1, W(1, rci%jy + i, rci%ky), 1)
      end do
    case ( 13 )
      do i = 0, rci%nx - 1
        if ( rci%kx == rci%ky ) then
          s = dnrm2(n, W(1, rci%jx + i, rci%kx), 1)
          if ( s > 0 ) &
            call dscal( n, 1/s, W(1, rci%jx + i, rci%kx), 1 )
        else
          s = sqrt(abs(ddot &
            (n, W(1, rci%jx + i, rci%kx), 1, W(1, rci%jy + i, rci%ky), 1)))
          if ( s > 0 ) then
            call dscal( n, 1/s, W(1, rci%jx + i, rci%kx), 1 )
            call dscal( n, 1/s, W(1, rci%jy + i, rci%ky), 1 )
          else
            call dcopy( n, 0.0D0, 0, W(1, rci%jy + i, rci%ky), 1 )
          end if
        end if
      end do
    case ( 14 )
      do i = 0, rci%nx - 1
        s = -rr(rci%i + i, rci%j + i, rci%k)
        call daxpy&
          ( n, s, W(1, rci%jx + i, rci%kx), 1, W(1, rci%jy + i, rci%ky), 1 )
      end do
    case ( 15 )
      if ( rci%nx > 0 .and. rci%ny > 0 ) &
        call dgemm &
          ( 'T', 'N', rci%nx, rci%ny, n, &
            rci%alpha, W(1, rci%jx, rci%kx), n, W(1, rci%jy, rci%ky), n, &
            rci%beta, rr(rci%i, rci%j, rci%k), 2*m )
    case ( 16, 17 )
      if ( rci%ny < 1 ) cycle
      if ( rci%nx < 1 ) then
        if ( rci%job == 17 ) cycle
        if ( rci%beta == 1.0D0 ) cycle
        do j = rci%jy, rci%jy + rci%ny - 1
          call dscal( n, rci%beta, W(1, j, rci%ky), 1 )
        end do
        cycle
      end if
      if ( rci%job == 17 ) then
        call dgemm &
          ( 'N', 'N', n, rci%ny, rci%nx, &
            1.0D0, W(1, rci%jx, rci%kx), n, rr(rci%i, rci%j, rci%k), 2*m, &
            0.0D0, W(1, rci%jy, rci%ky), n )
        call dcopy &
          ( n * rci%ny, W(1, rci%jy, rci%ky), 1, W(1, rci%jx, rci%kx), 1 )
      else
        call dgemm &
          ( 'N', 'N', n, rci%ny, rci%nx, &
            rci%alpha, W(1, rci%jx, rci%kx), n, rr(rci%i, rci%j, rci%k), 2*m, &
            rci%beta, W(1, rci%jy, rci%ky), n )
      end if
    case ( 21, 22 )
      if ( ncon > 0 ) then
        call dgemm &
          ( 'T', 'N', ncon, rci%nx, n, &
            1.0D0, X, n, W(1, rci%jy, rci%ky), n, 0.0D0, U, n )
        call dgemm &
          ( 'N', 'N', n, rci%nx, ncon, &
            -1.0D0, X, n, U, n, 1.0D0, W(1, rci%jx, rci%kx), n )
        call dgemm &
          ( 'N', 'N', n, rci%nx, ncon, &
            -1.0D0, X, n, U, n, 1.0D0, W(1, rci%jy, rci%ky), n )
      end if
    case ( 999 )
      if ( rci%k == 0 ) then
        if ( rci%jx > 1 ) then
          do j = 1, rci%jx-1
            do i = 1, n
              W(i, j, 0) = random_real(state, positive=.true.)
            end do
          end do
        endif
      end if
    case ( :-1 )
      exit
    end select
  end do
  print '(i3, 1x, a, i4, 1x, a)', &
    inform%left, 'eigenpairs converged in', inform%iteration, 'iterations'
  print '(1x, a, i2, a, es13.7)', &
    ('lambda(', i, ') = ', lambda(i), i = 1, inform%left)
  call ssmfe_free( keep, inform )
end program ssmfe_expert_precond_example

This code produces the following output:

 6 eigenpairs converged in 129 iterations
lambda( 1) = 4.4676695E-02
lambda( 2) = 1.1119274E-01
lambda( 3) = 1.1119274E-01
lambda( 4) = 1.7770878E-01
lambda( 5) = 2.2040061E-01
lambda( 6) = 2.2040061E-01

Note that the code computed one extra eigenpair because of the insufficient gap between the 5th and 6th eigenvalues.

Method

The algorithm

The solver procedures of spral_ssmfe_expert are interfaces to solver procedures of spral_ssmfe_core, which implement a block iterative algorithm based on the Jacobi-conjugate preconditioned gradients method [2], [3]. Further information on the algorithm used by spral_ssmfe_expert can be found in the specification document for spral_ssmfe_core and in [1].

Stopping criteria

An approximate eigenpair \(\{x,\lambda\}\) is considered to have converged if the following three conditions are all satisfied:

  1. if options%abs_tol_lambda and options%rel_tol_lambda are not both equal to zero, then the estimated error in the approximate eigenvalue must be less than or equal to \(\max(\mathrm{options\%abs\_tol\_lambda}, \delta*\mathrm{options\%rel\_tol\_lambda})\), where \(\delta\) is the estimated average distance between eigenvalues.

  2. if options%tol_x is not zero, then the estimated sine of the angle between the approximate eigenvector and the invariant subspace corresponding to the eigenvalue approximated by \(\lambda\) must be less than or equal to options%tol_x.

  3. if options%abs_tol_residual and options%rel_tol_residual are not both equal to zero, then the Euclidean norm of the residual, \(\|A x - \lambda B x\|_2\), must be less than or equal to \(\max(\mathrm{options\%abs\_tol\_residual}, \mathrm{options\%rel\_tol\_residual}*\|\lambda B x\|_2)\).

The extra eigenpairs are not checked for convergence, as their role is purely auxiliary.

Improving eigenvector accuracy

If the gap between the last computed eigenvalue and the rest of the spectrum is small, then the accuracy of the corresponding eigenvector may be very low. To prevent this from happening, the user should set the eigenpairs storage size mep to a value that is larger than the number of desired eigenpairs, and set the options options%left_gap and options%right_gap to non-zero values \(\delta_l\) and \(\delta_r\). These values determine the size of the minimal acceptable gaps between the computed eigenvalues and the rest of the spectrum, \(\delta_l\) referring to either leftmost eigenvalues (for ssmfe_standard() and ssmfe_generalized() only) or those to the left of the shift sigma, and \(\delta_r\) to those to the right of the shift sigma. Positive values of \(\delta_l\) and \(\delta_r\) set the gap explicitly, and negative values require the gap to be not less than their absolute value times the average distance between the computed eigenvalues. A recommended value of \(\delta_l\) and \(\delta_r\) is \(-0.1\). The value of mep has little effect on the speed of computation, hence it might be set to any reasonably large value. The larger the value of mep, the larger the size of an eigenvalue cluster for which accurate eigenvectors can be computed, notably: to safeguard against clusters of size up to \(k\), it is sufficient to set mep to the number of desired eigenpairs plus \(k - 1\).

The use of shifted matrix factorization

When using the solver procedures that employ the shift-and-invert technique, it is very important to ensure that the numbers of desired eigenvalues each side of the shift do not exceed the actual numbers of these eigenvalues, as the eigenpairs ‘approximating’ non-existing eigenpairs of the problem will not converge. It is therefore strongly recommended that the user employs a linear system solver that performs the \(LDL^T\) factorization of the shifted system, e.g. HSL_MA97 or SPRAL_SSIDS. The \(LDL^T\) factorization of the matrix \(A - \sigma B\) consists in finding a lower triangular matrix \(L\), a block-diagonal matrix \(D\) with \(1\times 1\) and \(2\times 2\) blocks on the diagonal and a permutation matrix \(P\) such that \(P^T(A - \sigma B)P = L D L^T\). By the inertia theorem, the number of eigenvalues to the left and right from the shift \(\sigma\) is equal to the number of negative and positive eigenvalues of \(D\), which allows quick computation of the eigenvalue numbers each side of the shift.

Error estimation

Standard problem

If options%err_est=1, the error estimates for the eigenvalues are based on the eigenvalues of a matrix of the form

\[\begin{aligned} \label{L.mx} \hat A = \tilde\Lambda_k - S_k^T S_k,\end{aligned}\]

where \(\tilde\Lambda_k\) is a diagonal matrix with the \(k-1\) leftmost Ritz values \(\tilde\lambda_j\) on the diagonal, and the columns of \(S_k\) are the respective residual vectors \(r_j = A \tilde x_j - \tilde\lambda_j \tilde x_j\) divided by \(\sqrt{\lambda_k - \tilde\lambda_j}\). If \(k\) is such that \(\tilde\lambda_{k-1} < \lambda_k\), then the eigenvalues of \(\hat A\) are the left-hand side bounds for eigenvalues \(\lambda_i\), and thus the difference \(\tilde\lambda_j - \hat\lambda_j\) estimates the eigenvalue error \(\tilde\lambda_j - \lambda_j\). The unknown \(\lambda_k\) is replaced by \(\tilde\lambda_k\), and select the maximal \(k \le m\) for which the distance between \(\tilde\lambda_{k-1}\) and \(\tilde\lambda_k\) exceeds the sum of the absolute error tolerance for eigenvalues and the Frobenius norm of the matrix formed by the residuals \(r_j, j = 1, \ldots, k-1\). If \(\tilde\lambda_j - \hat\lambda_j\) is close to the machine accuracy, it may be too polluted by round-off errors to rely upon. In such case, we use instead

\[\begin{aligned} \tilde\lambda_j - \lambda_j \le \delta_j \approx \frac{\|r_j\|^2}{\tilde\lambda_k - \lambda_j}.\end{aligned}\]

The eigenvector errors are estimated based on the Davis-Kahan inequality:

\[\begin{aligned} \min_{x \in \mathcal{X}_{k-1}} \sin\{\tilde x_j; x\} \le \frac{\|r_j\|}{\lambda_k - \tilde\lambda_j} \approx \frac{\|r_j\|}{\tilde\lambda_k - \tilde\lambda_j},\end{aligned}\]

where \(\mathcal{X}_{k-1}\) is the invariant subspace corresponding to \(k-1\) leftmost eigenvalues.

If options%err_est=2 the errors are estimated based on the eigenvalue decrements history, which produces an estimate for the average eigenvalue error reduction per iteration, which in turn yields error estimates for both eigenvalues and eigenvectors. Unlike the residual estimates mentioned in this section, such ‘kinematic’ error estimates are not guaranteed to be upper bounds for the actual errors. However, the numerical tests have demonstrated that kinematic error estimates are significantly more accurate, i.e. closer to the actual error, than the residual-based estimates. Furthermore, they straightforwardly apply to the generalized case as well.

Generalized problem

In the case of the generalized eigenvalue problem solved by iterations with preconditioning, all of the residual norms in the previous section must be replaced with \(\|\cdot\|_{B^{-1}}\)-norm of the residual \(r_j = A \tilde x_j - \tilde\lambda_j B \tilde x_j\) (\(\|r_j\|_{B^{-1}}^2 = r_j^* B^{-1} r_j\)) or its upper estimate, e.g. \(\beta_1^{-1/2}\|\cdot\|\), where \(\beta_1\) is the smallest eigenvalue of \(B\). Hence, if \(\beta_1\) is known, then the error tolerances for eigenvalues and eigenvectors must be multiplied by \(\beta_1\) and \(\sqrt{\beta_1}\) respectively. If no estimate for \(\|\cdot\|_{B^{-1}}\)-norm is available, then the use of non-zero residual tolerances and options%err_est=1 is not recommended. In the case of problems solved by iterations with shift-and-invert and the problem , the residuals are computed as \(r_j = T B \tilde x_j - \tilde \lambda_j \tilde x_j\) where \(T = (A - \sigma B)^{-1}\) for and \(T = (B - \sigma A)^{-1}\) for , and \(B\)-norms of \(r_j\) are used, so that Lehmann matrix becomes \(\hat A = \tilde\Lambda_k - S_k^T B\ S_k\). 0 Note that the residual estimates may considerably overestimate the actual error of direct iterations because of the use of the Euclidean norm of the residual, which is too strong a norm for it when \(A\) is the discretization of a differential operator.

References