A Compact Heart Iteration for Large Eigenvalues Problems ()

Achiya Dax^{}

Hydrological Service of Israel, Jerusalem, Israel.

**DOI: **10.4236/alamt.2022.121002
PDF
HTML XML
231
Downloads
924
Views
Citations

Hydrological Service of Israel, Jerusalem, Israel.

In this paper, we present a compact version of the Heart iteration. One that requires less matrix-vector products per iteration and attains faster convergence. The Heart iteration is a new type of Restarted Krylov methods for calculating peripheral eigenvalues of symmetric matrices. The new framework avoids the Lanczos tridiagonalization process and the use of implicit restarts. This simplifies the restarting mechanism and allows the introduction of several modifications. Convergence is assured by a monotonicity property that pushes the computed Ritz values toward their limits. Numerical experiments illustrate the usefulness of the proposed approach.

Keywords

Large Sparse Matrices, Restarted Krylov Methods, Exterior Eigenvalues, Symmetric Matrices, Monotonicity, Starting Vectors

Share and Cite:

Dax, A. (2022) A Compact Heart Iteration for Large Eigenvalues Problems. *Advances in Linear Algebra & Matrix Theory*, **12**, 24-38. doi: 10.4236/alamt.2022.121002.

1. Introduction

The Heart iteration is a new type of Restarted Krylov methods. Given a symmetric matrix
$G\in {\mathbb{R}}^{n\times n}$, the method is aimed at calculating a cluster of *k* exterior eigenvalues of *G*. As other Krylov methods it is best suited for handling large sparse matrices in which a matrix-vector product needs only 0(*n*) flops. Another underlying assumption is that the number of computed eigenvalues, *k*, is much smaller than *n*. The use of restarted Krylov methods for solving such problems was considered by several authors, e.g., [1] - [22]. Most of these methods are based on a Lanczos tridiagonalization algorithm in which the starting vector is determined by an implicit restart process. The Heart iteration is not using these tools. It is based on Gram-Schmidt orthogonalization and a simple intuitive choice of the starting vector. This results in a simple iteration that allows several modifications. Convergence is assured by a monotonicity property that pushes the computed eigenvalues toward their limits.

The main idea behind the new method is clarified by inspecting its basic iteration. Below we concentrate on the largest eigenvalues, but the algorithm can compute any cluster of *k* exterior eigenvalues. Let the eigenvalues of *G* be sorted to satisfy

${\lambda}_{1}\ge {\lambda}_{2}\ge \cdots \ge {\lambda}_{n}\mathrm{.}$ (1.1)

Then the term “exterior eigenvalues” refers to the *k* largest eigenvalues, the *k* smallest eigenvalues, or any set of *k* eigenvalues that is combined from a number of the largest eigenvalues plus a number of the smallest ones. Other names for such eigenvalues are “peripheral eigenvalues” and “extreme eigenvalues”.

Note that although the above definitions refer to clusters of eigenvalues, the algorithm is carried out by computing the corresponding *k* eigenvectors of *G*. The subspace that is spanned by these eigenvectors is called the target space.

The basic Heart iteration

The *q*th iteration,
$q=0,1,2,\cdots $, is composed of the following five steps. The first step starts with a matrix
${V}_{q}\in {\mathbb{R}}^{n\times k}$ that contains “old” information on the target space, a matrix
${Y}_{q}\in {\mathbb{R}}^{n\times \mathcal{l}}$ that contains “new” information, and a matrix
${X}_{q}=\left[{V}_{q}\mathrm{,}{Y}_{q}\right]\in {\mathbb{R}}^{n\times \left(k+\mathcal{l}\right)}$ that includes all the known information. The matrix
${X}_{q}$ has
$p=k+\mathcal{l}$ orthonormal columns. That is

${X}_{q}^{\text{T}}{X}_{q}=I\in {\mathbb{R}}^{p\times p}\mathrm{.}$

(Typical values for
$\mathcal{l}$ lie between *k* to 2*k*.)

Step 1: Eigenvalues extraction. Given the Rayleigh quotient matrix

${S}_{q}={X}_{q}^{\text{T}}G{X}_{q}\mathrm{,}$

compute the *k* largest eigenvalues of
${S}_{q}$. The corresponding *k* eigenvectors of
${S}_{q}$ are assembled in a matrix

${U}_{q}\in {\mathbb{R}}^{p\times k}\mathrm{,}\text{\hspace{1em}}{U}_{q}^{\text{T}}{U}_{q}=I\in {\mathbb{R}}^{k\times k}\mathrm{,}$

which is used to compute the related matrix of Ritz vectors,

${V}_{q+1}={X}_{q}{U}_{q}\mathrm{.}$

Note that both ${X}_{q}$ and ${U}_{q}$ have orthonormal columns and ${V}_{q+1}$ inherits this property.

Step 2: Collect new information. Compute a Krylov matrix ${B}_{q}\in {\mathbb{R}}^{n\times \mathcal{l}}$ that contains new information on the target space.

Step 3: Orthogonalize the columns of ${B}_{q}$ against the columns of ${V}_{q+1}$. There are several ways to achieve this task. In exact arithmetic, the resulting matrix, ${Z}_{q}$, satisfies the Gram-Schmidt formula

${Z}_{q}={B}_{q}-{V}_{q+1}\left({V}_{q+1}^{\text{T}}{B}_{q}\right)\mathrm{.}$

Step 4: Build an orthonormal basis of Range ( ${Z}_{q}$ ). Compute a matrix,

${Y}_{q+1}\in {\mathbb{R}}^{n\times \mathcal{l}}\mathrm{,}\text{\hspace{1em}}{Y}_{q+1}^{\text{T}}{Y}_{q+1}=I\in {\mathbb{R}}^{\mathcal{l}\times \mathcal{l}}\mathrm{,}$

whose columns form an orthonormal basis of Range ( ${Z}_{q}$ ). This can be done by a QR factorization of ${Z}_{q}$. (If rank ( ${Z}_{q}$ ) is smaller than $\mathcal{l}$, then $\mathcal{l}$ is temporarily reduced to be rank ( ${Z}_{q}$ ).)

Step 5: Define ${X}_{q+1}$ by the rule

${X}_{q+1}=\left[{V}_{q+1}\mathrm{,}{Y}_{q+1}\right]\mathrm{,}$

which ensures that

${X}_{q+1}^{\text{T}}{X}_{q+1}=I\in {\mathbb{R}}^{p\times p}\mathrm{.}$

Then compute the new Rayleigh quotient matrix

${S}_{q+1}={X}_{q+1}^{\text{T}}G{X}_{q+1}\mathrm{.}$

This matrix will be used at the beginning of the next iteration.

At this point we are not concerned with efficiency issues, and the above description is mainly aimed at clarifying the purpose of each step. (A more effective scheme is proposed in Section 4.) The name “Heart iteration” comes from the similarity to the heart’s systole-diastole cardiac cycle: Step 1 achieves subspace contraction (eigenvalues extraction), while in Steps 2 - 5 the subspace expands (collecting new information).

The plan of the paper is as follows. The monotonicity property that motivates the new method is established in the next section. Then, in Section 3, we describe a simple Krylov subspace process that constructs ${B}_{q}$. The aim of this paper is to present an efficient implementation of the Heart iteration. The new iteration combines Steps 2 - 5 into one “compact” step. This results in a simple effective algorithm that uses less matrix-vector products per iteration. The details of the new iteration are given in Section 4. The paper ends with numerical experiments that illustrate the usefulness of the proposed method.

2. The Monotonicity Property

In this section we establish a useful property of the proposed method. The proof can be found in former presentations of the Heart iteration, e.g., [5] [6] [7] [8]. Yet, in order to make this paper self-contained, we provide the proof. The main argument is based on the following well-known interlacing theorems, e.g., [11] [15] [23].

Theorem 1 (Cauchy interlace theorem) *Let*
$G\in {\mathbb{R}}^{n\times n}$ *be* *a* *symmetric* *matrix* *with* *eigenvalues*

${\lambda}_{1}\ge {\lambda}_{2}\ge \cdots \ge {\lambda}_{n}\mathrm{.}$ (2.1)

Let the symmetric matrix
$H\in {\mathbb{R}}^{k\times k}$ be obtained from *G* by deleting
$n-k$ rows and the corresponding
$n-k$ columns. Let

${\eta}_{1}\ge {\eta}_{2}\ge \cdots \ge {\eta}_{k}$ (2.2)

denote the eigenvalues of *H*. Then

${\lambda}_{j}\ge {\eta}_{j}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.05em}}\text{for}\text{\hspace{0.17em}}j=\mathrm{1,}\cdots \mathrm{,}k\mathrm{,}$ (2.3)

and

${\eta}_{k+1-i}\ge {\lambda}_{n+1-i}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.05em}}\text{for}\text{\hspace{0.17em}}i=\mathrm{1,}\cdots \mathrm{,}k\mathrm{.}$ (2.4)

In particular, for $k=n-1$ we have the interlacing relations

${\lambda}_{1}\ge {\eta}_{1}\ge {\lambda}_{2}\ge {\eta}_{2}\ge {\lambda}_{3}\ge \cdots \ge {\lambda}_{n-1}\ge {\eta}_{n-1}\ge {\lambda}_{n}\mathrm{.}$ (2.5)

Corollary 2 (Poincaré separation theorem) *Let* *the* *matrix*
$V\in {\mathbb{R}}^{n\times k}$ *have* *k* *orthonormal* *columns.* *That* *is*
${V}^{\text{T}}V=I\in {\mathbb{R}}^{k\times k}$ *.* *Let* *the* *matrix*
$H={V}^{\text{T}}GV$ *have* *the* *eigenvalues* (2.2)*.* *Then* *the* *eigenvalues* *of* *H* *and* *G* *satisfy* (2.3) *and* (2.4)*.*

The last observation enables us to prove the following monotonicity property.

Theorem 3. *Consider* *the* *qth* *iteration* *of* *the* *new* *method*,
$q=\mathrm{1,2,3,}\cdots $ *.* *Assume* *that* *the* *eigenvalues* *of* *G* *satisfy* (2.1) *and* *let* *the* *eigenvalues* *of* *the* *matrix*

${S}_{q}={X}_{q}^{\text{T}}G{X}_{q}={\left[{V}_{q}\mathrm{,}{Y}_{q}\right]}^{\text{T}}G\left[{V}_{q}\mathrm{,}{Y}_{q}\right]$

*be denoted as *

${\lambda}_{1}^{\left(q\right)}\ge {\lambda}_{2}^{\left(q\right)}\ge \cdots \ge {\lambda}_{k}^{\left(q\right)}\ge \cdots \ge {\lambda}_{p}^{\left(q\right)}\mathrm{.}$

*Then the inequalities *

${\lambda}_{j}\ge {\lambda}_{j}^{\left(q\right)}\ge {\lambda}_{j}^{\left(q-1\right)}$ (2.6)

*hold for*
$j=1,\cdots ,k$ *and*
$q=\mathrm{1,2,3,}\cdots $.

Proof: The Ritz values which are computed at Step 1 are

${\lambda}_{1}^{\left(q\right)}\ge {\lambda}_{2}^{\left(q\right)}\cdots \ge {\lambda}_{k}^{\left(q\right)}\mathrm{,}$

and these values are the largest eigenvalues of the matrix

${S}_{q}={X}_{q}^{\text{T}}G{X}_{q}\text{\hspace{0.05em}}\mathrm{.}$

Similarly,

${\lambda}_{1}^{\left(q-1\right)}\ge {\lambda}_{2}^{\left(q-1\right)}\ge \cdots \ge {\lambda}_{k}^{\left(q-1\right)}\mathrm{,}$

are eigenvalues of the matrix

${V}_{q}^{\text{T}}G{V}_{q}\mathrm{.}$

Therefore, since the columns of
${V}_{q}$ are the first *k* columns of
${X}_{q}$,

${\lambda}_{j}^{\left(q\right)}\ge {\lambda}_{j}^{\left(q-1\right)}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{for}\text{\hspace{0.17em}}j=\mathrm{1,}\cdots \mathrm{,}k\mathrm{,}$

while a further use of Corollary 2 gives

${\lambda}_{j}\ge {\lambda}_{j}^{\left(q\right)}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{for}\text{\hspace{0.17em}}j=1,\cdots ,k.$

Hence by combining these relations we obtain (2.6). $\square $

The treatment of other exterior clusters is done in a similar way. Assume for example that the algorithm is aimed at computing the *k* smallest eigenvalues of *G*,

$\left\{{\lambda}_{n+1-k}\mathrm{,}\cdots \mathrm{,}{\lambda}_{n-1}\mathrm{,}{\lambda}_{n}\right\}\mathrm{.}$

Then similar arguments show that

${\lambda}_{p+1-i}^{\left(q-1\right)}\ge {\lambda}_{p+1-i}^{\left(q\right)}\ge {\lambda}_{n+1-i}$ (2.7)

for $i=\mathrm{1,}\cdots \mathrm{,}k$, and $q=\mathrm{1,2,3,}\cdots $.

The proof of Theorem 3 emphasizes the importance of the orthonormality relations, and provides the motivation behind the basic iteration. Moreover, since orthonormality ensures monotonicity, it is not essential to construct ${B}_{q}$ by applying the Lanczos algorithm. This consequence is used in the next sections.

3. The Basic Krylov Matrix

The basic Krylov information matrix has the form

${B}_{q}=\left[{b}_{1}\mathrm{,}{b}_{2}\mathrm{,}\cdots \mathrm{,}{b}_{\mathcal{l}}\right]\in {\mathbb{R}}^{n\times \mathcal{l}}\mathrm{,}$ (3.1)

where the sequence ${b}_{1}\mathrm{,}{b}_{2}\mathrm{,}\cdots $, is initialized by the starting vector ${b}_{0}$. The ability of a Krylov subspace to approximate a dominant subspace is characterized by the Kaniel-Paige-Saad bounds (See, for example, [10]: pp. 552-554; [15]: pp. 242-247; [16]: pp. 147-151; [18]: pp. 272-274), and the references therein. One consequence of these bounds regards the angle between ${b}_{1}$ and the dominant subspace: The smaller the angle, the better approximation we get. This suggests that ${b}_{0}$ should be defined as the sum of the current Ritz vectors. That is,

${b}_{0}={V}_{q+1}e$ (3.2)

where $e={\left(\mathrm{1,1,}\cdots \mathrm{,1}\right)}^{\text{T}}\in {\mathbb{R}}^{k}$ is a vector of ones. (If some of the Ritz vectors have already converged then it is possible to remove these vectors from the sum.) Note that there is no point in setting ${b}_{1}={V}_{q+1}e$, since in the next step ${B}_{q}$ is orthogonalized against ${V}_{q+1}$.

The other columns of ${B}_{q}$ are obtained by a Krylov process that resembles Lanczos’ algorithm but uses direct orthogonalization. Let $r\in {\mathbb{R}}^{n}$ be a given vector and let $q\in {\mathbb{R}}^{n}$ be a unit length vector. That is ${\Vert q\Vert}_{2}=1$ where ${\Vert \text{\hspace{0.05em}}\cdot \text{\hspace{0.05em}}\Vert}_{2}$ denotes the Euclidean vector norm. Then the statement “orthogonalize $r$ against $q$ ” is carried out by replacing $r$ with $r-\left({r}^{\text{T}}q\right)q$. Similarly, the statement “normalize $r$ ” is carried out by replacing $r$ with $r/{\Vert r\Vert}_{2}$. With these conventions at hand the construction of the vectors ${b}_{0}\mathrm{,}{b}_{1}\mathrm{,}\cdots \mathrm{,}{b}_{\mathcal{l}}$, is carried out as follows.

The preparations part

1) Compute the starting vector:

${b}_{0}={V}_{q+1}e/{\Vert {V}_{q+1}e\Vert}_{2}$ (3.3)

2) Compute ${b}_{1}$ : Set ${b}_{1}=G{b}_{0}$.

Orthogonalize ${b}_{1}$ against ${b}_{0}$.

Normalize ${b}_{1}$.

3) Compute ${b}_{2}$ : Set ${b}_{2}=G{b}_{1}$.

Orthogonalize ${b}_{2}$ against ${b}_{0}$.

Orthogonalize ${b}_{2}$ against ${b}_{1}$.

Normalize ${b}_{2}$.

The iterative part

For $j=3,\cdots ,\mathcal{l}$, compute ${b}_{j}$ as follows:

Set ${b}_{j}=G{b}_{j-1}$.

Orthogonalize ${b}_{j}$ against ${b}_{j-2}$.

Orthogonalize ${b}_{j}$ against ${b}_{j-1}$.

Normalize ${b}_{j}$.

The direct orthogonalization that we use differs from Lanczos’ algorithm and, therefore, fails to achieve a reduction of *G* into a tridiagonal form (The difference lies in the term that connects
${b}_{j}$ with
${b}_{j-2}$ ). It is also important to note that although
${b}_{0}$ is defined in an “explicit” way, there is a major difference between our method and former explicitly restarted Krylov methods. That is, in Steps 3 and 4 the Krylov matrix
${B}_{q}$ is orthogonalized against
${V}_{q+1}$ and the resulting matrix,
${Z}_{q}$, is used to construct an orthonormal extension of
${V}_{q+1}$. This important ingredient is missing in the former explicit methods.

4. A Compact Version of the Heart Iteration

One feature that characterizes the basic Heart iteration is a direct computation of the Rayleigh quotient matrix

${S}_{q}={X}_{q}^{\text{T}}G{X}_{q}\mathrm{.}$ (4.1)

In this section we describe a compact version of the Heart iteration that avoids this computation. Instead ${S}_{q}$ is computed “on the fly”, as a by-product of the expanding process. The main idea is that the Krylov process in Step 2 and the orthogonalization in Steps 3 - 4 can be combined into one process. Moreover, observe that the columns of ${S}_{q}$ have the form

${X}_{q}^{\text{T}}\left(G{x}_{j}\right)\mathrm{,}\text{\hspace{0.17em}}\text{\hspace{0.17em}}j=\mathrm{1,}\cdots \mathrm{,}p\mathrm{,}$ (4.2)

where
${x}_{j}$ denotes the *j*th column of
${X}_{q}$. Hence the vector
$G{x}_{j}$ can be used both to construct
${S}_{q}$ and to expand the Krylov subspace.

The contraction part remains unchanged. As before, it ends by computing a Ritz vectors matrix

$V\in {\mathbb{R}}^{n\times k}\mathrm{,}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.05em}}{V}^{\text{T}}V=I\in {\mathbb{R}}^{k\times k}\mathrm{,}$ (4.3)

that satisfies

${V}^{\text{T}}GV=D\mathrm{,}$ (4.4)

where *D* is a
$k\times k$ diagonal matrix whose diagonal entries are the computed Ritz values (The iteration subscripts are removed to ease the description).

The expansion process starts with the matrices

$X=V$

and

$S=D$.

Then the Krylov sequence begins with the vector

$z=G\left(Ve\right)\mathrm{,}$

and performs
$\mathcal{l}$ steps. The *j*th step,
$j=k+\mathrm{1,}\cdots ,k+\mathcal{l}$, begins with the matrix

$X=\left[{x}_{1}\mathrm{,}\cdots \mathrm{,}{x}_{j-1}\right]\in {\mathbb{R}}^{n\times \left(j-1\right)}$ (4.5)

and ends with the matrix

$X=\left[{x}_{1}\mathrm{,}\cdots \mathrm{,}{x}_{j-1}\mathrm{,}{x}_{j}\right]\in {\mathbb{R}}^{n\times j}\mathrm{.}$ (4.6)

The vector
$z=G{x}_{j}$ serves two purposes: To start the computation of
${x}_{j+1}$, and to extend the Rayleigh quotient matrix. The vector
${x}_{j+1}$ is obtained by orthogonalizing
$z$ against the columns of *X*. When using Gram-Schmidt orthogonalization this is achieved by replacing
$z$ with the vector
$z-X\left({X}^{\text{T}}z\right)$. Below we denote this operation as

$z=z-X\left({X}^{\text{T}}z\right)$. (4.7)

In practice one use of (4.7) is not sufficient to maintain orthogonality, so we need to repeat this operation.

The building of the Rayleigh quotient matrix

${S}_{q+1}={X}_{q+1}^{\text{T}}G{X}_{q+1}$ (4.8)

is based on the following observations. At the beginning of the *j*th step,
$j=k+\mathrm{1,}\cdots \mathrm{,}k+\mathcal{l}$, the matrix *X* has the form (4.5) and the matrix

$S={X}^{\text{T}}GX$ (4.9)

is a
$\left(j-1\right)\times \left(j-1\right)$ principal submatrix of
${S}_{q+1}$. At the end of the *j*th step *X* has the form (4.6) and the matrix (4.9) is a
$j\times j$ principal submatrix of
${S}_{q+1}$. The “new” entries of *S* are
${s}_{ij}={s}_{ji},\text{\hspace{0.17em}}i=1,\cdots ,j$, and these entries are obtained from the vector

$r={\left({r}_{1},\cdots ,{r}_{j}\right)}^{\text{T}}={X}^{\text{T}}\left(G{x}_{j}\right)={X}^{\text{T}}z$. (4.10)

A further saving is gained by noting that
$r$ can be used in the orthogonalization of
$z$ against the columns of *X*. The expansion process ends with a matrix *X* that has
$k+\mathcal{l}$ orthonormal columns, and the related Rayleigh quotient matrix

$S={X}^{\text{T}}GX\mathrm{.}$ (4.11)

These matrices are the input for the next iteration (As before, we omit indices to ease the notation).

The compact Heart iteration

Part I: Contraction

Compute the *k* largest eigenvalues of *S* and the corresponding eigenvectors. The eigenvalues construct a diagonal matrix,
$D\in {\mathbb{R}}^{k\times k}$. The eigenvectors are assembled into the matrix

$U\in {\mathbb{R}}^{\left(k+\mathcal{l}\right)\times k}\mathrm{,}\text{\hspace{0.17em}}\text{\hspace{0.17em}}{U}^{\text{T}}U=I\mathrm{,}$ (4.12)

which is used to compute the related matrix of Ritz vectors

$V=XU\in {\mathbb{R}}^{n\times k}\mathrm{.}$ (4.13)

Since both *X* and *U* have orthonormal columns the matrix *V* inherits this property.

Part II: Expansion

First set

$X=V,$ (4.14)

$S=D,$ (4.15)

$z=G\left(Ve\right)\mathrm{,}$ (4.16)

and

$r={X}^{\text{T}}z$. (4.17)

Then for $j=k+1,\cdots ,k+\mathcal{l}$, do as follows.

Gram-Schmidt orthogonalization: Set $z=z-Xr$.

Gram-Schmidt reorthogonalization: Set $z=z-X\left({X}^{\text{T}}z\right)$.

Normalize $z$.

Expand *X* to be
$\left[X\mathrm{,}z\right]$.

Set $z=Gz$.

Expand *S* by computing the vector

$r={\left({r}_{1},\cdots ,{r}_{j}\right)}^{\text{T}}={X}^{\text{T}}z$

and setting the new entries of *S* to be

${s}_{ij}={s}_{ji}={r}_{i}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{for}\text{\hspace{0.17em}}\text{\hspace{0.05em}}i=1,\cdots ,j.$

The main feature that characterizes this scheme is its simplicity. Furthermore, now each iteration needs only $\mathcal{l}+1$ matrix-vector products.

5. The Initial Orthonormal Matrix

To start the Heart iteration we need to supply an “initial” orthonormal matrix, ${X}_{0}\in {\mathbb{R}}^{n\times p}$, and the corresponding Rayleigh quotient matrix ${S}_{0}={X}_{0}^{\text{T}}G{X}_{0}$. In our experiments this was done in the following way. Define $p=k+\mathcal{l}$ and let the $n\times p$ matrix

${B}_{0}=\left[{b}_{1}\mathrm{,}{b}_{2}\mathrm{,}\cdots \mathrm{,}{b}_{p}\right]$ (5.1)

be generated as in Section 3, using the starting vector

${b}_{0}=e/{\Vert e\Vert}_{2}$ (5.2)

where $e={\left(\mathrm{1,1,}\cdots \mathrm{,1}\right)}^{\text{T}}\in {\mathbb{R}}^{n}$. Then ${X}_{0}$ is obtained by computing an orthonormal basis of Range ( ${B}_{0}$ ).

6. Numerical Experiments

In this section we describe some experiments with the proposed methods. The basic Heart iteration was used with
$\mathcal{l}=k+40$ (Recall that *k* denotes the number of desired eigenvalues). The compact Heart iteration was tested with two values of
$\mathcal{l}$. The first one is
$\mathcal{l}=k+40$, as in the basic iteration. The second value of
$\mathcal{l}$ is

$\mathcal{l}={\left[k\right]}_{40}^{100}$ (6.1a)

This notation means that
$\mathcal{l}$ is obtained from *k* in the following way:

If $k\le 40$ then $\mathcal{l}=40$ ; (6.1b)

if $40\le k\le 100$ then $\mathcal{l}=k$ ; (6.1c)

if $100\le k$ then $\mathcal{l}=100$. (6.1d)

Note that $k+40\ge {\left[k\right]}_{40}^{100}$, hence the second choice increases the number of iterations, but reduces the computational effort per iteration. The first experiments concentrate on the number of iterations (number of restarts) that are needed by each method. For this purpose we have used diagonal test matrices have the form

$D=\text{diag}\left\{{\lambda}_{1}\mathrm{,}{\lambda}_{2}\mathrm{,}\cdots \mathrm{,}{\lambda}_{n}\right\}\in {\mathbb{R}}^{n\times n}$ (6.2)

where

${\lambda}_{1}\ge {\lambda}_{2}\ge \cdots \ge {\lambda}_{n}\ge 0.$ (6.3)

(Since we are interested in iterations there is no loss of generality in experimenting with diagonal matrices, e.g., ( [9]: page 367) The diagonal matrices that we have used are displayed in Table 1. The eigenvalues of the “Normal distribution” matrix were generated with MATLAB’s command “randn(*n*, 1)”.

Table 1. Types of test matrices, $n=200000$.

All the experiments were carried out with *n* = 200,000, and are aimed at computing the *k* largest eigenvalues. The iterative process was terminated as soon as it satisfies the stopping condition.

$\left({\displaystyle \underset{j=1}{\overset{k}{\sum}}}\left|{\lambda}_{j}-{\lambda}_{j}^{\left(q\right)}\right|\right)/\left(k\left|{\lambda}_{1}\right|\right)\le {10}^{-14},$ (6.4)

where, as before,

${\lambda}_{1}^{\left(q\right)}\ge \cdots \ge {\lambda}_{k}^{(\; q\; )}$

denote the computed Ritz values at the *q*th iteration.

Table 2. Computing *k* dominant eigenvalues with the basic Heart iteration,
$\mathcal{l}=k+40$.

The figures in Tables 2-4 provide the number of iterations that are needed to satisfy (6.4). Thus, for example, from Table 2 we see that only 5 iterations of basic Heart are needed to compute the largest
$k=200$ eigenvalues of the Equispaced test matrix. A comparison of Table 2 with Table 3 shows that the compact version often requires a smaller number of iterations. One reason for this gain lies in the order of the orthogonalizations. In the basic scheme rank (
${Y}_{q+1}$ ) can be smaller than *l*, while in the compact scheme rank (
${Y}_{q+1}$ ) is guaranteed to be *l*. This implies that the compact scheme is able to collect a larger amount of new information.

Table 3. Computing *k* dominant eigenvalues with the compact Heart iteration,
$\mathcal{l}=k+40$.

Table 4. Computing *k* dominant eigenvalues with the compact Heart iteration,
$\mathcal{l}={\left[k\right]}_{40}^{100}$.

The second part of our experiments provides timing results that compare the Heart iterations and MATLAB’s “eigs” function. For this purpose we have used “PH” matrices that have the following form:

$G=\left({\displaystyle \underset{i=1}{\overset{p}{\prod}}}\text{\hspace{0.05em}}\text{\hspace{0.05em}}{H}_{i}\right)D{\left({\displaystyle \underset{i=1}{\overset{p}{\prod}}}\text{\hspace{0.05em}}\text{\hspace{0.05em}}{H}_{i}\right)}^{\text{T}}\in {\mathbb{R}}^{n\times n},$ (6.5)

where
$D\in {\mathbb{R}}^{n\times n}$ is a diagonal matrix and
${H}_{i}\mathrm{,}i=\mathrm{1,}\cdots \mathrm{,}p$, are
$n\times n$ sparse random Householder matrices. The matrix *D* has the form (6.2)-(6.3) with eigenvalues that achieve “slow geometric decay”. That is,

${\lambda}_{j}={\left(0.999\right)}^{j-1}\text{\hspace{0.17em}}\text{\hspace{0.05em}}\text{\hspace{0.05em}}\text{for}\text{\hspace{0.17em}}j=1,\cdots ,n.$ (6.6)

The Householder matrices, ${H}_{i},i=1,\cdots ,p$, have the form

${H}_{i}=I-2{h}_{i}{h}_{i}^{\text{T}}/{h}_{i}^{\text{T}}{h}_{i}\mathrm{,}$ (6.7)

where
${h}_{i}\in {\mathbb{R}}^{n}$ is a sparse random vector. To generate this vector we have used MATLAB’s command *h *= sprand(*n*, 1, density) with the density 1000/*n*. This yields a sparse *n*-vector that has about 1000 nonzero entries at random locations. Consequently, for small values of *p* the resulting PH matrix (6.5) is a large sparse symmetric matrix whose nonzero entries lie at random locations. The number of nonzero,
$\nu $, increases with *p*, see Table 6. Thus, for example, the 3*H* matrix has 9,071,462 nonzero entries, while the 7*H* matrix has 47,417,512 nonzeroes.

The results in Table 5 and Table 6 provide both the computation times (in seconds) and the number of iterations that are needed to compute the *k* largest eigenvalues. The eigenvalues of a PH matrix are given by (6.6). Thus, as before, the iterative process terminates as soon as (6.4) is satisfied.

Table 5. Timing results (in seconds) and number of iterations for the 3*H* matrix,
$n=200000$,
$\nu =9071462$.

Table 6. Timing results (in seconds) and number of iterations for computing $k=50$ dominant eigenpairs of PH matrices.

Let us turn now to conduct a brief operations count for the compact Heart iteration. The Gram-Schmidt orthogonalizations require about

$n\left[{\left(k+l\right)}^{2}-{k}^{2}\right]=n\left({l}^{2}+2kl\right)=nl\left(l+2k\right)$

multiplications per iteration, while the matrix-vector products require
$\left(l+1\right)\nu $ multiplications per iteration. The size of the Rayleigh quotient matrix is
$k+l$, and the spectral decomposition of this matrix requires a moderate multiple of
${\left(k+l\right)}^{3}$ multiplications. Thus, when
$k+l$ is negligible with respect to *n*, the spectral decomposition of the Rayleigh-quotient matrix requires considerably less efforts than the orthogonalizations. In our experiments the spectral decomposition was carried out with MATLAB’s “eig” function. In this case, for
$k=100$ and
$l=k+40=140$, the spectral decomposition of the related 240 × 240 matrix required, on average, about 0.01 seconds. Similarly, for
$k=500$ and
$l=k+40=540$, the spectral decomposition of the related 1040 × 1040 matrix required, on average, 0.1 seconds. That is, the time spent on the spectral decomposition is negligible with respect to the overall computation time. Recall that the Lanczos process provides a tridiagonal
$\left(k+l\right)\times \left(k+l\right)$ Rayleigh quotient matrix whose spectral decomposition is faster than that of a full matrix of the same size. Yet the last observation suggests that this is not a real gain. The Ritz vectors matrix
${V}_{q+1}={X}_{q}{U}_{q}$ is obtained by one matrix-matrix product. Thus, although this product achieves
$n\left(k+l\right)k$ multiplications, it needs considerably less time than the time required for orthogonalizations. These considerations show that most of the computation time is spent during the expansion process, on orthogonalizations and matrix-vector products.

The experiments in Table 5 and Table 6 illustrate how these tasks affect the computation times. Table 5 concentrates on the 3*H* matrix and runs the algorithms for increasing values of *k*. Thus, for example, we see that for
$k=200$ “eigs” required 187.9 seconds, while compact Heart with
$l={\left[200\right]}_{40}^{100}=100$ terminated after 8 iterations and 162.9 seconds. In 3*H* matrices the number of nonzero is moderate; hence for large *k* most of the computation time is spent on orthogonalizations. Table 6 concentrates on
$k=50$ and runs the algorithm with increasing numbers of nonzero. Since *k* is fixed, the time spent on orthogonalizations is fixed, and the increase in time is mainly due to the cost of matrix-vector products.

The timing results demonstrate the ability of the compact Heart algorithm to compete with MATLAB’s “eigs” program. We see that compact Heart is not much slower than “eigs”, and in some cases, it is faster. When judging the timing results it is important to note that compact Heart was programmed (in MATLAB) exactly as described in Section 4. Yet, as in other Krylov subspace methods, the basic version can be improved in several ways. Such modifications may include, for example, more effective orthogonalization schemes, locking, and improved rules for the choice of *l*. Hence from this point of view, the new method is quite promising.

7. Concluding Remarks

Perhaps the main feature that characterizes the new iteration is its simplicity. The discarding of Lanczos’ tridiagonalization and implicit restarts results in a simple iteration that retains a fast rate of convergence. As we have seen, in many cases it requires a remarkably small number of iterations.

The compact Heart iteration is an elegant version of the basic Heart iteration. It uses an effective orthogonalization scheme that avoids the direct computation of the Rayleigh quotient matrix. The experiments that we have done are quite encouraging.

The Heart iteration is a useful tool for calculating low-rank approximations of large matrices. The cross-product approach that was proposed in [8] uses the basic Heart iteration and can be improved by applying the new compact version.

Conflicts of Interest

The author declares no conflicts of interest regarding the publication of this paper.

[1] |
Baglama, J., Calvetti, D. and Reichel, L. (2003) IRBL: An Implicitly Restarted Block Lanczos Method for Large-Scale Hermitian Eigen Problems. SIAM Journal on Scientific Computing, 24, 1650-1677. https://doi.org/10.1137/S1064827501397949 |

[2] |
Bai, A., Demmel, J., Dongarra, J., Ruhe, A. and van der Vorst, H. (1999) Templates for the Solution of Algebraic Eigenvalue Problems: A Practical Guide. Society for Industrial and Applied Mathematics, Philadelphia. https://doi.org/10.1137/1.9780898719581 |

[3] | Calvetti, D., Reichel, L. and Sorenson, D.C. (1994) An Implicitly Restarted Lanczos Method for Large Symmetric Eigenvalue Problems. Electronic Transactions on Numerical Analysis, 2, 1-21. |

[4] |
Dax, A. (2017) The Numerical Rank of Krylov Matrices. Linear Algebra and Its Applications, 528, 185-205. https://doi.org/10.1016/j.laa.2016.07.022 |

[5] |
Dax, A. (2017) A New Type of Restarted Krylov Methods. Advances in Linear Algebra & Matrix Theory, 7, 18-28. https://doi.org/10.4236/alamt.2017.71003 |

[6] |
Dax, A. (2019) A Restarted Krylov Method with Inexact Inversions. Numerical Linear Algebra with Applications, 26, Article No. e2213. https://doi.org/10.1002/nla.2213 |

[7] |
Dax, A. (2019) Computing the Smallest Singular Triplets of a Large Matrix. Results in Applied Mathematic, 3, Article ID: 100006. https://doi.org/10.1016/j.rinam.2019.100006 |

[8] |
Dax, A. (2019) A Cross-Product Approach for Low-Rank Approximations of Large Matrices. Journal of Computational and Applied Mathematics, 369, Article ID: 112576. https://doi.org/10.1016/j.cam.2019.112576 |

[9] |
Demmel, J.W. (1997) Applied Numerical Linear Algebra. Society for Industrial and Applied Mathematics, Philadelphia. https://doi.org/10.1137/1.9781611971446 |

[10] | Golub, G.H. and Van Loan, C.F. (2013) Matrix Computations. 4th Edition, Johns Hopkins University Press, Baltimore. |

[11] |
Horn, R.A. and Johnson, C.R. (1985) Matrix Analysis. Cambridge University Press, Cambridge. https://doi.org/10.1017/CBO9780511810817 |

[12] | Larsen, R.M. (2001) Combining Implicit Restarts and Partial Reorthogonalization in Lanczos Bidiagonalization. Technical Report, Stanford University. |

[13] |
Li, R., Xi, Y., Vecharynski, E., Yang, C. and Saad, Y. (2016) A Thick-Restart Lanczos Algorithm with Polynomial Filtering for Hermitian Eigenvalue Problems. SIAM Journal on Scientific Computing, 38, A2512-A2534. https://doi.org/10.1137/15M1054493 |

[14] |
Morgan, R.B. (1996) On Restarting the Arnoldi Method for Large Non-Symmetric Eigenvalues Problems. Mathematics of Computation, 65, 1213-1230. https://doi.org/10.1090/S0025-5718-96-00745-4 |

[15] | Parlett, B.N. (1980) The Symmetric Eigenvalue Problem. Prentice-Hall, Englewood Cliffs. |

[16] |
Saad, Y. (2011) Numerical Methods for Large Eigenvalue Problems. Revised Edition, Society for Industrial and Applied Mathematics, Philadelphia. https://doi.org/10.1137/1.9781611970739 |

[17] |
Sorensen, D.C. (1992) Implicit Application of Polynomial Filters in a k-Step Arnoldi Method. SIAM Journal on Matrix Analysis and Applications, 13, 357-385. https://doi.org/10.1137/0613025 |

[18] |
Stewart, G.W. (2001) Matrix Algorithms, Vol. II: Eigensystems. SIAM, Philadelphia. https://doi.org/10.1137/1.9780898718058 |

[19] | Trefethen, L.N. and Bau III, D. (1997) Numerical Linear Algebra. Society for Industrial and Applied Mathematics, Philadelphia. |

[20] |
Watkins, D.S. (2007) The Matrix Eigenvalue Problem: GR and Krylov Subspace Methods. Society for Industrial and Applied Mathematics, Philadelphia. https://doi.org/10.1137/1.9780898717808 |

[21] |
Wu, K. and Simon, H. (2000) Thick-Restarted Lanczos Method for Large Symmetric Eigenvalue Problems. SIAM Journal on Matrix Analysis and Applications, 22, 602-616. https://doi.org/10.1137/S0895479898334605 |

[22] |
Yamazaki, I., Bai, Z., Simon, H., Wang, L. and Wu, K. (2010) Adaptive Projection Subspace Dimension for the Thick-Restart Lanczos Method. ACM Transactions on Mathematical Software, 47, Article No. 27. https://doi.org/10.1145/1824801.1824805 |

[23] | Zhang, F. (1999) Matrix Theory: Basic Results and Techniques. Springer-Verlag, New York. |

Journals Menu

Contact us

+1 323-425-8868 | |

customer@scirp.org | |

+86 18163351462(WhatsApp) | |

1655362766 | |

Paper Publishing WeChat |

Copyright © 2024 by authors and Scientific Research Publishing Inc.

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.