1 Arnoldi/Lanczos Iterations
Krylov Subspace Methods
•Given A and b, Krylov subspace

linear systems 
eigenvalue problems 
Hermitian 
CG 
Lanczos 
Nonhermitian 
GMRES, BiCG, etc. 
Arnoldi 
Arnoldi Algorithm
•Let be m n matrix with first n columns of Q
and be (n + 1) x n upperleft section of H
•Start by picking a random q_{1} and then determine q_{2}and
•The nth columns of AQ_{n} = Q_{n+1} can be written as
Aq_{n} = h_{1n}q_{1} + + h_{nn}q_{n} + h_{n+1};_{n}q_{n+1
}
•Algorithm: Arnoldi Iteration
given random nonzero b, let
for n = 1 to 1, 2, 3,.....
Lanczos Iteration for Symmetric Matrices
•For symmetric A, and H_{n} are tridiagonal,
denoted by and T_{n},
respectively. AQ_{n} = Q_{n+1} can be written as three term recurrence
•where are diagonal entries and
are subdiagonal entries of
•Algorithm: Lanczos Iteration
given random b, let
for n = 1 to 1,2,3,....
Question: What are the meanings of
and in ?
Properties of Arnoldi and Lanczos Iterations
•Eigenvalues of H_{n} (or T_{n} in Lanczos iterations) are called Ritz values.
•When m = n, Ritz values are eigenvalues .
•Even for n << m, Ritz values are often accurate approximations to
eigenvalues of A!
•For symmetric matrices with evenly spaced eigenvalues, Ritz values
tend to first convert to extreme eigenvalue.
•With rounding errors, Lanczos iteration can suffer from loss of
orthogonality and can in turn lead to spurious “ghost” eigenvalues.
Arnoldi and Polynomial Approximation
•For any
•It can be interpreted as a polynomial in A times b , x = p(A)b, where
•Krylov subspace iterations are often analyzed in terms of matrix
polynomials.
•Let P^{n} be the set of polynomials of degree n with c_{n} = 1.
•Optimality of Arnoldi/Lanczos approximation: It finds
s.t.
•p* is optimal polynomial among
that minimizes
,
which is equivalent to minimizing the distance betwen Anb and its
projection in K_{n}
•p* is the characteristic polynomial of Hn
Ritz values are the roots of an optimal polynomial
2 Conjugate Gradient Method
Krylov Subspace Algorithms
•Create a sequence of Krylov subspaces for Ax = b
and find an approximate (hopefully optimal) solutions x _{n} in K_{n}
•Only matrixvector products involved
•For SPD matrices, most famous algorithm is Conjugate Gradient (CG)
• method discovered by Hestenes/Stiefel in 1952
•Finds best solution in norm
•Only requires storing 4 vectors (instead of n vectors) due to threeterm
recurrence
Motivation of Conjugate Gradients
•If A is m x m SPD, then quadratic function
has unique minimum
• Negative gradient of this function is residual vector
so minimum is obtained precisely when Ax = b
•Optimization methods have form
where p_{n} is search direction and is step length chosen to minimize
• Line search parameter can be determined analytically as
•In CG, p_{n} is chosen to be Aconjugate (or Aorthogonal) to previous
search directions, i.e., for j < n
Conjugate Gradient Method
Algorithm: Conjugate Gradient Method 


step length
approximate solution
residual
improvement this step
search direction 
•Only one matrixvector product Ap _{n1} per iteration
•Apart from matrixvector product, operation count per iteration is
O(m)
•If A is sparse with constant number of nonzeros per row, O(m)
operations per iteration.