Newsgroups: sci.math.num-analysis,comp.soft-sys.matlab Subject: Re: sqrtm() of a matrix. From: moler@mathworks.com (Cleve Moler) Date: 14 Mar 1997 12:20:28 -0500 Abed M. Hammoud, D.sc <abed@oregano.wustl.edu> started this discussion with: > I am trying to decide on the best algorithm to use to calculate the square > root of a matrix. My first shot at the problem I tried the following: > > for a positive semidefinte matrix (A): > > 1- [E D] = eig(A); sqrtm(A) = E * sqrt(D) * E' where D is a diagonal matrix. > sqrt(D) is formed by taking the square root of the diagonal entries in D. > > Another equivilant way is to use: > 2- [U S V] = svd(A); sqrtm(A) = U * sqrt(S) * V' > > 3- Then I looked at the book by Golub and Van Loan, they suggest under the > matrix square root the following method: > Use cholesky decomposition to write A as A = G G' > then use SVD on G, i.e, [u s v] = svd(G), then the square root of > A is equal to sqrtm(A) = u * s * u' > > So my question is which is the most accurate and best way of doing > the sqrtm operation. ... I responded with: > If your matrix is symmetric and positive definite, then your method 1, > based on eigenvalues and eigenvectors, is the best choice. Your method 2 > is the same as method 1 for symmetric, positive definite A because V = U > in this case. But use of "svd", instead of "eig", does not take advantage > of symmetry and is more work. I've never seen your method 3, can't > find it in Golub and Van Loan, and don't believe it will work. > ... Several other people joined the discussion, including my long time friend and colleague, G. W. (Pete) Stewart <stewart@cs.umd.edu>, who wrote: > For positive semidefinite matrices, all three methods work in > principal, the first being the most efficient (howevr, you have > to set to zero any negative eigenvalues that rounding error might > produce). > It is worth noting many applications can be recast to use > a Cholesky decomposition instead of the square root. > > Cleve was refering to indefinite and nonsymmetric matrices. > Even matlab (at home, at the office, and even on the beach) can > have problems. But I did find something funny. When I run > the script > > small = 1.e-5; > for n=2:5 > a = small*diag(randn(n,1)) + diag(ones(n-1,1),1); > b = sqrtm(a); > c = a^.5; > disp([norm(b*b-a), norm(c*c-a)]) > end > > I get the following output (error messages deleted) > > 2.2204e-16 5.5511e-17 > 1.5172e-02 2.2555e-10 > 1.4740e+04 4.3826e-06 > 1.2471e+06 8.3380e+01 > > The exponential function, which I believe is based on the > eigendecompostion, gives the accuracy one deserves. But sqrtm blows > it. > > Pete Stewart Here are some additional thoughts on the overall problem, and then a response to Pete's "funny" results. First of all, what is a matrix square root? If A is any square matrix, then MATLAB's notion of a square root of A is any matrix S which satisfies S*S = S^2 = A There are no transposes involved in this definition. But, as has been pointed out, there are other matrices which act like a square root. Any matrix X which satisfies X'*X = A, or X*X' = A might also be thought of as a square root. If such an X exists, and A is real, then A must be symmetric and, in fact, positive definite. Such matrices have a Cholesky decomposition. The classic Cholesky decomposition is lower triangular (see Householder, The Theory of Matrices in Numerical Analysis) L*L' = A But, it was Pete's idea with LINPACK to use an upper triangular decomposition so that the Cholesky algorithm would be column oriented. R'*R = A MATLAB inherited this convention; our Cholesky decomposition is upper triangular. So, we have at least three different candidates for square root -- S, L and R. If A is diagonal, with positive diagonal entries, we can take the positive scalar square roots of the diagonal entries. The resulting diagonal matrix could be called S, L or R -- all three are the same in this case. But we could also put minus signs in front of any of the diagonal entries and obtain other matrices which are square roots of A. If A is symmetric and positive definite, but not diagonal, then S, L and R are not the same. It is possible, and reasonable, to also make S symmetric and positive definite. But L and R certainly aren't symmetric -- they are triangular. Here's an example. >> A = hilb(4) A = 1.0000 0.5000 0.3333 0.2500 0.5000 0.3333 0.2500 0.2000 0.3333 0.2500 0.2000 0.1667 0.2500 0.2000 0.1667 0.1429 >> S = sqrtm(A) S = 0.9115 0.3390 0.1927 0.1310 0.3390 0.3529 0.2475 0.1807 0.1927 0.2475 0.2411 0.2086 0.1310 0.1807 0.2086 0.2226 >> R = chol(A) R = 1.0000 0.5000 0.3333 0.2500 0 0.2887 0.2887 0.2598 0 0 0.0745 0.1118 0 0 0 0.0189 >> L = R' L = 1.0000 0 0 0 0.5000 0.2887 0 0 0.3333 0.2887 0.0745 0 0.2500 0.2598 0.1118 0.0189 Each of S, L and R satisfy their defining equation to within roundoff error. For symmetric, positive definite A, MATLAB computes sqrtm(A) with the first algorithm that Hammond mentions. [V,D] = eig(A) S = V*diag(sqrt(diag(D)))*V' Hammond's second algorithm is a more expensive way of computing the same thing. I was unsure about Hammond's third algorithm for three reasons. I looked in the second, not the third, edition of Golub and Van Loan. There is a confusion between the lower and upper Cholesky decompositions. And, the algorithm doesn't generalize to nonsymmetric or indefinite matries. Now let's look at Pete's "funny" results. His matrices have some random entries, but one of them might be A = -1.441e-05 1 0 0 0 5.7115e-06 1 0 0 0 -3.9989e-06 1 0 0 0 6.9e-06 This matrix is not symmetric, certainly not positive definite, and is dangerously close to: A = 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 This is an instance of sqrtm's worst nightmare. Its square root does not exist. Complex numbers and infinites don't help. There simply is no matrix S for which S*S is equal to A. As Pete's parameter "small" gets even smaller, his matrices approach ones for which sqrtm must fail. The results don't exist. Even when small = 1.e-5, sqrtm knows it's in trouble and prints out warning messages which Pete has omitted. The sqrtm function is based on a clever observation known as Parlett's algorithm. It does involve eigenvalues. In contrast, the expoential function, expm, does not use eigenvalues or Parlett's algorithm. You can see the details with type sqrtm type expm1 It is true that sqrtm could be improved by using a block form of Parlett's algorithm. But I'm not sure if that would make Pete happy. It's impossible to compute things which don't exist. It's difficult to compute things which almost don't exist. -- Cleve Moler moler@mathworks.com

Index Home About Blog