* This is the main program for a simple test of * PDSQRT.f parameter (n=5) real A(n,n), S(n,n),w(8*n), tiny integer iw(5*n), i, j tiny = 1.0E-6 do j=1, n do i=j, n A(i,j) = 1.0/(2.0 + (i-j)**2) enddo enddo call PDSQRT(n,A,S,w,iw, tiny) do i=1, n print*, (S(i,j),j=1,i) enddo end **************************************************************** * This is the fortran 77 program for calculating * the square root of a symmetric positive definite matrix. * It requires a few routines in LAPACK (and BLAS). It * works on a SUN workstation. LAPACK is already included * in SUN's performance library. It can be compiled with * f77 -xlic_lib=sunperf pdsqrt.f * in Solaris 7 and 8. * Here A is the input matrix, only the lower triangular * part is really needed. * S is the square root matrix. The lower triangular part * is calculated. * w is a real vector for working space. * iw is an integer vector for working space. * tiny is a real error control parameter. ***************************************************************** subroutine PDSQRT(n,A,S,w,iw,tiny) integer n, iw(5*n) real A(n,n), S(n,n), w(8*n),tiny call PDSQRW(n,A,S,w,w(n+1),w(2*n+1),w(3*n+1),w(4*n+1),iw,tiny) return end ***************************************************************** subroutine PDSQRW(n,A,S,d,e,tau,w,work,iw,tiny) implicit none integer n real A(n,n),S(n,n),d(n),e(n),tau(n),w(n),work(4*n),tiny integer iw(5*n) real lams, laml, mu, rtmu, coef * real time(2), ttime, dtime integer lwork, m, info, i, j, mfound, nsplit lwork = 4*n * SSYTRD reduces a real symmetric matrix A to real symmetric * tridiagonal form T by an orthogonal similarity transformation: * Q**T * A * Q = T, where Q is is represented as a product of * elementary reflectors. * * It takes: 4/3 n^3 operations. call SSYTRD( 'L', n, A, n, d, e, tau, work, lwork, info ) * SSTEBZ computes the eigenvalues of a symmetric tridiagonal * matrix T. * * It takes O(n) operations. call SSTEBZ( 'I', 'B', n, 0.0, 0.0, 1, 1, 0.0, d, e, $ mfound, nsplit, w, iw, iw(n+1), work, iw(2*n+1), $ info ) lams = w(1) call SSTEBZ( 'I', 'B', n, 0.0, 0.0, n, n, 0.0, d, e, $ mfound, nsplit, w, iw, iw(n+1), work, iw(2*n+1), $ info ) laml = w(1) call XING(laml, lams, tiny, mu, m) print*, 'm=', m rtmu = sqrt(mu) * We translate the tridiagonal matrix T to X. coef = 1.0/mu do i=1, n-1 d(i) = coef*d(i) - 1.0 e(i) = coef*e(i) enddo d(n) = coef*d(n)-1.0 * It takes: O(m n^2) call TRISRT(n,d,e,m,rtmu,S,work,work(n+1),work(2*n+1)) * Given symmetric matrix S, we calculate QSQ^t * Q is given as in the SSYTRD, stored in elementary reflectors. call APPLIQ(n,A,tau,S, work, work(n+1)) do j=1, n S(j,j) = S(j,j) + rtmu enddo return end **************************************************** * This is the IMPROVED subroutine for calculating m and mu with a given * laml, lams, tiny. The theory is given in section 3 of the paper: * * A Pad\'e approximation Method for Square Roots of * Symmetric Positive Definite Matrices. * * Input: laml > lams > 0, the extreme eigenvalues of * a symmetric positive definite matrix * tiny, the relative desired accuracy. *************************************************** subroutine XING(laml, lams, tiny, mu, m) implicit none real laml, lams, tiny, mu integer m real kappa,rtkp,a,b,ab,tau,t,err,f,fp,mstar,s integer it kappa = laml/lams rtkp = sqrt(kappa) a = 2.0/tiny -1.0 b = 1.0 + 2.0/(rtkp*tiny) ab = a*b tau = ( rtkp+1.0)/( rtkp-1.0) t = sqrt( (rtkp-1.0)*log(a)*log(b) ) t = 2.0/t err = 1.0 it = 0 1 if ( abs(err/t) .GT. 1.0E-5) then it = it + 1 f = (ab)**t - tau*(a**t+b**t)+1.0 fp = log(ab)*(ab)**t - tau*(log(a)*a**t+log(b)*b**t) err = f/fp t = t - err goto 1 endif mstar = (1.0-t)/(2.0*t) m = mstar if ( mstar .GT. m) m = m+1 c Now we improve our mu value for the fixed integer m s = (a**t-1.0)/(a**t+1.0) a = sqrt(rtkp) b = 1.0/a s = s*a err = 1.0 it = 0 2 if ( abs(err/s) .GT. 1.0E-5) then it = it + 1 c print*, it f=rtkp*(((s+b)/(s-b))**(2*m+1)-1.0)-((a+s)/(a-s))**(2*m+1)-1.0 fp = ((s+b)/(s-b))**(2*m) / (s-b)**2 + $ ((a+s)/(a-s))**(2*m) / (s-a)**2 fp = -2*(2*m+1)*a*fp err = f/fp s = s - err goto 2 endif mu = s*s*sqrt(laml*lams) c c print*, 'Number of iteration = ', it return end ******************************************************* * This is the program to calculate * S = SUM a_j sqrt(mu) ( I + b_j X)^(-1) X ******************************************************* subroutine TRISRT(n,d,e,m,rtmu,S,dx,ex,x) implicit none integer n, m real d(n), e(n-1), S(n,n), dx(n), ex(n-1), x(n) integer i,j,k,info real theta, ak, bk, coef, pi, rtmu pi = 4.0*atan(1.0) do j=1, n do i=j, n S(i,j) = 0.0 enddo enddo do k=1, m theta = k*pi/(2.0*m+1.0) ak = 2.0/(2.0*m+1.0)* ( sin(theta))**2 bk = ( cos(theta) )**2 coef = ak*rtmu do i=1, n-1 dx(i) = 1.0 + bk*d(i) ex(i) = bk*e(i) enddo dx(n) = 1.0 + bk*d(n) * SPTTRF computes the factorization of a real symmetric positive * definite tridiagonal matrix A. call SPTTRF(n,dx,ex,info) do j=1, n do i=1, n x(i) = 0.0 enddo x(j) = d(j)*coef if ( j .GT. 1) x(j-1) = coef*e(j-1) if ( j .LT. n) x(j+1) = e(j)*coef * SPTTRS solves a system of linear equations A * X = B with a * symmetric positive definite tridiagonal matrix A using the * factorization A = L*D*L**T or A = U**T*D*U computed by SPTTRF. * (The two forms are equivalent if A is real.) * call SPTTRS(n, neq, dx, ex, x, n, info) call TRISPE(n, dx, ex, x, j) * We will replace this by our own program that takes * advantage of the zeros in x, and the symmetric of * solution matrix. do i=j, n S(i,j) = S(i,j) + x(i) enddo enddo enddo return end *********************************************************** subroutine TRISPE(n, d, e, x, j) implicit none integer n, j real d(n), e(n-1), x(n) integer i if ( j .GT. 1) x(j) = x(j)-e(j-1)*x(j-1) if ( j .LT. n) x(j+1) = x(j+1) - e(j)*x(j) do i = j+2, n x(i) = -e(i-1)*x(i-1) enddo x(n) = x(n)/d(n) do i=n-1, j, -1 x(i) = x(i)/d(i) - e(i)*x(i+1) enddo return end ****************************************************************** * This program requires 2n^3 flops. * SSYTRD reduces a real symmetric matrix A to real symmetric * tridiagonal form T by an orthogonal similarity transformation: * Q**T * A * Q = T, where Q is is represented as a product of * elementary reflectors. * * It takes: 4/3 n^3 operations. * * If UPLO = 'L', the matrix Q is represented as a product of elementary * reflectors * * Q = H(1) H(2) . . . H(n-1). * * Each H(i) has the form * * H(i) = I - tau * v * v' * * where tau is a real scalar, and v is a real vector with * v(1:i) = 0 and v(i+1) = 1; v(i+2:n) is stored on exit in A(i+2:n,i), * and tau in TAU(i). * * The contents of A on exit are illustrated by the following examples * with n = 5: * * if UPLO = 'U': if UPLO = 'L': * * ( d e v2 v3 v4 ) ( d ) * ( d e v3 v4 ) ( e d ) * ( d e v4 ) ( v1 e d ) * ( d e ) ( v1 v2 e d ) * ( d ) ( v1 v2 v3 e d ) * * where d and e denote diagonal and off-diagonal elements of T, and vi * denotes an element of the vector defining H(i). subroutine APPLIQ(n,A,tau,S,v,w) implicit none integer n real A(n,n), tau(n-1), S(n,n), v(n), w(n) integer i, j, k real vtw do k= n-2, 1, -1 * print*, k * Redefine the vector v, as sqrt(tau)*v, H = I - v*v' v(k+1) = sqrt(tau(k)) do i=k+2, n v(i) = A(i,k)*v(k+1) enddo * Calculate the lower left corner, dimension is: (n-k) x k * Total operations: 4 (n-k) k do j=1, k w(j) = 0.0 do i=k+1, n w(j) = w(j) + v(i)*S(i,j) enddo enddo do j=1, k do i=k+1, n S(i,j) = S(i,j) - v(i)*w(j) enddo enddo * Calculate the lower right trailing block, only the lower * triangular part, domension (n-k) x (n-k) * Total operations: 4 (n-k)^2 * Operations: 2 (n-k)^2 do i=k+1, n w(i) = 0.0 do j=k+1, i w(i) = w(i) + S(i,j)*v(j) enddo do j=i+1, n w(i) = w(i) + S(j,i)*v(j) enddo enddo vtw = 0.0 do i=k+1, n vtw = vtw + v(i)*w(i) enddo vtw = 0.5*vtw do i=k+1,n w(i) = w(i) - vtw*v(i) enddo * Operations: 4/2 (n-k)^2 = 2(n-k)^2 do j= k+1, n do i=j, n S(i,j) = S(i,j) - w(i)*v(j)-v(i)*w(j) enddo enddo enddo return end **************************************************