![]() ![]() | ![]() |
Mise à jour de lapack vers la version 3.3.0.
1: SUBROUTINE DGEJSV( JOBA, JOBU, JOBV, JOBR, JOBT, JOBP, 2: & M, N, A, LDA, SVA, U, LDU, V, LDV, 3: & WORK, LWORK, IWORK, INFO ) 4: * 5: * -- LAPACK routine (version 3.3.0) -- 6: * 7: * -- Contributed by Zlatko Drmac of the University of Zagreb and -- 8: * -- Kresimir Veselic of the Fernuniversitaet Hagen -- 9: * November 2010 10: * 11: * -- LAPACK is a software package provided by Univ. of Tennessee, -- 12: * -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..-- 13: * 14: * This routine is also part of SIGMA (version 1.23, October 23. 2008.) 15: * SIGMA is a library of algorithms for highly accurate algorithms for 16: * computation of SVD, PSVD, QSVD, (H,K)-SVD, and for solution of the 17: * eigenvalue problems Hx = lambda M x, H M x = lambda x with H, M > 0. 18: * 19: * .. Scalar Arguments .. 20: IMPLICIT NONE 21: INTEGER INFO, LDA, LDU, LDV, LWORK, M, N 22: * .. 23: * .. Array Arguments .. 24: DOUBLE PRECISION A( LDA, * ), SVA( N ), U( LDU, * ), V( LDV, * ), 25: & WORK( LWORK ) 26: INTEGER IWORK( * ) 27: CHARACTER*1 JOBA, JOBP, JOBR, JOBT, JOBU, JOBV 28: * .. 29: * 30: * Purpose 31: * ======= 32: * 33: * DGEJSV computes the singular value decomposition (SVD) of a real M-by-N 34: * matrix [A], where M >= N. The SVD of [A] is written as 35: * 36: * [A] = [U] * [SIGMA] * [V]^t, 37: * 38: * where [SIGMA] is an N-by-N (M-by-N) matrix which is zero except for its N 39: * diagonal elements, [U] is an M-by-N (or M-by-M) orthonormal matrix, and 40: * [V] is an N-by-N orthogonal matrix. The diagonal elements of [SIGMA] are 41: * the singular values of [A]. The columns of [U] and [V] are the left and 42: * the right singular vectors of [A], respectively. The matrices [U] and [V] 43: * are computed and stored in the arrays U and V, respectively. The diagonal 44: * of [SIGMA] is computed and stored in the array SVA. 45: * 46: * Arguments 47: * ========= 48: * 49: * JOBA (input) CHARACTER*1 50: * Specifies the level of accuracy: 51: * = 'C': This option works well (high relative accuracy) if A = B * D, 52: * with well-conditioned B and arbitrary diagonal matrix D. 53: * The accuracy cannot be spoiled by COLUMN scaling. The 54: * accuracy of the computed output depends on the condition of 55: * B, and the procedure aims at the best theoretical accuracy. 56: * The relative error max_{i=1:N}|d sigma_i| / sigma_i is 57: * bounded by f(M,N)*epsilon* cond(B), independent of D. 58: * The input matrix is preprocessed with the QRF with column 59: * pivoting. This initial preprocessing and preconditioning by 60: * a rank revealing QR factorization is common for all values of 61: * JOBA. Additional actions are specified as follows: 62: * = 'E': Computation as with 'C' with an additional estimate of the 63: * condition number of B. It provides a realistic error bound. 64: * = 'F': If A = D1 * C * D2 with ill-conditioned diagonal scalings 65: * D1, D2, and well-conditioned matrix C, this option gives 66: * higher accuracy than the 'C' option. If the structure of the 67: * input matrix is not known, and relative accuracy is 68: * desirable, then this option is advisable. The input matrix A 69: * is preprocessed with QR factorization with FULL (row and 70: * column) pivoting. 71: * = 'G' Computation as with 'F' with an additional estimate of the 72: * condition number of B, where A=D*B. If A has heavily weighted 73: * rows, then using this condition number gives too pessimistic 74: * error bound. 75: * = 'A': Small singular values are the noise and the matrix is treated 76: * as numerically rank defficient. The error in the computed 77: * singular values is bounded by f(m,n)*epsilon*||A||. 78: * The computed SVD A = U * S * V^t restores A up to 79: * f(m,n)*epsilon*||A||. 80: * This gives the procedure the licence to discard (set to zero) 81: * all singular values below N*epsilon*||A||. 82: * = 'R': Similar as in 'A'. Rank revealing property of the initial 83: * QR factorization is used do reveal (using triangular factor) 84: * a gap sigma_{r+1} < epsilon * sigma_r in which case the 85: * numerical RANK is declared to be r. The SVD is computed with 86: * absolute error bounds, but more accurately than with 'A'. 87: * 88: * JOBU (input) CHARACTER*1 89: * Specifies whether to compute the columns of U: 90: * = 'U': N columns of U are returned in the array U. 91: * = 'F': full set of M left sing. vectors is returned in the array U. 92: * = 'W': U may be used as workspace of length M*N. See the description 93: * of U. 94: * = 'N': U is not computed. 95: * 96: * JOBV (input) CHARACTER*1 97: * Specifies whether to compute the matrix V: 98: * = 'V': N columns of V are returned in the array V; Jacobi rotations 99: * are not explicitly accumulated. 100: * = 'J': N columns of V are returned in the array V, but they are 101: * computed as the product of Jacobi rotations. This option is 102: * allowed only if JOBU .NE. 'N', i.e. in computing the full SVD. 103: * = 'W': V may be used as workspace of length N*N. See the description 104: * of V. 105: * = 'N': V is not computed. 106: * 107: * JOBR (input) CHARACTER*1 108: * Specifies the RANGE for the singular values. Issues the licence to 109: * set to zero small positive singular values if they are outside 110: * specified range. If A .NE. 0 is scaled so that the largest singular 111: * value of c*A is around DSQRT(BIG), BIG=SLAMCH('O'), then JOBR issues 112: * the licence to kill columns of A whose norm in c*A is less than 113: * DSQRT(SFMIN) (for JOBR.EQ.'R'), or less than SMALL=SFMIN/EPSLN, 114: * where SFMIN=SLAMCH('S'), EPSLN=SLAMCH('E'). 115: * = 'N': Do not kill small columns of c*A. This option assumes that 116: * BLAS and QR factorizations and triangular solvers are 117: * implemented to work in that range. If the condition of A 118: * is greater than BIG, use DGESVJ. 119: * = 'R': RESTRICTED range for sigma(c*A) is [DSQRT(SFMIN), DSQRT(BIG)] 120: * (roughly, as described above). This option is recommended. 121: * ~~~~~~~~~~~~~~~~~~~~~~~~~~~ 122: * For computing the singular values in the FULL range [SFMIN,BIG] 123: * use DGESVJ. 124: * 125: * JOBT (input) CHARACTER*1 126: * If the matrix is square then the procedure may determine to use 127: * transposed A if A^t seems to be better with respect to convergence. 128: * If the matrix is not square, JOBT is ignored. This is subject to 129: * changes in the future. 130: * The decision is based on two values of entropy over the adjoint 131: * orbit of A^t * A. See the descriptions of WORK(6) and WORK(7). 132: * = 'T': transpose if entropy test indicates possibly faster 133: * convergence of Jacobi process if A^t is taken as input. If A is 134: * replaced with A^t, then the row pivoting is included automatically. 135: * = 'N': do not speculate. 136: * This option can be used to compute only the singular values, or the 137: * full SVD (U, SIGMA and V). For only one set of singular vectors 138: * (U or V), the caller should provide both U and V, as one of the 139: * matrices is used as workspace if the matrix A is transposed. 140: * The implementer can easily remove this constraint and make the 141: * code more complicated. See the descriptions of U and V. 142: * 143: * JOBP (input) CHARACTER*1 144: * Issues the licence to introduce structured perturbations to drown 145: * denormalized numbers. This licence should be active if the 146: * denormals are poorly implemented, causing slow computation, 147: * especially in cases of fast convergence (!). For details see [1,2]. 148: * For the sake of simplicity, this perturbations are included only 149: * when the full SVD or only the singular values are requested. The 150: * implementer/user can easily add the perturbation for the cases of 151: * computing one set of singular vectors. 152: * = 'P': introduce perturbation 153: * = 'N': do not perturb 154: * 155: * M (input) INTEGER 156: * The number of rows of the input matrix A. M >= 0. 157: * 158: * N (input) INTEGER 159: * The number of columns of the input matrix A. M >= N >= 0. 160: * 161: * A (input/workspace) DOUBLE PRECISION array, dimension (LDA,N) 162: * On entry, the M-by-N matrix A. 163: * 164: * LDA (input) INTEGER 165: * The leading dimension of the array A. LDA >= max(1,M). 166: * 167: * SVA (workspace/output) DOUBLE PRECISION array, dimension (N) 168: * On exit, 169: * - For WORK(1)/WORK(2) = ONE: The singular values of A. During the 170: * computation SVA contains Euclidean column norms of the 171: * iterated matrices in the array A. 172: * - For WORK(1) .NE. WORK(2): The singular values of A are 173: * (WORK(1)/WORK(2)) * SVA(1:N). This factored form is used if 174: * sigma_max(A) overflows or if small singular values have been 175: * saved from underflow by scaling the input matrix A. 176: * - If JOBR='R' then some of the singular values may be returned 177: * as exact zeros obtained by "set to zero" because they are 178: * below the numerical rank threshold or are denormalized numbers. 179: * 180: * U (workspace/output) DOUBLE PRECISION array, dimension ( LDU, N ) 181: * If JOBU = 'U', then U contains on exit the M-by-N matrix of 182: * the left singular vectors. 183: * If JOBU = 'F', then U contains on exit the M-by-M matrix of 184: * the left singular vectors, including an ONB 185: * of the orthogonal complement of the Range(A). 186: * If JOBU = 'W' .AND. (JOBV.EQ.'V' .AND. JOBT.EQ.'T' .AND. M.EQ.N), 187: * then U is used as workspace if the procedure 188: * replaces A with A^t. In that case, [V] is computed 189: * in U as left singular vectors of A^t and then 190: * copied back to the V array. This 'W' option is just 191: * a reminder to the caller that in this case U is 192: * reserved as workspace of length N*N. 193: * If JOBU = 'N' U is not referenced. 194: * 195: * LDU (input) INTEGER 196: * The leading dimension of the array U, LDU >= 1. 197: * IF JOBU = 'U' or 'F' or 'W', then LDU >= M. 198: * 199: * V (workspace/output) DOUBLE PRECISION array, dimension ( LDV, N ) 200: * If JOBV = 'V', 'J' then V contains on exit the N-by-N matrix of 201: * the right singular vectors; 202: * If JOBV = 'W', AND (JOBU.EQ.'U' AND JOBT.EQ.'T' AND M.EQ.N), 203: * then V is used as workspace if the pprocedure 204: * replaces A with A^t. In that case, [U] is computed 205: * in V as right singular vectors of A^t and then 206: * copied back to the U array. This 'W' option is just 207: * a reminder to the caller that in this case V is 208: * reserved as workspace of length N*N. 209: * If JOBV = 'N' V is not referenced. 210: * 211: * LDV (input) INTEGER 212: * The leading dimension of the array V, LDV >= 1. 213: * If JOBV = 'V' or 'J' or 'W', then LDV >= N. 214: * 215: * WORK (workspace/output) DOUBLE PRECISION array, dimension at least LWORK. 216: * On exit, 217: * WORK(1) = SCALE = WORK(2) / WORK(1) is the scaling factor such 218: * that SCALE*SVA(1:N) are the computed singular values 219: * of A. (See the description of SVA().) 220: * WORK(2) = See the description of WORK(1). 221: * WORK(3) = SCONDA is an estimate for the condition number of 222: * column equilibrated A. (If JOBA .EQ. 'E' or 'G') 223: * SCONDA is an estimate of DSQRT(||(R^t * R)^(-1)||_1). 224: * It is computed using DPOCON. It holds 225: * N^(-1/4) * SCONDA <= ||R^(-1)||_2 <= N^(1/4) * SCONDA 226: * where R is the triangular factor from the QRF of A. 227: * However, if R is truncated and the numerical rank is 228: * determined to be strictly smaller than N, SCONDA is 229: * returned as -1, thus indicating that the smallest 230: * singular values might be lost. 231: * 232: * If full SVD is needed, the following two condition numbers are 233: * useful for the analysis of the algorithm. They are provied for 234: * a developer/implementer who is familiar with the details of 235: * the method. 236: * 237: * WORK(4) = an estimate of the scaled condition number of the 238: * triangular factor in the first QR factorization. 239: * WORK(5) = an estimate of the scaled condition number of the 240: * triangular factor in the second QR factorization. 241: * The following two parameters are computed if JOBT .EQ. 'T'. 242: * They are provided for a developer/implementer who is familiar 243: * with the details of the method. 244: * 245: * WORK(6) = the entropy of A^t*A :: this is the Shannon entropy 246: * of diag(A^t*A) / Trace(A^t*A) taken as point in the 247: * probability simplex. 248: * WORK(7) = the entropy of A*A^t. 249: * 250: * LWORK (input) INTEGER 251: * Length of WORK to confirm proper allocation of work space. 252: * LWORK depends on the job: 253: * 254: * If only SIGMA is needed ( JOBU.EQ.'N', JOBV.EQ.'N' ) and 255: * -> .. no scaled condition estimate required ( JOBE.EQ.'N'): 256: * LWORK >= max(2*M+N,4*N+1,7). This is the minimal requirement. 257: * For optimal performance (blocked code) the optimal value 258: * is LWORK >= max(2*M+N,3*N+(N+1)*NB,7). Here NB is the optimal 259: * block size for xGEQP3/xGEQRF. 260: * -> .. an estimate of the scaled condition number of A is 261: * required (JOBA='E', 'G'). In this case, LWORK is the maximum 262: * of the above and N*N+4*N, i.e. LWORK >= max(2*M+N,N*N+4N,7). 263: * 264: * If SIGMA and the right singular vectors are needed (JOBV.EQ.'V'), 265: * -> the minimal requirement is LWORK >= max(2*N+M,7). 266: * -> For optimal performance, LWORK >= max(2*N+M,2*N+N*NB,7), 267: * where NB is the optimal block size. 268: * 269: * If SIGMA and the left singular vectors are needed 270: * -> the minimal requirement is LWORK >= max(2*N+M,7). 271: * -> For optimal performance, LWORK >= max(2*N+M,2*N+N*NB,7), 272: * where NB is the optimal block size. 273: * 274: * If full SVD is needed ( JOBU.EQ.'U' or 'F', JOBV.EQ.'V' ) and 275: * -> .. the singular vectors are computed without explicit 276: * accumulation of the Jacobi rotations, LWORK >= 6*N+2*N*N 277: * -> .. in the iterative part, the Jacobi rotations are 278: * explicitly accumulated (option, see the description of JOBV), 279: * then the minimal requirement is LWORK >= max(M+3*N+N*N,7). 280: * For better performance, if NB is the optimal block size, 281: * LWORK >= max(3*N+N*N+M,3*N+N*N+N*NB,7). 282: * 283: * IWORK (workspace/output) INTEGER array, dimension M+3*N. 284: * On exit, 285: * IWORK(1) = the numerical rank determined after the initial 286: * QR factorization with pivoting. See the descriptions 287: * of JOBA and JOBR. 288: * IWORK(2) = the number of the computed nonzero singular values 289: * IWORK(3) = if nonzero, a warning message: 290: * If IWORK(3).EQ.1 then some of the column norms of A 291: * were denormalized floats. The requested high accuracy 292: * is not warranted by the data. 293: * 294: * INFO (output) INTEGER 295: * < 0 : if INFO = -i, then the i-th argument had an illegal value. 296: * = 0 : successfull exit; 297: * > 0 : DGEJSV did not converge in the maximal allowed number 298: * of sweeps. The computed values may be inaccurate. 299: * 300: * Further Details 301: * =============== 302: * 303: * DGEJSV implements a preconditioned Jacobi SVD algorithm. It uses SGEQP3, 304: * SGEQRF, and SGELQF as preprocessors and preconditioners. Optionally, an 305: * additional row pivoting can be used as a preprocessor, which in some 306: * cases results in much higher accuracy. An example is matrix A with the 307: * structure A = D1 * C * D2, where D1, D2 are arbitrarily ill-conditioned 308: * diagonal matrices and C is well-conditioned matrix. In that case, complete 309: * pivoting in the first QR factorizations provides accuracy dependent on the 310: * condition number of C, and independent of D1, D2. Such higher accuracy is 311: * not completely understood theoretically, but it works well in practice. 312: * Further, if A can be written as A = B*D, with well-conditioned B and some 313: * diagonal D, then the high accuracy is guaranteed, both theoretically and 314: * in software, independent of D. For more details see [1], [2]. 315: * The computational range for the singular values can be the full range 316: * ( UNDERFLOW,OVERFLOW ), provided that the machine arithmetic and the BLAS 317: * & LAPACK routines called by DGEJSV are implemented to work in that range. 318: * If that is not the case, then the restriction for safe computation with 319: * the singular values in the range of normalized IEEE numbers is that the 320: * spectral condition number kappa(A)=sigma_max(A)/sigma_min(A) does not 321: * overflow. This code (DGEJSV) is best used in this restricted range, 322: * meaning that singular values of magnitude below ||A||_2 / SLAMCH('O') are 323: * returned as zeros. See JOBR for details on this. 324: * Further, this implementation is somewhat slower than the one described 325: * in [1,2] due to replacement of some non-LAPACK components, and because 326: * the choice of some tuning parameters in the iterative part (DGESVJ) is 327: * left to the implementer on a particular machine. 328: * The rank revealing QR factorization (in this code: SGEQP3) should be 329: * implemented as in [3]. We have a new version of SGEQP3 under development 330: * that is more robust than the current one in LAPACK, with a cleaner cut in 331: * rank defficient cases. It will be available in the SIGMA library [4]. 332: * If M is much larger than N, it is obvious that the inital QRF with 333: * column pivoting can be preprocessed by the QRF without pivoting. That 334: * well known trick is not used in DGEJSV because in some cases heavy row 335: * weighting can be treated with complete pivoting. The overhead in cases 336: * M much larger than N is then only due to pivoting, but the benefits in 337: * terms of accuracy have prevailed. The implementer/user can incorporate 338: * this extra QRF step easily. The implementer can also improve data movement 339: * (matrix transpose, matrix copy, matrix transposed copy) - this 340: * implementation of DGEJSV uses only the simplest, naive data movement. 341: * 342: * Contributors 343: * 344: * Zlatko Drmac (Zagreb, Croatia) and Kresimir Veselic (Hagen, Germany) 345: * 346: * References 347: * 348: * [1] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm I. 349: * SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1322-1342. 350: * LAPACK Working note 169. 351: * [2] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm II. 352: * SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1343-1362. 353: * LAPACK Working note 170. 354: * [3] Z. Drmac and Z. Bujanovic: On the failure of rank-revealing QR 355: * factorization software - a case study. 356: * ACM Trans. Math. Softw. Vol. 35, No 2 (2008), pp. 1-28. 357: * LAPACK Working note 176. 358: * [4] Z. Drmac: SIGMA - mathematical software library for accurate SVD, PSV, 359: * QSVD, (H,K)-SVD computations. 360: * Department of Mathematics, University of Zagreb, 2008. 361: * 362: * Bugs, examples and comments 363: * 364: * Please report all bugs and send interesting examples and/or comments to 365: * drmac@math.hr. Thank you. 366: * 367: * ========================================================================== 368: * 369: * .. Local Parameters .. 370: DOUBLE PRECISION ZERO, ONE 371: PARAMETER ( ZERO = 0.0D0, ONE = 1.0D0 ) 372: * .. 373: * .. Local Scalars .. 374: DOUBLE PRECISION AAPP, AAQQ, AATMAX, AATMIN, BIG, BIG1, COND_OK, 375: & CONDR1, CONDR2, ENTRA, ENTRAT, EPSLN, MAXPRJ, SCALEM, 376: & SCONDA, SFMIN, SMALL, TEMP1, USCAL1, USCAL2, XSC 377: INTEGER IERR, N1, NR, NUMRANK, p, q, WARNING 378: LOGICAL ALMORT, DEFR, ERREST, GOSCAL, JRACC, KILL, LSVEC, 379: & L2ABER, L2KILL, L2PERT, L2RANK, L2TRAN, 380: & NOSCAL, ROWPIV, RSVEC, TRANSP 381: * .. 382: * .. Intrinsic Functions .. 383: INTRINSIC DABS, DLOG, DMAX1, DMIN1, DBLE, 384: & MAX0, MIN0, IDNINT, DSIGN, DSQRT 385: * .. 386: * .. External Functions .. 387: DOUBLE PRECISION DLAMCH, DNRM2 388: INTEGER IDAMAX 389: LOGICAL LSAME 390: EXTERNAL IDAMAX, LSAME, DLAMCH, DNRM2 391: * .. 392: * .. External Subroutines .. 393: EXTERNAL DCOPY, DGELQF, DGEQP3, DGEQRF, DLACPY, DLASCL, 394: & DLASET, DLASSQ, DLASWP, DORGQR, DORMLQ, 395: & DORMQR, DPOCON, DSCAL, DSWAP, DTRSM, XERBLA 396: * 397: EXTERNAL DGESVJ 398: * .. 399: * 400: * Test the input arguments 401: * 402: LSVEC = LSAME( JOBU, 'U' ) .OR. LSAME( JOBU, 'F' ) 403: JRACC = LSAME( JOBV, 'J' ) 404: RSVEC = LSAME( JOBV, 'V' ) .OR. JRACC 405: ROWPIV = LSAME( JOBA, 'F' ) .OR. LSAME( JOBA, 'G' ) 406: L2RANK = LSAME( JOBA, 'R' ) 407: L2ABER = LSAME( JOBA, 'A' ) 408: ERREST = LSAME( JOBA, 'E' ) .OR. LSAME( JOBA, 'G' ) 409: L2TRAN = LSAME( JOBT, 'T' ) 410: L2KILL = LSAME( JOBR, 'R' ) 411: DEFR = LSAME( JOBR, 'N' ) 412: L2PERT = LSAME( JOBP, 'P' ) 413: * 414: IF ( .NOT.(ROWPIV .OR. L2RANK .OR. L2ABER .OR. 415: & ERREST .OR. LSAME( JOBA, 'C' ) )) THEN 416: INFO = - 1 417: ELSE IF ( .NOT.( LSVEC .OR. LSAME( JOBU, 'N' ) .OR. 418: & LSAME( JOBU, 'W' )) ) THEN 419: INFO = - 2 420: ELSE IF ( .NOT.( RSVEC .OR. LSAME( JOBV, 'N' ) .OR. 421: & LSAME( JOBV, 'W' )) .OR. ( JRACC .AND. (.NOT.LSVEC) ) ) THEN 422: INFO = - 3 423: ELSE IF ( .NOT. ( L2KILL .OR. DEFR ) ) THEN 424: INFO = - 4 425: ELSE IF ( .NOT. ( L2TRAN .OR. LSAME( JOBT, 'N' ) ) ) THEN 426: INFO = - 5 427: ELSE IF ( .NOT. ( L2PERT .OR. LSAME( JOBP, 'N' ) ) ) THEN 428: INFO = - 6 429: ELSE IF ( M .LT. 0 ) THEN 430: INFO = - 7 431: ELSE IF ( ( N .LT. 0 ) .OR. ( N .GT. M ) ) THEN 432: INFO = - 8 433: ELSE IF ( LDA .LT. M ) THEN 434: INFO = - 10 435: ELSE IF ( LSVEC .AND. ( LDU .LT. M ) ) THEN 436: INFO = - 13 437: ELSE IF ( RSVEC .AND. ( LDV .LT. N ) ) THEN 438: INFO = - 14 439: ELSE IF ( (.NOT.(LSVEC .OR. RSVEC .OR. ERREST).AND. 440: & (LWORK .LT. MAX0(7,4*N+1,2*M+N))) .OR. 441: & (.NOT.(LSVEC .OR. LSVEC) .AND. ERREST .AND. 442: & (LWORK .LT. MAX0(7,4*N+N*N,2*M+N))) .OR. 443: & (LSVEC .AND. (.NOT.RSVEC) .AND. (LWORK .LT. MAX0(7,2*N+M))) .OR. 444: & (RSVEC .AND. (.NOT.LSVEC) .AND. (LWORK .LT. MAX0(7,2*N+M))) .OR. 445: & (LSVEC .AND. RSVEC .AND. .NOT.JRACC .AND. (LWORK.LT.6*N+2*N*N)) 446: & .OR. (LSVEC.AND.RSVEC.AND.JRACC.AND.LWORK.LT.MAX0(7,M+3*N+N*N))) 447: & THEN 448: INFO = - 17 449: ELSE 450: * #:) 451: INFO = 0 452: END IF 453: * 454: IF ( INFO .NE. 0 ) THEN 455: * #:( 456: CALL XERBLA( 'DGEJSV', - INFO ) 457: END IF 458: * 459: * Quick return for void matrix (Y3K safe) 460: * #:) 461: IF ( ( M .EQ. 0 ) .OR. ( N .EQ. 0 ) ) RETURN 462: * 463: * Determine whether the matrix U should be M x N or M x M 464: * 465: IF ( LSVEC ) THEN 466: N1 = N 467: IF ( LSAME( JOBU, 'F' ) ) N1 = M 468: END IF 469: * 470: * Set numerical parameters 471: * 472: *! NOTE: Make sure DLAMCH() does not fail on the target architecture. 473: * 474: 475: EPSLN = DLAMCH('Epsilon') 476: SFMIN = DLAMCH('SafeMinimum') 477: SMALL = SFMIN / EPSLN 478: BIG = DLAMCH('O') 479: * BIG = ONE / SFMIN 480: * 481: * Initialize SVA(1:N) = diag( ||A e_i||_2 )_1^N 482: * 483: *(!) If necessary, scale SVA() to protect the largest norm from 484: * overflow. It is possible that this scaling pushes the smallest 485: * column norm left from the underflow threshold (extreme case). 486: * 487: SCALEM = ONE / DSQRT(DBLE(M)*DBLE(N)) 488: NOSCAL = .TRUE. 489: GOSCAL = .TRUE. 490: DO 1874 p = 1, N 491: AAPP = ZERO 492: AAQQ = ONE 493: CALL DLASSQ( M, A(1,p), 1, AAPP, AAQQ ) 494: IF ( AAPP .GT. BIG ) THEN 495: INFO = - 9 496: CALL XERBLA( 'DGEJSV', -INFO ) 497: RETURN 498: END IF 499: AAQQ = DSQRT(AAQQ) 500: IF ( ( AAPP .LT. (BIG / AAQQ) ) .AND. NOSCAL ) THEN 501: SVA(p) = AAPP * AAQQ 502: ELSE 503: NOSCAL = .FALSE. 504: SVA(p) = AAPP * ( AAQQ * SCALEM ) 505: IF ( GOSCAL ) THEN 506: GOSCAL = .FALSE. 507: CALL DSCAL( p-1, SCALEM, SVA, 1 ) 508: END IF 509: END IF 510: 1874 CONTINUE 511: * 512: IF ( NOSCAL ) SCALEM = ONE 513: * 514: AAPP = ZERO 515: AAQQ = BIG 516: DO 4781 p = 1, N 517: AAPP = DMAX1( AAPP, SVA(p) ) 518: IF ( SVA(p) .NE. ZERO ) AAQQ = DMIN1( AAQQ, SVA(p) ) 519: 4781 CONTINUE 520: * 521: * Quick return for zero M x N matrix 522: * #:) 523: IF ( AAPP .EQ. ZERO ) THEN 524: IF ( LSVEC ) CALL DLASET( 'G', M, N1, ZERO, ONE, U, LDU ) 525: IF ( RSVEC ) CALL DLASET( 'G', N, N, ZERO, ONE, V, LDV ) 526: WORK(1) = ONE 527: WORK(2) = ONE 528: IF ( ERREST ) WORK(3) = ONE 529: IF ( LSVEC .AND. RSVEC ) THEN 530: WORK(4) = ONE 531: WORK(5) = ONE 532: END IF 533: IF ( L2TRAN ) THEN 534: WORK(6) = ZERO 535: WORK(7) = ZERO 536: END IF 537: IWORK(1) = 0 538: IWORK(2) = 0 539: RETURN 540: END IF 541: * 542: * Issue warning if denormalized column norms detected. Override the 543: * high relative accuracy request. Issue licence to kill columns 544: * (set them to zero) whose norm is less than sigma_max / BIG (roughly). 545: * #:( 546: WARNING = 0 547: IF ( AAQQ .LE. SFMIN ) THEN 548: L2RANK = .TRUE. 549: L2KILL = .TRUE. 550: WARNING = 1 551: END IF 552: * 553: * Quick return for one-column matrix 554: * #:) 555: IF ( N .EQ. 1 ) THEN 556: * 557: IF ( LSVEC ) THEN 558: CALL DLASCL( 'G',0,0,SVA(1),SCALEM, M,1,A(1,1),LDA,IERR ) 559: CALL DLACPY( 'A', M, 1, A, LDA, U, LDU ) 560: * computing all M left singular vectors of the M x 1 matrix 561: IF ( N1 .NE. N ) THEN 562: CALL DGEQRF( M, N, U,LDU, WORK, WORK(N+1),LWORK-N,IERR ) 563: CALL DORGQR( M,N1,1, U,LDU,WORK,WORK(N+1),LWORK-N,IERR ) 564: CALL DCOPY( M, A(1,1), 1, U(1,1), 1 ) 565: END IF 566: END IF 567: IF ( RSVEC ) THEN 568: V(1,1) = ONE 569: END IF 570: IF ( SVA(1) .LT. (BIG*SCALEM) ) THEN 571: SVA(1) = SVA(1) / SCALEM 572: SCALEM = ONE 573: END IF 574: WORK(1) = ONE / SCALEM 575: WORK(2) = ONE 576: IF ( SVA(1) .NE. ZERO ) THEN 577: IWORK(1) = 1 578: IF ( ( SVA(1) / SCALEM) .GE. SFMIN ) THEN 579: IWORK(2) = 1 580: ELSE 581: IWORK(2) = 0 582: END IF 583: ELSE 584: IWORK(1) = 0 585: IWORK(2) = 0 586: END IF 587: IF ( ERREST ) WORK(3) = ONE 588: IF ( LSVEC .AND. RSVEC ) THEN 589: WORK(4) = ONE 590: WORK(5) = ONE 591: END IF 592: IF ( L2TRAN ) THEN 593: WORK(6) = ZERO 594: WORK(7) = ZERO 595: END IF 596: RETURN 597: * 598: END IF 599: * 600: TRANSP = .FALSE. 601: L2TRAN = L2TRAN .AND. ( M .EQ. N ) 602: * 603: AATMAX = -ONE 604: AATMIN = BIG 605: IF ( ROWPIV .OR. L2TRAN ) THEN 606: * 607: * Compute the row norms, needed to determine row pivoting sequence 608: * (in the case of heavily row weighted A, row pivoting is strongly 609: * advised) and to collect information needed to compare the 610: * structures of A * A^t and A^t * A (in the case L2TRAN.EQ..TRUE.). 611: * 612: IF ( L2TRAN ) THEN 613: DO 1950 p = 1, M 614: XSC = ZERO 615: TEMP1 = ONE 616: CALL DLASSQ( N, A(p,1), LDA, XSC, TEMP1 ) 617: * DLASSQ gets both the ell_2 and the ell_infinity norm 618: * in one pass through the vector 619: WORK(M+N+p) = XSC * SCALEM 620: WORK(N+p) = XSC * (SCALEM*DSQRT(TEMP1)) 621: AATMAX = DMAX1( AATMAX, WORK(N+p) ) 622: IF (WORK(N+p) .NE. ZERO) AATMIN = DMIN1(AATMIN,WORK(N+p)) 623: 1950 CONTINUE 624: ELSE 625: DO 1904 p = 1, M 626: WORK(M+N+p) = SCALEM*DABS( A(p,IDAMAX(N,A(p,1),LDA)) ) 627: AATMAX = DMAX1( AATMAX, WORK(M+N+p) ) 628: AATMIN = DMIN1( AATMIN, WORK(M+N+p) ) 629: 1904 CONTINUE 630: END IF 631: * 632: END IF 633: * 634: * For square matrix A try to determine whether A^t would be better 635: * input for the preconditioned Jacobi SVD, with faster convergence. 636: * The decision is based on an O(N) function of the vector of column 637: * and row norms of A, based on the Shannon entropy. This should give 638: * the right choice in most cases when the difference actually matters. 639: * It may fail and pick the slower converging side. 640: * 641: ENTRA = ZERO 642: ENTRAT = ZERO 643: IF ( L2TRAN ) THEN 644: * 645: XSC = ZERO 646: TEMP1 = ONE 647: CALL DLASSQ( N, SVA, 1, XSC, TEMP1 ) 648: TEMP1 = ONE / TEMP1 649: * 650: ENTRA = ZERO 651: DO 1113 p = 1, N 652: BIG1 = ( ( SVA(p) / XSC )**2 ) * TEMP1 653: IF ( BIG1 .NE. ZERO ) ENTRA = ENTRA + BIG1 * DLOG(BIG1) 654: 1113 CONTINUE 655: ENTRA = - ENTRA / DLOG(DBLE(N)) 656: * 657: * Now, SVA().^2/Trace(A^t * A) is a point in the probability simplex. 658: * It is derived from the diagonal of A^t * A. Do the same with the 659: * diagonal of A * A^t, compute the entropy of the corresponding 660: * probability distribution. Note that A * A^t and A^t * A have the 661: * same trace. 662: * 663: ENTRAT = ZERO 664: DO 1114 p = N+1, N+M 665: BIG1 = ( ( WORK(p) / XSC )**2 ) * TEMP1 666: IF ( BIG1 .NE. ZERO ) ENTRAT = ENTRAT + BIG1 * DLOG(BIG1) 667: 1114 CONTINUE 668: ENTRAT = - ENTRAT / DLOG(DBLE(M)) 669: * 670: * Analyze the entropies and decide A or A^t. Smaller entropy 671: * usually means better input for the algorithm. 672: * 673: TRANSP = ( ENTRAT .LT. ENTRA ) 674: * 675: * If A^t is better than A, transpose A. 676: * 677: IF ( TRANSP ) THEN 678: * In an optimal implementation, this trivial transpose 679: * should be replaced with faster transpose. 680: DO 1115 p = 1, N - 1 681: DO 1116 q = p + 1, N 682: TEMP1 = A(q,p) 683: A(q,p) = A(p,q) 684: A(p,q) = TEMP1 685: 1116 CONTINUE 686: 1115 CONTINUE 687: DO 1117 p = 1, N 688: WORK(M+N+p) = SVA(p) 689: SVA(p) = WORK(N+p) 690: 1117 CONTINUE 691: TEMP1 = AAPP 692: AAPP = AATMAX 693: AATMAX = TEMP1 694: TEMP1 = AAQQ 695: AAQQ = AATMIN 696: AATMIN = TEMP1 697: KILL = LSVEC 698: LSVEC = RSVEC 699: RSVEC = KILL 700: IF ( LSVEC ) N1 = N 701: * 702: ROWPIV = .TRUE. 703: END IF 704: * 705: END IF 706: * END IF L2TRAN 707: * 708: * Scale the matrix so that its maximal singular value remains less 709: * than DSQRT(BIG) -- the matrix is scaled so that its maximal column 710: * has Euclidean norm equal to DSQRT(BIG/N). The only reason to keep 711: * DSQRT(BIG) instead of BIG is the fact that DGEJSV uses LAPACK and 712: * BLAS routines that, in some implementations, are not capable of 713: * working in the full interval [SFMIN,BIG] and that they may provoke 714: * overflows in the intermediate results. If the singular values spread 715: * from SFMIN to BIG, then DGESVJ will compute them. So, in that case, 716: * one should use DGESVJ instead of DGEJSV. 717: * 718: BIG1 = DSQRT( BIG ) 719: TEMP1 = DSQRT( BIG / DBLE(N) ) 720: * 721: CALL DLASCL( 'G', 0, 0, AAPP, TEMP1, N, 1, SVA, N, IERR ) 722: IF ( AAQQ .GT. (AAPP * SFMIN) ) THEN 723: AAQQ = ( AAQQ / AAPP ) * TEMP1 724: ELSE 725: AAQQ = ( AAQQ * TEMP1 ) / AAPP 726: END IF 727: TEMP1 = TEMP1 * SCALEM 728: CALL DLASCL( 'G', 0, 0, AAPP, TEMP1, M, N, A, LDA, IERR ) 729: * 730: * To undo scaling at the end of this procedure, multiply the 731: * computed singular values with USCAL2 / USCAL1. 732: * 733: USCAL1 = TEMP1 734: USCAL2 = AAPP 735: * 736: IF ( L2KILL ) THEN 737: * L2KILL enforces computation of nonzero singular values in 738: * the restricted range of condition number of the initial A, 739: * sigma_max(A) / sigma_min(A) approx. DSQRT(BIG)/DSQRT(SFMIN). 740: XSC = DSQRT( SFMIN ) 741: ELSE 742: XSC = SMALL 743: * 744: * Now, if the condition number of A is too big, 745: * sigma_max(A) / sigma_min(A) .GT. DSQRT(BIG/N) * EPSLN / SFMIN, 746: * as a precaution measure, the full SVD is computed using DGESVJ 747: * with accumulated Jacobi rotations. This provides numerically 748: * more robust computation, at the cost of slightly increased run 749: * time. Depending on the concrete implementation of BLAS and LAPACK 750: * (i.e. how they behave in presence of extreme ill-conditioning) the 751: * implementor may decide to remove this switch. 752: IF ( ( AAQQ.LT.DSQRT(SFMIN) ) .AND. LSVEC .AND. RSVEC ) THEN 753: JRACC = .TRUE. 754: END IF 755: * 756: END IF 757: IF ( AAQQ .LT. XSC ) THEN 758: DO 700 p = 1, N 759: IF ( SVA(p) .LT. XSC ) THEN 760: CALL DLASET( 'A', M, 1, ZERO, ZERO, A(1,p), LDA ) 761: SVA(p) = ZERO 762: END IF 763: 700 CONTINUE 764: END IF 765: * 766: * Preconditioning using QR factorization with pivoting 767: * 768: IF ( ROWPIV ) THEN 769: * Optional row permutation (Bjoerck row pivoting): 770: * A result by Cox and Higham shows that the Bjoerck's 771: * row pivoting combined with standard column pivoting 772: * has similar effect as Powell-Reid complete pivoting. 773: * The ell-infinity norms of A are made nonincreasing. 774: DO 1952 p = 1, M - 1 775: q = IDAMAX( M-p+1, WORK(M+N+p), 1 ) + p - 1 776: IWORK(2*N+p) = q 777: IF ( p .NE. q ) THEN 778: TEMP1 = WORK(M+N+p) 779: WORK(M+N+p) = WORK(M+N+q) 780: WORK(M+N+q) = TEMP1 781: END IF 782: 1952 CONTINUE 783: CALL DLASWP( N, A, LDA, 1, M-1, IWORK(2*N+1), 1 ) 784: END IF 785: * 786: * End of the preparation phase (scaling, optional sorting and 787: * transposing, optional flushing of small columns). 788: * 789: * Preconditioning 790: * 791: * If the full SVD is needed, the right singular vectors are computed 792: * from a matrix equation, and for that we need theoretical analysis 793: * of the Businger-Golub pivoting. So we use DGEQP3 as the first RR QRF. 794: * In all other cases the first RR QRF can be chosen by other criteria 795: * (eg speed by replacing global with restricted window pivoting, such 796: * as in SGEQPX from TOMS # 782). Good results will be obtained using 797: * SGEQPX with properly (!) chosen numerical parameters. 798: * Any improvement of DGEQP3 improves overal performance of DGEJSV. 799: * 800: * A * P1 = Q1 * [ R1^t 0]^t: 801: DO 1963 p = 1, N 802: * .. all columns are free columns 803: IWORK(p) = 0 804: 1963 CONTINUE 805: CALL DGEQP3( M,N,A,LDA, IWORK,WORK, WORK(N+1),LWORK-N, IERR ) 806: * 807: * The upper triangular matrix R1 from the first QRF is inspected for 808: * rank deficiency and possibilities for deflation, or possible 809: * ill-conditioning. Depending on the user specified flag L2RANK, 810: * the procedure explores possibilities to reduce the numerical 811: * rank by inspecting the computed upper triangular factor. If 812: * L2RANK or L2ABER are up, then DGEJSV will compute the SVD of 813: * A + dA, where ||dA|| <= f(M,N)*EPSLN. 814: * 815: NR = 1 816: IF ( L2ABER ) THEN 817: * Standard absolute error bound suffices. All sigma_i with 818: * sigma_i < N*EPSLN*||A|| are flushed to zero. This is an 819: * agressive enforcement of lower numerical rank by introducing a 820: * backward error of the order of N*EPSLN*||A||. 821: TEMP1 = DSQRT(DBLE(N))*EPSLN 822: DO 3001 p = 2, N 823: IF ( DABS(A(p,p)) .GE. (TEMP1*DABS(A(1,1))) ) THEN 824: NR = NR + 1 825: ELSE 826: GO TO 3002 827: END IF 828: 3001 CONTINUE 829: 3002 CONTINUE 830: ELSE IF ( L2RANK ) THEN 831: * .. similarly as above, only slightly more gentle (less agressive). 832: * Sudden drop on the diagonal of R1 is used as the criterion for 833: * close-to-rank-defficient. 834: TEMP1 = DSQRT(SFMIN) 835: DO 3401 p = 2, N 836: IF ( ( DABS(A(p,p)) .LT. (EPSLN*DABS(A(p-1,p-1))) ) .OR. 837: & ( DABS(A(p,p)) .LT. SMALL ) .OR. 838: & ( L2KILL .AND. (DABS(A(p,p)) .LT. TEMP1) ) ) GO TO 3402 839: NR = NR + 1 840: 3401 CONTINUE 841: 3402 CONTINUE 842: * 843: ELSE 844: * The goal is high relative accuracy. However, if the matrix 845: * has high scaled condition number the relative accuracy is in 846: * general not feasible. Later on, a condition number estimator 847: * will be deployed to estimate the scaled condition number. 848: * Here we just remove the underflowed part of the triangular 849: * factor. This prevents the situation in which the code is 850: * working hard to get the accuracy not warranted by the data. 851: TEMP1 = DSQRT(SFMIN) 852: DO 3301 p = 2, N 853: IF ( ( DABS(A(p,p)) .LT. SMALL ) .OR. 854: & ( L2KILL .AND. (DABS(A(p,p)) .LT. TEMP1) ) ) GO TO 3302 855: NR = NR + 1 856: 3301 CONTINUE 857: 3302 CONTINUE 858: * 859: END IF 860: * 861: ALMORT = .FALSE. 862: IF ( NR .EQ. N ) THEN 863: MAXPRJ = ONE 864: DO 3051 p = 2, N 865: TEMP1 = DABS(A(p,p)) / SVA(IWORK(p)) 866: MAXPRJ = DMIN1( MAXPRJ, TEMP1 ) 867: 3051 CONTINUE 868: IF ( MAXPRJ**2 .GE. ONE - DBLE(N)*EPSLN ) ALMORT = .TRUE. 869: END IF 870: * 871: * 872: SCONDA = - ONE 873: CONDR1 = - ONE 874: CONDR2 = - ONE 875: * 876: IF ( ERREST ) THEN 877: IF ( N .EQ. NR ) THEN 878: IF ( RSVEC ) THEN 879: * .. V is available as workspace 880: CALL DLACPY( 'U', N, N, A, LDA, V, LDV ) 881: DO 3053 p = 1, N 882: TEMP1 = SVA(IWORK(p)) 883: CALL DSCAL( p, ONE/TEMP1, V(1,p), 1 ) 884: 3053 CONTINUE 885: CALL DPOCON( 'U', N, V, LDV, ONE, TEMP1, 886: & WORK(N+1), IWORK(2*N+M+1), IERR ) 887: ELSE IF ( LSVEC ) THEN 888: * .. U is available as workspace 889: CALL DLACPY( 'U', N, N, A, LDA, U, LDU ) 890: DO 3054 p = 1, N 891: TEMP1 = SVA(IWORK(p)) 892: CALL DSCAL( p, ONE/TEMP1, U(1,p), 1 ) 893: 3054 CONTINUE 894: CALL DPOCON( 'U', N, U, LDU, ONE, TEMP1, 895: & WORK(N+1), IWORK(2*N+M+1), IERR ) 896: ELSE 897: CALL DLACPY( 'U', N, N, A, LDA, WORK(N+1), N ) 898: DO 3052 p = 1, N 899: TEMP1 = SVA(IWORK(p)) 900: CALL DSCAL( p, ONE/TEMP1, WORK(N+(p-1)*N+1), 1 ) 901: 3052 CONTINUE 902: * .. the columns of R are scaled to have unit Euclidean lengths. 903: CALL DPOCON( 'U', N, WORK(N+1), N, ONE, TEMP1, 904: & WORK(N+N*N+1), IWORK(2*N+M+1), IERR ) 905: END IF 906: SCONDA = ONE / DSQRT(TEMP1) 907: * SCONDA is an estimate of DSQRT(||(R^t * R)^(-1)||_1). 908: * N^(-1/4) * SCONDA <= ||R^(-1)||_2 <= N^(1/4) * SCONDA 909: ELSE 910: SCONDA = - ONE 911: END IF 912: END IF 913: * 914: L2PERT = L2PERT .AND. ( DABS( A(1,1)/A(NR,NR) ) .GT. DSQRT(BIG1) ) 915: * If there is no violent scaling, artificial perturbation is not needed. 916: * 917: * Phase 3: 918: * 919: 920: IF ( .NOT. ( RSVEC .OR. LSVEC ) ) THEN 921: * 922: * Singular Values only 923: * 924: * .. transpose A(1:NR,1:N) 925: DO 1946 p = 1, MIN0( N-1, NR ) 926: CALL DCOPY( N-p, A(p,p+1), LDA, A(p+1,p), 1 ) 927: 1946 CONTINUE 928: * 929: * The following two DO-loops introduce small relative perturbation 930: * into the strict upper triangle of the lower triangular matrix. 931: * Small entries below the main diagonal are also changed. 932: * This modification is useful if the computing environment does not 933: * provide/allow FLUSH TO ZERO underflow, for it prevents many 934: * annoying denormalized numbers in case of strongly scaled matrices. 935: * The perturbation is structured so that it does not introduce any 936: * new perturbation of the singular values, and it does not destroy 937: * the job done by the preconditioner. 938: * The licence for this perturbation is in the variable L2PERT, which 939: * should be .FALSE. if FLUSH TO ZERO underflow is active. 940: * 941: IF ( .NOT. ALMORT ) THEN 942: * 943: IF ( L2PERT ) THEN 944: * XSC = DSQRT(SMALL) 945: XSC = EPSLN / DBLE(N) 946: DO 4947 q = 1, NR 947: TEMP1 = XSC*DABS(A(q,q)) 948: DO 4949 p = 1, N 949: IF ( ( (p.GT.q) .AND. (DABS(A(p,q)).LE.TEMP1) ) 950: & .OR. ( p .LT. q ) ) 951: & A(p,q) = DSIGN( TEMP1, A(p,q) ) 952: 4949 CONTINUE 953: 4947 CONTINUE 954: ELSE 955: CALL DLASET( 'U', NR-1,NR-1, ZERO,ZERO, A(1,2),LDA ) 956: END IF 957: * 958: * .. second preconditioning using the QR factorization 959: * 960: CALL DGEQRF( N,NR, A,LDA, WORK, WORK(N+1),LWORK-N, IERR ) 961: * 962: * .. and transpose upper to lower triangular 963: DO 1948 p = 1, NR - 1 964: CALL DCOPY( NR-p, A(p,p+1), LDA, A(p+1,p), 1 ) 965: 1948 CONTINUE 966: * 967: END IF 968: * 969: * Row-cyclic Jacobi SVD algorithm with column pivoting 970: * 971: * .. again some perturbation (a "background noise") is added 972: * to drown denormals 973: IF ( L2PERT ) THEN 974: * XSC = DSQRT(SMALL) 975: XSC = EPSLN / DBLE(N) 976: DO 1947 q = 1, NR 977: TEMP1 = XSC*DABS(A(q,q)) 978: DO 1949 p = 1, NR 979: IF ( ( (p.GT.q) .AND. (DABS(A(p,q)).LE.TEMP1) ) 980: & .OR. ( p .LT. q ) ) 981: & A(p,q) = DSIGN( TEMP1, A(p,q) ) 982: 1949 CONTINUE 983: 1947 CONTINUE 984: ELSE 985: CALL DLASET( 'U', NR-1, NR-1, ZERO, ZERO, A(1,2), LDA ) 986: END IF 987: * 988: * .. and one-sided Jacobi rotations are started on a lower 989: * triangular matrix (plus perturbation which is ignored in 990: * the part which destroys triangular form (confusing?!)) 991: * 992: CALL DGESVJ( 'L', 'NoU', 'NoV', NR, NR, A, LDA, SVA, 993: & N, V, LDV, WORK, LWORK, INFO ) 994: * 995: SCALEM = WORK(1) 996: NUMRANK = IDNINT(WORK(2)) 997: * 998: * 999: ELSE IF ( RSVEC .AND. ( .NOT. LSVEC ) ) THEN 1000: * 1001: * -> Singular Values and Right Singular Vectors <- 1002: * 1003: IF ( ALMORT ) THEN 1004: * 1005: * .. in this case NR equals N 1006: DO 1998 p = 1, NR 1007: CALL DCOPY( N-p+1, A(p,p), LDA, V(p,p), 1 ) 1008: 1998 CONTINUE 1009: CALL DLASET( 'Upper', NR-1, NR-1, ZERO, ZERO, V(1,2), LDV ) 1010: * 1011: CALL DGESVJ( 'L','U','N', N, NR, V,LDV, SVA, NR, A,LDA, 1012: & WORK, LWORK, INFO ) 1013: SCALEM = WORK(1) 1014: NUMRANK = IDNINT(WORK(2)) 1015: 1016: ELSE 1017: * 1018: * .. two more QR factorizations ( one QRF is not enough, two require 1019: * accumulated product of Jacobi rotations, three are perfect ) 1020: * 1021: CALL DLASET( 'Lower', NR-1, NR-1, ZERO, ZERO, A(2,1), LDA ) 1022: CALL DGELQF( NR, N, A, LDA, WORK, WORK(N+1), LWORK-N, IERR) 1023: CALL DLACPY( 'Lower', NR, NR, A, LDA, V, LDV ) 1024: CALL DLASET( 'Upper', NR-1, NR-1, ZERO, ZERO, V(1,2), LDV ) 1025: CALL DGEQRF( NR, NR, V, LDV, WORK(N+1), WORK(2*N+1), 1026: & LWORK-2*N, IERR ) 1027: DO 8998 p = 1, NR 1028: CALL DCOPY( NR-p+1, V(p,p), LDV, V(p,p), 1 ) 1029: 8998 CONTINUE 1030: CALL DLASET( 'Upper', NR-1, NR-1, ZERO, ZERO, V(1,2), LDV ) 1031: * 1032: CALL DGESVJ( 'Lower', 'U','N', NR, NR, V,LDV, SVA, NR, U, 1033: & LDU, WORK(N+1), LWORK, INFO ) 1034: SCALEM = WORK(N+1) 1035: NUMRANK = IDNINT(WORK(N+2)) 1036: IF ( NR .LT. N ) THEN 1037: CALL DLASET( 'A',N-NR, NR, ZERO,ZERO, V(NR+1,1), LDV ) 1038: CALL DLASET( 'A',NR, N-NR, ZERO,ZERO, V(1,NR+1), LDV ) 1039: CALL DLASET( 'A',N-NR,N-NR,ZERO,ONE, V(NR+1,NR+1), LDV ) 1040: END IF 1041: * 1042: CALL DORMLQ( 'Left', 'Transpose', N, N, NR, A, LDA, WORK, 1043: & V, LDV, WORK(N+1), LWORK-N, IERR ) 1044: * 1045: END IF 1046: * 1047: DO 8991 p = 1, N 1048: CALL DCOPY( N, V(p,1), LDV, A(IWORK(p),1), LDA ) 1049: 8991 CONTINUE 1050: CALL DLACPY( 'All', N, N, A, LDA, V, LDV ) 1051: * 1052: IF ( TRANSP ) THEN 1053: CALL DLACPY( 'All', N, N, V, LDV, U, LDU ) 1054: END IF 1055: * 1056: ELSE IF ( LSVEC .AND. ( .NOT. RSVEC ) ) THEN 1057: * 1058: * .. Singular Values and Left Singular Vectors .. 1059: * 1060: * .. second preconditioning step to avoid need to accumulate 1061: * Jacobi rotations in the Jacobi iterations. 1062: DO 1965 p = 1, NR 1063: CALL DCOPY( N-p+1, A(p,p), LDA, U(p,p), 1 ) 1064: 1965 CONTINUE 1065: CALL DLASET( 'Upper', NR-1, NR-1, ZERO, ZERO, U(1,2), LDU ) 1066: * 1067: CALL DGEQRF( N, NR, U, LDU, WORK(N+1), WORK(2*N+1), 1068: & LWORK-2*N, IERR ) 1069: * 1070: DO 1967 p = 1, NR - 1 1071: CALL DCOPY( NR-p, U(p,p+1), LDU, U(p+1,p), 1 ) 1072: 1967 CONTINUE 1073: CALL DLASET( 'Upper', NR-1, NR-1, ZERO, ZERO, U(1,2), LDU ) 1074: * 1075: CALL DGESVJ( 'Lower', 'U', 'N', NR,NR, U, LDU, SVA, NR, A, 1076: & LDA, WORK(N+1), LWORK-N, INFO ) 1077: SCALEM = WORK(N+1) 1078: NUMRANK = IDNINT(WORK(N+2)) 1079: * 1080: IF ( NR .LT. M ) THEN 1081: CALL DLASET( 'A', M-NR, NR,ZERO, ZERO, U(NR+1,1), LDU ) 1082: IF ( NR .LT. N1 ) THEN 1083: CALL DLASET( 'A',NR, N1-NR, ZERO, ZERO, U(1,NR+1), LDU ) 1084: CALL DLASET( 'A',M-NR,N1-NR,ZERO,ONE,U(NR+1,NR+1), LDU ) 1085: END IF 1086: END IF 1087: * 1088: CALL DORMQR( 'Left', 'No Tr', M, N1, N, A, LDA, WORK, U, 1089: & LDU, WORK(N+1), LWORK-N, IERR ) 1090: * 1091: IF ( ROWPIV ) 1092: & CALL DLASWP( N1, U, LDU, 1, M-1, IWORK(2*N+1), -1 ) 1093: * 1094: DO 1974 p = 1, N1 1095: XSC = ONE / DNRM2( M, U(1,p), 1 ) 1096: CALL DSCAL( M, XSC, U(1,p), 1 ) 1097: 1974 CONTINUE 1098: * 1099: IF ( TRANSP ) THEN 1100: CALL DLACPY( 'All', N, N, U, LDU, V, LDV ) 1101: END IF 1102: * 1103: ELSE 1104: * 1105: * .. Full SVD .. 1106: * 1107: IF ( .NOT. JRACC ) THEN 1108: * 1109: IF ( .NOT. ALMORT ) THEN 1110: * 1111: * Second Preconditioning Step (QRF [with pivoting]) 1112: * Note that the composition of TRANSPOSE, QRF and TRANSPOSE is 1113: * equivalent to an LQF CALL. Since in many libraries the QRF 1114: * seems to be better optimized than the LQF, we do explicit 1115: * transpose and use the QRF. This is subject to changes in an 1116: * optimized implementation of DGEJSV. 1117: * 1118: DO 1968 p = 1, NR 1119: CALL DCOPY( N-p+1, A(p,p), LDA, V(p,p), 1 ) 1120: 1968 CONTINUE 1121: * 1122: * .. the following two loops perturb small entries to avoid 1123: * denormals in the second QR factorization, where they are 1124: * as good as zeros. This is done to avoid painfully slow 1125: * computation with denormals. The relative size of the perturbation 1126: * is a parameter that can be changed by the implementer. 1127: * This perturbation device will be obsolete on machines with 1128: * properly implemented arithmetic. 1129: * To switch it off, set L2PERT=.FALSE. To remove it from the 1130: * code, remove the action under L2PERT=.TRUE., leave the ELSE part. 1131: * The following two loops should be blocked and fused with the 1132: * transposed copy above. 1133: * 1134: IF ( L2PERT ) THEN 1135: XSC = DSQRT(SMALL) 1136: DO 2969 q = 1, NR 1137: TEMP1 = XSC*DABS( V(q,q) ) 1138: DO 2968 p = 1, N 1139: IF ( ( p .GT. q ) .AND. ( DABS(V(p,q)) .LE. TEMP1 ) 1140: & .OR. ( p .LT. q ) ) 1141: & V(p,q) = DSIGN( TEMP1, V(p,q) ) 1142: IF ( p. LT. q ) V(p,q) = - V(p,q) 1143: 2968 CONTINUE 1144: 2969 CONTINUE 1145: ELSE 1146: CALL DLASET( 'U', NR-1, NR-1, ZERO, ZERO, V(1,2), LDV ) 1147: END IF 1148: * 1149: * Estimate the row scaled condition number of R1 1150: * (If R1 is rectangular, N > NR, then the condition number 1151: * of the leading NR x NR submatrix is estimated.) 1152: * 1153: CALL DLACPY( 'L', NR, NR, V, LDV, WORK(2*N+1), NR ) 1154: DO 3950 p = 1, NR 1155: TEMP1 = DNRM2(NR-p+1,WORK(2*N+(p-1)*NR+p),1) 1156: CALL DSCAL(NR-p+1,ONE/TEMP1,WORK(2*N+(p-1)*NR+p),1) 1157: 3950 CONTINUE 1158: CALL DPOCON('Lower',NR,WORK(2*N+1),NR,ONE,TEMP1, 1159: & WORK(2*N+NR*NR+1),IWORK(M+2*N+1),IERR) 1160: CONDR1 = ONE / DSQRT(TEMP1) 1161: * .. here need a second oppinion on the condition number 1162: * .. then assume worst case scenario 1163: * R1 is OK for inverse <=> CONDR1 .LT. DBLE(N) 1164: * more conservative <=> CONDR1 .LT. DSQRT(DBLE(N)) 1165: * 1166: COND_OK = DSQRT(DBLE(NR)) 1167: *[TP] COND_OK is a tuning parameter. 1168: 1169: IF ( CONDR1 .LT. COND_OK ) THEN 1170: * .. the second QRF without pivoting. Note: in an optimized 1171: * implementation, this QRF should be implemented as the QRF 1172: * of a lower triangular matrix. 1173: * R1^t = Q2 * R2 1174: CALL DGEQRF( N, NR, V, LDV, WORK(N+1), WORK(2*N+1), 1175: & LWORK-2*N, IERR ) 1176: * 1177: IF ( L2PERT ) THEN 1178: XSC = DSQRT(SMALL)/EPSLN 1179: DO 3959 p = 2, NR 1180: DO 3958 q = 1, p - 1 1181: TEMP1 = XSC * DMIN1(DABS(V(p,p)),DABS(V(q,q))) 1182: IF ( DABS(V(q,p)) .LE. TEMP1 ) 1183: & V(q,p) = DSIGN( TEMP1, V(q,p) ) 1184: 3958 CONTINUE 1185: 3959 CONTINUE 1186: END IF 1187: * 1188: IF ( NR .NE. N ) 1189: * .. save ... 1190: & CALL DLACPY( 'A', N, NR, V, LDV, WORK(2*N+1), N ) 1191: * 1192: * .. this transposed copy should be better than naive 1193: DO 1969 p = 1, NR - 1 1194: CALL DCOPY( NR-p, V(p,p+1), LDV, V(p+1,p), 1 ) 1195: 1969 CONTINUE 1196: * 1197: CONDR2 = CONDR1 1198: * 1199: ELSE 1200: * 1201: * .. ill-conditioned case: second QRF with pivoting 1202: * Note that windowed pivoting would be equaly good 1203: * numerically, and more run-time efficient. So, in 1204: * an optimal implementation, the next call to DGEQP3 1205: * should be replaced with eg. CALL SGEQPX (ACM TOMS #782) 1206: * with properly (carefully) chosen parameters. 1207: * 1208: * R1^t * P2 = Q2 * R2 1209: DO 3003 p = 1, NR 1210: IWORK(N+p) = 0 1211: 3003 CONTINUE 1212: CALL DGEQP3( N, NR, V, LDV, IWORK(N+1), WORK(N+1), 1213: & WORK(2*N+1), LWORK-2*N, IERR ) 1214: ** CALL DGEQRF( N, NR, V, LDV, WORK(N+1), WORK(2*N+1), 1215: ** & LWORK-2*N, IERR ) 1216: IF ( L2PERT ) THEN 1217: XSC = DSQRT(SMALL) 1218: DO 3969 p = 2, NR 1219: DO 3968 q = 1, p - 1 1220: TEMP1 = XSC * DMIN1(DABS(V(p,p)),DABS(V(q,q))) 1221: IF ( DABS(V(q,p)) .LE. TEMP1 ) 1222: & V(q,p) = DSIGN( TEMP1, V(q,p) ) 1223: 3968 CONTINUE 1224: 3969 CONTINUE 1225: END IF 1226: * 1227: CALL DLACPY( 'A', N, NR, V, LDV, WORK(2*N+1), N ) 1228: * 1229: IF ( L2PERT ) THEN 1230: XSC = DSQRT(SMALL) 1231: DO 8970 p = 2, NR 1232: DO 8971 q = 1, p - 1 1233: TEMP1 = XSC * DMIN1(DABS(V(p,p)),DABS(V(q,q))) 1234: V(p,q) = - DSIGN( TEMP1, V(q,p) ) 1235: 8971 CONTINUE 1236: 8970 CONTINUE 1237: ELSE 1238: CALL DLASET( 'L',NR-1,NR-1,ZERO,ZERO,V(2,1),LDV ) 1239: END IF 1240: * Now, compute R2 = L3 * Q3, the LQ factorization. 1241: CALL DGELQF( NR, NR, V, LDV, WORK(2*N+N*NR+1), 1242: & WORK(2*N+N*NR+NR+1), LWORK-2*N-N*NR-NR, IERR ) 1243: * .. and estimate the condition number 1244: CALL DLACPY( 'L',NR,NR,V,LDV,WORK(2*N+N*NR+NR+1),NR ) 1245: DO 4950 p = 1, NR 1246: TEMP1 = DNRM2( p, WORK(2*N+N*NR+NR+p), NR ) 1247: CALL DSCAL( p, ONE/TEMP1, WORK(2*N+N*NR+NR+p), NR ) 1248: 4950 CONTINUE 1249: CALL DPOCON( 'L',NR,WORK(2*N+N*NR+NR+1),NR,ONE,TEMP1, 1250: & WORK(2*N+N*NR+NR+NR*NR+1),IWORK(M+2*N+1),IERR ) 1251: CONDR2 = ONE / DSQRT(TEMP1) 1252: * 1253: IF ( CONDR2 .GE. COND_OK ) THEN 1254: * .. save the Householder vectors used for Q3 1255: * (this overwrittes the copy of R2, as it will not be 1256: * needed in this branch, but it does not overwritte the 1257: * Huseholder vectors of Q2.). 1258: CALL DLACPY( 'U', NR, NR, V, LDV, WORK(2*N+1), N ) 1259: * .. and the rest of the information on Q3 is in 1260: * WORK(2*N+N*NR+1:2*N+N*NR+N) 1261: END IF 1262: * 1263: END IF 1264: * 1265: IF ( L2PERT ) THEN 1266: XSC = DSQRT(SMALL) 1267: DO 4968 q = 2, NR 1268: TEMP1 = XSC * V(q,q) 1269: DO 4969 p = 1, q - 1 1270: * V(p,q) = - DSIGN( TEMP1, V(q,p) ) 1271: V(p,q) = - DSIGN( TEMP1, V(p,q) ) 1272: 4969 CONTINUE 1273: 4968 CONTINUE 1274: ELSE 1275: CALL DLASET( 'U', NR-1,NR-1, ZERO,ZERO, V(1,2), LDV ) 1276: END IF 1277: * 1278: * Second preconditioning finished; continue with Jacobi SVD 1279: * The input matrix is lower trinagular. 1280: * 1281: * Recover the right singular vectors as solution of a well 1282: * conditioned triangular matrix equation. 1283: * 1284: IF ( CONDR1 .LT. COND_OK ) THEN 1285: * 1286: CALL DGESVJ( 'L','U','N',NR,NR,V,LDV,SVA,NR,U, 1287: & LDU,WORK(2*N+N*NR+NR+1),LWORK-2*N-N*NR-NR,INFO ) 1288: SCALEM = WORK(2*N+N*NR+NR+1) 1289: NUMRANK = IDNINT(WORK(2*N+N*NR+NR+2)) 1290: DO 3970 p = 1, NR 1291: CALL DCOPY( NR, V(1,p), 1, U(1,p), 1 ) 1292: CALL DSCAL( NR, SVA(p), V(1,p), 1 ) 1293: 3970 CONTINUE 1294: 1295: * .. pick the right matrix equation and solve it 1296: * 1297: IF ( NR. EQ. N ) THEN 1298: * :)) .. best case, R1 is inverted. The solution of this matrix 1299: * equation is Q2*V2 = the product of the Jacobi rotations 1300: * used in DGESVJ, premultiplied with the orthogonal matrix 1301: * from the second QR factorization. 1302: CALL DTRSM( 'L','U','N','N', NR,NR,ONE, A,LDA, V,LDV ) 1303: ELSE 1304: * .. R1 is well conditioned, but non-square. Transpose(R2) 1305: * is inverted to get the product of the Jacobi rotations 1306: * used in DGESVJ. The Q-factor from the second QR 1307: * factorization is then built in explicitly. 1308: CALL DTRSM('L','U','T','N',NR,NR,ONE,WORK(2*N+1), 1309: & N,V,LDV) 1310: IF ( NR .LT. N ) THEN 1311: CALL DLASET('A',N-NR,NR,ZERO,ZERO,V(NR+1,1),LDV) 1312: CALL DLASET('A',NR,N-NR,ZERO,ZERO,V(1,NR+1),LDV) 1313: CALL DLASET('A',N-NR,N-NR,ZERO,ONE,V(NR+1,NR+1),LDV) 1314: END IF 1315: CALL DORMQR('L','N',N,N,NR,WORK(2*N+1),N,WORK(N+1), 1316: & V,LDV,WORK(2*N+N*NR+NR+1),LWORK-2*N-N*NR-NR,IERR) 1317: END IF 1318: * 1319: ELSE IF ( CONDR2 .LT. COND_OK ) THEN 1320: * 1321: * :) .. the input matrix A is very likely a relative of 1322: * the Kahan matrix :) 1323: * The matrix R2 is inverted. The solution of the matrix equation 1324: * is Q3^T*V3 = the product of the Jacobi rotations (appplied to 1325: * the lower triangular L3 from the LQ factorization of 1326: * R2=L3*Q3), pre-multiplied with the transposed Q3. 1327: CALL DGESVJ( 'L', 'U', 'N', NR, NR, V, LDV, SVA, NR, U, 1328: & LDU, WORK(2*N+N*NR+NR+1), LWORK-2*N-N*NR-NR, INFO ) 1329: SCALEM = WORK(2*N+N*NR+NR+1) 1330: NUMRANK = IDNINT(WORK(2*N+N*NR+NR+2)) 1331: DO 3870 p = 1, NR 1332: CALL DCOPY( NR, V(1,p), 1, U(1,p), 1 ) 1333: CALL DSCAL( NR, SVA(p), U(1,p), 1 ) 1334: 3870 CONTINUE 1335: CALL DTRSM('L','U','N','N',NR,NR,ONE,WORK(2*N+1),N,U,LDU) 1336: * .. apply the permutation from the second QR factorization 1337: DO 873 q = 1, NR 1338: DO 872 p = 1, NR 1339: WORK(2*N+N*NR+NR+IWORK(N+p)) = U(p,q) 1340: 872 CONTINUE 1341: DO 874 p = 1, NR 1342: U(p,q) = WORK(2*N+N*NR+NR+p) 1343: 874 CONTINUE 1344: 873 CONTINUE 1345: IF ( NR .LT. N ) THEN 1346: CALL DLASET( 'A',N-NR,NR,ZERO,ZERO,V(NR+1,1),LDV ) 1347: CALL DLASET( 'A',NR,N-NR,ZERO,ZERO,V(1,NR+1),LDV ) 1348: CALL DLASET( 'A',N-NR,N-NR,ZERO,ONE,V(NR+1,NR+1),LDV ) 1349: END IF 1350: CALL DORMQR( 'L','N',N,N,NR,WORK(2*N+1),N,WORK(N+1), 1351: & V,LDV,WORK(2*N+N*NR+NR+1),LWORK-2*N-N*NR-NR,IERR ) 1352: ELSE 1353: * Last line of defense. 1354: * #:( This is a rather pathological case: no scaled condition 1355: * improvement after two pivoted QR factorizations. Other 1356: * possibility is that the rank revealing QR factorization 1357: * or the condition estimator has failed, or the COND_OK 1358: * is set very close to ONE (which is unnecessary). Normally, 1359: * this branch should never be executed, but in rare cases of 1360: * failure of the RRQR or condition estimator, the last line of 1361: * defense ensures that DGEJSV completes the task. 1362: * Compute the full SVD of L3 using DGESVJ with explicit 1363: * accumulation of Jacobi rotations. 1364: CALL DGESVJ( 'L', 'U', 'V', NR, NR, V, LDV, SVA, NR, U, 1365: & LDU, WORK(2*N+N*NR+NR+1), LWORK-2*N-N*NR-NR, INFO ) 1366: SCALEM = WORK(2*N+N*NR+NR+1) 1367: NUMRANK = IDNINT(WORK(2*N+N*NR+NR+2)) 1368: IF ( NR .LT. N ) THEN 1369: CALL DLASET( 'A',N-NR,NR,ZERO,ZERO,V(NR+1,1),LDV ) 1370: CALL DLASET( 'A',NR,N-NR,ZERO,ZERO,V(1,NR+1),LDV ) 1371: CALL DLASET( 'A',N-NR,N-NR,ZERO,ONE,V(NR+1,NR+1),LDV ) 1372: END IF 1373: CALL DORMQR( 'L','N',N,N,NR,WORK(2*N+1),N,WORK(N+1), 1374: & V,LDV,WORK(2*N+N*NR+NR+1),LWORK-2*N-N*NR-NR,IERR ) 1375: * 1376: CALL DORMLQ( 'L', 'T', NR, NR, NR, WORK(2*N+1), N, 1377: & WORK(2*N+N*NR+1), U, LDU, WORK(2*N+N*NR+NR+1), 1378: & LWORK-2*N-N*NR-NR, IERR ) 1379: DO 773 q = 1, NR 1380: DO 772 p = 1, NR 1381: WORK(2*N+N*NR+NR+IWORK(N+p)) = U(p,q) 1382: 772 CONTINUE 1383: DO 774 p = 1, NR 1384: U(p,q) = WORK(2*N+N*NR+NR+p) 1385: 774 CONTINUE 1386: 773 CONTINUE 1387: * 1388: END IF 1389: * 1390: * Permute the rows of V using the (column) permutation from the 1391: * first QRF. Also, scale the columns to make them unit in 1392: * Euclidean norm. This applies to all cases. 1393: * 1394: TEMP1 = DSQRT(DBLE(N)) * EPSLN 1395: DO 1972 q = 1, N 1396: DO 972 p = 1, N 1397: WORK(2*N+N*NR+NR+IWORK(p)) = V(p,q) 1398: 972 CONTINUE 1399: DO 973 p = 1, N 1400: V(p,q) = WORK(2*N+N*NR+NR+p) 1401: 973 CONTINUE 1402: XSC = ONE / DNRM2( N, V(1,q), 1 ) 1403: IF ( (XSC .LT. (ONE-TEMP1)) .OR. (XSC .GT. (ONE+TEMP1)) ) 1404: & CALL DSCAL( N, XSC, V(1,q), 1 ) 1405: 1972 CONTINUE 1406: * At this moment, V contains the right singular vectors of A. 1407: * Next, assemble the left singular vector matrix U (M x N). 1408: IF ( NR .LT. M ) THEN 1409: CALL DLASET( 'A', M-NR, NR, ZERO, ZERO, U(NR+1,1), LDU ) 1410: IF ( NR .LT. N1 ) THEN 1411: CALL DLASET('A',NR,N1-NR,ZERO,ZERO,U(1,NR+1),LDU) 1412: CALL DLASET('A',M-NR,N1-NR,ZERO,ONE,U(NR+1,NR+1),LDU) 1413: END IF 1414: END IF 1415: * 1416: * The Q matrix from the first QRF is built into the left singular 1417: * matrix U. This applies to all cases. 1418: * 1419: CALL DORMQR( 'Left', 'No_Tr', M, N1, N, A, LDA, WORK, U, 1420: & LDU, WORK(N+1), LWORK-N, IERR ) 1421: 1422: * The columns of U are normalized. The cost is O(M*N) flops. 1423: TEMP1 = DSQRT(DBLE(M)) * EPSLN 1424: DO 1973 p = 1, NR 1425: XSC = ONE / DNRM2( M, U(1,p), 1 ) 1426: IF ( (XSC .LT. (ONE-TEMP1)) .OR. (XSC .GT. (ONE+TEMP1)) ) 1427: & CALL DSCAL( M, XSC, U(1,p), 1 ) 1428: 1973 CONTINUE 1429: * 1430: * If the initial QRF is computed with row pivoting, the left 1431: * singular vectors must be adjusted. 1432: * 1433: IF ( ROWPIV ) 1434: & CALL DLASWP( N1, U, LDU, 1, M-1, IWORK(2*N+1), -1 ) 1435: * 1436: ELSE 1437: * 1438: * .. the initial matrix A has almost orthogonal columns and 1439: * the second QRF is not needed 1440: * 1441: CALL DLACPY( 'Upper', N, N, A, LDA, WORK(N+1), N ) 1442: IF ( L2PERT ) THEN 1443: XSC = DSQRT(SMALL) 1444: DO 5970 p = 2, N 1445: TEMP1 = XSC * WORK( N + (p-1)*N + p ) 1446: DO 5971 q = 1, p - 1 1447: WORK(N+(q-1)*N+p)=-DSIGN(TEMP1,WORK(N+(p-1)*N+q)) 1448: 5971 CONTINUE 1449: 5970 CONTINUE 1450: ELSE 1451: CALL DLASET( 'Lower',N-1,N-1,ZERO,ZERO,WORK(N+2),N ) 1452: END IF 1453: * 1454: CALL DGESVJ( 'Upper', 'U', 'N', N, N, WORK(N+1), N, SVA, 1455: & N, U, LDU, WORK(N+N*N+1), LWORK-N-N*N, INFO ) 1456: * 1457: SCALEM = WORK(N+N*N+1) 1458: NUMRANK = IDNINT(WORK(N+N*N+2)) 1459: DO 6970 p = 1, N 1460: CALL DCOPY( N, WORK(N+(p-1)*N+1), 1, U(1,p), 1 ) 1461: CALL DSCAL( N, SVA(p), WORK(N+(p-1)*N+1), 1 ) 1462: 6970 CONTINUE 1463: * 1464: CALL DTRSM( 'Left', 'Upper', 'NoTrans', 'No UD', N, N, 1465: & ONE, A, LDA, WORK(N+1), N ) 1466: DO 6972 p = 1, N 1467: CALL DCOPY( N, WORK(N+p), N, V(IWORK(p),1), LDV ) 1468: 6972 CONTINUE 1469: TEMP1 = DSQRT(DBLE(N))*EPSLN 1470: DO 6971 p = 1, N 1471: XSC = ONE / DNRM2( N, V(1,p), 1 ) 1472: IF ( (XSC .LT. (ONE-TEMP1)) .OR. (XSC .GT. (ONE+TEMP1)) ) 1473: & CALL DSCAL( N, XSC, V(1,p), 1 ) 1474: 6971 CONTINUE 1475: * 1476: * Assemble the left singular vector matrix U (M x N). 1477: * 1478: IF ( N .LT. M ) THEN 1479: CALL DLASET( 'A', M-N, N, ZERO, ZERO, U(N+1,1), LDU ) 1480: IF ( N .LT. N1 ) THEN 1481: CALL DLASET( 'A',N, N1-N, ZERO, ZERO, U(1,N+1),LDU ) 1482: CALL DLASET( 'A',M-N,N1-N, ZERO, ONE,U(N+1,N+1),LDU ) 1483: END IF 1484: END IF 1485: CALL DORMQR( 'Left', 'No Tr', M, N1, N, A, LDA, WORK, U, 1486: & LDU, WORK(N+1), LWORK-N, IERR ) 1487: TEMP1 = DSQRT(DBLE(M))*EPSLN 1488: DO 6973 p = 1, N1 1489: XSC = ONE / DNRM2( M, U(1,p), 1 ) 1490: IF ( (XSC .LT. (ONE-TEMP1)) .OR. (XSC .GT. (ONE+TEMP1)) ) 1491: & CALL DSCAL( M, XSC, U(1,p), 1 ) 1492: 6973 CONTINUE 1493: * 1494: IF ( ROWPIV ) 1495: & CALL DLASWP( N1, U, LDU, 1, M-1, IWORK(2*N+1), -1 ) 1496: * 1497: END IF 1498: * 1499: * end of the >> almost orthogonal case << in the full SVD 1500: * 1501: ELSE 1502: * 1503: * This branch deploys a preconditioned Jacobi SVD with explicitly 1504: * accumulated rotations. It is included as optional, mainly for 1505: * experimental purposes. It does perfom well, and can also be used. 1506: * In this implementation, this branch will be automatically activated 1507: * if the condition number sigma_max(A) / sigma_min(A) is predicted 1508: * to be greater than the overflow threshold. This is because the 1509: * a posteriori computation of the singular vectors assumes robust 1510: * implementation of BLAS and some LAPACK procedures, capable of working 1511: * in presence of extreme values. Since that is not always the case, ... 1512: * 1513: DO 7968 p = 1, NR 1514: CALL DCOPY( N-p+1, A(p,p), LDA, V(p,p), 1 ) 1515: 7968 CONTINUE 1516: * 1517: IF ( L2PERT ) THEN 1518: XSC = DSQRT(SMALL/EPSLN) 1519: DO 5969 q = 1, NR 1520: TEMP1 = XSC*DABS( V(q,q) ) 1521: DO 5968 p = 1, N 1522: IF ( ( p .GT. q ) .AND. ( DABS(V(p,q)) .LE. TEMP1 ) 1523: & .OR. ( p .LT. q ) ) 1524: & V(p,q) = DSIGN( TEMP1, V(p,q) ) 1525: IF ( p. LT. q ) V(p,q) = - V(p,q) 1526: 5968 CONTINUE 1527: 5969 CONTINUE 1528: ELSE 1529: CALL DLASET( 'U', NR-1, NR-1, ZERO, ZERO, V(1,2), LDV ) 1530: END IF 1531: 1532: CALL DGEQRF( N, NR, V, LDV, WORK(N+1), WORK(2*N+1), 1533: & LWORK-2*N, IERR ) 1534: CALL DLACPY( 'L', N, NR, V, LDV, WORK(2*N+1), N ) 1535: * 1536: DO 7969 p = 1, NR 1537: CALL DCOPY( NR-p+1, V(p,p), LDV, U(p,p), 1 ) 1538: 7969 CONTINUE 1539: 1540: IF ( L2PERT ) THEN 1541: XSC = DSQRT(SMALL/EPSLN) 1542: DO 9970 q = 2, NR 1543: DO 9971 p = 1, q - 1 1544: TEMP1 = XSC * DMIN1(DABS(U(p,p)),DABS(U(q,q))) 1545: U(p,q) = - DSIGN( TEMP1, U(q,p) ) 1546: 9971 CONTINUE 1547: 9970 CONTINUE 1548: ELSE 1549: CALL DLASET('U', NR-1, NR-1, ZERO, ZERO, U(1,2), LDU ) 1550: END IF 1551: 1552: CALL DGESVJ( 'G', 'U', 'V', NR, NR, U, LDU, SVA, 1553: & N, V, LDV, WORK(2*N+N*NR+1), LWORK-2*N-N*NR, INFO ) 1554: SCALEM = WORK(2*N+N*NR+1) 1555: NUMRANK = IDNINT(WORK(2*N+N*NR+2)) 1556: 1557: IF ( NR .LT. N ) THEN 1558: CALL DLASET( 'A',N-NR,NR,ZERO,ZERO,V(NR+1,1),LDV ) 1559: CALL DLASET( 'A',NR,N-NR,ZERO,ZERO,V(1,NR+1),LDV ) 1560: CALL DLASET( 'A',N-NR,N-NR,ZERO,ONE,V(NR+1,NR+1),LDV ) 1561: END IF 1562: 1563: CALL DORMQR( 'L','N',N,N,NR,WORK(2*N+1),N,WORK(N+1), 1564: & V,LDV,WORK(2*N+N*NR+NR+1),LWORK-2*N-N*NR-NR,IERR ) 1565: * 1566: * Permute the rows of V using the (column) permutation from the 1567: * first QRF. Also, scale the columns to make them unit in 1568: * Euclidean norm. This applies to all cases. 1569: * 1570: TEMP1 = DSQRT(DBLE(N)) * EPSLN 1571: DO 7972 q = 1, N 1572: DO 8972 p = 1, N 1573: WORK(2*N+N*NR+NR+IWORK(p)) = V(p,q) 1574: 8972 CONTINUE 1575: DO 8973 p = 1, N 1576: V(p,q) = WORK(2*N+N*NR+NR+p) 1577: 8973 CONTINUE 1578: XSC = ONE / DNRM2( N, V(1,q), 1 ) 1579: IF ( (XSC .LT. (ONE-TEMP1)) .OR. (XSC .GT. (ONE+TEMP1)) ) 1580: & CALL DSCAL( N, XSC, V(1,q), 1 ) 1581: 7972 CONTINUE 1582: * 1583: * At this moment, V contains the right singular vectors of A. 1584: * Next, assemble the left singular vector matrix U (M x N). 1585: * 1586: IF ( NR .LT. M ) THEN 1587: CALL DLASET( 'A', M-NR, NR, ZERO, ZERO, U(NR+1,1), LDU ) 1588: IF ( NR .LT. N1 ) THEN 1589: CALL DLASET( 'A',NR, N1-NR, ZERO, ZERO, U(1,NR+1),LDU ) 1590: CALL DLASET( 'A',M-NR,N1-NR, ZERO, ONE,U(NR+1,NR+1),LDU ) 1591: END IF 1592: END IF 1593: * 1594: CALL DORMQR( 'Left', 'No Tr', M, N1, N, A, LDA, WORK, U, 1595: & LDU, WORK(N+1), LWORK-N, IERR ) 1596: * 1597: IF ( ROWPIV ) 1598: & CALL DLASWP( N1, U, LDU, 1, M-1, IWORK(2*N+1), -1 ) 1599: * 1600: * 1601: END IF 1602: IF ( TRANSP ) THEN 1603: * .. swap U and V because the procedure worked on A^t 1604: DO 6974 p = 1, N 1605: CALL DSWAP( N, U(1,p), 1, V(1,p), 1 ) 1606: 6974 CONTINUE 1607: END IF 1608: * 1609: END IF 1610: * end of the full SVD 1611: * 1612: * Undo scaling, if necessary (and possible) 1613: * 1614: IF ( USCAL2 .LE. (BIG/SVA(1))*USCAL1 ) THEN 1615: CALL DLASCL( 'G', 0, 0, USCAL1, USCAL2, NR, 1, SVA, N, IERR ) 1616: USCAL1 = ONE 1617: USCAL2 = ONE 1618: END IF 1619: * 1620: IF ( NR .LT. N ) THEN 1621: DO 3004 p = NR+1, N 1622: SVA(p) = ZERO 1623: 3004 CONTINUE 1624: END IF 1625: * 1626: WORK(1) = USCAL2 * SCALEM 1627: WORK(2) = USCAL1 1628: IF ( ERREST ) WORK(3) = SCONDA 1629: IF ( LSVEC .AND. RSVEC ) THEN 1630: WORK(4) = CONDR1 1631: WORK(5) = CONDR2 1632: END IF 1633: IF ( L2TRAN ) THEN 1634: WORK(6) = ENTRA 1635: WORK(7) = ENTRAT 1636: END IF 1637: * 1638: IWORK(1) = NR 1639: IWORK(2) = NUMRANK 1640: IWORK(3) = WARNING 1641: * 1642: RETURN 1643: * .. 1644: * .. END OF DGEJSV 1645: * .. 1646: END 1647: *