Annotation of rpl/lapack/lapack/dgejsv.f, revision 1.2
1.1 bertrand 1: SUBROUTINE DGEJSV( JOBA, JOBU, JOBV, JOBR, JOBT, JOBP,
2: & M, N, A, LDA, SVA, U, LDU, V, LDV,
3: & WORK, LWORK, IWORK, INFO )
4: *
5: * -- LAPACK routine (version 3.2.2) --
6: *
7: * -- Contributed by Zlatko Drmac of the University of Zagreb and --
8: * -- Kresimir Veselic of the Fernuniversitaet Hagen --
9: * -- June 2010 --
10: *
11: * -- LAPACK is a software package provided by Univ. of Tennessee, --
12: * -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
13: *
14: * This routine is also part of SIGMA (version 1.23, October 23. 2008.)
15: * SIGMA is a library of algorithms for highly accurate algorithms for
16: * computation of SVD, PSVD, QSVD, (H,K)-SVD, and for solution of the
17: * eigenvalue problems Hx = lambda M x, H M x = lambda x with H, M > 0.
18: *
19: * .. Scalar Arguments ..
20: IMPLICIT NONE
21: INTEGER INFO, LDA, LDU, LDV, LWORK, M, N
22: * ..
23: * .. Array Arguments ..
24: DOUBLE PRECISION A( LDA, * ), SVA( N ), U( LDU, * ), V( LDV, * ),
25: & WORK( LWORK )
26: INTEGER IWORK( * )
27: CHARACTER*1 JOBA, JOBP, JOBR, JOBT, JOBU, JOBV
28: * ..
29: *
30: * Purpose
31: * =======
32: *
33: * DGEJSV computes the singular value decomposition (SVD) of a real M-by-N
34: * matrix [A], where M >= N. The SVD of [A] is written as
35: *
36: * [A] = [U] * [SIGMA] * [V]^t,
37: *
38: * where [SIGMA] is an N-by-N (M-by-N) matrix which is zero except for its N
39: * diagonal elements, [U] is an M-by-N (or M-by-M) orthonormal matrix, and
40: * [V] is an N-by-N orthogonal matrix. The diagonal elements of [SIGMA] are
41: * the singular values of [A]. The columns of [U] and [V] are the left and
42: * the right singular vectors of [A], respectively. The matrices [U] and [V]
43: * are computed and stored in the arrays U and V, respectively. The diagonal
44: * of [SIGMA] is computed and stored in the array SVA.
45: *
46: * Arguments
47: * =========
48: *
49: * JOBA (input) CHARACTER*1
50: * Specifies the level of accuracy:
51: * = 'C': This option works well (high relative accuracy) if A = B * D,
52: * with well-conditioned B and arbitrary diagonal matrix D.
53: * The accuracy cannot be spoiled by COLUMN scaling. The
54: * accuracy of the computed output depends on the condition of
55: * B, and the procedure aims at the best theoretical accuracy.
56: * The relative error max_{i=1:N}|d sigma_i| / sigma_i is
57: * bounded by f(M,N)*epsilon* cond(B), independent of D.
58: * The input matrix is preprocessed with the QRF with column
59: * pivoting. This initial preprocessing and preconditioning by
60: * a rank revealing QR factorization is common for all values of
61: * JOBA. Additional actions are specified as follows:
62: * = 'E': Computation as with 'C' with an additional estimate of the
63: * condition number of B. It provides a realistic error bound.
64: * = 'F': If A = D1 * C * D2 with ill-conditioned diagonal scalings
65: * D1, D2, and well-conditioned matrix C, this option gives
66: * higher accuracy than the 'C' option. If the structure of the
67: * input matrix is not known, and relative accuracy is
68: * desirable, then this option is advisable. The input matrix A
69: * is preprocessed with QR factorization with FULL (row and
70: * column) pivoting.
71: * = 'G' Computation as with 'F' with an additional estimate of the
72: * condition number of B, where A=D*B. If A has heavily weighted
73: * rows, then using this condition number gives too pessimistic
74: * error bound.
75: * = 'A': Small singular values are the noise and the matrix is treated
76: * as numerically rank defficient. The error in the computed
77: * singular values is bounded by f(m,n)*epsilon*||A||.
78: * The computed SVD A = U * S * V^t restores A up to
79: * f(m,n)*epsilon*||A||.
80: * This gives the procedure the licence to discard (set to zero)
81: * all singular values below N*epsilon*||A||.
82: * = 'R': Similar as in 'A'. Rank revealing property of the initial
83: * QR factorization is used do reveal (using triangular factor)
84: * a gap sigma_{r+1} < epsilon * sigma_r in which case the
85: * numerical RANK is declared to be r. The SVD is computed with
86: * absolute error bounds, but more accurately than with 'A'.
87: *
88: * JOBU (input) CHARACTER*1
89: * Specifies whether to compute the columns of U:
90: * = 'U': N columns of U are returned in the array U.
91: * = 'F': full set of M left sing. vectors is returned in the array U.
92: * = 'W': U may be used as workspace of length M*N. See the description
93: * of U.
94: * = 'N': U is not computed.
95: *
96: * JOBV (input) CHARACTER*1
97: * Specifies whether to compute the matrix V:
98: * = 'V': N columns of V are returned in the array V; Jacobi rotations
99: * are not explicitly accumulated.
100: * = 'J': N columns of V are returned in the array V, but they are
101: * computed as the product of Jacobi rotations. This option is
102: * allowed only if JOBU .NE. 'N', i.e. in computing the full SVD.
103: * = 'W': V may be used as workspace of length N*N. See the description
104: * of V.
105: * = 'N': V is not computed.
106: *
107: * JOBR (input) CHARACTER*1
108: * Specifies the RANGE for the singular values. Issues the licence to
109: * set to zero small positive singular values if they are outside
110: * specified range. If A .NE. 0 is scaled so that the largest singular
111: * value of c*A is around DSQRT(BIG), BIG=SLAMCH('O'), then JOBR issues
112: * the licence to kill columns of A whose norm in c*A is less than
113: * DSQRT(SFMIN) (for JOBR.EQ.'R'), or less than SMALL=SFMIN/EPSLN,
114: * where SFMIN=SLAMCH('S'), EPSLN=SLAMCH('E').
115: * = 'N': Do not kill small columns of c*A. This option assumes that
116: * BLAS and QR factorizations and triangular solvers are
117: * implemented to work in that range. If the condition of A
118: * is greater than BIG, use DGESVJ.
119: * = 'R': RESTRICTED range for sigma(c*A) is [DSQRT(SFMIN), DSQRT(BIG)]
120: * (roughly, as described above). This option is recommended.
121: * ~~~~~~~~~~~~~~~~~~~~~~~~~~~
122: * For computing the singular values in the FULL range [SFMIN,BIG]
123: * use DGESVJ.
124: *
125: * JOBT (input) CHARACTER*1
126: * If the matrix is square then the procedure may determine to use
127: * transposed A if A^t seems to be better with respect to convergence.
128: * If the matrix is not square, JOBT is ignored. This is subject to
129: * changes in the future.
130: * The decision is based on two values of entropy over the adjoint
131: * orbit of A^t * A. See the descriptions of WORK(6) and WORK(7).
132: * = 'T': transpose if entropy test indicates possibly faster
133: * convergence of Jacobi process if A^t is taken as input. If A is
134: * replaced with A^t, then the row pivoting is included automatically.
135: * = 'N': do not speculate.
136: * This option can be used to compute only the singular values, or the
137: * full SVD (U, SIGMA and V). For only one set of singular vectors
138: * (U or V), the caller should provide both U and V, as one of the
139: * matrices is used as workspace if the matrix A is transposed.
140: * The implementer can easily remove this constraint and make the
141: * code more complicated. See the descriptions of U and V.
142: *
143: * JOBP (input) CHARACTER*1
144: * Issues the licence to introduce structured perturbations to drown
145: * denormalized numbers. This licence should be active if the
146: * denormals are poorly implemented, causing slow computation,
147: * especially in cases of fast convergence (!). For details see [1,2].
148: * For the sake of simplicity, this perturbations are included only
149: * when the full SVD or only the singular values are requested. The
150: * implementer/user can easily add the perturbation for the cases of
151: * computing one set of singular vectors.
152: * = 'P': introduce perturbation
153: * = 'N': do not perturb
154: *
155: * M (input) INTEGER
156: * The number of rows of the input matrix A. M >= 0.
157: *
158: * N (input) INTEGER
159: * The number of columns of the input matrix A. M >= N >= 0.
160: *
161: * A (input/workspace) DOUBLE PRECISION array, dimension (LDA,N)
162: * On entry, the M-by-N matrix A.
163: *
164: * LDA (input) INTEGER
165: * The leading dimension of the array A. LDA >= max(1,M).
166: *
167: * SVA (workspace/output) DOUBLE PRECISION array, dimension (N)
168: * On exit,
169: * - For WORK(1)/WORK(2) = ONE: The singular values of A. During the
170: * computation SVA contains Euclidean column norms of the
171: * iterated matrices in the array A.
172: * - For WORK(1) .NE. WORK(2): The singular values of A are
173: * (WORK(1)/WORK(2)) * SVA(1:N). This factored form is used if
174: * sigma_max(A) overflows or if small singular values have been
175: * saved from underflow by scaling the input matrix A.
176: * - If JOBR='R' then some of the singular values may be returned
177: * as exact zeros obtained by "set to zero" because they are
178: * below the numerical rank threshold or are denormalized numbers.
179: *
180: * U (workspace/output) DOUBLE PRECISION array, dimension ( LDU, N )
181: * If JOBU = 'U', then U contains on exit the M-by-N matrix of
182: * the left singular vectors.
183: * If JOBU = 'F', then U contains on exit the M-by-M matrix of
184: * the left singular vectors, including an ONB
185: * of the orthogonal complement of the Range(A).
186: * If JOBU = 'W' .AND. (JOBV.EQ.'V' .AND. JOBT.EQ.'T' .AND. M.EQ.N),
187: * then U is used as workspace if the procedure
188: * replaces A with A^t. In that case, [V] is computed
189: * in U as left singular vectors of A^t and then
190: * copied back to the V array. This 'W' option is just
191: * a reminder to the caller that in this case U is
192: * reserved as workspace of length N*N.
193: * If JOBU = 'N' U is not referenced.
194: *
195: * LDU (input) INTEGER
196: * The leading dimension of the array U, LDU >= 1.
197: * IF JOBU = 'U' or 'F' or 'W', then LDU >= M.
198: *
199: * V (workspace/output) DOUBLE PRECISION array, dimension ( LDV, N )
200: * If JOBV = 'V', 'J' then V contains on exit the N-by-N matrix of
201: * the right singular vectors;
202: * If JOBV = 'W', AND (JOBU.EQ.'U' AND JOBT.EQ.'T' AND M.EQ.N),
203: * then V is used as workspace if the pprocedure
204: * replaces A with A^t. In that case, [U] is computed
205: * in V as right singular vectors of A^t and then
206: * copied back to the U array. This 'W' option is just
207: * a reminder to the caller that in this case V is
208: * reserved as workspace of length N*N.
209: * If JOBV = 'N' V is not referenced.
210: *
211: * LDV (input) INTEGER
212: * The leading dimension of the array V, LDV >= 1.
213: * If JOBV = 'V' or 'J' or 'W', then LDV >= N.
214: *
215: * WORK (workspace/output) DOUBLE PRECISION array, dimension at least LWORK.
216: * On exit,
217: * WORK(1) = SCALE = WORK(2) / WORK(1) is the scaling factor such
218: * that SCALE*SVA(1:N) are the computed singular values
219: * of A. (See the description of SVA().)
220: * WORK(2) = See the description of WORK(1).
221: * WORK(3) = SCONDA is an estimate for the condition number of
222: * column equilibrated A. (If JOBA .EQ. 'E' or 'G')
223: * SCONDA is an estimate of DSQRT(||(R^t * R)^(-1)||_1).
224: * It is computed using DPOCON. It holds
225: * N^(-1/4) * SCONDA <= ||R^(-1)||_2 <= N^(1/4) * SCONDA
226: * where R is the triangular factor from the QRF of A.
227: * However, if R is truncated and the numerical rank is
228: * determined to be strictly smaller than N, SCONDA is
229: * returned as -1, thus indicating that the smallest
230: * singular values might be lost.
231: *
232: * If full SVD is needed, the following two condition numbers are
233: * useful for the analysis of the algorithm. They are provied for
234: * a developer/implementer who is familiar with the details of
235: * the method.
236: *
237: * WORK(4) = an estimate of the scaled condition number of the
238: * triangular factor in the first QR factorization.
239: * WORK(5) = an estimate of the scaled condition number of the
240: * triangular factor in the second QR factorization.
241: * The following two parameters are computed if JOBT .EQ. 'T'.
242: * They are provided for a developer/implementer who is familiar
243: * with the details of the method.
244: *
245: * WORK(6) = the entropy of A^t*A :: this is the Shannon entropy
246: * of diag(A^t*A) / Trace(A^t*A) taken as point in the
247: * probability simplex.
248: * WORK(7) = the entropy of A*A^t.
249: *
250: * LWORK (input) INTEGER
251: * Length of WORK to confirm proper allocation of work space.
252: * LWORK depends on the job:
253: *
254: * If only SIGMA is needed ( JOBU.EQ.'N', JOBV.EQ.'N' ) and
255: * -> .. no scaled condition estimate required ( JOBE.EQ.'N'):
256: * LWORK >= max(2*M+N,4*N+1,7). This is the minimal requirement.
257: * For optimal performance (blocked code) the optimal value
258: * is LWORK >= max(2*M+N,3*N+(N+1)*NB,7). Here NB is the optimal
259: * block size for xGEQP3/xGEQRF.
260: * -> .. an estimate of the scaled condition number of A is
261: * required (JOBA='E', 'G'). In this case, LWORK is the maximum
262: * of the above and N*N+4*N, i.e. LWORK >= max(2*M+N,N*N+4N,7).
263: *
264: * If SIGMA and the right singular vectors are needed (JOBV.EQ.'V'),
265: * -> the minimal requirement is LWORK >= max(2*N+M,7).
266: * -> For optimal performance, LWORK >= max(2*N+M,2*N+N*NB,7),
267: * where NB is the optimal block size.
268: *
269: * If SIGMA and the left singular vectors are needed
270: * -> the minimal requirement is LWORK >= max(2*N+M,7).
271: * -> For optimal performance, LWORK >= max(2*N+M,2*N+N*NB,7),
272: * where NB is the optimal block size.
273: *
274: * If full SVD is needed ( JOBU.EQ.'U' or 'F', JOBV.EQ.'V' ) and
275: * -> .. the singular vectors are computed without explicit
276: * accumulation of the Jacobi rotations, LWORK >= 6*N+2*N*N
277: * -> .. in the iterative part, the Jacobi rotations are
278: * explicitly accumulated (option, see the description of JOBV),
279: * then the minimal requirement is LWORK >= max(M+3*N+N*N,7).
280: * For better performance, if NB is the optimal block size,
281: * LWORK >= max(3*N+N*N+M,3*N+N*N+N*NB,7).
282: *
283: * IWORK (workspace/output) INTEGER array, dimension M+3*N.
284: * On exit,
285: * IWORK(1) = the numerical rank determined after the initial
286: * QR factorization with pivoting. See the descriptions
287: * of JOBA and JOBR.
288: * IWORK(2) = the number of the computed nonzero singular values
289: * IWORK(3) = if nonzero, a warning message:
290: * If IWORK(3).EQ.1 then some of the column norms of A
291: * were denormalized floats. The requested high accuracy
292: * is not warranted by the data.
293: *
294: * INFO (output) INTEGER
295: * < 0 : if INFO = -i, then the i-th argument had an illegal value.
296: * = 0 : successfull exit;
297: * > 0 : DGEJSV did not converge in the maximal allowed number
298: * of sweeps. The computed values may be inaccurate.
299: *
300: * Further Details
301: * ===============
302: *
303: * DGEJSV implements a preconditioned Jacobi SVD algorithm. It uses SGEQP3,
304: * SGEQRF, and SGELQF as preprocessors and preconditioners. Optionally, an
305: * additional row pivoting can be used as a preprocessor, which in some
306: * cases results in much higher accuracy. An example is matrix A with the
307: * structure A = D1 * C * D2, where D1, D2 are arbitrarily ill-conditioned
308: * diagonal matrices and C is well-conditioned matrix. In that case, complete
309: * pivoting in the first QR factorizations provides accuracy dependent on the
310: * condition number of C, and independent of D1, D2. Such higher accuracy is
311: * not completely understood theoretically, but it works well in practice.
312: * Further, if A can be written as A = B*D, with well-conditioned B and some
313: * diagonal D, then the high accuracy is guaranteed, both theoretically and
314: * in software, independent of D. For more details see [1], [2].
315: * The computational range for the singular values can be the full range
316: * ( UNDERFLOW,OVERFLOW ), provided that the machine arithmetic and the BLAS
317: * & LAPACK routines called by DGEJSV are implemented to work in that range.
318: * If that is not the case, then the restriction for safe computation with
319: * the singular values in the range of normalized IEEE numbers is that the
320: * spectral condition number kappa(A)=sigma_max(A)/sigma_min(A) does not
321: * overflow. This code (DGEJSV) is best used in this restricted range,
322: * meaning that singular values of magnitude below ||A||_2 / SLAMCH('O') are
323: * returned as zeros. See JOBR for details on this.
324: * Further, this implementation is somewhat slower than the one described
325: * in [1,2] due to replacement of some non-LAPACK components, and because
326: * the choice of some tuning parameters in the iterative part (DGESVJ) is
327: * left to the implementer on a particular machine.
328: * The rank revealing QR factorization (in this code: SGEQP3) should be
329: * implemented as in [3]. We have a new version of SGEQP3 under development
330: * that is more robust than the current one in LAPACK, with a cleaner cut in
331: * rank defficient cases. It will be available in the SIGMA library [4].
332: * If M is much larger than N, it is obvious that the inital QRF with
333: * column pivoting can be preprocessed by the QRF without pivoting. That
334: * well known trick is not used in DGEJSV because in some cases heavy row
335: * weighting can be treated with complete pivoting. The overhead in cases
336: * M much larger than N is then only due to pivoting, but the benefits in
337: * terms of accuracy have prevailed. The implementer/user can incorporate
338: * this extra QRF step easily. The implementer can also improve data movement
339: * (matrix transpose, matrix copy, matrix transposed copy) - this
340: * implementation of DGEJSV uses only the simplest, naive data movement.
341: *
342: * Contributors
343: *
344: * Zlatko Drmac (Zagreb, Croatia) and Kresimir Veselic (Hagen, Germany)
345: *
346: * References
347: *
348: * [1] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm I.
349: * SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1322-1342.
350: * LAPACK Working note 169.
351: * [2] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm II.
352: * SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1343-1362.
353: * LAPACK Working note 170.
354: * [3] Z. Drmac and Z. Bujanovic: On the failure of rank-revealing QR
355: * factorization software - a case study.
356: * ACM Trans. Math. Softw. Vol. 35, No 2 (2008), pp. 1-28.
357: * LAPACK Working note 176.
358: * [4] Z. Drmac: SIGMA - mathematical software library for accurate SVD, PSV,
359: * QSVD, (H,K)-SVD computations.
360: * Department of Mathematics, University of Zagreb, 2008.
361: *
362: * Bugs, examples and comments
363: *
364: * Please report all bugs and send interesting examples and/or comments to
365: * drmac@math.hr. Thank you.
366: *
367: * ==========================================================================
368: *
369: * .. Local Parameters ..
370: DOUBLE PRECISION ZERO, ONE
371: PARAMETER ( ZERO = 0.0D0, ONE = 1.0D0 )
372: * ..
373: * .. Local Scalars ..
374: DOUBLE PRECISION AAPP, AAQQ, AATMAX, AATMIN, BIG, BIG1, COND_OK,
375: & CONDR1, CONDR2, ENTRA, ENTRAT, EPSLN, MAXPRJ, SCALEM,
376: & SCONDA, SFMIN, SMALL, TEMP1, USCAL1, USCAL2, XSC
377: INTEGER IERR, N1, NR, NUMRANK, p, q, WARNING
378: LOGICAL ALMORT, DEFR, ERREST, GOSCAL, JRACC, KILL, LSVEC,
379: & L2ABER, L2KILL, L2PERT, L2RANK, L2TRAN,
380: & NOSCAL, ROWPIV, RSVEC, TRANSP
381: * ..
382: * .. Intrinsic Functions ..
383: INTRINSIC DABS, DLOG, DMAX1, DMIN1, DBLE,
384: & MAX0, MIN0, IDNINT, DSIGN, DSQRT
385: * ..
386: * .. External Functions ..
387: DOUBLE PRECISION DLAMCH, DNRM2
388: INTEGER IDAMAX
389: LOGICAL LSAME
390: EXTERNAL IDAMAX, LSAME, DLAMCH, DNRM2
391: * ..
392: * .. External Subroutines ..
393: EXTERNAL DCOPY, DGELQF, DGEQP3, DGEQRF, DLACPY, DLASCL,
394: & DLASET, DLASSQ, DLASWP, DORGQR, DORMLQ,
395: & DORMQR, DPOCON, DSCAL, DSWAP, DTRSM, XERBLA
396: *
397: EXTERNAL DGESVJ
398: * ..
399: *
400: * Test the input arguments
401: *
402: LSVEC = LSAME( JOBU, 'U' ) .OR. LSAME( JOBU, 'F' )
403: JRACC = LSAME( JOBV, 'J' )
404: RSVEC = LSAME( JOBV, 'V' ) .OR. JRACC
405: ROWPIV = LSAME( JOBA, 'F' ) .OR. LSAME( JOBA, 'G' )
406: L2RANK = LSAME( JOBA, 'R' )
407: L2ABER = LSAME( JOBA, 'A' )
408: ERREST = LSAME( JOBA, 'E' ) .OR. LSAME( JOBA, 'G' )
409: L2TRAN = LSAME( JOBT, 'T' )
410: L2KILL = LSAME( JOBR, 'R' )
411: DEFR = LSAME( JOBR, 'N' )
412: L2PERT = LSAME( JOBP, 'P' )
413: *
414: IF ( .NOT.(ROWPIV .OR. L2RANK .OR. L2ABER .OR.
415: & ERREST .OR. LSAME( JOBA, 'C' ) )) THEN
416: INFO = - 1
417: ELSE IF ( .NOT.( LSVEC .OR. LSAME( JOBU, 'N' ) .OR.
418: & LSAME( JOBU, 'W' )) ) THEN
419: INFO = - 2
420: ELSE IF ( .NOT.( RSVEC .OR. LSAME( JOBV, 'N' ) .OR.
421: & LSAME( JOBV, 'W' )) .OR. ( JRACC .AND. (.NOT.LSVEC) ) ) THEN
422: INFO = - 3
423: ELSE IF ( .NOT. ( L2KILL .OR. DEFR ) ) THEN
424: INFO = - 4
425: ELSE IF ( .NOT. ( L2TRAN .OR. LSAME( JOBT, 'N' ) ) ) THEN
426: INFO = - 5
427: ELSE IF ( .NOT. ( L2PERT .OR. LSAME( JOBP, 'N' ) ) ) THEN
428: INFO = - 6
429: ELSE IF ( M .LT. 0 ) THEN
430: INFO = - 7
431: ELSE IF ( ( N .LT. 0 ) .OR. ( N .GT. M ) ) THEN
432: INFO = - 8
433: ELSE IF ( LDA .LT. M ) THEN
434: INFO = - 10
435: ELSE IF ( LSVEC .AND. ( LDU .LT. M ) ) THEN
436: INFO = - 13
437: ELSE IF ( RSVEC .AND. ( LDV .LT. N ) ) THEN
438: INFO = - 14
439: ELSE IF ( (.NOT.(LSVEC .OR. RSVEC .OR. ERREST).AND.
440: & (LWORK .LT. MAX0(7,4*N+1,2*M+N))) .OR.
441: & (.NOT.(LSVEC .OR. LSVEC) .AND. ERREST .AND.
442: & (LWORK .LT. MAX0(7,4*N+N*N,2*M+N))) .OR.
443: & (LSVEC .AND. (.NOT.RSVEC) .AND. (LWORK .LT. MAX0(7,2*N+M))) .OR.
444: & (RSVEC .AND. (.NOT.LSVEC) .AND. (LWORK .LT. MAX0(7,2*N+M))) .OR.
445: & (LSVEC .AND. RSVEC .AND. .NOT.JRACC .AND. (LWORK.LT.6*N+2*N*N))
446: & .OR. (LSVEC.AND.RSVEC.AND.JRACC.AND.LWORK.LT.MAX0(7,M+3*N+N*N)))
447: & THEN
448: INFO = - 17
449: ELSE
450: * #:)
451: INFO = 0
452: END IF
453: *
454: IF ( INFO .NE. 0 ) THEN
455: * #:(
456: CALL XERBLA( 'DGEJSV', - INFO )
457: END IF
458: *
459: * Quick return for void matrix (Y3K safe)
460: * #:)
461: IF ( ( M .EQ. 0 ) .OR. ( N .EQ. 0 ) ) RETURN
462: *
463: * Determine whether the matrix U should be M x N or M x M
464: *
465: IF ( LSVEC ) THEN
466: N1 = N
467: IF ( LSAME( JOBU, 'F' ) ) N1 = M
468: END IF
469: *
470: * Set numerical parameters
471: *
472: *! NOTE: Make sure DLAMCH() does not fail on the target architecture.
473: *
474:
475: EPSLN = DLAMCH('Epsilon')
476: SFMIN = DLAMCH('SafeMinimum')
477: SMALL = SFMIN / EPSLN
478: BIG = DLAMCH('O')
479: * BIG = ONE / SFMIN
480: *
481: * Initialize SVA(1:N) = diag( ||A e_i||_2 )_1^N
482: *
483: *(!) If necessary, scale SVA() to protect the largest norm from
484: * overflow. It is possible that this scaling pushes the smallest
485: * column norm left from the underflow threshold (extreme case).
486: *
487: SCALEM = ONE / DSQRT(DBLE(M)*DBLE(N))
488: NOSCAL = .TRUE.
489: GOSCAL = .TRUE.
490: DO 1874 p = 1, N
491: AAPP = ZERO
492: AAQQ = ZERO
493: CALL DLASSQ( M, A(1,p), 1, AAPP, AAQQ )
494: IF ( AAPP .GT. BIG ) THEN
495: INFO = - 9
496: CALL XERBLA( 'DGEJSV', -INFO )
497: RETURN
498: END IF
499: AAQQ = DSQRT(AAQQ)
500: IF ( ( AAPP .LT. (BIG / AAQQ) ) .AND. NOSCAL ) THEN
501: SVA(p) = AAPP * AAQQ
502: ELSE
503: NOSCAL = .FALSE.
504: SVA(p) = AAPP * ( AAQQ * SCALEM )
505: IF ( GOSCAL ) THEN
506: GOSCAL = .FALSE.
507: CALL DSCAL( p-1, SCALEM, SVA, 1 )
508: END IF
509: END IF
510: 1874 CONTINUE
511: *
512: IF ( NOSCAL ) SCALEM = ONE
513: *
514: AAPP = ZERO
515: AAQQ = BIG
516: DO 4781 p = 1, N
517: AAPP = DMAX1( AAPP, SVA(p) )
518: IF ( SVA(p) .NE. ZERO ) AAQQ = DMIN1( AAQQ, SVA(p) )
519: 4781 CONTINUE
520: *
521: * Quick return for zero M x N matrix
522: * #:)
523: IF ( AAPP .EQ. ZERO ) THEN
524: IF ( LSVEC ) CALL DLASET( 'G', M, N1, ZERO, ONE, U, LDU )
525: IF ( RSVEC ) CALL DLASET( 'G', N, N, ZERO, ONE, V, LDV )
526: WORK(1) = ONE
527: WORK(2) = ONE
528: IF ( ERREST ) WORK(3) = ONE
529: IF ( LSVEC .AND. RSVEC ) THEN
530: WORK(4) = ONE
531: WORK(5) = ONE
532: END IF
533: IF ( L2TRAN ) THEN
534: WORK(6) = ZERO
535: WORK(7) = ZERO
536: END IF
537: IWORK(1) = 0
538: IWORK(2) = 0
539: RETURN
540: END IF
541: *
542: * Issue warning if denormalized column norms detected. Override the
543: * high relative accuracy request. Issue licence to kill columns
544: * (set them to zero) whose norm is less than sigma_max / BIG (roughly).
545: * #:(
546: WARNING = 0
547: IF ( AAQQ .LE. SFMIN ) THEN
548: L2RANK = .TRUE.
549: L2KILL = .TRUE.
550: WARNING = 1
551: END IF
552: *
553: * Quick return for one-column matrix
554: * #:)
555: IF ( N .EQ. 1 ) THEN
556: *
557: IF ( LSVEC ) THEN
558: CALL DLASCL( 'G',0,0,SVA(1),SCALEM, M,1,A(1,1),LDA,IERR )
559: CALL DLACPY( 'A', M, 1, A, LDA, U, LDU )
560: * computing all M left singular vectors of the M x 1 matrix
561: IF ( N1 .NE. N ) THEN
562: CALL DGEQRF( M, N, U,LDU, WORK, WORK(N+1),LWORK-N,IERR )
563: CALL DORGQR( M,N1,1, U,LDU,WORK,WORK(N+1),LWORK-N,IERR )
564: CALL DCOPY( M, A(1,1), 1, U(1,1), 1 )
565: END IF
566: END IF
567: IF ( RSVEC ) THEN
568: V(1,1) = ONE
569: END IF
570: IF ( SVA(1) .LT. (BIG*SCALEM) ) THEN
571: SVA(1) = SVA(1) / SCALEM
572: SCALEM = ONE
573: END IF
574: WORK(1) = ONE / SCALEM
575: WORK(2) = ONE
576: IF ( SVA(1) .NE. ZERO ) THEN
577: IWORK(1) = 1
578: IF ( ( SVA(1) / SCALEM) .GE. SFMIN ) THEN
579: IWORK(2) = 1
580: ELSE
581: IWORK(2) = 0
582: END IF
583: ELSE
584: IWORK(1) = 0
585: IWORK(2) = 0
586: END IF
587: IF ( ERREST ) WORK(3) = ONE
588: IF ( LSVEC .AND. RSVEC ) THEN
589: WORK(4) = ONE
590: WORK(5) = ONE
591: END IF
592: IF ( L2TRAN ) THEN
593: WORK(6) = ZERO
594: WORK(7) = ZERO
595: END IF
596: RETURN
597: *
598: END IF
599: *
600: TRANSP = .FALSE.
601: L2TRAN = L2TRAN .AND. ( M .EQ. N )
602: *
603: AATMAX = -ONE
604: AATMIN = BIG
605: IF ( ROWPIV .OR. L2TRAN ) THEN
606: *
607: * Compute the row norms, needed to determine row pivoting sequence
608: * (in the case of heavily row weighted A, row pivoting is strongly
609: * advised) and to collect information needed to compare the
610: * structures of A * A^t and A^t * A (in the case L2TRAN.EQ..TRUE.).
611: *
612: IF ( L2TRAN ) THEN
613: DO 1950 p = 1, M
614: XSC = ZERO
615: TEMP1 = ZERO
616: CALL DLASSQ( N, A(p,1), LDA, XSC, TEMP1 )
617: * DLASSQ gets both the ell_2 and the ell_infinity norm
618: * in one pass through the vector
619: WORK(M+N+p) = XSC * SCALEM
620: WORK(N+p) = XSC * (SCALEM*DSQRT(TEMP1))
621: AATMAX = DMAX1( AATMAX, WORK(N+p) )
622: IF (WORK(N+p) .NE. ZERO) AATMIN = DMIN1(AATMIN,WORK(N+p))
623: 1950 CONTINUE
624: ELSE
625: DO 1904 p = 1, M
626: WORK(M+N+p) = SCALEM*DABS( A(p,IDAMAX(N,A(p,1),LDA)) )
627: AATMAX = DMAX1( AATMAX, WORK(M+N+p) )
628: AATMIN = DMIN1( AATMIN, WORK(M+N+p) )
629: 1904 CONTINUE
630: END IF
631: *
632: END IF
633: *
634: * For square matrix A try to determine whether A^t would be better
635: * input for the preconditioned Jacobi SVD, with faster convergence.
636: * The decision is based on an O(N) function of the vector of column
637: * and row norms of A, based on the Shannon entropy. This should give
638: * the right choice in most cases when the difference actually matters.
639: * It may fail and pick the slower converging side.
640: *
641: ENTRA = ZERO
642: ENTRAT = ZERO
643: IF ( L2TRAN ) THEN
644: *
645: XSC = ZERO
646: TEMP1 = ZERO
647: CALL DLASSQ( N, SVA, 1, XSC, TEMP1 )
648: TEMP1 = ONE / TEMP1
649: *
650: ENTRA = ZERO
651: DO 1113 p = 1, N
652: BIG1 = ( ( SVA(p) / XSC )**2 ) * TEMP1
653: IF ( BIG1 .NE. ZERO ) ENTRA = ENTRA + BIG1 * DLOG(BIG1)
654: 1113 CONTINUE
655: ENTRA = - ENTRA / DLOG(DBLE(N))
656: *
657: * Now, SVA().^2/Trace(A^t * A) is a point in the probability simplex.
658: * It is derived from the diagonal of A^t * A. Do the same with the
659: * diagonal of A * A^t, compute the entropy of the corresponding
660: * probability distribution. Note that A * A^t and A^t * A have the
661: * same trace.
662: *
663: ENTRAT = ZERO
664: DO 1114 p = N+1, N+M
665: BIG1 = ( ( WORK(p) / XSC )**2 ) * TEMP1
666: IF ( BIG1 .NE. ZERO ) ENTRAT = ENTRAT + BIG1 * DLOG(BIG1)
667: 1114 CONTINUE
668: ENTRAT = - ENTRAT / DLOG(DBLE(M))
669: *
670: * Analyze the entropies and decide A or A^t. Smaller entropy
671: * usually means better input for the algorithm.
672: *
673: TRANSP = ( ENTRAT .LT. ENTRA )
674: *
675: * If A^t is better than A, transpose A.
676: *
677: IF ( TRANSP ) THEN
678: * In an optimal implementation, this trivial transpose
679: * should be replaced with faster transpose.
680: DO 1115 p = 1, N - 1
681: DO 1116 q = p + 1, N
682: TEMP1 = A(q,p)
683: A(q,p) = A(p,q)
684: A(p,q) = TEMP1
685: 1116 CONTINUE
686: 1115 CONTINUE
687: DO 1117 p = 1, N
688: WORK(M+N+p) = SVA(p)
689: SVA(p) = WORK(N+p)
690: 1117 CONTINUE
691: TEMP1 = AAPP
692: AAPP = AATMAX
693: AATMAX = TEMP1
694: TEMP1 = AAQQ
695: AAQQ = AATMIN
696: AATMIN = TEMP1
697: KILL = LSVEC
698: LSVEC = RSVEC
699: RSVEC = KILL
700: *
701: ROWPIV = .TRUE.
702: END IF
703: *
704: END IF
705: * END IF L2TRAN
706: *
707: * Scale the matrix so that its maximal singular value remains less
708: * than DSQRT(BIG) -- the matrix is scaled so that its maximal column
709: * has Euclidean norm equal to DSQRT(BIG/N). The only reason to keep
710: * DSQRT(BIG) instead of BIG is the fact that DGEJSV uses LAPACK and
711: * BLAS routines that, in some implementations, are not capable of
712: * working in the full interval [SFMIN,BIG] and that they may provoke
713: * overflows in the intermediate results. If the singular values spread
714: * from SFMIN to BIG, then DGESVJ will compute them. So, in that case,
715: * one should use DGESVJ instead of DGEJSV.
716: *
717: BIG1 = DSQRT( BIG )
718: TEMP1 = DSQRT( BIG / DBLE(N) )
719: *
720: CALL DLASCL( 'G', 0, 0, AAPP, TEMP1, N, 1, SVA, N, IERR )
721: IF ( AAQQ .GT. (AAPP * SFMIN) ) THEN
722: AAQQ = ( AAQQ / AAPP ) * TEMP1
723: ELSE
724: AAQQ = ( AAQQ * TEMP1 ) / AAPP
725: END IF
726: TEMP1 = TEMP1 * SCALEM
727: CALL DLASCL( 'G', 0, 0, AAPP, TEMP1, M, N, A, LDA, IERR )
728: *
729: * To undo scaling at the end of this procedure, multiply the
730: * computed singular values with USCAL2 / USCAL1.
731: *
732: USCAL1 = TEMP1
733: USCAL2 = AAPP
734: *
735: IF ( L2KILL ) THEN
736: * L2KILL enforces computation of nonzero singular values in
737: * the restricted range of condition number of the initial A,
738: * sigma_max(A) / sigma_min(A) approx. DSQRT(BIG)/DSQRT(SFMIN).
739: XSC = DSQRT( SFMIN )
740: ELSE
741: XSC = SMALL
742: *
743: * Now, if the condition number of A is too big,
744: * sigma_max(A) / sigma_min(A) .GT. DSQRT(BIG/N) * EPSLN / SFMIN,
745: * as a precaution measure, the full SVD is computed using DGESVJ
746: * with accumulated Jacobi rotations. This provides numerically
747: * more robust computation, at the cost of slightly increased run
748: * time. Depending on the concrete implementation of BLAS and LAPACK
749: * (i.e. how they behave in presence of extreme ill-conditioning) the
750: * implementor may decide to remove this switch.
751: IF ( ( AAQQ.LT.DSQRT(SFMIN) ) .AND. LSVEC .AND. RSVEC ) THEN
752: JRACC = .TRUE.
753: END IF
754: *
755: END IF
756: IF ( AAQQ .LT. XSC ) THEN
757: DO 700 p = 1, N
758: IF ( SVA(p) .LT. XSC ) THEN
759: CALL DLASET( 'A', M, 1, ZERO, ZERO, A(1,p), LDA )
760: SVA(p) = ZERO
761: END IF
762: 700 CONTINUE
763: END IF
764: *
765: * Preconditioning using QR factorization with pivoting
766: *
767: IF ( ROWPIV ) THEN
768: * Optional row permutation (Bjoerck row pivoting):
769: * A result by Cox and Higham shows that the Bjoerck's
770: * row pivoting combined with standard column pivoting
771: * has similar effect as Powell-Reid complete pivoting.
772: * The ell-infinity norms of A are made nonincreasing.
773: DO 1952 p = 1, M - 1
774: q = IDAMAX( M-p+1, WORK(M+N+p), 1 ) + p - 1
775: IWORK(2*N+p) = q
776: IF ( p .NE. q ) THEN
777: TEMP1 = WORK(M+N+p)
778: WORK(M+N+p) = WORK(M+N+q)
779: WORK(M+N+q) = TEMP1
780: END IF
781: 1952 CONTINUE
782: CALL DLASWP( N, A, LDA, 1, M-1, IWORK(2*N+1), 1 )
783: END IF
784: *
785: * End of the preparation phase (scaling, optional sorting and
786: * transposing, optional flushing of small columns).
787: *
788: * Preconditioning
789: *
790: * If the full SVD is needed, the right singular vectors are computed
791: * from a matrix equation, and for that we need theoretical analysis
792: * of the Businger-Golub pivoting. So we use DGEQP3 as the first RR QRF.
793: * In all other cases the first RR QRF can be chosen by other criteria
794: * (eg speed by replacing global with restricted window pivoting, such
795: * as in SGEQPX from TOMS # 782). Good results will be obtained using
796: * SGEQPX with properly (!) chosen numerical parameters.
797: * Any improvement of DGEQP3 improves overal performance of DGEJSV.
798: *
799: * A * P1 = Q1 * [ R1^t 0]^t:
800: DO 1963 p = 1, N
801: * .. all columns are free columns
802: IWORK(p) = 0
803: 1963 CONTINUE
804: CALL DGEQP3( M,N,A,LDA, IWORK,WORK, WORK(N+1),LWORK-N, IERR )
805: *
806: * The upper triangular matrix R1 from the first QRF is inspected for
807: * rank deficiency and possibilities for deflation, or possible
808: * ill-conditioning. Depending on the user specified flag L2RANK,
809: * the procedure explores possibilities to reduce the numerical
810: * rank by inspecting the computed upper triangular factor. If
811: * L2RANK or L2ABER are up, then DGEJSV will compute the SVD of
812: * A + dA, where ||dA|| <= f(M,N)*EPSLN.
813: *
814: NR = 1
815: IF ( L2ABER ) THEN
816: * Standard absolute error bound suffices. All sigma_i with
817: * sigma_i < N*EPSLN*||A|| are flushed to zero. This is an
818: * agressive enforcement of lower numerical rank by introducing a
819: * backward error of the order of N*EPSLN*||A||.
820: TEMP1 = DSQRT(DBLE(N))*EPSLN
821: DO 3001 p = 2, N
822: IF ( DABS(A(p,p)) .GE. (TEMP1*DABS(A(1,1))) ) THEN
823: NR = NR + 1
824: ELSE
825: GO TO 3002
826: END IF
827: 3001 CONTINUE
828: 3002 CONTINUE
829: ELSE IF ( L2RANK ) THEN
830: * .. similarly as above, only slightly more gentle (less agressive).
831: * Sudden drop on the diagonal of R1 is used as the criterion for
832: * close-to-rank-defficient.
833: TEMP1 = DSQRT(SFMIN)
834: DO 3401 p = 2, N
835: IF ( ( DABS(A(p,p)) .LT. (EPSLN*DABS(A(p-1,p-1))) ) .OR.
836: & ( DABS(A(p,p)) .LT. SMALL ) .OR.
837: & ( L2KILL .AND. (DABS(A(p,p)) .LT. TEMP1) ) ) GO TO 3402
838: NR = NR + 1
839: 3401 CONTINUE
840: 3402 CONTINUE
841: *
842: ELSE
843: * The goal is high relative accuracy. However, if the matrix
844: * has high scaled condition number the relative accuracy is in
845: * general not feasible. Later on, a condition number estimator
846: * will be deployed to estimate the scaled condition number.
847: * Here we just remove the underflowed part of the triangular
848: * factor. This prevents the situation in which the code is
849: * working hard to get the accuracy not warranted by the data.
850: TEMP1 = DSQRT(SFMIN)
851: DO 3301 p = 2, N
852: IF ( ( DABS(A(p,p)) .LT. SMALL ) .OR.
853: & ( L2KILL .AND. (DABS(A(p,p)) .LT. TEMP1) ) ) GO TO 3302
854: NR = NR + 1
855: 3301 CONTINUE
856: 3302 CONTINUE
857: *
858: END IF
859: *
860: ALMORT = .FALSE.
861: IF ( NR .EQ. N ) THEN
862: MAXPRJ = ONE
863: DO 3051 p = 2, N
864: TEMP1 = DABS(A(p,p)) / SVA(IWORK(p))
865: MAXPRJ = DMIN1( MAXPRJ, TEMP1 )
866: 3051 CONTINUE
867: IF ( MAXPRJ**2 .GE. ONE - DBLE(N)*EPSLN ) ALMORT = .TRUE.
868: END IF
869: *
870: *
871: SCONDA = - ONE
872: CONDR1 = - ONE
873: CONDR2 = - ONE
874: *
875: IF ( ERREST ) THEN
876: IF ( N .EQ. NR ) THEN
877: IF ( RSVEC ) THEN
878: * .. V is available as workspace
879: CALL DLACPY( 'U', N, N, A, LDA, V, LDV )
880: DO 3053 p = 1, N
881: TEMP1 = SVA(IWORK(p))
882: CALL DSCAL( p, ONE/TEMP1, V(1,p), 1 )
883: 3053 CONTINUE
884: CALL DPOCON( 'U', N, V, LDV, ONE, TEMP1,
885: & WORK(N+1), IWORK(2*N+M+1), IERR )
886: ELSE IF ( LSVEC ) THEN
887: * .. U is available as workspace
888: CALL DLACPY( 'U', N, N, A, LDA, U, LDU )
889: DO 3054 p = 1, N
890: TEMP1 = SVA(IWORK(p))
891: CALL DSCAL( p, ONE/TEMP1, U(1,p), 1 )
892: 3054 CONTINUE
893: CALL DPOCON( 'U', N, U, LDU, ONE, TEMP1,
894: & WORK(N+1), IWORK(2*N+M+1), IERR )
895: ELSE
896: CALL DLACPY( 'U', N, N, A, LDA, WORK(N+1), N )
897: DO 3052 p = 1, N
898: TEMP1 = SVA(IWORK(p))
899: CALL DSCAL( p, ONE/TEMP1, WORK(N+(p-1)*N+1), 1 )
900: 3052 CONTINUE
901: * .. the columns of R are scaled to have unit Euclidean lengths.
902: CALL DPOCON( 'U', N, WORK(N+1), N, ONE, TEMP1,
903: & WORK(N+N*N+1), IWORK(2*N+M+1), IERR )
904: END IF
905: SCONDA = ONE / DSQRT(TEMP1)
906: * SCONDA is an estimate of DSQRT(||(R^t * R)^(-1)||_1).
907: * N^(-1/4) * SCONDA <= ||R^(-1)||_2 <= N^(1/4) * SCONDA
908: ELSE
909: SCONDA = - ONE
910: END IF
911: END IF
912: *
913: L2PERT = L2PERT .AND. ( DABS( A(1,1)/A(NR,NR) ) .GT. DSQRT(BIG1) )
914: * If there is no violent scaling, artificial perturbation is not needed.
915: *
916: * Phase 3:
917: *
918:
919: IF ( .NOT. ( RSVEC .OR. LSVEC ) ) THEN
920: *
921: * Singular Values only
922: *
923: * .. transpose A(1:NR,1:N)
924: DO 1946 p = 1, MIN0( N-1, NR )
925: CALL DCOPY( N-p, A(p,p+1), LDA, A(p+1,p), 1 )
926: 1946 CONTINUE
927: *
928: * The following two DO-loops introduce small relative perturbation
929: * into the strict upper triangle of the lower triangular matrix.
930: * Small entries below the main diagonal are also changed.
931: * This modification is useful if the computing environment does not
932: * provide/allow FLUSH TO ZERO underflow, for it prevents many
933: * annoying denormalized numbers in case of strongly scaled matrices.
934: * The perturbation is structured so that it does not introduce any
935: * new perturbation of the singular values, and it does not destroy
936: * the job done by the preconditioner.
937: * The licence for this perturbation is in the variable L2PERT, which
938: * should be .FALSE. if FLUSH TO ZERO underflow is active.
939: *
940: IF ( .NOT. ALMORT ) THEN
941: *
942: IF ( L2PERT ) THEN
943: * XSC = DSQRT(SMALL)
944: XSC = EPSLN / DBLE(N)
945: DO 4947 q = 1, NR
946: TEMP1 = XSC*DABS(A(q,q))
947: DO 4949 p = 1, N
948: IF ( ( (p.GT.q) .AND. (DABS(A(p,q)).LE.TEMP1) )
949: & .OR. ( p .LT. q ) )
950: & A(p,q) = DSIGN( TEMP1, A(p,q) )
951: 4949 CONTINUE
952: 4947 CONTINUE
953: ELSE
954: CALL DLASET( 'U', NR-1,NR-1, ZERO,ZERO, A(1,2),LDA )
955: END IF
956: *
957: * .. second preconditioning using the QR factorization
958: *
959: CALL DGEQRF( N,NR, A,LDA, WORK, WORK(N+1),LWORK-N, IERR )
960: *
961: * .. and transpose upper to lower triangular
962: DO 1948 p = 1, NR - 1
963: CALL DCOPY( NR-p, A(p,p+1), LDA, A(p+1,p), 1 )
964: 1948 CONTINUE
965: *
966: END IF
967: *
968: * Row-cyclic Jacobi SVD algorithm with column pivoting
969: *
970: * .. again some perturbation (a "background noise") is added
971: * to drown denormals
972: IF ( L2PERT ) THEN
973: * XSC = DSQRT(SMALL)
974: XSC = EPSLN / DBLE(N)
975: DO 1947 q = 1, NR
976: TEMP1 = XSC*DABS(A(q,q))
977: DO 1949 p = 1, NR
978: IF ( ( (p.GT.q) .AND. (DABS(A(p,q)).LE.TEMP1) )
979: & .OR. ( p .LT. q ) )
980: & A(p,q) = DSIGN( TEMP1, A(p,q) )
981: 1949 CONTINUE
982: 1947 CONTINUE
983: ELSE
984: CALL DLASET( 'U', NR-1, NR-1, ZERO, ZERO, A(1,2), LDA )
985: END IF
986: *
987: * .. and one-sided Jacobi rotations are started on a lower
988: * triangular matrix (plus perturbation which is ignored in
989: * the part which destroys triangular form (confusing?!))
990: *
991: CALL DGESVJ( 'L', 'NoU', 'NoV', NR, NR, A, LDA, SVA,
992: & N, V, LDV, WORK, LWORK, INFO )
993: *
994: SCALEM = WORK(1)
995: NUMRANK = IDNINT(WORK(2))
996: *
997: *
998: ELSE IF ( RSVEC .AND. ( .NOT. LSVEC ) ) THEN
999: *
1000: * -> Singular Values and Right Singular Vectors <-
1001: *
1002: IF ( ALMORT ) THEN
1003: *
1004: * .. in this case NR equals N
1005: DO 1998 p = 1, NR
1006: CALL DCOPY( N-p+1, A(p,p), LDA, V(p,p), 1 )
1007: 1998 CONTINUE
1008: CALL DLASET( 'Upper', NR-1, NR-1, ZERO, ZERO, V(1,2), LDV )
1009: *
1010: CALL DGESVJ( 'L','U','N', N, NR, V,LDV, SVA, NR, A,LDA,
1011: & WORK, LWORK, INFO )
1012: SCALEM = WORK(1)
1013: NUMRANK = IDNINT(WORK(2))
1014:
1015: ELSE
1016: *
1017: * .. two more QR factorizations ( one QRF is not enough, two require
1018: * accumulated product of Jacobi rotations, three are perfect )
1019: *
1020: CALL DLASET( 'Lower', NR-1, NR-1, ZERO, ZERO, A(2,1), LDA )
1021: CALL DGELQF( NR, N, A, LDA, WORK, WORK(N+1), LWORK-N, IERR)
1022: CALL DLACPY( 'Lower', NR, NR, A, LDA, V, LDV )
1023: CALL DLASET( 'Upper', NR-1, NR-1, ZERO, ZERO, V(1,2), LDV )
1024: CALL DGEQRF( NR, NR, V, LDV, WORK(N+1), WORK(2*N+1),
1025: & LWORK-2*N, IERR )
1026: DO 8998 p = 1, NR
1027: CALL DCOPY( NR-p+1, V(p,p), LDV, V(p,p), 1 )
1028: 8998 CONTINUE
1029: CALL DLASET( 'Upper', NR-1, NR-1, ZERO, ZERO, V(1,2), LDV )
1030: *
1031: CALL DGESVJ( 'Lower', 'U','N', NR, NR, V,LDV, SVA, NR, U,
1032: & LDU, WORK(N+1), LWORK, INFO )
1033: SCALEM = WORK(N+1)
1034: NUMRANK = IDNINT(WORK(N+2))
1035: IF ( NR .LT. N ) THEN
1036: CALL DLASET( 'A',N-NR, NR, ZERO,ZERO, V(NR+1,1), LDV )
1037: CALL DLASET( 'A',NR, N-NR, ZERO,ZERO, V(1,NR+1), LDV )
1038: CALL DLASET( 'A',N-NR,N-NR,ZERO,ONE, V(NR+1,NR+1), LDV )
1039: END IF
1040: *
1041: CALL DORMLQ( 'Left', 'Transpose', N, N, NR, A, LDA, WORK,
1042: & V, LDV, WORK(N+1), LWORK-N, IERR )
1043: *
1044: END IF
1045: *
1046: DO 8991 p = 1, N
1047: CALL DCOPY( N, V(p,1), LDV, A(IWORK(p),1), LDA )
1048: 8991 CONTINUE
1049: CALL DLACPY( 'All', N, N, A, LDA, V, LDV )
1050: *
1051: IF ( TRANSP ) THEN
1052: CALL DLACPY( 'All', N, N, V, LDV, U, LDU )
1053: END IF
1054: *
1055: ELSE IF ( LSVEC .AND. ( .NOT. RSVEC ) ) THEN
1056: *
1057: * .. Singular Values and Left Singular Vectors ..
1058: *
1059: * .. second preconditioning step to avoid need to accumulate
1060: * Jacobi rotations in the Jacobi iterations.
1061: DO 1965 p = 1, NR
1062: CALL DCOPY( N-p+1, A(p,p), LDA, U(p,p), 1 )
1063: 1965 CONTINUE
1064: CALL DLASET( 'Upper', NR-1, NR-1, ZERO, ZERO, U(1,2), LDU )
1065: *
1066: CALL DGEQRF( N, NR, U, LDU, WORK(N+1), WORK(2*N+1),
1067: & LWORK-2*N, IERR )
1068: *
1069: DO 1967 p = 1, NR - 1
1070: CALL DCOPY( NR-p, U(p,p+1), LDU, U(p+1,p), 1 )
1071: 1967 CONTINUE
1072: CALL DLASET( 'Upper', NR-1, NR-1, ZERO, ZERO, U(1,2), LDU )
1073: *
1074: CALL DGESVJ( 'Lower', 'U', 'N', NR,NR, U, LDU, SVA, NR, A,
1075: & LDA, WORK(N+1), LWORK-N, INFO )
1076: SCALEM = WORK(N+1)
1077: NUMRANK = IDNINT(WORK(N+2))
1078: *
1079: IF ( NR .LT. M ) THEN
1080: CALL DLASET( 'A', M-NR, NR,ZERO, ZERO, U(NR+1,1), LDU )
1081: IF ( NR .LT. N1 ) THEN
1082: CALL DLASET( 'A',NR, N1-NR, ZERO, ZERO, U(1,NR+1), LDU )
1083: CALL DLASET( 'A',M-NR,N1-NR,ZERO,ONE,U(NR+1,NR+1), LDU )
1084: END IF
1085: END IF
1086: *
1087: CALL DORMQR( 'Left', 'No Tr', M, N1, N, A, LDA, WORK, U,
1088: & LDU, WORK(N+1), LWORK-N, IERR )
1089: *
1090: IF ( ROWPIV )
1091: & CALL DLASWP( N1, U, LDU, 1, M-1, IWORK(2*N+1), -1 )
1092: *
1093: DO 1974 p = 1, N1
1094: XSC = ONE / DNRM2( M, U(1,p), 1 )
1095: CALL DSCAL( M, XSC, U(1,p), 1 )
1096: 1974 CONTINUE
1097: *
1098: IF ( TRANSP ) THEN
1099: CALL DLACPY( 'All', N, N, U, LDU, V, LDV )
1100: END IF
1101: *
1102: ELSE
1103: *
1104: * .. Full SVD ..
1105: *
1106: IF ( .NOT. JRACC ) THEN
1107: *
1108: IF ( .NOT. ALMORT ) THEN
1109: *
1110: * Second Preconditioning Step (QRF [with pivoting])
1111: * Note that the composition of TRANSPOSE, QRF and TRANSPOSE is
1112: * equivalent to an LQF CALL. Since in many libraries the QRF
1113: * seems to be better optimized than the LQF, we do explicit
1114: * transpose and use the QRF. This is subject to changes in an
1115: * optimized implementation of DGEJSV.
1116: *
1117: DO 1968 p = 1, NR
1118: CALL DCOPY( N-p+1, A(p,p), LDA, V(p,p), 1 )
1119: 1968 CONTINUE
1120: *
1121: * .. the following two loops perturb small entries to avoid
1122: * denormals in the second QR factorization, where they are
1123: * as good as zeros. This is done to avoid painfully slow
1124: * computation with denormals. The relative size of the perturbation
1125: * is a parameter that can be changed by the implementer.
1126: * This perturbation device will be obsolete on machines with
1127: * properly implemented arithmetic.
1128: * To switch it off, set L2PERT=.FALSE. To remove it from the
1129: * code, remove the action under L2PERT=.TRUE., leave the ELSE part.
1130: * The following two loops should be blocked and fused with the
1131: * transposed copy above.
1132: *
1133: IF ( L2PERT ) THEN
1134: XSC = DSQRT(SMALL)
1135: DO 2969 q = 1, NR
1136: TEMP1 = XSC*DABS( V(q,q) )
1137: DO 2968 p = 1, N
1138: IF ( ( p .GT. q ) .AND. ( DABS(V(p,q)) .LE. TEMP1 )
1139: & .OR. ( p .LT. q ) )
1140: & V(p,q) = DSIGN( TEMP1, V(p,q) )
1141: IF ( p. LT. q ) V(p,q) = - V(p,q)
1142: 2968 CONTINUE
1143: 2969 CONTINUE
1144: ELSE
1145: CALL DLASET( 'U', NR-1, NR-1, ZERO, ZERO, V(1,2), LDV )
1146: END IF
1147: *
1148: * Estimate the row scaled condition number of R1
1149: * (If R1 is rectangular, N > NR, then the condition number
1150: * of the leading NR x NR submatrix is estimated.)
1151: *
1152: CALL DLACPY( 'L', NR, NR, V, LDV, WORK(2*N+1), NR )
1153: DO 3950 p = 1, NR
1154: TEMP1 = DNRM2(NR-p+1,WORK(2*N+(p-1)*NR+p),1)
1155: CALL DSCAL(NR-p+1,ONE/TEMP1,WORK(2*N+(p-1)*NR+p),1)
1156: 3950 CONTINUE
1157: CALL DPOCON('Lower',NR,WORK(2*N+1),NR,ONE,TEMP1,
1158: & WORK(2*N+NR*NR+1),IWORK(M+2*N+1),IERR)
1159: CONDR1 = ONE / DSQRT(TEMP1)
1160: * .. here need a second oppinion on the condition number
1161: * .. then assume worst case scenario
1162: * R1 is OK for inverse <=> CONDR1 .LT. DBLE(N)
1163: * more conservative <=> CONDR1 .LT. DSQRT(DBLE(N))
1164: *
1165: COND_OK = DSQRT(DBLE(NR))
1166: *[TP] COND_OK is a tuning parameter.
1167:
1168: IF ( CONDR1 .LT. COND_OK ) THEN
1169: * .. the second QRF without pivoting. Note: in an optimized
1170: * implementation, this QRF should be implemented as the QRF
1171: * of a lower triangular matrix.
1172: * R1^t = Q2 * R2
1173: CALL DGEQRF( N, NR, V, LDV, WORK(N+1), WORK(2*N+1),
1174: & LWORK-2*N, IERR )
1175: *
1176: IF ( L2PERT ) THEN
1177: XSC = DSQRT(SMALL)/EPSLN
1178: DO 3959 p = 2, NR
1179: DO 3958 q = 1, p - 1
1180: TEMP1 = XSC * DMIN1(DABS(V(p,p)),DABS(V(q,q)))
1181: IF ( DABS(V(q,p)) .LE. TEMP1 )
1182: & V(q,p) = DSIGN( TEMP1, V(q,p) )
1183: 3958 CONTINUE
1184: 3959 CONTINUE
1185: END IF
1186: *
1187: IF ( NR .NE. N )
1188: * .. save ...
1189: & CALL DLACPY( 'A', N, NR, V, LDV, WORK(2*N+1), N )
1190: *
1191: * .. this transposed copy should be better than naive
1192: DO 1969 p = 1, NR - 1
1193: CALL DCOPY( NR-p, V(p,p+1), LDV, V(p+1,p), 1 )
1194: 1969 CONTINUE
1195: *
1196: CONDR2 = CONDR1
1197: *
1198: ELSE
1199: *
1200: * .. ill-conditioned case: second QRF with pivoting
1201: * Note that windowed pivoting would be equaly good
1202: * numerically, and more run-time efficient. So, in
1203: * an optimal implementation, the next call to DGEQP3
1204: * should be replaced with eg. CALL SGEQPX (ACM TOMS #782)
1205: * with properly (carefully) chosen parameters.
1206: *
1207: * R1^t * P2 = Q2 * R2
1208: DO 3003 p = 1, NR
1209: IWORK(N+p) = 0
1210: 3003 CONTINUE
1211: CALL DGEQP3( N, NR, V, LDV, IWORK(N+1), WORK(N+1),
1212: & WORK(2*N+1), LWORK-2*N, IERR )
1213: ** CALL DGEQRF( N, NR, V, LDV, WORK(N+1), WORK(2*N+1),
1214: ** & LWORK-2*N, IERR )
1215: IF ( L2PERT ) THEN
1216: XSC = DSQRT(SMALL)
1217: DO 3969 p = 2, NR
1218: DO 3968 q = 1, p - 1
1219: TEMP1 = XSC * DMIN1(DABS(V(p,p)),DABS(V(q,q)))
1220: IF ( DABS(V(q,p)) .LE. TEMP1 )
1221: & V(q,p) = DSIGN( TEMP1, V(q,p) )
1222: 3968 CONTINUE
1223: 3969 CONTINUE
1224: END IF
1225: *
1226: CALL DLACPY( 'A', N, NR, V, LDV, WORK(2*N+1), N )
1227: *
1228: IF ( L2PERT ) THEN
1229: XSC = DSQRT(SMALL)
1230: DO 8970 p = 2, NR
1231: DO 8971 q = 1, p - 1
1232: TEMP1 = XSC * DMIN1(DABS(V(p,p)),DABS(V(q,q)))
1233: V(p,q) = - DSIGN( TEMP1, V(q,p) )
1234: 8971 CONTINUE
1235: 8970 CONTINUE
1236: ELSE
1237: CALL DLASET( 'L',NR-1,NR-1,ZERO,ZERO,V(2,1),LDV )
1238: END IF
1239: * Now, compute R2 = L3 * Q3, the LQ factorization.
1240: CALL DGELQF( NR, NR, V, LDV, WORK(2*N+N*NR+1),
1241: & WORK(2*N+N*NR+NR+1), LWORK-2*N-N*NR-NR, IERR )
1242: * .. and estimate the condition number
1243: CALL DLACPY( 'L',NR,NR,V,LDV,WORK(2*N+N*NR+NR+1),NR )
1244: DO 4950 p = 1, NR
1245: TEMP1 = DNRM2( p, WORK(2*N+N*NR+NR+p), NR )
1246: CALL DSCAL( p, ONE/TEMP1, WORK(2*N+N*NR+NR+p), NR )
1247: 4950 CONTINUE
1248: CALL DPOCON( 'L',NR,WORK(2*N+N*NR+NR+1),NR,ONE,TEMP1,
1249: & WORK(2*N+N*NR+NR+NR*NR+1),IWORK(M+2*N+1),IERR )
1250: CONDR2 = ONE / DSQRT(TEMP1)
1251: *
1252: IF ( CONDR2 .GE. COND_OK ) THEN
1253: * .. save the Householder vectors used for Q3
1254: * (this overwrittes the copy of R2, as it will not be
1255: * needed in this branch, but it does not overwritte the
1256: * Huseholder vectors of Q2.).
1257: CALL DLACPY( 'U', NR, NR, V, LDV, WORK(2*N+1), N )
1258: * .. and the rest of the information on Q3 is in
1259: * WORK(2*N+N*NR+1:2*N+N*NR+N)
1260: END IF
1261: *
1262: END IF
1263: *
1264: IF ( L2PERT ) THEN
1265: XSC = DSQRT(SMALL)
1266: DO 4968 q = 2, NR
1267: TEMP1 = XSC * V(q,q)
1268: DO 4969 p = 1, q - 1
1269: * V(p,q) = - DSIGN( TEMP1, V(q,p) )
1270: V(p,q) = - DSIGN( TEMP1, V(p,q) )
1271: 4969 CONTINUE
1272: 4968 CONTINUE
1273: ELSE
1274: CALL DLASET( 'U', NR-1,NR-1, ZERO,ZERO, V(1,2), LDV )
1275: END IF
1276: *
1277: * Second preconditioning finished; continue with Jacobi SVD
1278: * The input matrix is lower trinagular.
1279: *
1280: * Recover the right singular vectors as solution of a well
1281: * conditioned triangular matrix equation.
1282: *
1283: IF ( CONDR1 .LT. COND_OK ) THEN
1284: *
1285: CALL DGESVJ( 'L','U','N',NR,NR,V,LDV,SVA,NR,U,
1286: & LDU,WORK(2*N+N*NR+NR+1),LWORK-2*N-N*NR-NR,INFO )
1287: SCALEM = WORK(2*N+N*NR+NR+1)
1288: NUMRANK = IDNINT(WORK(2*N+N*NR+NR+2))
1289: DO 3970 p = 1, NR
1290: CALL DCOPY( NR, V(1,p), 1, U(1,p), 1 )
1291: CALL DSCAL( NR, SVA(p), V(1,p), 1 )
1292: 3970 CONTINUE
1293:
1294: * .. pick the right matrix equation and solve it
1295: *
1296: IF ( NR. EQ. N ) THEN
1297: * :)) .. best case, R1 is inverted. The solution of this matrix
1298: * equation is Q2*V2 = the product of the Jacobi rotations
1299: * used in DGESVJ, premultiplied with the orthogonal matrix
1300: * from the second QR factorization.
1301: CALL DTRSM( 'L','U','N','N', NR,NR,ONE, A,LDA, V,LDV )
1302: ELSE
1303: * .. R1 is well conditioned, but non-square. Transpose(R2)
1304: * is inverted to get the product of the Jacobi rotations
1305: * used in DGESVJ. The Q-factor from the second QR
1306: * factorization is then built in explicitly.
1307: CALL DTRSM('L','U','T','N',NR,NR,ONE,WORK(2*N+1),
1308: & N,V,LDV)
1309: IF ( NR .LT. N ) THEN
1310: CALL DLASET('A',N-NR,NR,ZERO,ZERO,V(NR+1,1),LDV)
1311: CALL DLASET('A',NR,N-NR,ZERO,ZERO,V(1,NR+1),LDV)
1312: CALL DLASET('A',N-NR,N-NR,ZERO,ONE,V(NR+1,NR+1),LDV)
1313: END IF
1314: CALL DORMQR('L','N',N,N,NR,WORK(2*N+1),N,WORK(N+1),
1315: & V,LDV,WORK(2*N+N*NR+NR+1),LWORK-2*N-N*NR-NR,IERR)
1316: END IF
1317: *
1318: ELSE IF ( CONDR2 .LT. COND_OK ) THEN
1319: *
1320: * :) .. the input matrix A is very likely a relative of
1321: * the Kahan matrix :)
1322: * The matrix R2 is inverted. The solution of the matrix equation
1323: * is Q3^T*V3 = the product of the Jacobi rotations (appplied to
1324: * the lower triangular L3 from the LQ factorization of
1325: * R2=L3*Q3), pre-multiplied with the transposed Q3.
1326: CALL DGESVJ( 'L', 'U', 'N', NR, NR, V, LDV, SVA, NR, U,
1327: & LDU, WORK(2*N+N*NR+NR+1), LWORK-2*N-N*NR-NR, INFO )
1328: SCALEM = WORK(2*N+N*NR+NR+1)
1329: NUMRANK = IDNINT(WORK(2*N+N*NR+NR+2))
1330: DO 3870 p = 1, NR
1331: CALL DCOPY( NR, V(1,p), 1, U(1,p), 1 )
1332: CALL DSCAL( NR, SVA(p), U(1,p), 1 )
1333: 3870 CONTINUE
1334: CALL DTRSM('L','U','N','N',NR,NR,ONE,WORK(2*N+1),N,U,LDU)
1335: * .. apply the permutation from the second QR factorization
1336: DO 873 q = 1, NR
1337: DO 872 p = 1, NR
1338: WORK(2*N+N*NR+NR+IWORK(N+p)) = U(p,q)
1339: 872 CONTINUE
1340: DO 874 p = 1, NR
1341: U(p,q) = WORK(2*N+N*NR+NR+p)
1342: 874 CONTINUE
1343: 873 CONTINUE
1344: IF ( NR .LT. N ) THEN
1345: CALL DLASET( 'A',N-NR,NR,ZERO,ZERO,V(NR+1,1),LDV )
1346: CALL DLASET( 'A',NR,N-NR,ZERO,ZERO,V(1,NR+1),LDV )
1347: CALL DLASET( 'A',N-NR,N-NR,ZERO,ONE,V(NR+1,NR+1),LDV )
1348: END IF
1349: CALL DORMQR( 'L','N',N,N,NR,WORK(2*N+1),N,WORK(N+1),
1350: & V,LDV,WORK(2*N+N*NR+NR+1),LWORK-2*N-N*NR-NR,IERR )
1351: ELSE
1352: * Last line of defense.
1353: * #:( This is a rather pathological case: no scaled condition
1354: * improvement after two pivoted QR factorizations. Other
1355: * possibility is that the rank revealing QR factorization
1356: * or the condition estimator has failed, or the COND_OK
1357: * is set very close to ONE (which is unnecessary). Normally,
1358: * this branch should never be executed, but in rare cases of
1359: * failure of the RRQR or condition estimator, the last line of
1360: * defense ensures that DGEJSV completes the task.
1361: * Compute the full SVD of L3 using DGESVJ with explicit
1362: * accumulation of Jacobi rotations.
1363: CALL DGESVJ( 'L', 'U', 'V', NR, NR, V, LDV, SVA, NR, U,
1364: & LDU, WORK(2*N+N*NR+NR+1), LWORK-2*N-N*NR-NR, INFO )
1365: SCALEM = WORK(2*N+N*NR+NR+1)
1366: NUMRANK = IDNINT(WORK(2*N+N*NR+NR+2))
1367: IF ( NR .LT. N ) THEN
1368: CALL DLASET( 'A',N-NR,NR,ZERO,ZERO,V(NR+1,1),LDV )
1369: CALL DLASET( 'A',NR,N-NR,ZERO,ZERO,V(1,NR+1),LDV )
1370: CALL DLASET( 'A',N-NR,N-NR,ZERO,ONE,V(NR+1,NR+1),LDV )
1371: END IF
1372: CALL DORMQR( 'L','N',N,N,NR,WORK(2*N+1),N,WORK(N+1),
1373: & V,LDV,WORK(2*N+N*NR+NR+1),LWORK-2*N-N*NR-NR,IERR )
1374: *
1375: CALL DORMLQ( 'L', 'T', NR, NR, NR, WORK(2*N+1), N,
1376: & WORK(2*N+N*NR+1), U, LDU, WORK(2*N+N*NR+NR+1),
1377: & LWORK-2*N-N*NR-NR, IERR )
1378: DO 773 q = 1, NR
1379: DO 772 p = 1, NR
1380: WORK(2*N+N*NR+NR+IWORK(N+p)) = U(p,q)
1381: 772 CONTINUE
1382: DO 774 p = 1, NR
1383: U(p,q) = WORK(2*N+N*NR+NR+p)
1384: 774 CONTINUE
1385: 773 CONTINUE
1386: *
1387: END IF
1388: *
1389: * Permute the rows of V using the (column) permutation from the
1390: * first QRF. Also, scale the columns to make them unit in
1391: * Euclidean norm. This applies to all cases.
1392: *
1393: TEMP1 = DSQRT(DBLE(N)) * EPSLN
1394: DO 1972 q = 1, N
1395: DO 972 p = 1, N
1396: WORK(2*N+N*NR+NR+IWORK(p)) = V(p,q)
1397: 972 CONTINUE
1398: DO 973 p = 1, N
1399: V(p,q) = WORK(2*N+N*NR+NR+p)
1400: 973 CONTINUE
1401: XSC = ONE / DNRM2( N, V(1,q), 1 )
1402: IF ( (XSC .LT. (ONE-TEMP1)) .OR. (XSC .GT. (ONE+TEMP1)) )
1403: & CALL DSCAL( N, XSC, V(1,q), 1 )
1404: 1972 CONTINUE
1405: * At this moment, V contains the right singular vectors of A.
1406: * Next, assemble the left singular vector matrix U (M x N).
1407: IF ( NR .LT. M ) THEN
1408: CALL DLASET( 'A', M-NR, NR, ZERO, ZERO, U(NR+1,1), LDU )
1409: IF ( NR .LT. N1 ) THEN
1410: CALL DLASET('A',NR,N1-NR,ZERO,ZERO,U(1,NR+1),LDU)
1411: CALL DLASET('A',M-NR,N1-NR,ZERO,ONE,U(NR+1,NR+1),LDU)
1412: END IF
1413: END IF
1414: *
1415: * The Q matrix from the first QRF is built into the left singular
1416: * matrix U. This applies to all cases.
1417: *
1418: CALL DORMQR( 'Left', 'No_Tr', M, N1, N, A, LDA, WORK, U,
1419: & LDU, WORK(N+1), LWORK-N, IERR )
1420:
1421: * The columns of U are normalized. The cost is O(M*N) flops.
1422: TEMP1 = DSQRT(DBLE(M)) * EPSLN
1423: DO 1973 p = 1, NR
1424: XSC = ONE / DNRM2( M, U(1,p), 1 )
1425: IF ( (XSC .LT. (ONE-TEMP1)) .OR. (XSC .GT. (ONE+TEMP1)) )
1426: & CALL DSCAL( M, XSC, U(1,p), 1 )
1427: 1973 CONTINUE
1428: *
1429: * If the initial QRF is computed with row pivoting, the left
1430: * singular vectors must be adjusted.
1431: *
1432: IF ( ROWPIV )
1433: & CALL DLASWP( N1, U, LDU, 1, M-1, IWORK(2*N+1), -1 )
1434: *
1435: ELSE
1436: *
1437: * .. the initial matrix A has almost orthogonal columns and
1438: * the second QRF is not needed
1439: *
1440: CALL DLACPY( 'Upper', N, N, A, LDA, WORK(N+1), N )
1441: IF ( L2PERT ) THEN
1442: XSC = DSQRT(SMALL)
1443: DO 5970 p = 2, N
1444: TEMP1 = XSC * WORK( N + (p-1)*N + p )
1445: DO 5971 q = 1, p - 1
1446: WORK(N+(q-1)*N+p)=-DSIGN(TEMP1,WORK(N+(p-1)*N+q))
1447: 5971 CONTINUE
1448: 5970 CONTINUE
1449: ELSE
1450: CALL DLASET( 'Lower',N-1,N-1,ZERO,ZERO,WORK(N+2),N )
1451: END IF
1452: *
1453: CALL DGESVJ( 'Upper', 'U', 'N', N, N, WORK(N+1), N, SVA,
1454: & N, U, LDU, WORK(N+N*N+1), LWORK-N-N*N, INFO )
1455: *
1456: SCALEM = WORK(N+N*N+1)
1457: NUMRANK = IDNINT(WORK(N+N*N+2))
1458: DO 6970 p = 1, N
1459: CALL DCOPY( N, WORK(N+(p-1)*N+1), 1, U(1,p), 1 )
1460: CALL DSCAL( N, SVA(p), WORK(N+(p-1)*N+1), 1 )
1461: 6970 CONTINUE
1462: *
1463: CALL DTRSM( 'Left', 'Upper', 'NoTrans', 'No UD', N, N,
1464: & ONE, A, LDA, WORK(N+1), N )
1465: DO 6972 p = 1, N
1466: CALL DCOPY( N, WORK(N+p), N, V(IWORK(p),1), LDV )
1467: 6972 CONTINUE
1468: TEMP1 = DSQRT(DBLE(N))*EPSLN
1469: DO 6971 p = 1, N
1470: XSC = ONE / DNRM2( N, V(1,p), 1 )
1471: IF ( (XSC .LT. (ONE-TEMP1)) .OR. (XSC .GT. (ONE+TEMP1)) )
1472: & CALL DSCAL( N, XSC, V(1,p), 1 )
1473: 6971 CONTINUE
1474: *
1475: * Assemble the left singular vector matrix U (M x N).
1476: *
1477: IF ( N .LT. M ) THEN
1478: CALL DLASET( 'A', M-N, N, ZERO, ZERO, U(NR+1,1), LDU )
1479: IF ( N .LT. N1 ) THEN
1480: CALL DLASET( 'A',N, N1-N, ZERO, ZERO, U(1,N+1),LDU )
1481: CALL DLASET( 'A',M-N,N1-N, ZERO, ONE,U(NR+1,N+1),LDU )
1482: END IF
1483: END IF
1484: CALL DORMQR( 'Left', 'No Tr', M, N1, N, A, LDA, WORK, U,
1485: & LDU, WORK(N+1), LWORK-N, IERR )
1486: TEMP1 = DSQRT(DBLE(M))*EPSLN
1487: DO 6973 p = 1, N1
1488: XSC = ONE / DNRM2( M, U(1,p), 1 )
1489: IF ( (XSC .LT. (ONE-TEMP1)) .OR. (XSC .GT. (ONE+TEMP1)) )
1490: & CALL DSCAL( M, XSC, U(1,p), 1 )
1491: 6973 CONTINUE
1492: *
1493: IF ( ROWPIV )
1494: & CALL DLASWP( N1, U, LDU, 1, M-1, IWORK(2*N+1), -1 )
1495: *
1496: END IF
1497: *
1498: * end of the >> almost orthogonal case << in the full SVD
1499: *
1500: ELSE
1501: *
1502: * This branch deploys a preconditioned Jacobi SVD with explicitly
1503: * accumulated rotations. It is included as optional, mainly for
1504: * experimental purposes. It does perfom well, and can also be used.
1505: * In this implementation, this branch will be automatically activated
1506: * if the condition number sigma_max(A) / sigma_min(A) is predicted
1507: * to be greater than the overflow threshold. This is because the
1508: * a posteriori computation of the singular vectors assumes robust
1509: * implementation of BLAS and some LAPACK procedures, capable of working
1510: * in presence of extreme values. Since that is not always the case, ...
1511: *
1512: DO 7968 p = 1, NR
1513: CALL DCOPY( N-p+1, A(p,p), LDA, V(p,p), 1 )
1514: 7968 CONTINUE
1515: *
1516: IF ( L2PERT ) THEN
1517: XSC = DSQRT(SMALL/EPSLN)
1518: DO 5969 q = 1, NR
1519: TEMP1 = XSC*DABS( V(q,q) )
1520: DO 5968 p = 1, N
1521: IF ( ( p .GT. q ) .AND. ( DABS(V(p,q)) .LE. TEMP1 )
1522: & .OR. ( p .LT. q ) )
1523: & V(p,q) = DSIGN( TEMP1, V(p,q) )
1524: IF ( p. LT. q ) V(p,q) = - V(p,q)
1525: 5968 CONTINUE
1526: 5969 CONTINUE
1527: ELSE
1528: CALL DLASET( 'U', NR-1, NR-1, ZERO, ZERO, V(1,2), LDV )
1529: END IF
1530:
1531: CALL DGEQRF( N, NR, V, LDV, WORK(N+1), WORK(2*N+1),
1532: & LWORK-2*N, IERR )
1533: CALL DLACPY( 'L', N, NR, V, LDV, WORK(2*N+1), N )
1534: *
1535: DO 7969 p = 1, NR
1536: CALL DCOPY( NR-p+1, V(p,p), LDV, U(p,p), 1 )
1537: 7969 CONTINUE
1538:
1539: IF ( L2PERT ) THEN
1540: XSC = DSQRT(SMALL/EPSLN)
1541: DO 9970 q = 2, NR
1542: DO 9971 p = 1, q - 1
1543: TEMP1 = XSC * DMIN1(DABS(U(p,p)),DABS(U(q,q)))
1544: U(p,q) = - DSIGN( TEMP1, U(q,p) )
1545: 9971 CONTINUE
1546: 9970 CONTINUE
1547: ELSE
1548: CALL DLASET('U', NR-1, NR-1, ZERO, ZERO, U(1,2), LDU )
1549: END IF
1550:
1551: CALL DGESVJ( 'G', 'U', 'V', NR, NR, U, LDU, SVA,
1552: & N, V, LDV, WORK(2*N+N*NR+1), LWORK-2*N-N*NR, INFO )
1553: SCALEM = WORK(2*N+N*NR+1)
1554: NUMRANK = IDNINT(WORK(2*N+N*NR+2))
1555:
1556: IF ( NR .LT. N ) THEN
1557: CALL DLASET( 'A',N-NR,NR,ZERO,ZERO,V(NR+1,1),LDV )
1558: CALL DLASET( 'A',NR,N-NR,ZERO,ZERO,V(1,NR+1),LDV )
1559: CALL DLASET( 'A',N-NR,N-NR,ZERO,ONE,V(NR+1,NR+1),LDV )
1560: END IF
1561:
1562: CALL DORMQR( 'L','N',N,N,NR,WORK(2*N+1),N,WORK(N+1),
1563: & V,LDV,WORK(2*N+N*NR+NR+1),LWORK-2*N-N*NR-NR,IERR )
1564: *
1565: * Permute the rows of V using the (column) permutation from the
1566: * first QRF. Also, scale the columns to make them unit in
1567: * Euclidean norm. This applies to all cases.
1568: *
1569: TEMP1 = DSQRT(DBLE(N)) * EPSLN
1570: DO 7972 q = 1, N
1571: DO 8972 p = 1, N
1572: WORK(2*N+N*NR+NR+IWORK(p)) = V(p,q)
1573: 8972 CONTINUE
1574: DO 8973 p = 1, N
1575: V(p,q) = WORK(2*N+N*NR+NR+p)
1576: 8973 CONTINUE
1577: XSC = ONE / DNRM2( N, V(1,q), 1 )
1578: IF ( (XSC .LT. (ONE-TEMP1)) .OR. (XSC .GT. (ONE+TEMP1)) )
1579: & CALL DSCAL( N, XSC, V(1,q), 1 )
1580: 7972 CONTINUE
1581: *
1582: * At this moment, V contains the right singular vectors of A.
1583: * Next, assemble the left singular vector matrix U (M x N).
1584: *
1585: IF ( N .LT. M ) THEN
1586: CALL DLASET( 'A', M-N, N, ZERO, ZERO, U(NR+1,1), LDU )
1587: IF ( N .LT. N1 ) THEN
1588: CALL DLASET( 'A',N, N1-N, ZERO, ZERO, U(1,N+1),LDU )
1589: CALL DLASET( 'A',M-N,N1-N, ZERO, ONE,U(NR+1,N+1),LDU )
1590: END IF
1591: END IF
1592: *
1593: CALL DORMQR( 'Left', 'No Tr', M, N1, N, A, LDA, WORK, U,
1594: & LDU, WORK(N+1), LWORK-N, IERR )
1595: *
1596: IF ( ROWPIV )
1597: & CALL DLASWP( N1, U, LDU, 1, M-1, IWORK(2*N+1), -1 )
1598: *
1599: *
1600: END IF
1601: IF ( TRANSP ) THEN
1602: * .. swap U and V because the procedure worked on A^t
1603: DO 6974 p = 1, N
1604: CALL DSWAP( N, U(1,p), 1, V(1,p), 1 )
1605: 6974 CONTINUE
1606: END IF
1607: *
1608: END IF
1609: * end of the full SVD
1610: *
1611: * Undo scaling, if necessary (and possible)
1612: *
1613: IF ( USCAL2 .LE. (BIG/SVA(1))*USCAL1 ) THEN
1614: CALL DLASCL( 'G', 0, 0, USCAL1, USCAL2, NR, 1, SVA, N, IERR )
1615: USCAL1 = ONE
1616: USCAL2 = ONE
1617: END IF
1618: *
1619: IF ( NR .LT. N ) THEN
1620: DO 3004 p = NR+1, N
1621: SVA(p) = ZERO
1622: 3004 CONTINUE
1623: END IF
1624: *
1625: WORK(1) = USCAL2 * SCALEM
1626: WORK(2) = USCAL1
1627: IF ( ERREST ) WORK(3) = SCONDA
1628: IF ( LSVEC .AND. RSVEC ) THEN
1629: WORK(4) = CONDR1
1630: WORK(5) = CONDR2
1631: END IF
1632: IF ( L2TRAN ) THEN
1633: WORK(6) = ENTRA
1634: WORK(7) = ENTRAT
1635: END IF
1636: *
1637: IWORK(1) = NR
1638: IWORK(2) = NUMRANK
1639: IWORK(3) = WARNING
1640: *
1641: RETURN
1642: * ..
1643: * .. END OF DGEJSV
1644: * ..
1645: END
1646: *
CVSweb interface <joel.bertrand@systella.fr>