Annotation of rpl/lapack/lapack/zgejsv.f, revision 1.7
1.6 bertrand 1: *> \brief \b ZGEJSV
2: *
3: * =========== DOCUMENTATION ===========
4: *
5: * Online html documentation available at
6: * http://www.netlib.org/lapack/explore-html/
7: *
8: *> \htmlonly
9: *> Download ZGEJSV + dependencies
10: *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.tgz?format=tgz&filename=/lapack/lapack_routine/zgejsv.f">
11: *> [TGZ]</a>
12: *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.zip?format=zip&filename=/lapack/lapack_routine/zgejsv.f">
13: *> [ZIP]</a>
14: *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.txt?format=txt&filename=/lapack/lapack_routine/zgejsv.f">
15: *> [TXT]</a>
16: *> \endhtmlonly
17: *
18: * Definition:
19: * ===========
20: *
21: * SUBROUTINE ZGEJSV( JOBA, JOBU, JOBV, JOBR, JOBT, JOBP,
22: * M, N, A, LDA, SVA, U, LDU, V, LDV,
23: * CWORK, LWORK, RWORK, LRWORK, IWORK, INFO )
24: *
25: * .. Scalar Arguments ..
26: * IMPLICIT NONE
27: * INTEGER INFO, LDA, LDU, LDV, LWORK, M, N
28: * ..
29: * .. Array Arguments ..
30: * COMPLEX*16 A( LDA, * ), U( LDU, * ), V( LDV, * ), CWORK( LWORK )
31: * DOUBLE PRECISION SVA( N ), RWORK( LRWORK )
32: * INTEGER IWORK( * )
33: * CHARACTER*1 JOBA, JOBP, JOBR, JOBT, JOBU, JOBV
34: * ..
35: *
36: *
37: *> \par Purpose:
38: * =============
39: *>
40: *> \verbatim
41: *>
42: *> ZGEJSV computes the singular value decomposition (SVD) of a complex M-by-N
43: *> matrix [A], where M >= N. The SVD of [A] is written as
44: *>
45: *> [A] = [U] * [SIGMA] * [V]^*,
46: *>
47: *> where [SIGMA] is an N-by-N (M-by-N) matrix which is zero except for its N
48: *> diagonal elements, [U] is an M-by-N (or M-by-M) unitary matrix, and
49: *> [V] is an N-by-N unitary matrix. The diagonal elements of [SIGMA] are
50: *> the singular values of [A]. The columns of [U] and [V] are the left and
51: *> the right singular vectors of [A], respectively. The matrices [U] and [V]
52: *> are computed and stored in the arrays U and V, respectively. The diagonal
53: *> of [SIGMA] is computed and stored in the array SVA.
54: *> \endverbatim
55: *>
56: *> Arguments:
57: *> ==========
58: *>
59: *> \param[in] JOBA
60: *> \verbatim
61: *> JOBA is CHARACTER*1
62: *> Specifies the level of accuracy:
63: *> = 'C': This option works well (high relative accuracy) if A = B * D,
64: *> with well-conditioned B and arbitrary diagonal matrix D.
65: *> The accuracy cannot be spoiled by COLUMN scaling. The
66: *> accuracy of the computed output depends on the condition of
67: *> B, and the procedure aims at the best theoretical accuracy.
68: *> The relative error max_{i=1:N}|d sigma_i| / sigma_i is
69: *> bounded by f(M,N)*epsilon* cond(B), independent of D.
70: *> The input matrix is preprocessed with the QRF with column
71: *> pivoting. This initial preprocessing and preconditioning by
72: *> a rank revealing QR factorization is common for all values of
73: *> JOBA. Additional actions are specified as follows:
74: *> = 'E': Computation as with 'C' with an additional estimate of the
75: *> condition number of B. It provides a realistic error bound.
76: *> = 'F': If A = D1 * C * D2 with ill-conditioned diagonal scalings
77: *> D1, D2, and well-conditioned matrix C, this option gives
78: *> higher accuracy than the 'C' option. If the structure of the
79: *> input matrix is not known, and relative accuracy is
80: *> desirable, then this option is advisable. The input matrix A
81: *> is preprocessed with QR factorization with FULL (row and
82: *> column) pivoting.
83: *> = 'G' Computation as with 'F' with an additional estimate of the
84: *> condition number of B, where A=B*D. If A has heavily weighted
85: *> rows, then using this condition number gives too pessimistic
86: *> error bound.
87: *> = 'A': Small singular values are not well determined by the data
88: *> and are considered as noisy; the matrix is treated as
89: *> numerically rank defficient. The error in the computed
90: *> singular values is bounded by f(m,n)*epsilon*||A||.
91: *> The computed SVD A = U * S * V^* restores A up to
92: *> f(m,n)*epsilon*||A||.
93: *> This gives the procedure the licence to discard (set to zero)
94: *> all singular values below N*epsilon*||A||.
95: *> = 'R': Similar as in 'A'. Rank revealing property of the initial
96: *> QR factorization is used do reveal (using triangular factor)
97: *> a gap sigma_{r+1} < epsilon * sigma_r in which case the
98: *> numerical RANK is declared to be r. The SVD is computed with
99: *> absolute error bounds, but more accurately than with 'A'.
100: *> \endverbatim
101: *>
102: *> \param[in] JOBU
103: *> \verbatim
104: *> JOBU is CHARACTER*1
105: *> Specifies whether to compute the columns of U:
106: *> = 'U': N columns of U are returned in the array U.
107: *> = 'F': full set of M left sing. vectors is returned in the array U.
108: *> = 'W': U may be used as workspace of length M*N. See the description
109: *> of U.
110: *> = 'N': U is not computed.
111: *> \endverbatim
112: *>
113: *> \param[in] JOBV
114: *> \verbatim
115: *> JOBV is CHARACTER*1
116: *> Specifies whether to compute the matrix V:
117: *> = 'V': N columns of V are returned in the array V; Jacobi rotations
118: *> are not explicitly accumulated.
119: *> = 'J': N columns of V are returned in the array V, but they are
120: *> computed as the product of Jacobi rotations, if JOBT .EQ. 'N'.
121: *> = 'W': V may be used as workspace of length N*N. See the description
122: *> of V.
123: *> = 'N': V is not computed.
124: *> \endverbatim
125: *>
126: *> \param[in] JOBR
127: *> \verbatim
128: *> JOBR is CHARACTER*1
129: *> Specifies the RANGE for the singular values. Issues the licence to
130: *> set to zero small positive singular values if they are outside
131: *> specified range. If A .NE. 0 is scaled so that the largest singular
132: *> value of c*A is around SQRT(BIG), BIG=DLAMCH('O'), then JOBR issues
133: *> the licence to kill columns of A whose norm in c*A is less than
134: *> SQRT(SFMIN) (for JOBR.EQ.'R'), or less than SMALL=SFMIN/EPSLN,
135: *> where SFMIN=DLAMCH('S'), EPSLN=DLAMCH('E').
136: *> = 'N': Do not kill small columns of c*A. This option assumes that
137: *> BLAS and QR factorizations and triangular solvers are
138: *> implemented to work in that range. If the condition of A
139: *> is greater than BIG, use ZGESVJ.
140: *> = 'R': RESTRICTED range for sigma(c*A) is [SQRT(SFMIN), SQRT(BIG)]
141: *> (roughly, as described above). This option is recommended.
142: *> ===========================
143: *> For computing the singular values in the FULL range [SFMIN,BIG]
144: *> use ZGESVJ.
145: *> \endverbatim
146: *>
147: *> \param[in] JOBT
148: *> \verbatim
149: *> JOBT is CHARACTER*1
150: *> If the matrix is square then the procedure may determine to use
151: *> transposed A if A^* seems to be better with respect to convergence.
152: *> If the matrix is not square, JOBT is ignored.
153: *> The decision is based on two values of entropy over the adjoint
154: *> orbit of A^* * A. See the descriptions of WORK(6) and WORK(7).
155: *> = 'T': transpose if entropy test indicates possibly faster
156: *> convergence of Jacobi process if A^* is taken as input. If A is
157: *> replaced with A^*, then the row pivoting is included automatically.
158: *> = 'N': do not speculate.
159: *> The option 'T' can be used to compute only the singular values, or
160: *> the full SVD (U, SIGMA and V). For only one set of singular vectors
161: *> (U or V), the caller should provide both U and V, as one of the
162: *> matrices is used as workspace if the matrix A is transposed.
163: *> The implementer can easily remove this constraint and make the
164: *> code more complicated. See the descriptions of U and V.
165: *> In general, this option is considered experimental, and 'N'; should
166: *> be preferred. This is subject to changes in the future.
167: *> \endverbatim
168: *>
169: *> \param[in] JOBP
170: *> \verbatim
171: *> JOBP is CHARACTER*1
172: *> Issues the licence to introduce structured perturbations to drown
173: *> denormalized numbers. This licence should be active if the
174: *> denormals are poorly implemented, causing slow computation,
175: *> especially in cases of fast convergence (!). For details see [1,2].
176: *> For the sake of simplicity, this perturbations are included only
177: *> when the full SVD or only the singular values are requested. The
178: *> implementer/user can easily add the perturbation for the cases of
179: *> computing one set of singular vectors.
180: *> = 'P': introduce perturbation
181: *> = 'N': do not perturb
182: *> \endverbatim
183: *>
184: *> \param[in] M
185: *> \verbatim
186: *> M is INTEGER
187: *> The number of rows of the input matrix A. M >= 0.
188: *> \endverbatim
189: *>
190: *> \param[in] N
191: *> \verbatim
192: *> N is INTEGER
193: *> The number of columns of the input matrix A. M >= N >= 0.
194: *> \endverbatim
195: *>
196: *> \param[in,out] A
197: *> \verbatim
198: *> A is COMPLEX*16 array, dimension (LDA,N)
199: *> On entry, the M-by-N matrix A.
200: *> \endverbatim
201: *>
202: *> \param[in] LDA
203: *> \verbatim
204: *> LDA is INTEGER
205: *> The leading dimension of the array A. LDA >= max(1,M).
206: *> \endverbatim
207: *>
208: *> \param[out] SVA
209: *> \verbatim
210: *> SVA is DOUBLE PRECISION array, dimension (N)
211: *> On exit,
212: *> - For WORK(1)/WORK(2) = ONE: The singular values of A. During the
213: *> computation SVA contains Euclidean column norms of the
214: *> iterated matrices in the array A.
215: *> - For WORK(1) .NE. WORK(2): The singular values of A are
216: *> (WORK(1)/WORK(2)) * SVA(1:N). This factored form is used if
217: *> sigma_max(A) overflows or if small singular values have been
218: *> saved from underflow by scaling the input matrix A.
219: *> - If JOBR='R' then some of the singular values may be returned
220: *> as exact zeros obtained by "set to zero" because they are
221: *> below the numerical rank threshold or are denormalized numbers.
222: *> \endverbatim
223: *>
224: *> \param[out] U
225: *> \verbatim
226: *> U is COMPLEX*16 array, dimension ( LDU, N )
227: *> If JOBU = 'U', then U contains on exit the M-by-N matrix of
228: *> the left singular vectors.
229: *> If JOBU = 'F', then U contains on exit the M-by-M matrix of
230: *> the left singular vectors, including an ONB
231: *> of the orthogonal complement of the Range(A).
232: *> If JOBU = 'W' .AND. (JOBV.EQ.'V' .AND. JOBT.EQ.'T' .AND. M.EQ.N),
233: *> then U is used as workspace if the procedure
234: *> replaces A with A^*. In that case, [V] is computed
235: *> in U as left singular vectors of A^* and then
236: *> copied back to the V array. This 'W' option is just
237: *> a reminder to the caller that in this case U is
238: *> reserved as workspace of length N*N.
239: *> If JOBU = 'N' U is not referenced, unless JOBT='T'.
240: *> \endverbatim
241: *>
242: *> \param[in] LDU
243: *> \verbatim
244: *> LDU is INTEGER
245: *> The leading dimension of the array U, LDU >= 1.
246: *> IF JOBU = 'U' or 'F' or 'W', then LDU >= M.
247: *> \endverbatim
248: *>
249: *> \param[out] V
250: *> \verbatim
251: *> V is COMPLEX*16 array, dimension ( LDV, N )
252: *> If JOBV = 'V', 'J' then V contains on exit the N-by-N matrix of
253: *> the right singular vectors;
254: *> If JOBV = 'W', AND (JOBU.EQ.'U' AND JOBT.EQ.'T' AND M.EQ.N),
255: *> then V is used as workspace if the pprocedure
256: *> replaces A with A^*. In that case, [U] is computed
257: *> in V as right singular vectors of A^* and then
258: *> copied back to the U array. This 'W' option is just
259: *> a reminder to the caller that in this case V is
260: *> reserved as workspace of length N*N.
261: *> If JOBV = 'N' V is not referenced, unless JOBT='T'.
262: *> \endverbatim
263: *>
264: *> \param[in] LDV
265: *> \verbatim
266: *> LDV is INTEGER
267: *> The leading dimension of the array V, LDV >= 1.
268: *> If JOBV = 'V' or 'J' or 'W', then LDV >= N.
269: *> \endverbatim
270: *>
271: *> \param[out] CWORK
272: *> \verbatim
273: *> CWORK is COMPLEX*16 array, dimension (MAX(2,LWORK))
274: *> If the call to ZGEJSV is a workspace query (indicated by LWORK=-1 or
275: *> LRWORK=-1), then on exit CWORK(1) contains the required length of
276: *> CWORK for the job parameters used in the call.
277: *> \endverbatim
278: *>
279: *> \param[in] LWORK
280: *> \verbatim
281: *> LWORK is INTEGER
282: *> Length of CWORK to confirm proper allocation of workspace.
283: *> LWORK depends on the job:
284: *>
285: *> 1. If only SIGMA is needed ( JOBU.EQ.'N', JOBV.EQ.'N' ) and
286: *> 1.1 .. no scaled condition estimate required (JOBA.NE.'E'.AND.JOBA.NE.'G'):
287: *> LWORK >= 2*N+1. This is the minimal requirement.
288: *> ->> For optimal performance (blocked code) the optimal value
289: *> is LWORK >= N + (N+1)*NB. Here NB is the optimal
290: *> block size for ZGEQP3 and ZGEQRF.
291: *> In general, optimal LWORK is computed as
292: *> LWORK >= max(N+LWORK(ZGEQP3),N+LWORK(ZGEQRF), LWORK(ZGESVJ)).
293: *> 1.2. .. an estimate of the scaled condition number of A is
294: *> required (JOBA='E', or 'G'). In this case, LWORK the minimal
295: *> requirement is LWORK >= N*N + 2*N.
296: *> ->> For optimal performance (blocked code) the optimal value
297: *> is LWORK >= max(N+(N+1)*NB, N*N+2*N)=N**2+2*N.
298: *> In general, the optimal length LWORK is computed as
299: *> LWORK >= max(N+LWORK(ZGEQP3),N+LWORK(ZGEQRF), LWORK(ZGESVJ),
300: *> N*N+LWORK(ZPOCON)).
301: *> 2. If SIGMA and the right singular vectors are needed (JOBV.EQ.'V'),
302: *> (JOBU.EQ.'N')
303: *> 2.1 .. no scaled condition estimate requested (JOBE.EQ.'N'):
304: *> -> the minimal requirement is LWORK >= 3*N.
305: *> -> For optimal performance,
306: *> LWORK >= max(N+(N+1)*NB, 2*N+N*NB)=2*N+N*NB,
307: *> where NB is the optimal block size for ZGEQP3, ZGEQRF, ZGELQ,
308: *> ZUNMLQ. In general, the optimal length LWORK is computed as
309: *> LWORK >= max(N+LWORK(ZGEQP3), N+LWORK(ZGESVJ),
310: *> N+LWORK(ZGELQF), 2*N+LWORK(ZGEQRF), N+LWORK(ZUNMLQ)).
311: *> 2.2 .. an estimate of the scaled condition number of A is
312: *> required (JOBA='E', or 'G').
313: *> -> the minimal requirement is LWORK >= 3*N.
314: *> -> For optimal performance,
315: *> LWORK >= max(N+(N+1)*NB, 2*N,2*N+N*NB)=2*N+N*NB,
316: *> where NB is the optimal block size for ZGEQP3, ZGEQRF, ZGELQ,
317: *> ZUNMLQ. In general, the optimal length LWORK is computed as
318: *> LWORK >= max(N+LWORK(ZGEQP3), LWORK(ZPOCON), N+LWORK(ZGESVJ),
319: *> N+LWORK(ZGELQF), 2*N+LWORK(ZGEQRF), N+LWORK(ZUNMLQ)).
320: *> 3. If SIGMA and the left singular vectors are needed
321: *> 3.1 .. no scaled condition estimate requested (JOBE.EQ.'N'):
322: *> -> the minimal requirement is LWORK >= 3*N.
323: *> -> For optimal performance:
324: *> if JOBU.EQ.'U' :: LWORK >= max(3*N, N+(N+1)*NB, 2*N+N*NB)=2*N+N*NB,
325: *> where NB is the optimal block size for ZGEQP3, ZGEQRF, ZUNMQR.
326: *> In general, the optimal length LWORK is computed as
327: *> LWORK >= max(N+LWORK(ZGEQP3), 2*N+LWORK(ZGEQRF), N+LWORK(ZUNMQR)).
328: *> 3.2 .. an estimate of the scaled condition number of A is
329: *> required (JOBA='E', or 'G').
330: *> -> the minimal requirement is LWORK >= 3*N.
331: *> -> For optimal performance:
332: *> if JOBU.EQ.'U' :: LWORK >= max(3*N, N+(N+1)*NB, 2*N+N*NB)=2*N+N*NB,
333: *> where NB is the optimal block size for ZGEQP3, ZGEQRF, ZUNMQR.
334: *> In general, the optimal length LWORK is computed as
335: *> LWORK >= max(N+LWORK(ZGEQP3),N+LWORK(ZPOCON),
336: *> 2*N+LWORK(ZGEQRF), N+LWORK(ZUNMQR)).
337: *> 4. If the full SVD is needed: (JOBU.EQ.'U' or JOBU.EQ.'F') and
338: *> 4.1. if JOBV.EQ.'V'
339: *> the minimal requirement is LWORK >= 5*N+2*N*N.
340: *> 4.2. if JOBV.EQ.'J' the minimal requirement is
341: *> LWORK >= 4*N+N*N.
342: *> In both cases, the allocated CWORK can accommodate blocked runs
343: *> of ZGEQP3, ZGEQRF, ZGELQF, SUNMQR, ZUNMLQ.
344: *>
345: *> If the call to ZGEJSV is a workspace query (indicated by LWORK=-1 or
346: *> LRWORK=-1), then on exit CWORK(1) contains the optimal and CWORK(2) contains the
347: *> minimal length of CWORK for the job parameters used in the call.
348: *> \endverbatim
349: *>
350: *> \param[out] RWORK
351: *> \verbatim
352: *> RWORK is DOUBLE PRECISION array, dimension (MAX(7,LWORK))
353: *> On exit,
354: *> RWORK(1) = Determines the scaling factor SCALE = RWORK(2) / RWORK(1)
355: *> such that SCALE*SVA(1:N) are the computed singular values
356: *> of A. (See the description of SVA().)
357: *> RWORK(2) = See the description of RWORK(1).
358: *> RWORK(3) = SCONDA is an estimate for the condition number of
359: *> column equilibrated A. (If JOBA .EQ. 'E' or 'G')
360: *> SCONDA is an estimate of SQRT(||(R^* * R)^(-1)||_1).
361: *> It is computed using SPOCON. It holds
362: *> N^(-1/4) * SCONDA <= ||R^(-1)||_2 <= N^(1/4) * SCONDA
363: *> where R is the triangular factor from the QRF of A.
364: *> However, if R is truncated and the numerical rank is
365: *> determined to be strictly smaller than N, SCONDA is
366: *> returned as -1, thus indicating that the smallest
367: *> singular values might be lost.
368: *>
369: *> If full SVD is needed, the following two condition numbers are
370: *> useful for the analysis of the algorithm. They are provied for
371: *> a developer/implementer who is familiar with the details of
372: *> the method.
373: *>
374: *> RWORK(4) = an estimate of the scaled condition number of the
375: *> triangular factor in the first QR factorization.
376: *> RWORK(5) = an estimate of the scaled condition number of the
377: *> triangular factor in the second QR factorization.
378: *> The following two parameters are computed if JOBT .EQ. 'T'.
379: *> They are provided for a developer/implementer who is familiar
380: *> with the details of the method.
381: *> RWORK(6) = the entropy of A^* * A :: this is the Shannon entropy
382: *> of diag(A^* * A) / Trace(A^* * A) taken as point in the
383: *> probability simplex.
384: *> RWORK(7) = the entropy of A * A^*. (See the description of RWORK(6).)
385: *> If the call to ZGEJSV is a workspace query (indicated by LWORK=-1 or
386: *> LRWORK=-1), then on exit RWORK(1) contains the required length of
387: *> RWORK for the job parameters used in the call.
388: *> \endverbatim
389: *>
390: *> \param[in] LRWORK
391: *> \verbatim
392: *> LRWORK is INTEGER
393: *> Length of RWORK to confirm proper allocation of workspace.
394: *> LRWORK depends on the job:
395: *>
396: *> 1. If only the singular values are requested i.e. if
397: *> LSAME(JOBU,'N') .AND. LSAME(JOBV,'N')
398: *> then:
399: *> 1.1. If LSAME(JOBT,'T') .OR. LSAME(JOBA,'F') .OR. LSAME(JOBA,'G'),
400: *> then: LRWORK = max( 7, 2 * M ).
401: *> 1.2. Otherwise, LRWORK = max( 7, N ).
402: *> 2. If singular values with the right singular vectors are requested
403: *> i.e. if
404: *> (LSAME(JOBV,'V').OR.LSAME(JOBV,'J')) .AND.
405: *> .NOT.(LSAME(JOBU,'U').OR.LSAME(JOBU,'F'))
406: *> then:
407: *> 2.1. If LSAME(JOBT,'T') .OR. LSAME(JOBA,'F') .OR. LSAME(JOBA,'G'),
408: *> then LRWORK = max( 7, 2 * M ).
409: *> 2.2. Otherwise, LRWORK = max( 7, N ).
410: *> 3. If singular values with the left singular vectors are requested, i.e. if
411: *> (LSAME(JOBU,'U').OR.LSAME(JOBU,'F')) .AND.
412: *> .NOT.(LSAME(JOBV,'V').OR.LSAME(JOBV,'J'))
413: *> then:
414: *> 3.1. If LSAME(JOBT,'T') .OR. LSAME(JOBA,'F') .OR. LSAME(JOBA,'G'),
415: *> then LRWORK = max( 7, 2 * M ).
416: *> 3.2. Otherwise, LRWORK = max( 7, N ).
417: *> 4. If singular values with both the left and the right singular vectors
418: *> are requested, i.e. if
419: *> (LSAME(JOBU,'U').OR.LSAME(JOBU,'F')) .AND.
420: *> (LSAME(JOBV,'V').OR.LSAME(JOBV,'J'))
421: *> then:
422: *> 4.1. If LSAME(JOBT,'T') .OR. LSAME(JOBA,'F') .OR. LSAME(JOBA,'G'),
423: *> then LRWORK = max( 7, 2 * M ).
424: *> 4.2. Otherwise, LRWORK = max( 7, N ).
425: *>
426: *> If, on entry, LRWORK = -1 or LWORK=-1, a workspace query is assumed and
427: *> the length of RWORK is returned in RWORK(1).
428: *> \endverbatim
429: *>
430: *> \param[out] IWORK
431: *> \verbatim
432: *> IWORK is INTEGER array, of dimension at least 4, that further depends
433: *> on the job:
434: *>
435: *> 1. If only the singular values are requested then:
436: *> If ( LSAME(JOBT,'T') .OR. LSAME(JOBA,'F') .OR. LSAME(JOBA,'G') )
437: *> then the length of IWORK is N+M; otherwise the length of IWORK is N.
438: *> 2. If the singular values and the right singular vectors are requested then:
439: *> If ( LSAME(JOBT,'T') .OR. LSAME(JOBA,'F') .OR. LSAME(JOBA,'G') )
440: *> then the length of IWORK is N+M; otherwise the length of IWORK is N.
441: *> 3. If the singular values and the left singular vectors are requested then:
442: *> If ( LSAME(JOBT,'T') .OR. LSAME(JOBA,'F') .OR. LSAME(JOBA,'G') )
443: *> then the length of IWORK is N+M; otherwise the length of IWORK is N.
444: *> 4. If the singular values with both the left and the right singular vectors
445: *> are requested, then:
446: *> 4.1. If LSAME(JOBV,'J') the length of IWORK is determined as follows:
447: *> If ( LSAME(JOBT,'T') .OR. LSAME(JOBA,'F') .OR. LSAME(JOBA,'G') )
448: *> then the length of IWORK is N+M; otherwise the length of IWORK is N.
449: *> 4.2. If LSAME(JOBV,'V') the length of IWORK is determined as follows:
450: *> If ( LSAME(JOBT,'T') .OR. LSAME(JOBA,'F') .OR. LSAME(JOBA,'G') )
451: *> then the length of IWORK is 2*N+M; otherwise the length of IWORK is 2*N.
452: *>
453: *> On exit,
454: *> IWORK(1) = the numerical rank determined after the initial
455: *> QR factorization with pivoting. See the descriptions
456: *> of JOBA and JOBR.
457: *> IWORK(2) = the number of the computed nonzero singular values
458: *> IWORK(3) = if nonzero, a warning message:
459: *> If IWORK(3).EQ.1 then some of the column norms of A
460: *> were denormalized floats. The requested high accuracy
461: *> is not warranted by the data.
462: *> IWORK(4) = 1 or -1. If IWORK(4) .EQ. 1, then the procedure used A^* to
463: *> do the job as specified by the JOB parameters.
464: *> If the call to ZGEJSV is a workspace query (indicated by LWORK .EQ. -1 or
465: *> LRWORK .EQ. -1), then on exit IWORK(1) contains the required length of
466: *> IWORK for the job parameters used in the call.
467: *> \endverbatim
468: *>
469: *> \param[out] INFO
470: *> \verbatim
471: *> INFO is INTEGER
472: *> < 0 : if INFO = -i, then the i-th argument had an illegal value.
473: *> = 0 : successful exit;
474: *> > 0 : ZGEJSV did not converge in the maximal allowed number
475: *> of sweeps. The computed values may be inaccurate.
476: *> \endverbatim
477: *
478: * Authors:
479: * ========
480: *
481: *> \author Univ. of Tennessee
482: *> \author Univ. of California Berkeley
483: *> \author Univ. of Colorado Denver
484: *> \author NAG Ltd.
485: *
486: *> \date June 2016
487: *
488: *> \ingroup complex16GEsing
489: *
490: *> \par Further Details:
491: * =====================
492: *>
493: *> \verbatim
494: *>
495: *> ZGEJSV implements a preconditioned Jacobi SVD algorithm. It uses ZGEQP3,
496: *> ZGEQRF, and ZGELQF as preprocessors and preconditioners. Optionally, an
497: *> additional row pivoting can be used as a preprocessor, which in some
498: *> cases results in much higher accuracy. An example is matrix A with the
499: *> structure A = D1 * C * D2, where D1, D2 are arbitrarily ill-conditioned
500: *> diagonal matrices and C is well-conditioned matrix. In that case, complete
501: *> pivoting in the first QR factorizations provides accuracy dependent on the
502: *> condition number of C, and independent of D1, D2. Such higher accuracy is
503: *> not completely understood theoretically, but it works well in practice.
504: *> Further, if A can be written as A = B*D, with well-conditioned B and some
505: *> diagonal D, then the high accuracy is guaranteed, both theoretically and
506: *> in software, independent of D. For more details see [1], [2].
507: *> The computational range for the singular values can be the full range
508: *> ( UNDERFLOW,OVERFLOW ), provided that the machine arithmetic and the BLAS
509: *> & LAPACK routines called by ZGEJSV are implemented to work in that range.
510: *> If that is not the case, then the restriction for safe computation with
511: *> the singular values in the range of normalized IEEE numbers is that the
512: *> spectral condition number kappa(A)=sigma_max(A)/sigma_min(A) does not
513: *> overflow. This code (ZGEJSV) is best used in this restricted range,
514: *> meaning that singular values of magnitude below ||A||_2 / DLAMCH('O') are
515: *> returned as zeros. See JOBR for details on this.
516: *> Further, this implementation is somewhat slower than the one described
517: *> in [1,2] due to replacement of some non-LAPACK components, and because
518: *> the choice of some tuning parameters in the iterative part (ZGESVJ) is
519: *> left to the implementer on a particular machine.
520: *> The rank revealing QR factorization (in this code: ZGEQP3) should be
521: *> implemented as in [3]. We have a new version of ZGEQP3 under development
522: *> that is more robust than the current one in LAPACK, with a cleaner cut in
523: *> rank deficient cases. It will be available in the SIGMA library [4].
524: *> If M is much larger than N, it is obvious that the initial QRF with
525: *> column pivoting can be preprocessed by the QRF without pivoting. That
526: *> well known trick is not used in ZGEJSV because in some cases heavy row
527: *> weighting can be treated with complete pivoting. The overhead in cases
528: *> M much larger than N is then only due to pivoting, but the benefits in
529: *> terms of accuracy have prevailed. The implementer/user can incorporate
530: *> this extra QRF step easily. The implementer can also improve data movement
531: *> (matrix transpose, matrix copy, matrix transposed copy) - this
532: *> implementation of ZGEJSV uses only the simplest, naive data movement.
533: *> \endverbatim
534: *
535: *> \par Contributor:
536: * ==================
537: *>
538: *> Zlatko Drmac, Department of Mathematics, Faculty of Science,
539: *> University of Zagreb (Zagreb, Croatia); drmac@math.hr
540: *
541: *> \par References:
542: * ================
543: *>
544: *> \verbatim
545: *>
546: *> [1] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm I.
547: *> SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1322-1342.
548: *> LAPACK Working note 169.
549: *> [2] Z. Drmac and K. Veselic: New fast and accurate Jacobi SVD algorithm II.
550: *> SIAM J. Matrix Anal. Appl. Vol. 35, No. 2 (2008), pp. 1343-1362.
551: *> LAPACK Working note 170.
552: *> [3] Z. Drmac and Z. Bujanovic: On the failure of rank-revealing QR
553: *> factorization software - a case study.
554: *> ACM Trans. Math. Softw. Vol. 35, No 2 (2008), pp. 1-28.
555: *> LAPACK Working note 176.
556: *> [4] Z. Drmac: SIGMA - mathematical software library for accurate SVD, PSV,
557: *> QSVD, (H,K)-SVD computations.
558: *> Department of Mathematics, University of Zagreb, 2008, 2016.
559: *> \endverbatim
560: *
561: *> \par Bugs, examples and comments:
562: * =================================
563: *>
564: *> Please report all bugs and send interesting examples and/or comments to
565: *> drmac@math.hr. Thank you.
566: *>
567: * =====================================================================
568: SUBROUTINE ZGEJSV( JOBA, JOBU, JOBV, JOBR, JOBT, JOBP,
569: $ M, N, A, LDA, SVA, U, LDU, V, LDV,
570: $ CWORK, LWORK, RWORK, LRWORK, IWORK, INFO )
571: *
572: * -- LAPACK computational routine (version 3.7.1) --
573: * -- LAPACK is a software package provided by Univ. of Tennessee, --
574: * -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
575: * June 2017
576: *
577: * .. Scalar Arguments ..
578: IMPLICIT NONE
579: INTEGER INFO, LDA, LDU, LDV, LWORK, LRWORK, M, N
580: * ..
581: * .. Array Arguments ..
582: COMPLEX*16 A( LDA, * ), U( LDU, * ), V( LDV, * ),
583: $ CWORK( LWORK )
584: DOUBLE PRECISION SVA( N ), RWORK( LRWORK )
585: INTEGER IWORK( * )
586: CHARACTER*1 JOBA, JOBP, JOBR, JOBT, JOBU, JOBV
587: * ..
588: *
589: * ===========================================================================
590: *
591: * .. Local Parameters ..
592: DOUBLE PRECISION ZERO, ONE
593: PARAMETER ( ZERO = 0.0D0, ONE = 1.0D0 )
594: COMPLEX*16 CZERO, CONE
595: PARAMETER ( CZERO = ( 0.0D0, 0.0D0 ), CONE = ( 1.0D0, 0.0D0 ) )
596: * ..
597: * .. Local Scalars ..
598: COMPLEX*16 CTEMP
599: DOUBLE PRECISION AAPP, AAQQ, AATMAX, AATMIN, BIG, BIG1,
600: $ COND_OK, CONDR1, CONDR2, ENTRA, ENTRAT, EPSLN,
601: $ MAXPRJ, SCALEM, SCONDA, SFMIN, SMALL, TEMP1,
602: $ USCAL1, USCAL2, XSC
603: INTEGER IERR, N1, NR, NUMRANK, p, q, WARNING
604: LOGICAL ALMORT, DEFR, ERREST, GOSCAL, JRACC, KILL, LQUERY,
605: $ LSVEC, L2ABER, L2KILL, L2PERT, L2RANK, L2TRAN, NOSCAL,
606: $ ROWPIV, RSVEC, TRANSP
607: *
608: INTEGER OPTWRK, MINWRK, MINRWRK, MINIWRK
609: INTEGER LWCON, LWLQF, LWQP3, LWQRF, LWUNMLQ, LWUNMQR, LWUNMQRM,
610: $ LWSVDJ, LWSVDJV, LRWQP3, LRWCON, LRWSVDJ, IWOFF
611: INTEGER LWRK_ZGELQF, LWRK_ZGEQP3, LWRK_ZGEQP3N, LWRK_ZGEQRF,
612: $ LWRK_ZGESVJ, LWRK_ZGESVJV, LWRK_ZGESVJU, LWRK_ZUNMLQ,
613: $ LWRK_ZUNMQR, LWRK_ZUNMQRM
614: * ..
615: * .. Local Arrays
616: COMPLEX*16 CDUMMY(1)
617: DOUBLE PRECISION RDUMMY(1)
618: *
619: * .. Intrinsic Functions ..
620: INTRINSIC ABS, DCMPLX, CONJG, DLOG, MAX, MIN, DBLE, NINT, SQRT
621: * ..
622: * .. External Functions ..
623: DOUBLE PRECISION DLAMCH, DZNRM2
624: INTEGER IDAMAX, IZAMAX
625: LOGICAL LSAME
626: EXTERNAL IDAMAX, IZAMAX, LSAME, DLAMCH, DZNRM2
627: * ..
628: * .. External Subroutines ..
629: EXTERNAL DLASSQ, ZCOPY, ZGELQF, ZGEQP3, ZGEQRF, ZLACPY, ZLAPMR,
630: $ ZLASCL, DLASCL, ZLASET, ZLASSQ, ZLASWP, ZUNGQR, ZUNMLQ,
631: $ ZUNMQR, ZPOCON, DSCAL, ZDSCAL, ZSWAP, ZTRSM, ZLACGV,
632: $ XERBLA
633: *
634: EXTERNAL ZGESVJ
635: * ..
636: *
637: * Test the input arguments
638: *
639: LSVEC = LSAME( JOBU, 'U' ) .OR. LSAME( JOBU, 'F' )
640: JRACC = LSAME( JOBV, 'J' )
641: RSVEC = LSAME( JOBV, 'V' ) .OR. JRACC
642: ROWPIV = LSAME( JOBA, 'F' ) .OR. LSAME( JOBA, 'G' )
643: L2RANK = LSAME( JOBA, 'R' )
644: L2ABER = LSAME( JOBA, 'A' )
645: ERREST = LSAME( JOBA, 'E' ) .OR. LSAME( JOBA, 'G' )
646: L2TRAN = LSAME( JOBT, 'T' ) .AND. ( M .EQ. N )
647: L2KILL = LSAME( JOBR, 'R' )
648: DEFR = LSAME( JOBR, 'N' )
649: L2PERT = LSAME( JOBP, 'P' )
650: *
651: LQUERY = ( LWORK .EQ. -1 ) .OR. ( LRWORK .EQ. -1 )
652: *
653: IF ( .NOT.(ROWPIV .OR. L2RANK .OR. L2ABER .OR.
654: $ ERREST .OR. LSAME( JOBA, 'C' ) )) THEN
655: INFO = - 1
656: ELSE IF ( .NOT.( LSVEC .OR. LSAME( JOBU, 'N' ) .OR.
657: $ ( LSAME( JOBU, 'W' ) .AND. RSVEC .AND. L2TRAN ) ) ) THEN
658: INFO = - 2
659: ELSE IF ( .NOT.( RSVEC .OR. LSAME( JOBV, 'N' ) .OR.
660: $ ( LSAME( JOBV, 'W' ) .AND. LSVEC .AND. L2TRAN ) ) ) THEN
661: INFO = - 3
662: ELSE IF ( .NOT. ( L2KILL .OR. DEFR ) ) THEN
663: INFO = - 4
664: ELSE IF ( .NOT. ( LSAME(JOBT,'T') .OR. LSAME(JOBT,'N') ) ) THEN
665: INFO = - 5
666: ELSE IF ( .NOT. ( L2PERT .OR. LSAME( JOBP, 'N' ) ) ) THEN
667: INFO = - 6
668: ELSE IF ( M .LT. 0 ) THEN
669: INFO = - 7
670: ELSE IF ( ( N .LT. 0 ) .OR. ( N .GT. M ) ) THEN
671: INFO = - 8
672: ELSE IF ( LDA .LT. M ) THEN
673: INFO = - 10
674: ELSE IF ( LSVEC .AND. ( LDU .LT. M ) ) THEN
675: INFO = - 13
676: ELSE IF ( RSVEC .AND. ( LDV .LT. N ) ) THEN
677: INFO = - 15
678: ELSE
679: * #:)
680: INFO = 0
681: END IF
682: *
683: IF ( INFO .EQ. 0 ) THEN
684: * .. compute the minimal and the optimal workspace lengths
685: * [[The expressions for computing the minimal and the optimal
686: * values of LCWORK, LRWORK are written with a lot of redundancy and
687: * can be simplified. However, this verbose form is useful for
688: * maintenance and modifications of the code.]]
689: *
690: * .. minimal workspace length for ZGEQP3 of an M x N matrix,
691: * ZGEQRF of an N x N matrix, ZGELQF of an N x N matrix,
692: * ZUNMLQ for computing N x N matrix, ZUNMQR for computing N x N
693: * matrix, ZUNMQR for computing M x N matrix, respectively.
694: LWQP3 = N+1
695: LWQRF = MAX( 1, N )
696: LWLQF = MAX( 1, N )
697: LWUNMLQ = MAX( 1, N )
698: LWUNMQR = MAX( 1, N )
699: LWUNMQRM = MAX( 1, M )
700: * .. minimal workspace length for ZPOCON of an N x N matrix
701: LWCON = 2 * N
702: * .. minimal workspace length for ZGESVJ of an N x N matrix,
703: * without and with explicit accumulation of Jacobi rotations
704: LWSVDJ = MAX( 2 * N, 1 )
705: LWSVDJV = MAX( 2 * N, 1 )
706: * .. minimal REAL workspace length for ZGEQP3, ZPOCON, ZGESVJ
707: LRWQP3 = N
708: LRWCON = N
709: LRWSVDJ = N
710: IF ( LQUERY ) THEN
711: CALL ZGEQP3( M, N, A, LDA, IWORK, CDUMMY, CDUMMY, -1,
712: $ RDUMMY, IERR )
713: LWRK_ZGEQP3 = CDUMMY(1)
714: CALL ZGEQRF( N, N, A, LDA, CDUMMY, CDUMMY,-1, IERR )
715: LWRK_ZGEQRF = CDUMMY(1)
716: CALL ZGELQF( N, N, A, LDA, CDUMMY, CDUMMY,-1, IERR )
717: LWRK_ZGELQF = CDUMMY(1)
718: END IF
719: MINWRK = 2
720: OPTWRK = 2
721: MINIWRK = N
722: IF ( .NOT. (LSVEC .OR. RSVEC ) ) THEN
723: * .. minimal and optimal sizes of the complex workspace if
724: * only the singular values are requested
725: IF ( ERREST ) THEN
726: MINWRK = MAX( N+LWQP3, N**2+LWCON, N+LWQRF, LWSVDJ )
727: ELSE
728: MINWRK = MAX( N+LWQP3, N+LWQRF, LWSVDJ )
729: END IF
730: IF ( LQUERY ) THEN
731: CALL ZGESVJ( 'L', 'N', 'N', N, N, A, LDA, SVA, N, V,
732: $ LDV, CDUMMY, -1, RDUMMY, -1, IERR )
733: LWRK_ZGESVJ = CDUMMY(1)
734: IF ( ERREST ) THEN
735: OPTWRK = MAX( N+LWRK_ZGEQP3, N**2+LWCON,
736: $ N+LWRK_ZGEQRF, LWRK_ZGESVJ )
737: ELSE
738: OPTWRK = MAX( N+LWRK_ZGEQP3, N+LWRK_ZGEQRF,
739: $ LWRK_ZGESVJ )
740: END IF
741: END IF
742: IF ( L2TRAN .OR. ROWPIV ) THEN
743: IF ( ERREST ) THEN
744: MINRWRK = MAX( 7, 2*M, LRWQP3, LRWCON, LRWSVDJ )
745: ELSE
746: MINRWRK = MAX( 7, 2*M, LRWQP3, LRWSVDJ )
747: END IF
748: ELSE
749: IF ( ERREST ) THEN
750: MINRWRK = MAX( 7, LRWQP3, LRWCON, LRWSVDJ )
751: ELSE
752: MINRWRK = MAX( 7, LRWQP3, LRWSVDJ )
753: END IF
754: END IF
755: IF ( ROWPIV .OR. L2TRAN ) MINIWRK = MINIWRK + M
756: ELSE IF ( RSVEC .AND. (.NOT.LSVEC) ) THEN
757: * .. minimal and optimal sizes of the complex workspace if the
758: * singular values and the right singular vectors are requested
759: IF ( ERREST ) THEN
760: MINWRK = MAX( N+LWQP3, LWCON, LWSVDJ, N+LWLQF,
761: $ 2*N+LWQRF, N+LWSVDJ, N+LWUNMLQ )
762: ELSE
763: MINWRK = MAX( N+LWQP3, LWSVDJ, N+LWLQF, 2*N+LWQRF,
764: $ N+LWSVDJ, N+LWUNMLQ )
765: END IF
766: IF ( LQUERY ) THEN
767: CALL ZGESVJ( 'L', 'U', 'N', N,N, U, LDU, SVA, N, A,
768: $ LDA, CDUMMY, -1, RDUMMY, -1, IERR )
769: LWRK_ZGESVJ = CDUMMY(1)
770: CALL ZUNMLQ( 'L', 'C', N, N, N, A, LDA, CDUMMY,
771: $ V, LDV, CDUMMY, -1, IERR )
772: LWRK_ZUNMLQ = CDUMMY(1)
773: IF ( ERREST ) THEN
774: OPTWRK = MAX( N+LWRK_ZGEQP3, LWCON, LWRK_ZGESVJ,
775: $ N+LWRK_ZGELQF, 2*N+LWRK_ZGEQRF,
776: $ N+LWRK_ZGESVJ, N+LWRK_ZUNMLQ )
777: ELSE
778: OPTWRK = MAX( N+LWRK_ZGEQP3, LWRK_ZGESVJ,N+LWRK_ZGELQF,
779: $ 2*N+LWRK_ZGEQRF, N+LWRK_ZGESVJ,
780: $ N+LWRK_ZUNMLQ )
781: END IF
782: END IF
783: IF ( L2TRAN .OR. ROWPIV ) THEN
784: IF ( ERREST ) THEN
785: MINRWRK = MAX( 7, 2*M, LRWQP3, LRWSVDJ, LRWCON )
786: ELSE
787: MINRWRK = MAX( 7, 2*M, LRWQP3, LRWSVDJ )
788: END IF
789: ELSE
790: IF ( ERREST ) THEN
791: MINRWRK = MAX( 7, LRWQP3, LRWSVDJ, LRWCON )
792: ELSE
793: MINRWRK = MAX( 7, LRWQP3, LRWSVDJ )
794: END IF
795: END IF
796: IF ( ROWPIV .OR. L2TRAN ) MINIWRK = MINIWRK + M
797: ELSE IF ( LSVEC .AND. (.NOT.RSVEC) ) THEN
798: * .. minimal and optimal sizes of the complex workspace if the
799: * singular values and the left singular vectors are requested
800: IF ( ERREST ) THEN
801: MINWRK = N + MAX( LWQP3,LWCON,N+LWQRF,LWSVDJ,LWUNMQRM )
802: ELSE
803: MINWRK = N + MAX( LWQP3, N+LWQRF, LWSVDJ, LWUNMQRM )
804: END IF
805: IF ( LQUERY ) THEN
806: CALL ZGESVJ( 'L', 'U', 'N', N,N, U, LDU, SVA, N, A,
807: $ LDA, CDUMMY, -1, RDUMMY, -1, IERR )
808: LWRK_ZGESVJ = CDUMMY(1)
809: CALL ZUNMQR( 'L', 'N', M, N, N, A, LDA, CDUMMY, U,
810: $ LDU, CDUMMY, -1, IERR )
811: LWRK_ZUNMQRM = CDUMMY(1)
812: IF ( ERREST ) THEN
813: OPTWRK = N + MAX( LWRK_ZGEQP3, LWCON, N+LWRK_ZGEQRF,
814: $ LWRK_ZGESVJ, LWRK_ZUNMQRM )
815: ELSE
816: OPTWRK = N + MAX( LWRK_ZGEQP3, N+LWRK_ZGEQRF,
817: $ LWRK_ZGESVJ, LWRK_ZUNMQRM )
818: END IF
819: END IF
820: IF ( L2TRAN .OR. ROWPIV ) THEN
821: IF ( ERREST ) THEN
822: MINRWRK = MAX( 7, 2*M, LRWQP3, LRWSVDJ, LRWCON )
823: ELSE
824: MINRWRK = MAX( 7, 2*M, LRWQP3, LRWSVDJ )
825: END IF
826: ELSE
827: IF ( ERREST ) THEN
828: MINRWRK = MAX( 7, LRWQP3, LRWSVDJ, LRWCON )
829: ELSE
830: MINRWRK = MAX( 7, LRWQP3, LRWSVDJ )
831: END IF
832: END IF
833: IF ( ROWPIV .OR. L2TRAN ) MINIWRK = MINIWRK + M
834: ELSE
835: * .. minimal and optimal sizes of the complex workspace if the
836: * full SVD is requested
837: IF ( .NOT. JRACC ) THEN
838: IF ( ERREST ) THEN
839: MINWRK = MAX( N+LWQP3, N+LWCON, 2*N+N**2+LWCON,
840: $ 2*N+LWQRF, 2*N+LWQP3,
841: $ 2*N+N**2+N+LWLQF, 2*N+N**2+N+N**2+LWCON,
842: $ 2*N+N**2+N+LWSVDJ, 2*N+N**2+N+LWSVDJV,
843: $ 2*N+N**2+N+LWUNMQR,2*N+N**2+N+LWUNMLQ,
844: $ N+N**2+LWSVDJ, N+LWUNMQRM )
845: ELSE
846: MINWRK = MAX( N+LWQP3, 2*N+N**2+LWCON,
847: $ 2*N+LWQRF, 2*N+LWQP3,
848: $ 2*N+N**2+N+LWLQF, 2*N+N**2+N+N**2+LWCON,
849: $ 2*N+N**2+N+LWSVDJ, 2*N+N**2+N+LWSVDJV,
850: $ 2*N+N**2+N+LWUNMQR,2*N+N**2+N+LWUNMLQ,
851: $ N+N**2+LWSVDJ, N+LWUNMQRM )
852: END IF
853: MINIWRK = MINIWRK + N
854: IF ( ROWPIV .OR. L2TRAN ) MINIWRK = MINIWRK + M
855: ELSE
856: IF ( ERREST ) THEN
857: MINWRK = MAX( N+LWQP3, N+LWCON, 2*N+LWQRF,
858: $ 2*N+N**2+LWSVDJV, 2*N+N**2+N+LWUNMQR,
859: $ N+LWUNMQRM )
860: ELSE
861: MINWRK = MAX( N+LWQP3, 2*N+LWQRF,
862: $ 2*N+N**2+LWSVDJV, 2*N+N**2+N+LWUNMQR,
863: $ N+LWUNMQRM )
864: END IF
865: IF ( ROWPIV .OR. L2TRAN ) MINIWRK = MINIWRK + M
866: END IF
867: IF ( LQUERY ) THEN
868: CALL ZUNMQR( 'L', 'N', M, N, N, A, LDA, CDUMMY, U,
869: $ LDU, CDUMMY, -1, IERR )
870: LWRK_ZUNMQRM = CDUMMY(1)
871: CALL ZUNMQR( 'L', 'N', N, N, N, A, LDA, CDUMMY, U,
872: $ LDU, CDUMMY, -1, IERR )
873: LWRK_ZUNMQR = CDUMMY(1)
874: IF ( .NOT. JRACC ) THEN
875: CALL ZGEQP3( N,N, A, LDA, IWORK, CDUMMY,CDUMMY, -1,
876: $ RDUMMY, IERR )
877: LWRK_ZGEQP3N = CDUMMY(1)
878: CALL ZGESVJ( 'L', 'U', 'N', N, N, U, LDU, SVA,
879: $ N, V, LDV, CDUMMY, -1, RDUMMY, -1, IERR )
880: LWRK_ZGESVJ = CDUMMY(1)
881: CALL ZGESVJ( 'U', 'U', 'N', N, N, U, LDU, SVA,
882: $ N, V, LDV, CDUMMY, -1, RDUMMY, -1, IERR )
883: LWRK_ZGESVJU = CDUMMY(1)
884: CALL ZGESVJ( 'L', 'U', 'V', N, N, U, LDU, SVA,
885: $ N, V, LDV, CDUMMY, -1, RDUMMY, -1, IERR )
886: LWRK_ZGESVJV = CDUMMY(1)
887: CALL ZUNMLQ( 'L', 'C', N, N, N, A, LDA, CDUMMY,
888: $ V, LDV, CDUMMY, -1, IERR )
889: LWRK_ZUNMLQ = CDUMMY(1)
890: IF ( ERREST ) THEN
891: OPTWRK = MAX( N+LWRK_ZGEQP3, N+LWCON,
892: $ 2*N+N**2+LWCON, 2*N+LWRK_ZGEQRF,
893: $ 2*N+LWRK_ZGEQP3N,
894: $ 2*N+N**2+N+LWRK_ZGELQF,
895: $ 2*N+N**2+N+N**2+LWCON,
896: $ 2*N+N**2+N+LWRK_ZGESVJ,
897: $ 2*N+N**2+N+LWRK_ZGESVJV,
898: $ 2*N+N**2+N+LWRK_ZUNMQR,
899: $ 2*N+N**2+N+LWRK_ZUNMLQ,
900: $ N+N**2+LWRK_ZGESVJU,
901: $ N+LWRK_ZUNMQRM )
902: ELSE
903: OPTWRK = MAX( N+LWRK_ZGEQP3,
904: $ 2*N+N**2+LWCON, 2*N+LWRK_ZGEQRF,
905: $ 2*N+LWRK_ZGEQP3N,
906: $ 2*N+N**2+N+LWRK_ZGELQF,
907: $ 2*N+N**2+N+N**2+LWCON,
908: $ 2*N+N**2+N+LWRK_ZGESVJ,
909: $ 2*N+N**2+N+LWRK_ZGESVJV,
910: $ 2*N+N**2+N+LWRK_ZUNMQR,
911: $ 2*N+N**2+N+LWRK_ZUNMLQ,
912: $ N+N**2+LWRK_ZGESVJU,
913: $ N+LWRK_ZUNMQRM )
914: END IF
915: ELSE
916: CALL ZGESVJ( 'L', 'U', 'V', N, N, U, LDU, SVA,
917: $ N, V, LDV, CDUMMY, -1, RDUMMY, -1, IERR )
918: LWRK_ZGESVJV = CDUMMY(1)
919: CALL ZUNMQR( 'L', 'N', N, N, N, CDUMMY, N, CDUMMY,
920: $ V, LDV, CDUMMY, -1, IERR )
921: LWRK_ZUNMQR = CDUMMY(1)
922: CALL ZUNMQR( 'L', 'N', M, N, N, A, LDA, CDUMMY, U,
923: $ LDU, CDUMMY, -1, IERR )
924: LWRK_ZUNMQRM = CDUMMY(1)
925: IF ( ERREST ) THEN
926: OPTWRK = MAX( N+LWRK_ZGEQP3, N+LWCON,
927: $ 2*N+LWRK_ZGEQRF, 2*N+N**2,
928: $ 2*N+N**2+LWRK_ZGESVJV,
929: $ 2*N+N**2+N+LWRK_ZUNMQR,N+LWRK_ZUNMQRM )
930: ELSE
931: OPTWRK = MAX( N+LWRK_ZGEQP3, 2*N+LWRK_ZGEQRF,
932: $ 2*N+N**2, 2*N+N**2+LWRK_ZGESVJV,
933: $ 2*N+N**2+N+LWRK_ZUNMQR,
934: $ N+LWRK_ZUNMQRM )
935: END IF
936: END IF
937: END IF
938: IF ( L2TRAN .OR. ROWPIV ) THEN
939: MINRWRK = MAX( 7, 2*M, LRWQP3, LRWSVDJ, LRWCON )
940: ELSE
941: MINRWRK = MAX( 7, LRWQP3, LRWSVDJ, LRWCON )
942: END IF
943: END IF
944: MINWRK = MAX( 2, MINWRK )
945: OPTWRK = MAX( 2, OPTWRK )
946: IF ( LWORK .LT. MINWRK .AND. (.NOT.LQUERY) ) INFO = - 17
947: IF ( LRWORK .LT. MINRWRK .AND. (.NOT.LQUERY) ) INFO = - 19
948: END IF
949: *
950: IF ( INFO .NE. 0 ) THEN
951: * #:(
952: CALL XERBLA( 'ZGEJSV', - INFO )
953: RETURN
954: ELSE IF ( LQUERY ) THEN
955: CWORK(1) = OPTWRK
956: CWORK(2) = MINWRK
957: RWORK(1) = MINRWRK
958: IWORK(1) = MAX( 4, MINIWRK )
959: RETURN
960: END IF
961: *
962: * Quick return for void matrix (Y3K safe)
963: * #:)
964: IF ( ( M .EQ. 0 ) .OR. ( N .EQ. 0 ) ) THEN
965: IWORK(1:4) = 0
966: RWORK(1:7) = 0
967: RETURN
968: ENDIF
969: *
970: * Determine whether the matrix U should be M x N or M x M
971: *
972: IF ( LSVEC ) THEN
973: N1 = N
974: IF ( LSAME( JOBU, 'F' ) ) N1 = M
975: END IF
976: *
977: * Set numerical parameters
978: *
979: *! NOTE: Make sure DLAMCH() does not fail on the target architecture.
980: *
981: EPSLN = DLAMCH('Epsilon')
982: SFMIN = DLAMCH('SafeMinimum')
983: SMALL = SFMIN / EPSLN
984: BIG = DLAMCH('O')
985: * BIG = ONE / SFMIN
986: *
987: * Initialize SVA(1:N) = diag( ||A e_i||_2 )_1^N
988: *
989: *(!) If necessary, scale SVA() to protect the largest norm from
990: * overflow. It is possible that this scaling pushes the smallest
991: * column norm left from the underflow threshold (extreme case).
992: *
993: SCALEM = ONE / SQRT(DBLE(M)*DBLE(N))
994: NOSCAL = .TRUE.
995: GOSCAL = .TRUE.
996: DO 1874 p = 1, N
997: AAPP = ZERO
998: AAQQ = ONE
999: CALL ZLASSQ( M, A(1,p), 1, AAPP, AAQQ )
1000: IF ( AAPP .GT. BIG ) THEN
1001: INFO = - 9
1002: CALL XERBLA( 'ZGEJSV', -INFO )
1003: RETURN
1004: END IF
1005: AAQQ = SQRT(AAQQ)
1006: IF ( ( AAPP .LT. (BIG / AAQQ) ) .AND. NOSCAL ) THEN
1007: SVA(p) = AAPP * AAQQ
1008: ELSE
1009: NOSCAL = .FALSE.
1010: SVA(p) = AAPP * ( AAQQ * SCALEM )
1011: IF ( GOSCAL ) THEN
1012: GOSCAL = .FALSE.
1013: CALL DSCAL( p-1, SCALEM, SVA, 1 )
1014: END IF
1015: END IF
1016: 1874 CONTINUE
1017: *
1018: IF ( NOSCAL ) SCALEM = ONE
1019: *
1020: AAPP = ZERO
1021: AAQQ = BIG
1022: DO 4781 p = 1, N
1023: AAPP = MAX( AAPP, SVA(p) )
1024: IF ( SVA(p) .NE. ZERO ) AAQQ = MIN( AAQQ, SVA(p) )
1025: 4781 CONTINUE
1026: *
1027: * Quick return for zero M x N matrix
1028: * #:)
1029: IF ( AAPP .EQ. ZERO ) THEN
1030: IF ( LSVEC ) CALL ZLASET( 'G', M, N1, CZERO, CONE, U, LDU )
1031: IF ( RSVEC ) CALL ZLASET( 'G', N, N, CZERO, CONE, V, LDV )
1032: RWORK(1) = ONE
1033: RWORK(2) = ONE
1034: IF ( ERREST ) RWORK(3) = ONE
1035: IF ( LSVEC .AND. RSVEC ) THEN
1036: RWORK(4) = ONE
1037: RWORK(5) = ONE
1038: END IF
1039: IF ( L2TRAN ) THEN
1040: RWORK(6) = ZERO
1041: RWORK(7) = ZERO
1042: END IF
1043: IWORK(1) = 0
1044: IWORK(2) = 0
1045: IWORK(3) = 0
1046: IWORK(4) = -1
1047: RETURN
1048: END IF
1049: *
1050: * Issue warning if denormalized column norms detected. Override the
1051: * high relative accuracy request. Issue licence to kill nonzero columns
1052: * (set them to zero) whose norm is less than sigma_max / BIG (roughly).
1053: * #:(
1054: WARNING = 0
1055: IF ( AAQQ .LE. SFMIN ) THEN
1056: L2RANK = .TRUE.
1057: L2KILL = .TRUE.
1058: WARNING = 1
1059: END IF
1060: *
1061: * Quick return for one-column matrix
1062: * #:)
1063: IF ( N .EQ. 1 ) THEN
1064: *
1065: IF ( LSVEC ) THEN
1066: CALL ZLASCL( 'G',0,0,SVA(1),SCALEM, M,1,A(1,1),LDA,IERR )
1067: CALL ZLACPY( 'A', M, 1, A, LDA, U, LDU )
1068: * computing all M left singular vectors of the M x 1 matrix
1069: IF ( N1 .NE. N ) THEN
1070: CALL ZGEQRF( M, N, U,LDU, CWORK, CWORK(N+1),LWORK-N,IERR )
1071: CALL ZUNGQR( M,N1,1, U,LDU,CWORK,CWORK(N+1),LWORK-N,IERR )
1072: CALL ZCOPY( M, A(1,1), 1, U(1,1), 1 )
1073: END IF
1074: END IF
1075: IF ( RSVEC ) THEN
1076: V(1,1) = CONE
1077: END IF
1078: IF ( SVA(1) .LT. (BIG*SCALEM) ) THEN
1079: SVA(1) = SVA(1) / SCALEM
1080: SCALEM = ONE
1081: END IF
1082: RWORK(1) = ONE / SCALEM
1083: RWORK(2) = ONE
1084: IF ( SVA(1) .NE. ZERO ) THEN
1085: IWORK(1) = 1
1086: IF ( ( SVA(1) / SCALEM) .GE. SFMIN ) THEN
1087: IWORK(2) = 1
1088: ELSE
1089: IWORK(2) = 0
1090: END IF
1091: ELSE
1092: IWORK(1) = 0
1093: IWORK(2) = 0
1094: END IF
1095: IWORK(3) = 0
1096: IWORK(4) = -1
1097: IF ( ERREST ) RWORK(3) = ONE
1098: IF ( LSVEC .AND. RSVEC ) THEN
1099: RWORK(4) = ONE
1100: RWORK(5) = ONE
1101: END IF
1102: IF ( L2TRAN ) THEN
1103: RWORK(6) = ZERO
1104: RWORK(7) = ZERO
1105: END IF
1106: RETURN
1107: *
1108: END IF
1109: *
1110: TRANSP = .FALSE.
1111: *
1112: AATMAX = -ONE
1113: AATMIN = BIG
1114: IF ( ROWPIV .OR. L2TRAN ) THEN
1115: *
1116: * Compute the row norms, needed to determine row pivoting sequence
1117: * (in the case of heavily row weighted A, row pivoting is strongly
1118: * advised) and to collect information needed to compare the
1119: * structures of A * A^* and A^* * A (in the case L2TRAN.EQ..TRUE.).
1120: *
1121: IF ( L2TRAN ) THEN
1122: DO 1950 p = 1, M
1123: XSC = ZERO
1124: TEMP1 = ONE
1125: CALL ZLASSQ( N, A(p,1), LDA, XSC, TEMP1 )
1126: * ZLASSQ gets both the ell_2 and the ell_infinity norm
1127: * in one pass through the vector
1128: RWORK(M+p) = XSC * SCALEM
1129: RWORK(p) = XSC * (SCALEM*SQRT(TEMP1))
1130: AATMAX = MAX( AATMAX, RWORK(p) )
1131: IF (RWORK(p) .NE. ZERO)
1132: $ AATMIN = MIN(AATMIN,RWORK(p))
1133: 1950 CONTINUE
1134: ELSE
1135: DO 1904 p = 1, M
1136: RWORK(M+p) = SCALEM*ABS( A(p,IZAMAX(N,A(p,1),LDA)) )
1137: AATMAX = MAX( AATMAX, RWORK(M+p) )
1138: AATMIN = MIN( AATMIN, RWORK(M+p) )
1139: 1904 CONTINUE
1140: END IF
1141: *
1142: END IF
1143: *
1144: * For square matrix A try to determine whether A^* would be better
1145: * input for the preconditioned Jacobi SVD, with faster convergence.
1146: * The decision is based on an O(N) function of the vector of column
1147: * and row norms of A, based on the Shannon entropy. This should give
1148: * the right choice in most cases when the difference actually matters.
1149: * It may fail and pick the slower converging side.
1150: *
1151: ENTRA = ZERO
1152: ENTRAT = ZERO
1153: IF ( L2TRAN ) THEN
1154: *
1155: XSC = ZERO
1156: TEMP1 = ONE
1157: CALL DLASSQ( N, SVA, 1, XSC, TEMP1 )
1158: TEMP1 = ONE / TEMP1
1159: *
1160: ENTRA = ZERO
1161: DO 1113 p = 1, N
1162: BIG1 = ( ( SVA(p) / XSC )**2 ) * TEMP1
1163: IF ( BIG1 .NE. ZERO ) ENTRA = ENTRA + BIG1 * DLOG(BIG1)
1164: 1113 CONTINUE
1165: ENTRA = - ENTRA / DLOG(DBLE(N))
1166: *
1167: * Now, SVA().^2/Trace(A^* * A) is a point in the probability simplex.
1168: * It is derived from the diagonal of A^* * A. Do the same with the
1169: * diagonal of A * A^*, compute the entropy of the corresponding
1170: * probability distribution. Note that A * A^* and A^* * A have the
1171: * same trace.
1172: *
1173: ENTRAT = ZERO
1174: DO 1114 p = 1, M
1175: BIG1 = ( ( RWORK(p) / XSC )**2 ) * TEMP1
1176: IF ( BIG1 .NE. ZERO ) ENTRAT = ENTRAT + BIG1 * DLOG(BIG1)
1177: 1114 CONTINUE
1178: ENTRAT = - ENTRAT / DLOG(DBLE(M))
1179: *
1180: * Analyze the entropies and decide A or A^*. Smaller entropy
1181: * usually means better input for the algorithm.
1182: *
1183: TRANSP = ( ENTRAT .LT. ENTRA )
1184: *
1185: * If A^* is better than A, take the adjoint of A. This is allowed
1186: * only for square matrices, M=N.
1187: IF ( TRANSP ) THEN
1188: * In an optimal implementation, this trivial transpose
1189: * should be replaced with faster transpose.
1190: DO 1115 p = 1, N - 1
1191: A(p,p) = CONJG(A(p,p))
1192: DO 1116 q = p + 1, N
1193: CTEMP = CONJG(A(q,p))
1194: A(q,p) = CONJG(A(p,q))
1195: A(p,q) = CTEMP
1196: 1116 CONTINUE
1197: 1115 CONTINUE
1198: A(N,N) = CONJG(A(N,N))
1199: DO 1117 p = 1, N
1200: RWORK(M+p) = SVA(p)
1201: SVA(p) = RWORK(p)
1202: * previously computed row 2-norms are now column 2-norms
1203: * of the transposed matrix
1204: 1117 CONTINUE
1205: TEMP1 = AAPP
1206: AAPP = AATMAX
1207: AATMAX = TEMP1
1208: TEMP1 = AAQQ
1209: AAQQ = AATMIN
1210: AATMIN = TEMP1
1211: KILL = LSVEC
1212: LSVEC = RSVEC
1213: RSVEC = KILL
1214: IF ( LSVEC ) N1 = N
1215: *
1216: ROWPIV = .TRUE.
1217: END IF
1218: *
1219: END IF
1220: * END IF L2TRAN
1221: *
1222: * Scale the matrix so that its maximal singular value remains less
1223: * than SQRT(BIG) -- the matrix is scaled so that its maximal column
1224: * has Euclidean norm equal to SQRT(BIG/N). The only reason to keep
1225: * SQRT(BIG) instead of BIG is the fact that ZGEJSV uses LAPACK and
1226: * BLAS routines that, in some implementations, are not capable of
1227: * working in the full interval [SFMIN,BIG] and that they may provoke
1228: * overflows in the intermediate results. If the singular values spread
1229: * from SFMIN to BIG, then ZGESVJ will compute them. So, in that case,
1230: * one should use ZGESVJ instead of ZGEJSV.
1231: * >> change in the April 2016 update: allow bigger range, i.e. the
1232: * largest column is allowed up to BIG/N and ZGESVJ will do the rest.
1233: BIG1 = SQRT( BIG )
1234: TEMP1 = SQRT( BIG / DBLE(N) )
1235: * TEMP1 = BIG/DBLE(N)
1236: *
1237: CALL DLASCL( 'G', 0, 0, AAPP, TEMP1, N, 1, SVA, N, IERR )
1238: IF ( AAQQ .GT. (AAPP * SFMIN) ) THEN
1239: AAQQ = ( AAQQ / AAPP ) * TEMP1
1240: ELSE
1241: AAQQ = ( AAQQ * TEMP1 ) / AAPP
1242: END IF
1243: TEMP1 = TEMP1 * SCALEM
1244: CALL ZLASCL( 'G', 0, 0, AAPP, TEMP1, M, N, A, LDA, IERR )
1245: *
1246: * To undo scaling at the end of this procedure, multiply the
1247: * computed singular values with USCAL2 / USCAL1.
1248: *
1249: USCAL1 = TEMP1
1250: USCAL2 = AAPP
1251: *
1252: IF ( L2KILL ) THEN
1253: * L2KILL enforces computation of nonzero singular values in
1254: * the restricted range of condition number of the initial A,
1255: * sigma_max(A) / sigma_min(A) approx. SQRT(BIG)/SQRT(SFMIN).
1256: XSC = SQRT( SFMIN )
1257: ELSE
1258: XSC = SMALL
1259: *
1260: * Now, if the condition number of A is too big,
1261: * sigma_max(A) / sigma_min(A) .GT. SQRT(BIG/N) * EPSLN / SFMIN,
1262: * as a precaution measure, the full SVD is computed using ZGESVJ
1263: * with accumulated Jacobi rotations. This provides numerically
1264: * more robust computation, at the cost of slightly increased run
1265: * time. Depending on the concrete implementation of BLAS and LAPACK
1266: * (i.e. how they behave in presence of extreme ill-conditioning) the
1267: * implementor may decide to remove this switch.
1268: IF ( ( AAQQ.LT.SQRT(SFMIN) ) .AND. LSVEC .AND. RSVEC ) THEN
1269: JRACC = .TRUE.
1270: END IF
1271: *
1272: END IF
1273: IF ( AAQQ .LT. XSC ) THEN
1274: DO 700 p = 1, N
1275: IF ( SVA(p) .LT. XSC ) THEN
1276: CALL ZLASET( 'A', M, 1, CZERO, CZERO, A(1,p), LDA )
1277: SVA(p) = ZERO
1278: END IF
1279: 700 CONTINUE
1280: END IF
1281: *
1282: * Preconditioning using QR factorization with pivoting
1283: *
1284: IF ( ROWPIV ) THEN
1285: * Optional row permutation (Bjoerck row pivoting):
1286: * A result by Cox and Higham shows that the Bjoerck's
1287: * row pivoting combined with standard column pivoting
1288: * has similar effect as Powell-Reid complete pivoting.
1289: * The ell-infinity norms of A are made nonincreasing.
1290: IF ( ( LSVEC .AND. RSVEC ) .AND. .NOT.( JRACC ) ) THEN
1291: IWOFF = 2*N
1292: ELSE
1293: IWOFF = N
1294: END IF
1295: DO 1952 p = 1, M - 1
1296: q = IDAMAX( M-p+1, RWORK(M+p), 1 ) + p - 1
1297: IWORK(IWOFF+p) = q
1298: IF ( p .NE. q ) THEN
1299: TEMP1 = RWORK(M+p)
1300: RWORK(M+p) = RWORK(M+q)
1301: RWORK(M+q) = TEMP1
1302: END IF
1303: 1952 CONTINUE
1304: CALL ZLASWP( N, A, LDA, 1, M-1, IWORK(IWOFF+1), 1 )
1305: END IF
1306: *
1307: * End of the preparation phase (scaling, optional sorting and
1308: * transposing, optional flushing of small columns).
1309: *
1310: * Preconditioning
1311: *
1312: * If the full SVD is needed, the right singular vectors are computed
1313: * from a matrix equation, and for that we need theoretical analysis
1314: * of the Businger-Golub pivoting. So we use ZGEQP3 as the first RR QRF.
1315: * In all other cases the first RR QRF can be chosen by other criteria
1316: * (eg speed by replacing global with restricted window pivoting, such
1317: * as in xGEQPX from TOMS # 782). Good results will be obtained using
1318: * xGEQPX with properly (!) chosen numerical parameters.
1319: * Any improvement of ZGEQP3 improves overal performance of ZGEJSV.
1320: *
1321: * A * P1 = Q1 * [ R1^* 0]^*:
1322: DO 1963 p = 1, N
1323: * .. all columns are free columns
1324: IWORK(p) = 0
1325: 1963 CONTINUE
1326: CALL ZGEQP3( M, N, A, LDA, IWORK, CWORK, CWORK(N+1), LWORK-N,
1327: $ RWORK, IERR )
1328: *
1329: * The upper triangular matrix R1 from the first QRF is inspected for
1330: * rank deficiency and possibilities for deflation, or possible
1331: * ill-conditioning. Depending on the user specified flag L2RANK,
1332: * the procedure explores possibilities to reduce the numerical
1333: * rank by inspecting the computed upper triangular factor. If
1334: * L2RANK or L2ABER are up, then ZGEJSV will compute the SVD of
1335: * A + dA, where ||dA|| <= f(M,N)*EPSLN.
1336: *
1337: NR = 1
1338: IF ( L2ABER ) THEN
1339: * Standard absolute error bound suffices. All sigma_i with
1340: * sigma_i < N*EPSLN*||A|| are flushed to zero. This is an
1341: * agressive enforcement of lower numerical rank by introducing a
1342: * backward error of the order of N*EPSLN*||A||.
1343: TEMP1 = SQRT(DBLE(N))*EPSLN
1344: DO 3001 p = 2, N
1345: IF ( ABS(A(p,p)) .GE. (TEMP1*ABS(A(1,1))) ) THEN
1346: NR = NR + 1
1347: ELSE
1348: GO TO 3002
1349: END IF
1350: 3001 CONTINUE
1351: 3002 CONTINUE
1352: ELSE IF ( L2RANK ) THEN
1353: * .. similarly as above, only slightly more gentle (less agressive).
1354: * Sudden drop on the diagonal of R1 is used as the criterion for
1355: * close-to-rank-deficient.
1356: TEMP1 = SQRT(SFMIN)
1357: DO 3401 p = 2, N
1358: IF ( ( ABS(A(p,p)) .LT. (EPSLN*ABS(A(p-1,p-1))) ) .OR.
1359: $ ( ABS(A(p,p)) .LT. SMALL ) .OR.
1360: $ ( L2KILL .AND. (ABS(A(p,p)) .LT. TEMP1) ) ) GO TO 3402
1361: NR = NR + 1
1362: 3401 CONTINUE
1363: 3402 CONTINUE
1364: *
1365: ELSE
1366: * The goal is high relative accuracy. However, if the matrix
1367: * has high scaled condition number the relative accuracy is in
1368: * general not feasible. Later on, a condition number estimator
1369: * will be deployed to estimate the scaled condition number.
1370: * Here we just remove the underflowed part of the triangular
1371: * factor. This prevents the situation in which the code is
1372: * working hard to get the accuracy not warranted by the data.
1373: TEMP1 = SQRT(SFMIN)
1374: DO 3301 p = 2, N
1375: IF ( ( ABS(A(p,p)) .LT. SMALL ) .OR.
1376: $ ( L2KILL .AND. (ABS(A(p,p)) .LT. TEMP1) ) ) GO TO 3302
1377: NR = NR + 1
1378: 3301 CONTINUE
1379: 3302 CONTINUE
1380: *
1381: END IF
1382: *
1383: ALMORT = .FALSE.
1384: IF ( NR .EQ. N ) THEN
1385: MAXPRJ = ONE
1386: DO 3051 p = 2, N
1387: TEMP1 = ABS(A(p,p)) / SVA(IWORK(p))
1388: MAXPRJ = MIN( MAXPRJ, TEMP1 )
1389: 3051 CONTINUE
1390: IF ( MAXPRJ**2 .GE. ONE - DBLE(N)*EPSLN ) ALMORT = .TRUE.
1391: END IF
1392: *
1393: *
1394: SCONDA = - ONE
1395: CONDR1 = - ONE
1396: CONDR2 = - ONE
1397: *
1398: IF ( ERREST ) THEN
1399: IF ( N .EQ. NR ) THEN
1400: IF ( RSVEC ) THEN
1401: * .. V is available as workspace
1402: CALL ZLACPY( 'U', N, N, A, LDA, V, LDV )
1403: DO 3053 p = 1, N
1404: TEMP1 = SVA(IWORK(p))
1405: CALL ZDSCAL( p, ONE/TEMP1, V(1,p), 1 )
1406: 3053 CONTINUE
1407: IF ( LSVEC )THEN
1408: CALL ZPOCON( 'U', N, V, LDV, ONE, TEMP1,
1409: $ CWORK(N+1), RWORK, IERR )
1410: ELSE
1411: CALL ZPOCON( 'U', N, V, LDV, ONE, TEMP1,
1412: $ CWORK, RWORK, IERR )
1413: END IF
1414: *
1415: ELSE IF ( LSVEC ) THEN
1416: * .. U is available as workspace
1417: CALL ZLACPY( 'U', N, N, A, LDA, U, LDU )
1418: DO 3054 p = 1, N
1419: TEMP1 = SVA(IWORK(p))
1420: CALL ZDSCAL( p, ONE/TEMP1, U(1,p), 1 )
1421: 3054 CONTINUE
1422: CALL ZPOCON( 'U', N, U, LDU, ONE, TEMP1,
1423: $ CWORK(N+1), RWORK, IERR )
1424: ELSE
1425: CALL ZLACPY( 'U', N, N, A, LDA, CWORK, N )
1426: *[] CALL ZLACPY( 'U', N, N, A, LDA, CWORK(N+1), N )
1427: * Change: here index shifted by N to the left, CWORK(1:N)
1428: * not needed for SIGMA only computation
1429: DO 3052 p = 1, N
1430: TEMP1 = SVA(IWORK(p))
1431: *[] CALL ZDSCAL( p, ONE/TEMP1, CWORK(N+(p-1)*N+1), 1 )
1432: CALL ZDSCAL( p, ONE/TEMP1, CWORK((p-1)*N+1), 1 )
1433: 3052 CONTINUE
1434: * .. the columns of R are scaled to have unit Euclidean lengths.
1435: *[] CALL ZPOCON( 'U', N, CWORK(N+1), N, ONE, TEMP1,
1436: *[] $ CWORK(N+N*N+1), RWORK, IERR )
1437: CALL ZPOCON( 'U', N, CWORK, N, ONE, TEMP1,
1438: $ CWORK(N*N+1), RWORK, IERR )
1439: *
1440: END IF
1441: IF ( TEMP1 .NE. ZERO ) THEN
1442: SCONDA = ONE / SQRT(TEMP1)
1443: ELSE
1444: SCONDA = - ONE
1445: END IF
1446: * SCONDA is an estimate of SQRT(||(R^* * R)^(-1)||_1).
1447: * N^(-1/4) * SCONDA <= ||R^(-1)||_2 <= N^(1/4) * SCONDA
1448: ELSE
1449: SCONDA = - ONE
1450: END IF
1451: END IF
1452: *
1453: L2PERT = L2PERT .AND. ( ABS( A(1,1)/A(NR,NR) ) .GT. SQRT(BIG1) )
1454: * If there is no violent scaling, artificial perturbation is not needed.
1455: *
1456: * Phase 3:
1457: *
1458: IF ( .NOT. ( RSVEC .OR. LSVEC ) ) THEN
1459: *
1460: * Singular Values only
1461: *
1462: * .. transpose A(1:NR,1:N)
1463: DO 1946 p = 1, MIN( N-1, NR )
1464: CALL ZCOPY( N-p, A(p,p+1), LDA, A(p+1,p), 1 )
1465: CALL ZLACGV( N-p+1, A(p,p), 1 )
1466: 1946 CONTINUE
1467: IF ( NR .EQ. N ) A(N,N) = CONJG(A(N,N))
1468: *
1469: * The following two DO-loops introduce small relative perturbation
1470: * into the strict upper triangle of the lower triangular matrix.
1471: * Small entries below the main diagonal are also changed.
1472: * This modification is useful if the computing environment does not
1473: * provide/allow FLUSH TO ZERO underflow, for it prevents many
1474: * annoying denormalized numbers in case of strongly scaled matrices.
1475: * The perturbation is structured so that it does not introduce any
1476: * new perturbation of the singular values, and it does not destroy
1477: * the job done by the preconditioner.
1478: * The licence for this perturbation is in the variable L2PERT, which
1479: * should be .FALSE. if FLUSH TO ZERO underflow is active.
1480: *
1481: IF ( .NOT. ALMORT ) THEN
1482: *
1483: IF ( L2PERT ) THEN
1484: * XSC = SQRT(SMALL)
1485: XSC = EPSLN / DBLE(N)
1486: DO 4947 q = 1, NR
1487: CTEMP = DCMPLX(XSC*ABS(A(q,q)),ZERO)
1488: DO 4949 p = 1, N
1489: IF ( ( (p.GT.q) .AND. (ABS(A(p,q)).LE.TEMP1) )
1490: $ .OR. ( p .LT. q ) )
1491: * $ A(p,q) = TEMP1 * ( A(p,q) / ABS(A(p,q)) )
1492: $ A(p,q) = CTEMP
1493: 4949 CONTINUE
1494: 4947 CONTINUE
1495: ELSE
1496: CALL ZLASET( 'U', NR-1,NR-1, CZERO,CZERO, A(1,2),LDA )
1497: END IF
1498: *
1499: * .. second preconditioning using the QR factorization
1500: *
1501: CALL ZGEQRF( N,NR, A,LDA, CWORK, CWORK(N+1),LWORK-N, IERR )
1502: *
1503: * .. and transpose upper to lower triangular
1504: DO 1948 p = 1, NR - 1
1505: CALL ZCOPY( NR-p, A(p,p+1), LDA, A(p+1,p), 1 )
1506: CALL ZLACGV( NR-p+1, A(p,p), 1 )
1507: 1948 CONTINUE
1508: *
1509: END IF
1510: *
1511: * Row-cyclic Jacobi SVD algorithm with column pivoting
1512: *
1513: * .. again some perturbation (a "background noise") is added
1514: * to drown denormals
1515: IF ( L2PERT ) THEN
1516: * XSC = SQRT(SMALL)
1517: XSC = EPSLN / DBLE(N)
1518: DO 1947 q = 1, NR
1519: CTEMP = DCMPLX(XSC*ABS(A(q,q)),ZERO)
1520: DO 1949 p = 1, NR
1521: IF ( ( (p.GT.q) .AND. (ABS(A(p,q)).LE.TEMP1) )
1522: $ .OR. ( p .LT. q ) )
1523: * $ A(p,q) = TEMP1 * ( A(p,q) / ABS(A(p,q)) )
1524: $ A(p,q) = CTEMP
1525: 1949 CONTINUE
1526: 1947 CONTINUE
1527: ELSE
1528: CALL ZLASET( 'U', NR-1, NR-1, CZERO, CZERO, A(1,2), LDA )
1529: END IF
1530: *
1531: * .. and one-sided Jacobi rotations are started on a lower
1532: * triangular matrix (plus perturbation which is ignored in
1533: * the part which destroys triangular form (confusing?!))
1534: *
1535: CALL ZGESVJ( 'L', 'N', 'N', NR, NR, A, LDA, SVA,
1536: $ N, V, LDV, CWORK, LWORK, RWORK, LRWORK, INFO )
1537: *
1538: SCALEM = RWORK(1)
1539: NUMRANK = NINT(RWORK(2))
1540: *
1541: *
1542: ELSE IF ( ( RSVEC .AND. ( .NOT. LSVEC ) .AND. ( .NOT. JRACC ) )
1543: $ .OR.
1544: $ ( JRACC .AND. ( .NOT. LSVEC ) .AND. ( NR .NE. N ) ) ) THEN
1545: *
1546: * -> Singular Values and Right Singular Vectors <-
1547: *
1548: IF ( ALMORT ) THEN
1549: *
1550: * .. in this case NR equals N
1551: DO 1998 p = 1, NR
1552: CALL ZCOPY( N-p+1, A(p,p), LDA, V(p,p), 1 )
1553: CALL ZLACGV( N-p+1, V(p,p), 1 )
1554: 1998 CONTINUE
1555: CALL ZLASET( 'U', NR-1,NR-1, CZERO, CZERO, V(1,2), LDV )
1556: *
1557: CALL ZGESVJ( 'L','U','N', N, NR, V, LDV, SVA, NR, A, LDA,
1558: $ CWORK, LWORK, RWORK, LRWORK, INFO )
1559: SCALEM = RWORK(1)
1560: NUMRANK = NINT(RWORK(2))
1561:
1562: ELSE
1563: *
1564: * .. two more QR factorizations ( one QRF is not enough, two require
1565: * accumulated product of Jacobi rotations, three are perfect )
1566: *
1567: CALL ZLASET( 'L', NR-1,NR-1, CZERO, CZERO, A(2,1), LDA )
1568: CALL ZGELQF( NR,N, A, LDA, CWORK, CWORK(N+1), LWORK-N, IERR)
1569: CALL ZLACPY( 'L', NR, NR, A, LDA, V, LDV )
1570: CALL ZLASET( 'U', NR-1,NR-1, CZERO, CZERO, V(1,2), LDV )
1571: CALL ZGEQRF( NR, NR, V, LDV, CWORK(N+1), CWORK(2*N+1),
1572: $ LWORK-2*N, IERR )
1573: DO 8998 p = 1, NR
1574: CALL ZCOPY( NR-p+1, V(p,p), LDV, V(p,p), 1 )
1575: CALL ZLACGV( NR-p+1, V(p,p), 1 )
1576: 8998 CONTINUE
1577: CALL ZLASET('U', NR-1, NR-1, CZERO, CZERO, V(1,2), LDV)
1578: *
1579: CALL ZGESVJ( 'L', 'U','N', NR, NR, V,LDV, SVA, NR, U,
1580: $ LDU, CWORK(N+1), LWORK-N, RWORK, LRWORK, INFO )
1581: SCALEM = RWORK(1)
1582: NUMRANK = NINT(RWORK(2))
1583: IF ( NR .LT. N ) THEN
1584: CALL ZLASET( 'A',N-NR, NR, CZERO,CZERO, V(NR+1,1), LDV )
1585: CALL ZLASET( 'A',NR, N-NR, CZERO,CZERO, V(1,NR+1), LDV )
1586: CALL ZLASET( 'A',N-NR,N-NR,CZERO,CONE, V(NR+1,NR+1),LDV )
1587: END IF
1588: *
1589: CALL ZUNMLQ( 'L', 'C', N, N, NR, A, LDA, CWORK,
1590: $ V, LDV, CWORK(N+1), LWORK-N, IERR )
1591: *
1592: END IF
1593: * .. permute the rows of V
1594: * DO 8991 p = 1, N
1595: * CALL ZCOPY( N, V(p,1), LDV, A(IWORK(p),1), LDA )
1596: * 8991 CONTINUE
1597: * CALL ZLACPY( 'All', N, N, A, LDA, V, LDV )
1598: CALL ZLAPMR( .FALSE., N, N, V, LDV, IWORK )
1599: *
1600: IF ( TRANSP ) THEN
1601: CALL ZLACPY( 'A', N, N, V, LDV, U, LDU )
1602: END IF
1603: *
1604: ELSE IF ( JRACC .AND. (.NOT. LSVEC) .AND. ( NR.EQ. N ) ) THEN
1605: *
1606: CALL ZLASET( 'L', N-1,N-1, CZERO, CZERO, A(2,1), LDA )
1607: *
1608: CALL ZGESVJ( 'U','N','V', N, N, A, LDA, SVA, N, V, LDV,
1609: $ CWORK, LWORK, RWORK, LRWORK, INFO )
1610: SCALEM = RWORK(1)
1611: NUMRANK = NINT(RWORK(2))
1612: CALL ZLAPMR( .FALSE., N, N, V, LDV, IWORK )
1613: *
1614: ELSE IF ( LSVEC .AND. ( .NOT. RSVEC ) ) THEN
1615: *
1616: * .. Singular Values and Left Singular Vectors ..
1617: *
1618: * .. second preconditioning step to avoid need to accumulate
1619: * Jacobi rotations in the Jacobi iterations.
1620: DO 1965 p = 1, NR
1621: CALL ZCOPY( N-p+1, A(p,p), LDA, U(p,p), 1 )
1622: CALL ZLACGV( N-p+1, U(p,p), 1 )
1623: 1965 CONTINUE
1624: CALL ZLASET( 'U', NR-1, NR-1, CZERO, CZERO, U(1,2), LDU )
1625: *
1626: CALL ZGEQRF( N, NR, U, LDU, CWORK(N+1), CWORK(2*N+1),
1627: $ LWORK-2*N, IERR )
1628: *
1629: DO 1967 p = 1, NR - 1
1630: CALL ZCOPY( NR-p, U(p,p+1), LDU, U(p+1,p), 1 )
1631: CALL ZLACGV( N-p+1, U(p,p), 1 )
1632: 1967 CONTINUE
1633: CALL ZLASET( 'U', NR-1, NR-1, CZERO, CZERO, U(1,2), LDU )
1634: *
1635: CALL ZGESVJ( 'L', 'U', 'N', NR,NR, U, LDU, SVA, NR, A,
1636: $ LDA, CWORK(N+1), LWORK-N, RWORK, LRWORK, INFO )
1637: SCALEM = RWORK(1)
1638: NUMRANK = NINT(RWORK(2))
1639: *
1640: IF ( NR .LT. M ) THEN
1641: CALL ZLASET( 'A', M-NR, NR,CZERO, CZERO, U(NR+1,1), LDU )
1642: IF ( NR .LT. N1 ) THEN
1643: CALL ZLASET( 'A',NR, N1-NR, CZERO, CZERO, U(1,NR+1),LDU )
1644: CALL ZLASET( 'A',M-NR,N1-NR,CZERO,CONE,U(NR+1,NR+1),LDU )
1645: END IF
1646: END IF
1647: *
1648: CALL ZUNMQR( 'L', 'N', M, N1, N, A, LDA, CWORK, U,
1649: $ LDU, CWORK(N+1), LWORK-N, IERR )
1650: *
1651: IF ( ROWPIV )
1652: $ CALL ZLASWP( N1, U, LDU, 1, M-1, IWORK(IWOFF+1), -1 )
1653: *
1654: DO 1974 p = 1, N1
1655: XSC = ONE / DZNRM2( M, U(1,p), 1 )
1656: CALL ZDSCAL( M, XSC, U(1,p), 1 )
1657: 1974 CONTINUE
1658: *
1659: IF ( TRANSP ) THEN
1660: CALL ZLACPY( 'A', N, N, U, LDU, V, LDV )
1661: END IF
1662: *
1663: ELSE
1664: *
1665: * .. Full SVD ..
1666: *
1667: IF ( .NOT. JRACC ) THEN
1668: *
1669: IF ( .NOT. ALMORT ) THEN
1670: *
1671: * Second Preconditioning Step (QRF [with pivoting])
1672: * Note that the composition of TRANSPOSE, QRF and TRANSPOSE is
1673: * equivalent to an LQF CALL. Since in many libraries the QRF
1674: * seems to be better optimized than the LQF, we do explicit
1675: * transpose and use the QRF. This is subject to changes in an
1676: * optimized implementation of ZGEJSV.
1677: *
1678: DO 1968 p = 1, NR
1679: CALL ZCOPY( N-p+1, A(p,p), LDA, V(p,p), 1 )
1680: CALL ZLACGV( N-p+1, V(p,p), 1 )
1681: 1968 CONTINUE
1682: *
1683: * .. the following two loops perturb small entries to avoid
1684: * denormals in the second QR factorization, where they are
1685: * as good as zeros. This is done to avoid painfully slow
1686: * computation with denormals. The relative size of the perturbation
1687: * is a parameter that can be changed by the implementer.
1688: * This perturbation device will be obsolete on machines with
1689: * properly implemented arithmetic.
1690: * To switch it off, set L2PERT=.FALSE. To remove it from the
1691: * code, remove the action under L2PERT=.TRUE., leave the ELSE part.
1692: * The following two loops should be blocked and fused with the
1693: * transposed copy above.
1694: *
1695: IF ( L2PERT ) THEN
1696: XSC = SQRT(SMALL)
1697: DO 2969 q = 1, NR
1698: CTEMP = DCMPLX(XSC*ABS( V(q,q) ),ZERO)
1699: DO 2968 p = 1, N
1700: IF ( ( p .GT. q ) .AND. ( ABS(V(p,q)) .LE. TEMP1 )
1701: $ .OR. ( p .LT. q ) )
1702: * $ V(p,q) = TEMP1 * ( V(p,q) / ABS(V(p,q)) )
1703: $ V(p,q) = CTEMP
1704: IF ( p .LT. q ) V(p,q) = - V(p,q)
1705: 2968 CONTINUE
1706: 2969 CONTINUE
1707: ELSE
1708: CALL ZLASET( 'U', NR-1, NR-1, CZERO, CZERO, V(1,2), LDV )
1709: END IF
1710: *
1711: * Estimate the row scaled condition number of R1
1712: * (If R1 is rectangular, N > NR, then the condition number
1713: * of the leading NR x NR submatrix is estimated.)
1714: *
1715: CALL ZLACPY( 'L', NR, NR, V, LDV, CWORK(2*N+1), NR )
1716: DO 3950 p = 1, NR
1717: TEMP1 = DZNRM2(NR-p+1,CWORK(2*N+(p-1)*NR+p),1)
1718: CALL ZDSCAL(NR-p+1,ONE/TEMP1,CWORK(2*N+(p-1)*NR+p),1)
1719: 3950 CONTINUE
1720: CALL ZPOCON('L',NR,CWORK(2*N+1),NR,ONE,TEMP1,
1721: $ CWORK(2*N+NR*NR+1),RWORK,IERR)
1722: CONDR1 = ONE / SQRT(TEMP1)
1723: * .. here need a second oppinion on the condition number
1724: * .. then assume worst case scenario
1725: * R1 is OK for inverse <=> CONDR1 .LT. DBLE(N)
1726: * more conservative <=> CONDR1 .LT. SQRT(DBLE(N))
1727: *
1728: COND_OK = SQRT(SQRT(DBLE(NR)))
1729: *[TP] COND_OK is a tuning parameter.
1730: *
1731: IF ( CONDR1 .LT. COND_OK ) THEN
1732: * .. the second QRF without pivoting. Note: in an optimized
1733: * implementation, this QRF should be implemented as the QRF
1734: * of a lower triangular matrix.
1735: * R1^* = Q2 * R2
1736: CALL ZGEQRF( N, NR, V, LDV, CWORK(N+1), CWORK(2*N+1),
1737: $ LWORK-2*N, IERR )
1738: *
1739: IF ( L2PERT ) THEN
1740: XSC = SQRT(SMALL)/EPSLN
1741: DO 3959 p = 2, NR
1742: DO 3958 q = 1, p - 1
1743: CTEMP=DCMPLX(XSC*MIN(ABS(V(p,p)),ABS(V(q,q))),
1744: $ ZERO)
1745: IF ( ABS(V(q,p)) .LE. TEMP1 )
1746: * $ V(q,p) = TEMP1 * ( V(q,p) / ABS(V(q,p)) )
1747: $ V(q,p) = CTEMP
1748: 3958 CONTINUE
1749: 3959 CONTINUE
1750: END IF
1751: *
1752: IF ( NR .NE. N )
1753: $ CALL ZLACPY( 'A', N, NR, V, LDV, CWORK(2*N+1), N )
1754: * .. save ...
1755: *
1756: * .. this transposed copy should be better than naive
1757: DO 1969 p = 1, NR - 1
1758: CALL ZCOPY( NR-p, V(p,p+1), LDV, V(p+1,p), 1 )
1759: CALL ZLACGV(NR-p+1, V(p,p), 1 )
1760: 1969 CONTINUE
1761: V(NR,NR)=CONJG(V(NR,NR))
1762: *
1763: CONDR2 = CONDR1
1764: *
1765: ELSE
1766: *
1767: * .. ill-conditioned case: second QRF with pivoting
1768: * Note that windowed pivoting would be equaly good
1769: * numerically, and more run-time efficient. So, in
1770: * an optimal implementation, the next call to ZGEQP3
1771: * should be replaced with eg. CALL ZGEQPX (ACM TOMS #782)
1772: * with properly (carefully) chosen parameters.
1773: *
1774: * R1^* * P2 = Q2 * R2
1775: DO 3003 p = 1, NR
1776: IWORK(N+p) = 0
1777: 3003 CONTINUE
1778: CALL ZGEQP3( N, NR, V, LDV, IWORK(N+1), CWORK(N+1),
1779: $ CWORK(2*N+1), LWORK-2*N, RWORK, IERR )
1780: ** CALL ZGEQRF( N, NR, V, LDV, CWORK(N+1), CWORK(2*N+1),
1781: ** $ LWORK-2*N, IERR )
1782: IF ( L2PERT ) THEN
1783: XSC = SQRT(SMALL)
1784: DO 3969 p = 2, NR
1785: DO 3968 q = 1, p - 1
1786: CTEMP=DCMPLX(XSC*MIN(ABS(V(p,p)),ABS(V(q,q))),
1787: $ ZERO)
1788: IF ( ABS(V(q,p)) .LE. TEMP1 )
1789: * $ V(q,p) = TEMP1 * ( V(q,p) / ABS(V(q,p)) )
1790: $ V(q,p) = CTEMP
1791: 3968 CONTINUE
1792: 3969 CONTINUE
1793: END IF
1794: *
1795: CALL ZLACPY( 'A', N, NR, V, LDV, CWORK(2*N+1), N )
1796: *
1797: IF ( L2PERT ) THEN
1798: XSC = SQRT(SMALL)
1799: DO 8970 p = 2, NR
1800: DO 8971 q = 1, p - 1
1801: CTEMP=DCMPLX(XSC*MIN(ABS(V(p,p)),ABS(V(q,q))),
1802: $ ZERO)
1803: * V(p,q) = - TEMP1*( V(q,p) / ABS(V(q,p)) )
1804: V(p,q) = - CTEMP
1805: 8971 CONTINUE
1806: 8970 CONTINUE
1807: ELSE
1808: CALL ZLASET( 'L',NR-1,NR-1,CZERO,CZERO,V(2,1),LDV )
1809: END IF
1810: * Now, compute R2 = L3 * Q3, the LQ factorization.
1811: CALL ZGELQF( NR, NR, V, LDV, CWORK(2*N+N*NR+1),
1812: $ CWORK(2*N+N*NR+NR+1), LWORK-2*N-N*NR-NR, IERR )
1813: * .. and estimate the condition number
1814: CALL ZLACPY( 'L',NR,NR,V,LDV,CWORK(2*N+N*NR+NR+1),NR )
1815: DO 4950 p = 1, NR
1816: TEMP1 = DZNRM2( p, CWORK(2*N+N*NR+NR+p), NR )
1817: CALL ZDSCAL( p, ONE/TEMP1, CWORK(2*N+N*NR+NR+p), NR )
1818: 4950 CONTINUE
1819: CALL ZPOCON( 'L',NR,CWORK(2*N+N*NR+NR+1),NR,ONE,TEMP1,
1820: $ CWORK(2*N+N*NR+NR+NR*NR+1),RWORK,IERR )
1821: CONDR2 = ONE / SQRT(TEMP1)
1822: *
1823: *
1824: IF ( CONDR2 .GE. COND_OK ) THEN
1825: * .. save the Householder vectors used for Q3
1826: * (this overwrittes the copy of R2, as it will not be
1827: * needed in this branch, but it does not overwritte the
1828: * Huseholder vectors of Q2.).
1829: CALL ZLACPY( 'U', NR, NR, V, LDV, CWORK(2*N+1), N )
1830: * .. and the rest of the information on Q3 is in
1831: * WORK(2*N+N*NR+1:2*N+N*NR+N)
1832: END IF
1833: *
1834: END IF
1835: *
1836: IF ( L2PERT ) THEN
1837: XSC = SQRT(SMALL)
1838: DO 4968 q = 2, NR
1839: CTEMP = XSC * V(q,q)
1840: DO 4969 p = 1, q - 1
1841: * V(p,q) = - TEMP1*( V(p,q) / ABS(V(p,q)) )
1842: V(p,q) = - CTEMP
1843: 4969 CONTINUE
1844: 4968 CONTINUE
1845: ELSE
1846: CALL ZLASET( 'U', NR-1,NR-1, CZERO,CZERO, V(1,2), LDV )
1847: END IF
1848: *
1849: * Second preconditioning finished; continue with Jacobi SVD
1850: * The input matrix is lower trinagular.
1851: *
1852: * Recover the right singular vectors as solution of a well
1853: * conditioned triangular matrix equation.
1854: *
1855: IF ( CONDR1 .LT. COND_OK ) THEN
1856: *
1857: CALL ZGESVJ( 'L','U','N',NR,NR,V,LDV,SVA,NR,U, LDU,
1858: $ CWORK(2*N+N*NR+NR+1),LWORK-2*N-N*NR-NR,RWORK,
1859: $ LRWORK, INFO )
1860: SCALEM = RWORK(1)
1861: NUMRANK = NINT(RWORK(2))
1862: DO 3970 p = 1, NR
1863: CALL ZCOPY( NR, V(1,p), 1, U(1,p), 1 )
1864: CALL ZDSCAL( NR, SVA(p), V(1,p), 1 )
1865: 3970 CONTINUE
1866:
1867: * .. pick the right matrix equation and solve it
1868: *
1869: IF ( NR .EQ. N ) THEN
1870: * :)) .. best case, R1 is inverted. The solution of this matrix
1871: * equation is Q2*V2 = the product of the Jacobi rotations
1872: * used in ZGESVJ, premultiplied with the orthogonal matrix
1873: * from the second QR factorization.
1874: CALL ZTRSM('L','U','N','N', NR,NR,CONE, A,LDA, V,LDV)
1875: ELSE
1876: * .. R1 is well conditioned, but non-square. Adjoint of R2
1877: * is inverted to get the product of the Jacobi rotations
1878: * used in ZGESVJ. The Q-factor from the second QR
1879: * factorization is then built in explicitly.
1880: CALL ZTRSM('L','U','C','N',NR,NR,CONE,CWORK(2*N+1),
1881: $ N,V,LDV)
1882: IF ( NR .LT. N ) THEN
1883: CALL ZLASET('A',N-NR,NR,CZERO,CZERO,V(NR+1,1),LDV)
1884: CALL ZLASET('A',NR,N-NR,CZERO,CZERO,V(1,NR+1),LDV)
1885: CALL ZLASET('A',N-NR,N-NR,CZERO,CONE,V(NR+1,NR+1),LDV)
1886: END IF
1887: CALL ZUNMQR('L','N',N,N,NR,CWORK(2*N+1),N,CWORK(N+1),
1888: $ V,LDV,CWORK(2*N+N*NR+NR+1),LWORK-2*N-N*NR-NR,IERR)
1889: END IF
1890: *
1891: ELSE IF ( CONDR2 .LT. COND_OK ) THEN
1892: *
1893: * The matrix R2 is inverted. The solution of the matrix equation
1894: * is Q3^* * V3 = the product of the Jacobi rotations (appplied to
1895: * the lower triangular L3 from the LQ factorization of
1896: * R2=L3*Q3), pre-multiplied with the transposed Q3.
1897: CALL ZGESVJ( 'L', 'U', 'N', NR, NR, V, LDV, SVA, NR, U,
1898: $ LDU, CWORK(2*N+N*NR+NR+1), LWORK-2*N-N*NR-NR,
1899: $ RWORK, LRWORK, INFO )
1900: SCALEM = RWORK(1)
1901: NUMRANK = NINT(RWORK(2))
1902: DO 3870 p = 1, NR
1903: CALL ZCOPY( NR, V(1,p), 1, U(1,p), 1 )
1904: CALL ZDSCAL( NR, SVA(p), U(1,p), 1 )
1905: 3870 CONTINUE
1906: CALL ZTRSM('L','U','N','N',NR,NR,CONE,CWORK(2*N+1),N,
1907: $ U,LDU)
1908: * .. apply the permutation from the second QR factorization
1909: DO 873 q = 1, NR
1910: DO 872 p = 1, NR
1911: CWORK(2*N+N*NR+NR+IWORK(N+p)) = U(p,q)
1912: 872 CONTINUE
1913: DO 874 p = 1, NR
1914: U(p,q) = CWORK(2*N+N*NR+NR+p)
1915: 874 CONTINUE
1916: 873 CONTINUE
1917: IF ( NR .LT. N ) THEN
1918: CALL ZLASET( 'A',N-NR,NR,CZERO,CZERO,V(NR+1,1),LDV )
1919: CALL ZLASET( 'A',NR,N-NR,CZERO,CZERO,V(1,NR+1),LDV )
1920: CALL ZLASET('A',N-NR,N-NR,CZERO,CONE,V(NR+1,NR+1),LDV)
1921: END IF
1922: CALL ZUNMQR( 'L','N',N,N,NR,CWORK(2*N+1),N,CWORK(N+1),
1923: $ V,LDV,CWORK(2*N+N*NR+NR+1),LWORK-2*N-N*NR-NR,IERR )
1924: ELSE
1925: * Last line of defense.
1926: * #:( This is a rather pathological case: no scaled condition
1927: * improvement after two pivoted QR factorizations. Other
1928: * possibility is that the rank revealing QR factorization
1929: * or the condition estimator has failed, or the COND_OK
1930: * is set very close to ONE (which is unnecessary). Normally,
1931: * this branch should never be executed, but in rare cases of
1932: * failure of the RRQR or condition estimator, the last line of
1933: * defense ensures that ZGEJSV completes the task.
1934: * Compute the full SVD of L3 using ZGESVJ with explicit
1935: * accumulation of Jacobi rotations.
1936: CALL ZGESVJ( 'L', 'U', 'V', NR, NR, V, LDV, SVA, NR, U,
1937: $ LDU, CWORK(2*N+N*NR+NR+1), LWORK-2*N-N*NR-NR,
1938: $ RWORK, LRWORK, INFO )
1939: SCALEM = RWORK(1)
1940: NUMRANK = NINT(RWORK(2))
1941: IF ( NR .LT. N ) THEN
1942: CALL ZLASET( 'A',N-NR,NR,CZERO,CZERO,V(NR+1,1),LDV )
1943: CALL ZLASET( 'A',NR,N-NR,CZERO,CZERO,V(1,NR+1),LDV )
1944: CALL ZLASET('A',N-NR,N-NR,CZERO,CONE,V(NR+1,NR+1),LDV)
1945: END IF
1946: CALL ZUNMQR( 'L','N',N,N,NR,CWORK(2*N+1),N,CWORK(N+1),
1947: $ V,LDV,CWORK(2*N+N*NR+NR+1),LWORK-2*N-N*NR-NR,IERR )
1948: *
1949: CALL ZUNMLQ( 'L', 'C', NR, NR, NR, CWORK(2*N+1), N,
1950: $ CWORK(2*N+N*NR+1), U, LDU, CWORK(2*N+N*NR+NR+1),
1951: $ LWORK-2*N-N*NR-NR, IERR )
1952: DO 773 q = 1, NR
1953: DO 772 p = 1, NR
1954: CWORK(2*N+N*NR+NR+IWORK(N+p)) = U(p,q)
1955: 772 CONTINUE
1956: DO 774 p = 1, NR
1957: U(p,q) = CWORK(2*N+N*NR+NR+p)
1958: 774 CONTINUE
1959: 773 CONTINUE
1960: *
1961: END IF
1962: *
1963: * Permute the rows of V using the (column) permutation from the
1964: * first QRF. Also, scale the columns to make them unit in
1965: * Euclidean norm. This applies to all cases.
1966: *
1967: TEMP1 = SQRT(DBLE(N)) * EPSLN
1968: DO 1972 q = 1, N
1969: DO 972 p = 1, N
1970: CWORK(2*N+N*NR+NR+IWORK(p)) = V(p,q)
1971: 972 CONTINUE
1972: DO 973 p = 1, N
1973: V(p,q) = CWORK(2*N+N*NR+NR+p)
1974: 973 CONTINUE
1975: XSC = ONE / DZNRM2( N, V(1,q), 1 )
1976: IF ( (XSC .LT. (ONE-TEMP1)) .OR. (XSC .GT. (ONE+TEMP1)) )
1977: $ CALL ZDSCAL( N, XSC, V(1,q), 1 )
1978: 1972 CONTINUE
1979: * At this moment, V contains the right singular vectors of A.
1980: * Next, assemble the left singular vector matrix U (M x N).
1981: IF ( NR .LT. M ) THEN
1982: CALL ZLASET('A', M-NR, NR, CZERO, CZERO, U(NR+1,1), LDU)
1983: IF ( NR .LT. N1 ) THEN
1984: CALL ZLASET('A',NR,N1-NR,CZERO,CZERO,U(1,NR+1),LDU)
1985: CALL ZLASET('A',M-NR,N1-NR,CZERO,CONE,
1986: $ U(NR+1,NR+1),LDU)
1987: END IF
1988: END IF
1989: *
1990: * The Q matrix from the first QRF is built into the left singular
1991: * matrix U. This applies to all cases.
1992: *
1993: CALL ZUNMQR( 'L', 'N', M, N1, N, A, LDA, CWORK, U,
1994: $ LDU, CWORK(N+1), LWORK-N, IERR )
1995:
1996: * The columns of U are normalized. The cost is O(M*N) flops.
1997: TEMP1 = SQRT(DBLE(M)) * EPSLN
1998: DO 1973 p = 1, NR
1999: XSC = ONE / DZNRM2( M, U(1,p), 1 )
2000: IF ( (XSC .LT. (ONE-TEMP1)) .OR. (XSC .GT. (ONE+TEMP1)) )
2001: $ CALL ZDSCAL( M, XSC, U(1,p), 1 )
2002: 1973 CONTINUE
2003: *
2004: * If the initial QRF is computed with row pivoting, the left
2005: * singular vectors must be adjusted.
2006: *
2007: IF ( ROWPIV )
2008: $ CALL ZLASWP( N1, U, LDU, 1, M-1, IWORK(IWOFF+1), -1 )
2009: *
2010: ELSE
2011: *
2012: * .. the initial matrix A has almost orthogonal columns and
2013: * the second QRF is not needed
2014: *
2015: CALL ZLACPY( 'U', N, N, A, LDA, CWORK(N+1), N )
2016: IF ( L2PERT ) THEN
2017: XSC = SQRT(SMALL)
2018: DO 5970 p = 2, N
2019: CTEMP = XSC * CWORK( N + (p-1)*N + p )
2020: DO 5971 q = 1, p - 1
2021: * CWORK(N+(q-1)*N+p)=-TEMP1 * ( CWORK(N+(p-1)*N+q) /
2022: * $ ABS(CWORK(N+(p-1)*N+q)) )
2023: CWORK(N+(q-1)*N+p)=-CTEMP
2024: 5971 CONTINUE
2025: 5970 CONTINUE
2026: ELSE
2027: CALL ZLASET( 'L',N-1,N-1,CZERO,CZERO,CWORK(N+2),N )
2028: END IF
2029: *
2030: CALL ZGESVJ( 'U', 'U', 'N', N, N, CWORK(N+1), N, SVA,
2031: $ N, U, LDU, CWORK(N+N*N+1), LWORK-N-N*N, RWORK, LRWORK,
2032: $ INFO )
2033: *
2034: SCALEM = RWORK(1)
2035: NUMRANK = NINT(RWORK(2))
2036: DO 6970 p = 1, N
2037: CALL ZCOPY( N, CWORK(N+(p-1)*N+1), 1, U(1,p), 1 )
2038: CALL ZDSCAL( N, SVA(p), CWORK(N+(p-1)*N+1), 1 )
2039: 6970 CONTINUE
2040: *
2041: CALL ZTRSM( 'L', 'U', 'N', 'N', N, N,
2042: $ CONE, A, LDA, CWORK(N+1), N )
2043: DO 6972 p = 1, N
2044: CALL ZCOPY( N, CWORK(N+p), N, V(IWORK(p),1), LDV )
2045: 6972 CONTINUE
2046: TEMP1 = SQRT(DBLE(N))*EPSLN
2047: DO 6971 p = 1, N
2048: XSC = ONE / DZNRM2( N, V(1,p), 1 )
2049: IF ( (XSC .LT. (ONE-TEMP1)) .OR. (XSC .GT. (ONE+TEMP1)) )
2050: $ CALL ZDSCAL( N, XSC, V(1,p), 1 )
2051: 6971 CONTINUE
2052: *
2053: * Assemble the left singular vector matrix U (M x N).
2054: *
2055: IF ( N .LT. M ) THEN
2056: CALL ZLASET( 'A', M-N, N, CZERO, CZERO, U(N+1,1), LDU )
2057: IF ( N .LT. N1 ) THEN
2058: CALL ZLASET('A',N, N1-N, CZERO, CZERO, U(1,N+1),LDU)
2059: CALL ZLASET( 'A',M-N,N1-N, CZERO, CONE,U(N+1,N+1),LDU)
2060: END IF
2061: END IF
2062: CALL ZUNMQR( 'L', 'N', M, N1, N, A, LDA, CWORK, U,
2063: $ LDU, CWORK(N+1), LWORK-N, IERR )
2064: TEMP1 = SQRT(DBLE(M))*EPSLN
2065: DO 6973 p = 1, N1
2066: XSC = ONE / DZNRM2( M, U(1,p), 1 )
2067: IF ( (XSC .LT. (ONE-TEMP1)) .OR. (XSC .GT. (ONE+TEMP1)) )
2068: $ CALL ZDSCAL( M, XSC, U(1,p), 1 )
2069: 6973 CONTINUE
2070: *
2071: IF ( ROWPIV )
2072: $ CALL ZLASWP( N1, U, LDU, 1, M-1, IWORK(IWOFF+1), -1 )
2073: *
2074: END IF
2075: *
2076: * end of the >> almost orthogonal case << in the full SVD
2077: *
2078: ELSE
2079: *
2080: * This branch deploys a preconditioned Jacobi SVD with explicitly
2081: * accumulated rotations. It is included as optional, mainly for
2082: * experimental purposes. It does perfom well, and can also be used.
2083: * In this implementation, this branch will be automatically activated
2084: * if the condition number sigma_max(A) / sigma_min(A) is predicted
2085: * to be greater than the overflow threshold. This is because the
2086: * a posteriori computation of the singular vectors assumes robust
2087: * implementation of BLAS and some LAPACK procedures, capable of working
2088: * in presence of extreme values, e.g. when the singular values spread from
2089: * the underflow to the overflow threshold.
2090: *
2091: DO 7968 p = 1, NR
2092: CALL ZCOPY( N-p+1, A(p,p), LDA, V(p,p), 1 )
2093: CALL ZLACGV( N-p+1, V(p,p), 1 )
2094: 7968 CONTINUE
2095: *
2096: IF ( L2PERT ) THEN
2097: XSC = SQRT(SMALL/EPSLN)
2098: DO 5969 q = 1, NR
2099: CTEMP = DCMPLX(XSC*ABS( V(q,q) ),ZERO)
2100: DO 5968 p = 1, N
2101: IF ( ( p .GT. q ) .AND. ( ABS(V(p,q)) .LE. TEMP1 )
2102: $ .OR. ( p .LT. q ) )
2103: * $ V(p,q) = TEMP1 * ( V(p,q) / ABS(V(p,q)) )
2104: $ V(p,q) = CTEMP
2105: IF ( p .LT. q ) V(p,q) = - V(p,q)
2106: 5968 CONTINUE
2107: 5969 CONTINUE
2108: ELSE
2109: CALL ZLASET( 'U', NR-1, NR-1, CZERO, CZERO, V(1,2), LDV )
2110: END IF
2111:
2112: CALL ZGEQRF( N, NR, V, LDV, CWORK(N+1), CWORK(2*N+1),
2113: $ LWORK-2*N, IERR )
2114: CALL ZLACPY( 'L', N, NR, V, LDV, CWORK(2*N+1), N )
2115: *
2116: DO 7969 p = 1, NR
2117: CALL ZCOPY( NR-p+1, V(p,p), LDV, U(p,p), 1 )
2118: CALL ZLACGV( NR-p+1, U(p,p), 1 )
2119: 7969 CONTINUE
2120:
2121: IF ( L2PERT ) THEN
2122: XSC = SQRT(SMALL/EPSLN)
2123: DO 9970 q = 2, NR
2124: DO 9971 p = 1, q - 1
2125: CTEMP = DCMPLX(XSC * MIN(ABS(U(p,p)),ABS(U(q,q))),
2126: $ ZERO)
2127: * U(p,q) = - TEMP1 * ( U(q,p) / ABS(U(q,p)) )
2128: U(p,q) = - CTEMP
2129: 9971 CONTINUE
2130: 9970 CONTINUE
2131: ELSE
2132: CALL ZLASET('U', NR-1, NR-1, CZERO, CZERO, U(1,2), LDU )
2133: END IF
2134:
2135: CALL ZGESVJ( 'L', 'U', 'V', NR, NR, U, LDU, SVA,
2136: $ N, V, LDV, CWORK(2*N+N*NR+1), LWORK-2*N-N*NR,
2137: $ RWORK, LRWORK, INFO )
2138: SCALEM = RWORK(1)
2139: NUMRANK = NINT(RWORK(2))
2140:
2141: IF ( NR .LT. N ) THEN
2142: CALL ZLASET( 'A',N-NR,NR,CZERO,CZERO,V(NR+1,1),LDV )
2143: CALL ZLASET( 'A',NR,N-NR,CZERO,CZERO,V(1,NR+1),LDV )
2144: CALL ZLASET( 'A',N-NR,N-NR,CZERO,CONE,V(NR+1,NR+1),LDV )
2145: END IF
2146:
2147: CALL ZUNMQR( 'L','N',N,N,NR,CWORK(2*N+1),N,CWORK(N+1),
2148: $ V,LDV,CWORK(2*N+N*NR+NR+1),LWORK-2*N-N*NR-NR,IERR )
2149: *
2150: * Permute the rows of V using the (column) permutation from the
2151: * first QRF. Also, scale the columns to make them unit in
2152: * Euclidean norm. This applies to all cases.
2153: *
2154: TEMP1 = SQRT(DBLE(N)) * EPSLN
2155: DO 7972 q = 1, N
2156: DO 8972 p = 1, N
2157: CWORK(2*N+N*NR+NR+IWORK(p)) = V(p,q)
2158: 8972 CONTINUE
2159: DO 8973 p = 1, N
2160: V(p,q) = CWORK(2*N+N*NR+NR+p)
2161: 8973 CONTINUE
2162: XSC = ONE / DZNRM2( N, V(1,q), 1 )
2163: IF ( (XSC .LT. (ONE-TEMP1)) .OR. (XSC .GT. (ONE+TEMP1)) )
2164: $ CALL ZDSCAL( N, XSC, V(1,q), 1 )
2165: 7972 CONTINUE
2166: *
2167: * At this moment, V contains the right singular vectors of A.
2168: * Next, assemble the left singular vector matrix U (M x N).
2169: *
2170: IF ( NR .LT. M ) THEN
2171: CALL ZLASET( 'A', M-NR, NR, CZERO, CZERO, U(NR+1,1), LDU )
2172: IF ( NR .LT. N1 ) THEN
2173: CALL ZLASET('A',NR, N1-NR, CZERO, CZERO, U(1,NR+1),LDU)
2174: CALL ZLASET('A',M-NR,N1-NR, CZERO, CONE,U(NR+1,NR+1),LDU)
2175: END IF
2176: END IF
2177: *
2178: CALL ZUNMQR( 'L', 'N', M, N1, N, A, LDA, CWORK, U,
2179: $ LDU, CWORK(N+1), LWORK-N, IERR )
2180: *
2181: IF ( ROWPIV )
2182: $ CALL ZLASWP( N1, U, LDU, 1, M-1, IWORK(IWOFF+1), -1 )
2183: *
2184: *
2185: END IF
2186: IF ( TRANSP ) THEN
2187: * .. swap U and V because the procedure worked on A^*
2188: DO 6974 p = 1, N
2189: CALL ZSWAP( N, U(1,p), 1, V(1,p), 1 )
2190: 6974 CONTINUE
2191: END IF
2192: *
2193: END IF
2194: * end of the full SVD
2195: *
2196: * Undo scaling, if necessary (and possible)
2197: *
2198: IF ( USCAL2 .LE. (BIG/SVA(1))*USCAL1 ) THEN
2199: CALL DLASCL( 'G', 0, 0, USCAL1, USCAL2, NR, 1, SVA, N, IERR )
2200: USCAL1 = ONE
2201: USCAL2 = ONE
2202: END IF
2203: *
2204: IF ( NR .LT. N ) THEN
2205: DO 3004 p = NR+1, N
2206: SVA(p) = ZERO
2207: 3004 CONTINUE
2208: END IF
2209: *
2210: RWORK(1) = USCAL2 * SCALEM
2211: RWORK(2) = USCAL1
2212: IF ( ERREST ) RWORK(3) = SCONDA
2213: IF ( LSVEC .AND. RSVEC ) THEN
2214: RWORK(4) = CONDR1
2215: RWORK(5) = CONDR2
2216: END IF
2217: IF ( L2TRAN ) THEN
2218: RWORK(6) = ENTRA
2219: RWORK(7) = ENTRAT
2220: END IF
2221: *
2222: IWORK(1) = NR
2223: IWORK(2) = NUMRANK
2224: IWORK(3) = WARNING
2225: IF ( TRANSP ) THEN
2226: IWORK(4) = 1
2227: ELSE
2228: IWORK(4) = -1
2229: END IF
2230:
2231: *
2232: RETURN
2233: * ..
2234: * .. END OF ZGEJSV
2235: * ..
2236: END
2237: *
CVSweb interface <joel.bertrand@systella.fr>