Return to zgsvj0.f CVS log | Up to [local] / rpl / lapack / lapack |
1.4 bertrand 1: *> \brief <b> ZGSVJ0 pre-processor for the routine zgesvj. </b>
1.1 bertrand 2: *
3: * =========== DOCUMENTATION ===========
4: *
1.4 bertrand 5: * Online html documentation available at
6: * http://www.netlib.org/lapack/explore-html/
1.1 bertrand 7: *
8: *> \htmlonly
1.4 bertrand 9: *> Download ZGSVJ0 + dependencies
10: *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.tgz?format=tgz&filename=/lapack/lapack_routine/zgsvj0.f">
11: *> [TGZ]</a>
12: *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.zip?format=zip&filename=/lapack/lapack_routine/zgsvj0.f">
13: *> [ZIP]</a>
14: *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.txt?format=txt&filename=/lapack/lapack_routine/zgsvj0.f">
1.1 bertrand 15: *> [TXT]</a>
1.4 bertrand 16: *> \endhtmlonly
1.1 bertrand 17: *
18: * Definition:
19: * ===========
20: *
21: * SUBROUTINE ZGSVJ0( JOBV, M, N, A, LDA, D, SVA, MV, V, LDV, EPS,
22: * SFMIN, TOL, NSWEEP, WORK, LWORK, INFO )
1.4 bertrand 23: *
1.1 bertrand 24: * .. Scalar Arguments ..
25: * INTEGER INFO, LDA, LDV, LWORK, M, MV, N, NSWEEP
26: * DOUBLE PRECISION EPS, SFMIN, TOL
27: * CHARACTER*1 JOBV
28: * ..
29: * .. Array Arguments ..
30: * COMPLEX*16 A( LDA, * ), D( N ), V( LDV, * ), WORK( LWORK )
31: * DOUBLE PRECISION SVA( N )
32: * ..
1.4 bertrand 33: *
1.1 bertrand 34: *
35: *> \par Purpose:
36: * =============
37: *>
38: *> \verbatim
39: *>
40: *> ZGSVJ0 is called from ZGESVJ as a pre-processor and that is its main
41: *> purpose. It applies Jacobi rotations in the same way as ZGESVJ does, but
42: *> it does not check convergence (stopping criterion). Few tuning
43: *> parameters (marked by [TP]) are available for the implementer.
44: *> \endverbatim
45: *
46: * Arguments:
47: * ==========
48: *
49: *> \param[in] JOBV
50: *> \verbatim
51: *> JOBV is CHARACTER*1
52: *> Specifies whether the output from this procedure is used
53: *> to compute the matrix V:
54: *> = 'V': the product of the Jacobi rotations is accumulated
55: *> by postmulyiplying the N-by-N array V.
56: *> (See the description of V.)
57: *> = 'A': the product of the Jacobi rotations is accumulated
58: *> by postmulyiplying the MV-by-N array V.
59: *> (See the descriptions of MV and V.)
60: *> = 'N': the Jacobi rotations are not accumulated.
61: *> \endverbatim
62: *>
63: *> \param[in] M
64: *> \verbatim
65: *> M is INTEGER
66: *> The number of rows of the input matrix A. M >= 0.
67: *> \endverbatim
68: *>
69: *> \param[in] N
70: *> \verbatim
71: *> N is INTEGER
72: *> The number of columns of the input matrix A.
73: *> M >= N >= 0.
74: *> \endverbatim
75: *>
76: *> \param[in,out] A
77: *> \verbatim
78: *> A is COMPLEX*16 array, dimension (LDA,N)
79: *> On entry, M-by-N matrix A, such that A*diag(D) represents
80: *> the input matrix.
81: *> On exit,
82: *> A_onexit * diag(D_onexit) represents the input matrix A*diag(D)
83: *> post-multiplied by a sequence of Jacobi rotations, where the
84: *> rotation threshold and the total number of sweeps are given in
85: *> TOL and NSWEEP, respectively.
86: *> (See the descriptions of D, TOL and NSWEEP.)
87: *> \endverbatim
88: *>
89: *> \param[in] LDA
90: *> \verbatim
91: *> LDA is INTEGER
92: *> The leading dimension of the array A. LDA >= max(1,M).
93: *> \endverbatim
94: *>
95: *> \param[in,out] D
96: *> \verbatim
97: *> D is COMPLEX*16 array, dimension (N)
98: *> The array D accumulates the scaling factors from the complex scaled
99: *> Jacobi rotations.
100: *> On entry, A*diag(D) represents the input matrix.
101: *> On exit, A_onexit*diag(D_onexit) represents the input matrix
102: *> post-multiplied by a sequence of Jacobi rotations, where the
103: *> rotation threshold and the total number of sweeps are given in
104: *> TOL and NSWEEP, respectively.
105: *> (See the descriptions of A, TOL and NSWEEP.)
106: *> \endverbatim
107: *>
108: *> \param[in,out] SVA
109: *> \verbatim
110: *> SVA is DOUBLE PRECISION array, dimension (N)
111: *> On entry, SVA contains the Euclidean norms of the columns of
112: *> the matrix A*diag(D).
113: *> On exit, SVA contains the Euclidean norms of the columns of
114: *> the matrix A_onexit*diag(D_onexit).
1.4 bertrand 115: *> \endverbatim
1.1 bertrand 116: *>
117: *> \param[in] MV
118: *> \verbatim
119: *> MV is INTEGER
1.8 bertrand 120: *> If JOBV = 'A', then MV rows of V are post-multipled by a
1.1 bertrand 121: *> sequence of Jacobi rotations.
122: *> If JOBV = 'N', then MV is not referenced.
123: *> \endverbatim
124: *>
125: *> \param[in,out] V
126: *> \verbatim
127: *> V is COMPLEX*16 array, dimension (LDV,N)
1.8 bertrand 128: *> If JOBV = 'V' then N rows of V are post-multipled by a
1.1 bertrand 129: *> sequence of Jacobi rotations.
1.8 bertrand 130: *> If JOBV = 'A' then MV rows of V are post-multipled by a
1.1 bertrand 131: *> sequence of Jacobi rotations.
132: *> If JOBV = 'N', then V is not referenced.
133: *> \endverbatim
134: *>
135: *> \param[in] LDV
136: *> \verbatim
137: *> LDV is INTEGER
138: *> The leading dimension of the array V, LDV >= 1.
1.8 bertrand 139: *> If JOBV = 'V', LDV >= N.
140: *> If JOBV = 'A', LDV >= MV.
1.1 bertrand 141: *> \endverbatim
142: *>
143: *> \param[in] EPS
144: *> \verbatim
145: *> EPS is DOUBLE PRECISION
146: *> EPS = DLAMCH('Epsilon')
147: *> \endverbatim
148: *>
149: *> \param[in] SFMIN
150: *> \verbatim
151: *> SFMIN is DOUBLE PRECISION
152: *> SFMIN = DLAMCH('Safe Minimum')
153: *> \endverbatim
154: *>
155: *> \param[in] TOL
156: *> \verbatim
157: *> TOL is DOUBLE PRECISION
158: *> TOL is the threshold for Jacobi rotations. For a pair
159: *> A(:,p), A(:,q) of pivot columns, the Jacobi rotation is
1.8 bertrand 160: *> applied only if ABS(COS(angle(A(:,p),A(:,q)))) > TOL.
1.1 bertrand 161: *> \endverbatim
162: *>
163: *> \param[in] NSWEEP
164: *> \verbatim
165: *> NSWEEP is INTEGER
166: *> NSWEEP is the number of sweeps of Jacobi rotations to be
167: *> performed.
168: *> \endverbatim
169: *>
170: *> \param[out] WORK
171: *> \verbatim
1.6 bertrand 172: *> WORK is COMPLEX*16 array, dimension (LWORK)
1.1 bertrand 173: *> \endverbatim
174: *>
175: *> \param[in] LWORK
176: *> \verbatim
177: *> LWORK is INTEGER
1.8 bertrand 178: *> LWORK is the dimension of WORK. LWORK >= M.
1.1 bertrand 179: *> \endverbatim
180: *>
181: *> \param[out] INFO
182: *> \verbatim
183: *> INFO is INTEGER
1.8 bertrand 184: *> = 0: successful exit.
185: *> < 0: if INFO = -i, then the i-th argument had an illegal value
1.1 bertrand 186: *> \endverbatim
187: *
188: * Authors:
189: * ========
190: *
1.4 bertrand 191: *> \author Univ. of Tennessee
192: *> \author Univ. of California Berkeley
193: *> \author Univ. of Colorado Denver
194: *> \author NAG Ltd.
1.1 bertrand 195: *
196: *> \ingroup complex16OTHERcomputational
197: *>
198: *> \par Further Details:
199: * =====================
200: *>
201: *> ZGSVJ0 is used just to enable ZGESVJ to call a simplified version of
202: *> itself to work on a submatrix of the original matrix.
203: *>
1.4 bertrand 204: *> Contributor:
1.1 bertrand 205: * =============
206: *>
1.4 bertrand 207: *> Zlatko Drmac (Zagreb, Croatia)
1.1 bertrand 208: *>
1.4 bertrand 209: *> \par Bugs, Examples and Comments:
1.1 bertrand 210: * ============================
211: *>
212: *> Please report all bugs and send interesting test examples and comments to
213: *> drmac@math.hr. Thank you.
214: *
215: * =====================================================================
216: SUBROUTINE ZGSVJ0( JOBV, M, N, A, LDA, D, SVA, MV, V, LDV, EPS,
217: $ SFMIN, TOL, NSWEEP, WORK, LWORK, INFO )
218: *
1.9 ! bertrand 219: * -- LAPACK computational routine --
1.1 bertrand 220: * -- LAPACK is a software package provided by Univ. of Tennessee, --
221: * -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
222: *
223: IMPLICIT NONE
224: * .. Scalar Arguments ..
225: INTEGER INFO, LDA, LDV, LWORK, M, MV, N, NSWEEP
226: DOUBLE PRECISION EPS, SFMIN, TOL
227: CHARACTER*1 JOBV
228: * ..
229: * .. Array Arguments ..
230: COMPLEX*16 A( LDA, * ), D( N ), V( LDV, * ), WORK( LWORK )
1.4 bertrand 231: DOUBLE PRECISION SVA( N )
1.1 bertrand 232: * ..
233: *
234: * =====================================================================
235: *
236: * .. Local Parameters ..
237: DOUBLE PRECISION ZERO, HALF, ONE
238: PARAMETER ( ZERO = 0.0D0, HALF = 0.5D0, ONE = 1.0D0)
239: COMPLEX*16 CZERO, CONE
240: PARAMETER ( CZERO = (0.0D0, 0.0D0), CONE = (1.0D0, 0.0D0) )
241: * ..
242: * .. Local Scalars ..
243: COMPLEX*16 AAPQ, OMPQ
244: DOUBLE PRECISION AAPP, AAPP0, AAPQ1, AAQQ, APOAQ, AQOAP, BIG,
245: $ BIGTHETA, CS, MXAAPQ, MXSINJ, ROOTBIG, ROOTEPS,
246: $ ROOTSFMIN, ROOTTOL, SMALL, SN, T, TEMP1, THETA,
247: $ THSIGN
248: INTEGER BLSKIP, EMPTSW, i, ibr, IERR, igl, IJBLSK, ir1,
249: $ ISWROT, jbc, jgl, KBL, LKAHEAD, MVL, NBL,
250: $ NOTROT, p, PSKIPPED, q, ROWSKIP, SWBAND
251: LOGICAL APPLV, ROTOK, RSVEC
252: * ..
253: * ..
254: * .. Intrinsic Functions ..
1.4 bertrand 255: INTRINSIC ABS, MAX, CONJG, DBLE, MIN, SIGN, SQRT
1.1 bertrand 256: * ..
257: * .. External Functions ..
258: DOUBLE PRECISION DZNRM2
259: COMPLEX*16 ZDOTC
260: INTEGER IDAMAX
261: LOGICAL LSAME
262: EXTERNAL IDAMAX, LSAME, ZDOTC, DZNRM2
263: * ..
264: * ..
265: * .. External Subroutines ..
266: * ..
267: * from BLAS
1.6 bertrand 268: EXTERNAL ZCOPY, ZROT, ZSWAP, ZAXPY
1.1 bertrand 269: * from LAPACK
270: EXTERNAL ZLASCL, ZLASSQ, XERBLA
271: * ..
272: * .. Executable Statements ..
273: *
274: * Test the input parameters.
275: *
276: APPLV = LSAME( JOBV, 'A' )
277: RSVEC = LSAME( JOBV, 'V' )
278: IF( .NOT.( RSVEC .OR. APPLV .OR. LSAME( JOBV, 'N' ) ) ) THEN
279: INFO = -1
280: ELSE IF( M.LT.0 ) THEN
281: INFO = -2
282: ELSE IF( ( N.LT.0 ) .OR. ( N.GT.M ) ) THEN
283: INFO = -3
284: ELSE IF( LDA.LT.M ) THEN
285: INFO = -5
286: ELSE IF( ( RSVEC.OR.APPLV ) .AND. ( MV.LT.0 ) ) THEN
287: INFO = -8
1.4 bertrand 288: ELSE IF( ( RSVEC.AND.( LDV.LT.N ) ).OR.
1.1 bertrand 289: $ ( APPLV.AND.( LDV.LT.MV ) ) ) THEN
290: INFO = -10
291: ELSE IF( TOL.LE.EPS ) THEN
292: INFO = -13
293: ELSE IF( NSWEEP.LT.0 ) THEN
294: INFO = -14
295: ELSE IF( LWORK.LT.M ) THEN
296: INFO = -16
297: ELSE
298: INFO = 0
299: END IF
300: *
301: * #:(
302: IF( INFO.NE.0 ) THEN
303: CALL XERBLA( 'ZGSVJ0', -INFO )
304: RETURN
305: END IF
306: *
307: IF( RSVEC ) THEN
308: MVL = N
309: ELSE IF( APPLV ) THEN
310: MVL = MV
311: END IF
312: RSVEC = RSVEC .OR. APPLV
313:
1.4 bertrand 314: ROOTEPS = SQRT( EPS )
315: ROOTSFMIN = SQRT( SFMIN )
1.1 bertrand 316: SMALL = SFMIN / EPS
317: BIG = ONE / SFMIN
318: ROOTBIG = ONE / ROOTSFMIN
319: BIGTHETA = ONE / ROOTEPS
1.4 bertrand 320: ROOTTOL = SQRT( TOL )
1.1 bertrand 321: *
322: * .. Row-cyclic Jacobi SVD algorithm with column pivoting ..
323: *
324: EMPTSW = ( N*( N-1 ) ) / 2
325: NOTROT = 0
326: *
327: * .. Row-cyclic pivot strategy with de Rijk's pivoting ..
328: *
329:
330: SWBAND = 0
331: *[TP] SWBAND is a tuning parameter [TP]. It is meaningful and effective
332: * if ZGESVJ is used as a computational routine in the preconditioned
333: * Jacobi SVD algorithm ZGEJSV. For sweeps i=1:SWBAND the procedure
334: * works on pivots inside a band-like region around the diagonal.
335: * The boundaries are determined dynamically, based on the number of
336: * pivots above a threshold.
337: *
1.4 bertrand 338: KBL = MIN( 8, N )
1.1 bertrand 339: *[TP] KBL is a tuning parameter that defines the tile size in the
340: * tiling of the p-q loops of pivot pairs. In general, an optimal
341: * value of KBL depends on the matrix dimensions and on the
342: * parameters of the computer's memory.
343: *
344: NBL = N / KBL
345: IF( ( NBL*KBL ).NE.N )NBL = NBL + 1
346: *
347: BLSKIP = KBL**2
348: *[TP] BLKSKIP is a tuning parameter that depends on SWBAND and KBL.
349: *
1.4 bertrand 350: ROWSKIP = MIN( 5, KBL )
1.1 bertrand 351: *[TP] ROWSKIP is a tuning parameter.
352: *
353: LKAHEAD = 1
354: *[TP] LKAHEAD is a tuning parameter.
355: *
356: * Quasi block transformations, using the lower (upper) triangular
357: * structure of the input matrix. The quasi-block-cycling usually
358: * invokes cubic convergence. Big part of this cycle is done inside
359: * canonical subspaces of dimensions less than M.
360: *
361: *
362: * .. Row-cyclic pivot strategy with de Rijk's pivoting ..
363: *
364: DO 1993 i = 1, NSWEEP
365: *
366: * .. go go go ...
367: *
368: MXAAPQ = ZERO
369: MXSINJ = ZERO
370: ISWROT = 0
371: *
372: NOTROT = 0
373: PSKIPPED = 0
374: *
375: * Each sweep is unrolled using KBL-by-KBL tiles over the pivot pairs
376: * 1 <= p < q <= N. This is the first step toward a blocked implementation
377: * of the rotations. New implementation, based on block transformations,
378: * is under development.
379: *
380: DO 2000 ibr = 1, NBL
381: *
382: igl = ( ibr-1 )*KBL + 1
383: *
1.4 bertrand 384: DO 1002 ir1 = 0, MIN( LKAHEAD, NBL-ibr )
1.1 bertrand 385: *
386: igl = igl + ir1*KBL
387: *
1.4 bertrand 388: DO 2001 p = igl, MIN( igl+KBL-1, N-1 )
1.1 bertrand 389: *
390: * .. de Rijk's pivoting
391: *
392: q = IDAMAX( N-p+1, SVA( p ), 1 ) + p - 1
393: IF( p.NE.q ) THEN
394: CALL ZSWAP( M, A( 1, p ), 1, A( 1, q ), 1 )
1.4 bertrand 395: IF( RSVEC )CALL ZSWAP( MVL, V( 1, p ), 1,
1.1 bertrand 396: $ V( 1, q ), 1 )
397: TEMP1 = SVA( p )
398: SVA( p ) = SVA( q )
399: SVA( q ) = TEMP1
400: AAPQ = D(p)
401: D(p) = D(q)
402: D(q) = AAPQ
403: END IF
404: *
405: IF( ir1.EQ.0 ) THEN
406: *
407: * Column norms are periodically updated by explicit
408: * norm computation.
409: * Caveat:
410: * Unfortunately, some BLAS implementations compute SNCRM2(M,A(1,p),1)
411: * as SQRT(S=ZDOTC(M,A(1,p),1,A(1,p),1)), which may cause the result to
412: * overflow for ||A(:,p)||_2 > SQRT(overflow_threshold), and to
413: * underflow for ||A(:,p)||_2 < SQRT(underflow_threshold).
414: * Hence, DZNRM2 cannot be trusted, not even in the case when
415: * the true norm is far from the under(over)flow boundaries.
416: * If properly implemented DZNRM2 is available, the IF-THEN-ELSE-END IF
417: * below should be replaced with "AAPP = DZNRM2( M, A(1,p), 1 )".
418: *
1.4 bertrand 419: IF( ( SVA( p ).LT.ROOTBIG ) .AND.
1.1 bertrand 420: $ ( SVA( p ).GT.ROOTSFMIN ) ) THEN
421: SVA( p ) = DZNRM2( M, A( 1, p ), 1 )
422: ELSE
423: TEMP1 = ZERO
424: AAPP = ONE
425: CALL ZLASSQ( M, A( 1, p ), 1, TEMP1, AAPP )
1.4 bertrand 426: SVA( p ) = TEMP1*SQRT( AAPP )
1.1 bertrand 427: END IF
428: AAPP = SVA( p )
429: ELSE
430: AAPP = SVA( p )
431: END IF
432: *
433: IF( AAPP.GT.ZERO ) THEN
434: *
435: PSKIPPED = 0
436: *
1.4 bertrand 437: DO 2002 q = p + 1, MIN( igl+KBL-1, N )
1.1 bertrand 438: *
439: AAQQ = SVA( q )
440: *
441: IF( AAQQ.GT.ZERO ) THEN
442: *
443: AAPP0 = AAPP
444: IF( AAQQ.GE.ONE ) THEN
445: ROTOK = ( SMALL*AAPP ).LE.AAQQ
446: IF( AAPP.LT.( BIG / AAQQ ) ) THEN
1.4 bertrand 447: AAPQ = ( ZDOTC( M, A( 1, p ), 1,
1.1 bertrand 448: $ A( 1, q ), 1 ) / AAQQ ) / AAPP
449: ELSE
1.4 bertrand 450: CALL ZCOPY( M, A( 1, p ), 1,
1.1 bertrand 451: $ WORK, 1 )
1.4 bertrand 452: CALL ZLASCL( 'G', 0, 0, AAPP, ONE,
1.1 bertrand 453: $ M, 1, WORK, LDA, IERR )
454: AAPQ = ZDOTC( M, WORK, 1,
455: $ A( 1, q ), 1 ) / AAQQ
456: END IF
457: ELSE
458: ROTOK = AAPP.LE.( AAQQ / SMALL )
459: IF( AAPP.GT.( SMALL / AAQQ ) ) THEN
1.4 bertrand 460: AAPQ = ( ZDOTC( M, A( 1, p ), 1,
461: $ A( 1, q ), 1 ) / AAPP ) / AAQQ
1.1 bertrand 462: ELSE
1.4 bertrand 463: CALL ZCOPY( M, A( 1, q ), 1,
1.1 bertrand 464: $ WORK, 1 )
465: CALL ZLASCL( 'G', 0, 0, AAQQ,
466: $ ONE, M, 1,
467: $ WORK, LDA, IERR )
1.4 bertrand 468: AAPQ = ZDOTC( M, A( 1, p ), 1,
1.1 bertrand 469: $ WORK, 1 ) / AAPP
470: END IF
471: END IF
472: *
1.4 bertrand 473: * AAPQ = AAPQ * CONJG( CWORK(p) ) * CWORK(q)
474: AAPQ1 = -ABS(AAPQ)
475: MXAAPQ = MAX( MXAAPQ, -AAPQ1 )
1.1 bertrand 476: *
477: * TO rotate or NOT to rotate, THAT is the question ...
478: *
479: IF( ABS( AAPQ1 ).GT.TOL ) THEN
1.4 bertrand 480: OMPQ = AAPQ / ABS(AAPQ)
1.1 bertrand 481: *
482: * .. rotate
483: *[RTD] ROTATED = ROTATED + ONE
484: *
485: IF( ir1.EQ.0 ) THEN
486: NOTROT = 0
487: PSKIPPED = 0
488: ISWROT = ISWROT + 1
489: END IF
490: *
491: IF( ROTOK ) THEN
492: *
493: AQOAP = AAQQ / AAPP
494: APOAQ = AAPP / AAQQ
495: THETA = -HALF*ABS( AQOAP-APOAQ )/AAPQ1
496: *
497: IF( ABS( THETA ).GT.BIGTHETA ) THEN
1.4 bertrand 498: *
1.1 bertrand 499: T = HALF / THETA
500: CS = ONE
501:
502: CALL ZROT( M, A(1,p), 1, A(1,q), 1,
1.4 bertrand 503: $ CS, CONJG(OMPQ)*T )
1.1 bertrand 504: IF ( RSVEC ) THEN
1.4 bertrand 505: CALL ZROT( MVL, V(1,p), 1,
506: $ V(1,q), 1, CS, CONJG(OMPQ)*T )
1.1 bertrand 507: END IF
1.4 bertrand 508:
509: SVA( q ) = AAQQ*SQRT( MAX( ZERO,
1.1 bertrand 510: $ ONE+T*APOAQ*AAPQ1 ) )
1.4 bertrand 511: AAPP = AAPP*SQRT( MAX( ZERO,
1.1 bertrand 512: $ ONE-T*AQOAP*AAPQ1 ) )
1.4 bertrand 513: MXSINJ = MAX( MXSINJ, ABS( T ) )
1.1 bertrand 514: *
515: ELSE
516: *
517: * .. choose correct signum for THETA and rotate
518: *
1.4 bertrand 519: THSIGN = -SIGN( ONE, AAPQ1 )
520: T = ONE / ( THETA+THSIGN*
521: $ SQRT( ONE+THETA*THETA ) )
522: CS = SQRT( ONE / ( ONE+T*T ) )
1.1 bertrand 523: SN = T*CS
524: *
1.4 bertrand 525: MXSINJ = MAX( MXSINJ, ABS( SN ) )
526: SVA( q ) = AAQQ*SQRT( MAX( ZERO,
1.1 bertrand 527: $ ONE+T*APOAQ*AAPQ1 ) )
1.4 bertrand 528: AAPP = AAPP*SQRT( MAX( ZERO,
1.1 bertrand 529: $ ONE-T*AQOAP*AAPQ1 ) )
530: *
531: CALL ZROT( M, A(1,p), 1, A(1,q), 1,
1.4 bertrand 532: $ CS, CONJG(OMPQ)*SN )
1.1 bertrand 533: IF ( RSVEC ) THEN
1.4 bertrand 534: CALL ZROT( MVL, V(1,p), 1,
535: $ V(1,q), 1, CS, CONJG(OMPQ)*SN )
536: END IF
537: END IF
538: D(p) = -D(q) * OMPQ
1.1 bertrand 539: *
540: ELSE
541: * .. have to use modified Gram-Schmidt like transformation
542: CALL ZCOPY( M, A( 1, p ), 1,
543: $ WORK, 1 )
544: CALL ZLASCL( 'G', 0, 0, AAPP, ONE, M,
545: $ 1, WORK, LDA,
546: $ IERR )
547: CALL ZLASCL( 'G', 0, 0, AAQQ, ONE, M,
548: $ 1, A( 1, q ), LDA, IERR )
549: CALL ZAXPY( M, -AAPQ, WORK, 1,
550: $ A( 1, q ), 1 )
551: CALL ZLASCL( 'G', 0, 0, ONE, AAQQ, M,
552: $ 1, A( 1, q ), LDA, IERR )
1.4 bertrand 553: SVA( q ) = AAQQ*SQRT( MAX( ZERO,
1.1 bertrand 554: $ ONE-AAPQ1*AAPQ1 ) )
1.4 bertrand 555: MXSINJ = MAX( MXSINJ, SFMIN )
1.1 bertrand 556: END IF
557: * END IF ROTOK THEN ... ELSE
558: *
559: * In the case of cancellation in updating SVA(q), SVA(p)
560: * recompute SVA(q), SVA(p).
561: *
562: IF( ( SVA( q ) / AAQQ )**2.LE.ROOTEPS )
563: $ THEN
564: IF( ( AAQQ.LT.ROOTBIG ) .AND.
565: $ ( AAQQ.GT.ROOTSFMIN ) ) THEN
566: SVA( q ) = DZNRM2( M, A( 1, q ), 1 )
567: ELSE
568: T = ZERO
569: AAQQ = ONE
570: CALL ZLASSQ( M, A( 1, q ), 1, T,
571: $ AAQQ )
1.4 bertrand 572: SVA( q ) = T*SQRT( AAQQ )
1.1 bertrand 573: END IF
574: END IF
575: IF( ( AAPP / AAPP0 ).LE.ROOTEPS ) THEN
576: IF( ( AAPP.LT.ROOTBIG ) .AND.
577: $ ( AAPP.GT.ROOTSFMIN ) ) THEN
578: AAPP = DZNRM2( M, A( 1, p ), 1 )
579: ELSE
580: T = ZERO
581: AAPP = ONE
582: CALL ZLASSQ( M, A( 1, p ), 1, T,
583: $ AAPP )
1.4 bertrand 584: AAPP = T*SQRT( AAPP )
1.1 bertrand 585: END IF
586: SVA( p ) = AAPP
587: END IF
588: *
589: ELSE
590: * A(:,p) and A(:,q) already numerically orthogonal
591: IF( ir1.EQ.0 )NOTROT = NOTROT + 1
592: *[RTD] SKIPPED = SKIPPED + 1
593: PSKIPPED = PSKIPPED + 1
594: END IF
595: ELSE
596: * A(:,q) is zero column
597: IF( ir1.EQ.0 )NOTROT = NOTROT + 1
598: PSKIPPED = PSKIPPED + 1
599: END IF
600: *
601: IF( ( i.LE.SWBAND ) .AND.
602: $ ( PSKIPPED.GT.ROWSKIP ) ) THEN
603: IF( ir1.EQ.0 )AAPP = -AAPP
604: NOTROT = 0
605: GO TO 2103
606: END IF
607: *
608: 2002 CONTINUE
609: * END q-LOOP
610: *
611: 2103 CONTINUE
612: * bailed out of q-loop
613: *
614: SVA( p ) = AAPP
615: *
616: ELSE
617: SVA( p ) = AAPP
618: IF( ( ir1.EQ.0 ) .AND. ( AAPP.EQ.ZERO ) )
1.4 bertrand 619: $ NOTROT = NOTROT + MIN( igl+KBL-1, N ) - p
1.1 bertrand 620: END IF
621: *
622: 2001 CONTINUE
623: * end of the p-loop
624: * end of doing the block ( ibr, ibr )
625: 1002 CONTINUE
626: * end of ir1-loop
627: *
628: * ... go to the off diagonal blocks
629: *
630: igl = ( ibr-1 )*KBL + 1
631: *
632: DO 2010 jbc = ibr + 1, NBL
633: *
634: jgl = ( jbc-1 )*KBL + 1
635: *
636: * doing the block at ( ibr, jbc )
637: *
638: IJBLSK = 0
1.4 bertrand 639: DO 2100 p = igl, MIN( igl+KBL-1, N )
1.1 bertrand 640: *
641: AAPP = SVA( p )
642: IF( AAPP.GT.ZERO ) THEN
643: *
644: PSKIPPED = 0
645: *
1.4 bertrand 646: DO 2200 q = jgl, MIN( jgl+KBL-1, N )
1.1 bertrand 647: *
648: AAQQ = SVA( q )
649: IF( AAQQ.GT.ZERO ) THEN
650: AAPP0 = AAPP
651: *
652: * .. M x 2 Jacobi SVD ..
653: *
654: * Safe Gram matrix computation
655: *
656: IF( AAQQ.GE.ONE ) THEN
657: IF( AAPP.GE.AAQQ ) THEN
658: ROTOK = ( SMALL*AAPP ).LE.AAQQ
659: ELSE
660: ROTOK = ( SMALL*AAQQ ).LE.AAPP
661: END IF
662: IF( AAPP.LT.( BIG / AAQQ ) ) THEN
1.4 bertrand 663: AAPQ = ( ZDOTC( M, A( 1, p ), 1,
1.1 bertrand 664: $ A( 1, q ), 1 ) / AAQQ ) / AAPP
665: ELSE
666: CALL ZCOPY( M, A( 1, p ), 1,
667: $ WORK, 1 )
668: CALL ZLASCL( 'G', 0, 0, AAPP,
669: $ ONE, M, 1,
670: $ WORK, LDA, IERR )
671: AAPQ = ZDOTC( M, WORK, 1,
672: $ A( 1, q ), 1 ) / AAQQ
673: END IF
674: ELSE
675: IF( AAPP.GE.AAQQ ) THEN
676: ROTOK = AAPP.LE.( AAQQ / SMALL )
677: ELSE
678: ROTOK = AAQQ.LE.( AAPP / SMALL )
679: END IF
680: IF( AAPP.GT.( SMALL / AAQQ ) ) THEN
1.4 bertrand 681: AAPQ = ( ZDOTC( M, A( 1, p ), 1,
682: $ A( 1, q ), 1 ) / MAX(AAQQ,AAPP) )
683: $ / MIN(AAQQ,AAPP)
1.1 bertrand 684: ELSE
685: CALL ZCOPY( M, A( 1, q ), 1,
686: $ WORK, 1 )
687: CALL ZLASCL( 'G', 0, 0, AAQQ,
688: $ ONE, M, 1,
689: $ WORK, LDA, IERR )
690: AAPQ = ZDOTC( M, A( 1, p ), 1,
691: $ WORK, 1 ) / AAPP
692: END IF
693: END IF
694: *
1.4 bertrand 695: * AAPQ = AAPQ * CONJG(CWORK(p))*CWORK(q)
1.1 bertrand 696: AAPQ1 = -ABS(AAPQ)
1.4 bertrand 697: MXAAPQ = MAX( MXAAPQ, -AAPQ1 )
1.1 bertrand 698: *
699: * TO rotate or NOT to rotate, THAT is the question ...
700: *
701: IF( ABS( AAPQ1 ).GT.TOL ) THEN
1.4 bertrand 702: OMPQ = AAPQ / ABS(AAPQ)
1.1 bertrand 703: NOTROT = 0
704: *[RTD] ROTATED = ROTATED + 1
705: PSKIPPED = 0
706: ISWROT = ISWROT + 1
707: *
708: IF( ROTOK ) THEN
709: *
710: AQOAP = AAQQ / AAPP
711: APOAQ = AAPP / AAQQ
712: THETA = -HALF*ABS( AQOAP-APOAQ )/ AAPQ1
713: IF( AAQQ.GT.AAPP0 )THETA = -THETA
714: *
715: IF( ABS( THETA ).GT.BIGTHETA ) THEN
716: T = HALF / THETA
1.4 bertrand 717: CS = ONE
1.1 bertrand 718: CALL ZROT( M, A(1,p), 1, A(1,q), 1,
1.4 bertrand 719: $ CS, CONJG(OMPQ)*T )
1.1 bertrand 720: IF( RSVEC ) THEN
1.4 bertrand 721: CALL ZROT( MVL, V(1,p), 1,
722: $ V(1,q), 1, CS, CONJG(OMPQ)*T )
1.1 bertrand 723: END IF
1.4 bertrand 724: SVA( q ) = AAQQ*SQRT( MAX( ZERO,
1.1 bertrand 725: $ ONE+T*APOAQ*AAPQ1 ) )
1.4 bertrand 726: AAPP = AAPP*SQRT( MAX( ZERO,
1.1 bertrand 727: $ ONE-T*AQOAP*AAPQ1 ) )
1.4 bertrand 728: MXSINJ = MAX( MXSINJ, ABS( T ) )
1.1 bertrand 729: ELSE
730: *
731: * .. choose correct signum for THETA and rotate
732: *
1.4 bertrand 733: THSIGN = -SIGN( ONE, AAPQ1 )
1.1 bertrand 734: IF( AAQQ.GT.AAPP0 )THSIGN = -THSIGN
735: T = ONE / ( THETA+THSIGN*
1.4 bertrand 736: $ SQRT( ONE+THETA*THETA ) )
737: CS = SQRT( ONE / ( ONE+T*T ) )
1.1 bertrand 738: SN = T*CS
1.4 bertrand 739: MXSINJ = MAX( MXSINJ, ABS( SN ) )
740: SVA( q ) = AAQQ*SQRT( MAX( ZERO,
1.1 bertrand 741: $ ONE+T*APOAQ*AAPQ1 ) )
1.4 bertrand 742: AAPP = AAPP*SQRT( MAX( ZERO,
1.1 bertrand 743: $ ONE-T*AQOAP*AAPQ1 ) )
744: *
745: CALL ZROT( M, A(1,p), 1, A(1,q), 1,
1.4 bertrand 746: $ CS, CONJG(OMPQ)*SN )
1.1 bertrand 747: IF( RSVEC ) THEN
1.4 bertrand 748: CALL ZROT( MVL, V(1,p), 1,
749: $ V(1,q), 1, CS, CONJG(OMPQ)*SN )
1.1 bertrand 750: END IF
751: END IF
752: D(p) = -D(q) * OMPQ
753: *
754: ELSE
755: * .. have to use modified Gram-Schmidt like transformation
756: IF( AAPP.GT.AAQQ ) THEN
757: CALL ZCOPY( M, A( 1, p ), 1,
758: $ WORK, 1 )
759: CALL ZLASCL( 'G', 0, 0, AAPP, ONE,
760: $ M, 1, WORK,LDA,
761: $ IERR )
762: CALL ZLASCL( 'G', 0, 0, AAQQ, ONE,
763: $ M, 1, A( 1, q ), LDA,
764: $ IERR )
765: CALL ZAXPY( M, -AAPQ, WORK,
766: $ 1, A( 1, q ), 1 )
767: CALL ZLASCL( 'G', 0, 0, ONE, AAQQ,
768: $ M, 1, A( 1, q ), LDA,
769: $ IERR )
1.4 bertrand 770: SVA( q ) = AAQQ*SQRT( MAX( ZERO,
1.1 bertrand 771: $ ONE-AAPQ1*AAPQ1 ) )
1.4 bertrand 772: MXSINJ = MAX( MXSINJ, SFMIN )
1.1 bertrand 773: ELSE
774: CALL ZCOPY( M, A( 1, q ), 1,
775: $ WORK, 1 )
776: CALL ZLASCL( 'G', 0, 0, AAQQ, ONE,
777: $ M, 1, WORK,LDA,
778: $ IERR )
779: CALL ZLASCL( 'G', 0, 0, AAPP, ONE,
780: $ M, 1, A( 1, p ), LDA,
781: $ IERR )
1.4 bertrand 782: CALL ZAXPY( M, -CONJG(AAPQ),
1.1 bertrand 783: $ WORK, 1, A( 1, p ), 1 )
784: CALL ZLASCL( 'G', 0, 0, ONE, AAPP,
785: $ M, 1, A( 1, p ), LDA,
786: $ IERR )
1.4 bertrand 787: SVA( p ) = AAPP*SQRT( MAX( ZERO,
1.1 bertrand 788: $ ONE-AAPQ1*AAPQ1 ) )
1.4 bertrand 789: MXSINJ = MAX( MXSINJ, SFMIN )
1.1 bertrand 790: END IF
791: END IF
792: * END IF ROTOK THEN ... ELSE
793: *
794: * In the case of cancellation in updating SVA(q), SVA(p)
795: * .. recompute SVA(q), SVA(p)
796: IF( ( SVA( q ) / AAQQ )**2.LE.ROOTEPS )
797: $ THEN
798: IF( ( AAQQ.LT.ROOTBIG ) .AND.
799: $ ( AAQQ.GT.ROOTSFMIN ) ) THEN
800: SVA( q ) = DZNRM2( M, A( 1, q ), 1)
801: ELSE
802: T = ZERO
803: AAQQ = ONE
804: CALL ZLASSQ( M, A( 1, q ), 1, T,
805: $ AAQQ )
1.4 bertrand 806: SVA( q ) = T*SQRT( AAQQ )
1.1 bertrand 807: END IF
808: END IF
809: IF( ( AAPP / AAPP0 )**2.LE.ROOTEPS ) THEN
810: IF( ( AAPP.LT.ROOTBIG ) .AND.
811: $ ( AAPP.GT.ROOTSFMIN ) ) THEN
812: AAPP = DZNRM2( M, A( 1, p ), 1 )
813: ELSE
814: T = ZERO
815: AAPP = ONE
816: CALL ZLASSQ( M, A( 1, p ), 1, T,
817: $ AAPP )
1.4 bertrand 818: AAPP = T*SQRT( AAPP )
1.1 bertrand 819: END IF
820: SVA( p ) = AAPP
821: END IF
822: * end of OK rotation
823: ELSE
824: NOTROT = NOTROT + 1
825: *[RTD] SKIPPED = SKIPPED + 1
826: PSKIPPED = PSKIPPED + 1
827: IJBLSK = IJBLSK + 1
828: END IF
829: ELSE
830: NOTROT = NOTROT + 1
831: PSKIPPED = PSKIPPED + 1
832: IJBLSK = IJBLSK + 1
833: END IF
834: *
835: IF( ( i.LE.SWBAND ) .AND. ( IJBLSK.GE.BLSKIP ) )
836: $ THEN
837: SVA( p ) = AAPP
838: NOTROT = 0
839: GO TO 2011
840: END IF
841: IF( ( i.LE.SWBAND ) .AND.
842: $ ( PSKIPPED.GT.ROWSKIP ) ) THEN
843: AAPP = -AAPP
844: NOTROT = 0
845: GO TO 2203
846: END IF
847: *
848: 2200 CONTINUE
849: * end of the q-loop
850: 2203 CONTINUE
851: *
852: SVA( p ) = AAPP
853: *
854: ELSE
855: *
856: IF( AAPP.EQ.ZERO )NOTROT = NOTROT +
1.4 bertrand 857: $ MIN( jgl+KBL-1, N ) - jgl + 1
1.1 bertrand 858: IF( AAPP.LT.ZERO )NOTROT = 0
859: *
860: END IF
861: *
862: 2100 CONTINUE
863: * end of the p-loop
864: 2010 CONTINUE
865: * end of the jbc-loop
866: 2011 CONTINUE
867: *2011 bailed out of the jbc-loop
1.4 bertrand 868: DO 2012 p = igl, MIN( igl+KBL-1, N )
1.1 bertrand 869: SVA( p ) = ABS( SVA( p ) )
870: 2012 CONTINUE
871: ***
872: 2000 CONTINUE
873: *2000 :: end of the ibr-loop
874: *
875: * .. update SVA(N)
876: IF( ( SVA( N ).LT.ROOTBIG ) .AND. ( SVA( N ).GT.ROOTSFMIN ) )
877: $ THEN
878: SVA( N ) = DZNRM2( M, A( 1, N ), 1 )
879: ELSE
880: T = ZERO
881: AAPP = ONE
882: CALL ZLASSQ( M, A( 1, N ), 1, T, AAPP )
1.4 bertrand 883: SVA( N ) = T*SQRT( AAPP )
1.1 bertrand 884: END IF
885: *
886: * Additional steering devices
887: *
888: IF( ( i.LT.SWBAND ) .AND. ( ( MXAAPQ.LE.ROOTTOL ) .OR.
889: $ ( ISWROT.LE.N ) ) )SWBAND = i
890: *
1.4 bertrand 891: IF( ( i.GT.SWBAND+1 ) .AND. ( MXAAPQ.LT.SQRT( DBLE( N ) )*
1.2 bertrand 892: $ TOL ) .AND. ( DBLE( N )*MXAAPQ*MXSINJ.LT.TOL ) ) THEN
1.1 bertrand 893: GO TO 1994
894: END IF
895: *
896: IF( NOTROT.GE.EMPTSW )GO TO 1994
897: *
898: 1993 CONTINUE
899: * end i=1:NSWEEP loop
900: *
901: * #:( Reaching this point means that the procedure has not converged.
902: INFO = NSWEEP - 1
903: GO TO 1995
904: *
905: 1994 CONTINUE
906: * #:) Reaching this point means numerical convergence after the i-th
907: * sweep.
908: *
909: INFO = 0
910: * #:) INFO = 0 confirms successful iterations.
1.4 bertrand 911: 1995 CONTINUE
1.1 bertrand 912: *
913: * Sort the vector SVA() of column norms.
914: DO 5991 p = 1, N - 1
915: q = IDAMAX( N-p+1, SVA( p ), 1 ) + p - 1
916: IF( p.NE.q ) THEN
917: TEMP1 = SVA( p )
918: SVA( p ) = SVA( q )
919: SVA( q ) = TEMP1
920: AAPQ = D( p )
921: D( p ) = D( q )
922: D( q ) = AAPQ
923: CALL ZSWAP( M, A( 1, p ), 1, A( 1, q ), 1 )
924: IF( RSVEC )CALL ZSWAP( MVL, V( 1, p ), 1, V( 1, q ), 1 )
925: END IF
926: 5991 CONTINUE
927: *
928: RETURN
929: * ..
930: * .. END OF ZGSVJ0
931: * ..
932: END