1: *> \brief \b DLATRS3 solves a triangular system of equations with the scale factors set to prevent overflow.
2: *
3: * Definition:
4: * ===========
5: *
6: * SUBROUTINE DLATRS3( UPLO, TRANS, DIAG, NORMIN, N, NRHS, A, LDA,
7: * X, LDX, SCALE, CNORM, WORK, LWORK, INFO )
8: *
9: * .. Scalar Arguments ..
10: * CHARACTER DIAG, NORMIN, TRANS, UPLO
11: * INTEGER INFO, LDA, LWORK, LDX, N, NRHS
12: * ..
13: * .. Array Arguments ..
14: * DOUBLE PRECISION A( LDA, * ), CNORM( * ), SCALE( * ),
15: * WORK( * ), X( LDX, * )
16: * ..
17: *
18: *
19: *> \par Purpose:
20: * =============
21: *>
22: *> \verbatim
23: *>
24: *> DLATRS3 solves one of the triangular systems
25: *>
26: *> A * X = B * diag(scale) or A**T * X = B * diag(scale)
27: *>
28: *> with scaling to prevent overflow. Here A is an upper or lower
29: *> triangular matrix, A**T denotes the transpose of A. X and B are
30: *> n by nrhs matrices and scale is an nrhs element vector of scaling
31: *> factors. A scaling factor scale(j) is usually less than or equal
32: *> to 1, chosen such that X(:,j) is less than the overflow threshold.
33: *> If the matrix A is singular (A(j,j) = 0 for some j), then
34: *> a non-trivial solution to A*X = 0 is returned. If the system is
35: *> so badly scaled that the solution cannot be represented as
36: *> (1/scale(k))*X(:,k), then x(:,k) = 0 and scale(k) is returned.
37: *>
38: *> This is a BLAS-3 version of LATRS for solving several right
39: *> hand sides simultaneously.
40: *>
41: *> \endverbatim
42: *
43: * Arguments:
44: * ==========
45: *
46: *> \param[in] UPLO
47: *> \verbatim
48: *> UPLO is CHARACTER*1
49: *> Specifies whether the matrix A is upper or lower triangular.
50: *> = 'U': Upper triangular
51: *> = 'L': Lower triangular
52: *> \endverbatim
53: *>
54: *> \param[in] TRANS
55: *> \verbatim
56: *> TRANS is CHARACTER*1
57: *> Specifies the operation applied to A.
58: *> = 'N': Solve A * x = s*b (No transpose)
59: *> = 'T': Solve A**T* x = s*b (Transpose)
60: *> = 'C': Solve A**T* x = s*b (Conjugate transpose = Transpose)
61: *> \endverbatim
62: *>
63: *> \param[in] DIAG
64: *> \verbatim
65: *> DIAG is CHARACTER*1
66: *> Specifies whether or not the matrix A is unit triangular.
67: *> = 'N': Non-unit triangular
68: *> = 'U': Unit triangular
69: *> \endverbatim
70: *>
71: *> \param[in] NORMIN
72: *> \verbatim
73: *> NORMIN is CHARACTER*1
74: *> Specifies whether CNORM has been set or not.
75: *> = 'Y': CNORM contains the column norms on entry
76: *> = 'N': CNORM is not set on entry. On exit, the norms will
77: *> be computed and stored in CNORM.
78: *> \endverbatim
79: *>
80: *> \param[in] N
81: *> \verbatim
82: *> N is INTEGER
83: *> The order of the matrix A. N >= 0.
84: *> \endverbatim
85: *>
86: *> \param[in] NRHS
87: *> \verbatim
88: *> NRHS is INTEGER
89: *> The number of columns of X. NRHS >= 0.
90: *> \endverbatim
91: *>
92: *> \param[in] A
93: *> \verbatim
94: *> A is DOUBLE PRECISION array, dimension (LDA,N)
95: *> The triangular matrix A. If UPLO = 'U', the leading n by n
96: *> upper triangular part of the array A contains the upper
97: *> triangular matrix, and the strictly lower triangular part of
98: *> A is not referenced. If UPLO = 'L', the leading n by n lower
99: *> triangular part of the array A contains the lower triangular
100: *> matrix, and the strictly upper triangular part of A is not
101: *> referenced. If DIAG = 'U', the diagonal elements of A are
102: *> also not referenced and are assumed to be 1.
103: *> \endverbatim
104: *>
105: *> \param[in] LDA
106: *> \verbatim
107: *> LDA is INTEGER
108: *> The leading dimension of the array A. LDA >= max (1,N).
109: *> \endverbatim
110: *>
111: *> \param[in,out] X
112: *> \verbatim
113: *> X is DOUBLE PRECISION array, dimension (LDX,NRHS)
114: *> On entry, the right hand side B of the triangular system.
115: *> On exit, X is overwritten by the solution matrix X.
116: *> \endverbatim
117: *>
118: *> \param[in] LDX
119: *> \verbatim
120: *> LDX is INTEGER
121: *> The leading dimension of the array X. LDX >= max (1,N).
122: *> \endverbatim
123: *>
124: *> \param[out] SCALE
125: *> \verbatim
126: *> SCALE is DOUBLE PRECISION array, dimension (NRHS)
127: *> The scaling factor s(k) is for the triangular system
128: *> A * x(:,k) = s(k)*b(:,k) or A**T* x(:,k) = s(k)*b(:,k).
129: *> If SCALE = 0, the matrix A is singular or badly scaled.
130: *> If A(j,j) = 0 is encountered, a non-trivial vector x(:,k)
131: *> that is an exact or approximate solution to A*x(:,k) = 0
132: *> is returned. If the system so badly scaled that solution
133: *> cannot be presented as x(:,k) * 1/s(k), then x(:,k) = 0
134: *> is returned.
135: *> \endverbatim
136: *>
137: *> \param[in,out] CNORM
138: *> \verbatim
139: *> CNORM is DOUBLE PRECISION array, dimension (N)
140: *>
141: *> If NORMIN = 'Y', CNORM is an input argument and CNORM(j)
142: *> contains the norm of the off-diagonal part of the j-th column
143: *> of A. If TRANS = 'N', CNORM(j) must be greater than or equal
144: *> to the infinity-norm, and if TRANS = 'T' or 'C', CNORM(j)
145: *> must be greater than or equal to the 1-norm.
146: *>
147: *> If NORMIN = 'N', CNORM is an output argument and CNORM(j)
148: *> returns the 1-norm of the offdiagonal part of the j-th column
149: *> of A.
150: *> \endverbatim
151: *>
152: *> \param[out] WORK
153: *> \verbatim
154: *> WORK is DOUBLE PRECISION array, dimension (LWORK).
155: *> On exit, if INFO = 0, WORK(1) returns the optimal size of
156: *> WORK.
157: *> \endverbatim
158: *>
159: *> \param[in] LWORK
160: *> LWORK is INTEGER
161: *> LWORK >= MAX(1, 2*NBA * MAX(NBA, MIN(NRHS, 32)), where
162: *> NBA = (N + NB - 1)/NB and NB is the optimal block size.
163: *>
164: *> If LWORK = -1, then a workspace query is assumed; the routine
165: *> only calculates the optimal dimensions of the WORK array, returns
166: *> this value as the first entry of the WORK array, and no error
167: *> message related to LWORK is issued by XERBLA.
168: *>
169: *> \param[out] INFO
170: *> \verbatim
171: *> INFO is INTEGER
172: *> = 0: successful exit
173: *> < 0: if INFO = -k, the k-th argument had an illegal value
174: *> \endverbatim
175: *
176: * Authors:
177: * ========
178: *
179: *> \author Univ. of Tennessee
180: *> \author Univ. of California Berkeley
181: *> \author Univ. of Colorado Denver
182: *> \author NAG Ltd.
183: *
184: *> \ingroup doubleOTHERauxiliary
185: *> \par Further Details:
186: * =====================
187: * \verbatim
188: * The algorithm follows the structure of a block triangular solve.
189: * The diagonal block is solved with a call to the robust the triangular
190: * solver LATRS for every right-hand side RHS = 1, ..., NRHS
191: * op(A( J, J )) * x( J, RHS ) = SCALOC * b( J, RHS ),
192: * where op( A ) = A or op( A ) = A**T.
193: * The linear block updates operate on block columns of X,
194: * B( I, K ) - op(A( I, J )) * X( J, K )
195: * and use GEMM. To avoid overflow in the linear block update, the worst case
196: * growth is estimated. For every RHS, a scale factor s <= 1.0 is computed
197: * such that
198: * || s * B( I, RHS )||_oo
199: * + || op(A( I, J )) ||_oo * || s * X( J, RHS ) ||_oo <= Overflow threshold
200: *
201: * Once all columns of a block column have been rescaled (BLAS-1), the linear
202: * update is executed with GEMM without overflow.
203: *
204: * To limit rescaling, local scale factors track the scaling of column segments.
205: * There is one local scale factor s( I, RHS ) per block row I = 1, ..., NBA
206: * per right-hand side column RHS = 1, ..., NRHS. The global scale factor
207: * SCALE( RHS ) is chosen as the smallest local scale factor s( I, RHS )
208: * I = 1, ..., NBA.
209: * A triangular solve op(A( J, J )) * x( J, RHS ) = SCALOC * b( J, RHS )
210: * updates the local scale factor s( J, RHS ) := s( J, RHS ) * SCALOC. The
211: * linear update of potentially inconsistently scaled vector segments
212: * s( I, RHS ) * b( I, RHS ) - op(A( I, J )) * ( s( J, RHS )* x( J, RHS ) )
213: * computes a consistent scaling SCAMIN = MIN( s(I, RHS ), s(J, RHS) ) and,
214: * if necessary, rescales the blocks prior to calling GEMM.
215: *
216: * \endverbatim
217: * =====================================================================
218: * References:
219: * C. C. Kjelgaard Mikkelsen, A. B. Schwarz and L. Karlsson (2019).
220: * Parallel robust solution of triangular linear systems. Concurrency
221: * and Computation: Practice and Experience, 31(19), e5064.
222: *
223: * Contributor:
224: * Angelika Schwarz, Umea University, Sweden.
225: *
226: * =====================================================================
227: SUBROUTINE DLATRS3( UPLO, TRANS, DIAG, NORMIN, N, NRHS, A, LDA,
228: $ X, LDX, SCALE, CNORM, WORK, LWORK, INFO )
229: IMPLICIT NONE
230: *
231: * .. Scalar Arguments ..
232: CHARACTER DIAG, TRANS, NORMIN, UPLO
233: INTEGER INFO, LDA, LWORK, LDX, N, NRHS
234: * ..
235: * .. Array Arguments ..
236: DOUBLE PRECISION A( LDA, * ), CNORM( * ), X( LDX, * ),
237: $ SCALE( * ), WORK( * )
238: * ..
239: *
240: * =====================================================================
241: *
242: * .. Parameters ..
243: DOUBLE PRECISION ZERO, ONE
244: PARAMETER ( ZERO = 0.0D+0, ONE = 1.0D+0 )
245: INTEGER NBMAX, NBMIN, NBRHS, NRHSMIN
246: PARAMETER ( NRHSMIN = 2, NBRHS = 32 )
247: PARAMETER ( NBMIN = 8, NBMAX = 64 )
248: * ..
249: * .. Local Arrays ..
250: DOUBLE PRECISION W( NBMAX ), XNRM( NBRHS )
251: * ..
252: * .. Local Scalars ..
253: LOGICAL LQUERY, NOTRAN, NOUNIT, UPPER
254: INTEGER AWRK, I, IFIRST, IINC, ILAST, II, I1, I2, J,
255: $ JFIRST, JINC, JLAST, J1, J2, K, KK, K1, K2,
256: $ LANRM, LDS, LSCALE, NB, NBA, NBX, RHS
257: DOUBLE PRECISION ANRM, BIGNUM, BNRM, RSCAL, SCAL, SCALOC,
258: $ SCAMIN, SMLNUM, TMAX
259: * ..
260: * .. External Functions ..
261: LOGICAL LSAME
262: INTEGER ILAENV
263: DOUBLE PRECISION DLAMCH, DLANGE, DLARMM
264: EXTERNAL DLAMCH, DLANGE, DLARMM, ILAENV, LSAME
265: * ..
266: * .. External Subroutines ..
267: EXTERNAL DLATRS, DSCAL, XERBLA
268: * ..
269: * .. Intrinsic Functions ..
270: INTRINSIC ABS, MAX, MIN
271: * ..
272: * .. Executable Statements ..
273: *
274: INFO = 0
275: UPPER = LSAME( UPLO, 'U' )
276: NOTRAN = LSAME( TRANS, 'N' )
277: NOUNIT = LSAME( DIAG, 'N' )
278: LQUERY = ( LWORK.EQ.-1 )
279: *
280: * Partition A and X into blocks
281: *
282: NB = MAX( 8, ILAENV( 1, 'DLATRS', '', N, N, -1, -1 ) )
283: NB = MIN( NBMAX, NB )
284: NBA = MAX( 1, (N + NB - 1) / NB )
285: NBX = MAX( 1, (NRHS + NBRHS - 1) / NBRHS )
286: *
287: * Compute the workspace
288: *
289: * The workspace comprises two parts.
290: * The first part stores the local scale factors. Each simultaneously
291: * computed right-hand side requires one local scale factor per block
292: * row. WORK( I+KK*LDS ) is the scale factor of the vector
293: * segment associated with the I-th block row and the KK-th vector
294: * in the block column.
295: LSCALE = NBA * MAX( NBA, MIN( NRHS, NBRHS ) )
296: LDS = NBA
297: * The second part stores upper bounds of the triangular A. There are
298: * a total of NBA x NBA blocks, of which only the upper triangular
299: * part or the lower triangular part is referenced. The upper bound of
300: * the block A( I, J ) is stored as WORK( AWRK + I + J * NBA ).
301: LANRM = NBA * NBA
302: AWRK = LSCALE
303: WORK( 1 ) = LSCALE + LANRM
304: *
305: * Test the input parameters
306: *
307: IF( .NOT.UPPER .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN
308: INFO = -1
309: ELSE IF( .NOT.NOTRAN .AND. .NOT.LSAME( TRANS, 'T' ) .AND. .NOT.
310: $ LSAME( TRANS, 'C' ) ) THEN
311: INFO = -2
312: ELSE IF( .NOT.NOUNIT .AND. .NOT.LSAME( DIAG, 'U' ) ) THEN
313: INFO = -3
314: ELSE IF( .NOT.LSAME( NORMIN, 'Y' ) .AND. .NOT.
315: $ LSAME( NORMIN, 'N' ) ) THEN
316: INFO = -4
317: ELSE IF( N.LT.0 ) THEN
318: INFO = -5
319: ELSE IF( NRHS.LT.0 ) THEN
320: INFO = -6
321: ELSE IF( LDA.LT.MAX( 1, N ) ) THEN
322: INFO = -8
323: ELSE IF( LDX.LT.MAX( 1, N ) ) THEN
324: INFO = -10
325: ELSE IF( .NOT.LQUERY .AND. LWORK.LT.WORK( 1 ) ) THEN
326: INFO = -14
327: END IF
328: IF( INFO.NE.0 ) THEN
329: CALL XERBLA( 'DLATRS3', -INFO )
330: RETURN
331: ELSE IF( LQUERY ) THEN
332: RETURN
333: END IF
334: *
335: * Initialize scaling factors
336: *
337: DO KK = 1, NRHS
338: SCALE( KK ) = ONE
339: END DO
340: *
341: * Quick return if possible
342: *
343: IF( MIN( N, NRHS ).EQ.0 )
344: $ RETURN
345: *
346: * Determine machine dependent constant to control overflow.
347: *
348: BIGNUM = DLAMCH( 'Overflow' )
349: SMLNUM = DLAMCH( 'Safe Minimum' )
350: *
351: * Use unblocked code for small problems
352: *
353: IF( NRHS.LT.NRHSMIN ) THEN
354: CALL DLATRS( UPLO, TRANS, DIAG, NORMIN, N, A, LDA, X( 1, 1),
355: $ SCALE( 1 ), CNORM, INFO )
356: DO K = 2, NRHS
357: CALL DLATRS( UPLO, TRANS, DIAG, 'Y', N, A, LDA, X( 1, K ),
358: $ SCALE( K ), CNORM, INFO )
359: END DO
360: RETURN
361: END IF
362: *
363: * Compute norms of blocks of A excluding diagonal blocks and find
364: * the block with the largest norm TMAX.
365: *
366: TMAX = ZERO
367: DO J = 1, NBA
368: J1 = (J-1)*NB + 1
369: J2 = MIN( J*NB, N ) + 1
370: IF ( UPPER ) THEN
371: IFIRST = 1
372: ILAST = J - 1
373: ELSE
374: IFIRST = J + 1
375: ILAST = NBA
376: END IF
377: DO I = IFIRST, ILAST
378: I1 = (I-1)*NB + 1
379: I2 = MIN( I*NB, N ) + 1
380: *
381: * Compute upper bound of A( I1:I2-1, J1:J2-1 ).
382: *
383: IF( NOTRAN ) THEN
384: ANRM = DLANGE( 'I', I2-I1, J2-J1, A( I1, J1 ), LDA, W )
385: WORK( AWRK + I+(J-1)*NBA ) = ANRM
386: ELSE
387: ANRM = DLANGE( '1', I2-I1, J2-J1, A( I1, J1 ), LDA, W )
388: WORK( AWRK + J+(I-1)*NBA ) = ANRM
389: END IF
390: TMAX = MAX( TMAX, ANRM )
391: END DO
392: END DO
393: *
394: IF( .NOT. TMAX.LE.DLAMCH('Overflow') ) THEN
395: *
396: * Some matrix entries have huge absolute value. At least one upper
397: * bound norm( A(I1:I2-1, J1:J2-1), 'I') is not a valid floating-point
398: * number, either due to overflow in LANGE or due to Inf in A.
399: * Fall back to LATRS. Set normin = 'N' for every right-hand side to
400: * force computation of TSCAL in LATRS to avoid the likely overflow
401: * in the computation of the column norms CNORM.
402: *
403: DO K = 1, NRHS
404: CALL DLATRS( UPLO, TRANS, DIAG, 'N', N, A, LDA, X( 1, K ),
405: $ SCALE( K ), CNORM, INFO )
406: END DO
407: RETURN
408: END IF
409: *
410: * Every right-hand side requires workspace to store NBA local scale
411: * factors. To save workspace, X is computed successively in block columns
412: * of width NBRHS, requiring a total of NBA x NBRHS space. If sufficient
413: * workspace is available, larger values of NBRHS or NBRHS = NRHS are viable.
414: DO K = 1, NBX
415: * Loop over block columns (index = K) of X and, for column-wise scalings,
416: * over individual columns (index = KK).
417: * K1: column index of the first column in X( J, K )
418: * K2: column index of the first column in X( J, K+1 )
419: * so the K2 - K1 is the column count of the block X( J, K )
420: K1 = (K-1)*NBRHS + 1
421: K2 = MIN( K*NBRHS, NRHS ) + 1
422: *
423: * Initialize local scaling factors of current block column X( J, K )
424: *
425: DO KK = 1, K2-K1
426: DO I = 1, NBA
427: WORK( I+KK*LDS ) = ONE
428: END DO
429: END DO
430: *
431: IF( NOTRAN ) THEN
432: *
433: * Solve A * X(:, K1:K2-1) = B * diag(scale(K1:K2-1))
434: *
435: IF( UPPER ) THEN
436: JFIRST = NBA
437: JLAST = 1
438: JINC = -1
439: ELSE
440: JFIRST = 1
441: JLAST = NBA
442: JINC = 1
443: END IF
444: ELSE
445: *
446: * Solve A**T * X(:, K1:K2-1) = B * diag(scale(K1:K2-1))
447: *
448: IF( UPPER ) THEN
449: JFIRST = 1
450: JLAST = NBA
451: JINC = 1
452: ELSE
453: JFIRST = NBA
454: JLAST = 1
455: JINC = -1
456: END IF
457: END IF
458: *
459: DO J = JFIRST, JLAST, JINC
460: * J1: row index of the first row in A( J, J )
461: * J2: row index of the first row in A( J+1, J+1 )
462: * so that J2 - J1 is the row count of the block A( J, J )
463: J1 = (J-1)*NB + 1
464: J2 = MIN( J*NB, N ) + 1
465: *
466: * Solve op(A( J, J )) * X( J, RHS ) = SCALOC * B( J, RHS )
467: * for all right-hand sides in the current block column,
468: * one RHS at a time.
469: *
470: DO KK = 1, K2-K1
471: RHS = K1 + KK - 1
472: IF( KK.EQ.1 ) THEN
473: CALL DLATRS( UPLO, TRANS, DIAG, 'N', J2-J1,
474: $ A( J1, J1 ), LDA, X( J1, RHS ),
475: $ SCALOC, CNORM, INFO )
476: ELSE
477: CALL DLATRS( UPLO, TRANS, DIAG, 'Y', J2-J1,
478: $ A( J1, J1 ), LDA, X( J1, RHS ),
479: $ SCALOC, CNORM, INFO )
480: END IF
481: * Find largest absolute value entry in the vector segment
482: * X( J1:J2-1, RHS ) as an upper bound for the worst case
483: * growth in the linear updates.
484: XNRM( KK ) = DLANGE( 'I', J2-J1, 1, X( J1, RHS ),
485: $ LDX, W )
486: *
487: IF( SCALOC .EQ. ZERO ) THEN
488: * LATRS found that A is singular through A(j,j) = 0.
489: * Reset the computation x(1:n) = 0, x(j) = 1, SCALE = 0
490: * and compute A*x = 0 (or A**T*x = 0). Note that
491: * X(J1:J2-1, KK) is set by LATRS.
492: SCALE( RHS ) = ZERO
493: DO II = 1, J1-1
494: X( II, KK ) = ZERO
495: END DO
496: DO II = J2, N
497: X( II, KK ) = ZERO
498: END DO
499: * Discard the local scale factors.
500: DO II = 1, NBA
501: WORK( II+KK*LDS ) = ONE
502: END DO
503: SCALOC = ONE
504: ELSE IF( SCALOC * WORK( J+KK*LDS ) .EQ. ZERO ) THEN
505: * LATRS computed a valid scale factor, but combined with
506: * the current scaling the solution does not have a
507: * scale factor > 0.
508: *
509: * Set WORK( J+KK*LDS ) to smallest valid scale
510: * factor and increase SCALOC accordingly.
511: SCAL = WORK( J+KK*LDS ) / SMLNUM
512: SCALOC = SCALOC * SCAL
513: WORK( J+KK*LDS ) = SMLNUM
514: * If LATRS overestimated the growth, x may be
515: * rescaled to preserve a valid combined scale
516: * factor WORK( J, KK ) > 0.
517: RSCAL = ONE / SCALOC
518: IF( XNRM( KK ) * RSCAL .LE. BIGNUM ) THEN
519: XNRM( KK ) = XNRM( KK ) * RSCAL
520: CALL DSCAL( J2-J1, RSCAL, X( J1, RHS ), 1 )
521: SCALOC = ONE
522: ELSE
523: * The system op(A) * x = b is badly scaled and its
524: * solution cannot be represented as (1/scale) * x.
525: * Set x to zero. This approach deviates from LATRS
526: * where a completely meaningless non-zero vector
527: * is returned that is not a solution to op(A) * x = b.
528: SCALE( RHS ) = ZERO
529: DO II = 1, N
530: X( II, KK ) = ZERO
531: END DO
532: * Discard the local scale factors.
533: DO II = 1, NBA
534: WORK( II+KK*LDS ) = ONE
535: END DO
536: SCALOC = ONE
537: END IF
538: END IF
539: SCALOC = SCALOC * WORK( J+KK*LDS )
540: WORK( J+KK*LDS ) = SCALOC
541: END DO
542: *
543: * Linear block updates
544: *
545: IF( NOTRAN ) THEN
546: IF( UPPER ) THEN
547: IFIRST = J - 1
548: ILAST = 1
549: IINC = -1
550: ELSE
551: IFIRST = J + 1
552: ILAST = NBA
553: IINC = 1
554: END IF
555: ELSE
556: IF( UPPER ) THEN
557: IFIRST = J + 1
558: ILAST = NBA
559: IINC = 1
560: ELSE
561: IFIRST = J - 1
562: ILAST = 1
563: IINC = -1
564: END IF
565: END IF
566: *
567: DO I = IFIRST, ILAST, IINC
568: * I1: row index of the first column in X( I, K )
569: * I2: row index of the first column in X( I+1, K )
570: * so the I2 - I1 is the row count of the block X( I, K )
571: I1 = (I-1)*NB + 1
572: I2 = MIN( I*NB, N ) + 1
573: *
574: * Prepare the linear update to be executed with GEMM.
575: * For each column, compute a consistent scaling, a
576: * scaling factor to survive the linear update, and
577: * rescale the column segments, if necesssary. Then
578: * the linear update is safely executed.
579: *
580: DO KK = 1, K2-K1
581: RHS = K1 + KK - 1
582: * Compute consistent scaling
583: SCAMIN = MIN( WORK( I + KK*LDS), WORK( J + KK*LDS ) )
584: *
585: * Compute scaling factor to survive the linear update
586: * simulating consistent scaling.
587: *
588: BNRM = DLANGE( 'I', I2-I1, 1, X( I1, RHS ), LDX, W )
589: BNRM = BNRM*( SCAMIN / WORK( I+KK*LDS ) )
590: XNRM( KK ) = XNRM( KK )*(SCAMIN / WORK( J+KK*LDS ))
591: ANRM = WORK( AWRK + I+(J-1)*NBA )
592: SCALOC = DLARMM( ANRM, XNRM( KK ), BNRM )
593: *
594: * Simultaneously apply the robust update factor and the
595: * consistency scaling factor to B( I, KK ) and B( J, KK ).
596: *
597: SCAL = ( SCAMIN / WORK( I+KK*LDS) )*SCALOC
598: IF( SCAL.NE.ONE ) THEN
599: CALL DSCAL( I2-I1, SCAL, X( I1, RHS ), 1 )
600: WORK( I+KK*LDS ) = SCAMIN*SCALOC
601: END IF
602: *
603: SCAL = ( SCAMIN / WORK( J+KK*LDS ) )*SCALOC
604: IF( SCAL.NE.ONE ) THEN
605: CALL DSCAL( J2-J1, SCAL, X( J1, RHS ), 1 )
606: WORK( J+KK*LDS ) = SCAMIN*SCALOC
607: END IF
608: END DO
609: *
610: IF( NOTRAN ) THEN
611: *
612: * B( I, K ) := B( I, K ) - A( I, J ) * X( J, K )
613: *
614: CALL DGEMM( 'N', 'N', I2-I1, K2-K1, J2-J1, -ONE,
615: $ A( I1, J1 ), LDA, X( J1, K1 ), LDX,
616: $ ONE, X( I1, K1 ), LDX )
617: ELSE
618: *
619: * B( I, K ) := B( I, K ) - A( J, I )**T * X( J, K )
620: *
621: CALL DGEMM( 'T', 'N', I2-I1, K2-K1, J2-J1, -ONE,
622: $ A( J1, I1 ), LDA, X( J1, K1 ), LDX,
623: $ ONE, X( I1, K1 ), LDX )
624: END IF
625: END DO
626: END DO
627: *
628: * Reduce local scaling factors
629: *
630: DO KK = 1, K2-K1
631: RHS = K1 + KK - 1
632: DO I = 1, NBA
633: SCALE( RHS ) = MIN( SCALE( RHS ), WORK( I+KK*LDS ) )
634: END DO
635: END DO
636: *
637: * Realize consistent scaling
638: *
639: DO KK = 1, K2-K1
640: RHS = K1 + KK - 1
641: IF( SCALE( RHS ).NE.ONE .AND. SCALE( RHS ).NE. ZERO ) THEN
642: DO I = 1, NBA
643: I1 = (I-1)*NB + 1
644: I2 = MIN( I*NB, N ) + 1
645: SCAL = SCALE( RHS ) / WORK( I+KK*LDS )
646: IF( SCAL.NE.ONE )
647: $ CALL DSCAL( I2-I1, SCAL, X( I1, RHS ), 1 )
648: END DO
649: END IF
650: END DO
651: END DO
652: RETURN
653: *
654: * End of DLATRS3
655: *
656: END
CVSweb interface <joel.bertrand@systella.fr>