Annotation of rpl/lapack/lapack/dlatrs3.f, revision 1.1
1.1 ! bertrand 1: *> \brief \b DLATRS3 solves a triangular system of equations with the scale factors set to prevent overflow.
! 2: *
! 3: * Definition:
! 4: * ===========
! 5: *
! 6: * SUBROUTINE DLATRS3( UPLO, TRANS, DIAG, NORMIN, N, NRHS, A, LDA,
! 7: * X, LDX, SCALE, CNORM, WORK, LWORK, INFO )
! 8: *
! 9: * .. Scalar Arguments ..
! 10: * CHARACTER DIAG, NORMIN, TRANS, UPLO
! 11: * INTEGER INFO, LDA, LWORK, LDX, N, NRHS
! 12: * ..
! 13: * .. Array Arguments ..
! 14: * DOUBLE PRECISION A( LDA, * ), CNORM( * ), SCALE( * ),
! 15: * WORK( * ), X( LDX, * )
! 16: * ..
! 17: *
! 18: *
! 19: *> \par Purpose:
! 20: * =============
! 21: *>
! 22: *> \verbatim
! 23: *>
! 24: *> DLATRS3 solves one of the triangular systems
! 25: *>
! 26: *> A * X = B * diag(scale) or A**T * X = B * diag(scale)
! 27: *>
! 28: *> with scaling to prevent overflow. Here A is an upper or lower
! 29: *> triangular matrix, A**T denotes the transpose of A. X and B are
! 30: *> n by nrhs matrices and scale is an nrhs element vector of scaling
! 31: *> factors. A scaling factor scale(j) is usually less than or equal
! 32: *> to 1, chosen such that X(:,j) is less than the overflow threshold.
! 33: *> If the matrix A is singular (A(j,j) = 0 for some j), then
! 34: *> a non-trivial solution to A*X = 0 is returned. If the system is
! 35: *> so badly scaled that the solution cannot be represented as
! 36: *> (1/scale(k))*X(:,k), then x(:,k) = 0 and scale(k) is returned.
! 37: *>
! 38: *> This is a BLAS-3 version of LATRS for solving several right
! 39: *> hand sides simultaneously.
! 40: *>
! 41: *> \endverbatim
! 42: *
! 43: * Arguments:
! 44: * ==========
! 45: *
! 46: *> \param[in] UPLO
! 47: *> \verbatim
! 48: *> UPLO is CHARACTER*1
! 49: *> Specifies whether the matrix A is upper or lower triangular.
! 50: *> = 'U': Upper triangular
! 51: *> = 'L': Lower triangular
! 52: *> \endverbatim
! 53: *>
! 54: *> \param[in] TRANS
! 55: *> \verbatim
! 56: *> TRANS is CHARACTER*1
! 57: *> Specifies the operation applied to A.
! 58: *> = 'N': Solve A * x = s*b (No transpose)
! 59: *> = 'T': Solve A**T* x = s*b (Transpose)
! 60: *> = 'C': Solve A**T* x = s*b (Conjugate transpose = Transpose)
! 61: *> \endverbatim
! 62: *>
! 63: *> \param[in] DIAG
! 64: *> \verbatim
! 65: *> DIAG is CHARACTER*1
! 66: *> Specifies whether or not the matrix A is unit triangular.
! 67: *> = 'N': Non-unit triangular
! 68: *> = 'U': Unit triangular
! 69: *> \endverbatim
! 70: *>
! 71: *> \param[in] NORMIN
! 72: *> \verbatim
! 73: *> NORMIN is CHARACTER*1
! 74: *> Specifies whether CNORM has been set or not.
! 75: *> = 'Y': CNORM contains the column norms on entry
! 76: *> = 'N': CNORM is not set on entry. On exit, the norms will
! 77: *> be computed and stored in CNORM.
! 78: *> \endverbatim
! 79: *>
! 80: *> \param[in] N
! 81: *> \verbatim
! 82: *> N is INTEGER
! 83: *> The order of the matrix A. N >= 0.
! 84: *> \endverbatim
! 85: *>
! 86: *> \param[in] NRHS
! 87: *> \verbatim
! 88: *> NRHS is INTEGER
! 89: *> The number of columns of X. NRHS >= 0.
! 90: *> \endverbatim
! 91: *>
! 92: *> \param[in] A
! 93: *> \verbatim
! 94: *> A is DOUBLE PRECISION array, dimension (LDA,N)
! 95: *> The triangular matrix A. If UPLO = 'U', the leading n by n
! 96: *> upper triangular part of the array A contains the upper
! 97: *> triangular matrix, and the strictly lower triangular part of
! 98: *> A is not referenced. If UPLO = 'L', the leading n by n lower
! 99: *> triangular part of the array A contains the lower triangular
! 100: *> matrix, and the strictly upper triangular part of A is not
! 101: *> referenced. If DIAG = 'U', the diagonal elements of A are
! 102: *> also not referenced and are assumed to be 1.
! 103: *> \endverbatim
! 104: *>
! 105: *> \param[in] LDA
! 106: *> \verbatim
! 107: *> LDA is INTEGER
! 108: *> The leading dimension of the array A. LDA >= max (1,N).
! 109: *> \endverbatim
! 110: *>
! 111: *> \param[in,out] X
! 112: *> \verbatim
! 113: *> X is DOUBLE PRECISION array, dimension (LDX,NRHS)
! 114: *> On entry, the right hand side B of the triangular system.
! 115: *> On exit, X is overwritten by the solution matrix X.
! 116: *> \endverbatim
! 117: *>
! 118: *> \param[in] LDX
! 119: *> \verbatim
! 120: *> LDX is INTEGER
! 121: *> The leading dimension of the array X. LDX >= max (1,N).
! 122: *> \endverbatim
! 123: *>
! 124: *> \param[out] SCALE
! 125: *> \verbatim
! 126: *> SCALE is DOUBLE PRECISION array, dimension (NRHS)
! 127: *> The scaling factor s(k) is for the triangular system
! 128: *> A * x(:,k) = s(k)*b(:,k) or A**T* x(:,k) = s(k)*b(:,k).
! 129: *> If SCALE = 0, the matrix A is singular or badly scaled.
! 130: *> If A(j,j) = 0 is encountered, a non-trivial vector x(:,k)
! 131: *> that is an exact or approximate solution to A*x(:,k) = 0
! 132: *> is returned. If the system so badly scaled that solution
! 133: *> cannot be presented as x(:,k) * 1/s(k), then x(:,k) = 0
! 134: *> is returned.
! 135: *> \endverbatim
! 136: *>
! 137: *> \param[in,out] CNORM
! 138: *> \verbatim
! 139: *> CNORM is DOUBLE PRECISION array, dimension (N)
! 140: *>
! 141: *> If NORMIN = 'Y', CNORM is an input argument and CNORM(j)
! 142: *> contains the norm of the off-diagonal part of the j-th column
! 143: *> of A. If TRANS = 'N', CNORM(j) must be greater than or equal
! 144: *> to the infinity-norm, and if TRANS = 'T' or 'C', CNORM(j)
! 145: *> must be greater than or equal to the 1-norm.
! 146: *>
! 147: *> If NORMIN = 'N', CNORM is an output argument and CNORM(j)
! 148: *> returns the 1-norm of the offdiagonal part of the j-th column
! 149: *> of A.
! 150: *> \endverbatim
! 151: *>
! 152: *> \param[out] WORK
! 153: *> \verbatim
! 154: *> WORK is DOUBLE PRECISION array, dimension (LWORK).
! 155: *> On exit, if INFO = 0, WORK(1) returns the optimal size of
! 156: *> WORK.
! 157: *> \endverbatim
! 158: *>
! 159: *> \param[in] LWORK
! 160: *> LWORK is INTEGER
! 161: *> LWORK >= MAX(1, 2*NBA * MAX(NBA, MIN(NRHS, 32)), where
! 162: *> NBA = (N + NB - 1)/NB and NB is the optimal block size.
! 163: *>
! 164: *> If LWORK = -1, then a workspace query is assumed; the routine
! 165: *> only calculates the optimal dimensions of the WORK array, returns
! 166: *> this value as the first entry of the WORK array, and no error
! 167: *> message related to LWORK is issued by XERBLA.
! 168: *>
! 169: *> \param[out] INFO
! 170: *> \verbatim
! 171: *> INFO is INTEGER
! 172: *> = 0: successful exit
! 173: *> < 0: if INFO = -k, the k-th argument had an illegal value
! 174: *> \endverbatim
! 175: *
! 176: * Authors:
! 177: * ========
! 178: *
! 179: *> \author Univ. of Tennessee
! 180: *> \author Univ. of California Berkeley
! 181: *> \author Univ. of Colorado Denver
! 182: *> \author NAG Ltd.
! 183: *
! 184: *> \ingroup doubleOTHERauxiliary
! 185: *> \par Further Details:
! 186: * =====================
! 187: * \verbatim
! 188: * The algorithm follows the structure of a block triangular solve.
! 189: * The diagonal block is solved with a call to the robust the triangular
! 190: * solver LATRS for every right-hand side RHS = 1, ..., NRHS
! 191: * op(A( J, J )) * x( J, RHS ) = SCALOC * b( J, RHS ),
! 192: * where op( A ) = A or op( A ) = A**T.
! 193: * The linear block updates operate on block columns of X,
! 194: * B( I, K ) - op(A( I, J )) * X( J, K )
! 195: * and use GEMM. To avoid overflow in the linear block update, the worst case
! 196: * growth is estimated. For every RHS, a scale factor s <= 1.0 is computed
! 197: * such that
! 198: * || s * B( I, RHS )||_oo
! 199: * + || op(A( I, J )) ||_oo * || s * X( J, RHS ) ||_oo <= Overflow threshold
! 200: *
! 201: * Once all columns of a block column have been rescaled (BLAS-1), the linear
! 202: * update is executed with GEMM without overflow.
! 203: *
! 204: * To limit rescaling, local scale factors track the scaling of column segments.
! 205: * There is one local scale factor s( I, RHS ) per block row I = 1, ..., NBA
! 206: * per right-hand side column RHS = 1, ..., NRHS. The global scale factor
! 207: * SCALE( RHS ) is chosen as the smallest local scale factor s( I, RHS )
! 208: * I = 1, ..., NBA.
! 209: * A triangular solve op(A( J, J )) * x( J, RHS ) = SCALOC * b( J, RHS )
! 210: * updates the local scale factor s( J, RHS ) := s( J, RHS ) * SCALOC. The
! 211: * linear update of potentially inconsistently scaled vector segments
! 212: * s( I, RHS ) * b( I, RHS ) - op(A( I, J )) * ( s( J, RHS )* x( J, RHS ) )
! 213: * computes a consistent scaling SCAMIN = MIN( s(I, RHS ), s(J, RHS) ) and,
! 214: * if necessary, rescales the blocks prior to calling GEMM.
! 215: *
! 216: * \endverbatim
! 217: * =====================================================================
! 218: * References:
! 219: * C. C. Kjelgaard Mikkelsen, A. B. Schwarz and L. Karlsson (2019).
! 220: * Parallel robust solution of triangular linear systems. Concurrency
! 221: * and Computation: Practice and Experience, 31(19), e5064.
! 222: *
! 223: * Contributor:
! 224: * Angelika Schwarz, Umea University, Sweden.
! 225: *
! 226: * =====================================================================
! 227: SUBROUTINE DLATRS3( UPLO, TRANS, DIAG, NORMIN, N, NRHS, A, LDA,
! 228: $ X, LDX, SCALE, CNORM, WORK, LWORK, INFO )
! 229: IMPLICIT NONE
! 230: *
! 231: * .. Scalar Arguments ..
! 232: CHARACTER DIAG, TRANS, NORMIN, UPLO
! 233: INTEGER INFO, LDA, LWORK, LDX, N, NRHS
! 234: * ..
! 235: * .. Array Arguments ..
! 236: DOUBLE PRECISION A( LDA, * ), CNORM( * ), X( LDX, * ),
! 237: $ SCALE( * ), WORK( * )
! 238: * ..
! 239: *
! 240: * =====================================================================
! 241: *
! 242: * .. Parameters ..
! 243: DOUBLE PRECISION ZERO, ONE
! 244: PARAMETER ( ZERO = 0.0D+0, ONE = 1.0D+0 )
! 245: INTEGER NBMAX, NBMIN, NBRHS, NRHSMIN
! 246: PARAMETER ( NRHSMIN = 2, NBRHS = 32 )
! 247: PARAMETER ( NBMIN = 8, NBMAX = 64 )
! 248: * ..
! 249: * .. Local Arrays ..
! 250: DOUBLE PRECISION W( NBMAX ), XNRM( NBRHS )
! 251: * ..
! 252: * .. Local Scalars ..
! 253: LOGICAL LQUERY, NOTRAN, NOUNIT, UPPER
! 254: INTEGER AWRK, I, IFIRST, IINC, ILAST, II, I1, I2, J,
! 255: $ JFIRST, JINC, JLAST, J1, J2, K, KK, K1, K2,
! 256: $ LANRM, LDS, LSCALE, NB, NBA, NBX, RHS
! 257: DOUBLE PRECISION ANRM, BIGNUM, BNRM, RSCAL, SCAL, SCALOC,
! 258: $ SCAMIN, SMLNUM, TMAX
! 259: * ..
! 260: * .. External Functions ..
! 261: LOGICAL LSAME
! 262: INTEGER ILAENV
! 263: DOUBLE PRECISION DLAMCH, DLANGE, DLARMM
! 264: EXTERNAL DLAMCH, DLANGE, DLARMM, ILAENV, LSAME
! 265: * ..
! 266: * .. External Subroutines ..
! 267: EXTERNAL DLATRS, DSCAL, XERBLA
! 268: * ..
! 269: * .. Intrinsic Functions ..
! 270: INTRINSIC ABS, MAX, MIN
! 271: * ..
! 272: * .. Executable Statements ..
! 273: *
! 274: INFO = 0
! 275: UPPER = LSAME( UPLO, 'U' )
! 276: NOTRAN = LSAME( TRANS, 'N' )
! 277: NOUNIT = LSAME( DIAG, 'N' )
! 278: LQUERY = ( LWORK.EQ.-1 )
! 279: *
! 280: * Partition A and X into blocks
! 281: *
! 282: NB = MAX( 8, ILAENV( 1, 'DLATRS', '', N, N, -1, -1 ) )
! 283: NB = MIN( NBMAX, NB )
! 284: NBA = MAX( 1, (N + NB - 1) / NB )
! 285: NBX = MAX( 1, (NRHS + NBRHS - 1) / NBRHS )
! 286: *
! 287: * Compute the workspace
! 288: *
! 289: * The workspace comprises two parts.
! 290: * The first part stores the local scale factors. Each simultaneously
! 291: * computed right-hand side requires one local scale factor per block
! 292: * row. WORK( I+KK*LDS ) is the scale factor of the vector
! 293: * segment associated with the I-th block row and the KK-th vector
! 294: * in the block column.
! 295: LSCALE = NBA * MAX( NBA, MIN( NRHS, NBRHS ) )
! 296: LDS = NBA
! 297: * The second part stores upper bounds of the triangular A. There are
! 298: * a total of NBA x NBA blocks, of which only the upper triangular
! 299: * part or the lower triangular part is referenced. The upper bound of
! 300: * the block A( I, J ) is stored as WORK( AWRK + I + J * NBA ).
! 301: LANRM = NBA * NBA
! 302: AWRK = LSCALE
! 303: WORK( 1 ) = LSCALE + LANRM
! 304: *
! 305: * Test the input parameters
! 306: *
! 307: IF( .NOT.UPPER .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN
! 308: INFO = -1
! 309: ELSE IF( .NOT.NOTRAN .AND. .NOT.LSAME( TRANS, 'T' ) .AND. .NOT.
! 310: $ LSAME( TRANS, 'C' ) ) THEN
! 311: INFO = -2
! 312: ELSE IF( .NOT.NOUNIT .AND. .NOT.LSAME( DIAG, 'U' ) ) THEN
! 313: INFO = -3
! 314: ELSE IF( .NOT.LSAME( NORMIN, 'Y' ) .AND. .NOT.
! 315: $ LSAME( NORMIN, 'N' ) ) THEN
! 316: INFO = -4
! 317: ELSE IF( N.LT.0 ) THEN
! 318: INFO = -5
! 319: ELSE IF( NRHS.LT.0 ) THEN
! 320: INFO = -6
! 321: ELSE IF( LDA.LT.MAX( 1, N ) ) THEN
! 322: INFO = -8
! 323: ELSE IF( LDX.LT.MAX( 1, N ) ) THEN
! 324: INFO = -10
! 325: ELSE IF( .NOT.LQUERY .AND. LWORK.LT.WORK( 1 ) ) THEN
! 326: INFO = -14
! 327: END IF
! 328: IF( INFO.NE.0 ) THEN
! 329: CALL XERBLA( 'DLATRS3', -INFO )
! 330: RETURN
! 331: ELSE IF( LQUERY ) THEN
! 332: RETURN
! 333: END IF
! 334: *
! 335: * Initialize scaling factors
! 336: *
! 337: DO KK = 1, NRHS
! 338: SCALE( KK ) = ONE
! 339: END DO
! 340: *
! 341: * Quick return if possible
! 342: *
! 343: IF( MIN( N, NRHS ).EQ.0 )
! 344: $ RETURN
! 345: *
! 346: * Determine machine dependent constant to control overflow.
! 347: *
! 348: BIGNUM = DLAMCH( 'Overflow' )
! 349: SMLNUM = DLAMCH( 'Safe Minimum' )
! 350: *
! 351: * Use unblocked code for small problems
! 352: *
! 353: IF( NRHS.LT.NRHSMIN ) THEN
! 354: CALL DLATRS( UPLO, TRANS, DIAG, NORMIN, N, A, LDA, X( 1, 1),
! 355: $ SCALE( 1 ), CNORM, INFO )
! 356: DO K = 2, NRHS
! 357: CALL DLATRS( UPLO, TRANS, DIAG, 'Y', N, A, LDA, X( 1, K ),
! 358: $ SCALE( K ), CNORM, INFO )
! 359: END DO
! 360: RETURN
! 361: END IF
! 362: *
! 363: * Compute norms of blocks of A excluding diagonal blocks and find
! 364: * the block with the largest norm TMAX.
! 365: *
! 366: TMAX = ZERO
! 367: DO J = 1, NBA
! 368: J1 = (J-1)*NB + 1
! 369: J2 = MIN( J*NB, N ) + 1
! 370: IF ( UPPER ) THEN
! 371: IFIRST = 1
! 372: ILAST = J - 1
! 373: ELSE
! 374: IFIRST = J + 1
! 375: ILAST = NBA
! 376: END IF
! 377: DO I = IFIRST, ILAST
! 378: I1 = (I-1)*NB + 1
! 379: I2 = MIN( I*NB, N ) + 1
! 380: *
! 381: * Compute upper bound of A( I1:I2-1, J1:J2-1 ).
! 382: *
! 383: IF( NOTRAN ) THEN
! 384: ANRM = DLANGE( 'I', I2-I1, J2-J1, A( I1, J1 ), LDA, W )
! 385: WORK( AWRK + I+(J-1)*NBA ) = ANRM
! 386: ELSE
! 387: ANRM = DLANGE( '1', I2-I1, J2-J1, A( I1, J1 ), LDA, W )
! 388: WORK( AWRK + J+(I-1)*NBA ) = ANRM
! 389: END IF
! 390: TMAX = MAX( TMAX, ANRM )
! 391: END DO
! 392: END DO
! 393: *
! 394: IF( .NOT. TMAX.LE.DLAMCH('Overflow') ) THEN
! 395: *
! 396: * Some matrix entries have huge absolute value. At least one upper
! 397: * bound norm( A(I1:I2-1, J1:J2-1), 'I') is not a valid floating-point
! 398: * number, either due to overflow in LANGE or due to Inf in A.
! 399: * Fall back to LATRS. Set normin = 'N' for every right-hand side to
! 400: * force computation of TSCAL in LATRS to avoid the likely overflow
! 401: * in the computation of the column norms CNORM.
! 402: *
! 403: DO K = 1, NRHS
! 404: CALL DLATRS( UPLO, TRANS, DIAG, 'N', N, A, LDA, X( 1, K ),
! 405: $ SCALE( K ), CNORM, INFO )
! 406: END DO
! 407: RETURN
! 408: END IF
! 409: *
! 410: * Every right-hand side requires workspace to store NBA local scale
! 411: * factors. To save workspace, X is computed successively in block columns
! 412: * of width NBRHS, requiring a total of NBA x NBRHS space. If sufficient
! 413: * workspace is available, larger values of NBRHS or NBRHS = NRHS are viable.
! 414: DO K = 1, NBX
! 415: * Loop over block columns (index = K) of X and, for column-wise scalings,
! 416: * over individual columns (index = KK).
! 417: * K1: column index of the first column in X( J, K )
! 418: * K2: column index of the first column in X( J, K+1 )
! 419: * so the K2 - K1 is the column count of the block X( J, K )
! 420: K1 = (K-1)*NBRHS + 1
! 421: K2 = MIN( K*NBRHS, NRHS ) + 1
! 422: *
! 423: * Initialize local scaling factors of current block column X( J, K )
! 424: *
! 425: DO KK = 1, K2-K1
! 426: DO I = 1, NBA
! 427: WORK( I+KK*LDS ) = ONE
! 428: END DO
! 429: END DO
! 430: *
! 431: IF( NOTRAN ) THEN
! 432: *
! 433: * Solve A * X(:, K1:K2-1) = B * diag(scale(K1:K2-1))
! 434: *
! 435: IF( UPPER ) THEN
! 436: JFIRST = NBA
! 437: JLAST = 1
! 438: JINC = -1
! 439: ELSE
! 440: JFIRST = 1
! 441: JLAST = NBA
! 442: JINC = 1
! 443: END IF
! 444: ELSE
! 445: *
! 446: * Solve A**T * X(:, K1:K2-1) = B * diag(scale(K1:K2-1))
! 447: *
! 448: IF( UPPER ) THEN
! 449: JFIRST = 1
! 450: JLAST = NBA
! 451: JINC = 1
! 452: ELSE
! 453: JFIRST = NBA
! 454: JLAST = 1
! 455: JINC = -1
! 456: END IF
! 457: END IF
! 458: *
! 459: DO J = JFIRST, JLAST, JINC
! 460: * J1: row index of the first row in A( J, J )
! 461: * J2: row index of the first row in A( J+1, J+1 )
! 462: * so that J2 - J1 is the row count of the block A( J, J )
! 463: J1 = (J-1)*NB + 1
! 464: J2 = MIN( J*NB, N ) + 1
! 465: *
! 466: * Solve op(A( J, J )) * X( J, RHS ) = SCALOC * B( J, RHS )
! 467: * for all right-hand sides in the current block column,
! 468: * one RHS at a time.
! 469: *
! 470: DO KK = 1, K2-K1
! 471: RHS = K1 + KK - 1
! 472: IF( KK.EQ.1 ) THEN
! 473: CALL DLATRS( UPLO, TRANS, DIAG, 'N', J2-J1,
! 474: $ A( J1, J1 ), LDA, X( J1, RHS ),
! 475: $ SCALOC, CNORM, INFO )
! 476: ELSE
! 477: CALL DLATRS( UPLO, TRANS, DIAG, 'Y', J2-J1,
! 478: $ A( J1, J1 ), LDA, X( J1, RHS ),
! 479: $ SCALOC, CNORM, INFO )
! 480: END IF
! 481: * Find largest absolute value entry in the vector segment
! 482: * X( J1:J2-1, RHS ) as an upper bound for the worst case
! 483: * growth in the linear updates.
! 484: XNRM( KK ) = DLANGE( 'I', J2-J1, 1, X( J1, RHS ),
! 485: $ LDX, W )
! 486: *
! 487: IF( SCALOC .EQ. ZERO ) THEN
! 488: * LATRS found that A is singular through A(j,j) = 0.
! 489: * Reset the computation x(1:n) = 0, x(j) = 1, SCALE = 0
! 490: * and compute A*x = 0 (or A**T*x = 0). Note that
! 491: * X(J1:J2-1, KK) is set by LATRS.
! 492: SCALE( RHS ) = ZERO
! 493: DO II = 1, J1-1
! 494: X( II, KK ) = ZERO
! 495: END DO
! 496: DO II = J2, N
! 497: X( II, KK ) = ZERO
! 498: END DO
! 499: * Discard the local scale factors.
! 500: DO II = 1, NBA
! 501: WORK( II+KK*LDS ) = ONE
! 502: END DO
! 503: SCALOC = ONE
! 504: ELSE IF( SCALOC * WORK( J+KK*LDS ) .EQ. ZERO ) THEN
! 505: * LATRS computed a valid scale factor, but combined with
! 506: * the current scaling the solution does not have a
! 507: * scale factor > 0.
! 508: *
! 509: * Set WORK( J+KK*LDS ) to smallest valid scale
! 510: * factor and increase SCALOC accordingly.
! 511: SCAL = WORK( J+KK*LDS ) / SMLNUM
! 512: SCALOC = SCALOC * SCAL
! 513: WORK( J+KK*LDS ) = SMLNUM
! 514: * If LATRS overestimated the growth, x may be
! 515: * rescaled to preserve a valid combined scale
! 516: * factor WORK( J, KK ) > 0.
! 517: RSCAL = ONE / SCALOC
! 518: IF( XNRM( KK ) * RSCAL .LE. BIGNUM ) THEN
! 519: XNRM( KK ) = XNRM( KK ) * RSCAL
! 520: CALL DSCAL( J2-J1, RSCAL, X( J1, RHS ), 1 )
! 521: SCALOC = ONE
! 522: ELSE
! 523: * The system op(A) * x = b is badly scaled and its
! 524: * solution cannot be represented as (1/scale) * x.
! 525: * Set x to zero. This approach deviates from LATRS
! 526: * where a completely meaningless non-zero vector
! 527: * is returned that is not a solution to op(A) * x = b.
! 528: SCALE( RHS ) = ZERO
! 529: DO II = 1, N
! 530: X( II, KK ) = ZERO
! 531: END DO
! 532: * Discard the local scale factors.
! 533: DO II = 1, NBA
! 534: WORK( II+KK*LDS ) = ONE
! 535: END DO
! 536: SCALOC = ONE
! 537: END IF
! 538: END IF
! 539: SCALOC = SCALOC * WORK( J+KK*LDS )
! 540: WORK( J+KK*LDS ) = SCALOC
! 541: END DO
! 542: *
! 543: * Linear block updates
! 544: *
! 545: IF( NOTRAN ) THEN
! 546: IF( UPPER ) THEN
! 547: IFIRST = J - 1
! 548: ILAST = 1
! 549: IINC = -1
! 550: ELSE
! 551: IFIRST = J + 1
! 552: ILAST = NBA
! 553: IINC = 1
! 554: END IF
! 555: ELSE
! 556: IF( UPPER ) THEN
! 557: IFIRST = J + 1
! 558: ILAST = NBA
! 559: IINC = 1
! 560: ELSE
! 561: IFIRST = J - 1
! 562: ILAST = 1
! 563: IINC = -1
! 564: END IF
! 565: END IF
! 566: *
! 567: DO I = IFIRST, ILAST, IINC
! 568: * I1: row index of the first column in X( I, K )
! 569: * I2: row index of the first column in X( I+1, K )
! 570: * so the I2 - I1 is the row count of the block X( I, K )
! 571: I1 = (I-1)*NB + 1
! 572: I2 = MIN( I*NB, N ) + 1
! 573: *
! 574: * Prepare the linear update to be executed with GEMM.
! 575: * For each column, compute a consistent scaling, a
! 576: * scaling factor to survive the linear update, and
! 577: * rescale the column segments, if necesssary. Then
! 578: * the linear update is safely executed.
! 579: *
! 580: DO KK = 1, K2-K1
! 581: RHS = K1 + KK - 1
! 582: * Compute consistent scaling
! 583: SCAMIN = MIN( WORK( I + KK*LDS), WORK( J + KK*LDS ) )
! 584: *
! 585: * Compute scaling factor to survive the linear update
! 586: * simulating consistent scaling.
! 587: *
! 588: BNRM = DLANGE( 'I', I2-I1, 1, X( I1, RHS ), LDX, W )
! 589: BNRM = BNRM*( SCAMIN / WORK( I+KK*LDS ) )
! 590: XNRM( KK ) = XNRM( KK )*(SCAMIN / WORK( J+KK*LDS ))
! 591: ANRM = WORK( AWRK + I+(J-1)*NBA )
! 592: SCALOC = DLARMM( ANRM, XNRM( KK ), BNRM )
! 593: *
! 594: * Simultaneously apply the robust update factor and the
! 595: * consistency scaling factor to B( I, KK ) and B( J, KK ).
! 596: *
! 597: SCAL = ( SCAMIN / WORK( I+KK*LDS) )*SCALOC
! 598: IF( SCAL.NE.ONE ) THEN
! 599: CALL DSCAL( I2-I1, SCAL, X( I1, RHS ), 1 )
! 600: WORK( I+KK*LDS ) = SCAMIN*SCALOC
! 601: END IF
! 602: *
! 603: SCAL = ( SCAMIN / WORK( J+KK*LDS ) )*SCALOC
! 604: IF( SCAL.NE.ONE ) THEN
! 605: CALL DSCAL( J2-J1, SCAL, X( J1, RHS ), 1 )
! 606: WORK( J+KK*LDS ) = SCAMIN*SCALOC
! 607: END IF
! 608: END DO
! 609: *
! 610: IF( NOTRAN ) THEN
! 611: *
! 612: * B( I, K ) := B( I, K ) - A( I, J ) * X( J, K )
! 613: *
! 614: CALL DGEMM( 'N', 'N', I2-I1, K2-K1, J2-J1, -ONE,
! 615: $ A( I1, J1 ), LDA, X( J1, K1 ), LDX,
! 616: $ ONE, X( I1, K1 ), LDX )
! 617: ELSE
! 618: *
! 619: * B( I, K ) := B( I, K ) - A( J, I )**T * X( J, K )
! 620: *
! 621: CALL DGEMM( 'T', 'N', I2-I1, K2-K1, J2-J1, -ONE,
! 622: $ A( J1, I1 ), LDA, X( J1, K1 ), LDX,
! 623: $ ONE, X( I1, K1 ), LDX )
! 624: END IF
! 625: END DO
! 626: END DO
! 627: *
! 628: * Reduce local scaling factors
! 629: *
! 630: DO KK = 1, K2-K1
! 631: RHS = K1 + KK - 1
! 632: DO I = 1, NBA
! 633: SCALE( RHS ) = MIN( SCALE( RHS ), WORK( I+KK*LDS ) )
! 634: END DO
! 635: END DO
! 636: *
! 637: * Realize consistent scaling
! 638: *
! 639: DO KK = 1, K2-K1
! 640: RHS = K1 + KK - 1
! 641: IF( SCALE( RHS ).NE.ONE .AND. SCALE( RHS ).NE. ZERO ) THEN
! 642: DO I = 1, NBA
! 643: I1 = (I-1)*NB + 1
! 644: I2 = MIN( I*NB, N ) + 1
! 645: SCAL = SCALE( RHS ) / WORK( I+KK*LDS )
! 646: IF( SCAL.NE.ONE )
! 647: $ CALL DSCAL( I2-I1, SCAL, X( I1, RHS ), 1 )
! 648: END DO
! 649: END IF
! 650: END DO
! 651: END DO
! 652: RETURN
! 653: *
! 654: * End of DLATRS3
! 655: *
! 656: END
CVSweb interface <joel.bertrand@systella.fr>