Annotation of rpl/lapack/lapack/zlaunhr_col_getrfnp2.f, revision 1.2

1.1       bertrand    1: *> \brief \b ZLAUNHR_COL_GETRFNP2
                      2: *
                      3: *  =========== DOCUMENTATION ===========
                      4: *
                      5: * Online html documentation available at
                      6: *            http://www.netlib.org/lapack/explore-html/
                      7: *
                      8: *> \htmlonly
                      9: *> Download ZLAUNHR_COL_GETRFNP2 + dependencies
                     10: *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.tgz?format=tgz&filename=/lapack/lapack_routine/zlaunhr_col_getrfnp2.f">
                     11: *> [TGZ]</a>
                     12: *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.zip?format=zip&filename=/lapack/lapack_routine/zlaunhr_col_getrfnp2.f">
                     13: *> [ZIP]</a>
                     14: *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.txt?format=txt&filename=/lapack/lapack_routine/zlaunhr_col_getrfnp2.f">
                     15: *> [TXT]</a>
                     16: *> \endhtmlonly
                     17: *
                     18: *  Definition:
                     19: *  ===========
                     20: *
                     21: *       RECURSIVE SUBROUTINE ZLAUNHR_COL_GETRFNP2( M, N, A, LDA, D, INFO )
                     22: *
                     23: *       .. Scalar Arguments ..
                     24: *       INTEGER            INFO, LDA, M, N
                     25: *       ..
                     26: *       .. Array Arguments ..
                     27: *       COMPLEX*16         A( LDA, * ), D( * )
                     28: *       ..
                     29: *
                     30: *
                     31: *> \par Purpose:
                     32: *  =============
                     33: *>
                     34: *> \verbatim
                     35: *>
                     36: *> ZLAUNHR_COL_GETRFNP2 computes the modified LU factorization without
                     37: *> pivoting of a complex general M-by-N matrix A. The factorization has
                     38: *> the form:
                     39: *>
                     40: *>     A - S = L * U,
                     41: *>
                     42: *> where:
                     43: *>    S is a m-by-n diagonal sign matrix with the diagonal D, so that
                     44: *>    D(i) = S(i,i), 1 <= i <= min(M,N). The diagonal D is constructed
                     45: *>    as D(i)=-SIGN(A(i,i)), where A(i,i) is the value after performing
                     46: *>    i-1 steps of Gaussian elimination. This means that the diagonal
                     47: *>    element at each step of "modified" Gaussian elimination is at
                     48: *>    least one in absolute value (so that division-by-zero not
                     49: *>    possible during the division by the diagonal element);
                     50: *>
                     51: *>    L is a M-by-N lower triangular matrix with unit diagonal elements
                     52: *>    (lower trapezoidal if M > N);
                     53: *>
                     54: *>    and U is a M-by-N upper triangular matrix
                     55: *>    (upper trapezoidal if M < N).
                     56: *>
                     57: *> This routine is an auxiliary routine used in the Householder
                     58: *> reconstruction routine ZUNHR_COL. In ZUNHR_COL, this routine is
                     59: *> applied to an M-by-N matrix A with orthonormal columns, where each
                     60: *> element is bounded by one in absolute value. With the choice of
                     61: *> the matrix S above, one can show that the diagonal element at each
                     62: *> step of Gaussian elimination is the largest (in absolute value) in
                     63: *> the column on or below the diagonal, so that no pivoting is required
                     64: *> for numerical stability [1].
                     65: *>
                     66: *> For more details on the Householder reconstruction algorithm,
                     67: *> including the modified LU factorization, see [1].
                     68: *>
                     69: *> This is the recursive version of the LU factorization algorithm.
                     70: *> Denote A - S by B. The algorithm divides the matrix B into four
                     71: *> submatrices:
                     72: *>
                     73: *>        [  B11 | B12  ]  where B11 is n1 by n1,
                     74: *>    B = [ -----|----- ]        B21 is (m-n1) by n1,
                     75: *>        [  B21 | B22  ]        B12 is n1 by n2,
                     76: *>                               B22 is (m-n1) by n2,
                     77: *>                               with n1 = min(m,n)/2, n2 = n-n1.
                     78: *>
                     79: *>
                     80: *> The subroutine calls itself to factor B11, solves for B21,
                     81: *> solves for B12, updates B22, then calls itself to factor B22.
                     82: *>
                     83: *> For more details on the recursive LU algorithm, see [2].
                     84: *>
                     85: *> ZLAUNHR_COL_GETRFNP2 is called to factorize a block by the blocked
                     86: *> routine ZLAUNHR_COL_GETRFNP, which uses blocked code calling
1.2     ! bertrand   87: *> Level 3 BLAS to update the submatrix. However, ZLAUNHR_COL_GETRFNP2
1.1       bertrand   88: *> is self-sufficient and can be used without ZLAUNHR_COL_GETRFNP.
                     89: *>
                     90: *> [1] "Reconstructing Householder vectors from tall-skinny QR",
                     91: *>     G. Ballard, J. Demmel, L. Grigori, M. Jacquelin, H.D. Nguyen,
                     92: *>     E. Solomonik, J. Parallel Distrib. Comput.,
                     93: *>     vol. 85, pp. 3-31, 2015.
                     94: *>
                     95: *> [2] "Recursion leads to automatic variable blocking for dense linear
                     96: *>     algebra algorithms", F. Gustavson, IBM J. of Res. and Dev.,
                     97: *>     vol. 41, no. 6, pp. 737-755, 1997.
                     98: *> \endverbatim
                     99: *
                    100: *  Arguments:
                    101: *  ==========
                    102: *
                    103: *> \param[in] M
                    104: *> \verbatim
                    105: *>          M is INTEGER
                    106: *>          The number of rows of the matrix A.  M >= 0.
                    107: *> \endverbatim
                    108: *>
                    109: *> \param[in] N
                    110: *> \verbatim
                    111: *>          N is INTEGER
                    112: *>          The number of columns of the matrix A.  N >= 0.
                    113: *> \endverbatim
                    114: *>
                    115: *> \param[in,out] A
                    116: *> \verbatim
                    117: *>          A is COMPLEX*16 array, dimension (LDA,N)
                    118: *>          On entry, the M-by-N matrix to be factored.
                    119: *>          On exit, the factors L and U from the factorization
                    120: *>          A-S=L*U; the unit diagonal elements of L are not stored.
                    121: *> \endverbatim
                    122: *>
                    123: *> \param[in] LDA
                    124: *> \verbatim
                    125: *>          LDA is INTEGER
                    126: *>          The leading dimension of the array A.  LDA >= max(1,M).
                    127: *> \endverbatim
                    128: *>
                    129: *> \param[out] D
                    130: *> \verbatim
                    131: *>          D is COMPLEX*16 array, dimension min(M,N)
                    132: *>          The diagonal elements of the diagonal M-by-N sign matrix S,
                    133: *>          D(i) = S(i,i), where 1 <= i <= min(M,N). The elements can be
                    134: *>          only ( +1.0, 0.0 ) or (-1.0, 0.0 ).
                    135: *> \endverbatim
                    136: *>
                    137: *> \param[out] INFO
                    138: *> \verbatim
                    139: *>          INFO is INTEGER
                    140: *>          = 0:  successful exit
                    141: *>          < 0:  if INFO = -i, the i-th argument had an illegal value
                    142: *> \endverbatim
                    143: *>
                    144: *  Authors:
                    145: *  ========
                    146: *
                    147: *> \author Univ. of Tennessee
                    148: *> \author Univ. of California Berkeley
                    149: *> \author Univ. of Colorado Denver
                    150: *> \author NAG Ltd.
                    151: *
                    152: *> \ingroup complex16GEcomputational
                    153: *
                    154: *> \par Contributors:
                    155: *  ==================
                    156: *>
                    157: *> \verbatim
                    158: *>
                    159: *> November 2019, Igor Kozachenko,
                    160: *>                Computer Science Division,
                    161: *>                University of California, Berkeley
                    162: *>
                    163: *> \endverbatim
                    164: *
                    165: *  =====================================================================
                    166:       RECURSIVE SUBROUTINE ZLAUNHR_COL_GETRFNP2( M, N, A, LDA, D, INFO )
                    167:       IMPLICIT NONE
                    168: *
1.2     ! bertrand  169: *  -- LAPACK computational routine --
1.1       bertrand  170: *  -- LAPACK is a software package provided by Univ. of Tennessee,    --
                    171: *  -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
                    172: *
                    173: *     .. Scalar Arguments ..
                    174:       INTEGER            INFO, LDA, M, N
                    175: *     ..
                    176: *     .. Array Arguments ..
                    177:       COMPLEX*16         A( LDA, * ), D( * )
                    178: *     ..
                    179: *
                    180: *  =====================================================================
                    181: *
                    182: *     .. Parameters ..
                    183:       DOUBLE PRECISION   ONE
                    184:       PARAMETER          ( ONE = 1.0D+0 )
                    185:       COMPLEX*16         CONE
                    186:       PARAMETER          ( CONE = ( 1.0D+0, 0.0D+0 ) )
                    187: *     ..
                    188: *     .. Local Scalars ..
                    189:       DOUBLE PRECISION   SFMIN
                    190:       INTEGER            I, IINFO, N1, N2
                    191:       COMPLEX*16         Z
                    192: *     ..
                    193: *     .. External Functions ..
                    194:       DOUBLE PRECISION   DLAMCH
                    195:       EXTERNAL           DLAMCH
                    196: *     ..
                    197: *     .. External Subroutines ..
                    198:       EXTERNAL           ZGEMM, ZSCAL, ZTRSM, XERBLA
                    199: *     ..
                    200: *     .. Intrinsic Functions ..
                    201:       INTRINSIC          ABS, DBLE, DCMPLX, DIMAG, DSIGN, MAX, MIN
                    202: *     ..
                    203: *     .. Statement Functions ..
                    204:       DOUBLE PRECISION   CABS1
                    205: *     ..
                    206: *     .. Statement Function definitions ..
                    207:       CABS1( Z ) = ABS( DBLE( Z ) ) + ABS( DIMAG( Z ) )
                    208: *     ..
                    209: *     .. Executable Statements ..
                    210: *
                    211: *     Test the input parameters
                    212: *
                    213:       INFO = 0
                    214:       IF( M.LT.0 ) THEN
                    215:          INFO = -1
                    216:       ELSE IF( N.LT.0 ) THEN
                    217:          INFO = -2
                    218:       ELSE IF( LDA.LT.MAX( 1, M ) ) THEN
                    219:          INFO = -4
                    220:       END IF
                    221:       IF( INFO.NE.0 ) THEN
                    222:          CALL XERBLA( 'ZLAUNHR_COL_GETRFNP2', -INFO )
                    223:          RETURN
                    224:       END IF
                    225: *
                    226: *     Quick return if possible
                    227: *
                    228:       IF( MIN( M, N ).EQ.0 )
                    229:      $   RETURN
                    230: 
                    231:       IF ( M.EQ.1 ) THEN
                    232: *
                    233: *        One row case, (also recursion termination case),
                    234: *        use unblocked code
                    235: *
                    236: *        Transfer the sign
                    237: *
                    238:          D( 1 ) = DCMPLX( -DSIGN( ONE, DBLE( A( 1, 1 ) ) ) )
                    239: *
                    240: *        Construct the row of U
                    241: *
                    242:          A( 1, 1 ) = A( 1, 1 ) - D( 1 )
                    243: *
                    244:       ELSE IF( N.EQ.1 ) THEN
                    245: *
                    246: *        One column case, (also recursion termination case),
                    247: *        use unblocked code
                    248: *
                    249: *        Transfer the sign
                    250: *
                    251:          D( 1 ) = DCMPLX( -DSIGN( ONE, DBLE( A( 1, 1 ) ) ) )
                    252: *
                    253: *        Construct the row of U
                    254: *
                    255:          A( 1, 1 ) = A( 1, 1 ) - D( 1 )
                    256: *
                    257: *        Scale the elements 2:M of the column
                    258: *
                    259: *        Determine machine safe minimum
                    260: *
                    261:          SFMIN = DLAMCH('S')
                    262: *
                    263: *        Construct the subdiagonal elements of L
                    264: *
                    265:          IF( CABS1( A( 1, 1 ) ) .GE. SFMIN ) THEN
                    266:             CALL ZSCAL( M-1, CONE / A( 1, 1 ), A( 2, 1 ), 1 )
                    267:          ELSE
                    268:             DO I = 2, M
                    269:                A( I, 1 ) = A( I, 1 ) / A( 1, 1 )
                    270:             END DO
                    271:          END IF
                    272: *
                    273:       ELSE
                    274: *
                    275: *        Divide the matrix B into four submatrices
                    276: *
                    277:          N1 = MIN( M, N ) / 2
                    278:          N2 = N-N1
                    279: 
                    280: *
                    281: *        Factor B11, recursive call
                    282: *
                    283:          CALL ZLAUNHR_COL_GETRFNP2( N1, N1, A, LDA, D, IINFO )
                    284: *
                    285: *        Solve for B21
                    286: *
                    287:          CALL ZTRSM( 'R', 'U', 'N', 'N', M-N1, N1, CONE, A, LDA,
                    288:      $               A( N1+1, 1 ), LDA )
                    289: *
                    290: *        Solve for B12
                    291: *
                    292:          CALL ZTRSM( 'L', 'L', 'N', 'U', N1, N2, CONE, A, LDA,
                    293:      $               A( 1, N1+1 ), LDA )
                    294: *
                    295: *        Update B22, i.e. compute the Schur complement
                    296: *        B22 := B22 - B21*B12
                    297: *
                    298:          CALL ZGEMM( 'N', 'N', M-N1, N2, N1, -CONE, A( N1+1, 1 ), LDA,
                    299:      $               A( 1, N1+1 ), LDA, CONE, A( N1+1, N1+1 ), LDA )
                    300: *
                    301: *        Factor B22, recursive call
                    302: *
                    303:          CALL ZLAUNHR_COL_GETRFNP2( M-N1, N2, A( N1+1, N1+1 ), LDA,
                    304:      $                              D( N1+1 ), IINFO )
                    305: *
                    306:       END IF
                    307:       RETURN
                    308: *
                    309: *     End of ZLAUNHR_COL_GETRFNP2
                    310: *
                    311:       END

CVSweb interface <joel.bertrand@systella.fr>