Annotation of rpl/lapack/lapack/zsytri_3x.f, revision 1.4
1.1 bertrand 1: *> \brief \b ZSYTRI_3X
2: *
3: * =========== DOCUMENTATION ===========
4: *
5: * Online html documentation available at
6: * http://www.netlib.org/lapack/explore-html/
7: *
8: *> \htmlonly
9: *> Download ZSYTRI_3X + dependencies
10: *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.tgz?format=tgz&filename=/lapack/lapack_routine/zsytri_3x.f">
11: *> [TGZ]</a>
12: *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.zip?format=zip&filename=/lapack/lapack_routine/zsytri_3x.f">
13: *> [ZIP]</a>
14: *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.txt?format=txt&filename=/lapack/lapack_routine/zsytri_3x.f">
15: *> [TXT]</a>
16: *> \endhtmlonly
17: *
18: * Definition:
19: * ===========
20: *
21: * SUBROUTINE ZSYTRI_3X( UPLO, N, A, LDA, E, IPIV, WORK, NB, INFO )
22: *
23: * .. Scalar Arguments ..
24: * CHARACTER UPLO
25: * INTEGER INFO, LDA, N, NB
26: * ..
27: * .. Array Arguments ..
28: * INTEGER IPIV( * )
29: * COMPLEX*16 A( LDA, * ), E( * ), WORK( N+NB+1, * )
30: * ..
31: *
32: *
33: *> \par Purpose:
34: * =============
35: *>
36: *> \verbatim
37: *> ZSYTRI_3X computes the inverse of a complex symmetric indefinite
38: *> matrix A using the factorization computed by ZSYTRF_RK or ZSYTRF_BK:
39: *>
40: *> A = P*U*D*(U**T)*(P**T) or A = P*L*D*(L**T)*(P**T),
41: *>
42: *> where U (or L) is unit upper (or lower) triangular matrix,
43: *> U**T (or L**T) is the transpose of U (or L), P is a permutation
44: *> matrix, P**T is the transpose of P, and D is symmetric and block
45: *> diagonal with 1-by-1 and 2-by-2 diagonal blocks.
46: *>
47: *> This is the blocked version of the algorithm, calling Level 3 BLAS.
48: *> \endverbatim
49: *
50: * Arguments:
51: * ==========
52: *
53: *> \param[in] UPLO
54: *> \verbatim
55: *> UPLO is CHARACTER*1
56: *> Specifies whether the details of the factorization are
57: *> stored as an upper or lower triangular matrix.
58: *> = 'U': Upper triangle of A is stored;
59: *> = 'L': Lower triangle of A is stored.
60: *> \endverbatim
61: *>
62: *> \param[in] N
63: *> \verbatim
64: *> N is INTEGER
65: *> The order of the matrix A. N >= 0.
66: *> \endverbatim
67: *>
68: *> \param[in,out] A
69: *> \verbatim
70: *> A is COMPLEX*16 array, dimension (LDA,N)
71: *> On entry, diagonal of the block diagonal matrix D and
72: *> factors U or L as computed by ZSYTRF_RK and ZSYTRF_BK:
73: *> a) ONLY diagonal elements of the symmetric block diagonal
74: *> matrix D on the diagonal of A, i.e. D(k,k) = A(k,k);
75: *> (superdiagonal (or subdiagonal) elements of D
76: *> should be provided on entry in array E), and
77: *> b) If UPLO = 'U': factor U in the superdiagonal part of A.
78: *> If UPLO = 'L': factor L in the subdiagonal part of A.
79: *>
80: *> On exit, if INFO = 0, the symmetric inverse of the original
81: *> matrix.
82: *> If UPLO = 'U': the upper triangular part of the inverse
83: *> is formed and the part of A below the diagonal is not
84: *> referenced;
85: *> If UPLO = 'L': the lower triangular part of the inverse
86: *> is formed and the part of A above the diagonal is not
87: *> referenced.
88: *> \endverbatim
89: *>
90: *> \param[in] LDA
91: *> \verbatim
92: *> LDA is INTEGER
93: *> The leading dimension of the array A. LDA >= max(1,N).
94: *> \endverbatim
95: *>
96: *> \param[in] E
97: *> \verbatim
98: *> E is COMPLEX*16 array, dimension (N)
99: *> On entry, contains the superdiagonal (or subdiagonal)
100: *> elements of the symmetric block diagonal matrix D
101: *> with 1-by-1 or 2-by-2 diagonal blocks, where
1.3 bertrand 102: *> If UPLO = 'U': E(i) = D(i-1,i), i=2:N, E(1) not referenced;
1.1 bertrand 103: *> If UPLO = 'L': E(i) = D(i+1,i), i=1:N-1, E(N) not referenced.
104: *>
105: *> NOTE: For 1-by-1 diagonal block D(k), where
106: *> 1 <= k <= N, the element E(k) is not referenced in both
107: *> UPLO = 'U' or UPLO = 'L' cases.
108: *> \endverbatim
109: *>
110: *> \param[in] IPIV
111: *> \verbatim
112: *> IPIV is INTEGER array, dimension (N)
113: *> Details of the interchanges and the block structure of D
114: *> as determined by ZSYTRF_RK or ZSYTRF_BK.
115: *> \endverbatim
116: *>
117: *> \param[out] WORK
118: *> \verbatim
119: *> WORK is COMPLEX*16 array, dimension (N+NB+1,NB+3).
120: *> \endverbatim
121: *>
122: *> \param[in] NB
123: *> \verbatim
124: *> NB is INTEGER
125: *> Block size.
126: *> \endverbatim
127: *>
128: *> \param[out] INFO
129: *> \verbatim
130: *> INFO is INTEGER
131: *> = 0: successful exit
132: *> < 0: if INFO = -i, the i-th argument had an illegal value
133: *> > 0: if INFO = i, D(i,i) = 0; the matrix is singular and its
134: *> inverse could not be computed.
135: *> \endverbatim
136: *
137: * Authors:
138: * ========
139: *
140: *> \author Univ. of Tennessee
141: *> \author Univ. of California Berkeley
142: *> \author Univ. of Colorado Denver
143: *> \author NAG Ltd.
144: *
1.3 bertrand 145: *> \date June 2017
1.1 bertrand 146: *
147: *> \ingroup complex16SYcomputational
148: *
149: *> \par Contributors:
150: * ==================
151: *> \verbatim
152: *>
1.3 bertrand 153: *> June 2017, Igor Kozachenko,
1.1 bertrand 154: *> Computer Science Division,
155: *> University of California, Berkeley
156: *>
157: *> \endverbatim
158: *
159: * =====================================================================
160: SUBROUTINE ZSYTRI_3X( UPLO, N, A, LDA, E, IPIV, WORK, NB, INFO )
161: *
1.3 bertrand 162: * -- LAPACK computational routine (version 3.7.1) --
1.1 bertrand 163: * -- LAPACK is a software package provided by Univ. of Tennessee, --
164: * -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
1.3 bertrand 165: * June 2017
1.1 bertrand 166: *
167: * .. Scalar Arguments ..
168: CHARACTER UPLO
169: INTEGER INFO, LDA, N, NB
170: * ..
171: * .. Array Arguments ..
172: INTEGER IPIV( * )
173: COMPLEX*16 A( LDA, * ), E( * ), WORK( N+NB+1, * )
174: * ..
175: *
176: * =====================================================================
177: *
178: * .. Parameters ..
179: COMPLEX*16 CONE, CZERO
180: PARAMETER ( CONE = ( 1.0D+0, 0.0D+0 ),
181: $ CZERO = ( 0.0D+0, 0.0D+0 ) )
182: * ..
183: * .. Local Scalars ..
184: LOGICAL UPPER
185: INTEGER CUT, I, ICOUNT, INVD, IP, K, NNB, J, U11
186: COMPLEX*16 AK, AKKP1, AKP1, D, T, U01_I_J, U01_IP1_J,
187: $ U11_I_J, U11_IP1_J
188: * ..
189: * .. External Functions ..
190: LOGICAL LSAME
191: EXTERNAL LSAME
192: * ..
193: * .. External Subroutines ..
194: EXTERNAL ZGEMM, ZSYSWAPR, ZTRTRI, ZTRMM, XERBLA
195: * ..
196: * .. Intrinsic Functions ..
197: INTRINSIC ABS, MAX, MOD
198: * ..
199: * .. Executable Statements ..
200: *
201: * Test the input parameters.
202: *
203: INFO = 0
204: UPPER = LSAME( UPLO, 'U' )
205: IF( .NOT.UPPER .AND. .NOT.LSAME( UPLO, 'L' ) ) THEN
206: INFO = -1
207: ELSE IF( N.LT.0 ) THEN
208: INFO = -2
209: ELSE IF( LDA.LT.MAX( 1, N ) ) THEN
210: INFO = -4
211: END IF
212: *
213: * Quick return if possible
214: *
215: IF( INFO.NE.0 ) THEN
216: CALL XERBLA( 'ZSYTRI_3X', -INFO )
217: RETURN
218: END IF
219: IF( N.EQ.0 )
220: $ RETURN
221: *
222: * Workspace got Non-diag elements of D
223: *
224: DO K = 1, N
225: WORK( K, 1 ) = E( K )
226: END DO
227: *
228: * Check that the diagonal matrix D is nonsingular.
229: *
230: IF( UPPER ) THEN
231: *
232: * Upper triangular storage: examine D from bottom to top
233: *
234: DO INFO = N, 1, -1
235: IF( IPIV( INFO ).GT.0 .AND. A( INFO, INFO ).EQ.CZERO )
236: $ RETURN
237: END DO
238: ELSE
239: *
240: * Lower triangular storage: examine D from top to bottom.
241: *
242: DO INFO = 1, N
243: IF( IPIV( INFO ).GT.0 .AND. A( INFO, INFO ).EQ.CZERO )
244: $ RETURN
245: END DO
246: END IF
247: *
248: INFO = 0
249: *
250: * Splitting Workspace
251: * U01 is a block ( N, NB+1 )
252: * The first element of U01 is in WORK( 1, 1 )
253: * U11 is a block ( NB+1, NB+1 )
254: * The first element of U11 is in WORK( N+1, 1 )
255: *
256: U11 = N
257: *
258: * INVD is a block ( N, 2 )
259: * The first element of INVD is in WORK( 1, INVD )
260: *
261: INVD = NB + 2
262:
263: IF( UPPER ) THEN
264: *
265: * Begin Upper
266: *
267: * invA = P * inv(U**T) * inv(D) * inv(U) * P**T.
268: *
269: CALL ZTRTRI( UPLO, 'U', N, A, LDA, INFO )
270: *
271: * inv(D) and inv(D) * inv(U)
272: *
273: K = 1
274: DO WHILE( K.LE.N )
275: IF( IPIV( K ).GT.0 ) THEN
276: * 1 x 1 diagonal NNB
277: WORK( K, INVD ) = CONE / A( K, K )
278: WORK( K, INVD+1 ) = CZERO
279: ELSE
280: * 2 x 2 diagonal NNB
281: T = WORK( K+1, 1 )
282: AK = A( K, K ) / T
283: AKP1 = A( K+1, K+1 ) / T
284: AKKP1 = WORK( K+1, 1 ) / T
285: D = T*( AK*AKP1-CONE )
286: WORK( K, INVD ) = AKP1 / D
287: WORK( K+1, INVD+1 ) = AK / D
288: WORK( K, INVD+1 ) = -AKKP1 / D
289: WORK( K+1, INVD ) = WORK( K, INVD+1 )
290: K = K + 1
291: END IF
292: K = K + 1
293: END DO
294: *
295: * inv(U**T) = (inv(U))**T
296: *
297: * inv(U**T) * inv(D) * inv(U)
298: *
299: CUT = N
300: DO WHILE( CUT.GT.0 )
301: NNB = NB
302: IF( CUT.LE.NNB ) THEN
303: NNB = CUT
304: ELSE
305: ICOUNT = 0
306: * count negative elements,
307: DO I = CUT+1-NNB, CUT
308: IF( IPIV( I ).LT.0 ) ICOUNT = ICOUNT + 1
309: END DO
310: * need a even number for a clear cut
311: IF( MOD( ICOUNT, 2 ).EQ.1 ) NNB = NNB + 1
312: END IF
313:
314: CUT = CUT - NNB
315: *
316: * U01 Block
317: *
318: DO I = 1, CUT
319: DO J = 1, NNB
320: WORK( I, J ) = A( I, CUT+J )
321: END DO
322: END DO
323: *
324: * U11 Block
325: *
326: DO I = 1, NNB
327: WORK( U11+I, I ) = CONE
328: DO J = 1, I-1
329: WORK( U11+I, J ) = CZERO
330: END DO
331: DO J = I+1, NNB
332: WORK( U11+I, J ) = A( CUT+I, CUT+J )
333: END DO
334: END DO
335: *
336: * invD * U01
337: *
338: I = 1
339: DO WHILE( I.LE.CUT )
340: IF( IPIV( I ).GT.0 ) THEN
341: DO J = 1, NNB
342: WORK( I, J ) = WORK( I, INVD ) * WORK( I, J )
343: END DO
344: ELSE
345: DO J = 1, NNB
346: U01_I_J = WORK( I, J )
347: U01_IP1_J = WORK( I+1, J )
348: WORK( I, J ) = WORK( I, INVD ) * U01_I_J
349: $ + WORK( I, INVD+1 ) * U01_IP1_J
350: WORK( I+1, J ) = WORK( I+1, INVD ) * U01_I_J
351: $ + WORK( I+1, INVD+1 ) * U01_IP1_J
352: END DO
353: I = I + 1
354: END IF
355: I = I + 1
356: END DO
357: *
358: * invD1 * U11
359: *
360: I = 1
361: DO WHILE ( I.LE.NNB )
362: IF( IPIV( CUT+I ).GT.0 ) THEN
363: DO J = I, NNB
364: WORK( U11+I, J ) = WORK(CUT+I,INVD) * WORK(U11+I,J)
365: END DO
366: ELSE
367: DO J = I, NNB
368: U11_I_J = WORK(U11+I,J)
369: U11_IP1_J = WORK(U11+I+1,J)
370: WORK( U11+I, J ) = WORK(CUT+I,INVD) * WORK(U11+I,J)
371: $ + WORK(CUT+I,INVD+1) * WORK(U11+I+1,J)
372: WORK( U11+I+1, J ) = WORK(CUT+I+1,INVD) * U11_I_J
373: $ + WORK(CUT+I+1,INVD+1) * U11_IP1_J
374: END DO
375: I = I + 1
376: END IF
377: I = I + 1
378: END DO
379: *
380: * U11**T * invD1 * U11 -> U11
381: *
382: CALL ZTRMM( 'L', 'U', 'T', 'U', NNB, NNB,
383: $ CONE, A( CUT+1, CUT+1 ), LDA, WORK( U11+1, 1 ),
384: $ N+NB+1 )
385: *
386: DO I = 1, NNB
387: DO J = I, NNB
388: A( CUT+I, CUT+J ) = WORK( U11+I, J )
389: END DO
390: END DO
391: *
392: * U01**T * invD * U01 -> A( CUT+I, CUT+J )
393: *
394: CALL ZGEMM( 'T', 'N', NNB, NNB, CUT, CONE, A( 1, CUT+1 ),
395: $ LDA, WORK, N+NB+1, CZERO, WORK(U11+1,1),
396: $ N+NB+1 )
397:
398: *
399: * U11 = U11**T * invD1 * U11 + U01**T * invD * U01
400: *
401: DO I = 1, NNB
402: DO J = I, NNB
403: A( CUT+I, CUT+J ) = A( CUT+I, CUT+J ) + WORK(U11+I,J)
404: END DO
405: END DO
406: *
407: * U01 = U00**T * invD0 * U01
408: *
409: CALL ZTRMM( 'L', UPLO, 'T', 'U', CUT, NNB,
410: $ CONE, A, LDA, WORK, N+NB+1 )
411:
412: *
413: * Update U01
414: *
415: DO I = 1, CUT
416: DO J = 1, NNB
417: A( I, CUT+J ) = WORK( I, J )
418: END DO
419: END DO
420: *
421: * Next Block
422: *
423: END DO
424: *
425: * Apply PERMUTATIONS P and P**T:
426: * P * inv(U**T) * inv(D) * inv(U) * P**T.
427: * Interchange rows and columns I and IPIV(I) in reverse order
428: * from the formation order of IPIV vector for Upper case.
429: *
430: * ( We can use a loop over IPIV with increment 1,
431: * since the ABS value of IPIV(I) represents the row (column)
432: * index of the interchange with row (column) i in both 1x1
433: * and 2x2 pivot cases, i.e. we don't need separate code branches
434: * for 1x1 and 2x2 pivot cases )
435: *
436: DO I = 1, N
437: IP = ABS( IPIV( I ) )
438: IF( IP.NE.I ) THEN
439: IF (I .LT. IP) CALL ZSYSWAPR( UPLO, N, A, LDA, I ,IP )
440: IF (I .GT. IP) CALL ZSYSWAPR( UPLO, N, A, LDA, IP ,I )
441: END IF
442: END DO
443: *
444: ELSE
445: *
446: * Begin Lower
447: *
448: * inv A = P * inv(L**T) * inv(D) * inv(L) * P**T.
449: *
450: CALL ZTRTRI( UPLO, 'U', N, A, LDA, INFO )
451: *
452: * inv(D) and inv(D) * inv(L)
453: *
454: K = N
455: DO WHILE ( K .GE. 1 )
456: IF( IPIV( K ).GT.0 ) THEN
457: * 1 x 1 diagonal NNB
458: WORK( K, INVD ) = CONE / A( K, K )
459: WORK( K, INVD+1 ) = CZERO
460: ELSE
461: * 2 x 2 diagonal NNB
462: T = WORK( K-1, 1 )
463: AK = A( K-1, K-1 ) / T
464: AKP1 = A( K, K ) / T
465: AKKP1 = WORK( K-1, 1 ) / T
466: D = T*( AK*AKP1-CONE )
467: WORK( K-1, INVD ) = AKP1 / D
468: WORK( K, INVD ) = AK / D
469: WORK( K, INVD+1 ) = -AKKP1 / D
470: WORK( K-1, INVD+1 ) = WORK( K, INVD+1 )
471: K = K - 1
472: END IF
473: K = K - 1
474: END DO
475: *
476: * inv(L**T) = (inv(L))**T
477: *
478: * inv(L**T) * inv(D) * inv(L)
479: *
480: CUT = 0
481: DO WHILE( CUT.LT.N )
482: NNB = NB
483: IF( (CUT + NNB).GT.N ) THEN
484: NNB = N - CUT
485: ELSE
486: ICOUNT = 0
487: * count negative elements,
488: DO I = CUT + 1, CUT+NNB
489: IF ( IPIV( I ).LT.0 ) ICOUNT = ICOUNT + 1
490: END DO
491: * need a even number for a clear cut
492: IF( MOD( ICOUNT, 2 ).EQ.1 ) NNB = NNB + 1
493: END IF
494: *
495: * L21 Block
496: *
497: DO I = 1, N-CUT-NNB
498: DO J = 1, NNB
499: WORK( I, J ) = A( CUT+NNB+I, CUT+J )
500: END DO
501: END DO
502: *
503: * L11 Block
504: *
505: DO I = 1, NNB
506: WORK( U11+I, I) = CONE
507: DO J = I+1, NNB
508: WORK( U11+I, J ) = CZERO
509: END DO
510: DO J = 1, I-1
511: WORK( U11+I, J ) = A( CUT+I, CUT+J )
512: END DO
513: END DO
514: *
515: * invD*L21
516: *
517: I = N-CUT-NNB
518: DO WHILE( I.GE.1 )
519: IF( IPIV( CUT+NNB+I ).GT.0 ) THEN
520: DO J = 1, NNB
521: WORK( I, J ) = WORK( CUT+NNB+I, INVD) * WORK( I, J)
522: END DO
523: ELSE
524: DO J = 1, NNB
525: U01_I_J = WORK(I,J)
526: U01_IP1_J = WORK(I-1,J)
527: WORK(I,J)=WORK(CUT+NNB+I,INVD)*U01_I_J+
528: $ WORK(CUT+NNB+I,INVD+1)*U01_IP1_J
529: WORK(I-1,J)=WORK(CUT+NNB+I-1,INVD+1)*U01_I_J+
530: $ WORK(CUT+NNB+I-1,INVD)*U01_IP1_J
531: END DO
532: I = I - 1
533: END IF
534: I = I - 1
535: END DO
536: *
537: * invD1*L11
538: *
539: I = NNB
540: DO WHILE( I.GE.1 )
541: IF( IPIV( CUT+I ).GT.0 ) THEN
542: DO J = 1, NNB
543: WORK( U11+I, J ) = WORK( CUT+I, INVD)*WORK(U11+I,J)
544: END DO
545:
546: ELSE
547: DO J = 1, NNB
548: U11_I_J = WORK( U11+I, J )
549: U11_IP1_J = WORK( U11+I-1, J )
550: WORK( U11+I, J ) = WORK(CUT+I,INVD) * WORK(U11+I,J)
551: $ + WORK(CUT+I,INVD+1) * U11_IP1_J
552: WORK( U11+I-1, J ) = WORK(CUT+I-1,INVD+1) * U11_I_J
553: $ + WORK(CUT+I-1,INVD) * U11_IP1_J
554: END DO
555: I = I - 1
556: END IF
557: I = I - 1
558: END DO
559: *
560: * L11**T * invD1 * L11 -> L11
561: *
562: CALL ZTRMM( 'L', UPLO, 'T', 'U', NNB, NNB, CONE,
563: $ A( CUT+1, CUT+1 ), LDA, WORK( U11+1, 1 ),
564: $ N+NB+1 )
565:
566: *
567: DO I = 1, NNB
568: DO J = 1, I
569: A( CUT+I, CUT+J ) = WORK( U11+I, J )
570: END DO
571: END DO
572: *
573: IF( (CUT+NNB).LT.N ) THEN
574: *
575: * L21**T * invD2*L21 -> A( CUT+I, CUT+J )
576: *
577: CALL ZGEMM( 'T', 'N', NNB, NNB, N-NNB-CUT, CONE,
578: $ A( CUT+NNB+1, CUT+1 ), LDA, WORK, N+NB+1,
579: $ CZERO, WORK( U11+1, 1 ), N+NB+1 )
580:
581: *
582: * L11 = L11**T * invD1 * L11 + U01**T * invD * U01
583: *
584: DO I = 1, NNB
585: DO J = 1, I
586: A( CUT+I, CUT+J ) = A( CUT+I, CUT+J )+WORK(U11+I,J)
587: END DO
588: END DO
589: *
590: * L01 = L22**T * invD2 * L21
591: *
592: CALL ZTRMM( 'L', UPLO, 'T', 'U', N-NNB-CUT, NNB, CONE,
593: $ A( CUT+NNB+1, CUT+NNB+1 ), LDA, WORK,
594: $ N+NB+1 )
595: *
596: * Update L21
597: *
598: DO I = 1, N-CUT-NNB
599: DO J = 1, NNB
600: A( CUT+NNB+I, CUT+J ) = WORK( I, J )
601: END DO
602: END DO
603: *
604: ELSE
605: *
606: * L11 = L11**T * invD1 * L11
607: *
608: DO I = 1, NNB
609: DO J = 1, I
610: A( CUT+I, CUT+J ) = WORK( U11+I, J )
611: END DO
612: END DO
613: END IF
614: *
615: * Next Block
616: *
617: CUT = CUT + NNB
618: *
619: END DO
620: *
621: * Apply PERMUTATIONS P and P**T:
622: * P * inv(L**T) * inv(D) * inv(L) * P**T.
623: * Interchange rows and columns I and IPIV(I) in reverse order
624: * from the formation order of IPIV vector for Lower case.
625: *
626: * ( We can use a loop over IPIV with increment -1,
627: * since the ABS value of IPIV(I) represents the row (column)
628: * index of the interchange with row (column) i in both 1x1
629: * and 2x2 pivot cases, i.e. we don't need separate code branches
630: * for 1x1 and 2x2 pivot cases )
631: *
632: DO I = N, 1, -1
633: IP = ABS( IPIV( I ) )
634: IF( IP.NE.I ) THEN
635: IF (I .LT. IP) CALL ZSYSWAPR( UPLO, N, A, LDA, I ,IP )
636: IF (I .GT. IP) CALL ZSYSWAPR( UPLO, N, A, LDA, IP ,I )
637: END IF
638: END DO
639: *
640: END IF
641: *
642: RETURN
643: *
644: * End of ZSYTRI_3X
645: *
646: END
647:
CVSweb interface <joel.bertrand@systella.fr>