1: *> \brief <b> DSGESV computes the solution to system of linear equations A * X = B for GE matrices</b> (mixed precision with iterative refinement)
2: *
3: * =========== DOCUMENTATION ===========
4: *
5: * Online html documentation available at
6: * http://www.netlib.org/lapack/explore-html/
7: *
8: *> \htmlonly
9: *> Download DSGESV + dependencies
10: *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.tgz?format=tgz&filename=/lapack/lapack_routine/dsgesv.f">
11: *> [TGZ]</a>
12: *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.zip?format=zip&filename=/lapack/lapack_routine/dsgesv.f">
13: *> [ZIP]</a>
14: *> <a href="http://www.netlib.org/cgi-bin/netlibfiles.txt?format=txt&filename=/lapack/lapack_routine/dsgesv.f">
15: *> [TXT]</a>
16: *> \endhtmlonly
17: *
18: * Definition:
19: * ===========
20: *
21: * SUBROUTINE DSGESV( N, NRHS, A, LDA, IPIV, B, LDB, X, LDX, WORK,
22: * SWORK, ITER, INFO )
23: *
24: * .. Scalar Arguments ..
25: * INTEGER INFO, ITER, LDA, LDB, LDX, N, NRHS
26: * ..
27: * .. Array Arguments ..
28: * INTEGER IPIV( * )
29: * REAL SWORK( * )
30: * DOUBLE PRECISION A( LDA, * ), B( LDB, * ), WORK( N, * ),
31: * $ X( LDX, * )
32: * ..
33: *
34: *
35: *> \par Purpose:
36: * =============
37: *>
38: *> \verbatim
39: *>
40: *> DSGESV computes the solution to a real system of linear equations
41: *> A * X = B,
42: *> where A is an N-by-N matrix and X and B are N-by-NRHS matrices.
43: *>
44: *> DSGESV first attempts to factorize the matrix in SINGLE PRECISION
45: *> and use this factorization within an iterative refinement procedure
46: *> to produce a solution with DOUBLE PRECISION normwise backward error
47: *> quality (see below). If the approach fails the method switches to a
48: *> DOUBLE PRECISION factorization and solve.
49: *>
50: *> The iterative refinement is not going to be a winning strategy if
51: *> the ratio SINGLE PRECISION performance over DOUBLE PRECISION
52: *> performance is too small. A reasonable strategy should take the
53: *> number of right-hand sides and the size of the matrix into account.
54: *> This might be done with a call to ILAENV in the future. Up to now, we
55: *> always try iterative refinement.
56: *>
57: *> The iterative refinement process is stopped if
58: *> ITER > ITERMAX
59: *> or for all the RHS we have:
60: *> RNRM < SQRT(N)*XNRM*ANRM*EPS*BWDMAX
61: *> where
62: *> o ITER is the number of the current iteration in the iterative
63: *> refinement process
64: *> o RNRM is the infinity-norm of the residual
65: *> o XNRM is the infinity-norm of the solution
66: *> o ANRM is the infinity-operator-norm of the matrix A
67: *> o EPS is the machine epsilon returned by DLAMCH('Epsilon')
68: *> The value ITERMAX and BWDMAX are fixed to 30 and 1.0D+00
69: *> respectively.
70: *> \endverbatim
71: *
72: * Arguments:
73: * ==========
74: *
75: *> \param[in] N
76: *> \verbatim
77: *> N is INTEGER
78: *> The number of linear equations, i.e., the order of the
79: *> matrix A. N >= 0.
80: *> \endverbatim
81: *>
82: *> \param[in] NRHS
83: *> \verbatim
84: *> NRHS is INTEGER
85: *> The number of right hand sides, i.e., the number of columns
86: *> of the matrix B. NRHS >= 0.
87: *> \endverbatim
88: *>
89: *> \param[in,out] A
90: *> \verbatim
91: *> A is DOUBLE PRECISION array,
92: *> dimension (LDA,N)
93: *> On entry, the N-by-N coefficient matrix A.
94: *> On exit, if iterative refinement has been successfully used
95: *> (INFO = 0 and ITER >= 0, see description below), then A is
96: *> unchanged, if double precision factorization has been used
97: *> (INFO = 0 and ITER < 0, see description below), then the
98: *> array A contains the factors L and U from the factorization
99: *> A = P*L*U; the unit diagonal elements of L are not stored.
100: *> \endverbatim
101: *>
102: *> \param[in] LDA
103: *> \verbatim
104: *> LDA is INTEGER
105: *> The leading dimension of the array A. LDA >= max(1,N).
106: *> \endverbatim
107: *>
108: *> \param[out] IPIV
109: *> \verbatim
110: *> IPIV is INTEGER array, dimension (N)
111: *> The pivot indices that define the permutation matrix P;
112: *> row i of the matrix was interchanged with row IPIV(i).
113: *> Corresponds either to the single precision factorization
114: *> (if INFO = 0 and ITER >= 0) or the double precision
115: *> factorization (if INFO = 0 and ITER < 0).
116: *> \endverbatim
117: *>
118: *> \param[in] B
119: *> \verbatim
120: *> B is DOUBLE PRECISION array, dimension (LDB,NRHS)
121: *> The N-by-NRHS right hand side matrix B.
122: *> \endverbatim
123: *>
124: *> \param[in] LDB
125: *> \verbatim
126: *> LDB is INTEGER
127: *> The leading dimension of the array B. LDB >= max(1,N).
128: *> \endverbatim
129: *>
130: *> \param[out] X
131: *> \verbatim
132: *> X is DOUBLE PRECISION array, dimension (LDX,NRHS)
133: *> If INFO = 0, the N-by-NRHS solution matrix X.
134: *> \endverbatim
135: *>
136: *> \param[in] LDX
137: *> \verbatim
138: *> LDX is INTEGER
139: *> The leading dimension of the array X. LDX >= max(1,N).
140: *> \endverbatim
141: *>
142: *> \param[out] WORK
143: *> \verbatim
144: *> WORK is DOUBLE PRECISION array, dimension (N,NRHS)
145: *> This array is used to hold the residual vectors.
146: *> \endverbatim
147: *>
148: *> \param[out] SWORK
149: *> \verbatim
150: *> SWORK is REAL array, dimension (N*(N+NRHS))
151: *> This array is used to use the single precision matrix and the
152: *> right-hand sides or solutions in single precision.
153: *> \endverbatim
154: *>
155: *> \param[out] ITER
156: *> \verbatim
157: *> ITER is INTEGER
158: *> < 0: iterative refinement has failed, double precision
159: *> factorization has been performed
160: *> -1 : the routine fell back to full precision for
161: *> implementation- or machine-specific reasons
162: *> -2 : narrowing the precision induced an overflow,
163: *> the routine fell back to full precision
164: *> -3 : failure of SGETRF
165: *> -31: stop the iterative refinement after the 30th
166: *> iterations
167: *> > 0: iterative refinement has been successfully used.
168: *> Returns the number of iterations
169: *> \endverbatim
170: *>
171: *> \param[out] INFO
172: *> \verbatim
173: *> INFO is INTEGER
174: *> = 0: successful exit
175: *> < 0: if INFO = -i, the i-th argument had an illegal value
176: *> > 0: if INFO = i, U(i,i) computed in DOUBLE PRECISION is
177: *> exactly zero. The factorization has been completed,
178: *> but the factor U is exactly singular, so the solution
179: *> could not be computed.
180: *> \endverbatim
181: *
182: * Authors:
183: * ========
184: *
185: *> \author Univ. of Tennessee
186: *> \author Univ. of California Berkeley
187: *> \author Univ. of Colorado Denver
188: *> \author NAG Ltd.
189: *
190: *> \ingroup doubleGEsolve
191: *
192: * =====================================================================
193: SUBROUTINE DSGESV( N, NRHS, A, LDA, IPIV, B, LDB, X, LDX, WORK,
194: $ SWORK, ITER, INFO )
195: *
196: * -- LAPACK driver routine --
197: * -- LAPACK is a software package provided by Univ. of Tennessee, --
198: * -- Univ. of California Berkeley, Univ. of Colorado Denver and NAG Ltd..--
199: *
200: * .. Scalar Arguments ..
201: INTEGER INFO, ITER, LDA, LDB, LDX, N, NRHS
202: * ..
203: * .. Array Arguments ..
204: INTEGER IPIV( * )
205: REAL SWORK( * )
206: DOUBLE PRECISION A( LDA, * ), B( LDB, * ), WORK( N, * ),
207: $ X( LDX, * )
208: * ..
209: *
210: * =====================================================================
211: *
212: * .. Parameters ..
213: LOGICAL DOITREF
214: PARAMETER ( DOITREF = .TRUE. )
215: *
216: INTEGER ITERMAX
217: PARAMETER ( ITERMAX = 30 )
218: *
219: DOUBLE PRECISION BWDMAX
220: PARAMETER ( BWDMAX = 1.0E+00 )
221: *
222: DOUBLE PRECISION NEGONE, ONE
223: PARAMETER ( NEGONE = -1.0D+0, ONE = 1.0D+0 )
224: *
225: * .. Local Scalars ..
226: INTEGER I, IITER, PTSA, PTSX
227: DOUBLE PRECISION ANRM, CTE, EPS, RNRM, XNRM
228: *
229: * .. External Subroutines ..
230: EXTERNAL DAXPY, DGEMM, DLACPY, DLAG2S, DGETRF, DGETRS,
231: $ SGETRF, SGETRS, SLAG2D, XERBLA
232: * ..
233: * .. External Functions ..
234: INTEGER IDAMAX
235: DOUBLE PRECISION DLAMCH, DLANGE
236: EXTERNAL IDAMAX, DLAMCH, DLANGE
237: * ..
238: * .. Intrinsic Functions ..
239: INTRINSIC ABS, DBLE, MAX, SQRT
240: * ..
241: * .. Executable Statements ..
242: *
243: INFO = 0
244: ITER = 0
245: *
246: * Test the input parameters.
247: *
248: IF( N.LT.0 ) THEN
249: INFO = -1
250: ELSE IF( NRHS.LT.0 ) THEN
251: INFO = -2
252: ELSE IF( LDA.LT.MAX( 1, N ) ) THEN
253: INFO = -4
254: ELSE IF( LDB.LT.MAX( 1, N ) ) THEN
255: INFO = -7
256: ELSE IF( LDX.LT.MAX( 1, N ) ) THEN
257: INFO = -9
258: END IF
259: IF( INFO.NE.0 ) THEN
260: CALL XERBLA( 'DSGESV', -INFO )
261: RETURN
262: END IF
263: *
264: * Quick return if (N.EQ.0).
265: *
266: IF( N.EQ.0 )
267: $ RETURN
268: *
269: * Skip single precision iterative refinement if a priori slower
270: * than double precision factorization.
271: *
272: IF( .NOT.DOITREF ) THEN
273: ITER = -1
274: GO TO 40
275: END IF
276: *
277: * Compute some constants.
278: *
279: ANRM = DLANGE( 'I', N, N, A, LDA, WORK )
280: EPS = DLAMCH( 'Epsilon' )
281: CTE = ANRM*EPS*SQRT( DBLE( N ) )*BWDMAX
282: *
283: * Set the indices PTSA, PTSX for referencing SA and SX in SWORK.
284: *
285: PTSA = 1
286: PTSX = PTSA + N*N
287: *
288: * Convert B from double precision to single precision and store the
289: * result in SX.
290: *
291: CALL DLAG2S( N, NRHS, B, LDB, SWORK( PTSX ), N, INFO )
292: *
293: IF( INFO.NE.0 ) THEN
294: ITER = -2
295: GO TO 40
296: END IF
297: *
298: * Convert A from double precision to single precision and store the
299: * result in SA.
300: *
301: CALL DLAG2S( N, N, A, LDA, SWORK( PTSA ), N, INFO )
302: *
303: IF( INFO.NE.0 ) THEN
304: ITER = -2
305: GO TO 40
306: END IF
307: *
308: * Compute the LU factorization of SA.
309: *
310: CALL SGETRF( N, N, SWORK( PTSA ), N, IPIV, INFO )
311: *
312: IF( INFO.NE.0 ) THEN
313: ITER = -3
314: GO TO 40
315: END IF
316: *
317: * Solve the system SA*SX = SB.
318: *
319: CALL SGETRS( 'No transpose', N, NRHS, SWORK( PTSA ), N, IPIV,
320: $ SWORK( PTSX ), N, INFO )
321: *
322: * Convert SX back to double precision
323: *
324: CALL SLAG2D( N, NRHS, SWORK( PTSX ), N, X, LDX, INFO )
325: *
326: * Compute R = B - AX (R is WORK).
327: *
328: CALL DLACPY( 'All', N, NRHS, B, LDB, WORK, N )
329: *
330: CALL DGEMM( 'No Transpose', 'No Transpose', N, NRHS, N, NEGONE, A,
331: $ LDA, X, LDX, ONE, WORK, N )
332: *
333: * Check whether the NRHS normwise backward errors satisfy the
334: * stopping criterion. If yes, set ITER=0 and return.
335: *
336: DO I = 1, NRHS
337: XNRM = ABS( X( IDAMAX( N, X( 1, I ), 1 ), I ) )
338: RNRM = ABS( WORK( IDAMAX( N, WORK( 1, I ), 1 ), I ) )
339: IF( RNRM.GT.XNRM*CTE )
340: $ GO TO 10
341: END DO
342: *
343: * If we are here, the NRHS normwise backward errors satisfy the
344: * stopping criterion. We are good to exit.
345: *
346: ITER = 0
347: RETURN
348: *
349: 10 CONTINUE
350: *
351: DO 30 IITER = 1, ITERMAX
352: *
353: * Convert R (in WORK) from double precision to single precision
354: * and store the result in SX.
355: *
356: CALL DLAG2S( N, NRHS, WORK, N, SWORK( PTSX ), N, INFO )
357: *
358: IF( INFO.NE.0 ) THEN
359: ITER = -2
360: GO TO 40
361: END IF
362: *
363: * Solve the system SA*SX = SR.
364: *
365: CALL SGETRS( 'No transpose', N, NRHS, SWORK( PTSA ), N, IPIV,
366: $ SWORK( PTSX ), N, INFO )
367: *
368: * Convert SX back to double precision and update the current
369: * iterate.
370: *
371: CALL SLAG2D( N, NRHS, SWORK( PTSX ), N, WORK, N, INFO )
372: *
373: DO I = 1, NRHS
374: CALL DAXPY( N, ONE, WORK( 1, I ), 1, X( 1, I ), 1 )
375: END DO
376: *
377: * Compute R = B - AX (R is WORK).
378: *
379: CALL DLACPY( 'All', N, NRHS, B, LDB, WORK, N )
380: *
381: CALL DGEMM( 'No Transpose', 'No Transpose', N, NRHS, N, NEGONE,
382: $ A, LDA, X, LDX, ONE, WORK, N )
383: *
384: * Check whether the NRHS normwise backward errors satisfy the
385: * stopping criterion. If yes, set ITER=IITER>0 and return.
386: *
387: DO I = 1, NRHS
388: XNRM = ABS( X( IDAMAX( N, X( 1, I ), 1 ), I ) )
389: RNRM = ABS( WORK( IDAMAX( N, WORK( 1, I ), 1 ), I ) )
390: IF( RNRM.GT.XNRM*CTE )
391: $ GO TO 20
392: END DO
393: *
394: * If we are here, the NRHS normwise backward errors satisfy the
395: * stopping criterion, we are good to exit.
396: *
397: ITER = IITER
398: *
399: RETURN
400: *
401: 20 CONTINUE
402: *
403: 30 CONTINUE
404: *
405: * If we are at this place of the code, this is because we have
406: * performed ITER=ITERMAX iterations and never satisfied the
407: * stopping criterion, set up the ITER flag accordingly and follow up
408: * on double precision routine.
409: *
410: ITER = -ITERMAX - 1
411: *
412: 40 CONTINUE
413: *
414: * Single-precision iterative refinement failed to converge to a
415: * satisfactory solution, so we resort to double precision.
416: *
417: CALL DGETRF( N, N, A, LDA, IPIV, INFO )
418: *
419: IF( INFO.NE.0 )
420: $ RETURN
421: *
422: CALL DLACPY( 'All', N, NRHS, B, LDB, X, LDX )
423: CALL DGETRS( 'No transpose', N, NRHS, A, LDA, IPIV, X, LDX,
424: $ INFO )
425: *
426: RETURN
427: *
428: * End of DSGESV
429: *
430: END
CVSweb interface <joel.bertrand@systella.fr>