|
ABSTRACT
We present a new parallel sparse LU factorization algorithm and code. The algorithm uses a column-preordering partial-pivoting unsymmetric-pattern multifrontal approach. Our baseline sequential algorithm is based on UMFPACK 4, but is somewhat simpler and is often somewhat faster than UMFPACK version 4.0. Our parallel algorithm is designed for shared-memory machines with a small or moderate number of processors (we tested it on up to 32 processors). We experimentally compare our algorithm with SuperLU_MT, an existing shared-memory sparse LU factorization with partial pivoting. SuperLU_MT scales better than our new algorithm, but our algorithm is more reliable and is usually faster. More specifically, on matrices that are costly to factor, our algorithm is usually faster on up to 4 processors, and is usually faster on 8 and 16. We were not able to run SuperLU_MT on 32. The main contribution of this article is showing that the column-preordering partial-pivoting unsymmetric-pattern multifrontal approach, developed as a sequential algorithm by Davis in several recent versions of UMFPACK, can be effectively parallelized.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
E. Anderson , Z. Bai , C. Bischof , L. S. Blackford , J. Demmel , Jack J. Dongarra , J. Du Croz , S. Hammarling , A. Greenbaum , A. McKenney , D. Sorensen, LAPACK Users' guide (third ed.), Society for Industrial and Applied Mathematics, Philadelphia, PA, 1999
|
 |
3
|
|
| |
4
|
|
| |
5
|
Davis, T. A. 2003. A column pre-ordering strategy for the unsymmetric-pattern multifrontal method. Tech. Rep. TR-03-006, Department of Computer and Information Science and Engineering, University of Florida. May.
|
 |
6
|
|
| |
7
|
Davis, T. A. and Duff, I. S. 1991. Unsymmetric-pattern multifrontal methods for parallel sparse LU factorization. Tech. Rep. TR-91-023, Department of Computer and Information Science and Engineering, University of Florida. Jan.
|
| |
8
|
|
 |
9
|
|
 |
10
|
|
 |
11
|
|
| |
12
|
|
| |
13
|
|
| |
14
|
|
 |
15
|
|
 |
16
|
|
| |
17
|
|
 |
18
|
|
| |
19
|
|
| |
20
|
|
| |
21
|
Gilbert, J. R. 1988. An efficinet parallel sparse partial pivoting algorithm. Tech. Rep. 88/45052-1, Christian Michelsen Institute, Bergen, Norway.
|
| |
22
|
Gilbert, J. R., Li, X. S., Ng, E. G., and Peyton, B. W. 2001. Computing row and column counts for sparse QR and LU factorization. BIT Numer. Math. 41, 4, 693--710.
|
| |
23
|
|
| |
24
|
Gilbert, J. R. and Ng, E. 1993. Predicting structure in nonsymmetric sparse matrix factorizations. In Graph Theory and Sparse Matrix Computation, A. George, J. R. Gilbert, and J. W. H. Liu, Eds. Springer-Verlag.
|
| |
25
|
Gilbert, J. R. and Peierls, T. 1988a. Sparse partial pivoting in time proportional to arithmetic operations. SIAM J. Sci. Statis. Comput. 9, 862--874.
|
| |
26
|
Gilbert, J. R. and Peierls, T. 1988b. Sparse partial pivoting in time proportional to arithmetic operations. SIAM J. Sci. Statis. Comput. 9, 862--874.
|
| |
27
|
Gilbert, J. R. and Schreiber, R. 1982. Nested dissection with partial pivoting. In Sparse Matrix Symposium 1982: Program and Abstracts. Fairfield Glade, Tennessee, 61.
|
 |
28
|
|
| |
29
|
|
 |
30
|
|
| |
31
|
|
 |
32
|
|
| |
33
|
|
| |
34
|
|
| |
35
|
|
 |
36
|
|
| |
37
|
Rozin, E. and Toledo, S. 2005. Locality of reference in sparse Cholesky methods. Electr. Trans. Numer. Anal. 21, 81--106.
|
| |
38
|
Supercomputing Technologies Group, MIT Laboratory for Computer Science 2001. Cilk-5.3.2 Reference Manual. Supercomputing Technologies Group, MIT Laboratory for Computer Science, Cambridge, MA. Available online at http://supertech.lcs.mit.edu/cilk.
|
|