Abstract
We show how to compute an LU factorization of a matrix when the factors of a leading principle submatrix are already known. The approach incorporates pivoting akin to partial pivoting, a strategy we call incremental pivoting. An implementation using the Formal Linear Algebra Methods Environment (FLAME) application programming interface (API) is described. Experimental results demonstrate practical numerical stability and high performance on an Intel Itanium2 processor-based server.
- Bientinesi, P., Gunnels, J. A., Myers, M. E., Quintana-Ortí, E. S., and van de Geijn, R. A. 2005. The science of deriving dense linear algebra algorithms. ACM Trans. Math. Softw. 31, 1 (Mar.), 1--26. Google ScholarDigital Library
- Bientinesi, P. and van de Geijn, R. 2006. Representing dense linear algebra algorithms: A farewell to indices. Tech. Rep. FLAME Working Note 17, CS-TR-2006-10, Department of Computer Sciences, The University of Texas at Austin.Google Scholar
- Cwik, T., van de Geijn, R., and Patterson, J. 1994. The application of parallel computation to integral equation models of electromagnetic scattering. J. Optic. Soc. Amer. A 11, 4 (Apr.), 1538--1545.Google Scholar
- Demmel, J. and Dongarra, J. 2005. LAPACK 2005 prospectus: Reliable and scalable software for linear algebra computations on high end computers. LAPACK Working Note 164 UT-CS-05-546, University of Tennessee. February.Google Scholar
- Geng, P., Oden, J. T., and van de Geijn, R. 1996. Massively parallel computation for acoustical scattering problems using boundary element methods. J. Sound Vibra. 191, 1, 145--165.Google ScholarCross Ref
- Goto, K. 2004. TACC software and tools. http://www.tacc.utexas.edu/resources/software/.Google Scholar
- Goto, K. and van de Geijn, R. A. 2008. Anatomy of a high-performance matrix multiplication. ACM Trans. Math. Softw. 34, 3 (to appear). Google ScholarDigital Library
- Gunnels, J. A., Henry, G. M., and van de Geijn, R. A. 2001. A family of high-performance matrix multiplication algorithms. In Proceedings of the International Conference on Computational Science (ICCS), Part I, V. N. Alexandrov et al., eds. Lecture Notes in Computer Science, vol. 2073. Springer, 51--60. Google ScholarDigital Library
- Gunter, B. and van de Geijn, R. 2005. Parallel out-of-core computation and updating of the QR factorization. ACM Trans. Math. Softw. 31, 1 (Mar.), 60--78. Google ScholarDigital Library
- Gustavson, F., Henriksson, A., Jonsson, I., Kågström, B., and Ling, P. 1998. Superscalar GEMM-based level 3 BLAS -- The on-going evolution of a portable and high-performance library. In Proceedings of the Workshop Applied Parallel Computing (PARA), Large Scale Scientific and Industrial Problems, B. K. et al., eds. Lecture Notes in Computer Science, vol. 1541. Springer, 207--215. Google ScholarDigital Library
- Higham, N. J. 2002. Accuracy and Stability of Numerical Algorithms, 2nd ed. Society for Industrial and Applied Mathematics, Philadelphia, PA. Google ScholarDigital Library
- Joffrain, T., Quintana-Ortí, E. S., and van de Geijn, R. A. 2005. Rapid development of high-performance out-of-core solvers. In Proceedings of the Workshop on Applied Parallel Computing (PARA 2004), J. Dongarra et al., eds. Lecture Notes in Computer Science, vol. 3732. Springer, 413--422. Google ScholarDigital Library
- Kågström, B., Ling, P., and Loan, C. V. 1995. Gemm-Based level 3 blas: High-Performance model, implementations and performance evaluation benchmark. LAPACK Working Note no. 107 CS-95-315, University of Tennessee. November.Google Scholar
- Kågström, B., Ling, P., and Loan, C. V. 1998. GEMM-Based level 3 BLAS: High performance model implementations and performance evaluation benchmark. ACM Trans. Math. Softw. 24, 3, 268--302. Google ScholarDigital Library
- Klimkowski, K. and van de Geijn, R. 1995. Anatomy of an out-of-core dense linear solver. In Proceedings of the International Conference on Parallel Processing. vol. III - Algorithms and Applications, 29--33.Google Scholar
- Sorensen, D. C. 1985. Analysis of pairwise pivoting in Gaussian elimination. IEEE Trans. Comput. C-34, 3, 274--278. Google ScholarDigital Library
- Stewart, G. W. 1998. Matrix Algorithms. Volume I: Basic Decompositions. SIAM, Philadelphia, PA.Google Scholar
- Toledo, S. 1997. Locality of reference in LU decomposition with partial pivoting. SIAM J. Matrix Anal. Appl. 18, 4, 1065--1081. Google ScholarDigital Library
- Toledo, S. 1999. A survey of out-of-core algorithms in numerical linear algebra. In External Memory Algorithms and Visualization, J. Abello and J. S. Vitter, eds. American Mathematical Society Press, Providence, RI, 161--180. Google ScholarDigital Library
- Toledo, S. and Gustavson, F. 1996. The design and implementation of SOLAR, a portable library for scalable out-of-core linear algebra computations. In Proceedings of the 4th Workshop on I/O in Parallel and Distributed Systems, 28--40. Google ScholarDigital Library
- Trefethen, L. N. and Schreiber, R. S. 1990. Average-Case stability of Gaussian elimination. SIAM J. Matrix Anal. Appl. 11, 3, 335--360. Google ScholarDigital Library
- Wilkinson, J. H. 1965. The Algebraic Eigenvalue Problem. Oxford University Press, London. Google ScholarDigital Library
- Yip, E. L. 1979. Fortran subroutines for out-of-core solutions of large complex linear systems. Tech. Rep. CR-159142, NASA.Google Scholar
Index Terms
- Updating an LU Factorization with Pivoting
Recommendations
LU Factorization with Panel Rank Revealing Pivoting and Its Communication Avoiding Version
We present block LU factorization with panel rank revealing pivoting (block LU_PRRP), a decomposition algorithm based on strong rank revealing QR panel factorization. Block LU_PRRP is more stable than Gaussian elimination with partial pivoting (GEPP), with a ...
Block pivoting implementation of a symmetric Toeplitz solver
Toeplitz matrices are characterized by a special structure that can be exploited in order to obtain fast linear system solvers. These solvers are difficult to parallelize due to their low computational cost and their closely coupled data operations. We ...
An Unsymmetric-Pattern Multifrontal Method for Sparse LU Factorization
Sparse matrix factorization algorithms for general problems are typically characterized by irregular memory access patterns that limit their performance on parallel-vector supercomputers. For symmetric problems, methods such as the multifrontal method ...
Comments