skip to main content
research-article

Dense Linear Algebra over Word-Size Prime Fields: the FFLAS and FFPACK Packages

Published: 01 October 2008 Publication History

Abstract

In the past two decades, some major efforts have been made to reduce exact (e.g. integer, rational, polynomial) linear algebra problems to matrix multiplication in order to provide algorithms with optimal asymptotic complexity. To provide efficient implementations of such algorithms one need to be careful with the underlying arithmetic. It is well known that modular techniques such as the Chinese remainder algorithm or the p-adic lifting allow very good practical performance, especially when word size arithmetic is used. Therefore, finite field arithmetic becomes an important core for efficient exact linear algebra libraries. In this article, we study high performance implementations of basic linear algebra routines over word size prime fields: especially matrix multiplication; our goal being to provide an exact alternate to the numerical BLAS library. We show that this is made possible by a careful combination of numerical computations and asymptotically faster algorithms. Our kernel has several symbolic linear algebra applications enabled by diverse matrix multiplication reductions: symbolic triangularization, system solving, determinant, and matrix inverse implementations are thus studied.

Supplementary Material

Electonic Appendix (apdx1.pdf)
The proof of Theorem 3.1 is given in an electronic appendix, available online in the ACM Digital Library.

References

[1]
Aho, A. V., Hopcroft, J. E., and Ullman, J. D. 1974. The Design and Analysis of Computer Algorithms. Addison-Wesley.
[2]
Anderson, E., Bai, Z., Bischof, C., Blackford, S., Demmel, J., Dongarra, J., Du Croz, J., Greenbaum, A., Hammarling, S., McKenney, A., and Sorensen, D. 1999. LAPACK Users’ Guide, Third ed. Society for Industrial and Applied Mathematics, Philadelphia, PA.
[3]
Bini, D. and Pan, V. 1994. Polynomial and Matrix Computations, Volume 1: Fundamental Algorithms. Birkhauser, Boston.
[4]
Brassel, M., Giorgi, P., and Pernet, C. 2003. LUdivine: A symbolic block LU factorisation for matrices over finite fields using BLAS. Poster, http://ljk.imag.fr/membres/Jean−Guillaume.Dumas/FFLAS/FFLAS_Download/ludivine_poster_eccad2003.ps.gz.
[5]
Bunch, J. R. and Hopcroft, J. E. 1974. Triangular factorization and inversion by fast matrix multiplication. Mathematics of Computation 28, 231--236.
[6]
Chen, Z. and Storjohann, A. 2003. Effective reductions to matrix multiplication. 9th International Conference on Applications of Computer Algebra (ACA’2003), Raleigh, North Carolina State University.
[7]
Coppersmith, D. and Winograd, S. 1990. Matrix multiplication via arithmetic progressions. J. Symbol. Computation 9, 3, 251--280.
[8]
Courrieu, P. 2005. Fast computation of Moore-Penrose inverse matrices. Neural Information Processing - Letters and Reviews 8, 2 (Aug.), 25--29.
[9]
Dixon, J. D. 1982. Exact solution of linear equations using p-adic expansions. Numerische Mathematik 40, 137--141.
[10]
Dongarra, J. and Eijkhout, V. 2003. Self-adapting numerical software and automatic tuning of heuristics. Lecture Notes in Computer Science 2660, 759--770.
[11]
Dongarra, J. J., Croz, J. D., Hammarling, S., and Duff, I. 1990. A set of level 3 basic linear algebra subprograms. Trans. Math. Soft. 16, 1 (Mar.), 1--17. http://doi.acm.org/10.1145/77626.79170.
[12]
Douglas, C. C., Heroux, M., Slishman, G., and Smith, R. M. 1994. Gemmw: a portable level 3 blas Winograd variant of Strassen’s matrix-matrix multiply algorithm. J. Computational Phys, 110, 1--10.
[13]
Dumas, J.-G., Giorgi, P., and Pernet, C. 2006. FFLAS-FFPACK: Finite field linear algebra subroutine/package. Software, http://ciel.ccsd.cnrs.fr/ciel−00000025. Feb.
[14]
Dumas, J.-G. 2004. Efficient dot product over finite fields. In Proceedings of the Seventh International Workshop on Computer Algebra in Scientific Computing, (Yalta, Ukraine). V. G. Ganzha, E. W. Mayr, and E. V. Vorozhtsov, Eds. Technische Universität München, Germany, 139--154.
[15]
Dumas, J.-G. 2007. Q-adic transform revisited. Tech. Rep. 0710.0510 {cs.SC}, ArXiv. Oct. http://hal.archives-ouvertes.fr/hal-00173894.
[16]
Dumas, J.-G., Gautier, T., Giesbrecht, M., Giorgi, P., Hovinen, B., Kaltofen, E., Saunders, B. D., Turner, W. J., and Villard, G. 2002a. LinBox: A generic library for exact linear algebra. In Proceedings of the 2002 International Congress of Mathematical Software (Beijing, China). A. M. Cohen, X.-S. Gao, and N. Takayama, Eds. World Scientific Pub, 40--50.
[17]
Dumas, J.-G., Gautier, T., and Pernet, C. 2002b. Finite field linear algebra subroutines. In Proceedings of the 2002 International Symposium on Symbolic and Algebraic Computation (Lille, France). T. Mora, Ed. ACM Press, New York, 63--74.
[18]
Dumas, J.-G., Giorgi, P., and Pernet, C. 2004. FFPACK: Finite field linear algebra package. In Proceedings of the 2004 International Symposium on Symbolic and Algebraic Computation (Santander, Spain). J. Gutierrez, Ed. ACM Press, New York, 119--126.
[19]
Dumas, J.-G., Pernet, C., and Roch, J.-L. 2006. Adaptive triangular system solving. In Challenges in Symbolic Computation Software. Dagstuhl Seminar proceedings 06271, paper 770.
[20]
Dumas, J.-G., Pernet, C., and Wan, Z. 2005. Efficient computation of the characteristic polynomial. In Proceedings of the 2005 International Symposium on Symbolic and Algebraic Computation (Beijing, China). M. Kauers, Ed. ACM Press, New York, 140--147.
[21]
Dumas, J.-G., Pernet, C., and Zhou, W. 2007. Memory efficient scheduling of Strassen-Winograd’s matrix multiplication algorithm. Tech. rep., arXiv:0707.2347v2. Aug. http://arxiv.org/abs/0707.2347v2.
[22]
Dumas, J.-G. and Roch, J.-L. 2002. On parallel block algorithms for exact triangularizations. Para. Comput. 28, 11 (Nov.), 1531--1548.
[23]
Dumas, J.-G., Saunders, B. D., and Villard, G. 2001. On efficient sparse integer matrix Smith normal form computations. J. Symbol. Computat. 32, 1/2 (July--Aug.), 71--99.
[24]
Gathen, J. v. and Gerhard, J. 1999. Modern Computer Algebra. Cambridge University Press, New York, NY.
[25]
Giorgi, P. 2003. From BLAS routines to finite field exact linear algebra solutions. 9th International Conference on Applications of Computer Algebra (ACA’2003), Raleigh, North Carolina State University.
[26]
Giorgi, P., Jeannerod, C.-P., and Villard, G. 2003. On the complexity of polynomial matrix computations. In Proceedings of the 2003 International Symposium on Symbolic and Algebraic Computation (Philadelphia, Pennsylvania), R. Sendra, Ed. ACM Press, New York, 135--142.
[27]
Golub, G. H. and Van Loan, C. F. 1996. Matrix Computations, third ed. Johns Hopkins Studies in the Mathematical Sciences. The Johns Hopkins University Press, Baltimore, MD.
[28]
Goto, K. and van de Geijn, R. 2002. On reducing tlb misses in matrix multiplication. Tech. Rep. TR-2002-55, University of Texas. Nov. FLAME working note #9.
[29]
Gustavson, F., Henriksson, A., Jonsson, I., and Kaagstroem, B. 1998. Recursive blocked data formats and BLAS’s for dense linear algebra algorithms. Lecture Notes in Computer Science 1541, 195--206.
[30]
Higham, N. J. 1990. Exploiting fast matrix multiplication within the level 3 BLAS. Trans. Math. Softw. 16, 4 (Dec.), 352--368.
[31]
Huss-Lederman, S., Jacobson, E. M., Johnson, J. R., Tsao, A., and Turnbull, T. 1996a. Implementation of Strassen’s algorithm for matrix multiplication. In Supercomputing ’96 Conference Proceedings: November 17--22, Pittsburgh, PA, ACM, Ed. ACM Press and IEEE Computer Society Press, New York, NY 10036, and 1109 Spring Street, Suite 300, Silver Spring, MD 20910. http://doi.acm.org/10.1145/369028.369096.
[32]
Huss-Lederman, S., Jacobson, E. M., Johnson, J. R., Tsao, A., and Turnbull, T. 1996b. Strassen’s algorithm for matrix multiplication : modeling analysis, and implementation. Tech. rep., Center for Computing Sciences. Nov. CCS-TR-96-17.
[33]
Ibarra, O. H., Moran, S., and Hui, R. 1982. A generalization of the fast LUP matrix decomposition algorithm and applications. J. Algor. 3, 1 (Mar.), 45--56.
[34]
Kaltofen, E. and Villard, G. 2005. On the complexity of computing determinants. Computational Complexity 13, 3-4, 91--130.
[35]
Kaporin, I. 2004. The aggregation and cancellation techniques as a practical tool for faster matrix multiplication. Theor. Comput. Sci. 315, 2-3, 469--510.
[36]
Laderman, J., Pan, V., and Sha, X.-H. 1992. On practical algorithms for accelerated matrix multiplication. Linear Algebra Appl. 162--164, 557--588.
[37]
Montgomery, P. L. 1985. Modular multiplication without trial division. Math. Computat. 44, 170 (Apr.), 519--521.
[38]
Montgomery, P. L. 1995. A block Lanczos algorithm for finding dependencies over gf(2). In Proceedings of the 1995 International Conference on the Theory and Application of Cryptographic Techniques (Saint-Malo, France). L. C. Guillou and J.-J. Quisquater, Eds. Lecture Notes in Computer Science, vol. 921. 106--120.
[39]
Noble, B. 1966. A method for computing the generalized inverse of a matrix. SIAM J. Numer. Anal. 3, 4 (Dec.), 582--584.
[40]
Odlyzko, A. M. 2000. Discrete logarithms: The past and the future. Designs, Codes, and Cryptography 19, 129--145.
[41]
Pernet, C. 2001. Implementation of Winograd’s matrix multiplication over finite fields using ATLAS level 3 BLAS. Tech. Rep. RR011122, Laboratoire Informatique et Distribution. July. http://ljk.imag.fr/membres/Jean−Guillaume.Dumas/FFLAS/FFLAS_Download/FFLAS_technical_report.ps.gz.
[42]
Saunders, B. D. 2001. Black box methods for least squares problems. In ISSAC 2001: July 22--25, 2001, University of Western Ontario, London, Ontario, Canada: proceedings of the 2001 International Symposium on Symbolic and Algebraic Computation, B. Mourrain, Ed. 297--302.
[43]
Shoup, V. 2002. NTL 5.3: A library for doing number theory. www.shoup.net/ntl.
[44]
Storjohann, A. 2005. The shifted number system for fast linear algebra on integer matrices. J. Complex. 21, 4, 609--650.
[45]
Strassen, V. 1969. Gaussian elimination is not optimal. Numerische Mathematik 13, 354--356.
[46]
Whaley, R. C., Petitet, A., and Dongarra, J. J. 2001. Automated empirical optimizations of software and the ATLAS project. Para. Comput. 27, 1--2 (Jan.), 3--35. http://www.netlib.org/utk/people/JackDongarra/PAPERS/atlas_pub.pdf.
[47]
Zassenhaus, H. 1978. A remark on the Hensel factorization method. Mathematics of Computation 32, 141 (Jan.), 287--292.

Cited By

View all

Index Terms

  1. Dense Linear Algebra over Word-Size Prime Fields: the FFLAS and FFPACK Packages

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Mathematical Software
      ACM Transactions on Mathematical Software  Volume 35, Issue 3
      October 2008
      164 pages
      ISSN:0098-3500
      EISSN:1557-7295
      DOI:10.1145/1391989
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 01 October 2008
      Accepted: 01 February 2008
      Revised: 01 January 2008
      Received: 01 February 2006
      Published in TOMS Volume 35, Issue 3

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. BLAS level 1-2-3
      2. Winograd’s symbolic matrix multiplication
      3. Word size prime fields
      4. determinant
      5. inverse
      6. linear algebra package
      7. matrix factorization

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)9
      • Downloads (Last 6 weeks)2
      Reflects downloads up to 17 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Plaintext-Ciphertext Matrix Multiplication and FHE Bootstrapping: Fast and FusedAdvances in Cryptology – CRYPTO 202410.1007/978-3-031-68382-4_12(387-421)Online publication date: 18-Aug-2024
      • (2023)On a Two-Layer Modular ArithmeticACM Communications in Computer Algebra10.1145/3637529.363753457:3(133-136)Online publication date: 13-Dec-2023
      • (2023)Drinfeld Modules in SageMathACM Communications in Computer Algebra10.1145/3614408.361441757:2(65-71)Online publication date: 7-Aug-2023
      • (2023)Operations for D-Algebraic FunctionsACM Communications in Computer Algebra10.1145/3614408.361441557:2(51-56)Online publication date: 7-Aug-2023
      • (2023)Modular Matrix Multiplication on GPU for Polynomial System SolvingACM Communications in Computer Algebra10.1145/3614408.361441157:2(35-38)Online publication date: 7-Aug-2023
      • (2023)Using the Interval-Symbol Method with Zero Rewriting to Factor Polynomials over Algebraic Number FieldsACM Communications in Computer Algebra10.1145/3614408.361440957:2(21-30)Online publication date: 7-Aug-2023
      • (2023)Some fast algorithms multiplying a matrix by its adjointJournal of Symbolic Computation10.1016/j.jsc.2022.08.009115:C(285-315)Online publication date: 1-Mar-2023
      • (2022)Algorithm 1021: SPEX Left LU, Exactly Solving Sparse Linear Systems via a Sparse Left-looking Integer-preserving LU FactorizationACM Transactions on Mathematical Software10.1145/351902448:2(1-23)Online publication date: 26-May-2022
      • (2022)Learning the Rules of the Game: An Interpretable AI for Learning How to PlayIEEE Transactions on Games10.1109/TG.2021.306624514:2(253-261)Online publication date: Jun-2022
      • (2022)Almost all subgeneric third-order Chow decompositions are identifiableAnnali di Matematica Pura ed Applicata (1923 -)10.1007/s10231-022-01224-8201:6(2891-2905)Online publication date: 6-Jun-2022
      • Show More Cited By

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media