research-article

Dense Linear Algebra over Word-Size Prime Fields: the FFLAS and FFPACK Packages

Authors:

Jean-Guillaume Dumas,

Clément PernetAuthors Info & Claims

ACM Transactions on Mathematical Software (TOMS), Volume 35, Issue 3

Article No.: 19, Pages 1 - 42

https://doi.org/10.1145/1391989.1391992

Published: 01 October 2008 Publication History

Abstract

In the past two decades, some major efforts have been made to reduce exact (e.g. integer, rational, polynomial) linear algebra problems to matrix multiplication in order to provide algorithms with optimal asymptotic complexity. To provide efficient implementations of such algorithms one need to be careful with the underlying arithmetic. It is well known that modular techniques such as the Chinese remainder algorithm or the p-adic lifting allow very good practical performance, especially when word size arithmetic is used. Therefore, finite field arithmetic becomes an important core for efficient exact linear algebra libraries. In this article, we study high performance implementations of basic linear algebra routines over word size prime fields: especially matrix multiplication; our goal being to provide an exact alternate to the numerical BLAS library. We show that this is made possible by a careful combination of numerical computations and asymptotically faster algorithms. Our kernel has several symbolic linear algebra applications enabled by diverse matrix multiplication reductions: symbolic triangularization, system solving, determinant, and matrix inverse implementations are thus studied.

Supplementary Material

Electonic Appendix (apdx1.pdf)

The proof of Theorem 3.1 is given in an electronic appendix, available online in the ACM Digital Library.

Download
64.24 KB

References

[1]

Aho, A. V., Hopcroft, J. E., and Ullman, J. D. 1974. The Design and Analysis of Computer Algorithms. Addison-Wesley.

Digital Library

[2]

Anderson, E., Bai, Z., Bischof, C., Blackford, S., Demmel, J., Dongarra, J., Du Croz, J., Greenbaum, A., Hammarling, S., McKenney, A., and Sorensen, D. 1999. LAPACK Users’ Guide, Third ed. Society for Industrial and Applied Mathematics, Philadelphia, PA.

Digital Library

[3]

Bini, D. and Pan, V. 1994. Polynomial and Matrix Computations, Volume 1: Fundamental Algorithms. Birkhauser, Boston.

Digital Library

[4]

Brassel, M., Giorgi, P., and Pernet, C. 2003. LUdivine: A symbolic block LU factorisation for matrices over finite fields using BLAS. Poster, http://ljk.imag.fr/membres/Jean−Guillaume.Dumas/FFLAS/FFLAS_Download/ludivine_poster_eccad2003.ps.gz.

[5]

Bunch, J. R. and Hopcroft, J. E. 1974. Triangular factorization and inversion by fast matrix multiplication. Mathematics of Computation 28, 231--236.

[6]

Chen, Z. and Storjohann, A. 2003. Effective reductions to matrix multiplication. 9th International Conference on Applications of Computer Algebra (ACA’2003), Raleigh, North Carolina State University.

[7]

Coppersmith, D. and Winograd, S. 1990. Matrix multiplication via arithmetic progressions. J. Symbol. Computation 9, 3, 251--280.

Digital Library

[8]

Courrieu, P. 2005. Fast computation of Moore-Penrose inverse matrices. Neural Information Processing - Letters and Reviews 8, 2 (Aug.), 25--29.

[9]

Dixon, J. D. 1982. Exact solution of linear equations using p-adic expansions. Numerische Mathematik 40, 137--141.

Digital Library

[10]

Dongarra, J. and Eijkhout, V. 2003. Self-adapting numerical software and automatic tuning of heuristics. Lecture Notes in Computer Science 2660, 759--770.

Digital Library

[11]

Dongarra, J. J., Croz, J. D., Hammarling, S., and Duff, I. 1990. A set of level 3 basic linear algebra subprograms. Trans. Math. Soft. 16, 1 (Mar.), 1--17. http://doi.acm.org/10.1145/77626.79170.

Digital Library

[12]

Douglas, C. C., Heroux, M., Slishman, G., and Smith, R. M. 1994. Gemmw: a portable level 3 blas Winograd variant of Strassen’s matrix-matrix multiply algorithm. J. Computational Phys, 110, 1--10.

Digital Library

[13]

Dumas, J.-G., Giorgi, P., and Pernet, C. 2006. FFLAS-FFPACK: Finite field linear algebra subroutine/package. Software, http://ciel.ccsd.cnrs.fr/ciel−00000025. Feb.

[14]

Dumas, J.-G. 2004. Efficient dot product over finite fields. In Proceedings of the Seventh International Workshop on Computer Algebra in Scientific Computing, (Yalta, Ukraine). V. G. Ganzha, E. W. Mayr, and E. V. Vorozhtsov, Eds. Technische Universität München, Germany, 139--154.

[15]

Dumas, J.-G. 2007. Q-adic transform revisited. Tech. Rep. 0710.0510 {cs.SC}, ArXiv. Oct. http://hal.archives-ouvertes.fr/hal-00173894.

[16]

Dumas, J.-G., Gautier, T., Giesbrecht, M., Giorgi, P., Hovinen, B., Kaltofen, E., Saunders, B. D., Turner, W. J., and Villard, G. 2002a. LinBox: A generic library for exact linear algebra. In Proceedings of the 2002 International Congress of Mathematical Software (Beijing, China). A. M. Cohen, X.-S. Gao, and N. Takayama, Eds. World Scientific Pub, 40--50.

[17]

Dumas, J.-G., Gautier, T., and Pernet, C. 2002b. Finite field linear algebra subroutines. In Proceedings of the 2002 International Symposium on Symbolic and Algebraic Computation (Lille, France). T. Mora, Ed. ACM Press, New York, 63--74.

Digital Library

[18]

Dumas, J.-G., Giorgi, P., and Pernet, C. 2004. FFPACK: Finite field linear algebra package. In Proceedings of the 2004 International Symposium on Symbolic and Algebraic Computation (Santander, Spain). J. Gutierrez, Ed. ACM Press, New York, 119--126.

Digital Library

[19]

Dumas, J.-G., Pernet, C., and Roch, J.-L. 2006. Adaptive triangular system solving. In Challenges in Symbolic Computation Software. Dagstuhl Seminar proceedings 06271, paper 770.

[20]

Dumas, J.-G., Pernet, C., and Wan, Z. 2005. Efficient computation of the characteristic polynomial. In Proceedings of the 2005 International Symposium on Symbolic and Algebraic Computation (Beijing, China). M. Kauers, Ed. ACM Press, New York, 140--147.

Digital Library

[21]

Dumas, J.-G., Pernet, C., and Zhou, W. 2007. Memory efficient scheduling of Strassen-Winograd’s matrix multiplication algorithm. Tech. rep., arXiv:0707.2347v2. Aug. http://arxiv.org/abs/0707.2347v2.

[22]

Dumas, J.-G. and Roch, J.-L. 2002. On parallel block algorithms for exact triangularizations. Para. Comput. 28, 11 (Nov.), 1531--1548.

Digital Library

[23]

Dumas, J.-G., Saunders, B. D., and Villard, G. 2001. On efficient sparse integer matrix Smith normal form computations. J. Symbol. Computat. 32, 1/2 (July--Aug.), 71--99.

Digital Library

[24]

Gathen, J. v. and Gerhard, J. 1999. Modern Computer Algebra. Cambridge University Press, New York, NY.

Digital Library

[25]

Giorgi, P. 2003. From BLAS routines to finite field exact linear algebra solutions. 9th International Conference on Applications of Computer Algebra (ACA’2003), Raleigh, North Carolina State University.

[26]

Giorgi, P., Jeannerod, C.-P., and Villard, G. 2003. On the complexity of polynomial matrix computations. In Proceedings of the 2003 International Symposium on Symbolic and Algebraic Computation (Philadelphia, Pennsylvania), R. Sendra, Ed. ACM Press, New York, 135--142.

Digital Library

[27]

Golub, G. H. and Van Loan, C. F. 1996. Matrix Computations, third ed. Johns Hopkins Studies in the Mathematical Sciences. The Johns Hopkins University Press, Baltimore, MD.

Digital Library

[28]

Goto, K. and van de Geijn, R. 2002. On reducing tlb misses in matrix multiplication. Tech. Rep. TR-2002-55, University of Texas. Nov. FLAME working note #9.

[29]

Gustavson, F., Henriksson, A., Jonsson, I., and Kaagstroem, B. 1998. Recursive blocked data formats and BLAS’s for dense linear algebra algorithms. Lecture Notes in Computer Science 1541, 195--206.

Digital Library

[30]

Higham, N. J. 1990. Exploiting fast matrix multiplication within the level 3 BLAS. Trans. Math. Softw. 16, 4 (Dec.), 352--368.

Digital Library

[31]

Huss-Lederman, S., Jacobson, E. M., Johnson, J. R., Tsao, A., and Turnbull, T. 1996a. Implementation of Strassen’s algorithm for matrix multiplication. In Supercomputing ’96 Conference Proceedings: November 17--22, Pittsburgh, PA, ACM, Ed. ACM Press and IEEE Computer Society Press, New York, NY 10036, and 1109 Spring Street, Suite 300, Silver Spring, MD 20910. http://doi.acm.org/10.1145/369028.369096.

Digital Library

[32]

Huss-Lederman, S., Jacobson, E. M., Johnson, J. R., Tsao, A., and Turnbull, T. 1996b. Strassen’s algorithm for matrix multiplication : modeling analysis, and implementation. Tech. rep., Center for Computing Sciences. Nov. CCS-TR-96-17.

[33]

Ibarra, O. H., Moran, S., and Hui, R. 1982. A generalization of the fast LUP matrix decomposition algorithm and applications. J. Algor. 3, 1 (Mar.), 45--56.

[34]

Kaltofen, E. and Villard, G. 2005. On the complexity of computing determinants. Computational Complexity 13, 3-4, 91--130.

Digital Library

[35]

Kaporin, I. 2004. The aggregation and cancellation techniques as a practical tool for faster matrix multiplication. Theor. Comput. Sci. 315, 2-3, 469--510.

Digital Library

[36]

Laderman, J., Pan, V., and Sha, X.-H. 1992. On practical algorithms for accelerated matrix multiplication. Linear Algebra Appl. 162--164, 557--588.

[37]

Montgomery, P. L. 1985. Modular multiplication without trial division. Math. Computat. 44, 170 (Apr.), 519--521.

[38]

Montgomery, P. L. 1995. A block Lanczos algorithm for finding dependencies over gf(2). In Proceedings of the 1995 International Conference on the Theory and Application of Cryptographic Techniques (Saint-Malo, France). L. C. Guillou and J.-J. Quisquater, Eds. Lecture Notes in Computer Science, vol. 921. 106--120.

Digital Library

[39]

Noble, B. 1966. A method for computing the generalized inverse of a matrix. SIAM J. Numer. Anal. 3, 4 (Dec.), 582--584.

[40]

Odlyzko, A. M. 2000. Discrete logarithms: The past and the future. Designs, Codes, and Cryptography 19, 129--145.

Digital Library

[41]

Pernet, C. 2001. Implementation of Winograd’s matrix multiplication over finite fields using ATLAS level 3 BLAS. Tech. Rep. RR011122, Laboratoire Informatique et Distribution. July. http://ljk.imag.fr/membres/Jean−Guillaume.Dumas/FFLAS/FFLAS_Download/FFLAS_technical_report.ps.gz.

[42]

Saunders, B. D. 2001. Black box methods for least squares problems. In ISSAC 2001: July 22--25, 2001, University of Western Ontario, London, Ontario, Canada: proceedings of the 2001 International Symposium on Symbolic and Algebraic Computation, B. Mourrain, Ed. 297--302.

Digital Library

[43]

Shoup, V. 2002. NTL 5.3: A library for doing number theory. www.shoup.net/ntl.

[44]

Storjohann, A. 2005. The shifted number system for fast linear algebra on integer matrices. J. Complex. 21, 4, 609--650.

Digital Library

[45]

Strassen, V. 1969. Gaussian elimination is not optimal. Numerische Mathematik 13, 354--356.

Digital Library

[46]

Whaley, R. C., Petitet, A., and Dongarra, J. J. 2001. Automated empirical optimizations of software and the ATLAS project. Para. Comput. 27, 1--2 (Jan.), 3--35. http://www.netlib.org/utk/people/JackDongarra/PAPERS/atlas_pub.pdf.

[47]

Zassenhaus, H. 1978. A remark on the Hensel factorization method. Mathematics of Computation 32, 141 (Jan.), 287--292.

Cited By

Bae YCheon JHanrot GPark JStehlé D(2024)Plaintext-Ciphertext Matrix Multiplication and FHE Bootstrapping: Fast and FusedAdvances in Cryptology – CRYPTO 202410.1007/978-3-031-68382-4_12(387-421)Online publication date: 18-Aug-2024
https://dl.acm.org/doi/10.1007/978-3-031-68382-4_12
Chen BLi YZima E(2023)On a Two-Layer Modular ArithmeticACM Communications in Computer Algebra10.1145/3637529.363753457:3(133-136)Online publication date: 13-Dec-2023
https://dl.acm.org/doi/10.1145/3637529.3637534
Ayotte DCaruso XLeudière AMusleh J(2023)Drinfeld Modules in SageMathACM Communications in Computer Algebra10.1145/3614408.361441757:2(65-71)Online publication date: 7-Aug-2023
https://dl.acm.org/doi/10.1145/3614408.3614417
Show More Cited By

Index Terms

Dense Linear Algebra over Word-Size Prime Fields: the FFLAS and FFPACK Packages
1. Mathematics of computing
  1. Mathematical analysis
    1. Numerical analysis
      1. Computations in finite fields
  2. Mathematical software

Recommendations

Finite field linear algebra subroutines
ISSAC '02: Proceedings of the 2002 international symposium on Symbolic and algebraic computation

In this paper we study different implementations of finite field arithmetic, essential foundation of computer algebra. We focus on Galois fields of word size cardinality at most, with any characteristic. Classical representations as machine integers, ...
Computational properties of pentadiagonal and anti-pentadiagonal block band matrices with perturbed corners
Abstract
In this paper, we derive an orthogonal block diagonalization and a number of formulas for pentadiagonal block band symmetric Toeplitz matrices and anti-pentadiagonal block band persymmetric Hankel matrices, both with perturbed corners. Namely, our ...
Explicit inverse matrices of Tribonacci skew circulant type matrices

The determinants and inverses of Tribonacci skew circulant type matrices are discussed in the paper. Firstly, Tribonacci skew circulant type matrices are defined. In addition, we show the invertibility of the Tribonacci skew circulant matrix and present ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Mathematical Software

ACM Transactions on Mathematical Software Volume 35, Issue 3

October 2008

164 pages

ISSN:0098-3500

EISSN:1557-7295

DOI:10.1145/1391989

Issue’s Table of Contents

Copyright © 2008 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 October 2008

Accepted: 01 February 2008

Revised: 01 January 2008

Received: 01 February 2006

Published in TOMS Volume 35, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

57
Total Citations
View Citations
417
Total Downloads

Downloads (Last 12 months)9
Downloads (Last 6 weeks)2

Reflects downloads up to 17 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Bae YCheon JHanrot GPark JStehlé D(2024)Plaintext-Ciphertext Matrix Multiplication and FHE Bootstrapping: Fast and FusedAdvances in Cryptology – CRYPTO 202410.1007/978-3-031-68382-4_12(387-421)Online publication date: 18-Aug-2024
https://dl.acm.org/doi/10.1007/978-3-031-68382-4_12
Chen BLi YZima E(2023)On a Two-Layer Modular ArithmeticACM Communications in Computer Algebra10.1145/3637529.363753457:3(133-136)Online publication date: 13-Dec-2023
https://dl.acm.org/doi/10.1145/3637529.3637534
Ayotte DCaruso XLeudière AMusleh J(2023)Drinfeld Modules in SageMathACM Communications in Computer Algebra10.1145/3614408.361441757:2(65-71)Online publication date: 7-Aug-2023
https://dl.acm.org/doi/10.1145/3614408.3614417
Tabuguia B(2023)Operations for D-Algebraic FunctionsACM Communications in Computer Algebra10.1145/3614408.361441557:2(51-56)Online publication date: 7-Aug-2023
https://dl.acm.org/doi/10.1145/3614408.3614415
Berthomieu JGraillat SLesnoff DMary T(2023)Modular Matrix Multiplication on GPU for Polynomial System SolvingACM Communications in Computer Algebra10.1145/3614408.361441157:2(35-38)Online publication date: 7-Aug-2023
https://dl.acm.org/doi/10.1145/3614408.3614411
Okuda KShirayanagi K(2023)Using the Interval-Symbol Method with Zero Rewriting to Factor Polynomials over Algebraic Number FieldsACM Communications in Computer Algebra10.1145/3614408.361440957:2(21-30)Online publication date: 7-Aug-2023
https://dl.acm.org/doi/10.1145/3614408.3614409
Dumas JPernet CSedoglavic A(2023)Some fast algorithms multiplying a matrix by its adjointJournal of Symbolic Computation10.1016/j.jsc.2022.08.009115:C(285-315)Online publication date: 1-Mar-2023
https://dl.acm.org/doi/10.1016/j.jsc.2022.08.009
Lourenco CChen JMoreno-Centeno EDavis T(2022)Algorithm 1021: SPEX Left LU, Exactly Solving Sparse Linear Systems via a Sparse Left-looking Integer-preserving LU FactorizationACM Transactions on Mathematical Software10.1145/351902448:2(1-23)Online publication date: 26-May-2022
https://dl.acm.org/doi/10.1145/3519024
Aurentz JNavarro AInsua D(2022)Learning the Rules of the Game: An Interpretable AI for Learning How to PlayIEEE Transactions on Games10.1109/TG.2021.306624514:2(253-261)Online publication date: Jun-2022
https://doi.org/10.1109/TG.2021.3066245
Torrance DVannieuwenhoven N(2022)Almost all subgeneric third-order Chow decompositions are identifiableAnnali di Matematica Pura ed Applicata (1923 -)10.1007/s10231-022-01224-8201:6(2891-2905)Online publication date: 6-Jun-2022
https://doi.org/10.1007/s10231-022-01224-8
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents