research-article

IEEE-754 Precision-p base-β Arithmetic Implemented in Binary

Author:
Siegfried M. Rump

Institute for Reliable Computing, Hamburg University of Technology, Germany, and Visiting Professor at Waseda University, Faculty of Science and Engineering, Japan

Institute for Reliable Computing, Hamburg University of Technology, Germany, and Visiting Professor at Waseda University, Faculty of Science and Engineering, Japan

0000-0002-4779-4800
View Profile

Authors Info & Claims

ACM Transactions on Mathematical Software Volume 49 Issue 4Article No.: 32pp 1–21https://doi.org/10.1145/3596218

Published:15 December 2023Publication History

ACM Transactions on Mathematical Software

Abstract

We show how an IEEE-754 conformant precision-p base-β arithmetic can be implemented based on some binary floating-point and/or integer arithmetic. This includes the four basic operations and square root subject to the five IEEE-754 rounding modes, namely the nearest roundings with roundTiesToEven and roundTiesToAway, the directed roundings downwards and upwards, as well as rounding towards zero. Exceptional values like ∞ of NaN are covered according to the IEEE-754 arithmetic standard.

The results of the precision-p base-β operations are computed using some underlying precision-q binary arithmetic. We distinguish two cases. When using a precision-q binary integer arithmetic, the base-β precision p is limited for all operations by β^2p ≤ 2^q, whereas using a precision-q binary floating-point arithmetic imposes stronger limits on the base-β precision, namely β^2p ≤ 2^q for addition and multiplication, β^2p ≤ 2^q-1 for division and β^2p ≤ 2^q-3 for the square root. Those limitations cannot be improved.

The algorithms are implemented in a Matlab/Octave flbeta-toolbox with the choice of using uint64 or binary64 as underlying arithmetic. The former allows larger precisions, the latter is advantageous for the square root, whereas computing times are similar. The flbeta-toolbox offers precision-p base-β scalar, vector and matrix operations including sparse matrices as well as corresponding interval operations. The base β can be chosen in the range β ∊ [2,64]. The flbeta-toolbox will be part of Version 13 of INTLAB [18], the Matlab/Octave toolbox for reliable computing.

REFERENCES

[1] R. Brent and P. Zimmermann. 2010. Modern Computer Arithmetic. Cambridge University Press, New York, NY, USA (2010).Google Scholar
[2] M. Fasi and M. Mikaitis. 2020. CPFloat: A C library for simulating low-precision arithmetic MIMS EPrint 2020.22, University of Manchester, (2020), accepted for publication in ACM TOMS , see also http://eprints.maths.manchester.ac.uk/2855/1/fami22.pdfGoogle ScholarDigital Library
[3] Samuel A. Figueroa. 1995. When is double rounding innocuous? SIGNUM Newsl. 30, 3 (1995), 21–26.Google ScholarDigital Library
[4] G. Flegar, F. Scheidegger, V. Novakovi$\acute{\mbox{c}}$, G. Mariani, A. E. Tom$\grave{\mbox{a}}$s, A. C. I. Malossi, and E. S. Quintana-Ort$\acute{\mbox{i}}$. 2019. FloatX: A C++ library for customized floating-point arithmetic. ACM Trans. Math. Softw., 45 (2019), 1–40.Google ScholarDigital Library
[5] L. Fousse, G. Hanrot, V. Lefèvre, P. Pélissier, and P. Zimmermann. MPFR: A multiple-precision binary floating-point library with correct rounding. Research Report RR-5753, INRIA, 2005. Code and documentation available at http://hal.inria.fr/inria-00000818Google Scholar
[6] N. J. Higham. 2002. Accuracy and Stability of Numerical Algorithms (2nd Ed.). SIAM Publications, Philadelphia.Google Scholar
[7] N. J. Higham and S. Pranesh. 2019. Simulating low precision floating-point arithmetic. SIAM Journal on Scientific Computing, 41, 5 (2019), C585–C602.Google ScholarDigital Library
[8] IEEE. 1987. ANSI/IEEE 854-1987, Standard for radix-independent floating-point arithmetic.Google Scholar
[9] IEEE. 2008. ANSI/IEEE 754-2008: IEEE standard for floating-point arithmetic. IEEE, New York.Google Scholar
[10] IEEE. 2019. IEEE standard for floating-point arithmetic. IEEE Std 754-2019 (revision of IEEE 754-2008). 1–84.Google Scholar
[11] C.-P. Jeannerod and S. M. Rump. 2017. On relative errors of floating-point operations: Optimal bounds and applications. Math. Comput. 87 (2017), 803–819.Google ScholarCross Ref
[12] D. E. Knuth. 1998. The Art of Computer Programming: Seminumerical Algorithms (3rd Ed.), Vol. 2. Addison Wesley, Reading, Massachusetts.Google Scholar
[13] V. Lefèvre. 2013. Sipe: A mini-library for very low precision computations with correct rounding. (2013). http://hal.inria.fr/hal-00864580submitted.Google Scholar
[14] G. Meurant. 2020. FLOATP$\_$toolbox, Matlab software, variable precision floating point arithmetic. https://gerard-meurant.pagesperso-orange.fr/soft_meurant_n.htmlGoogle Scholar
[15] C. B. Moler. 2019. Variable format half and quarter precision floating-point arithmetic. Cleve’s Corner, 2019. https://de.mathworks.com/matlabcentral/fileexchange/59085-cleve_labGoogle Scholar
[16] J.-M. Muller, N. Brunie, F. de Dinechin, C.-P. Jeannerod, M. Joldes, V. Lefèvre, G. Melquiond, R. Revol, and S. Torres. 2018. Handbook of Floating-point Arithmetic. (2nd Ed.). Birkhäuser, Boston.Google Scholar
[17] P. Roux. 2014. Innocuous double rounding of basic arithmetic operations. J. Formal. Reason., 7, 1 (2014), 131–142.Google Scholar
[18] S. M. Rump. 1999. INTLAB - INTerval LABoratory. In Developments in Reliable Computing, Tibor Csendes (Ed.). Kluwer Academic Publishers, Dordrecht, 77–104. http://www.ti3.tu-harburg.de/rump/intlab/index.htmlGoogle Scholar
[19] S. M. Rump. 2012. Fast interval matrix multiplication. Numerical Algorithms, 61, 1 (2012), 134.Google ScholarCross Ref
[20] S. M. Rump. 2017. IEEE754 precision-k base-$\beta$ arithmetic inherited by precision-m base-$\beta$ arithmetic for $k\lt m$. ACM Trans. Math. Software, 43, 3 (2017).Google ScholarDigital Library
[21] P. H. Sterbenz. 1974. Floating-point Computation. Prentice Hall, Englewood Cliffs, NJ.Google Scholar
[22] G. W. Stewart. 2009. Flap: A Matlab package for adjustable precision floating-point arithmetic. http://www.cs.umd.edu/stewart/flap/flap.htmlGoogle Scholar
[23] G. W. Stewart. 2014. Private communication.Google Scholar

Index Terms

IEEE-754 Precision-p base-β Arithmetic Implemented in Binary
1. Mathematics of computing
  1. Mathematical software

Recommendations

IEEE754 Precision-k base-β Arithmetic Inherited by Precision-m Base-β Arithmetic for k < m

Suppose an m-digit floating-point arithmetic in base β ≥ 2 following the IEEE754 arithmetic standard is available. We show how a k-digit arithmetic with k < m can be inherited solely using m-digit operations. This includes the rounding into k digits, ...
Read More
Computation of the unit in the first place (ufp) and the unit in the last place (ulp) in precision-p base $β$
Abstract
There are simple algorithms to compute the predecessor, successor, unit in the first place, unit in the last place etc. in binary arithmetic. In this note equally simple algorithms for computing the unit in the first place and the unit in the last ...
Read More
A Software Implementation of the IEEE 754R Decimal Floating-Point Arithmetic Using the Binary Encoding Format

The IEEE Standard 754-1985 for Binary Floating-Point Arithmetic [19] was revised [20], and an important addition is the definition of decimal floating-point arithmetic [8], [24]. This is intended mainly to provide a robust reliable framework for ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Mathematical Software Volume 49, Issue 4
December 2023
226 pages
ISSN:0098-3500
EISSN:1557-7295
DOI:10.1145/3637452
Editors:
Zhaojun Bai
University of California at Davis, USA
,
Wolfgang Bangerth
Colorado State University, USA
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 15 December 2023
- Online AM: 21 August 2023
- Accepted: 30 March 2023
- Revised: 10 October 2022
- Received: 6 December 2021
Published in toms Volume 49, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Floating-point arithmetic
precision-p
base-β
IEEE-754
double rounding
interval arithmetic
INTLAB
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 119
  Total Downloads
- Downloads (Last 12 months)119
- Downloads (Last 6 weeks)14
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

View Full Text

IEEE-754 Precision-p base-β Arithmetic Implemented in Binary

ACM Transactions on Mathematical Software

Abstract

REFERENCES

Cited By

Index Terms

Recommendations

IEEE754 Precision-k base-β Arithmetic Inherited by Precision-m Base-β Arithmetic for k < m

Computation of the unit in the first place (ufp) and the unit in the last place (ulp) in precision-p base $β$

A Software Implementation of the IEEE 754R Decimal Floating-Point Arithmetic Using the Binary Encoding Format

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Full Text

Caption

IEEE-754 Precision-p base-β Arithmetic Implemented in Binary

ACM Transactions on Mathematical Software

Abstract

REFERENCES

Cited By

Index Terms

Recommendations

IEEE754 Precision-k base-β Arithmetic Inherited by Precision-m Base-β Arithmetic for k < m

Computation of the unit in the first place (ufp) and the unit in the last place (ulp) in precision-p base β

A Software Implementation of the IEEE 754R Decimal Floating-Point Arithmetic Using the Binary Encoding Format

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Full Text

Share this Publication link

Share on Social Media

Computation of the unit in the first place (ufp) and the unit in the last place (ulp) in precision-p base $β$