skip to main content
article

Building high accuracy bloom filters using partitioned hashing

Published: 12 June 2007 Publication History

Abstract

The growing importance of operations such as packet-content inspection, packet classification based on non-IP headers, maintaining flow-state, etc. has led to increased interest in the networking applications of Bloom filters. This is because Bloom filters provide a relatively easy method for hardware implementation of set-membership queries. However, the tradeoff is that Bloom filters only provide a probabilistic test and membership queries can result in false positives. Ideally, we would like this false positive probability to be very low. The main contribution of this paper is a method for significantly reducing this false positive probability in comparison to existing schemes. This is done by developing a partitioned hashing method which results in a choice of hash functions that set far fewer bits in the Bloom filter bit vector than would be the case otherwise. This lower fill factor of the bit vector translates to a much lower false positive probability. We show experimentally that this improved choice can result in as much as a ten-fold increase in accuracy over standard Bloom filters. We also show that the scheme performs much better than other proposed schemes for improving Bloom filters.

References

[1]
B. H. Bloom, "Space/time tradeoffs in hash coding with allowable errors", Communications of the ACM 13:7 (1970), 422--426.
[2]
A. Kirsch and M. Mitzenmacher, "Less hashing, same performance: building a better Bloom filter", ESA 2006.
[3]
S. Lumetta and M. Mitzenmacher, "Using the power of two choices to improve Bloom filters", Preprint version available at http://www.eecs.harvard.edu/~michaelm.
[4]
A. Broder and M. Mitzenmacher, "Network applications fo Bloom filters: a survey", Internet Mathematics, vol. 1. no. 4, pp. 485--509, 2005.
[5]
R. Motwani and P. Raghavan, "Randomized algorithms", Cambridge University Press, August, 1995.
[6]
Robert Sedgewick, "Algorithms in C", Addison-Wesley Professional, August, 2001.
[7]
A. Ostlin and R. Pagh, "Uniform hashing in constant time and linear space", Proceedings of STOC 2003, ACM.
[8]
A. Pagh, R. Pagh, and S. Rao, "An optimal Bloom filter replacement", SODA 2005.
[9]
B. Chazelle, J. Kilian, R. Rubinfeld, and A. Tal, "The Bloomier filter: an efficient data structure for static support lookup tables", SODA 2004.
[10]
C. Estan and G. Varghese, "New directions in traffic measurement and accounting", Proceedings of ACM SIGCOMM Conference, 2002.
[11]
L. Fan, P. Cao, J. Almeida, and A. Z. Broder, "Summary cache: a scalable wide-area web cache sharing protocol", IEEE/ACM Transactions on Networking 8:3 (2000), 281--293.
[12]
E. H. Spafford, "Opus: preventing weak password choices", Computer and Security 11 (1992), 273--278.
[13]
U. Manber and S. Wu, "An algorithm for approximate membership checking with application to password security", Information Processing Letters 50 (1994), 19--197.
[14]
F. Bonomi, M. Mitzenmacher, R. Panigrahy, S. Singh, and G. Varghese, "Beyond Bloom filters: from approximate membership checks to approximate state machines", Proceedings of ACM SIGCOMM 2006.

Cited By

View all
  • (2019)Locality-Sensitive Bloom Filter for Approximate Membership QuerySearchable Storage in Cloud Computing10.1007/978-981-13-2721-6_5(99-127)Online publication date: 9-Feb-2019
  • (2019)Blockchain Retrieval Model Based on Elastic Bloom FilterWeb Information Systems and Applications10.1007/978-3-030-30952-7_53(527-538)Online publication date: 20-Sep-2019
  • (2019)Provably secure and efficient PUF‐based broadcast authentication schemes for smart grid applicationsInternational Journal of Communication Systems10.1002/dac.393532:8Online publication date: 5-Mar-2019
  • Show More Cited By

Index Terms

  1. Building high accuracy bloom filters using partitioned hashing

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM SIGMETRICS Performance Evaluation Review
      ACM SIGMETRICS Performance Evaluation Review  Volume 35, Issue 1
      SIGMETRICS '07 Conference Proceedings
      June 2007
      382 pages
      ISSN:0163-5999
      DOI:10.1145/1269899
      Issue’s Table of Contents
      • cover image ACM Conferences
        SIGMETRICS '07: Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
        June 2007
        398 pages
        ISBN:9781595936394
        DOI:10.1145/1254882
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 12 June 2007
      Published in SIGMETRICS Volume 35, Issue 1

      Check for updates

      Author Tags

      1. bloom filter
      2. hashing

      Qualifiers

      • Article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)11
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 02 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2019)Locality-Sensitive Bloom Filter for Approximate Membership QuerySearchable Storage in Cloud Computing10.1007/978-981-13-2721-6_5(99-127)Online publication date: 9-Feb-2019
      • (2019)Blockchain Retrieval Model Based on Elastic Bloom FilterWeb Information Systems and Applications10.1007/978-3-030-30952-7_53(527-538)Online publication date: 20-Sep-2019
      • (2019)Provably secure and efficient PUF‐based broadcast authentication schemes for smart grid applicationsInternational Journal of Communication Systems10.1002/dac.393532:8Online publication date: 5-Mar-2019
      • (2016)Par-BF: A parallel partitioned Bloom filter for dynamic data setsThe International Journal of High Performance Computing Applications10.1177/109434201561845230:3(259-275)Online publication date: 28-Jul-2016
      • (2016)An approximate dynamic programming approach for improving accuracy of lossy data compression by Bloom filtersEuropean Journal of Operational Research10.1016/j.ejor.2016.01.042252:3(985-994)Online publication date: Aug-2016
      • (2015)Sampling Bloom Filter-Based Detection of Unknown RFID TagsIEEE Transactions on Communications10.1109/TCOMM.2015.240266063:4(1432-1442)Online publication date: Apr-2015
      • (2015)Completely Pinpointing the Missing RFID Tags in a Time-Efficient WayIEEE Transactions on Computers10.1109/TC.2013.19764:1(87-96)Online publication date: Jan-2015
      • (2015)One-hashing bloom filter2015 IEEE 23rd International Symposium on Quality of Service (IWQoS)10.1109/IWQoS.2015.7404748(289-298)Online publication date: Jun-2015
      • (2015)Leveraging traffic repetitions for high-speed deep packet inspection2015 IEEE Conference on Computer Communications (INFOCOM)10.1109/INFOCOM.2015.7218648(2578-2586)Online publication date: Apr-2015
      • (2015)An Insight Review on Bloom Filter and Its Variants with Applications: An Emerging Hash Based Membership Querying TechniqueJournal of Discrete Mathematical Sciences and Cryptography10.1080/09720529.2014.93212918:3(247-263)Online publication date: 2-Jun-2015
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media