skip to main content
10.1145/3064176.3064184acmconferencesArticle/Chapter ViewAbstractPublication PageseurosysConference Proceedingsconference-collections
research-article
Open Access

MiniCrypt: Reconciling Encryption and Compression for Big Data Stores

Published:23 April 2017Publication History

ABSTRACT

We propose MiniCrypt, the first key-value store that reconciles encryption and compression without compromising performance. At the core of MiniCrypt is an observation on data compressibility trends in key-value stores, which enables grouping key-value pairs into small key packs, together with a set of distributed systems techniques for retrieving, updating, merging and splitting encrypted packs. Our evaluation shows that MiniCrypt compresses data by as much as 4 times with respect to the vanilla key-value store, and can increase the server's throughput by up to two orders of magnitude by fitting more data in main memory.

References

  1. Conviva. http://www.conviva.com/.Google ScholarGoogle Scholar
  2. Memcached. http://www.memcached.Google ScholarGoogle Scholar
  3. Mongodb. http://www.mongodb.org.Google ScholarGoogle Scholar
  4. Redis. http://www.redis.io.Google ScholarGoogle Scholar
  5. D. J. Abadi, S. R. Madden, and M. Ferreira. Integrating Compression and Execution in Column-Oriented Database Systems. In ACM International Conference on Management of Data SIGMOD, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. R. Agarwal, A. Khandelwal, and I. Stoica. Succinct: Enabling queries on compressed data. In Proceedings of the 12th Symposium on Networked Systems Design and Implementation NSDI, Oakland, CA, May 2015.Google ScholarGoogle Scholar
  7. A. Arasu, S. Blanas, K. Eguro, M. Joglekar, R. Kaushik, D. Kossmann, R. Ramamurthy, P. Upadhyaya, and R. Venkatesan. Secure database-as-a-service with cipherbase. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pages 1033--1036. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. C. Binnig, S. Hildenbrand, and F. Farber. Dictionary-based order-preserving string compression for main memory column stores. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of data, pages 283--296. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. Blaze. A cryptographic file system for unix. In 1st ACM Conference on Communications and Computing Security, pages 9--16, Nov. 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. Boldyreva, N. Chenette, Y. Lee, and A. O'Neill. Order-preserving symmetric encryption. In Proceedings of the 28th Annual International Conference on the Theory and Applications of Cryptographic Techniques Eurocrypt, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. K. Challapalli. The internet of things: A time series data challenge. In IBM Big data and Analytics Hub, 2014.Google ScholarGoogle Scholar
  12. F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber. Bigtable: A distributed storage system for structured data. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation OSDI, Seattle, WA, Nov. 2006.Google ScholarGoogle Scholar
  13. P. R. Clearinghouse. Chronology of data breaches. http://www.privacyrights.org/data-breach.Google ScholarGoogle Scholar
  14. J. C. Corbett, J. Dean, M. Epstein, A. Fikes, C. Frost, J. Furman, S. Ghemawat, A. Gubarev, C. Heiser, P. Hochschild, W. Hsieh, S. Kanthak, E. Kogan, H. Li, A. Lloyd, S. Melnik, D. Mwaura, D. Nagle, S. uinlan, R. Rao, L. Rolig, Y. Saito, M. Szymaniak, C. Taylor, R. Wang, and D. Woodford. Spanner: Googles globally-distributed database. In Proceedings of the 10th Symposium on Operating Systems Design and Implementation OSDI, Hollywood, CA, Oct. 2012.Google ScholarGoogle Scholar
  15. Daniel J. Abadi and Samuel R. Madden and Nabil Hachem. Column-Stores vs. Row-Stores: How Different Are They Really? In ACM International Conference on Management of Data SIGMOD, 2008.Google ScholarGoogle Scholar
  16. G. Danti. Linux compressors comparison on CentOS 6.5 x86-64: lzo vs lz4 vs gzip vs bzip2 vs lzma. In Hardware and software benchmark and analysis, 2014.Google ScholarGoogle Scholar
  17. G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: Amazons highly available key-value store. In Proceedings of the 21st ACM Symposium on Operating Systems Principles SOSP, Stevenson, WA, Oct. 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Ellis. Lightweight transactions in cassandra 2.0. http://www.datastax.com/dev/blog/lightweight-transactions-in-cassandra-2-0.Google ScholarGoogle Scholar
  19. E.-J. Goh, H. Shacham, N. Modadugu, and D. Boneh. SiRiUS: Securing Remote Untrusted Storage. In Proceedings of the Tenth Network and Distributed System Security NDSS Symposium, pages 131--145. Internet Society (ISOC), Feb. 2003.Google ScholarGoogle Scholar
  20. O. Goldreich. Modern cryptography, probabilistic proofs and pseudorandomness, volume 1. Springer Science & Business Media, 1998.Google ScholarGoogle Scholar
  21. A. L. Holloway, V. Raman, G. Swart, and D. J. DeWitt. How to barter bits for chronons: compression and bandwidth trade offs for database scans. In Proceedings of the 2007 ACM SIGMOD international conference on Management of data, pages 389--400. ACM, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. M. Kallahalla, E. Riedel, R. Swaminathan. Wang, and K. Fu. Plutus: Scalable secure file sharing on untrusted storage. In 2nd USENI conference on File and Storage Technologies FAST '03, San Francisco, CA, Apr. 2003.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. A. Khandelwal, R. Agarwal, and I. Stoica. Blowfish: Dynamic storage-performance tradeoff in data stores. In Proceedings of the USENI Symposium on Networked Systems Design and Implementation NSDI, volume 60, 2016.Google ScholarGoogle Scholar
  24. L. King. Why compression is a must in the big data era. In IBM Big Data and Analytics Hub, 2013.Google ScholarGoogle Scholar
  25. A. Lakshman and P. Malik. Cassandra: A decentralized structured storage system. In ACM SIGOPS Operating Systems Review, 44 2: 3540, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. A. Lamb, M. Fuller, R. Varadarajan, N. Tran, B. Vandiver, L. Doshi, and C. Bear. The Vertica Analytic Database: C-store 7 Years Later. Proceedings of the VLDB Endowment, 5(12):1790--1801, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. B. Lorica. The re-emergence of time-series. In O'Reilly Radar, 2013.Google ScholarGoogle Scholar
  28. D. Mazières and D. Shasha. Building secure file systems out of byzantine storage. Technical Report TR2002-826, NYU Department of Computer Science, May 2002.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. D. L. Mills. Internet time synchronization: the network time protocol. IEEE Transactions on communications, 39(10):1482--1493, 1991. Google ScholarGoogle ScholarCross RefCross Ref
  30. J. Podesta. Findings of the big data and privacy working group review. In The White House Blog, 2014.Google ScholarGoogle Scholar
  31. R. A. Popa, F. H. Li, and N. Zeldovich. An ideal-security protocol for order-preserving encoding. In Proceedings of the 34th IEEE Symposium on Security and Privacy IEEE S P, pages 463--477, San Francisco, CA, May 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. R. A. Popa, C. M. S. Redfield, N. Zeldovich, and H. Balakrishnan. CryptDB: Protecting confidentiality with encrypted query processing. In Proceedings of the 23rd ACM Symposium on Operating Systems Principles SOSP, pages 85--100, Cascais, Portugal, Oct. 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. W. Zheng, A. Dave, J. G. Beekman, R. A. Popa, J. E. Gonzalez, and I. Stoica. Opaque: An oblivious and encrypted distributed analytics platform. In Proceedings of the 14th Symposium on Networked Systems Design and Implementation NSDI, Boston, MA, 2017.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    EuroSys '17: Proceedings of the Twelfth European Conference on Computer Systems
    April 2017
    648 pages
    ISBN:9781450349383
    DOI:10.1145/3064176

    Copyright © 2017 Owner/Author

    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 23 April 2017

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited

    Acceptance Rates

    Overall Acceptance Rate241of1,308submissions,18%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader