ABSTRACT
With data becoming available in larger quantities and at higher rates, new data processing paradigms have been proposed to handle high-volume, fast-moving data. Data Stream Processing is one such paradigm wherein transient data streams flow through sets of continuous queries, only returning results when data is of interest to the querier. To avoid the large costs associated with maintaining the infrastructure required for processing these data streams, many companies will outsource their computation to third-party cloud services. This outsourcing, however, can lead to private data being accessed by parties that a data provider may not trust. The literature offers solutions to this confidentiality and access control problem but they have fallen short of providing a complete solution to these problems, due to either immense overheads or trust requirements placed on these third-party services.
To address these issues, we have developed PolyStream, an enhancement to existing data stream management systems that enables data providers to specify attribute-based access control policies that are cryptographically enforced while simultaneously allowing many types of in-network data processing. We detail the access control models and mechanisms used by PolyStream, and describe a novel use of security punctuations that enables flexible, online policy management and key distribution. We detail how queries are submitted and executed using an unmodified Data Stream Management System, and show through an extensive evaluation that PolyStream yields a 550x performance gain versus the state-of-the-art system StreamForce in CODASPY 2014, while providing greater functionality to the querier.
- D. Abadi et al. The design of the borealis stream processing engine. In CIDR, 2005.Google Scholar
- D. J. Abadi, D. Carney, U. Çetintemel, M. Cherniack, C. Convey, S. Lee, M. Stonebraker, N. Tatbul, and S. Zdonik. Aurora: a new model and architecture for data stream management. The VLDB Journal-The International Journal on Very Large Data Bases, 12(2):120--139, 2003. Google ScholarDigital Library
- R. Adaikkalavan and T. Perez. Secure shared continuous query processing. In ACM SAC, pages 1000--1005, 2011. Google ScholarDigital Library
- T. Akidau, A. Balikov, K. Bekirouglu, S. Chernyak, J. Haberman, R. Lax, S. McVeety, D. Mills, P. Nordstrom, and S. Whittle. Millwheel: fault-tolerant stream processing at internet scale. Proceedings of the VLDB Endowment, 6(11):1033--1044, 2013. Google ScholarDigital Library
- D. T. T. Anh and A. Datta. Streamforce: outsourcing access control enforcement for stream data to the clouds. In Proceedings of the 4th ACM conference on Data and application security and privacy, pages 13--24, 2014. Google ScholarDigital Library
- L. Aniello, R. Baldoni, and L. Querzoni. Adaptive online scheduling in storm. In Proceedings of the 7th ACM DEBS, pages 207--218. ACM, 2013. Google ScholarDigital Library
- A. Arasu, S. Babu, and J. Widom. The cql continuous query language: semantic foundations and query execution. The VLDB Journal--The International Journal on Very Large Data Bases, 15(2):121--142, 2006. Google ScholarDigital Library
- A. Arasu, M. Cherniack, E. Galvez, D. Maier, A. S. Maskey, E. Ryvkina, M. Stonebraker, and R. Tibbetts. Linear road: a stream data management benchmark. In Proceedings of the Thirtieth international conference on Very large data bases-Volume 30, pages 480--491. VLDB Endowment, 2004. Google ScholarDigital Library
- S. Babu and J. Widom. Continuous queries over data streams. ACM Sigmod Record, 30(3):109--120, 2001. Google ScholarDigital Library
- J. Benthencourt, A. Sahai, and B. Waters. Advanced crypto software collection: Ciphertext-policy attribute-based encryption. 2011.Google Scholar
- A. Boldyreva, N. Chenette, Y. Lee, and A. O'Neill. Order-preserving symmetric encryption. In Eurocrypt, pages 224--241. Springer, 2009.Google Scholar
- A. Boldyreva, N. Chenette, and A. O'Neill. Order-preserving encryption revisited: Improved security analysis and alternative solutions. In Advances in Cryptology--CRYPTO 2011, pages 578--595. Springer, 2011. Google ScholarDigital Library
- B. Carminati, E. Ferrari, J. Cao, and K. L. Tan. A framework to enforce access control over data streams. ACM Transactions on Information and System Security (TISSEC), 13(3):28, 2010. Google ScholarDigital Library
- B. Carminati, E. Ferrari, and K. L. Tan. Enforcing access control over data streams. In Proceedings of the 12th ACM symposium on Access control models and technologies, pages 21--30, 2007. Google ScholarDigital Library
- B. Carminati, E. Ferrari, and K. L. Tan. Specifying access control policies on data streams. In Advances in Databases: Concepts, Systems and Applications, pages 410--421. Springer, 2007. Google ScholarDigital Library
- V. Goyal, O. Pandey, A. Sahai, and B. Waters. Attribute-based encryption for fine-grained access control of encrypted data. In Proceedings of the 13th ACM conference on Computer and communications security, pages 89--98, 2006. Google ScholarDigital Library
- M. Green, S. Hohenberger, and B. Waters. Outsourcing the decryption of abe ciphertexts. In USENIX Security Symposium, 2011. Google ScholarDigital Library
- S. Halevi and P. Rogaway. A tweakable enciphering mode. In CRYPTO 2003, pages 482--499. Springer, 2003.Google ScholarCross Ref
- J. Hur and D. K. Noh. Attribute-based access control with efficient revocation in data outsourcing systems. Parallel and Distributed Systems, IEEE Transactions on, 22(7):1214--1221, 2011. Google ScholarDigital Library
- H. V. Jagadish et al. Big data and its technical challenges. Communications of the ACM, 57(7):86--94, Jul 2014. Google ScholarDigital Library
- S. Jahid, P. Mittal, and N. Borisov. Easier: Encryption-based access control in social networks with efficient revocation. In Proceedings of the 6th ACM Symposium on Information, Computer and Communications Security, pages 411--415. ACM, 2011. Google ScholarDigital Library
- X. Jin, R. Krishnan, and R. Sandhu. A unified attribute-based access control model covering dac, mac and rbac. In Data and applications security and privacy XXVI, pages 41--55. Springer, 2012. Google ScholarDigital Library
- S. Kulkarni, N. Bhagat, M. Fu, V. Kedigehalli, C. Kellogg, S. Mittal, J. M. Patel, K. Ramasamy, and S. Taneja. Twitter heron: Stream processing at scale. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pages 239--250. ACM, 2015. Google ScholarDigital Library
- W. Lindner and J. Meier. Securing the borealis data stream engine. In Database Engineering and Applications Symposium, 2006. IDEAS'06. 10th International, pages 137--147. IEEE, 2006. Google ScholarDigital Library
- R. Nehme, E. A. Rundensteiner, and E. Bertino. A security punctuation framework for enforcing access control on streaming data. In IEEE 24th International Conference on Data Engineering (ICDE), pages 406--415, 2008. Google ScholarDigital Library
- R. V. Nehme, H.-S. Lim, and E. Bertino. Fence: Continuous access control enforcement in dynamic data stream environments. In Proceedings of the third ACM conference on Data and application security and privacy, pages 243--254, 2013. Google ScholarDigital Library
- W. S. Ng, H. Wu, W. Wu, S. Xiang, and K.-L. Tan. Privacy preservation in streaming data collection. In Proceedings of the 2012 IEEE 18th International Conference on Parallel and Distributed Systems, pages 810--815, 2012. Google ScholarDigital Library
- P. Paillier. Public-key cryptosystems based on composite degree residuosity classes. In Proc. of Eurocrypt, pages 223--238, 1999. Google ScholarDigital Library
- R. A. Popa, C. Redfield, N. Zeldovich, and H. Balakrishnan. Cryptdb: protecting confidentiality with encrypted query processing. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles, pages 85--100, 2011. Google ScholarDigital Library
- StormProject. Storm: Distributed and fault-tolerant realtime computation. http://storm.incubator.apache.org/documentation/Home.html, 2014.Google Scholar
- N. Tatbul, U. Çetintemel, S. Zdonik, M. Cherniack, and M. Stonebraker. Load shedding in a data stream manager. In Proceedings of the 29th international conference on Very large data bases-Volume 29, pages 309--320, 2003. Google ScholarDigital Library
- S. Tu, M. F. Kaashoek, S. Madden, and N. Zeldovich. Processing analytical queries over encrypted data. In Proceedings of the 39th international conference on Very Large Data Bases, pages 289--300, 2013. Google ScholarDigital Library
- B. Wang, M. Li, S. S. Chow, and H. Li. A tale of two clouds: Computing on data encrypted under multiple keys. In Communications and Network Security (CNS), 2014 IEEE Conference on, pages 337--345. IEEE, 2014.Google ScholarCross Ref
- S. Yu, C. Wang, K. Ren, and W. Lou. Attribute based data sharing with attribute revocation. In Proceedings of the 5th ACM Symposium on Information, Computer and Communications Security, pages 261--270. ACM, 2010. Google ScholarDigital Library
- M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica. Spark: cluster computing with working sets. In Proceedings of the 2nd USENIX conference on Hot topics in cloud computing, volume 10, page 10, 2010. Google ScholarDigital Library
Index Terms
- PolyStream: Cryptographically Enforced Access Controls for Outsourced Data Stream Processing
Recommendations
Streamforce: outsourcing access control enforcement for stream data to the clouds
CODASPY '14: Proceedings of the 4th ACM conference on Data and application security and privacyIn this paper, we focus on the problem of data privacy on the cloud, particularly on access controls over stream data. The nature of stream data and the complexity of sharing data make access control a more challenging issue than in traditional archival ...
MedSMan: a live multimedia stream querying system
Querying live media streams is a challenging problem that is becoming an essential requirement in a growing number of applications. Research in multimedia information systems has addressed and made good progress in dealing with archived data. Meanwhile, ...
CryptStream: Cryptographic Access Controls for Streaming Data
CODASPY '15: Proceedings of the 5th ACM Conference on Data and Application Security and PrivacyWith data becoming available in larger quantities and at higher rates, new data processing paradigms have been proposed to handle large and fast data. Data Stream Processing is one such paradigm wherein transient data flows as streams through sets of ...
Comments