ABSTRACT
Peer-to-peer data integration - a.k.a. Peer Data Management Systems (PDMSs) - promises to extend the classical data integration approach to the Internet scale. Unfortunately, some challenges remain before realizing this promise. One of the biggest challenges is preserving the privacy of the exchanged data while passing through several intermediate peers. Another challenge is protecting the mappings used for data translation. Protecting the privacy without being unfair to any of the peers is yet a third challenge. This paper presents a novel query answering protocol in PDMSs to address these challenges. The protocol employs a technique based on noise selection and insertion to protect the query results, and a commutative encryption-based technique to protect the mappings and ensure fairness among peers. An extensive security analysis of the protocol shows that it is resilient to several possible types of attacks. We implemented the protocol within an established PDMS: the Hyperion system. We conducted an experimental study using real data from the healthcare domain. The results show that our protocol manages to achieve its privacy and fairness goals, while maintaining query processing time at the interactive level.
- R. Agrawal, A. Evfimievski, and R. Srikant. Information sharing across private databases. In SIGMOD, 2003. Google ScholarDigital Library
- A. Blum, C. Dwork, F. McSherry, and K. Nissim. Practical privacy: The suLQ framework. In PODS, 2005 Google ScholarDigital Library
- A. Bonifati et al. Schema mapping and query translation in heterogeneous P2P XML databases. In VLDBJ, 2009. Google ScholarDigital Library
- I. Dinur and K. Nissim. Revealing information while preserving privacy. In PODS, 2003. Google ScholarDigital Library
- E. Franconi, G. Kuper, A. Lopatenko, I. Zaihrayeu. Queries and Updates in the coDB Peer to Peer Database System. In VLDB Demonstration session, 2004. Google ScholarDigital Library
- O. Goldreich. Foundations of Cryptography, volume 2. Cambridge University Press, 2004. Google ScholarCross Ref
- A. Y. Halevy, Z. G. Ives, D. Suciu, and I.Tatarinov. Schema mediation in peer data management systems. In ICDE, 2003.Google ScholarCross Ref
- A. Kementsietsidis and M. Arenas. Data Sharing Through Query Translation in Autonomous Sources. In VLDB 2004. Google ScholarDigital Library
- A. Kementsietsidis, M. Arenas, R. J. Miller. Mapping Data in Peer-to-Peer Systems: Semantics and Algorithmic Issues. In SIGMOD, 2003. Google ScholarDigital Library
- K. LeFevre, D. J. DeWitt and R. Ramakrishnan. Incognito: Efficient Full-Domain K-Anonymity. In SIGMOD, 2005 Google ScholarDigital Library
- A. Meyerson and R. Williams. On the complexity of optimal k-anonymity. In PODS, 2004. Google ScholarDigital Library
- G. Miklau and D. Suciu. Controlling Access to Published Data Using Cryptography. In VLDB, 2003. Google ScholarDigital Library
- National Institute of Health. Computational Technology for Effective Health Care: Immediate Steps and Strategic Directions. http://www.nlm.nih.gov/pubs/reports/comptech_prepub.pdf, 2009.Google Scholar
- W.S. Ng, B. C. Ooi, K.L. Tan, A. Zhou. PeerDB: A P2P-based System for Distributed Data Sharing. In ICDE, 2003Google ScholarCross Ref
- S. Pohlig and M. Hellman. An improved algorithm for computing logarithms over GF(p) and its cryptographic significance. IEEE TOIT, 24:106--110, Jan 1978.Google ScholarDigital Library
- P. Rodriguez-Gianolli et al. Data Sharing in the Hyperion Peer Database System. In VLDB, 2005 Google ScholarDigital Library
- L. Sweeney. k-Anonymity: A Model for Protecting Privacy. In Int. J. on Uncertainty, Fuzziness & Knowledge-based Systems, 1998. Google ScholarDigital Library
- I. Tatarinov and A. Y. Halevy. Efficient query reformulation in peer-data management systems. In SIGMOD, 2004. Google ScholarDigital Library
- I. Tatarinov et al. The Piazza Peer Data Management Project. In WWW, 2003.Google Scholar
- http://www.carfax.comGoogle Scholar
- http://www.gmplib.orgGoogle Scholar
Index Terms
- Preserving privacy and fairness in peer-to-peer data integration
Recommendations
Privacy Preserving Query Answering in Peer Data Management Systems
ICDCSW '13: Proceedings of the 2013 IEEE 33rd International Conference on Distributed Computing Systems WorkshopsPeer Data Management Systems (PDMS) provide data sharing between heterogeneous databases using peer-to- peer schema mapping, where intermediate peers are used to translate the queries as well as the query results. However, such translation leaks not ...
Lightweight privacy-preserving peer-to-peer data integration
Peer Data Management Systems (PDMS) are an attractive solution for managing distributed heterogeneous information. When a peer (client) requests data from another peer (server) with a different schema, translations of the query and its answer are done ...
An Efficient Hybrid Peer-to-Peer System for Distributed Data Sharing
Peer-to-peer overlay networks are widely used in distributed systems. Based on whether a regular topology is maintained among peers, peer-to-peer networks can be divided into two categories: structured peer-to-peer networks in which peers are connected ...
Comments