ABSTRACT
Privacy concerns relating to sharing network traces have traditionally been handled via sanitization, which includes removal of sensitive data and IP address anonymization. We argue that sanitization is a poor solution for data sharing that offers insufficient research utility to users and poor privacy guarantees to data providers.
We claim that a better balance in the utility/privacy trade-off, inherent to network data sharing, can be achieved via a new paradigm we propose: secure queries. In this paradigm, a data owner publishes a query language and an online portal, allowing researchers to submit sets of queries to be run on data. Only certain operations are allowed on certain data fields, and in specific contexts. Query restriction is achieved via the provider's privacy policy, and enforced by the language's interpreter. Query results, returned to researchers, consist of aggregate information such as counts, histograms, distributions, etc. and not of individual packets. We discuss why secure queries provide higher privacy guarantees and higher research utility than sanitization, and present a design of the secure query language and a privacy policy.
- Cynthia Dwork. Differential Privacy. In Proceedings of the 33rd International Colloquium on Automata, Languages and Programming, 2006. Google ScholarDigital Library
- S. Coull, C. Wright, F. Monrose, M. Collins, and M. Reiter. Playing Devil's Advocate: Inferring Sensitive Information from Anonymized Network Traces. In Proceedings of the Network and Distributed System Security Symposium, February 2007.Google Scholar
- Q. Sun, D. R. Simon, Y. Wang, W. Russell, V. N. Padmanabhan, and L. Qiu. Statistical Identification of Encrypted Web Browsing Traffic. In Proceedings of the IEEE Symposium on Security and Privacy, 2002. Google ScholarDigital Library
- S. Coull, M.P. Collins, C.V. Wright, F. Monrose, and M. Reiter. On Web Browsing Privacy in Anonymized NetFlows. In Proceedings of the USENIX Security Symposium, August 2007. Google ScholarDigital Library
- T. Kohno, A. Broido, and kc Claffy. Remote Physical Device Fingerprinting. In Proceedings of the IEEE Symposium on Security and Privacy, 2005. Google ScholarDigital Library
- Ruoming Pang, Mark Allman, Vern Paxson, and Jason Lee. The devil and packet trace anonymization. ACM SIGCOMM Computer Communications Review, 36(1):29--38, 2006. Google ScholarDigital Library
- J C Mogul and M Arlitt. Sc2d: An alternative to trace anonymization. In Proceedings of the SIGCOMM 2006 Workshop on Mining Network Data, 2006. Google ScholarDigital Library
- Vern Paxson. Trace sanitization scripts. http://ita.ee.lbl.gov/html/contrib/sanitize.html.Google Scholar
- J. Xu, J. Fan, M. H. Ammar, and S. B. Moon. Prefix-Preserving IP Address Anonymization: Measurement-Based Security Evaluation and a New Cryptography-Based Scheme. In Proceedings of the IEEE International Conference on Network Protocols, 2002. Google ScholarDigital Library
- Nabil R. Adam and John C. Worthmann. Security-control methods for statistical databases: a comparative study. ACM Computing Surveys, 21(4):515--556, 1989. Google ScholarDigital Library
- L. Sweeney. k-anonymity: a model for protecting privacy. International Journal on Uncertainty, Fuzziness and Knowledge-based Systems, 10(5):557--570, 2002. Google ScholarDigital Library
- Xiaokui Xiao and Yufei Tao. M-invariance: towards privacy preserving re-publication of dynamic datasets. In Proceedings of the International Conference on Management of Data, 2007. Google ScholarDigital Library
- Ashwin Machanavajjhala, Johannes Gehrke, Daniel Kifer, and Muthuramakrishnan Venkitasubramaniam. l-Diversity: Privacy Beyond k-Anonymity. In Proceedings of the 22nd IEEE International Conference on Data Engineering, 2006. Google ScholarDigital Library
- MAWI Working Group Traffic Archive. http://tracer.csl.sony.co.jp/mawi/.Google Scholar
- Greg Minshall. tcpdpriv tool. http://ita.ee.lbl.gov/html/contrib/tcpdpriv.html.Google Scholar
- Eddie Kohler. Ipsumdump tool. http://www.cs.ucla.edu/~kohler/ipsumdump/.Google Scholar
- Eddie Kohler. Ipaggregate tool. http://www.cs.ucla.edu/~kohler/ipsumdump/aggcreateman.html.Google Scholar
- Ruoming Pang and Vern Paxson. A High-level Programming Environment for Packet Trace Anonymization and Transformation. In Proceedings of ACM SIGCOMM, 2003. Google ScholarDigital Library
- Gianluca Iannacone. CoMo: An Open Infrastructure for Network Monitoring -- Research Agenda. http://como.intel-research.net/pubs/como.agenda.pdf.Google Scholar
- Lobster web page. http://www.ist-lobster.org/publications/deliverables/D1.1a.pdf.Google Scholar
Index Terms
- Privacy-safe network trace sharing via secure queries
Recommendations
Short paper: the NetSANI framework for analysis and fine-tuning of network trace sanitization
WiSec '11: Proceedings of the fourth ACM conference on Wireless network securityAnonymization is critical prior to sharing wireless-network traces within the research community, to protect both personal and organizational sensitive information from disclosure. One difficulty in anonymization, or more generally, sanitization, is ...
Commoner Privacy And A Study On Network Traces
ACSAC '17: Proceedings of the 33rd Annual Computer Security Applications ConferenceDifferential privacy has emerged as a promising mechanism for privacy-safe data mining. One popular differential privacy mechanism allows researchers to pose queries over a dataset, and adds random noise to all output points to protect privacy. While ...
Toward sensitive document release with privacy guarantees
Privacy has become a serious concern for modern Information Societies. The sensitive nature of much of the data that are daily exchanged or released to untrusted parties requires that responsible organizations undertake appropriate privacy protection ...
Comments