ABSTRACT
Website privacy policies are often long and difficult to understand. While research shows that Internet users care about their privacy, they do not have time to understand the policies of every website they visit, and most users hardly ever read privacy policies. Several recent efforts aim to crowdsource the interpretation of privacy policies and use the resulting annotations to build more effective user interfaces that provide users with salient policy summaries. However, very little attention has been devoted to studying the accuracy and scalability of crowdsourced privacy policy annotations, the types of questions crowdworkers can effectively answer, and the ways in which their productivity can be enhanced. Prior research indicates that most Internet users often have great difficulty understanding privacy policies, suggesting limits to the effectiveness of crowdsourcing approaches. In this paper, we assess the viability of crowdsourcing privacy policy annotations. Our results suggest that, if carefully deployed, crowdsourcing can indeed result in the generation of non-trivial annotations and can also help identify elements of ambiguity in policies. We further introduce and evaluate a method to improve the annotation process by predicting and highlighting paragraphs relevant to specific data practices.
- M. S. Ackerman, L. F. Cranor, and J. Reagle. Privacy in e-commerce: Examining user scenarios and privacy preferences. In Proceedings of the 1st ACM Conference on Electronic Commerce, EC '99, pages 1--8, New York, NY, USA, 1999. ACM. 00456. Google ScholarDigital Library
- P. André, A. Kittur, and S. P. Dow. Crowd synthesis: Extracting categories and clusters from complex data. In Proc. CSCW '14, pages 989--998. ACM, 2014. Google ScholarDigital Library
- T. D. Breaux and F. Schaub. Scaling requirements extraction to the crowd. In RE'14: Proceedings of the 22nd IEEE International Requirements Engineering Conference (RE'14), Washington, DC, USA, August 2014. IEEE Society Press.Google Scholar
- J. Brookman, S. Harvey, E. Newland, and H. West. Tracking compliance and scope. W3C Working Draft, November.Google Scholar
- L. B. Chilton, G. Little, D. Edge, D. S. Weld, and J. A. Landay. Cascade: Crowdsourcing taxonomy creation. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pages 1999--2008. ACM, 2013. Google ScholarDigital Library
- P. Chundi and P. M. Subramaniam. An Approach to Analyze Web Privacy Policy Documents. In KDD Workshop on Data Mining for Social Good, 2014.Google Scholar
- E. Costante, Y. Sun, M. Petković, and J. den Hartog. A machine learning solution to assess privacy policy completeness. In Proc. of the ACM Workshop on Privacy in the Electronic Society, 2012. Google ScholarDigital Library
- L. Cranor, B. Dobbs, S. Egelman, G. Hogben, J. Humphrey, M. Langheinrich, M. Marchiori, M. Presler-Marshall, J. Reagle, D. A. Stampley, M. Schunter, and R. Wenning. The Platform for Privacy Preferences 1.1 (P3P1.1) Specification. Working group note, W3C, 2006. http://www.w3.org/TR/P3P11/.Google Scholar
- T. Ermakova, B. Fabian, and E. Babina. Readability of Privacy Policies of Healthcare Websites. In 12. Internationale Tagung Wirtschaftsinformatik (Wirtschaftsinformatik 2015), 2015.Google Scholar
- Federal Trade Commission. Protecting consumer privacy in an era of rapid change: Recommendations for businesses and policymakers, 2012.Google Scholar
- C. Jensen and C. Potts. Privacy policies as decision-making tools: an evaluation of online privacy notices. In Proc. CHI '04. ACM, 2004. Google ScholarDigital Library
- A. N. Joinson, U.-D. Reips, T. Buchanan, and C. B. P. Schofield. Privacy, trust, and self-disclosure online. Human-Computer Interaction, 25(1):1--24, Feb. 2010. Google ScholarCross Ref
- A. Kittur, B. Smus, S. Khamkar, and R. E. Kraut. Crowdforge: Crowdsourcing complex work. In Proc. UIST '11, pages 43--52. ACM, 2011. Google ScholarDigital Library
- P. G. Leon, B. Ur, Y. Wang, M. Sleeper, R. Balebako, R. Shay, L. Bauer, M. Christodorescu, and L. F. Cranor. What matters to users?: Factors that affect users' willingness to share information with online advertisers. In Proc, SOUPS '13. ACM, 2013. Google ScholarDigital Library
- F. Liu, R. Ramanath, N. Sadeh, and N. A. Smith. A step towards usable privacy policy: Automatic alignment of privacy statements. In Proceedings of the 25th International Conference on Computational Linguistics (COLING), 2014.Google Scholar
- E. Luger, S. Moran, and T. Rodden. Consent for all: Revealing the hidden complexity of terms and conditions. In Proc. CHI '13. ACM, 2013. Google ScholarDigital Library
- A. M. McDonald. Browser Wars: A New Sequel? The technology of privacy, Silicon Flatirons Center, University of Colorado, 2013. presented Jan. 11, 2013.Google Scholar
- A. M. McDonald and L. F. Cranor. The cost of reading privacy policies. I/S: J Law & Policy Info. Soc., 4(3), 2008.Google Scholar
- G. Meiselwitz. Readability Assessment of Policies and Procedures of Social Networking Sites. In Proc. OCSC '13, 2013. Google ScholarDigital Library
- M. Negri, L. Bentivogli, Y. Mehdad, D. Giampiccolo, and A. Marchetti. Divide and conquer: Crowdsourcing the creation of cross-lingual textual entailment corpora. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 670--679. Association for Computational Linguistics, 2011. Google ScholarDigital Library
- Official California Legislative Information. The Online Privacy Protection Act of 2003, 2003.Google Scholar
- A. J. Quinn and B. B. Bederson. Human computation: A survey and taxonomy of a growing field. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI '11, pages 1403--1412, New York, NY, USA, 2011. ACM. 00257. Google ScholarDigital Library
- N. Quoc Viet Hung, N. T. Tam, L. Tran, and K. Aberer. An evaluation of aggregation techniques in crowdsourcing. In Proc. WISE '13, pages 1--15. Springer, 2013. Google ScholarCross Ref
- R. Ramanath, F. Liu, N. Sadeh, and N. A. Smith. Unsupervised alignment of privacy policies using hidden Markov models. In Proc. ACL '14, 2014. Google ScholarCross Ref
- J. R. Reidenberg, T. D. Breaux, L. F. Cranor, B. French, A. Grannis, J. T. Graves, F. Liu, A. M. McDonald, T. B. Norton, R. Ramanath, N. C. Russell, N. Sadeh, and F. Schaub. Disagreeable privacy policies: Mismatches between meaning and users' understanding. In Proceedings of 42nd Research Conference on Communication, Information and Internet Policy, TPRC'14, 2014.Google ScholarCross Ref
- J. R. Reidenberg, N. C. Russell, A. J. Callen, S. Qasir, and T. B. Norton. Privacy harms and the effectiveness of the notice and choice framework. I/S: Journal of Law & Policy for the Information Society, 2015.Google Scholar
- N. Sadeh, A. Acquisti, T. D. Breaux, L. F. Cranor, A. M. McDonald, J. R. Reidenberg, N. A. Smith, F. Liu, N. C. Russell, F. Schaub, and S. Wilson. The usable privacy policy project: Combining crowdsourcing, machine learning and natural language processing to semi-automatically answer those privacy questions users care about. Tech. report Carnegie Mellon University-ISR-13--119, Carnegie Mellon University, 2013.Google Scholar
- F. Schaub, R. Balebako, A. L. Durity, and L. F. Cranor. A design space for effective privacy notices. In Eleventh Symposium On Usable Privacy and Security (SOUPS 2015), pages 1--17, Ottawa, July 2015. USENIX Association.Google Scholar
- J. W. Stamey and R. A. Rossi. Automatically identifying relations in privacy policies. In Proc. SIGDOC '09. ACM, 2009. Google ScholarDigital Library
- Tos;DR. Terms of service didn't read, 2012. http://tosdr.org/ (accessed: 2015-03--11).Google Scholar
- University of Cambridge. Certificate of proficiency in english (cpe), cefr level c2): Handbook for teachers, 2013.Google Scholar
- S. Zimmeck and S. M. Bellovin. Privee: An architecture for automatically analyzing web privacy policies. In USENIX Security Symposium, 2014. Google ScholarDigital Library
Index Terms
- Crowdsourcing Annotations for Websites' Privacy Policies: Can It Really Work?
Recommendations
Analyzing Privacy Policies at Scale: From Crowdsourcing to Automated Annotations
Website privacy policies are often long and difficult to understand. While research shows that Internet users care about their privacy, they do not have the time to understand the policies of every website they visit, and most users hardly ever read ...
A Gap in Perceived Importance of Privacy Policies between Individuals and Companies
CONGRESS '09: Proceedings of the 2009 World Congress on Privacy, Security, Trust and the Management of e-BusinessAlthough several studies have examined individuals’ privacy concerns and companies’ privacy policy disclosures, only a few studies examined whether customers’ privacy concerns are adequately addressed in companies’ privacy policy disclosures. This study ...
E-P3P privacy policies and privacy authorization
WPES '02: Proceedings of the 2002 ACM workshop on Privacy in the Electronic SocietyEnterprises collect large amounts of personal data from their customers. To ease privacy concerns, enterprises publish privacy statements that outline how data is used and shared. The Platform for Enterprise Privacy Practices (E-P3P) defines a fine-...
Comments