skip to main content
10.1145/2872427.2883035acmotherconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

Crowdsourcing Annotations for Websites' Privacy Policies: Can It Really Work?

Authors Info & Claims
Published:11 April 2016Publication History

ABSTRACT

Website privacy policies are often long and difficult to understand. While research shows that Internet users care about their privacy, they do not have time to understand the policies of every website they visit, and most users hardly ever read privacy policies. Several recent efforts aim to crowdsource the interpretation of privacy policies and use the resulting annotations to build more effective user interfaces that provide users with salient policy summaries. However, very little attention has been devoted to studying the accuracy and scalability of crowdsourced privacy policy annotations, the types of questions crowdworkers can effectively answer, and the ways in which their productivity can be enhanced. Prior research indicates that most Internet users often have great difficulty understanding privacy policies, suggesting limits to the effectiveness of crowdsourcing approaches. In this paper, we assess the viability of crowdsourcing privacy policy annotations. Our results suggest that, if carefully deployed, crowdsourcing can indeed result in the generation of non-trivial annotations and can also help identify elements of ambiguity in policies. We further introduce and evaluate a method to improve the annotation process by predicting and highlighting paragraphs relevant to specific data practices.

References

  1. M. S. Ackerman, L. F. Cranor, and J. Reagle. Privacy in e-commerce: Examining user scenarios and privacy preferences. In Proceedings of the 1st ACM Conference on Electronic Commerce, EC '99, pages 1--8, New York, NY, USA, 1999. ACM. 00456. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. P. André, A. Kittur, and S. P. Dow. Crowd synthesis: Extracting categories and clusters from complex data. In Proc. CSCW '14, pages 989--998. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. T. D. Breaux and F. Schaub. Scaling requirements extraction to the crowd. In RE'14: Proceedings of the 22nd IEEE International Requirements Engineering Conference (RE'14), Washington, DC, USA, August 2014. IEEE Society Press.Google ScholarGoogle Scholar
  4. J. Brookman, S. Harvey, E. Newland, and H. West. Tracking compliance and scope. W3C Working Draft, November.Google ScholarGoogle Scholar
  5. L. B. Chilton, G. Little, D. Edge, D. S. Weld, and J. A. Landay. Cascade: Crowdsourcing taxonomy creation. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pages 1999--2008. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. P. Chundi and P. M. Subramaniam. An Approach to Analyze Web Privacy Policy Documents. In KDD Workshop on Data Mining for Social Good, 2014.Google ScholarGoogle Scholar
  7. E. Costante, Y. Sun, M. Petković, and J. den Hartog. A machine learning solution to assess privacy policy completeness. In Proc. of the ACM Workshop on Privacy in the Electronic Society, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. L. Cranor, B. Dobbs, S. Egelman, G. Hogben, J. Humphrey, M. Langheinrich, M. Marchiori, M. Presler-Marshall, J. Reagle, D. A. Stampley, M. Schunter, and R. Wenning. The Platform for Privacy Preferences 1.1 (P3P1.1) Specification. Working group note, W3C, 2006. http://www.w3.org/TR/P3P11/.Google ScholarGoogle Scholar
  9. T. Ermakova, B. Fabian, and E. Babina. Readability of Privacy Policies of Healthcare Websites. In 12. Internationale Tagung Wirtschaftsinformatik (Wirtschaftsinformatik 2015), 2015.Google ScholarGoogle Scholar
  10. Federal Trade Commission. Protecting consumer privacy in an era of rapid change: Recommendations for businesses and policymakers, 2012.Google ScholarGoogle Scholar
  11. C. Jensen and C. Potts. Privacy policies as decision-making tools: an evaluation of online privacy notices. In Proc. CHI '04. ACM, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. N. Joinson, U.-D. Reips, T. Buchanan, and C. B. P. Schofield. Privacy, trust, and self-disclosure online. Human-Computer Interaction, 25(1):1--24, Feb. 2010. Google ScholarGoogle ScholarCross RefCross Ref
  13. A. Kittur, B. Smus, S. Khamkar, and R. E. Kraut. Crowdforge: Crowdsourcing complex work. In Proc. UIST '11, pages 43--52. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. P. G. Leon, B. Ur, Y. Wang, M. Sleeper, R. Balebako, R. Shay, L. Bauer, M. Christodorescu, and L. F. Cranor. What matters to users?: Factors that affect users' willingness to share information with online advertisers. In Proc, SOUPS '13. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. F. Liu, R. Ramanath, N. Sadeh, and N. A. Smith. A step towards usable privacy policy: Automatic alignment of privacy statements. In Proceedings of the 25th International Conference on Computational Linguistics (COLING), 2014.Google ScholarGoogle Scholar
  16. E. Luger, S. Moran, and T. Rodden. Consent for all: Revealing the hidden complexity of terms and conditions. In Proc. CHI '13. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. A. M. McDonald. Browser Wars: A New Sequel? The technology of privacy, Silicon Flatirons Center, University of Colorado, 2013. presented Jan. 11, 2013.Google ScholarGoogle Scholar
  18. A. M. McDonald and L. F. Cranor. The cost of reading privacy policies. I/S: J Law & Policy Info. Soc., 4(3), 2008.Google ScholarGoogle Scholar
  19. G. Meiselwitz. Readability Assessment of Policies and Procedures of Social Networking Sites. In Proc. OCSC '13, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. Negri, L. Bentivogli, Y. Mehdad, D. Giampiccolo, and A. Marchetti. Divide and conquer: Crowdsourcing the creation of cross-lingual textual entailment corpora. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 670--679. Association for Computational Linguistics, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Official California Legislative Information. The Online Privacy Protection Act of 2003, 2003.Google ScholarGoogle Scholar
  22. A. J. Quinn and B. B. Bederson. Human computation: A survey and taxonomy of a growing field. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI '11, pages 1403--1412, New York, NY, USA, 2011. ACM. 00257. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. N. Quoc Viet Hung, N. T. Tam, L. Tran, and K. Aberer. An evaluation of aggregation techniques in crowdsourcing. In Proc. WISE '13, pages 1--15. Springer, 2013. Google ScholarGoogle ScholarCross RefCross Ref
  24. R. Ramanath, F. Liu, N. Sadeh, and N. A. Smith. Unsupervised alignment of privacy policies using hidden Markov models. In Proc. ACL '14, 2014. Google ScholarGoogle ScholarCross RefCross Ref
  25. J. R. Reidenberg, T. D. Breaux, L. F. Cranor, B. French, A. Grannis, J. T. Graves, F. Liu, A. M. McDonald, T. B. Norton, R. Ramanath, N. C. Russell, N. Sadeh, and F. Schaub. Disagreeable privacy policies: Mismatches between meaning and users' understanding. In Proceedings of 42nd Research Conference on Communication, Information and Internet Policy, TPRC'14, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  26. J. R. Reidenberg, N. C. Russell, A. J. Callen, S. Qasir, and T. B. Norton. Privacy harms and the effectiveness of the notice and choice framework. I/S: Journal of Law & Policy for the Information Society, 2015.Google ScholarGoogle Scholar
  27. N. Sadeh, A. Acquisti, T. D. Breaux, L. F. Cranor, A. M. McDonald, J. R. Reidenberg, N. A. Smith, F. Liu, N. C. Russell, F. Schaub, and S. Wilson. The usable privacy policy project: Combining crowdsourcing, machine learning and natural language processing to semi-automatically answer those privacy questions users care about. Tech. report Carnegie Mellon University-ISR-13--119, Carnegie Mellon University, 2013.Google ScholarGoogle Scholar
  28. F. Schaub, R. Balebako, A. L. Durity, and L. F. Cranor. A design space for effective privacy notices. In Eleventh Symposium On Usable Privacy and Security (SOUPS 2015), pages 1--17, Ottawa, July 2015. USENIX Association.Google ScholarGoogle Scholar
  29. J. W. Stamey and R. A. Rossi. Automatically identifying relations in privacy policies. In Proc. SIGDOC '09. ACM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Tos;DR. Terms of service didn't read, 2012. http://tosdr.org/ (accessed: 2015-03--11).Google ScholarGoogle Scholar
  31. University of Cambridge. Certificate of proficiency in english (cpe), cefr level c2): Handbook for teachers, 2013.Google ScholarGoogle Scholar
  32. S. Zimmeck and S. M. Bellovin. Privee: An architecture for automatically analyzing web privacy policies. In USENIX Security Symposium, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Crowdsourcing Annotations for Websites' Privacy Policies: Can It Really Work?

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in
              • Published in

                cover image ACM Other conferences
                WWW '16: Proceedings of the 25th International Conference on World Wide Web
                April 2016
                1482 pages
                ISBN:9781450341431

                Copyright © 2016 Copyright is held by the International World Wide Web Conference Committee (IW3C2)

                Publisher

                International World Wide Web Conferences Steering Committee

                Republic and Canton of Geneva, Switzerland

                Publication History

                • Published: 11 April 2016

                Permissions

                Request permissions about this article.

                Request Permissions

                Check for updates

                Qualifiers

                • research-article

                Acceptance Rates

                WWW '16 Paper Acceptance Rate115of727submissions,16%Overall Acceptance Rate1,899of8,196submissions,23%

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader