skip to main content
research-article

Using Worker Self-Assessments for Competence-Based Pre-Selection in Crowdsourcing Microtasks

Published: 23 August 2017 Publication History

Abstract

Paid crowdsourcing platforms have evolved into remarkable marketplaces where requesters can tap into human intelligence to serve a multitude of purposes, and the workforce can benefit through monetary returns for investing their efforts. In this work, we focus on individual crowd worker competencies. By drawing from self-assessment theories in psychology, we show that crowd workers often lack awareness about their true level of competence. Due to this, although workers intend to maintain a high reputation, they tend to participate in tasks that are beyond their competence. We reveal the diversity of individual worker competencies, and make a case for competence-based pre-selection in crowdsourcing marketplaces. We show the implications of flawed self-assessments on real-world microtasks, and propose a novel worker pre-selection method that considers accuracy of worker self-assessments. We evaluated our method in a sentiment analysis task and observed an improvement in the accuracy by over 15%, when compared to traditional performance-based worker pre-selection. Similarly, our proposed method resulted in an improvement in accuracy of nearly 6% in an image validation task. Our results show that requesters in crowdsourcing platforms can benefit by considering worker self-assessments in addition to their performance for pre-selection.

References

[1]
Yoram Bachrach, Thore Graepel, Gjergji Kasneci, Michal Kosinski, and Jurgen Van Gael. 2012. Crowd IQ: Aggregating opinions to boost performance. In Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems-Volume 1. International Foundation for Autonomous Agents and Multiagent Systems, 535--542
[2]
David Boud. 2013. Enhancing Learning Through Self-Assessment. Routledge.
[3]
Katherine A. Burson, Richard P. Larrick, and Joshua Klayman. 2006. Skilled or unskilled, but still unaware of it: How perceptions of difficulty drive miscalibration in relative comparisons. Journal of Personality and Social Psychology 90, 1 (2006), 60.
[4]
Carrie J. Cai, Shamsi T. Iqbal, and Jaime Teevan. 2016. Chain reactions: The impact of order on microtask chains. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. 3143--3154.
[5]
Lydia B. Chilton, John J. Horton, Robert C. Miller, and Shiri Azenkot. 2010. Task search in a human computation market. In Proceedings of the ACM SIGKDD Workshop on Human Computation. ACM, 1--9.
[6]
D. E. Difallah, M. Catasta, G. Demartini, P. G. Ipeirotis, and P. Cudré-Mauroux. 2015. The dynamics of micro-task crowdsourcing -- The case of Amazon MTurk. In Proceedings of the 24th International Conference on World Wide Web (WWW). ACM, 238--247.
[7]
Steven Dow, Anand Kulkarni, Scott Klemmer, and Björn Hartmann. 2012. Shepherding the crowd yields better work. In Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work. ACM, 1013--1022.
[8]
Christoph Dukat and Simon Caton. 2013. Towards the competence of crowdsourcees: Literature-based considerations on the problem of assessing crowdsourcees’ qualities. In Proceedings of the 3rd International Conference on Cloud and Green Computing. IEEE, 536--540.
[9]
David Dunning. 2011. The dunning-kruger effect: On being ignorant of one’s own ignorance. Advances in Experimental Social Psychology 44 (2011), 247.
[10]
David Dunning, Chip Heath, and Jerry M. Suls. 2004. Flawed self-assessment implications for health, education, and the workplace. Psychological Science in the Public Interest 5, 3 (2004), 69--106.
[11]
Joyce Ehrlinger and David Dunning. 2003. How chronic self-views influence (and potentially mislead) estimates of performance. Journal of Personality and Social Psychology 84, 1 (2003), 5.
[12]
Joyce Ehrlinger, Kerri Johnson, Matthew Banner, David Dunning, and Justin Kruger. 2008. Why the unskilled are unaware: Further explorations of (absent) self-insight among the incompetent. Organizational Behavior and Human Decision Processes 105, 1 (2008), 98--121.
[13]
Ujwal Gadiraju, Besnik Fetahu, and Ricardo Kawase. 2015. Training workers for improving performance in crowdsourcing microtasks. In Proceedings of the 10th European Conference on Technology Enhanced Learning. EC-TEL 2015. Springer, 100--114.
[14]
Ujwal Gadiraju, Ricardo Kawase, and Stefan Dietze. 2014. A taxonomy of microtasks on the web. In Proceedings of the 25th ACM Conference on Hypertext and Social Media. ACM, 218--223.
[15]
Ujwal Gadiraju, Ricardo Kawase, Stefan Dietze, and Gianluca Demartini. 2015. Understanding malicious behavior in crowdsourcing platforms: The case of online surveys. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI’15). 1631--1640.
[16]
Lilly C. Irani and M. Silberman. 2013. Turkopticon: Interrupting worker invisibility in amazon mechanical turk. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 611--620.
[17]
Nicolas Kaufmann, Thimo Schulze, and Daniel Veit. 2011. More than fun and money. Worker motivation in crowdsourcing--A study on mechanical turk. A Renaissance of Information Technology for Sustainability and Global Competitiveness. 17th Americas Conference on Information Systems (AMCIS’11). Association for Information Systems, Detroit, Michigan, USA.
[18]
Gabriella Kazai. 2011. In search of quality in crowdsourcing for search engine evaluation. In Advances in Information Retrieval. Springer, 165--176.
[19]
Gabriella Kazai, Jaap Kamps, and Natasa Milic-Frayling. 2011. Worker types and personality traits in crowdsourcing relevance labels. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management. ACM, 1941--1944.
[20]
Gabriella Kazai, Jaap Kamps, and Natasa Milic-Frayling. 2012. The face of quality in crowdsourcing relevance labels: Demographics, personality and labeling accuracy. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management. ACM, 2583--2586.
[21]
Aniket Kittur, Ed H. Chi, and Bongwon Suh. 2008. Crowdsourcing user studies with mechanical Turk. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 453--456.
[22]
Aniket Kittur, Jeffrey V. Nickerson, Michael Bernstein, Elizabeth Gerber, Aaron Shaw, John Zimmerman, Matt Lease, and John Horton. 2013. The future of crowd work. In Proceedings of the 2013 Conference on Computer Supported Cooperative Work. ACM, 1301--1318.
[23]
Michal Kosinski, Yoram Bachrach, Gjergji Kasneci, Jurgen Van-Gael, and Thore Graepel. 2012. Crowd IQ: Measuring the intelligence of crowdsourcing platforms. In Proceedings of the 4th Annual ACM Web Science Conference. ACM, 151--160.
[24]
Marian Krajc and Andreas Ortmann. 2008. Are the unskilled really that unaware? An alternative explanation. Journal of Economic Psychology 29, 5 (2008), 724--738.
[25]
Joachim Krueger and Ross A. Mueller. 2002. Unskilled, unaware, or both? The better-than-average heuristic and statistical regression predict errors in estimates of own performance. Journal of Personality and Social Psychology 82, 2 (2002), 180.
[26]
Justin Kruger and David Dunning. 1999. Unskilled and unaware of it: How difficulties in recognizing one’s own incompetence lead to inflated self-assessments. Journal of Personality and Social Psychology 77, 6 (1999), 1121.
[27]
Chinmay Kulkarni, Koh Pang Wei, Huy Le, Daniel Chia, Kathryn Papadopoulos, Justin Cheng, Daphne Koller, and Scott R. Klemmer. 2015. Peer and self assessment in massive online classes. In Design Thinking Research. Springer, 131--168.
[28]
John Le, Andy Edmonds, Vaughn Hester, and Lukas Biewald. 2010. Ensuring quality in crowdsourced search relevance evaluation: The effects of training question distribution. In Proceedings of the SIGIR 2010 Workshop on Crowdsourcing for Search Evaluation. 21--26.
[29]
Catherine C. Marshall and Frank M. Shipman. 2013. Experiences surveying the crowd: Reflections on methods, participation, and reliability. In Proceedings of the 5th Annual ACM Web Science Conference. ACM, 234--243.
[30]
David Martin, Benjamin V. Hanrahan, Jacki O’Neill, and Neha Gupta. 2014. Being a turker. In Proceedings of the 17th ACM Conference on Computer Supported Cooperative Work 8 Social Computing. ACM, 224--235.
[31]
David Martin, Jacki O. Neill, Neha Gupta, and Benjamin V. Hanrahan. 2016. Turking in a global labour market. Computer Supported Cooperative Work (CSCW) 25, 1 (2016), 39--77.
[32]
Winter Mason and Siddharth Suri. 2012. Conducting behavioral research on Amazons Mechanical Turk. Behavior Research Methods 44, 1 (2012), 1--23.
[33]
Tyler M. Miller and Lisa Geraci. 2011. Training metacognition in the classroom: The influence of incentives and feedback on exam predictions. Metacognition and Learning 6, 3 (2011), 303--314.
[34]
Edward Newell and Derek Ruths. 2016. How one microtask affects another. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. ACM, 3155--3166.
[35]
David Oleson, Alexander Sorokin, Greg P. Laughlin, Vaughn Hester, John Le, and Lukas Biewald. 2011. Programmatic gold: Targeted and scalable quality assurance in crowdsourcing. Human Computation 11, 11 (2011).
[36]
Joel Ross, Lilly Irani, M. Silberman, Andrew Zaldivar, and Bill Tomlinson. 2010. Who are the crowdworkers?: Shifting demographics in mechanical turk. In CHI’10 Extended Abstracts on Human Factors in Computing Systems. ACM, 2863--2872.
[37]
Thomas Schlösser, David Dunning, Kerri L. Johnson, and Justin Kruger. 2013. How unaware are the unskilled? Empirical tests of the signal extraction counter explanation for the Dunning--Kruger effect in self-evaluation of performance. Journal of Economic Psychology 39 (2013), 85--100.
[38]
Barry Schwartz. 2004. The paradox of choice: Why less is more. Ecco, New York.
[39]
Barry Schwartz and Andrew Ward. 2004. Doing better but feeling worse: The paradox of choice. Positive Psychology in Practice (2004), 86--104.
[40]
Han Yu, Zhiqi Shen, Chunyan Miao, and Bo An. 2012. Challenges and opportunities for trust management in crowdsourcing. In Proceedings of the 2012 IEEE/WIC/ACM International Conferences on Intelligent Agent Technology (IAT’12). IEEE Computer Society, 486--493.
[41]
Ujwal Gadiraju and Stefan Dietze. 2017. Improving learning through achievement priming in crowdsourced information finding microtasks. In Proceedings of the Seventh International Learning Analytics 8 Knowledge Conference. ACM, Vancouver, BC, Canada, 105--114.

Cited By

View all
  • (2024)FARPLS: A Feature-Augmented Robot Trajectory Preference Labeling System to Assist Human Labelers’ Preference ElicitationProceedings of the 29th International Conference on Intelligent User Interfaces10.1145/3640543.3645145(344-369)Online publication date: 18-Mar-2024
  • (2024)Cognitive personalization for online microtask labor platforms: A systematic literature reviewUser Modeling and User-Adapted Interaction10.1007/s11257-023-09383-w34:3(617-658)Online publication date: 1-Jul-2024
  • (2024)Trustworthy human computation: a surveyArtificial Intelligence Review10.1007/s10462-024-10974-157:12Online publication date: 12-Oct-2024
  • Show More Cited By

Index Terms

  1. Using Worker Self-Assessments for Competence-Based Pre-Selection in Crowdsourcing Microtasks

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Computer-Human Interaction
      ACM Transactions on Computer-Human Interaction  Volume 24, Issue 4
      August 2017
      182 pages
      ISSN:1073-0516
      EISSN:1557-7325
      DOI:10.1145/3132166
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 23 August 2017
      Accepted: 01 June 2017
      Revised: 01 April 2017
      Received: 01 December 2016
      Published in TOCHI Volume 24, Issue 4

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Crowdsourcing
      2. microtasks
      3. performance
      4. pre-screening
      5. pre-selection
      6. self-assessment
      7. worker behaviour

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Funding Sources

      • European Commission within the H2020-ICT-2015 Programme (AFEL project)

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)28
      • Downloads (Last 6 weeks)2
      Reflects downloads up to 19 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)FARPLS: A Feature-Augmented Robot Trajectory Preference Labeling System to Assist Human Labelers’ Preference ElicitationProceedings of the 29th International Conference on Intelligent User Interfaces10.1145/3640543.3645145(344-369)Online publication date: 18-Mar-2024
      • (2024)Cognitive personalization for online microtask labor platforms: A systematic literature reviewUser Modeling and User-Adapted Interaction10.1007/s11257-023-09383-w34:3(617-658)Online publication date: 1-Jul-2024
      • (2024)Trustworthy human computation: a surveyArtificial Intelligence Review10.1007/s10462-024-10974-157:12Online publication date: 12-Oct-2024
      • (2023)Frustratingly easy truth discoveryProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v37i5.25750(6074-6083)Online publication date: 7-Feb-2023
      • (2023)How Stated Accuracy of an AI System and Analogies to Explain Accuracy Affect Human Reliance on the SystemProceedings of the ACM on Human-Computer Interaction10.1145/36100677:CSCW2(1-29)Online publication date: 4-Oct-2023
      • (2023)Knowing About Knowing: An Illusion of Human Competence Can Hinder Appropriate Reliance on AI SystemsProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581025(1-18)Online publication date: 19-Apr-2023
      • (2023)Exploring Stigmergic Collaboration and Task Modularity Through an Expert Crowdsourcing Annotation System: The Case of Storm Phenomena in the Euro-Atlantic RegionIEEE Access10.1109/ACCESS.2023.331959711(106485-106502)Online publication date: 2023
      • (2023)A New Method for Identifying Low-Quality Data in Perceived Usability Crowdsourcing Tests: Differences in Questionnaire ScoresInternational Journal of Human–Computer Interaction10.1080/10447318.2023.226369440:22(7297-7313)Online publication date: 9-Oct-2023
      • (2023)Modeling evolutionary responses in crowdsourcing MCQ using belief function theoryProcedia Computer Science10.1016/j.procs.2023.10.249225(2575-2584)Online publication date: 2023
      • (2022)Improving Students Argumentation Learning with Adaptive Self-Evaluation NudgingProceedings of the ACM on Human-Computer Interaction10.1145/35556336:CSCW2(1-31)Online publication date: 11-Nov-2022
      • Show More Cited By

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media