skip to main content
10.1145/2723372.2750550acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

iCrowd: An Adaptive Crowdsourcing Framework

Authors Info & Claims
Published:27 May 2015Publication History

ABSTRACT

Crowdsourcing is widely accepted as a means for resolving tasks that machines are not good at. Unfortunately, Crowdsourcing may yield relatively low-quality results if there is no proper quality control. Although previous studies attempt to eliminate "bad" workers by using qualification tests, the accuracies estimated from qualifications may not be accurate, because workers have diverse accuracies across tasks. Thus, the quality of the results could be further improved by selectively assigning tasks to the workers who are well acquainted with the tasks. To this end, we propose an adaptive crowdsourcing framework, called iCrowd. iCrowd on-the-fly estimates accuracies of a worker by evaluating her performance on the completed tasks, and predicts which tasks the worker is well acquainted with. When a worker requests for a task, iCrowd assigns her a task, to which the worker has the highest estimated accuracy among all online workers. Once a worker submits an answer to a task, iCrowd analyzes her answer and adjusts estimation of her accuracies to improve subsequent task assignments. This paper studies the challenges that arise in iCrowd. The first is how to estimate diverse accuracies of a worker based on her completed tasks. The second is instant task assignment. We deploy iCrowd on Amazon Mechanical Turk, and conduct extensive experiments on real datasets. Experimental results show that iCrowd achieves higher quality than existing approaches.

References

  1. http://crowdflower.com/.Google ScholarGoogle Scholar
  2. Amazon Mechanical Turk. https://www.mturk.com.Google ScholarGoogle Scholar
  3. G. Adomavicius and A. Tuzhilin. Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Trans. Knowl. Data Eng., 17(6):734--749, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. S. Bernstein, J. Brandt, R. C. Miller, and D. R. Karger. Crowds in two seconds: enabling realtime crowd-powered interfaces. In UIST, pages 33--42, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. S. Bernstein, G. Little, R. C. Miller, B. Hartmann, M. S. Ackerman, D. R. Karger, D. Crowell, and K. Panovich. Soylent: a word processor with a crowd inside. In UIST, pages 313--322, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. In NIPS, pages 601--608, 2001.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. C. C. Cao, J. She, Y. Tong, and L. Chen. Whom to ask? jury selection for decision making tasks on micro-blog services. PVLDB, 5(11):1495--1506, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. P. Dawid and A. M. Skene. Maximum likelihood estimation of observer error-rates using the em algorithm. Applied Statistics, 28(1):20--28, 1979.Google ScholarGoogle ScholarCross RefCross Ref
  9. G. Demartini, D. E. Difallah, and P. Cudré-Mauroux. ZenCrowd: leveraging probabilistic reasoning and crowdsourcing techniques for large-scale entity linking. In WWW, pages 469--478, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Fan, M. Lu, B. C. Ooi, W. Tan, and M. Zhang. A hybrid machine-crowdsourcing system for matching web tables. In IEEE 30th International Conference on Data Engineering, Chicago, ICDE 2014, IL, USA, March 31 - April 4, 2014, pages 976--987, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  11. A. Feng, M. J. Franklin, D. Kossmann, T. Kraska, S. Madden, S. Ramesh, A. Wang, and R. Xin. Crowddb: Query processing with the vldb crowd. PVLDB, 4(12):1387--1390, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. M. J. Franklin, D. Kossmann, T. Kraska, S. Ramesh, and R. Xin. Crowddb: answering queries with crowdsourcing. In SIGMOD, pages 61--72, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. Guo, A. G. Parameswaran, and H. Garcia-Molina. So who won?: dynamic max discovery with the crowd. In SIGMOD, pages 385--396, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. T. H. Haveliwala. Topic-sensitive pagerank. In WWW, pages 517--526, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. C. Ho, S. Jabbari, and J. W. Vaughan. Adaptive task assignment for crowdsourced classification. In ICML, pages 534--542, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. C. Ho and J. W. Vaughan. Online task assignment in crowdsourcing markets. In AAAI, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. P. G. Ipeirotis, F. Provost, and J. Wang. Quality management on amazon mechanical turk. In SIGKDD Workshop on Human Computation, pages 64--67, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. D. R. Karger, S. Oh, and D. Shah. Iterative learning for reliable crowdsourcing systems. In NIPS, pages 1953--1961, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. D. R. Karger, S. Oh, and D. Shah. Budget-optimal task allocation for reliable crowdsourcing systems. Operations Research, 62(1):1--24, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. H. W. Kuhn. The hungarian method for the assignment problem. In 50 Years of Integer Programming 1958--2008 - From the Early Years to the State-of-the-Art, pages 29--47. 2010.Google ScholarGoogle Scholar
  21. Q. Liu, J. Peng, and A. T. Ihler. Variational inference for crowdsourcing. In NIPS, pages 701--709, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. X. Liu, M. Lu, B. C. Ooi, Y. Shen, S. Wu, and M. Zhang. Cdas: A crowdsourcing data analytics system. PVLDB, 5(10):1040--1051, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. A. Marcus, E. Wu, D. R. Karger, S. Madden, and R. C. Miller. Human-powered sorts and joins. PVLDB, 5(1):13--24, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. A. Marcus, E. Wu, S. Madden, and R. C. Miller. Crowdsourced databases: Query processing with people. In CIDR, pages 211--214, 2011.Google ScholarGoogle Scholar
  25. A. G. Parameswaran, S. Boyd, H. Garcia-Molina, A. Gupta, N. Polyzotis, and J. Widom. Optimal crowd-powered rating and filtering algorithms. PVLDB, 7(9):685--696, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. A. G. Parameswaran, H. Garcia-Molina, H. Park, N. Polyzotis, A. Ramesh, and J. Widom. Crowdscreen: algorithms for filtering data with humans. In SIGMOD, pages 361--372, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. A. G. Parameswaran and N. Polyzotis. Answering queries using humans, algorithms and databases. In CIDR, pages 160--166, 2011.Google ScholarGoogle Scholar
  28. A. G. Parameswaran, A. D. Sarma, H. Garcia-Molina, N. Polyzotis, and J. Widom. Human-assisted graph search: it's okay to ask questions. PVLDB, 4(5):267--278, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. H. Park and J. Widom. Crowdfill: collecting structured data from the crowd. In SIGMOD, pages 577--588, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. V. C. Raykar and S. Yu. Ranking annotators for crowdsourced labeling tasks. In NIPS, pages 1809--1817, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. V. S. Sheng, F. J. Provost, and P. G. Ipeirotis. Get another label? improving data quality and data mining using multiple, noisy labelers. In SIGKDD, pages 614--622, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. J. Wang, T. Kraska, M. J. Franklin, and J. Feng. Crowder: Crowdsoucing entity resolution. PVLDB, 5(10), 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. J. Wang, G. Li, J. X. Yu, and J. Feng. Entity matching: How similar is similar. PVLDB, 4(10):622--633, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. S. E. Whang, P. Lofgren, and H. Garcia-Molina. Question selection for crowd entity resolution. Technical report, Stanford University.Google ScholarGoogle Scholar
  35. D. Zhou, O. Bousquet, T. N. Lal, J. Weston, and B. Schölkopf. Learning with local and global consistency. In NIPS, 2003.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. iCrowd: An Adaptive Crowdsourcing Framework

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data
      May 2015
      2110 pages
      ISBN:9781450327589
      DOI:10.1145/2723372

      Copyright © 2015 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 27 May 2015

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      SIGMOD '15 Paper Acceptance Rate106of415submissions,26%Overall Acceptance Rate785of4,003submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader