ABSTRACT
A crowdsourcing system, such as the Amazon Mechanical Turk (AMT), provides a platform for a large number of questions to be answered by Internet workers. Such systems have been shown to be useful to solve problems that are difficult for computers, including entity resolution, sentiment analysis, and image recognition. In this paper, we investigate the online task assignment problem: Given a pool of n questions, which of the k questions should be assigned to a worker? A poor assignment may not only waste time and money, but may also hurt the quality of a crowdsourcing application that depends on the workers' answers. We propose to consider quality measures (also known as evaluation metrics) that are relevant to an application during the task assignment process. Particularly, we explore how Accuracy and F-score, two widely-used evaluation metrics for crowdsourcing applications, can facilitate task assignment. Since these two metrics assume that the ground truth of a question is known, we study their variants that make use of the probability distributions derived from workers' answers. We further investigate online assignment strategies, which enables optimal task assignments. Since these algorithms are expensive, we propose solutions that attain high quality in linear time. We develop a system called the Quality-Aware Task Assignment System for Crowdsourcing Applications (QASCA) on top of AMT. We evaluate our approaches on five real crowdsourcing applications. We find that QASCA is efficient, and attains better result quality (of more than 8% improvement) compared with existing methods.
- A.P.Dawid and A.M.Skene. Maximum likelihood estimation of observer error-rates using em algorithm. Appl.Statist., 28(1):20--28, 1979.Google ScholarCross Ref
- M. Blum, R. W. Floyd, V. R. Pratt, R. L. Rivest, and R. E. Time bounds for selection. Journal of Computer and System Sciences, 7(4):448--461, 1973. Google ScholarDigital Library
- R. Boim, O. Greenshpan, T. Milo, S. Novgorodov, N. Polyzotis, and W. C. Tan. Asking the right questions in crowd data sourcing. In ICDE, 2012. Google ScholarDigital Library
- C.G. Small. Expansions and asymptotics for statistics. CRC Press, 2010.Google ScholarCross Ref
- X. Chen, P. N. Bennett, K. Collins-Thompson, and E. Horvitz. Pairwise ranking aggregation in a crowdsourced setting. In WSDM, pages 193--202, 2013. Google ScholarDigital Library
- X. Chen, Q. Lin, and D. Zhou. Optimistic knowledge gradient policy for optimal budget allocation in crowdsourcing. In ICML, pages 64--72, 2013.Google ScholarDigital Library
- J. Costa, C. Silva, M. Antunes, and B. Ribeiro. On using crowdsourcing and active learning to improve classification performance. In ISDA, 2011.Google ScholarCross Ref
- P. Dai, C. H. Lin, Mausam, and D. S. Weld. Pomdp-based control of workflows for crowdsourcing. Artif. Intell., 202:52--85, 2013.Google ScholarDigital Library
- S. B. Davidson, S. Khanna, T. Milo, and S. Roy. Using the crowd for top-k and group-by queries. In ICDT, pages 225--236, 2013. Google ScholarDigital Library
- A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the em algorithm. J.R. Statist. Soc. B, 30(1):1--38, 1977.Google Scholar
- D. Deng, G. Li, S. Hao, J. Wang, and J. Feng. Massjoin: A mapreduce-based method for scalable string similarity joins. In ICDE, pages 340--351, 2014.Google ScholarCross Ref
- W. Dinkelbach. On nonlinear fractional programming. Management Science, 13(7):492--498, March,1967.Google ScholarDigital Library
- P. Efraimidis and P. G. Spirakis. Weighted random sampling with a reservoir. Inf. Process. Lett., 97(5):181--185, 2006. Google ScholarDigital Library
- M. J. Franklin, D. Kossmann, T. Kraska, S. Ramesh, and R. Xin. Crowddb: answering queries with crowdsourcing. In SIGMOD Conference, 2011. Google ScholarDigital Library
- J. Gao, X. Liu, B. C. Ooi, H. Wang, and G. Chen. An online cost sensitive decision-making method in crowdsourcing systems. In SIGMOD, 2013. Google ScholarDigital Library
- S. Guo, A. G. Parameswaran, and H. Garcia-Molina. So who won?: dynamic max discovery with the crowd. In SIGMOD Conference, pages 385--396, 2012. Google ScholarDigital Library
- K. Hara, V. Le, and J. Froehlich. Combining crowdsourcing and google street view to identify street-level accessibility problems. In CHI, 2013. Google ScholarDigital Library
- C.-J. Ho, S. Jabbari, and J. W. Vaughan. Adaptive task assignment for crowdsourced classification. In ICML (1), pages 534--542, 2013.Google Scholar
- C.-J. Ho and J. W. Vaughan. Online task assignment in crowdsourcing markets. In AAAI, 2012.Google ScholarDigital Library
- J. J. Horton and L. B. Chilton. The labor economics of paid crowdsourcing. In ACM Conference on Electronic Commerce, pages 209--218, 2010. Google ScholarDigital Library
- N. Q. V. Hung, N. T. Tam, L. N. Tran, and K. Aberer. An evaluation of aggregation techniques in crowdsourcing. In WISE, pages 1--15. Springer, 2013.Google Scholar
- P. Ipeirotis, F. Provost, and J. Wang. Quality management on amazon mechanical turk. In SIGKDD workshop, pages 64--67, 2010. Google ScholarDigital Library
- P. G. Ipeirotis. Analyzing the amazon mechanical turk marketplace. ACM Crossroads, 17(2):16--21, 2010. Google ScholarDigital Library
- M. Jansche. A maximum expected utility framework for binary sequence labeling. In ACL, 2007.Google Scholar
- S. R. Jeffery, M. J. Franklin, and A. Y. Halevy. Pay-as-you-go user feedback for dataspace systems. In SIGMOD, pages 847--860, 2008. Google ScholarDigital Library
- D. R. Karger, S. Oh, and D. Shah. Iterative learning for reliable crowdsourcing systems. In NIPS, pages 1953--1961, 2011.Google ScholarDigital Library
- A. H. Laender, M. A. Gonçalves, R. G. Cota, A. A. Ferreira, R. L. T. Santos, and A. J. Silva. Keeping a digital library clean: New solutions to old problems. In DocEng, pages 257--262. ACM, 2008. Google ScholarDigital Library
- D. D. Lewis. Evaluating and optimizing autonomous text classification systems. In SIGIR, pages 246--254, 1995. Google ScholarDigital Library
- X. Li, X. L. Dong, K. Lyons, W. Meng, and D. Srivastava. Truth finding on the deep web: Is the problem solved? PVLDB, 6(2):97--108, 2012. Google ScholarDigital Library
- X. Liu, M. Lu, B. C. Ooi, Y. Shen, S. Wu, and M. Zhang. Cdas: A crowdsourcing data analytics system. PVLDB, 5(10):1040--1051, 2012. Google ScholarDigital Library
- C. D. Manning, P. Raghavan, and H. Schütze. Introduction to information retrieval. Cambridge University Press, 2008. Google ScholarCross Ref
- C. D. Manning and H. Schütze. Foundations of statistical natural language processing. MIT Press, 2001. Google ScholarDigital Library
- A. Marcus, D. R. Karger, S. Madden, R. Miller, and S. Oh. Counting with the crowd. PVLDB, 6(2):109--120, 2012. Google ScholarDigital Library
- A. Marcus, E. Wu, D. R. Karger, S. Madden, and R. C. Miller. Human-powered sorts and joins. PVLDB, 5(1):13--24, 2011. Google ScholarDigital Library
- A. Marcus, E. Wu, S. Madden, and R. C. Miller. Crowdsourced databases: Query processing with people. In CIDR, pages 211--214, 2011.Google Scholar
- D. Menestrina, S. E. Whang, and H. Garcia-Molina. Evaluating entity resolution results. PVLDB, 3(1--2):208--219, 2010. Google ScholarDigital Library
- K. Mo, E. Zhong, and Q. Yang. Cross-task crowdsourcing. In KDD, 2013. Google ScholarDigital Library
- L. Mo, R. Cheng, B. Kao, X. S. Yang, C. Ren, S. Lei, D. W. Cheung, and E. Lo. Optimizing plurality for human intelligence tasks. In CIKM, 2013. Google ScholarDigital Library
- S. Nowozin. Optimal decisions from probabilistic models: the intersection-over-union case. In CVPR, 2014. Google ScholarDigital Library
- A. G. Parameswaran, H. Garcia-Molina, H. Park, N. Polyzotis, A. Ramesh, and J. Widom. Crowdscreen: algorithms for filtering data with humans. In SIGMOD Conference, pages 361--372, 2012. Google ScholarDigital Library
- H. Park, R. Pang, A. G. Parameswaran, H. Garcia-Molina, N. Polyzotis, and J. Widom. Deco: A system for declarative crowdsourcing. PVLDB, 5(12), 2012. Google ScholarDigital Library
- R. Pochampally, A. D. Sarma, X. L. Dong, A. Meliou, and D. Srivastava. Fusing data with correlations. In SIGMOD, pages 433--444, 2014. Google ScholarDigital Library
- V. C. Raykar and S. Yu. Ranking annotators for crowdsourced labeling tasks. In NIPS, pages 1809--1817, 2011.Google ScholarDigital Library
- V. C. Raykar, S. Yu, L. H. Zhao, G. H. Valadez, C. Florin, L. Bogoni, and L. Moy. Learning from crowds. Journal of Machine Learning Research, 2010. Google ScholarDigital Library
- S. H. Rice. A stochastic version of the price equation reveals the interplay of deterministic and stochastic processes in evolution. BMC evolutionary biology, 8:262, 2008.Google Scholar
- A. B. Sayeed, T. J. Meyer, H. C. Nguyen, O. Buzek, and A. Weinberg. Crowdsourcing the evaluation of a domain-adapted named entity recognition system. In HLT-NAACL, pages 345--348, 2010. Google ScholarDigital Library
- V. S. Sheng, F. J. Provost, and P. G. Ipeirotis. Get another label? improving data quality and data mining using multiple, noisy labelers. In KDD, 2008. Google ScholarDigital Library
- A. Sheshadri and M. Lease. SQUARE: A Benchmark for Research on Computing Crowd Consensus. In Proceedings of the 1st AAAI Conference on Human Computation (HCOMP), pages 156--164, 2013.Google Scholar
- J. R. Talburt. Entity Resolution and Information Quality. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2010. Google ScholarDigital Library
- B. Trushkowsky, T. Kraska, M. J. Franklin, and P. Sarkar. Crowdsourced enumeration queries. In ICDE, pages 673--684, 2013. Google ScholarDigital Library
- C. Van Rijsbergen. Information retrieval. Butterworths, 1979. Google ScholarDigital Library
- M. Venanzi, J. Guiver, G. Kazai, P. Kohli, and M. Shokouhi. Community-based bayesian aggregation models for crowdsourcing. In WWW, 2014. Google ScholarDigital Library
- P. Venetis and H. Garcia-Molina. Quality control for comparison microtasks. In Proceedings of the First International Workshop on Crowdsourcing and Data Mining, pages 15--21. ACM, 2012. Google ScholarDigital Library
- P. Venetis, H. Garcia-Molina, K. Huang, and N. Polyzotis. Max algorithms in crowdsourcing environments. In WWW, pages 989--998, 2012. Google ScholarDigital Library
- R. Vernica, M. J. Carey, and C. Li. Efficient parallel set-similarity joins using mapreduce. In SIGMOD, pages 495--506. ACM, 2010. Google ScholarDigital Library
- J. Wang, T. Kraska, M. J. Franklin, and J. Feng. CrowdER: crowdsourcing entity resolution. PVLDB, 5(11):1483--1494, 2012. Google ScholarDigital Library
- J. Wang, S. Krishnan, M. J. Franklin, K. Goldberg, T. Kraska, and T. Milo. A sample-and-clean framework for fast and accurate query processing on dirty data. In SIGMOD, pages 469--480. ACM, 2014. Google ScholarDigital Library
- J. Wang, G. Li, and J. Feng. Can we beat the prefix filtering?: an adaptive framework for similarity join and search. In SIGMOD, pages 85--96, 2012. Google ScholarDigital Library
- J. Wang, G. Li, T. Kraska, M. J. Franklin, and J. Feng. Leveraging transitive relations for crowdsourced joins. In SIGMOD, 2013. Google ScholarDigital Library
- S. E. Whang, P. Lofgren, and H. Garcia-Molina. Question selection for crowd entity resolution. PVLDB, 6(6):349--360, 2013. Google ScholarDigital Library
- J. Whitehill, P. Ruvolo, T. Wu, J. Bergsma, and J. R. Movellan. Whose vote should count more: Optimal integration of labels from labelers of unknown expertise. In NIPS, pages 2035--2043, 2009.Google ScholarDigital Library
- C. J. Zhang, L. Chen, H. V. Jagadish, and C. C. Cao. Reducing uncertainty of schema matching via crowdsourcing. PVLDB, 6(9):757--768, 2013. Google ScholarDigital Library
- B. Zhao, B. I. Rubinstein, J. Gemmell, and J. Han. A bayesian approach to discovering truth from conflicting sources for data integration. PVLDB, 2012. Google ScholarDigital Library
Index Terms
- QASCA: A Quality-Aware Task Assignment System for Crowdsourcing Applications
Recommendations
iCrowd: An Adaptive Crowdsourcing Framework
SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of DataCrowdsourcing is widely accepted as a means for resolving tasks that machines are not good at. Unfortunately, Crowdsourcing may yield relatively low-quality results if there is no proper quality control. Although previous studies attempt to eliminate "...
Quality Control in Crowdsourcing based on Fine-Grained Behavioral Features
CSCW2Crowdsourcing is popular for large-scale data collection and labeling, but a major challenge is on detecting low-quality submissions. Recent studies have demonstrated that behavioral features of workers are highly correlated with data quality and can be ...
Towards a classification model for tasks in crowdsourcing
ICC '17: Proceedings of the Second International Conference on Internet of things, Data and Cloud ComputingCrowdsourcing is an increasingly popular approach for utilizing the power of the crowd in performing tasks that cannot be solved sufficiently by machines. Text annotation and image labeling are two examples of crowdsourcing tasks that are difficult to ...
Comments