research-article

Categorization of computing education resources with utilization of crowdsourcing

Authors:
Yinlin Chen

Virginia Tech, Blacksburg, VA, USA

Virginia Tech, Blacksburg, VA, USA
View Profile

,
Paul Logasa Bogen

Oak Ridge National Laboratory, Oak Ridge, TN, USA

Oak Ridge National Laboratory, Oak Ridge, TN, USA
View Profile

,
Haowei Hsieh

University of Iowa, Iowa City, IA, USA

University of Iowa, Iowa City, IA, USA
View Profile

,
Edward A. Fox

Virginia Tech, Blacksburg, VA, USA

Virginia Tech, Blacksburg, VA, USA
View Profile

,
Lillian N. Cassel

Villanova University, Villanova , PA, USA

Villanova University, Villanova , PA, USA
View Profile

JCDL '12: Proceedings of the 12th ACM/IEEE-CS joint conference on Digital LibrariesJune 2012Pages 121–124https://doi.org/10.1145/2232817.2232840

Published:10 June 2012Publication History

JCDL '12: Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries

Pages 121–124

ABSTRACT

The Ensemble Portal harvests resources from multiple heterogeneous federated collections. Managing these dynamically increasing collections requires an automatic mechanism to categorize records in to corresponding topics. We propose an approach to use existing ACM DL metadata to build classifiers for harvested resources in the Ensemble project. We also present our experience with utilizing the Amazon Mechanical Turk platform to build ground truth training data sets from Ensemble collections.

References

Sebastiani, F. (2002). Machine learning in automated text categorization. ACM Comput. Surv.34, 1, 1--47. Google ScholarDigital Library
Jain, A.K., Murty, M.N., and Flynn, P.J. (1999). Data clustering: a review. ACM Comput. Surv. 31, 3, 264--323. Google ScholarDigital Library
Chen, G., Warren, J., and Riddle,P. (2010). Semantic Space models for classification of consumer webpages on metadata attributes. J. of Biomedical Informatics 43, 5, 725--735. Google ScholarDigital Library
Meyer, M., Rensing, C., and Steinmetz, R. (2008). Using community-generated contents as a substitute corpus for metadata generation. Int. J. Adv. Media Comm. 2, 1, 59--72. Google ScholarDigital Library
Kittur, A., Chi, E. H., & Suh, B. (2008). Crowdsourcing user studies with Mechanical Turk. In Proc. of CHI 08. Google ScholarDigital Library
Mason, W., & Suri, S. (2010). Conducting Behavioral Research on Amazon's Mechanical Turk. Behavior Research Methods, 5(5), 1--23.Google Scholar
Chen, J. J., Menezes, N. J., Bradley, A. D., & North, T. A. (2011). Opportunities for Crowdsourcing Research on Amazon Mechanical Turk. Human Factors, 5, 3.Google Scholar
Yetisgen-yildiz, M., Solti, I., Xia, F., & Halgrim, S. R. (2010). Preliminary Experiments with Amazon's Mechanical Turk for Annotating Medical Named Entities. Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk, 180--183. Google ScholarDigital Library

Index Terms

Categorization of computing education resources with utilization of crowdsourcing

Recommendations

A Community Rather Than A Union: Understanding Self-Organization Phenomenon on MTurk and How It Impacts Turkers and Requesters
CHI EA '17: Proceedings of the 2017 CHI Conference Extended Abstracts on Human Factors in Computing Systems

This paper aims to understand the self-organization phenomenon among the workers of Amazon Mechanical Turk (MTurk), a well-known crowdsourcing platform. Specifically, we explored 1) why MTurk workers self-organize into online communities (Turker ...
Read More
Dynamic categorization of clinical research eligibility criteria by hierarchical clustering

Objective: To semi-automatically induce semantic categories of eligibility criteria from text and to automatically classify eligibility criteria based on their semantic similarity. Design: The UMLS semantic types and a set of previously developed ...
Read More
A novel Bagged Naïve Bayes-Decision Tree approach for multi-class classification problems
Soft Computing and Intelligent Systems: Techniques and Applications

Breakthrough classification performances have been achieved by utilizing ensemble techniques in machine learning and data mining. Bagging is one such ensemble technique that has outperformed single models in obtaining higher predictive performances. This ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
JCDL '12: Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries
June 2012
458 pages
ISBN:9781450311540
DOI:10.1145/2232817
General Chairs:
Karim B. Boughida
The George Washington University, USA
,
Barrie Howard
The Library of Congress, USA
,
Program Chairs:
Michael L. Nelson
Old Dominion University, USA
,
Herbert Van de Sompel
Los Alamos National Laboratory, USA
,
Ingeborg Sølvberg
Norwegian University of Science & Technology, Norway
Copyright © 2012 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 10 June 2012
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
amazon mechanical turk
classification
digital libraries
machine learning
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate415of1,482submissions,28%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 5
  Total Citations
  View Citations
- 211
  Total Downloads
- Downloads (Last 12 months)1
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Categorization of computing education resources with utilization of crowdsourcing

JCDL '12: Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries

ABSTRACT

References

Cited By

Index Terms

Recommendations

A Community Rather Than A Union: Understanding Self-Organization Phenomenon on MTurk and How It Impacts Turkers and Requesters

Dynamic categorization of clinical research eligibility criteria by hierarchical clustering

A novel Bagged Naïve Bayes-Decision Tree approach for multi-class classification problems