skip to main content
10.1145/3271553.3271578acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicvispConference Proceedingsconference-collections
research-article

Deep Active Learning for Text Classification

Published:27 August 2018Publication History

ABSTRACT

In recent years, Active Learning (AL) has been applied in the domain of text classification successfully. However, traditional methods need researchers to pay attention to feature extraction of datasets and different features will influence the final accuracy seriously. In this paper, we propose a new method that uses Recurrent Neutral Network (RNN) as the acquisition function in Active Learning called Deep Active Learning (DAL). For DAL, there is no need to consider how to extract features because RNN can use its internal state to process sequences of inputs. We have proved that DAL can achieve the accuracy that cannot be reached by traditional Active Learning methods when dealing with text classification. What's more, DAL can decrease the need of the great number of labeled instances for Deep Learning (DL).

At the same time, we design a strategy to distribute label work to different workers. We have proved by using a proper batch size of instance, we can save much time but not decrease the model's accuracy. Based on this, we provide batch of instances for different workers and the size of batch is determined by worker's ability and scale of dataset, meanwhile, it can be updated with the performance of the workers.

References

  1. Aggarwal, C. C., and Zhai, C. 2012. A survey of text classification algorithms. In Mining text data. Springer. 163--222.Google ScholarGoogle Scholar
  2. Chen, Y. and Krause, A. 2013. Near-optimal batch mode active learning and adaptive submodular optimization. In ICML'13 Proceedings of the 30th International Conference on International Conference on Machine Learning. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Gal, Y., Islam, R. and Ghahramani, Z. 2016. Deep Bayesian Active Learning with Image Data. In: Advances in Neural Information Processing Systems 29.Google ScholarGoogle Scholar
  4. Gal, Y. 2016. Uncertainty in Deep Learning. Doctor. University of Cambridge.Google ScholarGoogle Scholar
  5. Hassan, S., Rafi, M. and Shaikh, M. 2011. Comparing SVM and naïve Bayes classifiers for text categorization with Wikitology as knowledge enrichment. In IEEE 14th International Multitopic Conference.Google ScholarGoogle Scholar
  6. Hoi, S., Jin R., Zhu J. and Lyu M. 2006. Batch Mode Active Learning and Its Application to Medical Image Classification. In ICML '06 Proceedings of the 23rd international conference on Machine learning. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Huang, K. and Lin, H. 2016. A Novel Uncertainty Sampling Algorithm for Cost-sensitive Multiclass Active Learning. In 2016 IEEE 16th International Conference on Data Mining (ICDM).Google ScholarGoogle Scholar
  8. Kim, Y. 2014. Convolutional Neural Networks for Sentence Classification. In Empirical Methods in Natural Language Processing.Google ScholarGoogle Scholar
  9. Lai, S., Xu, L., Liu, K. and Zhao, J. 2015. Recurrent Convolutional Neural Networks for Text Classification. In Twenty-Ninth AAAI Conference on Artificial Intelligence Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Lilleberg, J., Zhu, Y. and Zhang, Y. 2015. Support Vector Machines and Word2vec for Text Classification with Semantic Features. In IEEE 14th International Conference on Cognitive Informatics & Cognitive Computing.Google ScholarGoogle Scholar
  11. Rajaraman, A., Ullman, J.D. 2011. "Data Mining". Mining of Massive Datasets. pp. 1--17. ISBN 978-1-139-05845-2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Ramos, J.E. 2003. Using TF-IDF to Determine Word Relevance in Document Queries.Google ScholarGoogle Scholar
  13. Settles, B. 2010. Active Learning Literature Survey. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Settles, B. and Craven, M. 2008. An Analysis of Active Learning Strategies for Sequence Labeling Tasks. In Empirical Methods in Natural Language Processing. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I. and Salakhutdinov, R. 2014. Dropout: A Simple Way to Prevent Neural Networks from Overfitting. Journal of Machine Learning Research, 15 (1929--1958). Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Sum, M., Li, J., Guo, Z., Zhao, Y., Zheng, Y., Si, X. and Liu, Z. 2016. THUCTC: An Efficient Chinese Text Classifier.Google ScholarGoogle Scholar
  17. Thompson, C., Califf, M. and Mooney, R. 1999. Active Learning for Natural Language Parsing and Information Extraction. In Proceedings of the Sixteenth International Machine Learning Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Tong, S. and Koller, D. 2001. Support Vector Machine Active Learning with Applications to Text Classification. Machine Learning Research. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Wang, X., Jiang, W. and Luo, Z. 2016. Combination of Convolutional and Recurrent Neural Network for Sentiment Analysis of Short Texts. Osaka, Japan, pp.2428--2437.Google ScholarGoogle Scholar
  20. Yin, W., Kann, K., Yu, M. and Schütze, H. 2017. Comparative Study of CNN and RNN for Natural Language Processing.Google ScholarGoogle Scholar

Index Terms

  1. Deep Active Learning for Text Classification

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      ICVISP 2018: Proceedings of the 2nd International Conference on Vision, Image and Signal Processing
      August 2018
      402 pages
      ISBN:9781450365291
      DOI:10.1145/3271553

      Copyright © 2018 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 27 August 2018

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

      Acceptance Rates

      Overall Acceptance Rate186of424submissions,44%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader