skip to main content
10.1145/1772690.1772767acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Cross-domain sentiment classification via spectral feature alignment

Published: 26 April 2010 Publication History

Abstract

Sentiment classification aims to automatically predict sentiment polarity (e.g., positive or negative) of users publishing sentiment data (e.g., reviews, blogs). Although traditional classification algorithms can be used to train sentiment classifiers from manually labeled text data, the labeling work can be time-consuming and expensive. Meanwhile, users often use some different words when they express sentiment in different domains. If we directly apply a classifier trained in one domain to other domains, the performance will be very low due to the differences between these domains. In this work, we develop a general solution to sentiment classification when we do not have any labels in a target domain but have some labeled data in a different domain, regarded as source domain. In this cross-domain sentiment classification setting, to bridge the gap between the domains, we propose a spectral feature alignment (SFA) algorithm to align domain-specific words from different domains into unified clusters, with the help of domain-independent words as a bridge. In this way, the clusters can be used to reduce the gap between domain-specific words of the two domains, which can be used to train sentiment classifiers in the target domain accurately. Compared to previous approaches, SFA can discover a robust representation for cross-domain data by fully exploiting the relationship between the domain-specific and domain-independent words via simultaneously co-clustering them in a common latent space. We perform extensive experiments on two real world datasets, and demonstrate that SFA significantly outperforms previous approaches to cross-domain sentiment classification.

References

[1]
R. K. Ando and T. Zhang. A framework for learning predictive structures from multiple tasks and unlabeled data. Journal of Machine Learning Research, 6:1817--1853, 2005.
[2]
R. K. Ando and T. Zhang. A high-performance semi-supervised learning method for text chunking. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pages 1--9, Morristown, NJ, USA, 2005. Association for Computational Linguistics.
[3]
M. Belkin and P. Niyogi. Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation, 15(6):1373--1396, 2003.
[4]
S. Ben-David, J. Blitzer, K. Crammer, and F. Pereira. Analysis of representations for domain adaptation. In Annual Conference on Neural Information Processing Systems 19, pages 137--144, Cambridge, MA, 2007. MIT Press.
[5]
J. Blitzer. Domain Adaptation of Natural Language Processing Systems. PhD thesis, The University of Pennsylvania, 2007.
[6]
J. Blitzer, M. Dredze, and F. Pereira. Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pages 432--439, Prague, Czech Republic, 2007.
[7]
J. Blitzer, R. McDonald, and F. Pereira. Domain adaptation with structural correspondence learning. In Proceedings of the Conference on Empirical Methods in Natural Language, pages 120--128, Sydney, Australia, July 2006.
[8]
B. Chen, W. Lam, I. W. Tsang, and T.-L. Wong. Extracting discriminative concepts for domain adaptation in text mining. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 179--188, New York, NY, USA, 2009. ACM.
[9]
F. R. K. Chung. Spectral Graph Theory. Number 92 in CBMS Regional Conference Series in Mathematics. American Mathematical Society, 1997.
[10]
W. Dai, O. Jin, G.-R. Xue, Q. Yang, and Y. Yu. Eigentransfer: a unified framework for transfer learning. In Proceedings of the 26th Annual International Conference on Machine Learning, pages 25--31, Montreal, Quebec, Canada, June 2009.
[11]
W. Dai, G. Xue, Q. Yang, and Y. Yu. Co-clustering based classification for out-of-domain documents. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Jose, California, USA, August 2007.
[12]
H. Daumé III. Frustratingly easy domain adaptation. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pages 256--263, Prague, Czech Republic, June 2007.
[13]
S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, and R. Harshman. Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41:391--407, 1990.
[14]
I. S. Dhillon. Co-clustering documents and words using bipartite spectral graph partitioning. In Proceedings of the 7th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 269--274. ACM, 2001.
[15]
C. Ding and X. He. K-means clustering via principal component analysis. In Proceedings of the twenty-first international conference on Machine learning, pages 225--232, Banff, Alberta, Canada, 2004. ACM.
[16]
R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research, 9:1871--1874, 2008.
[17]
A. Goldberg and X. Zhu. Seeing stars when there aren't many stars: Graph-based semi-supervised learning for sentiment categorization. In Proceedings of TextGraphs: the 1st Workshop on Graph Based Methods for Natural Language Processing, pages 45--52. ACL, June 2006.
[18]
J. A. Hartigan and M. A. Wong. A k-means clustering algorithm. Applied Statistics, 28:100--108, 1979.
[19]
M. Hu and B. Liu. Mining and summarizing customer reviews. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 168--177, Seattle, WA, USA, 2004. ACM.
[20]
J. Jiang and C. Zhai. Instance weighting for domain adaptation in nlp. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pages 264--271, Prague, Czech Republic, June 2007. Association for Computational Linguistics.
[21]
N. Jindal and B. Liu. Opinion spam and analysis. In Proceedings of the international conference on Web search and web data mining, pages 219--230, Palo Alto, California, USA, 2008. ACM.
[22]
T. Li, V. Sindhwani, C. Ding, and Y. Zhang. Knowledge transformation for cross-domain sentiment classification. In Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval, pages 716--717, Boston, MA, USA, 2009. ACM.
[23]
T. Li, Y. Zhang, and V. Sindhwani. A non-negative matrix tri-factorization approach to sentiment classification with lexical prior knowledge. In Proceedings of the Joint Conference of the 47th Annual Meeting of the Association of Computational Linguistics, pages 244--252, Suntec, Singapore, August 2009. ACL.
[24]
B. Liu. Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data. Springer, January 2007.
[25]
Y. Lu and C. Zhai. Opinion integration through semi-supervised topic modeling. In Proceedings of the 17th International Conference on World Wide Web, pages 121--130, Beijing, China, April 2008. ACM.
[26]
Y. Lu, C. Zhai, and N. Sundaresan. Rated aspect summarization of short comments. In Proceedings of the 18th international conference on World wide web, pages 131--140, Madrid, Spain, 2009. ACM.
[27]
A. Y. Ng, M. I. Jordan, and Y. Weiss. On spectral clustering: Analysis and an algorithm. In Advances in Neural Information Processing Systems 14, pages 849--856, 2001.
[28]
S. J. Pan and Q. Yang. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 2009. Available at http://doi.ieeecomputersociety.org/10.1109/TKDE.2009.191.
[29]
B. Pang and L. Lee. Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval, 2(1-2):1--135, 2008.
[30]
B. Pang, L. Lee, and S. Vaithyanathan. Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 79--86, 2002.
[31]
V. Sindhwani and P. Melville. Document-word co-regularization for semi-supervised sentiment analysis. In Proceedings of the 8th IEEE International Conference on Data Mining, pages 1025--1030, Washington, DC, USA, 2008. IEEE Computer Society.
[32]
P. Turney. Thumbs up or thumbs down? semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th Annual Meeting for the Association for Computational Linguistics, pages 417--424. ACL, 2002.
[33]
S. Xie, W. Fan, J. Peng, O. Verscheure, and J. Ren. Latent space domain transfer between high dimensional overlapping distributions. In 18th International World Wide Web Conference, pages 91--100, April 2009.

Cited By

View all
  • (2024)Few-Shot Methods for Aspect-Level Sentiment AnalysisInformation10.3390/info1511066415:11(664)Online publication date: 22-Oct-2024
  • (2024)A Systematic Literature Review on Cross Domain Sentiment Analysis Techniques: PRISMA ApproachAnnals of Emerging Technologies in Computing10.33166/AETiC.2024.04.0028:4(30-55)Online publication date: 1-Oct-2024
  • (2024)Transfer learning in robotics: An upcoming breakthrough? A review of promises and challengesThe International Journal of Robotics Research10.1177/02783649241273565Online publication date: 13-Sep-2024
  • Show More Cited By

Index Terms

  1. Cross-domain sentiment classification via spectral feature alignment

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      WWW '10: Proceedings of the 19th international conference on World wide web
      April 2010
      1407 pages
      ISBN:9781605587998
      DOI:10.1145/1772690

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 26 April 2010

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. domain adaptation
      2. feature alignment
      3. opinion mining
      4. sentiment classification
      5. transfer learning

      Qualifiers

      • Research-article

      Conference

      WWW '10
      WWW '10: The 19th International World Wide Web Conference
      April 26 - 30, 2010
      North Carolina, Raleigh, USA

      Acceptance Rates

      Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)161
      • Downloads (Last 6 weeks)16
      Reflects downloads up to 15 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Few-Shot Methods for Aspect-Level Sentiment AnalysisInformation10.3390/info1511066415:11(664)Online publication date: 22-Oct-2024
      • (2024)A Systematic Literature Review on Cross Domain Sentiment Analysis Techniques: PRISMA ApproachAnnals of Emerging Technologies in Computing10.33166/AETiC.2024.04.0028:4(30-55)Online publication date: 1-Oct-2024
      • (2024)Transfer learning in robotics: An upcoming breakthrough? A review of promises and challengesThe International Journal of Robotics Research10.1177/02783649241273565Online publication date: 13-Sep-2024
      • (2024)Efficient Unsupervised Domain Adaptation with PEFT Combinations2024 9th International Conference on Computer Science and Engineering (UBMK)10.1109/UBMK63289.2024.10773425(169-174)Online publication date: 26-Oct-2024
      • (2024)Cross-Domain Aspect-Based Sentiment Classification With Tripartite Graph ModelingIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2024.336597532(1623-1635)Online publication date: 14-Feb-2024
      • (2024)Renewing Iterative Self-Labeling Domain Adaptation With Application to the Spine Motion PredictionIEEE Transactions on Automation Science and Engineering10.1109/TASE.2023.328090021:3(3543-3553)Online publication date: Jul-2024
      • (2024)A Knowledge-Enhanced and Topic-Guided Domain Adaptation Model for Aspect-Based Sentiment AnalysisIEEE Transactions on Affective Computing10.1109/TAFFC.2023.329221315:2(709-721)Online publication date: Apr-2024
      • (2024)Cross-Domain Knowledge Adaptation Using Iterative Scaling and Language Model2024 4th International Conference on Technological Advancements in Computational Sciences (ICTACS)10.1109/ICTACS62700.2024.10840895(1592-1597)Online publication date: 13-Nov-2024
      • (2024)Enabling Resource-Efficient AIoT System With Cross-Level Optimization: A SurveyIEEE Communications Surveys & Tutorials10.1109/COMST.2023.331995226:1(389-427)Online publication date: Sep-2025
      • (2024)Cross-Domain Sentiment Classification with Mere Contrastive Learning and Improved Method2024 3rd International Conference on Artificial Intelligence and Computer Information Technology (AICIT)10.1109/AICIT62434.2024.10730527(1-10)Online publication date: 20-Sep-2024
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      EPUB

      View this article in ePub.

      ePub

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media