skip to main content
10.1145/3159652.3159703acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

Dynamic Word Embeddings for Evolving Semantic Discovery

Published:02 February 2018Publication History

ABSTRACT

Word evolution refers to the changing meanings and associations of words throughout time, as a byproduct of human language evolution. By studying word evolution, we can infer social trends and language constructs over different periods of human history. However, traditional techniques such as word representation learning do not adequately capture the evolving language structure and vocabulary. In this paper, we develop a dynamic statistical model to learn time-aware word vector representation. We propose a model that simultaneously learns time-aware embeddings and solves the resulting alignment problem. This model is trained on a crawled NYTimes dataset. Additionally, we develop multiple intuitive evaluation strategies of temporal word embeddings. Our qualitative and quantitative tests indicate that our method not only reliably captures this evolution over time, but also consistently outperforms state-of-the-art temporal embedding approaches on both semantic accuracy and alignment quality.

References

  1. James Allan, Rahul Gupta, and Vikas Khandelwal . 2001. Temporal summaries of new topics. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 10--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Sanjeev Arora, Yuanzhi Li, Yingyu Liang, Tengyu Ma, and Andrej Risteski . 2015. Rand-walk: A latent variable model approach to word embeddings. arXiv preprint arXiv:1502.03520 (2015).Google ScholarGoogle Scholar
  3. Marco Baroni, Georgiana Dinu, and Germán Kruszewski . 2014. Don't count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors.. In ACL (1). 238--247.Google ScholarGoogle Scholar
  4. Pierpaolo Basile, Annalina Caputo, and Giovanni Semeraro . 2014. Analysing word meaning over time by exploiting temporal random indexing First Italian Conference on Computational Linguistics CLiC-it.Google ScholarGoogle Scholar
  5. Yoshua Bengio, Réjean Ducharme, Pascal Vincent, and Christian Jauvin . 2003. A Neural Probabilistic Language Model. Journal of Machine Learning Research Vol. 3 (2003), 1137--1155. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. David M Blei and John D Lafferty . 2006. Dynamic topic models Proceedings of the 23rd international conference on Machine learning. ACM, 113--120. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Hyunyoung Choi and Hal Varian . 2012. Predicting the present with Google Trends. Economic Record, Vol. 88, s1 (2012), 2--9.Google ScholarGoogle ScholarCross RefCross Ref
  8. Ronan Collobert and Jason Weston . 2008. A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th international conference on Machine learning. ACM, 160--167. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Scott Deerwester, Susan T Dumais, George W Furnas, Thomas K Landauer, and Richard Harshman . 1990. Indexing by latent semantic analysis. Journal of the American society for information science, Vol. 41, 6 (1990), 391.Google ScholarGoogle ScholarCross RefCross Ref
  10. John R Firth . 1957. $$A synopsis of linguistic theory, 1930--1955$$. (1957).Google ScholarGoogle Scholar
  11. Kristina Gulordava and Marco Baroni . 2011. A distributional similarity approach to the detection of semantic change in the Google Books Ngram corpus. In Proceedings of the GEMS 2011 Workshop on GEometrical Models of Natural Language Semantics. Association for Computational Linguistics, 67--71. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. William L Hamilton, Jure Leskovec, and Dan Jurafsky . 2016. Diachronic Word Embeddings Reveal Statistical Laws of Semantic Change. arXiv preprint arXiv:1605.09096 (2016).Google ScholarGoogle Scholar
  13. Gerhard Heyer, Florian Holz, and Sven Teresniak . 2009. Change of Topics over Time-Tracking Topics by their Change of Meaning. KDIR Vol. 9 (2009), 223--228.Google ScholarGoogle Scholar
  14. Yoon Kim, Yi-I Chiu, Kentaro Hanaki, Darshan Hegde, and Slav Petrov . 2014. Temporal analysis of language through neural language models. arXiv preprint arXiv:1405.3515 (2014).Google ScholarGoogle Scholar
  15. Vivek Kulkarni, Rami Al-Rfou, Bryan Perozzi, and Steven Skiena . 2015. Statistically significant detection of linguistic change Proceedings of the 24th International Conference on World Wide Web. ACM, 625--635. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Matt J Kusner, Yu Sun, Nicholas I Kolkin, Kilian Q Weinberger, and others . 2015. From Word Embeddings To Document Distances.. In ICML, Vol. Vol. 15. 957--966. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Omer Levy and Yoav Goldberg . 2014. Neural word embedding as implicit matrix factorization Advances in neural information processing systems. 2177--2185. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Omer Levy, Yoav Goldberg, and Ido Dagan . 2015. Improving distributional similarity with lessons learned from word embeddings. Transactions of the Association for Computational Linguistics Vol. 3 (2015), 211--225.Google ScholarGoogle ScholarCross RefCross Ref
  19. Xuanyi Liao and Guang Cheng . 2016. Analysing the Semantic Change Based on Word Embedding International Conference on Computer Processing of Oriental Languages. Springer, 213--223.Google ScholarGoogle Scholar
  20. Kevin Lund and Curt Burgess . 1996. Producing high-dimensional semantic spaces from lexical co-occurrence. Behavior Research Methods, Instruments, & Computers, Vol. 28, 2 (1996), 203--208.Google ScholarGoogle ScholarCross RefCross Ref
  21. Guy Merchant . 2001. Teenagers in cyberspace: An investigation of language use and language change in internet chatrooms. Journal of Research in Reading Vol. 24, 3 (2001), 293--306.Google ScholarGoogle ScholarCross RefCross Ref
  22. Jean-Baptiste Michel, Yuan Kui Shen, Aviva Presser Aiden, Adrian Veres, Matthew K Gray, Joseph P Pickett, Dale Hoiberg, Dan Clancy, Peter Norvig, Jon Orwant, and others . 2011. Quantitative analysis of culture using millions of digitized books. science, Vol. 331, 6014 (2011), 176--182.Google ScholarGoogle Scholar
  23. Rada Mihalcea and Vivi Nastase . 2012. Word epoch disambiguation: Finding how words change over time Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers-Volume 2. Association for Computational Linguistics, 259--263. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean . 2013 a. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).Google ScholarGoogle Scholar
  25. Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean . 2013 b. Distributed representations of words and phrases and their compositionality Advances in neural information processing systems. 3111--3119. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Sunny Mitra, Ritwik Mitra, Martin Riedl, Chris Biemann, Animesh Mukherjee, and Pawan Goyal . 2014. That's sick dude!: Automatic identification of word sense change across different timescales. arXiv preprint arXiv:1405.4392 (2014).Google ScholarGoogle Scholar
  27. Jeffrey Pennington, Richard Socher, and Christopher D Manning . 2014. Glove: Global Vectors for Word Representation.. In EMNLP, Vol. Vol. 14. 1532--1543.Google ScholarGoogle ScholarCross RefCross Ref
  28. James Petterson, Wray Buntine, Shravan M Narayanamurthy, Tibério S Caetano, and Alex J Smola . 2010. Word features for latent dirichlet allocation. In Advances in Neural Information Processing Systems. 1921--1929. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Michael JD Powell . 1973. On search directions for minimization algorithms. Mathematical Programming Vol. 4, 1 (1973), 193--201.Google ScholarGoogle ScholarCross RefCross Ref
  30. Nikhil Rao, Hsiang-Fu Yu, Pradeep K Ravikumar, and Inderjit S Dhillon . 2015. Collaborative filtering with graph information: Consistency and scalable methods Advances in neural information processing systems. 2107--2115. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Eyal Sagi, Stefan Kaufmann, and Brady Clark . 2011. Tracing semantic change with latent semantic analysis. Current methods in historical semantics (2011), 161--183.Google ScholarGoogle Scholar
  32. Diane J Schiano, Coreena P Chen, Ellen Isaacs, Jeremy Ginsberg, Unnur Gretarsdottir, and Megan Huddleston . 2002. Teen use of messaging media. In CHI'02 extended abstracts on Human factors in computing systems. ACM, 594--595. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Ruben Sipos, Adith Swaminathan, Pannaga Shivaswamy, and Thorsten Joachims . 2012. Temporal corpus summarization using submodular word coverage Proceedings of the 21st ACM international conference on Information and knowledge management. ACM, 754--763. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Sali A Tagliamonte and Derek Denis . 2008. Linguistic ruin? LOL! Instant messaging and teen language. American speech, Vol. 83, 1 (2008), 3--34.Google ScholarGoogle Scholar
  35. Xuri Tang, Weiguang Qu, and Xiaohe Chen . 2016. Semantic change computation: A successive approach. World Wide Web, Vol. 19, 3 (2016), 375--415. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Xuerui Wang and Andrew McCallum . 2006. Topics over time: a non-Markov continuous-time model of topical trends Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 424--433. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Derry Tanti Wijaya and Reyyan Yeniterzi . 2011. Understanding semantic change of words over centuries Proceedings of the 2011 international workshop on DETecting and Exploiting Cultural diversiTy on the social web. ACM, 35--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Stephen J Wright . 2015. Coordinate descent algorithms. Mathematical Programming Vol. 151, 1 (2015), 3--34. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Hsiang-Fu Yu, Cho-Jui Hsieh, Si Si, and Inderjit Dhillon . 2012. Scalable coordinate descent approaches to parallel matrix factorization for recommender systems 12th IEEE International Conference on Data Mining (ICDM). IEEE, 765--774. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Yating Zhang, Adam Jatowt, Sourav S Bhowmick, and Katsumi Tanaka . 2016. The Past is Not a Foreign Country: Detecting Semantically Similar Terms across Time. IEEE Transactions on Knowledge and Data Engineering, Vol. 28, 10 (2016), 2793--2807. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Dynamic Word Embeddings for Evolving Semantic Discovery

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      WSDM '18: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining
      February 2018
      821 pages
      ISBN:9781450355810
      DOI:10.1145/3159652

      Copyright © 2018 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 2 February 2018

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      WSDM '18 Paper Acceptance Rate81of514submissions,16%Overall Acceptance Rate498of2,863submissions,17%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader