research-article

Dynamic Word Embeddings for Evolving Semantic Discovery

Authors:
Zijun Yao

Rutgers University, Newark, NJ, USA

Rutgers University, Newark, NJ, USA
View Profile

,
Yifan Sun

Technicolor Research, Los Altos, CA, USA

Technicolor Research, Los Altos, CA, USA
View Profile

,
Weicong Ding

Amazon, Seattle, WA, USA

Amazon, Seattle, WA, USA
View Profile

,
Nikhil Rao

Amazon, Seattle, WA, USA

Amazon, Seattle, WA, USA
View Profile

,
Hui Xiong

Rutgers University, Newark, NJ, USA

Rutgers University, Newark, NJ, USA
View Profile

WSDM '18: Proceedings of the Eleventh ACM International Conference on Web Search and Data MiningFebruary 2018Pages 673–681https://doi.org/10.1145/3159652.3159703

Published:02 February 2018Publication History

WSDM '18: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining

Pages 673–681

ABSTRACT

Word evolution refers to the changing meanings and associations of words throughout time, as a byproduct of human language evolution. By studying word evolution, we can infer social trends and language constructs over different periods of human history. However, traditional techniques such as word representation learning do not adequately capture the evolving language structure and vocabulary. In this paper, we develop a dynamic statistical model to learn time-aware word vector representation. We propose a model that simultaneously learns time-aware embeddings and solves the resulting alignment problem. This model is trained on a crawled NYTimes dataset. Additionally, we develop multiple intuitive evaluation strategies of temporal word embeddings. Our qualitative and quantitative tests indicate that our method not only reliably captures this evolution over time, but also consistently outperforms state-of-the-art temporal embedding approaches on both semantic accuracy and alignment quality.

References

James Allan, Rahul Gupta, and Vikas Khandelwal . 2001. Temporal summaries of new topics. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 10--18. Google ScholarDigital Library
Sanjeev Arora, Yuanzhi Li, Yingyu Liang, Tengyu Ma, and Andrej Risteski . 2015. Rand-walk: A latent variable model approach to word embeddings. arXiv preprint arXiv:1502.03520 (2015).Google Scholar
Marco Baroni, Georgiana Dinu, and Germán Kruszewski . 2014. Don't count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors.. In ACL (1). 238--247.Google Scholar
Pierpaolo Basile, Annalina Caputo, and Giovanni Semeraro . 2014. Analysing word meaning over time by exploiting temporal random indexing First Italian Conference on Computational Linguistics CLiC-it.Google Scholar
Yoshua Bengio, Réjean Ducharme, Pascal Vincent, and Christian Jauvin . 2003. A Neural Probabilistic Language Model. Journal of Machine Learning Research Vol. 3 (2003), 1137--1155. Google ScholarDigital Library
David M Blei and John D Lafferty . 2006. Dynamic topic models Proceedings of the 23rd international conference on Machine learning. ACM, 113--120. Google ScholarDigital Library
Hyunyoung Choi and Hal Varian . 2012. Predicting the present with Google Trends. Economic Record, Vol. 88, s1 (2012), 2--9.Google ScholarCross Ref
Ronan Collobert and Jason Weston . 2008. A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th international conference on Machine learning. ACM, 160--167. Google ScholarDigital Library
Scott Deerwester, Susan T Dumais, George W Furnas, Thomas K Landauer, and Richard Harshman . 1990. Indexing by latent semantic analysis. Journal of the American society for information science, Vol. 41, 6 (1990), 391.Google ScholarCross Ref
John R Firth . 1957. $$A synopsis of linguistic theory, 1930--1955$$. (1957).Google Scholar
Kristina Gulordava and Marco Baroni . 2011. A distributional similarity approach to the detection of semantic change in the Google Books Ngram corpus. In Proceedings of the GEMS 2011 Workshop on GEometrical Models of Natural Language Semantics. Association for Computational Linguistics, 67--71. Google ScholarDigital Library
William L Hamilton, Jure Leskovec, and Dan Jurafsky . 2016. Diachronic Word Embeddings Reveal Statistical Laws of Semantic Change. arXiv preprint arXiv:1605.09096 (2016).Google Scholar
Gerhard Heyer, Florian Holz, and Sven Teresniak . 2009. Change of Topics over Time-Tracking Topics by their Change of Meaning. KDIR Vol. 9 (2009), 223--228.Google Scholar
Yoon Kim, Yi-I Chiu, Kentaro Hanaki, Darshan Hegde, and Slav Petrov . 2014. Temporal analysis of language through neural language models. arXiv preprint arXiv:1405.3515 (2014).Google Scholar
Vivek Kulkarni, Rami Al-Rfou, Bryan Perozzi, and Steven Skiena . 2015. Statistically significant detection of linguistic change Proceedings of the 24th International Conference on World Wide Web. ACM, 625--635. Google ScholarDigital Library
Matt J Kusner, Yu Sun, Nicholas I Kolkin, Kilian Q Weinberger, and others . 2015. From Word Embeddings To Document Distances.. In ICML, Vol. Vol. 15. 957--966. Google ScholarDigital Library
Omer Levy and Yoav Goldberg . 2014. Neural word embedding as implicit matrix factorization Advances in neural information processing systems. 2177--2185. Google ScholarDigital Library
Omer Levy, Yoav Goldberg, and Ido Dagan . 2015. Improving distributional similarity with lessons learned from word embeddings. Transactions of the Association for Computational Linguistics Vol. 3 (2015), 211--225.Google ScholarCross Ref
Xuanyi Liao and Guang Cheng . 2016. Analysing the Semantic Change Based on Word Embedding International Conference on Computer Processing of Oriental Languages. Springer, 213--223.Google Scholar
Kevin Lund and Curt Burgess . 1996. Producing high-dimensional semantic spaces from lexical co-occurrence. Behavior Research Methods, Instruments, & Computers, Vol. 28, 2 (1996), 203--208.Google ScholarCross Ref
Guy Merchant . 2001. Teenagers in cyberspace: An investigation of language use and language change in internet chatrooms. Journal of Research in Reading Vol. 24, 3 (2001), 293--306.Google ScholarCross Ref
Jean-Baptiste Michel, Yuan Kui Shen, Aviva Presser Aiden, Adrian Veres, Matthew K Gray, Joseph P Pickett, Dale Hoiberg, Dan Clancy, Peter Norvig, Jon Orwant, and others . 2011. Quantitative analysis of culture using millions of digitized books. science, Vol. 331, 6014 (2011), 176--182.Google Scholar
Rada Mihalcea and Vivi Nastase . 2012. Word epoch disambiguation: Finding how words change over time Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers-Volume 2. Association for Computational Linguistics, 259--263. Google ScholarDigital Library
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean . 2013 a. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).Google Scholar
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean . 2013 b. Distributed representations of words and phrases and their compositionality Advances in neural information processing systems. 3111--3119. Google ScholarDigital Library
Sunny Mitra, Ritwik Mitra, Martin Riedl, Chris Biemann, Animesh Mukherjee, and Pawan Goyal . 2014. That's sick dude!: Automatic identification of word sense change across different timescales. arXiv preprint arXiv:1405.4392 (2014).Google Scholar
Jeffrey Pennington, Richard Socher, and Christopher D Manning . 2014. Glove: Global Vectors for Word Representation.. In EMNLP, Vol. Vol. 14. 1532--1543.Google ScholarCross Ref
James Petterson, Wray Buntine, Shravan M Narayanamurthy, Tibério S Caetano, and Alex J Smola . 2010. Word features for latent dirichlet allocation. In Advances in Neural Information Processing Systems. 1921--1929. Google ScholarDigital Library
Michael JD Powell . 1973. On search directions for minimization algorithms. Mathematical Programming Vol. 4, 1 (1973), 193--201.Google ScholarCross Ref
Nikhil Rao, Hsiang-Fu Yu, Pradeep K Ravikumar, and Inderjit S Dhillon . 2015. Collaborative filtering with graph information: Consistency and scalable methods Advances in neural information processing systems. 2107--2115. Google ScholarDigital Library
Eyal Sagi, Stefan Kaufmann, and Brady Clark . 2011. Tracing semantic change with latent semantic analysis. Current methods in historical semantics (2011), 161--183.Google Scholar
Diane J Schiano, Coreena P Chen, Ellen Isaacs, Jeremy Ginsberg, Unnur Gretarsdottir, and Megan Huddleston . 2002. Teen use of messaging media. In CHI'02 extended abstracts on Human factors in computing systems. ACM, 594--595. Google ScholarDigital Library
Ruben Sipos, Adith Swaminathan, Pannaga Shivaswamy, and Thorsten Joachims . 2012. Temporal corpus summarization using submodular word coverage Proceedings of the 21st ACM international conference on Information and knowledge management. ACM, 754--763. Google ScholarDigital Library
Sali A Tagliamonte and Derek Denis . 2008. Linguistic ruin? LOL! Instant messaging and teen language. American speech, Vol. 83, 1 (2008), 3--34.Google Scholar
Xuri Tang, Weiguang Qu, and Xiaohe Chen . 2016. Semantic change computation: A successive approach. World Wide Web, Vol. 19, 3 (2016), 375--415. Google ScholarDigital Library
Xuerui Wang and Andrew McCallum . 2006. Topics over time: a non-Markov continuous-time model of topical trends Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 424--433. Google ScholarDigital Library
Derry Tanti Wijaya and Reyyan Yeniterzi . 2011. Understanding semantic change of words over centuries Proceedings of the 2011 international workshop on DETecting and Exploiting Cultural diversiTy on the social web. ACM, 35--40. Google ScholarDigital Library
Stephen J Wright . 2015. Coordinate descent algorithms. Mathematical Programming Vol. 151, 1 (2015), 3--34. Google ScholarDigital Library
Hsiang-Fu Yu, Cho-Jui Hsieh, Si Si, and Inderjit Dhillon . 2012. Scalable coordinate descent approaches to parallel matrix factorization for recommender systems 12th IEEE International Conference on Data Mining (ICDM). IEEE, 765--774. Google ScholarDigital Library
Yating Zhang, Adam Jatowt, Sourav S Bhowmick, and Katsumi Tanaka . 2016. The Past is Not a Foreign Country: Detecting Semantically Similar Terms across Time. IEEE Transactions on Knowledge and Data Engineering, Vol. 28, 10 (2016), 2793--2807. Google ScholarDigital Library

Index Terms

Dynamic Word Embeddings for Evolving Semantic Discovery
1. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

Jointly learning bilingual word embeddings and alignments
Abstract
Learning bilingual word embeddings can be much easier if the parallel corpora are available with their words well aligned explicitly. However, in most cases, the parallel corpora only provide a set of pairs that are semantically equivalent to each ...
Read More
Composing Word Embeddings for Compound Words Using Linguistic Knowledge
In recent years, the use of distributed representations has been a fundamental technology for natural language processing. However, Japanese has multiple compound words, and often we must compare the meanings of a word and a compound word. Moreover, word ...
Read More
Exploring Implicit Semantic Constraints for Bilingual Word Embeddings

Bilingual word embeddings (BWEs) have proven to be useful in many cross-lingual natural language processing tasks. Previous studies often require bilingual texts or dictionaries that are scarce resources. As a result, in these studies, the exploited ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WSDM '18: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining
February 2018
821 pages
ISBN:9781450355810
DOI:10.1145/3159652
General Chairs:
Yi Chang
Jilin University, Huawei Inc.
,
Chengxiang Zhai
University of Illinois Urbana-Champaign
,
Program Chairs:
Yan Liu
University of Southern California
,
Yoelle Maarek
Amazon
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 2 February 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
dynamic word embeddings
word semantic analysis
Qualifiers
- research-article
Conference

Acceptance Rates
WSDM '18 Paper Acceptance Rate81of514submissions,16%Overall Acceptance Rate498of2,863submissions,17%
More
Upcoming Conference
WSDM '25

Sponsor:

sigir

sigir

sigir

sigir

The Eighteenth ACM International Conference on Web Search and Data Mining

April 7 - 11, 2025

Hannover , Germany
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 78
  Total Citations
  View Citations
- 1,882
  Total Downloads
- Downloads (Last 12 months)175
- Downloads (Last 6 weeks)30
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Dynamic Word Embeddings for Evolving Semantic Discovery

WSDM '18: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining

ABSTRACT

References

Cited By

Index Terms

Recommendations

Jointly learning bilingual word embeddings and alignments

Composing Word Embeddings for Compound Words Using Linguistic Knowledge

Exploring Implicit Semantic Constraints for Bilingual Word Embeddings