skip to main content
10.1145/3077136.3080813acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article
Public Access

Learning a Hierarchical Embedding Model for Personalized Product Search

Authors Info & Claims
Published:07 August 2017Publication History

ABSTRACT

Product search is an important part of online shopping. In contrast to many search tasks, the objectives of product search are not confined to retrieving relevant products. Instead, it focuses on finding items that satisfy the needs of individuals and lead to a user purchase. The unique characteristics of product search make search personalization essential for both customers and e-shopping companies. Purchase behavior is highly personal in online shopping and users often provide rich feedback about their decisions (e.g. product reviews). However, the severe mismatch found in the language of queries, products and users make traditional retrieval models based on bag-of-words assumptions less suitable for personalization in product search. In this paper, we propose a hierarchical embedding model to learn semantic representations for entities (i.e. words, products, users and queries) from different levels with their associated language data. Our contributions are three-fold: (1) our work is one of the initial studies on personalized product search; (2) our hierarchical embedding model is the first latent space model that jointly learns distributed representations for queries, products and users with a deep neural network; (3) each component of our network is designed as a generative model so that the whole structure is explainable and extendable. Following the methodology of previous studies, we constructed personalized product search benchmarks with Amazon product data. Experiments show that our hierarchical embedding model significantly outperforms existing product search baselines on multiple benchmark datasets.

References

  1. Eugene Agichtein, Eric Brill, Susan Dumais, and Robert Ragno. 2006. Learning user interaction models for predicting web search result preferences. In Proceedings of the 29th ACM SIGIR. ACM, 3--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Qingyao Ai, Liu Yang, Jiafeng Guo, and W Bruce Croft. 2016. Analysis of the paragraph vector model for information retrieval. In Proceedings of the ACM ICTIR'16. ACM, 133--142. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Sanjeev Arora, Yuanzhi Li, Yingyu Liang, Tengyu Ma, and Andrej Risteski. 2015. Rand-walk: A latent variable model approach to word embeddings. arXiv preprint arXiv:1502.03520 (2015).Google ScholarGoogle Scholar
  4. David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent dirichlet allocation. Journal of machine Learning research 3, Jan (2003), 993--1022.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Kyunghyun Cho, Bart Van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014).Google ScholarGoogle Scholar
  6. Nick Craswell, Hugo Zaragoza, and Stephen Robertson. 2005. Microsoft Cambridge at TREC 14: Enterprise Track.. In TREC.Google ScholarGoogle Scholar
  7. Scott Deerwester, Susan T Dumais, George W Furnas, Thomas K Landauer, and Richard Harshman. 1990. Indexing by latent semantic analysis. Journal of the American society for information science 41, 6 (1990), 391.Google ScholarGoogle ScholarCross RefCross Ref
  8. Huizhong Duan and ChengXiang Zhai. 2015. Mining Coordinated Intent Representation for Entity Search and Recommendation. In Proceedings of the 24th ACM CIKM. ACM, 333--342. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Huizhong Duan, ChengXiang Zhai, Jinxing Cheng, and Abhishek Gaffani. 2013. A probabilistic mixture model for mining and analyzing product search log. In Proceedings of the 22nd ACM CIKM. ACM, 2179--2188. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Huizhong Duan, ChengXiang Zhai, Jinxing Cheng, and Abhishek Gaffani. 2013. Supporting keyword search in product database: a probabilistic approach. Proceedings of the VLDB Endowment 6, 14 (2013), 1786--1797. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Susan T Dumais. 2014. Personalized Search: Potential and Pitfalls. In NTCIR.Google ScholarGoogle Scholar
  12. Jiafeng Guo, Yixing Fan, Qingyao Ai, and W Bruce Croft. 2016. A deep relevance matching model for ad-hoc retrieval. In Proceedings of the 25th ACM CIKM. ACM, 55--64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Thomas Hofmann. 1999. Probabilistic latent semantic indexing. In Proceedings of the 22nd ACM SIGIR. ACM, 50--57. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Bernard J Jansen and Paulo R Molina. 2006. The effectiveness of Web search engines for retrieving relevant ecommerce links. Information Processing & Management 42, 4 (2006), 1075--1098. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Thoc V Le and Tomas Mikolov. 2014. Distributed Representations of Sentences and Documents.. In ICML, Vol. 14. 1188--1196.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Omer Levy and Yoav Goldberg. 2014. Neural word embedding as implicit matrix factorization. In Advances in neural information processing systems. 2177--2185.Google ScholarGoogle Scholar
  17. Soon Chong Johnson Lim, Ying Liu, and Wing Bun Lee. 2010. Multi-facet product information search and retrieval using semantically annotated product family ontology. Information Processing & Management 46, 4 (2010), 479--493. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Julian McAuley, Rahul Pandey, and Jure Leskovec. 2015. Inferring networks of substitutable and complementary products. In Proceedings of the 21th ACM SIGKDD. ACM, 785--794. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Julian McAuley, Christopher Targett, Qinfeng Shi, and Anton van den Hengel. 2015. Image-based recommendations on styles and substitutes. In Proceedings of the 38th ACM SIGIR. ACM, 43--52. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).Google ScholarGoogle Scholar
  21. Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111--3119.Google ScholarGoogle Scholar
  22. Meredith Ringel Morris, Jaime Teevan, and Steve Bush. 2008. Enhancing collaborative web search with personalization: groupization, smart splitting, and group hit-highlighting. In Proceedings of the 2008 ACM conference on Computer supported cooperative work. ACM, 481--484. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Petteri Nurmi, Eemil Lagerspetz, Wray Buntine, Patrik Floreen, and Joonas Kukkonen. 2008. Product retrieval for grocery stores. In Proceedings of the 31st ACM SIGIR. ACM, 781--782. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Paul Ogilvie and Jamie Callan. 2005. Experiments with Language Models for Known-Item Finding of E-mail Messages. In TREC.Google ScholarGoogle Scholar
  25. Hamid Palangi, Li Deng, Yelong Shen, Jianfeng Gao, Xiaodong He, Jianshu Chen, Xinying Song, and Rabab Ward. 2016. Deep sentence embedding using long short-term memory networks: Analysis and application to information retrieval. IEEE/ACM Transactions on ASLP 24, 4 (2016), 694--707.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Jay M Ponte and W Bruce Croft. 1998. A language modeling approach to information retrieval. In Proceedings of the 21st ACM SIGIR. ACM, 275--281.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Jennifer Rowley. 2000. Product search in e-shopping: a review and research propositions. Journal of consumer marketing 17, 1 (2000), 20--35. Google ScholarGoogle ScholarCross RefCross Ref
  28. Mark D Smucker, James Allan, and Ben Carterette. 2007. A comparison of statistical significance tests for information retrieval evaluation. In Proceedings of the sixteenth ACM CIKM. ACM, 623--632.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Jaime Teevan, Susan T Dumais, and Daniel J Liebling. 2008. To personalize or not to personalize: modeling queries with variation in user intent. In Proceedings of the 31st ACM SIGIR. ACM, 163--170.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Christophe Van Gysel, Maarten de Rijke, and Evangelos Kanoulas. 2016. Learning latent vector spaces for product search. In Proceedings of the 25th ACM CIKM. ACM, 165--174. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Ivan Vulic and Marie-Francine Moens. 2015. Monolingual and cross-lingual in- formation retrieval models based on (bilingual) word embeddings. In Proceedings of the 38th ACM SIGIR. ACM, 363--372.Google ScholarGoogle Scholar
  32. Hamed Zamani and W Bruce Croft. 2016. Estimating embedding vectors for queries. In Proceedings of the ACM ICTIR'16. ACM, 123--132. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Learning a Hierarchical Embedding Model for Personalized Product Search

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGIR '17: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval
      August 2017
      1476 pages
      ISBN:9781450350228
      DOI:10.1145/3077136

      Copyright © 2017 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 7 August 2017

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      SIGIR '17 Paper Acceptance Rate78of362submissions,22%Overall Acceptance Rate792of3,983submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader