skip to main content
10.1145/1143844.1143859acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
Article

Dynamic topic models

Published:25 June 2006Publication History

ABSTRACT

A family of probabilistic time series models is developed to analyze the time evolution of topics in large document collections. The approach is to use state space models on the natural parameters of the multinomial distributions that represent the topics. Variational approximations based on Kalman filters and nonparametric wavelet regression are developed to carry out approximate posterior inference over the latent topics. In addition to giving quantitative, predictive models of a sequential corpus, dynamic topic models provide a qualitative window into the contents of a large document collection. The models are demonstrated by analyzing the OCR'ed archives of the journal Science from 1880 through 2000.

References

  1. Aitchison, J. (1982). The statistical analysis of compositional data. Journal of the Royal Statistical Society, Series B, 44(2):139--177.]]Google ScholarGoogle Scholar
  2. Blei, D., Ng, A., and Jordan, M. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3:993--1022.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Blei, D. M. and Lafferty, J. D. (2006). Correlated topic models. In Weiss, Y., Schölkopf, B., and Platt, J., editors, Advances in Neural Information Processing Systems 18. MIT Press, Cambridge, MA.]]Google ScholarGoogle Scholar
  4. Buntine, W. and Jakulin, A. (2004). Applying discrete PCA in data analysis. In Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence, pages 59--66. AUAI Press.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Erosheva, E. (2002). Grade of membership and latent structure models with application to disability survey data. PhD thesis, Carnegie Mellon University, Department of Statistics.]]Google ScholarGoogle Scholar
  6. Fei-Fei, L. and Perona, P. (2005). A Bayesian hierarchical model for learning natural scene categories. IEEE Computer Vision and Pattern Recognition.]]Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Griffiths, T. and Steyvers, M. (2004). Finding scientific topics. Proceedings of the National Academy of Science, 101:5228--5235.]]Google ScholarGoogle ScholarCross RefCross Ref
  8. Kalman, R. (1960). A new approach to linear filtering and prediction problems. Transaction of the AMSE: Journal of Basic Engineering, 82:35--45.]]Google ScholarGoogle ScholarCross RefCross Ref
  9. McCallum, A., Corrada-Emmanuel, A., and Wang, X. (2004). The author-recipient-topic model for topic and role discovery in social networks: Experiments with Enron and academic email. Technical report, University of Massachusetts, Amherst.]]Google ScholarGoogle Scholar
  10. Pritchard, J., Stephens, M., and Donnelly, P. (2000). Inference of population structure using multilocus genotype data. Genetics, 155:945--959.]]Google ScholarGoogle ScholarCross RefCross Ref
  11. Rosen-Zvi, M., Griffiths, T., Steyvers, M., and Smith, P. (2004). The author-topic model for authors and documents. In Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence, pages 487--494. AUAI Press.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Sivic, J., Rusell, B., Efros, A., Zisserman, A., and Freeman, W. (2005). Discovering objects and their location in images. In International Conference on Computer Vision (ICCV 2005).]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Snelson, E. and Ghahramani, Z. (2006). Sparse Gaussian processes using pseudo-inputs. In Weiss, Y., Schölkopf, B., and Platt, J., editors, Advances in Neural Information Processing Systems 18, Cambridge, MA. MIT Press.]]Google ScholarGoogle Scholar
  14. Wasserman, L. (2006). All of Nonparametric Statistics. Springer.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. West, M. and Harrison, J. (1997). Bayesian Forecasting and Dynamic Models. Springer.]] Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Dynamic topic models

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          ICML '06: Proceedings of the 23rd international conference on Machine learning
          June 2006
          1154 pages
          ISBN:1595933832
          DOI:10.1145/1143844

          Copyright © 2006 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 25 June 2006

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • Article

          Acceptance Rates

          ICML '06 Paper Acceptance Rate140of548submissions,26%Overall Acceptance Rate140of548submissions,26%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader