skip to main content
research-article

On Estimation of Functional Causal Models: General Results and Application to the Post-Nonlinear Causal Model

Authors Info & Claims
Published:17 December 2015Publication History
Skip Abstract Section

Abstract

Compared to constraint-based causal discovery, causal discovery based on functional causal models is able to identify the whole causal model under appropriate assumptions [Shimizu et al. 2006; Hoyer et al. 2009; Zhang and Hyvärinen 2009b]. Functional causal models represent the effect as a function of the direct causes together with an independent noise term. Examples include the linear non-Gaussian acyclic model (LiNGAM), nonlinear additive noise model, and post-nonlinear (PNL) model. Currently, there are two ways to estimate the parameters in the models: dependence minimization and maximum likelihood. In this article, we show that for any acyclic functional causal model, minimizing the mutual information between the hypothetical cause and the noise term is equivalent to maximizing the data likelihood with a flexible model for the distribution of the noise term. We then focus on estimation of the PNL causal model and propose to estimate it with the warped Gaussian process with the noise modeled by the mixture of Gaussians. As a Bayesian nonparametric approach, it outperforms the previous one based on mutual information minimization with nonlinear functions represented by multilayer perceptrons; we also show that unlike the ordinary regression, estimation results of the PNL causal model are sensitive to the assumption on the noise distribution. Experimental results on both synthetic and real data support our theoretical claims.

References

  1. P. J. Bickel and K. A. Doksum. 1981. An analysis of transformations revisited. Journal of the American Statistical Association 76, 296--311.Google ScholarGoogle ScholarCross RefCross Ref
  2. T. M. Cover and J. A. Thomas. 1991. Elements of Information Theory. Wiley. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. Gretton, K. Fukumizu, C. H. Teo, L. Song, B. Schölkopf, and A. J. Smola. 2008. A kernel statistical test of independence. In Advances in Neural Information Procssing Systems 20. MIT Press, Cambridge, MA, 585--592.Google ScholarGoogle Scholar
  4. P. O. Hoyer, D. Janzing, J. Mooij, J. Peters, and B. Schölkopf. 2009. Nonlinear causal discovery with additive noise models. In Advances in Neural Information Processing Systems 21.Google ScholarGoogle Scholar
  5. A. Hyvärinen, J. Karhunen, and E. Oja. 2001. Independent Component Analysis. John Wiley & Sons.Google ScholarGoogle Scholar
  6. A. Hyvärinen and P. Pajunen. 1999. Nonlinear independent component analysis: Existence and uniqueness results. Neural Networks 12, 3, 429--439. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. Janzing, J. Mooij, K. Zhang, J. Lemeire, J. Zscheischler, P. Daniuvsis, B. Steudel, and B. Schölkopf. 2012. Information-geometric approach to inferring causal directions. Artificial Intelligence 182--183, 1--31. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. R. A. Levine and G. Casella. 2001. Implementations of the Monte Carlo EM algorithm. Journal of Computational and Graphical Statistics 10, 3, 422--439.Google ScholarGoogle ScholarCross RefCross Ref
  9. J. Mooij, D. Janzing, J. Peters, and B. Schölkopf. 2009. Regression by dependence minimization and its application to causal inference in additive noise models. In Proceedings of the 26th Annual International Conference on Machine Learning (ICML’09). 745--752. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Mooij, O. Stegle, D. Janzing, K. Zhang, and B. Schölkopf. 2010. Probabilistic latent variable models for distinguishing between cause and effect. In Advances in Neural Information Processing Systems 23.Google ScholarGoogle Scholar
  11. J. Pearl. 2000. Causality: Models, Reasoning, and Inference. Cambridge University Press, Cambridge, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. B. Schölkopf, D. Janzing, J. Peters, E. Sgouritsa, K. Zhang, and J. Mooij. 2012. On causal and anticausal learning. In Proceedings of the 29th International Conference on Machine Learning (ICML’12).Google ScholarGoogle Scholar
  13. S. Shimizu, P. O. Hoyer, A. Hyvärinen, and A. J. Kerminen. 2006. A linear non-Gaussian acyclic model for causal discovery. Journal of Machine Learning Research 7, 2003--2030. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. E. Snelson, C. E. Rasmussen, and Z. Ghahramani. 2004. Warped Gaussian processes. In Advances in Neural Information Processing Systems 16.Google ScholarGoogle Scholar
  15. P. Spirtes, C. Glymour, and R. Scheines. 2001. Causation, Prediction, and Search (2nd ed.). MIT Press, Cambridge, MA.Google ScholarGoogle Scholar
  16. A. Taleb and C. Jutten. 1999. Source separation in post-nonlinear mixtures. IEEE Transactions on Signal Processing 47, 10, 2807--2820. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. Yamada and M. Sugiyama. 2010. Dependence minimizing regression with model selection for non-linear causal inference under non-Gaussian noise. In Proceedings of the 24th AAAI Conference on Artificial Intelligence (AAAI’10). 643--648.Google ScholarGoogle Scholar
  18. K. Zhang and L. Chan. 2005. Extended Gaussianization method for blind separation of post-nonlinear mixtures. Neural Computation 17, 2, 425--452. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. K. Zhang and A. Hyvärinen. 2009a. Acyclic causality discovery with additive noise: An information-theoretical perspective. In Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD’09).Google ScholarGoogle Scholar
  20. K. Zhang and A. Hyvärinen. 2009b. On the identifiability of the post-nonlinear causal model. In Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. K. Zhang, J. Peters, D. Janzing, and B. Schölkopf. 2011. Kernel-based conditional independence test and application in causal discovery. In Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence (UAI’11).Google ScholarGoogle Scholar
  22. K. Zhang, B. Schölkopf, K. Muandet, and Z. Wang. 2013a. Domain adaptation under target and conditional shift. In Proceedings of the 30th International Conference on Machine Learning.Google ScholarGoogle Scholar
  23. K. Zhang, Z. Wang, and B. Schölkopf. 2013b. On estimation of functional causal models: Post-nonlinear causal model as an example. In Proceedings of the IEEE 13th International Conference on Data Mining Workshops. 139--146. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. On Estimation of Functional Causal Models: General Results and Application to the Post-Nonlinear Causal Model

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Intelligent Systems and Technology
      ACM Transactions on Intelligent Systems and Technology  Volume 7, Issue 2
      Special Issue on Causal Discovery and Inference
      January 2016
      270 pages
      ISSN:2157-6904
      EISSN:2157-6912
      DOI:10.1145/2850424
      • Editor:
      • Yu Zheng
      Issue’s Table of Contents

      Copyright © 2015 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 17 December 2015
      • Revised: 1 October 2014
      • Accepted: 1 October 2014
      • Received: 1 April 2014
      Published in tist Volume 7, Issue 2

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader