research-article

On Estimation of Functional Causal Models: General Results and Application to the Post-Nonlinear Causal Model

Authors:
Kun Zhang

Max-Planck Institute for Intelligent Systems, Tubingen, Germany

Max-Planck Institute for Intelligent Systems, Tubingen, Germany
View Profile

,
Zhikun Wang

Max-Planck Institute for Intelligent Systems, Tubingen, Germany

Max-Planck Institute for Intelligent Systems, Tubingen, Germany
View Profile

,
Jiji Zhang

Department of Philosophy, Lingnan University, Hong Kong

Department of Philosophy, Lingnan University, Hong Kong
View Profile

,
Bernhard Schölkopf

Max-Planck Institute for Intelligent Systems, Tubingen, Germany

Max-Planck Institute for Intelligent Systems, Tubingen, Germany
View Profile

ACM Transactions on Intelligent Systems and Technology Volume 7 Issue 2Article No.: 13pp 1–22https://doi.org/10.1145/2700476

Published:17 December 2015Publication History

ACM Transactions on Intelligent Systems and Technology

Abstract

Compared to constraint-based causal discovery, causal discovery based on functional causal models is able to identify the whole causal model under appropriate assumptions [Shimizu et al. 2006; Hoyer et al. 2009; Zhang and Hyvärinen 2009b]. Functional causal models represent the effect as a function of the direct causes together with an independent noise term. Examples include the linear non-Gaussian acyclic model (LiNGAM), nonlinear additive noise model, and post-nonlinear (PNL) model. Currently, there are two ways to estimate the parameters in the models: dependence minimization and maximum likelihood. In this article, we show that for any acyclic functional causal model, minimizing the mutual information between the hypothetical cause and the noise term is equivalent to maximizing the data likelihood with a flexible model for the distribution of the noise term. We then focus on estimation of the PNL causal model and propose to estimate it with the warped Gaussian process with the noise modeled by the mixture of Gaussians. As a Bayesian nonparametric approach, it outperforms the previous one based on mutual information minimization with nonlinear functions represented by multilayer perceptrons; we also show that unlike the ordinary regression, estimation results of the PNL causal model are sensitive to the assumption on the noise distribution. Experimental results on both synthetic and real data support our theoretical claims.

References

P. J. Bickel and K. A. Doksum. 1981. An analysis of transformations revisited. Journal of the American Statistical Association 76, 296--311.Google ScholarCross Ref
T. M. Cover and J. A. Thomas. 1991. Elements of Information Theory. Wiley. Google ScholarDigital Library
A. Gretton, K. Fukumizu, C. H. Teo, L. Song, B. Schölkopf, and A. J. Smola. 2008. A kernel statistical test of independence. In Advances in Neural Information Procssing Systems 20. MIT Press, Cambridge, MA, 585--592.Google Scholar
P. O. Hoyer, D. Janzing, J. Mooij, J. Peters, and B. Schölkopf. 2009. Nonlinear causal discovery with additive noise models. In Advances in Neural Information Processing Systems 21.Google Scholar
A. Hyvärinen, J. Karhunen, and E. Oja. 2001. Independent Component Analysis. John Wiley & Sons.Google Scholar
A. Hyvärinen and P. Pajunen. 1999. Nonlinear independent component analysis: Existence and uniqueness results. Neural Networks 12, 3, 429--439. Google ScholarDigital Library
D. Janzing, J. Mooij, K. Zhang, J. Lemeire, J. Zscheischler, P. Daniuvsis, B. Steudel, and B. Schölkopf. 2012. Information-geometric approach to inferring causal directions. Artificial Intelligence 182--183, 1--31. Google ScholarDigital Library
R. A. Levine and G. Casella. 2001. Implementations of the Monte Carlo EM algorithm. Journal of Computational and Graphical Statistics 10, 3, 422--439.Google ScholarCross Ref
J. Mooij, D. Janzing, J. Peters, and B. Schölkopf. 2009. Regression by dependence minimization and its application to causal inference in additive noise models. In Proceedings of the 26th Annual International Conference on Machine Learning (ICML’09). 745--752. Google ScholarDigital Library
J. Mooij, O. Stegle, D. Janzing, K. Zhang, and B. Schölkopf. 2010. Probabilistic latent variable models for distinguishing between cause and effect. In Advances in Neural Information Processing Systems 23.Google Scholar
J. Pearl. 2000. Causality: Models, Reasoning, and Inference. Cambridge University Press, Cambridge, MA. Google ScholarDigital Library
B. Schölkopf, D. Janzing, J. Peters, E. Sgouritsa, K. Zhang, and J. Mooij. 2012. On causal and anticausal learning. In Proceedings of the 29th International Conference on Machine Learning (ICML’12).Google Scholar
S. Shimizu, P. O. Hoyer, A. Hyvärinen, and A. J. Kerminen. 2006. A linear non-Gaussian acyclic model for causal discovery. Journal of Machine Learning Research 7, 2003--2030. Google ScholarDigital Library
E. Snelson, C. E. Rasmussen, and Z. Ghahramani. 2004. Warped Gaussian processes. In Advances in Neural Information Processing Systems 16.Google Scholar
P. Spirtes, C. Glymour, and R. Scheines. 2001. Causation, Prediction, and Search (2nd ed.). MIT Press, Cambridge, MA.Google Scholar
A. Taleb and C. Jutten. 1999. Source separation in post-nonlinear mixtures. IEEE Transactions on Signal Processing 47, 10, 2807--2820. Google ScholarDigital Library
M. Yamada and M. Sugiyama. 2010. Dependence minimizing regression with model selection for non-linear causal inference under non-Gaussian noise. In Proceedings of the 24th AAAI Conference on Artificial Intelligence (AAAI’10). 643--648.Google Scholar
K. Zhang and L. Chan. 2005. Extended Gaussianization method for blind separation of post-nonlinear mixtures. Neural Computation 17, 2, 425--452. Google ScholarDigital Library
K. Zhang and A. Hyvärinen. 2009a. Acyclic causality discovery with additive noise: An information-theoretical perspective. In Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD’09).Google Scholar
K. Zhang and A. Hyvärinen. 2009b. On the identifiability of the post-nonlinear causal model. In Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence. Google ScholarDigital Library
K. Zhang, J. Peters, D. Janzing, and B. Schölkopf. 2011. Kernel-based conditional independence test and application in causal discovery. In Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence (UAI’11).Google Scholar
K. Zhang, B. Schölkopf, K. Muandet, and Z. Wang. 2013a. Domain adaptation under target and conditional shift. In Proceedings of the 30th International Conference on Machine Learning.Google Scholar
K. Zhang, Z. Wang, and B. Schölkopf. 2013b. On estimation of functional causal models: Post-nonlinear causal model as an example. In Proceedings of the IEEE 13th International Conference on Data Mining Workshops. 139--146. Google ScholarDigital Library

Index Terms

On Estimation of Functional Causal Models: General Results and Application to the Post-Nonlinear Causal Model
1. Computing methodologies
  1. Artificial intelligence
    1. Philosophical/theoretical foundations of artificial intelligence

Recommendations

On Estimation of Functional Causal Models: Post-Nonlinear Causal Model as an Example
ICDMW '13: Proceedings of the 2013 IEEE 13th International Conference on Data Mining Workshops

Compared to constraint-based causal discovery, causal discovery based on functional causal models is able to identify the whole causal model under appropriate assumptions. Functional causal models represent the effect as a function of the direct causes ...
Read More
Causal Discovery via Causal Star Graphs
Discovering causal relationships among observed variables is an important research focus in data mining. Existing causal discovery approaches are mainly based on constraint-based methods and functional causal models (FCMs). However, the constraint-based ...
Read More
Coresets for fast causal discovery with the additive noise model
Abstract
Causal discovery reveals the true causal relationships behind data and discovering causal relationships from observed data is a particularly challenging problem, especially in large-scale datasets. The functional causal model is an effective ...
Highlights
- New coresets proposed for the additive noise model greatly reduces the data size for causal discovery.
- A time-efficient algorithm, FANM, is proposed for causal discovery based on the coresets.
- The coreset construction is applied to ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Intelligent Systems and Technology Volume 7, Issue 2
Special Issue on Causal Discovery and Inference
January 2016
270 pages
ISSN:2157-6904
EISSN:2157-6912
DOI:10.1145/2850424
Editor:
Yu Zheng
Microsoft Research, China
Issue’s Table of Contents
Copyright © 2015 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 December 2015
- Revised: 1 October 2014
- Accepted: 1 October 2014
- Received: 1 April 2014
Published in tist Volume 7, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Causal discovery
functional causal model
maximum likelihood
post-nonlinear causal model
statistical independence
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 16
  Total Citations
  View Citations
- 614
  Total Downloads
- Downloads (Last 12 months)90
- Downloads (Last 6 weeks)13
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

On Estimation of Functional Causal Models: General Results and Application to the Post-Nonlinear Causal Model

ACM Transactions on Intelligent Systems and Technology

Abstract

References

Cited By

Index Terms

Recommendations

On Estimation of Functional Causal Models: Post-Nonlinear Causal Model as an Example

Causal Discovery via Causal Star Graphs

Coresets for fast causal discovery with the additive noise model