ABSTRACT
There are well known algorithms for learning the structure of directed and undirected graphical models from data, but nearly all assume that the data consists of a single i.i.d. sample. In contexts such as fMRI analysis, data may consist of an ensemble of independent samples from a common data generating mechanism which may not have identical distributions. Pooling such data can result in a number of well known statistical problems so each sample must be analyzed individually, which offers no increase in power due to the presence of multiple samples. We show how existing constraint based methods can be modified to learn structure from the aggregate of such data in a statistically sound manner. The prescribed method is simple to implement and based on existing statistical methods employed in metaanalysis and other areas, but works surprisingly well in this context where there are increased concerns due to issues such as retesting. We report results for directed models, but the method given is just as applicable to undirected models.
- Bahadur, R. R. (1971). Some limit theorems in statistics. Philadelphia: SIAM.Google Scholar
- Chickering, D. M. (2002). Optimal structure identification with greedy search. Journal of Machine Learning Research, 3, 507--554. Google ScholarDigital Library
- Fisher, R. A. (1950). Statistical methods for research workers. London: Oliver and Boyd. 11th edition.Google Scholar
- Kalisch, M., & Büühlmann, P. (2007). Estimating high-dimensional directed acyclic graphs with the PC-algorithm. Journal of Machine Learning Research, 8, 613--636. Google ScholarDigital Library
- Lazar, N. A., Luna, B., Sweeney, J. A., & Eddy, W. F. (2002). Combining brains: A survey of methods for statistical pooling of information. NeuroImage, 16, 538--550.Google ScholarCross Ref
- Melançon, G., Dutour, I., & Bousquet-Méélou, M. (2000). Random generation of dags for graph drawing (Technical Report INS-R0005). Centre for Mathematics and Computer Sciences, Amsterdam. Google ScholarDigital Library
- Mudholkar, G. S., & George, E. O. (1979). The logit method for combining probabilities. Symposium on Optimizing Methods in Statistics (pp. 345--366). New York: Academic Press.Google Scholar
- Pearl, J. (2000). Causality: Models, reasoning, and inference. Cambridge, United Kingdom: Cambridge University Press. Google ScholarDigital Library
- Spirtes, P., Glymour, C., & Scheines, R. (2000). Causation, prediction, and search. Cambridge, MA: MIT Press. 2nd edition.Google Scholar
- Stouffer, S. A., Suchman, E. A., Devinney, L. C., Star, S. A., & Williams, R. M. (1949). The american soldier: Vol. 1. adjustment during army life. Princeton: Princeton University Press.Google Scholar
- Studený, M. (1992). Conditional independence relations have no finite complete characterization. In S. Kubik and J. A. Visek (Eds.), Information theory, statistical decision functions and random processes, vol. B, 377--396. Dordrecht: Kluwer.Google Scholar
- Sutton, A. J., Abrams, K. R., Jones, D. R., Sheldon, T. A., & Song, F. (2000). Methods for meta-analysis in medical research. New York: John Wiley & Sons.Google Scholar
- Tillman, R. E. (2008). Learning Bayesian network structure from distributed data with overlapping variables (Technical Report). Carnegie Mellon University, Pittsburgh, PA.Google Scholar
- Tillman, R. E., Danks, D., & Glymour, C. (2009). Integrating locally learned causal structures with overlapping variables. Advances in Neural Information Processing Systems 21.Google Scholar
- Tippett, L. H. C. (1950). The method of statistics. London: Williams and Norgate. 1st edition.Google Scholar
- Worsely, K. J., & Friston, K. J. (2000). A test for conjunction. Statistics and Probability Letters, 47, 135--140.Google ScholarCross Ref
Index Terms
- Structure learning with independent non-identically distributed data
Recommendations
Bootstrapping sums of independent but not identically distributed continuous processes with applications to functional data
In many areas of application, the data are of functional nature, such as (one-dimensional) spectral data and two- or three-dimensional imaging data. It is often of interest to test for the significance of some set of factors in the functional ...
Applying Secure Data Aggregation techniques for a Structure and Density Independent Group Based Key Management Protocol
IAS '07: Proceedings of the Third International Symposium on Information Assurance and SecurityIn many applications of Wireless Sensor Networks a Sink is interested in aggregated data instead of exact values from all sensors. To send aggregated data, it is also helpful to reduce the amount of data to be transmitted and thereby conserve energy. ...
Structure-Free Data Aggregation in Sensor Networks
Data aggregation protocols can reduce the communication cost, thereby extending the lifetime of sensor networks. Prior works on data aggregation protocols have focused on tree-based or cluster-based structured approaches. Although structured approaches ...
Comments