research-article

Structure learning with independent non-identically distributed data

Author:
Robert E. Tillman

Carnegie Mellon University, Pittsburgh, PA

Carnegie Mellon University, Pittsburgh, PA
View Profile

ICML '09: Proceedings of the 26th Annual International Conference on Machine LearningJune 2009Pages 1041–1048https://doi.org/10.1145/1553374.1553507

Published:14 June 2009Publication History

ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning

Pages 1041–1048

ABSTRACT

There are well known algorithms for learning the structure of directed and undirected graphical models from data, but nearly all assume that the data consists of a single i.i.d. sample. In contexts such as fMRI analysis, data may consist of an ensemble of independent samples from a common data generating mechanism which may not have identical distributions. Pooling such data can result in a number of well known statistical problems so each sample must be analyzed individually, which offers no increase in power due to the presence of multiple samples. We show how existing constraint based methods can be modified to learn structure from the aggregate of such data in a statistically sound manner. The prescribed method is simple to implement and based on existing statistical methods employed in metaanalysis and other areas, but works surprisingly well in this context where there are increased concerns due to issues such as retesting. We report results for directed models, but the method given is just as applicable to undirected models.

References

Bahadur, R. R. (1971). Some limit theorems in statistics. Philadelphia: SIAM.Google Scholar
Chickering, D. M. (2002). Optimal structure identification with greedy search. Journal of Machine Learning Research, 3, 507--554. Google ScholarDigital Library
Fisher, R. A. (1950). Statistical methods for research workers. London: Oliver and Boyd. 11th edition.Google Scholar
Kalisch, M., & Büühlmann, P. (2007). Estimating high-dimensional directed acyclic graphs with the PC-algorithm. Journal of Machine Learning Research, 8, 613--636. Google ScholarDigital Library
Lazar, N. A., Luna, B., Sweeney, J. A., & Eddy, W. F. (2002). Combining brains: A survey of methods for statistical pooling of information. NeuroImage, 16, 538--550.Google ScholarCross Ref
Melançon, G., Dutour, I., & Bousquet-Méélou, M. (2000). Random generation of dags for graph drawing (Technical Report INS-R0005). Centre for Mathematics and Computer Sciences, Amsterdam. Google ScholarDigital Library
Mudholkar, G. S., & George, E. O. (1979). The logit method for combining probabilities. Symposium on Optimizing Methods in Statistics (pp. 345--366). New York: Academic Press.Google Scholar
Pearl, J. (2000). Causality: Models, reasoning, and inference. Cambridge, United Kingdom: Cambridge University Press. Google ScholarDigital Library
Spirtes, P., Glymour, C., & Scheines, R. (2000). Causation, prediction, and search. Cambridge, MA: MIT Press. 2nd edition.Google Scholar
Stouffer, S. A., Suchman, E. A., Devinney, L. C., Star, S. A., & Williams, R. M. (1949). The american soldier: Vol. 1. adjustment during army life. Princeton: Princeton University Press.Google Scholar
Studený, M. (1992). Conditional independence relations have no finite complete characterization. In S. Kubik and J. A. Visek (Eds.), Information theory, statistical decision functions and random processes, vol. B, 377--396. Dordrecht: Kluwer.Google Scholar
Sutton, A. J., Abrams, K. R., Jones, D. R., Sheldon, T. A., & Song, F. (2000). Methods for meta-analysis in medical research. New York: John Wiley & Sons.Google Scholar
Tillman, R. E. (2008). Learning Bayesian network structure from distributed data with overlapping variables (Technical Report). Carnegie Mellon University, Pittsburgh, PA.Google Scholar
Tillman, R. E., Danks, D., & Glymour, C. (2009). Integrating locally learned causal structures with overlapping variables. Advances in Neural Information Processing Systems 21.Google Scholar
Tippett, L. H. C. (1950). The method of statistics. London: Williams and Norgate. 1st edition.Google Scholar
Worsely, K. J., & Friston, K. J. (2000). A test for conjunction. Statistics and Probability Letters, 47, 135--140.Google ScholarCross Ref

Index Terms

Structure learning with independent non-identically distributed data

Recommendations

Bootstrapping sums of independent but not identically distributed continuous processes with applications to functional data

In many areas of application, the data are of functional nature, such as (one-dimensional) spectral data and two- or three-dimensional imaging data. It is often of interest to test for the significance of some set of factors in the functional ...
Read More
Applying Secure Data Aggregation techniques for a Structure and Density Independent Group Based Key Management Protocol
IAS '07: Proceedings of the Third International Symposium on Information Assurance and Security

In many applications of Wireless Sensor Networks a Sink is interested in aggregated data instead of exact values from all sensors. To send aggregated data, it is also helpful to reduce the amount of data to be transmitted and thereby conserve energy. ...
Read More
Structure-Free Data Aggregation in Sensor Networks

Data aggregation protocols can reduce the communication cost, thereby extending the lifetime of sensor networks. Prior works on data aggregation protocols have focused on tree-based or cluster-based structured approaches. Although structured approaches ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning
June 2009
1331 pages
ISBN:9781605585161
DOI:10.1145/1553374
General Chair:
Andrea Danyluk
Williams College
,
Program Chairs:
Léon Bottou
NEC Laboratories America
,
Michael Littman
Rutgers University
Copyright © 2009 Copyright 2009 by the author(s)/owner(s).
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 14 June 2009
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate140of548submissions,26%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 17
  Total Citations
  View Citations
- 245
  Total Downloads
- Downloads (Last 12 months)11
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Structure learning with independent non-identically distributed data

ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning

ABSTRACT

References

Cited By

Index Terms

Recommendations

Bootstrapping sums of independent but not identically distributed continuous processes with applications to functional data

Applying Secure Data Aggregation techniques for a Structure and Density Independent Group Based Key Management Protocol

Structure-Free Data Aggregation in Sensor Networks

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Structure learning with independent non-identically distributed data

ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning

ABSTRACT

References

Cited By

Index Terms

Recommendations

Bootstrapping sums of independent but not identically distributed continuous processes with applications to functional data

Applying Secure Data Aggregation techniques for a Structure and Density Independent Group Based Key Management Protocol

Structure-Free Data Aggregation in Sensor Networks

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media