Article

Mining multiple phenotype structures underlying gene expression profiles

Authors:
Chun Tang

State University of New York at Buffalo, Buffalo, NY

State University of New York at Buffalo, Buffalo, NY
View Profile

,
Aidong Zhang

State University of New York at Buffalo, Buffalo, NY

State University of New York at Buffalo, Buffalo, NY
View Profile

CIKM '03: Proceedings of the twelfth international conference on Information and knowledge managementNovember 2003Pages 418–425https://doi.org/10.1145/956863.956942

Published:03 November 2003Publication History

CIKM '03: Proceedings of the twelfth international conference on Information and knowledge management

Pages 418–425

ABSTRACT

DNA microarray technology is now widely used in basic biomedical research for mRNA expression profiling and are increasingly being used to explore patterns of gene expression in clinical research. Automatically detecting phenotype structures from gene expression profiles can provide deep insight into the nature of many diseases as well as lead in the development of new drugs. While most of the previous studies focus on only mining empirical phenotype structure which the experiment controls, it is also interesting to detect possible hidden phenotype structures underlying gene expression profiles.Since the number of samples is usually limited, such data sets are very sparse in high-dimensional gene space. Furthermore, most of the genes of interest are buried in large amount of noise. Unsupervised phenotype structure discovery of such sparse high-dimensional data sets present interesting but challenging problems. In this paper, we propose the model of simultaneously mining both empirical and hidden phenotype structures from gene expression data. We demonstrate the effectiveness and efficiency of the proposed method on various real-world data sets.

References

Agrawal, R., Gehrke, J., Gunopulos, D., and Raghavan, P. Automatic subspace clustering of high dimensional data for data mining applications. In SIGMOD 1998, Proceedings ACM SIGMOD International Conference on Management of Data, pages 94--105, 1998. Google ScholarDigital Library
Alon U., Barkai N., Notterman D. A., Gish K., Ybarra S., Mack D. and Levine A.J. Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide array. Proc. Natl. Acad. Sci. USA, Vol. 96(12):6745--6750, June 1999.Google ScholarCross Ref
Barash Y. and Friedman N. Context-specific bayesian clustering for gene expression data. In Proc. 5th Annual International Conference on Computational Molecular Biology (RECOMB), pages 12--20. ACM Press, 2001. Google ScholarDigital Library
Ben-Dor A., Shamir R. and Yakhini Z. Clustering gene expression patterns. Journal of Computational Biology, 6(3/4):281--297, 1999.Google ScholarCross Ref
Brown M. P. S., Grundy W. N., Lin D., Cristianini N., Sugnet C. W., Furey T. S., Ares M. Jr. and Haussler D. Knowledge-based analysis of microarray gene expression data using support vector machines. Proc. Natl. Acad. Sci., 97(1):262--267, January 2000.Google ScholarCross Ref
Cheng Y., Church GM. Biclustering of expression data. Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology (ISMB), 8:93--103, 2000. Google ScholarDigital Library
Ding, Chris. Analysis of gene expression profiles: class discovery and leaf ordering. In Proc. of International Conference on Computational Molecular Biology (RECOMB), pages 127--136, Washington, DC., April 2002. Google ScholarDigital Library
Eisen M. B., Spellman P. T., Brown P. O. and Botstein D. Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA, Vol. 95:14863--14868, 1998.Google ScholarCross Ref
Golub T. R., Slonim D. K. et al. Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science, Vol. 286(15):531--537, October 1999.Google ScholarCross Ref
Hedenfalk, I., Duggan, D., Chen, Y. D., Radmacher, M., Bittner, M., Simon, R., Meltzer, P., Gusterson, B., Esteller, M., Kallioniemi, O. P., Wilfond, B., Borg, A., and Trent, J. Gene-expression profiles in hereditary breast cancer. The New England Journal of Medicine, 344(8):539--548, February 2001.Google ScholarCross Ref
Kirkpatrick, S., Gelatt, C. D. Jr., andVecchi, M. P. Optimization by simulated annealing. Science, 220(4598):671--680, 1983.Google ScholarCross Ref
Mihael Ankerst, Markus M. Breunig, Hans-Peter Kriegel, Jörg Sander. OPTICS: Ordering Points To Identify the Clustering Structure. Sigmod, pages 49--60, 1999. Google ScholarDigital Library
Ng, Raymond T. and Han, Jiawei. Clarans: A method for clustering objects for spatial data mining. IEEE Transactions on Knowledge and Data Engineering, 14(5):1003--1016, October 2002. Google ScholarDigital Library
Peterson Leif E. Factor analysis of cluster-specific gene expression levels from cdna microarrays. Computer Methods and Programs in Biomedicine, 69(3):179--188, 2002.Google ScholarCross Ref
Rand, W. M. Objective criteria for evaluation of clustering methods. Journal of the American Statistical Association, 1971.Google ScholarCross Ref
Rhodes, D. R., Miller, J. C., Haab, B. B., Furge, K. A. CIT: Identification of Differentially Expressed Clusters of Genes from Microarray Data. Bioinformatics, 18:205--206, 2001.Google ScholarCross Ref
Schloegel, Kirk, Karypis, George. CRPC Parallel Computing Handbook, chapter Graph Partitioning For High Performance Scientific Simulations. Morgan Kaufmann, 2000.Google ScholarDigital Library
Shamir R. and Sharan R. Click: A clustering algorithm for gene expression analysis. In In Proceedings of the 8th International Conference on Intelligent Systems for Molecular Biology (ISMB '00). AAAI Press., 2000. Google ScholarDigital Library
Tang, Chun and Zhang, Aidong. An iterative strategy for pattern discovery in high-dimensional data sets. In Proceeding of 11th International Conference on Information and Knowledge Management (CIKM 02), McLean, VA, November 4-9 2002. Google ScholarDigital Library
Thomas J. G., Olson J. M., Tapscott S. J. and Zhao L. P. An Efficient and Robust Statistical Modeling Approach to Discover Differentially Expressed Genes Using Genomic Expression Profiles. Genome Research, 11(7):1227--1236, 2001.Google ScholarCross Ref
Xing E. P. and Karp R. M. Cliff: Clustering of high-dimensional microarray data via iterative feature filtering using normalized cuts. Bioinformatics, Vol. 17(1):306--315, 2001.Google ScholarCross Ref
Xu, Ying, Olman, Victor and Xu, Dong. Clustering gene expression data using a graph-theoretic approach: An application of minimum spanning trees. Bioinformatics, 18(4):536--545, 2002.Google ScholarCross Ref
Yang, Jiong, Wang, Wei, Wang, Haixun and Yu, Philip S. δ-cluster: Capturing Subspace Correlation in a Large Data Set. In Proceedings of 18th International Conference on Data Engineering (ICDE 2002), pages 517--528, 2002. Google ScholarDigital Library
Yeung, Ka Yee and Ruzzo, Walter L. An empirical study on principal component analysis for clustering gene expression data. Technical Report UW-CSE-2000-11-03, Department of Computer Science & Engineering, University of Washington, 2000.Google Scholar
Yeung, K. Y., Fraley, C., Murua, A., Raftery, A. E., Ruzzo, W. L. Model-based clustering and data transformations for gene expression data. Bioinformatics, 17:977--987, 2001.Google ScholarCross Ref

Index Terms

Mining multiple phenotype structures underlying gene expression profiles
1. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

Mining phenotypes and informative genes from gene expression data
KDD '03: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining

Mining microarray gene expression data is an important research topic in bioinformatics with broad applications. While most of the previous studies focus on clustering either genes or samples, it is interesting to ask whether we can partition the ...
Read More
Automatic phenotype structure mining underlying gene expression profiles
Read More
Mining Gene Expression Profiles and Gene Regulatory Networks: Identification of Phenotype-Specific Molecular Mechanisms
SETN '08: Proceedings of the 5th Hellenic conference on Artificial Intelligence: Theories, Models and Applications

The complex regulatory mechanisms of genes and their transcription are the major gene regulatory steps in the cell. Gene Regulatory Networks (GRNs) and DNA Microarrays (MAs) present two of the most prominent and heavily researched concepts in ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '03: Proceedings of the twelfth international conference on Information and knowledge management
November 2003
592 pages
ISBN:1581137230
DOI:10.1145/956863
General Chair:
Donald Kraft
Louisiana State University
,
Program Chairs:
Ophir Frieder
Illinois Institute of Technology
,
Joachim Hammer
University of Florida
,
Sajda Qureshi
University of Nebraska, Omaha
,
Len Seligman
The MITRE Corporation
Copyright © 2003 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 3 November 2003
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
array data
bioinformatics
informative genes
phenotype
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate1,861of8,427submissions,22%
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 713
  Total Downloads
- Downloads (Last 12 months)1
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Mining multiple phenotype structures underlying gene expression profiles

CIKM '03: Proceedings of the twelfth international conference on Information and knowledge management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Mining phenotypes and informative genes from gene expression data

Automatic phenotype structure mining underlying gene expression profiles

Mining Gene Expression Profiles and Gene Regulatory Networks: Identification of Phenotype-Specific Molecular Mechanisms

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Mining multiple phenotype structures underlying gene expression profiles

CIKM '03: Proceedings of the twelfth international conference on Information and knowledge management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Mining phenotypes and informative genes from gene expression data

Automatic phenotype structure mining underlying gene expression profiles

Mining Gene Expression Profiles and Gene Regulatory Networks: Identification of Phenotype-Specific Molecular Mechanisms

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media