article

Inferential Clustering Approach for Microarray Experiments with Replicated Measurements

Authors:
Miquel Salicru

Barcelona University, Spain

Barcelona University, Spain
View Profile

,
Sergi Vives

Barcelona University, Spain

Barcelona University, Spain
View Profile

,
Tian Zheng

Columbia University, New York

Columbia University, New York
View Profile

IEEE/ACM Transactions on Computational Biology and Bioinformatics Volume 6 Issue 4pp 594–604https://doi.org/10.1109/TCBB.2008.106

Published:01 October 2009Publication History

IEEE/ACM Transactions on Computational Biology and Bioinformatics

Abstract

Cluster analysis has proven to be a useful tool for investigating the association structure among genes in a microarray data set. There is a rich literature on cluster analysis and various techniques have been developed. Such analyses heavily depend on an appropriate (dis)similarity measure. In this paper, we introduce a general clustering approach based on the confidence interval inferential methodology, which is applied to gene expression data of microarray experiments. Emphasis is placed on data with low replication (three or five replicates). The proposed method makes more efficient use of the measured data and avoids the subjective choice of a dissimilarity measure. This new methodology, when applied to real data, provides an easy-to-use bioinformatics solution for the cluster analysis of microarray experiments with replicates (see the Appendix). Even though the method is presented under the framework of microarray experiments, it is a general algorithm that can be used to identify clusters in any situation. The method's performance is evaluated using simulated and publicly available data set. Our results also clearly show that our method is not an extension of the conventional clustering method based on correlation or euclidean distance.

References

J.P. Brody, B.A. Williams, B.J. Wold, and S.R. Quake, "Significance and Statistical Errors in the Analysis of DNA Microarray Data," Proc. Nat'l Academy Sciences USA, vol. 99, no. 20, pp. 12975-12978, 2002.Google ScholarCross Ref
M.J. Callow, S. Dudoit, E.L. Gong, T.P. Speed, and E.M. Rubin, "Microarray Expression Profiling Identifies Genes with Altered Expression in HDL Deficient Mice," Genome Research, vol. 10, pp. 2022-2029, 2000.Google ScholarCross Ref
D. Dembele and P. Kastner, "Fuzzy C-Means Method for Clustering Microarray Data," Bioinformatics, vol. 19, pp. 973-980, 2003.Google ScholarCross Ref
I. Dhilon, E. Marcotte, and U. Roshan, "Diametrical Clustering for Identifying Anticorrelated Gene Clusters," Bioinformatics, vol. 19, pp. 1612-1619, 2003.Google ScholarCross Ref
S. Dudoit and J. Fridlyand, "Bagging to Improve the Accuracy of a Clustering Procedure," Biometrics, vol. 19, pp. 1090-1099, 2003.Google Scholar
M. Dugas, S. Merk, S. Breit, and P. Dirschedl, "Mdclust: Exploratory Microarray Analysis by Multidimensional Clustering," Bioinformatics, vol. 20, pp. 931-936, 2004. Google ScholarDigital Library
M.B. Eisen, P. Spellman, P.O. Brown, and D. Botstein, "Cluster Analysis and Display of Genome-Wide Expression Patterns," Proc. Nat'l Academy Sciences USA, vol. 95, pp. 14863-14868, 1998.Google ScholarCross Ref
C. Fraley and A.E. Raftery, "MCLUST: Software for Model-Based Clustering Discriminant Analysis and Density Estimation," Technical Report 415, Dept. of Statistics, Univ. of Washington, 2002.Google ScholarCross Ref
J.A. Hartigan and M.A. Wong, "A k-Means Clustering Algorithm," Applied Statistics, vol. 28, pp. 126-130, 1979.Google ScholarCross Ref
T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer, 2002.Google Scholar
R. Herwig, A.J. Poustka, C. Meuller, H. Lehrach, and J. O'Brien, "Large-Scale Clustering of cDNAfingerprinting Data," Genome Research, vol. 9, no. 11, pp. 1093-1105, 1999.Google ScholarCross Ref
D.V. Hinkley, "On the Ratio of Two Correlated Normal Random Variables," Biometrika, vol. 56, pp. 635-639, 1969.Google ScholarCross Ref
D. Horn and I. Axel, "Novel Clustering Algorithm for Microarray Expression Data in a Truncated SVD Space," Bioinformatics, vol. 19, pp. 1110-1115, 2003.Google ScholarCross Ref
T.R. Hughes, M.J. Marton, C.J. Jones, A.R. Roberts, R. Stoughton, C.D. Armour, H.A. Bennett, E. Coffey, and Y.D. He, "Functional Discovery via a Compendium of Expression Profiles," Cell, vol. 102, pp. 109-126, 2000.Google ScholarCross Ref
T. Ideker, V. Thorsson, J.A. Ranish, R. Christmas, J. Buhler, J.K. Eng, R.E. Bumgarner, D.R. Goodlett, R. Aebersold, and L. Hood, "Integrated Genomic and Proteomic Analyses of a Systemically Perturbed Metabolic Network," Science, vol. 292, pp. 929-934, 2001.Google ScholarCross Ref
N. Jardine and R. Sibson, Mathematical Taxonomy. Wiley, 1971.Google Scholar
L. Kaufman and P.J. Rousseeuw, Finding Groups in a Data. Wiley, 1990.Google ScholarCross Ref
T. Kohonen, "The Self-Organizing Map," Proc. IEEE, vol. 78, no. 9, pp. 1464-1479, Sept. 1990.Google ScholarCross Ref
M.T. Lee, F.C. Kuo, G.A. Whitmore, and J. Sklar, "Importance of Replication in Microarray Gene Expression Studies: Statistical Methods and Evidence from Repetitive cDNA Hybridizations," Proc. Nat'l Academy Sciences USA, vol. 97, pp. 9834-9839, 2000.Google ScholarCross Ref
A. Lukashin and R. Fuchs, "Analysis of Temporal Gene Expression Profiles: Clustering by Simulated Annealing and Determining the Optimal Number of Clusters," Bioinformatics, vol. 17, pp. 405-414, 2001.Google ScholarCross Ref
F. Luo, L. Khan, F. Bastani, I.L. Yen, and J. Zhou, Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, 2004.Google Scholar
G.J. McLachlan, R.W. Bean, and D. Peel, "A Mixture Model-Based Approach to the Clustering of Microarray Expression Data," Bioinformatics, vol. 18, pp. 1-10, 2002.Google ScholarCross Ref
M. Medvedovic, K.Y. Yeung, and R.E. Bumgarner, "Bayesian Mixture Model Based Clustering of Replicated Microarray Data," Bioinformatics, vol. 8, pp. 1222-1232, 2004. Google ScholarDigital Library
J. Qin, D. Lewis, and W. Noble, "Kernel Hierarchical Gene Clustering from Microarray Gene Expression Data," Bioinformatics, vol. 19, pp. 2097-2104, 2003.Google ScholarCross Ref
D. Ridder, F. Staal, J.M. van Dogen, and M.J. Reinders, "Maximum Significance Clustering of Oligonucleotide Microarrays," Bioinformatics, vol. 22, pp. 326-331, 2006. Google ScholarDigital Library
M. Salicrú and P. Sánchez, "Pseudocontinuity in Hierarchical Classifications," Information Sciences, vol. 120, pp. 257-265, 1999. Google ScholarDigital Library
M. Schena, D. Shalon, R.W. Davis, and P.O. Brown, "Quantitative Monitoring of Gene Expression Patterns with Complementary DNA Microarray," Science, vol. 270, pp. 467-470, 1995.Google ScholarCross Ref
R. Sharan, A. Maron-Katz, and R. Shamir, "CLICK and Expander: A System for Clustering and Visualizing Gene Expression Data," Bioinformatics, vol. 19, pp. 1787-1799, 2003.Google ScholarCross Ref
R. Sharan and R. Shamir, "CLICK: A Clustering Algorithm with Applications to Gene Expression Analysis," Proc. Int'l Conf. Intelligent Systems for Molecular Biology (ISMB), pp. 307-316, 2000. Google ScholarDigital Library
G. Sherlock, "Analysis of Large-Scale Gene Expression Data," Current Opinion in Immunology, vol. 12, pp. 201-205, 2000.Google ScholarCross Ref
G.K. Smyth, "Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments," Statistical Applications in Genetics and Molecular Biology, vol 3, no. 3, pp. 1-26, 2004.Google ScholarCross Ref
R. Steuer, J. Kurths, C. Daub, J. Weise, and J. Selbig, "The Mutual Information: Detecting and Evaluating Dependencies between Variables," Bioinformatics, vol. 18, pp. 231-240, 2002.Google ScholarCross Ref
Z. Szallasi and R. Somogyi, "Genetic Network Analysis-the Millennium Opening Version," Proc. Pacific Symp. BioComputing Tutorial, 2001.Google Scholar
P. Tamayo, D. Slonim, J. Mesirov, Q. Zhu, S. Kitareewan, E. Dmitrovsky, E.S. Lander, and T.R. Golub, "Interpreting Patterns of Gene Expression with Self-Organizing Maps: Methods and Application to Hematopoietic Differentiation," Proc. Nat'l Academy Sciences USA, vol. 96, pp. 2907-2912, 1999.Google ScholarCross Ref
S. Tavazoide, J. Hughes, M. Campbell, R.J. Cho, and G.M. Churo, "Systematic Determination of Genetic Network Architecture," Nature Genetics, vol. 22, pp. 281-285, 1999.Google ScholarCross Ref
S. Theodoridis and K. Koutroumbas, Pattern Recognition. Academic Press, 1999.Google ScholarDigital Library
S. Varma and R. Simon, "Iterative Class Discovery and Feature Selection Using Minimal Spanning Trees," BMC Bioinformatics, vol. 5, pp. 126-134, 2004.Google ScholarCross Ref
X. Wen, S. Fuhrman, G.S. Michaels, D.B. Carr, S. Smith, J.L. Barker, and R. Somogyi, "Large-Scale Temporal Gene Expression Mapping of Central Nervous System Development," Proc. Nat'l Academy Sciences USA, vol. 95, pp. 334-339, 1998.Google ScholarCross Ref
K. Yeung, D. Haynor, and W. Ruzzo, "Validating Clustering for Gene Expression Data," Bioinformatics, vol. 17, pp. 309-318, 2001.Google ScholarCross Ref
K.Y. Yeung, M. Medvedovic, and R.E. Bumgarner, "Clustering Gene Expression Data with Repeated Measurements," Genome Biology, vol 4, no. 5, p. 1-16, 2003.Google ScholarCross Ref

Index Terms

Inferential Clustering Approach for Microarray Experiments with Replicated Measurements

Recommendations

Context-dependent clustering for dynamic cellular state modeling of microarray gene expression

Motivation: High-throughput expression profiling allows researchers to study gene activities globally. Genes with similar expression profiles are likely to encode proteins that may participate in a common structural complex, metabolic pathway or ...
Read More
A Survey on Filter Techniques for Feature Selection in Gene Expression Microarray Analysis

A plenitude of feature selection (FS) methods is available in the literature, most of them rising as a need to analyze data of very high dimension, usually hundreds or thousands of variables. Such data sets are now available in various application areas ...
Read More
A global approach to identify differentially expressed genes in cDNA (two-color) microarray experiments

Motivation: Currently most of the methods for identifying differentially expressed genes fall into the category of so called single-gene-analysis, performing hypothesis testing on a gene-by-gene basis. In a single-gene-analysis approach, estimating the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

IEEE/ACM Transactions on Computational Biology and Bioinformatics Volume 6, Issue 4
October 2009
185 pages
ISSN:1545-5963
Issue’s Table of Contents
Sponsors
In-Cooperation
Publisher
IEEE Computer Society Press
Washington, DC, United States
Publication History
- Published: 1 October 2009
Published in tcbb Volume 6, Issue 4
Author Tags
Clustering analysis
confidence interval
gene expression data.
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 3
  Total Citations
  View Citations
- 119
  Total Downloads
- Downloads (Last 12 months)1
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Inferential Clustering Approach for Microarray Experiments with Replicated Measurements

IEEE/ACM Transactions on Computational Biology and Bioinformatics

Abstract

References

Cited By

Index Terms

Recommendations

Context-dependent clustering for dynamic cellular state modeling of microarray gene expression

A Survey on Filter Techniques for Feature Selection in Gene Expression Microarray Analysis

A global approach to identify differentially expressed genes in cDNA (two-color) microarray experiments

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Inferential Clustering Approach for Microarray Experiments with Replicated Measurements

IEEE/ACM Transactions on Computational Biology and Bioinformatics

Abstract

References

Cited By

Index Terms

Recommendations

Context-dependent clustering for dynamic cellular state modeling of microarray gene expression

A Survey on Filter Techniques for Feature Selection in Gene Expression Microarray Analysis

A global approach to identify differentially expressed genes in cDNA (two-color) microarray experiments

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media