skip to main content
article

Inferential Clustering Approach for Microarray Experiments with Replicated Measurements

Published:01 October 2009Publication History
Skip Abstract Section

Abstract

Cluster analysis has proven to be a useful tool for investigating the association structure among genes in a microarray data set. There is a rich literature on cluster analysis and various techniques have been developed. Such analyses heavily depend on an appropriate (dis)similarity measure. In this paper, we introduce a general clustering approach based on the confidence interval inferential methodology, which is applied to gene expression data of microarray experiments. Emphasis is placed on data with low replication (three or five replicates). The proposed method makes more efficient use of the measured data and avoids the subjective choice of a dissimilarity measure. This new methodology, when applied to real data, provides an easy-to-use bioinformatics solution for the cluster analysis of microarray experiments with replicates (see the Appendix). Even though the method is presented under the framework of microarray experiments, it is a general algorithm that can be used to identify clusters in any situation. The method's performance is evaluated using simulated and publicly available data set. Our results also clearly show that our method is not an extension of the conventional clustering method based on correlation or euclidean distance.

References

  1. J.P. Brody, B.A. Williams, B.J. Wold, and S.R. Quake, "Significance and Statistical Errors in the Analysis of DNA Microarray Data," Proc. Nat'l Academy Sciences USA, vol. 99, no. 20, pp. 12975-12978, 2002.Google ScholarGoogle ScholarCross RefCross Ref
  2. M.J. Callow, S. Dudoit, E.L. Gong, T.P. Speed, and E.M. Rubin, "Microarray Expression Profiling Identifies Genes with Altered Expression in HDL Deficient Mice," Genome Research, vol. 10, pp. 2022-2029, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  3. D. Dembele and P. Kastner, "Fuzzy C-Means Method for Clustering Microarray Data," Bioinformatics, vol. 19, pp. 973-980, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  4. I. Dhilon, E. Marcotte, and U. Roshan, "Diametrical Clustering for Identifying Anticorrelated Gene Clusters," Bioinformatics, vol. 19, pp. 1612-1619, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  5. S. Dudoit and J. Fridlyand, "Bagging to Improve the Accuracy of a Clustering Procedure," Biometrics, vol. 19, pp. 1090-1099, 2003.Google ScholarGoogle Scholar
  6. M. Dugas, S. Merk, S. Breit, and P. Dirschedl, "Mdclust: Exploratory Microarray Analysis by Multidimensional Clustering," Bioinformatics, vol. 20, pp. 931-936, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M.B. Eisen, P. Spellman, P.O. Brown, and D. Botstein, "Cluster Analysis and Display of Genome-Wide Expression Patterns," Proc. Nat'l Academy Sciences USA, vol. 95, pp. 14863-14868, 1998.Google ScholarGoogle ScholarCross RefCross Ref
  8. C. Fraley and A.E. Raftery, "MCLUST: Software for Model-Based Clustering Discriminant Analysis and Density Estimation," Technical Report 415, Dept. of Statistics, Univ. of Washington, 2002.Google ScholarGoogle ScholarCross RefCross Ref
  9. J.A. Hartigan and M.A. Wong, "A k-Means Clustering Algorithm," Applied Statistics, vol. 28, pp. 126-130, 1979.Google ScholarGoogle ScholarCross RefCross Ref
  10. T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer, 2002.Google ScholarGoogle Scholar
  11. R. Herwig, A.J. Poustka, C. Meuller, H. Lehrach, and J. O'Brien, "Large-Scale Clustering of cDNAfingerprinting Data," Genome Research, vol. 9, no. 11, pp. 1093-1105, 1999.Google ScholarGoogle ScholarCross RefCross Ref
  12. D.V. Hinkley, "On the Ratio of Two Correlated Normal Random Variables," Biometrika, vol. 56, pp. 635-639, 1969.Google ScholarGoogle ScholarCross RefCross Ref
  13. D. Horn and I. Axel, "Novel Clustering Algorithm for Microarray Expression Data in a Truncated SVD Space," Bioinformatics, vol. 19, pp. 1110-1115, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  14. T.R. Hughes, M.J. Marton, C.J. Jones, A.R. Roberts, R. Stoughton, C.D. Armour, H.A. Bennett, E. Coffey, and Y.D. He, "Functional Discovery via a Compendium of Expression Profiles," Cell, vol. 102, pp. 109-126, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  15. T. Ideker, V. Thorsson, J.A. Ranish, R. Christmas, J. Buhler, J.K. Eng, R.E. Bumgarner, D.R. Goodlett, R. Aebersold, and L. Hood, "Integrated Genomic and Proteomic Analyses of a Systemically Perturbed Metabolic Network," Science, vol. 292, pp. 929-934, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  16. N. Jardine and R. Sibson, Mathematical Taxonomy. Wiley, 1971.Google ScholarGoogle Scholar
  17. L. Kaufman and P.J. Rousseeuw, Finding Groups in a Data. Wiley, 1990.Google ScholarGoogle ScholarCross RefCross Ref
  18. T. Kohonen, "The Self-Organizing Map," Proc. IEEE, vol. 78, no. 9, pp. 1464-1479, Sept. 1990.Google ScholarGoogle ScholarCross RefCross Ref
  19. M.T. Lee, F.C. Kuo, G.A. Whitmore, and J. Sklar, "Importance of Replication in Microarray Gene Expression Studies: Statistical Methods and Evidence from Repetitive cDNA Hybridizations," Proc. Nat'l Academy Sciences USA, vol. 97, pp. 9834-9839, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  20. A. Lukashin and R. Fuchs, "Analysis of Temporal Gene Expression Profiles: Clustering by Simulated Annealing and Determining the Optimal Number of Clusters," Bioinformatics, vol. 17, pp. 405-414, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  21. F. Luo, L. Khan, F. Bastani, I.L. Yen, and J. Zhou, Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, 2004.Google ScholarGoogle Scholar
  22. G.J. McLachlan, R.W. Bean, and D. Peel, "A Mixture Model-Based Approach to the Clustering of Microarray Expression Data," Bioinformatics, vol. 18, pp. 1-10, 2002.Google ScholarGoogle ScholarCross RefCross Ref
  23. M. Medvedovic, K.Y. Yeung, and R.E. Bumgarner, "Bayesian Mixture Model Based Clustering of Replicated Microarray Data," Bioinformatics, vol. 8, pp. 1222-1232, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. J. Qin, D. Lewis, and W. Noble, "Kernel Hierarchical Gene Clustering from Microarray Gene Expression Data," Bioinformatics, vol. 19, pp. 2097-2104, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  25. D. Ridder, F. Staal, J.M. van Dogen, and M.J. Reinders, "Maximum Significance Clustering of Oligonucleotide Microarrays," Bioinformatics, vol. 22, pp. 326-331, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. M. Salicrú and P. Sánchez, "Pseudocontinuity in Hierarchical Classifications," Information Sciences, vol. 120, pp. 257-265, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. M. Schena, D. Shalon, R.W. Davis, and P.O. Brown, "Quantitative Monitoring of Gene Expression Patterns with Complementary DNA Microarray," Science, vol. 270, pp. 467-470, 1995.Google ScholarGoogle ScholarCross RefCross Ref
  28. R. Sharan, A. Maron-Katz, and R. Shamir, "CLICK and Expander: A System for Clustering and Visualizing Gene Expression Data," Bioinformatics, vol. 19, pp. 1787-1799, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  29. R. Sharan and R. Shamir, "CLICK: A Clustering Algorithm with Applications to Gene Expression Analysis," Proc. Int'l Conf. Intelligent Systems for Molecular Biology (ISMB), pp. 307-316, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. G. Sherlock, "Analysis of Large-Scale Gene Expression Data," Current Opinion in Immunology, vol. 12, pp. 201-205, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  31. G.K. Smyth, "Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments," Statistical Applications in Genetics and Molecular Biology, vol 3, no. 3, pp. 1-26, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  32. R. Steuer, J. Kurths, C. Daub, J. Weise, and J. Selbig, "The Mutual Information: Detecting and Evaluating Dependencies between Variables," Bioinformatics, vol. 18, pp. 231-240, 2002.Google ScholarGoogle ScholarCross RefCross Ref
  33. Z. Szallasi and R. Somogyi, "Genetic Network Analysis-the Millennium Opening Version," Proc. Pacific Symp. BioComputing Tutorial, 2001.Google ScholarGoogle Scholar
  34. P. Tamayo, D. Slonim, J. Mesirov, Q. Zhu, S. Kitareewan, E. Dmitrovsky, E.S. Lander, and T.R. Golub, "Interpreting Patterns of Gene Expression with Self-Organizing Maps: Methods and Application to Hematopoietic Differentiation," Proc. Nat'l Academy Sciences USA, vol. 96, pp. 2907-2912, 1999.Google ScholarGoogle ScholarCross RefCross Ref
  35. S. Tavazoide, J. Hughes, M. Campbell, R.J. Cho, and G.M. Churo, "Systematic Determination of Genetic Network Architecture," Nature Genetics, vol. 22, pp. 281-285, 1999.Google ScholarGoogle ScholarCross RefCross Ref
  36. S. Theodoridis and K. Koutroumbas, Pattern Recognition. Academic Press, 1999.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. S. Varma and R. Simon, "Iterative Class Discovery and Feature Selection Using Minimal Spanning Trees," BMC Bioinformatics, vol. 5, pp. 126-134, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  38. X. Wen, S. Fuhrman, G.S. Michaels, D.B. Carr, S. Smith, J.L. Barker, and R. Somogyi, "Large-Scale Temporal Gene Expression Mapping of Central Nervous System Development," Proc. Nat'l Academy Sciences USA, vol. 95, pp. 334-339, 1998.Google ScholarGoogle ScholarCross RefCross Ref
  39. K. Yeung, D. Haynor, and W. Ruzzo, "Validating Clustering for Gene Expression Data," Bioinformatics, vol. 17, pp. 309-318, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  40. K.Y. Yeung, M. Medvedovic, and R.E. Bumgarner, "Clustering Gene Expression Data with Repeated Measurements," Genome Biology, vol 4, no. 5, p. 1-16, 2003.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Inferential Clustering Approach for Microarray Experiments with Replicated Measurements

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader