skip to main content
10.1145/2695664.2695773acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

Discovering weighted motifs in gene co-expression networks

Published:13 April 2015Publication History

ABSTRACT

A important dimension of complex networks is embedded in the weights of its edges. Incorporating this source of information on the analysis of a network can greatly enhance our understanding of it. This is the case for gene co-expression networks, which encapsulate information about the strength of correlation between gene expression profiles. Classical unweighted gene co-expression networks use thresholding for defining connectivity, losing some of the information contained in the different connection strengths. In this paper, we propose a mining method capable of extracting information from weighted gene co-expression networks. We study groups of differently connected nodes and their importance as network motifs. We define a subgraph as a motif if the weights of edges inside the subgraph hold a significantly different distribution than what would be found in a random distribution. We use the Kolmogorov-Smirnov test to calculate the significance score of the subgraph, avoiding the time consuming generation of random networks to determine statistic significance. We apply our approach to gene co-expression networks related to three different types of cancer and also to two healthy datasets. The structure of the networks is compared using weighted motif profiles, and our results show that we are able to clearly distinguish the networks and separate them by type. We also compare the biological relevance of our weighted approach to a more classical binary motif profile, where edges are unweighted. We use shared Gene Ontology annotations on biological processes, cellular components and molecular functions. The results of gene enrichment analysis show that weighted motifs are biologically more significant than the binary motifs.

References

  1. V. Arnau, S. Mars, and I. Marín. Iterative cluster analysis of protein interaction data. Bioinformatics, 21(3):364--378, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. M. I. Arnone and E. H. Davidson. The hardwiring of development: organization and function of genomic regulatory systems. Development, 124(10):1851--1864, 1997.Google ScholarGoogle ScholarCross RefCross Ref
  3. M. Ashburner, C. A. Ball, J. A. Blake, D. Botstein, H. Butler, J. M. Cherry, A. P. Davis, K. Dolinski, S. S. Dwight, J. T. Eppig, et al. Gene ontology: tool for the unification of biology. Nature genetics, 25(1):25--29, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  4. Y. Benjamini and Y. Hochberg. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society. Series B (Methodological), pages 289--300, 1995.Google ScholarGoogle Scholar
  5. M. R. Carlson, B. Zhang, Z. Fang, P. S. Mischel, S. Horvath, and S. F. Nelson. Gene connectivity, function, and sequence conservation: predictions from modular yeast co-expression networks. BMC genomics, 7(1):40, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  6. S. Choobdar, P. Ribeiro, S. Bulga, and F. Silva. Coauthorship network comparison across research fields using motifs. In IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Choobdar, P. Ribeiro, and F. Silva. Motif mining in weighted networks. In Data Mining Workshops (ICDMW), 2012 IEEE 12th International Conference on, pages 210--217. IEEE, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Dong and S. Horvath. Understanding network concepts in modules. BMC Systems Biology, 1(1):24, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  9. P. S. Gargalovic, M. Imura, B. Zhang, N. M. Gharavi, M. J. Clark, J. Pagnon, W.-P. Yang, A. He, A. Truong, S. Patel, et al. Identification of inflammatory gene modules based on variations of human endothelial cell responses to oxidized lipids. Proceedings of the National Academy of Sciences, 103(34):12741--12746, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  10. J. Grochow and M. Kellis. Network motif discovery using subgraph enumeration and symmetry-breaking. In Research in Computational Molecular Biology, pages 92--106. Springer, 2007. Google ScholarGoogle ScholarCross RefCross Ref
  11. C. Helma, S. Kramer, and L. De Raedt. The molecular feature miner molfea. In Proceedings of the Beilstein-Institut Workshop. May, 2002.Google ScholarGoogle Scholar
  12. S. Horvath and J. Dong. Geometric Interpretation of Gene Coexpression Network Analysis. PLoS Comput Biol, 4(8):e1000117+, Aug. 2008.Google ScholarGoogle Scholar
  13. S. Horvath, B. Zhang, M. Carlson, K. Lu, S. Zhu, R. Felciano, M. Laurance, W. Zhao, S. Qi, Z. Chen, et al. Analysis of oncogenic signaling networks in glioblastoma identifies aspm as a molecular target. Proceedings of the National Academy of Sciences, 103(46):17402--17407, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  14. H. Hu, X. Yan, Y. Huang, J. Han, and X. J. Zhou. Mining coherent dense subgraphs across massive biological networks for functional discovery. Bioinformatics, 21(suppl 1):i213--i221, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Huan, W. Wang, and J. Prins. Efficient mining of frequent subgraphs in the presence of isomorphism. In Proceedings of the Third IEEE International Conference on Data Mining, ICDM '03, pages 549--, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. C. Jiang, F. Coenen, and M. Zito. Frequent sub-graph mining on edge weighted graphs. Data Warehousing and Knowledge Discovery, pages 77--88, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. S. Kullback. Information theory and statistics. Courier Dover Publications, 1968.Google ScholarGoogle Scholar
  18. H. Li, Y. Sun, and M. Zhan. Exploring pathways from gene co-expression to network dynamics. In Computational Systems Biology, pages 249--267. Springer, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  19. N. K. MacLennan, J. Dong, J. E. Aten, S. Horvath, L. Rahib, L. Ornelas, K. M. Dipple, and E. R. McCabe. Weighted gene co-expression network analysis identifies biomarkers in glycerol kinase deficient mice. Molecular genetics and metabolism, 98(1):203--214, 2009.Google ScholarGoogle Scholar
  20. F. J. Massey Jr. The kolmogorov-smirnov test for goodness of fit. Journal of the American statistical Association, 46(253):68--78, 1951.Google ScholarGoogle Scholar
  21. G. L. G. Miklos and G. M. Rubin. The role of the genome project review in determining gene function: Insights from model organisms. Cell, 86:521--9, 1996.Google ScholarGoogle ScholarCross RefCross Ref
  22. R. Milo, S. Itzkovitz, N. Kashtan, R. Levitt, S. Shen-Orr, I. Ayzenshtat, M. Sheffer, and U. Alon. Super-families of evolved and designed networks. Science, 303(5663):1538--1542, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  23. R. Milo, S. Shen-Orr, S. Itzkovitz, N. Kashtan, D. Chklovskii, and U. Alon. Network Motifs: Simple Building Blocks of Complex Networks. Science, 298(5594):824--827, 2002.Google ScholarGoogle Scholar
  24. T. Nepusz, H. Yu, and A. Paccanaro. Detecting overlapping protein complexes in protein-protein interaction networks. Nature methods, 9(5):471--472, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  25. M. C. Oldham, S. Horvath, and D. H. Geschwind. Conservation and evolution of gene coexpression networks in human and chimpanzee brains. Proceedings of the National Academy of Sciences, 103(47):17973--17978, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  26. M. A. Pujana, J.-D. J. Han, L. M. Starita, K. N. Stevens, M. Tewari, J. S. Ahn, G. Rennert, V. Moreno, T. Kirchhoff, B. Gold, et al. Network modeling links breast cancer susceptibility and centrosome dysfunction. Nature genetics, 39(11):1338--1349, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  27. P. Ribeiro and F. Silva. G-tries: an efficient data structure for discovering network motifs. In Proceedings of the 2010 ACM Symposium on Applied Computing, pages 1559--1566, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. P. Ribeiro and F. Silva. G-tries: a data structure for storing and finding subgraphs. Data Mining and Knowledge Discovery, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. J. Saramaki, J.-P. Onnela, J. Kertesz, and K. Kaski. Characterizing motifs in weighted complex networks. AIP Conference Proceedings, 776(1):108--117, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  30. J. M. Stuart, E. Segal, D. Koller, and S. K. Kim. A gene-coexpression network for global discovery of conserved genetic modules. Science, 302:249--255, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  31. S. Wernicke. Efficient detection of network motifs. Computational Biology and Bioinformatics, IEEE/ACM Transactions on, 3(4):347--359, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. X. Yan and J. Han. gspan: Graph-based substructure pattern mining. In Proceedings of the 2002 IEEE International Conference on Data Mining, ICDM '02, pages 721--, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. B. Zhang and S. Horvath. A general framework for weighted gene co-expression network analysis. Statistical applications in genetics and molecular biology, 4(1):1128, 2005.Google ScholarGoogle Scholar
  34. J. Zhang, K. Huang, Y. Xiang, and R. Jin. Using frequent co-expression network to identify gene clusters for breast cancer prognosis. In Bioinformatics, Systems Biology and Intelligent Computing, 2009. IJCBS'09. International Joint Conference on, pages 428--434. IEEE, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. J. Zhang, K. Lu, Y. Xiang, M. Islam, S. Kotian, Z. Kais, C. Lee, M. Arora, H.-w. Liu, J. D. Parvin, et al. Weighted frequent gene co-expression network mining to identify genes involved in genome stability. PLoS Computational Biology, 8(8):e1002656, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  36. W. Zhao, P. Langfelder, T. Fuller, J. Dong, A. Li, and S. Hovarth. Weighted gene coexpression network analysis: state of the art. Journal of biopharmaceutical statistics, 20(2):281--300, 2010.Google ScholarGoogle Scholar

Index Terms

  1. Discovering weighted motifs in gene co-expression networks

                  Recommendations

                  Comments

                  Login options

                  Check if you have access through your login credentials or your institution to get full access on this article.

                  Sign in
                  • Published in

                    cover image ACM Conferences
                    SAC '15: Proceedings of the 30th Annual ACM Symposium on Applied Computing
                    April 2015
                    2418 pages
                    ISBN:9781450331968
                    DOI:10.1145/2695664

                    Copyright © 2015 ACM

                    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

                    Publisher

                    Association for Computing Machinery

                    New York, NY, United States

                    Publication History

                    • Published: 13 April 2015

                    Permissions

                    Request permissions about this article.

                    Request Permissions

                    Check for updates

                    Qualifiers

                    • research-article

                    Acceptance Rates

                    SAC '15 Paper Acceptance Rate291of1,211submissions,24%Overall Acceptance Rate1,650of6,669submissions,25%

                    Upcoming Conference

                    SAC '24

                  PDF Format

                  View or Download as a PDF file.

                  PDF

                  eReader

                  View online with eReader.

                  eReader