ABSTRACT
In recent years, genome-wide association studies (GWAS) have successfully identified loci that harbor genetic variants associated with complex diseases. However, susceptibility loci identified by GWAS so far generally account for a limited fraction of heritability in patient populations. More recently, there has been considerable attention on identifying epistatic interactions. However, the large number of pairs to be tested for epistasis poses significant challenges, in terms of both computational (run-time) and statistical (multiple hypothesis testing) considerations.
In this paper, we propose a new method to reduce the number of tests required to identify epistatic pairs of genomic loci. The key idea of the proposed algorithm is to reduce the data by identifying sets of loci that may be complementary in their association with the disease. Namely, we identify population covering locus sets (PoCos), i.e., sets of loci that harbor at least one susceptibility allele in samples with the phenotype of interest. Then we compute representative genotypes for PoCos, and assess the significance of the interactions between pairs of PoCos. We use the results of this assessment to prioritize pairs of loci to be tested for epistasis. We test the proposed method on two independent GWAS data sets of Type 2 Diabetes (T2D). Our experimental results show that the proposed method reduces the number of hypotheses to be tested drastically, enabling efficient identification of more epistatic loci that are statistically significant. Moreover, some of the identified epistatic pairs of loci are reproducible between the two datasets. We also show that the proposed method outperforms an existing method for prioritization of locus pairs.
- B. Ahren. Islet G protein-coupled receptors as potential targets for treatment of type 2 diabetes. Nat Rev Drug Discov, 2009.Google ScholarCross Ref
- Australia and N. Z. M. S. G. C. (ANZgene). Genome-wide association study identifies new multiple sclerosis susceptibility loci on chromosomes 12 and 20. Nat Genet, 41, 2009.Google Scholar
- G. Bader and C. Hogue. W. T. C. C. consortium. genome-wide association study of CNVs in 16,000 cases of eight common diseases and 3,000 shared controls. Nature, 2010.Google Scholar
- H. Cordell. Epistasis: what it means, what it doesn't mean, and statistical methods to detect it in humans. Hum. Mol. Genet., 11(20), 2002.Google Scholar
- S. Erten, M. Ayati, Y. Liu, M. R. Chance, and M. Koyutürk. Algorithms for detecting complementary snps within a region of interest that are associated with diseases. pages 194--201, 2012. Google ScholarDigital Library
- R. Fisher. On the interpretation of X2 from contingency tables, and the calculation of p. Journal of the Royal Statistical Society, 85, 1922.Google Scholar
- B. Goudey and et al. GWIS - model-free, fast and exhaustive search for epistatic interactions in case-control GWAS. BMC Genomic, 14(3), 2013.Google Scholar
- J. Gudmundsson, P. Sulem, and et al. Genome-wide association study identifies a second prostate cancer susceptibility variant at 8q24. Nature genetics, 39, 2007.Google Scholar
- J. Gui, J. Moore, and et al. A simple and computationally efficient approach to multifactor dimensionality reduction analysis of gene-gene interactions for quantitative traits. PLoS One, 8, 2013.Google Scholar
- J. Lim, K. Hong, H. Jin, Y. Kim, H. Park, and B. Oh. Type 2 diabetes genetic association database manually curated for the study design and odds ratio. BMC Medical Informatics and Decision Making, 2010.Google ScholarCross Ref
- Y. Liu, S. Maxwell, and et al. Gene, pathway and network frameworks to identify epistatic interactions of single nucleotide polymorphisms derived from gwas data. BMC Syst Biol, 3, 2012.Google Scholar
- M. D. Mailman, M. Feolo, Y. Jin, M. Kimura, K. Tryka, and et al. The NCBI dbGaP database of genotypes and phenotypes. Nature genetics, 39, 2007.Google Scholar
- J. Marchini, P. Donnelly, and et al. Genome-wide strategies for detecting multiple loci that influence complex diseases. Nature Genet., 37, 2005.Google Scholar
- N. MS. Chi-square test for normality. Proceedings of International Vilnius Conference on Probability Thepry and Mathematical. Statistics, 2, 1973.Google Scholar
- R. P. Nair, K. C. Duffin, and et al. Genome-wide scan reveals association of psoriasis with IL-23 and NF-kB pathways. Nature genetics, 2009.Google Scholar
- J. Piriyapongsa and et al. iLOCi: a SNP interaction prioritization technique for detecting epistasis in genome-wide association studies. BMC Genomic, 13(7), 2012.Google Scholar
- N. Risch. Searching for genetic determinants in the new millennium. Nature, 405, 2000.Google Scholar
- M. Ritchie. Using biological knowledge to uncover the mystery in the search for epistasis in genome-wide association studies. Ann Hum Genet, 75, 2011.Google Scholar
- M. ritchie and et al. Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. Hum. Genet., 69, 2001.Google Scholar
- D. Segre, A. Deluna, and et al. Modular epistasis in yeast metabolism. Nature genetics, 37, 2005.Google Scholar
- S. T and et al. FastEpistasis: a high performance computing solution for quantitative trait epistasis. Bioinformatics, 26, 2010. Google ScholarDigital Library
- N. Tiffin, E. Adie, F. Turner, and et al. Computational disease gene identification: a concert of methods prioritizes type 2 diabetes and obesity candidate genes. Nucleic Acids Res., 2006.Google Scholar
- X. Wan and et al. BOOST: A fast approach to detecting gene-gene interactions in genome-wide case-constrol studies. Am J Hum Genet, 87(3), 2010.Google ScholarCross Ref
- C. Yang, Z. He, and et al. SNPHarvester: a filtering-based approach for detecting epistatic interactions in genome-wide association studies. Bioinformatics, 25, 2009. Google ScholarDigital Library
- E. Zeggini, L. Scott, and et al. Meta-analysis of genome-wide association data and large-scale replication identifies additional susceptibility loci for type 2 diabetes. Nature genetics, 40, 2008.Google Scholar
- K. Zerba, R. Ferrell, and et al. Complex adaptive systems and human health: the influence of common genotypes of the apolipoprotein E (ApoE) gene polymorphism and age on the relational order within a field of lipid metabolism traits. Hum. genetics, 107, 2000.Google Scholar
- X. Zhang, S. Huang, and et al. Team: efficient two-locus epistasis tests in human genome-wide association study. Bioinformatics, 26, 2010.Google Scholar
Index Terms
- Prioritization of genomic locus pairs for testing epistasis
Recommendations
A new approach to detect epistasis utilizing parallel implementation of ant colony optimization by MapReduce framework
Genome-wide association studies GWAS involve the detection and interpretation of epistasis, which is responsible for the ‘missing heritability’ and influences common complex disease susceptibility. Many epistasis detection algorithms cannot be directly ...
An Omnibus Permutation Test on Ensembles of Two-Locus Analyses for the Detection of Purely Epistatic Multi-locus Interactions
ICONIP '09: Proceedings of the 16th International Conference on Neural Information Processing: Part IIPurely epistatic multi-locus interactions cannot generally be detected via single-locus analysis in case-control studies of complex diseases. Recently, many two-locus and multi-locus analysis techniques have been shown to be promising for the epistasis ...
Genomic mining for complex disease traits with "random chemistry"
Our rapidly growing knowledge regarding genetic variation in the human genome offers great potential for understanding the genetic etiology of disease. This, in turn, could revolutionize detection, treatment, and in some cases prevention of disease. ...
Comments