ACM Home Page
Please provide us with feedback. Feedback
A new approach for gene prediction using comparative sequence analysis
Full text PdfPdf (164 KB)
Source Symposium on Applied Computing archive
Proceedings of the 2005 ACM symposium on Applied computing table of contents
Santa Fe, New Mexico
SESSION: Bioinformatics (BIO) table of contents
Pages: 177 - 184  
Year of Publication: 2005
ISBN:1-58113-964-0
Authors
Rong Chen  University of Nebraska at Omaha, Omaha, NE
Hesham H. Ali  University of Nebraska at Omaha, Omaha, NE
Sponsor
SIGAPP: ACM Special Interest Group on Applied Computing
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 8,   Downloads (12 Months): 53,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1066677.1066719
What is a DOI?

ABSTRACT

The availability of large fragments of genomic DNA makes it possible to apply comparative genomics for identification of protein-coding regions. In this work, a comparative analysis is conducted on homologous genomic sequences of organisms with different evolutionary distances and the conservation of the non-coding regions between closely related organisms is found. In contrast, more distance shows much less intron similarity but less conservation on the exon structures. This study sought to illuminate the impact of evolutionary distances on the performance of the proposed gene-finding program based on the cross-species sequence comparison. Base on the finding from comparative study and training of data sets, we proposed a model by which coding sequence could be identified by comparing sequences of multiple species, both close and approximately distant. The reliability of the proposed method is evaluated in terms of sensitivity and specificity, and results are compared to those obtained by other popular gene prediction programs. Provided sequences can be found from other species at appropriate evolutionary distances, this approach could be applied in newly sequenced organisms where no species-dependent statistical models are available.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
Batzoglou, S., Pachter, L., Mesirovi, J. P., Berger, B. and Lander, E. S. (2000). Human and mouse gene structure: comparative analysis and application to exon prediction. Genome Res. 7, 950--958.
 
3
Burge, C. and Karlin, S. (1997). Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268, 78--94.
 
4
Burset, M. and Guigó, R. (1996). Evaluation of gene structure prediction programs. Genomics 34, 353--367
 
5
Claverie, J.-M. (1997). Computational methods for the identification of genes in vertebrate genomic sequences. Hum. Mol. Genet. 6, 1735--1744.
 
6
Guigó, R., Agarwal, P., Abril, J. F., Burset, M. and Fickett, J. W. (2000). An Assessment of Gene Prediction Accuracy in Large DNA Sequences. Genome Res. 10, 1631--1642.
 
7
Miller, W. (2001). Comparison of genomic DNA sequences: solved and unsolved problems. Bioinformatics 17, 391--397.
 
8
Morgenstern, B., Rinner, O., Abdeddaïm, S., Haase, D., Mayer, K., Dress, A. and Mewes, H.-W. (2001). Exon prediction by comparative sequence analysis. In: The Human Genome Meeting 2001, Edinburgh, Programme and Abstract Book pp. 146--147.
 
9
Novichkov, P. S., Gelfand, M. S. and Mironov, A. A. (2001). Gene recognition in eukaryotic DNA by comparison of genomic sequences. Bioinformatics 17, 1011--1018.
 
10
Otu, H. and Sayood, K. (2002). A New Sequence Distance Measure for Phylogenetic Tree Construction.
 
11
Mathé, C et. al. Current Methods of Gene Prediction, Their Strengths and Weakness. Nucleic Acids Research, 2002, Vol. 30 No. 19, 44103--4117.
 
12
Mayor C. et. al. VISTA: visualizing global DNA sequence alignments of arbitrary length. Bioinformatics, Vol. 16 no. 11 2000, pages 1046--1047.
 
13
Salzberg, S. L., A method for identifying splice sites and translational start sites in eukaryotic mRNA. Comput. Appl. Biosci. 13, 365--376, 1997.
 
14
 
15
Functional and Comparative Genomics Fact Sheet, Human Genome Project Information
 
16
Stormo G. Gene-Finding approaches for Eukaryotes. Genome Research Vol. 10, Issue 4, 394--397, April 2000.

Collaborative Colleagues:
Rong Chen: colleagues
Hesham H. Ali: colleagues