|
ABSTRACT
In this paper, we present a graphical model for protein secondary structure prediction. This model extends segmental semi-Markov models (SSMM) to exploit multiple sequence alignment profiles which contain information from evolutionarily related sequences. A novel parameterized model is proposed as the likelihood function for the SSMM to capture the segmental conformation. By incorporating the information from long range interactions in ß-sheets, this model is capable of carrying out inference on contact maps. The numerical results on benchmark data sets show that incorporating the profiles results in substantial improvements and the generalization performance is promising.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Altschul, S. F., Madden, T. L., Schaeffer, A. A., Zhang, J., Zhang, Z., Miller, W., & Lipman, D. J. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research, 25, 3389--3402.
|
| |
2
|
Aurora, R., & Rose, G. D. (1998). Helix capping. Protein Science, 7, 21--38.
|
| |
3
|
Burge, C., & Karlin, S. (1997). Prediction of complete gene structures in human genomic DNA. Journal of Molecular Biology, 268, 78--94.
|
| |
4
|
Cuff, J. A., & Barton, G. J. (2000). Application of multiple sequence alignment profiles to improve protein secondary structure prediction. Proteins: Structure, Function and Genetics, 40, 502--511.
|
| |
5
|
|
| |
6
|
Eisenberg, D., Weiss, R. M., & Terwilliger, T. C. (1984). The hydrophobic moment detects periodicity in protein hydrophobicity. Proceedings of the National Academy of Sciences, USA, 81, 140--144.
|
| |
7
|
Green, P. J. (1995). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika, 82, 711--732.
|
| |
8
|
Jones, D. (1999). Protein secondary structure prediction based on position-specific scoring matrices. Journal of Molecular Biology, 292, 195--202.
|
| |
9
|
Korf, I., Flicek, P., Duan, D., & Brent, M. R. (2001). Integrating genomic homology into gene structure prediction. Bioinformatics, 17 Suppl 1, S140-S148.
|
| |
10
|
Ostendorf, M., Digalakis, V., & Kimball, O. (1996). From HMM to segment models: a unified view of stochastic modelling for speech recognition. IEEE Trans. on Speech and Audio Processing, 4, 360--378.
|
| |
11
|
Pollastri, G., & Baldi, P. (2002). Prediction of contact maps by GIOHMMs and recurrent neural networks using lateral propagation from all four cardinal corners. Bioinformatics, 18 Suppl 1, S62--S70.
|
| |
12
|
Qian, N., & Sejnowski, T. J. (1988). Predicting the secondary structure of globular proteins using neural network models. Journal of Mol. Biol., 202, 865--884.
|
| |
13
|
Rabiner, R. L. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of The IEEE, 77, 257--286.
|
| |
14
|
Rost, B., & Sander, C. (1993). Prediction of protein secondary structure at better than 70% accuracy. Journal of Molecular Biology, 232, 584--599.
|
| |
15
|
|
| |
16
|
Schmidler, C. S., Liu, J. S., & Brutlag, D. L. (2000). Bayesian segmentation of protein secondary structure. Journal of Computational Biology, 7, 233--248.
|
| |
17
|
Sjölander, K., Karplus, K., Brown, M., Hughey, R., Krogh, A., Mian, I. S., & Haussler, D. (1996). Dirichlet mixtures: A method for improving detection of weak but significant protein sequence homology. Computing Applications in the Biosciences, 12, 327--345.
|
| |
18
|
Thompson, J. D., Higgins, D. G., & Gibson, T. J. (1994). CLUSTAL W: improving the sensistivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Research, 22, 4673--4680.
|
| |
19
|
Yel, R. F., Lim, L. P., & Burge, C. B. (2001). Computational inference of homologous gene structures in the human genome. Genome Res., 11, 803--816.
|
| |
20
|
Zhang, L., Pavlovic, V., Cantor, C. R., & Kasif, S. (2003). Human-mouse gene identification by comparative evidence integration and evolutionary analysis. Genome Res., 13, 1190--1202.
|
CITED BY 2
|
|
|
Zhen Guo , Zhongfei Zhang , Eric Xing , Christos Faloutsos, Enhanced max margin learning on multimodal data mining in a multimedia database, Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, August 12-15, 2007, San Jose, California, USA
|
|