ACM Home Page
Please provide us with feedback. Feedback
Comprehensive statistical method for protein fold recognition
Full text PdfPdf (1.08 MB)
Source Annual Conference on Research in Computational Molecular Biology archive
Proceedings of the fourth annual international conference on Computational molecular biology table of contents
Tokyo, Japan
Pages: 76 - 85  
Year of Publication: 2000
ISBN:1-58113-186-0
Authors
Jadwiga R. Bieńkowska  BioMolecular Engineering Research Center, College of Engineering, Boston University, 36 Cummington Street, Boston, MA
Lihua Yu  BioMolecular Engineering Research Center, College of Engineering, Boston University, 36 Cummington Street, Boston, MA
Sophia Zarakhovich  BioMolecular Engineering Research Center, College of Engineering, Boston University, 36 Cummington Street, Boston, MA
Robert G. Rogers, Jr.  BioMolecular Engineering Research Center, College of Engineering, Boston University, 36 Cummington Street, Boston, MA
Temple F. Smith  BioMolecular Engineering Research Center, College of Engineering, Boston University, 36 Cummington Street, Boston, MA
Sponsor
SIGACT: ACM Special Interest Group on Algorithms and Computation Theory
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 2,   Downloads (12 Months): 14,   Citation Count: 0
Additional Information:

abstract   references   collaborative colleagues   peer to peer  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/332306.332347
What is a DOI?

ABSTRACT

We present a protein fold recognition method that uses a comprehensive statistical interpretation of structural Hidden Markov Models (HMMs). The structure/fold recognition is done by summing the probabilities of all sequence-to-structure alignments Conventionally, Boltzmann statistics dictate that the optimal alignment can give an estimate of the lowest free energy of the sequence conformation imposed by the structural model. The alignment is optimized for a scoring function that is interpreted as a free energy of an amino acid in a structural environment. Near-optimal alignments are ignored, regardless of how likely they might be compared to the optimal alignment. Here we investigate an alternative view. A structure model can be seen as a statistical representation of an ensemble of similar structures. The optimal alignment is always the most probable, but sub-optimal alignments may have comparable probabilities. These sub-optimal alignments can be interpreted as optimal alignments to the “other” structures from the ensemble or optimal alignments under minor fluctuations in the scoring function. Summing probabilities for all alignments gives an estimate of sequence-model compatibility. We have built a set of structural HMMs for 188 protein structures, and have compared two methods for identifying the structure compatible with a sequence: by the optimal alignment probability and by the total probability. Fold recognition by total probability was 40% more accurate than fold recognition by the optimal alignment probability.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
BERNSTEIN, F. C., KOETZLE, T. F., WILLIAMS: G. J. B.~ MEYER, E. F., BRIGIS~ M. D., ROI)Gt;;RS: J. R., KENNARD, O., $HIMANOUCHI, T., AND TASUMi, M. The protein data bastk: a computer-ba.qed archival file for maeromoleculax structures. J. Mol. B:ol. t12 (1977), 535-542. Brookhaven Protein Data Bank release 80.
 
2
BOWIE, J. U., LUTHY, R., AND Eml~sara(~, D. A method to identify protein sequences that fold into a known three-dimensional structure. Science 253 (1991), 164-170.
 
3
BRYANT, $. H., AND LAWRIgNCE, C. E. An empirical energy function for threading protein sequence through the folding motif. Proteins: Structure. l~mction and Genetics 16 (1993), 92-112.
 
4
GODZII4, A., SKOLNICK, J., AND KOLINSKI: A. A topoiogy fingerprint approach to the inverse folding problem. J. Idol. B, ol. 227 (1992), 227-238.
 
5
JERNiGAN, R. L.: AND BAHAR, }. Structure-derived potentials and protein simulations. Current Op,mon in Structural Bzology 6 (1996)~ 195-209.
 
6
KABSCH, W., AND SANDER, C. Dictionary of protein secondary structure: Pattern recognition of hydrogenbonded and geometrical features. B~opolymers 22 (1983), 2577-2037.
 
7
LAT~aOP, R. H., Roo~as JR., R. G.,,. S~t{TH, T. F., AND WHITE, J. V. A bayes-optimal seq~ience-structure theory that unifies protein sequence-structure recognition and alignment. Bulletin of Mathematw~t Biology. 60 (1998), 1-33.
 
8
LATHROP: R. H., ROGERS JR., R. G., BI~NKOWSKA, J. R., BRYANT, B. K. M., BUTUROVI~, L. J., GAI- TATZES, C. NAMBUDRIPAD, R., WHHh~ J. V., AND SMITH, T. F. Analys~s and Al~or~thms for Protein Sequence.Structure Al, gnment. S. Salzberg, D. Searls and S. Kasif, Elsevier Press, Amsterdam, Netherlands, 1998, pp. 227-283.
 
9
LaMeR, C., P~OOMAN, M. J., At~O Woo^t<. S. Protein structure prediction by threading meth0&s: evaluation of current techniques. Proteins 23 (1995), 337-355.
 
10
LRWTT, M. Competitive assessment of pr(~tein fold recognition and alignment accuracy. Pt'ott.ms: Struc. ture, Functton and Genetics Suppl. 1 (1997), 92-104.
 
11
MIvaz^w^, S., ANO JERNIOAN, R. L. Resl~lue-residue potentials with a favorable contact pair term and unfavorable high packing density term, for simulation and threading. J. Mol. Bwl. 256 (1996), 623-644.
 
12
MURztN, A., BReNNtlR, S. E., HUBBARD, T., AND CHOTHIA, C. SCOP: a structural classification of proteins database for the investigation of the sequences and structures. J. Mol Biol. ~2d7 (1995), 536-540.
 
13
MURmN, A. G. Structure classification-based assessment of CASP3 predictions for the fold recogmtion targets. Protezn~: Structure, Function and Genet,cs Suppl. 3 (1999), 88-t03.
 
14
PARK, J., KAaPLUS, K., BARRETT, C., HUGHE~, R., HAUSSLRR~ D.: HUBBARD: T., AND CHOTHIA, C. Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods. J. Mot. Biol. 28d (December 1998), 1201- 1210.
 
15
R, Aatt~a, L. a. A tutorial on hidden markov models and selected applications in spech recognition. Proceed. mrs iEEE 77 (1989), 257-286.
 
16
ROST, B., SCHNEIDER: R., AND SANDER, C. Protein fold recognition by prediction-based threading. J. Mol. Bwl. 270 (1997), 471-480.
 
17
RUSSELL, R. B., COPLEY, 1~. R., AND BARTON, G. J. Protein fold recognition by mapping predicted secondary structures. J. Mol. Bwl. 259 (1996), 349-365.
 
18
SIPPL, M. J. Knowledge-based potentials for proteiv~s. Current Opm,on in Structural Biology 5 (1995), 229- 235.
 
19
SKOLNIGK, J,, JAR OSZEWSKI, L., KOL~NSK{, A., AND GODZIK, A. Derivation and testing of pair potentials for protein folding, when is the quasichemical approximation correct? Protein Science 6 (1907), 676-688.
 
20
SM~VS, T. r., ASD WaWraMar~, M. $. Identification of common molecular subsequences. J. Mol. Bwl 147 (1980), 195-197.
 
21
T., J. D. GenTHREADER: An efficient and reliable protein fold recognition method for genomic sequences. J. Mol. Bsol. 287:4 (April 1999), 797-815.
 
22
TAYLOR, W. P~. Multiple .sequence threading: An analysis of Mignment quality and stability, or. Mot Biol. ~69 (1997), 902-943.
 
23
THIELE, R., ZIMMBR, R., AND LI~NGAUEIt, T. Protein threading by recursive dynamic programming. J. Mol Bwl. 290 (July 1999), 757-779.
 
24
VITERSE, A. J. Error bounds for convolutional codes and an asymptotically optimal decoding algorithm. IEEE 2qrans. Information Theory IT. 13 (April 1967), 260-269.
 
25
WHITE, J. V. Bayeszan analya:s of t:me seines and dynam, c models. Marcel Dekker, New York, NY USA, 1988, pp. 255-283.
 
26
WHrTE, J. V., STULTZ, C. M., AND SMITH, T. F. Protein classification by stochastic modeling and optimal filtering of amino ~id sequences. Balkan of Mathe. rear,cat B:osciences 119 (1994), 35-75.
 
27
Yu, L., WHITE, J. V., AND SMITH, T. F. A homology identification method that combines sequence and structure information. Protein Science ? (1998), 2499- 2510.
Collaborative Colleagues:
Jadwiga R. Bieńkowska: colleagues
Lihua Yu: colleagues
Sophia Zarakhovich: colleagues
Robert G. Rogers, Jr.: colleagues
Temple F. Smith: colleagues

Peer to Peer - Readers of this Article have also read: