skip to main content
article

Bidirectional Long Short-Term Memory Networks for Predicting the Subcellular Localization of Eukaryotic Proteins

Published:01 July 2007Publication History
Skip Abstract Section

Abstract

An algorithm called Bidirectional Long Short-Term Memory Networks (BLSTM) for processing sequential data is introduced. This supervised learning method trains a special recurrent neural network to use very long ranged symmetric sequence context using a combination of nonlinear processing elements and linear feedback loops for storing long-range context. The algorithm is applied to the sequence-based prediction of protein localization and predicts 93.3% novel non-plant proteins and 88.4% novel plant proteins correctly, which is an improvement over feedforward and standard recurrent networks solving the same problem. The BLSTM system is available as a web-service (http://www.stepc.gr/~synaptic/blstm.html).

References

  1. M. Reczko, E. Staub, P. Fiziev, and A. Hatzigeorgiou, “Finding Signal Peptides in Human Protein Sequences Using Recurrent Neural Networks,” Lecture Notes in Computer Science, R. Guigo and D. Gusfield, eds., vol. 2452, pp. 60-67, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. F. Gers and J. Schmidhuber, “LSTM Recurrent Networks Learn Simple Context Free and Context Sensitive Languages,” IEEE Trans. Neural Networks, vol. 12, no. 6, pp. 1333-1340, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. G. Schatz and B. Dobberstein, “Common Principles of Protein Translocation across Membranes,” Science, vol. 271, no. 5255, pp.1519-1526, 1996.Google ScholarGoogle ScholarCross RefCross Ref
  4. B. Eisenhaber and P. Bork, “Wanted: Subcellular Localization of Proteins Based on Sequence,” Trends Cell Biology, vol. 9, pp. 169-170, 1998.Google ScholarGoogle ScholarCross RefCross Ref
  5. O. Emanuelsson and G. von Heijne, “Predicting of Organellar Targeting Signals,” Biochimica et Biophysica Acta, vol. 1541, pp. 114-119, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  6. K. Nakai, “Review: Prediction of in vivo Fates of Proteins in the Era of Genomics and Proteomics,” J. Structural Biology, vol. 134, pp. 103-116, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  7. K. Nakai, “Protein Sorting Signals and Prediction of Subcellular Localization,” Advances in Protein Chemistry, vol. 54, pp. 277-344, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  8. H. Nielsen, J. Engelbrecht, S. Brunak, and G. von Heijne, “Identification of Prokaryotic and Eukaryotic Signal Peptides and Prediction of Their Cleavage Sites,” Protein Eng., vol. 10, no. 1, pp. 1-6, 1997.Google ScholarGoogle ScholarCross RefCross Ref
  9. H. Nielsen, S. Brunak, and G. von Heijne, “Machine Learning Approaches for the Prediction of Signal Peptides and Other Protein Sorting Signals,” Protein Eng., vol. 12, no. 1, pp. 3-9, 1999.Google ScholarGoogle ScholarCross RefCross Ref
  10. M.G. Claros and P. Vincens, “Computational Method to Predict Mitochondrially Imported Proteins and Their Targeting Sequences,” European J. Biochemistry, vol. 241, pp. 779-786, 1996.Google ScholarGoogle ScholarCross RefCross Ref
  11. O. Emanuelsson, H. Nielsen, S. Brunak, and G. von Heijne, “Predicting Subcellular Localization of Proteins Based on Their N-Terminal Amino Acid Sequence,” J. Molecular Biology, vol. 300, pp. 1005-1016, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  12. B. Jagla and J. Schuchhardt, “Adaptive Encoding Neural Networks for the Recognition of Human Signal Peptide Cleavage Sites,” Bioinformatics, vol. 16, pp. 245-250, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  13. A. Reinhardt and T. Hubbard, “Using Neural Networks for Prediction of the Subcellular Location of Proteins,” Nucleic Acids Research, vol. 26, no. 9, pp. 2230-2236, 1998.Google ScholarGoogle ScholarCross RefCross Ref
  14. K.C. Chou, “Using Subsite Coupling to Predict Signal Peptides,” Protein Eng., vol. 14, pp. 75-79, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  15. S. Hua and Z. Sun, “Support Vector Machine Approach for Protein Subcellular Localization Prediction,” Bioinformatics, vol. 17, no. 8, pp. 721-728, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  16. E.M. Marcotte, I. Xenarios, A.M. van der Bliek, and D. Eisenberg, “Localizing Proteins in the Cell from Their Phylogenetic Profiles,” Proc. Nat'l Academy of Sciences USA, vol. 97, no. 22, pp. 12115-12120, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  17. R. Mott, J. Schultz, P. Bork, and C.P. Ponting, “Predicting Protein Cellular Localization Using a Domain Projection Method,” Genome Reserach, vol. 12, pp. 1168-1174, 2002.Google ScholarGoogle ScholarCross RefCross Ref
  18. H. Bannai, Y. Tamada, O. Maruyama, K. Nakai, and S. Miyano, “Extensive Feature Detection of n-Terminal Protein Sorting Signals,” Bioinformatics, vol. 18, no. 2, pp. 298-305, 2002.Google ScholarGoogle ScholarCross RefCross Ref
  19. A. Drawid and M. Gerstein, “A Bayesian System Integrating Expression Data with Sequence Patterns for Localizing Proteins: Comprehensive Application to the Yeast Genome,” J. Molecular Biology, vol. 301, pp. 1059-1075, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  20. M. Bhasin and G. Raghava, “ESLpred: SVM-Based Method for Subcellular Localization of Eukaryotic Proteins Using Dipeptide Composition and PSI-BLAST,” Nucleic Acids Research, vol. 32, pp.W414-W419, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  21. M. Reczko and A. Hatzigeorgiou, “Prediction of Subcellular Localization of Eukaryotic Proteins Using Sequence Signals and Composition,” PROTEOMICS, vol. 4, no. 6, pp. 1591-1596, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  22. J. Hawkins and M. Boden, “The Applicability of Recurrent Neural Networks for Biological Sequence Analysis,” IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 2, no. 3, pp. 243-253, July-Sept. 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. F. Gers et al., “Learning Precise Timing with LSTM Recurrent Networks,” J. Machine Learning Research, vol. 3, pp. 115-143, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Computation, vol. 9, no. 8, pp. 1735-1780, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. A.J. Robinson and F. Fallside, “The Utility Driven Dynamic Error Propagation Network,” Technical Report CUED/F-INFENG/TR.1, Eng. Dept., Cambridge Univ., 1987.Google ScholarGoogle Scholar
  26. M. Riedmiller and H. Braun, “A Direct Adaptive Method for Faster Backpropagation Learning: The RPROP Algorithm,” Proc. IEEE Int'l Conf. Neural Networks (ICNN '93), H. Ruspini, ed., pp.586-591, 1993.Google ScholarGoogle ScholarCross RefCross Ref
  27. M. Schuster and K. Paliwal, “Bidirectional Recurrent Neural Networks,” IEEE Trans. Signal Processing, vol. 45, pp. 2673-2681, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. P. Baldi, S. Brunak, Y. Chauvin, C.A.F. Andersen, and H. Nielsen, “Assessing the Accuracy of Prediction Algorithms for Classification: An Overview,” Bioinformatics, vol. 16, pp. 412-424, 2000.Google ScholarGoogle Scholar

Index Terms

  1. Bidirectional Long Short-Term Memory Networks for Predicting the Subcellular Localization of Eukaryotic Proteins

                Recommendations

                Comments

                Login options

                Check if you have access through your login credentials or your institution to get full access on this article.

                Sign in

                Full Access

                PDF Format

                View or Download as a PDF file.

                PDF

                eReader

                View online with eReader.

                eReader