skip to main content
article

Combinatorial Approaches for Mass Spectra Recalibration

Published: 01 January 2008 Publication History

Abstract

Mass spectrometry has become one of the most popular analysis techniques in Proteomics and Systems Biology. With the creation of larger datasets, the automated recalibration of mass spectra becomes important to ensure that every peak in the sample spectrum is correctly assigned to some peptide and protein. Algorithms for recalibrating mass spectra have to be robust with respect to wrongly assigned peaks, as well as efficient due to the amount of mass spectrometry data. The recalibration of mass spectra leads us to the problem of finding an optimal matching between mass spectra under measurement errors.We have developed two deterministic methods that allow robust computation of such a matching: The first approach uses a computational geometry interpretation of the problem, and tries to find two parallel lines with constant distance that stab a maximal number of points in the plane. The second approach is based on finding a maximal common approximate subsequence, and improves existing algorithms by one order of magnitude exploiting the sequential nature of the matching problem. We compare our results to a computational geometry algorithm using a topological line-sweep.

References

[1]
W.J. Henzel, C. Watanabe, and J.T. Stults, "Protein Identification: The Origins of Peptide Mass Fingerprints," J. Am. Soc. Mass Spectrometry, vol. 14, pp. 931-942, 2003.
[2]
R. Matthiesen, M.B. Trelle, P. Hojrup, J. Bunkenborg, and O.N. Jensen, "VEMS 3.0: Algorithms and Computational Tools for Tandem Mass Spectrometry Based Identification of Post_Transla- tional Modifications in Proteins," J Proteome Research, vol. 4, no. 6, pp. 2338-2347, http://dx.doi.org/10.1021/pr050264q, 2005.
[3]
B. Ma, K. Zhang, C. Hendrie, C. Liang, M. Li, A. Doherty-Kirby, and G. Lajoie, "PEAKS: Powerful Software for Peptide de Novo Sequencing by Tandem Mass Spectrometry," Rapid Comm. Mass Spectrometry, vol. 17, no. 20, pp. 2337-2342, 2003.
[4]
B.-L. Adam, Y. Qu, J.W. Davis, M.D. Ward, M.A. Clements, L.H. Cazares, O.J. Semmes, P.F. Schellhammer, Y. Yasui, Z. Feng, and G.L. Wright Jr., "Serum Protein Fingerprinting Coupled with a Pattern-Matching Algorithm Distinguishes Prostate Cancer from Benign Prostate Hyperplasia and Healthy Men," Cancer Research, vol. 62, pp. 3609-3614, 2002.
[5]
J. Gobom, M. Mueller, V. Egelhofer, D. Theiss, H. Lehrach, and E. Nordhoff, "A Calibration Method that Simplifies and Improves Accurate Determination of Peptide Molecular Masses by MALDI-TOF MS," Analytical Chemistry, vol. 74, no. 15, pp. 3915-3923, 2002.
[6]
K.A. Baggerly, J.S. Morris, and K.R. Coombes, "Reproducibility of SELDI-TOF Protein Patterns in Serum: Comparing Datasets from Different Experiments," Bioinformatics, vol. 20, no. 5, pp. 777-785, 2004.
[7]
O.J. Semmes, Z. Feng, B.-L. Adam, L.L. Banez, W.L. Bigbee, D. Campos, L.H. Cazares, D.W. Chan, W.E. Grizzle, E. Izbicka, J. Kagan, G. Malik, D. McLerran, J.W. Moul, A. Partin, P. Prasanna, J. Rosenzweig, L.J. Sokoll, S. Srivastava, S. Srivastava, I. Thompson, M.J. Welsh, N. White, M. Winget, Y. Yasui, Z. Zhang, and L. Zhu, "Evaluation of Serum Protein Profiling by Surface-Enhanced Laser Desorption/Ionization Time-of-Flight Mass Spectrometry for the Detection of Prostate Cancer: I. Assessment of Platform Reproducibility," Clinical Chemistry, vol. 51, pp. 102-112, 2005.
[8]
M.W. Bern and D. Goldberg, "EigenMS: De Novo Analysis of Peptide Tandem Mass Spectra by Spectral Graph Partitioning," Proc. Ann. Int'l Conf. Research in Computational and Molecular Biology (RECOMB '05), vol. 3500, pp. 357-372, 2005.
[9]
E. Cheney, An Introduction to Approximation Theory, second ed., reprint of 1982 ed. Am. Math. Soc., 2000.
[10]
J.W. Wong, G. Cagney, and H.M. Cartwright, "SpecAlign-Processing and Alignment of Mass Spectra Datasets," Bioinformatics , vol. 21, no. 9, pp. 2088-2090, 2005.
[11]
W.E. Wolski, M. Lalowski, P. Jungblut, and K. Reinert, "Calibration of Mass Spectrometric Peptide Mass Fingerprint Data without Specific External or Internal Calibrants," BMC Bioinformatics, vol. 6, p. 203, 2005.
[12]
N. Jeffries, "Algorithms for Alignment of Mass Spectrometry Proteomic Data," Bioinformatics, vol. 21, no. 14, pp. 3066-3073, 2005.
[13]
R. Matthiesen, J. Bunkenborg, A. Stensballe, O.N. Jensen, K.G. Welinder, and G. Bauw, "Database-Independent, Database-Dependent, and Extended Interpretation of Peptide Mass Spectra in VEMS V2.0," Proteomics, vol. 4, no. 9, pp. 2583-2593, http:// dx.doi.org/10.1002/pmic.200300792, Sept. 2004.
[14]
A. Wool and Z. Smilansky, "Precalibration of Matrix-Assisted Laser Desorption/Ionization-Time of Flight Spectra for Peptide Mass Fingerprinting," Proteomics, vol. 2, no. 10, pp. 1365-1373, http://dx.doi.org/3.0.CO;2-9, 2002.
[15]
K.R. Clauser, P. Baker, and A.L. Burlingame, "Role of Accurate Mass Measurement (+/ - 10 ppm) in Protein Identification Strategies Employing MS or MS/MS and Database Searching," Analytical Chemistry, vol. 71, no. 14, pp. 2871-2882, July 1999.
[16]
V. Egelhofer, K. Büssow, C. Luebbert, H. Lehrach, and E. Nordhoff, "Improvements in Protein Identification by MALDI-TOF-MS Peptide Mapping," Analytical Chemistry, vol. 72, no. 13, pp. 2741-2750, July 2000.
[17]
T.J. Rivlin, An Introduction to the Approximation of Functions, reprint of 1969 ed. Dover, 1981.
[18]
S. Chattopadhyay and P. Das, "The K-Dense Corridor Problems," Pattern Recognition Letters, vol. 11, no. 7, pp. 463-469, 1990.
[19]
F.Y. Chin, C.A. Wang, and F.L. Wang, "Maximum Stabbing Line in 2D Plane," Proc. Ann. Int'l Conf. Computing and Combinatorics (COCOON '99), vol. 1627, pp. 379-388, 1999.
[20]
K.Q. Brown, "Geometric Transforms for Fast Geometric Algorithms," Report CMUCS-80-101, Dept. of Computer Science, Carnegie Mellon Univ., 1980.
[21]
M. de Berg, M. van Kreveld, M. Overmars, and O. Schwarzkopf, Computational Geometry: Algorithms and Applications, second ed. Springer, 2000.
[22]
D.L. Souvaine and J.M. Steele, "Time- and Space-Efficient Algorithms for Least Median of Squares Regression," J. Am. Statistical Assoc., vol. 82, no. 399, pp. 794-801, 1987.
[23]
H. Edelsbrunner and D.L. Souvaine, "Computing Least Median of Squares Regression Lines and Guided Topological Sweep," J. Am. Statistical Assoc., vol. 85, no. 409, pp. 115-119, 1990.
[24]
H. Edelsbrunner and L.J. Guibas, "Topologically Sweeping an Arrangement," J. Computer and System Sciences, vol. 38, no. 1, pp. 165-194, 1989.
[25]
E. Rafalin, S. Souvaine, and I. Streinu, "Topological Sweep in Degenerate Cases," Proc. Fourth Workshop Algorithm Eng. and Experiments (ALENEX '02), vol. 2409, pp. 577-588, 2002.
[26]
P.J. Rousseeuw, "Least Median of Squares Regression," J. Am. Statistical Assoc., 1984.
[27]
H. Alt, K. Mehlhorn, H. Wagener, and E. Welzl, "Congruence, Similarity and Symmetries of Geometric Objects," Discrete and Computational Geometry, vol. 3, no. 3, pp. 237-256, 1988.
[28]
R. Karp and S. Li, "Two Special Cases of the Assignment Problem," Discrete Math., vol. 13, no. 2, pp. 129-142, 1975.
[29]
J. Colannino, M. Damian, F. Hurtado, J. Iacono, H. Meijer, S. Ramaswami, and G. Toussaint, "An O(n log n)-Time Algorithm for the Restriction Scaffold Assignment Problem," J. Computational Biology, vol. 13, no. 4, pp. 979-989, http:// www.liebertonline.com/doi/abs/10.1089/cmb.2006.13.979, 2006.
[30]
Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison, D. Sankoff and J.B. Kruskal, eds. Addison-Wesley, 1983.
[31]
E. Rafalin, "LMS Regression Using Guided Topological Sweep in Degenerate Cases," http://www.cs.tufts.edu/research/geometry /lms/, 2002.
[32]
M. Lefmann, C. Honisch, S. Boecker, N. Storm, F. von Wintzingerode, C. Schloetelburg, A. Moter, D. van den Boom, and U.B. Goebel, "A Novel Mass Spectrometry Based Tool for Genotypic Identification of Mycobacteria," J. Clinical Microbiology , vol. 42, no. 1, pp. 339-346, 2004.
[33]
T.J. Rivlin, Chebyshev Polynomials: From Approximation Theory to Algebra and Number Theory. Wiley-Interscience, 1990.
[34]
C.A. Taatjes, N. Hansen, A. McIlroy, J.A. Miller, J.P. Senosiain, S.J. Klippenstein, F. Qi, L. Sheng, Y. Zhang, T.A. Cool, J. Wang, P.R. Westmoreland, M.E. Law, T. Kasper, and K. Kohse-Höinghaus, "Enols Are Common Intermediates in Hydrocarbon Oxidation," Science, vol. 308, no. 5730, pp. 1887-1889, 2005.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE/ACM Transactions on Computational Biology and Bioinformatics
IEEE/ACM Transactions on Computational Biology and Bioinformatics  Volume 5, Issue 1
January 2008
159 pages

Publisher

IEEE Computer Society Press

Washington, DC, United States

Publication History

Published: 01 January 2008
Published in TCBB Volume 5, Issue 1

Author Tags

  1. biotechnology
  2. combinatorial pattern matching
  3. computational geometry
  4. mass spectrometry

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 245
    Total Downloads
  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Mar 2025

Other Metrics

Citations

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media