skip to main content
article

Dynamic text and static pattern matching

Published: 01 May 2007 Publication History

Abstract

In this article, we address a new version of dynamic pattern matching. The dynamic text and static pattern matching problem is the problem of finding a static pattern in a text that is continuously being updated. The goal is to report all new occurrences of the pattern in the text after each text update. We present an algorithm for solving the problem where the text update operation is changing the symbol value of a text location. Given a text of length n and a pattern of length m, our algorithm preprocesses the text in time O(n log log m), and the pattern in time O(m log m). The extra space used is O(n + m log m). Following each text update, the algorithm deletes all prior occurrences of the pattern that no longer match, and reports all new occurrences of the pattern in the text in O(log log m) time. We note that the complexity is not proportional to the number of pattern occurrences, since all new occurrences can be reported in a succinct form.

References

[1]
Alon, N., and Naor, M. 1996. Derandomization, witnesses for boolean matrix multiplication and construction of perfect hash functions. Algorithmica 16, 4-5, 434--449.
[2]
Alstrup, S., Brodal, G. S., and Rauhe, T. 2000. Pattern matching in dynamic texts. In Proceedings of the 11th ACM-SIAM Symposium on Discrete Algorithms (SODA). SIAM, 819--828.
[3]
Beame, P., and Fich, F. E. 2002. Optimal bounds for the predecessor problem and related problems. J. Comput. Syst. Sci. 65, 1, 38--72.
[4]
Berkman, O., and Vishkin, U. 1994. Finding level-ancestors in trees. J. Comput. Syst. Sci. 48, 2, 214--229.
[5]
Buchsbaum, A. L. 2006. Personal communication.
[6]
Buchsbaum, A. L., Goodrich, M. T., and Westbrook, J. 2000. Range searching over tree cross products. In Proceedings of the 8th Annual European Symposium on Algorithms (ESA). Springer Verlag, Berlin. 120--131.
[7]
Cole, R., and Harihan, R. 1997. Tighter bounds on the exact complexity of string matching. SIAM J. Comput. 26, 3, 803--856.
[8]
Farach, M. 1997. Optimal suffix tree construction with large alphabets. In Proceedings of the 38th IEEE Symposium on Foundations of Computer Science (FOCS). 137--143.
[9]
Farach, M., and Muthukrishnan, S. 1996. Perfect hashing for strings: Formalization and algorithms. In Proceedings of the 7th Symposium on Combinatorial Pattern Matching (CPM). Springer Verlag, Berlin. 130--140.
[10]
Ferragina, P., and Grossi, R. 1995. Fast incremental text editing. In Proceedings of the 7th ACM-SIAM Symposium on Discrete Algorithms (SODA). 531--540.
[11]
Gabow, H. N., Bentley, J. L., and Tarjan, R. E. 1984. Scaling and related techniques for geometry problems. In Proceedings of the 16th ACM Symposium on Theory of Computing (STOC), vol. 67. 135--143.
[12]
Gu, M., Farach, M., and Beigel, R. 1994. An efficient algorithm for dynamic text indexing. In Proceedings of the 5th ACM-SIAM Symposium on Discrete Algorithms (SODA). 697--704.
[13]
Hagerup, T., Miltersen, P. B., and Pagh, R. 2001. Deterministic dictionaries. J. Alg. 41, 1, 69--85.
[14]
Harel, D., and Tarjan, R. 1984. Fast algorithms for finding nearest common ancestors. SIAM J. Comput. 13, 2, 338--355.
[15]
Kärkkäinen, J., and Sanders, P. 2003. Simple linear work suffix array construction. In Proceedings of the 30th International Colloquium on Automata, Languages and Programming (ICALP). Lecture Notes in Computer Science, vol. 2719. Springer Verlag, Berlin. 943--955.
[16]
Knuth, D., Morris, J., and Pratt, V. 1977. Fast pattern matching in strings. SIAM J. Comput. 6, 323--350.
[17]
Landau, G. M., and Vishkin, U. 1988. Fast string matching with k differences. J. Comput. Syst. Sci. 37, 1, 63--78.
[18]
Landau, G. M., and Vishkin, U. 1994. Pattern matching in a digitized image. Algorithmica 12, 4-5, 375--408.
[19]
Levenshtein, V. I. 1966. Binary codes capable of correcting, deletions, insertions and reversals. Soviet Phys. Dokl. 10, 707--710.
[20]
Lowrance, R., and Wagner, R. A. 1975. An extension of the string-to-string correction problem. J. ACM 22, 2, 177--183.
[21]
McCreight, E. M. 1976. A space-economical suffix tree construction algorithm. J. ACM 23, 262--272.
[22]
Sahinalp, S. C., and Vishkin, U. 1996. Efficient approximate and dynamic matching of patterns using a labeling paradigm. In Proceedings of the 37th IEEE Symposium on Foundations of Computer Science (FOCS). 320--328.
[23]
Schieber, B., and Vishkin, U. 1988. On finding lowest common ancestors: Simplification and parallelization. SIAM J. Comput. 17, 6, 1253--1262.
[24]
Smith, T., and Waterman, M. 1981. Identification of common molecular subsequences. J. Mol. Biol. 147, 195--197.
[25]
Ukkonen, E. 1995. On-Line construction of suffix trees. Algorithmica 14, 249--260.
[26]
van Emde Boas, P. 1974. An O(n log log n) online algorithm for the insert-extract min problem. Tech. Rep. TR 74-221, Cornell University, Department of Computer Science.
[27]
Weiner, P. 1973. Linear pattern matching algorithm. In Proceedings of the 14th IEEE Symposium on Switching and Automata Theory. 1--11.
[28]
Willard, D. E. 1983. Log-Logarithmic worst-case range queries are possible in space Theta(n). Inf. Process. Lett. 17, 2, 81--84.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Algorithms
ACM Transactions on Algorithms  Volume 3, Issue 2
May 2007
338 pages
ISSN:1549-6325
EISSN:1549-6333
DOI:10.1145/1240233
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 May 2007
Published in TALG Volume 3, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Dynamic text
  2. border trees
  3. static pattern

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)29
  • Downloads (Last 6 weeks)1
Reflects downloads up to 18 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Computing MEMs and Relatives on Repetitive Text CollectionsACM Transactions on Algorithms10.1145/370156121:1(1-33)Online publication date: 17-Dec-2024
  • (2023)Finding top-k longest palindromes in substringsTheoretical Computer Science10.1016/j.tcs.2023.114183979:COnline publication date: 10-Nov-2023
  • (2023)Frequency-Constrained Substring ComplexityString Processing and Information Retrieval10.1007/978-3-031-43980-3_28(345-352)Online publication date: 26-Sep-2023
  • (2023)Internal Longest Palindrome Queries in Optimal TimeWALCOM: Algorithms and Computation10.1007/978-3-031-27051-2_12(127-138)Online publication date: 22-Mar-2023
  • (2021)Detecting premature departure in online text-based counseling using logic-based pattern matchingInternet Interventions10.1016/j.invent.2021.10048626(100486)Online publication date: Dec-2021
  • (2021)Internal Dictionary MatchingAlgorithmica10.1007/s00453-021-00821-y83:7(2142-2169)Online publication date: 1-Jul-2021
  • (2020)A Linear-Time Algorithm for Seeds ComputationACM Transactions on Algorithms10.1145/338636916:2(1-23)Online publication date: 21-Apr-2020
  • (2020)Linear-time String Indexing and Analysis in Small SpaceACM Transactions on Algorithms10.1145/338141716:2(1-54)Online publication date: 9-Mar-2020
  • (2020)Faster algorithms for 1-mappability of a sequenceTheoretical Computer Science10.1016/j.tcs.2019.04.026812(2-12)Online publication date: Apr-2020
  • (2020)Indexing weighted sequencesInformation and Computation10.1016/j.ic.2019.104462270:COnline publication date: 1-Feb-2020
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media