skip to main content
10.1145/2902251.2902299acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
research-article

Schema Validation via Streaming Circuits

Published:15 June 2016Publication History

ABSTRACT

XML schema validation can be performed in constant memory in the streaming model if and only if the schema admits only trees of bounded depth - an acceptable assumption from the practical view-point. In this paper we refine this analysis by taking into account that data can be streamed block-by-block, rather then letter-by-letter, which provides opportunities to speed up the computation by parallelizing the processing of each block.

For this purpose we introduce the model of streaming circuits, which process words of arbitrary length in blocks of fixed size, passing constant amount of information between blocks.

This model allows us to transfer fundamental results about the circuit complexity of regular languages to the setting of streaming schema validation, which leads to effective constructions of streaming circuits of depth logarithmic in the block size, or even constant under certain assumptions on the input schema.

For nested-relational DTDs, a practically motivated class of bounded-depth XML schemas, we provide an efficient construction yielding constant-depth streaming circuits with particularly good parameters.

References

  1. Serge Abiteboul, Luc Segoufin, and Victor Vianu. Representing and querying XML with incomplete information. ACM Trans. Database Syst., 31(1):208--254, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Marcelo Arenas and Leonid Libkin. XML data exchange: Consistency and query answering. J. ACM, 55(2), 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. David A. Mix Barrington, Kevin Compton, Howard Straubing, and Denis Thérien. Regular languages in NC1. J. Computer and System Sciences, 44(3):478--499, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. G. J. Bex, F. Neven, and J. Van den Bussche. DTDs versus XML Schema: a practical study. In WEBDB 2004, pages 79--84, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. David Black-Schaffer. Block-Parallel Programming for Real-Time Applications on Multi-Core Processors. PhD thesis, Humboldt-Universitat zu Berlin, 2008. Available at http://cva.stanford.edu/people/davidbbs/black-schaffer_thesis_final.pdf. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. David Black-Schaffer and William J. Dally. Block-parallel programming for real-time embedded applications. In ICPP 2010, pages 297--306. IEEE Computer Society, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Mikołaj Bojanczyk, Howard Straubing, and Igor Walukiewicz. Wreath products of forest algebras, with applications to tree logics. Logical Methods in Computer Science, 8(3), 2012.Google ScholarGoogle Scholar
  8. Vince Bárány, Christof Löding, and Olivier Serre. Regularity problems for visibly pushdown languages. In Bruno Durand and Wolfgang Thomas, editors, STACS 2006, volume 3884 of Lecture Notes in Computer Science, pages 420--431. Springer Berlin Heidelberg, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Ashok K. Chandra, Steven Fortune, and Richard J. Lipton. Lower bounds for constant depth circuits for prefix problems. In Josep Dıaz, editor, ICALP 1983, volume 154 of Lecture Notes in Computer Science, pages 109--117. Springer, 1983. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Sang Cho and Dung T. Huynh. Finite-automaton aperiodicity is PSPACE-complete. Theoretical Computer Science, 88(1):99 -- 116, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Luc Dartois and Charles Paperman. Alternation hierarchies of first order logic with regular predicates. In Adrian Kosowski and Igor Walukiewicz, editors, FCT 2015, volume 9210 of Lecture Notes in Computer Science, pages 160--172. Springer, 2015.Google ScholarGoogle Scholar
  12. Tathagata Das, Yuan Zhong, Ion Stoica, and Scott Shenker. Adaptive stream processing using dynamic batch sizing. In Ed Lazowska, Doug Terry, Remzi H. Arpaci-Dusseau, and Johannes Gehrke, editors, SoCC 2014, pages 16:1--16:13. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Patrick Dymond. Input-driven languages are in log n depth. Inf. Process. Lett., 26(5):247--250, January 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Merrick Furst, James B. Saxe, and Michael Sipser. Parity, circuits, and the polynomial-time hierarchy. Theory of Computing Systems, 17:13--27, 1984.Google ScholarGoogle Scholar
  15. Karsten Henckell. Pointlike sets: the finest aperiodic cover of a finite semigroup. Journal of Pure and Applied Algebra, 55(1):85 -- 126, 1988.Google ScholarGoogle ScholarCross RefCross Ref
  16. Neil Immerman. Languages that capture complexity classes. SIAM J. Computing, 16(4):760--778, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Rashid Khogali, Olivia Das, and Kaamran Raahemifar. Mobile parallel computing algorithms for single-buffered, speed-scalable processors. In TrustCom 2013 / ISPA-13 / IUCC-2013, pages 1832--1839. IEEE Computer Society, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Eryk Kopczynski. Invisible pushdown languages. In LICS 2016 (to appear), 2016. Available at http://arxiv.org/abs/1511.00289\,.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Michal Koucký, Pavel Pudlák, and Denis Thérien. Bounded-depth circuits: separating wires from gates. In Harold N. Gabow and Ronald Fagin, editors, STOC 2005, pages 257--265. ACM, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Michal Koucký. Circuit complexity of regular languages. Theory of Computing Systems, 45(4):865--879, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Michal Koucký, Clemens Lautemann, Sebastian Poloczek, and Denis Thérien. Circuit lower bounds via Ehrenfeucht-Fraissé games. In CCC 2006, pages 190--201, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Viraj Kumar, P. Madhusudan, and Mahesh Viswanathan. Visibly pushdown automata for streaming XML. In WWW 2007, pages 1053--1062. ACM, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Dongwon Lee, Murali Mani, Frank Chiu, and Wesley W. Chu. Net & cot: translating relational schemas to XML schemas using semantic constraints. In Proceedings of the 2002 ACM CIKM International Conference on Information and Knowledge Management, McLean, VA, USA, November 4--9, 2002, pages 282--291. ACM, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Kyong-Ha Lee, Yoon-Joon Lee, Hyunsik Choi, Yon Dohn Chung, and Bongki Moon. Parallel data processing with MapReduce: a survey. SIGMOD Record, 40(4):11--20, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Robert McNaughton and Seymour Papert. Counter-Free Automata. The MIT Press, Cambridge, Mass., 1971.Google ScholarGoogle Scholar
  26. Charles Paperman. Circuits booléens, prédicats modulaires et langages réguliers. PhD dissertation, Université Paris Diderot, 2014. In French.Google ScholarGoogle Scholar
  27. Thomas Place and Marc Zeitoun. Separating regular languages with first-order logic. In Thomas A. Henzinger and Dale Miller, editors, CSL-LICS 2014, pages 75:1--75:10. ACM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Thomas Place and Marc Zeitoun. Separation and the successor relation. In Ernst W. Mayr and Nicolas Ollinger, editors, STACS 2015, volume 30 of Leibniz International Proceedings in Informatics (LIPIcs), pages 662--675, Dagstuhl, Germany, 2015. Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik.Google ScholarGoogle Scholar
  29. Marcel-Paul Schützenberger. On finite monoids having only trivial subgroups. Information and Control, 8(2):190--194, 1965.Google ScholarGoogle ScholarCross RefCross Ref
  30. Luc Segoufin and Cristina Sirangelo. Constant-memory validation of streaming XML documents against DTDs. In Thomas Schwentick and Dan Suciu, editors, ICDT 2007, volume 4353 of Lecture Notes in Computer Science, pages 299--313. Springer, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Luc Segoufin and Victor Vianu. Validating streaming XML documents. In Lucian Popa, Serge Abiteboul, and Phokion G. Kolaitis, editors, PODS 2002, pages 53--64. ACM, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Howard Straubing. Finite Automata, Formal Logic, and Circuit Complexity. Birkh\"auser, Boston, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Schema Validation via Streaming Circuits

                    Recommendations

                    Comments

                    Login options

                    Check if you have access through your login credentials or your institution to get full access on this article.

                    Sign in
                    • Published in

                      cover image ACM Conferences
                      PODS '16: Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems
                      June 2016
                      504 pages
                      ISBN:9781450341912
                      DOI:10.1145/2902251
                      • General Chair:
                      • Tova Milo,
                      • Program Chair:
                      • Wang-Chiew Tan

                      Copyright © 2016 ACM

                      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                      Publisher

                      Association for Computing Machinery

                      New York, NY, United States

                      Publication History

                      • Published: 15 June 2016

                      Permissions

                      Request permissions about this article.

                      Request Permissions

                      Check for updates

                      Qualifiers

                      • research-article

                      Acceptance Rates

                      PODS '16 Paper Acceptance Rate31of94submissions,33%Overall Acceptance Rate642of2,707submissions,24%

                    PDF Format

                    View or Download as a PDF file.

                    PDF

                    eReader

                    View online with eReader.

                    eReader