ABSTRACT
We propose an application of programming language techniques to lossless data compression, where tree data are compressed as functional programs that generate them. This "functional programs as compressed data" approach has several advantages. First, it follows from the standard argument of Kolmogorov complexity that the size of compressed data can be optimal up to an additive constant. Secondly, a compression algorithm is clean: it is just a sequence of beta-expansions for lambda-terms. Thirdly, one can use program verification and transformation techniques (higher-order model checking, in particular) to apply certain operations on data without decompression. In the paper, we present algorithms for data compression and manipulation based on the approach, and prove their correctness. We also report preliminary experiments on prototype data compression/transformation systems.
- H. Barendregt, M. Coppo, and M. Dezani-Ciancaglini. A filter lambda model and the completeness of type assignment. J. Symb. Log., 48(4):931--940, 1983.Google ScholarCross Ref
- C. H. Broadbent, A. Carayol, C.-H. L. Ong, and O. Serre. Recursion schemes and logical reflection. In Proceedings of LICS 2010, pages 120--129. IEEE Computer Society Press, 2010. Google ScholarDigital Library
- G. Busatto, M. Lohrey, and S. Maneth. Efficient memory representation of XML document trees. Inf. Syst., 33(4--5):456--474, 2008. Google ScholarDigital Library
- H. Comon, M. Dauchet, R. Gilleron, C. Loding, F. Jacquemard, D. Lugiez, S. Tison, and M. Tommasi. Tree automata techniques and applications. Available on: http://www.grappa.univ-lille3.fr/tata, 2007. release October, 12th 2007.Google Scholar
- M. Crochemore and W. Rytter. Jewels of Stringology. World Scientific, 2002.Google ScholarCross Ref
- J. Engelfriet and H. Vogler. High level tree transducers and iterated pushdown tree transducers. Acta Inf., 26(1/2):131--192, 1988. Google ScholarDigital Library
- A. J. Gill, J. Launchbury, and S. L. Peyton-Jones. A short cut to deforestation. In FPCA, pages 223--232, 1993. Google ScholarDigital Library
- M. Hague, A. Murawski, C.-H. L. Ong, and O. Serre. Collapsible pushdown automata and recursion schemes. In Proceedings of LICS 2008, pages 452--461. IEEE Computer Society, 2008. Google ScholarDigital Library
- M. Hutter. Universal Artificial Intelligence: Sequential Decisions based on Algorithmic Probability. Springer-Verlag, Berlin, 2004. Google ScholarDigital Library
- H. Kaji, Y. Kida, and Y. Morimoto. Learning translation templates from bilingual text. In COLING, pages 672--678, 1992. Google ScholarDigital Library
- M. Karpinski, W. Rytter, and A. Shinohara. An efficient pattern matching algorithm for strings with short description. Nordic Journal on Computing, 4(2):129--144, 1997.Google Scholar
- T. Kida, T. Matsumoto, Y. Shibata, M. Takeda, A. Shinohara, and S. Arikawa. Collage system: a unifying framework for compressed pattern matching. Theor. Comput. Sci., 1(298):253--272, 2003. Google ScholarDigital Library
- T. Knapik, D. Niwinski, and P. Urzyczyn. Higher-order pushdown trees are easy. In Proceedings of FOSSACS 2002, volume 2303 of LNCS, pages 205--222. Springer-Verlag, 2002. Google ScholarDigital Library
- N. Kobayashi. Model-checking higher-order functions. In Proceedings of PPDP 2009, pages 25--36. ACM Press, 2009. Google ScholarDigital Library
- N. Kobayashi. Types and higher-order recursion schemes for verification of higher-order programs. In Proc. of POPL, pages 416--428, 2009. Google ScholarDigital Library
- N. Kobayashi. A practical linear time algorithm for trivial automata model checking of higher-order recursion schemes. In Proceedings of FOSSACS 2011, volume 6604 of LNCS, pages 260--274. Springer-Verlag, 2011. Google ScholarDigital Library
- N. Kobayashi, K. Matsuda, and A. Shonohara. Functional programs as compressed data. An extended version, available from the first author's web page, 2012.Google Scholar
- N. J. Larsson and A. Moffat. Offline dictionary-based compression. In Proc. Data Compression Conference '99 (DCC'99), page 296. IEEE Computer Society, 1999. Google ScholarDigital Library
- M. Li and P. M. B. Vitanyi. Kolmogorov complexity and its applications. In Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity, pages 187--254. The MIT Press, 1990. Google ScholarDigital Library
- M. Li and P. M. B. Vitanyi. An introduction to Kolmogorov complexity and its applications (3rd ed.). Texts in computer science. Springer-Verlag, 2009. Google ScholarDigital Library
- M. Lohrey, S. Maneth, and R. Mennicke. Tree structure compression with repair. In 2011 Data Compression Conference (DCC 2011), pages 353--362. IEEE Computer Society, 2011. Google ScholarDigital Library
- S. Maneth and G. Busatto. Tree transducers and tree compressions. In Proceedings of FOSSACS 2004, volume 2987 of LNCS, pages 363--377. Springer-Verlag, 2004.Google ScholarCross Ref
- W. Matsubara, S. Inenaga, A. Ishino, A. Shinohara, T. Nakamura, and K. Hashimoto. Efficient algorithms to compute compressed longest common substrings and compressed palindromes. Theor. Comput. Sci., 410(8--10):900--913, 2009. Google ScholarDigital Library
- C. G. Nevill-Manning and I. H. Witten. Compression and explanation using hierarchical grammars. Comput. J., 40(2/3):103--116, 1997.Google ScholarCross Ref
- C.-H. L. Ong. On model-checking trees generated by higher-order recursion schemes. In LICS 2006, pages 81--90. IEEE Computer Society Press, 2006. Google ScholarDigital Library
- W. Plandowski. Testing equivalence of morphisms on context-free languages. In ESA'94, volume 855 of LNCS, pages 460--470. Springer-Verlag, 1994. Google ScholarDigital Library
- W. Rytter. Application of Lempel-Ziv factorization to the approximation of grammar-based compression. Theor. Comput. Sci., 302(1--3):211--222, 2003. Google ScholarDigital Library
- W. Rytter. Grammar compression, LZ-encodings, and string algorithms with implicit input. In ICALP'04, volume 3142 of LNCS, pages 15--27. Springer-Verlag, 2004.Google Scholar
- H. Sakamoto. A fully linear-time approximation algorithm for grammar-based compression. Journal of Discrete Algorithms, 3(2--4):416--430, 2005.Google ScholarCross Ref
- H. L. Somers. Review article: Example-based machine translation. Machine Translation, 14(2):113--157, 1999. Google ScholarDigital Library
- R. Statman. The typed lambda-calculus is not elementary recursive. Theor. Comput. Sci., 9:73--81, 1979.Google ScholarCross Ref
- C. Stirling. Decidability of higher-order matching. Logical Methods in Computer Science, 5(3), 2009.Google Scholar
- J. Storer. NP-completeness results concerning data compression. Technical Report 234, Department of Electrical Engineering and Computer Science, Princeton University, Princeton, N.J., 1997.Google Scholar
- J. Tromp. Binary lambda calculus and combinatory logic. In Kolmogorov Complexity and Applications, volume 06051 of Dagstuhl Seminar Proceedings, 2006.Google Scholar
- T. Tsukada and N. Kobayashi. Untyped recursion schemes and infinite intersection types. In Proceedings of FOSSACS 2010, volume 6014 of LNCS, pages 343--357. Springer-Verlag, 2010. Google ScholarDigital Library
- S. van Bakel. Intersection type assignment systems. Theor. Comput. Sci., 151(2):385--435, 1995. Google ScholarDigital Library
- D. Vytiniotis and A. Kennedy. Functional pearl: every bit counts. In Proceedings of ICFP 2010, pages 15--26, 2010. Google ScholarDigital Library
Index Terms
- Functional programs as compressed data
Recommendations
Functional programs as compressed data
We propose an application of programming language techniques to lossless data compression, where tree data are compressed as functional programs that generate them. This "functional programs as compressed data" approach has several advantages. First, it ...
Ultrasound Beamforming Using Compressed Data
The rapid advancements in electronics technologies have made software-based beamformers for ultrasound array imaging feasible, thus facilitating the rapid development of high-performance and potentially low-cost systems. However, one challenge to ...
Comments