ACM Home Page
Please provide us with feedback. Feedback
Automatic inference of models for statistical code compression
Full text PdfPdf (631 KB)
Source Conference on Programming Language Design and Implementation archive
Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation table of contents
Atlanta, Georgia, United States
Pages: 242 - 246  
Year of Publication: 1999
ISBN:1-58113-094-5
Also published in ...
Author
Christopher W. Fraser  Microsoft Research, One Microsoft Way, Redmond, WA
Sponsors
SIGSOFT: ACM Special Interest Group on Software Engineering
SIGPLAN: ACM Special Interest Group on Programming Languages
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 7,   Downloads (12 Months): 24,   Citation Count: 13
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/301618.301672
What is a DOI?

ABSTRACT

This paper describes experiments that apply machine learning to compress computer programs, formalizing and automating decisions about instruction encoding that have traditionally been made by humans in a more ad hoc manner. A program accepts a large training set of program material in a conventional compiler intermediate representation (IR) and automatically infers a decision tree that separates IR code into streams that compress much better than the undifferentiated whole. Driving a conventional arithmetic compressor with this model yields code 30% smaller than the previous record for IR code compression, and 24% smaller than an ambitious optimizing compiler feeding an ambitious general-purpose data compressor.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
M. Burrows and D. J. Wheeler. A block-sorting lossless data compression algorithm. Digital SRC research report 124, 5/10/94.
 
3
D. Chickering, D. Heckerman, and C. Meek. A Bayesian approach to learning Bayesian networks with local structure. Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence, Morgan Kaufman, 8/97.
4
 
5
M. Franz and T. Kistler. Slim binaries. TR 96-24, Dept of Information and Computer Science, University of California, irvine, 6/96.
 
6
M. Franz. Adaptive compression of syntax trees and iterative dynamic code optimization: Two basic technologies for mobile-object systems. TR 97-04, Dept of Information and Computer Science, University of California, Irvine, 2/97.
 
7
 
8
Christopher W. Fraser and Todd A. Proebsting. Custom Instruction Sets For Code Compression. Unpublished manuscript, http://research'micr~s~ft'c~rn/~t~ddpr~/papers/ pldi2.ps, 10/95.
 
9
Free Software Foundation. GCC - The GNU C Compiler. http://www.gnu.org/software/gcc, 8/13/98.
 
10
Jean-Loup Gailly and Mark Adler. The gzip home page. http://w3.gzip.org. 1/21/99.
 
11
 
12
 
13
A. Lempel and J. Ziv. On the complexity of finite sequences. IEEE Transactions on Information Theory 22(1):75-81, 1/76.
14
 
15
Julian Seward. The bzip2 and libbzip2 home page. http://www.muraroa.demon.co.uk, 2/11/99.
 
16
C. E. Shannon. Prediction and entropy of printed English. Bell System Technical Journal 30:50-64, 1/51.
17
 
18
Rik van de Wiel. Code compaction bibliography. http://www, win.tue.nl/csdpa/rikvdw/bibl.html, 2/3/99.
 
19
 
20
J. Ziv and A. Lempel. Compression of individual sequences via variable-rate coding. IEEE Transactions on Information Theory 24(5):530-536, 9/78.

CITED BY  13
 
 

Collaborative Colleagues:
Christopher W. Fraser: colleagues