skip to main content
10.1145/1375581.1375607acmconferencesArticle/Chapter ViewAbstractPublication PagespldiConference Proceedingsconference-collections
research-article

Grammar-based whitebox fuzzing

Published: 07 June 2008 Publication History

Abstract

Whitebox fuzzing is a form of automatic dynamic test generation, based on symbolic execution and constraint solving, designed for security testing of large applications. Unfortunately, the current effectiveness of whitebox fuzzing is limited when testing applications with highly-structured inputs, such as compilers and interpreters. These applications process their inputs in stages, such as lexing, parsing and evaluation. Due to the enormous number of control paths in early processing stages, whitebox fuzzing rarely reaches parts of the application beyond those first stages.
In this paper, we study how to enhance whitebox fuzzing of complex structured-input applications with a grammar-based specification of their valid inputs. We present a novel dynamic test generation algorithm where symbolic execution directly generates grammar-based constraints whose satisfiability is checked using a custom grammar-based constraint solver. We have implemented this algorithm and evaluated it on a large security-critical application, the JavaScript interpreter of Internet Explorer 7 (IE7). Results of our experiments show that grammar-based whitebox fuzzing explores deeper program paths and avoids dead-ends due to non-parsable inputs. Compared to regular whitebox fuzzing, grammar-based whitebox fuzzing increased coverage of the code generation module of the IE7 JavaScript interpreter from 53% to 81% while using three times fewer tests.

References

[1]
D. Aitel. The Advantages of Block-Based Protocol Analysis for Security Testing. Immunity Inc., February, 2002.
[2]
S. Artzi, A. Kie?un, J. Dolby, F. Tip, D. Dig, A. Paradkar, and M. D. Ernst. Finding bugs in dynamic Web applications. Technical Report MIT-CSAIL-TR-2008-006, MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA, Feb. 2008.
[3]
D. Bird and C. Munoz. Automatic Generation of Random Self-Checking Test Cases. IBM Systems Journal, 22(3):229--245, 1983.
[4]
N. Borisov, D. Brumley, H. Wang, J. Dunagan, P. Joshi, and C. Guo. Generic application-level protocol analyzer and its language. In NDSS, 2007.
[5]
C. Boyapati, S. Khurshid, and D. Marinov. Korat: automated testing based on Java predicates. In ISSTA, 2002.
[6]
C. Cadar, V. Ganesh, P. Pawlowski, D. Dill, and D. Engler. EXE: automatically generating inputs of death. In CCS, 2006.
[7]
K. Claessen and J. Hughes. QuickCheck: A lightweight tool for random testing of Haskell programs. In ICFP, 2000.
[8]
D. Coppit and J. Lian. yagg: an easy-to-use generator for structured test inputs. In ASE, 2005.
[9]
W. Cui, J. Kannan, and H. J. Wang. Discoverer: Automatic protocol reverse engineering from network traces. In USENIX Security Symposium, 2007.
[10]
B. Daniel, D. Dig, K. Garcia, and D. Marinov. Automated testing of refactoring engines. In FSE, 2007.
[11]
M. Emmi, R. Majumdar, and K. Sen. Dynamic test input generation for database applications. In ISSTA, 2007.
[12]
J. E. Forrester and B. P. Miller. An Empirical Study of the Robustness of Windows NT Applications Using Random Testing. In Proceedings of the 4th USENIX Windows System Symposium, Seattle, August 2000.
[13]
P. Godefroid. Compositional Dynamic Test Generation. In POPL, 2007.
[14]
P. Godefroid, N. Klarlund, and K. Sen. DART: Directed automated random testing. In PLDI, 2005.
[15]
P. Godefroid, M. Levin, and D. Molnar. Active property checking. Technical Report MSR-TR-2007-91, Microsoft, 2007.
[16]
P. Godefroid, M. Levin, and D. Molnar. Automated whitebox fuzz testing. In NDSS, 2008.
[17]
K. Hanford. Automatic Generation of Test Cases. IBM Systems Journal, 9(4), 1970.
[18]
J. Hopcroft and J. Ullman. Introduction to automata theory, languages and computation. Addison-Wesley Series in Computer Science, 1979.
[19]
S. Khurshid and D. Marinov. TestEra: Specification-Based Testing of Java Programs Using SAT. In ASE, 2004.
[20]
J. King. Symbolic execution and program testing. Communications of the ACM, 19(7):385--394, 1976.
[21]
R. Lämmel and W. Schulte. Controllable combinatorial coverage in grammar-based testing. In TestCom, 2006.
[22]
R. Majumdar and K. Sen. LATEST: Lazy dynamic test input generation. Technical Report UCB/EECS-2007-36, EECS Department, University of California, Berkeley, 2007.
[23]
R. Majumdar and R.-G. Xu. Directed test generation using symbolic grammars. In ASE, 2007.
[24]
B. Malloy and J. Power. An interpretation of Purdom?s algorithm for automatic generation of test cases. In ICIS, 2001.
[25]
P. Maurer. Generating test data with enhanced context-free grammars. IEEE Software, 7(4), 1990.
[26]
B. McKenzie. Generating strings at random from a context free grammar. Technical Report TR-COSC 10/97, Department of Computer Science, University of Canterbury, 1997.
[27]
D. Melski and T. Reps. Interconvertbility of set constraints and context-free language reachability. In PEPM, 1997.
[28]
B. P. Miller, L. Fredriksen, and B. So. An empirical study of the reliability of UNIX utilities. Communications of the ACM, 33(12), 1990.
[29]
R. C. Moore. Removing left recursion from context-free grammars. In Proceedings of the first conference on North American chapter of the Association for Computational Linguistics, 2000.
[30]
C. Pacheco, S. K. Lahiri, M. D. Ernst, and T. Ball. Feedbackdirected random test generation. In ICSE, 2007.
[31]
R. Pang, V. Paxson, R. Sommer, and L. Peterson. binpac: a yacc for writing application protocol parsers. In IMC, 2006.
[32]
P. Purdom. A sentence generator for testing parsers. BIT Numerical Mathematics, 12(3), 1972.
[33]
D. J. Salomon and G. V. Cormack. Scannerless NSLR(1) parsing of programming languages. In PLDI, 1989.
[34]
K. Sen, D. Marinov, and G. Agha. CUTE: a concolic unit testing engine for C. In FSE, 2005.
[35]
E. Sirer and B. Bershad. Using production grammars in software testing. In DSL, 1999.
[36]
K. Sullivan, J. Yang, D. Coppit, S. Khurshid, and D. Jackson. Software assurance by bounded exhaustive testing. In ISSTA, 2004.
[37]
M. Sutton, A. Greene, and P. Amini. Fuzzing: Brute Force Vulnerability Discovery. Addison-Wesley, 2007.
[38]
M. Utting, A. Pretschner, and B. Legeard. A Taxonomy of Model-Based Testing. Department of Computer Science, The University of Waikato, New Zealand, Tech. Rep, 4, 2006.
[39]
G. Wassermann and Z. Su. Sound and precise analysis of Web applications for injection vulnerabilities. In PLDI, 2007.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PLDI '08: Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation
June 2008
396 pages
ISBN:9781595938602
DOI:10.1145/1375581
  • General Chair:
  • Rajiv Gupta,
  • Program Chair:
  • Saman Amarasinghe
  • cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 43, Issue 6
    PLDI '08
    June 2008
    382 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/1379022
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 June 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. automatic test generation
  2. grammars
  3. program verification
  4. software testing

Qualifiers

  • Research-article

Conference

PLDI '08
Sponsor:

Acceptance Rates

Overall Acceptance Rate 406 of 2,067 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)220
  • Downloads (Last 6 weeks)16
Reflects downloads up to 18 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)LLM-enhanced evolutionary test generation for untyped languagesAutomated Software Engineering10.1007/s10515-025-00496-732:1Online publication date: 17-Feb-2025
  • (2024)MESSIProceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation10.5555/3691825.3691881(1009-1023)Online publication date: 16-Apr-2024
  • (2024)A Fuzzing Tool Based on Automated Grammar DetectionSoftware10.3390/software30400283:4(569-586)Online publication date: 14-Dec-2024
  • (2024)A Novel Network Protocol Syntax Extracting Method for Grammar-Based FuzzingApplied Sciences10.3390/app1406240914:6(2409)Online publication date: 13-Mar-2024
  • (2024)Runtime Tests for Memory Error Handlers of In-Memory Key Value Stores Using MemFIIEICE Transactions on Information and Systems10.1587/transinf.2024EDP7019E107.D:11(1408-1421)Online publication date: 1-Nov-2024
  • (2024)Incremental Context-free Grammar Inference in Black Box SettingsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695494(1171-1182)Online publication date: 27-Oct-2024
  • (2024)FuSeBMC v4: Improving Code Coverage with Smart Seeds via BMC, Fuzzing and Static AnalysisFormal Aspects of Computing10.1145/366533736:2(1-25)Online publication date: 20-May-2024
  • (2024)Tacoma: Enhanced Browser Fuzzing with Fine-Grained Semantic AlignmentProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680351(1174-1185)Online publication date: 11-Sep-2024
  • (2024)AsFuzzer: Differential Testing of Assemblers with Error-Driven Grammar InferenceProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680345(1099-1111)Online publication date: 11-Sep-2024
  • (2024)UPBEAT: Test Input Checks of Q# Quantum LibrariesProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3652120(186-198)Online publication date: 11-Sep-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media