research-article

EvilCoder: automated bug insertion

Authors:
Jannik Pewny

Ruhr-Universität, Bochum, Germany

Ruhr-Universität, Bochum, Germany
View Profile

,
Thorsten Holz

Ruhr-Universität, Bochum, Germany

Ruhr-Universität, Bochum, Germany
View Profile

ACSAC '16: Proceedings of the 32nd Annual Conference on Computer Security ApplicationsDecember 2016Pages 214–225https://doi.org/10.1145/2991079.2991103

Published:05 December 2016Publication History

ACSAC '16: Proceedings of the 32nd Annual Conference on Computer Security Applications

Pages 214–225

ABSTRACT

The art of finding software vulnerabilities has been covered extensively in the literature and there is a huge body of work on this topic. In contrast, the intentional insertion of exploitable, security-critical bugs has received little (public) attention yet. Wanting more bugs seems to be counterproductive at first sight, but the comprehensive evaluation of bug-finding techniques suffers from a lack of ground truth and the scarcity of bugs.

In this paper, we propose EvilCoder, a system to automatically find potentially vulnerable source code locations and modify the source code to be actually vulnerable. More specifically, we leverage automated program analysis techniques to find sensitive sinks which match typical bug patterns (e.g., a sensitive API function with a preceding sanity check), and try to find data-flow connections to user-controlled sources. We then transform the source code such that exploitation becomes possible, for example by removing or modifying input sanitization or other types of security checks. Our tool is designed to randomly pick vulnerable locations and possible modifications, such that it can generate numerous different vulnerabilities on the same software corpus. We evaluated our tool on several open-source projects such as for example libpng and vsftpd, where we found between 22 and 158 unique connected source-sink pairs per project. This translates to hundreds of potentially vulnerable data-flow paths and hundreds of bugs we can insert. We hope to support future bug-finding techniques by supplying freshly generated, bug-ridden test corpora so that such techniques can (finally) be evaluated and compared in a comprehensive and statistically meaningful way.

References

Aleph One. Smashing the stack for fun and profit. Phrack, 7(49), November 1996.Google Scholar
T. Avgerinos, S. K. Cha, B. L. T. Hao, and D. Brumley. AEG: Automatic Exploit Generation. In Symposium on Network and Distributed System Security (NDSS), 2011.Google Scholar
N. Borisov, R. Johnson, N. Sastry, and D. Wagner. Fixing races for fun and profit: How to abuse atime. In USENIX Security Symposium, 2005. Google ScholarDigital Library
S. K. Cha, T. Avgerinos, A. Rebert, and D. Brumley. Unleashing Mayhem on Binary Code. In IEEE Symposium on Security and Privacy, 2012. Google ScholarDigital Library
R. Curtmola, S. Torres-Arias, A. Ammula, and J. Cappos. On omitting commits and committing omissions: Preventing git metadata tampering that (re)introduces software vulnerabilities. In USENIX Security Symposium, 2016.Google Scholar
D. Davidson, B. Moench, T. Ristenpart, and S. Jha. Fie on firmware: Finding vulnerabilities in embedded systems using symbolic execution. In USENIX Security Symposium, 2013. Google ScholarDigital Library
A. Delaitre, B. Stivalet, E. Fong, and V. Okun. Evaluating bug finders: Test and measurement of static code analyzers. In First International Workshop on Complex faUlts and Failures in LargE Software Systems (COUFLESS), 2015. Google ScholarDigital Library
W. Dietz, P. Li, J. Regehr, and V. Adve. Understanding integer overflow in C/C++. In International Conference on Software Engineering (ICSE), 2012. Google ScholarDigital Library
B. Dolan-Gavitt, P. Hulin, E. Kirda, T. Leek, A. Mambretti, W. Robertson, F. Ulrich, and R. Whelan. LAVA: Large-scale Automated Vulnerability Addition. In IEEE Symposium on Security and Privacy, 2016.Google Scholar
J. E. Forrester and B. P. Miller. An empirical study of the robustness of Windows NT applications using random testing. In USENIX Windows Systems Symposium (WSS), 2000. Google ScholarDigital Library
J. S. Garofolo, L. F. Lamel, W. M. Fisher, J. G. Fiscus, D. S. Pallett, and N. L. Dahlgren. DARPA TIMIT acoustic phonetic continuous speech corpus CDROM, 1993.Google Scholar
P. Godefroid. Random testing for security: Blackbox vs. whitebox fuzzing. In International Workshop on Random Testing: Co-located with the IEEE/ACM International Conference on Automated Software Engineering (ASE), 2007. Google ScholarDigital Library
S. Horwitz. Precise flow-insensitive may-alias analysis is NP-hard. ACM Transactions on Programming Languages and Systems (TOPLAS), 19(1), 1997. Google ScholarDigital Library
IARPA. Securely taking on software of uncertain provenance (STONESOUP), 2015. http://www.iarpa.gov/index.php/research-programs/stonesoup.Google Scholar
J. Jang, A. Agrawal, and D. Brumley. ReDeBug: Finding Unpatched Code Clones in Entire OS Distributions. In IEEE Symposium on Security and Privacy, 2012. Google ScholarDigital Library
Y. Jia and M. Harman. An analysis and survey of the development of mutation testing. IEEE Transactions on Software Engineering, 37(5), 2011. Google ScholarDigital Library
C. Lattner, A. Lenharth, and V. Adve. Making context-sensitive points-to analysis with heap cloning practical for the real world. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2007. Google ScholarDigital Library
L. Li, C. Cifuentes, and N. Keynes. Boosting the performance of flow-sensitive points-to analysis using value flow. In ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE), 2011. Google ScholarDigital Library
Z. Lin, X. Zhang, and D. Xu. Convicting exploitable software vulnerabilities: An efficient input provenance based approach. In International Conference on Dependable Systems and Networks, 2008.Google Scholar
B. Livshits. Defining a set of common benchmarks for web application security, 2005.Google Scholar
B. Livshits and S. Chong. Towards fully automatic placement of security sanitizers and declassifiers. In ACM Symposium on Principles of Programming Languages (POPL), 2013. Google ScholarDigital Library
V. B. Livshits and M. S. Lam. Finding Security Vulnerabilities in Java Applications with Static Analysis. In USENIX Security Symposium, 2005. Google ScholarDigital Library
M. E. Locasto and S. Bratus. Hacking the Abacus: An Undergraduate Guide to Programming Weird Machines. http://www.cs.dartmouth.edu/sergey/drafts/sismat-manual-locasto.pdf, 2014.Google Scholar
Y. Lu, S. Yi, Z. Lei, and Y. Xinlei. Binary software vulnerability analysis based on bidirectional-slicing. In Conference on Instrumentation, Measurement, Computer, Communication and Control (IMCCC), 2012. Google ScholarDigital Library
B. P. Miller, L. Fredriksen, and B. So. An Empirical Study of the Reliability of UNIX Utilities. Commununications of ACM, 1990. Google ScholarDigital Library
C. Miller. Fuzz by number. CanSecWest Conference, 2008.Google Scholar
L. Moonen. Generating robust parsers using island grammars. In Working Conference on Reverse Engineering (WCRE), 2001. Google ScholarDigital Library
Neo4j. Neo4j - the world's leading graph database, 2012. http://neo4j.org/.Google Scholar
G. Nilson, K. Wills, J. Stuckman, and J. Purtilo. Bugbox: A vulnerability corpus for PHP web applications. In Workshop on Cyber Security Experimentation and Test (CSET), 2013.Google Scholar
NIST. SAMATE - Software Assurance Metrics And Tool Evaluation, 2015. https://samate.nist.gov/.Google Scholar
OWASP. OWASP WebGoat Project, 2015. https://www.owasp.org/index.php/Category: OWASP_WebGoat_Project.Google Scholar
D. B. Paul and J. M. Baker. The design for the Wall Street Journal-based CSR corpus. In Workshop on Speech and Natural Language, 1992. Google ScholarDigital Library
J. Pewny, B. Garmany, R. Gawlik, C. Rossow, and T. Holz. Cross-architecture bug search in binary executables. In IEEE Symposium on Security and Privacy, 2015. Google ScholarDigital Library
D. Song, D. Brumley, J. Caballero, I. Jager, M. G. Kang, Z. Liang, J. Newsome, P. Poosankam, and P. Saxena. Bitblaze: A new approach to computer security via binary analysis. In International Conference on Information Systems Security, 2008. Google ScholarDigital Library
X. Wang, H. Chen, A. Cheung, Z. Jia, N. Zeldovich, and M. F. Kaashoek. Undefined behavior: What happened to my code? In Asia-Pacific Workshop on Systems (APSYS), 2012. Google ScholarDigital Library
F. Yamaguchi, N. Golde, D. Arp, and K. Rieck. Modeling and discovering vulnerabilities with code property graphs. In IEEE Symposium on Security and Privacy, 2014. Google ScholarDigital Library

EvilCoder: automated bug insertion
1. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis

Recommendations

Using evolution patterns to find duplicated bugs
Read More
Code Bad Smells: a review of current knowledge

Fowler et al. identified 22 Code Bad Smells to direct the effective refactoring of code. These are increasingly being taken up by software engineers. However, the empirical basis of using Code Bad Smells to direct refactoring and to address ‘trouble’ in ...
Read More
Bug localization via searching crowd-contributed code
Internetware '14: Proceedings of the 6th Asia-Pacific Symposium on Internetware

Bug localization, i.e., locating bugs in code snippets, is a frequent task in software development. Although static bug-finding tools are available to reduce manual effort in bug localization, these tools typically detect bugs with known project-...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ACSAC '16: Proceedings of the 32nd Annual Conference on Computer Security Applications
December 2016
614 pages
ISBN:9781450347716
DOI:10.1145/2991079
Conference Chair:
Stephen Schwab
USC Information Sciences Institute
,
Program Chairs:
Wil Robertson
Northeastern University
,
Davide Balzarotti
Eurecom
Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 5 December 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate104of497submissions,21%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 27
  Total Citations
  View Citations
- 475
  Total Downloads
- Downloads (Last 12 months)53
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

EvilCoder: automated bug insertion

ACSAC '16: Proceedings of the 32nd Annual Conference on Computer Security Applications

ABSTRACT

References

Cited By

Recommendations

Using evolution patterns to find duplicated bugs

Code Bad Smells: a review of current knowledge

Bug localization via searching crowd-contributed code

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

EvilCoder: automated bug insertion

ACSAC '16: Proceedings of the 32nd Annual Conference on Computer Security Applications

ABSTRACT

References

Cited By

Recommendations

Using evolution patterns to find duplicated bugs

Code Bad Smells: a review of current knowledge

Bug localization via searching crowd-contributed code

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media