ABSTRACT
The art of finding software vulnerabilities has been covered extensively in the literature and there is a huge body of work on this topic. In contrast, the intentional insertion of exploitable, security-critical bugs has received little (public) attention yet. Wanting more bugs seems to be counterproductive at first sight, but the comprehensive evaluation of bug-finding techniques suffers from a lack of ground truth and the scarcity of bugs.
In this paper, we propose EvilCoder, a system to automatically find potentially vulnerable source code locations and modify the source code to be actually vulnerable. More specifically, we leverage automated program analysis techniques to find sensitive sinks which match typical bug patterns (e.g., a sensitive API function with a preceding sanity check), and try to find data-flow connections to user-controlled sources. We then transform the source code such that exploitation becomes possible, for example by removing or modifying input sanitization or other types of security checks. Our tool is designed to randomly pick vulnerable locations and possible modifications, such that it can generate numerous different vulnerabilities on the same software corpus. We evaluated our tool on several open-source projects such as for example libpng and vsftpd, where we found between 22 and 158 unique connected source-sink pairs per project. This translates to hundreds of potentially vulnerable data-flow paths and hundreds of bugs we can insert. We hope to support future bug-finding techniques by supplying freshly generated, bug-ridden test corpora so that such techniques can (finally) be evaluated and compared in a comprehensive and statistically meaningful way.
- Aleph One. Smashing the stack for fun and profit. Phrack, 7(49), November 1996.Google Scholar
- T. Avgerinos, S. K. Cha, B. L. T. Hao, and D. Brumley. AEG: Automatic Exploit Generation. In Symposium on Network and Distributed System Security (NDSS), 2011.Google Scholar
- N. Borisov, R. Johnson, N. Sastry, and D. Wagner. Fixing races for fun and profit: How to abuse atime. In USENIX Security Symposium, 2005. Google ScholarDigital Library
- S. K. Cha, T. Avgerinos, A. Rebert, and D. Brumley. Unleashing Mayhem on Binary Code. In IEEE Symposium on Security and Privacy, 2012. Google ScholarDigital Library
- R. Curtmola, S. Torres-Arias, A. Ammula, and J. Cappos. On omitting commits and committing omissions: Preventing git metadata tampering that (re)introduces software vulnerabilities. In USENIX Security Symposium, 2016.Google Scholar
- D. Davidson, B. Moench, T. Ristenpart, and S. Jha. Fie on firmware: Finding vulnerabilities in embedded systems using symbolic execution. In USENIX Security Symposium, 2013. Google ScholarDigital Library
- A. Delaitre, B. Stivalet, E. Fong, and V. Okun. Evaluating bug finders: Test and measurement of static code analyzers. In First International Workshop on Complex faUlts and Failures in LargE Software Systems (COUFLESS), 2015. Google ScholarDigital Library
- W. Dietz, P. Li, J. Regehr, and V. Adve. Understanding integer overflow in C/C++. In International Conference on Software Engineering (ICSE), 2012. Google ScholarDigital Library
- B. Dolan-Gavitt, P. Hulin, E. Kirda, T. Leek, A. Mambretti, W. Robertson, F. Ulrich, and R. Whelan. LAVA: Large-scale Automated Vulnerability Addition. In IEEE Symposium on Security and Privacy, 2016.Google Scholar
- J. E. Forrester and B. P. Miller. An empirical study of the robustness of Windows NT applications using random testing. In USENIX Windows Systems Symposium (WSS), 2000. Google ScholarDigital Library
- J. S. Garofolo, L. F. Lamel, W. M. Fisher, J. G. Fiscus, D. S. Pallett, and N. L. Dahlgren. DARPA TIMIT acoustic phonetic continuous speech corpus CDROM, 1993.Google Scholar
- P. Godefroid. Random testing for security: Blackbox vs. whitebox fuzzing. In International Workshop on Random Testing: Co-located with the IEEE/ACM International Conference on Automated Software Engineering (ASE), 2007. Google ScholarDigital Library
- S. Horwitz. Precise flow-insensitive may-alias analysis is NP-hard. ACM Transactions on Programming Languages and Systems (TOPLAS), 19(1), 1997. Google ScholarDigital Library
- IARPA. Securely taking on software of uncertain provenance (STONESOUP), 2015. http://www.iarpa.gov/index.php/research-programs/stonesoup.Google Scholar
- J. Jang, A. Agrawal, and D. Brumley. ReDeBug: Finding Unpatched Code Clones in Entire OS Distributions. In IEEE Symposium on Security and Privacy, 2012. Google ScholarDigital Library
- Y. Jia and M. Harman. An analysis and survey of the development of mutation testing. IEEE Transactions on Software Engineering, 37(5), 2011. Google ScholarDigital Library
- C. Lattner, A. Lenharth, and V. Adve. Making context-sensitive points-to analysis with heap cloning practical for the real world. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2007. Google ScholarDigital Library
- L. Li, C. Cifuentes, and N. Keynes. Boosting the performance of flow-sensitive points-to analysis using value flow. In ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE), 2011. Google ScholarDigital Library
- Z. Lin, X. Zhang, and D. Xu. Convicting exploitable software vulnerabilities: An efficient input provenance based approach. In International Conference on Dependable Systems and Networks, 2008.Google Scholar
- B. Livshits. Defining a set of common benchmarks for web application security, 2005.Google Scholar
- B. Livshits and S. Chong. Towards fully automatic placement of security sanitizers and declassifiers. In ACM Symposium on Principles of Programming Languages (POPL), 2013. Google ScholarDigital Library
- V. B. Livshits and M. S. Lam. Finding Security Vulnerabilities in Java Applications with Static Analysis. In USENIX Security Symposium, 2005. Google ScholarDigital Library
- M. E. Locasto and S. Bratus. Hacking the Abacus: An Undergraduate Guide to Programming Weird Machines. http://www.cs.dartmouth.edu/sergey/drafts/sismat-manual-locasto.pdf, 2014.Google Scholar
- Y. Lu, S. Yi, Z. Lei, and Y. Xinlei. Binary software vulnerability analysis based on bidirectional-slicing. In Conference on Instrumentation, Measurement, Computer, Communication and Control (IMCCC), 2012. Google ScholarDigital Library
- B. P. Miller, L. Fredriksen, and B. So. An Empirical Study of the Reliability of UNIX Utilities. Commununications of ACM, 1990. Google ScholarDigital Library
- C. Miller. Fuzz by number. CanSecWest Conference, 2008.Google Scholar
- L. Moonen. Generating robust parsers using island grammars. In Working Conference on Reverse Engineering (WCRE), 2001. Google ScholarDigital Library
- Neo4j. Neo4j - the world's leading graph database, 2012. http://neo4j.org/.Google Scholar
- G. Nilson, K. Wills, J. Stuckman, and J. Purtilo. Bugbox: A vulnerability corpus for PHP web applications. In Workshop on Cyber Security Experimentation and Test (CSET), 2013.Google Scholar
- NIST. SAMATE - Software Assurance Metrics And Tool Evaluation, 2015. https://samate.nist.gov/.Google Scholar
- OWASP. OWASP WebGoat Project, 2015. https://www.owasp.org/index.php/Category: OWASP_WebGoat_Project.Google Scholar
- D. B. Paul and J. M. Baker. The design for the Wall Street Journal-based CSR corpus. In Workshop on Speech and Natural Language, 1992. Google ScholarDigital Library
- J. Pewny, B. Garmany, R. Gawlik, C. Rossow, and T. Holz. Cross-architecture bug search in binary executables. In IEEE Symposium on Security and Privacy, 2015. Google ScholarDigital Library
- D. Song, D. Brumley, J. Caballero, I. Jager, M. G. Kang, Z. Liang, J. Newsome, P. Poosankam, and P. Saxena. Bitblaze: A new approach to computer security via binary analysis. In International Conference on Information Systems Security, 2008. Google ScholarDigital Library
- X. Wang, H. Chen, A. Cheung, Z. Jia, N. Zeldovich, and M. F. Kaashoek. Undefined behavior: What happened to my code? In Asia-Pacific Workshop on Systems (APSYS), 2012. Google ScholarDigital Library
- F. Yamaguchi, N. Golde, D. Arp, and K. Rieck. Modeling and discovering vulnerabilities with code property graphs. In IEEE Symposium on Security and Privacy, 2014. Google ScholarDigital Library
- EvilCoder: automated bug insertion
Recommendations
Code Bad Smells: a review of current knowledge
Fowler et al. identified 22 Code Bad Smells to direct the effective refactoring of code. These are increasingly being taken up by software engineers. However, the empirical basis of using Code Bad Smells to direct refactoring and to address ‘trouble’ in ...
Bug localization via searching crowd-contributed code
Internetware '14: Proceedings of the 6th Asia-Pacific Symposium on InternetwareBug localization, i.e., locating bugs in code snippets, is a frequent task in software development. Although static bug-finding tools are available to reduce manual effort in bug localization, these tools typically detect bugs with known project-...
Comments