skip to main content
10.1145/2991079.2991103acmotherconferencesArticle/Chapter ViewAbstractPublication PagesacsacConference Proceedingsconference-collections
research-article

EvilCoder: automated bug insertion

Published:05 December 2016Publication History

ABSTRACT

The art of finding software vulnerabilities has been covered extensively in the literature and there is a huge body of work on this topic. In contrast, the intentional insertion of exploitable, security-critical bugs has received little (public) attention yet. Wanting more bugs seems to be counterproductive at first sight, but the comprehensive evaluation of bug-finding techniques suffers from a lack of ground truth and the scarcity of bugs.

In this paper, we propose EvilCoder, a system to automatically find potentially vulnerable source code locations and modify the source code to be actually vulnerable. More specifically, we leverage automated program analysis techniques to find sensitive sinks which match typical bug patterns (e.g., a sensitive API function with a preceding sanity check), and try to find data-flow connections to user-controlled sources. We then transform the source code such that exploitation becomes possible, for example by removing or modifying input sanitization or other types of security checks. Our tool is designed to randomly pick vulnerable locations and possible modifications, such that it can generate numerous different vulnerabilities on the same software corpus. We evaluated our tool on several open-source projects such as for example libpng and vsftpd, where we found between 22 and 158 unique connected source-sink pairs per project. This translates to hundreds of potentially vulnerable data-flow paths and hundreds of bugs we can insert. We hope to support future bug-finding techniques by supplying freshly generated, bug-ridden test corpora so that such techniques can (finally) be evaluated and compared in a comprehensive and statistically meaningful way.

References

  1. Aleph One. Smashing the stack for fun and profit. Phrack, 7(49), November 1996.Google ScholarGoogle Scholar
  2. T. Avgerinos, S. K. Cha, B. L. T. Hao, and D. Brumley. AEG: Automatic Exploit Generation. In Symposium on Network and Distributed System Security (NDSS), 2011.Google ScholarGoogle Scholar
  3. N. Borisov, R. Johnson, N. Sastry, and D. Wagner. Fixing races for fun and profit: How to abuse atime. In USENIX Security Symposium, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. K. Cha, T. Avgerinos, A. Rebert, and D. Brumley. Unleashing Mayhem on Binary Code. In IEEE Symposium on Security and Privacy, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. R. Curtmola, S. Torres-Arias, A. Ammula, and J. Cappos. On omitting commits and committing omissions: Preventing git metadata tampering that (re)introduces software vulnerabilities. In USENIX Security Symposium, 2016.Google ScholarGoogle Scholar
  6. D. Davidson, B. Moench, T. Ristenpart, and S. Jha. Fie on firmware: Finding vulnerabilities in embedded systems using symbolic execution. In USENIX Security Symposium, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. Delaitre, B. Stivalet, E. Fong, and V. Okun. Evaluating bug finders: Test and measurement of static code analyzers. In First International Workshop on Complex faUlts and Failures in LargE Software Systems (COUFLESS), 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. W. Dietz, P. Li, J. Regehr, and V. Adve. Understanding integer overflow in C/C++. In International Conference on Software Engineering (ICSE), 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. B. Dolan-Gavitt, P. Hulin, E. Kirda, T. Leek, A. Mambretti, W. Robertson, F. Ulrich, and R. Whelan. LAVA: Large-scale Automated Vulnerability Addition. In IEEE Symposium on Security and Privacy, 2016.Google ScholarGoogle Scholar
  10. J. E. Forrester and B. P. Miller. An empirical study of the robustness of Windows NT applications using random testing. In USENIX Windows Systems Symposium (WSS), 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. S. Garofolo, L. F. Lamel, W. M. Fisher, J. G. Fiscus, D. S. Pallett, and N. L. Dahlgren. DARPA TIMIT acoustic phonetic continuous speech corpus CDROM, 1993.Google ScholarGoogle Scholar
  12. P. Godefroid. Random testing for security: Blackbox vs. whitebox fuzzing. In International Workshop on Random Testing: Co-located with the IEEE/ACM International Conference on Automated Software Engineering (ASE), 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. Horwitz. Precise flow-insensitive may-alias analysis is NP-hard. ACM Transactions on Programming Languages and Systems (TOPLAS), 19(1), 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. IARPA. Securely taking on software of uncertain provenance (STONESOUP), 2015. http://www.iarpa.gov/index.php/research-programs/stonesoup.Google ScholarGoogle Scholar
  15. J. Jang, A. Agrawal, and D. Brumley. ReDeBug: Finding Unpatched Code Clones in Entire OS Distributions. In IEEE Symposium on Security and Privacy, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Y. Jia and M. Harman. An analysis and survey of the development of mutation testing. IEEE Transactions on Software Engineering, 37(5), 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. C. Lattner, A. Lenharth, and V. Adve. Making context-sensitive points-to analysis with heap cloning practical for the real world. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. L. Li, C. Cifuentes, and N. Keynes. Boosting the performance of flow-sensitive points-to analysis using value flow. In ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE), 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Z. Lin, X. Zhang, and D. Xu. Convicting exploitable software vulnerabilities: An efficient input provenance based approach. In International Conference on Dependable Systems and Networks, 2008.Google ScholarGoogle Scholar
  20. B. Livshits. Defining a set of common benchmarks for web application security, 2005.Google ScholarGoogle Scholar
  21. B. Livshits and S. Chong. Towards fully automatic placement of security sanitizers and declassifiers. In ACM Symposium on Principles of Programming Languages (POPL), 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. V. B. Livshits and M. S. Lam. Finding Security Vulnerabilities in Java Applications with Static Analysis. In USENIX Security Symposium, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M. E. Locasto and S. Bratus. Hacking the Abacus: An Undergraduate Guide to Programming Weird Machines. http://www.cs.dartmouth.edu/sergey/drafts/sismat-manual-locasto.pdf, 2014.Google ScholarGoogle Scholar
  24. Y. Lu, S. Yi, Z. Lei, and Y. Xinlei. Binary software vulnerability analysis based on bidirectional-slicing. In Conference on Instrumentation, Measurement, Computer, Communication and Control (IMCCC), 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. B. P. Miller, L. Fredriksen, and B. So. An Empirical Study of the Reliability of UNIX Utilities. Commununications of ACM, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. C. Miller. Fuzz by number. CanSecWest Conference, 2008.Google ScholarGoogle Scholar
  27. L. Moonen. Generating robust parsers using island grammars. In Working Conference on Reverse Engineering (WCRE), 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Neo4j. Neo4j - the world's leading graph database, 2012. http://neo4j.org/.Google ScholarGoogle Scholar
  29. G. Nilson, K. Wills, J. Stuckman, and J. Purtilo. Bugbox: A vulnerability corpus for PHP web applications. In Workshop on Cyber Security Experimentation and Test (CSET), 2013.Google ScholarGoogle Scholar
  30. NIST. SAMATE - Software Assurance Metrics And Tool Evaluation, 2015. https://samate.nist.gov/.Google ScholarGoogle Scholar
  31. OWASP. OWASP WebGoat Project, 2015. https://www.owasp.org/index.php/Category: OWASP_WebGoat_Project.Google ScholarGoogle Scholar
  32. D. B. Paul and J. M. Baker. The design for the Wall Street Journal-based CSR corpus. In Workshop on Speech and Natural Language, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. J. Pewny, B. Garmany, R. Gawlik, C. Rossow, and T. Holz. Cross-architecture bug search in binary executables. In IEEE Symposium on Security and Privacy, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. D. Song, D. Brumley, J. Caballero, I. Jager, M. G. Kang, Z. Liang, J. Newsome, P. Poosankam, and P. Saxena. Bitblaze: A new approach to computer security via binary analysis. In International Conference on Information Systems Security, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. X. Wang, H. Chen, A. Cheung, Z. Jia, N. Zeldovich, and M. F. Kaashoek. Undefined behavior: What happened to my code? In Asia-Pacific Workshop on Systems (APSYS), 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. F. Yamaguchi, N. Golde, D. Arp, and K. Rieck. Modeling and discovering vulnerabilities with code property graphs. In IEEE Symposium on Security and Privacy, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. EvilCoder: automated bug insertion

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      ACSAC '16: Proceedings of the 32nd Annual Conference on Computer Security Applications
      December 2016
      614 pages
      ISBN:9781450347716
      DOI:10.1145/2991079

      Copyright © 2016 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 5 December 2016

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate104of497submissions,21%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader