skip to main content
10.1145/2024569.2024571acmconferencesArticle/Chapter ViewAbstractPublication PagespasteConference Proceedingsconference-collections
research-article

Labeling library functions in stripped binaries

Published:05 September 2011Publication History

ABSTRACT

Binary code presents unique analysis challenges, particularly when debugging information has been stripped from the executable. Among the valuable information lost in stripping are the identities of standard library functions linked into the executable; knowing the identities of such functions can help to optimize automated analysis and is instrumental in understanding program behavior. Library fingerprinting attempts to restore the names of library functions in stripped binaries, using signatures extracted from reference libraries. Existing methods are brittle in the face of variations in the toolchain that produced the reference libraries and do not generalize well to new library versions. We introduce semantic descriptors, high-level representations of library functions that avoid the brittleness of existing approaches. We have extended a tool, unstrip, to apply this technique to fingerprint wrapper functions in the GNU C library. unstrip discovers functions in a stripped binary and outputs a new binary, with meaningful names added to the symbol table. Other tools can leverage these symbols to perform further analysis. We demonstrate that our semantic descriptors generalize well and substantially outperform existing library fingerprinting techniques.

References

  1. G. Balakrishnan, T. Reps, D. Melski, and T. Teitelbaum. WYSINWYX: What You See Is Not What You eXecute. In Verified Software: Theories, Tools, Experiments. Springer-Verlag, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. U. Bayer, P. M. Comparetti, C. Hlauschek, C. Kruegel, and E. Kirda. Scalable, behavior-based malware clustering. In Network and Distributed System Security Symposium (NDSS), San Diego, CA, USA, February 2009.Google ScholarGoogle Scholar
  3. T. E. Cheatham, G. H. Holloway, and J. A. Townley. Symbolic evaluation and the analysis of programs. IEEE Trans. Softw. Eng., 5 (4): 402--417, 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. Christodorescu, S. Jha, and C. Krugel. Mining specifications of malicious behavior. In Proceedings of the Sixth Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, pages 5--14, Dubrovnik, Croatia, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. C. Cifuentes and A. Fraboulet. Intraprocedural static slicing of binary executables. In Proc. International Conference on Software Maintenance, pages 188--195, Bari, Italy, October 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. C. Cifuentes and K. J. Gough. Decompilation of binary programs. Software--Practice and Experience, 25 (7), 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. P. Coward. Symbolic execution systems-a review. Software Engineering Journal, 3 (6): 229--239, Nov 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. ROSED. J. Quinlan et al. ROSE Compiler Project. http://www.rosecompiler.org.Google ScholarGoogle Scholar
  9. T. Dullien and R. Rolles. Graph-based comparison of executable objects. In Symposium sur la Sécurité des Technologies de l'Information et des Communications (SSTIC), June 2005.Google ScholarGoogle Scholar
  10. M. V. Emmerik. Signatures for library functions in executable files. Technical Report 2194, Queensland University of Technology, 1994.Google ScholarGoogle Scholar
  11. H. Flake. Structural comparison of executable objects. In Conference Detection of Intrusions and Malware & Vulnerability Assessment (DIMVA 2004), Dortmund, Germany, July 2004.Google ScholarGoogle Scholar
  12. M. Fredrikson, S. Jha, M. Christodorescu, R. Sailer, and X. Yan. Synthesizing near-optimal malware specifications from suspicious behaviors. In Proceedings of the 2010 IEEE Symposium on Security and Privacy, Berkeley, California, May 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. I. Guilfanova and DataRescue. Fast library identificatiion and recognition technology. http://www.hex-rays.com/idapro/flirt.htm, 1997.Google ScholarGoogle Scholar
  14. Hex-Rays. IDA Pro disassembler. http://www.hex-rays.com/idapro.Google ScholarGoogle Scholar
  15. A. Kiss, J. Jasz, G. Lehotai, and T. Gyimothy. Interprocedural static slicing of binary executables. In Source Code Analysis and Manipulation, Amsterdam, The Netherlands, September 2003.Google ScholarGoogle Scholar
  16. C. Kolbitsch, P. M. Comparetti, C. Kruegel, E. Kirda, X. Zho, and X. Wang. Effective and efficient malware detection at the end host. In Eighteenth USENIX Security Symposium, Montreal, Canada, August 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. C. Kruegel, E. Kirda, D. Mutz, W. Robertson, and G. Vigna. Polymorphic worm detection using structural information of executables. In Eighth International Symposium on Recent Advances in Intrusion Detection (RAID 2005), Seattle,WA, September 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Paradyn Project. Dyninst 7.0. 2011. URL http://www.paradyn.org/html/dyninst7.0-features.html.Google ScholarGoogle Scholar
  19. Paradyn Project. ParseAPI: An application program interface for binary parsing. 2011. URL http://paradyn.org/html/parse0.9-features.html.Google ScholarGoogle Scholar
  20. Paradyn Project. shape unstrip. 2011. URL http://paradyn.org/html/tools/unstrip.html.Google ScholarGoogle Scholar
  21. N. Rosenblum, X. Zhu, B. Miller, and K. Hunt. Learning to analyze binary computer code. In 23rd conference on Artificial Intellegence (AAAI '08), Chicago, IL, July 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. N. E. Rosenblum, B. P. Miller, and X. Zhu. Extracting compiler provenance from program binaries. In 9th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering (PASTE '10), Toronto, Ontario, Canada, June 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. H. Theiling. Ecxtracting safe and precise control flow from binaries. In 7th Conference on Real-Time Computing Systems and Applications (RTCSA '00), Washington, DC, December 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Labeling library functions in stripped binaries

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      PASTE '11: Proceedings of the 10th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools
      September 2011
      46 pages
      ISBN:9781450308496
      DOI:10.1145/2024569

      Copyright © 2011 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 5 September 2011

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate57of159submissions,36%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader