skip to main content
10.1145/2336717.2336725acmconferencesArticle/Chapter ViewAbstractPublication PagespldiConference Proceedingsconference-collections
research-article

Hash-flow taint analysis of higher-order programs

Published:15 June 2012Publication History

ABSTRACT

As web applications have grown in popularity, so have attacks on such applications. Cross-site scripting and injection attacks have become particularly problematic. Both vulnerabilities stem, at their core, from improper sanitization of user input.

We propose static taint analysis, which can verify the absence of unsanitized input errors at compile-time. Unfortunately, precise static analysis of modern scripting languages like Python is challenging: higher-orderness and complex control-flow collide with opaque, dynamic data structures like hash maps and objects. The interdependence of data-flow and control-flow make it hard to attain both soundness and precision.

In this work, we apply abstract interpretation to sound and precise taint-style static analysis of scripting languages. We first define λH, a core calculus of modern scripting languages, with hash maps, dynamic objects, higher-order functions and first class control. Then we derive a framework of k-CFA-like CESK-style abstract machines for statically reasoning about λH, but with hash maps factored into a "Curried Object store." The Curried object store---and shape analysis on this store---allows us to recover field sensitivity, even in the presence of dynamically modified fields. Lastly, atop this framework, we devise a taint-flow analysis, leveraging its field-sensitive, interprocedural and context-sensitive properties to soundly and precisely detect security vulnerabilities, like XSS attacks in web applications.

We have prototyped the analytical framework for Python, and conducted preliminary experiments with web applications. A low rate of false alarms demonstrates the promise of this approach.

References

  1. Pyblosxom. http://pyblosxom.bluesock.org/.Google ScholarGoogle Scholar
  2. Andrews, M. Guest editor's introduction: The state of web security. IEEE Security and Privacy 4 (July 2006), 14--15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Chang, W., Streiff, B., and Lin, C. Efficient and extensible security enforcement using dynamic data flow analysis. In Conference on Computer and Communications Security (2008). Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Chase, D. R., Wegman, M., and Zadeck, F. K. Analysis of pointers and structures. In PLDI '90: Proceedings of the ACM SIGPLAN 1990 Conference on Programming Language Design and Implementation (New York, NY, USA, 1990), PLDI '90, ACM, pp. 296--310. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Denning, D. E. A lattice model of secure information flow. Commun. ACM 19, 5 (May 1976), 236--243. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Felleisen, M. The Calculi of Lambda-v-CS Conversion: A Syntactic Theory of Control and State in Imperative Higher-Order Programming Languages. PhD thesis, Indiana University, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Felleisen, M., and Friedman, D. P. Control operators, the SECD-machine, and the lambda-calculus. In 3rd Working Conference on the Formal Description of Programming Concepts (Aug. 1986).Google ScholarGoogle Scholar
  8. Flanagan, C., Sabry, A., Duba, B. F., and Felleisen, M. The essence of compiling with continuations. In PLDI '93: Proceedings of the ACM SIGPLAN 1993 Conference on Programming Language Design and Implementation (New York, NY, USA, June 1993), ACM, pp. 237--247. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Grossbart, Z. Search engine in python, 2007. http://www.zackgrossbart.com/hackito/search-engine-python/.Google ScholarGoogle Scholar
  10. Huang, Y. W., Yu, F., Hang, C., Tsai, C. H., Lee, D. T., and Kuo, S. Y. Securing web application code by static analysis and runtime protection. In WWW '04: Proceedings of the 13th international conference on World Wide Web (New York, NY, USA, 2004), ACM, pp. 40--52. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Jovanovic, N., Kruegel, C., and Kirda, E. Pixy: A static analysis tool for detecting web application vulnerabilities (short paper). In IN 2006 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (2006), pp. 258--263. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Kozlov, D., and Petukhov. Implementation of tainted mode approach to finding security vulnerabilities for python technology.Google ScholarGoogle Scholar
  13. Meyerovich, L. A., and Livshits, B. ConScript: Specifying and enforcing Fine-Grained security policies for JavaScript in the browser. In Proceedings of the 2010 IEEE Symposium on Security and Privacy (Washington, DC, USA, 2010), SP '10, IEEE Computer Society, pp. 481--496. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Might, M. A-normalization: Why and how. http://matt.might.net/articles/a-normalization/.Google ScholarGoogle Scholar
  15. Might, M. You don't understand exceptions, but you should. http://matt.might.net/articles/implementing-exceptions/.Google ScholarGoogle Scholar
  16. Might, M. Environment Analysis of Higher-Order Languages. PhD thesis, Georgia Institute of Technology, June 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Might, M. Shape analysis in the absence of pointers and structure. In VMCAI 2010: International Conference on Verification, Model-Checking and Abstract Interpretation (Madrid, Spain, Jan. 2010), pp. 263--278. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Might, M., and Shivers, O. Improving flow analyses via Gamma-CFA: Abstract garbage collection and counting. In ICFP '06: Proceedings of the 11th ACM SIGPLAN International Conference on Functional Programming (New York, NY, USA, 2006), ACM, pp. 13--25. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Might, M., and Shivers, O. Exploiting reachability and cardinality in higher-order flow analysis. Journal of Functional Programming 18, Special Double Issue 5--6 (2008), 821--864. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Might, M., Smaragdakis, Y., and Van Horn, D. Resolving and exploiting the k-CFA paradox: Illuminating functional vs. object-oriented program analysis. In PLDI '10: Proceedings of the 2010 ACM SIGPLAN Conference on Programming Language Design and Implementation (2010), PLDI '10, ACM Press, pp. 305--315. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Minamide, Y. Static approximation of dynamically generated web pages. In WWW. 2005, pp. 432--441. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Moin moin. http://moinmo.in/.Google ScholarGoogle Scholar
  23. Nguyen-Tuong, A., Guarnieri, S., Greene, D., Shirley, J., and Evans, D. Automatically hardening web applications using precise tainting. In In 20th IFIP International Information Security Conference (2005), pp. 372--382.Google ScholarGoogle ScholarCross RefCross Ref
  24. Owasp. OWASP top 10 for 2010. https://www.owasp.org/index.php/Category:OWASP_Top_Ten_Project.Google ScholarGoogle Scholar
  25. Richards, G., Lebresne, S., Burg, B., and Vitek, J. An analysis of the dynamic behavior of JavaScript programs. In Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation (New York, NY, USA, 2010), PLDI '10, ACM, pp. 1--12. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Ruby. http://ruby-doc.org/docs/ProgrammingRuby/.Google ScholarGoogle Scholar
  27. Sabelfeld, A., and Myers, A. Language-Based Information-Flow security, 2003.Google ScholarGoogle Scholar
  28. Salib, M. Static Type Inference with Starkiller. In PyCon DC (2004).Google ScholarGoogle Scholar
  29. Seo, J., and Monica. InvisiType: Object-Oriented security policies. In NDSS (2010).Google ScholarGoogle Scholar
  30. Tripp, O., Pistoia, M., Fink, S. J., Sridharan, M., and Weisman, O. Taj: effective taint analysis of web applications. In Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation (New York, NY, USA, 2009), PLDI '09, ACM, pp. 87--97. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Van Horn, D., and Might, M. Abstracting abstract machines. In Proceedings of the 15th ACM SIGPLAN international conference on Functional programming (New York, NY, USA, 2010), ICFP '10, ACM, pp. 51--62. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Vogt, P., Nentwich, F., Jovanovic, N., Kirda, E., Kruegel, C., and Vigna, G. Cross site scripting prevention with dynamic data tainting and static analysis.Google ScholarGoogle Scholar
  33. Wall, L., Christiansen, T., Schwartz, R. L., and Potter, S. Programming Perl (2nd Edition). Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Wassermann, G., and Su, Z. Sound and precise analysis of web applications for injection vulnerabilities. In PLDI '07: Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation (New York, NY, USA, 2007), ACM, pp. 32--41. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Wassermann, G., and Su, Z. Static detection of cross-site scripting vulnerabilities. In Proceedings of the 30th international conference on Software engineering (Leipzig, Germany, 2008), ACM, pp. 171--180. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Hash-flow taint analysis of higher-order programs

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        PLAS '12: Proceedings of the 7th Workshop on Programming Languages and Analysis for Security
        June 2012
        91 pages
        ISBN:9781450314411
        DOI:10.1145/2336717

        Copyright © 2012 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 15 June 2012

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate43of77submissions,56%

        Upcoming Conference

        PLDI '24

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader