ABSTRACT
As web applications have grown in popularity, so have attacks on such applications. Cross-site scripting and injection attacks have become particularly problematic. Both vulnerabilities stem, at their core, from improper sanitization of user input.
We propose static taint analysis, which can verify the absence of unsanitized input errors at compile-time. Unfortunately, precise static analysis of modern scripting languages like Python is challenging: higher-orderness and complex control-flow collide with opaque, dynamic data structures like hash maps and objects. The interdependence of data-flow and control-flow make it hard to attain both soundness and precision.
In this work, we apply abstract interpretation to sound and precise taint-style static analysis of scripting languages. We first define λH, a core calculus of modern scripting languages, with hash maps, dynamic objects, higher-order functions and first class control. Then we derive a framework of k-CFA-like CESK-style abstract machines for statically reasoning about λH, but with hash maps factored into a "Curried Object store." The Curried object store---and shape analysis on this store---allows us to recover field sensitivity, even in the presence of dynamically modified fields. Lastly, atop this framework, we devise a taint-flow analysis, leveraging its field-sensitive, interprocedural and context-sensitive properties to soundly and precisely detect security vulnerabilities, like XSS attacks in web applications.
We have prototyped the analytical framework for Python, and conducted preliminary experiments with web applications. A low rate of false alarms demonstrates the promise of this approach.
- Pyblosxom. http://pyblosxom.bluesock.org/.Google Scholar
- Andrews, M. Guest editor's introduction: The state of web security. IEEE Security and Privacy 4 (July 2006), 14--15. Google ScholarDigital Library
- Chang, W., Streiff, B., and Lin, C. Efficient and extensible security enforcement using dynamic data flow analysis. In Conference on Computer and Communications Security (2008). Google ScholarDigital Library
- Chase, D. R., Wegman, M., and Zadeck, F. K. Analysis of pointers and structures. In PLDI '90: Proceedings of the ACM SIGPLAN 1990 Conference on Programming Language Design and Implementation (New York, NY, USA, 1990), PLDI '90, ACM, pp. 296--310. Google ScholarDigital Library
- Denning, D. E. A lattice model of secure information flow. Commun. ACM 19, 5 (May 1976), 236--243. Google ScholarDigital Library
- Felleisen, M. The Calculi of Lambda-v-CS Conversion: A Syntactic Theory of Control and State in Imperative Higher-Order Programming Languages. PhD thesis, Indiana University, 1987. Google ScholarDigital Library
- Felleisen, M., and Friedman, D. P. Control operators, the SECD-machine, and the lambda-calculus. In 3rd Working Conference on the Formal Description of Programming Concepts (Aug. 1986).Google Scholar
- Flanagan, C., Sabry, A., Duba, B. F., and Felleisen, M. The essence of compiling with continuations. In PLDI '93: Proceedings of the ACM SIGPLAN 1993 Conference on Programming Language Design and Implementation (New York, NY, USA, June 1993), ACM, pp. 237--247. Google ScholarDigital Library
- Grossbart, Z. Search engine in python, 2007. http://www.zackgrossbart.com/hackito/search-engine-python/.Google Scholar
- Huang, Y. W., Yu, F., Hang, C., Tsai, C. H., Lee, D. T., and Kuo, S. Y. Securing web application code by static analysis and runtime protection. In WWW '04: Proceedings of the 13th international conference on World Wide Web (New York, NY, USA, 2004), ACM, pp. 40--52. Google ScholarDigital Library
- Jovanovic, N., Kruegel, C., and Kirda, E. Pixy: A static analysis tool for detecting web application vulnerabilities (short paper). In IN 2006 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (2006), pp. 258--263. Google ScholarDigital Library
- Kozlov, D., and Petukhov. Implementation of tainted mode approach to finding security vulnerabilities for python technology.Google Scholar
- Meyerovich, L. A., and Livshits, B. ConScript: Specifying and enforcing Fine-Grained security policies for JavaScript in the browser. In Proceedings of the 2010 IEEE Symposium on Security and Privacy (Washington, DC, USA, 2010), SP '10, IEEE Computer Society, pp. 481--496. Google ScholarDigital Library
- Might, M. A-normalization: Why and how. http://matt.might.net/articles/a-normalization/.Google Scholar
- Might, M. You don't understand exceptions, but you should. http://matt.might.net/articles/implementing-exceptions/.Google Scholar
- Might, M. Environment Analysis of Higher-Order Languages. PhD thesis, Georgia Institute of Technology, June 2007. Google ScholarDigital Library
- Might, M. Shape analysis in the absence of pointers and structure. In VMCAI 2010: International Conference on Verification, Model-Checking and Abstract Interpretation (Madrid, Spain, Jan. 2010), pp. 263--278. Google ScholarDigital Library
- Might, M., and Shivers, O. Improving flow analyses via Gamma-CFA: Abstract garbage collection and counting. In ICFP '06: Proceedings of the 11th ACM SIGPLAN International Conference on Functional Programming (New York, NY, USA, 2006), ACM, pp. 13--25. Google ScholarDigital Library
- Might, M., and Shivers, O. Exploiting reachability and cardinality in higher-order flow analysis. Journal of Functional Programming 18, Special Double Issue 5--6 (2008), 821--864. Google ScholarDigital Library
- Might, M., Smaragdakis, Y., and Van Horn, D. Resolving and exploiting the k-CFA paradox: Illuminating functional vs. object-oriented program analysis. In PLDI '10: Proceedings of the 2010 ACM SIGPLAN Conference on Programming Language Design and Implementation (2010), PLDI '10, ACM Press, pp. 305--315. Google ScholarDigital Library
- Minamide, Y. Static approximation of dynamically generated web pages. In WWW. 2005, pp. 432--441. Google ScholarDigital Library
- Moin moin. http://moinmo.in/.Google Scholar
- Nguyen-Tuong, A., Guarnieri, S., Greene, D., Shirley, J., and Evans, D. Automatically hardening web applications using precise tainting. In In 20th IFIP International Information Security Conference (2005), pp. 372--382.Google ScholarCross Ref
- Owasp. OWASP top 10 for 2010. https://www.owasp.org/index.php/Category:OWASP_Top_Ten_Project.Google Scholar
- Richards, G., Lebresne, S., Burg, B., and Vitek, J. An analysis of the dynamic behavior of JavaScript programs. In Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation (New York, NY, USA, 2010), PLDI '10, ACM, pp. 1--12. Google ScholarDigital Library
- Ruby. http://ruby-doc.org/docs/ProgrammingRuby/.Google Scholar
- Sabelfeld, A., and Myers, A. Language-Based Information-Flow security, 2003.Google Scholar
- Salib, M. Static Type Inference with Starkiller. In PyCon DC (2004).Google Scholar
- Seo, J., and Monica. InvisiType: Object-Oriented security policies. In NDSS (2010).Google Scholar
- Tripp, O., Pistoia, M., Fink, S. J., Sridharan, M., and Weisman, O. Taj: effective taint analysis of web applications. In Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation (New York, NY, USA, 2009), PLDI '09, ACM, pp. 87--97. Google ScholarDigital Library
- Van Horn, D., and Might, M. Abstracting abstract machines. In Proceedings of the 15th ACM SIGPLAN international conference on Functional programming (New York, NY, USA, 2010), ICFP '10, ACM, pp. 51--62. Google ScholarDigital Library
- Vogt, P., Nentwich, F., Jovanovic, N., Kirda, E., Kruegel, C., and Vigna, G. Cross site scripting prevention with dynamic data tainting and static analysis.Google Scholar
- Wall, L., Christiansen, T., Schwartz, R. L., and Potter, S. Programming Perl (2nd Edition). Google ScholarDigital Library
- Wassermann, G., and Su, Z. Sound and precise analysis of web applications for injection vulnerabilities. In PLDI '07: Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation (New York, NY, USA, 2007), ACM, pp. 32--41. Google ScholarDigital Library
- Wassermann, G., and Su, Z. Static detection of cross-site scripting vulnerabilities. In Proceedings of the 30th international conference on Software engineering (Leipzig, Germany, 2008), ACM, pp. 171--180. Google ScholarDigital Library
Index Terms
- Hash-flow taint analysis of higher-order programs
Recommendations
P/Taint: unified points-to and taint analysis
Static information-flow analysis (especially taint-analysis) is a key technique in software security, computing where sensitive or untrusted data can propagate in a program. Points-to analysis is a fundamental static program analysis, computing what ...
Pushdown control-flow analysis for free
POPL '16Traditional control-flow analysis (CFA) for higher-order languages introduces spurious connections between callers and callees, and different invocations of a function may pollute each other's return flows. Recently, three distinct approaches have been ...
Pushdown control-flow analysis for free
POPL '16: Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming LanguagesTraditional control-flow analysis (CFA) for higher-order languages introduces spurious connections between callers and callees, and different invocations of a function may pollute each other's return flows. Recently, three distinct approaches have been ...
Comments