Abstract
The most dangerous security-related software errors, according to the OWASP Top Ten 2017 list, affect web applications. They are potential injection attacks that exploit user-provided data to execute undesired operations: database access and updates (SQL injection); generation of malicious web pages (cross-site scripting injection); redirection to user-specified web pages (redirect injection); execution of OS commands and arbitrary scripts (command injection); loading of user-specified, possibly heavy or dangerous classes at run time (reflection injection); access to arbitrary files on the file system (path-traversal); and storing user-provided data into heap regions normally assumed to be shielded from the outside world (trust boundary violation). All these attacks exploit the same weakness: unconstrained propagation of data from sources that the user of a web application controls into sinks whose activation might trigger dangerous operations. Although web applications are written in a variety of languages, Java remains a frequent choice, in particular for banking applications, where security has tangible relevance.
This article defines a unified, sound protection mechanism against such attacks, based on the identification of all possible explicit flows of tainted data in Java code. Such flows can be arbitrarily complex, passing through dynamically allocated data structures in the heap. The analysis is based on abstract interpretation and is interprocedural, flow-sensitive, and context-sensitive. Its notion of taint applies to reference (non-primitive) types dynamically allocated in the heap and is object-sensitive and field-sensitive. The analysis works by translating the program into Boolean formulas that model all possible data flows. Its implementation, within the Julia analyzer for Java and Android, found injection security vulnerabilities in the Internet banking service and in the customer relationship management of large Italian banks, as well as in a set of open-source third-party applications. It found the command injection, which is at the origin of the 2017 Equifax data breach, one of the worst data breaches ever. For objective, repeatable results, this article also evaluates the implementation on two open-source security benchmarks: the Juliet Suite and the OWASP Benchmark for the automatic comparison of static analyzers for cybersecurity. We compared this technique against more than 10 other static analyzers, both free and commercial. The result of these experiments is that ours is the only analysis for injection that is sound (up to well-stated limitations such as multithreading and native code) and works on industrial code, and it is also much more precise than other tools.
- H. R. Andersen. 1999. An introduction to binary decision diagrams. Retrieved from: http://configit.com/configit_wordpress/wp-content/uploads/2013/07/bdd-eap.pdf.Google Scholar
- D. Appelt, C. D. Nguyen, L. C. Briand, and N. Alshahwan. 2014. Automated testing for SQL injection vulnerabilities: An input mutation approach. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA’14). 259--269. Google ScholarDigital Library
- S. Arzt, S. Rasthofer, C. Fritz, E. Bodden, A. Bartel, J. Klein, Y. Le Traon, D. Octeau, and P. McDaniel. 2014. FlowDroid: Precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for Android apps. In Proceedings of the Symposium on Programming Language Design and Implementation (PLDI’14). 29. Google ScholarDigital Library
- P. Avgustinov, O. de Moor, M. Peyton Jones, and M. Schäfer. 2016. QL: Object-oriented queries on relational data. In Proceedings of the 30th European Conference on Object-Oriented Programming (ECOOP’16) (LIPIcs), Vol. 56. Schloss Dagstuhl—Leibniz-Zentrum für Informatik, 2:1--2:25.Google Scholar
- P. Barros, R. Just, S. Millstein, P. Vines, W. Dietl, M. d’Amorim, and M. D. Ernst. 2015. Static analysis of implicit control flow: Resolving Java reflection and Android intents (T). In Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering (ASE’15). 669--679.Google Scholar
- G. Barthe, D. Pichardie, and T. Rezk. 2013. A certified lightweight non-interference Java bytecode verifier. Math. Struct. Comput. Sci. 23, 5 (2013), 1032--1081.Google ScholarCross Ref
- G. Barthe, T. Rezk, and A. Basu. 2007. Security types preserving compilation. Comput. Lang., Syst. Struct. 33, 2 (2007), 35--59. Google ScholarDigital Library
- N. Biasini. 2017. Content-type: Malicious—New Apache Struts2 0-day under attack. Retrieved from: https://blog.talosintelligence.com/2017/03/apache-0-day-exploited.html.Google Scholar
- R. Bryant. 1992. Symbolic Boolean manipulation with ordered binary-decision diagrams. ACM Comput. Surv. 24, 3 (1992), 293--318. Google ScholarDigital Library
- E. Burato, P. Ferrara, and F. Spoto. 2017. Security analysis of the OWASP benchmark with Julia. In Proceedings of the 1st Italian Conference on Security (ITASEC’17).Google Scholar
- D. Clark, C. Hankin, and S. Hunt. 2002. Information flow for ALGOL-like languages. Comput. Lang. 28, 1 (Apr. 2002), 3--28. Google ScholarDigital Library
- P. Cousot and R. Cousot. 1977. Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints. In Proceedings of the Symposium on Principles of Programming Languages (POPL’77). 238--252. Google ScholarDigital Library
- J. C. Doshi, M. Christian, and B. H. Trivedi. 2014. SQL FILTER—SQL injection prevention and logging using dynamic network filter. In Proceedings of the International Symposium on Security in Computing and Communications (SSCC’14). 400--406.Google Scholar
- M. D. Ernst, A. Lovato, D. Macedonio, C. Spiridon, and F. Spoto. 2015. Boolean formulas for the static identification of injection attacks in Java. In Proceedings of the Conference on Logic for Programming, Artificial Intelligence, and Reasoning (LPAR’20) (Lecture Notes in Computer Science), Vol. 9450. Springer, 130--145. Google ScholarDigital Library
- P. Ferrara, L. Olivieri, and F. Spoto. 2018. Tailoring taint analysis to GDPR. In Proceedings of the Privacy Forum, Revised Selected Papers (Lecture Notes in Computer Science), Vol. 11079. Springer, 63--76.Google Scholar
- X. Fu, X. Lu, B. Peltsverger, S. Chen, K. Qian, and L. Tao. 2007. A static analysis framework for detecting SQL injection vulnerabilities. In Proceedings of the 31st International Computer Software and Applications Conference (COMPSAC’07), Vol. 1. 87--96. Google ScholarDigital Library
- S. Genaim, R. Giacobazzi, and I. Mastroeni. 2004. Modeling secure information flow with Boolean functions. In Proceedings of the Workshop on Information Technologies and Systems (WITS’04), Peter Ryan (Ed.).Google Scholar
- S. Genaim and F. Spoto. 2005. Information flow analysis for Java bytecode. In Proceedings of the International Conference on Verification, Model Checking and Abstract Interpretation (VMCAI’05), R. Cousot (Ed.). Springer-Verlag, 346--362. Google ScholarDigital Library
- S. Genaim and F. Spoto. 2008. Constancy analysis. In Proceedings of the Workshop on Formal Techniques for Java-like Programs (FTfJP’08), M. Huisman (Ed.). Radboud University.Google Scholar
- J. Hélie, I. Wright, and A. Ziegler. 2018. Measuring software development productivity: A machine learning approach. In Proceedings of the Conference on Machine Learning for Programming Workshop, affiliated with FLoC’18.Google Scholar
- Oracle Inc. 2019. Java Platform, Enterprise Edition. Retrieved from: http://www.oracle.com/technetwork/java/javaee/overview/index.html.Google Scholar
- Oracle Inc. 2019. JavaServer Pages Technology. Retrieved from: http://www.oracle.com/technetwork/java/javaee/jsp/index.html.Google Scholar
- Pivotal Software Inc. 2019. Spring. Retrieved from: https://spring.io.Google Scholar
- Y.-S. Jang and J.-Y. Choi. 2014. Detecting SQL injection attacks using query result size. Comput. Secur. 44 (2014), 104--118.Google ScholarCross Ref
- D. Kar, S. Panigrahi, and S. Sundararajan. 2016. SQLiGoT: Detecting SQL injection attacks using graph of tokens and SVM. Comput. Secur. 60 (2016), 206--225. Google ScholarDigital Library
- N. Kobayashi and K. Shirane. 2002. Type-based information flow analysis for low-level languages. In Proceedings of the Asian Symposium on Programming Languages and Systems (APLAS’02).Google Scholar
- D. G. Kumar and M. Chatterjee. 2015. MAC based solution for SQL injection. J. Comput. Virol. Hack. Tech. 11, 1 (2015), 1--7.Google ScholarCross Ref
- D. Landman, A. Serebrenik, and J. J. Vinju. 2017. Challenges for static analysis of Java reflection: Literature review and empirical study. In Proceedings of the International Conference on Software Engineering (ICSE’17). 507--518. Google ScholarDigital Library
- P. Laud. 2001. Semantics and program analysis of computationally secure information flow. In Proceedings of the European Symposium on Programming (ESOP’01). Springer-Verlag, 77--91. Google ScholarDigital Library
- T. Lindholm, F. Yellin, G. Bracha, and A. Buckley. 2013. The Java Virtual Machine Specification, Java SE 7 Edition (1st ed.). Addison-Wesley Professional. Google ScholarDigital Library
- L. Liu, J. Xu, M. Li, and J. Yang. 2013. A dynamic SQL injection vulnerability test case generation model based on the multiple phases detection approach. In Proceedings of the International Computer Software and Applications Conference (COMPSAC’13). 256--261. Google ScholarDigital Library
- L. Liu, J. Xu, H. Yang, C. Guo, J. Kang, S. Xu, B. Zhang, and G. Si. 2016. An effective penetration test approach based on feature matrix for exposing SQL injection vulnerability. In Proceedings of the 40th IEEE Computer Software and Applications Conference (COMPSAC’16). 123--132.Google Scholar
- A. Makiou, Y. Begriche, and A. Serhrouchni. 2014. Improving web application firewalls to detect advanced SQL injection attacks. In Proceedings of the International Conference on Information Assurance and Security (IAS’14). 35--40.Google Scholar
- MITRE/SANS. 2011. Top 25 most dangerous software errors. Retrieved from http://cwe.mitre.org/top25.Google Scholar
- M. Mizuno. 1989. A least fixed point approach to inter-procedural information flow control. In Proceedings of the National Computer Security Conference (NCSC’89). 558--570. Retrieved from: citeseer.nj.nec.com/mizuno89least.html.Google Scholar
- N. M. Naghmeh Moradpoor Sheykhkanloo. 2014. Employing neural networks for the detection of SQL injection attack. In Proceedings of the International Conference on Security of Information and Networks (SIN’14). 318.Google Scholar
- National Institute of Standards and Technology. 2006. Juliet test suite for Java. Retrieved from: https://samate.nist.gov/SRD/testsuite.php.Google Scholar
- Ð. Nikolić and F. Spoto. 2013. Reachability analysis of program variables. ACM Trans. Program. Lang. Syst. 35, 4 (2013), 14. Google ScholarDigital Library
- NIST. 2017. CVE-2017-5638 detail. Retrieved from: https://nvd.nist.gov/vuln/detail/CVE-2017-5638.Google Scholar
- S. A. O’Brien. 2017. Giant Equifax data breach: 143 million people could be affected. Retrieved from: http://money.cnn.com/2017/09/07/technology/business/equifax-data-breach/index.html.Google Scholar
- OWASP. 2018. Benchmark. Retrieved from: https://www.owasp.org/index.php/Benchmark.Google Scholar
- J. Palsberg and M. I. Schwartzbach. 1991. Object-oriented type inference. In Proceedings of the 6th Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA’91). 146--161. Google ScholarDigital Library
- F. Panarotto, A. Cortesi, P. Ferrara, A. Mandal, and Spoto F. 2018. Static analysis of Android apps interaction with automotive CAN. In Proceedings of the International Conference on Smart Computing and Communication (SmartCom’18) (Lecture Notes in Computer Science), M. Qiu (Ed.), Vol. 11344. Springer, 114--123.Google Scholar
- É. Payet and F. Spoto. 2007. Magic-sets transformation for the analysis of Java bytecode. In Proceedings of the International Static Analysis Symposium (SAS’07). Springer, 452--467. Google ScholarDigital Library
- É. Payet and F. Spoto. 2012. Static analysis of Android programs. Inform. Softw. Technol. 54, 11 (2012), 1192--1201. Google ScholarDigital Library
- T. F. A. Rahman, A. G. Buja, K. A. Jalil, and F. M. Ali. 2017. SQL injection attack scanner using Boyer-Moore string matching algorithm. JCP 12, 2 (2017), 183--189.Google ScholarCross Ref
- T. W. Reps. 1998. Program analysis via graph reachability. Inform. Softw. Technol. 40, 11--12 (1998), 701--726.Google ScholarCross Ref
- T. W. Reps, S. Horwitz, and S. Sagiv. 1995. Precise interprocedural dataflow analysis via graph reachability. In Proceedings of the Symposium on Principles of Programming Languages (POPL’95). 49--61. Google ScholarDigital Library
- A. Sabelfeld and A. C. Myers. 2003. Language-based information-flow security. IEEE J. Select. Areas Commun. 21, 1 (2003), 5--19. Google ScholarDigital Library
- A. Sabelfeld and D. Sands. 2001. A PER model of secure information flow in sequential programs. Higher-Order Symbol. Computat. 14, 1 (2001), 59--91. Google ScholarDigital Library
- Gotham Digital Science. 2017. An Analysis of CVE-2017-5638. Retrieved from: https://blog.gdssecurity.com/labs/2017/3/27/an-analysis-of-cve-2017-5638.html.Google Scholar
- S. Secci and F. Spoto. 2005. Pair-sharing analysis of object-oriented programs. In Proceedings of the International Static Analysis Symposium (SAS’05). Springer, 320--335. Google ScholarDigital Library
- H. Shahriar and M. Zulkernine. 2012. Information-theoretic detection of SQL injection attacks. In Proceedings of the International Symposium on High-Assurance Systems Engineering (HASE’12). 40--47. Google ScholarDigital Library
- L. K. Shar and K. Tan, H. B. 2013. Defeating SQL injection. IEEE Comput. 46, 3 (2013), 69--77. Google ScholarDigital Library
- B. Simic and J. Walden. 2013. Eliminating SQL injection and cross site scripting using aspect oriented programming. In Proceedings of the Conference on Engineering Secure Software and Systems (ESSoS’13). 213--228. Google ScholarDigital Library
- C. Skalka and S. Smith. 2000. Static enforcement of security with types. In Proceedings of the International Conference on Functional Programming (ICFP’00). ACM Press, 254--267. Google ScholarDigital Library
- F. Spoto. 2008. Nullness analysis in Boolean form. In Proceedings of the International Conference on Software Engineering and Formal Methods (SEFM’08). IEEE, 21--30. Google ScholarDigital Library
- F. Spoto. 2016. The Julia static analyzer for Java. In Proceedings of the Static Analysis Symposium (SAS’16) (Lecture Notes in Computer Science), X. Rival (Ed.), Vol. 9837. Springer, 39--57.Google ScholarCross Ref
- M. Stampar. 2016. Inferential SQL injection attacks. I. J. Netw. Secur. 18, 2 (2016), 316--325.Google Scholar
- F. Tip and J. Palsberg. 2000. Scalable propagation-based call graph construction algorithms. In Proceedings of the 2000 ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages 8 Applications (OOPSLA’00). 281--293. Google ScholarDigital Library
- O. Tripp, M. Pistoia, P. Cousot, R. Cousot, and S. Guarnieri. 2013. Andromeda: Accurate and scalable security analysis of web applications. In Proceedings of the Fundamental Approaches to Software Engineering (FASE’13). 210--225. Google ScholarDigital Library
- O. Tripp, M. Pistoia, S. J. Fink, M. Sridharan, and O. Weisman. 2009. TAJ: Effective taint analysis of web applications. SIGPLAN Notices 44, 6 (June 2009), 87--97. Google ScholarDigital Library
- M. S. Tschantz and M. D. Ernst. 2005. Javari: Adding reference immutability to Java. In Proceedings of the 20th ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA’05). 211--230. Google ScholarDigital Library
- R. Vallée-Rai, É. Gagnon, L. J. Hendren, P. Lam, P. Pominville, and V. Sundaresan. 2000. Optimizing Java bytecode using the soot framework: Is it feasible? In Proceedings of the 9th International Conference on Compiler Construction (CC’00). 18--34. Google ScholarDigital Library
- D. Volpano, G. Smith, and C. Irvine. 1996. A sound type system for secure flow analysis. J. Comput. Secur. 4, 2--3 (1996), 167--187. Google ScholarDigital Library
- G. Wassermann and Z. Su. 2007. Sound and precise analysis of web applications for injection vulnerabilities. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’07). 32--41. Google ScholarDigital Library
- J. Whaley. 2008. Java binary decision diagram library. Retrieved from: http://javabdd.sourceforge.net/.Google Scholar
- T.-Y. Wu, J.-S. Pan, C.-M. Chen, and C.-W. Lin. 2014. Towards SQL injection attacks detection mechanism using parse tree. In Proceedings of the International Conference on Genetic and Evolutionary Computing (ICGEC’14). 371--380.Google Scholar
- L. Xiao, S. Matsumoto, T. Ishikawa, and K. Sakurai. 2016. SQL injection attack detection method using expectation criterion. In Proceedings of the International Symposium on Computing and Networking (CANDAR’16). 649--654.Google Scholar
Index Terms
- Static Identification of Injection Attacks in Java
Recommendations
Code-Reuse Attacks for the Web: Breaking Cross-Site Scripting Mitigations via Script Gadgets
CCS '17: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications SecurityCross-Site Scripting (XSS) is an unremitting problem for the Web. Since its initial public documentation in 2000 until now, XSS has been continuously on top of the vulnerability statistics. Even though there has been a considerable amount of research ...
Static analysis for detecting taint-style vulnerabilities in web applications
The number and the importance of web applications have increased rapidly over the last years. At the same time, the quantity and impact of security vulnerabilities in such applications have grown as well. Since manual code reviews are time-consuming, ...
Hash-flow taint analysis of higher-order programs
PLAS '12: Proceedings of the 7th Workshop on Programming Languages and Analysis for SecurityAs web applications have grown in popularity, so have attacks on such applications. Cross-site scripting and injection attacks have become particularly problematic. Both vulnerabilities stem, at their core, from improper sanitization of user input.
We ...
Comments