research-article

Open Access

Static Identification of Injection Attacks in Java

Authors:
Fausto Spoto

Università di Verona, Italy and JuliaSoft Srl, Verona, Italy

Università di Verona, Italy and JuliaSoft Srl, Verona, Italy
View Profile

,
Elisa Burato

JuliaSoft Srl, Verona, Italy

JuliaSoft Srl, Verona, Italy
View Profile

,
Michael D. Ernst

University of Washington, Seattle, WA, USA

University of Washington, Seattle, WA, USA
View Profile

,
Pietro Ferrara

JuliaSoft Srl, Verona, Italy

JuliaSoft Srl, Verona, Italy
View Profile

,
Alberto Lovato

Università di Verona, Verona, Italy

Università di Verona, Verona, Italy
View Profile

,
Damiano Macedonio

JuliaSoft Srl, Verona, Italy

JuliaSoft Srl, Verona, Italy
View Profile

,
Ciprian Spiridon

JuliaSoft Srl, Verona, Italy

JuliaSoft Srl, Verona, Italy
View Profile

ACM Transactions on Programming Languages and Systems Volume 41 Issue 3Article No.: 18pp 1–58https://doi.org/10.1145/3332371

Published:02 July 2019Publication History

ACM Transactions on Programming Languages and Systems

Abstract

The most dangerous security-related software errors, according to the OWASP Top Ten 2017 list, affect web applications. They are potential injection attacks that exploit user-provided data to execute undesired operations: database access and updates (SQL injection); generation of malicious web pages (cross-site scripting injection); redirection to user-specified web pages (redirect injection); execution of OS commands and arbitrary scripts (command injection); loading of user-specified, possibly heavy or dangerous classes at run time (reflection injection); access to arbitrary files on the file system (path-traversal); and storing user-provided data into heap regions normally assumed to be shielded from the outside world (trust boundary violation). All these attacks exploit the same weakness: unconstrained propagation of data from sources that the user of a web application controls into sinks whose activation might trigger dangerous operations. Although web applications are written in a variety of languages, Java remains a frequent choice, in particular for banking applications, where security has tangible relevance.

This article defines a unified, sound protection mechanism against such attacks, based on the identification of all possible explicit flows of tainted data in Java code. Such flows can be arbitrarily complex, passing through dynamically allocated data structures in the heap. The analysis is based on abstract interpretation and is interprocedural, flow-sensitive, and context-sensitive. Its notion of taint applies to reference (non-primitive) types dynamically allocated in the heap and is object-sensitive and field-sensitive. The analysis works by translating the program into Boolean formulas that model all possible data flows. Its implementation, within the Julia analyzer for Java and Android, found injection security vulnerabilities in the Internet banking service and in the customer relationship management of large Italian banks, as well as in a set of open-source third-party applications. It found the command injection, which is at the origin of the 2017 Equifax data breach, one of the worst data breaches ever. For objective, repeatable results, this article also evaluates the implementation on two open-source security benchmarks: the Juliet Suite and the OWASP Benchmark for the automatic comparison of static analyzers for cybersecurity. We compared this technique against more than 10 other static analyzers, both free and commercial. The result of these experiments is that ours is the only analysis for injection that is sound (up to well-stated limitations such as multithreading and native code) and works on industrial code, and it is also much more precise than other tools.

References

H. R. Andersen. 1999. An introduction to binary decision diagrams. Retrieved from: http://configit.com/configit_wordpress/wp-content/uploads/2013/07/bdd-eap.pdf.Google Scholar
D. Appelt, C. D. Nguyen, L. C. Briand, and N. Alshahwan. 2014. Automated testing for SQL injection vulnerabilities: An input mutation approach. In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA’14). 259--269. Google ScholarDigital Library
S. Arzt, S. Rasthofer, C. Fritz, E. Bodden, A. Bartel, J. Klein, Y. Le Traon, D. Octeau, and P. McDaniel. 2014. FlowDroid: Precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for Android apps. In Proceedings of the Symposium on Programming Language Design and Implementation (PLDI’14). 29. Google ScholarDigital Library
P. Avgustinov, O. de Moor, M. Peyton Jones, and M. Schäfer. 2016. QL: Object-oriented queries on relational data. In Proceedings of the 30th European Conference on Object-Oriented Programming (ECOOP’16) (LIPIcs), Vol. 56. Schloss Dagstuhl—Leibniz-Zentrum für Informatik, 2:1--2:25.Google Scholar
P. Barros, R. Just, S. Millstein, P. Vines, W. Dietl, M. d’Amorim, and M. D. Ernst. 2015. Static analysis of implicit control flow: Resolving Java reflection and Android intents (T). In Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering (ASE’15). 669--679.Google Scholar
G. Barthe, D. Pichardie, and T. Rezk. 2013. A certified lightweight non-interference Java bytecode verifier. Math. Struct. Comput. Sci. 23, 5 (2013), 1032--1081.Google ScholarCross Ref
G. Barthe, T. Rezk, and A. Basu. 2007. Security types preserving compilation. Comput. Lang., Syst. Struct. 33, 2 (2007), 35--59. Google ScholarDigital Library
N. Biasini. 2017. Content-type: Malicious—New Apache Struts2 0-day under attack. Retrieved from: https://blog.talosintelligence.com/2017/03/apache-0-day-exploited.html.Google Scholar
R. Bryant. 1992. Symbolic Boolean manipulation with ordered binary-decision diagrams. ACM Comput. Surv. 24, 3 (1992), 293--318. Google ScholarDigital Library
E. Burato, P. Ferrara, and F. Spoto. 2017. Security analysis of the OWASP benchmark with Julia. In Proceedings of the 1st Italian Conference on Security (ITASEC’17).Google Scholar
D. Clark, C. Hankin, and S. Hunt. 2002. Information flow for ALGOL-like languages. Comput. Lang. 28, 1 (Apr. 2002), 3--28. Google ScholarDigital Library
P. Cousot and R. Cousot. 1977. Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints. In Proceedings of the Symposium on Principles of Programming Languages (POPL’77). 238--252. Google ScholarDigital Library
J. C. Doshi, M. Christian, and B. H. Trivedi. 2014. SQL FILTER—SQL injection prevention and logging using dynamic network filter. In Proceedings of the International Symposium on Security in Computing and Communications (SSCC’14). 400--406.Google Scholar
M. D. Ernst, A. Lovato, D. Macedonio, C. Spiridon, and F. Spoto. 2015. Boolean formulas for the static identification of injection attacks in Java. In Proceedings of the Conference on Logic for Programming, Artificial Intelligence, and Reasoning (LPAR’20) (Lecture Notes in Computer Science), Vol. 9450. Springer, 130--145. Google ScholarDigital Library
P. Ferrara, L. Olivieri, and F. Spoto. 2018. Tailoring taint analysis to GDPR. In Proceedings of the Privacy Forum, Revised Selected Papers (Lecture Notes in Computer Science), Vol. 11079. Springer, 63--76.Google Scholar
X. Fu, X. Lu, B. Peltsverger, S. Chen, K. Qian, and L. Tao. 2007. A static analysis framework for detecting SQL injection vulnerabilities. In Proceedings of the 31st International Computer Software and Applications Conference (COMPSAC’07), Vol. 1. 87--96. Google ScholarDigital Library
S. Genaim, R. Giacobazzi, and I. Mastroeni. 2004. Modeling secure information flow with Boolean functions. In Proceedings of the Workshop on Information Technologies and Systems (WITS’04), Peter Ryan (Ed.).Google Scholar
S. Genaim and F. Spoto. 2005. Information flow analysis for Java bytecode. In Proceedings of the International Conference on Verification, Model Checking and Abstract Interpretation (VMCAI’05), R. Cousot (Ed.). Springer-Verlag, 346--362. Google ScholarDigital Library
S. Genaim and F. Spoto. 2008. Constancy analysis. In Proceedings of the Workshop on Formal Techniques for Java-like Programs (FTfJP’08), M. Huisman (Ed.). Radboud University.Google Scholar
J. Hélie, I. Wright, and A. Ziegler. 2018. Measuring software development productivity: A machine learning approach. In Proceedings of the Conference on Machine Learning for Programming Workshop, affiliated with FLoC’18.Google Scholar
Oracle Inc. 2019. Java Platform, Enterprise Edition. Retrieved from: http://www.oracle.com/technetwork/java/javaee/overview/index.html.Google Scholar
Oracle Inc. 2019. JavaServer Pages Technology. Retrieved from: http://www.oracle.com/technetwork/java/javaee/jsp/index.html.Google Scholar
Pivotal Software Inc. 2019. Spring. Retrieved from: https://spring.io.Google Scholar
Y.-S. Jang and J.-Y. Choi. 2014. Detecting SQL injection attacks using query result size. Comput. Secur. 44 (2014), 104--118.Google ScholarCross Ref
D. Kar, S. Panigrahi, and S. Sundararajan. 2016. SQLiGoT: Detecting SQL injection attacks using graph of tokens and SVM. Comput. Secur. 60 (2016), 206--225. Google ScholarDigital Library
N. Kobayashi and K. Shirane. 2002. Type-based information flow analysis for low-level languages. In Proceedings of the Asian Symposium on Programming Languages and Systems (APLAS’02).Google Scholar
D. G. Kumar and M. Chatterjee. 2015. MAC based solution for SQL injection. J. Comput. Virol. Hack. Tech. 11, 1 (2015), 1--7.Google ScholarCross Ref
D. Landman, A. Serebrenik, and J. J. Vinju. 2017. Challenges for static analysis of Java reflection: Literature review and empirical study. In Proceedings of the International Conference on Software Engineering (ICSE’17). 507--518. Google ScholarDigital Library
P. Laud. 2001. Semantics and program analysis of computationally secure information flow. In Proceedings of the European Symposium on Programming (ESOP’01). Springer-Verlag, 77--91. Google ScholarDigital Library
T. Lindholm, F. Yellin, G. Bracha, and A. Buckley. 2013. The Java Virtual Machine Specification, Java SE 7 Edition (1st ed.). Addison-Wesley Professional. Google ScholarDigital Library
L. Liu, J. Xu, M. Li, and J. Yang. 2013. A dynamic SQL injection vulnerability test case generation model based on the multiple phases detection approach. In Proceedings of the International Computer Software and Applications Conference (COMPSAC’13). 256--261. Google ScholarDigital Library
L. Liu, J. Xu, H. Yang, C. Guo, J. Kang, S. Xu, B. Zhang, and G. Si. 2016. An effective penetration test approach based on feature matrix for exposing SQL injection vulnerability. In Proceedings of the 40th IEEE Computer Software and Applications Conference (COMPSAC’16). 123--132.Google Scholar
A. Makiou, Y. Begriche, and A. Serhrouchni. 2014. Improving web application firewalls to detect advanced SQL injection attacks. In Proceedings of the International Conference on Information Assurance and Security (IAS’14). 35--40.Google Scholar
MITRE/SANS. 2011. Top 25 most dangerous software errors. Retrieved from http://cwe.mitre.org/top25.Google Scholar
M. Mizuno. 1989. A least fixed point approach to inter-procedural information flow control. In Proceedings of the National Computer Security Conference (NCSC’89). 558--570. Retrieved from: citeseer.nj.nec.com/mizuno89least.html.Google Scholar
N. M. Naghmeh Moradpoor Sheykhkanloo. 2014. Employing neural networks for the detection of SQL injection attack. In Proceedings of the International Conference on Security of Information and Networks (SIN’14). 318.Google Scholar
National Institute of Standards and Technology. 2006. Juliet test suite for Java. Retrieved from: https://samate.nist.gov/SRD/testsuite.php.Google Scholar
Ð. Nikolić and F. Spoto. 2013. Reachability analysis of program variables. ACM Trans. Program. Lang. Syst. 35, 4 (2013), 14. Google ScholarDigital Library
NIST. 2017. CVE-2017-5638 detail. Retrieved from: https://nvd.nist.gov/vuln/detail/CVE-2017-5638.Google Scholar
S. A. O’Brien. 2017. Giant Equifax data breach: 143 million people could be affected. Retrieved from: http://money.cnn.com/2017/09/07/technology/business/equifax-data-breach/index.html.Google Scholar
OWASP. 2018. Benchmark. Retrieved from: https://www.owasp.org/index.php/Benchmark.Google Scholar
J. Palsberg and M. I. Schwartzbach. 1991. Object-oriented type inference. In Proceedings of the 6th Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA’91). 146--161. Google ScholarDigital Library
F. Panarotto, A. Cortesi, P. Ferrara, A. Mandal, and Spoto F. 2018. Static analysis of Android apps interaction with automotive CAN. In Proceedings of the International Conference on Smart Computing and Communication (SmartCom’18) (Lecture Notes in Computer Science), M. Qiu (Ed.), Vol. 11344. Springer, 114--123.Google Scholar
É. Payet and F. Spoto. 2007. Magic-sets transformation for the analysis of Java bytecode. In Proceedings of the International Static Analysis Symposium (SAS’07). Springer, 452--467. Google ScholarDigital Library
É. Payet and F. Spoto. 2012. Static analysis of Android programs. Inform. Softw. Technol. 54, 11 (2012), 1192--1201. Google ScholarDigital Library
T. F. A. Rahman, A. G. Buja, K. A. Jalil, and F. M. Ali. 2017. SQL injection attack scanner using Boyer-Moore string matching algorithm. JCP 12, 2 (2017), 183--189.Google ScholarCross Ref
T. W. Reps. 1998. Program analysis via graph reachability. Inform. Softw. Technol. 40, 11--12 (1998), 701--726.Google ScholarCross Ref
T. W. Reps, S. Horwitz, and S. Sagiv. 1995. Precise interprocedural dataflow analysis via graph reachability. In Proceedings of the Symposium on Principles of Programming Languages (POPL’95). 49--61. Google ScholarDigital Library
A. Sabelfeld and A. C. Myers. 2003. Language-based information-flow security. IEEE J. Select. Areas Commun. 21, 1 (2003), 5--19. Google ScholarDigital Library
A. Sabelfeld and D. Sands. 2001. A PER model of secure information flow in sequential programs. Higher-Order Symbol. Computat. 14, 1 (2001), 59--91. Google ScholarDigital Library
Gotham Digital Science. 2017. An Analysis of CVE-2017-5638. Retrieved from: https://blog.gdssecurity.com/labs/2017/3/27/an-analysis-of-cve-2017-5638.html.Google Scholar
S. Secci and F. Spoto. 2005. Pair-sharing analysis of object-oriented programs. In Proceedings of the International Static Analysis Symposium (SAS’05). Springer, 320--335. Google ScholarDigital Library
H. Shahriar and M. Zulkernine. 2012. Information-theoretic detection of SQL injection attacks. In Proceedings of the International Symposium on High-Assurance Systems Engineering (HASE’12). 40--47. Google ScholarDigital Library
L. K. Shar and K. Tan, H. B. 2013. Defeating SQL injection. IEEE Comput. 46, 3 (2013), 69--77. Google ScholarDigital Library
B. Simic and J. Walden. 2013. Eliminating SQL injection and cross site scripting using aspect oriented programming. In Proceedings of the Conference on Engineering Secure Software and Systems (ESSoS’13). 213--228. Google ScholarDigital Library
C. Skalka and S. Smith. 2000. Static enforcement of security with types. In Proceedings of the International Conference on Functional Programming (ICFP’00). ACM Press, 254--267. Google ScholarDigital Library
F. Spoto. 2008. Nullness analysis in Boolean form. In Proceedings of the International Conference on Software Engineering and Formal Methods (SEFM’08). IEEE, 21--30. Google ScholarDigital Library
F. Spoto. 2016. The Julia static analyzer for Java. In Proceedings of the Static Analysis Symposium (SAS’16) (Lecture Notes in Computer Science), X. Rival (Ed.), Vol. 9837. Springer, 39--57.Google ScholarCross Ref
M. Stampar. 2016. Inferential SQL injection attacks. I. J. Netw. Secur. 18, 2 (2016), 316--325.Google Scholar
F. Tip and J. Palsberg. 2000. Scalable propagation-based call graph construction algorithms. In Proceedings of the 2000 ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages 8 Applications (OOPSLA’00). 281--293. Google ScholarDigital Library
O. Tripp, M. Pistoia, P. Cousot, R. Cousot, and S. Guarnieri. 2013. Andromeda: Accurate and scalable security analysis of web applications. In Proceedings of the Fundamental Approaches to Software Engineering (FASE’13). 210--225. Google ScholarDigital Library
O. Tripp, M. Pistoia, S. J. Fink, M. Sridharan, and O. Weisman. 2009. TAJ: Effective taint analysis of web applications. SIGPLAN Notices 44, 6 (June 2009), 87--97. Google ScholarDigital Library
M. S. Tschantz and M. D. Ernst. 2005. Javari: Adding reference immutability to Java. In Proceedings of the 20th ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA’05). 211--230. Google ScholarDigital Library
R. Vallée-Rai, É. Gagnon, L. J. Hendren, P. Lam, P. Pominville, and V. Sundaresan. 2000. Optimizing Java bytecode using the soot framework: Is it feasible? In Proceedings of the 9th International Conference on Compiler Construction (CC’00). 18--34. Google ScholarDigital Library
D. Volpano, G. Smith, and C. Irvine. 1996. A sound type system for secure flow analysis. J. Comput. Secur. 4, 2--3 (1996), 167--187. Google ScholarDigital Library
G. Wassermann and Z. Su. 2007. Sound and precise analysis of web applications for injection vulnerabilities. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’07). 32--41. Google ScholarDigital Library
J. Whaley. 2008. Java binary decision diagram library. Retrieved from: http://javabdd.sourceforge.net/.Google Scholar
T.-Y. Wu, J.-S. Pan, C.-M. Chen, and C.-W. Lin. 2014. Towards SQL injection attacks detection mechanism using parse tree. In Proceedings of the International Conference on Genetic and Evolutionary Computing (ICGEC’14). 371--380.Google Scholar
L. Xiao, S. Matsumoto, T. Ishikawa, and K. Sakurai. 2016. SQL injection attack detection method using expectation criterion. In Proceedings of the International Symposium on Computing and Networking (CANDAR’16). 649--654.Google Scholar

Index Terms

Static Identification of Injection Attacks in Java
1. Security and privacy
  1. Formal methods and theory of security
    1. Logic and verification
  2. Software and application security
    1. Software security engineering
    2. Web application security
2. Theory of computation
  1. Semantics and reasoning
    1. Program reasoning
      1. Program analysis
    2. Program semantics
      1. Denotational semantics

Recommendations

Code-Reuse Attacks for the Web: Breaking Cross-Site Scripting Mitigations via Script Gadgets
CCS '17: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security

Cross-Site Scripting (XSS) is an unremitting problem for the Web. Since its initial public documentation in 2000 until now, XSS has been continuously on top of the vulnerability statistics. Even though there has been a considerable amount of research ...
Read More
Static analysis for detecting taint-style vulnerabilities in web applications

The number and the importance of web applications have increased rapidly over the last years. At the same time, the quantity and impact of security vulnerabilities in such applications have grown as well. Since manual code reviews are time-consuming, ...
Read More
Hash-flow taint analysis of higher-order programs
PLAS '12: Proceedings of the 7th Workshop on Programming Languages and Analysis for Security

As web applications have grown in popularity, so have attacks on such applications. Cross-site scripting and injection attacks have become particularly problematic. Both vulnerabilities stem, at their core, from improper sanitization of user input.

We ...
Read More

Reviews

Reviewer: Mohammad Sadegh Kayhani Pirdehi

A programming language's security principles guarantee robustness and sustainability by detecting and neutralizing any tainted object in the programming code, which can potentially be the source of any vulnerability during the operation and execution of a program. This paper covers injected attacks via these tainted user inputs in different operation scenarios, and proposes mechanisms based on semantic analysis to protect the programs. The paper begins with an illustrative introduction. It explains the destructive and malicious operations of injection attacks in database systems, web page rendering, operating system scripting, and so on. Static and dynamic analyses are counted as traditional ways to prevent and combat possible threats and vulnerabilities, respectively. The paper's central theme is pre-execution and injection-preventing methods for Java, with the inclusion of all possible code block structures (code block, exception handling, recursion, and so on). The authors present "a sound static analysis that identifies if and where a Java bytecode program lets data flow from tainted user input (including servlet requests) into critical operations that might give rise to injections," as well as consider the possibility of false alarms. The paper includes examples of injection in program code and propounds the issue of reachability-based taintedness. The literature review covers the automatic identification of data injection for both dynamic and static analyses, where scalability and precision are considered essential criteria. Also featured: a theoretical framework of access paths that describes reachability-based taintedness via backwards taint analysis and modeling information via heap analyses. In a concrete semantical notational formalism as the basic expression of code structure for taint analysis, the denotation for Java bytecodes is declared. It defines the entities of the language, for example, class, instance method, constructors, program state, exceptional states, handling, and so on. Next, using binary decision diagrams (BDDs) and a defined abstract interpretation of the concrete semantic, the probable ways of propagating taintedness are theorized. This theoretical framework would operate as an analysis engine to reveal any tainted connection on a flow-sensitive static analysis. The paper extensively discusses issues of operation on the implemented Julia static analysis framework. Open-source and closed-source application results, true and false alarms, iteration strategy, and BDD compaction are studied comprehensively. The paper presents a mature security-based look at program coding that has been equipped with a rich theoretical notational formalism. These features reflect the reliability and power of the provided framework for the static analysis of injection attacks. The paper is strongly recommended for all coders involved in the security of web development systems.

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Programming Languages and Systems Volume 41, Issue 3
September 2019
278 pages
ISSN:0164-0925
EISSN:1558-4593
DOI:10.1145/3343145
Editor:
Andrew Myers
Cornell University, USA
Issue’s Table of Contents
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 2 July 2019
- Accepted: 1 May 2019
- Revised: 1 March 2019
- Received: 1 June 2017
Published in toplas Volume 41, Issue 3

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
SQL injection attack
Static analysis
XSS
abstract interpretation
taint analysis
web application security
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 19
  Total Citations
  View Citations
- 1,913
  Total Downloads
- Downloads (Last 12 months)514
- Downloads (Last 6 weeks)63
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Static Identification of Injection Attacks in Java

ACM Transactions on Programming Languages and Systems

Abstract

References

Cited By

Index Terms

Recommendations

Code-Reuse Attacks for the Web: Breaking Cross-Site Scripting Mitigations via Script Gadgets

Static analysis for detecting taint-style vulnerabilities in web applications

Hash-flow taint analysis of higher-order programs

Reviews

Access critical reviews of Computing literature here