Abstract
Static information-flow analysis (especially taint-analysis) is a key technique in software security, computing where sensitive or untrusted data can propagate in a program. Points-to analysis is a fundamental static program analysis, computing what abstract objects a program expression may point to. In this work, we propose a deep unification of information-flow and points-to analysis. We observe that information-flow analysis is not a mere high-level client of points-to information, but it is indeed identical to points-to analysis on artificial abstract objects that represent different information sources. The very same algorithm can compute, simultaneously, two interlinked but separate results (points-to and information-flow values) with changes only to its initial conditions.
The benefits of such a unification are manifold. We can use existing points-to analysis implementations, with virtually no modification (only minor additions of extra logic for sanitization) to compute information flow concepts, such as value tainting. The algorithmic enhancements of points-to analysis (e.g., different flavors of context sensitivity) can be applied transparently to information-flow analysis. Heavy engineering work on points-to analysis (e.g., handling of the reflection API for Java) applies to information-flow analysis without extra effort. We demonstrate the benefits in a realistic implementation that leverages the Doop points-to analysis framework (including its context-sensitivity and reflection analysis features) to provide an information-flow analysis with excellent precision (over 91%) and recall (over 99%) for standard Java information-flow benchmarks.
The analysis comfortably scales to large, real-world Android applications, analyzing the Facebook Messenger app with more than 55K classes in under 7 hours.
- Steven Arzt and Eric Bodden. 2016. StubDroid: automatic inference of precise data-flow summaries for the android framework. In Proceedings of the 38th International Conference on Software Engineering, ICSE 2016, Austin, TX, USA, May 14-22, 2016. 725–735. DOI: Google ScholarDigital Library
- Steven Arzt, Siegfried Rasthofer, Christian Fritz, Eric Bodden, Alexandre Bartel, Jacques Klein, Yves Le Traon, Damien Octeau, and Patrick McDaniel. 2014. FlowDroid: Precise Context, Flow, Field, Object-sensitive and Lifecycle-aware Taint Analysis for Android Apps. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’14). ACM, New York, NY, USA, 259–269. DOI: Google ScholarDigital Library
- Martin Bravenboer and Yannis Smaragdakis. 2009a. Exception Analysis and Points-to Analysis: Better Together. In Proc. of the 18th International Symp. on Software Testing and Analysis (ISSTA ’09). ACM, New York, NY, USA, 1–12. DOI: Google ScholarDigital Library
- Martin Bravenboer and Yannis Smaragdakis. 2009b. Strictly Declarative Specification of Sophisticated Points-to Analyses. In Proc. of the 24th Annual ACM SIGPLAN Conf. on Object Oriented Programming, Systems, Languages, and Applications (OOPSLA ’09). ACM, New York, NY, USA. Google ScholarDigital Library
- Patrick Cousot and Radhia Cousot. 1977. Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints. In Conference Record of the Fourth ACM Symposium on Principles of Programming Languages. 238–252. Google ScholarDigital Library
- Patrick Cousot and Radhia Cousot. 1979. Systematic design of program analysis frameworks. In POPL ’79: Proceedings of the 6th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages. 269–282. Google ScholarDigital Library
- Johannes Dahse and Thorsten Holz. 2014. Static Detection of Second-order Vulnerabilities in Web Applications. In Proceedings of the 23rd USENIX Conference on Security Symposium (SEC’14). USENIX Association, Berkeley, CA, USA, 989–1003. http://dl.acm.org/citation.cfm?id=2671225.2671288Google Scholar
- Michael I. Gordon, Deokhwan Kim, Jeff H. Perkins, Limei Gilham, Nguyen Nguyen, and Martin C. Rinard. 2015. Information Flow Analysis of Android Applications in DroidSafe. In 22nd Annual Network and Distributed System Security Symposium, NDSS 2015, San Diego, California, USA, February 8-11, 2015. Google ScholarCross Ref
- David Hauzar and Jan Kofron. 2015. Framework for Static Analysis of PHP Applications. In 29th European Conference on Object-Oriented Programming (ECOOP 2015) (Leibniz International Proceedings in Informatics (LIPIcs)), John Tang Boyland (Ed.), Vol. 37. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, 689–711. DOI: Google ScholarCross Ref
- Michael Hind. 2001. Pointer analysis: haven’t we solved this problem yet?. In Proc. of the 3rd ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering (PASTE ’01). ACM, New York, NY, USA, 54–61. Google ScholarDigital Library
- Wei Huang, Yao Dong, Ana Milanova, and Julian Dolby. 2015. Scalable and Precise Taint Analysis for Android. In Proceedings of the 2015 International Symposium on Software Testing and Analysis (ISSTA 2015). ACM, New York, NY, USA, 106–117. DOI: Google ScholarDigital Library
- Neil Immerman. 1999. Descriptive Complexity. Springer. DOI: Google ScholarCross Ref
- Simon Holm Jensen, Anders Møller, and Peter Thiemann. 2009. Type Analysis for JavaScript. In Proc. 16th International Static Analysis Symposium (SAS) (LNCS), Vol. 5673. Springer-Verlag. Google ScholarDigital Library
- Andrew Johnson, Lucas Waye, Scott Moore, and Stephen Chong. 2015. Exploring and Enforcing Security Guarantees via Program Dependence Graphs. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’15). ACM, New York, NY, USA, 291–302. DOI: Google ScholarDigital Library
- Herbert Jordan, Bernhard Scholz, and Pavle Subotić. 2016. Soufflé: On Synthesis of Program Analyzers. Springer International Publishing, Cham, 422–430. DOI: Google ScholarCross Ref
- Rezwana Karim, Mohan Dhawan, Vinod Ganapathy, and Chung-chieh Shan. 2012. An Analysis of the Mozilla Jetpack Extension Framework. In Proceedings of the 26th European Conference on Object-Oriented Programming (ECOOP’12). Springer-Verlag, Berlin, Heidelberg, 333–355. DOI: Google ScholarDigital Library
- George Kastrinis and Yannis Smaragdakis. 2013. Hybrid Context-Sensitivity for Points-To Analysis. In Proc. of the 2013 ACM SIGPLAN Conf. on Programming Language Design and Implementation (PLDI ’13). ACM, New York, NY, USA. Google ScholarDigital Library
- Chris Lattner, Andrew Lenharth, and Vikram Adve. 2007. Making Context-Sensitive Points-to Analysis with Heap Cloning Practical For The Real World. In Proc. of the 2007 ACM SIGPLAN Conf. on Programming Language Design and Implementation (PLDI ’07). ACM, New York, NY, USA. Google ScholarDigital Library
- Johannes Lerch, Ben Hermann, Eric Bodden, and Mira Mezini. 2014. FlowTwist: Efficient Context-sensitive Inside-out Taint Analysis for Large Codebases. In Proceedings of the 22Nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE 2014). ACM, New York, NY, USA, 98–108. DOI: Google ScholarDigital Library
- Ondřej Lhoták. 2006. Program Analysis using Binary Decision Diagrams. Ph.D. Dissertation. McGill University.Google Scholar
- Li Li, Alexandre Bartel, Tegawendé F. Bissyandé, Jacques Klein, Yves Le Traon, Steven Arzt, Siegfried Rasthofer, Eric Bodden, Damien Octeau, and Patrick McDaniel. 2015a. IccTA: Detecting Inter-component Privacy Leaks in Android Apps. In Proceedings of the 37th International Conference on Software Engineering - Volume 1 (ICSE ’15). IEEE Press, Piscataway, NJ, USA, 280–291. http://dl.acm.org/citation.cfm?id=2818754.2818791Google ScholarDigital Library
- Yue Li, Tian Tan, Yulei Sui, and Jingling Xue. 2014. Self-Inferencing Reflection Resolution for Java. In Proc. of the 28th European Conf. on Object-Oriented Programming (ECOOP ’14). Springer, 27–53. Google ScholarDigital Library
- Yue Li, Tian Tan, and Jingling Xue. 2015b. Effective Soundness-Guided Reflection Analysis. In Static Analysis - 22nd International Symposium, SAS 2015, Saint-Malo, France, September 9-11, 2015, Proceedings (Lecture Notes in Computer Science), Sandrine Blazy and Thomas Jensen (Eds.), Vol. 9291. Springer, 162–180. DOI: Google ScholarCross Ref
- Tim Lindholm, Frank Yellin, Gilad Bracha, and Alex Buckley. 2015. The Java Virtual Machine Specification, Java SE 8 Edition.Google Scholar
- Benjamin Livshits. 2006. Improving Software Security with Precise Static and Runtime Analysis. Ph.D. Dissertation. Stanford University.Google Scholar
- Benjamin Livshits, John Whaley, and Monica S. Lam. 2005. Reflection Analysis for Java. In Proc. of the 3rd Asian Symp. on Programming Languages and Systems. Springer, 139–160. DOI: Google ScholarDigital Library
- Ana Milanova, Atanas Rountev, and Barbara G. Ryder. 2002. Parameterized object sensitivity for points-to and side-effect analyses for Java. In Proc. of the 2002 International Symp. on Software Testing and Analysis (ISSTA ’02). ACM, New York, NY, USA, 1–11. DOI: Google ScholarDigital Library
- Ana Milanova, Atanas Rountev, and Barbara G. Ryder. 2005. Parameterized object sensitivity for points-to analysis for Java. ACM Trans. Softw. Eng. Methodol. 14, 1 (2005), 1–41. DOI: Google ScholarDigital Library
- Durica Nikolić and Fausto Spoto. 2012. Definite Expression Aliasing Analysis for Java Bytecode. In Proc. of the 9th International Colloquium on Theoretical Aspects of Computing (ICTAC ’12), Vol. 7521. Springer, 74–89. DOI: Google ScholarDigital Library
- Veselin Raychev, Martin Vechev, and Andreas Krause. 2015. Predicting Program Properties from "Big Code". In Proceedings of the 42Nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’15). ACM, New York, NY, USA, 111–124. DOI: Google ScholarDigital Library
- Veselin Raychev, Martin Vechev, and Eran Yahav. 2014. Code Completion with Statistical Language Models. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’14). ACM, New York, NY, USA, 419–428. DOI: Google ScholarDigital Library
- Barbara G. Ryder. 2003. Dimensions of Precision in Reference Analysis of Object-Oriented Programming Languages. In Proc. of the 12th International Conf. on Compiler Construction (CC ’03). Springer, 126–137. DOI: Google ScholarCross Ref
- Max Schäfer, Manu Sridharan, Julian Dolby, and Frank Tip. 2011. Refactoring Java Programs for Flexible Locking. In Proceedings of the 33rd International Conference on Software Engineering (ICSE ’11). ACM, New York, NY, USA, 71–80. DOI: Google ScholarDigital Library
- Micha Sharir and Amir Pnueli. 1981. Two Approaches to Interprocedural Data Flow Analysis. In Program flow analysis: theory and applications, Steven S. Muchnick and Neil D. Jones (Eds.). Prentice-Hall, Inc., Englewood Cliffs, NJ, Chapter 7, 189–233.Google Scholar
- Olin Shivers. 1991. Control-Flow Analysis of Higher-Order Languages. Ph.D. Dissertation. Carnegie Mellon University.Google ScholarDigital Library
- Yannis Smaragdakis and George Balatsouras. 2015. Pointer Analysis. Foundations and TrendsÂő in Programming Languages 2, 1 (2015), 1–69. DOI: Google ScholarDigital Library
- Yannis Smaragdakis, George Balatsouras, George Kastrinis, and Martin Bravenboer. 2015. More Sound Static Handling of Java Reflection. In Proc. of the Asian Symp. on Programming Languages and Systems (APLAS ’15). Springer. Google ScholarCross Ref
- Yannis Smaragdakis, Martin Bravenboer, and Ondřej Lhoták. 2011. Pick Your Contexts Well: Understanding Object-Sensitivity. In Proc. of the 38th ACM SIGPLAN-SIGACT Symp. on Principles of Programming Languages (POPL ’11). ACM, New York, NY, USA, 17–30. Google ScholarDigital Library
- Manu Sridharan, Satish Chandra, Julian Dolby, Stephen J. Fink, and Eran Yahav. 2013. Alias Analysis for Object-Oriented Programs. In Aliasing in Object-Oriented Programming. Types, Analysis and Verification, Dave Clarke, James Noble, and Tobias Wrigstad (Eds.). Lecture Notes in Computer Science, Vol. 7850. Springer Berlin Heidelberg, 196–232. DOI: Google ScholarCross Ref
- Omer Tripp, Marco Pistoia, Patrick Cousot, Radhia Cousot, and Salvatore Guarnieri. 2013. ANDROMEDA: Accurate and Scalable Security Analysis of Web Applications. In Proceedings of the 16th International Conference on Fundamental Approaches to Software Engineering (FASE’13). Springer-Verlag, Berlin, Heidelberg, 210–225. DOI: Google ScholarDigital Library
- Omer Tripp, Marco Pistoia, Stephen J. Fink, Manu Sridharan, and Omri Weisman. 2009. TAJ: Effective Taint Analysis of Web Applications. In Proceedings of the 30th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’09). ACM, New York, NY, USA, 87–97. DOI: Google ScholarDigital Library
- David Van Horn and Matthew Might. 2010. Abstracting Abstract Machines. In Proceedings of the 15th ACM SIGPLAN International Conference on Functional Programming (ICFP ’10). ACM, New York, NY, USA, 51–62. DOI: Google ScholarDigital Library
- Dennis Volpano, Cynthia Irvine, and Geoffrey Smith. 1996. A Sound Type System for Secure Flow Analysis. J. Comput. Secur. 4, 2-3 (Jan. 1996), 167–187. http://dl.acm.org/citation.cfm?id=353629.353648Google ScholarDigital Library
- Jingyue Wu, Yang Tang, Gang Hu, Heming Cui, and Junfeng Yang. 2012. Sound and Precise Analysis of Parallel Programs Through Schedule Specialization. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’12). ACM, New York, NY, USA, 205–216. DOI: Google ScholarDigital Library
Index Terms
- P/Taint: unified points-to and taint analysis
Recommendations
Hybrid Pruning: Towards Precise Pointer and Taint Analysis
Detection of Intrusions and Malware, and Vulnerability AssessmentAbstractPointer and taint analyses are the building blocks for several other static analysis techniques. Unfortunately, these techniques frequently sacrifice precision in favor of scalability by over-approximating program behaviors. Scaling these analyses ...
Bit-Level Taint Analysis
SCAM '14: Proceedings of the 2014 IEEE 14th International Working Conference on Source Code Analysis and ManipulationTaint analysis has a wide variety of applications in software analysis, making the precision of taint analysis an important consideration. Current taint analysis algorithms, including previous work on bit-precise taint analyses, suffer from shortcomings ...
Leveraging historical versions of Android apps for efficient and precise taint analysis
MSR '18: Proceedings of the 15th International Conference on Mining Software RepositoriesToday, computing on various Android devices is pervasive. However, growing security vulnerabilities and attacks in the Android ecosystem constitute various threats through user apps. Taint analysis is a common technique for defending against these ...
Comments