skip to main content
research-article
Open Access
Artifacts Evaluated & Functional

P/Taint: unified points-to and taint analysis

Published:12 October 2017Publication History
Skip Abstract Section

Abstract

Static information-flow analysis (especially taint-analysis) is a key technique in software security, computing where sensitive or untrusted data can propagate in a program. Points-to analysis is a fundamental static program analysis, computing what abstract objects a program expression may point to. In this work, we propose a deep unification of information-flow and points-to analysis. We observe that information-flow analysis is not a mere high-level client of points-to information, but it is indeed identical to points-to analysis on artificial abstract objects that represent different information sources. The very same algorithm can compute, simultaneously, two interlinked but separate results (points-to and information-flow values) with changes only to its initial conditions.

The benefits of such a unification are manifold. We can use existing points-to analysis implementations, with virtually no modification (only minor additions of extra logic for sanitization) to compute information flow concepts, such as value tainting. The algorithmic enhancements of points-to analysis (e.g., different flavors of context sensitivity) can be applied transparently to information-flow analysis. Heavy engineering work on points-to analysis (e.g., handling of the reflection API for Java) applies to information-flow analysis without extra effort. We demonstrate the benefits in a realistic implementation that leverages the Doop points-to analysis framework (including its context-sensitivity and reflection analysis features) to provide an information-flow analysis with excellent precision (over 91%) and recall (over 99%) for standard Java information-flow benchmarks.

The analysis comfortably scales to large, real-world Android applications, analyzing the Facebook Messenger app with more than 55K classes in under 7 hours.

References

  1. Steven Arzt and Eric Bodden. 2016. StubDroid: automatic inference of precise data-flow summaries for the android framework. In Proceedings of the 38th International Conference on Software Engineering, ICSE 2016, Austin, TX, USA, May 14-22, 2016. 725–735. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Steven Arzt, Siegfried Rasthofer, Christian Fritz, Eric Bodden, Alexandre Bartel, Jacques Klein, Yves Le Traon, Damien Octeau, and Patrick McDaniel. 2014. FlowDroid: Precise Context, Flow, Field, Object-sensitive and Lifecycle-aware Taint Analysis for Android Apps. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’14). ACM, New York, NY, USA, 259–269. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Martin Bravenboer and Yannis Smaragdakis. 2009a. Exception Analysis and Points-to Analysis: Better Together. In Proc. of the 18th International Symp. on Software Testing and Analysis (ISSTA ’09). ACM, New York, NY, USA, 1–12. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Martin Bravenboer and Yannis Smaragdakis. 2009b. Strictly Declarative Specification of Sophisticated Points-to Analyses. In Proc. of the 24th Annual ACM SIGPLAN Conf. on Object Oriented Programming, Systems, Languages, and Applications (OOPSLA ’09). ACM, New York, NY, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Patrick Cousot and Radhia Cousot. 1977. Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints. In Conference Record of the Fourth ACM Symposium on Principles of Programming Languages. 238–252. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Patrick Cousot and Radhia Cousot. 1979. Systematic design of program analysis frameworks. In POPL ’79: Proceedings of the 6th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages. 269–282. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Johannes Dahse and Thorsten Holz. 2014. Static Detection of Second-order Vulnerabilities in Web Applications. In Proceedings of the 23rd USENIX Conference on Security Symposium (SEC’14). USENIX Association, Berkeley, CA, USA, 989–1003. http://dl.acm.org/citation.cfm?id=2671225.2671288Google ScholarGoogle Scholar
  8. Michael I. Gordon, Deokhwan Kim, Jeff H. Perkins, Limei Gilham, Nguyen Nguyen, and Martin C. Rinard. 2015. Information Flow Analysis of Android Applications in DroidSafe. In 22nd Annual Network and Distributed System Security Symposium, NDSS 2015, San Diego, California, USA, February 8-11, 2015. Google ScholarGoogle ScholarCross RefCross Ref
  9. David Hauzar and Jan Kofron. 2015. Framework for Static Analysis of PHP Applications. In 29th European Conference on Object-Oriented Programming (ECOOP 2015) (Leibniz International Proceedings in Informatics (LIPIcs)), John Tang Boyland (Ed.), Vol. 37. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, 689–711. DOI: Google ScholarGoogle ScholarCross RefCross Ref
  10. Michael Hind. 2001. Pointer analysis: haven’t we solved this problem yet?. In Proc. of the 3rd ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering (PASTE ’01). ACM, New York, NY, USA, 54–61. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Wei Huang, Yao Dong, Ana Milanova, and Julian Dolby. 2015. Scalable and Precise Taint Analysis for Android. In Proceedings of the 2015 International Symposium on Software Testing and Analysis (ISSTA 2015). ACM, New York, NY, USA, 106–117. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Neil Immerman. 1999. Descriptive Complexity. Springer. DOI: Google ScholarGoogle ScholarCross RefCross Ref
  13. Simon Holm Jensen, Anders Møller, and Peter Thiemann. 2009. Type Analysis for JavaScript. In Proc. 16th International Static Analysis Symposium (SAS) (LNCS), Vol. 5673. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Andrew Johnson, Lucas Waye, Scott Moore, and Stephen Chong. 2015. Exploring and Enforcing Security Guarantees via Program Dependence Graphs. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’15). ACM, New York, NY, USA, 291–302. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Herbert Jordan, Bernhard Scholz, and Pavle Subotić. 2016. Soufflé: On Synthesis of Program Analyzers. Springer International Publishing, Cham, 422–430. DOI: Google ScholarGoogle ScholarCross RefCross Ref
  16. Rezwana Karim, Mohan Dhawan, Vinod Ganapathy, and Chung-chieh Shan. 2012. An Analysis of the Mozilla Jetpack Extension Framework. In Proceedings of the 26th European Conference on Object-Oriented Programming (ECOOP’12). Springer-Verlag, Berlin, Heidelberg, 333–355. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. George Kastrinis and Yannis Smaragdakis. 2013. Hybrid Context-Sensitivity for Points-To Analysis. In Proc. of the 2013 ACM SIGPLAN Conf. on Programming Language Design and Implementation (PLDI ’13). ACM, New York, NY, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Chris Lattner, Andrew Lenharth, and Vikram Adve. 2007. Making Context-Sensitive Points-to Analysis with Heap Cloning Practical For The Real World. In Proc. of the 2007 ACM SIGPLAN Conf. on Programming Language Design and Implementation (PLDI ’07). ACM, New York, NY, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Johannes Lerch, Ben Hermann, Eric Bodden, and Mira Mezini. 2014. FlowTwist: Efficient Context-sensitive Inside-out Taint Analysis for Large Codebases. In Proceedings of the 22Nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE 2014). ACM, New York, NY, USA, 98–108. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Ondřej Lhoták. 2006. Program Analysis using Binary Decision Diagrams. Ph.D. Dissertation. McGill University.Google ScholarGoogle Scholar
  21. Li Li, Alexandre Bartel, Tegawendé F. Bissyandé, Jacques Klein, Yves Le Traon, Steven Arzt, Siegfried Rasthofer, Eric Bodden, Damien Octeau, and Patrick McDaniel. 2015a. IccTA: Detecting Inter-component Privacy Leaks in Android Apps. In Proceedings of the 37th International Conference on Software Engineering - Volume 1 (ICSE ’15). IEEE Press, Piscataway, NJ, USA, 280–291. http://dl.acm.org/citation.cfm?id=2818754.2818791Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Yue Li, Tian Tan, Yulei Sui, and Jingling Xue. 2014. Self-Inferencing Reflection Resolution for Java. In Proc. of the 28th European Conf. on Object-Oriented Programming (ECOOP ’14). Springer, 27–53. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Yue Li, Tian Tan, and Jingling Xue. 2015b. Effective Soundness-Guided Reflection Analysis. In Static Analysis - 22nd International Symposium, SAS 2015, Saint-Malo, France, September 9-11, 2015, Proceedings (Lecture Notes in Computer Science), Sandrine Blazy and Thomas Jensen (Eds.), Vol. 9291. Springer, 162–180. DOI: Google ScholarGoogle ScholarCross RefCross Ref
  24. Tim Lindholm, Frank Yellin, Gilad Bracha, and Alex Buckley. 2015. The Java Virtual Machine Specification, Java SE 8 Edition.Google ScholarGoogle Scholar
  25. Benjamin Livshits. 2006. Improving Software Security with Precise Static and Runtime Analysis. Ph.D. Dissertation. Stanford University.Google ScholarGoogle Scholar
  26. Benjamin Livshits, John Whaley, and Monica S. Lam. 2005. Reflection Analysis for Java. In Proc. of the 3rd Asian Symp. on Programming Languages and Systems. Springer, 139–160. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Ana Milanova, Atanas Rountev, and Barbara G. Ryder. 2002. Parameterized object sensitivity for points-to and side-effect analyses for Java. In Proc. of the 2002 International Symp. on Software Testing and Analysis (ISSTA ’02). ACM, New York, NY, USA, 1–11. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Ana Milanova, Atanas Rountev, and Barbara G. Ryder. 2005. Parameterized object sensitivity for points-to analysis for Java. ACM Trans. Softw. Eng. Methodol. 14, 1 (2005), 1–41. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Durica Nikolić and Fausto Spoto. 2012. Definite Expression Aliasing Analysis for Java Bytecode. In Proc. of the 9th International Colloquium on Theoretical Aspects of Computing (ICTAC ’12), Vol. 7521. Springer, 74–89. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Veselin Raychev, Martin Vechev, and Andreas Krause. 2015. Predicting Program Properties from "Big Code". In Proceedings of the 42Nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’15). ACM, New York, NY, USA, 111–124. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Veselin Raychev, Martin Vechev, and Eran Yahav. 2014. Code Completion with Statistical Language Models. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’14). ACM, New York, NY, USA, 419–428. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Barbara G. Ryder. 2003. Dimensions of Precision in Reference Analysis of Object-Oriented Programming Languages. In Proc. of the 12th International Conf. on Compiler Construction (CC ’03). Springer, 126–137. DOI: Google ScholarGoogle ScholarCross RefCross Ref
  33. Max Schäfer, Manu Sridharan, Julian Dolby, and Frank Tip. 2011. Refactoring Java Programs for Flexible Locking. In Proceedings of the 33rd International Conference on Software Engineering (ICSE ’11). ACM, New York, NY, USA, 71–80. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Micha Sharir and Amir Pnueli. 1981. Two Approaches to Interprocedural Data Flow Analysis. In Program flow analysis: theory and applications, Steven S. Muchnick and Neil D. Jones (Eds.). Prentice-Hall, Inc., Englewood Cliffs, NJ, Chapter 7, 189–233.Google ScholarGoogle Scholar
  35. Olin Shivers. 1991. Control-Flow Analysis of Higher-Order Languages. Ph.D. Dissertation. Carnegie Mellon University.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Yannis Smaragdakis and George Balatsouras. 2015. Pointer Analysis. Foundations and TrendsÂő in Programming Languages 2, 1 (2015), 1–69. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Yannis Smaragdakis, George Balatsouras, George Kastrinis, and Martin Bravenboer. 2015. More Sound Static Handling of Java Reflection. In Proc. of the Asian Symp. on Programming Languages and Systems (APLAS ’15). Springer. Google ScholarGoogle ScholarCross RefCross Ref
  38. Yannis Smaragdakis, Martin Bravenboer, and Ondřej Lhoták. 2011. Pick Your Contexts Well: Understanding Object-Sensitivity. In Proc. of the 38th ACM SIGPLAN-SIGACT Symp. on Principles of Programming Languages (POPL ’11). ACM, New York, NY, USA, 17–30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Manu Sridharan, Satish Chandra, Julian Dolby, Stephen J. Fink, and Eran Yahav. 2013. Alias Analysis for Object-Oriented Programs. In Aliasing in Object-Oriented Programming. Types, Analysis and Verification, Dave Clarke, James Noble, and Tobias Wrigstad (Eds.). Lecture Notes in Computer Science, Vol. 7850. Springer Berlin Heidelberg, 196–232. DOI: Google ScholarGoogle ScholarCross RefCross Ref
  40. Omer Tripp, Marco Pistoia, Patrick Cousot, Radhia Cousot, and Salvatore Guarnieri. 2013. ANDROMEDA: Accurate and Scalable Security Analysis of Web Applications. In Proceedings of the 16th International Conference on Fundamental Approaches to Software Engineering (FASE’13). Springer-Verlag, Berlin, Heidelberg, 210–225. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Omer Tripp, Marco Pistoia, Stephen J. Fink, Manu Sridharan, and Omri Weisman. 2009. TAJ: Effective Taint Analysis of Web Applications. In Proceedings of the 30th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’09). ACM, New York, NY, USA, 87–97. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. David Van Horn and Matthew Might. 2010. Abstracting Abstract Machines. In Proceedings of the 15th ACM SIGPLAN International Conference on Functional Programming (ICFP ’10). ACM, New York, NY, USA, 51–62. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Dennis Volpano, Cynthia Irvine, and Geoffrey Smith. 1996. A Sound Type System for Secure Flow Analysis. J. Comput. Secur. 4, 2-3 (Jan. 1996), 167–187. http://dl.acm.org/citation.cfm?id=353629.353648Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Jingyue Wu, Yang Tang, Gang Hu, Heming Cui, and Junfeng Yang. 2012. Sound and Precise Analysis of Parallel Programs Through Schedule Specialization. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’12). ACM, New York, NY, USA, 205–216. DOI: Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. P/Taint: unified points-to and taint analysis

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image Proceedings of the ACM on Programming Languages
      Proceedings of the ACM on Programming Languages  Volume 1, Issue OOPSLA
      October 2017
      1786 pages
      EISSN:2475-1421
      DOI:10.1145/3152284
      Issue’s Table of Contents

      Copyright © 2017 Owner/Author

      This work is licensed under a Creative Commons Attribution International 4.0 License.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 12 October 2017
      Published in pacmpl Volume 1, Issue OOPSLA

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader