skip to main content
research-article
Open access

Checking type safety of foreign function calls

Published: 01 August 2008 Publication History

Abstract

Foreign function interfaces (FFIs) allow components in different languages to communicate directly with each other. While FFIs are useful, they often require writing tricky low-level code and include little or no static safety checking, thus providing a rich source of hard-to-find programming errors. In this article, we study the problem of enforcing type safety across the OCaml-to-C FFI and the Java Native Interface (JNI). We present O-Saffire and J-Saffire, a pair of multilingual type inference systems that ensure C code that uses these FFIs accesses high-level data safely. Our inference systems use representational types to model C's low-level view of OCaml and Java values, and singleton types to track integers, strings, memory offsets, and type tags through C. J-Saffire, our Java system, uses a polymorphic flow-insensitive, unification-based analysis. Polymorphism is important because it allows us to precisely model user-defined wrapper functions and the more than 200 JNI functions. O-Saffire, our OCaml system, uses a monomorphic flow-sensitive analysis because, while polymorphism is much less important for the OCaml FFI flow-sensitivity is critical to track conditional branches, which are used when pattern matching OCaml data in C. O-Saffire also tracks garbage collection information to ensure that local C pointers to the OCaml heap are registered properly, which is not necessary for the JNI. We have applied O-Saffire and J-Saffire to a set of benchmarks and found many bugs and questionable coding practices. These results suggest that static checking of FFIs can be a valuable tool in writing correct multilingual software.

References

[1]
Auerbach, J., Barton, C., Chu-Carroll, M., and Raghavachari, M. 1999. Mockingbird: Flexible stub compilation from paris of declarations. In Proceedings of the 19th International Conference on Distributed Computing Systems. Austin, TX.]]
[2]
Barrett, D. J. 1998. Polylingual systems: An approach to seamless interoperability. Ph.D. thesis, University of Massachusetts, Amherst, MA.]]
[3]
Beazley, D. M. 1996. SWIG: An easy to use tool for integrating scripting languages with C and C++. USENIX 4th Annual Tcl/Tk Workshop.]]
[4]
Blume, M. 2001. No-longer-foreign: Teaching an ML compiler to speak C “natively”. In Proceedings of the 1st International Workshop on Multilanguage Infrastructure and Interoperability (BABEL'01). Firenze, Italy.]]
[5]
Bubba, J. F., Kaplan, A., and Wileden, J. C. 2001. The Exu approach to safe, transparent and lightweight interoperability. In Proceedings of the 25th International Computer Software and Applications Conference (COMPSAC'01). Chicago, IL.]]
[6]
Cannasse, N. 2004. Ocaml javalib. http://team.motion-twin.com/ncannasse/javaLib/.]]
[7]
Chandra, S. and Reps, T. W. 1999. Physical type checking for C. In Proceedings of the ACM SIGPLAN/SIGSOFT Workshop on Program Analysis for Software Tools and Engineering. Toulouse, France, 66--75.]]
[8]
Christensen, A. S., Møller, A., and Schwartzbach, M. I. 2003. Precise analysis of string expressions. In Proceedings of the 10th International Symposium on Static Analysis. San Diego, CA.]]
[9]
DeLine, R. and Fähndrich, M. 2004. The Fugue protocol checker: Is your software baroque? Tech. rep. MSR-TR-2004-07, Microsoft Research.]]
[10]
Fähndrich, M., Rehof, J., and Das, M. 2000. Scalable context-sensitive flow analysis using instantiation constraints. In Proceedings of the ACM Conference on Programming Language Design and Implementation. Vancouver B.C., Canada.]]
[11]
Felleisen, M. and Hieb, R. 1992. The revised report on the syntactic theories of sequential control and state. Theor. Comput. Sci. 103, 2, 235--271.]]
[12]
Finne, S., Leijen, D., Meijer, E., and Jones, S. P. 1999. Calling hell from heaven and heaven from hell. In Proceedings of the 4th ACM SIGPLAN International Conference on Functional Programming. Paris, France, 114--125.]]
[13]
Fisher, K., Pucella, R., and Reppy, J. 2001. A framework for interoperability. In Proceedings of the 1st International Workshop on Multilanguage Infrastructure and Interoperability (BABEL'01). Firenze, Italy.]]
[14]
Furr, M. and Foster, J. S. 2005a. Checking type safety of foreign function calls. In Proceedings of the ACM Conference on Programming Language Design and Implementation. Chicago, IL. 62--72.]]
[15]
Furr, M. and Foster, J. S. 2005b. Java SE 6 “Mustang” bug 6362203. http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6362203.]]
[16]
Furr, M. and Foster, J. S. 2006a. Checking type safety of foreign function calls. Tech. rep. CS-TR-4845, Computer Science Department, University of Maryland.]]
[17]
Furr, M. and Foster, J. S. 2006b. Polymorphic type inference for the JNI. In Proceedings of the 15th European Symposium on Programming. Vienna, Austria. To appear.]]
[18]
Gould, C., Su, Z., and Devanbu, P. 2004. Static Checking of Dynamically Generated Queries in Database Applications. In Proceedings of the 26th International Conference on Software Engineering (ICSE'04). Edinburgh, 645--654.]]
[19]
Gray, D. N., Hotchkiss, J., LaForge, S., Shalit, A., and Weinberg, T. 1998. Modern languages and Microsoft's component object model. Comm. ACM 41, 5, 55--65.]]
[20]
Grechanik, M., Batory, D., and Perry, D. E. 2004. Design of large-scale polylingual systems. In Proceedings of the 26th International Conference on Software Engineering (ICSE'04). Scot and. 357--366.]]
[21]
Hamilton, J. 1996. Interlanguage object sharing with SOM. In Proceedings of the Usenix Annual Technical Conference. San Diego, CA.]]
[22]
Hamilton, J. 2003. Language integration in the common language runtime. ACM SIGPLAN Notices 38, 2, 19--28.]]
[23]
Henglein, F. 1993. Type inference with polymorphic recursion. ACM Trans. Program. Lang. Syst. 15, 2, 253--289.]]
[24]
Huelsbergen, L. 1996. A portable C interface for standard ML of New Jersey. http://www.smlnj.org//doc/SMLNJ-C/smlnj-c.ps.]]
[25]
Java-Gnome Developers. 2005. Java bindings for the gnome and gtk libraries. http://java-gnome.sourceforge.net.]]
[26]
Jones, S. P. 2001. Tackling the awkward squad: Monadic input/output, concurrency, exceptions, and foreign-language calls in Haskell. In Engineernig Theories of Software Construction, T. Hoare, M. Broy, and R. Steinbruggen, Eds. IOS Press, 47--96.]]
[27]
Leroy, X. 2004. The Objective Caml system. Release 3.08, http://caml.inria.fr/distrib/ocaml-3.08/ocaml-3.08-refman.pdf.]]
[28]
Liang, S. 1999. The Java Native Interface: Programmer's Guide and Specification. Addison-Wesley.]]
[29]
Lindholm, T. and Yellin, F. 1997. The Java Virtual Machine Specification. Addison-Wesley.]]
[30]
Matthews, J. and Findler, R. B. 2007. Operational semantics for multi-language programs. In Proceedings of the 34th Annual ACM Symposium on Principles of Programming Languages. Nice, France, 3--10.]]
[31]
Meijer, E., Perry, N., and van Yzendoorn, A. 2001. Scripting .NET using Mondrian. In Proceedings of the 15th European Conference on Object-Oriented Programming (ECOOP'01). Budapest, Hungary.]]
[32]
Necula, G., McPeak, S., Rahul, S. P., and Weimer, W. 2002. CIL: Intermediate language and tools for analysis and transformation of C programs. In Proceedings of the 11th International Conference on Computer Construction. Grenoble, France.]]
[33]
Necula, G., McPeak, S., and Weimer, W. 2002. CCured: Type-safe retrofitting of legacy code. In Proceedings of the 29th Annual ACM Symposium on Principles of Programming Languages. Portland, OR. 128--139.]]
[34]
Nishimura, S. 1998. Static typing for dynamic messages. In Proceedings of the 25th Annual ACM Symposium on Principles of Programming Languages. San Diego, CA.]]
[35]
Object Management Group 2004. Common object request broker architecture: Core specification, Version 3.0.3. Object Management Group.]]
[36]
Rémy, D. 1989. Typechecking records and variants in a natural extension of ML. In Proceedings of the 16th Annual ACM Symposium on Principles of Programming Languages. Austin, TX. 77--88.]]
[37]
Tan, G., Appel, A. W., Chakradhar, S., Raghunathan, A., Ravi, S., and Wang, D. 2006. Safe java native interface. In Proceedings of the IEEE International Symposium on Secure Software Engineering. Arlington, VA.]]
[38]
Thiemann, P. 2005. Grammar-based analysis of string expressions. In Proceedings of the ACM SIGPLAN International Workshop on Types in Language Design and Implementation. Long Beach, CA.]]
[39]
Trifonov, V. and Shao, Z. 1999. Safe and principled language interoperation. In Proceedings of the 8th European Symposium on Programming. D. Swierstra, Ed. Lecture Notes in Computer Science, vol. 1576. Springer-Verlag, Berlin, Germany, 128--146.]]
[40]
Wright, A. K. and Cartwright, R. 1994. A practical soft type system for scheme. In Proceedings of the Conference on Lisp and Functional Programming. 250--262.]]

Cited By

View all
  • (2025)A Verified Foreign Function Interface between Coq and CProceedings of the ACM on Programming Languages10.1145/37048609:POPL(687-717)Online publication date: 9-Jan-2025
  • (2024)Cross-Language Taint Analysis: Generating Caller-Sensitive Native Code Specification for JavaIEEE Transactions on Software Engineering10.1109/TSE.2024.339225450:6(1518-1533)Online publication date: 27-May-2024
  • (2024)Bridging the language gap: an empirical study of bindings for open source machine learning libraries across software package ecosystemsEmpirical Software Engineering10.1007/s10664-024-10570-530:1Online publication date: 18-Oct-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Programming Languages and Systems
ACM Transactions on Programming Languages and Systems  Volume 30, Issue 4
July 2008
358 pages
ISSN:0164-0925
EISSN:1558-4593
DOI:10.1145/1377492
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 August 2008
Accepted: 01 July 2007
Revised: 01 December 2006
Received: 01 July 2006
Published in TOPLAS Volume 30, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. FFI
  2. Foreign function interface
  3. JNI
  4. Java
  5. Java Native Interface
  6. OCaml
  7. dataflow analysis
  8. flow-sensitive type system
  9. foreign function calls
  10. multilingual type inference
  11. multilingual type system
  12. representational type

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)120
  • Downloads (Last 6 weeks)22
Reflects downloads up to 02 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)A Verified Foreign Function Interface between Coq and CProceedings of the ACM on Programming Languages10.1145/37048609:POPL(687-717)Online publication date: 9-Jan-2025
  • (2024)Cross-Language Taint Analysis: Generating Caller-Sensitive Native Code Specification for JavaIEEE Transactions on Software Engineering10.1109/TSE.2024.339225450:6(1518-1533)Online publication date: 27-May-2024
  • (2024)Bridging the language gap: an empirical study of bindings for open source machine learning libraries across software package ecosystemsEmpirical Software Engineering10.1007/s10664-024-10570-530:1Online publication date: 18-Oct-2024
  • (2023)LiSA: A Generic Framework for Multilanguage Static AnalysisChallenges of Software Verification10.1007/978-981-19-9601-6_2(19-42)Online publication date: 22-Jul-2023
  • (2021)Towards a polyglot framework for factorized MLProceedings of the VLDB Endowment10.14778/3476311.347637214:12(2918-2931)Online publication date: 28-Oct-2021
  • (2021)A Multilanguage Static Analysis of Python Programs with Native C ExtensionsStatic Analysis10.1007/978-3-030-88806-0_16(323-345)Online publication date: 17-Oct-2021
  • (2018)The computer for the 21st century: present security & privacy challengesJournal of Internet Services and Applications10.1186/s13174-018-0095-29:1Online publication date: 4-Dec-2018
  • (2017)The Computer for the 21st Century: Security & Privacy Challenges after 25 Years2017 26th International Conference on Computer Communication and Networks (ICCCN)10.1109/ICCCN.2017.8038394(1-10)Online publication date: Jul-2017
  • (2017)MUSEUM: Debugging real-world multilingual programs using mutation analysisInformation and Software Technology10.1016/j.infsof.2016.10.00282(80-95)Online publication date: Feb-2017
  • (2016)Array length inference for C library bindingsProceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering10.1145/2970276.2970310(461-471)Online publication date: 25-Aug-2016
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media