skip to main content
10.1145/1250734.1250749acmconferencesArticle/Chapter ViewAbstractPublication PagespldiConference Proceedingsconference-collections
Article

Static specification inference using predicate mining

Published: 10 June 2007 Publication History

Abstract

The reliability and correctness of complex software systems can be significantly enhanced through well-defined specifications that dictate the use of various units of abstraction (e.g., modules, or procedures). Often times, however, specifications are either missing, imprecise, or simply too complex to encode within a signature, necessitating specification inference. The process of inferring specifications from complex software systems forms the focus of this paper. We describe a static inference mechanism for identifying the preconditions that must hold whenever a procedure is called. These preconditions may reflect both data flow properties (e.g., whenever p is called, variable x must be non-null) as well as control-flow properties (e.g., every call to p must bepreceded by a call to q). We derive these preconditions using a ninter-procedural path-sensitive dataflow analysis that gathers predicates at each program point. We apply mining techniques to these predicates to make specification inference robust to errors. This technique also allows us to derive higher-level specifications that abstract structural similarities among predicates (e.g., procedure p is called immediately after a conditional test that checks whether some variable v is non-null.) We describe an implementation of these techniques, and validate the effectiveness of the approach on a number of large open-source benchmarks. Experimental results confirm that our mining algorithms are efficient, and that the specifications derived are both precise and useful-the implementation discovers several critical, yet previously, undocumented preconditions for well-tested libraries.

References

[1]
R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proc. 20th Int. Conf. Very Large Data Bases, VLDB, pages 487--499, 1994.
[2]
R. Agrawal and R. Srikant. Mining sequential patterns. In Eleventh International Conference on Data Engineering, pages 3--14, 1995.
[3]
R. Alur, P. Cerny, P. Madhusudan, and W. Nam. Synthesis of interface specidications for java classes. In POPL '05: Proceedings of the 32nd ACM SIGPLAN--SIGACT Symposium on Principles of Programming Languages, pages 98--109, 2005.
[4]
G. Ammons, R. Bodik, and J. Larus. Mining specifications. In POPL'02: Proceedings of the 29th ACM SIGPLAN--SIGACT Symposium on Principles of Programming Languages, pages 4--16, 2002.
[5]
G. Ammons, D. Mandelin, R. Bodik, and J. Larus. Debugging temporal specifications with concept analysis. In PLDI '03: Proceedings of the ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation, pages 182--195, 2003.
[6]
P. Anderson, T. Reps, and T. Teitelbaum. Design and implementation of a fine-grained software inspection tool. IEEE Trans. on Software Engineering, 29(8):721--733, August 2003.
[7]
T. Ball and S.K. Rajamani. Automatically validating temporal safety properties of interfaces. In SPIN 2001, Workshop on Model Checking of Software, LNCS 2057, pages 103--122, May 2001.
[8]
B. Blanchet. A computationally sound mechanized prover for security protocols. In IEEE Symposium on Security and Privacy, pages 140--154, 2006.
[9]
D. Burdick, M. Calimlim, J. Flannick, J. Gehrke, and T. Yiu. Mafia: A performance study of mining maximal frequent itemsets. In Workshop on Frequent Itemset Mining Implementations (FIMI'03), 2003.
[10]
M. Castro, M. Costa, and T. Harris. Securing software by enforcing data-flow integrity. In OSDI '06: Proceedings of the 7th Usenix Symposium on Operating Systems Design and Implementation, pages 147--160, 2006.
[11]
B. Chin, S. Markstrum, and T. Millstein. Semantic type qualifiers. In PLDI '05: Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 85--95, 2005.
[12]
C. Cortes, K. Fisher, D. Pregibon, and A. Rogers. Hancock: a language for extracting signatures from data streams. In KDD '00: Proceedings of the sixth ACM SIGKDD International Conference on Knowledge Discovery and Data mining, pages 9--17, 2000.
[13]
D. Engler, D. Chen, and A. Chou. Bugs as inconsistent behavior: A general approach to inferring errors in systems code. In SOSP'01: Proceedings of the 18th ACM Symposium on Operating Systems Principles, pages 57--72, 2001.
[14]
M. Ernst, J. Cockrell, W. Griswold, and D. Notkin. Dynamically discovering likely program invariants to support program evolution. IEEE TSE, 27(2):1--25, February 2001.
[15]
J. Fischer, R. Jhala, and R. Majumdar. Joining dataflow with predicates. In ESEC--FSE '05: 10th European Software Engineering Conference and 13th ACM SIGSOFT international symposium on Foundations of Software Engineering, pages 227--236, 2005.
[16]
J. Foster, T. Terauchi, and A. Aiken. Flow-sensitive type qualifiers. In PLDI '02: Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 1--12, 2002.
[17]
M. Furr and J. Foster. Checking type safety of foreign function calls. In PLDI '05: Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, pages 62--72, 2005.
[18]
P. Godefroid, N. Klarslund, and K. Sen. Dart: Directed automated random testing. In PLDI '05: Proceedings of the ACM SIGPLAN 2005 Conference on Programming Language Design and Implementation, pages 213--223, Chicago, Il, 2005.
[19]
D. Gunopulos, R. Khardon, H. Mannila, S. Saluja, H. Toivonen, and R. Sharma. Discovering all most specific sentences. ACM Transactions on Database Systems, 28:140--174, 2003.
[20]
T. Henzinger, R. Jhala, and R. Majumdar. Permissive interfaces. SIGSOFT Softw. Eng. Notes, 30(5):31--40, 2005.
[21]
G. J. Holzmann. The SPIN Model Checker: Primer and Reference Manual. Addison-Wesley, 2004.
[22]
R. Jhala and R. Majumdar. Path slicing. In PLDI '05: Proceedings of the ACM SIGPLAN 2005 Conference on Programming Language Design and Implementation, pages 38--47, 2005.
[23]
T. Kremenek, P. Twohey, G. Back, A. Ng, and D. Engler. From uncertainty to belief: Inferring the specification within. In OSDI'06: Proceedings of the 7th Usenix Symposium on Operating Systems Design and Implementation, pages 161--176, 2006.
[24]
P. Lam, V. Kuncak, and M. Rinard. Generalized typestate checking for data structure consistency. In VMCAI '05: Proceedings of 6th International Conference on Verification, Model Checking and Abstract Interpretation, pages 430--447, 2005.
[25]
Z. Li, S. Lu, S. Myagmar, and Y. Zhou. Cp-miner: A tool for finding copy-paste and related bugs in operating system code. In OSDI '04: Proceedings of the 6th Usenix Symposium on Operating Systems Design and Implementation, pages 289--302, 2004.
[26]
Z. Li and Y. Zhou. Pr-miner: Automatically extracting implicit programming rules and detecting violations in large software code. In ESEC-FSE '05: 10th European Software Engineering Conference and 13th ACM SIGSOFT international symposium on Foundations of Software Engineering, pages 306--315, 2005.
[27]
B. Liblit, M. Naik, A. Zheng, A. Aiken, and M. Jordan. Scalable statistical bug isolation. In PLDI '05: Proceedings of the ACM SIGPLAN 2005 Conference on Programming Language Design and Implementation, pages 15--26, Chicago, Illinois, 2005.
[28]
B. Livshits and T. Zimmermann. Dynamine: Finding common error patterns by mining software revision histories. In ESEC-FSE '05: 10th European Software Engineering Conference and 13th ACM SIGSOFT international symposium on Foundations of Software Engineering, pages 296--305, 2005.
[29]
D. Mandelin, L. Xu, R. Bodik, and D. Kimelman. Jungloid mining: helping to navigate the api jungle. In PLDI '05: Proceedings of the ACM SIGPLAN 2005 Conference on Programming Language Design and Implementation, pages 48--61, 2005.
[30]
J. Palsberg. Closure analysis in constraint form. ACM Trans. Program. Lang. Syst., 17(1):47--62, 1995.
[31]
M. K. Ramanathan, A. Grama, and S. Jagannathan. Path-sensitive inference of function precedence protocols. In ICSE '07: Proceedings of the 29th International Conference on Software Engineering, 2007.
[32]
M. Rinard, C. Cadar, D. Dumitran, D. M. Roy, T. Leu, and W.S. Beebee. Enhancing server availability and security through failureoblivious computing. In OSDI '04: Proceedings of the 6th Usenix Symposium on Operating Systems Design and Implementation, 2004.
[33]
O. Shivers. Control-Flow Analysis of Higher-Order Languages. PhD thesis, Carnegie Mellon University, May 1991.
[34]
W. Weimer and G. Necula. Mining temporal specifications for error detection. In TACAS '05: Eleventh International Conference on Tools and Algorithms for the Construction and Analysis of Systems, pages 461--476, April 2005.
[35]
J. Whaley, M. Martin, and M. Lam. Automatic extraction of objectoriented component interfaces. In Proceedings of the International Symposium of Software Testing and Analysis, ISSTA, 2002.
[36]
Y. Xie and A. Aiken. Scalable error detection using boolean satisfiability. In POPL '05: Proceedings of the 32nd ACM Symposium on Principles of Programming Languages, pages 351--363, 2005.
[37]
G. Yang. The complexity of mining maximal frequent itemsets and maximal frequent patterns. In KDD '04: Proceedings of the tenth ACM SIGKDD International Conference on Knowledge Discovery and Data mining, pages 344--353, 2004.
[38]
J. Yang, D. Evans, D. Bhardwaj, T. Bhat, and M. Das. Perracotta: Mining temporal api rules from imperfect traces. In ICSE '06: Proceedings of the International Conference on Software Engineering, 2006.
[39]
J. Yang, P. Twohey, D. Engler, and M. Musuvathi. Using model checking to find serious file system errors. In OSDI '04: Proceedings of the 6th Usenix Symposium on Operating Systems Design and Implementation, pages 273--288, San Francisco, CA, 2004.

Cited By

View all
  • (2024)An Empirical Study of API Misuses of Data-Centric LibrariesProceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement10.1145/3674805.3686685(245-256)Online publication date: 24-Oct-2024
  • (2024)API usage templates via structural generalizationJournal of Systems and Software10.1016/j.jss.2024.111974210:COnline publication date: 1-Apr-2024
  • (2023)Lightweight precise automatic extraction of exception preconditions in java methodsEmpirical Software Engineering10.1007/s10664-023-10392-x29:1Online publication date: 26-Dec-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PLDI '07: Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation
June 2007
508 pages
ISBN:9781595936332
DOI:10.1145/1250734
  • cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 42, Issue 6
    Proceedings of the 2007 PLDI conference
    June 2007
    491 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/1273442
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 June 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. preconditions
  2. predicate mining
  3. program analysis
  4. specification inference

Qualifiers

  • Article

Conference

PLDI '07
Sponsor:

Acceptance Rates

Overall Acceptance Rate 406 of 2,067 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)48
  • Downloads (Last 6 weeks)4
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)An Empirical Study of API Misuses of Data-Centric LibrariesProceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement10.1145/3674805.3686685(245-256)Online publication date: 24-Oct-2024
  • (2024)API usage templates via structural generalizationJournal of Systems and Software10.1016/j.jss.2024.111974210:COnline publication date: 1-Apr-2024
  • (2023)Lightweight precise automatic extraction of exception preconditions in java methodsEmpirical Software Engineering10.1007/s10664-023-10392-x29:1Online publication date: 26-Dec-2023
  • (2022)A Hybrid Approach for Inference between Behavioral Exception API Documentation and Implementations, and Its ApplicationsProceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering10.1145/3551349.3560434(1-13)Online publication date: 10-Oct-2022
  • (2022)Assisting Example-Based API Misuse Detection via Complementary Artificial ExamplesIEEE Transactions on Software Engineering10.1109/TSE.2021.309324648:9(3410-3422)Online publication date: 1-Sep-2022
  • (2022)MisuseHint: A Service for API Misuse Detection Based on Building Knowledge Graph from Documentation and Codebase2022 IEEE International Conference on Web Services (ICWS)10.1109/ICWS55610.2022.00046(246-255)Online publication date: Jul-2022
  • (2022)What Is Thrown? Lightweight Precise Automatic Extraction of Exception Preconditions in Java Methods2022 IEEE International Conference on Software Maintenance and Evolution (ICSME)10.1109/ICSME55016.2022.00038(340-351)Online publication date: Oct-2022
  • (2021)Specification synthesis with constrained Horn clausesProceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation10.1145/3453483.3454104(1203-1217)Online publication date: 19-Jun-2021
  • (2021)Mining API Constraints from Library and Client to Detect API Misuses2021 28th Asia-Pacific Software Engineering Conference (APSEC)10.1109/APSEC53868.2021.00024(161-170)Online publication date: Dec-2021
  • (2021)Documentation‐based functional constraint generation for library methodsSoftware Testing, Verification and Reliability10.1002/stvr.178531:8Online publication date: 26-Jul-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media