skip to main content
10.1145/1273463.1273486acmconferencesArticle/Chapter ViewAbstractPublication PagesisstaConference Proceedingsconference-collections
Article

Finding what's not there: a new approach to revealing neglected conditions in software

Published: 09 July 2007 Publication History

Abstract

Neglected conditions are an important but difficult-to-find class of software defects. This paper presents a novel approach to revealing neglected conditions that integrates static program analysis and advanced data mining techniques to discover implicit conditional rules in a code base and to discover rule violations that indicate neglected conditions. The approach requires the user to indicate minimal constraints on the context of the rules to be sought, rather than specific rule templates. To permit this generality, rules are modeled as graph minors of program dependence graphs, and both frequent itemset mining and frequent subgraph mining algorithms are employed to identify candidate rules. We report the results of an empirical evaluation of the approach in which it was used to discover conditional rules and neglected conditions in ~25,000 lines of source code.

References

[1]
Apache.org. Apache HTTP Server Project. www.apache.org.
[2]
Budd, T. A., DeMillo, R. A., Lipton, R. J., and Sayward, F. G. Theoretical and empirical studies on using program mutation to test the functional correctness of programs. ACM Symp. on Principles of Prog. Lang., 7, pp. 220--233, 1980.
[3]
Burdick, D., Calimlim, M. and Gehrke, J. MAFIA: A Maximal Frequent Itemset Algorithm for Transactional Databases, 17th Intl. Conf. on Data Eng. (ICDE '01), 2001.
[4]
Chelf, B., Engler, D., and Hallem, S. How to write system-specific, static checkers in metal. 2002 ACM Workshop on Program Analysis for Software Tools and Engineering (Charleston, SC, Nov. 18-19, 2002), 51--60.
[5]
Chockler, H., Kupferman, O., and Vardi, M. Coverage metrics for formal verification. Lecture Notes in Computer Science 2860, Springer, 2003, pp. 111--125.
[6]
Dunsmore, A., Roper, M., and Wood, M. Practical code inspection techniques for object-oriented systems: an experimental comparison. IEEE Softw 20, 4 (July 03), 21--29.
[7]
Engler, D., Chen, D. Y., Hallem, S., Chou, A., and Chelf, B. Bugs as deviant behavior: a general approach to inferring errors in systems code. 18th ACM Symp. on Operating Systems Principles (Banff, Canada, Oct. 2001), 57--72.
[8]
Engler, D. Meta-level compilation. metacomp.stanford.edu.
[9]
Fagan, M. E. Design and code inspections to reduce errors in program development. IBM Syst. J. 15, 3, 1976, 258--287.
[10]
Fatta, G., Leue, S., and Stegantova, E. Discriminative pattern mining in software fault detection. 3rd Intl. Wkshp. on Software Quality Assurance (Portland, OR, Nov. 2006).
[11]
Ferrante, J., Ottenstein, K. J., and Warren, J. D. The program dependence graph and its use in optimization. ACM Trans. Prog. Lang. Syst. 9, 3 (July 1987), 319--349.
[12]
Festa, P. Study says buffer overflow is most common security bug. C|Net News.com. news.com.com/2100-1001-233483.html.
[13]
Garey, M. R. and Johnson, D. S. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, 1979
[14]
Grammatech. CodeSurfer. www.grammatech.com/products/codesurfer/overview.html.
[15]
Grammatech. Dependence graphs and program slicing. www.grammatech.com/research/slicing/slicingWhitepaper
[16]
Holder, L. B., Cook, D. J., and Djoko, S. Substructure discovery in the SUBDUE system. AAAI Workshop on Knowledge Discovery in Databases, 1994, 169--180.
[17]
Howard, M. and LeBlanc, D. Writing Secure Code, 2nd edition. Microsoft Press, 2003.
[18]
Howden, W. Reliability of the path analysis testing strategy. IEEE Trans. on Softw. Eng., SE-2, Sept. 1976, pp. 208--215.
[19]
Huan, J., Wang, W., and Prins, J. Efficient mining of frequent subgraphs in the presence of isomorphism. 3rd IEEE Int. Conf. on Data Mining (Piscataway, NJ, 2003), 549--552.
[20]
Huan, J., Wang, W., Prins, J., and Yang, J. SPIN: mining maximal frequent subgraphs from graph database. KDD 2004 (2004, Seattle, WA).
[21]
IBM. Orthogonal defect classification. Center for Software Eng., www.research.ibm.com/softeng/ODC/ODC.HTM.
[22]
Krinke, J. Identifying similar code with program dependence graphs. 8th Working Conf. on Reverse Engineering (Stuttgart, Germany, 2001).
[23]
Kuramochi, M. and Karypis, G. Finding frequent patterns in a large sparse graph. Technical Report #03-038, Dept. of Computer Sci. and Eng., Univ. of Minnesota.
[24]
Kuramochi, M. and Karypis, G. GREW: A Scalable Frequent Subgraph Discovery Algorithm. Technical Report #TR 04-024, CSE Dept., Univ. of Minnesota.
[25]
Li, Z. and Chou, Y. PR-Miner: Automatically Extracting Implicit Programming Rules and Detecting Violations in Large Software Code. ESEC-FSE '05 (Lisbon, Portugal, Sept. 05-09, 2005).
[26]
Li, Z., Lu, S., and Myagmar, S. CP-Miner: Finding Copy-Paste and Related Bugs in Large-Scale Software Code. IEEE Trans. Softw. Eng. 32, 3 (Mar. 2006), 176--192.
[27]
Liu, C., Yan X., and Han, J: GPLAG: Detection of Software Plagiarism by Program Dependence Graph Analysis. KDD '06 (Aug. 20-23, 2006, Philadelphia, PA, USA.).
[28]
Livshits, B. and Zimmermann, T. DynaMine: finding common error patterns by mining software revision histories. ESEC-FSE '05 (Lisbon, Portugal, Sept. 2005), 296--305.
[29]
McClure, S., Scambray, J., and Kurtz, G. Hacking Exposed: Network Security Secrets and Solutions, 5th edition, McGraw Hill, 2005.
[30]
Meyers, G. J. A controlled experiment in program testing and code walkthroughs/inspections. CACM 21, 9 (Sept. 1978), 760--768
[31]
Meyers, G. J. The Art of Software Testing. Wiley, 1979.
[32]
Mozilla.org. Bugzilla. https://bugzilla.mozilla.org/.
[33]
NIST. National Vulnerability Database. http://nvd.nist.gov/.
[34]
Raghavan, S., Rohana, R., Leon, D., Podgurski, A., and Augustine, V. Dex: A Semantic-Graph Differencing Tool for Studying Changes in Large Code Bases. 20th IEEE Intl. Conf. on Software Maintenance (Sept. 2004), 188--197.
[35]
Renieres, M. and Reiss, S. P. Fault localization with nearest neighbor queries. 18th International Conference on Automated Software Engineering (Oct. 2003), pp. 30--39.
[36]
Richardson, D. J. and Clarke, L. A. Partition analysis: a method combining testing and verification. IEEE Trans. on Softw. Eng., SE-11, 12 (Dec. 1985), 1477--1490.
[37]
Robertson, N. and Seymour, P. D. Graph Minors. I. Excluding a forest. Journal of Combinatorial Theory, Series B 35 (1), 1983, 39--61.
[38]
Sommerville, L., Sawyer, P., and Viller, S. Viewpoints for requirements elicitation: a practical approach. 3rd IEEE Intl. Conf. on Requirements Eng., April 1998.
[39]
Thayer, T. A., Lipow, M., and Nelson, E. C. Software reliability study. TRW-SS-76-03, Redondo Beach, California, TRW, March 1976.
[40]
Thomas, L., Valluri, S., and Karlapalem, K. MARGIN: Maximal frequent subgraph mining. 6th IEEE Intl. Conf. on Data Mining (2006).
[41]
Wilander, J. and Fak, P. Rule Matching Security Properties of Code using Dependence Graphs. 1st Intl. Workshop on Code Based Software Security Assessments (Pittsburgh, PA, November 2005).
[42]
Williams C. C. and Hollingsworth, J. K. Automatic mining of Source Code Repositories to Improve Bug Finding Techniques. IEEE Trans. of Soft. Eng. 31 (6), June 2005, 466--480.
[43]
Wing, J. M. and Vaziri-Farahani, M. 1995. Model checking software systems: a case study. 3rd ACM Symp. on Found. of Softw. Eng. (Washington, D.C., Oct. 1995), 128--139.
[44]
Yan, X. and Han, J. gSpan: Graph-Based Substructure Rule Mining. IEEE Intl. Conf. on Data Mining, Maebashi City, Japan (2002) 721--723.
[45]
Zhang, S., Yang, J., and Cheedella, V. Monkey: Approximate Graph Mining Based on Spanning Trees, to appear in ICDE, April 2007, Istanbul, Turkey.
[46]
Zhu, H., Hall, P. A., and May, J. H. Software unit test coverage and adequacy. ACM Comput. Surv. 29, 4 (Dec. 1997), 366--427.

Cited By

View all
  • (2024)Raisin: Identifying Rare Sensitive Functions for Bug DetectionProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639165(1-12)Online publication date: 20-May-2024
  • (2024)APP-Miner: Detecting API Misuses via Automatically Mining API Path Patterns2024 IEEE Symposium on Security and Privacy (SP)10.1109/SP54263.2024.00043(4034-4052)Online publication date: 19-May-2024
  • (2023)Mining Resource-Operation Knowledge to Support Resource Leak DetectionProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616315(986-998)Online publication date: 30-Nov-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ISSTA '07: Proceedings of the 2007 international symposium on Software testing and analysis
July 2007
258 pages
ISBN:9781595937346
DOI:10.1145/1273463
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 July 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. automatic defect detection
  2. frequent itemset mining
  3. frequent subgraph mining
  4. mining software repositories
  5. program dependences

Qualifiers

  • Article

Conference

ISSTA07
Sponsor:

Acceptance Rates

Overall Acceptance Rate 58 of 213 submissions, 27%

Upcoming Conference

ISSTA '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Raisin: Identifying Rare Sensitive Functions for Bug DetectionProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639165(1-12)Online publication date: 20-May-2024
  • (2024)APP-Miner: Detecting API Misuses via Automatically Mining API Path Patterns2024 IEEE Symposium on Security and Privacy (SP)10.1109/SP54263.2024.00043(4034-4052)Online publication date: 19-May-2024
  • (2023)Mining Resource-Operation Knowledge to Support Resource Leak DetectionProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616315(986-998)Online publication date: 30-Nov-2023
  • (2023)When to Say What: Learning to Find Condition-Message Inconsistencies2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE)10.1109/ICSE48619.2023.00081(868-880)Online publication date: May-2023
  • (2022)Detecting Inconsistencies in If-Condition-Raise StatementsProceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering10.1145/3551349.3559514(1-3)Online publication date: 10-Oct-2022
  • (2022)Bug detection in Java code: An extensive evaluation of static analysis tools using Juliet Test SuitesSoftware: Practice and Experience10.1002/spe.318153:5(1125-1143)Online publication date: 29-Dec-2022
  • (2021)Identifying change patterns of API misuses from code changesScience China Information Sciences10.1007/s11432-019-2745-564:3Online publication date: 7-Feb-2021
  • (2020)SinkFinder: harvesting hundreds of unknown interesting function pairs with just one seedProceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3368089.3409678(1101-1113)Online publication date: 8-Nov-2020
  • (2019)Detecting Bugs by Discovering Expectations and Their ViolationsIEEE Transactions on Software Engineering10.1109/TSE.2018.281663945:10(984-1001)Online publication date: 1-Oct-2019
  • (2018)NAR-miner: discovering negative association rules from code for bug detectionProceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3236024.3236032(411-422)Online publication date: 26-Oct-2018
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media