Article

Finding what's not there: a new approach to revealing neglected conditions in software

Authors:

Ray-Yaung Chang,

Andy Podgurski,

Jiong YangAuthors Info & Claims

ISSTA '07: Proceedings of the 2007 international symposium on Software testing and analysis

Pages 163 - 173

https://doi.org/10.1145/1273463.1273486

Published: 09 July 2007 Publication History

Abstract

Neglected conditions are an important but difficult-to-find class of software defects. This paper presents a novel approach to revealing neglected conditions that integrates static program analysis and advanced data mining techniques to discover implicit conditional rules in a code base and to discover rule violations that indicate neglected conditions. The approach requires the user to indicate minimal constraints on the context of the rules to be sought, rather than specific rule templates. To permit this generality, rules are modeled as graph minors of program dependence graphs, and both frequent itemset mining and frequent subgraph mining algorithms are employed to identify candidate rules. We report the results of an empirical evaluation of the approach in which it was used to discover conditional rules and neglected conditions in ~25,000 lines of source code.

References

[1]

Apache.org. Apache HTTP Server Project. www.apache.org.

[2]

Budd, T. A., DeMillo, R. A., Lipton, R. J., and Sayward, F. G. Theoretical and empirical studies on using program mutation to test the functional correctness of programs. ACM Symp. on Principles of Prog. Lang., 7, pp. 220--233, 1980.

Digital Library

[3]

Burdick, D., Calimlim, M. and Gehrke, J. MAFIA: A Maximal Frequent Itemset Algorithm for Transactional Databases, 17th Intl. Conf. on Data Eng. (ICDE '01), 2001.

Digital Library

[4]

Chelf, B., Engler, D., and Hallem, S. How to write system-specific, static checkers in metal. 2002 ACM Workshop on Program Analysis for Software Tools and Engineering (Charleston, SC, Nov. 18-19, 2002), 51--60.

Digital Library

[5]

Chockler, H., Kupferman, O., and Vardi, M. Coverage metrics for formal verification. Lecture Notes in Computer Science 2860, Springer, 2003, pp. 111--125.

[6]

Dunsmore, A., Roper, M., and Wood, M. Practical code inspection techniques for object-oriented systems: an experimental comparison. IEEE Softw 20, 4 (July 03), 21--29.

Digital Library

[7]

Engler, D., Chen, D. Y., Hallem, S., Chou, A., and Chelf, B. Bugs as deviant behavior: a general approach to inferring errors in systems code. 18th ACM Symp. on Operating Systems Principles (Banff, Canada, Oct. 2001), 57--72.

Digital Library

[8]

Engler, D. Meta-level compilation. metacomp.stanford.edu.

[9]

Fagan, M. E. Design and code inspections to reduce errors in program development. IBM Syst. J. 15, 3, 1976, 258--287.

Digital Library

[10]

Fatta, G., Leue, S., and Stegantova, E. Discriminative pattern mining in software fault detection. 3rd Intl. Wkshp. on Software Quality Assurance (Portland, OR, Nov. 2006).

Digital Library

[11]

Ferrante, J., Ottenstein, K. J., and Warren, J. D. The program dependence graph and its use in optimization. ACM Trans. Prog. Lang. Syst. 9, 3 (July 1987), 319--349.

Digital Library

[12]

Festa, P. Study says buffer overflow is most common security bug. C|Net News.com. news.com.com/2100-1001-233483.html.

[13]

Garey, M. R. and Johnson, D. S. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, 1979

Digital Library

[14]

Grammatech. CodeSurfer. www.grammatech.com/products/codesurfer/overview.html.

[15]

Grammatech. Dependence graphs and program slicing. www.grammatech.com/research/slicing/slicingWhitepaper

[16]

Holder, L. B., Cook, D. J., and Djoko, S. Substructure discovery in the SUBDUE system. AAAI Workshop on Knowledge Discovery in Databases, 1994, 169--180.

[17]

Howard, M. and LeBlanc, D. Writing Secure Code, 2nd edition. Microsoft Press, 2003.

Digital Library

[18]

Howden, W. Reliability of the path analysis testing strategy. IEEE Trans. on Softw. Eng., SE-2, Sept. 1976, pp. 208--215.

Digital Library

[19]

Huan, J., Wang, W., and Prins, J. Efficient mining of frequent subgraphs in the presence of isomorphism. 3rd IEEE Int. Conf. on Data Mining (Piscataway, NJ, 2003), 549--552.

Digital Library

[20]

Huan, J., Wang, W., Prins, J., and Yang, J. SPIN: mining maximal frequent subgraphs from graph database. KDD 2004 (2004, Seattle, WA).

Digital Library

[21]

IBM. Orthogonal defect classification. Center for Software Eng., www.research.ibm.com/softeng/ODC/ODC.HTM.

[22]

Krinke, J. Identifying similar code with program dependence graphs. 8th Working Conf. on Reverse Engineering (Stuttgart, Germany, 2001).

Digital Library

[23]

Kuramochi, M. and Karypis, G. Finding frequent patterns in a large sparse graph. Technical Report #03-038, Dept. of Computer Sci. and Eng., Univ. of Minnesota.

[24]

Kuramochi, M. and Karypis, G. GREW: A Scalable Frequent Subgraph Discovery Algorithm. Technical Report #TR 04-024, CSE Dept., Univ. of Minnesota.

[25]

Li, Z. and Chou, Y. PR-Miner: Automatically Extracting Implicit Programming Rules and Detecting Violations in Large Software Code. ESEC-FSE '05 (Lisbon, Portugal, Sept. 05-09, 2005).

Digital Library

[26]

Li, Z., Lu, S., and Myagmar, S. CP-Miner: Finding Copy-Paste and Related Bugs in Large-Scale Software Code. IEEE Trans. Softw. Eng. 32, 3 (Mar. 2006), 176--192.

Digital Library

[27]

Liu, C., Yan X., and Han, J: GPLAG: Detection of Software Plagiarism by Program Dependence Graph Analysis. KDD '06 (Aug. 20-23, 2006, Philadelphia, PA, USA.).

Digital Library

[28]

Livshits, B. and Zimmermann, T. DynaMine: finding common error patterns by mining software revision histories. ESEC-FSE '05 (Lisbon, Portugal, Sept. 2005), 296--305.

Digital Library

[29]

McClure, S., Scambray, J., and Kurtz, G. Hacking Exposed: Network Security Secrets and Solutions, 5th edition, McGraw Hill, 2005.

Digital Library

[30]

Meyers, G. J. A controlled experiment in program testing and code walkthroughs/inspections. CACM 21, 9 (Sept. 1978), 760--768

Digital Library

[31]

Meyers, G. J. The Art of Software Testing. Wiley, 1979.

Digital Library

[32]

Mozilla.org. Bugzilla. https://bugzilla.mozilla.org/.

[33]

NIST. National Vulnerability Database. http://nvd.nist.gov/.

[34]

Raghavan, S., Rohana, R., Leon, D., Podgurski, A., and Augustine, V. Dex: A Semantic-Graph Differencing Tool for Studying Changes in Large Code Bases. 20th IEEE Intl. Conf. on Software Maintenance (Sept. 2004), 188--197.

Digital Library

[35]

Renieres, M. and Reiss, S. P. Fault localization with nearest neighbor queries. 18th International Conference on Automated Software Engineering (Oct. 2003), pp. 30--39.

[36]

Richardson, D. J. and Clarke, L. A. Partition analysis: a method combining testing and verification. IEEE Trans. on Softw. Eng., SE-11, 12 (Dec. 1985), 1477--1490.

Digital Library

[37]

Robertson, N. and Seymour, P. D. Graph Minors. I. Excluding a forest. Journal of Combinatorial Theory, Series B 35 (1), 1983, 39--61.

[38]

Sommerville, L., Sawyer, P., and Viller, S. Viewpoints for requirements elicitation: a practical approach. 3rd IEEE Intl. Conf. on Requirements Eng., April 1998.

Digital Library

[39]

Thayer, T. A., Lipow, M., and Nelson, E. C. Software reliability study. TRW-SS-76-03, Redondo Beach, California, TRW, March 1976.

[40]

Thomas, L., Valluri, S., and Karlapalem, K. MARGIN: Maximal frequent subgraph mining. 6th IEEE Intl. Conf. on Data Mining (2006).

Digital Library

[41]

Wilander, J. and Fak, P. Rule Matching Security Properties of Code using Dependence Graphs. 1st Intl. Workshop on Code Based Software Security Assessments (Pittsburgh, PA, November 2005).

[42]

Williams C. C. and Hollingsworth, J. K. Automatic mining of Source Code Repositories to Improve Bug Finding Techniques. IEEE Trans. of Soft. Eng. 31 (6), June 2005, 466--480.

Digital Library

[43]

Wing, J. M. and Vaziri-Farahani, M. 1995. Model checking software systems: a case study. 3rd ACM Symp. on Found. of Softw. Eng. (Washington, D.C., Oct. 1995), 128--139.

Digital Library

[44]

Yan, X. and Han, J. gSpan: Graph-Based Substructure Rule Mining. IEEE Intl. Conf. on Data Mining, Maebashi City, Japan (2002) 721--723.

Digital Library

[45]

Zhang, S., Yang, J., and Cheedella, V. Monkey: Approximate Graph Mining Based on Spanning Trees, to appear in ICDE, April 2007, Istanbul, Turkey.

[46]

Zhu, H., Hall, P. A., and May, J. H. Software unit test coverage and adequacy. ACM Comput. Surv. 29, 4 (Dec. 1997), 366--427.

Digital Library

Cited By

Huang JNie JGong YYou WLiang BBian PRoychoudhury APaiva AAbreu RStorey M(2024)Raisin: Identifying Rare Sensitive Functions for Bug DetectionProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639165(1-12)Online publication date: 20-May-2024
https://dl.acm.org/doi/10.1145/3597503.3639165
Jiang JWu JLing XLuo TQu SWu Y(2024)APP-Miner: Detecting API Misuses via Automatically Mining API Path Patterns2024 IEEE Symposium on Security and Privacy (SP)10.1109/SP54263.2024.00043(4034-4052)Online publication date: 19-May-2024
https://doi.org/10.1109/SP54263.2024.00043
Wang CLou YPeng XLiu JZou BChandra SBlincoe KTonella P(2023)Mining Resource-Operation Knowledge to Support Resource Leak DetectionProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616315(986-998)Online publication date: 30-Nov-2023
https://dl.acm.org/doi/10.1145/3611643.3616315
Show More Cited By

Index Terms

Finding what's not there: a new approach to revealing neglected conditions in software
1. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging
2. Theory of computation
  1. Semantics and reasoning
    1. Program reasoning
      1. Program verification

Recommendations

An efficient pattern growth approach for mining fault tolerant frequent itemsets
Highlights
- Mining fault tolerant (FT) frequent itemsets are computationally expensive.
- ...
Abstract
Mining fault tolerant (FT) frequent itemsets from transactional databases are computationally more expensive than mining exact matching frequent itemsets. Previous algorithms mine FT frequent itemsets using Apriori heuristic. Apriori-...
Discovering highly expected utility itemsets for revenue prediction

Identifying patterns of items that are purchased frequently and generate high profits is crucial for inventory and profit management. However, neither approaches based on frequent itemsets nor those based on high-utility itemsets (HUIs) can meet this ...
An improvement for dEclat algorithm
ICUIMC '12: Proceedings of the 6th International Conference on Ubiquitous Information Management and Communication

The diffset format (the difference of two sets) has drastically reduced the running time and memory usage of the Eclat algorithm and the Eclat algorithm using diffset format is called dEclat algorithm. However, in some sparse datasets, diffset format ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ISSTA '07: Proceedings of the 2007 international symposium on Software testing and analysis

July 2007

258 pages

ISBN:9781595937346

DOI:10.1145/1273463

General Chair:
David S. Rosenblum
University College London, United Kingdom
,
Program Chair:
Sebastian Elbaum
University of Nebraska-Lincoln, USA

Copyright © 2007 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

SIGPLAN: ACM Special Interest Group on Programming Languages

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 July 2007

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

ISSTA07

Sponsor:

ISSTA07: International Symposium on Software Testing and Analysis

July 9 - 12, 2007

London, United Kingdom

Acceptance Rates

Overall Acceptance Rate 58 of 213 submissions, 27%

Upcoming Conference

ISSTA '25

Sponsor:
sigsoft

34th ACM SIGSOFT International Symposium on Software Testing and Analysis

June 25 - 28, 2025

Trondheim , Norway

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

49
Total Citations
View Citations
1,034
Total Downloads

Downloads (Last 12 months)3
Downloads (Last 6 weeks)0

Reflects downloads up to 16 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Huang JNie JGong YYou WLiang BBian PRoychoudhury APaiva AAbreu RStorey M(2024)Raisin: Identifying Rare Sensitive Functions for Bug DetectionProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639165(1-12)Online publication date: 20-May-2024
https://dl.acm.org/doi/10.1145/3597503.3639165
Jiang JWu JLing XLuo TQu SWu Y(2024)APP-Miner: Detecting API Misuses via Automatically Mining API Path Patterns2024 IEEE Symposium on Security and Privacy (SP)10.1109/SP54263.2024.00043(4034-4052)Online publication date: 19-May-2024
https://doi.org/10.1109/SP54263.2024.00043
Wang CLou YPeng XLiu JZou BChandra SBlincoe KTonella P(2023)Mining Resource-Operation Knowledge to Support Resource Leak DetectionProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616315(986-998)Online publication date: 30-Nov-2023
https://dl.acm.org/doi/10.1145/3611643.3616315
Bouzenia IPradel M(2023)When to Say What: Learning to Find Condition-Message Inconsistencies2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE)10.1109/ICSE48619.2023.00081(868-880)Online publication date: May-2023
https://doi.org/10.1109/ICSE48619.2023.00081
Bouzenia I(2022)Detecting Inconsistencies in If-Condition-Raise StatementsProceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering10.1145/3551349.3559514(1-3)Online publication date: 10-Oct-2022
https://dl.acm.org/doi/10.1145/3551349.3559514
Amankwah RChen JSong HKudjo P(2022)Bug detection in Java code: An extensive evaluation of static analysis tools using Juliet Test SuitesSoftware: Practice and Experience10.1002/spe.318153:5(1125-1143)Online publication date: 29-Dec-2022
https://doi.org/10.1002/spe.3181
Liu WChen BPeng XSun QZhao W(2021)Identifying change patterns of API misuses from code changesScience China Information Sciences10.1007/s11432-019-2745-564:3Online publication date: 7-Feb-2021
https://doi.org/10.1007/s11432-019-2745-5
Bian PLiang BHuang JShi WWang XZhang JDevanbu PCohen MZimmermann T(2020)SinkFinder: harvesting hundreds of unknown interesting function pairs with just one seedProceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3368089.3409678(1101-1113)Online publication date: 8-Nov-2020
https://dl.acm.org/doi/10.1145/3368089.3409678
Bian PLiang BZhang YYang CShi WCai Y(2019)Detecting Bugs by Discovering Expectations and Their ViolationsIEEE Transactions on Software Engineering10.1109/TSE.2018.281663945:10(984-1001)Online publication date: 1-Oct-2019
https://doi.org/10.1109/TSE.2018.2816639
Bian PLiang BShi WHuang JCai YLeavens GGarcia APăsăreanu C(2018)NAR-miner: discovering negative association rules from code for bug detectionProceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3236024.3236032(411-422)Online publication date: 26-Oct-2018
https://dl.acm.org/doi/10.1145/3236024.3236032
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten