skip to main content
10.1145/1453101.1453110acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article

Profile-guided program simplification for effective testing and analysis

Published: 09 November 2008 Publication History

Abstract

Many testing and analysis techniques have been developed for inhouse use. Although they are effective at discovering defects before a program is deployed, these techniques are often limited due to the complexity of real-world code and thus miss program faults. It will be the users of the program who eventually experience failures caused by the undetected faults. To take advantage of the large number of program runs carried by the users, recent work has proposed techniques to collect execution profiles from the users for developers to perform post-deployment failure analysis. However, in order to protect users' privacy and to reduce run-time overhead, such profiles are usually not detailed enough for the developers to identify or fix the root causes of the failures.
In this paper, we propose a novel approach to utilize user execution profiles for more effective in-house testing and analysis. Our key insight is that execution profiles for program failures can be used to simplify a program, while preserving its erroneous behavior. By simplifying a program and scaling down its complexity according to its profiles, in-house testing and analysis techniques can be performed more accurately and efficiently, and pragmatically program defects that occur more often and are (arguably) more relevant to users will be given preference during failure analysis. Specifically, we adapt statistical debugging on execution profiles to predict likely failure-related code and use a syntax-directed algorithm to trim failure-irrelevant code from a program, while preserving its erroneous behavior as much as possible. We conducted case studies on a testing engine, CUTE, and a software model checker, BLAST, to evaluate our technique. We used subject programs from the Aristotle Analysis System and the Software-artifact Infrastructure Repository (SIR). Our empirical results show that using simplified programs, CUTE and BLAST find more bugs with improved accuracy and performance: they were able to detect 20 and 21 (out of 139) more bugs respectively in about half of the time as they took on the original test programs.

References

[1]
H. Agrawal, J. R. Horgan, S. London, and W. E. Wong. Fault localization using execution slices and dataflow tests. In ISSRE, pages 143--151, 1995.
[2]
A. V. Aho, R. Sethi, and J. D. Ullman. Compilers: Priciples, Techniques, and Tools. Addison-Wesley, 1986.
[3]
P. Arumuga Nainar, T. Chen, J. Rosin, and B. Liblit. Statistical debugging using compound boolean predicates. In ISSTA, pages 5--15, 2007.
[4]
S. Bates and S. Horwitz. Incremental program testing using program dependence graphs. In POPL, pages 384--396, 1993.
[5]
D. Beyer, A. J. Chlipala, T. Henzinger, R. Jhala, and R. Majumdar. Generating tests from counterexamples. In ICSE, pages 326--335, 2004.
[6]
D. Binkley and K. B. Gallagher. Program slicing. Advances in Computers, 43:1--50, 1996.
[7]
J. F. Bowring, M. J. Harrold, and J. M. Rehg. Improving the classification of software behaviors using ensembles of control-flow and data-flow classifiers. Technical Report GIT-CERCS-05-10, Georgia Institute of Technology, April 2005.
[8]
J. F. Bowring, J. M. Rehg, and M. J. Harrold. Active learning for automatic classification of software behavior. In ISSTA, pages 195--205, 2004.
[9]
Y. Brun and M. D. Ernst. Finding latent code errors via machine learning over program executions. In ICSE, pages 480--490, 2004.
[10]
C. Cadar, V. Ganesh, P. M. Pawlowski, D. L. Dill, and D. R. Engler. EXE: automatically generating inputs of death. In ACM Computer and Communications Security (CCS), pages 322--335, 2006.
[11]
H. Cleve and A. Zeller. Locating causes of program failures. In ICSE, pages 342--351, 2005.
[12]
C. Consel and O. Danvy. Tutorial notes on partial evaluation. In POPL, pages 493--501, 1993.
[13]
C. Csallner and Y. Smaragdakis. Check 'n' crash: combining static checking and testing. In ICSE, pages 422--431, 2005.
[14]
W. Dickinson, D. Leon, and A. Podgurski. Finding failures by cluster analysis of execution profiles. In ICSE, pages 339--348, 2001.
[15]
H. Do, S. G. Elbaum, and G. Rothermel. Supporting controlled experimentation with testing techniques: An infrastructure and its potential impact. Empirical Software Engineering: An International Journal, 10(4):405--435, 2005.
[16]
P. Godefroid, N. Klarlund, and K. Sen. DART: Directed automated random testing. In PLDI, pages 213--223, 2005.
[17]
T. L. Graves, M. J. Harrold, J.-M. Kim, A. Porter, and G. Rothermel. An empirical study of regression test selection techniques. ACM TOSEM, 10(2):184--208, 2001.
[18]
A. Groce and R. Joshi. Exploiting traces in program analysis. In TACAS, volume 3920 of LNCS, pages 379--393. Springer, 2006.
[19]
N. Gupta, H. He, X. Zhang, and R. Gupta. Locating faulty code using failure-inducing chops. In ASE, pages 263--272, 2005.
[20]
M. Haran, A. F. Karr, A. Orso, A. A. Porter, and A. P. Sanil. Applying classification techniques to remotely-collected program execution data. In ESEC/SIGSOFT FSE, pages 146--155, 2005.
[21]
T. A. Henzinger, R. Jhala, R. Majumdar, and G. Sutre. Lazy abstraction. In POPL, pages 58--70, 2002.
[22]
T. A. Henzinger, R. Jhala, R. Majumdar, and G. Sutre. Software verification with BLAST. In SPIN, volume 2648 of LNCS, pages 235--239, 2003.
[23]
M. Hutchins, H. Foster, T. Goradia, and T. Ostrand. Experiments of the effectiveness of dataflow- and controlflow-based test adequacy criteria. In ICSE, pages 191--200, 1994.
[24]
L. Jiang and Z. Su. Context-aware statistical debugging: from bug predictors to faulty control flow paths. In ASE, pages 184--193, 2007.
[25]
J. A. Jones and M. J. Harrold. Empirical evaluation of the tarantula automatic fault-localization technique. In ASE, pages 273--282, 2005.
[26]
J. A. Jones, M. J. Harrold, and J. F. Bowring. Debugging in parallel. In ISSTA, pages 16--26, 2007.
[27]
N. D. Jones. An introduction to partial evaluation. ACM Computing Surveys, 28(3), September 1996.
[28]
B. Korel and J. Laski. Dynamic program slicing. Information Processing Letters, 29(3):155--163, Oct 26 1988.
[29]
A. Lal, J. Lim, M. Polishchuk, and B. Liblit. Path optimization in programs and its application to debugging. In ESOP, 2006.
[30]
B. Liblit. The Cooperative Bug Isolation Project. http://www.cs.wisc.edu/cbi/.
[31]
B. Liblit and A. Aiken. Building a better backtrace: Techniques for postmortem program analysis. Technical Report CSD-02-1203, UC Berkeley, 2002.
[32]
B. Liblit, A. Aiken, A. X. Zheng, and M. I. Jordan. Bug isolation via remote program sampling. In PLDI, 2003.
[33]
B. Liblit, M. Naik, A. X. Zheng, A. Aiken, and M. I. Jordan. Scalable statistical bug isolation. In PLDI, 2005.
[34]
C. Liu and J. Han. Failure proximity: A fault localization-based approach. In FSE, 2006.
[35]
C. Liu, X. Yan, L. Fei, J. Han, and S. P. Midkiff. SOBER: statistical model-based bug localization. In FSE, pages 286--295, 2005.
[36]
C. Liu, X. Zhang, J. Han, Y. Zhang, and B. K. Bhargava. Failure indexing: A dynamic slicing based approach. In ICSM, pages 382--393, 2007.
[37]
S. Lu, P. Zhou, W. Liu, Y. Zhou, and J. Torrellas. Pathexpander: Architectural support for increasing the path coverage of dynamic bug detection. In MICRO, pages 38--52, 2006.
[38]
R. Manevich, M. Sridharan, S. Adams, M. Das, and Z. Yang. PSE: Explaining program failures via postmortem static analysis. In FSE, 2004.
[39]
G. C. Necula, S. McPeak, S. P. Rahul, and W. Weimer. CIL: Intermediate language and tools for analysis and transformation of C programs. In CC, volume 2304 of LNCS, pages 213--228, 2002.
[40]
C. S. Pasareanu, M. B. Dwyer, and W. Visser. Finding feasible counter-examples when model checking abstracted java programs. In TACAS, volume 2031 of LNCS, pages 284--298, 2001.
[41]
A. Podgurski, D. Leon, P. Francis, W. Masri, M. Minch, J. Sun, and B. Wang. Automated support for classifying software failure reports. In ICSE, pages 465--477, 2003.
[42]
M. Renieris and S. P. Reiss. Fault localization with nearest neighbor queries. In ASE, 2003.
[43]
K. Sen, D. Marinov, and G. Agha. CUTE: A concolic unit testing engine for C. In FSE, pages 263--272, 2005.
[44]
Siemens, M. J. Harrold, and G. Rothermel. Aristotle Analysis System -- Siemens Programs, HR Variants. http://www.cc.gatech.edu/aristotle/Tools/subjects/.
[45]
W. Visser, C. S. Pasareanu, and S. Khurshid. Test input generation with java pathfinder. In ISSTA, pages 97--107, 2004.
[46]
T. Xie, D. Marinov, W. Schulte, and D. Notkin. Symstra: A framework for generating object-oriented unit tests using symbolic execution. In TACAS, pages 365--381, 2005.
[47]
A. Zeller. Isolating cause-effect chains from computer programs. In FSE, pages 1--10, 2002.
[48]
A. X. Zheng, M. I. Jordan, B. Liblit, and A. Aiken. Statistical debugging of sampled programs. In Advances in Neural Information Processing Systems 16. 2003.
[49]
A. X. Zheng, M. I. Jordan, B. Liblit, M. Naik, and A. Aiken. Statistical debugging: Simultaneous identification of multiple bugs. In International Conference on Machine Learning (ICML), pages 26--29, 2006.

Cited By

View all
  • (2021)Validating static warnings via testing code fragmentsProceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3460319.3464832(540-552)Online publication date: 11-Jul-2021
  • (2020)Differentially-private control-flow node coverage for software usage analysisProceedings of the 29th USENIX Conference on Security Symposium10.5555/3489212.3489270(1021-1038)Online publication date: 12-Aug-2020
  • (2019)Certifying delta-oriented programsSoftware and Systems Modeling (SoSyM)10.1007/s10270-018-00704-x18:5(2875-2906)Online publication date: 2-Aug-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGSOFT '08/FSE-16: Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering
November 2008
369 pages
ISBN:9781595939951
DOI:10.1145/1453101
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 November 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. profiling
  2. program simplification
  3. statistical debugging
  4. testing and analysis

Qualifiers

  • Research-article

Conference

SIGSOFT '08/FSE-16
Sponsor:

Acceptance Rates

Overall Acceptance Rate 17 of 128 submissions, 13%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2021)Validating static warnings via testing code fragmentsProceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3460319.3464832(540-552)Online publication date: 11-Jul-2021
  • (2020)Differentially-private control-flow node coverage for software usage analysisProceedings of the 29th USENIX Conference on Security Symposium10.5555/3489212.3489270(1021-1038)Online publication date: 12-Aug-2020
  • (2019)Certifying delta-oriented programsSoftware and Systems Modeling (SoSyM)10.1007/s10270-018-00704-x18:5(2875-2906)Online publication date: 2-Aug-2019
  • (2016)PerfGuard: binary-centric application performance monitoring in production environmentsProceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering10.1145/2950290.2950347(595-606)Online publication date: 1-Nov-2016
  • (2015)Automating performance bottleneck detection using search-based application profilingProceedings of the 2015 International Symposium on Software Testing and Analysis10.1145/2771783.2771816(270-281)Online publication date: 13-Jul-2015
  • (2009)Program SiftingProceedings of the 2009 16th Asia-Pacific Software Engineering Conference10.1109/APSEC.2009.44(275-282)Online publication date: 1-Dec-2009

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media