ABSTRACT
As concurrent and distributive applications are becoming more common and debugging such applications is very difficult, practical tools for automatic debugging of concurrent applications are in demand. In previous work, we applied automatic debugging to noise-based testing of concurrent programs. The idea of noise-based testing is to increase the probability of observing the bugs by adding, using instrumentation, timing "noise" to the execution of the program. The technique of finding a small subset of points that causes the bug to manifest can be used as an automatic debugging technique. Previously, we showed that Delta Debugging can be used to pinpoint the bug location on some small programs.
In the work reported in this paper, we create and evaluate two algorithms for automatically pinpointing program locations that are in the vicinity of the bugs on a number of industrial programs. We discovered that the Delta Debugging algorithms do not scale due to the non-monotonic nature of the concurrent debugging problem. Instead we decided to try a machine learning feature selection algorithm. The idea is to consider each instrumentation point as a feature, execute the program many times with different instrumentations, and correlate the features (instrumentation points) with the executions in which the bug was revealed. This idea works very well when the bug is very hard to reveal using instrumentation, correlating to the case when a very specific timing window is needed to reveal the bug. However, in the more common case, when the bugs are easy to find using instrumentation points ranked high by the feature selection algorithm is not high enough. We show that for these cases, the important value is not the absolute value of the evaluation of the feature but the derivative of that value along the program execution path.
As a number of groups expressed interest in this research, we built an open infrastructure for automatic debugging algorithms for concurrent applications, based on noise injection based concurrent testing using instrumentation. The infrastructure is described in this paper.
- Y. Ben-Asher, Y. Eytani, E. Farchi, and S. Ur. Noise makers need to know where to be silent - producing schedules that find bugs. In International Symposium on Leveraging Applications of Formal Methods, Verification and Validation (ISOLA), 2006. Google ScholarDigital Library
- A. L. Blum and P. Langley. Selection of relevant features and examples in machine learning. Artificial Intelligence, 97:245--271, 1997. Google ScholarDigital Library
- H. Chockler, E. Farchi, Z. Glazberg, B. Godlin, Y. Nir-Buchbinder, and I. Rabinovitz. Formal verification of concurrent software: Two case studies. In PADTAD '06: Proceeding of the 2006 workshop on parallel and distributed systems: Testing and debugging, pages 11--22, 2006. Google ScholarDigital Library
- J. -D. Choi and H. Srinivasan. Deterministic replay of Java multithreaded applications. In Proceedings of the SIGMETRICS Symposium on Parallel and Distributed Tools, August 1998. Google ScholarDigital Library
- J. -D. Choi and A. Zeller. Isolating failure-inducing thread schedules. In ISSTA '02: Proceedings of the 2002 ACM SIGSOFT International Symposium on Software Testing and Analysis, pages 210--220, New York, NY, USA, 2002. ACM Press. Google ScholarDigital Library
- S. Copty and S. Ur. Toward automatic concurrent debugging via minimal program mutant generation with AspectJ. In TV '06: Proceedings of Multithreading in Hardware and Software: Formal Approaches to Design and Verification, pages 125--132, 2006.Google Scholar
- J. C. Corbett, M. Dwyer, J. Hatcliff, C. Pasareanu, R., S. Laubach, and H. Zheng. Bandera: Extracting finite-state models from Java source code. In Proc. 22nd International Conference on Software Engineering (ICSE). ACM Press, June 2000. Google ScholarDigital Library
- O. Edelstein, E. Farchi, Y. Nir, G. Ratsaby, and S. Ur. Multithreaded Java program test generation. IBM Systems Journal, 41(1):111--125, 2002. Also available as http://www.research.ibm.com/journal/sj/411/-edelstein.html.Google ScholarDigital Library
- Y. Eytani. Concurrent Java test generation as a search problem. In Proceedings of the Fifth Workshop on Runtime Verification (RV), volume 144(4) of Electronic Notes in Theoretical Computer Science, 2005.Google Scholar
- Y. Eytani and S. Ur. Compiling a benchmark of documented multi-threaded bugs. In IPDPS, 2004.Google ScholarCross Ref
- E. Farchi, Y. Nir, and S. Ur. Concurrent bug patterns and how to test them. In IPDPS (PADTAD), page 286, 2003. Google ScholarDigital Library
- A. Hartman, A. Kirshin, and K. Nagin. A test execution environment running abstract tests for distributed software. In Proceedings of Software Engineering and Applications, SEA 2002, 2002.Google Scholar
- K. Havelund and T. Pressburger. Model checking Java programs using Java PathFinder. International Journal on Software Tools for Technology Transfer, STTT, 2(4), April 2000.Google Scholar
- E. Itzkovitz, A. Schuster, and O. Zeev-Ben-Mordehai. Towards integration of data-race detection in DSM systems. Journal of Parallel and Distributed Computing. Special Issue on Software Support for Distributed Computing, 59(2):180--203, Nov 1999. Google ScholarDigital Library
- J. A. Jones and M. J. Harrold. Empirical evaluation of the tarantula automatic fault-localization technique. In ASE '05: Proceedings of the 20th IEEE/ACM international Conference on Automated software engineering, pages 273--282. ACM Press, 2005. Google ScholarDigital Library
- B. Long and P. A. Strooper. A classification of concurrency failures in Java components. In IPDPS, page 287, 2003. Google ScholarDigital Library
- Y. Malaiya, N. Li, J. Bieman, R. Karcich, and B. Skibbe. Software test coverage and reliability. Technical report, Colorado State University, 1996.Google Scholar
- A. Papoulis. Probability, random variables, and stochastic processes. McGraw-Hill Books, 1991.Google Scholar
- B. Richards and J. R. Larus. Protocol-based data-race detection. In Proceedings of the 2nd SIGMETRICS Symposium on Parallel and Distributed Tools, August 1998. Google ScholarDigital Library
- S. Savage. Eraser: A dynamic race detector for multithreaded programs. ACM Transactions on Computer Systems, 15(4):391--411, November 1997. Google ScholarDigital Library
- S. D. Stoller. Model-checking multi-threaded distributed Java programs. In Proceedings of the 7th International SPIN Workshop on Model Checking, 2000. Google ScholarDigital Library
- S. D. Stoller. Model-checking multi-threaded distributed Java programs. International Journal on Software Tools for Technology Transfer, 4(1):71--91, October 2002.Google ScholarCross Ref
- S. D. Stoller. Testing concurrent Java programs using randomized scheduling. In Proceedings of the Second Workshop on Runtime Verification (RV), volume 70(4) of Electronic Notes in Theoretical Computer Science. Elsevier, 2002.Google Scholar
- G. Upton and I. Cook. Oxford Dictionary of Statistics. Oxford University Press, Oxford, UK, 2002.Google Scholar
- E. Yom-Tov. An introduction to pattern classification. In O. Bousquet, U. von Luxburg, and G. Ratsch, editors, Advanced Lectures on Machine Learning, LNAI 3176. Springer, 2004.Google Scholar
- A. Zeller. Yesterday, my program worked. Today, it does not. Why? In ESEC/FSE-7: Proceedings of the 7th European software engineering conference held jointly with the 7th ACM SIGSOFT international symposium on foundations of software engineering, pages 253--267, London, UK, 1999. Springer-Verlag. Google ScholarDigital Library
- A. Zeller and R. Hildebrandt. Simplifying and isolating failure-inducing input. IEEE Trans. Softw. Eng., 28(2):183--200, 2002. Google ScholarDigital Library
- A. Zheng, M. Jordan, B. Liblit, and A. Aiken. Statistical debugging of sampled programs. In Advances in Neural Information Processing Systems, 2003.Google Scholar
Index Terms
- Instrumenting where it hurts: an automatic concurrent debugging technique
Recommendations
Reproducing concurrency failures from crash stacks
ESEC/FSE 2017: Proceedings of the 2017 11th Joint Meeting on Foundations of Software EngineeringReproducing field failures is the first essential step for understanding, localizing and removing faults. Reproducing concurrency field failures is hard due to the need of synthesizing a test code jointly with a thread interleaving that induce the ...
Debugging non-deadlock concurrency bugs
ISSTA 2013: Proceedings of the 2013 International Symposium on Software Testing and AnalysisConcurrency bugs are difficult to debug because of nondeterministic behaviors of concurrent programs. Existing fault-localization techniques for non-deadlock concurrency bugs do not provide the comprehensive information to identify and understand the ...
What’s the problem? interrogating actors to identify the root cause of concurrency bugs
AGERE 2021: Proceedings of the 11th ACM SIGPLAN International Workshop on Programming Based on Actors, Agents, and Decentralized ControlPrograms written using Communicating Event-Loops (CEL) concurrency model do not suffer from low-level data races by design but are not exempt from other concurrency bugs, such as behavioral deadlocks and message order violations.
When programmers need ...
Comments