ABSTRACT
Database-centric programs form the backbone of many enterprise systems. Fixing defects in such programs takes much human effort due to the interplay between imperative code and database-centric logic. This paper presents a novel data-driven approach for automated fixing of bugs in the selection condition of database statements (e.g., WHERE clause of SELECT statements) – a common form of bugs in such programs. Our key observation is that in real-world data, there is information latent in the distribution of data that can be useful to repair selection conditions efficiently. Given a faulty database program and input data, only a part of which induces the defect, our novelty is in determining the correct behavior for the defect-inducing data by taking advantage of the information revealed by the rest of the data. We accomplish this by employing semi-supervised learning to predict the correct behavior for defect-inducing data and by patching up any inaccuracies in the prediction by a SAT-based combinatorial search. Next, we learn a compact decision tree for the correct behavior, including the correct behavior on the defect-inducing data. This tree suggests a plausible fix to the selection condition. We demonstrate the feasibility of our approach on seven realworld examples.
- Thomas Ackling, Bradley Alexander, and Ian Grunert. Evolving patches for software repair. In GECCO, pages 1427–1434, 2011. Google ScholarDigital Library
- Kristin P. Bennett and Colin Campbell. Support vector machines: hype or hallelujah? SIGKDD Explor. Newsl., 2(2):1–13, December 2000. Google ScholarDigital Library
- Lionel C. Briand, Yvan Labiche, and Xuetao Liu. Using machine learning to support debugging with Tarantula. In ISSRE, pages 137–146, 2007. Google ScholarDigital Library
- Satish Chandra, Emina Torlak, Shaon Barman, and Rastislav Bodik. Angelic debugging. In ICSE, pages 121–130, 2011. Google ScholarDigital Library
- Vidroha Debroy and W. Eric Wong. Using mutation to automatically suggest fixes for faulty programs. In ICST, pages 65–74, 2010. Google ScholarDigital Library
- Divya Gopinath, Sarfraz Khurshid, Diptikalyan Saha, and Satish Chandra. Data-Guided Repair of Selection Statements. Technical report, IBM Research. India, 2014. IBM Technical Report RI14004, available from http://domino.watson.ibm.com/library/CyberDig.nsf/home.Google Scholar
- Divya Gopinath, Muhammad Zubair Malik, and Sarfraz Khurshid. Specification-based program repair using SAT. In TACAS, pages 173–188, March 2011. Google ScholarDigital Library
- Andreas Griesmayer, Roderick Bloem, and Byron Cook. Repair of boolean programs with an application to C. In CAV, pages 358–371, 2006. Google ScholarDigital Library
- Sumit Gulwani. Automating string processing in spreadsheets using input-output examples. In POPL, pages 317–330, 2011. Google ScholarDigital Library
- Sumit Gulwani, Susmit Jha, Ashish Tiwari, and Ramarathnam Venkatesan. Synthesis of loop-free programs. In PLDI, pages 62–73, 2011. Google ScholarDigital Library
- Daniel Jackson. Alloy: a lightweight object modelling notation. ACM Trans. Softw. Eng. Methodol., 11(2), April 2002. Google ScholarDigital Library
- Lingxiao Jiang and Zhendong Su. Context-aware statistical debugging: from bug predictors to faulty control flow paths. In ASE, pages 184–193, 2007. Google ScholarDigital Library
- T. Joachims. Making large-scale svm learning practical. Advances in Kernel Methods - Support Vector Learning, 1999. Google ScholarDigital Library
- B. Jobstmann, A. Griesmayer, and R. Bloem. Program repair as a game. In CAV, pages 226–238, 2005. Google ScholarDigital Library
- James A. Jones, James F. Bowring, and Mary Jean Harrold. Debugging in parallel. In ISSTA, pages 16–26, 2007. Google ScholarDigital Library
- James A. Jones, Mary Jean Harrold, and John Stasko. Visualization of test information to assist fault localization. In ICSE, pages 467–477, 2002. Google ScholarDigital Library
- Sarfraz Khurshid, Iván García, and Yuk Lai Suen. Repairing structurally complex data. In SPIN, pages 123–138, 2005. Google ScholarDigital Library
- Viktor Kuncak, Mikael Mayer, Ruzica Piskac, and Philippe Suter. Complete functional synthesis. In PLDI, pages 316–329, 2010. Google ScholarDigital Library
- Ben Liblit, Mayur Naik, Alice X. Zheng, Alex Aiken, and Michael I. Jordan. Scalable statistical bug isolation. In PLDI, pages 15–26, 2005. Google ScholarDigital Library
- M. Z. Malik, K. Ghori, B. Elkarablieh, and S. Khurshid. A case for automated debugging using data structure repair. In ASE, pages 615–619, November 2009. Google ScholarDigital Library
- T. Mitchell. Machine Learning. McGraw Hill, 1997. Google ScholarDigital Library
- Rishabh Singh and Armando Solar-Lezama. SPT: Storyboard programming tool. In CAV, pages 738–743, 2012. Google ScholarDigital Library
- Armando Solar-Lezama. The sketching approach to program synthesis. In APLAS, pages 4–13, 2009. Google ScholarDigital Library
- V. Vapnik. The Nature of Statistical Learning Theory. Springer Verlag, 1995. Google ScholarCross Ref
- Yi Wei, Yu Pei, Carlo A. Furia, Lucas S. Silva, Stefan Buchholz, Bertrand Meyer, and Andreas Zeller. Automated fixing of programs with contracts. In ISSTA, pages 61–72, 2010. Google ScholarDigital Library
- Westley Weimer, ThanhVu Nguyen, Claire Le Goues, and Stephanie Forrest. Automatically finding patches using genetic programming. In ICSE, pages 364–374, 2009. Google ScholarDigital Library
Index Terms
- Data-guided repair of selection statements
Recommendations
Constraint-Based Program Debugging Using Data Structure Repair
ICST '11: Proceedings of the 2011 Fourth IEEE International Conference on Software Testing, Verification and ValidationDevelopers have used data structure repair over the last few decades as an effective means to recover on-the-fly from errors in program state. Traditional repair techniques were based on dedicated repair routines, whereas more recent techniques have ...
Improved Program Repair Methods using Refactoring with GPT Models
SIGCSE 2024: Proceedings of the 55th ACM Technical Symposium on Computer Science Education V. 1Teachers often utilize automatic program repair methods to provide feedback on submitted student code using model answer code. A state-of-the-art tool is Refactory, which achieves a high repair success rate and small patch size (less code repair) by ...
RepairNet: Contextual Sequence-to-Sequence Network for Automated Program Repair
Artificial Intelligence in EducationAbstractCompile-time errors can wreak havoc for programmers – seasoned and novice. Often developers spend a lot of time debugging them. An automated system to repair such errors can be a useful aid to the developers for their productivity. In this work, ...
Comments