ABSTRACT
While many educators have added software testing practices to their programming assignments, assessing the effectiveness of student-written tests using statement coverage or branch coverage has limitations. While researchers have begun investigating alternative approaches to assessing student-written tests, this paper reports on an investigation of the quality of student written tests in terms of the number of authentic, human-written defects those tests can detect. An experiment was conducted using 101 programs written for a CS2 data structures assignment where students implemented a queue two ways, using both an array-based and a link-based representation. Students were required to write their own software tests and graded in part on the branch coverage they achieved. Using techniques from prior work, we were able to approximate the number of bugs present in the collection of student solutions, and identify which of these were detected by each student-written test suite. The results indicate that, while students achieved an average branch coverage of 95.4% on their own solutions, their test suites were only able to detect an average of 13.6% of the faults present in the entire program population. Further, there was a high degree of similarity among 90% of the student test suites. Analysis of the suites suggest that students were following naïve, "happy path" testing, writing basic test cases covering mainstream expected behavior rather than writing tests designed to detect hidden bugs. These results suggest that educators should strive to reinforce test design techniques intended to find bugs, rather than simply confirming that features work as expected.
- K. Aaltonen, P. Ihantola, and O. Seppälä. 2010. Mutation analysis vs. code coverage in automated assessment of students' testing skills. In Proc. ACM Int'l Conf. on Object Oriented Prog. Sys. Languages and Applications Companion (SPLASH '10). ACM Press, pp. 153--160. Google ScholarDigital Library
- S.H. Edwards. 2003. Improving student performance by evaluating how well students test their own programs. Journal on Educational Resources in Computing, 3(3): Article 1. Google ScholarDigital Library
- S.H. Edwards. 2004. Using software testing to move students from trial-and-error to reflection-in-action. In Proc. 35th SIGCSE Tech. Symp. Computer Science Education. ACM Press, pp. 26--30. Google ScholarDigital Library
- S.H. Edwards and M.A. Pérez-Quiñones. 2007. Experiences using test-driven development with an automated grader. J. Comput. Small Coll., 22(3): 44--50. January 2007. Google ScholarDigital Library
- S.H. Edwards, Z. Shams, M. Cogswell, and R.C. Senkbeil. 2012. Running students' software tests against each others' code: New life for an old "gimmick". In Proc. 43rd ACM Tech. Symp. Computer Science Education. ACM Press, pp. 221--226. Google ScholarDigital Library
- M.H. Goldwasser. 2002. A gimmick to integrate software testing throughout the curriculum. In Proc. 33rd SIGCSE Tech. Symp. Computer Science Education. ACM Press, pp. 271--275. Google ScholarDigital Library
- D. Jackson and M. Usher. 1997. Grading student programs using ASSYST. In Proc. 28th SIGCSE Tech. Symp. Computer Science Education. ACM Press, pp. 335--339. Google ScholarDigital Library
- E.L. Jones. 2000. Software testing in the computer science curriculum-a holistic approach. In Proc. Australasian Computing Education Conf. ACM Press, pp. 153--157. Google ScholarDigital Library
- E.L. Jones. 2001. Integrating testing into the curriculum-arsenic in small doses. In Proc. 32nd SIGCSE Tech. Symp. Computer Science Education. ACM Press, pp. 337--341. Google ScholarDigital Library
- G.J. Myers, C. Sandler, and T. Badgett. 2011. The Art of Software Testing, 3rd Ed. Wiley. Google ScholarDigital Library
- Z. Shams and S.H. Edwards. 2013. Toward practical mutation analysis for evaluating the quality of student-written software tests. In Proc. 9th Ann. Int'l ACM Conf. Int'l Computing Education Research. ACM Press, pp. 53--58. Google ScholarDigital Library
- J. Spacco and W. Pugh. 2006. Helping students appreciate test-driven development (TDD). In Companion to the 21st ACM SIGPLAN Symp. Object-oriented Prog. Sys, Languages, and Applications. ACM Press, pp. 907--913. Google ScholarDigital Library
Index Terms
- Do student programmers all tend to write the same software tests?
Recommendations
Toward practical mutation analysis for evaluating the quality of student-written software tests
ICER '13: Proceedings of the ninth annual international ACM conference on International computing education researchSoftware testing is being added to programming courses at many schools, but current assessment techniques for evaluating student-written tests are imperfect. Code coverage measures are typically used in practice, but they have limitations and sometimes ...
Comparing test quality measures for assessing student-written tests
ICSE Companion 2014: Companion Proceedings of the 36th International Conference on Software EngineeringMany educators now include software testing activities in programming assignments, so there is a growing demand for appropriate methods of assessing the quality of student-written software tests. While tests can be hand-graded, some educators also use ...
Checked Coverage and Object Branch Coverage: New Alternatives for Assessing Student-Written Tests
SIGCSE '15: Proceedings of the 46th ACM Technical Symposium on Computer Science EducationMany educators currently use code coverage metrics to assess student-written software tests. While test adequacy criteria such as statement or branch coverage can also be used to measure the thoroughness of a test suite, they have limitations. Coverage ...
Comments