ABSTRACT
Plagiarism is an issue that all educators have had to deal with. Large numbers of students and assignments have resulted in the development of automated systems to detect code similarities with the aim of identifying cases that may have been plagiarised. These systems are of great value to assessors, allowing them to process submissions automatically. However, these automated systems do present possible disadvantages and drawbacks. In this study we explore and analyse the differences between various systems as well as how their performance compares with manual checking. We consider the different methods students use when committing plagiarism. Then we examine more closely the systems that can aid plagiarism detection, ranging from their characteristics to how they work. In the process, we determine how these systems compare with our own system and their suitability for aiding the identification of submissions which may have been plagiarised in our introductory C++ course.
- A. Ahtiainen, S. Surakka, and M. Rahikainen. Plaggie: GNU-licensed source code plagiarism detection engine for Java exercises. In Proceedings of the 6th Baltic Sea Conference on Computing Education Research: Koli Calling 2006, Baltic Sea '06, pages 141--142, New York, NY, USA, 2006. ACM. Google ScholarDigital Library
- A. Aiken. MOSS: a system for detecting software plagiarism. http://theory.stanford.edu/~aiken/moss/, 1994. {Online: accessed 12-January-2013}.Google Scholar
- K. W. Bowyer and L. O. Hall. Experience using \moss" to detect cheating on programming assignments. In Frontiers in Education Conference, 1999. FIE'99. 29th Annual, volume 3, pages 13B3--18. Institute of Electrical and Electronics Engineers, 1999.Google ScholarCross Ref
- D. M. Breuker, J. Derriks, and J. Brunekreef. Measuring static quality of student code. In Proceedings of the 16th Annual Joint Conference on Innovation and Technology in Computer Science Education, ITiCSE '11, pages 13--17, New York, NY, USA, 2011. ACM. Google ScholarDigital Library
- R. Brixtel, M. Fontaine, B. Lesner, C. Bazin, and R. Robbes. Language-independent clone detection applied to plagiarism detection. In Source Code Analysis and Manipulation (SCAM), 2010 10th IEEE Working Conference on, pages 77--86, Sept 2010. Google ScholarDigital Library
- W. B. Cavnar and J. M. Trenkle. N-gram-based text categorization. In Proceedings of the 3rd Annual Symposium on Document Analysis and Information Retrieval, pages 161--175, Las Vegas, US, 1994.Google Scholar
- X. Chen, B. Francia, M. Li, B. Mckinnon, and A. Seker. Shared information and program plagiarism detection. Information Theory, IEEE Transactions on, 50(7):1545--1551, July 2004. Google ScholarDigital Library
- T. Copeland. Detecting duplicate code with pmd's cpd, Dec 03 2001. {Online; accessed 5-January-2016}.Google Scholar
- C. Daly and J. Horgan. Patterns of plagiarism. SIGCSE Bull., 37(1):383--387, Feb. 2005. Google ScholarDigital Library
- M. Freire. Visualizing Program Similarity in the AC Plagiarism Detection System. In Proceedings of the Working Conference on Advanced Visual Interfaces, AVI '08, pages 404--407, New York, NY, USA, 2008. ACM. Google ScholarDigital Library
- M. Freire, M. Cebrian, and E. del Rosal. Uncovering plagiarism networks. arXiv preprint cs/0703136, 2007.Google Scholar
- M. Freire, M. Cebrian, and E. Rosal. AC: An Integrated Source Code Plagiarism Detection Environment. Technical Report cs.IT/0703136, Universidad Autónoma de Madrid, Mar 2007. Comments: 57 pages, 11 figures.Google Scholar
- D. Gitchell and N. Tran. Sim: A utility for detecting similarity in computer programs. SIGCSE Bull., 31(1):266--270, Mar. 1999. Google ScholarDigital Library
- J. Hage, P. Rademaker, and N. van Vugt. Plagiarism Detection for Java: A Tool Comparison. In Computer Science Education Research Conference, CSERC '11, pages 33--46, Open Univ., Heerlen, The Netherlands, The Netherlands, 2011. Open Universiteit, Heerlen. Google ScholarDigital Library
- J. Hage, B. Vermeer, and G. Verburg. Plagiarism Detection for Haskell with Holmes. In Proceedings of the 3rd Computer Science Education Research Conference on Computer Science Education Research, CSERC '13, pages 19--30, Open Univ., Heerlen, The Netherlands, The Netherlands, 2013. Open Universiteit, Heerlen. Google ScholarDigital Library
- B. Haskins. Utilising n-grams and Edit Distance as a Means of Identifying Copied Programming Assignments. In Proceedings of the 44th Annual Conference of the Southern African Computer Lecturers' Association (SACLA), Port Elizabeth, 25-26 June 2014. SACLA Organising Committee.Google Scholar
- M. Joy and M. Luck. Plagiarism in programming assignments. IEEE Transactions on Education, 42(2):129--133, 1999. Google ScholarDigital Library
- R. M. Karp and M. O. Rabin. Efficient randomized pattern-matching algorithms. IBM Journal of Research and Development, 31(2):249--260, 1987. Google ScholarDigital Library
- M. Kaya and S. A. Özel. Integrating an online compiler and a plagiarism detection tool into the moodle distance education system for easy assessment of programming assignments. Computer Applications in Engineering Education, 23(3):363--373, 2015. Google ScholarDigital Library
- T. Lancaster and F. Culwin. A comparison of source code plagiarism detection engines. Computer Science Education, 14(2):101--112, 2004.Google ScholarCross Ref
- D. Louw and V. Pieterse. Dealing with plagiarism in introductory programming. In International Conference on Computer Science Education Innovation & Technology (CSEIT). Proceedings, pages 4--13, Singapore, 2015. Global Science and Technology Forum.Google ScholarCross Ref
- V. T. Martins, D. Fonte, P. R. Henriques, and D. da Cruz. Plagiarism detection: A tool survey and comparison. In A. S. o. Maria João Varanda Pereira, José Paulo Leal, editor, 3rd Symposium on Languages, Applications and Technologies (SLATE'14), pages 143--158. OASICS Schloss Dagstuhl, 2014.Google Scholar
- W. J. Masek and M. S. Paterson. A faster algorithm computing string edit distances. Journal of Computer and System Sciences, 20(1):18--31, 1980.Google ScholarCross Ref
- M. E. B. Menai and N. S. Al-Hassoun. Similarity detection in Java programming assignments. In Computer Science and Education (ICCSE), 2010 5th International Conference on, pages 356--361. IEEE, 2010.Google ScholarCross Ref
- V. Pieterse. Automated assessment of programming assignments. In Proceedings of the 3rd Computer Science Education Research Conference on Computer Science Education Research, CSERC '13, pages 4:45--4:56, Open Univ., Heerlen, The Netherlands, The Netherlands, 2013. Open Universiteit, Heerlen. Google ScholarDigital Library
- V. Pieterse. Decoding code plagiarism. In Proceedings of the 44th Annual Conference of the Southern African Computer Lecturers' Association (SACLA), Port Elizabeth, 25-26 June 2014. SACLA Organising Committee.Google Scholar
- PMD Contributors. Finding duplicate code. http://pmd.sourceforge.net/pmd-4.3.0/cpd.html, 2015. {Online; accessed 5-January-2016}.Google Scholar
- L. Prechelt, G. Malpohl, and M. Philippsen. Finding plagiarisms among a set of programs with jplag. Journal of Universal Computer Science, 8(11):1016--1038, 2002.Google Scholar
- R. Rivest. The MD5 message-digest algorithm. Internet Request For Comments, 1321, 1992. Google ScholarDigital Library
- C. K. Roy and J. R. Cordy. A survey on software clone detection research. Technical Report TR 2007-541, Queens University, 2007.Google Scholar
- SAFE Corporation. Code suite products. http://www.safe-corp.biz/products_codesuite.html, 2015. {Online; accessed 5-January-2016}.Google Scholar
- SAFE Corporation. CodeMatch Algorithms. http://www.safe-corp.biz/CodeMatch_algorithms.htm, 2015. {Online; accessed 5-January-2016}.Google Scholar
- S. Schleimer, D. S. Wilkerson, and A. Aiken. Winnowing: local algorithms for document fingerprinting. In Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, pages 76--85. ACM, 2003. Google ScholarDigital Library
- S. Schleimer, D. S. Wilkerson, and A. Aiken. Winnowing: Local algorithms for document fingerprinting. In Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, SIGMOD '03, pages 76--85, New York, NY, USA, 2003. ACM. Google ScholarDigital Library
- I. L. Schoeman and V. Pieterse. Managing programming assignments in the computer science classroom. In Proceedings of the 34th Annual Conference of the Southern African Computer Lecturers' Association (SACLA) (4-6 July), SACLA '04, pages 50--59, 2004.Google Scholar
- R. A. Wagner and M. J. Fischer. The string-to-string correction problem. J. ACM, 21(1):168--173, Jan. 1974. Google ScholarDigital Library
- A. T. Wibowo, K. W. Sudarmadi, and A. M. Barmawi. Comparison between fingerprint and winnowing algorithm to detect plagiarism fraud on Bahasa Indonesia documents. In Information and Communication Technology (ICoICT), 2013 International Conference of, pages 128--133. IEEE, 2013.Google ScholarCross Ref
- B. Zeidman. Tools and algorithms for finding plagiarism in source code. Dr Dobbs, July 01 2004.Google Scholar
- Evaluating plagiarism detection software for introductory programming assignments
Recommendations
Detection of plagiarism in computer programming assignments
Plagiarism in programming assignments in computer science courses is on the rise, mainly due to recent innovation in computer technology which has made copying, sharing, and modifying a document effortless. Detecting plagiarism in computer programs ...
“I didn’t copy his code”: Code Plagiarism Detection with Visual Proof
Artificial Intelligence in EducationAbstractCode plagiarism in online courses gives a false idea of the performance of students. In 2020, we run a smartphone-based online coding course, SuaCode Africa 2.0 in which 27% of plagiarism cases was found in the final assignment submissions. Hence, ...
Introductory programming: what's happening today and will there be any students to teach tomorrow?
ACE '04: Proceedings of the Sixth Australasian Conference on Computing Education - Volume 30This paper reports the findings of a census of introductory programming courses. Eighty five courses from Australian and New Zealand universities are included. The census aims to discover languages and paradigms taught, tools used, texts employed, method ...
Comments