skip to main content
10.1145/2998551.2998558acmotherconferencesArticle/Chapter ViewAbstractPublication PagescsercConference Proceedingsconference-collections
research-article

Evaluating plagiarism detection software for introductory programming assignments

Authors Info & Claims
Published:04 July 2016Publication History

ABSTRACT

Plagiarism is an issue that all educators have had to deal with. Large numbers of students and assignments have resulted in the development of automated systems to detect code similarities with the aim of identifying cases that may have been plagiarised. These systems are of great value to assessors, allowing them to process submissions automatically. However, these automated systems do present possible disadvantages and drawbacks. In this study we explore and analyse the differences between various systems as well as how their performance compares with manual checking. We consider the different methods students use when committing plagiarism. Then we examine more closely the systems that can aid plagiarism detection, ranging from their characteristics to how they work. In the process, we determine how these systems compare with our own system and their suitability for aiding the identification of submissions which may have been plagiarised in our introductory C++ course.

References

  1. A. Ahtiainen, S. Surakka, and M. Rahikainen. Plaggie: GNU-licensed source code plagiarism detection engine for Java exercises. In Proceedings of the 6th Baltic Sea Conference on Computing Education Research: Koli Calling 2006, Baltic Sea '06, pages 141--142, New York, NY, USA, 2006. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. Aiken. MOSS: a system for detecting software plagiarism. http://theory.stanford.edu/~aiken/moss/, 1994. {Online: accessed 12-January-2013}.Google ScholarGoogle Scholar
  3. K. W. Bowyer and L. O. Hall. Experience using \moss" to detect cheating on programming assignments. In Frontiers in Education Conference, 1999. FIE'99. 29th Annual, volume 3, pages 13B3--18. Institute of Electrical and Electronics Engineers, 1999.Google ScholarGoogle ScholarCross RefCross Ref
  4. D. M. Breuker, J. Derriks, and J. Brunekreef. Measuring static quality of student code. In Proceedings of the 16th Annual Joint Conference on Innovation and Technology in Computer Science Education, ITiCSE '11, pages 13--17, New York, NY, USA, 2011. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. R. Brixtel, M. Fontaine, B. Lesner, C. Bazin, and R. Robbes. Language-independent clone detection applied to plagiarism detection. In Source Code Analysis and Manipulation (SCAM), 2010 10th IEEE Working Conference on, pages 77--86, Sept 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. W. B. Cavnar and J. M. Trenkle. N-gram-based text categorization. In Proceedings of the 3rd Annual Symposium on Document Analysis and Information Retrieval, pages 161--175, Las Vegas, US, 1994.Google ScholarGoogle Scholar
  7. X. Chen, B. Francia, M. Li, B. Mckinnon, and A. Seker. Shared information and program plagiarism detection. Information Theory, IEEE Transactions on, 50(7):1545--1551, July 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. T. Copeland. Detecting duplicate code with pmd's cpd, Dec 03 2001. {Online; accessed 5-January-2016}.Google ScholarGoogle Scholar
  9. C. Daly and J. Horgan. Patterns of plagiarism. SIGCSE Bull., 37(1):383--387, Feb. 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. Freire. Visualizing Program Similarity in the AC Plagiarism Detection System. In Proceedings of the Working Conference on Advanced Visual Interfaces, AVI '08, pages 404--407, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Freire, M. Cebrian, and E. del Rosal. Uncovering plagiarism networks. arXiv preprint cs/0703136, 2007.Google ScholarGoogle Scholar
  12. M. Freire, M. Cebrian, and E. Rosal. AC: An Integrated Source Code Plagiarism Detection Environment. Technical Report cs.IT/0703136, Universidad Autónoma de Madrid, Mar 2007. Comments: 57 pages, 11 figures.Google ScholarGoogle Scholar
  13. D. Gitchell and N. Tran. Sim: A utility for detecting similarity in computer programs. SIGCSE Bull., 31(1):266--270, Mar. 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Hage, P. Rademaker, and N. van Vugt. Plagiarism Detection for Java: A Tool Comparison. In Computer Science Education Research Conference, CSERC '11, pages 33--46, Open Univ., Heerlen, The Netherlands, The Netherlands, 2011. Open Universiteit, Heerlen. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. Hage, B. Vermeer, and G. Verburg. Plagiarism Detection for Haskell with Holmes. In Proceedings of the 3rd Computer Science Education Research Conference on Computer Science Education Research, CSERC '13, pages 19--30, Open Univ., Heerlen, The Netherlands, The Netherlands, 2013. Open Universiteit, Heerlen. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. B. Haskins. Utilising n-grams and Edit Distance as a Means of Identifying Copied Programming Assignments. In Proceedings of the 44th Annual Conference of the Southern African Computer Lecturers' Association (SACLA), Port Elizabeth, 25-26 June 2014. SACLA Organising Committee.Google ScholarGoogle Scholar
  17. M. Joy and M. Luck. Plagiarism in programming assignments. IEEE Transactions on Education, 42(2):129--133, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. R. M. Karp and M. O. Rabin. Efficient randomized pattern-matching algorithms. IBM Journal of Research and Development, 31(2):249--260, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M. Kaya and S. A. Özel. Integrating an online compiler and a plagiarism detection tool into the moodle distance education system for easy assessment of programming assignments. Computer Applications in Engineering Education, 23(3):363--373, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. T. Lancaster and F. Culwin. A comparison of source code plagiarism detection engines. Computer Science Education, 14(2):101--112, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  21. D. Louw and V. Pieterse. Dealing with plagiarism in introductory programming. In International Conference on Computer Science Education Innovation & Technology (CSEIT). Proceedings, pages 4--13, Singapore, 2015. Global Science and Technology Forum.Google ScholarGoogle ScholarCross RefCross Ref
  22. V. T. Martins, D. Fonte, P. R. Henriques, and D. da Cruz. Plagiarism detection: A tool survey and comparison. In A. S. o. Maria João Varanda Pereira, José Paulo Leal, editor, 3rd Symposium on Languages, Applications and Technologies (SLATE'14), pages 143--158. OASICS Schloss Dagstuhl, 2014.Google ScholarGoogle Scholar
  23. W. J. Masek and M. S. Paterson. A faster algorithm computing string edit distances. Journal of Computer and System Sciences, 20(1):18--31, 1980.Google ScholarGoogle ScholarCross RefCross Ref
  24. M. E. B. Menai and N. S. Al-Hassoun. Similarity detection in Java programming assignments. In Computer Science and Education (ICCSE), 2010 5th International Conference on, pages 356--361. IEEE, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  25. V. Pieterse. Automated assessment of programming assignments. In Proceedings of the 3rd Computer Science Education Research Conference on Computer Science Education Research, CSERC '13, pages 4:45--4:56, Open Univ., Heerlen, The Netherlands, The Netherlands, 2013. Open Universiteit, Heerlen. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. V. Pieterse. Decoding code plagiarism. In Proceedings of the 44th Annual Conference of the Southern African Computer Lecturers' Association (SACLA), Port Elizabeth, 25-26 June 2014. SACLA Organising Committee.Google ScholarGoogle Scholar
  27. PMD Contributors. Finding duplicate code. http://pmd.sourceforge.net/pmd-4.3.0/cpd.html, 2015. {Online; accessed 5-January-2016}.Google ScholarGoogle Scholar
  28. L. Prechelt, G. Malpohl, and M. Philippsen. Finding plagiarisms among a set of programs with jplag. Journal of Universal Computer Science, 8(11):1016--1038, 2002.Google ScholarGoogle Scholar
  29. R. Rivest. The MD5 message-digest algorithm. Internet Request For Comments, 1321, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. C. K. Roy and J. R. Cordy. A survey on software clone detection research. Technical Report TR 2007-541, Queens University, 2007.Google ScholarGoogle Scholar
  31. SAFE Corporation. Code suite products. http://www.safe-corp.biz/products_codesuite.html, 2015. {Online; accessed 5-January-2016}.Google ScholarGoogle Scholar
  32. SAFE Corporation. CodeMatch Algorithms. http://www.safe-corp.biz/CodeMatch_algorithms.htm, 2015. {Online; accessed 5-January-2016}.Google ScholarGoogle Scholar
  33. S. Schleimer, D. S. Wilkerson, and A. Aiken. Winnowing: local algorithms for document fingerprinting. In Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, pages 76--85. ACM, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. S. Schleimer, D. S. Wilkerson, and A. Aiken. Winnowing: Local algorithms for document fingerprinting. In Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, SIGMOD '03, pages 76--85, New York, NY, USA, 2003. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. I. L. Schoeman and V. Pieterse. Managing programming assignments in the computer science classroom. In Proceedings of the 34th Annual Conference of the Southern African Computer Lecturers' Association (SACLA) (4-6 July), SACLA '04, pages 50--59, 2004.Google ScholarGoogle Scholar
  36. R. A. Wagner and M. J. Fischer. The string-to-string correction problem. J. ACM, 21(1):168--173, Jan. 1974. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. A. T. Wibowo, K. W. Sudarmadi, and A. M. Barmawi. Comparison between fingerprint and winnowing algorithm to detect plagiarism fraud on Bahasa Indonesia documents. In Information and Communication Technology (ICoICT), 2013 International Conference of, pages 128--133. IEEE, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  38. B. Zeidman. Tools and algorithms for finding plagiarism in source code. Dr Dobbs, July 01 2004.Google ScholarGoogle Scholar
  1. Evaluating plagiarism detection software for introductory programming assignments

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      CSERC '16: Proceedings of the Computer Science Education Research Conference 2016
      July 2016
      52 pages
      ISBN:9781450344920
      DOI:10.1145/2998551

      Copyright © 2016 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 4 July 2016

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

      Acceptance Rates

      CSERC '16 Paper Acceptance Rate5of14submissions,36%Overall Acceptance Rate24of60submissions,40%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader