research-article

Public Access

A Robust Machine Learning Technique to Predict Low-performing Students

Authors:
Soohyun Nam Liao

University of California, San Diego, USA

University of California, San Diego, USA

0000-0002-7368-252X
View Profile

,
Daniel Zingaro

University of Toronto Mississauga, Canada

University of Toronto Mississauga, Canada
View Profile

,
Kevin Thai

University of California, San Diego, USA

University of California, San Diego, USA
View Profile

,
Christine Alvarado

University of California, San Diego, USA

University of California, San Diego, USA
View Profile

,
William G. Griswold

University of California, San Diego, USA

University of California, San Diego, USA
View Profile

,
Leo Porter

University of California, San Diego, USA

University of California, San Diego, USA
View Profile

Authors Info & Claims

ACM Transactions on Computing Education Volume 19 Issue 3Article No.: 18pp 1–19https://doi.org/10.1145/3277569

Published:16 January 2019Publication History

ACM Transactions on Computing Education

Abstract

As enrollments and class sizes in postsecondary institutions have increased, instructors have sought automated and lightweight means to identify students who are at risk of performing poorly in a course. This identification must be performed early enough in the term to allow instructors to assist those students before they fall irreparably behind. This study describes a modeling methodology that predicts student final exam scores in the third week of the term by using the clicker data that is automatically collected for instructors when they employ the Peer Instruction pedagogy. The modeling technique uses a support vector machine binary classifier, trained on one term of a course, to predict outcomes in the subsequent term. We applied this modeling technique to five different courses across the computer science curriculum, taught by three different instructors at two different institutions. Our modeling approach includes a set of strengths not seen wholesale in prior work, while maintaining competitive levels of accuracy with that work. These strengths include using a lightweight source of student data, affording early detection of struggling students, and predicting outcomes across terms in a natural setting (different final exams, minor changes to course content), across multiple courses in a curriculum, and across multiple institutions.

References

Yousef Mohamed Abdulrazzaq and Khalil Ibrahim Qayed. 2009. Could final year school grades suffice as a predictor for future performance? Med. Teach. 15, 2--3 (2009), 243--251.Google Scholar
Alireza Ahadi, Arto Hellas, and Raymond Lister. 2017. A contingency table derived method for analyzing course data. Trans. Comput. Educ. 17, 3, Article 13 (2017), 13:1--13:19. Google ScholarDigital Library
Alireza Ahadi and Raymond Lister. 2013. Geek genes, prior knowledge, stumbling points and learning edge momentum: Parts of the one elephant? In Proceedings of the 9th Annual International ACM Conference on International Computing Education Research. 123--128. Google ScholarDigital Library
Alireza Ahadi, Raymond Lister, Heikki Haapala, and Arto Vihavainen. 2015. Exploring machine-learning methods to automatically identify students in need of assistance. In Proceedings of the International Conference on Computing Education Research. 121--130. Google ScholarDigital Library
A. Bandura. 1977. Self-efficacy: Toward a unifying theory of behavioral change. Psychol. Rev. 84, 2 (1977), 191--215.Google ScholarCross Ref
Susan Bergin and Ronan Reilly. 2006. Predicting introductory programming performance: A multi-institutional multivariate study. Comput. Sci. Educ. 16, 4 (2006), 303--323.Google ScholarCross Ref
Adam S. Carter, Christopher D. Hundhausen, and Olusola Adesope. 2017. Blending measures of programming and social behavior into predictive models of student achievement in early computing courses. Trans. Comput. Educ. 17, 3 (2017), 12:1--12:20. Google ScholarDigital Library
Adam S. Carter, Christopher D. Hundhausen, and Olusola O. Adesope. 2015. The normalized programming state model—Predicting student performance in computing courses based on programming behavior. In Proceedings of the International Conference on Computing Education Research. 141--150. Google ScholarDigital Library
Jennifer M. Case. 2015. A different route to reducing university drop-out rates. In The Conversation. Retrieved from https://theconversation.com/a-different-route-to-reducing-university-drop-out-rates-40406.Google Scholar
Karo Castro-Wunsch, Alireza Ahadi, and Andrew Petersen. 2017. Evaluating neural networks as a method for identifying students in need of assistance. In Proceedings of the Technical Symposium on Computer Science Education. 111--116. Google ScholarDigital Library
Nihat Cengiz and Arban Uka. 2014. Prediction of student success using enrollment data. Proceedings of the Workshops held at Educational Data Mining: Workshop Approaching Twenty Years of Knowledge Tracing.Google Scholar
A. T. Corbett and J. R. Anderson. 1994. Knowledge tracing - Modeling the acquisition of procedural knowledge. User Model. User-Adapt. Interact. 4, 4 (1994), 253--278.Google ScholarCross Ref
Corinna Cortes and Vladimir Vapnik. 1995. Support-vector networks. Mach. Learn. 20, 3 (1995), 273--297. Google ScholarDigital Library
CRA Enrollment Committee Institution Subgroup. 2017. Generation CS: Computer science undergraduate enrollments surge since 2006. Computing Research Association. Retrieved from http://cra.org/data/Generation-CS/.Google Scholar
C. H. Crouch and E. Mazur. 2001. Peer instruction: Ten years of experience and results. Amer. J. Phys. 69, 9 (2001), 970--77.Google ScholarCross Ref
Michael de Raadt, Margaret Hamilton, Raymond Lister, Jodi Tutty, Bob Baker, Ilona Box, Quintin Cutts, Sally Fincher, John Hamer, Patricia Haden, Marian Petre, Anthony Robins, Simon, Ken Sutton, and Denise Tolhurst. 2005. Approaches to learning in computer programming students and their effect on success. Res. Dev. Higher Educ.: Higher Educ. Chang. World 28 (2005), 407--414.Google Scholar
Edward M. Elias and Carl A. Lindsay. 1968. The Role of Intellective Variables in Achievement and Attrition of Associate Degree Students at the York Campus for the Years 1959 to 1963. Technical Report PSU -68 -7. Pennsylvania State University.Google Scholar
J. A. Hanley and B. J. McNeil. 1982. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143, 1 (1982), 29--36.Google ScholarCross Ref
Norhayati Ibrahim, Steven A. Freeman, and Mack C. Shelley. 2011. Identifying predictors of academic success for part-time students at polytechnic institutes in Malaysia. Int. J. Adult Vocat. Educ. Technol. 2, 4 (2011), 1--16.Google ScholarCross Ref
Matthew C. Jadud. 2006. Methods and tools for exploring novice compilation behaviour. In Proceedings of the International Conference on Computing Education Research. 73--84. Google ScholarDigital Library
David James and Clair Chilvers. 2001. Academic and non-academic predictors of success on the Nottingham undergraduate medical course 1970-1995. Med. Educ. 35, 11 (2001), 1056--1064.Google ScholarCross Ref
Cynthia Lee, Leo Porter, Beth Simon, and Daniel Zingaro. 2012. Peer instruction for computer science. Retrieved from http://www.peerinstruction4cs.org.Google Scholar
Cynthia Bailey Lee, Saturnino Garcia, and Leo Porter. 2013. Can peer instruction be effective in upper-division computer science courses? Trans. Comput. Educ. 13, 3 (2013), 12:1--12:22. Google ScholarDigital Library
Soohyun Nam Liao, Daniel Zingaro, Michael A. Laurenzano, William G. Griswold, and Leo Porter. 2016. Lightweight, early identification of at-risk CS1 students. In Proceedings of the International Conference on Computing Education Research. 123--131. Google ScholarDigital Library
Adam Lucas. 2009. Using peer instruction and I-clickers to enhance student participation in calculus. Prob. Resour. Issues Math. Undergrad. Studies 19, 3 (2009), 219--231.Google Scholar
National Center for Education Statistics. 2016. Total undergraduate fall enrollment in degree-granting postsecondary institutions, by attendance status, sex of student, and control and level of institution: Selected years, 1970 through 2026. National Center for Education Statistics. https://nces.ed.gov/programs/digest/d16/tables/dt16_303.70.asp.Google Scholar
Charles G. Petersen and Trevor G. Howe. 1979. Predicting academic success in introduction to computers. Assoc. Educ. Data Syst. 12, 4 (1979), 182--191.Google Scholar
Scott Pilzer. 2001. Peer instruction in physics and mathematics. Prob. Resour. Issues Math. Undergrad. Studies 11, 2 (2001), 185--192. Google ScholarDigital Library
Leo Porter, Cynthia Bailey Lee, and Beth Simon. 2013. Halving fail rates using peer instruction: A study of four computer science courses. In Proceedings of the Technical Symposium on Computer Science Education. 177--182. Google ScholarDigital Library
L. Porter, C. Bailey-Lee, B. Simon, Q. Cutts, and D. Zingaro. 2011. Experience report: A multi-classroom report on the value of peer instruction. In Proceedings of the Annual Joint Conference on Innovation and Technology in Computer Science Education. 138--142. Google ScholarDigital Library
Leo Porter, Dennis Bouvier, Quintin Cutts, Scott Grissom, Cynthia Lee, Robert McCartney, Daniel Zingaro, and Beth Simon. 2016. A multi-institutional study of peer instruction in introductory computing. In Proceedings of the Technical Symposium on Computer Science Education. 358--363. Google ScholarDigital Library
Leo Porter, Saturnino Garcia, John Glick, Andrew Matusiewicz, and Cynthia Taylor. 2013. Peer instruction in computer science at small liberal arts colleges. In Proceedings of the Annual Joint Conference on Innovation and Technology in Computer Science Education. 129--134. Google ScholarDigital Library
Leo Porter and Beth Simon. 2013. Retaining nearly one-third more majors with a trio of instructional best practices in CS1. In Proceedings of the Technical Symposium on Computer Science Education. 165--170. Google ScholarDigital Library
Leo Porter and Daniel Zingaro. 2014. Importance of early performance in CS1: Two conflicting assessment stories. In Proceedings of the Technical Symposium on Computer Science Education. 295--300. Google ScholarDigital Library
Leo Porter, Daniel Zingaro, and Raymond Lister. 2014. Predicting student success using fine grain clicker data. In Proceedings of the International Conference on Computing Education Research. 51--58. Google ScholarDigital Library
D. M. Powers. 2011. Evaluation: From precision, recall and F-measure to ROC, informedness, markedness and correlation. J. Mach. Learn. Technol. 2, 1 (2011), 37--63.Google ScholarCross Ref
Anthony Robins. 2010. Leaning edge momentum: A new account of outcomes. In Computer Science Education, 20(1). 37--71.Google ScholarCross Ref
John E. Roueche. 1967. Research studies of the junior college dropout. Amer. Assoc. Junior Coll. (1967), 1--5.Google Scholar
Philip M. Sadler and Robert H. Tai. 2007. Advanced placement exam scores as a predictor of performance in introductory college biology, chemistry and physics courses. Sci. Educat. 16, 2 (2007), 1--19.Google Scholar
William E. Cohen Sadler and Frederic L. Kockesen Levent. 1997. Factors affecting retention behavior: A model to predict at-risk students. In Proceedings of the Association for Institutional Research Annual Forum.Google Scholar
Vicki L. Sauter. 1986. Predicting computer programming skill. Computers 8 Education 10, 2 (1986), 299--302. Google ScholarDigital Library
Sami Shaban and Michelle McLean. 2011. Predicting performance at medical school: Can we identify at-risk students? Adv. Med. Educ. Pract. 2 (2011), 139--148.Google ScholarCross Ref
Karedn Shakerdge. 2016. High failure rates spur universities to overhaul math class. In The Hechinger Report. Retrieved from http://hechingerreport.org/high-failure-rates-spur-universities-overhaul-math-class/.Google Scholar
Shahireh Sharif, Larry Gifford, Gareth A. Morris, and Jill Barber. 2003. Can we predict student success (and reduce student failure)? Pharm. Educ. 3 (2003), 1--10.Google ScholarCross Ref
B. Simon, M. Kohanfars, J. Lee, K. Tamayo, and Q. Cutts. 2010. Experience report: Peer instruction in introductory computing. In Proceedings of the Technical Symposium on Computer Science Education. 341--345. Google ScholarDigital Library
Beth Simon, Julian Parris, and Jaime Spacco. 2013. How we teach impacts student learning: Peer instruction vs. lecture in CS0. In Proceedings of the Technical Symposium on Computer Science Education. 41--46. Google ScholarDigital Library
Larry D. Singell and Glen R. Waddell. 2010. Modeling retention at a large public university: Can at-risk students be identified early enough to treat? Res. Higher Educ. 51, 6 (2010), 546--572.Google ScholarCross Ref
Michelle K. Smith, William B. Wood, Wendy K. Adams, Carl E. Wieman, Jennifer K. Knight, Nancy Guild, and Tin Tin Su. 2009. Why peer discussion improves student performance on in-class concept questions. Science 323, 5910 (2009), 122--124.Google Scholar
Alex Smola, Kurt Hornik, Achim Zeileis, and Alexandros Karatzoglou. 2003. Kernel-based machine-learning lab. Retrieved from https://www.rdocumentation.org/packages/kernlab/versions/0.9-25.Google Scholar
Louise Tickle. 2015. How universities are using data to stop students dropping out. In The Guardian. Retrieved from https://www.theguardian.com/guardian-professional/.Google Scholar
Bruno Trstenjak and Dzenana Donko. 2014. Determining the impact of demographic features in predicting student success in Croatia. In Proceedings of the International Convention on Information and Communication Technology, Electronics, and Microelectronics. 1222--1227.Google ScholarCross Ref
Christopher Watson, Frederick W. B. Li, and Jamie L. Godwin. 2013. Predicting performance in an introductory programming course by logging and analyzing student programming behavior. In Proceedings of the International Conference on Advanced Learning Technologies. 319--323. Google ScholarDigital Library
Brenda Cantwell Wilson and Sharon Shrock. 2001. Contributing to success in an introductory computer science course—A study of twelve factors. In Proceedings of the Technical Symposium on Computer Science Education. 184--188. Google ScholarDigital Library
Annika Wolff, Zdenek Zdráhal, Drahomira Herrmannova, and Petr Knoth. 2014. Predicting student performance from combined data sources. Educ. Data Mining 524 (2014), 175--202.Google ScholarCross Ref
Ping Zhang, Lin Ding, and Eric Mazur. 2017. Peer instruction in introductory physics: A method to bring about positive changes in students’ attitudes and beliefs. Phys. Rev. Phys. Educ. Res. 113, 1 (2017), 10.Google ScholarCross Ref

Index Terms

A Robust Machine Learning Technique to Predict Low-performing Students
1. Social and professional topics
  1. Professional topics
    1. Computing education
      1. Computing education programs
        Computer science education

Recommendations

Techniques for retaining low performing students: high-need student mentoring program (abstract only)
SIGCSE '14: Proceedings of the 45th ACM technical symposium on Computer science education

It has been a challenge to retain computer science students, especially underrepresented students such as woman and minority. This poster describes our experiences in implementing a mentoring program to improve computer science retention and graduate ...
Read More
Predicting at-risk university students in a virtual learning environment via a machine learning algorithm
Abstract
A university education is widely considered essential to social advancement. Ensuring students pass their courses and graduate on time have thus become issues of concern. This paper proposes a reduced training vector-based support ...
Highlights
- Proposed algorithm predicts both marginal and at-risk university students.
- ...
Read More
Learning from What Works: Improving an Introductory Computing Course for Architects with Teaching Methods from Media Computation (Abstract Only)
SIGCSE '15: Proceedings of the 46th ACM Technical Symposium on Computer Science Education

This poster describes an ongoing five-year assessment of a new introductory programming course for architecture students at the University of North Carolina at Charlotte. The goal of this course is to teach the fundamentals of computing with an emphasis ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Computing Education Volume 19, Issue 3
September 2019
333 pages
EISSN:1946-6226
DOI:10.1145/3308443
Editor:
Chris Hundhausen
Washington State University, USA
Issue’s Table of Contents
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 16 January 2019
- Revised: 1 August 2018
- Accepted: 1 August 2018
- Received: 1 May 2018
Published in toce Volume 19, Issue 3

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Peer instruction
at-risk students
clicker data
cross-term
machine learning
multi-institution
prediction
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 59
  Total Citations
  View Citations
- 2,893
  Total Downloads
- Downloads (Last 12 months)505
- Downloads (Last 6 weeks)75
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

A Robust Machine Learning Technique to Predict Low-performing Students

ACM Transactions on Computing Education

Abstract

References

Cited By

Index Terms

Recommendations

Techniques for retaining low performing students: high-need student mentoring program (abstract only)

Predicting at-risk university students in a virtual learning environment via a machine learning algorithm

Learning from What Works: Improving an Introductory Computing Course for Architects with Teaching Methods from Media Computation (Abstract Only)