Predicting student success by mining enrolment data.

Thumbnail Image
Kovacic, Z.
Student success
Student engagement
Data mining
Study outcomes
Persistence in learning
Distance learning
Description of form
Rights holder
Issue Date
Peer-reviewed status
This paper explores the socio-demographic variables (age, gender, ethnicity, education, work status, and disability) and study environment (course programme and course block), that may influence persistence or dropout of the distance education students at the Open Polytechnic. It examines to what extent these factors, i.e. enrolment data help us in preidentifying successful and unsuccessful students. The data stored in the Open Polytechnic student management system from 2006 to 2009, covering over 450 students who enrolled to Information Systems course was used to perform a quantitative analysis of study outcome. Based on a data mining techniques (such as feature selection and classification trees) and logistic regression the most important factors for student success and a profile of the typical successful and unsuccessful students are identified. The empirical results show the following: (i) the most important factors separating successful from unsuccessful students are: ethnicity, course programme and course block; (ii) among classification tree growing methods Classification and Regression Tree (CART) was the most successful in growing the tree with an overall percentage of correct classification of 60.5%; (iii) both the risk estimated by the cross-validation and the gain diagram suggests that all trees, based only on enrolment data, are not quite good in separating successful from unsuccessful students, and (iv) the same conclusion was reached using the logistic regression. The implications of these results for academic and administrative staff are discussed.
Kovacic, Z. J. (2012). Predicting student success by mining enrolment data. Research in Higher Education, 15.