Marcelo Rinesi used our dataset to study the influence of interaction with the VLE and various demographic factors on the probability of passing the course. He concluded that there is a clear correlation between days spent interacting with the system and success. Socio-economic context and educational history play a significant role too.
Nathan Stevens from NYC Data Science Academy looked at VLE clicks histograms and relationship between clicks and final grade in the courses. Content access time was also studied. There has been found a strong relationship between both and final grade of students in course. Unexpected findings of older students being more active inside the VLE have also been reported.

Workshops and hackathons

Learning Analytics & Open Data Hackathon 3.0 at the University of British Columbia, Canada

The two-day event was held at the University of British Columbia, Canada. Over 100 participants dove into our dataset and experimented with it. Interesting projects in the area of social comparison and visualisation have been developed.

LAK18 Hackathon at Learning Analytics and Knowledge conference (LAK18) in Sydney, Australia

The principal aim of Hack@LAK18 was to enable multi-disciplinary thinking over key open challenges in Learning Analytics based on a problem-oriented, pragmatic approach. OULAD was one of the recommended datasets by the organisers.

Data Literacy for Learning Analytics at Learning Analytics and Knowledge conference (LAK16) in Edinburgh, Scotland

The workshop ran as part of the LAK 2016 conference. Over its 30 participants focused to discover how data literacy impacts learning analytics, both for practitioners and for end users. OULAD data was utilised for analysing and visualizing learner data in order to understand the outputs of learning analytics on the real examples.


This paper generalises the concept of Self-Learner to a problem of a finite set of entities which are required to achieve a goal within a predefined deadline. The paper extends the original method focusing especially on targeting the problem of class imbalance. Again, the evaluation is performed on OULAD dataset. The proposed improvements outperform the original method, and show that the best results are achieved if domain-driven techniques are utilised to tackle the imbalance problem. The improvements showed to be statistically significant using Wilcoxon signed rank test.
This paper presents the concept of a ”self-learner” that builds the machine learning models from the data generated during the current course. The approach utilises information about already submitted assessments, which introduces the problem of imbalanced data for training and testing the classification models. The presented method proposes a solution in the absence of data from previous courses, which are usually used for training machine learning models. This situation typically occurs in new courses. OULAD was used as data source for the performed experiments.
This paper is intended to be useful for a first understanding of academic data analysis. What we can get and what we do need to do. This is the first of a series of reports that taken all together will provide a complete and consistent view towards the inclusion of data mining as a helping hand in the tutoring action. Mentions OULAD in its Proposal for data-set format section
In this paper, the authors predict students’ success in an online course using regression, clustering and classification methods.


Basic data analysis of OULAD by github user marloft
Official repository for LAK18 Hackathon containing resources and support materials