Data-driven decision making is serving and transforming education. We approached the problem of predicting students performance by using multiple data sources which came from online courses, including one we created. Experimental results show preliminary conclusions towards which data are to be considered for the task.
Educational software data promises unique insights into students study behaviors and drivers of success. While much work has been dedicated to performance prediction in massive open online courses, it is unclear if the same methods can be applied to blended courses and a deeper understanding of student strategies is often missing. We use pattern mining and models borrowed from Natural Language Processing (NLP) to understand student interactions and extract frequent strategies from a blended college course. Fine-grained clickstream data is collected through Diderot, a non-commercial educational support system that spans a wide range of functionalities. We find that interaction patterns differ considerably based on the assessment type students are preparing for, and many of the extracted features can be used for reliable performance prediction. Our results suggest that the proposed hybrid NLP methods can provide valuable insights even in the low-data setting of blended courses given enough data granularity.
Many researchers have studied student academic performance in supervised and unsupervised learning using numerous data mining techniques. Neural networks often need a greater collection of observations to achieve enough predictive ability. Due to the increase in the rate of poor graduates, it is necessary to design a system that helps to reduce this menace as well as reduce the incidence of students having to repeat due to poor performance or having to drop out of school altogether in the middle of the pursuit of their career. It is therefore necessary to study each one as well as their advantages and disadvantages, so as to determine which is more efficient in and in what case one should be preferred over the other. The study aims to develop a system to predict student performance with Artificial Neutral Network using the student demographic traits so as to assist the university in selecting candidates (students) with a high prediction of success for admission using previous academic records of students granted admissions which will eventually lead to quality graduates of the institution. The model was developed based on certain selected variables as the input. It achieved an accuracy of over 92.3 percent, showing Artificial Neural Network potential effectiveness as a predictive tool and a selection criterion for candidates seeking admission to a university.
We present a method for accurately predicting the long time popularity of online content from early measurements of user access. Using two content sharing portals, Youtube and Digg, we show that by modeling the accrual of views and votes on content offered by these services we can predict the long-term dynamics of individual submissions from initial data. In the case of Digg, measuring access to given stories during the first two hours allows us to forecast their popularity 30 days ahead with remarkable accuracy, while downloads of Youtube videos need to be followed for 10 days to attain the same performance. The differing time scales of the predictions are shown to be due to differences in how content is consumed on the two portals: Digg stories quickly become outdated, while Youtube videos are still found long after they are initially submitted to the portal. We show that predictions are more accurate for submissions for which attention decays quickly, whereas predictions for evergreen content will be prone to larger errors.
Hospital readmission rate is high for heart failure patients. Early detection of deterioration will help doctors prevent readmissions, thus reducing health care cost and providing patients with just-in-time intervention. Wearable devices (e.g., wristbands and smart watches) provide a convenient technology for continuous outpatient monitoring. In the paper, we explore the feasibility of monitoring outpatients using Fitbit Charge HR wristbands and the potential of machine learning models to predicting clinical deterioration (readmissions and death) among outpatients discharged from the hospital. We developed and piloted a data collection system in a clinical study which involved 25 heart failure patients recently discharged from a hospital. The results from the clinical study demonstrated the feasibility of continuously monitoring outpatients using wristbands. We observed high levels of patient compliance in wearing the wristbands regularly and satisfactory yield, latency and reliability of data collection from the wristbands to a cloud-based database. Finally, we explored a set of machine learning models to predict deterioration based on the Fitbit data. Through 5-fold cross validation, K nearest neighbor achieved the highest accuracy of 0.8800 for identifying patients at risk of deterioration using the health data from the beginning of the monitoring. Machine learning models based on multimodal data (step, sleep and heart rate) significantly outperformed the traditional clinical approach based on LACE index. Moreover, our proposed weighted samples one class SVM model can reach high accuracy (0.9635) for predicting the deterioration happening in the future using data collected by a sliding window, which indicates the potential for allowing timely intervention.
In a Massive Open Online Course (MOOC), predictive models of student behavior can support multiple aspects of learning, including instructor feedback and timely intervention. Ongoing courses, when the student outcomes are yet unknown, must rely on models trained from the historical data of previously offered courses. It is possible to transfer models, but they often have poor prediction performance. One reason is features that inadequately represent predictive attributes common to both courses. We present an automated transductive transfer learning approach that addresses this issue. It relies on problem-agnostic, temporal organization of the MOOC clickstream data, where, for each student, for multiple courses, a set of specific MOOC event types is expressed for each time unit. It consists of two alternative transfer methods based on representation learning with auto-encoders: a passive approach using transductive principal component analysis and an active approach that uses a correlation alignment loss term. With these methods, we investigate the transferability of dropout prediction across similar and dissimilar MOOCs and compare with known methods. Results show improved model transferability and suggest that the methods are capable of automatically learning a feature representation that expresses common predictive characteristics of MOOCs.