A New Two-step Ensemble Learning Model for Improving Stress Prediction of Automobile Drivers
May Al-Nashashibi1, Wa’el Hadi2, Nuha El-Khalili3, Ghassan Issa4, and Abed Alkarim AlBanna1
1Computer Science, University of Petra, Jordan
2Information Security, University of Petra, Jordan
3Software Engineering, University of Petra, Jordan
4School of IT, Skyline University, UAE
Abstract: Commuting when there is a significant volume of traffic congestion has been acknowledged as one of the key factors causing stress. Significant levels of stress whilst driving are seen to have a profoundly negative effect on the actions and ability of a driver; this has the capacity to result in risks, hazards and accidents. As such, there is a recognized need to determine drivers’ levels of stress and accordingly predict the key causes responsible for high levels of stress. In this work, the objective is centred on providing an ensemble machine learning framework in order to determine the stress levels of drivers. Moreover, the study also provides a fresh set of data, as gathered from 14 different drivers, with data collection having taken place during driving in Amman, Jordan. Data was gathered via the implementation of a wearable biomedical instrument that was attached to the driver on a continuous basis in order to gather physiological data. The data gathered was accordingly categorised into two different groups: ‘Yes’, which represents the presence of stress, whilst ‘No’ represents the absence of stress. Importantly, in an effort to circumvent the negative impact of driver instances with a minority class on stress predictions, oversampling technique was applied. A two-step ensemble classifier was developed through bringing together the findings from random forest, decision tree, and Repeated Incremental Pruning to Produce Error Reduction (RIPPER) classifiers, which was then inputted into a Multi-Layer Perceptron neural network. The experimental findings highlight that the suggested framework is far more precise and has a more scalable capacity when compared with all classifiers in relation to accuracy, g-mean measures and sensitivity.
Keywords: Ensemble learning, stress prediction, oversampling, data mining algorithms.
Received June 23, 2020; accept January 6, 2021