Construction of models for predicting arousal level in advance based on features of face images

Open Access
Conference Proceedings
Authors: Yuki MekataMiwa Nakanishi

Abstract: Background and PurposeSleepiness is a major factor in accidents or errors. Even in these days of promoting system automation, users should monitor the state of the autonomous system or take over in the event of an emergency. Thus, it is important for systems to understand user’s arousal level and manage user’s arousal level appropriately. In recent years, a method for maintaining arousal levels has been proposed by using artificial intelligent agents to converse with the user. The interaction method used by such a system is considered to be a more natural way to maintain arousal level than the conventional method of using stimulation such as a beep sound. Therefore, we anticipate realizing a system that not only detects sleepiness and responds to it reactively, but also predicts a decreasing arousal level in advance and responds to it proactively. In this study, aiming at realizing such a system, we attempt to construct a model to predict decreasing arousal levels in advance. We think this will lead to the optimization of system interaction. In the assessment of arousal level, there is a method to assess a user’s arousal level in five levels based on facial expression by trained raters. Therefore, we assumed to construct a model predicting the five stages of arousal level in the method based on features of face images. MethodIn the experiment to obtain data for model construction, autonomous driving was assumed as an example of an autonomous system. Participants monitored autonomous driving for an hour. Participants could only change lanes by pressing a button, and all other operations were handled by the autonomous system. During the task, participants responded to the arousal level in five levels by pressing a button every 30 seconds. In addition, participant’s face images were recorded at 60 Hz. Three male and three female participants took part in the experiments, and each participant completed the task three times. From face images, we used features of a texture distribution by Local Binary Patterns Histogram (LBPH) and features obtained by embedding using FaceNet for model construction. Using these features and subjective assessment of arousal level, we constructed a model to predict future arousal level in five levels by machine learning. ResultsThe bias of the output value was less when using a neural network than when using random forest or support vector machine. The accuracies of prediction arousal level after 30 seconds from the current features are around 35% for learning individual data. Although the accuracies decrease as the target time for prediction became far away from the present time, the results up to 120 seconds showed that the accuracy was greater than 30% and the rate of the case where the deviation between predicted and actual values is less than one was greater than 60%, which means that it is possible to suppress the cases where large deviations occur. We think that the model is expected to be used in a proactive system that predicts and responds to decreases in arousal levels in advance.

Keywords: Arousal level, Artificial intelligence agents, Deep learning

DOI: 10.54941/ahfe1002442

Cite this paper: