Transforming Elderly Care with Vision-Based Hand Gesture Recognition: A Deep Learning Framework

AHFE International

Accelerating Open Access Science in Human Factors Engineering and Human-Centered Computing

Transforming Elderly Care with Vision-Based Hand Gesture Recognition: A Deep Learning Framework

Open Access

Article

Conference Proceedings

Authors: Seyed Ali Ghorashi, Riazul Islam

Abstract: In today’s ever-evolving healthcare landscape, our objective is to bring technology closer to human experience. In this paper, a vision-based hand gesture recognition (HGR) system is introduced as a human computer interface (HCI) that transforms natural movements into intuitive medium for device and application manipulation. This innovative approach is designed to empower the elderly and enhance patient monitoring by offering a seamless, nonintrusive interface. By bridging compassion with cutting-edge technology, this approach aims to redefine care and create a more connected, responsive healthcare environment. The study begins by reviewing both traditional and contemporary HGR methodologies, with a focus on vision-based systems, which have shown greater potential for practical applications compared to conventional sensor-based systems. Traditional approaches, such as contour analysis, colour segmentation, and template matching, have demonstrated limitations. In contrast, deep learning approaches have gained mass adaptation in vision-based systems for their ability to learn hierarchical features and better handling complex, dynamic scenarios. However, existing models struggle in real-time performance due to computational inefficiencies and environmental variations. This study introduces a deep learning-based detection framework which enables the practical application of HGR. For example, it supports the development of gesture-based virtual assistants for interacting with digital devices and allows elderly residents in care homes to give instructions to remote caregivers. The framework consists of two deep learning models, one for region of interest (ROI) detection and another for classification. The recognition process begins with capturing RGB images, chosen for versatility, affordability, and compatibility, which then passes through YOLOv8 (You Only Look Once version 8) for detection. The YOLOv8 model is trained on a subset, only 4.5%, of the HaGRID dataset, comprises over 550,000 RGB images, to accurately locate hand regions. Once detection is complete, several image processing techniques are applied to overcome typical HGR constraints. These include ROI extraction and gray scaling for faster computation, resizing for model optimization and distance viability, HSV conversion for lighting independence, and histogram equalization for enhanced feature extraction. The processed image is then fed into a convolutional neural network (CNN) for gesture classification. The network is designed with several convolutional layers, followed by dense layers with ReLU activation to introduce non-linearity, and SoftMax activation for classification. To improve the model's robustness, data augmentation techniques such as random flipping and rotation are applied. The HGR system demonstrates a staggering accuracy of 97% for training & validation. The proposed framework effectively addresses the limitations of conventional HGR systems—many of which have been developed and tested in controlled environments—demonstrating its applicability in real-life situations.

Keywords: Artificial Intelligence, Machine Learning, Helthcare.

DOI: 10.54941/ahfe1005969

Cite this paper:

Downloads

136

Visits

448