A Computer-Vision Approach to Accessible Robot Control: Hand Gesture Recognition for Users With Limited Mobility or Speech

Amin Majd; Mehdi Asadi; Juha Kalliovaara

doi:10.54941/ahfe1007677

AHFE International

Accelerating Open Access Science in Human Factors Engineering and Human-Centered Computing

A Computer-Vision Approach to Accessible Robot Control: Hand Gesture Recognition for Users With Limited Mobility or Speech

Open Access

Article

Conference Proceedings

Authors: Amin Majd, Mehdi Asadi, Juha Kalliovaara

Abstract

Human–robot interaction increasingly demands intuitive, efficient, and accessible control mechanisms, particularly for users with physical or communication disabilities. Traditional interfaces—such as joysticks, keyboards, or voice commands—often impose significant cognitive or physical effort and may be unusable for individuals with impaired speech, hearing, or motor abilities. Recent advances in artificial intelligence and computer vision offer promising alternatives by enabling robots and autonomous systems to interpret human intentions directly from visual cues. This paper introduces a vision-based control framework that allows users to operate an autonomous drone through predefined hand gestures without any physical contact with a controller. The proposed system integrates real-time computer vision with control-system engineering to translate finger poses captured by a camera into actionable navigation commands. Our method employs PoseNet for robust hand-keypoint detection, combined with a custom gesture-classification module optimized for low-latency inference. The generated gesture classes are mapped to drone control instructions, enabling tasks such as takeoff, landing, directional movement, and hovering. The development process involved coordinated work across three subsystems: (1) Data Labeling, including dataset creation and annotation using CVAT and MATLAB; (2) Robot Interface and Connectivity, focusing on reliable communication between the vision module and the drone’s flight controller; and (3) AI Model Development, comprising model selection, training, and optimization using Python, OpenCV, TensorFlow, and Google Colab. Although the project encountered initial technical and organizational challenges, the iterative development cycle ultimately led to a stable, functional prototype. Experimental results demonstrate that the system can accurately recognize gesture commands in real time and maintain responsive drone control under various lighting and background conditions. The achieved performance highlights the feasibility of replacing traditional physical controllers with AI-driven gesture interfaces, providing an accessible alternative for users who cannot operate conventional input devices. Overall, this work contributes a practical and innovative solution for enhancing human–robot interaction through contact-free control. The presented framework has potential applications not only in assistive technologies but also in fields such as rescue operations, manufacturing, and interactive robotics, where intuitive and hands-free control is advantageous. The project also offered valuable interdisciplinary experience in computer vision, robotics, and software engineering, demonstrating the effectiveness of merging AI-based perception with control-system design.

Keywords: Human And Robot Interaction, Computer Vision, Artificial Intelligence

DOI: 10.54941/ahfe1007677

Cite this paper

Downloads

44

Visits

57

Download PDF

More from this volume

← Effects of Swarm Size Variability on Operator Workload Emotive Design Heuristics: A Methodology for Creating and Validating Empathetic Design Heuristics for Human-Robot Interaction →

View all articles in Human Factors in Robots, Drones and Unmanned Systems →