AI-powered real-time analysis of human activity in videos via smartphone.
Abstract
A major focus in computer vision research is the recognition of human activity based on visual information from audiovisual data using artificial intelligence. In this context, researchers are currently exploring image-based approaches using 3D CNNs, RNNs, or hybrid models with the intent of learning multiple levels of representation and abstraction that enable fully automated feature extraction and activity analysis based on them. Unfortunately, these architectures require powerful hardware to achieve the most real-time processing possible, making them difficult to deploy on smartphones. However, many video recordings are increasingly made with smartphones, so immediate classification of performed human activities and their tagging already during video recording would be useful for a variety of use cases. Especially in the mobile environment, a wide variety of use cases are therefore conceivable, such as the detection of correct motion sequences in the sports and health sector or the monitoring and automated alerting of security-relevant environments (e.g., demonstrations, festivals). However, this requires an efficient system architecture to perform real-time analysis despite limited hardware power. This paper addresses the approach of skeleton-based activity recognition on smartphones, where motion vectors of detected skeleton points are analyzed for their spatial and temporal expression rather than pixel-based information. In this process, the 3D-bone points of a recognized person are extracted using the AR framework integrated in the operating system and their motion data is analyzed in real time using a self-trained RNN. This purely numerical approach enables time-efficient real-time processing and activity classification. This system makes it possible to recognize a person in a live video stream recorded with a smartphone and classify the activity performed. By successfully deploying the system in several field tests, it can be shown both that the described approach works in principle and that it can be transferred to a resource-constrained mobile environment.
Keywords: artificial intelligence system, computer vision
DOI: 10.54941/ahfe1003972
Cite this paper
More from this volume
- Towards Gender-sensitive Motivation of Patients with Depression for Cognitive Training with a Socially Assistive Robot
- Mild Dementia Decision Support from AI-based Digital Biomarkers using Mobile Playful Exercises with High Adherence
- Digital Shadows and Twins for Human Experts and Data-Driven Services in a Framework of Democratic AI-based Decision Support
- AI-enabled Playful Enhancement of Resilience and Self-Efficacy with Psychological Learning Theory
- Impact of real-time stress monitoring in people with an intellectual disability
- Improving the Security and Usability of the Internet of Things through a Scalable Network-Level Smart System
- Laboratory assessment of heat strain in female and male wildland firefighters
- Real-time remote stress monitoring based on specific stress modelling considering load characteristics of different military forces
- Impact of Acute Physical Exercise on Cognitive Performance
- Towards Immersive Skill Training for First Responders with Biosensor-based Assessment of Situation Awareness
- Development of an Automated Microclimate Adjustment System based on Concentration Levels of Students
- Exploring The Implementation of AI in a Cost-effective Device for Predicting Sleep Quality


AHFE Open Access