Automated Ergonomic Problem and Solution Identification from Videos with a Knowledge-Retrieving Large Multimodal Model
Abstract
Workers across diverse industries experience a high prevalence of work-related musculoskeletal disorders (WMSDs), the leading cause of non-fatal injuries. To mitigate WMSDs, it is crucial for ergonomic experts to identify ergonomic problems and solutions. However, implementing such manual identification across diverse workplaces is challenging because it is time-consuming and resource-intensive, highlighting the need for accessible tools for on-site personnel. Recent advances in large multimodal models (LMMs) have demonstrated their potential to identify ergonomic problems and solutions, given their strong scene understanding capabilities. However, LMMs often generate plausible but incorrect information, known as hallucination, particularly when handling long-context video inputs. This limitation is critical because workers without ergonomic expertise cannot verify the correctness of identified problems and solutions, leading to ineffective or even harmful interventions. To address this, we aim to automatically identify ergonomic problems and solutions from videos, supported by guideline-based evidence, by applying an ergonomic knowledge-retrieving LMM. We developed an ergonomic knowledge retrieval pipeline that enables the LMM to retrieve ergonomic knowledge from a knowledge graph and ground its predictions accordingly. To evaluate the correctness of identification and the relevance of the retrieved knowledge, we used accuracy and context precision as evaluation metrics. Evaluation on 25 real-world workplace videos yielded an accuracy and context precision of 0.80, outperforming a state-of-the-art LMM. Our results highlight the importance of integrating ergonomic knowledge into LMMs in identifying ergonomic problems and solutions. Our knowledge-retrieving LMM automates ergonomic problem and solution identification grounded in verified knowledge, helping reduce WMSDs through easier, broader adoption.
Keywords: Ergonomic Problem And Solution Identification, Large Multimodal Model, Knowledge Retrieval
DOI: 10.54941/ahfe1007792
Cite this paper
More from this volume
- Musculoskeletal pain in teleworkers in Brazil: prevalence and workplace environment risk factors
- Investigating the Regulation of the Circulatory System during Acute Hypobaric Hypoxia Exposure
- Occupational Exoskeletons: Overview of Mental Workload Effects and Assessment Methodologies
- Risk assessment through heart rate of a team of airport workers loading Unit Load Devices
- Differences in Mandolin Tremolo Motion between Beginners and Experts: Implications for Skill Acquisition
- An Illumination Study on Floating Storage and Offloading Vessel
- The impact of physical and mental stress on the cognitive abilities of employees in Industry 5.0 manufacturing environments: A systematic literature review
- A Methodological Framework for Upper-Limb Comfort Reachability Modeling Using Biomechanical Simulation and Point-Cloud Representation
- Designing for adoption: a culturally grounded, 3D-prototyped ergonomic hand tool for Andean potato harvesting
- An AI-Powered Model for Automatic Real-Time Assessment of Seated Work Postures Using Rapid Upper Limb Assessment (RULA)
- Integrating Physical and Psychosocial Risk Assessment Across Age and Gender: Preliminary Findings from the European WAge Project
- Design of a rapid assessment tool to identify ergonomic risk factors due to manual material handling tasks in mexican workers


AHFE Open Access