Human Interaction and Emerging Technologies (IHIET-FS 2025): Future Systems and Artificial Intelligence Applications

AHFE International

Accelerating Open Access Science in Human Factors Engineering and Human-Centered Computing

Human Interaction and Emerging Technologies (IHIET-FS 2025): Future Systems and Artificial Intelligence Applications

book-cover

Editors: Tareq Ahram, Andrew Arewa, Seyed Ali Ghorashi

Topics: Future Systems and Artificial Intelligence Applications

Publication Date: 2025

ISBN: 978-1-964867-72-4

DOI: 10.54941/ahfe1005948

Articles

Enabling the Transfer of Large Files Across Security Domains in a Multinational Environment

In an environment in which data has to be transferred between different applications, nations and domains, various risks in terms of data security exist. This has particularly serious consequences if security-relevant data and information are processed. In this paper we focus on the transfer of large files between several applications across security domains in the military sector. The exchange of data in a military multinational environment is particularly challenging because it is regulated by predefined architectures, concepts and technologies. Coalition Shared Data (defined by STANAG (Standardization Agreement) 4559) specifies services, interfaces and data models to exchange ISR (intelligence, surveillance and reconnaissance) information within a coalition. STANAG 4774 and STANAG 4778 as well as STANAG 5636 define essential (security) metadata and the (security) labeling of data to enable role and security-based data management. In this paper the exchange of large files between different security domains that include Coalition Shared Data services linked by a specific labeling service and security gateway based on STANAG 4774 and STANAG 4778 were examined. Accredited security gateways supporting data exchange often come with limitations e.g. on the file size that can be exchanged. Based on the existing systems, processes and requirements, a concept for the transfer of large files has been developed under consideration of the technical and organizational requirements and constraints. Initial tests of the new approach were carried out in a laboratory demonstrator and demonstrated the fundamental functionality.

Lorraine Hagemann, Philipp Klotz, Dennis Gießel

Conference Proceedings

Defining Autonomous Weapon Systems: A Conceptual Overview of Existing Definitory Attempts and the Effectiveness of Human Oversight

It is only “natural that proponents and opponents of Autonomous Weapon Systems (AWS) will seek to establish a definition that serves their aims and interests. The definitional discussion will not be a value-neutral discussion of facts, but ultimately one driven by political and strategic motivations” (UNIDIR, 2017). Though somewhat of a sullen statement, conceptually echoing the self-serving outcome-oriented nature of ‘conflict’ espoused by the proverb, ‘all is fair in love and war’ (attributed to the poet John Lyly), the aforementioned stipulation by the United Nations Institute for Disarmament Research (UNIDIR), coupled with the belief that some States are reluctant to engage in a broader definitional exercise, have recently been proven rather prophetic.In effect, the concepts above have somewhat accurately described the existing state of the discourse surrounding the technology in question. Specifically, the divided stance on definitions can be evidenced in a wide array of publications. These range from and include the 2024 ‘collation of responses’ report compiled by the Implementation Support Unit of the Convention on Certain Conventional Weapons (UNODA), the UNODA 2023 report, which focused specifically on individual definitions and characterisations from multiple countries, as well as results from domestic inquiries (such as those arising from the House of Lords AI in Weapon Systems Committee Report and the subsequent UK Government response). The latter two documents provide an informative snapshot as to the potential reasoning behind this discourse, reverberating the initial quotation’s poignancy. Specifically, the House of Lords (2023) Committee acknowledged the United Kingdom’s lack of an operational definition for Autonomous Weapon Systems, alongside the challenges such an absence may pose regarding regulation. Yet, the UK Government’s response stipulated that while it respected the general arguments put forward by the Committee, it did not intend to adopt an official or operative definition for AWS. The reasoning? Definitions are typically an aid to policy making and may serve as a starting point for a new legal instrument prohibiting certain types of systems, which would represent a threat to UK defence interests (Ministry of Defence, 2024). This stance, albeit not unique to the UK, aptly demonstrates that the suggested apprehension regarding the adverse effects of adopting a wider definition, in so far as bringing the legitimacy or legality of encompassed technologies into question (UNIDIR, 2017), seems to have retained its prevalence.Irrespective of the explicit or implicit recognition of the extent to which International Humanitarian Laws (IHL) should apply with respect to the potential use of weapons systems based on emerging technologies (UNIDIR, 2023), namely the adherence to the principles of Distinction, Proportionality and Precaution in attack, it remains necessary to conduct an inquiry and ascertain what a value-neutral definition of AWS can be. Consequently, the effectiveness of human oversight must also be reviewed. Not only does the human-element operate as an integral thematic component of the existing definitions but the value of its position as a pre-requisite for adherence needs to be ascertained.

Mark Tsagas

Conference Proceedings

Exploring the Effect of Wearable Digital Devices (WDDs) on Adverse Occupational Health and Safety Practices of High-Risk Workers

Globally, workers in high-risk industries are often exposed to hazards with devastating effects, leading to occupational health infections, injuries, and fatalities. Despite the advent of Wearable Digital Devices (WDDs), contemporary research examining their influence vis-à-vis high-risk industry workers' health and safety practices is inadequate. Aim: The study explores the influence of wearable digital devices on managing adverse occupational health and safety practices among workers in high-risk industries. Research question: Does the use of wearable digital devices influence safety practices among high-risk industry workers? Methodology: A mixed (Quan+Qual) research method was followed for a holistic approach to the study’s variables. Besides, a semi-structured interview with senior managers and supervisors was conducted. Quantitative data was analysed using Microsoft Excel, and thematic analysis for qualitative data. Findings: 60% of the study’s participants agreed that WDDs such as smart watches, digital helmets, and airbag vests are critical to managing the prevalence of adverse safety practices among workers on high-risk projects, although a comprehensive utilisation of WDDs is envisaged to cause financial pressures on high-risk industries. Conclusion: The study revealed that despite the relevance and importance of WDDs in occupational health and safety management, workers' habits and practices may limit their effectiveness in curbing adverse safety incidents. Also, due to cost implications, small-sized high-risk companies may not be able to afford PPEs with digital functionalities. Thus, novel non-technological approaches, such as behaviour-based training, are recommended. This is a supplementary study and part of an ongoing PhD research that seeks to develop a conceptual framework for managing dysfunctional safety practices in high-risk industries.

Oluwafemi Olatoye, Andrew Arewa, David Tann , Ismaeel Husain

Conference Proceedings

Evaluating the Effectiveness of Machine Learning Algorithms in Stock Price Prediction Across Different Time Frames

Financial markets, characterized by their volatility, uncertainty, complexity, and ambiguity (VUCA), pose significant challenges for accurate predictions. Investment has become increasingly intertwined with technological advancements, as machine learning models revolutionise the field of stock market trend predictions, offering potential solutions by processing large datasets, identifying trends, and minimizing human bias. While machine learning is increasingly applied in financial forecasting, understanding the relative strengths and weaknesses of different algorithms across varying time frames remains underexplored. This is especially relevant given the rise of algorithmic trading and new stock markets such as cryptocurrencies, underscoring the need for precise, data-driven predictions. The aim of this study is to evaluate the performance of machine learning algorithms in predicting stock prices within Singapore’s banking sector. The study explores how each algorithm performs when trained on different amounts of data, comparing its effectiveness for short-term, mid-term and long-term stock price predictions. To do this, historical stock prices were collected using the Yahoo Finance API, focusing on closing prices as the target variable. Using the data collected from major Singaporean banks, namely DBS, OCBC and UOB, this study evaluated the performance of various machine learning algorithms: Random Forest (RF), Support Vector Regression (SVR), K-Nearest Neighbors (KNN), Artificial Neural Networks (ANN), and Long Short-Term Memory (LSTM). The various models were all trained on different datasets, and its predictions for the closing price on a specific date was recorded. Each model was evaluated using rigorous performance metrics, including percentage error, R² values, mean absolute error, and mean squared error, to determine their efficacy in capturing trends and minimising predictive inaccuracies. Different algorithms have distinct methods of learning patterns and handling data variability, and thus will perform differently under the same conditions. Hence, we hoped to gain greater insight into each model’s performance and assess their adaptability to the various time frames. This study contributes to the growing body of research on AI-driven financial forecasting by providing a comparative analysis of machine learning algorithms in Singapore’s banking sector. It highlights the need for flexibility in one’s approach to algorithmic trading to enhance prediction accuracy across diverse scenarios. The insights gained can aid financial analysts, traders, and decision-makers in developing data-driven strategies for stock market investments, ultimately promoting more informed decision-making and risk management in a volatile financial landscape.

Kenneth Y T Lim, Amy Low, Isabella Lim

Conference Proceedings

Enhancing the Viability of Battery-Electric Trucks in Long-Distance Freight Transport: Assessing the User Acceptance of Overhead Line Technology

The impacts of climate change are becoming increasingly noticeable, highlighting the need for the transport sector to minimize CO₂ emissions. Battery-electric trucks (BEVs) offer a promising solution for reducing emissions in heavy-duty road transport. However, their limited range and long charging times reduce their overall attractiveness and usability. Dynamic charging via overhead line technology addresses these challenges by enabling trucks to additionally charge while driving, extending their range and reducing reliance on stationary charging. The “BEV Goes eHighway” (BEE) project investigates user acceptance and technical feasibility of this technology, focusing on retrofitting pantographs to existing BEVs. To integrate perspectives from decision-makers and users, an expert survey (N = 12 logistics specialists) and a pilot field study (N = 7 truck drivers) were conducted. Over 80% of experts supported integrating pantograph-equipped trucks into their fleets, with purchase price, operating costs, and maintenance costs being key factors. The field study tested two pantograph-equipped trucks—one battery-electric and one hybrid—revealing overall ease of use, but also optimization potential. Challenges include pantograph connection in poor weather and increased cognitive workload due to precise lane keeping. Users suggested improvements such as auditory and visual feedback and automated pantograph control. The results emphasize the dependence of a successful implementation on technological and infrastructural advancements as well as user acceptance. Future efforts should focus on improving pantograph reliability, automating key processes, and expanding field studies to validate scalability and usability.

Lotte Wagner-douglas, Emma Höfer, Regina Linke, Stefan Ladwig

Conference Proceedings

Cognitive Science and Information Technologies in Team Sports: Enhancing Performance and Safety

This research and development program leverages the integration of cognitive science, experimental psychology, and advanced technologies to enhance performance and safety in team sports. Through the development and validation of three innovative applications—TAKTIK, SENIC, and ENTOURAGE—the program demonstrates the potential for interdisciplinary approaches to address critical challenges in tactical learning, injury detection, and concussion management. TAKTIK is a gamified, AI-driven playbook designed to support players in learning and retaining complex tactical formations in football. By generating dynamic, cognitive exercises tailored to individual response accuracy and speed, TAKTIK provides actionable feedback that improves engagement and comprehension of game strategies. A preliminary study involving high school and university football teams indicates significant enhancements in players’ learning processes, suggesting that TAKTIK effectively meets the cognitive demands of team sports athletes. SENIC (ENgaging and Immersive Cognitive Simulation) wishes to advance concussion management by embedding cognitive tasks within sport-specific contexts to add dimensions of face validity and ecological validity. SENIC is a computer-based assessment that measures processing speed – reaction time to identify a change in ball (or puck) possession between on-screen players in video sequences in the team sport plat (e.g., football, soccer, rugby, hockey, basketball) – and smooth pursuit through an external eye-tracking system. These dynamic indicators of cognitive functioning provide a comprehensive assessment of post-concussion impairments. Initial validation studies comparing SENIC to established tools, such as the ImPACT test, reveal promising evidence for its sensitivity and reliability in supporting return-to-play decisions, particularly in fast-paced team sports. ENTOURAGE builds on the insights gained from SENIC to extend concussion management beyond evaluation, focusing on education and decision-making support for athletes, parents, and coaches. By offering real-time insights and integrating seamlessly with SENIC data, ENTOURAGE empowers community stakeholders with the tools necessary for informed decision-making in stressful situations, such as managing potential concussions during games or practices. The complementary design of SENIC and ENTOURAGE reflects a unified framework aimed at democratizing access to effective concussion management tools in team sports. The overarching program relies on a shared approach across these tools, characterized by the use of artificial intelligence, gamification, and user-centered design to enhance cognitive engagement and ecological validity. Each application addresses the specific cognitive and practical challenges of team sports, emphasizing the importance of adaptive, contextually relevant solutions for both performance optimization and safety enhancement. By combining interdisciplinary research with advanced information technology, our program underscores the transformative potential of cognitive science in addressing complex challenges in team sports.

Sebastien Tremblay, Carolane Croteau, Mireille Patry, Helen Hodgetts, Cindy Chamberland

Conference Proceedings

Artificial Intelligence as Self-Instantiated, Temporally Continuous, Disturbance-Driven Adaptive World-Builder

Consciousness remains one of the most elusive features to replicate in artificial agents. This paper proposes a novel framework for artificial consciousness based on four integrative pillars: (1) self-instantiation, a mechanism for continuous self-representation and identity; (2) temporal continuity, preserving an internal narrative through persistent memory; (3) disturbance-driven adaptation, an intrinsic feedback loop that triggers learning in response to surprises or anomalies; and (4) autonomous world-building, the ability to construct and simulate internal models of the world. We propose that current AI models, despite their sophistication, are fundamentally constrained by functionalist architectures and cannot fulfill these requirements through computational scaling alone. Unlike Integrated Information Theory or Global Workspace Theory, our approach emphasizes the necessity of autonomous world-building and genuine temporal flow. Our experiments demonstrate that combining these pillars can yield emergent conscious-like behaviors in AI systems, allowing them to exhibit self-awareness, resilience, and creative problem solving beyond the capabilities of conventional models. The significance of this framework lies in bridging theoretical foundations of consciousness with practical AI design, providing a roadmap for developing more adaptive and interpretable intelligent agents while raising important ethical considerations about the potential moral status of truly conscious artificial systems.

Manuel Delaflor Rodriguez, Cecilia Delgado Solorzano, Carlos Toxtli

Conference Proceedings

Knowledge Evolution and Scientific Breakthroughs triggered by AI Hallucinations - A Paradigm Shift?

The interdisciplinary impact of artificial intelligence (AI) in science has been especially emphasized by the fact that both, the Nobel Prize in Physics and in Chemistry in 2024 have been awarded for pioneering research with results, decisively based on artificial neural networks. The core of the excelling achievement in chemistry is described as: capturing of the full computational understanding of living matter at atomic level (Abriata, 2024). An interesting detail behind this highly acclaimed success, is that one of the laureates had praised AI hallucinations to be the designers of de novo proteins (Anishchenko, 2021). AI hallucinations are defined as incorrect or misleading results, usually produced by models implementing generative AI. Hallucinating AI systems are particularly associated with large language models, chat bots and computer vision tools and their occasionally nonsensical or altogether inaccurate outputs can be welcome in domains such as imaginary and visionary art but they can have significant negative consequences in practical applications. AI systems lack human wisdom. They do not solve problems via understanding context or using ideas of their own. They work with predefined inputs and in the case of generative AI, they generate new patterns some of which may deviate from the knowledge implemented in the algorithm or even defy the wisdom of the algorithm designer. Still, they can prove to be compatible with reality as is the case with de novo proteins. AI hallucinations could then be viewed as glimpses into a future, one yet to be created, for instance when introducing man made proteins and organisms into the existing biosphere. Epistemological questions arising from the perspective that creative mistakes of AI can promote science more effective than human ideas will be discussed. Possible risks in connection to a rapid application of in silico results in structural biology, created mostly with machine learning, will also be considered.

Anastasia-maria Leventi-peetz, Nikolaos Zacharis

Conference Proceedings

Effectiveness of Knowledge Models for Visual Object Detection

There are various effective methods of image processing for detecting moving objects from video information, such as background subtraction, optical flow, and edge extraction. Some image processing techniques such as image difference detection can detect targets small pixels. On the other hand, these methods do not recognize and distinguish the target itself. Image recognition such as machine learning can also detect objects from video information. However, if there are not enough pixels to recognize the target, it is difficult to find it by image recognition. In this study, we tried to develop a mechanism for automatic detecting function of aircraft approaching to the airport by using visual cameras. In the situation where aircraft flying to the airport for landing, the image of aircraft appears as a small dot. And, this target image does not have enough pixel information to identify the aircraft by image recognition techniques. However, a tower controller can identify whether a small dot is an aircraft or not under the similar condition. This is because air traffic controllers make their decisions based on their operational knowledge and experience. Therefore, we aim to model the knowledge and experience of air traffic controllers explicitly to judge the situation as rules for high accuracy target detection.In order to detect and track a specific object continuously, another mechanism is needed for identifying the object. In this study, we propose a rule-based model as an air traffic controllers' operational knowledge for detecting the target from the frame of video image. We present techniques for the detection and selection of target aircraft that appear as small dots on the image. And then, we discuss the results of our validation of the effectiveness of our proposed approach.

Satoru Inoue, Mark Brown, Taro Kanno

Conference Proceedings

Talent Development and Retention in Industry 4.0: Strategy to Overcome Talent Challenges in VUCA Environments and Drive Digital Transformation with Agility

Technologies have significantly changed the demands of work, and the skills needed for the future. Many companies have migrated towards automated processes; however, it is crucial to strengthen professional growth from the formative stage, developing skills that complement industrial digitization. In countries such as Mexico, the lack of specialized talent represents a major obstacle to progress in the digital transformation. This article analyzes an experience applied in a Mexican university, in which an agile approach was implemented in the management of projects developed by students of different engineering degrees. The comparison between the use of predictive and agile methodologies showed significant improvements in academic performance and the quality of the proposed solutions. In VUCA contexts, where uncertainty and complexity are constant, agile approaches are not only relevant, but necessary to train talent prepared for a globalized and constantly evolving labor market.

Gabriela Guadalupe Reyes Zárate, Eduardo Arturo Garzón Garnica

Conference Proceedings

Architectural Analysis of RFID Integration in Medical Device Logistics: A Healthcare Information Systems Study

This study presents a comprehensive architectural framework for implementing Radio Frequency Identification (RFID) technology in medical device logistics, addressing the unique challenges of healthcare supply chain management. The healthcare sector's stringent requirements for device tracking, sterilization protocols, and quality assurance measures create a complex environment where traditional RFID implementation approaches often prove insufficient. Through a detailed case study of a third-party logistics provider managing two distinct medical device accounts, we analyze the technical, operational, and financial implications of integrating RFID technology within existing healthcare information systems. The research employs a mixed-methods approach, combining qualitative analysis of stakeholder interviews with quantitative assessment of system performance metrics to develop a multi-layered architectural solution.The proposed framework introduces a novel five-layer architecture that integrates artificial intelligence capabilities with traditional RFID infrastructure, incorporating physical infrastructure, data processing, integration, application, and AI layers. This design addresses healthcare-specific challenges including sterilization requirements, regulatory compliance, and bidirectional inventory flow. Our implementation analysis reveals potential processing time reductions of up to 75% under optimal conditions, with projected annual cost savings ranging from $45,000 to $75,000. The system significantly improves inventory management efficiency, reducing annual audit completion time by 87.53% and decreasing tracking errors by 95%.The study contributes to the theoretical understanding and practical implementation of healthcare information systems by providing detailed architectural specifications and strategies. The hybrid system approach, combining RFID, Direct Part Marking (DPM), and traditional barcoding technologies, demonstrates superior reliability and cost-effectiveness, with expected break-even periods ranging from 2.1 to 2.9 years. Risk analysis identifies key challenges in technology integration, staff training, and system maintenance while proposing specific mitigation strategies for healthcare environments. The implementation framework includes comprehensive guidelines for managing the transition period, staff training requirements of 20-40 hours per person, and strategies for minimizing operational disruption during the 2–4-week deployment phase.This research extends the current literature by offering a comprehensive framework that bridges the gap between theoretical RFID capabilities and practical healthcare implementation requirements. The inclusion of artificial intelligence components - including computer vision systems, natural language processing, and predictive analytics - provides a forward-looking architecture capable of adapting to emerging healthcare technology needs. The findings suggest that successful RFID integration in medical device logistics requires careful consideration of both technical architecture and operational constraints while maintaining a focus on healthcare-specific requirements and standards. Future research directions identify opportunities for enhanced AI integration, predictive maintenance capabilities, and system optimization in medical device tracking and management, particularly in environments with complex sterilization requirements and high-volume inventory movement.

Vikraman Baskaran, Denny Nguyen

Conference Proceedings

Early Detection of Arthritis Using Convolutional Neural Networks and Explainable AI

Arthritis is a prevalent and debilitating musculoskeletal disorder that significantly impairs mobility, joint function, and overall quality of life for millions of individuals across the globe. The condition is characterized by chronic joint inflammation, cartilage degradation, stiffness, and persistent pain, often leading to long-term disability and increased healthcare dependency. As the global population continues to age, the healthcare impact and economic burden of arthritis are expected to rise substantially. Early detection and precise classification of arthritis are therefore essential for initiating effective treatment, slowing disease progression, improving long-term prognosis, and reducing healthcare system strain. However, traditional diagnostic approaches such as physical examination and manual interpretation of radiographic images by clinicians are often subjective, time-consuming, and prone to inter-observer variability. These limitations emphasize the urgent need for intelligent, reproducible, and scalable computer-aided diagnostic systems. This study proposes a novel deep learning-based framework that incorporates Explainable Artificial Intelligence (XAI) techniques to automatically classify the severity of arthritis using X-ray imaging data. Specifically, the research investigates and compares the performance of six widely recognized convolutional neural network (CNN) architectures: EfficientNetB5, ResNet50, InceptionV3, DenseNet121, VGG16, and MobileNetV2. These architectures were systematically trained and validated on a curated dataset of arthritis X-ray images, with the aim of identifying the most robust and efficient model for classifying different stages of arthritis severity with high diagnostic precision and generalizability. Among the evaluated models, the VGG16 architecture demonstrated the highest classification accuracy, achieving a performance of 96.17%, making it a strong candidate for clinical integration. DenseNet121 followed with an accuracy of 91.35%, while EfficientNetB5, InceptionV3, and ResNet50 each delivered competitive results within the range of 88% to 89%. MobileNetV2, although computationally lighter and more efficient in terms of processing speed, exhibited the lowest performance with an accuracy of 85.43%. These findings reveal that deeper and well-optimized CNN architectures tend to offer superior results for medical image classification tasks, particularly when image features are subtle and require high-level abstraction. To enhance the transparency and interpretability of the classification outcomes, the study integrates Gradient-weighted Class Activation Mapping (Grad-CAM) into the system pipeline. Grad-CAM generates visual heatmaps that identify and highlight the most influential regions within the X-ray images that guided the model’s predictions. This visual interpretability is essential for clinical adoption, as it allows healthcare professionals to validate and understand the AI’s decision-making process, thereby fostering trust, transparency, and accountability. The outcomes of this study illustrate the transformative potential of AI-powered tools in medical diagnostics. By automating the identification and classification of arthritis severity, the proposed framework can significantly reduce diagnostic delays, improve diagnostic consistency, and support more timely and informed treatment decisions. Future research directions include expanding the dataset to incorporate more diverse imaging samples, refining model architectures for enhanced real-time deployment, and integrating multimodal clinical data such as MRI scans, blood biomarkers, and patient history to further elevate diagnostic accuracy and support holistic arthritis management strategies.

Binta Ade-olusile, Zainb Dawod, Saeed Sharif

Conference Proceedings

Transforming Mental Health Assessment: Machine Learning for Early Detection and Personalized Care Among College Students

The growing global incidence of mental health disorders underlines the urgent need for improved tools to enable early diagnosis and intervention. This study investigates the potential of machine learning models to predict mental health issues among college students by utilizing a dataset that includes a variety of demographic and behavioural characteristics. This study employs several Machine learning models, including Logistic Regression, Random Forest, Decision Tree, and XGBoost, using a dataset comprising demographic, behavioural, and self-reported mental health information. Data preprocessing involved cleansing, normalization, and feature selection to optimize model performance. Models were trained and validated using cross-validation, and their performance was measured using metrics such as accuracy, precision, and ROC-AUC scores. Machine learning models, particularly Logistic Regression, show significant potential for improving mental health assessments by providing early, accurate, and scalable predictions. This study is significant in addressing the rising mental health challenges among college students by leveraging machine learning (ML) for early detection and personalized care. Traditional diagnostic methods, often time-consuming and subjective, are enhanced by ML’s ability to process large datasets for faster, accurate, and scalable assessments. The Logistic Regression model achieved an accuracy of 85% and a precision of 81%, demonstrating its reliability for general mental health predictions. By integrating demographic, behavioural, and physiological data, the study promotes tailored interventions while emphasizing ethical considerations like privacy and transparency. Its findings can guide institutions and policymakers in developing data-driven mental health programs, fostering healthier academic environments and advancing mental health care.

Madhav Theeng Tamang, Saeed Sharif, Seyed Ali Ghorashi

Conference Proceedings

Hybrid Deep Learning Healthcare AI Framework for Real-Time Human Pose Estimation and Remote Patient Monitoring to Support TKR Physiotherapy

Total Knee Replacement (TKR) rehabilitation critically depends on precise physiotherapy exercise execution, and the rise of patient volumes and constrained clinical resources limit continuous supervision. This study presents an Artificial Intelligence (AI) framework for real-time assessment and feedback of TKR exercises using deep learning–based human pose estimation to empower remote rehabilitation. We investigate three architectures: a Dense Convolutional Neural Network (DCNN) incorporating frame decoupling for robust joint tracking; a pruned Generative Adversarial Network (Sparse GAN) optimized for computational efficiency; and a novel hybrid model that embeds the DCNN as a discriminator within the GAN model. A diverse dataset of over 10,000 annotated video clips, sourced from clinical environments and public repositories, was processed with OpenCV, and joint annotations were generated using OpenPose. Models were trained and evaluated on standard metrics (i.e. Precision, Recall, F1-score) alongside runtime and memory usage benchmarks. The hybrid architecture achieved the highest classification performance with 86.01% F1-score, which demonstrates the synergetic benefits of combining rich feature extraction with generative refinement, though it incurred elevated computational costs. The Sparse GAN provided faster inference suitable for mobile deployment, with only a marginal decrease in F1-score. The standalone DCNN provided a balance between accuracy and speed, but it did not match the hybrid’s precision. These results highlight a fundamental trade-off between model complexity and real-time usability in AI-driven therapeutic monitoring. The hybrid model is optimal for clinical settings where accuracy is paramount, while the Sparse GAN provides a practical solution for resource-constrained and edge-based applications. Future work will explore model compression, hardware acceleration, and edge-computing strategies to further optimize performance. By demonstrating the viability of advanced pose estimation techniques in a physiotherapy context, this research contributes to the broader discourse on the use of AI in healthcare for scalable, autonomous rehabilitation tools across several medical and wellbeing domains.

Hisham Abougrad, Manasa Yegamati, Mimi Mather

Conference Proceedings

Development of an Automated System for Cardiomyocyte Activity Using Computer Vision

Computer vision, a pivotal field within computer science, empowers machines to interpret and analyse visual information such as images and videos. Its growing application in healthcare, particularly in the diagnosis and treatment of cardiac conditions, underscores its transformative potential. Traditional methods for detecting cardiac beat rates are largely manual, making them time-consuming and labour-intensive, thereby limiting their scalability in clinical contexts. To address this gap, there is a critical need for an automated system capable of identifying cells in video data and extracting key parameters such as beat rate, cell area during systole and diastole, and beat duration. This study introduces a novel computer vision-based framework that automates the detection of heart cell contractions from video recordings. By employing motion segmentation, masking techniques, and machine learning algorithms, the system efficiently identifies active cardiomyocytes, calculates beats per minute (BPM), and measures the time taken for a complete contraction-relaxation cycle. This approach not only improves diagnostic accuracy but also contributes to more efficient and scalable cardiac assessments, representing a significant advancement in computational healthcare.

Mohammed Noman Mohammed Arif, Qazi Nadeem, Prashant Ruchaya, Mustansar Ali Ghazanfar

Conference Proceedings

Vision Transformer-Based Image Captioning for the Visually Impaired

Digital accessibility remains a central concern in Human-Computer Interaction (HCI), particularly for visually impaired individuals who depend on assistive technologies to interpret visual content. While image captioning systems have shown notable progress in high-resource languages, languages such as Indonesian, despite having a large speaker base, continue to be underserved. This disparity stems from the lack of annotated datasets and models that account for linguistic and cultural nuances, thereby limiting equitable access to visual information for Indonesian-speaking users. To address this gap, we present a bilingual image captioning framework aimed at improving digital accessibility for visually impaired users in the Indonesian-speaking community. We propose an end-to-end system that integrates a neural machine translation component with three deep learning-based captioning architectures: CNN-RNN, Vision Transformer with GPT-2 (ViT-GPT2), and Generative Adversarial Networks (GANs). The Flickr30k dataset was translated into Indonesian using leading machine translation models, with Google Translate achieving the highest scores across BLEU, METEOR, and ROUGE metrics. These translated captions served as training data for evaluating the image captioning models. Experimental results demonstrate that the ViT-GPT2 model outperforms the others, achieving the highest BLEU (0.2599) and ROUGE (0.3004) scores, reflecting its effectiveness in generating accurate and contextually rich captions. This work advances inclusive AI by developing culturally adaptive captioning models for underrepresented languages. By generating culturally and linguistically relevant captions for visually impaired users, the framework advances Human-Computer Interaction through more accessible and inclusive user-system communication. Beyond its technical contributions, this research addresses key challenges in Human-Computer Interaction (HCI) by enabling inclusive, multilingual assistive technologies. It supports the evolution of Next-Generation Work environments by equipping visually impaired individuals with tools to independently interpret visual information, an increasingly essential capability in AI-rich, visually oriented digital workspaces. In future work, the framework will be enhanced through multimodal pretraining and the integration of culturally enriched datasets, aiming to improve semantic accuracy and broaden its applicability to a wider range of linguistic communities.

Qazi Nadeem, Indra Dewaji, Nawaz Khan

Conference Proceedings

Assessment of Upper Limb Functional Workspace through Inertial Measurement Units: a Pilot Study

The assessment of the upper limb functional workspace in an ecological environment is important for the evaluation of clinical progress in persons suffering from musculoskeletal disorders or neurological impairments. Inertial Measurement Units (IMUs) represent a very effective technology for the assessment of human movement in ecological settings. This work presents a preliminary validation of a methodology for reconstructing and assessing the upper limb functional workspace explored during the daily routine in ecological setting through IMUs. Participants in the study were involved in 7 hours data acquisition with IMUs performing two different protocols simulating an active and a non-active arm, respectively. For each of the two protocols, a workspace for each limb segment and each participant was reconstructed by evaluating the estimated spatial position of the sensors over time. A density and clusterization assessment was performed on each workspace through the application of a Gaussian kernel and k-means algorithm. Next, workspaces from the non-active and the active protocols were compared by performing statistical tests on the distributions of points in the respective workspaces along the the three spatial coordinates. Results showed significant difference between the two protocols (active and non-active) on every spatial coordinate and every of the three segments in the upper limb (arm, forearm, hand) and different clusterization of the workspaces. The findings represent a preliminary confirmation of the applicability of IMUs to the assessment of changes in the functional workspace of the upper limb. Further developments may involve enlarging the sample size, testing on impaired persons, and assessing in more realistic scenarios.

Matteo Iurato, Ronny Stanzani, Mirko Job, Andres Gaggero, Igor Ingegnosi, Marco Testa

Conference Proceedings

In-depth analysis of nuclear data flow using graph theory and the Technology, Organization, People Model through the application of betweenness centrality measure and community detection

The incorporation of digital technologies is an essential factor in enhancing the efficiency and performance of nuclear facilities throughout their lifecycle. This encompasses various stages, from the design phase to their operational period. However, the nuclear industry faces significant challenges due to the intricate and diverse nature of its stakeholders, supply chain, and activities.Moreover, the exponential growth of complex information systems has created substantial challenges in data management within the nuclear sector. The traditional approaches to managing data often result in inefficiencies, inconsistencies, and inaccuracies, which can have severe consequences on the performance of nuclear facilities.To address these critical issues, this study proposes the application of graph theory for analyzing nuclear data flow and integrating human system integration (HSI) through the Technology Organization People (TOP) Model. The TOP Model is a comprehensive framework that considers the technological, organizational, and social aspects of complex systems, providing a holistic approach to understanding interactions within the information system.The dataset used in this study was synthetically generated to emulate real-world operations and provide a more accurate representation of the nuclear data information system.This study employs two graph theory methods to analyze the nuclear data flow: betweenness centrality measure and spectral clustering. The betweenness centrality measure is used to identify critical nodes within the data network that are most central or influential in terms of data flow, thereby highlighting the key components involved in data transmission. Spectral clustering is employed to group similar nodes within the data network, sharing common data transmission characteristics, thereby facilitating insight into the underlying structure and dynamics of the nuclear data flow. This comprehensive approach facilitates more informed decisions regarding data management and optimization of data flux. The results of this study demonstrate the potential of this innovative methodology, graph theory, and the TOP Model, offering a new way to address the challenges faced by the nuclear industry in managing nuclear complex information systems. The findings underscore the significance of integrating human factors into data management and provide a framework for enhancing the efficiency of nuclear facilities throughout their lifecycle.This study makes a contribution to the development of more effective strategies for managing complex information systems within the nuclear sector, with implications for enhancing the performance and efficiency of nuclear facilities.

Luigui Salazar, Olivier Malhomme, Xianyun Zhuang, Robert Plana, Nicolas Bureau

Conference Proceedings

Social media and internet celebrity for social commerce intentional and behavioral recommendations

Social media is an online media platform based on interests and creative content formed by a group of Internet users. Internet celebrities are people who become famous on the Internet, increasing their popularity by their social networking or video websites. Social commerce (s-ecommerce) is the combination of social relations and commercial transaction activities. The combination of social media and Internet celebrities is an emerging model for the development of s-ecommerce. Recommendation systems are an effective alternative to search algorithms because they help users find items that they are unlikely to find on their own. Currently, businesses have relied on cookies to collect consumer online data, especially to support advertising and collect keywords, and with this change will require businesses to find alternative ways to collect user data and information. With recent advances in information technologies, recommendation systems are gradually moving to develop intentional and behavioral recommendations. The Internet has numerous signals that goods are in demand. Therefore, the behavioral signal targeting of traditional recommendation systems differs from the intentional signal targeting for recommendations. It can be said that behavioral recommendation is a point-to-point marketing extension, where merchants find the people who want to buy a product and deliver that product. For example, behavioral recommendation occurs when a consumer clicks on a smartphone store catalog. The system shows "This person is looking for smartphones”. So, for the next two weeks, when the consumer clicks on a website, ads for smartphones will pop up. Behavioral recommendations can only provide marketing/promotion based on past behavior records. On the other hand, intentional recommendation is a mindset that seeks to understand consumers' lives and intentions; constantly collecting information about Internet users' behavior and monitoring events and information in consumers' lives, it leads consumers to explore their needs gradually, wants, and demands. Intentional recommendation depicts a person's life through the things (signals) the consumer clicks on. Based on this, it analyzes the person’s specific profile, and then further presents information that they may need (targeting). The information relates not only to commodities but also includes smartphone apps, news, social media, socializing, gaming KOL, etc., on smartphones. In these regards, this study considers that signal targeting is a method by which social media operators and internet celebrities’ platforms can understand consumers’ media tools preferences and individual lifestyles so that operators can effectively recommend social commerce to consumers. Thus, this study first implements a two-stage data mining analytics, including clustering analysis and association rules, to investigate Taiwan users (n=2,102) to investigate social media, and Internet celebrities’ preferences to find knowledge profiles/patterns/rules for s-ecommerce intentional and behavioral recommendations.

Shu-hsien Liao, Yao-hsuan Yang

Conference Proceedings

SmartAI: Enhancing Scene Understanding by Combining Different AI Technologies

This paper introduces SmartAI, a novel framework that integrates Machine Learning (ML) and Knowledge Representation and Reasoning (KRR) to enhance AI capabilities in reasoning and adaptability. Inspired by Daniel Kahneman's Thinking, Fast and Slow theory, SmartAI leverages ML for rapid, intuitive processing (System 1) and KRR for deliberate, analytical reasoning (System 2). The framework emphasizes modularity, enabling seamless orchestration of these technologies without altering their core components. A case study on scene understanding demonstrates SmartAI's effectiveness in combining fast pattern recognition with in-depth contextual reasoning, achieving superior interpretive outcomes. Beyond scene understanding, SmartAI lays the foundation for context-aware AI applications in diverse fields such as healthcare, education, and autonomous systems. This work sets a precedent for integrating specialized AI technologies to achieve human-like cognitive flexibility. However, it introduces new challenges in effectively managing and orchestrating interactions between these complementary technologies, opening avenues for future research.

Adnan Agbaria, Yael Dubinsky

Conference Proceedings

Transforming Elderly Care with Vision-Based Hand Gesture Recognition: A Deep Learning Framework

In today’s ever-evolving healthcare landscape, our objective is to bring technology closer to human experience. In this paper, a vision-based hand gesture recognition (HGR) system is introduced as a human computer interface (HCI) that transforms natural movements into intuitive medium for device and application manipulation. This innovative approach is designed to empower the elderly and enhance patient monitoring by offering a seamless, nonintrusive interface. By bridging compassion with cutting-edge technology, this approach aims to redefine care and create a more connected, responsive healthcare environment. The study begins by reviewing both traditional and contemporary HGR methodologies, with a focus on vision-based systems, which have shown greater potential for practical applications compared to conventional sensor-based systems. Traditional approaches, such as contour analysis, colour segmentation, and template matching, have demonstrated limitations. In contrast, deep learning approaches have gained mass adaptation in vision-based systems for their ability to learn hierarchical features and better handling complex, dynamic scenarios. However, existing models struggle in real-time performance due to computational inefficiencies and environmental variations. This study introduces a deep learning-based detection framework which enables the practical application of HGR. For example, it supports the development of gesture-based virtual assistants for interacting with digital devices and allows elderly residents in care homes to give instructions to remote caregivers. The framework consists of two deep learning models, one for region of interest (ROI) detection and another for classification. The recognition process begins with capturing RGB images, chosen for versatility, affordability, and compatibility, which then passes through YOLOv8 (You Only Look Once version 8) for detection. The YOLOv8 model is trained on a subset, only 4.5%, of the HaGRID dataset, comprises over 550,000 RGB images, to accurately locate hand regions. Once detection is complete, several image processing techniques are applied to overcome typical HGR constraints. These include ROI extraction and gray scaling for faster computation, resizing for model optimization and distance viability, HSV conversion for lighting independence, and histogram equalization for enhanced feature extraction. The processed image is then fed into a convolutional neural network (CNN) for gesture classification. The network is designed with several convolutional layers, followed by dense layers with ReLU activation to introduce non-linearity, and SoftMax activation for classification. To improve the model's robustness, data augmentation techniques such as random flipping and rotation are applied. The HGR system demonstrates a staggering accuracy of 97% for training & validation. The proposed framework effectively addresses the limitations of conventional HGR systems—many of which have been developed and tested in controlled environments—demonstrating its applicability in real-life situations.

Seyed Ali Ghorashi, Riazul Islam

Conference Proceedings

Negotiated agency: the constant deliberation between the player's abilities and developer's permissions

This essay attempts to shed a light on the intricate relationship between player agency – the player's capacity to act and make choices in a videogame –, and that of the developers – the limitations and possibilities creators must navigate in order to release their title. Firstly, we explore the complex concept of agency, drawing upon fields like philosophy and sociology. Then, we turn to the unique position videogames occupy among other mass media by having interactivity as their centre stage.

Thais Weiller, Mauricio Perin, Pedro Campos, Vanessa Cesário

Conference Proceedings

Developing Optimal Affordance Detection Technology Using Genetic Algorithm based on Posture Primitives on Atypical Surfaces

This study presents an optimization method for human behavior simulation in atypical architectural spaces using genetic algorithms based on posture primitives. Traditional simulation tools often fail to capture diverse movement possibilities in non-standard environments, limiting their application in architectural design. To address this, we introduce a novel affordance detection technology that extracts potential human actions directly from architectural geometry rather than relying on predefined scenarios. Our approach employs genetic algorithms to refine optimal action placements on complex surfaces iteratively. A set of posture primitives is modeled based on anthropometric data, and their spatial suitability is evaluated through computational iterations. By integrating Rhino, Grasshopper, and ActoViz, our system enables designers to visualize and analyze possible human interactions within atypical architectural forms. The physics engine incorporated in the system introduces behavioral noise, allowing for a wider range of movement possibilities.The key contribution of this study is the automatic generation of spatial affordances for human behavior simulation. Unlike conventional methods, our approach does not require predefined action sets but instead derives possible movements dynamically based on spatial conditions. By systematically analyzing posture primitives and their adaptability to different surfaces, this research provides a data-driven foundation for predicting human behavior in complex architectural environments.Furthermore, the proposed method enhances the architectural design process by offering real-time feedback on spatial affordances, allowing architects to optimize layouts based on anticipated user interactions. The ability to generate realistic behavior simulations supports a more intuitive and human-centered design approach, making this research highly applicable to various architectural and urban planning contexts.By integrating computational affordance detection into architectural design, this study contributes to the advancement of human behavior simulation, enabling architects and designers to predict and refine spatial experiences with greater accuracy and efficiency.

Yun Gil Lee

Conference Proceedings

Lightweight Transformer for Robust Human Activity Recognition Using Smartphone IMU Data

Human-activity recognition (HAR) underpins a wide range of m-health, smart-home, and context-aware services, yet conventional approaches frequently struggle with overfitting, class imbalance, and limited capacity to capture long-range temporal dependencies. In this study we introduce a lightweight, end-to-end Transformer pipeline that learns directly from raw smartphone inertial signals, eliminating the need for manually engineered features. We evaluate the approach on MotionDetection, a 12-channel dataset collected from 24 volunteers who performed a scripted series of everyday movements while carrying a Samsung Galaxy Note 20 Ultra. After windowing and minimal preprocessing, the Transformer attains %98 validation accuracy with no discernible overfitting. Relative to a strong CNN-BiLSTM baseline, it improves the macro F1-score by 3.6 percentage points while employing a smaller parameter budget, underscoring its computational efficiency. These findings indicate that Transformer architecture can provide a robust, scalable foundation for real-world HAR on commodity mobile devices, paving the way for battery-friendly, on-device activity monitoring in health and ambient-assisted applications

Hossein Shahverdi, Seyed Ali Ghorashi

Conference Proceedings