Human-Computer Interaction & Emerging Technologies

Editors: Tareq Z. Ahram, Waldemar Karwowski, Pei-Luen Rau
Topics: Human Systems Interaction
Publication Date: 2025
ISBN: 978-1-964867-71-7
DOI: 10.54941/ahfe1006006
Articles
Critical Foresight of Human-Computer Interaction: A Review on Methods to Assess Ethical Risks and Side-Effects of Emerging Technologies
Innovative technologies often come with unforeseen side effects or even detrimental consequences. This may range from impacts on individual wellbeing, effects on the level of society and social dynamics, environmental impacts, or unwanted effects on an organizational level. The responsible design of human-computer interaction demands methods to systematically foresee and assess such potential negative consequences of technological innovations for humans and society, and ideally use such insights to sensibly design and adjust a product concept and its features. In a comprehensive literature review of the methods landscape, we identified a total of 40 future-oriented methods designed or adaptable to elicit negative ethical or societal consequences of technological innovations. This paper describes different clusters of methods, classifies the methods along different criteria, and lists examples for each cluster. Based on a cross-analysis of the methods and reported best practices, we discuss recommendations for a fruitful combination of structural elements of existing methods and sketch ideas for new approaches of a light-weight way to reflect on potential harmful consequences of emerging technologies.
Sarah Diefenbach, Johannes Stoll, Daniel Ullrich
Open Access
Article
Conference Proceedings
Promoting Healthy Eating by Design: Opportunities for Meaningful Persuasive Technologies
While eating behavior has a considerable impact on people’s health and well-being, it is well known that changing food practices is an incredibly difficult endeavor. People often lack motivation to modify their diet, and a variety of barriers prevent them from adopting a healthy lifestyle. In this paper, we recount the preliminary findings of the PHaSE project by exploring how people conceptualize food and their eating behaviors. Through a series of co-design workshops, we discovered that people ascribe a variety of meanings to food, spanning from health concerns to emotional relief. These meanings play a crucial role in the process of change. Based on the study findings, we suggest that designers should address the meanings associated with food, rather than placing exclusive emphasis on the behavior to be changed. Additionally, they should promote sense-making and reflection on food practices.
Amon Rapp, Arianna Boldi
Open Access
Article
Conference Proceedings
Exploring the impact of AI-generated content on branded IP character design and user experience
This paper aims to explore the practical application and potential value of AIGC technology in brand IP character design, and study its specific role in improving design efficiency, enhancing brand recognition and user emotional connection. Through literature review, the core theory of AIGC technology and its development trend in brand design are sorted out, and combined with case analysis and comparative analysis, the advantages and disadvantages of traditional design methods and AIGC technology are compared in depth. The study found that AIGC technology is significantly superior to traditional methods in terms of design efficiency, cost control and creative expression, especially showing strong appeal in the young audience market. In addition, through in-depth analysis of multiple typical cases, this article summarizes the key application scenarios of AIGC technology in the character design process, such as the generation of virtual idols, rapid iteration of brand IP images, and support for cross-platform communication. However, the study also revealed the limitations of AIGC technology applications, including insufficient emotional expression, low fit with brand tone, and data privacy and ethical issues. Based on the research results, this paper proposes suggestions for optimizing the application of AIGC technology, such as enhancing the emotional computing ability of the model, deepening the integration of brand research data, and promoting the hybrid design model of human-machine collaboration. This study shows that AIGC technology not only plays an important role in the current brand IP role design, but also provides a new direction for the sustainable development of the future brand communication ecology.
Yijing Wang, Xin Hu
Open Access
Article
Conference Proceedings
The Impact of Visual Elements and Design Principles of Design Systems on Design Decisions
This study investigates the impact of visual elements (color, typography, icons, and layout) and design principles (consistency, simplicity, reusability, and accessibility) within design systems on designers' decision-making processes. A survey of 24 interaction designers with varying levels of experience was conducted to assess how these factors influence decision consistency and efficiency. Results show that color and layout are crucial in design decisions, while consistency and simplicity enhance decision-making efficiency. Designer experience significantly affects the emphasis placed on these elements, with senior designers prioritizing color, consistency, and simplicity, and junior designers focusing on reusability. The findings highlight the need for design systems to be adaptable to different experience levels to optimize design workflows and user experience.
Keyang Wu, Xiaojun Liu
Open Access
Article
Conference Proceedings
Real-Time Object Recognition with Neural Networks in Public Transport – Determining the Utilization of Vehicles using Existing Camera Systems
For security reasons, many local public transport vehicles now have cameras installed in their interiors. At present, these can and may only be used in Germany to investigate criminal offenses. Real-time object recognition offers the possibility of counting passengers with existing cameras without saving images or videos. This is not only important for revenue sharing, but can also provide information about bottlenecks when boarding and alighting, or be used to display empty areas in individual carriages of a train. This paper investigates whether object recognition is suitable for determining the utilization of public transport vehicles using individual images from the interior. Test images from the security cameras of a streetcar were analyzed and evaluated with a self-trained Faster R-CNN model. Accuracies of 70% were achieved in the detection of people and free seats.
Waldemar Titov, Julian Knust, Thomas Schlegel
Open Access
Article
Conference Proceedings
Accelerating Legacy Code Migration with Artificial Intelligence
Organizations relying on critical systems with decades-old code face significant challenges, making modernization an operational imperative due to issues like operational stability, security, and a lack of updated features. This process of transforming legacy code is challenging, but is now being accelerated and augmented by Artificial Intelligence (AI) and large language models (LLMs). This research investigates the use of various LLMs for legacy code translation, aiming not for a perfect solution, but to significantly assist senior software developers by accelerating the development process, enabling rapid prototyping and initial implementation. This approach allows senior engineers to refine and productize the solutions, ensuring quality and alignment with system requirements. The initial testing strategy involved evaluating small subsets of legacy code within the Motif Framework, with the ultimate goal to demonstrate AI’s role as an assistive tool for senior developers in accelerating code modernization efforts.
Amir Schur, Max Graves, Stephanie Heckel, David Vandine
Open Access
Article
Conference Proceedings
Design Evaluation System of AI-Generated Content in the Industrial Design of Construction Machinery
The application of Artificial Intelligence Generated Content (AIGC) technology in the industrial design of construction machinery has developed rapidly. However, its generated solutions differ significantly from designer-generated solutions, posing new challenges to traditional design evaluation methods. To address the evaluation challenges of AIGC solutions, this study proposes a design evaluation framework that combines comprehensiveness and efficiency. An evaluation framework based on the Analytic Hierarchy Process (AHP) was developed through expert interviews and literature analysis. The framework includes five criteria and fifteen sub-criteria to comprehensively assess the quality of AIGC design solutions. Subsequently, the evaluation indicators were streamlined and optimized to enhance efficiency. Experimental validation using practical construction machinery design cases demonstrated that this framework maintains scientific rigor while improving the efficiency of screening and decision-making for AIGC solutions. This study provides an efficient and reliable method for the preliminary scoring and filtering of AIGC solutions in the industrial design of construction machinery, contributing to improved decision-making processes in industrial design practices.
Yifan Yang, Hui Li
Open Access
Article
Conference Proceedings
EEG-Driven Personalized Visual Communication
In this paper, we explore the technology that directly connects brainwaves to image systems, incorporating insights from brain science, computer science, and visual design. Firstly, the paper reviews the pivotal literature concerning the evolution and advancements in electroencephalogram (EEG) technology. Secondly, the paper outlines the background and application scenarios of brain-computer interfaces (BCI) and investigates the main image-generating brain-computer interface (BCII). Finally, our team has developed a personalized, human-centered BCD (brain-computer doodle) board utilizing EEG and BCII technologies, demonstrating its practical applications and the potential impact of our work.
Jun Chen
Open Access
Article
Conference Proceedings
Evaluating Map Orientation Methods in Smartphone Applications by Analyzing Search Time Through a Virtual Environment Experiment
In recent years, smartphone map applications have implemented numerous features enhancing usability. These include adjusting map scale and orientation, displaying user’s current location, and route guidance. Such advancements have made map reading significantly easier compared to traditional paper maps particularly by automating tasks like orientation and navigation. However, the diverse street patterns in certain urban areas can confuse users who are not proficient in map reading, such as those unfamiliar with technology or individuals in emergency situations. This confusion can reduce the benefit from the convenience of map applications in providing seamless navigation, despite their advanced functionality.This study examines the effect of smartphone map orientation methods, such as North-Up and Heading-Up as well as various street patterns and building layouts on users' map reading abilities. The research aims to identify maps display elements that facilitate self-location awareness and gain insights that contribute to developing new map applications enabling effective navigation even for users with difficulties in map reading.The experiment utilized a Head-Mounted Display (HMD) with eye-tracking function to measure gaze data as an indicator of participants' map reading processes across different map display methods and various street patterns. Participants were presented with modeled streets based on actual intersections through the HMD. A map of each location, simulating a smartphone map application, was displayed in the lower right field of view. Specific buildings were highlighted in red on the map, and participants were instructed to locate the actual position of the highlighted building and orient themselves toward it. Three types of maps were tested: North-up maps with fixed orientation, Heading-up maps that rotate with body movement, and Control-up maps that allowed free rotation. These map types were combined with five different locations, resulting in 15 total trials.Analysis revealed statistically significant differences between map types, with Control-up maps resulting in significantly longer completion times and map viewing durations compared to other map types. This result suggests that the Control-Up method presents greater difficulty for users. Moreover, an analysis excluding outliers indicated that Heading-up maps tend to be slightly more efficient than North-up maps. This finding indicates that the Heading-up method may help reduce users' cognitive load during navigation. Additionally, locations with simple layouts and linear street intersections showed significantly longer completion times compared to other locations with more complex layouts. This suggests that the simplicity of the street layout may increase task difficulty, possibly due to the lack of distinctive visual landmarks for orientation. However, the impact of location differences was reduced when Control-up maps were excluded from the analysis, indicating that appropriate map type selection could potentially mitigate the difficulties associated with simple street layouts. To enhance the effectiveness of the Heading-up method suggested in this study, further improvements need to be considered, such as dynamically switching map display methods based on user movement and environmental conditions. Expanding the scope of research in this direction could contribute to the development of more user-friendly map navigation systems for a broader range of users and use cases.
Momoko Sakai, Yohsuke Yoshioka
Open Access
Article
Conference Proceedings
Optimizing Human-Machine Interfaces for Neuroergonomics: Cognitive Workload and Performance in sUAS Operations
The growing prevalence of small Unmanned Aerial Systems (sUAS) or Drones across industries, particularly, aerial photography and surveying, results in a need for a deeper understanding of how control interface design impacts operator cognitive workload and performance. This study evaluates the effects of gyroscopic and traditional joystick-based control systems on the operator’s cognitive workload as well as their mission performance under diverse environmental conditions. Participants perform standardized sUAS tasks while real-time electroencephalography (EEG) monitoring tracks cognitive workload via theta and alpha wave activity. Results indicate that gyroscopic controls, though intuitive, increase cognitive workload under high stress, whereas joystick controls provide more stability. Performance metrics show greater consistency with traditional controls, especially in demanding conditions. Insights from this study inform ergonomic improvements, tailored training, and real-time physiological monitoring for adaptive systems. By integrating neuroergonomics with human-machine interface design, this research advances sUAS usability, optimizing operator performance and safety in dynamic environments.
Suvipra Singh
Open Access
Article
Conference Proceedings
Exploring Virtual Keyboards for Text Entry in Virtual Reality
As Virtual Reality (VR) technology expands beyond entertainment, its potential as a workspace for professional tasks, such as programming, is growing. However, efficient and accurate text entry remains a significant challenge due to the lack of tactile feedback, the cognitive load of virtual typing, and ergonomic constraints. This study presents a novel custom VR keyboard designed to enhance user experience through multimodal feedback, ergonomic optimizations, and adaptive interaction models.Our custom keyboard features a smartphone-like layout with a hover effect, dynamic color transitions, haptic feedback via the SenseGlove Nova 2, auditory keypress cues, and a key-weighting mechanism to refine selection accuracy. To evaluate its performance, we conducted a controlled user study (N = 11), comparing it to a curved MRTK keyboard. Key performance metrics, including typing speed, error rate, keystroke accuracy, word accuracy, and overall user comfort, were analyzed.Results indicate that while the curved MRTK keyboard enabled faster typing speeds (8.99 WPM vs. 7.47 WPM) and higher keystroke accuracy (92.14% vs. 86.85%), the custom keyboard provided a more intuitive and engaging typing experience, particularly during prolonged use. Participants reported greater comfort with the custom keyboard, benefiting from its ergonomic design and tactile interactions. Although the custom keyboard had a slightly higher error rate (1.97% vs. 0.50%), it demonstrated improvements in user adaptability and interaction precision through multimodal feedback. Correlation analysis further revealed that enhanced haptic and visual feedback played a crucial role in improving text entry efficiency.These findings suggest that while traditional virtual keyboards may offer faster speeds, adaptive virtual keyboards with haptic integration can enhance user comfort and long-term usability. Future research should focus on refining interaction fidelity and optimizing hand tracking to further improve VR text entry performance for professional applications.
Kheireddine Azzez, Andrii Matviienko, Gerrit Meixner
Open Access
Article
Conference Proceedings
The application of an RGB-D camera for monitoring the allocation of visual attention among high-speed train drivers
With the continuous development of autonomous driving technology in China’s high-speed trains, automatic train operation (ATO) system has begun to assume certain driving tasks, while the primary responsibility of drivers has progressively transitioned to a more pivotal supervisory role. However, in a long-term highly automated work environment, drivers may experience a decrement or even a complete loss of situation awareness (SA), which can precipitate delayed responses to emergencies, thereby compromising the safety of train operations. To understand the alterations in drivers’ SA during supervisory tasks, it is imperative to first acquire knowledge of their visual attention allocation. Consequently, this study aims to propose a monitoring method based on an RGB-D camera to investigate the visual attention allocation of high-speed train drivers across varying levels of SA. Initially, an RGB-D camera is employed to capture the driver’s 3D information during operation and to conduct face detection. Subsequently, the driver’s eye movements and head poses are analyzed using this 3D information. Thereafter, visual attention features are extracted from this information to estimate the visual attention allocation. Finally, experiments are conducted to analyze the changes in visual attention allocation of high-speed train drivers under different SA levels. The experimental results indicate that the application of an RGB-D camera effectively monitors alterations in visual attention among high-speed train drivers with differing levels of SA, revealing that drivers with high SA allocate a greater proportion of their visual attention to the driver machine interface compared to those with low SA. These findings offer a crucial reference for enhancing the supervising efficiency and operational safety of high-speed train drivers.
Weiyi Shen, Beiyuan Guo
Open Access
Article
Conference Proceedings
The Impact of Information Presentation Modes on Visual Search under Different Task Modalities
In recent years, the effect of information presentation on visual exploration in industrial system application software has received much attention, but there is less research on the effect of information presentation on visual exploration in the context of different task modes. In this paper, the industrial operation tasks are classified into three modes: one is the unconscious viewing task, the second is the daily operation task, and the third is the emergency operation task; and the information presentation modes are classified into three categories: one is text only, the second is picture only, and the third is the combination of picture and text, so as to explore the impacts of the different information presentation modes on the performance under the three kinds of task intensities. At the same time, a platform was built for different tasks to simulate the real operating environment, and the behaviour was quantified by collecting the subjects' task completion time and correctness rate, etc. The experimental data showed that the combination of graphics and text in Tasks 1 and 2 was the best information presentation method. The lower the information complexity in task one, the smaller the gap between the three presentation methods, the higher the information complexity, the lower the correct rate of the pure picture presentation method; in task two, the higher the number of information, the heavier the processing load of the pure text display method. In Task 3, which requires users to react quickly, the combination of graphic and text increases the amount of information, and the display of too much information reduces the operation efficiency, and thus is not the best presentation method.
Xianchun Yang, Xiaojun Liu
Open Access
Article
Conference Proceedings
AI Agents as Knowledge Navigators: A Conceptual Framework for Multi-Agent Systems in Scientific Knowledge Management
The rapid and continuous expansion of scientific literature has led to an unprecedented increase in the volume of knowledge produced, significantly complicating its organization, retrieval, and effective utilization. Researchers face considerable challenges in managing this vast information landscape, particularly in terms of identifying relevant studies, maintaining contextual integrity, and integrating knowledge across multiple disciplines. Traditional database-driven search engines and static indexing methods often fall short in addressing these issues, as they lack the capacity to dynamically interpret scientific discourse, establish meaningful cross-disciplinary connections, and facilitate real-time knowledge synthesis.To overcome these limitations, this study proposes a novel approach to scientific knowledge management through a multi-agent artificial intelligence (AI) system. This system is designed to enhance interactive, context-aware, and dynamic information retrieval by leveraging a network of AI agents, each specialized in distinct scientific domains. These agents operate collaboratively, employing advanced adaptive learning mechanisms, context-sensitive reasoning, and cooperative problem-solving strategies to improve the organization and accessibility of scientific knowledge.The multi-agent framework integrates state-of-the-art Natural Language Processing (NLP) techniques, transformer-based architectures, and knowledge graph methodologies to provide a more nuanced understanding of scientific texts. By doing so, it enables automated cross-domain conceptual linking, validation of theoretical and experimental claims, and the seamless integration of newly acquired information into existing knowledge structures. Moreover, the system incorporates reinforcement learning mechanisms to continuously optimize its retrieval and synthesis processes based on user interactions and evolving research trends.Beyond its immediate applications in knowledge retrieval, the proposed system fosters a paradigm shift in how scientific research is conducted, promoting more effective interdisciplinary collaboration and accelerating the development of innovative ideas. By facilitating automated synthesis of vast scientific corpora, it enables researchers to explore novel hypotheses, detect previously unrecognized connections between fields, and refine theoretical models with enhanced precision. Additionally, the ability of AI agents to autonomously process complex information reduces cognitive load on researchers, allowing them to focus on higher-order analytical tasks and creative problem-solving.This study contributes to the field of scientific knowledge management by introducing a scalable and adaptive AI-driven framework capable of supporting research in high-information-density environments. By bridging the gaps between disparate scientific disciplines and facilitating intelligent, real-time knowledge synthesis, the proposed system has the potential to revolutionize the way researchers interact with and utilize scientific information, ultimately advancing the efficiency and impact of knowledge discovery in the modern research landscape.
Paolo Gemelli, Laura Pagani, Mario Ivan Zignego, Alessandro Bertirotti
Open Access
Article
Conference Proceedings
Exploring Inductive and Deductive Qualitative Coding with AI: Investigating Inter-Rater Reliability between Large Language Model and Human Coders
Qualitative research provides valuable insights into complex human phenomena, but its coding processes are often time-intensive and labor-intensive. The advent of Large Language Models (LLMs) has introduced new opportunities to streamline qualitative analysis. This study investigates the application of LLMs in both inductive and deductive coding tasks using real-world datasets, assessing their ability to complement traditional coding methods. To address challenges such as privacy concerns, prompt customization, and integration with qualitative workflows, we developed QualiGPT, an API-based tool that facilitates efficient and secure qualitative coding. Our evaluation shows that the consistency level between AI-generated codes and human coders is acceptable, particularly for inductive coding tasks where themes are identified without prior frameworks. In our case study using data from a Discord community, GPT-4 achieved a Cohen's Kappa of 0.57 in inductive coding, demonstrating moderate agreement with human coders. For deductive coding, the inter-rater reliability between human coders and GPT-4 reached a Fleiss' Kappa of 0.46, indicating a promising level of consistency when applying pre-established codebooks. These findings highlight the potential of LLMs to augment qualitative research by improving efficiency and consistency while maintaining the contextual depth that human researchers provide. We also observed that LLMs demonstrated higher internal consistency compared to human coders when using a codebook for deductive coding, suggesting their value in standardizing coding approaches. Additionally, we explored a novel paradigm where LLMs function not merely as coding tools but as collaborative co-researchers that independently analyze data alongside humans. This approach leverages LLMs' strengths in generating high-quality themes and providing genuine content references, thereby enriching researchers' insights while maintaining human oversight to ensure contextual understanding and ethical standards. Nevertheless, challenges remain regarding prompt engineering, domain-specific training, and the risk of fabricated information, underscoring the importance of human validation in the final analysis. This research advances human-AI collaboration in qualitative methods by exploring AI-assisted coding and highlighting future improvements in interaction design.
He Zhang, Chuhao Wu, Jingyi Xie, Fiona Rubino, Sydney Graver, Jie Cai, Chanmin Kim, John Carroll
Open Access
Article
Conference Proceedings
Enhancing the Double Diamond: Human-AI Co-Creation in UI Design through AIGC Integration
The rapid development of artificial intelligence generated content (AIGC) technologies (such as ChatGPT, Midjourney, and Sora) has profoundly changed design practice by facilitating creativity and efficiency. As a structured design framework, the Double Diamond still plays an important role in the design process. As AI technologies continue to develop, the Double Diamond can integrate the AI technology to better meet the needs of the current design process. This paper proposes an innovative AIGC-enhanced Double Diamond that integrates AI tools (such as ChatGPT for idea generation, Midjourney/Sora for visualization, Wix ADI/Photoshop/Illustrator for prototyping, and Figma for delivery) with traditional and emerging design tools. We used a mental health application as a hypothetical scenario to assess the feasibility, flexibility, and inclusiveness of the framework in the four stages of Discover, Define, Develop, and Delivery. This paper aims to contribute to design theory and practice by extending the research of the Double Diamond framework and providing practical insights for human-IA collaboration design in UI design.
Risheng Liang, Sauman Chu, Lizhu Zhang
Open Access
Article
Conference Proceedings
Long seconds: how AI text-loading design affects the subjective feeling of waiting for users
An AI doctor agent is an intelligent medical tool that interacts with users through text-based dialogues to provide convenient health consultations and preliminary diagnostic services. However, during the interaction process, users inevitably encounter loading delays, making the design of appropriate text-based loading indicators particularly crucial. These loading indicators not only provide visual feedback but also help alleviate user anxiety, enhance trust in the system, and improve the overall user experience (UX). This study primarily explores the impact of different loading indicator designs—specifically, animation, gradient, and loop—on users' perceived waiting times and preferences. We conducted an experimental analysis with 30 participants recruited through convenience sampling. The results indicate that (1) loading indicators with gradient effects can effectively reduce users' subjective perception of loading duration, making the wait seem shorter; (2) animated and gradient loading methods can effectively reduce user boredom during waiting periods, helping to maintain a positive experiential state; (3) in practical use, users generally prefer animated loading presentations, finding them more engaging and visually appealing; and (4) compared to loop and gradient methods, the animated loading method received higher scores on the System Usability Scale (SUS), further validating its design advantages. The findings of this study provide important references for the practice and design of medical conversational agents and offer valuable insights for application development in other fields, enriching the theoretical foundation of generative AI within the domain of Human-Computer Interaction (HCI).
Jiwei He, Chien-Hsiung Chen
Open Access
Article
Conference Proceedings
A Metaverse where Users, NPCs and AI Agents can coexist.
The Metaverse is a rapidly evolving virtual ecosystem that offers endless possibilities for enhancing user experiences, encouraging interactivity, and allowing the exploration of immersive digital environments. This paper examines the critical role of NPCs (non-player characters) and AI agents in achieving these goals, emphasizing their ability to simulate lifelike behaviours and interactions that enrich the Metaverse. By drawing on recent advancements, such as the integration of AI bots on platforms like Facebook and Instagram, the paper explores how these agents can provide appropriate user support, entertainment, and companionship, while also enhancing the believability and vibrancy of virtual spaces.The paper also highlights a significant challenge. The overpopulation of user-created bots. Much like the rise of automated accounts on traditional social media platforms, an uncontrolled influx of these bots in the Metaverse could disrupt user social dynamics, dilute authentic interactions, and overcrowd the virtual environments. The resulting imbalance threatens to undermine the Metaverse's potential to deliver meaningful and enjoyable experiences. By researching and analysing the coexistence of NPCs, AI agents, users, and regulated bots, this paper promotes the idea of a Metaverse that supports artificial and human interactions, ensuring a dynamic, engaging, and sociable virtual environment for all participants.
Panagiotis Markopoulos, Jouni Smed
Open Access
Article
Conference Proceedings
Soldier Perspectives in AI and Autonomous Systems
Artificially intelligent autonomous systems are a growing component of the Army’s battlefield assets. However, there is a realization that design of these systems, especially armaments, must incorporate the user’s perspectives. These perspectives include desired features, concept of operations, limitations on the intelligence and autonomy of the system. DEVCOM Armaments Center’s Tactical Behavior Research Laboratory gathered information from over 130 Soldiers about their insights into artificial intelligence and autonomous weapons in operations. The survey was distributed in person to an international group of Soldiers (predominantly US) following force-on-force exercises and also distributed online. This presentation will give an overview of the guidance that these Soldiers provided engineers and human systems integrators who are involved with developing intelligent autonomous weapons.
Elizabeth Mezzacappa, Dominic Cheng, Nikola Jovanovic, Jose Rodriguez, Alexis Cady, Robert Demarco, Darnell Simon, Jessika Decker, Nasir Jaffery, Hillary Rogers
Open Access
Article
Conference Proceedings
The New Fear of Missing Out (FOMO): Reflecting on the Human Impact of Overly Positive Social Media Through Interactive Installation
Fear of Missing Out (FOMO) is a psychosocial phenomenon characterized by anxiety over missing an event, activity, or opportunity, particularly when it is perceived to offer pleasure or value. The rise of social media has amplified this anxiety, as individuals are exposed to idealized portrayals of others’ lives, which can lead to feelings of dissatisfaction. Traditionally, people have attempted to mitigate FOMO by avoiding comparisons with others’ seemingly perfect lives. However, this strategy has become less effective due to evolving forms of FOMO. In modern society, FOMO is no longer confined to worrying about the activities of others; it now extends to the creation of positive content that individuals compare themselves to. When personal experiences fail to meet these idealized standards, anxiety and frustration ensue. This shift in FOMO dynamics complicates individuals’ ability to recognize and address their distress. As a result, new forms of FOMO have a profound impact on mental health and self-identity, mirroring the challenges faced in the pursuit of an idealized existence. Social media offers individuals the freedom to envision their potential, but this freedom paradoxically fuels anxiety. To explore this emerging form of FOMO, this project began with a formative study to identify the key factors of overly positive media that influence individuals. Based on these insights, a visual model was created to represent self-construction in social media, and a storyboard was developed to illustrate how people self-construct positive content. An interactive installation was then designed to simulate the process by which individuals unconsciously fall into self-comparison. Feedback from user experiences was collected to further optimize the design. The installation aims to raise awareness of new forms of FOMO, promote reflection, and encourage behavioural change. Ultimately, the project seeks to deepen public understanding of how modern FOMO impacts mental health and assist individuals in developing a more authentic relationship with social media.
Lina Xu, Haichuan Lin, Menglu Wang, Meng Nie
Open Access
Article
Conference Proceedings
Construction of Fitz's Law Model for Ship Operator in Long Voyage
The long voyage of ships has the characteristics of quarantine closed, 24-hour continuous duty and long voyage to distant sea.The state of long voyage will affect the operator's operation ability, and then affect the operation performance in the process of human-computer interaction.As a result, the human-computer interaction task that can be completed under conventional conditions may not be completed in the long voyage state.However, the existing Fitz's law model lacks the consideration of the long voyage state, which leads to errors in the calculation of human-computer interaction performance.Therefore, based on the original Fitz's law model, this paper designs an experiment to collect the steering parameters under the condition of 90 days long voyage. Eight ship operators were selected to collect the data of operating parameters. The model of parameter change rule of operator's Fitts law in the state of 90 days long voyage is constructed.The model can be used to calculate the execution time of human-computer interaction tasks for each day in the 90-day long voyage state.
Ning Li, Guofang Wang, Jing Fang
Open Access
Article
Conference Proceedings
Adaptive HMI for Cyber-Physical Systems: Facilitating Multi-Level System Understanding for Rapid Response and Recovery
In complex cyber-physical systems, especially those supporting high-stakes operational contexts, users face increasing demands to quickly comprehend and navigate intricate system data. As these systems grow in functionality, the volume and complexity of data presented on conventional interfaces can overwhelm users, particularly in scenarios requiring rapid response. During unexpected or unpredictable degradations of capability, users must transition swiftly from routine operations to diagnostic and triage actions, often under constrained timeframes. Conventional human-machine interfaces (HMIs) frequently lack the adaptive features needed to adjust not only the information presentation, but also the user interaction mechanisms based on the evolving system state. This limits user situational awareness and diagnostic efficiency. Additionally, traditional HMI designs do not adequately support multi-level system understanding, where both high-level and detailed abstractions of the system are often essential for making informed decisions under pressure. This gap between the increasing complexity of systems and the static nature of HMIs presents a critical challenge for maintaining operational resilience and effectiveness during degradation or recovery scenarios.To address these challenges, we are developing an adaptive human-machine interface (HMI) that leverages a dynamically reconfigurable display composed of digital OLED screens, each functioning as both a visual information source and an interactive control. Drawing on principles from cognitive systems engineering and ecological interface design (EID), the HMI is structured around an abstraction hierarchy, allowing users to view and interact with system states at multiple levels of abstraction, from high-level functional goals to detailed component states. This multi-layered approach enables users to access context-specific information that dynamically shifts in response to ongoing system data displayed on the primary monitor, aligning with the principles of cognitive compatibility. By presenting critical information through adaptive iconography and visual cues on the OLED screens, the system facilitates rapid perception of key operational states, significantly enhancing users’ capacity for efficient diagnostic reasoning during degraded operations.The interface’s adaptability is critical to helping users shift focus seamlessly between strategic and tactical tasks, a need particularly acute during triage and recovery phases when response time is crucial. By encoding higher-order functional information, the adaptive HMI minimizes cognitive load and reduces the risk of errors by guiding attention to actionable insights relevant to the current operational context. This design also supports “direct manipulation” of data at multiple levels, allowing users to interact with core system function information directly from the HMI, thus reinforcing the system’s transparency and intelligibility. These features make it possible to sustain a level of operational resilience even under conditions of reduced capability, as users can rapidly construct a situational model of the system’s behavior and prioritize actions accordingly. The approach’s alignment with ecological interface design principles fosters a robust user interface by emphasizing perceptual cues that map to the underlying system structure, supporting faster recognition and more effective response.This HMI prototype ultimately represents a significant advancement in facilitating cognitive work across varying levels of detail and complexity, enabling users to better understand, diagnose, and restore critical functionality in cyber-physical systems under high-stress, time-sensitive conditions.
Caroline Kingsley, Laura Mieses, Michael Jenkins
Open Access
Article
Conference Proceedings
Enabling and Advancing Adaptive HMI Ecosystems for Highly Configurable Multimodal Smartphone Control
In modern professional and operational environments, the increasing reliance on smartphone-based applications for mission-critical tasks has heightened the need for adaptive and efficient human-machine interfaces (HMIs). Many current smartphone interfaces, however, are limited in their flexibility to accommodate diverse input modalities (i.e., beyond the native touchscreen) or to adapt effectively to dynamic and high-stakes contexts. These limitations become especially evident in scenarios where operators must seamlessly switch between multiple control mechanisms—such as gestures, voice commands, or touch inputs—without sacrificing responsiveness or accuracy. Furthermore, most commercially available HMI solutions are designed with standardized user interactions in mind, which may hinder their usability for individuals with specific accessibility needs or for users working in constrained environments. This lack of adaptability can compromise an operator’s ability to interact quickly and accurately with necessary applications, potentially impacting job or task outcomes. The need for a more flexible, multimodal HMI ecosystem has never been greater, especially as smartphone use expands across professional, industrial, and defense sectors where varied input mechanisms could be leveraged to enhance productivity and effectiveness.To address these limitations in smartphone HMI, our team is developing "THALAMUS". The THALAMUS solution introduces a highly adaptable framework that allows for flexible, multimodal interaction with Android-based applications, empowering users to operate their devices through various input mechanisms without needing software-level integration with each application. THALAMUS leverages a customizable, invisible grid system on the touchscreen, enabling users to map inputs from diverse HMI peripherals—such as gesture gloves, voice commands, and body-worn controllers—onto specific areas of the screen. This feature, combined with an ability to detect the active app and its state, allows THALAMUS to seamlessly transition between different grid configurations, tailoring its response to specific user and app requirements in real time. By enabling device-wide interaction customization, THALAMUS creates an “HMI amplifier” effect, allowing operators to integrate multiple control methods across any application without the need for modification, making it adaptable for a wide range of operational needs.The versatility of THALAMUS is particularly significant for professional and defense applications where hands-free, rapid, and reliable smartphone interaction is essential. In military environments, for instance, THALAMUS could enable operators to interact with situational awareness applications or communication systems using only voice or gesture controls, minimizing heads-down time and improving situational awareness. In other high-stakes contexts, such as healthcare or logistics, THALAMUS can enhance accessibility by allowing professionals to configure interaction methods based on immediate situational needs or individual ergonomic constraints. For accessibility-focused applications, THALAMUS’s customizability allows users to establish personalized controls, supporting users with unique physical or cognitive needs and enabling greater digital inclusivity.THALAMUS promises a foundational platform for multimodal HMI ecosystems, allowing diverse input devices to coalesce within a unified smartphone interaction framework. As an adaptable layer for managing multiple input modes, it facilitates highly configurable control environments that can evolve with emerging HMI peripherals, supporting future innovation in control modalities. THALAMUS thus represents an essential step toward more intelligent, adaptive HMI solutions, allowing device manufacturers and software developers to create tailored, mission-aligned user interfaces for varied operational communities. Ultimately, THALAMUS not only enhances immediate usability but also establishes a scalable and flexible framework for future multimodal HMI ecosystem development across professional and operational domains.
Michael Jenkins, Daniel Ferris, Sean Kelly
Open Access
Article
Conference Proceedings
Investigation of image processing methods based on photographs for automatic posture recognition
Japan has the highest aging rate worldwide, underscoring the importance of maintaining daily health for older adults. Postural assessment serves as a valuable indicator of health status. The purpose of this study is to construct an automatic posture recognition model using photographs. As a preliminary investigation, pre-processing methods suitable for machine learning datasets was examined. A total of 278 older adults from sagittal were captured using Kinect v2. the photographs were cropped to exclude non-relevant areas and transformed into grayscale. Subsequently, the cropped images underwent background removed, four edge-detection methods (Prewitt, Sobel, Laplacian 4-neighbors, and Laplacian 8-neighbors), and silhouette extraction, respectively, along with the original images, resulting in seven distinct datasets. A posture the images were classified into Ideal and Non-ideal categories according to physical therapists. The recognition model employed a Support Vector Machine (SVM), with feature extraction methods utilizing Scale-Invariant Feature Transform (SIFT) and Speeded-Up Robust Features (SURF). The dataset was divided into training (70%) and test (30%) subsets, with 15 cross-validation sets generated for robustness. Results showed that the Prewitt edge detection method achieved the highest average of F1 score (0.45 ± 0.07) with SIFT, while silhouette extraction yielded the best performance (0.48 ± 0.08) with SURF. The overall accuracy was relatively low; however, when compared to the cropping images, all methods demonstrated higher values, and the order of accuracy was clearly established. These results suggest that further improvements in accuracy could be achieved through tuning the recognition model, highlighting the potential applicability to deep learning frameworks.
Naoki Sugiyama, Yoshihiro Kai, Hitoshi Koda, Toru Morihara, Noriyuki Kida
Open Access
Article
Conference Proceedings
Beyond the Response: How Timing and Context Shape Empathy, Responsiveness, and Social Presence in AI Conversations
This study investigates how conversational artificial intelligence (CAI) agents’ response delay and context awareness influence users’ perceptions of social presence, empathy, and responsiveness, and engagement, in the context of stress management and wellness for college students. Grounded in social response theory and social presence theory, this research focuses on the conceptual mechanisms through which these design features shape human-like interactions in CAI. Response delay refers to the intentional pause between a user’s input and the agent’s reply. Social response theory suggests that pauses reflect “thoughtfulness,” which aligns with human expectations of social interaction and thus foster trust and engagement. A short delay (e.g., 2 seconds, accompanied by a visual indicator) may signal CAI’s humanlike cognitive effort and deliberation, enhancing users’ perceptions of social presence or sense of being with an intelligent, socially engaging entity. Context awareness refers to CAI’s ability to recall and integrate conversational history. A context-aware CAI references prior user interactions, ensuring coherence, personalization, and emotional resonance in its interaction with users. Social presence theory suggests that the ability to build on prior interactions enhances social presence. The increased social presence through CAI’s thoughtful delay and personalized responses based on context awareness may create a relationally engaging interaction, fostering user engagement with the CAI. This study will employ a 2 (response delay: yes vs. no) × 2 (context awareness: aware vs. unaware) between-subjects experiment. Four versions of CAI, as a virtual wellness counselor for college students, will be designed to manipulate the four experimental conditions, and their impacts on users’ social presence, empathy and responsiveness perceptions, as well as engagement will be examined. Findings will provide theoretical and practical insights into optimizing CAI design for human-like interactions in wellness contexts.
Haya Elbadawy, Wi-Suk Kwon
Open Access
Article
Conference Proceedings
The Impact of Secondary Task’s Perceived Value on Individuals' Creativity in Divergent Thinking Tasks
With the rapid advancement of information technology, media multitasking—the simultaneous use of media devices (e.g., computers, smartphones) to perform multiple tasks—has become increasingly common in daily life. Existing studies have indicated that media multitasking affects individuals’ creativity. One limitation of the previous research is that most studies only focused on utilitarian tasks. However, in practical, multitaskers are often attracted by hedonic tasks.The perceived value of a task (hedonic vs. utilitarian) is defined as the extent to which an individual’s intrinsic and extrinsic motive are satisfied. In this study, we conducted an experiment to examine how the perceived value of a secondary task influences individuals’ creativity performance in completing a primary task. Specifically, the study hypothesizes that the perceived value of the task influences attention and emotional experience, thereby impacting on creativity through distinct pathways.Eighty-three participants were recruited for a laboratory experiment employing a single-factor, two-level design (perceived value of secondary tasks: hedonic vs. utilitarian), with participants randomly assigned to two groups. Participants were instructed to finish a primary creative writing task—listing as many new functions of a refrigerator as possible—and a secondary reading task within 20 minutes. The value type of the reading task was manipulated by setting the reading material as either learning material or entertainment news. Attention to the main task and the secondary task was measured using eye-tracking (Tobii Pro Spectrum). Participants’ emotional valence and arousal during the experiment were recorded through a facial expression analysis system—Noldus Facereader 9.0. Task-switching frequencies were recorded via backend program. After completing the tasks, participants filled out questionnaires to subjectively evaluate their attention and emotions.The findings reveal that the perceived value of secondary tasks significantly influences creativity performance in the primary task. Regarding attention, participants who conducted hedonic reading tasks had higher task-switching frequencies compared to those conducted utilitarian ones, resulting in greater performance in the fluency of divergent answers. However, the effect of perceived value on creativity through the path of emotions was not supported in this study.Focusing on everyday multitasking scenarios, our study examines whether creativity varies with the value of different tasks. It vividly replicates multitasking scenarios common among university students, such as completing coursework or writing papers while simultaneously handling other tasks or browsing the web for leisure, thus achieving relatively high ecological validity. The present study reveals how the perceived value of tasks influenced individuals’ creativity through the attentional mechanisms. These findings provide evidence of the positive side of media multitasking.
Yunxuan Xing, Zhuoyi Sha, Junhui Lin, Xingchen Zhou
Open Access
Article
Conference Proceedings
Unpacking the Invisible: Human Factors in a Data-Driven World
Design, at its core, is fundamentally concerned with human experience—shaping perception, environment, and behavior. In an increasingly interconnected and digital world, design ethnography and human factors research must evolve beyond traditional, verbally focused, and localized studies to incorporate innovative, technology-driven methodologies. This paper explores emerging approaches to human factors in design, with a focus on data ethnography and entangled ethnography as methods for gaining deeper insights into the complex interplay between perception, behavior, and technological ecosystems. Through the lens of Globally Inclusive Design, we examine the opportunities and ethical challenges these methodologies present, particularly in relation to AI systems. Ultimately, this paper demonstrates how advanced ethnographic methods can inform the development of human-centered AI systems that enhance usability, resilience, and well-being. By bridging methodological innovation with ethical considerations, it contributes to the ongoing discourse on human factors in complex technological and design ecosystems—advocating for design practices that reflect the diversity, complexity, and interconnectedness of human experiences.
Tarika Kumar, Matteo Zallio
Open Access
Article
Conference Proceedings
Interaction Design for AI Interfaces and Robots Incorporating Motion
This study explores the design of human-computer interaction (HCI) by using the movements of AI and robots. Specifically, it aims to (1) categorize the affective impact of different types of motion (e.g., gradual or sudden acceleration and deceleration or constant speed) on humans and (2) experimentally examine interactions with AI and robots that incorporate different types of motion to discuss how motion can be effectively designed in HCI. At this stage, findings suggest that motion itself is not merely a visual or mechanical feature, but a fundamental aspect of interaction design that can influence HCI.
Akihiro Maehigashi, Seiji Yamada
Open Access
Article
Conference Proceedings
Constructing Hybrid-Driven Agents for MOBA through Player Modeling: Integrating Behavioral Analysis and Narrative Persona in Human-Centered AI
To address the challenges of rigid behavioral patterns, lack of semantic consistency, and insufficient emotional interaction in current MOBA game AI agents, this study proposes a Persona-Driven Game AI Agents (PDGAIA) framework based on the Human-Centered AI (HCAI). The framework integrates Dual-Channel Player Profiling, a Narrative Consistency Engine, and Hierarchical Reinforcement Learning (HRL) to enhance the personalization, immersion, and tactical adaptability of game AI. Using Tencent’s MOBA game Honor of Kings as a case study, player personas are constructed from large-scale match logs and survey data. By combining objective behavioral indicators and subjective psychological modeling, core player personality dimensions and their tactical preferences are identified. Based on narrative strategies, the player-modeled character prototypes are transformed into AI agents with consistent personas, leveraging Hunyuan LLMs for multimodal expression. An HRL framework is then employed to decouple tactical decision-making from action execution, enabling the development of stylized AI combat systems. This framework successfully ensures a coherent persona expression across combat behavior, voice tone, and tactical communication. Experimental results demonstrate that the dual-path approach, which combines data-driven modeling and narrative persona construction, significantly improves anthropomorphism and social presence, delivering a more immersive and intelligent gaming experience. The proposed framework provides a reusable HCI methodology for constructing Game AI agents.
Chenxi Yan, Yuntian Zhang
Open Access
Article
Conference Proceedings
LLM Asks, You Write: Enhancing Human-AI Collaborative Writing Experience through Flipped Interaction
Large Language Models (LLMs) have offered unprecedented writing assistance, significantly improving the quality of writing outcomes. However, this assistance often relegates users to passive reviewers rather than active creators, potentially compromising their creative engagement and subjective experience in the writing process. To enhance users' writing engagement and agency while preserving the benefits of AI assistance, we propose a novel flipped interaction framework called Guided-Writing for human-LLM collaborative writing. Unlike the traditional Prompt-Generate mode, where users prompt LLMs to generate content, the Guided-Writing mode features controlled questioning from the LLM, guiding users to stay focused on their writing while leveraging the LLM's strengths in creative inspiration and text editing. Through a within-subjects experiment comparing both modes, our findings demonstrate that the Guided-Writing mode significantly enhances users' independent writing engagement and strengthens their sense of agency, ownership, self-achievement, and self-expression, while maintaining comparable mental workload. Moreover, users in the Guided-Writing mode exhibited greater willingness to take responsibility for their writing outcomes, with two-day post-experiment assessments indicating higher perceptions of content authenticity and reproducibility. This study demonstrates the practical benefits of the flipped interaction framework in enhancing users' writing experience and offers valuable insights for the future design of user-centric LLM-assisted writing tools.
Mingwei Chen, Pei-Luen Patrick Rau, Liang Ma
Open Access
Article
Conference Proceedings
RescueFlex: A Modular Intelligent Rescue Robot System for Complex Disaster Scenarios
This study presents RescueFlex, a modular intelligent rescue and transport robot system designed to address key challenges in complex disaster rescue scenarios. Based on a quadruped robot platform, the system integrates multi-robot collaboration, modular design, and autonomous navigation technologies, incorporating thermal imaging, search and rescue, material transport, and medical evacuation functional modules. The research context stems from rescue needs in disaster-prone regions, particularly in countries severely affected by disasters like China, where approximately 90% of disaster deaths in remote areas are caused by delayed rescue operations, with a critical rescue window of only 72 hours. Traditional rescue methods show significant limitations when facing complex environments: first, time inefficiency leads to limited coverage within the rescue window; second, poor environmental adaptability results in high equipment failure rates in extreme conditions; third, information processing delays are closely related to rescue failures and insufficient information sharing. The RescueFlex system addresses these challenges through user-centered design methods and human-computer interaction innovations. The system adopts a "user-centered" design philosophy, combining contextual design and systematic innovation approaches through in-depth user research, scenario analysis, and iterative design to create a highly modular and adaptive rescue system. The system architecture includes hardware systems (core navigation unit, functional modules, sensing systems), software systems (autonomous navigation, multi-robot collaboration, task planning), and interaction systems (intuitive interfaces, multimodal interaction). This research provides innovative solutions and design paradigms for the disaster rescue field from human-computer interaction and design perspectives, with significant implications for improving rescue efficiency and ensuring personnel safety.
Zhixin Cai
Open Access
Article
Conference Proceedings
Using the Analytic Hierarchy Process (AHP) to Prioritize Augmented Reality (AR) Device Considerations
When designing and developing emerging technologies, understanding user priorities for different product considerations is essential for ensuring that products meet user needs and expectations, leading to better adoption, satisfaction, and overall success in the market. Understanding and characterizing user priorities is critical for making design trade-offs and aligning design and development teams to high-value efforts. This study explores the application of the Analytic Hierarchy Process (AHP) to determine how users prioritize key product considerations in the selection of Augmented Reality (AR) devices. While other prioritization techniques such as Maximum Difference Scaling (MaxDiff) and the ranking method are commonly used to better understand user priorities, AHP provides a more accurate and reliable approach, resulting in priority weights that reflect both the relative importance and the intensity of user preferences across multiple factors.For this study, AR device users completed a web-based survey that included a series of pairwise comparison prompts to prioritize six AR device considerations: comfort, features and functionality, lens quality, price, battery life, and aesthetics. For each of the fifteen pairwise comparisons, participants first indicated which device consideration was more important to them. Then, the participants rated the strength of their preference for the more important factor. The results from a sample of 37 participants revealed comfort was the most significant factor, followed by features and functionality, and lens quality. Price and battery life were also important but ranked lower, while aesthetics was deemed the least important consideration.Utilizing AHP with a panel of remote participants proved to be an effective human-centered approach for prioritizing device considerations for AR devices. The outputs from the AHP analysis not only establish a priority but also priority weights which offer deeper insights into exactly how important each factor is, revealing the relative intensity of user preferences. These priority weights can also be used to quantitatively evaluate products and prototypes, providing a more objective basis for comparison and decision-making. The methods utilized in this study facilitate a deeper understanding of user preferences and priorities which can be applied to the development of many products and emerging technologies.
Mohammad Jeelani, Michael Prieto
Open Access
Article
Conference Proceedings
Robot Autonomy through Learning from Multi-Camera Images and Human Selection Behavior
The COVID-19 pandemic, which began in early 2020, highlighted the need for technologies that can mitigate the risks of human exposure during infectious disease outbreaks. Given the ongoing threat of emerging pandemics, it is crucial to develop robotic systems that can be remotely operated by humans and eventually achieve autonomous behavior through learning from such interactions. As a fundamental study in this direction, this paper presents a method for enabling robots to autonomously operate in environments. The proposed system integrates real-time and past images from multiple cameras and learns human selection behavior based on these images to enable autonomous decision-making. Experimental results demonstrate that the proposed system achieves significantly longer autonomous operation without collisions compared to the author’s previous approach.
Manabu Motegi
Open Access
Article
Conference Proceedings
PyPro: Think in Code. Grow in Logic!
The accelerating demand for computing professionals, fueled by over 500,000 unfilled positions and a projected need for 1.7 million more by 2030 underscores the urgent need to rethink how computer science (CS) is introduced to learners, particularly during the formative middle school years. Yet, barriers such as limited early exposure, rigid curricula, and misconceptions about the field continue to hinder equitable access and sustained engagement. This concept paper introduces PyPro, a next-generation educational platform envisioned to transform the way students experience programming. PyPro integrates adaptive learning pathways, a conversational AI tutor, gamified challenges, and accessibility-first design to create a dynamic, inclusive, and student-centered environment for Python instruction. Based on the principles of personalized learning and interactive engagement, the platform reimagines CS education as a responsive, exploratory journey rather than a static instructional sequence. By continuously adapting to learner performance, offering on-demand support, and aligning content with student interests and needs, PyPro aims to cultivate computational confidence, deepen conceptual understanding, and promote long-term interest in CS. This paper explores PyPro not just as a tool, but as a conceptual model for how emerging technologies can reshape computer science education into a more equitable, engaging, and empowering experience for all learners.
Elijah Ballou, Osita Odunze, Michael Adeleke, Naja A Mack
Open Access
Article
Conference Proceedings
Traffic Sign Visual Recognition Study Based on full-reference Image Quality Assessment Algorithms
The factors influencing the visual recognizability of traffic signage are diverse. To study the synergistic effects and critical value models of these visual recognition elements, it is necessary to conduct experiments and collect data on the recognition influence factors. However, data collection based on human testers is limited by experimental conditions, making it difficult to establish large-scale datasets and avoid individual errors.The full-reference algorithm for image quality evaluation is a technique used in the field of computer image recognition to identify image distortions and assess distortion scores. Therefore, this study attempts to apply this method in cross-modal transfer learning, simulating the clarity judgment of drivers for images representing different presentations of multivariable elements. By varying three variables—font height, weight, and recognition distance—this approach simulates signage presentation images at different recognition distances. Using a Back propagation neural network (BPNN) to model signage recognizability and the computer Zernike moment algorithm to simulate recognition for given variable settings, large-scale calculations are performed to analyze the regression curves of recognition variables and compare them with data from human testers.The results show that using computational algorithms to process images with different variables helps analyze the critical points of these variables, significantly improving the accuracy and efficiency of the research. It reduces the individual differences in subjective judgment and can be applied to future studies on visual recognizability experiments.The findings evince that the utilization of computer algorithmic processing of images formulated via distinct variables facilitates the identification of critical points within the variable space, thereby markedly enhancing the precision and efficacy of research endeavors while concurrently mitigating the inherent subjectivity endemic to individual judgments. This paradigm holds promise for future research endeavors germane to visual recognition experimentation.
Duan Wu, Jingwen Tian, Renzhou Gui, Meng Wang, Yuhong Ma, Zhixuan Sun, Ruiyue Tang, Yaqi Wang, Jiawei Bi, Peng Gao
Open Access
Article
Conference Proceedings
Graphical Highlighting Study in Train Driving Interface
In recent years, with the advancement of train operation control technology and the rapid development of modern electronic technology, many electronic devices have been widely used in the cockpit, significantly enhancing the automation level and information processing capabilities of trains. However, this trend has also led to a surge in human-machine interaction information, resulting in a substantial increase in the number of displays and controllers in the cab. Excessive information and a complex driving interface not only increase the cognitive and operational burden on drivers but also seriously affect their work efficiency and decision-making quality, posing a major threat to the safety of train operations. Graphics in the driving interface are an important way for drivers to obtain various types of information during vehicle operation. Design issues such as their size, color, shape, and display mode often affect the efficiency of drivers' visual search. Highlighting is a visual strategy that marks a specific number of items through specific methods to improve the efficiency of visual search.This paper takes the train driving interface as the background and studies the visual search efficiency of the driver's graphical information under the influence of different color backgrounds, graphical color and flashing frequencies of the highlighting codes. Through the experiment, the reaction time and the correct response rate of the subjects are obtained, and the optimal combination of the three kinds of highlighting codes. When the black background and the flashing frequency of the red graphic is 27Hz, the search efficiency of the subjects is the highest, which can be used as the optimal solution for the graphic information highlighting in the design of the train driving interface.
Wenqian Zhu, Jianrun Zhang
Open Access
Article
Conference Proceedings
The Impact of Color Combinations on Recognition Efficiency and Visual Fatigue in Industrial Alarm Systems
This study delved deeply into the influence of foreground and background color combinations on recognition efficiency and visual load in industrial alarm systems. By conducting experiments with 15 participants and 54 color combinations, descriptive statistical analysis indicated that color combinations significantly affected the reaction time of visual recognition. Foreground colors, particularly those with higher saturation and luminance, played a crucial role in promoting visual recognition, while the impact of background colors was relatively less pronounced. A two - way ANOVA was performed, revealing that foreground color was a dominant factor influencing reaction time, while the background color and their interaction had relatively minor significant effects. The research findings can be applied to optimize the color design of industrial alarm systems, thus enhancing information recognition efficiency and reducing the probability of operator errors. Moreover, these results have implications for other visual - information - based fields. Future research could explore long - term effects and more complex color - related factors.
Yiwei Xie, Xiaojun Liu
Open Access
Article
Conference Proceedings
Inclusive Gaming Through Brain-Computer Interfaces: The Mind Mastery Experience
The gaming industry continues to evolve, yet accessibility barriers persist, affecting 91% of gamers with disabilities. Mind Mastery, a Brain-Computer Interface (BCI) game developed in Unity 3D, addresses these challenges by enabling hands-free gameplay through electroencephalography (EEG) signals, gyroscopic head movements, and blink detection. Utilizing the Muse 2 EEG headset, the system reduces hair-related interference and optimizes signal filtering for improved accuracy. EEG data is streamed in real-time via MuseLSL and integrated into Unity 3D using LSL4Unity, ensuring responsive and adaptive gameplay. Players interact through gyroscopic head tilts, blink detection, and concentration-based controls, enabling actions such as jumping, shrinking, growing, and force-falling without traditional input devices.Developed with an emphasis on real-time signal processing and adaptive feedback, Mind Mastery incorporates gyroscopic calibration, blink detection, and dynamic difficulty adjustment to enhance user experience. A usability study with 22 participants demonstrated Mind Mastery's potential while highlighting areas for improvement, including error handling, blink calibration, and increased challenge variability. In response, levels have been refined, and blink calibration and error-handling enhancements are in progress.Future development will integrate advanced motor imagery detection for thought-based commands, alongside an intelligent conversational agent that provides real-time guidance and adaptive feedback based on player performance. Additionally, Mind Mastery will expand to include educational components with gamification elements, where players develop cognitive skills while earning rewards for brain-activity control and completing educational challenges. This educational integration will transform Mind Mastery into a comprehensive tool for cognitive development, valuable in therapeutic and educational settings while maintaining its core mission of breaking traditional barriers in gaming accessibility. By advancing EEG-based interaction and real-time BCI integration, Mind Mastery demonstrates the potential of neurotechnology in fostering inclusivity in interactive experiences.
Osita Odunze, Elijah Ballou, Abdul Kanu, Andrew Kelly, Eric Void, Naja A Mack
Open Access
Article
Conference Proceedings
Human and Machine as Seen at the Co-Creation Age: A Co-Word Analysis in Human Machine Co-creation (2014–2024)
This paper explores the evolving landscape of human-machine co-creation, focusing on its development in the context of the ACM Conference on Human Factors in Computing Systems (CHI) from 2014 to 2024. We employ co-word analysis to identify emerging trends, central themes, and the intellectual trajectory of this field. The study highlights the shift from viewing machines as mere tools to recognizing them as collaborative partners in creative processes. By understanding these dynamics, we aim to provide insights into the implications of this paradigm shift for creativity, innovation, and societal impact, ultimately fostering a more inclusive and effective approach to human-machine interaction in various domains.
Mengyao Guo, Jinda Han, Ze Gao, Yuan Zhuang, Xingting Wu
Open Access
Article
Conference Proceedings
Object-Oriented Encapsulation-Based Virtual Equipment Modeling for Digital-Twin Production Systems
As digital-twin technology becomes integral to smart manufacturing, the creation of high-fidelity yet maintainable virtual equipment models is critical for effective production-unit debugging and system-level coordination. This paper presents an object-oriented encapsulation approach that leverages information-hiding and interface-uniformity mechanisms to modularize device states, interfaces, and behavior logic. By encapsulating each physical device within a self-contained class exposing only standardized input/output methods, the proposed method reduces coupling and enhances model reusability. We further introduce a hierarchical composition strategy, enabling seamless aggregation from single devices to work-unit, production-line, and workshop-level models. On top of this, we develop a semantic-signal aggregation framework and event-triggering mechanism that automatically translate low-level physical signals into discrete events for control and scheduling. A case study of a riveting workstation demonstrates improvements in interface consistency, modeling accuracy, and extensibility. The results confirm that our encapsulation-based modeling method offers a portable, scalable, and easily maintainable solution for digital-twin production systems, laying a solid foundation for advanced debugging workflows and intelligent decision support.
Fengyi Feng, Xiaojun Liu
Open Access
Article
Conference Proceedings
Passthrough Extended Reality in Maritime Commissioning
In this paper a feasibility study of passthrough Extended Reality (XR) with maritime commissioning as a use case is presented. Passthrough XR is a technology designed to implement Augmented Reality (AR) with modern Virtual Reality (VR) devices. The driving force for this research arises from the shipbuilding industry's need to optimise installation and validation processes during critical phases, such as sea trials and larger commissioning process. Prior research into the employment of XR technologies within the shipbuilding industry shows that tools allowing hands-free operation should be favoured, and the use of video passthrough HMDs should be avoided due to the (then) limited capabilities offered by the technology. The research involved the development of an XR environment designed for analysing the passthrough capabilities of modern VR HMDs using the Meta Quest 3 platform. The primary objective of this case study was to assess the maturity of contemporary mobile XR technologies for industrial applications within the shipbuilding sector via a testing session held for participants linked to the shipbuilding industry (n=33). The results revealed potential for the contemporary application of passthrough XR technologies in shipbuilding.
Joni Rajamäki, Mirva Tapola, Mikko Salonen, Olli Heimo, Teijo Lehtonen
Open Access
Article
Conference Proceedings
Augmented Reality Aids Logistics: Augmenting workers' abilities during customs inspections
Random goods inspections are a critical component of customs operations, ensuring that imported and exported goods comply with relevant regulations. Currently, these inspections rely on manual efforts by operators, making them susceptible to delays and increased costs, often attributed to "human errors." This paper shifts the narrative from "human errors" to "human factors", emphasizing the role of cognitive and operational challenges in inspection tasks. We introduce “Virtual Storage Assistant”, a wearable Augmented Reality tool designed to enhance the efficiency and accuracy of customs inspections. By leveraging Augmented Reality, the proposed system aims to bridge the gap between: (1) the required performance standards for the inspection process, and (2) the specific cognitive abilities and skills of customs workers, thereby streamlining operations and reducing associated inefficiencies.
Sara Buonocore, Edoardo Granata, Antonia Maria Tulino, Giuseppe Di Gironimo
Open Access
Article
Conference Proceedings
Effects of Computer Aided Design Modeling System with Hand Gesture and Eye Tracking Interface
As the core tool of industrial product design, computer aided design (CAD) software mainly adopts the traditional mouse and keyboard (MK) interface, which limits the naturalness and intuitiveness of the CAD modeling process. Recently, the emerging multimodal interface combined with hand gesture and eye tracking (HE), provides a natural and efficient interaction method. The use of this novel interface for CAD modeling has become an important development trend. This study aims to analyze the effects on user performance in a practical application where an experimental comparison between HE and traditional MK interfaces is set based on a CAD modeling system. With this prototype HE interface system for CAD modeling, users are able to complete drawing, moving, zooming, rotating of models by hand gesture and selecting of target models by eye tracking. To assess the practical efficacy of HE interface in CAD modeling, a user experiment was conducted involving 16 participants. They were tasked with performing specific CAD commands, including model drawing, selecting, moving, zooming and rotating. Each of them needed to use both interfaces (i.e., HE and MK interfaces) separately to complete specific modeling tasks. Metrics recorded included task operation time, accuracy and user preference. The results indicate that HE interface reduces operation time compared to the traditional MK. For instance, the average operation time is almost reduced by more than 10% with a corresponding increase in operation efficiency. However, average operation accuracy is slightly compromised with a decrease less than 1% in precision when using the HE interface.
Yuekang Wang, Jia Hao, Hongwei Niu, Qing Xue, Xiaonan Yang, Fei Di
Open Access
Article
Conference Proceedings
Adaptive Control Point Manipulation Technique for Free-Form Deformation Based on Gesture Interaction
Currently, the majority of Computer-Aided Design (CAD) software adopts a feature-based parametric modeling approach, which effectively captures and represents the detailed attributes of design objects. Despite its precision and utility, this method demands extensive parameter adjustments within the constructed solid models to accommodate design modifications. Such complexity imposes significant constraints on the fluidity and flexibility required during the conceptual design phase, thereby impeding the expression of innovative design ideas. Furthermore, traditional CAD software predominantly relies on the "mouse and keyboard" interaction paradigm. This approach maps three-dimensional manipulation commands onto one-dimensional or two-dimensional input signals, which reduces the naturalness and intuitiveness of the modeling process. Consequently, there is a pressing need to develop advanced three-dimensional modeling methodologies that support free-form deformation (FFD) and integrate novel interaction paradigms to enhance user experience and foster design creativity. The concept of free-form deformation, first introduced by Sederberg and Parry (1986), involves embedding a geometric object into a control lattice defined by a set of control points. The deformation of the lattice, driven by manipulating these control points, is propagated to the embedded object, resulting in its transformation. Over time, free-form deformation technology has evolved significantly. A method for deforming geometric models was proposed to enhance usability by replacing grid-based operations with direct point manipulation. This approach was later refined with Dirichlet free-form deformation, which automates the generation of influence regions and calculates spatial density using constraint points. Subsequently, advancements introduced adaptive control point generation, multi-resolution editing, and hierarchical deformation, significantly improving the flexibility and precision of free-form deformation techniques. Parallel to advancements in FFD, gesture-based interaction has emerged as a novel interaction technology. Gesture interaction provides a more natural and intuitive mode of operation compared to traditional WIMP user interfaces by capturing hand movements and translating them into control commands. Prior research has demonstrated that gesture-based systems are not only easier for new users to adopt but also enable more immersive and efficient manipulation of virtual objects. However, challenges remain in bridging the gap between gesture-based input and precise control in modeling environments, particularly in the context of free-form deformation. This paper proposes a gesture-based adaptive control point manipulation technique for free-form deformation, which integrates gesture interaction into the free-form deformation process. In order to address the interaction amplitude mapping issue inherent in gesture interaction, a control point adaptive manipulation method based on mapping functions is designed, aiming at improving user interaction experience and enhancing model editing efficiency during the conceptual design phase. First, the study compares and analyzes the basis functions and deformation tools of free-form deformation to optimize the initial control point layout. Three mapping functions are developed to address the mapping discrepancies between hand gesture movement speed and control point movement speed during gesture interaction. To evaluate the interaction efficiency and operation smoothness of these three mapping modes, a comparative experiment is conducted involving 15 participants, which identifies the optimal mapping mode for interaction performance. Each of the participants needs to perform 3 rounds of control point movement operations in the experimental scenarios with 5 different movement distances preset by the system, under three different mapping modes, for a total of 45 operations. Metrics recorded include task completion time, operation accuracy and user preference. An adaptive control method for control points based on the optimal mapping function is designed, defining adaptive parameters within the mapping pattern and providing an adaptive parameter adjustment method for new users. Finally, an evaluation experiment is conducted with 10 participants to assess the effectiveness of the proposed adaptive control methods. Participants are required to perform specified model deformation tasks using both the "mouse and keyboard" interaction paradigm and the gesture-based interaction paradigm. Key metrics, including interaction accuracy and user preferences, are recorded for analysis. The results indicate that the gesture-based interaction paradigm not only achieves satisfactory interaction accuracy but also offers notable advantages in terms of enhanced immersion and reduced fatigue compared to the traditional "mouse and keyboard" interaction paradigm.
Jia Hao, Yuxuan Liu, Hongwei Niu, Xiaonan Yang, Yuhan Hu
Open Access
Article
Conference Proceedings
Augmented Reality for Learning Traditional Wicker Weaving
Traditional craft learning methods often fail to effectively convey the tactile and material-specific nuances of practices like wickering. This study explores the use of augmented reality (AR) animations to enhance the instructional process by integrating sensory cues and interactive overlays. By focusing on two common pain points in the early stages of wicker weaving, we developed AR prototypes using Blender 4.2, incorporating visual aids like colors, directional arrows, and guided pauses. A comparative experiment was conducted with two groups: the test group learns via AR and the control group through traditional video tutorials. While the control group exhibited fewer weaving flaws, the test group produced structurally superior baskets and demonstrated better retention of step-by-step actions. Qualitative feedback highlighted AR's potential to engage users, though it also introduced higher frustration levels. Our findings suggest AR's promise in preserving and modernizing traditional crafts, bridging digital tools with physical learning.
Ian Garcia, Janne Huys, Louis Van Bouwel, Alexander Van Volsom, Jouke Verlinden
Open Access
Article
Conference Proceedings
Toward Intuitive Interaction: A Cognitive Workflow Analysis of Human-Robot Interaction in Extended Reality Interfaces
Extended-Reality (XR) technologies promise to enhance human–robot interaction (HRI) by offering intuitive spatial interfaces and immersive input. However, traditional evaluation methods, such as task completion time, error rates, or NASA-TLX often obscure where cognitive and physical demands arise or are reduced within the interaction process. This study conducts a cognitive workflow analysis of XR interfaces by integrating established methodologies: Goal-Directed Task Analysis (GDTA), Norman’s Seven Stages of Action, and Applied Cognitive Task Analysis (ACTA). These methods collectively trace how interface design affects user cognition, from mission goals to task-level interactions, revealing specific gulfs of execution and evaluation. We apply this approach to compare two XR interface types: a grid-based menu and spatial affordance-based pop-ups, within an emergency-response scenario using Microsoft HoloLens 2. The analysis uncovers hidden cognitive challenges, such as inefficient visual search and occlusion issues, often missed by conventional metrics. The findings offer XR designers actionable insights into usability challenges and demonstrate how cognitive analysis can guide more intuitive interface development.
Pattaraorn Yu, Arisara Jiamsanguanwong, Gim Song Soh
Open Access
Article
Conference Proceedings