Human Error, Reliability, Resilience, and Performance

AHFE International

Accelerating Open Access Science in Human Factors Engineering and Human-Centered Computing

Human Error, Reliability, Resilience, and Performance

Editors: Ronald Boring

Topics: Human Error, Reliability & Performance

Publication Date: 2025

ISBN: 978-1-964867-49-6

DOI: 10.54941/ahfe1005984

Articles

Virtual Human in the Loop (VHITL): Generating Synthetic Human Performance Data with HUNTER

The Human Unimodel for Nuclear Technology to Enhance Reliability (HUNTER) software is used for dynamic human reliability analysis (HRA). HUNTER creates a digital human twin (or virtual operator) that interfaces with a digital twin (or nuclear power plant simulator). HUNTER is procedure driven, a unique characteristic of safety domains in which much decision making is rule based and captured in procedures. The outputs of HUNTER extend beyond the typical outputs of an HRA estimating method and approach the level of human performance data acquired from human-in-the-loop (HITL) studies using operators and a plant simulator. An advantage of HUNTER is that it creates a virtual human in the loop (VHITL). As such, HUNTER is a unique source of synthetic data on human performance. This paper highlights the use of HUNTER for use in automated evaluations for human factors. HUNTER augments HITL studies by providing a virtual tool to screen human interactions with novel technologies in the control room.

Ronald Boring, Thomas Ulrich, Roger Lew, Jooyoung Park

Open Access

Article

Conference Proceedings

Time Distribution Analysis for Task Primitives to Support Dynamic Human Reliability Analysis

To support data collection for dynamic human reliability analysis (HRA), this study investigates time distributions for task primitives defined in the Goals, Operators, Methods, and Selection rules (GOMS)–Human Reliability Analysis (HRA) method and Human Reliability data EXtraction (HuREX). GOMS-HRA was developed to provide cognition-based time and human error probability (HEP) information for dynamic HRA calculations within the Human Unimodel for Nuclear Technology to Enhance Reliability (HUNTER) framework, while HuREX is a comprehensive HRA data collection method developed by the Korea Atomic Energy Research Institute (KAERI). In this paper, we examine time distributions by using experimental data collected from the Simplified Human Error Experimental Program (SHEEP) study, which proposes an HRA data collection framework to complement full-scope simulator research and gather input data for dynamic HRA by using simplified simulators such as the Rancor Microworld simulator. This paper investigates whether the time required for GOMS-HRA and HuREX task primitives fits 13 statistical distributions. Additionally, we compare and discuss the time distributions obtained from both student operators and professional operators. The result was that this study identified several time distributions for five GOMS-HRA and four HuREX task primitives. In the future, the results of this study are expected to provide objective reference data on the elapsed time for task primitives and aid in realistically simulating scenarios within dynamic HRA.

Taewon Yang, Ronald Boring, Jooyoung Park

Open Access

Article

Conference Proceedings

Methodology for analyzing the resilience capabilities of manufacturing companies

In an increasingly interconnected and globalized business world, companies today face the reality that “black swan” events and unpredictable disruptions are occurring more and more frequently. These shocks, which in the past were considered rare exceptions, have now become the rule. In this era of unexpected disruptions, companies are forced to strengthen their robustness and adaptability. The objective of this paper is to identify the resilience capability based on a value stream mapping. To this end, the assessment of external turbulence and the company's own resilience is the basis for developing agile solutions for new types of problems in resilience situations.

Jessica Mack, Oliver Scholtz

Open Access

Article

Conference Proceedings

Modeling cognitive behavior of human errors based on ACT-R: Design of color cued operation switching task

Although the mechanization of labor and the automation of work have been advancing in recent years, accidents caused by human error continue to occur. To prevent such accidents, it is essential to identify human errors and understand their underlying mechanisms. One common approach in research is to conduct experiments using cognitive tasks to measure human error and analyze its mechanisms based on the obtained data. However, few studies have attempted to understand the data generation process by simulating the rules governing measurement data using cognitive models. Therefore, this study aims to construct a cognitive model using the ACT-R cognitive architecture and to develop a cognitive task for measuring cognitive behavior and human error. Furthermore, by conducting experiments with the developed cognitive task, we will examine whether it is possible to measure cognitive behavior when human error occurs. Specifically, we designed a cognitive task in which participants perceive two displays, each showing two numbers in a predetermined order, perform arithmetic operations, and input their answers using a keyboard. The task incorporates four colors for the displayed numbers (green, purple, red, and black), with each color corresponding to a different arithmetic operation. To achieve precise measurement, a video camera was placed between the two displays to capture the participants’ faces from the front, enabling accurate eye-gaze tracking. A total of six undergraduate and graduate students aged 18 to 29, all enrolled at Kyoto University, participated in the experiment. The experiment was conducted individually, and each participant completed the cognitive task seven times, including practice trials. By comparing the results with those of previous studies, we confirmed that the error rate was significantly improved.

Keisuke Takeuchi, Kimi Ueda, Hirotake Ishii, Hiroshi Shimoda

Open Access

Article

Conference Proceedings

Reanalyzing the BP Texas City Refinery Accident with FRAM (Functional Resonance Analysis Method) - 20 years of complexity and learning

This study reanalyzes the BP Texas City refinery accident of March 2005 using the FRAM (Functional Resonance Analysis Method) methodology, based on technical-scientific materials such as reports, articles, and documents from the institutions involved, regulatory agencies and interviews with former employees. The goal is to uncover the human factors and its complex interactions overlooked by traditional risk assessment techniques, which are suited for linear systems, but limited for complex high-risk workplaces, such as an oil refinery. The FRAM methodology was chosen for its ability to address the interactions in a complex sociotechnical system, enhancing a human factors approach. This reanalysis revealed the significant influence of organizational elements, as a fragmented culture and the workforce reduction, influencing the decision-making through hierarchical structures. Even two decades later, the study could highlight that there is still much to learn from this event, especially as FRAM enables a deeper understanding of the complexities inherent in high-risk work environments that compose most of the workplaces of the O&G industry, from the new plants to their decommissioning. The findings underscore the limitations of linear methodologies in analyzing complex sociotechnical systems, as well as provided a broader understanding of the event, emphasizing the importance of advanced approaches to address the variability and interconnectedness of tight coupled high-risk process plants.

Josue Franca, Erik Hollnagel

Open Access

Article

Conference Proceedings

Correlation between Headquarter Placement and Mirroring Collected Intel to Gain Knowledge on an Adversaries Headquarter Location based on Gender: An ISR Assessment

Intelligence, Surveillance, and Reconnaissance (ISR) analysts are provided with a tremendous amounts of information that needs to be accurately processed, exploited, and disseminated (PED) in order to provide our warfighters an advantage on the battlefield. Analysts can obtain valuable information about the enemy territory by using ISR, such as, imagery intelligence (IMINT), signals intelligence (SIGINT), human intelligence (HUMINT) (Boury-Brisset, Kolodny, and Pham, 2016). ISR analysts can have varying strategies when capturing intelligence and often gain experience and expertise on the job rather than learning the necessary skills in a training environment. To address this problem, ISR subject matter experts (SMEs) developed Intrage: an ISR tabletop board game to provide training to Air Force intelligence personnel to gain skills relevant to ISR operations. By investigating different aspects of gameplay, we can better understand the complex decision-making process that warfighters face day-to-day and enhance future trainings and technology development. For an example, a previous study on Intrage investigated whether military experience influenced headquarter placement. Evidence was found that individuals with military experience centralized their headquarter location compared to individuals with no military experience (Nelson et al., 2024). With this knowledge, we can modify and improve training to include a less predictable outcome. In addition to this found significance, demographic characteristics such as gender is a variable that could further this discovery and increase its impact. Therefore, the purpose of this study was to determine if individuals mirrored headquarter placement with the adversary headquarter placement with respect to gender.Method: Before collecting data, the study protocol was approved by the U.S. Air Force Research Laboratory (AFRL) Institutional Review Board (IRB). The goal of this study was to determine if a relationship exists with regards to mirroring headquarter placement and adversary headquarter placement collections based on gender. Participants were recruited via email from WPAFB and completed the task online through Qualtrics. A sample of participants (N = 50) were selected from 25 military and 25 non-military participants completed the research study. The sample consisted of 11 female and 39 male participants. The participant was provided with the study objectives and were instructed to place their headquarters within a single quadrant in the northern region (i.e., Region A, B, or C) of the map. Additionally, participants were able to collect four times in the southern region of the map (i.e., Region E, F, and G) in an attempt to identify the adversary headquarters. The four collects contained a honeycomb pattern.Analysis: Statistical analyses were conducted in Rstudio using functions from various packages (R Core Team, 2022). An analysis of variance (ANOVA) was conducted to determine if there was a statistically significant difference between mirroring headquarters placement and adversary headquarter placement collections based on gender.Results: The ANOVA displayed a statistically significant difference detected with respect to mirroring headquarter placement and adversary headquarter placement collections based on gender (p = 0.04). Two of the eleven female participants (18%) conducted collections on the adversary in a mirroring manner with respect to identifying headquarter placement. Whereas twenty-one of the thirty-nine male participants (54%) conducted collections on the adversary in a mirroring manner with respect to identifying headquarter placement.Discussion: The findings of this study have provided underlying evidence that individuals headquarter placement often mirrors the adversary headquarter placement regarding gender. Additionally, more men mirrored headquarter placement than women; however, this should be investigated further with a greater sample size. As research continues, various findings will assist in the ongoing development and maturity of Intrage. In addition, future research should evaluate whether personality differences influence headquarter placement to mirror adversary headquarter placement.

Jenna Cotter, Justin Nelson, Samuel Johnston, Timothy Heggedahl, Justin Morgan

Open Access

Article

Conference Proceedings

Cognitive and Task Predictors of Naval Submarine School Academic Performance: A Pilot Study

Retaining highly qualified and trained service members (SMs) is critical for maintaining the readiness of the U.S. military to execute its mission. Unplanned losses, related to SM termination before completing their first contract, harm readiness and incur unanticipated expenses. Improved prediction of a SM’s academic performance during initial skills training could improve operational outcomes by reducing SM separations related to poor grades. Cognitive assessments that evaluate skills specific to military occupational specialties may help predict training performance, yield opportunities for customized intervention, or guide the selection of SMs to jobs that match their cognitive skills and abilities. We compared three machine learning algorithms (linear discriminate analysis [LDA], K-Nearest Neighbors [KNN], and Support Vector Machine [SVM]), which classified the initial skills training scores of 22 SMs as low (score cut-off < 75%) or high (score cuff-off >85%) on five separate exams administered during military ascension training, using performance on a ten-task cognitive assessment battery. The battery measured neurocognitive domains of attention, visual learning, working memory, abstraction, and vigilance. The cut-off scores characterized the lower and upper performance range. The resulting models exhibited modest predictive capabilities in classifying academic exam performance, with recall and precision performance in the 50th and 60th percentile. Only the KNN and SVM models exhibited better-than-chance classification performance (p < .001). Separately, correlational analyses found that performance on a simulated sonar task accounted for 31% of the variance in academic performance. The findings of this study imply that future research should add these promising cognitive measures to aid in screening and help more students achieve academic success.

Sylvia Guillory, Anna Jane Brown, Dawn Debrodt, Chad Peltier, Jeffrey Bolkhovsky

Open Access

Article

Conference Proceedings

The Impact of Cultural Background on Perception and Understanding in Learning: A Neuroscientific and Psychological Perspective

This paper explores the impact of cultural background on an individual's perception and understanding in the context of learning and training, through a neuroscientific and psychological lens. The aim is to synthesize key insights from interdisciplinary research, focusing on the connections between neuroscience, psychology, education, and culture. By reviewing a wide range of peer-reviewed studies, the paper examines how culture influences cognition, emotion, and morality. It also highlights the persistence of neuromyths among educators, particularly in Spanish-speaking countries and Latin America, and discusses their implications for pedagogical practices (Lithander et al., 2024; Gleichgerrcht et al., 2015). The paper examines neurobiological perspectives on cognition, emotion, and moral reasoning (Cohen, 2005; Shenhav et al., 2017), offering a deeper understanding of how cultural contexts shape emotional and moral behavior. Additionally, the review explores neuroscientific research on the neural mechanisms underlying social learning, memory, and hierarchy-related interactions, contributing to the understanding of the social dimensions of neuroscience (Pan et al., 2022). The role of creativity, brain plasticity, and physical activity in aging populations is also examined (Frith et al., 2022). Cognitive mechanisms, including rhythm perception (Grahn, 2012), intertemporal decision-making (Shenhav et al., 2017), and language development (Sussman et al., 2023), are analyzed from a neuroscientific perspective, providing insight into how the brain processes complex cognitive tasks. Furthermore, the paper discusses emerging research linking neuroscience with cultural anthropology (Sarto-Jackson et al., 2017), emphasizing the bidirectional influences between biology, environment, and behavior. By exploring how neuroscience informs transcultural psychiatry, social learning, and moral decision-making, the paper highlights the feedback loop between culture, behavior, and mental health (Choudhary & Kirmayer, 2009). In conclusion, this review emphasizes the need for interdisciplinary approaches to better understand the brain, cognition, and behavior, and calls for further integration of neuroscientific, psychological, and cultural perspectives to advance our knowledge of human development, learning, and societal functioning.

Abner Flores, Alexander Paselk, Teresa Irish

Open Access

Article

Conference Proceedings

Does Military Experience Influence Intel Collection Efficacy when Providing Chatter Locations on a Geographical Map

Future military planning relies heavily on the information collected from Intelligence, Surveillance, and Reconnaissance (ISR) operations to support data-driven decision-making. In particular, ISR collections that utilize imagery intelligence (IMINT) can detect, track, and target our adversaries ground movement behaviors and headquarter locations in near-real time. However, understanding when and why IMINT collections should be conducted is a challenging problem intel analysts are facing. To combat this issue, the 711th Human Performance Wing at Wright-Patterson Air Force Base developed Intrage. Intrage is a strategic decision-making game with the premise of accelerating the understanding of ISR operations. Methods: The study consists of two groups, 25 military participants and 25 non-military participants from WPAFB. Participants were provided with an overview of Intrage and requested to complete two Phases of the game. In Phase I, participants were provided the Intrage map with chatter locations and requested to conduct four intel collections. Following Phase I, participants were informed that their collections were inconclusive. In Phase II, participants were provided the Intrage map with the same chatter locations and requested to conduct two new intel collections. The objective was to determine if a correlation exists between military and non-military participants regarding intel collection efficacy when providing chatter locations on the fictional map of Intrage. Results: An analysis of variance was performed depicting conducted collections when the collection encompassed four or less chatter locations, five to seven chatter locations, and eight or more chatter locations. There was not a statistically significant difference detected between groups when conducted collections consisted of four or less chatter locations. However, there was a statistically significant difference between groups when conducted collections consisted of five to seven chatter locations (p=0.02). Military participants conducted significantly less intel collections compared to non-military participants. In addition, there was a statistically significant difference between groups when conducted collections consisted of eight or more chatter locations (p=0.03). Military participants conducted significantly more intel collections compared to non-military participants. Moreover, in Phase II there was not a significant difference between groups with respect to conducted collections and provided chatter locations. Conclusion: The findings provide underlying evidence that military experience does influence intel collection efficacy when provided chatter locations on a geographical map. Nevertheless, as both military and non-military participants engaged in additional phases of Intrage, a learning effect was observed resulting in similar performance metrics.

Justin Nelson, Justin Morgan, Timothy Heggedahl, Samuel Johnston, Jenna Cotter

Open Access

Article

Conference Proceedings

Similar known and later discovered wildland fire human, psychological, and fire weather causal relationships saved lives on two separate wildfires 23 years apart

This paper examines the parallel human factors, psychological elements, and weather conditions that influenced survival outcomes in two fatal Arizona wildfires: the June 1990 Dude Fire and June 2013 Yarnell Hill Fire. The analysis reveals how lessons learned from the earlier fire directly contributed to life-saving decisions 23 years later, challenging the notion that historical lessons go unheeded. Through examination of deep-seated systemic drivers, the paper explores how similar weather patterns, fire behavior, and human factors resulted in multiple fatalities in both incidents, while also highlighting how proper application of learned experience saved lives. The research questions whether current wildland fire management adequately promotes and ensures strict adherence to established Rules of Engagement and principles of entrapment avoidance. This analysis provides valuable insights for improving future training, site visits, and staff rides while acknowledging the impossibility of preventing all fatalities despite best practices. These findings staunchly suggest that while complete prevention of wildland fire fatalities is impossible, properly integrating human factors training with accurate and truthful practical knowledge can significantly reduce them.

Fred J Schoeffler

Open Access

Article

Conference Proceedings

Engineering a Cognitive Load Assessment System Through Multimodal Sensor Fusion

Accurate quantification of cognitive load is essential for optimizing human-computer interaction systems. Methods: This study recruited 159 healthy participants and employed a hierarchical n-back task paradigm (0-back to 3-back) to induce graded levels of cognitive load. Multimodal physiological signals, including electroencephalogram (EEG), heart rate variability (HRV), and electrodermal activity (EDA), were recorded simultaneously to construct a cognitive load dataset encompassing three modalities. Temporal and frequency domain features were extracted from EEG signals, temporal and frequency domain parameters from HRV signals, and phase-amplitude integration and SCR frequency from EDA signals. A Kruskal-Wallis test was used to analyze significant differences in physiological indices across different cognitive load levels. Finally, a multiple linear regression model was employed to quantify the contribution of each modality's features to cognitive load classification. Results: (1) A significant suppression of alpha band power in the eyes-open resting state validated the effectiveness of the EEG signal acquisition system; (2) With increasing task difficulty, the alpha and theta power of EEG and the LF value of HRV showed significant monotonic increasing trends (p < 0.05), confirming the sensitivity of multimodal physiological signals to changes in cognitive load; (3) The regression model revealed that EEG features had the highest contribution (β = 0.57). Conclusion: This study proposes a framework for cognitive load quantification based on multimodal feature fusion, providing a theoretical and empirical foundation for the development of high-precision cognitive load assessment models.

Qichao Zhao, Jianming Yang, Qingju Wang, Ruiyu Zhu, Lili Guo, Qian Zhou, Ping Wu, Lin Shen

Open Access

Article

Conference Proceedings

Important Human Actions for Advanced Reactors: Implications for Risk Analysis

Advances in technology have led to development of new reactor designs and significant changes in human-system interfaces used by operators to monitor and control commercial nuclear power plants (NPPs). These advances have led the way to novel concepts of operations (ConOps) that are very different from those used in operations for traditional NPPs, i.e., large light water reactors. Accordingly, in collaboration with Idaho National Laboratory (INL), the U.S. Nuclear Regulatory Commission (NRC) started multiple projects to support the NRC’s guidance for human factors engineering (HFE) reviews of advanced reactor applications. As a part of these efforts, INL staff members are currently working on a project to risk-inform the scope of HFE reviews when there are changes to important human actions (IHAs). Specifically, the purpose of the project is to capture specific regulatory aspects of these novel ConOps proposals and risk-inform HFE reviews related to IHAs.

Jooyoung Park, Torrey Mortenson, Rachael Hill, Casey Kovesdi, Ronald Boring

Open Access

Article

Conference Proceedings

Rancor-HUNTER - data collection and virtual operator modeling tool

Historically, human reliability analysis (HRA) methods require analysts to develop static models of human error within a predetermined human failure event through expert estimation. Objective quantitative models of human performance in the form of a virtual operator offer an avenue to perform dynamic HRA through Monte Carlo simulation. Idaho National Laboratory developed the Human Unimodel for Nuclear Technology to Enhance Reliability (HUNTER) as a dynamic HRA framework and has demonstrated success in modeling existing scenarios. More data is needed to create generalizable virtual human models to explore undefined human failure event space. The Rancor Microworld Simulator is a simplified nuclear process control simulator that students can easily train to use and that can be modified to support the development of concepts of operation for advanced reactors. Rancor-HUNTER fills the data collection niche by integrating the Rancor Microworld Simulator with the HUNTER software. Rancor automates human performance data collection with a digital procedure system compatible with HUNTER’s procedure database through their integration. This digital procedure system provides implicit task level goals associated with each step that affords automatic HRA coded data collection. Therefore, high-resolution sequences of operator behaviors can now be automatically recorded with error rates and time distributions, which collectively represent a virtual human. A thermal power dispatch concept of operations development use case highlights benefits of Rancor-HUNTER to augment human factors evaluation through automated human performance data collection, which can then be used to also inform HRA models.

Thomas Ulrich, Ronald Boring, Roger Lew, Jisuk Kim, Dylan Jurski, Olugbenga Gideon, Kelly Dickerson

Open Access

Article

Conference Proceedings

Evaluation model for human error response training

Currently, many companies provide human error response training. The objective of this training is to gain knowledge about human error, participate in safety activities on their own, and make use of this knowledge in actual workplaces. However, many companies do not properly measure the effectiveness of education. Therefore, this study examined the creation of a model that can evaluate the effectiveness of human error response education. We created a new questionnaire tool by referring to engagement surveys currently conducted in various companies and questionnaires that measure the personality of individual characteristics. We then conducted a questionnaire at an IT company that conducts human error response education and attempts to measure its effectiveness in terms of exercise scores. Multiple regression analysis was conducted based on the results of the questionnaire and the exercise scores, and a model was created to enable measurement of the effectiveness of the education. In this study, we were able to find clues for creating an evaluation index to measure the effectiveness of human error response education. However, the measurement of effectiveness was unclear in some cases, and issues remained regarding the accuracy of the measurement. Once this model is established, it is expected that companies that have not yet been able to measure the effectiveness of their human error response training will be able to do so by using a simple questionnaire. Based on the results of this study, we plan to further expand the data and create an evaluation index that will enable more accurate measurement of the effectiveness of education.

Yuka Banno, Yusaku Okada, Yu Shibuya

Open Access

Article

Conference Proceedings