Context-aware LLMs for healthcare requirements engineering

Valeria Resendez; Andrew Hornback; Harinishree Sathu; J. Ben Tamo; Yining Yuan; May Wang; Nese Baz; Funda Yildirim; Russell Chan; Maria Fernanda Cabrera; Simone Borsci

doi:10.54941/ahfe1007500

AHFE International

Accelerating Open Access Science in Human Factors Engineering and Human-Centered Computing

Context-aware LLMs for healthcare requirements engineering

Open Access

Article

Conference Proceedings

Authors: Valeria Resendez, Andrew Hornback, Harinishree Sathu, J. Ben Tamo, Yining Yuan, May Wang, Nese Baz, Funda Yildirim, Russell Chan, Maria Fernanda Cabrera, Simone Borsci

Abstract

Requirements engineering (RE) is a collaborative, context-dependent, and resource-intensive process, particularly in highly regulated domains such as healthcare. Recent advances in large language models (LLMs) have raised questions about their potential in supporting early-stage requirements elicitation. However, integrating LLMs introduces an additional mediation layer between contextual knowledge and articulated system requirements. Drawing on Norman’s concepts of the gulf of execution and the gulf of evaluation, this study examines under what contextual conditions LLMs approximate human expert–elicited requirements. We conducted a 3 × 3 × 3 simulation study comparing three LLMs (GPT-5.2, Claude 4.5 Sonnet, and Gemini 3 Pro), three knowledge conditions (none, proposal-based, and literature-based), and three expert-role prompts (none, pediatrician, and geneticist). Each combination was repeated 50 times, producing a total of 1,350 outputs. Results show significant variation in requirement quantity across models and knowledge conditions, but consistently low semantic alignment with human expert requirements. Retrieval-augmented knowledge reduced output volume without improving the alignment with human-expert requirements. Role prompting produced marginal effects. All models demonstrated high within-condition reliability, indicating stable but moderately aligned outputs. These findings suggest that LLMs could function more as tools to generate requirements for scaffolding than as expert emulators. While LLMs do not operationalize contextual knowledge into expert-level requirements, they may support early RE processes.

Keywords: Requirement Elicitation, Human-ai Collaboration, Retrieval Augmented Generation

DOI: 10.54941/ahfe1007500

Cite this paper

Downloads

47

Visits

99

Download PDF

More from this volume

← Human–AI Interaction as a Catalyst for Interdisciplinary Co-Creation: Exploring Prompt-Driven Visualization in Design Education Understanding the Needs and Challenges of Developing Robot Teleoperation Applications using Mixed Reality Headsets →

View all articles in Human-Computer Interaction & Emerging Technologies →