Application Potential of Large Language Models as Product User Experience Evaluation Tools
Abstract
This study aims to explore the potential of general-purpose Large Language Models (LLMs) in generating User Experience (UX) evaluations during the product verification phase, addressing issues in traditional UX research methods such as difficulties in user recruitment and long scheduling cycles. Using the User Experience Honeycomb model as the theoretical framework, the research selects the off-the-shelf GPT-4o as the experimental model. By combining optimized prompt engineering with multimodal inputs, a comparative analysis is conducted on the similarities and differences between LLMs and human users regarding evaluation coverage rate, language style, and problem perspectives. The experiment employs a deductive-inductive approach to code and analyze the collected evaluation data. The results indicate that the thematic overlap rate between LLM-generated evaluations and human user evaluations reaches 81.05%, demonstrating significant potential in simulating human users to output experience evaluations. Textual analysis reveals that LLM-generated UX evaluations exhibit strengths in systematic analysis, professional expression, and proactive risk identification; however, they show limitations in capturing nuanced emotions and dynamic interaction details. Additionally, the efficiency of UX evaluation is improved by 88.0% compared to human users. The study recommends adopting a Hybrid Intelligence evaluation model, leveraging the systematic analysis capabilities of LLMs while incorporating human users' acute perception of emotions and immediate experiences to enhance both the efficiency and comprehensiveness of UX research.
Keywords: User Experience, Large Language Models, Product Verification, UX Honeycomb, Hybrid Intelligence
DOI: 10.54941/ahfe1008057
Cite this paper
More from this volume
- Enhancing XR Interface Design through Immersive AR Co-Design and 360-Degree Photospheres
- Designing with the Senses: Emotional Connections for Sustainable Consumption
- Sustainable AI: Exploring Gains and Losses of AI in Daily Routines
- A Usability Evaluation of the Lusog-Isip Mental Health Mobile Application
- Narrative Design Method Innovation: Exploring Paths to Enhance the Story Connotation and User Experience of Cultural and Creative Products
- Mobile Service Design as Cultural Intermediaries for Halal-Related Services
- The Silent Language of Priority Seating: Invisible Needs, Attention Barriers, and the Legitimacy Crisis in Public Transit
- An Observation on the Accessibility of a Connecting Ramp between Campus Buildings
- A Behavioral Observation on Spatial Configuration and Circulation Planning of the Food Court Area in a Hypermarket
- The Interplay of Gender and Anthropomorphism in AI Avatar Design: An Empirical Study on User Experience in Financial Contexts
- Design of an Intelligent Product–Service System for Last-Mile Express Delivery in the Chinese Context
- Beyond Visuals: Addressing Cognitive Load and Usability Challenges in an Enterprise Mobile Application Design


AHFE Open Access