Application Potential of Large Language Models as Product User Experience Evaluation Tools

Xiaoyue Mao; Jun Zhang; Yijing Yang; Kaiyang Tang

doi:10.54941/ahfe1008057

AHFE International

Accelerating Open Access Science in Human Factors Engineering and Human-Centered Computing

Application Potential of Large Language Models as Product User Experience Evaluation Tools

Open Access

Article

Conference Proceedings

Authors: Xiaoyue Mao, Jun Zhang, Yijing Yang, Kaiyang Tang

Abstract

This study aims to explore the potential of general-purpose Large Language Models (LLMs) in generating User Experience (UX) evaluations during the product verification phase, addressing issues in traditional UX research methods such as difficulties in user recruitment and long scheduling cycles. Using the User Experience Honeycomb model as the theoretical framework, the research selects the off-the-shelf GPT-4o as the experimental model. By combining optimized prompt engineering with multimodal inputs, a comparative analysis is conducted on the similarities and differences between LLMs and human users regarding evaluation coverage rate, language style, and problem perspectives. The experiment employs a deductive-inductive approach to code and analyze the collected evaluation data. The results indicate that the thematic overlap rate between LLM-generated evaluations and human user evaluations reaches 81.05%, demonstrating significant potential in simulating human users to output experience evaluations. Textual analysis reveals that LLM-generated UX evaluations exhibit strengths in systematic analysis, professional expression, and proactive risk identification; however, they show limitations in capturing nuanced emotions and dynamic interaction details. Additionally, the efficiency of UX evaluation is improved by 88.0% compared to human users. The study recommends adopting a Hybrid Intelligence evaluation model, leveraging the systematic analysis capabilities of LLMs while incorporating human users' acute perception of emotions and immediate experiences to enhance both the efficiency and comprehensiveness of UX research.

Keywords: User Experience, Large Language Models, Product Verification, UX Honeycomb, Hybrid Intelligence

DOI: 10.54941/ahfe1008057

Cite this paper

Downloads

34

Visits

65

Download PDF

More from this volume

← User Experience on Social Media between Adolescents and Young Adults Reducing Cognitive Load in Expert Interviews: Interface Design and Application of a Spreadsheet-based CASI System →

View all articles in Usability and User Experience →