Can Machine Learning Replace Expert Evaluation? An AI-Powered Platform for XR, Tangible, and Haptic User Interfaces Automated Testing

Mohammad Mustafa; Ahmed Seffah

doi:10.54941/ahfe1007182

AHFE International

Accelerating Open Access Science in Human Factors Engineering and Human-Centered Computing

Can Machine Learning Replace Expert Evaluation? An AI-Powered Platform for XR, Tangible, and Haptic User Interfaces Automated Testing

Open Access

Article

Conference Proceedings

Authors: Mohammad Mustafa, Ahmed Seffah

Abstract

The rapid proliferation of novel interaction paradigms—including Extended Reality (XR), tangible and natural user interfaces, and haptic systems—together with evolving user experience attributes such as emotional, social, and collaborative interactions, has fundamentally transformed human–computer interaction. These interaction modalities, increasingly integrated with artificial intelligence, expose the limitations of traditional usability and UX evaluation methods developed for conventional graphical user interfaces. Human–AI interaction introduces distinct design challenges related to transparency, explainability, user trust, and shared control, requiring evaluation approaches that shift from technology-centered assessment toward genuinely human-centered analysis.This research proposes a metric-based, AI-powered evaluation platform designed to systematically test and benchmark contemporary user interfaces and interaction modalities. The platform employs enhanced event logging and behavioral coding mechanisms to capture comprehensive multimodal interaction data, including task performance metrics, behavioral patterns, temporal action sequences, gaze trajectories, gesture dynamics, and physiological responses. Machine learning models analyze this multidimensional data to predict standard usability attributes—efficiency, effectiveness, and satisfaction as defined by ISO 9241-11—alongside emerging UX dimensions such as cognitive load, emotional engagement, and social presence.The platform architecture integrates high-performance XR-capable computing hardware with professional-grade graphics processing, multi-monitor visualization, eye-tracking sensors, haptic feedback devices, motion-capture systems, and biometric sensing infrastructure. Data collection is supported by curated training datasets derived from prior usability studies, expert heuristic evaluations, standardized task baselines, and cross-cultural UX assessment data. A custom-engineered, shock-resistant, transportable case with integrated power management and modular connectivity enables rapid deployment across laboratory, field, and organizational contexts, addressing ecological validity limitations inherent in traditional lab-based usability testing.Automated usability evaluation is driven by a hybrid AI framework combining three complementary models. Random Forest ensemble classifiers provide robust prediction across heterogeneous interaction data while offering interpretable feature importance measures to identify key usability determinants. Long Short-Term Memory (LSTM) networks model temporally ordered interaction sequences, enabling detection of behavioral signatures associated with confusion, flow states, hesitation, and error recovery. Support Vector Machines with radial basis function kernels capture complex non-linear relationships in high-dimensional usability data and perform effectively under limited expert-labeled training conditions.The central hypothesis posits that AI-driven automated usability evaluation can achieve accuracy and reliability comparable to expert-conducted heuristic evaluations and cognitive walkthroughs, validated through controlled A/B experiments comparing automated predictions with expert assessments across diverse interface types and user populations.By democratizing access to rigorous UX evaluation through automation, multimodal sensing, and portable deployment, the proposed platform aims to accelerate iterative design cycles for emerging interactive and human–AI systems while maintaining methodological rigor comparable to expert evaluation.

Keywords: AI as user testing platform, XR, Haptics, Wearable, Human-AI interaction, Usability evaluation and user tests

DOI: 10.54941/ahfe1007182

Cite this paper

Downloads

141

Visits

175

Download PDF