An Adaptive XR Platform for Multimodal Public Speaking Training and Performance Assessment

Lia Cardoso; Hugo Correia; Bernardo Marques; Paulo Dias; Samuel Silva; Beatriz Sousa Santos

doi:10.54941/ahfe1007283

AHFE International

Accelerating Open Access Science in Human Factors Engineering and Human-Centered Computing

An Adaptive XR Platform for Multimodal Public Speaking Training and Performance Assessment

Open Access

Article

Conference Proceedings

Authors: Lia Cardoso, Hugo Correia, Bernardo Marques, Paulo Dias, Samuel Silva, Beatriz Sousa Santos

Abstract

Public speaking is a fundamental competence across academic, professional, and social contexts, yet an estimated 77% of the population experiences public speaking anxiety (PSA), manifesting through psychological symptoms – such as fear of judgment and avoidance behaviours – and physiological responses including increased heart rate and electrodermal activity. Traditional training approaches face limitations in scalability, reproducibility, and objective performance assessment, often relying on subjective rubrics and offering limited opportunities for repeated practice in realistic settings. Extended Reality (XR) technologies have emerged as a promising alternative, as immersive virtual environments can reliably elicit anxiety responses comparable to real audiences while providing safe, controllable, and repeatable practice scenarios. However, most existing XR platforms focus primarily on exposure, without systematically capturing or leveraging the multimodal signals that reflect a speaker's internal state and communicative behaviour. Incorporating behavioural data such as gaze patterns, gesture dynamics, and spatial movement, alongside psychophysiological signals and speech acoustics, is essential to move beyond subjective evaluation toward objective characterization of user states, enabling personalized feedback and real-time training adaptation. This paper presents the design, implementation, and preliminary assessment of an adaptive XR platform for public speaking training that integrates these multimodal data streams within a five-layer architecture (Environment Generation, Data Gathering, Analysis, Feedback, and Visualization), instantiated through a modular microservices approach. A preliminary usability assessment with ten participants demonstrates the platform's learnability and task completion effectiveness, advancing beyond exposure-only systems toward intelligent, data-driven, and personalized skill development.

Keywords: Extended Reality, Public Speaking Anxiety, Multimodal Data Analysis, Adaptive Feedback, Virtual Reality Training

DOI: 10.54941/ahfe1007283

Cite this paper

Downloads

90

Visits

124

Download PDF

More from this volume

← Presenting ADS-B Labels in a Conventional and in a Remote Virtual Tower Environment: Beneficial or Disruptive?Designing Together at a Distance: Mediating Industrial Remote Co-Design Through eXtended Reality Technologies →

View all articles in Human Interaction and Emerging Technologies (IHIET-FS 2026): Future Systems and Design Applications →