User trust in a depression screening app when outcomes are labelled as either AI-generated or doctor-generated: a pilot study.

AHFE International

Accelerating Open Access Science in Human Factors Engineering and Human-Centered Computing

User trust in a depression screening app when outcomes are labelled as either AI-generated or doctor-generated: a pilot study.

Open Access

Article

Conference Proceedings

Authors: Yeganeh Shahsavar, Avishek Choudhury

Abstract: This study aimed to quantify and compare user trust when interacting with a web-based depression screening app, where outcomes were labeled as either AI-generated or doctor-generated. The app calculated a depression score based on user input and presented two screening outcomes, one labeled as “doctor-generated” and the other as “AI-generated.” Participants were then asked to select the outcome they trusted the most. Seventeen individuals participated in the study. Despite identical outcomes, 11 participants chose the AI-generated outcome (group-AI), while 6 selected the doctor-generated outcome (group-DR). To assess user trust (also attention), electroencephalogram (EEG) signals were recorded during the task, focusing on Alpha (Pz) and Beta (Fc1, Fc2) channels. Attention was measured through Alpha activity at Pz, while trust was assessed through Beta activity at Fc1 and Fc2. Post-intervention, participants’ perceived trust in the outcomes was measured using a survey. The mean normalized power spectral density (PSD) values were calculated and correlated with the survey-based trust scores. Comparisons of PSDs and trust scores were made both between and within the AI and DR groups. Results showed that the mean PSD value for attention (Pz) was 0.116 µV²/Hz, while the values for trust (Fc1 and Fc2) were 0.648 µV²/Hz and 0.646 µV²/Hz, respectively. The mean trust score for the AI-based outcome was 3.118, compared to 3.235 for the doctor-based outcome. A weak to moderate correlation was observed between survey trust scores and PSD values in Fc1 and Fc2. Group-AI exhibited lower Alpha power at Pz (0.108 µV²/Hz) and higher Beta power at Fc1 (0.660 µV²/Hz) and Fc2 (0.659 µV²/Hz) compared to group-DR, which showed higher Alpha power at Pz (0.131 µV²/Hz) and lower Beta power at Fc1 (0.626 µV²/Hz) and Fc2 (0.621 µV²/Hz). In conclusion, our findings suggest that while participants may express marginal preference for doctor-generated outcomes in self-reported trust, their EEG data reveals a nuanced picture where those choosing AI-based outcomes may exhibit higher levels of trust on a cognitive level.

Keywords: User trust, artificial intelligent, depression, digital health, human factors

DOI: 10.54941/ahfe1005708

Cite this paper:

Downloads

403

Visits

982