AI-Driven Music Generation and Emotion Conversion

AHFE International

Accelerating Open Access Science in Human Factors Engineering and Human-Centered Computing

AI-Driven Music Generation and Emotion Conversion

Open Access

Article

Conference Proceedings

Authors: Xinwei Gao, Deng Kai Chen, Zhiming Gou, Lin Ma, Ruisi Liu, Di Zhao, Jaap Ham

Abstract: With the integration of Generalized Adversarial Networks (GANs), Artificial Intelligence Generated Content (AIGC) overcomes algorithmic limitations, significantly enhancing generation quality and diversifying generation types. This advancement profoundly impacts AI music generation, fostering emotionally warm compositions capable of forging empathetic connections with audiences. AI interprets input prompts to generate music imbued with semantic emotions. This study aims to assess the accuracy of AI music generation in conveying semantic emotions, and its impact on empathetic audience connections. ninety audios were generated across three music-generated software (Google musicLM, Stable Audio, and MusicGen), using four emotion prompts (Energetic, Distressed, Sluggish, and Peaceful) based on the Dimensional Emotion Model, and two generated forms (text-to-music and music-to-music). Emotional judgment experiment involving 26 subjects were conducted, comparing their valance and arousal judgments of the audios. Through Multi-way variance analysis, the AI-music-generated software had a significant main effect on the accuracy of conversion. Due to the diversity of generated forms of MusicGen, it has a lower accuracy of conversion compared to Google musicLM and Stable Audio. There was a significant interaction effect of generated forms and emotion prompts on the accuracy of conversion. The differences in accuracy between emotion prompts in the form of text-to-music were statistically significant, except for the differences between the accuracy of Distressed and Peaceful. Compared with the generated form of text-to-music, the form of music-to-music showed statistically significant emotional conversion ability for low arousal. The diversity of AI software input elements (i.e., text or music) may affect the effectiveness of emotional expression in music generation. The ability of different software to convey different emotions according to different prompts was unsteady in the form of text-to-music. This study advance computer music co-composition and improvisation abilities, facilitating AI music applications in fields such as medical rehabilitation, education, psychological healing, and virtual reality experiences.

Keywords: AI-Music, Emotion Conversion, Emotion model

DOI: 10.54941/ahfe1004679

Cite this paper:

Downloads

1443

Visits

2390