Dramatizing Everyday Conversations: A Context-Aware BGM Recommendation System Using Generative AI
Abstract
Conventional music recommendation systems often rely on predefined emotional values or direct user interaction, making it difficult to incorporate nuanced conversational context. To address this limitation, we propose a novel system that recommends background music (BGM) for everyday conversations based on contextual analysis using a generative AI model, Gemini. Our system transcribes spoken dialogues into text, analyzes the content using Gemini, and then identifies similar scenes and BGMs from a preconstructed dataset composed of 12 BGMs derived from the Japanese TV drama “Ichiban Sukina Hana (My Most Favorite Flower)”. By matching real-life conversations with relatable dramatic contexts, the system aims to enhance the immersion and emotional resonance of ordinary dialogues.We developed a system that takes conversational audio as input and recommends BGMs suited to the conversational context. Using twelve conversation themes, we conducted live conversations and tested whether the expected BGM would be recommended from the audio input. As a result, in 10 out of 12 trials (83.3%), the expected BGM was recommended within the top three ranks. For the trials that fell out of rank, although the conversations were related to the assigned themes, more specific sub-contexts were emphasized (partly diverging from the original intent of the theme), which likely caused other BGMs to be prioritized. Additionally, the actual conversational content did not always match what was anticipated, contributing to recommendations that differed from the target. These findings suggest that refining conversation themes to be more concrete and reproducible would increase the likelihood that BGMs aligned with the themes are recommended appropriately
Keywords: Generative AI, BGM recommendation, Affective computing, Dialogue analysis, Drama-based dataset
DOI: 10.54941/ahfe1006863
Cite this paper
More from this volume
- Warnings and Multilingual Audiences
- EAT Da Vinci 3.0_Translating Cinematic Narrative into Media Art Installation
- From Manual to Automated: Enhancing Inclusivity in Foreign Language Education with Technology
- The effect of multi-sensory physical experiences in daily emotional self-tracking service for emotion self-awareness
- Parametric generation based graphic design and spatial expression research
- Gender Stereotypes in Video Gaming: Impacts of Anxiety Levels, Verbal Communication, and Performance
- Exploring Usability And User-experience Metrics With A Novel AR App In The MASTERLY Project
- Drawing Dialogues Between Generative AI and Children with Autism: A Qualitative Study on the Externalization of “Understanding”
- Human-Centered Design of Integrated Food Service Management Systems: Reducing Cognitive Load in Resource-Constrained Kitchen Operations
- The Design Futures Art-driven (DFA) Method: Structuring Art-Tech Collaboration for Sustainable Future of Food System
- Increasing importance of Instinct
- Bridging the Privacy Gap: Stakeholder Solutions to Support Transparent Data Management Practices in Digital Health Research


AHFE Open Access