Semantic Structure and Importance Extraction from Sequential Conversational Data via Dimensional Reduction
Abstract
The analysis of spoken data from panel discussions, policy dialogues, and educational meetings has gained increasing importance in both academic research and professional practice. However, conventional approaches to Japanese conversation analysis have relied heavily on keyword matching or surface‑level text similarity, making it difficult to capture deeper semantic relationships, topic transitions, and latent discourse structures. In addition, Japanese natural language processing pipelines often rely on environment-sensitive morphological analyzers, which hinder reproducibility and large-scale processing. To address these limitations, this study proposes a robust and semantically enriched framework for conversation understanding based on a composite distributed representation. The proposed method integrates three layers of linguistic information: (1) contextual sentence embeddings generated by a multilingual transformer model, (2) word embeddings obtained from fastText, and (3) co‑occurrence vectors that capture lexical association patterns within the conversation. Sudachi is employed for Japanese text preprocessing to ensure stable and reproducible morphological analysis. By combining these components into a unified composite vector, the framework simultaneously represents global sentence‑level meaning and local lexical relationships. Using this representation, a directed graph is constructed that incorporates both temporal adjacency and semantic proximity between utterances, enabling the visualization of key conversational connections. To evaluate the effectiveness of the composite representation, dimensionality‑reduction algorithms are applied to examine whether semantically similar utterances naturally form coherent clusters in low‑dimensional space. The resulting clusters are assessed for consistency and interpretability, demonstrating that the proposed representation successfully captures meaningful conversational structure.
Keywords: Conversational Semantics, Distributed Representations, Dimensionality Reduction (MAPE), Nonlinear Embedding, Semantic Clustering
DOI: 10.54941/ahfe1008009
Cite this paper
More from this volume
- Enhancing Material Literacy Through Hands-On Workshops in Educational Material Libraries
- Instructors’ Perspectives on AI in Maritime Simulator Training: A Qualitative Study
- Methodological Validation of Environmental Embedding and Cognitive Absorption for AR Instructional Communication in Chinese Motif Design Learning
- AI Empowers Design Education: Integrated Model of Prompt Teaching and Originality Cultivation in University Design Majors
- Interdisciplinary Pathways and Pedagogical Models Integrating Artificial Intelligence and Design
- Shaping a pro-development orientation & proactivity as intentions corresponding to the process of self-education in a career in the globalizing world
- From Tutor to Co-Instructor: AI–Human Instructor Roles in Maritime Simulator Training and Assessment
- Internal-external parameters’ balance during cognitive performance as measured individual adaptive “norm” for learning/training
- A Bridge too Far: Low Literacy and Cybersecurity Materials
- Perceptions of Undesirable Software Development Tasks among Computer Science Students
- Manual Dexterity Required for Clothing Repairs: Assessing the Influence of Thread Fineness on Evaluation Outcomes
- An Empathy-to-Testing Workshop to Strengthen Human Factors Evaluation in Design Education


AHFE Open Access