A Peer-to-Peer Corpus for Conversational Agents for Long-Distance Relationships
Authors: Nicholas Caporusso, Naryn Samuel, Devyn Ferman
Abstract: In addition to automated Decision Support Systems, intelligent agents are being increasingly utilized in many different domains to support humans in accomplishing their tasks. Research demonstrated the applicability of virtual agents, with specific regard to conversational agents and chatbots, as an effective system for providing users with support in the form of answers to frequent questions, simplified and more natural interfaces for searching for information, and human-like interaction. Several studies introduced the use of chatbots in the healthcare domain for a variety of tasks: in addition to analyzing their performance in accomplishing the expected goals, research confirmed the effectiveness of conversational agents in creating engaging and positive user experiences. Recently, dedicated chatbots have been introduced for addressing mental health conditions. Several systems have been developed for supporting individuals suffering from mental disorders, in situations of stress and anxiety, or who are reluctant to seek mental health advice. The availability of corpora is crucial for developing and training conversational agents. To this end, dedicated groups on social networks, forums, and online peer support communities are especially useful for acquiring datasets for addressing specific circumstances and issues, improving the design of Natural Language Processing (NLP) systems, and training Machine Learning algorithms. The scientific literature documents repositories and corpora generated by collecting information publicly available on Twitter, Reddit, Facebook, and other websites.In this paper, we introduce an annotated corpus especially dedicated to supporting couples who are in long-distance relationships. Our work leverages the experience of thousands of individuals who have been separated during the COVID-19 pandemic when the governments of many countries enacted travel restrictions that prevented couples from reuniting and forced them into long-distance relationships. Travel bans, which were introduced in March 2020 and maintained until most of 2021, especially affected thousands of bi-national couples subject to VISA-related restrictions, who experienced a prolonged situation of stress, anxiety, and depression, that impacted their mental health. In our work, we collected publicly available data published on websites and groups dedicated to offering peer-to-peer support to individuals who have been separated from their partners due to COVID-19-related restrictions. Our corpus contains over 16 months of data and more than 3500 posts and their reactions, which provide a rich representation of conversations and interactions that happened in the group and offers insight into peer-support dynamics. In addition to completely anonymizing the corpus, we removed less meaningful content and we produced interaction and engagement statistics for each post, which is especially useful for analyzing the relevance of the content for the community. Furthermore, we annotated the corpus both using machine methods based on NLP models (e.g., BERT) and human classification. Specifically, each entry is associated with sentiment and emotion labels. Although our data refer to situations caused by COVID-19-related travel restrictions, the content of the corpus is applicable to long-distance relationships in general, and it is particularly suitable for realizing research on peer-support as well as for developing new applications, conversational agents, and intelligent systems that can offer psychological and sentimental help to individuals and couples.
Keywords: Chatbots, Natural Language Processing, Machine Learning, Bert
Cite this paper: