Novice and expert performance in a knowledge graph-driven assistive dialogue system
Open Access
Article
Conference Proceedings
Authors: Shannon Briggs, Emily Conway, Clare Arrington, Kelsey Rook, Tomek Strzalkowski, Abraham Sanders, Erfan El-hossami, Collen Roller
Abstract: This paper describes a LLM-supported assistive dialog system and a proposed evaluation methodology for this system, developed for Air Force intelligence analysts in intelligence reporting scenarios. Part of this technology has previously been described in other work which developed the core technology of the assistive dialog function (Sanders, A). This paper expands the previous system by incorporating RAG with a recommendation system. This recommendation system is informed by knowledge graphs. These knowledge graphs describe analysts’ progress through information foraging and sensemaking processes, and customizes its assistance based on that progress and expertise level of the analyst. We anticipate that this assistive dialog system will improve metrics of efficiency, effectiveness, information absorption, and retention. From this aggregate of schema, we developed an ideal sample schema designed to inform the assistive dialog system to help track participants during intelligence analysis tasks.We are specifically interested in the performance difference between expert and novice subject matter experts in this arena, however development of a large language model that adapts to user expertise has not been previously attempted. Our proposed evaluations are exploratory in nature to establish baseline behaviors. Existing literature is clear about expert behaviors across a variety of domains (Eriksson, 1995; Ward, 2008; Williams, 2005), and has informed our approach to developing the behavior of the assistive dialog system to support users with varying levels of expertise. In our evaluations, we are looking to determine if an assistive dialog system can successfully reduce training time needed to narrow the performance gap between novices and experts and if the system will have any benefit to experts in accelerating their current workflow processes. In order to study this, we have designed the dialog system’s performance to differ based on the background of the participant. The dialog system will interact with experts in a support capacity, working to lessen the cognitive load of analysis. However, for novice users, the assistive dialog system engages in guiding capacity, assisting novices in tasks they have less familiarity and confidence with in order to increase efficiency, accuracy, and information retention and absorption.The proposed testing scenario is a lab-study designed to monitor participants during an intelligence task they could experience in their typical workday. Participants are allotted time to complete an intelligence task, to gather information and deliver an intelligence estimate at the end of the testing scenario. Success of the testing scenario is judged from a combination of human and machine metrics. Human metrics being observed and collected include efficiency tasks, such as time-to-task. Other evaluation metrics include evaluation of the cognitive schema direction of the assistive dialog system, to determine if the participant is successfully directed or influenced during the course of the evaluation. Works CitedEricsson, K. A., & Kintsch, W. (1995). Long-term working memory. Psychological review, 102(2), 211.Sanders, A., Strzalkowski, T., Si, M., Chang, A., Dey, D., Braasch, J., & Wang, D. (2022). Towards a progression-aware autonomous dialogue agent. arXiv preprint arXiv:2205.03692.Ward, P., Farrow, D., Harris, K. R., Williams, A. M., Eccles, D. W., & Ericsson, K. A. (2008). Training perceptual-cognitive skills: Can sport psychology research inform military decision training?. Military Psychology, 20(sup1), S71-S102.Williams, A. M., & Ericsson, K. A. (2005). Perceptual-cognitive expertise in sport: Some considerations when applying the expert performance approach. Human movement science, 24(3), 283-307.
Keywords: LLM, RAG, knowledge graphs, human evaluations, decision making, information processing, mental models
DOI: 10.54941/ahfe1006139
Cite this paper:
Downloads
11
Visits
54