TauchiGPT_V2: An Offline Agent-based Opensource AI Tool designed to Assist in Academic Research

Open Access
Article
Conference Proceedings
Authors: Ahmed FarooqJari KangasMounia ZiatRoope Raisamo

Abstract: Recent progress in artificial intelligence, particularly deep learning, has ushered in a new era of autonomously generated content spanning text, audio, and visuals. This means Large Language Models (LLMs) such as ChatGPT, Llama2, Claude, and PaLM 2 are now developed enough to not only fill in the gaps within user-generated content, but also create unique content of their own, using predefined styles, formats, and writing techniques. With selective modelling and fine-tuning relevant training data, LLMs can output original content for a wide range of tasks previously considered solely the domain of human creativity. However, if we look at the area of research and development within academics, this AI renaissance has yet to make a meaningful impact finding in the pedagogical domains. Crafting a tailored R&D instrument, adept at intricate research procedures, previously presented a formidable challenge regarding expertise, time, and fiscal resources. However, the latest development Within this context, Generative Pre-trained Transformers (GPT) and their foundational structures offer a beacon, given their potential to exploit pre-trained Large Language Models (LLMs) for optimizing standard research operations. Our previous work on Autonomous Agents shows that using existing tools and deductive reasoning techniques built on the LangChain model can create a customized tool for academic research. This study builds on the existing work in autonomous agents and open-source LLMs to develop TAUCHI-GPT_V2, a novel adaptation of the academic research assistant. TAUCHI-GPT_V2, conceptualized as an open-source initiative, is built on top of the LangChain architecture employing LLaMA2-13b as the core LLM, ingesting users’ own data and files to provide highly relevant contextual results. In this paper, we discuss how TAUCHI-GPT_V2 uses custom offline localized vectorDB for parsing users’ personal files to output relevant contextual results within a chat interface. We also put the model to the test by having academic researchers utilize the tool within their daily workflow and report its efficacy and reliability in both hallucinations as well as citing relevant information to enhance user workflow for academic research-related tasks.

Keywords: Artificial Intelligence, Large Language Models (LLMs), Generative Pre-trained Transformers (GPT), Human Computer Interaction, Opensource LLM Models

DOI: 10.54941/ahfe1004567

Cite this paper:

Downloads
94
Visits
343
Download