Project AVIAN-S: Development of a Natural Language Processing Model for Analyzing Aviation Safety Event Reports
Authors: R Jordan Hinson, Edward Bynum, Amelia Kinsella, Katherine Berry, Michael Sawyer
Abstract: Voluntary Safety Reporting Programs (VSRPs) allow civil aviation authorities, operators, and manufacturers to actively monitor and identify potential safety issues within their operations. These first-hand reports enable organizations to develop and implement safety and efficiency improvements based on front-line observations. The National Aeronautics and Space Administration (NASA) operates the Aviation Safety Reporting System (ASRS) to empower the aviation industry and its participants to report observed safety problems, discrepancies, or deficiencies. ASRS receives, processes, and publicly releases thousands of reports annually. For example, 6,428 ASRS reports are currently available detailing events that occurred in 2019; any interested party can download these ASRS reports and associated data. Often, researchers and analysts will then read and manually label factors of interest in each report to gain safety insights. This manual process can be labor-intensive and relies on the ongoing efforts of subject-matter experts. The full potential of various voluntary safety reporting data can be difficult to realize due to the limited resources available to analyze and summarize these data. New machine learning techniques involving natural language processing offer opportunities to assess and label factors of interest within safety reports more efficiently and effectively. A novel machine learning model has been developed and trained to identify human factors issues within aviation safety reports. The AVIAN-S model has been built and iteratively trained on over 50,000 rows of manually classified aviation safety reporting data. The model uses machine learning and natural language processing to automate the process of labeling aviation safety reporting data and codifying reporter narratives according to an established human factors taxonomy. This paper will describe lessons learned from the initial model development iterations and present interim results of the model as applied across a set of sample event reports. The paper will further discuss the challenges and implications of using natural language processing to identify human factors issues emerging from this or other large aviation safety reporting data sets.
Keywords: Aviation Safety, Human Factors, Artificial Intelligence, Machine Learning, Natural Language Processing
Cite this paper: