Early Callsign Highlighting using Automatic Speech Recognition to Reduce Air Traffic Controller Workload

Open Access
Conference Proceedings
Authors: Shruthi ShettyHartmut HelmkeMatthias KleinertOliver Ohneiser

Abstract: The primary task of an air traffic controller (ATCo) is to issue instructions to pi-lots. However, the first verbal communication contact is often initiated by the pi-lot. Hence, the ATCo needs to search for the aircraft radar label that corresponds to the callsign uttered by the pilot. Therefore, it would be useful to have a control-ler assistance system, which recognizes and highlights the spoken callsign in the ATCo display as early as possible, directly from the speech data. Therefore, we propose to use an automatic speech recognition (ASR) system to first obtain the speech-to-text transcription, followed by extracting the spoken callsign from the transcription. As a high performance in callsign recognition is required, we use surveillance data, which significantly reduces callsign recognition error rates. When using ASR transcriptions for ATCo utterances of Isavia data (HAAWAII project ), we initially obtain a callsign recognition error rate of 6.2%, which im-proves to 2.8% when surveillance data information is used.For the ATC operational speech data obtained from NATS air navigation service provider for London approach area, currently we obtain a callsign recognition rate of 93.8% for both ATCo and pilot utterances on automatic transcriptions which are generated by an ASR system with a word error rate of 5.1%. However, when surveillance data is not used, the callsign recognition rate drops significantly to 82.7%, indicating the importance of using surveillance data while recognizing callsigns. Once the callsign is spoken, we are able to recognize it within a second, which would be of great value to ATCos especially in situations of high traffic constituting high workload.

Keywords: air traffic controller workload, automatic speech recognition, callsign highlighting

DOI: 10.54941/ahfe1002493

Cite this paper: