OCR-based Quality Assessment and Auxiliary Review System for Semantic Information Extraction from Engineering Drawings
Abstract
Optical Character Recognition (OCR) has been widely adopted to extract textual information from legacy engineering drawings, aiming to transform image-based PDF documents to semantically enriched digital models. However, the quality of drawings varies due to variations in sources and formats, which degrades the performance of OCR and lowers the accuracy of extraction results. Therefore, manual review is needed to correct OCR outputs, requiring additional time and labor. To address this issue, the authors proposed an OCR-based quality assessment method combined with an auxiliary review system to enhance both the accuracy and efficiency of textual information extraction. A set of semantic- and task-driven criteria was designed to evaluate drawing quality. A dataset of 50 bridge plans in PDF format was annotated with “high” or “low” quality labels, and the textual content was manually transcribed for OCR performance evaluation. The proposed method applied Tesseract OCR to extract textual information and automate the quality assessment process. Token-level confidence scores were computed, and drawings with an average score below 80 were classified as low-quality. In the auxiliary review system, tables detected were reconstructed, and cells with text below this confidence threshold were highlighted, enabling reviewers to focus on potentially error-prone regions. Experiments on the annotated dataset showed that the proposed method achieved a precision of 97.14% and a recall of 87.18% in classification. By excluding low-quality drawings, the precision increased by 17.84% and the recall increased by 18.96% in information extraction. Additionally, the auxiliary review system highlighted 36.81% of the cells, indicating a potential reduction of over 60% in manual review time. Overall, the proposed method provides a lightweight approach to improve OCR-based semantic information extraction from engineering drawings in terms of accuracy and review efficiency.
Keywords: Optical Character Recognition, engineering drawings, semantic information extraction, quality filtering, auxiliary review system
DOI: 10.54941/ahfe1007033
Cite this paper
More from this volume
- Warnings and Multilingual Audiences
- EAT Da Vinci 3.0_Translating Cinematic Narrative into Media Art Installation
- From Manual to Automated: Enhancing Inclusivity in Foreign Language Education with Technology
- The effect of multi-sensory physical experiences in daily emotional self-tracking service for emotion self-awareness
- Parametric generation based graphic design and spatial expression research
- Gender Stereotypes in Video Gaming: Impacts of Anxiety Levels, Verbal Communication, and Performance
- Exploring Usability And User-experience Metrics With A Novel AR App In The MASTERLY Project
- Drawing Dialogues Between Generative AI and Children with Autism: A Qualitative Study on the Externalization of “Understanding”
- Human-Centered Design of Integrated Food Service Management Systems: Reducing Cognitive Load in Resource-Constrained Kitchen Operations
- The Design Futures Art-driven (DFA) Method: Structuring Art-Tech Collaboration for Sustainable Future of Food System
- Increasing importance of Instinct
- Bridging the Privacy Gap: Stakeholder Solutions to Support Transparent Data Management Practices in Digital Health Research


AHFE Open Access