Combining large language models with linguistic features for the readability complexity assessment of texts
Open Access
Article
Conference Proceedings
Authors: Diego Palma, Christian Soto
Abstract: Readability is a fundamental skill for students, as the acquisition of knowledge throughout formal education is mediated largely by text. Texts vary widely in their complexity, and assessing whether a text is appropriately complex for a given reader is essential in educational contexts. While traditional approaches to text complexity rely on shallow surface proxies, such as word frequency, sentence length, or lexical diversity, these features fail to fully capture cohesion and coherence, two central dimensions of readability. Moreover, comprehension depends not only on textual features but also on the reader’s common knowledge, which remains difficult to approximate computationally. This work addresses these challenges by proposing a hybrid computational linguistics framework for text complexity assessment that integrates classical readability measurements and leverages large language models (LLMs). Our contributions are threefold:First, we develop a novel set of linguistic features designed to approximate cohesion and coherence more effectively than traditional shallow measures. These features are based on discourse patterns, lexical distribution, and semantic similarity between text segments, leveraging embeddings from transformer-based models. Specifically, we introduce new coherence features derived from segmentation heuristics and sentence embeddings, including measures of givenness, lexical diversity based on word distribution, and relative semantic distances across segments. These features capture how information is introduced and developed across a text, thus reflecting its readability at a deeper level than word counts or syntactic proxies alone.Second, we design a hybrid approach that combines these linguistic features with a fine-tuned LLM acting as a common knowledge assessor. While the linguistic features model structural and semantic regularities internal to the text, the LLM contributes an externalized knowledge base that helps approximate the background knowledge readers may bring to comprehension. By treating the LLM’s judgments as an additional feature set, we establish a hybrid model that integrates the strengths of both paradigms.Third, to support reproducibility and further research, we compile and release a new corpus of Spanish educational texts, drawn from the Chilean school system, annotated with grade-level labels. The dataset contains 656 texts spanning grades 1 through 8, and we provide detailed linguistic feature extractions alongside the labels.Our experimental evaluation compares three approaches: a fine-tuned LLM (GPT-4o), a machine learning model trained solely on linguistic features, and the proposed hybrid model. Results show that the LLM alone performs poorly on this task (accuracy = 0.18), whereas the linguistic features model achieves a substantially higher accuracy (0.61). Most importantly, the hybrid model outperforms both baselines, achieving 0.75 accuracy, thereby demonstrating the complementary value of combining linguistic insights with LLM-based judgments. Feature analysis further shows that our proposed measures, including KL divergence, lexical diversity, semantic distances, and givenness, are among the strongest predictors of text complexity, highlighting their explanatory power.In summary, this work advances the state of the art in text complexity assessment by proposing new semantic-based readability features, integrating them with LLMs to approximate reader knowledge, and validating the approach on a novel educational corpus. The findings demonstrate that hybrid models are not only more accurate but also more theoretically aligned with multidimensional views of text comprehension, bridging computational linguistics with educational applications.
Keywords: Natural Language Processing, Intelligent Systems, NLP, Artificial Intelligence, Computational Linguistics, Large Language Models, LLM
DOI: 10.54941/ahfe1007028
Cite this paper:
Downloads
15
Visits
72


AHFE Open Access