Predicative potential of lexical parameters: text complexity assessment in Russian language textbooks for 5-7 grades
- Authors: Andreeva M.I.1,2, Zamaletdinov R.R.2, Borisova A.S.3
-
Affiliations:
- Kazan State Medical University
- Kazan (Volga Region) Federal University
- RUDN University
- Issue: Vol 22, No 4 (2024): LINGUISTIC PROFILES OF RUSSIAN TEXTS: GOING FROM FORM TO MEANING
- Pages: 518-539
- Section: Key Issues of Russian Language Research
- URL: https://journals.rudn.ru/russian-language-studies/article/view/42906
- DOI: https://doi.org/10.22363/2618-8163-2024-22-4-518-539
- EDN: https://elibrary.ru/APRGCY
Cite item
Full Text
Abstract
This study addresses the urgent issue of assessing the influence of lexical parameters on text complexity. The research has been conducted on the material of a specialized linguistic corpus, which includes texts of 15 modern Russian language textbooks for 5-7 grades, with a total size of 811911 words. The study is aimed at identifying the scale and dynamics of changes in vocabulary of Russian textbooks for 5-7 grades. The research algorithm included the following stages: (a) identifying the size and content of vocabulary in modern Russian textbooks for 5-7 grades, (b) assessing the share of linguistic terms in their vocabulary, and (c) identifying complexity predictors, i.e. parameters demonstrating a statistically significant correlation with readability. The analytical part of the study was preceded by a meta-description of the corpus, its tokenization, lemmatization, segmentation into fragments of approximately 1000 words. Text parameters were calculated using the text profiler RuLingva, and the correlation strength was assessed with STATISTIKA. To ensure the research results reliability, co-dependencies of lexical parameters and text readability were analyzed at two levels: at the textbook level (with average indicators for 15 textbooks for 5-7 grades) and at the level of 1000-word fragments. We revealed a slightly lower readability index, which was expected to be 1.0-1.5 levels higher. The latter may be a characteristic of Russian language textbook as a genre and indicate eclecticism of academic texts, including fragments of research discourse (rules and theory), fiction (exercises), and instructional discourse (texts of tasks). The research demonstrated that the share of linguistic terms does not exceed 2 % in the textbook vocabulary, but their share in the texts rises to 13 %. The statistical analysis indicates that the indices of ‘lexical density’, cohesion (global and local overlaps of nouns and arguments), ‘descriptiveness’ (ratio between adjectives and nouns), ‘narrativity’ (ratio between verbs and nouns), and the share of nouns in the genitive case are text complexity predictors. The prospects for the research include studying verbs and pronouns as complexity predictors in Russian language textbooks.
About the authors
Mariia I. Andreeva
Kazan State Medical University; Kazan (Volga Region) Federal University
Author for correspondence.
Email: mariia99andreeva@yandex.ru
ORCID iD: 0000-0002-5760-0934
SPIN-code: 9243-6995
Scopus Author ID: 57195974758
ResearcherId: ABF-7003-2020
PhD in Philology, Associate Professor of the Department of Foreign Languages, Kazan State Medical University; Senior researcher of the research laboratory ‘Multidisciplinary Text Investigation’, Kazan (Volga region) Federal University
49 Butlerov st., Kazan, the Republic of Tatarstan, 420012, Russian Federation; 18 Kremlevskaya St, Kazan, 420008, Russian FederationRadif R. Zamaletdinov
Kazan (Volga Region) Federal University
Email: director.ifmk@gmail.com
ORCID iD: 0000-0002-2692-1698
SPIN-code: 4027-8784
Scopus Author ID: 56027359900
ResearcherId: M-2174-2013
Doctor Habil. (Philology), Professor, Director of the Institute of Philology and Intercultural Communication, Head of the Department of General Linguistics and Turkology
18 Kremlevskaya St, Kazan, 420008, Russian FederationAnna S. Borisova
RUDN University
Email: borisova-as@rudn.ru
ORCID iD: 0000-0002-7395-7028
SPIN-code: 2332-6093
Scopus Author ID: 57194527178
ResearcherId: AAH-9347-2019
PhD in Philology, Associate Professor of the Department of Foreign Languages, Faculty of Philology
6 Miklukho-Maklaya St, Moscow, 117198, Russian FederationReferences
- Andreeva, M., Solnyshkina M., Bukach, O., Zaikin, A., & Zamaletdinov, R. (2020). Assessment of comparative abstractness: Quantitative approach. In CEUR Workshop Proceedings (pp. 132-144). Kazan.
- Biber, D. (2006). University Language: A Corpus-Based Study of Spoken and Written Registers. John Benjamins Publ. https://doi.org/10.1075/scl.23
- Churunina, A.A., Solnyshkina, M.I., & Yarmakeev, I.E. (2023). Lexical diversity as a predictor of the complexity of textbooks on the Russian language. Russian Studies, 21(2), 212-227. (In Russ.). https://doi.org/10.22363/2618-8163-2023-21-2-212-227.
- Crossley, S.A., Louwerse, M.M., McCarthy, P.M., & McNamara, D.S. (2007). A linguistic analysis of simplified and authentic texts. The Modern Language Journal, 91(1), 15-30.
- Dubay, W. (2004). The Principles of Readability. CA.
- Gadasin, D.V., Pak, E.V., Korovushkina, V.M., & Melkova, E.K. (2022). Preprocessing of textual information based on natural language terms. REDS: Telecommunication Devices and Systems, 12(1), 4-11. (In Russ.).
- Gatiyatullina, G., Solnyshkina, M., Solovyev, V., Danilov, A., Martynova, E., & Yarmakeev, I. (2020). Computing Russian morphological distribution patterns using RusAC online server. In 2020 13th International Conference on Developments in eSystems Engineering (DeSE) (pp. 393-398). IEEE Publ. https://doi.org/10.1109/DeSE51703.2020.9450753
- Gatiyatullina, G.M., Solnyshkina, M.I., Kupriyanov, R.V., & Ziganshina, C.R. (2023). Lexical density as a complexity predictor: the case of Science and Social Studies textbooks. Research Result. Theoretical and Applied Linguistics, 9(1), 11-26. https://doi.org/10.18413/2313-8912-2023-9-1-0-2
- Goldman, S.R., & Lee, C.D. (2014). Text complexity: State of the art and the conundrums it raises. The Elementary School Journal, 115(2), 290-300. https://doi.org/10.1086/678298
- Graesser, A.C., McNamara, D.S., Louwerse, M.M., & Cai, Zh. (2004). Coh-Metrix: Analysis of text on cohesion and language. Behavior research methods, instruments, & computers, 36(2), 193-202. http://doi.org/10.3758/BF03195564
- Halliday, M.A.K. (1985). An Introduction to Functional Grammar. London: Hodder Arnold Publ.
- Kupriyanov, R.V., Solnyshkina, M.I., Dascalu, M., & Soldatkina, T.A. (2022). Lexical and syntactic features of academic Russian texts: A discriminant analysis. Research Result. Theoretical and Applied Linguistics, 8(4), 105-122. http://doi.org/10.18413/2313-8912-2022-8-4-0-8
- Okladnikova, S.V. (2010). A model for a comprehensive assessment of the readability of test materials. Caspian Journal: Management and High Technologies, (3), 63-71. (In Russ.).
- Paraschiv, A., Dascalu, M., & Solnyshkina, M.I. (2023). Classification of Russian textbooks by grade level and topic using Readerbench. Research result. Theoretical and applied linguistics, 9(1), 73-86. https://doi.org/10.18413/2313-8912-2023-9-1-0-4
- Ranzato, P.L.R. (2018). A text segmentation technique based on language models. Master thesis, Milano.
- Sardinha, T.B. (2002). Segmenting corpora of texts. DELTA: Documentação de Estudos em Lingüística Teórica e Aplicada, 18(2), 273-286. https://doi.org/10.1590/S0102-44502002000200004
- Solnyshkina, M., Guryanov, I., Gafiyatova, E., & Varlamova, E. (2018). Readability Metrics: the Case of Russian Educational Texts. In Abstracts & Proceedings of ADVED 2018 - 4th International Conference on Advances in Education and Social Sciences (pp. 676-681). Istanbul.
- Solnyshkina, M.I., Kupriyanov, R.V., & Shoeva, G.N. (2024). Linguistic profiling of text genres: adventure stories vs. textbooks. Research Result. Theoretical and Applied Linguistics, 10(1), 115-132. https://doi.org/10.18413/2313-8912-2024-10-1-0-7
- Solnyshkina, M.I., & Shoeva, G.N. (2024). Towards a taxonomy of textbooks as a genre: The case of Russian textbooks. RUDN Journal of Language Studies, Semiotics and Semantics, 15(3), 313-328. https://doi.org/10.22363/2313-2299-2024-15-2-313-328
- Solovyev, V., Andreeva, M., Solnyshkina, M., Zamaletdinov, R., Danilov, A., & Gaynutdinova, D. (2019). Computing concreteness ratings of Russian and English most frequent words: Contrastive approach. In 2019 12th International Conference on Developments in eSystems Engineering (DeSE), 403-408. https://doi.org/10.1109/DeSE.2019.00081
- Solovyev, V.D., Dascalu, M., & Solnyshkina, M.I. (2023). Discourse complexity: driving forces of the new paradigm. Research Result. Theoretical and Applied Linguistics, 9(1), 4-10. https://doi.org/10.18413/2313-8912-2023-9-1-0-1
- Templin, M. (1957). Certain language skills in children. Minneapolis: University of Minnesota Press.
- Vakhrusheva, A.Y., Solnyshkina, M.I., Kupriyanov, R.V., Gafiyatova, E.V., & Klimagina, I.O. (2021). Linguistic complexity of academic texts. Issues in Journalism, Education, Linguistics, 40(1), 89-99. https://doi.org/10.18413/2712-7451-2021-40-1-89-99
- Вахрушева А.Я., Солнышкина М.И., Куприянов Р.В., Гафиятова Э.В., Климагина И.О. Лингвистическая сложность учебных текстов // Вопросы журналистики, педагогики, языкознания. 2021. Т. 40. № 1. С. 88-99. https://doi.org/10.18413/2712-7451-2021-40-1-89-99
- Vahrusheva, A., Solovyev, V., Solnyshkina, M., Gafiaytova, E., & Akhtyamova, S. (2023). Revisiting Assessment of Text Complexity: Lexical and Syntactic Parameters Fluctuations. In International Conference on Speech and Computer, 430-441. Cham: Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-48309-7_35
- Ure, J. (1971). Lexical density and register differentiation. Applications of linguistics, 23(7), 443-452.
Supplementary files
Source: Calculated by M.I. Andreeva, R.R. Zamaletdinov, A.S. Borisova based on the research corpus.
Source: Calculated by the M.I. Andreeva, R.R. Zamaletdinov, A.S. Borisova based on the research corpus.
S o u r c e : Calculated by M.I. Andreeva, R.R. Zamaletdinov, A.S. Borisova based on the research corpus.
Source: Calculated by M.I. Andreeva, R.R. Zamaletdinov, A.S. Borisova based on the research corpus.
Source: Calculated by M.I. Andreeva, R.R. Zamaletdinov, A.S. Borisova based on the research corpus.
S o u r c e : Calculated by the M.I. Andreeva, R.R. Zamaletdinov, A.S. Borisova based on the research corpus.
Source: Calculated by M.I. Andreeva, R.R. Zamaletdinov, A.S. Borisova based on the research corpus.
Source: Calculated by the M.I. Andreeva, R.R. Zamaletdinov, A.S. Borisova based on the research corpus.
