<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE root>
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ali="http://www.niso.org/schemas/ali/1.0/" article-type="research-article" dtd-version="1.2" xml:lang="en"><front><journal-meta><journal-id journal-id-type="publisher-id">Russian Language Studies</journal-id><journal-title-group><journal-title xml:lang="en">Russian Language Studies</journal-title><trans-title-group xml:lang="ru"><trans-title>Русистика</trans-title></trans-title-group></journal-title-group><issn publication-format="print">2618-8163</issn><issn publication-format="electronic">2618-8171</issn><publisher><publisher-name xml:lang="en">Peoples’ Friendship University of Russia named after Patrice Lumumba</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="publisher-id">27498</article-id><article-id pub-id-type="doi">10.22363/2618-8163-2021-19-3-331-345</article-id><article-categories><subj-group subj-group-type="toc-heading" xml:lang="en"><subject>Mediadidactics and electronic means of instruction</subject></subj-group><subj-group subj-group-type="toc-heading" xml:lang="ru"><subject>Медиадидактика и электронные средства обучения</subject></subj-group><subj-group subj-group-type="article-type"><subject>Research Article</subject></subj-group></article-categories><title-group><article-title xml:lang="en">Textometr: an online tool for automated complexity level assessment of texts for Russian language learners</article-title><trans-title-group xml:lang="ru"><trans-title>Текстометр: онлайн-инструмент определения уровня сложности текста по русскому языку как иностранному</trans-title></trans-title-group></title-group><contrib-group><contrib contrib-type="author"><name-alternatives><name xml:lang="en"><surname>Laposhina</surname><given-names>Antonina N.</given-names></name><name xml:lang="ru"><surname>Лапошина</surname><given-names>Антонина Николаевна</given-names></name></name-alternatives><bio xml:lang="en"><p>leading expert, Laboratory of Cognitive and Linguistic Studies</p></bio><bio xml:lang="ru"><p>ведущий эксперт лаборатории когнитивных и лингвистических исследований</p></bio><email>ANLaposhina@pushkin.institute</email><xref ref-type="aff" rid="aff1"/></contrib><contrib contrib-type="author"><name-alternatives><name xml:lang="en"><surname>Lebedeva</surname><given-names>Maria Yu.</given-names></name><name xml:lang="ru"><surname>Лебедева</surname><given-names>Мария Юрьевна</given-names></name></name-alternatives><bio xml:lang="en"><p>Candidate of Philology, leading researcher of the Laboratory of Cognitive and Linguistic Research, Associate Professor of the Department of Methods of Teaching Russian as a Foreign Language</p></bio><bio xml:lang="ru"><p>кандидат филологических наук, ведущий научный сотрудник лаборатории когнитивных и лингвистических исследований, доцент кафедры методики преподавания РКИ</p></bio><email>MULebedeva@pushkin.institute</email><xref ref-type="aff" rid="aff1"/></contrib></contrib-group><aff-alternatives id="aff1"><aff><institution xml:lang="en">Pushkin State Russian Language Institute</institution></aff><aff><institution xml:lang="ru">Государственный институт русского языка имени А.С. Пушкина</institution></aff></aff-alternatives><pub-date date-type="pub" iso-8601-date="2021-09-28" publication-format="electronic"><day>28</day><month>09</month><year>2021</year></pub-date><volume>19</volume><issue>3</issue><issue-title xml:lang="en">VOL 19, NO3 (2021)</issue-title><issue-title xml:lang="ru">ТОМ 19, №3 (2021)</issue-title><fpage>331</fpage><lpage>345</lpage><history><date date-type="received" iso-8601-date="2021-09-28"><day>28</day><month>09</month><year>2021</year></date></history><permissions><copyright-statement xml:lang="en">Copyright ©; 2021, Laposhina A.N., Lebedeva M.Y.</copyright-statement><copyright-statement xml:lang="ru">Copyright ©; 2021, Лапошина А.Н., Лебедева М.Ю.</copyright-statement><copyright-year>2021</copyright-year><copyright-holder xml:lang="en">Laposhina A.N., Lebedeva M.Y.</copyright-holder><copyright-holder xml:lang="ru">Лапошина А.Н., Лебедева М.Ю.</copyright-holder><ali:free_to_read xmlns:ali="http://www.niso.org/schemas/ali/1.0/"/><license><ali:license_ref xmlns:ali="http://www.niso.org/schemas/ali/1.0/">http://creativecommons.org/licenses/by/4.0</ali:license_ref></license></permissions><self-uri xlink:href="https://journals.rudn.ru/russian-language-studies/article/view/27498">https://journals.rudn.ru/russian-language-studies/article/view/27498</self-uri><abstract xml:lang="en"><p style="text-align: justify;">Evaluation of text accessibility seems to be an extremely urgent and labor-consuming task in the process of preparing texts for teaching Russian as a foreign language. On the other hand, the procedure of assigning a text to one of the levels on the CEFR scale (from A1 to C2) is well-formalized and described in the professional literature, which opens opportunities for its automation. This paper presents Textometr - a new free web-based tool for estimating CEFR level and other key statistics from any given text in Russian that can be relevant for adapting it for foreign students. The automated assessment of the text level here is based on a regression model, trained on the dataset of more than 800 texts from Russian textbooks for foreigners, applying several machine learning and natural language processing methods. In addition to the CEFR level, the tool provides information relevant for adapting the text to educational tasks: lists of keywords and words for a potential vocabulary list, statistics on the text coverage by frequency lists and CEFR-graded vocabulary lists (lexical minima), a frequency list of the text, a forecast of the time needed for reading. The tool shortages at the current stage of development and suggested ways to solve them are also discussed. Finally, the results of the test on the tool quality and the vectors for its further development are reported. Textometr can provide helpful information not only to teachers and guidance teachers, but to authors of textbooks and publishers to check the compliance of the text content with the declared level and educational goals.</p></abstract><trans-abstract xml:lang="ru"><p style="text-align: justify;">Оценка текста с точки зрения его языковой доступности представляется крайне актуальной и трудозатратной задачей в процессе его подготовки к занятию по русскому языку как иностранному. С другой стороны, процесс отнесения текста к одному из уровней по шкале CEFR (от А1 до С2) является достаточно формализованным и описанным в методической литературе, что открывает возможности по его автоматизации. Цель исследования - описать возможности и методику использования нового онлайн-инструмента «Текcтометр» для автоматического анализа уровня сложности текста по шкале CEFR и его подготовки к уроку русского языка в иностранной аудитории. Материалом для построения математической модели по определению уровня текста послужили более чем 800 текстов из современных учебников по русскому языку как иностранному. В процессе разработки концепции и создания сервиса применялись методы теоретического анализа научно-методической литературы и регламентирующих документов в области русского языка как иностранного, анкетирования и тестирования учащихся и преподавателей, машинного обучения и автоматической обработки текстов на естественном языке. В результате установлены и описаны основные возможности сервиса: определение уровня текста по шкале CEFR, предоставление информации, полезной для адаптации текста к учебным задачам, такой как списки ключевых слов и слов - оптимальных кандидатов в словарь к данному тексту, статистика по покрытию текста лексическими минимумами ТРКИ и списками частотных слов русского языка, меры лексического разнообразия текста, прогноз времени, необходимого для разных видов чтения текста. Выявлены недостатки работы сервиса на данном этапе разработки и предложены пути их решения. Приведены результаты экспериментальной проверки качества работы инструмента и намечены векторы дальнейшего развития сервиса. Сервис может быть полезен преподавателям, методистам, а также авторам пособий и представителям издательств для проверки соответствия текстового материала заявленному уровню и учебным целям.</p></trans-abstract><kwd-group xml:lang="en"><kwd>Russian as a foreign language</kwd><kwd>educational text</kwd><kwd>text complexity</kwd><kwd>reading</kwd><kwd>text adapting</kwd><kwd>computational linguodidactics</kwd><kwd>computer assisted language learning</kwd><kwd>Russian language learning</kwd><kwd>web tools</kwd></kwd-group><kwd-group xml:lang="ru"><kwd>русский язык как иностранный</kwd><kwd>учебный текст</kwd><kwd>сложность текста</kwd><kwd>обучение чтению</kwd><kwd>адаптация текстов</kwd><kwd>компьютерная лингводидактика</kwd><kwd>компьютерные технологии</kwd><kwd>преподавание русского языка</kwd><kwd>интернет-ресурсы</kwd><kwd>обучение русскому языку</kwd></kwd-group><funding-group/></article-meta></front><body></body><back><ref-list><ref id="B1"><label>1.</label><citation-alternatives><mixed-citation xml:lang="en">Alexander, P.A., &amp; Jetton, T.L. (1996). The role of importance and interest in the processing of text. Educational Psychology Review, 8(1), 89–121.</mixed-citation><mixed-citation xml:lang="ru">Арутюнов А.Р. Теория и практика создания учебника русского языка для иностранцев. М. : Русский язык, 1990. 167 с.</mixed-citation></citation-alternatives></ref><ref id="B2"><label>2.</label><citation-alternatives><mixed-citation xml:lang="en">Arutyunov, A.R. (1990). Theory and practice of creating a textbook of the Russian language for foreigners. Moscow: Russkii Yazyk Publ. (In Russ.)</mixed-citation><mixed-citation xml:lang="ru">Бим И.Л. Методика обучения иностранным языкам как наука и проблемы школьного учебника. М. : Русский язык, 1977. 288 с.</mixed-citation></citation-alternatives></ref><ref id="B3"><label>3.</label><citation-alternatives><mixed-citation xml:lang="en">Bim, I.L. (1977). Methods of teaching foreign languages as a science and problems of a school textbook. Moscow: Russkii Yazyk Publ. (In Russ.)</mixed-citation><mixed-citation xml:lang="ru">Вятютнев М.Н. Теория учебника русского языка как иностранного (методические основы). М. : Русский язык, 1984. 144 с.</mixed-citation></citation-alternatives></ref><ref id="B4"><label>4.</label><citation-alternatives><mixed-citation xml:lang="en">Chen, X., &amp; Meurers, D. (2016). Characterizing text difficulty with word frequencies. Proceedings of the 11th Workshop on Innovative Use of NLP for Building Educational Applications (June 16, 2016), 11, 84–94. San Diego, CA, USA.</mixed-citation><mixed-citation xml:lang="ru">Зализняк А.А. Русское именное словоизменение. М. : Наука, 1967. 373 с.</mixed-citation></citation-alternatives></ref><ref id="B5"><label>5.</label><citation-alternatives><mixed-citation xml:lang="en">DuBay, W. (2004). The principles of readability. Costa Mesa, CA: Impact Information.</mixed-citation><mixed-citation xml:lang="ru">Лапошина А.Н. Корпус текстов учебников РКИ как инструмент анализа учебных материалов // Русский язык за рубежом. 2020. № 6 (283). С. 22-28.</mixed-citation></citation-alternatives></ref><ref id="B6"><label>6.</label><citation-alternatives><mixed-citation xml:lang="en">Graesser, A.C., McNamara, D.S., Cai, Z., Conley, M., Li, H., &amp; Pennebaker, J. (2014). Coh-Metrix measures text characteristics at multiple levels of language and discourse. The Elementary School Journal, 15(2), 210–229.</mixed-citation><mixed-citation xml:lang="ru">Лапошина А.Н. Опыт экспериментального исследования сложности текстов по РКИ // Динамика языковых и культурных процессов в современной России : материалы VI Конгресса РОПРЯЛ (Уфа, 11-14 октября 2018 г.) : сборник статей. 2018. Вып. 6. С. 1544-1549</mixed-citation></citation-alternatives></ref><ref id="B7"><label>7.</label><citation-alternatives><mixed-citation xml:lang="en">Karpov, N., Baranova, J., &amp; Vitugin, F. (2014). Single-sentence readability prediction in Russian. Proceedings of Analysis of Images, Social Networks, and Texts conference (AIST), (3), 91–100.</mixed-citation><mixed-citation xml:lang="ru">Лапошина А.Н., Лебедева М.Ю. Корпусный подход к решению проблемы отбора лексики в обучении РКИ // Slavica Helsingiensia. 2019. № 52. С. 359-368.</mixed-citation></citation-alternatives></ref><ref id="B8"><label>8.</label><citation-alternatives><mixed-citation xml:lang="en">Keskisärkkä, R., &amp; Jönsson, A. (2013). Investigations of synonym replacement for Swedish. Northern European Journal of Language Technology, (3), 41–59.</mixed-citation><mixed-citation xml:lang="ru">Микк Я.А. Оптимизация сложности учебного текста : в помощь авторам и редакторам. М. : Просвещение, 1981. 119 с.</mixed-citation></citation-alternatives></ref><ref id="B9"><label>9.</label><citation-alternatives><mixed-citation xml:lang="en">Laposhina, A.N. (2018). Insights from an experimental study on the text complexity for Russian as a foreign language. The Dynamics of Linguistic and Cultural Processes in Modern Russia: Proceedings of the VI Congress of ROPRYAL, (6), 1544–1549. (In Russ.)</mixed-citation><mixed-citation xml:lang="ru">Миллер Л.В., Политова Л.В. Рыбакова И.Я. Жили-были... 28 уроков русского языка для начинающих : учебник. СПб. : Златоуст, 2016. 112 с.</mixed-citation></citation-alternatives></ref><ref id="B10"><label>10.</label><citation-alternatives><mixed-citation xml:lang="en">Laposhina, A.N. (2020). A corpus of Russian textbook materials for foreign students as an instrument of an educational content analysis. Russian Language Abroad, (6(283)), 22–28. (In Russ.)</mixed-citation><mixed-citation xml:lang="ru">Система лексических минимумов современного русского языка : 10 лексических списков : от 500 до 5000 самых важных русских слов / под ред. В.В. Морковкина. М. : Астрель, 2003. 768 с.</mixed-citation></citation-alternatives></ref><ref id="B11"><label>11.</label><citation-alternatives><mixed-citation xml:lang="en">Laposhina, A.N., &amp; Lebedeva, M.U. (2019). Corpus approach to vocabulary selection for learning Russian as a foreign language. Slavica Helsingiensia, (52), 359–368. (In Russ.)</mixed-citation><mixed-citation xml:lang="ru">Томина Ю.А . Объективная оценка языковой трудности текстов (описание, повествование, рассуждение, доказательство) : дис. ... канд. пед. наук. М., 1985. 225 с.</mixed-citation></citation-alternatives></ref><ref id="B12"><label>12.</label><citation-alternatives><mixed-citation xml:lang="en">Laposhina, А.N., Veselovskaya, Т.S., Lebedeva, M.U., &amp; Kupreshchenko, O.F. (2018). Automated text readability assessment for Russian second language learners. Dialogue 2018: Proceedings of the International Conference, 17(24), 396–406.</mixed-citation><mixed-citation xml:lang="ru">Alexander P.A., Jetton T.L. The role of importance and interest in the processing of text // Educational Psychology Review. 1996. No 8 (1). Pp. 89-121.</mixed-citation></citation-alternatives></ref><ref id="B13"><label>13.</label><citation-alternatives><mixed-citation xml:lang="en">Mikk, Ya.A. (1981). Optimizing the complexity of educational text: A help for authors and editors. Moscow: Prosveshchenie Publ. (In Russ.)</mixed-citation><mixed-citation xml:lang="ru">Chen X., Meurers D. Characterizing text difficulty with word frequencies // Proceedings of the 11th Workshop on Innovative Use of NLP for Building Educational Applications. 2016. Vol. 11. Pp. 84-94.</mixed-citation></citation-alternatives></ref><ref id="B14"><label>14.</label><citation-alternatives><mixed-citation xml:lang="en">Miller, L.V., Politova, L.V., &amp; Rybakova, I.A. (2016). Once upon a time... 28 Russian lessons for beginners: Textbook. Saint Petersburg: Zlatoust Publ. (In Russ.)</mixed-citation><mixed-citation xml:lang="ru">DuBay W. The principles of readability. Costa Mesa, CA : Impact Information, 2004. 76 p.</mixed-citation></citation-alternatives></ref><ref id="B15"><label>15.</label><citation-alternatives><mixed-citation xml:lang="en">Morkovkin, V.V. (Ed.). (2003). The system of lexical minima of the modern Russian language: 10 lexical lists: From 500 to 5000 of the most important Russian words. Moscow: Astrel Publ. (In Russ.)</mixed-citation><mixed-citation xml:lang="ru">Graesser A.C., McNamara D.S., Cai Z., Conley M., Li H., Pennebaker J. Coh-Metrix measures text characteristics at multiple levels of language and discourse // The Elementary School Journal. 2014. No. 15 (2). Pp. 210-229</mixed-citation></citation-alternatives></ref><ref id="B16"><label>16.</label><citation-alternatives><mixed-citation xml:lang="en">Nation, P. (2006). How Large a vocabulary is needed for reading and listening? Canadian Modern Language Review, (63), 59–81.</mixed-citation><mixed-citation xml:lang="ru">Karpov N., Baranova J., Vitugin F. Single-sentence readability prediction in Russian // Proceedings of Analysis of Images, Social Networks, and Texts Conference (AIST). 2014. Vol. 3. Pp. 91-100</mixed-citation></citation-alternatives></ref><ref id="B17"><label>17.</label><citation-alternatives><mixed-citation xml:lang="en">Qian, D.D. (2002). Investigating the relationship between vocabulary knowledge and academic reading performance: An assessment perspective. Language Learning, 52(3), 513–536.</mixed-citation><mixed-citation xml:lang="ru">Keskisärkkä R., Jönsson A. Investigations of synonym replacement for Swedish // Northern European Journal of Language Technology. 2013. No 3. Pp. 41-59</mixed-citation></citation-alternatives></ref><ref id="B18"><label>18.</label><citation-alternatives><mixed-citation xml:lang="en">Reynolds, R. (2016). Insights from Russian second language readability classification: complexity-dependent training requirements, and feature evaluation of multiple categories. Proceedings of the 11th Workshop on the Innovative Use of NLP for Building Educational Applications, 11, 289–300.</mixed-citation><mixed-citation xml:lang="ru">Laposhina А.N., Veselovskaya Т.S., Lebedeva M.U., Kupreshchenko O.F. Automated text readability assessment for Russian second language learners // Dialogue 2018 : Proceedings of the International Conference. 2018. Vol. 17. Issue 24. Pp. 396-406</mixed-citation></citation-alternatives></ref><ref id="B19"><label>19.</label><citation-alternatives><mixed-citation xml:lang="en">Sharoff, S., Kurella, S., &amp; Hartley, A. (2008). Seeking needles in the web’s haystack: Finding texts suitable for language learners. Proceedings of the 8th Teaching and Language Corpora Conference (TaLC-8) (pp. 365–370). Lisbon.</mixed-citation><mixed-citation xml:lang="ru">Nation P. How large a vocabulary is needed for reading and listening? // Canadian Modern Language Review. 2006. No 63. Pp. 59-81</mixed-citation></citation-alternatives></ref><ref id="B20"><label>20.</label><citation-alternatives><mixed-citation xml:lang="en">Sharoff, S., Umanskaya, E., &amp; Wilson, J. (2013). A frequency dictionary of Russian: Core vocabulary for learners. New York: Routledge.</mixed-citation><mixed-citation xml:lang="ru">Qian D.D. Investigating the relationship between vocabulary knowledge and academic reading performance: an assessment perspective // Language Learning. 2002. No 52 (3). Pp. 513-536</mixed-citation></citation-alternatives></ref><ref id="B21"><label>21.</label><citation-alternatives><mixed-citation xml:lang="en">To, V., &amp; Le, T. (2013). Lexical density and readability: A case study of English textbooks. Proceedings of the Australian Systemic Functional Linguistics Association Conference (October 1–3, 2013) (pp. 61–71). Melbourne.</mixed-citation><mixed-citation xml:lang="ru">Reynolds R. Insights from Russian second language readability classification : complexity-dependent training requirements, and feature evaluation of multiple categories // Proceedings of the 11th Workshop on the Innovative Use of NLP for Building Educational Applications. 2016. Vol. 11. Pp. 289-300</mixed-citation></citation-alternatives></ref><ref id="B22"><label>22.</label><citation-alternatives><mixed-citation xml:lang="en">Tomina, Yu.A. (1985). Objective assessment of the language difficulty of texts (description, narration, reasoning, argumentation) (Candidate dissertation, Moscow). (In Russ.)</mixed-citation><mixed-citation xml:lang="ru">Sharoff S., Kurella S., Hartley A. Seeking needles in the web’s haystack : finding texts suitable for language learners // Proceedings of the 8th Teaching and Language Corpora Conference (TaLC-8). Lisbon, 2008. Pp. 365-370</mixed-citation></citation-alternatives></ref><ref id="B23"><label>23.</label><citation-alternatives><mixed-citation xml:lang="en">Vyatyutnev, M.N. (1984). Textbook theory of Russian as a foreign language (methodological foundations). Moscow: Russkii Yazyk Publ. (In Russ.)</mixed-citation><mixed-citation xml:lang="ru">Sharoff S., Umanskaya E., Wilson J. A frequency dictionary of Russian: core vocabulary for learners. New York: Routledge, 2013. 400 p</mixed-citation></citation-alternatives></ref><ref id="B24"><label>24.</label><citation-alternatives><mixed-citation xml:lang="en">Zaliznak, A.A. (1967). Russian nominal infleсtion. Moscow: Nauka Publ. (In Russ.)</mixed-citation><mixed-citation xml:lang="ru">To V., Le T. Lexical density and readability : a case study of English textbooks // Proceedings of the Australian Systemic Functional Linguistics Association Conference. Melbourne, 2013. Pp. 61-71</mixed-citation></citation-alternatives></ref></ref-list></back></article>
