A cognitive linguistic approach to analysis and correction of orthographic errors

封面

如何引用文章

详细

In this paper, we apply usage-based linguistic analysis to systematize the inventory of orthographic errors observed in the writing of non-native users of Russian. The data comes from a longitudinal corpus (560K tokens) of non-native academic writing. Traditional spellcheckers mark errors and suggest corrections, but do not attempt to model why errors are made. Our approach makes it possible to recognize not only the errors themselves, but also the conceptual causes of these errors, which lie in misunderstandings of Russian phonotactics and morphophonology and the way they are represented by orthographic conventions. With this linguistically-based system in place, we can propose targeted grammar explanations that improve users’ command of Russian morphophonology rather than merely correcting errors. Based on errors attested in the non-native academic writing corpus, we introduce a taxonomy of errors, organized by pedagogical domains. Then, on the basis of this taxonomy, we create a set of mal-rules to expand an existing finite-state analyzer of Russian. The resulting morphological analyzer tags wordforms that fit our taxonomy with specific error tags. For each error tag, we also develop an accompanying grammar explanation to help users understand why and how to correct the diagnosed errors. Using our augmented analyzer, we build a webapp to allow users to type or paste a text and receive detailed feedback and correction on common Russian morphophonological and orthographic errors.

作者简介

Robert Reynolds

UiT The Arctic University of Norway; Brigham Young University

Email: robert_reynolds@byu.edu
ORCID iD: 0000-0003-0306-087X

employed as Assistant Research Professor in the Office of Digital Humanities

Tromsø, Norway; Provo, Utah, USA

Laura Janda

UiT The Arctic University of Norway

Email: laura.janda@uit.no
ORCID iD: 0000-0001-5047-1909

Professor of Russian in the Department of Language and Culture

Tromsø, Norway

Tore Nesset

UiT The Arctic University of Norway

编辑信件的主要联系方式.
Email: tore.nesset@uit.no
ORCID iD: 0000-0003-1308-3506

Professor of Russian linguistics in the Department of Language and Culture

Tromsø, Norway

参考

  1. Amaral, Luiz & Detmar Meurers.2011. On using intelligent computer-assisted language learning in real-life foreign language teaching and learning. ReCALL 23(1). 4-24.
  2. Beesley, Kenneth R. & Lauri Karttunen. 2003. Finite State Morphology. Stanford, CA: CSLI Publications.
  3. Biggs, John & Catherine Tang. 2011. Teaching for Quality Learning at University. Maidenhead, UK: Open University Press.
  4. Biggs, John. 1999. What the student does: Teaching for enhanced learning. Higher Education & Development 18 (1). 57-75.
  5. Bocharov, Victor, Svetlana Alexeeva, Dmitry Granovsky, E. Protopopova, Anastasia Bodrova, Svetlana Volskaya, I.V. Krylova & A.S. Chuchunkov. 2013. Crowdsourcing morphological annotations. In Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference "Dialog" 1. http://opencorpora.org/doc/articles/2013_Dialog.pdf (accessed 20.04.2022).
  6. Choi, Inn-Chull. 2016. Efficacy of an ICALL tutoring system and process-oriented corrective feedback. Computer Assisted Language Learning 29. 334-364.
  7. Heift, Trude. 2010. Developing an Intelligent Language Tutor. CALICO Journal 27(3). 443-459.
  8. Kopotev, Mixail, Sardana Ivanova, Anisia Katinskaia & Roman Yangarber. 2019. Corpus-based language teaching tool. Trudy Meždunarodnii Konferencii «KORPUSNAYA LINGVISTIKA-2019». 30-39. (In Russ.)
  9. Korobov, Mikhail. 2015. Morphological analyzer and generator for Russian and Ukrainian languages. In Proceedings of AIST’2015. 320-332. New York: Springer.
  10. Krylov, Sergej & Sergej Starostin. 2003. Upcoming tasks for morphological analysis and generation in the integrated information environment STARLING. In Proceedings of the International Conference “Dialog 2003”. https://www.dialog-21.ru/media/2655/krylov.pdf (In Russ.) (accessed 20.04.22).
  11. Linden, Krister, Erik Axelson, Sam Hardwick & Tommi A. Pirinen. 2011. HFST- framework for compiling and applying morphologies. In Cerstin Mahlow & Michael Pietrowski (eds.), Systems and frameworks for computational morphology, 100 of Communications in Computer and Information Science, 67-85. New York: Springer.
  12. Matthews, Clive. 1992. Going AI: Foundations of ICALL. Computer Assisted Language Learning 5(1). 13-31.
  13. Matthews, Clive. 1992. Going AI: Foundations of ICALL. Computer Assisted Language Learning 5(1). 13-31.
  14. Meurers, Detmar, Kordula De Kuthy, Florian Nuxoll, Björn Rudzewitz &Ramon Ziai.2019. Scaling up intervention studies to investigate real-life foreign language learning in school. Annual Review of Applied Linguistics 39.
  15. Nagata, Noriko. 2009. Robo-Sensei’s NLP-Based Error detection and feedback generation. CALICO Journal 26(3). 562-579.
  16. Rozovskaya, Alla & Dan Roth. 2019. Grammar Error Correction in Morphologically Rich Languages: The Case of Russian. Transactions of the Association for Computational Linguistics 7. 1-17. https://doi.org/10.1162/tacl_a_00251
  17. Rozovskaya, Alla & Dan Roth. 2021. How Good (really) are Grammatical Error Correction Systems? In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. 2686-2698.
  18. Segalovich, Ilya. 2003. A fast morphological algorithm with unknown word guessing induced by a dictionary for a web search engine. In International Conference on Machine Learning; Models, Technologies and Applications. 273-280.
  19. Sleeman, Derek. 1982. Inferring (mal) rules from pupil’s protocols. In Proceedings of the 5th European Conference on Artificial Intelligence (ECAI). 160-164. Orsay, France.
  20. Vilkki, Liisa. 2005. RUSTWOL: A tool for automatic Russian word form recognition. In Antti Arppe, Lauri Carlson, Krister Lindén, Jussi Piitulainen, Mickael Suominen, Martti Vainio, Hanna Westerlund & Anssi Yli-Jyrä (eds.), Inquiries into words, constraints and contexts: Festschrift for Kimmo Koskenniemi on his 60th Birthday, 151-162. Stanford, CA: CSLI Publications.
  21. Vilkki, Liisa. 1997. RUSTWOL: A System for Automatic Recognition of Russian Words. Technical report, Lingsoft, Inc.
  22. Vilkki, Liisa. 2005. RUSTWOL: A tool for automatic Russian word form recognition. In Arppe, A., Carlson, L., Lindén, K., Piitulainen, J., Suominen, M., Vainio, M., Westerlund, H., and Yli-Jyrä, A. (eds.), Inquiries into Words, Constraints and Contexts: Festschrift for Kimmo Koskenniemi on his 60th Birthday, 151-162. CSLI Publications.
  23. Zaliznjak, Andrej A. 1977. Grammatical dictionary of the Russian language: In ection: Approx 100 000 words. Russkij Jazyk. (In Russ.)

版权所有 © Reynolds R., Janda L., Nesset T., 2022

Creative Commons License
此作品已接受知识共享署名-非商业性使用 4.0国际许可协议的许可。

##common.cookie##