The words that make fake stories go viral: A corpus-based approach to analyzing Russian Covid-19 disinformation

封面

如何引用文章

详细

Since the outbreak of the Covid-19 pandemic in 2020, the spread of the new virus has been accompanied by the growing infodemic that became a dangerous prospect for Internet users. Social media and online messengers have been instrumental in making fake stories about Covid-19 viral. The lack of an efficient instrument for classifying digital texts as true or fake is still a big challenge. Deceptive content and its specific characteristics attract attention of many linguists, making it one of the most popular contemporary topics in corpus-based research. This paper explores the language of viral Covid-related fake stories and identifies specific linguistic features that distinguish fake stories from real (authentic) news using quantitative and qualitative approaches to text analysis. The study was conducted on the material of the self-compiled diachronic corpus containing Russian misleading coronavirus-related social media posts (a target corpus of 897 texts) which were virally shared by Russian users through social media platforms and mobile messengers from March 2020 to March 2022 and the reference corpus containing genuine materials about the virus. First, we compared two corpora using an interpretable set of features across language levels to find whether there is evidence of significant variation in the language of fake and real news. Then, we focused on frequency profiling to extract other over-represented groups of words from both corpora. Finally, we analyzed the corresponding contexts to indicate whether these features can be considered as linguistic trends in Russian Covid-related fake story making. Findings regarding the role of these over-represented groups of words in fake narratives about coronavirus revealed efficiency of frequency profiling in indicating lexical patterns of the language of deception.

作者简介

Alina Monogarova

Pyatigorsk State University

Email: alinach12@yandex.ru
ORCID iD: 0000-0003-4098-0341

Assistant Professor of the English Language and Professional Communication Department at Pyatigorsk State University, Russia. Her research interests embrace corpus linguistics, text mining and text analysis, as well as standardization of developing terminologies.

Pyatigorsk, Russia

Tatyana Shiryaeva

Pyatigorsk State University

编辑信件的主要联系方式.
Email: shiryaevat@list.ru
ORCID iD: 0000-0001-5508-8407

Professor of Linguistics, Head of the English Language and Professional Communication Department at Pyatigorsk State University, Russia. She is Editor-in-Chief of the research journal Professional Communication: Top Issues of Linguistics and Teaching Methods. Her research interests focus on discourse analysis, sociocognitive linguistics with particular emphasis on professional discourse studies, theory and practice of intercultural professional and business communication, English for special purposes, genre analysis and pragmatics. She is author and co-author of over 200 publications. Several research articles were published in high ranking journals, including Heliyon, Humanities and Social Sciences Reviews, International Journal of Arabic-English Studies, Journal of Language and Education, among others.

Pyatigorsk, Russia

Elena Tikhonova

MGIMO University

Email: etihonova@gmail.com
ORCID iD: 0000-0001-8252-6150

Associate Professor at the Department of Foreign Languages of MGIMO University, Moscow, Russia. She is also Deputy Editor-in-Chief of the Journal of Language and Education. Her areas of interest include discourse analysis, sociocognitive linguistics, and psycholinguistics. She conducts research in the field of English for specific purposes, genre analysis, pragmatics, and academic writing. She has authored numerous articles in high-impact international journals. She is a member and lecturer of the Association of Scientific Editors and Publishers (ASEP).

Moscow, Russia

参考

  1. Ahmed, Hadeer. 2017. Detecting Opinion Spam and Fake News Using n-Gram Analysis and Semantic Similarity. University of Ahram Canadian.
  2. Ahmed, Hadeer, Issa Traore & Sherif Saad. 2018. Detecting opinion spams and fake news using text classification. Security and privacy 1 (1). 1-15. https://doi.org/10.1002/spy2.9
  3. Allcott, Hunt & Matthew Gentzkow. 2017. Social media and fake news in the 2016 election. Journal of Economic Perspectives 31 (2). 211-236. https://doi.org/10.1257/jep.31.2.211
  4. Al-Salman, Saleh & Ahmad S. Haider. 2021. COVID-19 trending neologisms and word formation processes in English. Russian Journal of Linguistics 25 (1). 24-42. https://doi.org/10.22363/2687-0088-2021-25-1-24-42
  5. Baron, Alistair, Paul Rayson & Dawn Elizabeth Archer. 2009. Word frequency and key word statistics in historical corpus linguistics. Anglistik: International Journal of English Studies 20 (1). 41-67.
  6. Biber, Douglas & Susan Conrad. 2019. Register, Genre, and Style. Cambridge University
  7. Brezina, Vaclav. 2018. Statistics in Corpus Linguistics: A Practical Guide. Cambridge University Press. https://doi.org/10.1017/9781316410899.008
  8. Chen, Lian-Ching, Kuei-Hu Chang & Hsiang-Yu Chung. 2020. A novel statistic-based corpus machine processing approach to refine a big textual data: An ESP Case of COVID-19 News Reports. Applied Sciences 10 (16). 5505. https://doi.org/10.3390/app10165505
  9. Christopher, S. Butler & Anne-Marie Simon-Vandenbergen. 2021. Social and physical distance/distancing: A corpus-based analysis of recent changes in usage. Corpus Pragmat 5 (4). 427-462. https://doi.org/10.1007/s41701-021-00107-2
  10. Curzan, Anne. 2009. Historical corpus linguistics and evidence of language change. In Anke Lüdeling & Merja Kytö (eds.), Corpus linguistics: An international handbook, 1091-1109. De Gruyter Mouton. https://doi.org/10.1515/9783110213881.2.1091
  11. Essam, Bacem A. & Muhammad S. Abdo. 2021. How do Arab tweeters perceive the Covid-19 pandemic? Journal of Psycholinguistic Research 50. 507-521. https://doi.org/10.1007/s10936-020-09715-6
  12. Gjylbegaj, Viola. 2018. Fake news in the age of social media. International E-Journal of Advances in Social Sciences 4 (11). 383-391. https://doi.org/10.18769/ijasos.455663
  13. Goddard, Cliff & Anna Wierzbicka. 2021. Semantics in the time of coronavirus: “Virus”, “bacteria”, “germs”, “disease” and related concepts. Russian Journal of Linguistics 25 (1). 7-23. https://doi.org/10.22363/2687-0088-2021-25-1-7-23
  14. Grieve, Jack & Helena Woodfield. 2023. The Language of fake. News Series: Elements in Forensic Linguistics, https://www.cambridge.org/core/elements/language-of-fake-news/7B37014A5C0768AEE806167E8ADD5897. (accessed 11 January 2023).
  15. Habgood-Coote, Joshua. 2019. Stop talking about fake news! Inquiry 62. 1033-1065.
  16. Ivanova, Irina. 2020. Pragmatic functions of interrogatives in media texts. Media Linguistics 7 (4). 501-515.
  17. Islam, Md Saiful, Tonmoy Sarkar, Sazzad Hossain Khan, Abu-Hena Mostofa Kamal, S M Murshid Hasan, Alamgir Kabir, Dalia Yeasmin, Mohammad Ariful Islam, Kamal Ibne Amin Chowdhury, Kazi Selim Anwar, Abrar Ahmad Chughtai & Holly Seale. 2020. Covid-19-Related infodemic and its impact on public health: A global social media analysis. American Journal of Tropical Medicine and Hygiene 103 (4). 1621-1629.
  18. Khan, Ali, Kathryn Brohman & Shamel Addas. 2021. The anatomy of ‘fake news’: Studying false messages as digital objects. Journal of Information Technology 37 (2).
  19. Kopytowska, Monika & Radosław Krakowiak. 2020. Online incivility in times of Covid-19: Social disunity and misperceptions of tourism industry in Poland. Russian Journal of Linguistics 24 (4). 743-773. https://doi.org/10.22363/2687-0088-2020-24-4-743-773
  20. Kuzmin, Gleb, Daniil Larionov, Dina Pisarevskaya & Ivan Smirnov. 2020. Fake news detection for the Russian language. In Proceedings of the 3rd International Workshop on Rumours and Deception in Social Media (RDSM). 45-57.
  21. Kytö, Merja. 2010. Data in historical pragmatics. In Jucker Taavitsainen & Irma Taavitsainen (eds.), Historical pragmatics. Berlin/New York: Walter de Gruyter Handbooks of Pragmatics https://doi.org/10.1515/9783110214284.2.33
  22. Lun, Wong Wei, Mazura Masture Muhammad, Muhamad Fadzllah Zaini, Rahimy Damit, Carrine Teoh-Ong, Charanjit Kaur Swaran Singh & Norhayati Yusoff. 2022. Analysis of Covid-19 related phrases using corpus-based tools: Dualisms language & technology. Journal of Positive School Psychology 6 (3). 5034-5044.
  23. Mahyoob, Mohammad, Jeehaan Algaraady & Musaad Alrahaili. 2021. Linguistic-based detection of fake news in social Media. International Journal of English Linguistics 11 (1). 99-109. https://doi.org/10.5539/ijel.v11n1p99
  24. McCulloch, Gretchen. 2019. Because Internet: Understanding the New Rules of Language. Riverhead Books.
  25. Monogarova, Alina, Tatiana Shiryaeva & Nadezda Arupova. 2021. The language of Russian fake stories: a corpus-based study of the topical change in the viral disinformation. Journal of Language and Education 7 (4). 83-106. https://doi.org/10.17323/jle.2021.13371
  26. Muslimah, Ryza Wahyu. 2020. A corpus-based analysis of critical strategies in Covid-19 corpora. Journal of Linguistics and Literature 4 (2). 258-268. https://doi.org/10.33019/lire.v4i2.89
  27. Oehmichen, Axel, Kevin Hua, Julio Amador Diaz Lopez, Miguel Molina-Solana, Juan Gómez-Romero & Yike Guo. 2019. Not All Lies Are Equal. A Study Into the Engineering of Political Misinformation in the 2016 US Presidential Election. IEEE Access (99) 1-1. 1-6.
  28. Pavlina, Svetlana. 2022. Pragmatic and stylistic perspectives on British and American COVID-19 cartoons. Russian Journal of Linguistics 26 (1). 162-193. https://doi.org/10.22363/2687-0088-27107
  29. Peng, Zhibin & Zhiong Hu. 2022. A bibliometric analysis of linguistic research on COVID-19. Frontiers in Psychology 13. https://doi.org/10.3389/fpsyg.2022.1005487
  30. Pisarevskaya, Dina. 2017. Deception detection in news reports in the Russian language: Lexics and discourse. In Proceedings of the 2017 EMNLP Workshop: Natural Language Processing meets Journalism. 74-79.
  31. Ponton, Douglas M. 2021. “Never in my life have I heard such a load of absolute nonsense. Wtf.” Political satire on the handling of the COVID-19 crisis. Russian Journal of Linguistics 25 (3). 767-788. https://doi.org/10.22363/2687-0088-2021-25-3-767-788
  32. Rashkin, Hannah, Eunsol Choi, Jin Yea Jang, Svitlana Volkova & Yejin Choi. 2017. Truth of varying shades: Analyzing language in fake news and political fact-checking. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2931-2937. https://doi.org/10.18653/v1/D17-1317
  33. Rayson, Paul. 2019. Corpus analysis of key words. In Carol A. Chapelle (ed.), The encyclopaedia of applied linguistics, 1-7. Oxford: Wiley-Blackwell.
  34. Rayson, Paul & Roger Garside. 2000. Comparing corpora using frequency profiling. In The Workshop on Comparing Corpora. Hong Kong, China. Association for Computational Linguistics. 1-6. https://doi.org/10.3115/1117729.1117730
  35. Sinclair, John. 1991. Corpus, Concordance, Collocation. Oxford University Press.
  36. Sutu, Rodica Melinda. 2020. Fake news, from social media to television case study of the Romanian presidential elections 2019. Styles of Communication 11(2). 81-92.
  37. Tandoc, Edson & Zheng Wei Lim. 2017. Defining “Fake News”: A typology of scholarly definitions. Digital Journalism 6 (3). 1-17. https://doi.org/10.1080/21670811.2017.1360143
  38. Torabi Asr, Fatemeh & Maite Taboada 2019. Big Data and quality data for fake news and misinformation detection. Big Data & Society 6 (1).
  39. Yu, Hangyan, Huiling Lu & Jie Hu. 2021. A corpus-based critical discourse analysis of news reports on the COVID-19 pandemic in China and the UK. International Journal of English Linguistics 11 (2). 36. https://doi.org/10.5539/ijel.v11n2p36
  40. Zhang, Xichen & Ali A. Ghorbani. 2020. An overview of online fake news: Characterization, detection, and discussion. Information processing and management 57 (2). https://doi.org/10.1016/j.ipm.2019.03.004
  41. Beckett, Charlie. 2017. ‘Fake news’: The best thing that’s happened to Journalism at Polis. (http://blogs.lse.ac.uk/polis/2017/03/11/fake-news-thebest-thing-thats-happened-to-journalism/) (accessed 11 January 2023)
  42. How Bill Gates became the voodoo doll of Covid conspiracies (6 June 2020). BBC News. (https://www.bbc.com/news/technology-52833706) (accessed 25 October 2022)

版权所有 © Monogarova A., Shiryaeva T., Tikhonova E., 2023

Creative Commons License
此作品已接受知识共享署名-非商业性使用 4.0国际许可协议的许可。

##common.cookie##