Text content variables as a function of comprehension: Propositional discourse analysis

Marina I. Solnyshkina; Солнышкина Марина Ивановна; Elena V. Harkova; Харькова Елена Владимировна; Yulia N. Ebzeeva; Эбзеева Юлия Николаевна

doi:10.22363/2687-0088-35915

Text content variables as a function of comprehension: Propositional discourse analysis

Authors: Solnyshkina M.I.¹, Harkova E.V.¹, Ebzeeva Y.N.²
Affiliations:
1. Kazan Federal University
2. RUDN University
Issue: Vol 27, No 4 (2023): Modern Languages and Cultures: Varieties, Functions and Ideologies in Cognitive Perspective
Pages: 938-956
Section: Articles
URL: https://journals.rudn.ru/linguistics/article/view/37237
DOI: https://doi.org/10.22363/2687-0088-35915
EDN: https://elibrary.ru/ZAJKUE
ID: 37237

Cite item

Full Text

Abstract
Full Text
About the authors
References
Supplementary files
Statistics

Abstract

Text complexity impact on immediate recalls and range of metadiscourse markers remains a research niche due to the lack of multidisciplinary data necessary to shed light on the issue. The current study aims to identify effects of text complexity and Russian-English discourse differences on immediate text-based recalls relating to the amount and type of the information reproduced. For the research purposes we engaged 94 native Russian speakers as respondents in a text-retelling task to explore the amount of propositions recalled from an opinion article and the range of discourse markers employed. The reading text and text-based recalls were contrasted on informative and linguistic levels. The informative complexity of the reading text was evaluated on the basis of propositional analysis, and the linguistic complexity was carried out on the basis of descriptive parameters (word and sentence length, proportion of long words), readability index, word complexity and range of metadiscourse markers. The study revealed that the complexity level of the reading text is a strong predictor of propositional recall. The comparative analysis indicated a slight decrease in metrics of descriptive parameters. We also revealed that high ability readers make a choice in favor of superordinate propositions recalling about 60% of them and losing over 70% of the subordinate propositions. They also tend to shift the metadiscourse patterns of the original text from interactive to more logical ones by loosing hedges, emphatics and evidentials. The study furthers our understanding of cross-linguistic differences in the use of metadiscourse, its results will find application in discourse complexology and natural language processing.

Keywords

propositions, text complexity, reading comprehension, cognitive model, automatic text analyzer, natural language processing

Full Text

Introduction

Reading models acknowledged in the modern research paradigm offer interpretation of what reading involves and how reading comprehension works (van den Broek et al. 1995, Zhang 2017). Experts in the area agree that difficulty in reading is a function of the processing level required by the reading purpose and text complexity (Weir et al. 2009: 160). Although there have been multiple studies on text comprehension, little research has examined the amount of information recalled after reading. As for the measures employed to assess reading comprehension, to the best of our knowledge, they are few and include either different types of questions or recalls. Multiple choice and true/false questions are the most popular and, according to many scholars, easy to use (Crossley et al. 2014). Sharing the view on comprehension questions’ ability to provide “an indication of comprehension”, researchers also agree on their internal constraints as comprehension questions “generally query only a small number of the ideas found in a text” which can also be guessed (cf. Crossley & McNamara 2016: 2). Another important reason to resort to other means of comprehension assessment is inability of questions to reflect abstract assumptions accommodated by models of comprehension (Kintsch 1998).

The obvious alternative is a recall, free, cued or serial, oral or written, which has long been advocated as an objective measure of reading comprehension. On the other hand, recall has also been widely criticized for additional procedures constrains. The most probable and often referred to include obligatory account of readers’ working memory span and speech generation strategies (cf. Chang 2006). Nonetheless, an immediate recall is still considered by many cognitive scientists a reliable means of our understanding the nature and depth of readers’ comprehension (Fletcher et al. 1995).

The idea of propositions as a measure of comprehension is firmly founded in numerous cognitive studies. The generally accepted theory states that written immediate recalls allow readers to freely reproduce the reading text propositions as well as elaborate extra-textual generations (Kintsch 1998, Crossley & McNamara 2016). If the experiment settings do not limit the recall time, the range and number of propositions reproduced in a text-based recall depend on readers’ linguistic and cognitive skills only (Bergman & Roediger 1999). Besides, there are no prompts provided to guess the correct answers as is the case with comprehension questions.

Our goal in this study is to assess the maximum limit of high ability readers to recall the reading text information in written text-based recalls, which as a research goal have been generally neglected in previous studies on text comprehension (Aubry et al. 2021, Hickey & Gilheany 2003, Kulik 1992). This type of approach allows us to address the following research questions: 1. How much of the original reading text can high ability readers of English as a foreign language (further EFL readers) reproduce in immediate written recalls? 2. What are the dynamics patterns when the reading text is conveyed in a recall? 3. Do Russian EFL readers render discourse markers of the original English text or tend to omit/substitute them? We also test the two hypotheses: (1) high ability C2^[1] EFL readers will recall about 50% of the superordinate propositions of the reading text in case their proficiency level corresponds to the reading text complexity; (2) The propositions reproduced in written text-based recalls carry predominantly factual information not the subjective claims. These hypotheses relate largely to ‘extraction’ and ‘attribution’ strategies of respondents (Novikov 2007) as well as to the possible facilitating effects of differences in Russian and English expository discourse patterns.

Literature review: reading comprehension and text-based recalls

Experimental results in the studies aimed at revealing specifics of text variables facilitating comprehension vary tremendously reflecting differences in the datasets, participants and settings. Many researchers approach the problems of reading comprehension and written text-based recalls by comparing texts of different complexity and readers of varying ability. The predominant majority of the studies look for evidence that respondents differ in their ability to identify and reproduce text organizing structures. Although these studies traditionally focus on native readers’ abilities in text comprehension. For example, Taylor (1980) argues that poor readers are less likely than good readers to organize their recalls according to the structure of the original text. Discourse markers are proved to facilitate comprehension (Irwin 1980) only if the reading text complexity corresponds to readers’ abilities (Spyridakis & Standal 1987).

A number of studies conducted recently have also examined effects of text complexity on EFL readers’ text comprehension and recalls (Crossley & McNamara 2016, Crossley et al. 2014, Kim et al. 2018). S. Crossley and D. McNamara (2016) contrasted text-retelling performance of EFL readers to confirm the hypothesis that if a text corresponds to readers’ linguistic proficiency more propositions are reproduced and elaborated. However, Russian EFL readers have not yet been involved into similar studies of propositional recall and metadiscourse model being a function of text complexity and readers-text alignment.

Employing the algorithm acknowledged in the field of discourse complexity in the current study we assess and draw the distinction between linguistic and informational complexity of a reading text (Bulté & Housen 2012). Text linguistic complexity manifests itself on five language level parameters, i.e. phonological (number of syllables, etc.), morphological (Parts of Speech Ratio, grammatical categories ratios, etc.), lexical (frequency, lexical diversity and density), syntactic (length of a sentence, distance to the main verb, etc.) and discourse (referential cohesion, deep cohesion, etc.) (Gatiyatullina et al. 2020, Solovyev et al. 2022). As for the informative (content), or cognitive, or propositional complexity, it is traditionally viewed as the organization of constructs and their similarity (Burleson & Caplan 1998), the number of propositions or idea units which an interlocutor encodes in a given language task to convey a certain message content (Ellis & Barkhuizen 2005, Bulté & Housen 2012). Researchers also provide an example of contrasting two EFL writers: if one of whom generated 30 propositions or idea units while another managed to produce only 15, then the propositional complexity of the first writer is higher than that of the second one (Bulté & Housen 2012:24). In the modern research paradigm it is estimated as a function of (1) the number of propositions per text or (2) propositional density, i.e. P-density, or (3) the number of new concepts per proposition (Chall 1999, Fletcher 1981, Vipond 1980).

The term and notion of ‘propositions’ once borrowed into psycholinguistic studies from Fillmore’s case grammar (2002) have since been viewed as units of text comprehension and cognition. The main verb of the clause and all its arguments are considered as one superordinate or main proposition while additional modifying elements constitute subordinate or additional propositions (Fletcher 1981, Kintsch 1998, Vipond 1980). Experts in the area argue that superordinate propositions are better recalled and longer stored in people’s memory than structurally subordinate propositions (Ziafar & Namaziandost 2020).

Another important difference in text parameters is that between metadiscourse and propositional information: metadiscourse is concerned with the organization and stance of the writer (Hyland 2004: 109) while propositional information is “information relating to the world beyond the text itself” (Halliday 1994: 70). Vande Kopple argues that “many discourses have at least two levels. On one level, we supply information about the subject of our text. On this level, we expand propositional content. On the other level, the level of metadiscourse, we do not add propositional material but help our receivers organize, classify, interpret, evaluate and react to such material. Metadiscourse, therefore, is discourse about discourse or communication about communication” (1985:83).

Employing the concept of proximity, which embodies the idea of interaction and occurs when authors establish mutual interaction via the employment of rhetorical features (Alipour & Jahanbin 2020:799), and Hyland’s definition of metadiscourse as “discourse about discourse” (Hyland 2005, 2010), researchers divide metadiscourse markers into two categories: interpersonal and interactional (Waller 2015), which are also subdivided into frame markers including logical connectives (refer to discourse acts, sequences and stages), transitional markers (express relations between clauses), code glosses (elaborate propositional meaning), evidential markers (refer to information in other texts), endophoric (refer to information in other parts of text), attitude markers (expresses writer’s attitude toward the propositional information), boosters (emphasize certainty and closes dialogue), hedges (withhold comment and open dialogue), engagement markers (explicitly build relationship with reader) and relational markers or self mention (explicitly refer to the writer) (Hyland 2004, 2005, 2010). Metadiscourse markers are used to present authorial claims, express a perspective on authorial statements, and to enter into a dialogue with the reader (Hyland 1996, Aull & Lancaster 2014, Alipour & Jahanbin 2020, Bolsunovskaya et al. 2015, Boginskaya 2022). They “imply trustworthiness and concerns of addressees” (Alipour & Jahanbin 2020).

Participants, materials and methods

Participants: 94 (13 males and 81 females) University students, all native Russians, majoring in Education and English as a Foreign language with A2 – C2 (CEFR) levels of proficiency volunteered to participate in the research and served as the experiment subjects. With each participant's written permission, we obtained their EFL scores of the previous semester thus defining their EFL proficiency. Based on their composite score, we employed a median split to form three groups of the participants: High (with C2 proficiency level or above), average (B1) and low ability (A2 or below) groups. Into the current research we involved only high ability students (# 10) with C2 proficiency level.

Dataset. The dataset for the study comprises (1) an article “Why Your Kid’s Bad Behavior May Be a Good Thing” from The New York Times online magazine [Moyer 2021] of 332 tokens which we used as a reading text; and (2) ten recalls of the article with the total size of 1338 tokens. The choice of the text was not random: we selected a text which is supposed to be among interests of participants, i.e. 3^d year students majoring Education and English as pre-service teachers. Relying on Teun A. van Dijk and Walter Kintsch’s view that “persons who understand real events or speech events are able to construct a mental representation, and especially a meaningful representation, only if they have more general knowledge about such events” (1983:17), we assume that experiment subjects are familiar with the main idea of the text, i.e. parenting, and the professional vocabulary used in it.

The reading text linguistic complexity was determined with the help of a text analyzer TextInspector (textinspector.com) as C2 thus matching the subjects’ reading proficiency. TextInspector provides metrics of numerous text parameters and matches them with CEFR proficiency levels (Table 1 below). These metrics are validated as statistically significant in distinguishing between different reading levels, and TextIspector developers argue that they ensure high reliability of the scores. The descriptive text metrics set comprises the following: average syllables per word, average syllables per sentence, average words per sentence, syllables per 100 words, words with more than 2 syllables % (see Table 1 below). The readability level of the reading text identified at 12.09 FKGL indicates that the text is understood by an average student with 12 years of formal schooling (see more in Solnyshkina et al. 2022).

The informational (content) complexity of the reading text was determined on the basis on the propositional analysis as the amount of ideas expressed in a text. In fact, it reflects the amount of information measured in propositions communicated by the author to his interlocutor (Smolik et al. 2016). The propositional analysis validated in numerous studies (cf. Kintsch 1998, Embretson & Wetzel 1987, Yus 2018, Korovina 2020) implies identifying and assigning semantic role labels to arguments of predicates. We demonstrated algorithm and stages of propositional analysis in our previous research (see Petrova et al. 2022, Petrova & Solnyshkina 2021) in which, in full accordance with the modern paradigm, we distinguish between superordinate and subordinate propositions (Waters 1983).

Table 1. Text Linguistic Complexity: Why Your Kid’s Bad Behavior May Be a Good Thing

Types of parameters	Parameter	Metric	Proficiency level
Descriptive	Average syllables per word	1.61	C2+
	Average syllables per sentence	35.73	C2
	Average words per sentence	22.13	C2+
	Syllables per 100 words	161.45	C2+
	Words with more than 2 syllables %	13.86	C2
Readability	Flesch Kincaid Reading Grade² (FKGL)	12.09	C2
Lexical Sophistication: English Vocabulary Profile,% of words (types)	A1	84 (45.41%)	C2
	A2	20 (10.81%)
	B1	31 (16.76%)
	B2	16 (8.65%)
	C1	9 (4.86%)
	C2	5 (2.70%)
Metadiscourse	% of all Metadiscourse Markers (types) in the text	12.97	C1+
Metadiscourse	% of all Metadiscourse Markers (tokens) in the text	14.55	C1

E.g. These parents set strict limits, but they are also warm and respectful with their children and sometimes willing to negotiate.

The sentence above contains six superordinate propositions referred to the AGENT parents and three subordinate propositions. Superordinate propositions are nominated with verbs (set), verbal nouns (limits) and adjectives (respectful, warm). Comprehension of the clause (but they are also warm…) is ensured by the anaphoric referential cohesion of the pronoun they and the antecedent parents.

PROP 1(superodinate): set (These parents)

PROP 2 (superodinate): limits (parents;) PROP 2_1(subordinate): strict (MOD)

PROP 3(superodinate): warm (parents)

PROP 4 (superodinate): respectful (parents) PROP 4_1 (subordinate): children (PATIENT)

PROP 5 (superodinate): will negotiate (parents;) PROP 5_1 (subordinate): sometimes (TIME)

PROP 6 (superodinate): expectations (parents)

Propositional analysis was conducted for all the sentences of the reading text, the results presented in a tabular format contain the number of superordinate and subordinate propositions in each sentence (see Table 2 below as an illustration).

Propositions of the reading text were scored independently by two professional linguists, experts in the area of propositional analysis with experience in identifying semantic roles in previous research. In scoring the text, one point was given for each correct proposition. A total of 98 points for superordinate propositions and 108 subordinate propositions were identified. The number and type of the propositions identified in the reading text by each expert were later compared and the correlation revealed between the two experts was 0.93 which indicates a very strong relationship.

Table 2. Propositional analysis of the Reading text (part)

	Number of superordinate propositions	Number of subordinate propositions
One of Dr. Loeb’s recent studies, which followed kids from ages 13 to 32, found that	4	3
children whose parents were psychologically controlling	1	5
[children] were less academically successful and	2	4
[children were] less liked by their peers in adolescence
compared with kids whose parents were not psychologically controlling.	1	5
As adults, they were also less likely to be in healthy romantic relationships.	3	3
Other research has linked parental psychological control with antisocial behavior and anxiety in kids.	3	10

For further contrasting the reading text and text-based recalls we also computed its metadiscourse profile (Fig. 1, 3 below), i.e. its rhetorical aspect embodied by diverse markers enforcing a writer-reader interaction. TextInspector (textinspector.com) elicits, categorizes and calculates metadiscourse markers of 13 classes including frame markers (announce goals, label stages, topic shifts, sequencing), code glosses (called, known as, such as), endophorics, hedges (certain, amount, likely, may, might, sometimes), logical connectives (also, and, but, or, so), relational markers (your), attitude markers, emphatics or boosters (certainly, indeed, should, sure), evidential (found that, research/ studies show/s, said, suggests), person markers (Hyland 2005).

Experiment Procedure. The experiment was conducted in four stages and lasted for about 90 minutes. On Stage 1, participants were provided with a general overview of the study and their role in it. They were informed that they would be asked to read a text for comprehension and written recall. Stage 2. Before involving respondents into the experiment, we also conducted a field testing to verify that the experiment participants were unfamiliar with the article topic. Based on the answers to three questions on the topic of the reading text it was concluded that the respondents had no prior knowledge on the subject. On Stage 3, the subjects (a) were instructed to read and (b) read through the text twice in the free reading-time condition. The reading time did not extend 10 min. On Stage 4, the participants were provided with individual laptops and wrote their recalls. The text-based recalls generated by the participants were marked 1G, 2G, 3G, 7В, 1А, 2А, 5C, 3В, 8В, 10В.

Analysis

On completion of the experiment, we conducted three levels of the recalls analysis: holistic, parametric and propositional. The holistic analysis was conducted by two experts separately to assess each recall’s content conformity with the reading text. As all recalls of C2 participants contained the macroproposition “Authoritative approach as a balance between hash and permissive ways is an effective kind of parenting”, they were found eligible to enter the next stages of the analysis.

As part of the parametric analysis we evaluated readability, descriptive, lexical and metadiscourse parameters in each recall. With the help of TextInspector, we obtained metrics of the following metrics: average syllables per word, average syllables per sentence, average words per sentence, syllables per 100 words, words with more than 2 syllables %, Flesch Kincaid reading grade, CEFR level, % of all metadiscourse markers (types) in the text, % of all metadiscourse markers (tokens) in the text.

After identifying mean values of all the parameters in the recalls we contrasted them with those in the reading text on the four levels: descriptive, readability, vocabulary profiles and metadiscourse (Table 3).

Table 3. Linguistic parameters: the reading text vs recalls (mean)

Type of Parameter	Parameter	Reading Text	Recalls (mean)
Type of Parameter	Parameter	Metric	Metric
Descriptive	Average syllables per word	1.61	1.51
	Average syllables per sentence	35.73	29.71
	Average words per sentence	22.13	19.71
	Syllables per 100 words	161.45	150.72
	Words with more than 2 syllables %	13.86	10.87
Readability	Flesch Kincaid Reading Grade	12.09	9.88
Vocabulary Profile	CEFR evel	C2	C1
Metadiscourse	% of all Metadiscourse Markers (types) in the text	12.97	12.22
Metadiscourse	% of all Metadiscourse Markers (tokens) in the text	14.55	9.92

The metrics in Table 4 indicate that C2 EFL readers demonstrate their high lexical and syntactic abilities in written recalls only slightly decreasing complexity level, i.e. from C2 to C1. Linguistic parameters including metadiscourse numbers do not differ significantly.

As for the Vocabulary profiles (see Figure 1) measured with TextInspector, we observe that shares of low-level vocabulary (A1-B1) increased while high-level shares (B2-C2) decreased resulting in lowering linguistic complexity level by one, from C2 to C1. For comparison we used types (word forms or instances of words), not tokens (lemmas) of words thus ensuring a better picture of vocabulary frequencies.

Figure 1. Vocabulary Profile: Reading text vs Recalls (mean)

The most interesting dynamics is observed in the range and number of discourse markers (see Figure 2): the bar chart in Figure 2 demonstrates an obvious increase in logical connectors share from 2.7 % (types) in the reading text to 4.44% in recalls (mean). Another increase we observe is that in code gloss types: from 1.62 % to 2.22%. Hedges, evidentials and emphatics have a strong tendency to decrease: their numbers plummeted twice: the share of hedges and emphatics dropped from over 2.6 % in the reading text to 1.11% in the recalls, the share of evidential decreased by 1.2% from 3.24 % to 2.22. As for logical connectives and code glosses, we observe the opposite tendency, i.e. markers of these two types have nearly doubled: code glosses mean increased from as low as 1.6 % in the reading text to 2.22% in the recalls and the share of logical connectors in the recalls is the highest, i.e. 4.44% .

Figure 2. Discourse markers: Reading text vs Reading Text‐based Written Recalls (mean)

Noteworthy is that evidentials which are the most frequent type of the discourse markers in the reading text dropped dramatically in the recalls. All the above testifies to the fact that even high ability, i.e. C2, Russian EFL readers transfer their native metadiscourse pattern to the recalls and shift the original metadiscourse model of the text losing half of the emphatics (certainly, indeed, should, sure), hedges (certain amount, likely, may, might, sometimes) and evidentials (found that, research, said, shows, studies, suggests) but increase the number of code glosses (known as, such as) and logical connectors (also, and, but, or, so). These two classes are also confirmed to be much more frequent in the Russian academic discourse (Blinova 2019). Two more crucial elements about the written recalls is that (1) logical connectors having acquired a much higher frequency dropped in the range from 12 to 5; (2) Russian EFL readers as representatives a “reader-responsible” culture (see Hinds 1987) and also tend to add discourse markers of a sequencing type (firstly, secondly) thus increasing logical organization of their recalls.

Propositional analysis. The propositions of all written recalls were also scored independently by the two raters who were previously engaged in measuring the propositional complexity of the reading text and the holistic assessment of the text-based recalls. The number and range of the propositions produced by each subject were later compared with that in the reading text. The calculated correlation between the two raters was identified as 0.87 which implies a relatively strong statistical importance.

While evaluating propositions in each recall we scored only text-based propositions, while any type of elaborations or distortion inference propositions were not taken into account. One point was given for each correctly recalled or inferred proposition and a total of 98 (100%) points for superordinate propositions and 108 (100%) subordinate propositions were possible. The absolute and relative (%) scores of the propositional recalls are presented in Table № 3 below.

Table № 3. Propositional Recall: absolute and relative (%) indices of informational complexity

Code	Absolute number of the propositions recalled	Number of the propositions recalled (%)	Absolute number of superordinate propositions recalled (%)	Number of the subordinate propositions recalled (%)
1А	41	41.8%	20	18.5%
3В	41	41.8%	24	22.2%
10В	45	45.9%	23	21.2%
2А	47	47.9%	27	25.0%
2G	60	61.2%	31	28.7%
1G	66	67.3%	32	29.6%
8В	68	69.4%	38	35.1%
5C	70	69.4%	36	33.3%
7В	71	71.4%	42	38.9%
3G	73	74.5%	35	32.4%
Mean		59.0%	Mean	28.5%

As demonstrated in Table 3 the reconstructed texts contain on average 60% of superordinate propositions and about 30 % of subordinate propositions.

Discussion

This study compares reading texts and high ability EFL readers’ performance on the immediate written recall in order to determine the amount of propositions and range of discourse markers reproduced. We demonstrate how reader-text complexity alignment, although contributing to the amount of the information reproduced in immediate recalls of Russian readers, does not facilitate reconstructing (or constructing a similar) metadiscourse structure of the original English text. Of the ninety-four Russian University students who initially participated in this study, we analyzed and compared the response patterns of ten participants whose language proficiency of C2 EFL (CEFR) was confirmed by the previous semester score. For the data analysis, we generated descriptive statistics of the reading texts as well as of each text-based recall and used four measures of comparison, i.e. readability, vocabulary profile, discourse markers range and the ratio of the propositions recalled correctly, i.e. propositional recall. The results showed that an average high ability Russian reader recalls about 60% of the superordinate and 30% of the subordinate propositions of the reading English text thus exercising his/her ability to discriminate and select communicatively relevant information. In their recalls readers lose more subordinate (about 2/3) than subordinate (about 2/5) propositions of the reading text. Another finding indicates that the propositions reproduced in recalls carry predominantly factual information not the subjective claims of the reading text which the author of the original reading text expressed by numerous metadiscourse markers, i.e. hedges, emphatics and evidentials. These data suggest that even in situations when the reading text complexity matches language proficiency of Russian C2 EFL readers, they tend to focus mostly on textual not metadiscourse information. The most obvious causes for the identified differences in the distribution and range of the discourse markers in the reading text and recalls are either disparities between Russian and English discourse patterns or readers’ individual incompetence in English metadiscourse features. We would also like to point out that resorting predominantly to “extraction” strategy and “endo-vocabulary” in recalls, readers in fact demonstrate their inability (or reluctance) to apply the strategy of “attire” which would qualify them as capable of shifting semantic dominants and widening the area of semantic cover. Decoding and recalling the content is not followed or accompanied by engagement of background knowledge, establishing connections between parts of texts or pieces information extracted from the reading text. Neither did we observe including elements of an evaluative or emotional nature into the recall. Thus, we can say that mainly the content of the text is reproduced. A deeper understanding of intra-textual links corresponds to the stage of concept formation, which implies involvement of emotional, evaluative, and subjective components.

As it was mentioned earlier, our findings provide strong evidence of the subjects’ ability to comprehend and immediately reproduce up to 60% of the reading text propositions if its complexity is fully aligned with readers’ proficiency. The results we received although different but not inconsistent with the findings of Bergman & Roediger (1999). In their study Bergman & Roediger (1999) registered the number of accurately reproduced propositions in three settings: immediate, one week and six months after reading the text. The researchers observed and documented that 26% of the propositions retrieved in immediate recalls were accurate. The differences in the number of the propositions recalled in our experiment may be caused by at least two reasons: (1) Bergman & Roediger (1999) did not assess the subjects’ language proficiency and (2) involved all undergraduate students who volunteered to participate in the experiment. Thus, the results they report are average of the general population, we, on the other hand, focus on high ability students only.

In light of the text complexity and readers’ competence balance, the recall findings for the differences in the range and number of metadiscourse markers were less expected. Nonetheless they are consistent with the conclusion made by A. Kotelnikova (2020) in her research of EFL readers’ comprehension strategies. Her research was based on A. Novikov’s theory of compressing information (2007) which says that in text-based recalls readers resort either to ‘extraction’ or ‘attribution’ strategy. While implementing ‘extraction’ strategy, a subject delivers the text content using the vocabulary ‘extracted’ from the reading text. The latter is referred to as ‘endo-lexis’. Metadiscourse or, in Novikov’s words, ‘some external information in the text’, on the other hand, if come across, is dealt with the strategy of ‘attribution’. Attribution here means that based on his/her individual experience, the reader does not reproduce but generates meanings ‘attributing’ his own experience, and uses his own ‘exo-lexis’, the vocabulary missed in the text. Hence, although C2 readers are generally assumed to have mastered all types of reading skills and able to use different types of linking words, i.e. metadiscourse markers, in their speech, the recalls we collected manifest readers’ preference to reproduce the text content and “attribute” senses. Our expectations that interactional markers are recalled in a similar way as factual information were not upheld by our results.

Another possible reason for the above are differences in cultures following John Hinds’ (1987) division of national writing cultures into “writer-responsible” and “reader-responsible”. “In a writer-responsible culture like English”, for example, “metadiscourse markers are used to guide readers through a text” (Adel 2006:149) while in a readerresponsible culture like Russian (see Blinova 2019), connections between various parts of a text are more commonly left implicit”. The findings indicate that even high ability Russian speakers tend to transfer Russian patterns of organizing their ideas into EFL writing. The differences in recalls patterns affecting participants’ speech production result in (a) differences in Russian and English discourse; (b) reduction of the variety and range of discourse markers as a highly probable component in the process of a metadiscourse model simplification in text-based recalls of all types. The findings suggest importance of raising readers’ awareness of the way metadiscourse markers frame their speech production. The latter gains particular importance in light of the fact that the role of metadiscourse is increasingly “recognized for natural language processing applications like text-mining and information extraction” (Sandor 2007: 97).

Conclusion

The current research explores the effect of text complexity on high proficiency readers’ ability to recall propositional content and reproduce metadiscourse markers of a reading text. It focuses on C2 EFL Russian readers, thus dealing with a specific group of participants whose immediate text-based recalls, on the one hand, belong to an under-researched area, but on the other, experience a growing interest in the context of studying aptitude of talented students.

The results received are confirmed for high ability EFL Russian students in situations when their reading proficiency matches the reading text complexity of C2 (CEFR). Validation of the results (considered as the research prospect) implies widening the demographic range and number of the respondents. Any other type of participants including mediocre and low proficiency students, or reader-text mismatch requires further investigation. Further research in the area could focus on different types of texts such as short stories and reports to investigate if they generate different discourse patterns.

The results received may be conducive to the research on semantic variables of recall strategies and text comprehension. They can also be applied in further research on text complexity impact on the amount and range of metadiscursive elements generated in written recall. As the problem of text-reader alignment in EFL practice still remains a research niche, primarily due to the its multidisciplinarity and the need for joint efforts of linguists, cognitive scientists and psychologists, our findings are in demand in applied linguistics, education and speech generation studies. Сomplete solution to the problem (if possible) may enable researchers and practitioners to determine optimal linguistic and cognitive factors of text comprehension and recall. So far the findings contribute to our understanding on differences in metadiscourse strategies in English and Russian discourses.

_{¹ C2 level of proficiency is considered the highest level of proficiency in the Common European Framework of reference, and it is viewed as a near-native speaker.}

_{² FKGL formula installed in TextInspector identifies the number of formal schooling generally required to comprehend a text (Teunyev et al. 2022).0.39(total wordstotal sentences)+11.8(total syllablestotal words)−15.59}

About the authors

Marina I. Solnyshkina

Kazan Federal University

Author for correspondence.
Email: mesoln@yandex.ru
ORCID iD: 0000-0003-1885-3039

Doctor Habil. of Philology, Professor of the Department of Theory and Practice of Teaching Foreign Languages, Head of “Text Analytics” Research Lab, Institute of Philology and Intercultural Communication of Kazan Federal University, Kazan, Russia. Her research interests include linguistic complexology, corpus linguistics, and lexicography

Kazan, Russia

Elena V. Harkova

Kazan Federal University

Email: halenka@rambler.ru
ORCID iD: 0000-0001-7582-6622

Doctor of Philology, Associate Professor of the Department of Theory and Practice of Teaching Foreign Languages at the Institute of Philology and Intercultural Communication of Kazan Federal University, Kazan, Russia. Her research interests embrace intercultural communication, theory and practice of translation, teaching English, and lexicography.

Kazan, Russia

Yulia N. Ebzeeva

RUDN University

Email: ebzeeva-jn@rudn.ru
ORCID iD: 0000-0002-0043-7590

Doctor of Social Sciences, First Vice-Rector - Vice Rector for Education and Head of Foreign Language Department, RUDN University. She is a member of the international scientific committee of QS. She actively participates in international conferences and forums, has spoken at the Council of Europe, and has repeatedly acted as an expert on linguistic and migration issues. Her research interests include French lexicology and stylistics, translation studies, intercultural communication, sociolinguistics, migration studies and educational policy.

Moscow, Russia

References

Adel, Annelie. 2006. Metadiscourse in L1 and L2 English. Philadelphia: John Benjamins.
Alipour, Mohammad & Parastoo Jahanbin. 2020. A comparative study of proximity in Iranian and American newspaper editorials. Russian Journal of Linguistics 24 (4). 796-815. https://doi.org/10.22363/2687-0088-2020-24-4-796-815
Aubry, Alexandre, Corentin Gonthier & Béatrice Bourdin. 2021. Explaining the high working memory capacity of gifted children: Contributions of processing skills and executive control. Acta Psychologica 218103358. 1-12. https://doi.org/10.1016/j.actpsy.2021.103358
Aull, Laura & Zak Lancaster. 2014. Linguistic markers of stance in early and advanced academic writing: A corpus-based comparison. Written Communication 31 (2). 151-183.
Bergman, Erik T. & Henry L. Roediger. 1999. Can Bartlett’s repeated reproduction experiments be replicated? Memory & Cognition 27. 937-947. https://doi.org/10.3758/BF03201224
Blinova, Olga. 2019. Teaching academic writing at university level in Russia through massive open online courses: National traditions and global challenges. Proceedings of INTED 2019 Conference 11th-13th March 2019, Valencia, Spain. 6085-6090. https://doi.org/10.2139/ssrn.3504163
Boginskaya, Olga. 2022. Functional categories of hedges: A diachronic study of Russian research article abstracts. Russian Journal of Linguistics 26 (3). 645-667. https://doi.org/10.22363/2687-0088-30017
Bolsunovskaya, Lyudmila, Yulia Zeremskaya & Natalia Dubrovskaya. 2015. Types of discourse markers in Russian and English research papers on geology, oil and gas. Tomsk State Pedagogical University Bulletin 4 (157). 117-123. (In Russ.).
Bulté, Bram & Alex Housen. 2012. Defining and operationalising L2 complexity. In Alex Housen, Folkert Kuiken & Ineke Vedder (eds.), Dimensions of L2 performance and proficiency: Complexity, accuracy and fluency in SLA, 21-46. Amsterdam: John Benjamins. https://doi.org/10.1075/lllt.32.02bul
Burleson, Brant R. & Scott E Caplan. 1998. Cognitive complexity. In James C. McCroskey, John A. Daly, Marcelo M. Marti & Michael J. Beatty (eds.), Communication and personality: Trait perspectives, 230-286. Cresskill, NJ: Hampton Press.
Chang, Yuh F. 2006. On the use of the immediate recall task as a measure of second language reading comprehension. Language Testing 23 (4). 520-543. https://doi.org/10.1191 0265532206lt340
Chall, Jeanne. 1999. Varying approaches to readability measurement. Revue Québécoise de Linguistique. Quebec Journal of Linguistics 25 (1). 23-40. https://doi.org/10.7202/603125ar
Crossley, Scott, Hae Sung Yang & Danielle McNamara. 2014. What’s so simple about simplified texts? A computational and psycholinguistic investigation of text comprehension and text processing. Reading in a Foreign Language 26. 92-113.
Crossley, Scott & Danielle McNamara. 2016. Text-based recall and extra-textual generations resulting from simplified and authentic texts. Reading in a Foreign Language 28 (1). 1-19.
Ellis, Rod & Gary Barkhuizen. 2005. Analysing Learner Language. Oxford: OUP.
Embretson, Susan & Wetzel Douglas. 1987. Competent latent trait models for paragraph comprehension tests. Applied Psychological Measurement 11 (2). 175-193. https://doi.org/10.1177/014662168701100207
Fillmore, Charles. 2002. Form and Meaning in Language, Vol. 1: Papers on Semantic Roles. Center for the Study of Language and Information.
Fletcher, Charles. 1981. Short-term memory processes in text comprehension. Journal of Verbal Learning & Verbal Behavior 20 (5). 564-574. https://doi.org/10.1016/s0022-5371(81)90183-3
Fletcher, Paul, Chris D. Frith, Paul Grasby, Tim Shallice, Richard Frackowiak & Raymond Dolan. 1995. Brain systems for encoding and retrieval of auditory-verbal memory: An in vivo study in humans. Brain 118 (2). 401-416. https://doi.org/10.1093/brain/118.2.401
Gatiyatullina, Galia, Ludmila Gorodetskaya, Marina Solnyshkina & Elzara Gafiyatova. 2020. Investigating the differences between prepared and spontaneous speech characteristics: Descriptive approach. International Journal of Criminology and Sociology 9. 2591-2598.
Gatiyatullina, Galia, Marina Solnyshkina, Roman Kupriyanov & Chulpan Ziganshina. 2023. Lexical density as a complexity predictor: The case of Science and Social Studies textbooks. Research Result. Theoretical and Applied Linguistics 9 (1). 11-26. https://doi.org/10.18413/2313-8912-2023-9-1-0-2
Graesser, Arthur & Danielle McNamara. 2011. Computational analyses of multilevel discourse comprehension. Topics in Cognitive Science 3. 371-398. https://doi.org/10.1111/j.1756-8765.2010.01081.x
Halliday, Michael A. K. 1994. An Introduction to Functional Grammar (2nd edition). London: Edward Arnold.
Hickey, Tina & Sheila Gilheany. 2003. High ability children and their reading needs. In G. Shiels and U. Ní Dhálaigh (eds.), Other ways of seeing: Diversity in language and literacy, 65-74. Dublin: Reading Association of Ireland.
Hinds, John. 1987. Reader versus Writer Responsibility: A New Typology. In Ulla Connor & Robert Kaplan (eds.), Writing across languages: Analysis of L2 texts, 141-152. MA: Addison-Wesley.
Hyland, Ken. 1996. Writing without conviction? Hedging in scientific research articles. Applied Linguistics 17. 433-454.
Hyland, Ken. 2004. Disciplinary interactions: Metadiscourse in L2 postgraduate writing. Journal of Language Writing 13. 133-151. https://doi.org/10.1016/j.jslw.2004.02.001
Hyland, Ken. 2005. Metadiscourse: Exploring Interaction in Writing. London: Continuum.
Hyland, Ken. 2010. Constructing proximity: Relating to readers in popular and professional science. Journal of English for Academic Purposes 9 (2). 116-127. https://doi.org/10.1016/ j.jeap.2010.02.003
Irwin, Jidith W. 1980. The effects of explicitness and clause order on the comprehension of reversible causal relationships. Reading Research Quarterly 14. 477-488.
Kintsch, Walter. 1998. Comprehension: A Paradigm for Cognition. Cambridge, New York, NY, USA: Cambridge University Press.
Korovina, Irina V. 2020. System of deictic coordinates and intertextual deixis in academic discourse. Russian Journal of Linguistics 24 (4). 876-898. https://doi.org/10.22363/2687-0088-2020- 24-4-876-898
Kotelnikova, Anastasiya. 2020. Two strategies of understanding. Bulletin of PNIPU. Problems of linguistics and pedagogy. PNRPU Linguistics and Pedagogy Bulletin 4. 70-78.
Kulik, James A. 1992. Analysis of the research on ability grouping: Historical and contemporary perspectives. Research Based Monograph No. 9204. Storrs: National Research Center on the Gifted and Talented, University of Connecticut.
Lee, Icy. 2002. Helping students develop coherence in writing. English Teaching Forum 40. 32-39.
Novikov, Anatoliy. 2007. Text and its Semantic Dominants. Moscow. Publishing House of the Institute of Linguistics of the Russian Academy of Sciences. (In Russ.).
Petrova, Anna, El'zara Gizzatullina-Gafiyatova, Nadezhda Sytinan & Marina Solnyshkina. 2022. Technologies in analysis and computing immediate recalls. Lecture Notes in Networks and Systems 342. 660-673. https://doi.org/10.1007/978-3-030-89477-1_63
Petrova, Anna & Marina Solnyshkina. 2021. Immediate recall as a secondary text: Referential parameters, pragmatics and propositions. Russian Journal of Linguistics 25 (1). 221-249. https://doi.org/10.22363/2687-0088-2021-25-1-221-249
Sandor, Agnès. 2007. Modeling metadiscourse conveying the author's rhetorical strategy in biomedical research abstracts. Dans Revue Française de Linguistique Appliqué Vol. XII. 97- 108.
Solnyshkina, Marina, Valery Solovyev, El’zara Gizzatullina-Gafiyatova & Ekaterina Martynova. 2022. Text complexity as interdisciplinary problem. Voprosy Kognitivnoy Lingvistiki 1. 18-39. https://doi.org/10.20916/1812-3228-2022-1-18-39
Smolik, Filip, Hana Stepankova, Martin Vyshnalek & Nikolai Tomas. 2016. Propositional density in spoken and written language of Czech-speaking patients with mild cognitive impairment. Journal of Speech Language and Hearing Research 59 (6). https://doi.org/10.1044/2016_JSLHR-L-15-0301
Solovyev, Valery, Marina Solnyshkina & Danielle McNamara. 2022. Computational linguistics and discourse complexology: Paradigms and research methods. Russian Journal of Linguistics 26 (2). 275-316. https://doi.org/10.22363/2687-0088-31326
Solovyev, Valery, Mihai Dascalu & Marina Solnyshkina. 2023. Discourse complexity: Driving forces of the new paradigm. Research Result. Theoretical and Applied Linguistics 9 (1). 4-10. https://doi.org/10.18413/2313-8912-2023-9-1-0-1
Spyridakis, Jan H. & Timothy C Standal. 1987. Signals in expository prose: Effects on reading comprehension. Reading Research Quarterly 22. 285-298.
Taylor, Barbara M. 1980. Children's memory for expository text after reading. Reading Research Quarterly 15. 399-411.
Van den Broek, Kirsten Risden Paul & Elizabeth Husebye-Hartmann. 1995. The role of readers’ standards for coherence in the generation of inferences during reading. In Robert F. Lorch Jr. & Edward J. O’Brien (eds.), Sources of coherence in reading, 353-373. Lawrence Erlbaum Associates.
Van Dijk, Teun & Walter Kintsch. 1983. Strategies of Discourse Comprehension. New York: Academic.
Vande Kopple, William. J. 1985. Some explanatory discourse on metadiscourse. College Composition and Communication 36. 82-93.
Vipond, Douglas. 1980. Micro- and Macro-processes in Text. Journal of Verbal Learning and Verbal Behaviour 19. 276-296. https://doi.org/10.1016/S0022-5371(80)90230-3
Waters, Harriet. 1983. Superordinate-subordinate structure in prose passages and the importance of propositions. Journal of Experimental Psychology: Learning, Memory, and Cognition 9 (2). 294-299. https://doi.org/10.1037/0278-7393.9.2.294
Weir, Сyril, Roger Hawkey, Anthony Green & Sarojani Devi. 2009. The cognitive processes underlying the academic reading construct as measured by IELTS. IELTS Research Reports 9. 157-189.
Yus, Francisco. 2018. Attaching feelings and emotions to propositions: Some insights on irony and internet communication. Russian Journal of Linguistics 22 (1). 94-107.
Zhang, Limei. 2018. Metacognitive and Cognitive Strategy Use in Reading Comprehension: A Structural Equation Modelling Approach. Singapore: Springer Nature Singapore. https://doi.org/10.1007/978-981-10-6325-1
Ziafar, Meisam & Ehsan Namaziandost. 2020. A formulaic approach to propositional density and readability. International Journal of Innovation and Research in Educational Sciences 6 (6) 816-822.

Supplementary files

Supplementary Files

Action

1. JATS XML

Download

Username
Password
Remember me

Forgot password?	Register

Username
Password
Remember me

Forgot password?	Register

Vol 30, No 2 (2026): LANGUAGE POLICY IN MULTIETHNIC COUNTRIES

Vol 30, No 2 (2026): LANGUAGE POLICY IN MULTIETHNIC COUNTRIES

Text content variables as a function of comprehension: Propositional discourse analysis

Full Text

Abstract

Keywords

Full Text

Introduction

Literature review: reading comprehension and text-based recalls

Participants, materials and methods

Table 2. Propositional analysis of the Reading text (part)

Analysis

Discussion

Conclusion

About the authors

Marina I. Solnyshkina

Elena V. Harkova

Yulia N. Ebzeeva

References

Supplementary files