Multimodal Communicative Moves in Expositive Dialogue: Common and Novel Topic Elaboration

Cover Page

Cite item

Abstract

The study explores the distribution and structure of multimodal clusters presenting a series of communicative moves in expositive dialogues: Request, Elaboration, and Response. We hypothesize that multimodal clustering of moves will be predetermined by the use of either common (for both participants) or novel topic elaboration as a nucleus move within the cluster. To proceed, we conduct a multimodal experiment which recorded the participants’ gesture with motion capture system (Perception Neuron Motion Capture) and gaze with eye-tracking glasses (Tobii Pro Glasses 2), as well as their speech and overall multimodal behavior with a stationary camera. The study reveals significant differences in the use of both face-oriented gaze and contactestablishing gesture as modulated by Request and Response moves within common or novel topic elaboration clusters; however, face-oriented gaze use manifests both higher frequency and diversity. Mutual face-oriented gaze prevails at the Request move preceding common topic elaboration, whereas elaborating a novel topic is found to produce a more involved gaze reaction of the listener during the Response moves. Additionally, simultaneous (by both participants) verbal move is more typical of common topic elaboration. The results evidence that social interaction and communication in expositive dialogue is processed multimodally and predetermines the role of gaze, gesture and verbal moves in communicative moves clusters.

Full Text

Introduction

The studies of multimodal collaboration in discourse have recently integrated experimental methods since one of the key directions in multimodal research is now the prognostic analysis aimed at predicting how collaborative information construal might affect multimodal behavior. In this study, we address the multimodal structure of expositive dialogue, hypothesizing that clustering its communicative moves may predetermine the use of multimodal resources, gaze, gesture and speech, within these moves. Methodologically, the work is rooted in the research on communicative moves in collaboration [1–3], and also on the use of multimodal resources in communicative moves [4–6]. As is known, the studies specify three and types of communicative moves in a dialogue which are Request, Topic Elaboration, Response; however, clustering multimodal resources in shaping these moves has only recently become the research focus of experimental studies [7].

The research data are the samples of spontaneous expositive dialogues collected by the authors of the paper. The research questions which the paper advances are primarily the following: 1) What are the types of verbal communicative moves and their functions in expository dialogue? 2) What is the distribution of multimodal communicative moves? and 3) How is the distribution of nonverbal moves modulated by the use of verbal moves within multimodal clusters? Following the studies which claim that gaze, gesture and speech co-preform in processing social interaction and communication [8–10], we hypothesize that the structure of multimodal clusters will be determined by advancing different types of topic [11; 12], specifically either common (by both participants) or novel topic in the expositive dialogue.

The work is structured as follows. First, we present the Theoretical Framework shaping 1) the studies of communicative moves and their functions in collaborative dialogue, 2) the studies of multimodal resources in communicative moves. Second, we introduce the Multimodal experiment design and methods. Next, the Results and Discussion are presented, which specify 1) the verbal communicative moves and their functions in expositive dialogue, 2) the distribution of multimodal communicative moves in expositive discourse and 2) the clusters of Request, Elaboration and Response modulated by the use of common and novel topic. Conclusive remarks section, we identify the research output and the prognostic prospects of its results.

Theoretical Framework

  1. Discourse studies of communicative moves in dialogue

Although the linguistic studies exploring the dialogue unities mostly differentiated two major communicative moves, Request and Response [2; 3], the discourse studies additionally specify Elaboration move [1; 7], since apart from questioning and answering, the discourse of the dialogue advances topics and subtopics or inserting comments on them [1. P. 147] which allow construe a discourse of a particular type, narrative, descriptive, argumentative and expositive [13]. As we focus on the expositive discourse type, we may well expect that its Elaboration moves constituting the nuclei of the communication (dialogue) unity will manifest specificity modulated by the expository function of this discourse type.

The problem of clustering these moves in a dialogue was formulated in earlier studies; still, it had not received an adequate solution since it became obvious that each move in a dialogue is not related to or does not depend on only the preceding move in the linear order of moves [1; 14]. In [1. P. 66–67], the author claims that the phenomenon of multiple antecedence is quite common, and a response for instance may serve several purposes, that is actually being a correction or confirmation of the answer it surpasses and an answer to the topical question. Further studies mostly explored the communicative structure of verbal moves [15; 16], and the discourse markers which shape it [17–21]. However, with the growth of multimodal communication studies we have faced the necessity to develop the methods exploring the clustering of both verbal and nonverbal communicative moves [5; 6]. In [7], for instance, the prevailing order of these moves was identified for descriptive discourse; it conformed to the formula Request — Response — Elaboration; however, the author claims that the potential of different modalities (gaze, gesture, speech) in their realization is still to be explored.

Following [2; 3] who claim that a dialogical unity is organized around its center — a topic, we presume that move clustering may be explored via the communicative function of topic which is elaborated in the dialogue. Since in linguistics there exist two different approaches to topic / theme, where the first developing discourse functional grammar recognizes the topic / comment distinction (following the distinction of theme and rheme), and the second exploring “the flow of consciousness” [12] in forming the chunks of topics recognizes the given / new topic distinction, we have to clarify the view adopted in this paper. Following the second tradition we consider a discourse topic as an “aggregate of coherently related events, states, and referents that are held together in some form in the speaker’s semiactive consciousness” [12. P. 121]. Importantly, in expositive dialogue the topic is neither restricted to one basic-level topic elaborated by both of participants, nor this topic elaboration follows a narrative scheme explored in [11; 12]. Expositive dialogue manifests the collaborative construal of fuzzy referents in demarcating, ranging, enumerating, contrasting them [22–24]; consequently, two types of topics may be additionally differentiated, with the first being the common topic which corresponds to recent, left-hand, specifying, causing, given, repeated information brought forward by both dialogue participants, and the second being the novel topic which corresponds to prominent, emergent, ad hoc information initiated by one of the participants. We further presume that clustering communicative moves in expository dialogue may follow a specific pattern determined by elaborating either Common or Novel topic.

  1. Multimodal resources in communicative moves

Clustering communicative modalities in communication has recently become the interest of ‘the social brain’ studies which claim that brain structures involved in human social interaction and communication are responsible for processing social information by reading signals from the face, gaze and action [8–10].

In terms of verbal moves, their social interaction function can be observed via the use of the discourse markers as expressions displaying the semantic relations between the moves employed in them [25; 26]. Structurally, discourse markers may relate to one of three classes, contrastive, elaborative, and implicative [27]; still, their communicative functions are drawn from the role they play in a communicative unit in Request and Response moves. Pragmatically, verbal requests are typically described as direct and indirect (for a review cf. [28]) with indirect semantic (rhetorical) functions being further specified in [29] where the authors identify contact-establishing, controlling, metacommunicative and specifying functions. Responses are analogously classified as direct and indirect, where indirect responses are commonly associated with evasiveness and silencing. However, we expect that the discourse functions for the communicative moves are additionally modulated by the discourse type, here expositive discourse; consequently, we can specify them in this study.

As for gestures, they engage an interlocutor, facilitate sensorimotor patterns of brain activations that determine specific behavioral responses [30]. According to [4] Kendon (1995), gestures in communication play a significant role and can be perceived as discourse markers and are known as conversational or interactive gestures. They act as a direct reference to the interlocutor as they are oriented towards the interlocutor in their form and direction [31]; in this study we will refer to them as contact-establishing gestures. Fewer studies specify the role of gesture as modified by the communicative moves. However, they claim that the congruent nature of gestures adds to the better explanation [32–34] or form a component of request, i.e., to ask for help in a moment of difficulty [35] in order to initiate, maintain, regulate, or terminate interaction and to convey communicative intentions. Some studies indicate the importance of contact-establishing gestures in addressing the interlocutor in problemsolving tasks [36] and cooperation [37]. Gestures are also significant for the emotional response [38].

It was also established that eye gaze plays a substantial role in turntaking; it serves as a signal to convey the willingness to establish communication [39; 40]: direct gaze shows the intention of the speaker to interact, whereas an averted gaze displays the unwillingness to initiate a relation [9]. Monitoring the gaze patterns during a conversation helps establish mutual understanding, especially in a joint action [41–43]. In a dialogue a speaker gazes at the listener at the end of their speech, so this is a turn-yielding cue that allows to check the understanding of the message [44], while a person tends to look more at the interlocutor when they are listening then when they are speaking [31]. Goodwin suggested that the interlocutor should be gazing at the speaker when the speaker is gazing at the hearer [45]. Beattie proposed that after taking the turn and providing their output the speaker tends to move the gaze away from the interlocutor in order to obtain speech fluency and reduce cognitive load [46]. Speakers tend to use the gaze window (i.e., mutual gaze) in order to coordinate their actions, where it is not the speaker’s response that elicits the speech of the interlocutor, but the speech reactions or backchannels (such as hm, mmm, uh huh, etc.) of the interlocutor terminate the gaze of the speaker. So, the speaker does not look at the interlocutor to monitor the feedback (i.e., the reaction), but to solicit a response [31].

Multimodal experiment design and methods

To explore multimodal clustering of communicative moves, we conducted the experiment simulating a face-to-face expositive dialogue. The participants were students aged 18–21. The experimental task presumed to agree upon one main difference between each pair of close synonyms, like «огонь — пламя» / “fire — flame”, «мертвец — труп» / “deadman — corpse”, «битва — схватка» / “battle — fight”, «чепуха — ерунда» / “nonsense — rubbish”, etc., altogether 14 pairs. Prior to the experiment the participants signed the consent form and were outfitted with the following equipment: (1) motion capture system (Perception Neuron Motion Capture) and (2) eye-tracking glasses (Tobii Pro Glasses 2, 1920×1080, 25 FPS) (Figure 1).

Fig. 1. Experiment setting
Source: photos from the archive of the authors. Prior to the experiment the participants signed the consent form.

Three cameras recorded the experiment: two cameras were built in the eyetracker that allowed see the perspective of the speaker and one camera (Sony HXR-NX30P, 1920х1080 FHD) was installed in front of the participants. For the purposes of this piece of research, we took a multimodal corpus with the duration of approximately 57 minutes. The data from motion capture system was collected in Axis Neuron and the data from eye-tracking glasses was retrieved using Tobii Pro Glasses Controller. To analyze the material we used ELAN, the annotating software devised by Max Planck Institute for Psycholinguistics, which allowed us to annotate verbal and nonverbal moves.

To explore the distribution of multimodal communicative moves and to identify the clusters of Request, Elaboration and Response, we have followed a series of steps.

At Step 1 we annotate the multimodal data determining the communicative moves of Common and Novel Topic Elaboration moves serving as the centers or nuclei of the move clusters, and also the presence / absence of Request and Response shaping these clusters. This procedure allows to identify the functions of moves as well as the frequency of single communicative moves.

At Step 2 we identify the role of verbal and nonverbal modalities shaping single communicative moves within the clusters. To perform, we address each type of communicative move, Request, Common and Novel Topic Elaboration, and Response as manifested in speech only in both direct and indirect modi, in Face-oriented gaze and in Contact-establishing gesture. This procedure allows to determine the distribution of multimodal communicative moves.

At Step 3 we apply an additional annotating and processing method to identify the presence / absence of verbal and nonverbal modalities in shaping each cluster in each participant’s communicative behavior. At this step, we obtain the aligned structure of the clusters modulated by both the presence of three types of communicative moves and each of verbal and nonverbal modalities. Finally, this allows to determine and contrast the specifics of move clustering in two cluster types, with the nuclei of Common and Novel Topic Elaboration.

Results and Discussion

  1. Verbal communicative moves and their functions in expositive dialogue

The analyzed multimodal corpus comprised 42 collaboration (joint action) units or problem-solving tasks. Each collaboration unit presented a series of move clusters advanced by either or both participants explaining the differences between a pair of close synonyms. To distinguish between the use of common and novel topic in the participants’ speech, we adopted the following procedure: 1) identifying the rhematic component of the verbal move of the first participant, 2) determining its semantic correspondences in thematic and rhematic components of prior verbal moves of the second participant, 3) in case of its either intensifying, specifying or generalizing the components of prior verbal moves of the second participant, we considered this move as Common Topic Elaboration, 4) in case no repeating, specifying or generalizing the components of prior verbal moves of the second participant was identified, we considered this move as Novel Topic Elaboration. For instance, the first participant’s verbal move Пламя это что-то какое-то больше как костер что-то большое а огонь может быть и спичка и свечка (flame is something like more like a fireplace and fire can be a match and a candle) followed by the second participant’s verbal move Да да пламя это что-то большое крупное (Yes yes flame is something big large) manifests the example of the participants’ sharing the same idea while exposing the differences between the flame and fire. To be more correct, it is the second participant who adopts the same idea; therefore, his verbal move will be identified as Common Topic Elaboration whereas the first participant’s verbal move is Novel Topic Elaboration.

Altogether, the number of communication units was 630, with nuclei moves Common and Novel Topic Elaboration equal to 388 and 242; which means that participants far more frequently collaborated on a common topic intensifying specifying or generalizing it. As evident, Novel Topic Elaboration appeared at the start of each collaboration unit, most commonly in the moves of both participants since they seemed to be eager to advance their personal view of the differences and only after several sequent moves, they “agreed” to adjust their opinion of the differences with something the other participant mentioned. Still present was the sequence of one participant offering a series of verbal moves elaborating on a Novel topic with the second participant being silent of manifesting either indirect verbal moves or Requests and Responses and then preceding to elaborating on a Common or Novel Topic. For instance, in страх мне кажется может быть и парализующим / он больше на тебя воздействует чем боязнь / боязнь например высоты (fear it seems can be paralyzing / it affects you more that fear / apprehension for instance of height) the first participant advances three sequent verbal moves which are Novel Topic Elaboration, before the second participants intervenes with a Request ну а страх например высоты? (and what about the fear of heights?).

We further distinguished the communicative functions of the three verbal moves in expository dialogue. Request is used to attract attention in послушай (listen), ну смотри (well look), to state the conditions for the communication in нам нужно решить (we have to decide), to request for repetition in еще раз? (one more time?) and to request for clarification in почему? (why?) or тогда ты можешь объяснить? (then can you explain?). Response can also perform several functions: it expresses consent in да соглашусь (yes I will agree), discord in да нет это же не то… (no it’s not like that), hesitation in таааак это у нас (wel-l-l-l we have), assessment in о супер (oh that’s cool) or emotion in я с ума сойду (I’ll go crazy). Common Topic Elaboration is expressed in specifying or giving additional information/details on the topic and information sequencing in а кара это когда ты не пришел на занятие и потом не сдал тест (and a punishment is when you missed a class and then didn’t pass a test), intensifying or restating in а кара – это что-то более масштабное (and a punishment is more widescale) and also generalizing or summarizing of what has been said in ну короче это что-то более масштабное (so this is something of a bigger scale). Novel Topic Elaboration is expressed in advancing a statement with a semantically novel rhematic component, e.g., in это божья кара и еще что-то там (it’s God’s punishment and something like this).

These discourse functions of Topic Elaboration in dialogue specify the functions of expositive discourse which are demarcating, ranging, enumerating, contrasting referents [22–24]; as seen apart from referent construal, expositive dialogue also contributes to their foregrounding which stimulates communication. Additionally, the functions of Request in discourse presented in [29] were itemized to comply with the context of expositive discourse where stating the conditions for the communication was found as advancing metacommunicative function, and requesting for repetition and for clarification were found as extending the specifying function.

Both verbal Request and Response may be expressed directly or indirectly. Indirect vocalization appears in hesitations, repetitions, murmuring, etc. Since distinguishing in this case between Request and Response seems complicating, we introduced a separate annotation category of Indirect verbal move (vocalization). In our recording we found 314 cases of the Indirect verbal move which we further classified as signs which might complement, precede or follow the nuclei move of a multimodal communicative cluster. The activity and diversity of indirect verbal moves in expositive dialogue justifies the need to consider it a specific verbal move, which agrees with the distinctions of discourse markers advanced in [27] as implicative alongside with contrastive and elaborative shaping Requests, Responses and Elaboration moves.

  1. The distribution of multimodal communicative moves in expositive discourse

In Table 1 we present the overall data on verbal communicative moves distribution, and also the data on synchronized activity of verbal and nonverbal moves. To obtain the data, we employed the ELAN-embedded function which allows to explore the synchronized events in different annotation layers, here in Request, Response, Common Topic Elaboration, Novel Topic Elaboration, and Indirect Verbal Move as synchronized with Contact-Establishing Gesture (CE Gesture), Face-Oriented Gaze (FO Gaze), and with both CE Gesture and FO Gaze.

Table 1 Verbal and nonverbal moves distribution

Verbal moves

Total

With CE Gesture

With FO Gaze

With CE Gesture and FO Gaze

Verbal Request

193

61

140

58

Verbal Response

284

43

191

43

Common Topic Elaboration

388

117

327

125

Novel Topic Elaboration

242

78

195

65

Indirect Verbal Move

314

43

172

53

Source: compiled by the authors.

The results show that while CE Gesture synchronized with cluster nuclei (Elaboration on a Common topic and Elaboration on a Novel topic) was found in 195 cases out of 630 uses, FO Gaze synchronized with cluster nuclei was observed in 512 cases, which is 2.63 times higher. We hypothesized that there might be a difference in the use of nonverbal moves as modulated by the verbal moves type, Common or Novel Topic Elaboration. The Chi-squared tests, however, did not prove this hypothesis: with χ2=0.301 at p=0.584 for CE Gesture modulated by Common or Novel Topic Elaboration, and χ2=0.123 at p=0.726 modulated by Common or Novel Topic Elaboration we cannot claim there is the difference in the use of nonverbal moves with the verbal nuclei moves. We further hypothesized that there might be a difference in the use of nonverbal moves as modulated by other verbal moves type, Request and Response. The Chi-squared tests showed that with χ2=18.272 at p<0.001 for CE Gesture synchronized with Request and Response, and χ2=1.511 at p=0.219 for FO Gaze synchronized with Request and Response, which means that the use of Contact-Establishing Gesture is significantly more frequently observed with Verbal Request rather than with Response, while Face-Oriented Gaze did not show the same tendency. The results somewhat specify the claim presented in [38] who found that gestures are frequently found as accompanying emotional response. In expositive dialogue expressing emotion is not the major discourse function of Response; supposedly for this reason contact-establishing gesture was uncommon in this Response type. Additionally, since we found that CE Gesture is more commonly used with Request, the results conform to the findings of frequent gesture use in Request advanced in [35]. Still, we also observe the tendency to use contact-establishing gesture in explanation tasks [32–34] which in our case are performed via exposition.

However, the results did not convey the tendency found in [39; 40] who observed the frequency of FO Gaze as manifesting the willingness to establish communication, which means it should prevail at the Request move. We expect that the possible explanation for it may be that in expositive dialogue the Request precedes two types of Topic elaboration moves, Common and Novel, and presumably, mutual FO Gaze will prevail at the Request move preceding solely Common Topic Elaboration; this hypothesis we will test further.

Importantly, we also observed the differences in the collaboration which were found in the three pairs of participants. They are presented in Table 2.

Table 2 Distribution of verbal and nonverbal moves by pairs of participants (PoP)

Verbal moves

With
CE Gesture
With
FO Gaze
With
CE Gesture
and FO Gaze
With
CE Gesture
With
FO Gaze
With
CE Gesture
and FO Gaze
With
CE Gesture
With
FO Gaze
With
CE Gesture
and FO Gaze

 

PoP1

PoP2

PoP3

Verbal Request

19

44

21

36

86

30

6

10

7

Verbal Response

23

106

25

16

69

13

4

16

5

Common Topic Elaboration

54

142

62

52

170

49

11

15

14

Novel Topic Elaboration

38

92

38

26

87

21

14

16

6

Indirect Verbal

Move

23

90

32

16

63

14

4

19

7

Source: compiled by the authors.

As is seen in Table 2, both verbal and nonverbal moves vary among the pairs of participants. PoP 1 and PoP 2 represent similarity in terms of verbal and nonverbal communication moves, while PoP 3 participants used verbal moves quite rarely and these moves were not frequently accompanied by nonverbal moves. Even though the participants have similar age and occupation, the participants in PoP 3 tended to find quick and best solution without being engaged in long debates about the difference between synonyms. For instance, if we analyze the timelines of all PoPs, PoP 3 spent the least amount of time to find differences between 14 pairs of synonyms (approximately 11 min), whereas the same task took approximately 20:40 min in PoP 1 and abound 23:20 min in PoP 2. It may indicate the differentiation of strategies to accomplish the given task, when PoP 3 preferred the strategy of rapid search and quick consent.

  1. The clusters of Request, Elaboration and Response modulated by Common or Novel Topic Elaboration

In this section, we will present the results of clusters distribution, first irrespective of the Elaboration move type (with Common and Novel Topic Elaboration taken together) as we presume that this distribution conveys the specifics of multimodal collaboration in expositive dialogue. Next, we proceed to the presenting the results specifying single Elaboration moves.

Multimodally, each communication unit with Participant 1 (P1) and Participant 2 (P2) maintaining a piece of expositive dialogue may have comprised a series of moves in Request, Elaboration and Response which can be manifested in the following schema:

Request [[P1: FO Gaze, CE Gesture, Verbal move (Direct/Indirect)] [P2: FO Gaze, CE Gesture, Verbal move (Direct/Indirect)]]

Elaboration [[P2: FO Gaze, CE Gesture, Common / Novel Topic Elaboration Verbal Move] [P1: FO Gaze, CE Gesture, Verbal move]]

Response [[P1: FO Gaze, CE Gesture, Verbal move (Direct/Indirect)] [P2: FO Gaze,

CE Gesture, Verbal move (Direct/Indirect)]]

This schema shows the maximum possible potential of modalities employed in communicating either Common or Novel Topic.

The minimal possible potential of modalities is shown in the following schema:

Request [[P1: 0] [P2: 0]]

Elaboration [[P2: Common / Novel Topic Elaboration Verbal Move] [P1: 0]]

Response [[P1: 0] [P2: 0]]

This schema still illustrates the communication unit since it involves the Elaboration move, although not directly or indirectly requested and not either directly or indirectly responded to. The question is then which multimodal schemas are more typical of expositive dialogues. To determine it, we annotated the 630 communicative moves (388 Common topic Elaboration and 242 Novel Topic Elaboration) following the schemes presented above. This allowed to identify the distribution of moves in each communicative unit. As the total number of possible moves within a communication unit was equal to 24, the possible number of their combinations was 224; still we expected that several combinations of moves will reappear constantly. The results show that there are at maximum only 8 cases following the scheme:

Request [[P1: FO Gaze] [P2: FO Gaze]]

Elaboration [[P2: FO Gaze, Common / Novel Topic Elaboration Verbal Move] [P1: FO Gaze]]

Response [[P1: FO Gaze] [P2: FO Gaze]]

We have also disclosed 6 cases presenting the schemes 1) with the absence of FO Gaze in P1 in Request, Elaboration and Response moves, 2) with the absence of FO Gaze in P2 in Request, the absence of FO Gaze in P1 in Response move, 3) with the absence of FO Gaze in P1 in Response move; and 5 cases presenting the scheme with the absence of FO Gaze in P1 in Request. Therefore, we can confirm that it is the variance in FO Gaze and not in CE Gesture or Verbal Move which contributes to the multimodal specificity of expository dialogue. However, since the number of such instances is small, we further proceeded to analyzing the multimodal moves separately in Request, Elaboration and Response moves.

The typical communicative moves clusters for Request are:

Request [[P1: FO Gaze] [P2: FO Gaze]] (75 cases),

Request [[P1: FO Gaze] [P2: 0]] (75 cases),

Request [[P1: 0] [P2: 0]] (71 cases); far less common is

Request [[P1: 0] [P2: 1]] (8 cases).

The results show that it is in most cases the first participant who initiates the collaboration (via gaze) in expository dialogue.

The typical communicative moves clusters for Elaboration are:

Elaboration [[P2: FO Gaze, Common / Novel Topic Elaboration Verbal Move] [P1: FO Gaze]] (127 cases),

Elaboration [[P2: FO Gaze, Common / Novel Topic Elaboration Verbal Move] [P1: FO Gaze, Direct Verbal Move]] (86 cases), and

Elaboration [[P2: FO Gaze, CE Gesture, Common / Novel Topic Elaboration Verbal Move] [P1: FO Gaze]] (54 cases).

We observe that in most cases the participants maintain the gaze contact, frequently they both simultaneously elaborate their topic, and quite frequently the elaborating participant complements his elaboration with contact-establishing gesture.

The typical communicative moves clusters for Response are:

Response [[P1: FO Gaze] [P2: FO Gaze]] (78 cases),

Response [[P1: 0] [P2: FO Gaze]] (76 cases), Response [[P1: 0] [P2: 0]] (49 cases),

Response [[P1: FO Gaze] [P2: 0]] (41 cases),

Response [[P1: FO Gaze, Direct Verbal Move] [P2: FO Gaze]] (38 cases).

Therefore, mutual contact in Response move is more commonly maintained via gaze, and only then via direct verbal response.

The results obtained largely conform to the findings of [45] who suggested that the interlocutor should be gazing at the speaker when the speaker is gazing at the hearer. The most frequent clusters in all communicative moves manifest the mutual FO Gaze exchange. However, since FO Gaze was found to frequently complement the verbal moves, the results do not confirm the results presented in [31] who claims that a person tends to look more at the interlocutor when they are listening then when they are speaking. Presumably, this is explained by the nature of expositive dialogue aimed at gaining a common decision. Consequently, the results conform to the view expressed in [41–43] who relate the gaze patterns during a conversation to the joint action and establishing mutual understanding. Additionally, the results conform to the findings presented in [31] who shows that a speaker does not look at the interlocutor to monitor the feedback, but to solicit a response. We found that FO Gaze at Response move was more common for the participant expecting (soliciting) a response rather than the one who presents it. This may account for higher attentional involvement of the participant eager to advance the next move in collaborating to the joint action.

Next, we expect to determine the differences in multimodal clusters of communicative moves as modulated by either Common or Novel Topic Elaboration.

First, we conducted a One-Way ANOVA test to identify whether there are significant differences in the use of communicative moves (24 moves including verbal and nonverbal moves in Request, Elaboration, and Response) with Common vs. Novel Topic Elaboration. Significant differences in the moves are presented in Table 3.

Table 3 Significant differences in the use of communicative moves within the clusters: Common vs. Novel Topic Elaboration

Communicative move

Participant

Mode

F

df2

p

Request

P1 (asking)

FO Gaze

30.4755

483

<0.001

CE Gesture

24.9466

618

<0.001

P2

FO Gaze

13.9102

522

<0.001

CE Gesture

5.3025

584

0.022

Indirect Verbal Move

13.0613

406

<0.001

Elaboration

P1

FO Gaze

4.4092

456

0.036

CE Gesture

9.5897

588

0.002

P2 (elaborating)

Indirect Verbal Move

8.2464

356

0.004

Response

P2

Direct Verbal Move

4.7714

614

0.029

Indirect Verbal Move

10.7811

404

0.001

Source: compiled by the authors.

The results suffice to confirm that the highest differences between Common and Novel Topic Elaboration are found at the Request move. They appear in all modalities save the Direct verbal move. Interestingly, at the Elaboration Move we observe higher differences in CE Gesture, although we did not identify this feature in overall move distribution (see above). At the Response Move the differences were found only in the use of verbal moves, which means that the use of nonverbal moves is hardly modified by the distinction of Common or Novel Topic Elaboration.

Second, we determine the typical multimodal clusters in the communication units representing Common and Novel Topic Elaboration separately.

For Request Move representing Common Topic Elaboration, the following clusters are typical:

Request [[P1: FO Gaze] [P2: FO Gaze]] (52 cases),

Request [[P1: FO Gaze] [P2: 0]] (50 cases),

Request [[P1: 0] [P2: 0]] (33 cases), Request [[P1: 0] [P2: FO Gaze]] (23 cases).

For the same move representing Novel Topic Elaboration, we found the following typical clusters:

Request [[P1: 0] [P2: 0]] (38 cases),

Request [[P1: FO Gaze] [P2: 0]] (25 cases),

Request [[P1: FO Gaze] [P2: FO Gaze]] (23 cases),

Request [[P1: 0] [P2: FO Gaze]] (20 cases).

Therefore, if we have to contrast the typicality of multimodal requests in Common and Novel Topic Elaboration, the following pictures (Fig. 2a and 2b) might display the difference.

Fig. 2. Typical multimodal Request: in Common Topic Elaboration (a); in Common Topic Elaboration (b)
Source: photos from the archive of the authors. Prior to the experiment the participants signed the consent form.

The results proved our previously advanced hypothesis that mutual FO Gaze will prevail at the Request move preceding solely Common Topic Elaboration, which at this step conforms and specified the results obtained in [39; 40] and shows that only the willingness to establish communication at the Request move is accompanied with FO Gaze.

For Elaboration Move representing Common Topic Elaboration, the following clusters are typical:

Elaboration [[P1: FO Gaze] [P2: FO Gaze, Common Topic Verbal move]] (71 case),

Elaboration [[P1: FO Gaze, Verbal move] [P2: FO Gaze, Common Topic Verbal move]] (57 cases),

Elaboration [[P1: FO Gaze] [P2: FO Gaze, CE Gesture, Common Topic Verbal move]] (32 cases).

For the same move representing Novel Topic Elaboration, we found the following typical clusters:

Elaboration [[P1: FO Gaze] [P2: FO Gaze, Novel Topic Verbal move]] (56 cases),

Elaboration [[P1: FO Gaze, Verbal move] [P2: FO Gaze, Novel Topic Verbal move]] (29 cases),

Elaboration [[P1: FO Gaze] [P2: FO Gaze, CE Gesture, Novel Topic Verbal move]] (22 cases).

We observe that although the frequency order of appearance of the clusters is the same, there is a far wider gap between the first and the second frequent cluster. Therefore, we can claim that simultaneous (by both participants) verbal move is far more typical of Common Topic Elaboration than of Novel Topic Elaboration, which may be manifested by Fig. 3a and 3b.

Fig. 3b. Typical multimodal Elaboration in Novel Topic Elaboration: both participants are advancing verbal moves (a); one participant is advancing a verbal move (b)
Source: photos from the archive of the authors.  Prior to the experiment the participants signed the consent form.

For Response Move representing Common Topic Elaboration, the following typical clusters were identified:

Response [[P1: 0] [P2: FO Gaze]] (48 cases),

Response [[P1: FO Gaze] [P2: FO Gaze] (42 cases),

Response [[P1: 0] [P2: 0]] (30 cases).

For the same move representing Novel Topic Elaboration, we found the following typical clusters:

Response [[P1: FO Gaze] [P2: FO Gaze] (36 cases),

Response [[P1: 0] [P2: FO Gaze]] (28 cases),

Response [[P1: 0] [P2: 0]] (19 cases).

Presumably, novel topic produces a more involved reaction on the part on the listener. Therefore, we can claim that this is the FO Gaze of the first participant that specifies the difference which may be manifested by Fig. 4a and 4b.

Fig. 4. Typical multimodal Response: in Common Topic Elaboration (a); in Novel Topic Elaboration (b)
Source: photos from the archive of the authors. Prior to the experiment the participants signed the consent form.

Overall, the results manifest that there exist particular differences in the use of multimodal moves presenting Common and Novel Topic Elaboration, and they can be found in the clusters of Request, Elaboration and Response moves. More striking differences are observed in the use of Requests; still, it was the FaceOriented Gaze that is mostly responsible for these differences. The results sufficiently specify the way the brain structures are involved in human social interaction and communication processed multimodally [8–10] in ranging the role of gaze, gesture and verbal moves in communicative moves clusters of two basic types, advancing common and novel topics in expositive dialogue. They also show that in contrast to other discourse types, ordering the communicative moves may display specificity with the prevailing order of moves being Request — Elaboration — Response in expositive dialogue and Request — Response — Elaboration in descriptive communication [7]. While promoting the notion of Common and Novel Topic differentiation, the study additionally confirms the methodological efficiency of discourse topic studies which might further contribute to exploring collaboration and communication in multimodal systems.

Conclusive remarks

The research aimed at specifying the multimodal organization of expository dialogue has allowed to reveal the distribution of multimodal communicative moves as well as their discourse functions and also to identify their clusters. To comply with the tasks, we advanced the notion of common and novel topic in structuring the clusters comprising Request, Topic Elaboration and Response moves in faceoriented gaze, contact-establishing gesture, and verbal direct and indirect moves.

The study has proved that the potential of common and novel topic differentiation which elaborates on the earlier notions of discourse topics, suffices to distinguish between two hyper-clusters of multimodal moves, with Common Topic Elaboration Verbal move and Novel Topic Elaboration Verbal move serving as their nuclei. In the data obtained during the multimodal experiment we identified 630 communication units, with Common and Novel Topic Elaboration units equal to 388 and 242, which contrasts the role of two hyper-clusters in expositive dialogues. The study also itemized the discourse functions of each verbal move within the dialogue, which allowed to maintain the collaborative discourse specificity of expositive dialogue in contrast to other dialogue formats.

Further distribution, contingency and variance analyses have shown that while there is significant difference in the use of nonverbal moves as modulated by Request and Response moves within Common or Novel Topic Elaboration move clusters, the highest differences between Common and Novel Topic Elaboration are found at the Request move. They appear in both face-oriented gaze and contact-establishing gesture; however, it was the gaze differences which appeared to manifest higher diversity (alongside with higher activity of gaze) in the moves. The results prove that mutual face-oriented gaze prevails at the Request move preceding solely Common Topic Elaboration, which shows that only the willingness to establish communication at the Request move is accompanied with gaze. Novel topic is found to produce a more involved reaction on the part on the listener during the Response moves which are more frequently accompanied with face-oriented gaze of a listener. Additionally, we can claim that simultaneous (by both participants) verbal move is far more typical of Common Topic Elaboration than of Novel Topic Elaboration.

The results majorly conform to prior experimental findings, still they specify the functions of discourse moves, their multimodal distribution and diversity typical of expositive dialogue. Overall, the results prove that social interaction and communication is processed multimodally and predetermines the role of gaze, gesture and verbal moves in communicative moves clusters. Among the most important findings of this study are the interrelations of multimodal resources use and the type of topic, common or novel, advanced in a communicative unit, as well as the defined structure of multimodal move clusters which organize these units.

Hopefully, the procedure developed and the results achieved may be used to predict the clines in multimodal resource use in expositive dialogue, and also in other discourse types contrasted with the expositive type under consideration.

×

About the authors

Maria I. Kiose

Moscow State Linguistic University; Institute of Linguistics RAS

Author for correspondence.
Email: maria_kiose@mail.ru
ORCID iD: 0000-0001-7215-0604
Scopus Author ID: 56642747500
ResearcherId: AAB-7989-2019

D.Sc. in Philology, Associate Professor, Leading Researcher of the Centre for Socio-Cognitive Studies of Moscow State Linguistic University; Leading Researcher; Laboratory for multichannel communication Institute of Linguistics, RAS

38, Ostozhenka, Moscow, Russian Federation, 119034; 1, B. Kislovsky, Moscow, Russian Federation, 125009

Anna V. Leonteva

Moscow State Linguistic University; Institute of Linguistics RAS

Email: lentevanja27@gmail.com
PhD in Linguistics, Researcher at the Center for Socio-Cognitive Discourse Studies (SCoDis), Moscow State Linguistic University ; Junior Researcher; Laboratory for multichannel communication Institute of Linguistics, RAS 38, Ostozhenka, Moscow, Russian Federation, 119034; 1, B. Kislovsky, Moscow, Russian Federation, 125009

Olga V. Agafonova

Moscow State Linguistic University; Institute of Linguistics RAS

Email: olga.agafonova92@gmail.com
ORCID iD: 0000-0002-2184-163X
SPIN-code: 3516-1947

Junior Researcher at the Center for Socio-Cognitive Discourse Studies (SCoDis), Moscow State Linguistic University; Junior Researcher, Laboratory for multichannel communication Institute of Linguistics, RAS

38, Ostozhenka, Moscow, Russian Federation, 119034; 1, B. Kislovsky, Moscow, Russian Federation, 125009

Andrey A. Petrov

Moscow State Linguistic University; Institute of Linguistics RAS

Email: petrov.drew@yandex.ru
ORCID iD: 0000-0003-0368-8800
SPIN-code: 7290-4450

Researcher at the Center for Socio-cognitive Discourse Studies (SCoDis), Moscow State Linguistic University ; Researcher, Laboratory for multichannel communication Institute of Linguistics, RAS

38, Ostozhenka, Moscow, Russian Federation, 119034; 1, B. Kislovsky, Moscow, Russian Federation, 125009

References

  1. Carlson, L. (1983). Dialogue Games. An Approach to Discourse Analysis. Dordrecht: D. Reidel Publ.
  2. Baranov, A.N. & Kreydlin, G.E. (1992). Illocutionary Forcing in the Structure of Dialogue. Topics in the Study of Language, 2, 84-99. (In Russ.).
  3. Shvedova, N.Yu. (2003). Essays on the syntax of Russian spoken speech. Moscow: Azbukovnik. Шведова Н.Ю. Очерки по синтаксису русской разговорной речи. М.: Азбуковник, 2003.
  4. Kendon, A. (1995). Gestures as illocutionary and discourse structure markers in Southern Italian conversation. Journal of Pragmatics, 23(3), 247-279.
  5. Brône, G., Feyaerts, K. & Oben, B. (2013). Multimodal turn-taking in dialogue: on the interplay of eye gaze, speech and gesture. In: Proceedings of AFLiCo5: Empirical approaches to multi-modality and to language variation. Lille. pp. 21-22.
  6. Kendrick, K.H. & Holler, J. (2017). Gaze direction signals response preference in conversation. Research on Language and Social Interaction, 50, 12-32.
  7. Korotaev, N.A. (2023). Collaborative constructions in Russian conversations: A multichannel perspective. In: Computational Linguistics and Intellectual Technologies. Proceedings from the Annual International Conference “Dialogue”, 22. Moscow. pp. 250-258. (In Russ.).
  8. Haxby, J.V., Hoffman, E.A. & Gobbini, M.I. (2000). The distributed human neural system for face perception. Trends in Cognitive Sciences, 4, 223-233.
  9. Senju, A. & Johnson, M.H. (2009). The eye contact effect: mechanisms and development. Trends in Cognitive Sciences, 13, 127-134.
  10. Burgoon, J.K., Guerrero, L.K. & Floyd, K. (2010). Nonverbal Communication. London: Routledge.
  11. Givón, T. (1987). Beyond foreground and background. In: R.S. Tomlin (Ed.) Coherence and grounding in discourse. Amsterdam: Benjamins. pp. 175-188.
  12. Chafe, W. (1994). Discourse, Consciousness, and Time. The Flow and Displacement of Conscious Experience in Speaking and Writing. Chicago: The University of Chicago Press.
  13. Longacre, R. (1983). The grammar of discourse. New York: Plenum.
  14. Duncan, S. (1974). On the structure of speaker-auditor interaction during speaking turns. Language and Society, 3, 161-180.
  15. Mann, W.C. & Thompson, S.A. (1987). Rhetorical Structure Theory. A Theory of text Organization. California: University of Southern California.
  16. Kibrik, A.A. & Podlesskaya, V.I. (2009). Night Dream Stories. A corpus study of spoken Russian discourse. Moscow: Languages of Slavic Culture.
  17. Maschler, Y. & Schiffrin, D. (2015). Discourse Markers: Language, Meaning, and Context. In: The Handbook of Discourse Analysis. Second Edition. Vol. I. London: Bloomsbury Publishing Plc. pp. 189-221.
  18. Sharonov, I.A. (2016). Discursive words and communicatives. In: Computational Linguistics and Intellectual Technologies. Papers from the Annual International Conference “Dialogue”, 15. Moscow. pp. 605-615. (In Russ.).
  19. Sherstinova, T.Yu. (2016). The Structure of everyday dialogue as the sequence of speech acts. In: Computational Linguistics and Intellectual Technologies. Proceedings from the Annual International Conference “Dialogue”, 15. Moscow. pp. 616-631.
  20. Bogdanova-Beglarian, N., Sherstinova, T., Blinova, O. & Martynenko, G. (2017). Linguistic Features and Sociolinguistic Variability in Everyday Spoken Russian. In: Karpov A., Potapova R., Mporas I. (eds) Speech and Computer. SPECOM 2017. Lecture Notes in Computer Science. Vol. 10458. Springer: Cham. pp. 503-511.
  21. Troshchenkova, E.V. & Blinova, O.V. (2020). Pragmatic markers in the aspect of communicative alignment. Science Journal of Volgograd State University. Linguistics, 19(3), 49-58.
  22. Nippold, M.A. & Scott, C.M. (eds.). (2010). Expository discourse in children, adolescents, and adults: Development and disorders. New York: Psychology Press.
  23. Lundine, J.P. (2016). The Language of Learning: Expository Discourse and the Influences of Cognition and Language [dissertation]. Columbus: The Ohio State University publ.
  24. Iriskhanova, O.K., Kiose, M.I., Leonteva, A.V. & Agafonova, O.V. (2023). Vague reference in expository discourse: multimodal regularities of speech and gesture. In: Computational Linguistics and Intellectual Technologies. Proceedings from the Annual International Conference “Dialogue”, 22. Moscow. pp. 172-180.
  25. Schiffrin, D. (1987). Discourse Markers. Studies in Interactional Sociolinguistics. Vol. 5. Cambridge: Cambridge University Press.
  26. Blakemore, D. (2002). Relevance and Linguistic Meaning: The Semantics and Pragmatics of Discourse Markers. Cambridge: Cambridge University Press.
  27. Fraser, B. (2015). The combining of Discourse Markers. A beginning. Journal of Pragmatics, 86, 48-53.
  28. Putina, O.N. (2021). Functioning of discourse markers in a dialogue unity RequestResponse (featuring Russian and English languages) [dissertation]. Perm. (In Russ.).
  29. Bulygina, Т.V. & Shmelev, A.D. (1982). Dialogical functions of several types of questions. The Bulletin of the Russian Academy of Sciences. Studies in Literature and Language, 41(4), 314-326. (In Russ.).
  30. Curioni, A., Knoblich, G.K., Sebanz, N. & Sacheli, L.M. (2020). The engaging nature of interactive gestures. PLoS ONE, 15(4). https://doi. org/10.1371/journal.pone.0232128
  31. Bavelas, J.B., Chovil, N., Coates, L. & Roe, L. (1995). Gestures Specialized for Dialogue. Personality and Social Psychology Bulletin, 21(4), 394-405. https://doi. org/10.1177/0146167295214010
  32. Alibali, M.W., Spencer, R., Knox, L. & Kita, S. (2011). Spontaneous gestures influence strategy choices in problem solving. Psychological Science, 22, 1138-1144.
  33. Gukson, T., Goldin-Meadow, S., Newcombe, N. & Shipley, T. (2013). Individual differences in mental rotation: What does gesture tell us? Cognitive Processing, 14, 153-162.
  34. Kang, S., Tversky, B. & Black, J.B. (2014). Gesture and speech in explanations to experts and novices. Spatial Cognition and Computation, 15, 1-26.
  35. Tam, G. & Tellier, M. (2021). Gesture Helps Second and Foreign Language Learning and Teaching. In: Morgenstern, A. & Goldin-Meadow, S. (eds). Gesture in Language: Development Across the Lifespan. Berlin: Mouton de Gruyter. pp. 336-363.
  36. Beilock, S.L. & Goldin-Meadow, S. (2010). Gesture changes thought by grounding it in action. Psychological Science, 21, 1605-1611.
  37. Yasui, E. (2013). Collaborative idea construction: Repetition of gestures and talk in joint brainstorming. Journal of Pragmatics, 46(1), 157-172.
  38. Rodero, E. (2022). Effectiveness, Attractiveness, and Emotional Response to Voice Pitch and Hand Gestures in Public Speaking. Frontiers in Communication, 7, 869084. https:// doi: 10.3389/fcomm.2022.869084
  39. George, N. & Conty, L. (2008). Facing the gaze of others. Neurophysiologie Clinique, 38(3), 197-207.
  40. Brône, G., Oben, B., Jehoul, A., Vranjes, J. & Feyaerts, K. (2017). Eye gaze and viewpoint in multimodal interaction management. Cognitive Linguistics, 28(3), 449-483.
  41. Richardson, D.C. & Dale, R. (2005). Looking to understand: The coupling between speakers’ and listeners’ eye movements and its relationship to discourse comprehension. Cognitive Science, 29(6), 1045-1060.
  42. Clark, H.H. & Krych, M.A. (2004). Speaking while monitoring addressees for understanding. Journal of Memory and Language, 50(1), 62-81.
  43. Amati, F. & Brennan, S.E. (2018). Eye gaze as a cue for recognizing intention and coordinating joint action. In: Brône, G. & Oben, B. (eds.), Eye-tracking in Interaction: Studies on the role of eye gaze in dialogue, 21-46. Amsterdam / Philadelphia: John Benjamins Publ.
  44. Rossano, F. (2012). Gaze in conversation. In: J. Sidnell & T. Stivers (eds.) The handbook of conversation analysis. Malden, MA: Wiley-Blackwell. pp. 308-329.
  45. Goodwin, C. (1981). Conversational organization: Interaction between speakers and hearers. New York: Academic Press.
  46. Beattie, G.W. (1979). Planning units in spontaneous speech: Some evidence from hesitations in speech and speaker gaze direction in conversation. Linguistics, 17, 61-78.

Supplementary files

Supplementary Files
Action
1. Fig. 1. Experiment setting

Download (73KB)
2. Fig. 2. Typical multimodal Request: in Common Topic Elaboration (a); in Common Topic Elaboration (b)

Download (148KB)
3. Fig. 3b. Typical multimodal Elaboration in Novel Topic Elaboration: both participants are advancing verbal moves (a); one participant is advancing a verbal move (b)

Download (178KB)
4. Fig. 4. Typical multimodal Response: in Common Topic Elaboration (a); in Novel Topic Elaboration (b)

Download (152KB)

Copyright (c) 2023 Kiose M.I., Leonteva A.V., Agafonova O.V., Petrov A.A.

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

This website uses cookies

You consent to our cookies if you continue to use our website.

About Cookies