Towards a New Linguistic Model for Detecting Political Lies

Cover Page

Abstract


The present study addresses the problem of how the two US presidential candidates Donald Trump and Hillary Clinton use statements judged to be false by the Politifact site while delivering their campaign speeches. Two corpora of Clinton’s and Trump’s alleged lies were compiled. Each corpus contained 16 statements judged to be false or ridiculously untrue (‘pants on fire’) by the Pulitzer Prize Winner site Politifact. Some statements were accompanied by the video recordings where they appeared; others had no video recordings affiliated because they are either tweets or their events had not been recorded on Youtube or elsewhere. The present research made use of CBCA (Criteria-based Content Analysis) but as a stepping stone for building a new model of detecting lies in political discourse to suit the characteristics of campaign discourse. This furnished the qualitative dimension of the research. As for the quantitative dimension, data were analyzed using software, namely LIWC (Linguistic Inquiry & Word Count), and also focused on the content analysis of the deception cues that can be matched with the results obtained from computerized findings. When VSA (Voice Stress Analysis) was required, Praat was used. Statistical analyses were occasionally applied to reach highly accurate results. The study concluded that the New Model (NM) is not context-sensitive, being a quantitative one, and is thus numerically oriented in its decisions. Moreover, when qualitative analysis intervenes, especially in examining Politifact rulings, context plays a crucial role in passing judgements on deceptive vs. non-deceptive discourse.

INTRODUCTION Lying is usually defined as not telling the truth. However, what is more important than this simplistic definition id why lying has become significant in human communication. DePaulo et al (1996) maintain that people lie in 31 percent of their social interactions. Their study thus points to the amount of lying committed, but how can this amount be studied linguistically in political campaign discourse? Although political campaign discourse is part of the overarching political discourse, its language is unique in that it possesses a number of characteristics. One feature, according to Emerich et al (2001), is recurrence of imagery as a means of rendering the campaign discourse charismatic. Another feature is the use of ‘consilience’ strategy, where the candidate-audience understanding stems from the mediation and embrace of different language, values and traditions in an attempt to encourage the listeners to remember the common principles shared by the candidate and voter (Frank and McPhail, 2005). A third feature, Fairclough maintains (2006), is that campaign language is capable of weaving visions and imaginaries which can change realities, obfuscate realities and construe them ideologically.A fourth is the topics that dominate campaign speeches. As Donella (1988) contends, campaign speeches serve as emotional triggers, spanning a range of issues such as the environment, taxes as well as good governance which can guarantee good jobs, among others. A final feature is the focus on populism as a discursive strategy that juxtaposes the virtuous populace with a corrupt elite and views the former as the sole legitimate source of political power (cf. Bonikowski and Gidron, 2015). Thus, campaign discourse is basically emotional and are geared towards canvassing support from voters. Given this picture of campaign discourse, it is legitimate to ask how presidential candidates can strike a balance between emotionalism and truth-telling. They are required to be as much persuasive as possible while at the same time sound truthful. This inherently impinges on their ability to remain consistent and reliable all the time. The two US presidential candidates Hillary Clinton and Donald Trump are now running the elections as representatives of the Democratic Party and the Republican Part, respectively. The two nominees have delivered several speeches and written posts and tweets on social media in the course of their campaigning. These electioneering channels can be a rich source for examining whether they tell the truth or not. A special site called Politifact (www.politifact.com) was set up years ago to gauge the veracity of American politicians’ releases. The site contains thousands of excerpts from past and present US politicians, including updates on Clinton’s and Trump’s statements. As a Pulitzer Prize Winner, the site claims that it adopts a criterion-based analysis of any statement. Such an analysis attempts to answer the following set of questions: ¨ Is the statement based on a fact that is subject to verification? ¨ Is the statement leaving a particular impression that may be misleading? ¨ Is the statement significant (barring slips of the tongue)? ¨ Is the statement likely to be carried over and repeated by others? ¨ Would a typical person hear or read the statement and wonder: Is that true? The result is a meter that has six pointers as follows: TRUE - The statement is accurate; nothing significant is missing. MOSTLY TRUE - The statement is accurate but needs clarification or additional information. HALF TRUE - The statement is partially accurate but leaves out important details or takes things out of context. MOSTLY FALSE - The statement contains an element of truth but ignores critical facts that would give a different impression. FALSE - The statement is not accurate. PANTS ON FIRE - The statement is not accurate and makes a ridiculous claim. The website claims that it is sometimes necessary to consider factors such context, timing, promise-keeping, etc. Other times they resort to acoustic analysis as was done with a contentious statement by Clinton about raising taxes on the middle class detected by Trump’s supporters. Another dimension in the present research is the use of LIWC (Linguistic Inquiry and Word Count) developed and continually updated by Pennebaker and others since 2000. LIWC is an website that it reads a given text and counts the percentage of words that reflect different emotions, thinking styles, social concerns, and even parts of speech. The most relevant part of this electronic tool to the present research is that it includes two dimensions that directly affect judgements on truth-telling, namely Authenticity and emotional tone. Authenticity refers to writing that is personal and honest. Emotional tone is scored such that higher numbers are more positive and upbeat and lower numbers are more negative. The present paper attempts to examine 16 statements for each candidate judged by Politifact as false (whether downright false or ‘pants on fire’). This study derives its significance from the fact that it provides a suitable vantage point for investigating the topic of lying in the context of political discourse, particularly the case of Clinton and Trump, as a major human interactive encounter. This is set within the context of contrasting two US candidates’ speeches with the aid of a linguistic model of analysis, which will eventually lead to providing a better understanding of the nature of lying as a verbal immediacy activity in political campaign discourse. 1. LINGUISTIC APPROACHES TO LIE DETECTION Linguistic approaches to lie detection can be divided into three categories: communication approaches, disfluency-based approaches (usually acoustically oriented), and holistic approaches. The review below provides a bird’s eye view of the three approaches in tandem. Three studies can be subsumed under the communication category. The first is Zuckerman et al’s (1981). As early as 1981, Zuckerman et al focused on the meta-analysis of deception-detection (traditionally known as the Four-Factor Theory), and stated that no cue or cues to deception could be accurate all the time because deception was an individual psychological process. The second is Newman et al’s (2003), where they investigated linguistic features that discern true from false stories. They applied a computerized analysis of five independent samples, achieving a classification of liars and truth-tellers at a rate of 67% when the topic was constant and a rate of 61% overall. When compared to truth-tellers, liars exhibited lower cognitive complexity, used fewer self-references and other-references, and showed a tendency towards more negative emotive words. The third is Zhou et al’s. (2004a). They foregrounded The Interpersonal Deception Theory. The theory is based on the assumption that deceivers’ number of words, verbal self-distancing tactics, and use of adjective and adverb increase during a conversation. Thus, while communicating, deceivers use feedback from recipients’ message to modify deception strategy. According to this theory, cues to deception are divided into three categories: verbal, nonverbal, and physiological. Some other studies later laid much emphasis on disfluencies in speech, particularly pauses, as a viable linguitsic marker of false statement. Anolli and Ciceri (1997) found out that longer time lapse occurs between the question and the lie to than the response latency that occurs in truthful statements. A major study in this direction is Benus et al. (2006), where they made use of a corpus of spontaneously recorded interviews to investigate the relationship between the distributional and prosodic features of silent and filled pauses and the interviewee’s intention to deceive the interviewer. They concluded that the use of pauses correlated more with truthful rather than with deceptive speech. They also found out that prosodic features extracted from filled pauses as well as features describing contextual prosodic information in adjacent phonetic environments of the filled pauses may facilitate the detection of lies in speech. Demenko (2008) attempted to introduce voice stress extraction and classification into the investigation of deceptive speech. She made use of the authentic Poznan police database with the recordings of the 997 emergency phone, and selected 20,000 recordings out of 60,000, then around hundred were acoustically analyzed. It was concluded that the range of fundamental frequency per se did not correlate with stress whereas the shift in fundamental frequency register constituted the primary indicator of stress. Through Linear Discriminant Analysis based on 12 acoustic features, it was shown that it is possible to reach the three categories of neutral, depressive, stressed, highly stressed speech. Arciuli et al. (2009) followed suit and examined the frequency of use of the filler ‘um’ during lying versus truth-telling statements in two laboratory-elicited lies about a murder case. They found out that within-participants, false statements exhibited fewer instances of ‘um’ during lying compared to truth-telling. These results pointed to the fact that ‘um’ is a major filler in lying statements, and thus can be reliably used to differentiate between deceptive and non-deceptive statements in ordinary communication. Therefore, the filler ‘um’ may not be accurately categorized as an instance of filled pauses, whose increase is proportionate with increased cognitive load. Rather, they may assume a lexical status similar to interjections, and so constitute an important part of authentic, natural communication. Latency or gaps in discourse was also used in recent studies as another indicator of deceptive speech. In fact, there are several studies in that domain; however, the best-known is Reynolds and Randle-Short’s (2011). They adopted a rigorous methodological framework of conversation analysis (CA) as analytic tool kit to demonstrate the importance of context, particularly interactional context, when researching cues to deception in order to understand whether there is a relationship between response latency and deception. They thus followed De-Paulo et al. (2003)[37], who emphasized the interactional context in detecting lies in speech. Reynolds et al examined data from outside laboratory settings taken from The Jeremy Kyle Show, adopting strict criteria to develop the data collection. Criteria were based on how participants in the outside-laboratory interactions formulate their verbal output. Lies were detected according to the following criteria: (1) agreement by the liar that a lie had occurred; (2) explicit labelling of talk as lies by other participants; and (3) the liar’s ‘revision’ of a prior action, thereby changing the course of action, in a ‘lie relevant’ sequential context. They found out that participants in the show could display a longer transition space to signal that a concessionary stance is close, or they can reduce the transition space to reduce the risk of an upcoming turn, which can be considered a concession. Preferring an overall perspective, Kirchhübel and Howard (2011) explored the acoustic changes in the speech in deceptive statements. Truthful, deceptive and control speech was collected from ten speakers during an interview. Results were displayed according to the parameters of fundamental frequency, intensity and vowel formants. They found out that no significant correlation could be established for any of the acoustic features, a result that runs counter to many mainstream studies in the field. The holistic approach, on the other hand, is adopted by Picornell (2012), where she examined deception in written witness statements. She employed marked sentence structures to code discourse markers in written narratives, and mapped the progression of lying as it unfolded through the course of the narrative based on the interaction of linguistic cues. She found out that what may be important is not the individual cues, but the way they are utilized. The same approach is also adopted by Burgoon et al (2012), where they focused on whether indicators of truth or deception are context-independent or context-sensitive. The factors they suggested are: motivation and modality. A 2 (veracity: truthful/deceptive) by 2 (incentives: high/low) by 3 (modality: FtF/audio/text). The factorial experiment revealed that linguistic indicators are significantly related to veracity, but the results are highly sensitive to context. In view of the previous review, there appears to a gap in the studies that focus on content analysis (i.e. the linguistic features of a potential liars’ outputs) and the prosodic features that verify spots in the speech that signal lying, i.e. latency responses, pauses, fillers, speech errors and the like in political discourse. Bringing the two dimensions together in one project that studies lies committed by politicians in English would eventually enrich the field, and help formulate a new theoretical framework liable to applications in a wider context. The present research project is an attempt at studying how lies can be detected in human interactions, especially political discourse in English. 2. CONTEXT OF THE PROBLEM The present study addresses the problem of how the two US presidential candidates Donald Trump and Hillary Clinton use statements judged to be false by Politifact while delivering their campaign speeches. A normal search through Google would yield 6 pages that provide discussions on how both candidates lie to their audiences, each page having 10 hits. This means that the topic of how the candidates use lies is a rampant phenomenon that merits further research. However, there are few studies that tackle the presidential candidates’ lies. Wortham and Lorcher (1999) suggested embedded metapragmatics to investigate politicians’ lies by examining television network news coverage of the 1992 and 1996 US presidential campaigns. Their article describes an approach to the social functions of language, which draws heavily on Bakhtin, and gives a more formal account of embedded metapragmatic constructions. Another extended study is David Corn’s (2004) book entitled The Lies of George Bush. Although the book is an amalgam of Bush’s lies about health programs, IRAQ and tax policies, it does not offer a linguistic approach that can be put to use in further analysis. Moreover, the tone of the book is polemic, and sometimes sounds as a personal war. Still, a third study by Kangas (2014) focused on computerized analysis of politicians’ discourse, and touched on honesty as composed of the z-scores of exclusive words, references to self, references to others, motion words and negative emotion words. The paper did not allot ample space to deceptive discourse, having a major focus on how software could analyze political discourse. Therefore, it is important to draw attention to the impact of lies on the US candidate’s image. The amount of lying and/or truthfulness can be linguistically analyzed, and how various linguistic tools can contribute to detecting these lies in their speeches and sometimes tweets. 3. METHODS AND DATA 3.1. Corpus Two corpora of Clinton’s and Trump’s alleged lies were compiled. Each corpus contained 16 statements judged to be false or ridiculously untrue (‘pants on fire’) by the Pulitzer Prize Winner site Politifact. Some statements were accompanied by the video recordings where they appeared; others had no video recordings affiliated because they are either tweets or their events had not been recorded on Youtube or elsewhere. All in all, the two corpora comprise 1536 words (639 for Clinton’s statements and 897 for Trump’s statements) and their 16 videos[38] are 7.02 minutes in total length (3.02 minutes for Clinton and 4 minutes for Trump). 3.2. A note on the method of analysis 3.2.1. Model of analysis One major approach to investigating the field of lie-detection is the CBCA (Criteria-based Content Analysis) as one of the major elements of Statement Validity Assessment (SVA), a technique developed to determine the credibility of child witnesses’ testimonies in trials for sexual offenses and recently applied to assessing testimonies given by adults (cf. Raskin and Esplin 1991). The present research makes use of CBCA but as a stepping stone for building a new model of detecting lies in political discourse to suit the characteristics of campaign discourse. This will furnish the qualitative dimension of the research. As for the quantitative dimension, it will analyze data using software, namely LIWC, and will also focus on the content analysis of the deception cues that can be matched with the results obtained from computerized findings. When VSA (Voice Stress Analysis) is required, Praat will be used. Statistical analyses will also be occasionally applied to reach highly accurate results. Based on an extensive reading of the literature on the linguistic markers of deceptive speech, the holistic approach was favored for a number of reasons. First, the present model can be considered the first to subject political campaign speeches and/or posts and tweets to lie detection analyses. It is difficult to zoom in on one aspect, such as acoustics, at the expense of other ones. Second, the model adopted here is just a starter that can be so broadened as to include other modifications and it is therefore far from being perfect. It just highlights how campaign discourse may divert from the norms of truthful speech. Third, the present model is adapted from Burgoon et al’s (2012) version, which is summarized in the following table. Table 1 Linguistic classes and indicators Linguistic Categories and Operationalizations of Indicators Quantity Refers to the length of an utterance, expressed at the lowest level in terms of morphemes and at the highest levels in terms of entire utterances or turns at talk 1. Syllables (morphemes and affixes) 2. Verbs (words that characteristically are the grammatical center of a predicate and express an act, occurrences, or mode of being) Complexity The degree to which a lexical item has few or many syllables (lexical complexity) or a sentence has few or many phrases and clauses (syntactic complexity) 1. Big words (# of words with 6 or more characters) 2. Readabittty (indices, e.g., Flesh-Kincaid or SMOG index) that measure reading grade level or difficulty of comprehending a segment of text) Diversity Degree to which a segment of text uses many unique words and phrases relative to the total number of'words or phrases in it 1. Lexical diversity (total # of different words divided by total # of words. i.e., percentage of unique words in all words) Specificity Degree to which a segment of text is concrete and specific or abstract 1. Sensory details (sensory experiences such as sounds, smells, physical sensations and visual details) 2. Expressivity (a measure of vividness, quantified as the relationship of # of adjectives + # of adverbs, divided by # of nouns + # of verbs) Uncertainty Degree to which words or constructions introduce ambiguity in meaning 1. Modal verbs (auxiliary verbs like would, should, could that are characteristically used with a verb of predication) Verbal Nonimmediacy Language that expresses and creates psychological distance 1. Passive voice (form of a verb used when the subject is being acted upon rather than doing something) Personalization Personalization: pronoun use that increases the specificily or reference to self and others 1. Self-reference (first-person singular pronouns: I, me, my) 2. Second person reference (you-references) Affect Words and expressions that convey die subjective aspect of an emotion apart from bodily changes 1. Affect ratio (number of affect-laden words from a dictionary of affect terms relative to total number of words) 2. Pleasantness (positive or negative feelings associated with a term, based on pre-scaled dictionary of terms) Activation Degree of dynamism expressed by emotional terms, based on pre-scaled dictionary of terms Informality Degree of adherence to formal, standard language forms 1. Tipographical errors (# of errors in written text) Cognitive Processes Terms describing the respondent’s thinking process (e.g., “thought”, “surmised”) Cognitive Difficulty Degree of nonfluencies in a segment of text 1. Filled pauses (um, er, ah, you know, and similar nonlexical expressions that do not disrupt the flow of speech and substitute for a silent pause The above table seems to be at first sight comprehensive, yet it contains a number of redundancies that can be conflated. For example, informality is not a viable marker of deception and can be excluded. The same is true for readability, which is measured for written texts only can be difficult to apply to speeches. An alternative benchmark as suggested by Burgoon and Qin (2006) is the average sentence length[39]. Moreover, the idea of relating cognitive difficulty to filled pauses runs counter to the view held by Arciuli et al (2009), where false statements usually contain fewer ‘um’ instances than truthful statements. Finally, being a predictive study, Burgoon and her colleagues omitted to include two important aspects: (a) the minimum amount (or percentage) of each feature that should be available for a statement to be false and (b) a rating scale that could locate the degree of veracity. The same problem is also detected in LIWC, where the scale from 0-100 cannot be reliable in cases where half of the statement is true and the rest is false. The present model thus adopted Vrij and Winkel’s (1991), Connell’s (2012) and Picornell’s (2012) results which could be summarized in the following points: 1. Deceivers use fewer first-person pronouns than truth tellers. 2. Deceivers used more words and more exact language (psychological distancing) than truth tellers. 3. Deceivers’ language was simpler (shorter clauses) than that of truth tellers. 4. Deceivers are more uncertain (passive voice usage). 5. Deceivers exhibited a higher cognitive load (through simpler structures and cognitive verbs). 6. Deceivers exhibit more tension through higher pitch. Therefore, for the purposes of the present research, the following table summarizes the new model with the scale included: Table 2 A modified version of Burgoon et al’s (2012) model (the New Model) Indicator/Marker Truthful Half-Truthful False Ridiculously False 1. Complexity: The degree to which a lexical item has few or many syllables (lexical complexity) or a sentence has few or many phrases and clauses (syntactic complexity) a. Big words (more than 6 characters or three syllables, excluding proper names) 100-89% 90-59% 60-10% 9-0% b. Average sentence length (relative to longest sentence in the same piece of discourse) 100-89% 90-59% 60-10% 9-0% 2. Specificity: Degree to which a segment of text is concrete and specific or abstract a. Sensory details (sensory experiences such as sounds, smells, physical sensations and visual details) 100-89% 90-59% 60-10% 9-0% End of table 2 Indicator/Marker Truthful Half-Truthful False Ridiculously False b. Lexical density (a measure of vividness, quantified as the relationship of # of adjectives + # of adverbs divided by # of nouns+ # of verbs) 0-10% 11-60% 61-90% 91-<100% 3 Uncertainty: Degree to which words or constructions introduce ambiguity in meaning a. Modal verbs 0-10% 11-60% 61-90% 91-<100% b. Qualifiers like ‘somewhat’, ‘maybe’, etc. 0-10% 11-60% 61-90% 91-<100% 4 Verbal Non-immediacy: Terms or constructions that express and create psychological distance a. Passive voice 0-10% 11-60% 61-90% 91-<100% 5. Personalization: Pronoun use that increases the specificity of reference to self and others a. Self-reference 100-89% 90-59% 60-10% 9-0% b. Second and third person references 0-10% 11-60% 61-90% 91-<100% 6. Emotiveness: Words or terms that convey emotions a. Affect ratio (number of affect-laden words from a dictionary of affect terms relative to total number of words) 0-10% 11-60% 61-90% 91-<100% 7. Cognitive process terms Terms describing the respondent’s thinking process (e.g., “thought,” “surmised”) 0-10% 11-60% 61-90% 91-<100% 8. VSA (voice stress analysis): Acoustic features that signal tension on the part of the deceiver a. Higher pitch (means are calculated; a pitch amounts to zero if below 65 Hz for males and if below 100 Hz for females*) 0-10% 11-60% 61-90% 91-<100% b. Fillers, especially ‘um’ 0-10% 11-60% 61-90% 91-<100% Total = degree of veracity Truthful 100% Half-truthful 99-50% False 49-5% Ridiculously false 4-0% * According to Pernet and Belin’s (2012) study. It is clear from the above table that eight indicators are adopted in the present model. They have been adapted from Burgoon et al’s (2012) version. Some indicators follow a reverse order of intensity on the scale from truthful to ridiculously false, since deceivers may have fewer self-references than truth-tellers, yet they may have more cognitive verbs such as ‘think’ ,’believe’, ‘guess’ etc. In any event, the new model is a so-called ‘test-bed’ for manually checking veracity in political campaign discourse, and will be compared with LIWC and Politifact judgements. It is noteworthy that the degree of veracity is calculated through summing up the percentages obtained in all the indicators. Then the total is divided by the 11 indicators and sub-indicators. In the case where there is no video available to measure pitch, the pitch indicator is excluded and the degree is calculated relative to 10 indicators only. 4. Data Analysis The analysis of the data follows a three-way measure: 1. New Model-LIWC Agreement/Discrepancy 2. New Model-Politifact Agreement/Discrepancy 3. LIWC-Politifact Agreement/Discrepancy Under each of the first two sections, the nine indicators will be examined. 4.1. New Model-LIWC Agreement/Discrepancy The New Model (henceforth NM) is greatly different from the LIWC tool. The following table summarizes the results obtained in both NM and LIWC for Clinton’s statements. Table 3 NM and LIWC results for Clinton’s statements* Statement NM LIWC Benghazi 22.22 35.4 FBI 25.76 99.9 GOP 17.5 37.2 Mortgage 12.97 2.1 Gun factory 14.18 1.0 Healthcare 14.93 67.3 ISIS 12.63 50.4 Legislation 14.22 78.9 Hampshire 17.64 20.2 Oil 21.67 98.0 Sanders 15.11 96.0 Scott 17.47 32.4 Not a thing in America 20.06 1.0 Education 25.90 2.4 Clean Power 22.31 1.0 Emails 15.92 43.4 * Statements are named after their central themes. For verbatim transcripts of Clinton’s statements selected, visit Politifact’s website: http://www.politifact.com/personalities/hillary-clinton/statements/byruling/false. It is clear from the above table that 3 statements are judged by LIWC to be half-truthful, i.e. around 98 and 99 %, while they are labeled false by NM. This discrepancy is not just found in the direction of truthfulness, so to say, but it also figures clearly in the direction of ridiculously false statements. Thus, 4 statements are judged as ridiculously false by LIWC while they are only false as labeled by NM. The problem is one of degree. If the rating scale proposed by NM is applied, then the above discrepancies are obviously problematic, since a statement cannot be true and false at the same time. The scale proposed in NM can be illustrated below: 100 99-50 49-5 4-0 Truthful Half-truthful False Ridiculously false Fig. 1: An envisaged continuum of the NM veracity scale This leads to considering 18.75% of LIWC results as completely inaccurate and 25% as partially inaccurate. In the first case, the discrepancy points to statements that are false judged as truthful, while in the second case, a statement is false but is labeled as ridiculously false. However, if taken from the point of view of LIWC, a statement is false if it does not attain 100 % on its scale. In view of this, the above discrepancy vanishes, but the question of degree is not fully tackled. In other words, a statement which attains a 99.9% percent on LIWC scale cannot be true although it has only a fraction left to be true. This interpretation causes the 99.9 % statements to be equal to 1.0% statements, which is a baffling decision. The same is true for statements which are considered half-truthful from the point of view of NM: they range from 65 to 79%, and are false according to LIWC, though their veracity is more than their falsehood. As for the rest of the statements which are judged by both NM and LIWC to be false, the suffer the same obstacle of degree. A statement, for example, can be 17.5 on NM scale but 37.2 on LIWC. The net result is that both are false, yet they are on a par with each other on the ‘falsity scale’, so to speak. A similar situation is found in analyzing Trump’s statements. The following table summarizes the NM and LIWC results for Trump’s statements: Table 4 NM and LIWC results for Trump’s statements* Statement NM LIWC Clinton campaign 11.99 63.5 Coal 16.23 86.4 Cruz 13.96 2.8 Economy 10.67 17.0 FBI 23.85 1.0 Freddie 15.05 3.0 Iran 17.56 33.6 Iraq 14.85 96.2 ISIS 11.53 1.0 ISIS foundation 19.63 1.0 Marshal 21.39 41.4 Money laundering 10.75 1.0 Muslims 17.0 36.4 Obamacare 20.15 7.2 Ohio 15.31 8.3 Second amendment 23.51 1.0 *Statements are named after their central themes. For verbatim transcripts of Clinton’s statements selected, visit Politifact’s website: http://www.politifact.com/personalities/donald-trump/statements. It is clear from the above table that 3 statements are judged by LIWC to be half-truthful, i.e. around 63 and 99%, while they are labeled false by NM. This discrepancy is not just found in the direction of truthfulness, so to say, but it also figures clearly in the direction of ridiculously false statements. Thus, 6 statements are judged as ridiculously false by LIWC while they are only false as labeled by NM. The problem is again one of degree. The conclusion is similar to the one reached when discussing Clinton’s statements: 18.75% of LIWC’s results as completely inaccurate and 37.5% are partially inaccurate. In the first case, the discrepancy points to statements that are false judged as truthful, while in the second case, a statement is false but is labeled as ridiculously false. However, if taken from the point of view of LIWC, a statement is false if it does not attain 100% on its scale. Statistics can come to the aid of the analysis at this point. The ANOVA analysis yields the following two tables: Table 5 ANOVA results for NM (Clinton and Trump) SS df MS F p Between: 22.884 1 22.884 1.229 0.276 Within: 558.527 30 18.618 Total: 581.411 31 P > 0.05, not significant; should be less than 0.05. Table 6 ANOVA results for LIWC (Clinton and Trump) SS df MS F p Between: 2,207.801 1 2,207.801 1.859 0.183 Within: 35,625.010 30 1,187.500 Total: 37,832.811 31 P >0.05, not significant; should be less than 0.05. It is clear that p is not significant in either case: the NM for Clinton’s and Trump’s statements, and LIWC for both candidates. Statistically, this means that the NM and LIWC are equal in their judgements when broadly compared according to ANOVA results. However, if this mode of analysis is the only one adopted, the details are not fully addressed. Table 3 above shows that only one statement appears to receive similar judgements by NM and LIWc, namely the Hampshire one: it scores 17.64 and 20.2 on NM and LIWC, respectively. The 2.56% difference can be considered significant, and this can be considered the only point of agreement between NM and LIWC. 4.2. New Model-Politifact Agreement/Discrepancy In this section, quantitative analysis is not possible, since Politifact does not provide numerical figures that can be set side by side with the NM results. The alternative, by nature, is qualitative analysis. The following table summarizes the qualitative results of both NM and Politifact for Clinton’s statements: Table 7 NM and Politifact results for Clinton’s statements Statement NM Politifact Benghazi False False FBI False False GOP False False Mortgage False False Gun factory False False Healthcare False False ISIS False False Legislation False False Hampshire False False Oil False False Sanders False False Scott False False Not a thing in America False False Education False False Clean Power False False Emails False False It is clear that the results of both NM and Politifact are identical. The discrepancies detected in LIWC are not there. The sole comment that can be made is related to the indicators of Lexical Density and VSA in NM. In 81.25% of the statements examined, Lexical Density scores point to the falsity of the statements in question, but the remaining 18.75% point to ridiculously false statements according to NM. Consider, for example, the following statement by Clinton: “I think this is a major challenge and I want us to address it. Not one word from the other side. And you take somebody like Governor Walker of Wisconsin, who seems to be delighting in slashing the investment in higher education in his state. And most surprisingly to me, rejecting legislation that would have made it tax deductible for you, on your income tax, to deduct the amount of your loan payments. I don't know why he wants to raise taxes on students. But that's the result when you don’t look for ways to help people who are not sitting around asking for something, who are actually working hard every day to get ahead.” This long statement has a Lexical Density score of 93.3%, being full of verbs and nouns. The problem is that the higher the lexical density, the more falsity score a statement attains (where details are provided to cover up any misinformation). According to NM, this statement is ridiculously false, while Politifact judges it false due to its context. Politifact maintains that it is true that Senator Scott did not publicly support the Democratic-sponsored measures that would have provided the tax deduction, but he had never rejected such legislation, either. This inherently means that Clinton passed the ruling without a sufficient amount of information. In a sense, the details of the indicators would at times point to judgements that are different from the overall decision of whether a statement is false or not, and this is the role of context. As for the VSA scores, the NM provides a mean of 44.87%, which indicates that Clinton’s statement is half-truthful. The upper-bound for a female voice pitch is 525 Hz, while the lower is 100 Hz. A Praat spectrogram has been created for a section of this statement as follows: Fig. 2: A spectrogram for the first part of Clinton’s example statement In this illustration, the blue streaks refer to pitch contours: they range from 239 Hz to 148 Hz. This means that Clinton is not stressed; she speaks normally. Yet, in another analysis later in the same segment, she starts to lose control and shout: Fig. 3: A spectrogram for the second part of Clinton’s example statement The pitch contours change from 239 Hz to 394.1 Hz, which indicates emotional speech, and thus the deceptive part starts at the extract “who seems to be delighting in slashing the investment in higher education in his state”. This is exactly what Politifact states about the context of Clinton’s judgement: Senator Scott remained tacit about the tax decision; he was neither delighted nor repugnant. This also tallies with Demenko’s (2008) study about pitch contours in stressed males and females reveals that average frequency for extremely stressed females is 366 Hz. Stress is a major indicator of deception (cf. Ekman, 1991). As for Trump’s statements, the following table summarizes the qualitative results of both NM and Politifact: Table 8 NM and Politifact results for Trump’s statements Statement New Model Politifact Clinton campaign False Ridiculously false Coal False False Cruz False Ridiculously false Economy False Mostly false FBI False False Freddie False False Iran False False Iraq False False ISIS False Ridiculously false ISIS foundation False False Marshal False Ridiculously false Money laundering False False Muslims False False Obamacare False False Ohio False False Second amendment False False There are five discrepancies, which means that 31.25% of NM decisions are not accurate. Lexical Density scores point to the falsity of the statements in question, but the remaining 37.50% point to ridiculously false statements according to NM. Again, context has to be taken into account. Recourse to VSA might show the moot point. One case in point is the statement about accusing marshals in Colorado and Ohio of incompetence. The following spectrogram illustrates the variations in pitch: Fig. 4: A spectrogram for Trump’s statement about marshals Trump’s pitch oscillates between 279 Hz and 312 Hz, especially when he speaks about fire marshals. This tallies with Demenko’s (2008) study about pitch contours in stressed males and females reveals that average frequency for extremely stressed males is 238 Hz. Stress is a major indicator of deception (cf. Ekman, 1991). As a concluding remark for this section, it is important to juxtapose context and VSA in order to achieve a sound judgement in deceptive speech analysis. Relying on Lexical Density and/or context alone would conduce towards erroneous decisions. 4.3. LIWC-Politifact Agreement/Discrepancy Here, again quantitative analysis is not possible. The following table summarizes the LIWC and Politifact results for Clinton’s statements: Table 9 LIWC and Politifact results for Clinton’s statements Statement LIWC according to NM scale LIWC Politifact Benghazi False False False FBI Half-truthful False False GOP False False False Mortgage Ridiculously false False False End of table 9 Statement LIWC according to NM scale LIWC Politifact Gun factory Ridiculously false False False Healthcare Half-truthful False False ISIS Half-truthful False False Legislation Half-truthful False False Hampshire False False False Oil Half-truthful False False Sanders Half-truthful False False Scott False False False Not a thing in America Ridiculously false False False Education Ridiculously false False False Clean Power Ridiculously false False False Emails False False False The two columns provided for the LIWC decisions are meant to show that according to the scale proposed under section 4.1, discrepancy is easily detected, but according to the ‘loose’ criteria of LIWC (where the two extremes 0 and 100 are at work), the discrepancy is absent. As for the first column, this is a glaring example of discrepancy. The LIWC results point to six statements that are half-truthful, which means more than 50% of each statement is true. Since LIWC does not provide detailed results for its ‘authenticity’ indicator, it is clear that there is a major problem with the program. Even false statements are considered in five cases out of sixteen as ridiculously false. It can be said that LIWC vacillates between the two extremes of truthful and ridiculously false without an intermediate level. The reason for this is two-fold. First, LIWC, like the present NM, is not context-sensitive. Second, according to the developers of LIWC Pennebaker et al (2015), the program has mean standard deviations (SD) of 0.70 and 0.32% for certainty and anxiety, respectively. The two dimensions are closely related in the study of deceptive discourse, and the above statements might have fallen within this level of SD. A similar situation is found in Trump’s statement. The following table summarizes the LIWC and Politifact results for Trump’s statements: Table 10 LIWC and Politifact results for Trump’s statements Statement LIWC according to NM scale LIWC Politifact Clinton campaign Half-truthful False Ridiculously false Coal Half-truthful False False Cruz Ridiculously false False Ridiculously false Economy False False Mostly false FBI Ridiculously false False False Freddie Ridiculously false False False Iran False False False Iraq Half-truthful False False ISIS Ridiculously false False Ridiculously false ISIS foundation Ridiculously false False False Marshal False False Ridiculously false Money laundering Ridiculously false False False Muslims False False False Obamacare Ridiculously false False False Ohio Ridiculously false False False Second amendment Ridiculously false False False The two columns provided for the LIWC decisions are meant to show that according to the scale proposed under section 3.1, discrepancy is easily detected, but according to the ‘loose’ criteria of LIWC (where the two extremes 0 and 100 are at work), the discrepancy occurs in 5 cases out of 16, i.e. 31.25 %. As for the column labeled ‘LIWC according to NM scale’, the discrepancy here might be located within the sub-scale of falsity. Seven cases of Politifact’s false statements are judged by LIWC as ridiculously false. Again, this can be attributed to the SD of certainty and anxiety as provided by LIWC developers. 5. CONCLUSIONS The analysis of the data in the previous section and sub-sections can lead to a number of conclusions. First, it is clear that NM is not context-sensitive, being a quantitative model, and is thus numerically oriented in its decisions. The comparisons carried out show that the present model is capable of making the falsity decision correctly in all the cases, unlike LIWC. The point that merits discussion is the degrees that need to be proposed for each point in the NM scale. From a semantic point of view, the two extremes ‘true’ and ‘false’ are binary antonyms, not subject to midway shades. However, the demands of accuracy necessitate that such shades or points are either added or taken into consideration. In a sense, a statement that, for example, scores 51% on the NM scale, is false despite the fact that it has 49% residuals of truth within. Thus, even when subdividing the ‘loose’ LIWC 0-100 scale into 50-5%, the problem of graduation still persists. This ushers to the necessity of subdividing each of NM points into further sub-points such as ‘full truthful’, ‘mostly truthful’, ‘fully false’, ‘mostly false’, etc. The same is mostly true for Lexical Density. Although this indicator easily detects false statements based on its numerical value, there are cases where it labels statements as ‘ridiculously false’ when they are just false. Second, when qualitative analysis intervenes, especially in examining Politifact rulings, context plays a crucial role in passing judgements on deceptive vs. non-deceptive discourse. The numerical values obtained from both NM and LIWC are at stake in this way, and the VSA can be used to detect how contextual analysis is capable of standing the test of falsity vs. truthfulness. However, the main disadvantage of VSA is that it is also ‘loose’ in that the values obtained from pitch contours are indicative of tension as broadly associated with stressed-out liars. Stress can likewise affect truthful speakers, especially when faced with unusual situations or when interrogated, for example. Third, emotions and authenticity are provided as two separate dimensions in LIWC, but in NM, despite being different indicators, their sum is used to reach the final decision whether a statement is deceptive or not. This means that it is to the taste and professionalism of the LIWC user to consider the two dimensions together when passing his/her judgement. In the case of NM, in contrast, the two indicators cannot be separated unless for statistical purposes. The question is whether LIWC acknowledges emotions as indicative of deception or not. Begging this question gives LIWC an edge on NM and other models, since it is yet to be decided in the literature whether tension is necessarily a sign of deceptive discourse. In view of these conclusions, there are some limitations to NM. It is a proposed model, subject to testing in other mainstream instances. The real test of NM is that whether it can be put to use in socio-political situations such as parliamentary and presidential campaigning in both the US and non-Anglophonic countries. The subdivisions of the falsity and truth degrees may also be a major improvement in terms of accuracy. Moreover, the comparisons with LIWC and Politifact showed that context is as important as numerical figures. © Amr M. El-Zawawy, 2017

Amr M El-Zawawy

Alexandria University

Email: amrzuave@yahoo.com
El-Guish Road, El-Shatby, 21526 Alexandria, Egypt

  • All False statements involving Hillary Clinton (Accessed on August 02, 2016). Retrieved from http://www.politifact.com/personalities/hillary-clinton/statements/byruling/false
  • All False statements involving Hillary Clinton. (Accessed on August 03, 2016). Retrieved from http://www.politifact.com/personalities/donald-trump/statements
  • Anolli, L., & Ciceri, R. (1997). The Voice of Deception: Vocal Strategies of Naive and Able Liars. Journal of Nonverbal Behavior, 21, 259-284
  • Arciuli, J., Villar, G., & Mallard, D. (2009). Lies, Lies and More Lies. Proceedings of the 31st Annual Conference of the Cognitive Science Society (CogSci 2009), 2329-2334
  • Vrij A. (2000) Detecting Lies and Deceit: The Psychology of Lying and the Implications for Professional Practice. New York: John Wiley & Sons
  • Benus, S., Enos, F., Hirschberg, J., & Shriberg, E. (2006, May). Pauses in Deceptive Speech. Speech Prosody, vol. 18, 2-5
  • DePaulo, B.M., Kashy, D.A., Kirkendol, S.E., Wyer, M.M. and Epstein J.A. (1996) Lying in Everyday Life. Journal of Personality and Social Psychology, vol. 70, 979-995
  • DePaulo, B.M., Lindsay, J.J., Malone, B.E., Muhlenbruck, L., Charlton, K. and Cooper, H. (2003) Cues to Deception. Psychological Bulletin, Vol. 129, 74-118
  • Bonikowski, B., & Gidron, N. (2015). The Populist Style in American Politics: Presidential Campaign Discourse, 1952-1996. Social Forces, sov. 120
  • Burgoon, J.K., & Qin, T. (2006). The Dynamic Nature of Deceptive Verbal Communication. Journal of Language and Social Psychology, 25(1), 76-96
  • Burgoon, J.K., Hamel, L., & Qin, T. Predicting Veracity from Linguistic Indicators. Intelligence and Security Informatics Conference (EISIC), 2012 European
  • Connell, C. (2012) Linguistic Cues to Deception. MA thesis, Virginia Polytechnic Institute and State University. (Accessed on August 22, 2016). Retrieved from https://vtechworks.lib.vt.edu/ bitstream/handle/10919/32465/Connell_CA_T_2012rev.pdf?sequence=4&isAllowed=y
  • Corn, D. (2004). The lies of George W. Bush: Mastering the politics of deception. Crown
  • Demenko, G. (2008, May). Voice Stress Extraction. Speech Prosody, 6-9
  • Donella, M. (1988) A Guide to American Campaign Language. (Accessed on July, 2016). Retrieved from: http://www.sustainabilityinstitute.org/dhm_archive/search.php?display_article= vn251languageed
  • Ekman, P. (1991). Telling Lies: Clues to Deceit in the Marketplace, Politics, and Marriage. WW Norton & Company
  • Emrich, C.G., Brower, H.H., Feldman, J.M., & Garland, H. (2001). Images in Words: Presidential Rhetoric, Charisma, and Greatness. Administrative Science Quarterly, 46(3), 527-557
  • Fairclough, N. (2006) Tony Blair and the Language of Politics. UK: Routledge
  • Frank, D.A., & McPhail, M.L. (2005). Barack Obama's Address to the 2004 Democratic National Convention: Trauma, Compromise, Consilience, and the (Im)possibility of Racial Reconciliation. Rhetoric & Public Affairs, 8(4), 571-593
  • Kangas, S.E. (2014). What Can Software Tell us About Political Candidates?: A Critical Analysis of a Computerized Method for Political Discourse. Journal of Language and Politics, 13(1), 77-97
  • Kirchhübel, C., & Howard, D.M. (2013). Detecting Suspicious Behaviour Using Speech: Acoustic Correlates of Deceptive Speech - An Exploratory Investigation. Applied Ergonomics, 44(5), 694-702
  • Newman, M.L., Pennebaker, J.W., Berry, D.S., & Richards, J.M. (2003). Lying Words: Predicting Deception from Linguistic Styles. Personality and Social Psychology Bulletin, 29, 665-675
  • Pennebaker, J.W., Boyd, R.L., Jordan, K., & Blackburn, K. (2015). The Development and Psychometric Properties of LIWC 2015. UT Faculty/Researcher Works
  • Pernet, C.R., & Belin, P. (2012). The Role of Pitch and Timbre in Voice Gender Categorization. Frontiers in psychology, 3, 23
  • Picornell, I. (2013). Analysing Deception in Written Witness Statements. Linguistic Evidence in Security, Law and Intelligence, 1(1), 41-50
  • Raskin, D., & Esplin, P. (1991). Statement Validity Assessment: Interview Procedures and Content Analysis of Children’s Statements of Sexual Abuse. Behavioral Assessment, 13, 265-291
  • Reynolds, E., & Rendle-Short, J. (2011). Cues to Deception in Context: Response Latency/Gaps in Denials and Blame Shifting. British Journal of Social Psychology, 50(3), 431-449
  • Vrij, A., & Winkel, F.W. (1991). Cultural Patterns in Dutch and Surinam Nonverbal Behavior: An Analysis of Simulated Police/Citizen Encounters. Journal of Nonverbal behavior, 15(3), 169-184
  • Wortham, S., Locher, M. (1999). Embedded Metapragmatics and Lying Politicians. Language & Communication, 19(2), 109-125
  • Zhou, L., Burgoon, J.K., Twitchell, D.P., Qin, T.T., and Nunamaker, J.F., Jr. (2004) A Comparison of Classification Methods for Predicting Deception in Computer-Mediated Communication. Journal of Management Information Systems, 20, 4, 139-165
  • Zuckerman, M., DePaulo, B.M., & Rosenthal, R. (1981). Verbal and Nonverbal Communication of Deception. In L. Berkowitz (Ed.). Advances in experimental social psychology.Vol. 14, 1-59. New York: Academic Press
  • Praat: doing phonetics by computer. Downloaded from http://www.fon.hum.uva.nl/praat

Views

Abstract - 580

PDF (English) - 281

PlumX


Copyright (c) 2017 Эль-Завави А.М.

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.