TOWARDS A NEW LINGUISTIC MODEL FOR DETECTING POLITICAL LIES

The present study addresses the problem of how the two US presidential candidates Donald Trump and Hillary Clinton use statements judged to be false by the Politifact site while delivering their campaign speeches. Two corpora of Clinton’s and Trump’s alleged lies were compiled. Each corpus contained 16 statements judged to be false or ridiculously untrue (‘pants on fire’) by the Pulitzer Prize Winner site Politifact. Some statements were accompanied by the video recordings where they appeared; others had no video recordings affiliated because they are either tweets or their events had not been recorded on Youtube or elsewhere. The present research made use of CBCA (Criteria-based Content Analysis) but as a stepping stone for building a new model of detecting lies in political discourse to suit the characteristics of campaign discourse. This furnished the qualitative dimension of the research. As for the quantitative dimension, data were analyzed using software, namely LIWC (Linguistic Inquiry & Word Count), and also focused on the content analysis of the deception cues that can be matched with the results obtained from computerized findings. When VSA (Voice Stress Analysis) was required, Praat was used. Statistical analyses were occasionally applied to reach highly accurate results. The study concluded that the New Model (NM) is not context-sensitive, being a quantitative one, and is thus numerically oriented in its decisions. Moreover, when qualitative analysis intervenes, especially in examining Politifact rulings, context plays a crucial role in passing judgements on deceptive vs. non-deceptive discourse.


INTRODUCTION
Lying is usually defined as not telling the truth.However, what is more important than this simplistic definition id why lying has become significant in human communication.DePaulo et al (1996) maintain that people lie in 31 percent of their social interactions.Their study thus points to the amount of lying committed, but how can this amount be studied linguistically in political campaign discourse?
Although political campaign discourse is part of the overarching political discourse, its language is unique in that it possesses a number of characteristics.One feature, according to Emerich et al (2001), is recurrence of imagery as a means of rendering the campaign discourse charismatic.Another feature is the use of 'consilience' strategy, where the candidate-audience understanding stems from the mediation and embrace of different language, values and traditions in an attempt to encourage the listeners to remember the common principles shared by the candidate and voter (Frank and McPhail, 2005).A third feature, Fairclough maintains (2006), is that campaign language is capable of weaving visions and imaginaries which can change realities, obfuscate realities and construe them ideologically.A fourth is the topics that dominate campaign speeches.As Donella (1988) contends, campaign speeches serve as emotional triggers, spanning a range of issues such as the environment, taxes as well as good governance which can guarantee good jobs, among others.A final feature is the focus on populism as a discursive strategy that juxtaposes the virtuous populace with a corrupt elite and views the former as the sole legitimate source of political power (cf.Bonikowski and Gidron, 2015).Thus, campaign discourse is basically emotional and are geared towards canvassing support from voters.
Given this picture of campaign discourse, it is legitimate to ask how presidential candidates can strike a balance between emotionalism and truth-telling.They are required to be as much persuasive as possible while at the same time sound truthful.This inherently impinges on their ability to remain consistent and reliable all the time.
The two US presidential candidates Hillary Clinton and Donald Trump are now running the elections as representatives of the Democratic Party and the Republican Part, respectively.The two nominees have delivered several speeches and written posts and tweets on social media in the course of their campaigning.These electioneering channels can be a rich source for examining whether they tell the truth or not.
A special site called Politifact (www.politifact.com)was set up years ago to gauge the veracity of American politicians' releases.The site contains thousands of excerpts from past and present US politicians, including updates on Clinton's and Trump's statements.As a Pulitzer Prize Winner, the site claims that it adopts a criterion-based analysis of any statement.Such an analysis attempts to answer the following set of questions:  Is the statement based on a fact that is subject to verification? Is the statement leaving a particular impression that may be misleading? Is the statement significant (barring slips of the tongue)?
 Is the statement likely to be carried over and repeated by others? Would a typical person hear or read the statement and wonder: Is that true?
The result is a meter that has six pointers as follows: TRUE -The statement is accurate; nothing significant is missing.MOSTLY TRUE -The statement is accurate but needs clarification or additional information.
HALF TRUE -The statement is partially accurate but leaves out important details or takes things out of context.
MOSTLY FALSE -The statement contains an element of truth but ignores critical facts that would give a different impression.
FALSE -The statement is not accurate.PANTS ON FIRE -The statement is not accurate and makes a ridiculous claim.
The website claims that it is sometimes necessary to consider factors such context, timing, promise-keeping, etc.Other times they resort to acoustic analysis as was done with a contentious statement by Clinton about raising taxes on the middle class detected by Trump's supporters.
Another dimension in the present research is the use of LIWC (Linguistic Inquiry and Word Count) developed and continually updated by Pennebaker and others since 2000.LIWC is an website that it reads a given text and counts the percentage of words that reflect different emotions, thinking styles, social concerns, and even parts of speech.The most relevant part of this electronic tool to the present research is that it includes two dimensions that directly affect judgements on truth-telling, namely Authenticity and emotional tone.Authenticity refers to writing that is personal and honest.Emotional tone is scored such that higher numbers are more positive and upbeat and lower numbers are more negative.
The present paper attempts to examine 16 statements for each candidate judged by Politifact as false (whether downright false or 'pants on fire').This study derives its significance from the fact that it provides a suitable vantage point for investigating the topic of lying in the context of political discourse, particularly the case of Clinton and Trump, as a major human interactive encounter.This is set within the context of contrasting two US candidates' speeches with the aid of a linguistic model of analysis, which will eventually lead to providing a better understanding of the nature of lying as a verbal immediacy activity in political campaign discourse.

LINGUISTIC APPROACHES TO LIE DETECTION
Linguistic approaches to lie detection can be divided into three categories: communication approaches, disfluency-based approaches (usually acoustically oriented), and holistic approaches.The review below provides a bird's eye view of the three approaches in tandem.
Three studies can be subsumed under the communication category.The first is Zuckerman et al's (1981).As early as 1981, Zuckerman et al focused on the metaanalysis of deception-detection (traditionally known as the Four-Factor Theory), and stated that no cue or cues to deception could be accurate all the time because deception was an individual psychological process.
The second is Newman et al's (2003), where they investigated linguistic features that discern true from false stories.They applied a computerized analysis of five independent samples, achieving a classification of liars and truth-tellers at a rate of 67% when the topic was constant and a rate of 61% overall.When compared to truth-tellers, liars exhibited lower cognitive complexity, used fewer self-references and other-references, and showed a tendency towards more negative emotive words.
The third is Zhou et al's. (2004a).They foregrounded The Interpersonal Deception Theory.The theory is based on the assumption that deceivers' number of words, verbal self-distancing tactics, and use of adjective and adverb increase during a conversation.Thus, while communicating, deceivers use feedback from recipients' message to modify deception strategy.According to this theory, cues to deception are divided into three categories: verbal, nonverbal, and physiological.
Some other studies later laid much emphasis on disfluencies in speech, particularly pauses, as a viable linguitsic marker of false statement.Anolli and Ciceri (1997) found out that longer time lapse occurs between the question and the lie to than the response latency that occurs in truthful statements.A major study in this direction is Benus et al. (2006), where they made use of a corpus of spontaneously recorded interviews to investigate the relationship between the distributional and prosodic features of silent and filled pauses and the interviewee's intention to deceive the interviewer.They concluded that the use of pauses correlated more with truthful rather than with deceptive speech.They also found out that prosodic features extracted from filled pauses as well as features describing contextual prosodic information in adjacent phonetic environments of the filled pauses may facilitate the detection of lies in speech.
Demenko (2008) attempted to introduce voice stress extraction and classification into the investigation of deceptive speech.She made use of the authentic Poznan police database with the recordings of the 997 emergency phone, and selected 20,000 recordings out of 60,000, then around hundred were acoustically analyzed.It was concluded that the range of fundamental frequency per se did not correlate with stress whereas the shift in fundamental frequency register constituted the primary indicator of stress.Through Linear Discriminant Analysis based on 12 acoustic features, it was shown that it is possible to reach the three categories of neutral, depressive, stressed, highly stressed speech.Arciuli et al. (2009) followed suit and examined the frequency of use of the filler 'um' during lying versus truth-telling statements in two laboratory-elicited lies about a murder case.They found out that within-participants, false statements exhibited fewer instances of 'um' during lying compared to truth-telling.These results pointed to the fact that 'um' is a major filler in lying statements, and thus can be reliably used to differentiate between deceptive and non-deceptive statements in ordinary communication.Therefore, the filler 'um' may not be accurately categorized as an instance of filled pauses, whose increase is proportionate with increased cognitive load.Rather, they may assume a lexical status similar to interjections, and so constitute an important part of authentic, natural communication.
Latency or gaps in discourse was also used in recent studies as another indicator of deceptive speech.In fact, there are several studies in that domain; however, the bestknown is Reynolds and Randle-Short's (2011).They adopted a rigorous methodological framework of conversation analysis (CA) as analytic tool kit to demonstrate the importance of context, particularly interactional context, when researching cues to deception in order to understand whether there is a relationship between response latency and deception.They thus followed De-Paulo et al. (2003) 1 , who emphasized the interactional context in detecting lies in speech.Reynolds et al examined data from outside laboratory settings taken from The Jeremy Kyle Show, adopting strict criteria to develop the data collection.Criteria were based on how participants in the outside-laboratory interactions formulate their verbal output.Lies were detected according to the following criteria: (1) agreement by the liar that a lie had occurred; (2) explicit labelling of talk as lies by other participants; and (3) the liar's 'revision' of a prior action, thereby changing the course of action, in a 'lie relevant' sequential context.They found out that participants in the show could display a longer transition space to signal that a concessionary stance is close, or they can reduce the transition space to reduce the risk of an upcoming turn, which can be considered a concession.
Preferring an overall perspective, Kirchhübel and Howard (2011) explored the acoustic changes in the speech in deceptive statements.Truthful, deceptive and control speech was collected from ten speakers during an interview.Results were displayed according to the parameters of fundamental frequency, intensity and vowel formants.They found out that no significant correlation could be established for any of the acoustic features, a result that runs counter to many mainstream studies in the field.
The holistic approach, on the other hand, is adopted by Picornell (2012), where she examined deception in written witness statements.She employed marked sentence structures to code discourse markers in written narratives, and mapped the progression of lying as it unfolded through the course of the narrative based on the interaction of linguistic cues.She found out that what may be important is not the individual cues, but the way they are utilized.
The same approach is also adopted by Burgoon et al (2012), where they focused on whether indicators of truth or deception are context-independent or context-sensitive.The factors they suggested are: motivation and modality.A 2 (veracity: truthful/deceptive) by 2 (incentives: high/low) by 3 (modality: FtF/audio/text).The factorial experiment revealed that linguistic indicators are significantly related to veracity, but the results are highly sensitive to context.
In view of the previous review, there appears to a gap in the studies that focus on content analysis (i.e. the linguistic features of a potential liars' outputs) and the prosodic features that verify spots in the speech that signal lying, i.e. latency responses, pauses, fillers, speech errors and the like in political discourse.Bringing the two dimensions together in one project that studies lies committed by politicians in English would eventually enrich the field, and help formulate a new theoretical framework liable to applications in a wider context.The present research project is an attempt at studying how lies can be detected in human interactions, especially political discourse in English.

CONTEXT OF THE PROBLEM
The present study addresses the problem of how the two US presidential candidates Donald Trump and Hillary Clinton use statements judged to be false by Politifact while delivering their campaign speeches.A normal search through Google would yield 6 pages that provide discussions on how both candidates lie to their audiences, each page having 10 hits.This means that the topic of how the candidates use lies is a rampant phenomenon that merits further research.However, there are few studies that tackle the presidential candidates' lies.Wortham and Lorcher (1999) suggested embedded metapragmatics to investigate politicians' lies by examining television network news coverage of the 1992 and 1996 US presidential campaigns.Their article describes an approach to the social functions of language, which draws heavily on Bakhtin, and gives a more formal account of embedded metapragmatic constructions.
Another extended study is David Corn's (2004) book entitled The Lies of George Bush.Although the book is an amalgam of Bush's lies about health programs, IRAQ and tax policies, it does not offer a linguistic approach that can be put to use in further analysis.Moreover, the tone of the book is polemic, and sometimes sounds as a personal war.Still, a third study by Kangas (2014) focused on computerized analysis of politicians' discourse, and touched on honesty as composed of the z-scores of exclusive words, references to self, references to others, motion words and negative emotion words.The paper did not allot ample space to deceptive discourse, having a major focus on how software could analyze political discourse.
Therefore, it is important to draw attention to the impact of lies on the US candidate's image.The amount of lying and/or truthfulness can be linguistically analyzed, and how various linguistic tools can contribute to detecting these lies in their speeches and sometimes tweets.

Corpus
Two corpora of Clinton's and Trump's alleged lies were compiled.Each corpus contained 16 statements judged to be false or ridiculously untrue ('pants on fire') by the Pulitzer Prize Winner site Politifact.Some statements were accompanied by the video recordings where they appeared; others had no video recordings affiliated because they are either tweets or their events had not been recorded on Youtube or elsewhere.All in all, the two corpora comprise 1536 words (639 for Clinton's statements and 897 for Trump's statements) and their 16 videos2 are 7.02 minutes in total length (3.02 minutes for Clinton and 4 minutes for Trump).

Model of analysis
One major approach to investigating the field of lie-detection is the CBCA (Criteriabased Content Analysis) as one of the major elements of Statement Validity Assessment (SVA), a technique developed to determine the credibility of child witnesses' testimonies in trials for sexual offenses and recently applied to assessing testimonies given by adults (cf.Raskin and Esplin 1991).The present research makes use of CBCA but as a stepping stone for building a new model of detecting lies in political discourse to suit the characteristics of campaign discourse.This will furnish the qualitative dimension of the research.As for the quantitative dimension, it will analyze data using software, namely LIWC, and will also focus on the content analysis of the deception cues that can be matched with the results obtained from computerized findings.When VSA (Voice Stress Analysis) is required, Praat will be used.Statistical analyses will also be occasionally applied to reach highly accurate results.
Based on an extensive reading of the literature on the linguistic markers of deceptive speech, the holistic approach was favored for a number of reasons.First, the present model can be considered the first to subject political campaign speeches and/or posts and tweets to lie detection analyses.It is difficult to zoom in on one aspect, such as acoustics, at the expense of other ones.Second, the model adopted here is just a starter that can be so broadened as to include other modifications and it is therefore far from being perfect.It just highlights how campaign discourse may divert from the norms of truthful speech.Third, the present model is adapted from Burgoon et al's (2012) version, which is summarized in the following table.

Linguistic Categories and Operationalizations of Indicators Quantity
Refers to the length of an utterance, expressed at the lowest level in terms of morphemes and at the highest levels in terms of entire utterances or turns at talk 1.
Verbs (words that characteristically are the grammatical center of a predicate and express an act, occurrences, or mode of being)

Complexity
The degree to which a lexical item has few or many syllables (lexical complexity) or a sentence has few or many phrases and clauses (syntactic complexity) 1.
Big words (# of words with 6 or more characters) 2.
Readabittty (indices, e.g., Flesh Kincaid or SMOG index) that measure reading grade level or difficulty of comprehending a segment of text)

Diversity
Degree to which a segment of text uses many unique words and phrases relative to the total number of'words or phrases in it 1.
Lexical diversity (total # of different words divided by total # of words.i.e., percentage of unique words in all words) Specificity Degree to which a segment of text is concrete and specific or abstract 1.
Sensory details (sensory experiences such as sounds, smells, physical sensations and visual details) 2.
Expressivity (a measure of vividness, quantified as the relationship of # of adjectives + # of adverbs, divided by # of nouns + # of verbs)

Uncertainty
Degree to which words or constructions introduce ambiguity in meaning 1.
Modal verbs (auxiliary verbs like would, should, could that are characteristically used with a verb of predication)

Verbal Nonimmediacy
Language that expresses and creates psychological distance 1.
Passive voice (form of a verb used when the subject is being acted upon rather than doing some thing)

Personalization
Personalization: pronoun use that increases the specificily or reference to self and others 1.
Second person reference (you references)

Affect
Words and expressions that convey die subjective aspect of an emotion apart from bodily changes 1.
Affect ratio (number of affect laden words from a dictionary of affect terms relative to total number of words) 2.
Pleasantness (positive or negative feelings associated with a term, based on pre scaled dictionary of terms) Activation Degree of dynamism expressed by emotional terms, based on pre scaled dictionary of terms

Informality
Degree of adherence to formal, standard language forms 1.

Cognitive Difficulty
Degree of nonfluencies in a segment of text 1.
Filled pauses (um, er, ah, you know, and similar nonlexical expressions that do not disrupt the flow of speech and substitute for a silent pause The above table seems to be at first sight comprehensive, yet it contains a number of redundancies that can be conflated.For example, informality is not a viable marker of deception and can be excluded.The same is true for readability, which is measured for written texts only can be difficult to apply to speeches.An alternative benchmark as suggested by Burgoon and Qin (2006) is the average sentence length 3 .Moreover, the idea of relating cognitive difficulty to filled pauses runs counter to the view held by Arciuli et al (2009), where false statements usually contain fewer 'um' instances than truthful statements.Finally, being a predictive study, Burgoon and her colleagues omitted to include two important aspects: (a) the minimum amount (or percentage) of each feature that should be available for a statement to be false and (b) a rating scale that could locate the degree of veracity.The same problem is also detected in LIWC, where the scale from 0-100 cannot be reliable in cases where half of the statement is true and the rest is false.The present model thus adopted Vrij and Winkel's (1991), Connell's (2012) and Picornell's (2012) results which could be summarized in the following points: 1. Deceivers use fewer first-person pronouns than truth tellers.2. Deceivers used more words and more exact language (psychological distancing) than truth tellers.
3. Deceivers' language was simpler (shorter clauses) than that of truth tellers.4. Deceivers are more uncertain (passive voice usage).5. Deceivers exhibited a higher cognitive load (through simpler structures and cognitive verbs).
6. Deceivers exhibit more tension through higher pitch.
Therefore, for the purposes of the present research, the following table summarizes the new model with the scale included: It is clear from the above table that eight indicators are adopted in the present model.They have been adapted from Burgoon et al's (2012) version.Some indicators follow a reverse order of intensity on the scale from truthful to ridiculously false, since deceivers may have fewer self-references than truth-tellers, yet they may have more cognitive verbs such as 'think' ,'believe', 'guess' etc.In any event, the new model is a so-called 'test-bed' for manually checking veracity in political campaign discourse, and will be compared with LIWC and Politifact judgements.
It is noteworthy that the degree of veracity is calculated through summing up the percentages obtained in all the indicators.Then the total is divided by the 11 indicators and sub-indicators.In the case where there is no video available to measure pitch, the pitch indicator is excluded and the degree is calculated relative to 10 indicators only.

DATA ANALYSIS
The analysis of the data follows a three-way measure: 1.New Model-LIWC Agreement/Discrepancy 2. New Model-Politifact Agreement/Discrepancy 3. LIWC-Politifact Agreement/Discrepancy Under each of the first two sections, the nine indicators will be examined.

New Model LIWC Agreement/Discrepancy
The New Model (henceforth NM) is greatly different from the LIWC tool.The following table summarizes the results obtained in both NM and LIWC for Clinton's statements.It is clear from the above table that 3 statements are judged by LIWC to be halftruthful, i.e. around 98 and 99 %, while they are labeled false by NM.This discrepancy is not just found in the direction of truthfulness, so to say, but it also figures clearly in the direction of ridiculously false statements.Thus, 4 statements are judged as ridiculously false by LIWC while they are only false as labeled by NM.The problem is one of degree.If the rating scale proposed by NM is applied, then the above discrepancies are obviously problematic, since a statement cannot be true and false at the same time.The scale proposed in NM can be illustrated below: This leads to considering 18.75% of LIWC results as completely inaccurate and 25% as partially inaccurate.In the first case, the discrepancy points to statements that are false judged as truthful, while in the second case, a statement is false but is labeled as ridiculously false.However, if taken from the point of view of LIWC, a statement is false if it does not attain 100 % on its scale.In view of this, the above discrepancy vanishes, but the question of degree is not fully tackled.In other words, a statement which attains a 99.9% percent on LIWC scale cannot be true although it has only a fraction left to be true.This interpretation causes the 99.9 % statements to be equal to 1.0% statements, which is a baffling decision.The same is true for statements which are considered halftruthful from the point of view of NM: they range from 65 to 79%, and are false according to LIWC, though their veracity is more than their falsehood.
As for the rest of the statements which are judged by both NM and LIWC to be false, the suffer the same obstacle of degree.A statement, for example, can be 17.5 on NM scale but 37.2 on LIWC.The net result is that both are false, yet they are on a par with each other on the 'falsity scale', so to speak.
A similar situation is found in analyzing Trump's statements.The following table summarizes the NM and LIWC results for Trump's statements: It is clear from the above table that 3 statements are judged by LIWC to be halftruthful, i.e. around 63 and 99%, while they are labeled false by NM.This discrepancy is not just found in the direction of truthfulness, so to say, but it also figures clearly in the direction of ridiculously false statements.Thus, 6 statements are judged as ridiculously false by LIWC while they are only false as labeled by NM.The problem is again one of degree.The conclusion is similar to the one reached when discussing Clinton's statements: 18.75% of LIWC's results as completely inaccurate and 37.5% are partially inaccurate.In the first case, the discrepancy points to statements that are false judged as truthful, while in the second case, a statement is false but is labeled as ridiculously false.However, if taken from the point of view of LIWC, a statement is false if it does not attain 100% on its scale.

ПОЛИТИЧЕСКИЙ ДИСКУРС
Statistics can come to the aid of the analysis at this point.The ANOVA analysis yields the following two tables:  It is clear that p is not significant in either case: the NM for Clinton's and Trump's statements, and LIWC for both candidates.Statistically, this means that the NM and LIWC are equal in their judgements when broadly compared according to ANOVA results.However, if this mode of analysis is the only one adopted, the details are not fully addressed.Table 3 above shows that only one statement appears to receive similar judgements by NM and LIWc, namely the Hampshire one: it scores 17.64 and 20.2 on NM and LIWC, respectively.The 2.56% difference can be considered significant, and this can be considered the only point of agreement between NM and LIWC.

New Model Politifact Agreement/Discrepancy
In this section, quantitative analysis is not possible, since Politifact does not provide numerical figures that can be set side by side with the NM results.The alternative, by nature, is qualitative analysis.The following table summarizes the qualitative results of both NM and Politifact for Clinton's statements: It is clear that the results of both NM and Politifact are identical.The discrepancies detected in LIWC are not there.The sole comment that can be made is related to the indicators of Lexical Density and VSA in NM.In 81.25% of the statements examined, Lexical Density scores point to the falsity of the statements in question, but the remaining 18.75% point to ridiculously false statements according to NM. Consider, for example, the following statement by Clinton: "I think this is a major challenge and I want us to address it.Not one word from the other side.And you take somebody like Governor Walker of Wisconsin, who seems to be delighting in slashing the investment in higher education in his state.And most surprisingly to me, rejecting legislation that would have made it tax deductible for you, on your income tax, to deduct the amount of your loan payments.I don't know why he wants to raise taxes on students.But that's the result when you don't look for ways to help people who are not sitting around asking for something, who are actually working hard every day to get ahead." This long statement has a Lexical Density score of 93.3%, being full of verbs and nouns.The problem is that the higher the lexical density, the more falsity score a statement attains (where details are provided to cover up any misinformation).According to NM, this statement is ridiculously false, while Politifact judges it false due to its context.Politifact maintains that it is true that Senator Scott did not publicly support the Democratic-sponsored measures that would have provided the tax deduction, but he had never rejected such legislation, either.This inherently means that Clinton passed the ruling without a sufficient amount of information.In a sense, the details of the indicators would at times point to judgements that are different from the overall decision of whether a statement is false or not, and this is the role of context.
As for the VSA scores, the NM provides a mean of 44.87%, which indicates that Clinton's statement is half-truthful.The upper-bound for a female voice pitch is 525 Hz, while the lower is 100 Hz.A Praat spectrogram has been created for a section of this statement as follows: In this illustration, the blue streaks refer to pitch contours: they range from 239 Hz to 148 Hz.This means that Clinton is not stressed; she speaks normally.Yet, in another analysis later in the same segment, she starts to lose control and shout: The pitch contours change from 239 Hz to 394.1 Hz, which indicates emotional speech, and thus the deceptive part starts at the extract "who seems to be delighting in slashing the investment in higher education in his state".This is exactly what Politifact states about the context of Clinton's judgement: Senator Scott remained tacit about the tax decision; he was neither delighted nor repugnant.This also tallies with Demenko's (2008) study about pitch contours in stressed males and females reveals that average frequency for extremely stressed females is 366 Hz.Stress is a major indicator of deception (cf. Ekman, 1991).
As for Trump's statements, the following table summarizes the qualitative results of both NM and Politifact: There are five discrepancies, which means that 31.25% of NM decisions are not accurate.Lexical Density scores point to the falsity of the statements in question, but the remaining 37.50% point to ridiculously false statements according to NM.Again, context has to be taken into account.
Recourse to VSA might show the moot point.One case in point is the statement about accusing marshals in Colorado and Ohio of incompetence.The following spectrogram illustrates the variations in pitch: Trump's pitch oscillates between 279 Hz and 312 Hz, especially when he speaks about fire marshals.This tallies with Demenko's (2008) study about pitch contours in stressed males and females reveals that average frequency for extremely stressed males is 238 Hz.Stress is a major indicator of deception (cf.Ekman, 1991).
As a concluding remark for this section, it is important to juxtapose context and VSA in order to achieve a sound judgement in deceptive speech analysis.Relying on Lexical Density and/or context alone would conduce towards erroneous decisions.

LIWC Politifact Agreement/Discrepancy
Here, again quantitative analysis is not possible.The following table summarizes the LIWC and Politifact results for Clinton's statements: The two columns provided for the LIWC decisions are meant to show that according to the scale proposed under section 3.1, discrepancy is easily detected, but according to the 'loose' criteria of LIWC (where the two extremes 0 and 100 are at work), the discrepancy occurs in 5 cases out of 16, i.e. 31.25 %.As for the column labeled 'LIWC according to NM scale', the discrepancy here might be located within the sub-scale of falsity.Seven cases of Politifact's false statements are judged by LIWC as ridiculously false.Again, this can be attributed to the SD of certainty and anxiety as provided by LIWC developers.

CONCLUSIONS
The analysis of the data in the previous section and sub-sections can lead to a number of conclusions.First, it is clear that NM is not context-sensitive, being a quantitative model, and is thus numerically oriented in its decisions.The comparisons carried out show that the present model is capable of making the falsity decision correctly in all the cases, unlike LIWC.The point that merits discussion is the degrees that need to be proposed for each point in the NM scale.From a semantic point of view, the two extremes 'true' and 'false' are binary antonyms, not subject to midway shades.However, the demands of accuracy necessitate that such shades or points are either added or taken into consideration.In a sense, a statement that, for example, scores 51% on the NM scale, is false despite the fact that it has 49% residuals of truth within.Thus, even when subdividing the 'loose' LIWC 0-100 scale into 50-5%, the problem of graduation still persists.This ushers to the necessity of subdividing each of NM points into further sub-points such as 'full truthful', 'mostly truthful', 'fully false', 'mostly false', etc.The same is mostly true for Lexical Density.Although this indicator easily detects false statements based on its numerical value, there are cases where it labels statements as 'ridiculously false' when they are just false.
Second, when qualitative analysis intervenes, especially in examining Politifact rulings, context plays a crucial role in passing judgements on deceptive vs. non-deceptive discourse.The numerical values obtained from both NM and LIWC are at stake in this way, and the VSA can be used to detect how contextual analysis is capable of standing the test of falsity vs. truthfulness.However, the main disadvantage of VSA is that it is also 'loose' in that the values obtained from pitch contours are indicative of tension as broadly associated with stressed-out liars.Stress can likewise affect truthful speakers, especially when faced with unusual situations or when interrogated, for example.
Third, emotions and authenticity are provided as two separate dimensions in LIWC, but in NM, despite being different indicators, their sum is used to reach the final decision whether a statement is deceptive or not.This means that it is to the taste and professionalism of the LIWC user to consider the two dimensions together when passing his/her judgement.In the case of NM, in contrast, the two indicators cannot be separated unless for statistical purposes.The question is whether LIWC acknowledges emotions as indicative of deception or not.Begging this question gives LIWC an edge on NM and other models, since it is yet to be decided in the literature whether tension is necessarily a sign of deceptive discourse.In view of these conclusions, there are some limitations to NM.It is a proposed model, subject to testing in other mainstream instances.The real test of NM is that whether it can be put to use in socio-political situations such as parliamentary and presidential campaigning in both the US and non-Anglophonic countries.The subdivisions of the falsity and truth degrees may also be a major improvement in terms of accuracy.Moreover, the comparisons with LIWC and Politifact showed that context is as important as numerical figures.

Fig. 1 :
Fig. 1: An envisaged continuum of the NM veracity scale

Fig. 2 :
Fig. 2: A spectrogram for the first part of Clinton's example statement

Fig. 3 :
Fig. 3: A spectrogram for the second part of Clinton's example statement

Table 2 A modified version of Burgoon et al's (2012) model (the New Model)
Pernet and Belin's (2012)n et al (2012, p. 324) mentioned a similar criterion in their definition of complexity when maintaining that it refers to 'a sentence[which]has few or many phrases and clauses (syntactic complexity)', then they subsumed readability under it.It is well-documented that Flesch-Kincaid readability tests are used with children and adults.SMOG is used particularly for checking health messages.*AccordingtoPernetand Belin's (2012)study.