University of Alicante, Spain

5th ILLA General Conference: Language and the Law in the Age of Migration

3rd ILLA Focus Conference on Forensic Linguistics

Full Programme

Linguistic Approaches to Online Crimes

Author: Patrizia Anesa (patrizia.anesa@unibg.it)

Abstract

Online scams have reached a high level of sophistication, and their diffusion rate is constantly increasing at the global level. This study investigates the persuasion strategies that emerge in two common scams, advance-fee frauds and romance scams. This paper analyses how scammers regularly exploit their victims' errors of judgment thanks to the creation of sophisticated and credible narratives and persuasive lexical choices. The investigation draws on previous research on scams (cf. Anesa, 2020; Arinas Pellón et al., 2005) and aims to gain a finer understanding of how linguistic and discursive strategies are employed for fraudulent purposes. Our hypothesis defends that the two types of scams draw on similar communicative strategies. Specifically, the analysis is based on Lea et al.’s taxonomy including motivational and cognitive judgment errors (2009, p. 25-34). Authentic examples of scams are included in two corpora compiled for this study. The analytical section describes the main linguistic, rhetorical, and discursive devices that the scammers employ to be successful in their fraudulent attempts. The scammer aims to fabricate trust through discursive, generic, and linguistics practices that can deceive the potential victims in romance scams and advance-fee scams. The types of frauds included in the two corpora are different, but the strategies employed by scammers show a certain degree of resemblance. Both in romance and AFF scams, the criminals manage to create a gap between the desire to believe the story's truthfulness and the ability to process the information rationally. As a result, once the victims are engaged in the fraudulent mechanism, they tend to disregard the possibility that the text may not be genuine. Genre cognisance is also astutely exploited by the scammers to defraud their victims. For example, in advance-fee frauds, scams follow the traditional structural elements of business proposals, thus matching people's expectations. In a similar vein, in romance scams the scammers' profiles tend to mimic genuine textual production. Consequently, the scammers manage to exploit the desire of the victims to believe in the message they read, which reduces their ability to process the message rationally and cautiously.

References

Anesa, P. (2020). Lovextortion: Persuasion strategies in romance cybercrime. Discourse, Context and Media, 35, 1-8.

Arinas Pellón, I., Gozalo Sáinz, M. J., & González González, T. (2005). Nigerian Letters, Dutch lottery and teaching an ESP genre. In L. Sierra, & E. Hernández (Ed.). Lenguas para fines específicos (VIII) Investigación y enseñanza (pp. 89-96). Alcalá de Henares: Universidad de Alcalá de Henares.

Lea, S., Fischer, P., & Evans, K. (2009). The Psychology of Scams: Provoking and Committing Errors of Judgment. Exeter: Office of Fair Trading.

Keywords: advance-fee frauds; romance scams; cognitive judgment error; persuasion strategies

Biodata

Dr Patrizia Anesa is a researcher in English Language and Translation at the Department of Foreign Languages, Literatures and Cultures, in the University of Bergamo, Italy. She holds a PhD in English Studies. She is a member of the Research Centre on Specialised Languages (CERLIS) and also Associate Editor of the IDEA project (International Dialects of English Archive). Her research interests lie mostly in specialised discourse. She is currently conducting research in the applications of Conversation Analysis in LSP and the investigation of knowledge asymmetries in expert-lay communication. She is a member of the editorial board of The International Journal of Law, Language & Discourse. Dr Anesa has also worked extensively in World Englishes, and her monograph Lexical innovation in world Englishes: Cross-fertilisation and evolving paradigms (2019, London: Routledge) has recently received the 2020 ESSE Book Award.

Identification of Spanish speakers feigning a Russian accent by palatalisation of /t/. New perspectives for the analysis of voice quality

Author: Pedro Castillo Mollá (pcmolla@outlook.es)

Abstract

This piece of research analyses the role of palatalisation of phoneme /t/ for speaker identification purposes. The investigation focuses on cases in which speakers imitate a Russian accent in Peninsular Spanish. The paper asks four questions: (a) Is there a relationship between the changes in the values of three distinctive features of phoneme /t/ and a specific group of speakers? (b) What is the degree of palatalisation that characterises both groups? And (c) is it possible to calculate the likelihood ratio of a Spanish feigning a Slavic accent examining the realisations of /t/? We raise two hypotheses, a null hypothesis (H0) and an alternative hypothesis (H1). H0 predicts that the changes in the values of each feature (dependent variables) are independent of a specific group of speakers (independent variables). H1 predicts that the changes in the values of each feature are dependent on a specific group of speakers. Briefly, the experimental design involves 1) informants: 90 speakers; 2) instruments: a phonetically rich test, sounds artificially modified with PRAAT 2020 and surveys; 3) tools: PRAAT 2020 (6.1.09 version), IBM SPSS Statistics (22.0 version) and IPA symbols for phonetic transcription; 4) and 4) an oral corpus with spontaneous speech. The corpus contains palatalised and non-palatalised allophones of /t/ and vowels shaping syllables with such allophones. Regarding the procedure, we perform three different but related types of analysis: articulatory, acoustic and perceptual. Findings from this experiment support the alternative hypothesis because the palatalisation of /t/ was found to be higher for Russian than for Spanish speakers. Furthermore, it was possible to measure the likelihood ratio of Spanish speakers feigning a Russian accent. We hope that this piece of research may fill the gap left by the traditional analysis done in Speaker Recognition and Speaker Identification concerning palatalised voice quality, and encourage further research into accent imitation and disguise in Peninsular Spanish and Russian.

References

Llisterri, J. (2005). VILE: Estudio acústico de la variación inter e intralocutor en español. Procesamiento del Lenguaje Natural, 35, 435-436.

Gil, J. (2012). La cualidad de voz y la comparación judicial de voces. In II Jornadas (In)formativas de Lingüística Forense. Facultad de Filosofía y Letras, Universidad Autónoma de Madrid. 18th and 19th of October 2012. Available at https://linguisticaforensemadrid.files.wordpress.com/2012/04/abstracts_completo.pdf. [Last access: 7/05/2020]

Ilina, O. (1999): Phonetic analysis in forensic speaker identification: an example of routine expert actions. In ICPhS, 14, 157-160.

Kuznetsov, V. (1998). Phonetic transcription for the systems of speech synthesis and recognition. In Proceedings of the International Workshop ‘Text, Speech and Dialog’, 8, pp. 263-267.

Skanrtizlt, R. (2015). Speaker discrimination using formant trajectories from casework recordings: Can LDA do it? In Proceedings of ICPhS, 18, pp. 183-190.

Keywords: forensic phonetics; speaker identification, speaker recognition; voice quality; accent disguise

Biodata

Pedro Castillo Mollá graduated from the University of Alicante in Spanish Language and Literature in 2018, specialising in Applied Linguistics. His BA degree´s thesis, Límites entre la subordinación sustantiva y adverbial: Una aproximación al problema was devoted to the analysis of Spanish syntax and supervised by Prof. Dr María Antonia Martínez Linares. In 2020 Castillo Mollá graduated from the same university with a Master's degree in English and Spanish for Specific Purposes. Due to his interest in language as evidence, he wrote his MA thesis on the Identification of Spanish speakers feigning a Russian accent by palatalisation of /t/. New perspectives for the analysis of voice quality, supervised by Prof. Dr Victoria Guillén-Nieto. At present, he is studying for his PhD degree in forensic phonetics.

Managing misunderstandings in police interviews

Author: Chi-Hé Elder (c.elder@uea.ac.uk)

Abstract

This paper examines how misunderstandings are identified, negotiated and resolved in police-suspect interviews. In the UK, police interviewers are trained to obtain as neutral an account of events as possible. However, the interviewer is also responsible for obtaining institutionally accepted testimony, requiring the interview to be directed in specific directions. Previous work has identified potential sources of misunderstandings to which interviewers must be sensitive to obtain quality testimony from the suspect. For example, interviewers should avoid specialised jargon in their language that, even if not used to be intentionally deceptive, may impact the quality of testimony acquired from the suspect (Filipović, 2019) and should clarify suspects' use of vague, ambiguous and contextual language (e.g., 'I wish it didn't happen') (Shuy, 2017). They also need to ensure that questions are satisfactorily addressed as intended, as suspects can evade providing direct answers by capitalising on unintended but inferable aspects of the interviewer's question, thereby still appearing compliant to the questioning process (Haworth, 2006). Such considerations highlight that the process of eliciting testimony is not simply a case of interviewees offering information but that the interviewer plays a significant role in shaping the testimony to fulfil institutional requirements. This paper addresses the question: How do institutional constraints affect the process of meaning negotiation in police interviews when misunderstandings arise? Examples are extracted, transcribed and anonymised from a corpus of 15 interviews totalling 8h24m between police interviewers and suspects obtained from a local police constabulary, made available for research at the author's institution. Instances of 'other-initiated repair' are identified in which there is an explicit request to explain the meaning of what has been said (i.e., the request is clarificatory of existing content, as opposed to additive). The extracts are analysed using principles from interactional pragmatics developed by Elder and Haugh (2018), analysing which meanings are put on record and resolved and which are ignored or left unresolved, for example, by suspects responding to interviewers' questions in deviant or unexpected ways, or by interviewers ignoring irrelevant but inferable meanings from suspects' contributions if they do not conform to the topic at hand. Overall, this paper will provide a greater understanding of how testimony is negotiated and reformulated between interviewer and suspect within the confines of institutional requirements, which can offer supplementary training on how interviewers can manage misunderstandings in the interview room.

References

Elder, C., & Haugh, M. (2018). The interactional achievement of speaker meaning: Towards a formal account of conversational inference. Intercultural Pragmatics, 15(5), 593-625.

Filipović, L. (2019). Police interviews with suspects: Communication problems in evidence-gathering and possible solutions. Pragmatics and Society, 10(1), 9-31.

Haworth, K. (2006). The dynamics of power and resistance in police interview discourse. Discourse and Society, 17(6), 739-759.

Shuy, R. W. (2017). Deceptive Ambiguity by Police and Prosecutors. Oxford: Oxford University Press.

Keywords: misunderstandings; police interviews; speaker meaning; institutional constraints; negotiating testimony

Biodata

Chi-Hé Elder is a Lecturer in Linguistics at the University of East Anglia, specialising in socio-pragmatics, a branch of linguistics at the interface between sociolinguistics and pragmatics. She completed her PhD on the pragmatics of conditionals at the University of Cambridge (published as Context, Cognition and Conditionals, 2019, Palgrave Macmillan) before moving to UEA to complete her Leverhulme Early Career Research Fellowship on the role of misunderstandings in a theory of 'speaker meaning' and the nature of pragmatic inferencing. Dr Elder’s approach combines insights from Conversation Analysis that observes the on-record ways in which speakers make themselves understood, with a post-Gricean perspective that views meaning as stemming from speakers' communicative intentions. Her current work is testing the approach's applicability in different interactional contexts, including political debate, doctor-patient consultations, and police-suspect interviews. Dr Elder teaches a range of undergraduate courses in pragmatics and intercultural communication and supervises PhD and Masters projects in various topics relating to so-ciopragmatics.

Evaluative Textbites as Evidence of Discourse Functions of Terrorist Communicated Threats: When 'Care' is a Moral Foundation

Author: Awni Etaywe (awnietaywe2@gmail.com)

Abstract

Recent linguistic research suggests a direct link between moral values and evaluative language use, and also that meta-values—e.g., care—offer a coding frame and interpretive lens based on interpersonal relationships and violent forms of behaviour (cf. Kádár, Parvaresh, & Ning, 2019). To date, limited forensic linguistic research has been undertaken on terrorist communicated threats as moral, evaluative construct despite recognising the moral element of terrorism (cf. Seto, 2002). This paper adds to the literature on linguistic features of terrorist threat text-type. This paper analyses the discoursal functions of terrorist threats and threat types. The sample consists of twelve texts made publicly available on the internet and produced by Osama bin Laden of al-Qaeda, Shekau of Boko Haram, and Tarrant of the far-right. A close textual analysis is undertaken to categorise the texts into threat types, following Gales (2010). Analysis, then, zooms into the discourse function(s) of each text as realised in evaluative textbites that give rise to patterns in attitudinal meaning types and identifiable values, which are deployed as a basis for disaffiliation and can colour a text with hostile attitudes and a moral justification. The analytical approach draws on the social semiotic approach to social affiliation—grounded in Systemic Functional Linguistics—which draws on Martin and White's (2005) appraisal framework to explain how attitude is realised through evaluative language. For example, in a Shekau's text, 'You Christians cheated and killed us to the extent of eating our flesh like cannibals' incorporates a coupling; alternatively, a textbite, of the 'You Christians' with negative judgement, annotated as [ideation: You Christians/attitude: negative judgement] and instantiating a 'Christians are bad: cheater and killer' disaligning value. This paper foregrounds the disaffiliative function of threat texts, whereby an author uses couplings to distance victims and build rival participants of conflicting values while communicating intentions to cause terror or work injury to a targeted social group, their property or rights. It foregrounds terrorists' pursuit for value disalignment with victims in such a way that the text producer benefits, has their current benefits reinforced or protected, or seeks to redress a moral breach against an in-group's moral order (cf. Culpeper, Iganski, & Sweiry, 2017). Findings from this piece of research show that the threats in the dataset are mostly direct or direct-conditional. Discoursal functions range between manipulation, which serves to preserve a threatener's ideological and physical territories; retribution, which is in response to the in-group's moral order disruption; and identity-damage, where couplings instantiate the values of negative 'others' merely because these belong to a particular category, and a threatener's symbolic benefit is achieved by damaging the victim's symbolic power. The paper's findings present terrorist threat texts as boulomaic texts—i.e., with attitude to cause harm—produced by deontic participants. A semantics-based approach to threat functions is proposed by focusing on attitudinal meaning-making resources and their role in social disaffiliation.

Keywords: discourse function; evaluative textbites; linguistic evidence; threat-text; value disalignment

Biodata

Awni Etaywe is a linguist with an interest in hate and extremist discourse. He is currently researching public statements by terrorist groups as a site for a set of criminal offences, including threats and incitement to violence. Through a forensic linguistic study that applies corpus analysis tools and Systemic Functional Linguistic discourse analysis tools to a corpus of terrorist statements of four extremist groups, Etaywe empirically examines patterns of attitudinal meaning and lexico-grammatical features as discursive markers of terrorists' ideological perceptions, semantic orientation, and personal and relational identities. By focusing on the interpersonal meaning-making resources and the role of social (dis)affiliation, his work makes a methodological contribution to the study of how terrorists use language to justify their communicated threats and to motivate social networks, provoke hatred, and incite violence towards out-groups. Etaywe's work adds to the linguistic research that seeks to bridge the gap between cognitive, computational, and functional stylistic approaches to authorship analysis. His research also proposes semantics-based tools and procedures for understanding discourses of violence and hate speech.

Confession to make: Confessing through Misunderstanding in Police Interviews

Author: Luna Filipović (l.filipovic@uea.ac.uk)

Abstract

Confessions in policing contexts are elicited in different ways and usually during lengthy exchanges. Sometimes suspects are unaware that they have made an inadvertent confession. On other occasions interviewers may think they have obtained a confession while the interview is ongoing, only to find upon subsequent examination of the transcript that none has been given, in which case we have a mistaken confession. Why do these misunderstandings happen? Furthermore, more importantly, since the consequences of such occurrences for justice and individual lives are potentially severe, can they be prevented? The research project's goal presented in this talk was to find out how and why misunderstandings lead to inadvertent and mistaken confessions in two very different legal contexts: the United Kingdom and the United States. I studied a unique database of authentic transcribed police interviews with suspects from both countries (100 in total, 50 monolingual and 50 bilingual) to gain in-depth insights into the linguistic and communicative similarities and differences between the two approaches to law enforcement. UK and US police interview styles have not been analysed in this way before and they are particularly relevant for comparative purposes since a) they involve very different interviewing strategies (The Cognitive Interview in the UK vs The Reid Technique in the US), and b) both models are widely adopted in jurisdictions across the world. A recent detailed international review of a significant number of studies of the two different questioning methods by the police internationally (Meissner et al., 2012) highlights that, overall, the UK-type information-gathering approach produces significantly more true confessions. In contrast, the US-type accusatorial approach produces significantly more false confessions. However, there has been no full linguistic study of confessions that are performed inadvertently, by concurring only with parts (not wholes) of lengthy statements that function as question prompts for admission or denial or by misunderstanding what was said. I hypothesised, based on the previous studies available (Berk-Seligson, 2009; Filipovi'c, 2019), that there will be 'inadvertent confessions' in both UK and US data, but that their incidence will be higher in the US than in the UK context because of the different communicative pressures (confession elicitation vs gathering quality evidence). The methodology is structured with a focus on three hypothesised sources of misunderstanding: 1) incomplete reference ('I saw it' - What does 'it' refer to exactly?), 2) lexical, semantic and syntactic complexity of words and structures (as 'evidence sufficient to preclude the imposition of penalty'), and 3) unresolved semantic/ syntactic ambiguity ('He dropped her on the stairs'—on purpose or not?). The research findings and authentic examples from the data are beneficial to students and practitioners in the legal and linguistic fields, such as police officers, lawyers, judges, interpreters and academic and professional educators. They pave the way for empirically informed professional training and further investigations into how language is used and manipulated to achieve different institutional or individual goals within the different justice systems of the world.

Keywords: confessions; miscommunication; police interpreting; police interviews; UK vs the US

Interpreting Challenges in the Context of Police Interviews

Author: Luna Filipović (l.filipovic@uea.ac.uk)

Abstract

This talk is about interpreter-assisted police interviews in the UK and police interrogations in the US and the challenges that multilingual interactions in this context pose for justice systems and efforts towards equality before the law. The title of this paper is purposefully ambiguous. 'Interpreting challenges' refers to the challenges FOR interpreting as a process in a highly sensitive legal context, and also to the challenges OF interpreting and the effects of interpreter-mediated communication on the information obtained. The aim of this talk is accordingly twofold: 1) to explain the linguistic, communication and interpreting difficulties specific to the context of a police investigation and 2) to illustrate with authentic, concrete examples what happens when these difficulties are not dealt with properly. Several studies in forensic linguistics have observed and addressed many of the problems that the increase in multilingual exchanges and the need for quality interpreting in legal context pose for justice systems (see Filipović, 2019 for a recent overview). In this paper a novel and comprehensive perspective is offered. By contrastively analysing extensive data from two different jurisdictions, namely the US and the UK, we can see which problems are pervasive and shared across different jurisdictions and despite the very different approaches to police questioning, namely the investigative interviewing in the UK underlined by the principles of cognitive interview vs the interrogation method based on Reid technique in the US. At the same time, we have an opportunity to gain a comparative insight into two very different systems of language service provision and production of evidence. In the US, bilingual police officers still act as interpreters, while this is not allowed in the UK, where only professional registered interpreters are used. On the other hand, the US provides professional control interpreters post-interview, who produce bilingual verbatim transcripts and check for interpreting errors, while no such thing is available in the UK where the transcripts are monolingual non-verbatim. The two different types of practice will be contrasted and exemplified with authentic excerpts from 100 bilingual interviews with victims, witnesses and suspects, including multiple languages in combination with English (Spanish, Portuguese, Lithuanian, Russian). We focus on the following sources of difficulty identified during the data analysis: 1) complexity of police speak, 2) language typology contrasts and lack of perfect translation equivalence, 3) interpreting of paralinguistic features such as attitude markers, and 4) the cost of unresolved miscommunication. We will explain why tackling these problems is necessary to achieve equality in access to justice and fair treatment of speakers with limited or zero English proficiency. This work has already been included in various training programmes for police officers and interpreters in the UK and I shall report on these practical applications of primary empirical research and show what we have learned about how to translate research into practice effectively.

References

Filipović, L. (2019). Bilingualism in action: Theory and practice. Cambridge: Cambridge University Press.

Keywords: interpreting; interrogation; investigative interview; LEP speakers; policespeak

Biodata

Luna Filipović (PhD Cantab) is a Professor of Language and Cognition at the University of East Anglia, UK. She specialises in experimental psycholinguistics, bilingualism, and semantic and syntactic processing of typologically different languages. Her recent research examines language effects on memory, verbalisation and translation of witness' accounts of events. Dr Filipović has conducted experiments showing how a specific language spoken by a witness or suspect can affect the quantity and quality of information given and explaining how, why and when this information can be distorted in translation, impacting witness memory and jury judgment. She has studied multilingual police interviews in both the UK and the US for 20 years and discovered important problems in police communication and police interpreting, and some solutions for those are proposed within the TACIT Project that she leads (https://www.tacit.org.uk/). For more details please check out Professor Filipović's academic webpage: https://people.uea.ac.uk/l_filipovic and the recent newsletter by the Leverhulme Trust (page 15) which recently awarded Professor Filipović a prestigious Research Fellowship for her work on the TACIT Project (https://www.leverhulme.ac.uk/sites/default/files/2020_09.pdf).

Pragmatics and authorship analysis

Author: Eilika Fobbe (efobbe@gmx.de)

Abstract

This paper examines the role of pragmatics in forensic authorship analysis and makes suggestions for a more comprehensive integration of pragmatic approaches to written texts. While there are many studies on surface structure features, it seems that pragmatics is only occasionally considered in tasks of attributing authorship, although it is well established in other research areas in language and law. By examining the communicative aspects of forensic texts in detail, pragmatic approaches methodically contribute to the forensic analysis of authorship and may extend the linguistic database available by analysing deeper layers of text. Unfortunately, many forensic texts are short (less than 200 words) and, therefore, provide very little linguistic data, hindering the use of automated systems and statistical analysis. From a pragmatic point of view, each text represents a materialised attempt to deal with a communicative situation, and its pragmatic analysis reveals the author's communicative skills, ideas and strategies for handling it. Moreover, integrating the textual level into the linguistic analysis may reveal stylistic traits that can only partially be derived from surface-structural elements and can prove to be highly individual. Therefore, the use of pragmatics could help identify an author even based on a relatively small amount of data. An early application of pragmatics as part of discourse analysis is Coulthard's work on Derek Bentley's confession and other questioned testimonies. Pragmatically oriented studies related to authorship analysis are, for instance, Kaplan examining pragmatic functions of syntactic structures or Brinker with his analysis of thematic text patterns. Other studies refer more indirectly to pragmatic aspects through genre dependent features of texts or only use isolated pragmatic-related features. We aim to illustrate through selected examples how the studies include pragmatic aspects, what they consider pragmatic features and what they can contribute to the analysis of authorship. Furthermore, by comparing the different approaches, instructive examples will illustrate the advantages of a more text-oriented approach. The paper shows that the use of pragmatic features in forensic authorship analysis currently varies widely and comes from different areas of linguistic research. Since pragmatics offers great potential for forensic applications, future research should aim to link existing approaches more closely, to harmonise them in the course of a specific forensic orientation.

References

Brinker, K., Cölfen, H., & Pappert, S. (2018). Linguistische Textanalyse. Eine Einführung in Grundbegriffe und Methoden. 9., durchgesehene Auflage. Berlin: Erich Schmidt.

Coulthard, M., & Johnson, A. (Eds.) (2010). An Introduction to forensic linguistics: Language in evidence. London, New York: Routledge.

Kaplan, J. R. (1998). Pragmatic contributions to the interpretation of a will. International Journal of Speech, Language and the Law, 5 (2), 107-126.

Turell, M. T. (2010). The use of textual, grammatical and sociolinguistic evidence in forensic text comparison. International Journal of Speech-Language and the Law, 17 (2), 211-250.

Keywords: authorship analysis; pragmatics; written discourse analysis; stylistic feature; text linguistics

Biodata

After studying Indo-Germanic, German and Sanskrit philology at the University of Göttingen, Elika Fobbe received her doctorate in historical linguistics in 2003. She then began research into forensic linguistics and worked as a postdoctoral researcher in the Department of German Philology at the University of Göttingen and later in the Department of German as a Foreign Language at the University of Greifswald. In 2004 she joined a research project between the Universität of Göttingen and the BKA. In 2012 she published an introduction to forensic linguistics that has become an important reference in Germany. Dr Fobbe has given lectures in the universities of Bremen, Magdeburg and Hildesheim, and also worked as a linguistic expert for law firms and courts. Since 2019 Dr Fobbe has been employed by the Forensic Institute of the Federal Criminal Police Office in Wiesbaden, Germany. Her research interests are authorship analysis, deception detection, pragmatic stylistics, and the history of science.

Grice's Cooperative Principle as a linguistic tool for statement veracity assessment: A tale of two minor girls claiming sexual abuse

Author: Victoria Guillén-Nieto (victoria.guillen@ua.es)

Abstract

In their excellent state of the art on veracity evaluation, Nicklaus and Stein (2020) point to the fact that veracity evaluation is dominated by forensic psychologists and argue that 'adding technical linguistic knowledge to the toolbox of veracity evaluation is necessary to analyse evidence given in the very medium that is the object of the scientific study of language.' (2020, p. 24) This paper analyses the use of Grice's Cooperative Principle (CP) as a linguistic tool for statement veracity assessment. Grice explains that there are many occasions on which speakers fail to observe the CP and its maxims of conversation. One of these types of non-observance is violating a maxim—that is, the unostentatious non-observance of a maxim. In other words, the speaker who violates the maxim does not expect the hearer to know or realise that she is doing so. This deceiving strategy is, in effect, the essence of lying and deception. We hypothesise that lying has a ripple effect on the maxims of conversation. We ask the following questions: How does lying affect the maxims of conversation in a fabricated statement? Can we consider violating the maxims of conversation a subtle linguistic behaviour linked to lying in fabricated statements? If so, can we add the violation of the maxims of conversation as a pragmatic indicator to the toolbox of veracity evaluation? The analysis is grounded in empirical work in two cases of sexual abuse against minor girls aged 14-15 years. In both cases, the expert linguist was asked to write a complementary forensic linguistic report for purposes of supporting the findings of the psychology-based approaches-e.g. Reality Monitoring (RM), Content-Based Criteria Analysis (CBCA), and Statement Validity Analysis (SVA). In each case, the expert linguist was asked to analyse the video recording of a minor girl's statement to determine whether or not there were specific linguistic indicators that could support the hypothesis that the statement was fabricated. In addition, the expert linguist made a transcript of the verbal statement, including paralinguistic and kinesic details. The linguistic evaluative report referred to all linguistic analysis levels: lexical, syntactical, semantic, and pragmatic. This paper draws attention to the recent growing interest that linguistic evaluative reporting has risen in forensic psychologists and lawyers in the Spanish legal context, who have started to see the benefits of adding linguistics to forensic practice. The paper also breaks new ground in proposing the CP and the maxims of conversation as a pragmatic tool for statement veracity assessment away from isolated cliché language indicators.

References

Grice, H. P. (1975). Logic and conversation. Further notes on logic and conversation. The William James Lectures. Published as Part I of Grice (1989). Studies in the way of words. Cambridge: MA: Harvard University Press.

Nickalus, M., & D. Stein (2020). The role of linguistics in veracity evaluation. Journal of Language & Law (JLL), 9, 23-47. http://dx.doi.org/10.14762/jll.2020.023

Keywords: forensic linguistics; cooperative principle; maxims of conversation; statement veracity assessment

Biodata

Victoria Guillén-Nieto is a Professor of Applied Linguistics at the Department of English Studies at the University of Alicante. She teaches forensic linguistics in the Masters in English and Spanish for Specific Purposes and the Masters in Forensic Sciences and Criminal Investigation. She is currently directing the Dual Masters in English and Spanish for Specific Purposes and Forensic Linguistics, co-organised by the University of Alicante and the East China University of Political Science and Law (ECUPL) in Shanghai. In September 2019 she was elected President of the International Association of Language and Law (ILLA) for linguistics. Her research interests are mostly in forensic linguistics (language as evidence). She has co-dited, with Dieter Stein, the volume: Language as evidence: Doing forensic linguistics (Palgrave, 2021). At present, she is preparing two volumes: Hate speech, linguistic approaches for the series Foundation in language and law (Janet Giltrow & Dieter Stein, Eds., Walter de Gruyter), and The language of harassment: Pragmatic perspectives. (Lexington Books). Since 2009, Dr Guillén-Nieto has provided professional linguistic service as an expert linguist in Spain, Switzerland, Sweden, and the USA. She has provided sworn testimony in over twenty cases relating to author identification, trademarks disputes, plagiarism detection, veracity evaluation, gender violence, and sexual harassment.

Cross-linguistic challenges in interpreted-mediated police interviews

Author: Alberto Hijazo-Gascón (a.hijazo-gascon@uea.ac.uk)

Abstract

This paper aims to raise awareness of the challenges legal interpreters face in police interviews. They need to be faithful to the original speech and, at the same time, render idiomatic versions in the target language. Previous studies have shown that time pressure, memory, note-taking skills, among others, may lead the interpreter to condense the meaning in the target language, which results in additions, omissions, or alterations of the original text's meaning (Hale, 2007). It is also frequent that the interpreted version involves changes in the communicative style and register (Krouglov, 1999). Furthermore, different languages have different ways to encode meaning, making it difficult for the interpreter to find an equivalent construction. For example, some semantic categories are compulsory in some languages but not in others. In this vein, intentionality in Spanish is always marked—e.g., 'lo tiré' ('I threw it'), 'se me cayó' ('it happened to me that it fell'), while it is not always marked in English, e.g., 'I dropped it', which can be intentional or unintentional (Filipović, 2007, 2017; Filipović & Hijazo-Gascón, 2018). Thus, interpreters need to choose between more idiomatic versions that may not encode all the semantic information of the source text and versions that convey all the semantic information but may sound unnatural in the target language. These linguistic differences might seem minor at first sight but can have crucial consequences in a legal context. This paper presents the analysis of Spanish-English bilingual police interviews in California (USA). These interviews are transcribed by a control interpreter, including translation notes. The analysis presented here is based on the cases in which the two interpreters diverge. The results show different types of inaccuracies that can be classified into two groups. The first group consists of the differences due to general interpreting skills-related challenges, such as addition or loss of intensity, the omission of information about emotional states, or changes in the use of euphemisms. The second group includes differences concerning semantic contrasts without an equivalent in the other language, such as non-agentive constructions, motion and modal verbs. These inaccuracies can influence the rapport between the interlocutors and how they perceive each other, interfering with the interviewing strategy. This research contributes to raising awareness about the complexity of police interpreting and advocates for the importance of transcribing police interpreting in major cases.

References

Filipović, L. (2007). Language as a witness: Insights from cognitive linguistics. International Journal of Speech, Language and the Law, 14(2), 245-267.

Filipović, L. (2017). Applied language typology: Applying typological insights in professional practice. Languages in Contrast, 17(2), 255-278.

Filipović, L., & Hijazo-Gascón, A. (2018). Interpreting meaning in police interviews: Applied Language Typology in a Forensic Linguistics context. VIAL: Vigo International Journal of Applied Linguistics, 15, 67-104.

Hale, B. (2007). Community Interpreting. Hampshire: Palgrave Macmillan.

Krouglov, A. (1999). Police interpreting: politeness and sociocultural context. The translator, 5(2), 285-302.

Keywords: police interpreting; applied typology; translation and interpreting; transcripts

Biodata

Dr Alberto Hijazo-Gascón holds a degree in Spanish Language and Hispanic Literature from the University of Zaragoza, Spain. He also graduated with a Master's in Applied Linguistics and Teaching Spanish as a Foreign Language from Nebrija University in Madrid. He completed a PhD in Hispanic Linguistics at the University of Zaragoza in 2011. During his doctorate, he visited the University of Southern Denmark, the University of Lund (Sweden), the University of California Berkeley (USA), and the Max Planck Institute for Psycholinguistics (Netherlands). In 2012, Dr Hijazo-Gascón worked as a researcher and associate tutor for general and comparative linguistics at the University of Zaragoza. He has been working as a Lecturer in Intercultural Communication and Spanish at UEA since September 2012. His latest research involves analysing interpreted-mediated police interviews, looking into how the specific typological differences among languages pose challenges for interpreters and their potential impact.

Transferring expertise on epistemic modality, evidentiality and related notions to defamation: a case study

Author: Dámaso Izquierdo-Alegría (dizquierdo@unav.es)

Abstract

Languages possess a vast array of devices that modulate in very different ways the responsibility we assume when we talk, such as supposedly, allegedly, visibly, obviously, in my opinion, or I guess, among many others. Most of them have been semantically analysed within the linguistic categories of evidentiality, and epistemic modality (cf. Aikhenvald, 2004; Nuyts, 2001) and have been pragmatically included among hedges and boosters (Hyland, 2019). Markers of evidentiality and epistemic modality may have a significant impact in potentially defamatory texts: whereas boosting strategies might be seen by lawyers and judges as unambiguous linguistic evidence of the defendants' commitment towards punishable statements, hedging strategies may help them escape liability. Previous studies in the field of forensic linguistics have not focused on the strategic role of markers of epistemic modality and evidentiality in cases involving defamation and related language crimes, such as hate discourse or threats, even if there are some notable exceptions, such as the mentions to such items in the judgments analysed by Shuy (2010) or the mitigating strategies identified in threatening discourse by Gales (2011). This paper aims to transfer specialised knowledge about markers of epistemic modality, evidentiality and related notions to a defamation case to clarify their role in potentially defamatory discourse. The defamation case examined in this paper was accessed through Court decision nº 269/2015 given by the Provincial Court in Alicante (Spain). In 2015, a famous Spanish TV host published a post in her blog where she expressed a very negative opinion about a prominent regional politician. The latter brought a claim against the former for defamation because she considered her honour and dignity damaged. More specifically, the plaintiff claimed that she had been accused of bribery and corruption in the original text, particularly through the following sentence: 'Yo no sé qué dirán los jueces sobre sus corruptelas, pero para mí usted no es una presunta chorizo, es una chorizo sin paliativos...' ('I don't know what the judges will say about your corrupt practices, but, to me, you are not an alleged crook, you are an unmitigated crook…'). This sentence contains different devices that may mitigate—e.g., 'presunta' ('alleged'), para mí ('to me')—or strengthen—e.g., 'chorizo sin paliativos' ('unmitigated crook')—one's commitment towards the state of affairs described. Identifying the type and degree of responsibility assumed by the defendant was hindered by such combination of hedges and boosters, as it is evinced in the analyses proposed by the contending parties and the judges' reasoning. I will show that transferring specialised research on evidentiality, epistemic modality, and related notions would have been extremely useful to understand which are the devices that construct the potentially ambiguous epistemic commitment assumed by the defendant in her post and to unveil how boosting and mitigating strategies interplay with each other in the text, which may have helped judges in their legal reasoning.

Keywords: defamation; epistemic modality; evidentiality; hedges; boosters

Biodata

Dámaso Izquierdo Alegría is a postdoctoral researcher in the Institute for Culture and Society (ICS) in the University of Navarra (Pamplona, Spain) within the research group 'Public Discourse', as well as a teacher of general linguistics and Spanish language in the associate centres of the UNED in Pamplona and Tudela. In addition, Dr Izquierdo Alegría has been a visiting researcher in the universities of Heidelberg (Germany), Antwerp (Belgium), Lancaster (United Kingdom) and Tartu (Estonia). His research interests include evidentiality, epistemic modality, discourse analysis, forensic linguistics and corpus linguistics. Dr Izquierdo legría has published in well-known journals such as Zeitschrift für romanische Philologie, Oralia, Estudios Filológicos or Cahiers de Lexicologie and several publishing houses such as Iberoamericana Vervuert, Peter Lang or EUNSA, among others. He has co-authored publications with Ramón González Ruiz (Universidad de Navarra, Spain), Patrick Dendale (University of Antwep, Belgium), Óscar Loureda (University of Heidelberg, Germany), and Bert Cornillie (KU Leuven, Belgium).

Emotion and ideology in terrorist language: A case of four violent Greek groups.

Authors: Andriana Maria Korasidi (andrianakorasidi@gmail.com); George Mikros (gmikros@gmail.com); Katerina Frantzi (frantzi@aegean.gr)

Abstract

Terrorism in Greece has been mainly committed by far-left revolutionary organisations who claimed responsibility for their actions and simultaneously provided their political ideas in many proclamations and communiqués. This paper primarily uses a machine learning approach based on linguistic evidence to capture psychological warning behaviours in written texts by four Greek far-left and radical anarchist terrorist groups that conducted a severe number of attacks since 1975 and are highly related to terrorist incidents. Using various unsupervised NLP methods, we will extract the documents' main topics and identify the common ideological framework expressed in the selected manifestos. Concerning the Greek case, corpus-based techniques have been applied for authorship identification, detection of hate speech, and comparison of related ideology by implementing text mining clustering techniques. However, the psychological dimensions that underpin the Greek terrorist mindset have not been so far investigated. Therefore, based on the hypothesis derived from previous research that terrorist texts present a linguistic feature set that is psychologically meaningful and correlated to violence, we set two questions: 1) What are the linguistic features and sentimental aspects of the language of violence adopted by the Greek terrorist groups? 2) Do left terrorist organisations in Greece share a common ideological background? We use four different corpora consisted of proclamations signed by four well-known Greek terrorist groups. In order to extract information from the text and provide insights about its author from a psychological standpoint, we will use LIWC, a computerised word counting tool that looks for and counts words in psychology-relevant categories across multiple text files. We will then analyse the distribution of semantic word clusters—'topics' in our corpora—by implementing topic modeling techniques and other concept-visualisation methods (word and collocation clouds, and Multidimensional Scaling on vocabulary). Primary results of our research indicate that particular LIWC categories align with the ‘terrorist mind’ based on the fact that specific lexical categories demonstrate different psychological properties and detect meaning in our experimental settings. Moreover, many discussion ideas in the texts share a common background and reflect underlying differences due to the changes in the sociopolitical context.

References

Frantzi, K. T. (2006). Language Resources: What can they do with terrorism? The International Journal of the Humanities: Annual Review, 2(2), 0.

Lekea, I., & Karampelas, P. (2018, August 1). Detecting hate speech within the terrorist argument: A Greek case. IEEE Conference Publication. Ieee.Org. https://ieeexplore.ieee.org/document/8508270

Patil, M., & Darokar, M. S. (2018). A Supervised joint topic modeling process using sentiment analysis. Journal of Advances and Scholarly Researches in Allied Education, 15(2), 720-725.

Smith, A. G., Suedfeld, P., Conway, L. G., & Winter, D. G. (2008). The language of violence: Distinguishing terrorist from non-terrorist groups by thematic content analysis. Dynamics of Asymmetric Conflict, 1(2), 142-163.

Tausczik, Y. R., & Pennebaker, J. W. (2009). The psychological meaning of words: LIWC and computerized text analysis methods. Journal of Language and Social Psychology, 29(1), 24-54.

Keywords: Greek terrorism, sentiment analysis, ideology, LIWC, topic modeling

Biodata

Andriana Maria Korasidi is a PhD candidate in forensic linguistics and a member of the Computational Stylistics Lab at the National and Kapodistrian University of Athens. She has been studying the language of terrorism since 2018 and has taken part in numerous research projects. She graduated from the Faculty of Philology of the National and Kapodistrian University of Athens (UoA) and is a postgraduate of the course Technoglossia that was carried out in collaboration with the School of Electrical and Computer Engineering of the National Technical University of Athens and the Institute for Language and Speech Processing. Her MA thesis was on feature selection methods in authorship identification using a small e-mail dataset. She has participated as a speaker in national and international conferences on linguistics and education. Her main research interests are computational stylistics and forensic linguistics.

Before teaching at the MA Program of Digital Humanities at HBKU, George Mikros was a Professor of Computational and Quantitative Linguistics at the University of Athens in Greece. Dr Mikros is the Director of the Computational Stylistics lab. He is also Adj. Professor at the Department of Applied Linguistics at the University of Massachusetts, Boston, USA. He had the position of Research Associate at the Institute for Language and Speech Processing and he was part of research groups that have developed interesting language resources and NLP tools for Modern Greek. Since 1999 holds the Teaching Associate position at the Hellenic Open University. Since 2016 Dr Mikros is the Director of the Undergraduate programme Spanish Language and Culture. He has authored five monographs and more than 80 papers published in peer-reviewed journals, conference proceedings, and edited volumes. Since 2007 Dr Mikros has been elected as a Member of the Council of the International Association of Quantitative Linguistics (IQLA). In 2018 he was elected its President. Prof. Mikros has been a keynote speaker and invited speaker in many international conferences, workshops and summer schools related to digital humanities and quantitative linguistics.

Katerina T. Frantzi is a Professor of Informatics—Corpus Processing, Director of the Informatics Laboratory, and Head of the Department of Mediterranean Studies, School of Humanities, University of the Aegean, Rhodes, Greece. She graduated from the Department of Informatics and Telecommunications, National and Kapodistrian University of Athens, Athens, Greece. She won a National Research Scholarship at the Institute of Informatics and Telecommunications, National Center for Scientific Research, NCSR Demokritos, Attica, Greece. She received her PhD from the Department of Computing and Mathematics, Manchester Metropolitan University, in collaboration with the Centre for Computational Linguistics, Language Engineering Department, University of Manchester Institute of Science and Technology (UMIST), Manchester, UK. Dr Frantzi has been a visiting researcher at the Department of Information Science, Faculty of Science, University of Tokyo, Tokyo, Japan. Her current research interests are in corpus construction, processing and applications to linguistics, language teaching, political sciences, communication sciences, humanities and social sciences.

Deep Learning Approach for Authorship Attribution

Authors: Trang Lam (doantrang1110@gmail.com); Jérémy Demange (jeremy.demange@cyu.fr); Julien Longhi (julien.longhi@cyu.fr)

Abstract

Authorship attribution is an important topic in NLP and has received notable attention in many research communities. Authorship attribution is an interdisciplinary area that involves stylometry, information retrieval and machine learning. Most previous research done in the area has been devoted to long, formal texts, because of the inherent difficulties in determining the authorship of a short text. However, with the advent of social media, texts are often short. Therefore, the question is: Is it possible to predict the author of a text that may not exceed 280 characters? Our paper analyses the authorship attribution of short texts, especially tweets, by using deep learning methods. The dataset used in our study consists of 42,923 tweets collected from 11 candidates during the 2017 French presidential election. This corpus is available on the Ortolang website. It must be noted that political tweets are not necessarily representative of tweets written by the general public (standard tweets). However, we consider that this type of tweet still provides the appropriate characteristics for us to evaluate the potential of a deep learning approach for authorship attribution (Longhi, 2017). In authorship attribution, the stylometric features selection plays a critical role; more than 1,000 features have been proposed (Rudman, 1998). For our experiment, we are mostly concerned about word- and character-level n-grams because previous work in the field supports the effectiveness of these two types of features in authorship attribution systems (Stamatatos, 2013). Before using NLTK tools to extract word and character n-grams, we removed all special characters, numbers, and links from the data. In terms of model construction, we investigate the performance of Convolutional Neural Networks (CNN) and Long Short Term Memory (LSTM) networks. We built six models, including two CNN and four LSTM, and applied them to the two features selected. FastText are used as the pre-trained word vectors. The experimental evaluation showed that CNN achieved better accuracy than LSTM (83% against 81%). Although these are not outstanding results, deep learning is still a promising approach since it can outperform the traditional machine learning algorithms (Naive Bayes, SVM, Decision Tree, Random Forest). This research is a premise for examining the performance of deep learning in authorship attribution. In the future, we would like to apply these models to different datasets that are less biased and more reliable (we are currently collecting articles and news to have a set of predefined authors on a well-defined topic).

References

Longhi, J. (2017). Humanités, numérique: des corpus au sens, du sens aux corpus. Questions de communication.

Stamatatos, E. (2013). On the robustness of authorship attribution based on character N-gram features. Journal of Law & Policy, 21, 421-725.

Rudman, J. (1998). The state of authorship attribution studies: Some problem and solutions. Computers and Humanities, 31, 351-365.

Keywords: authorship attribution; stylometry; machine learning; social media

Biodata

Trang Lam is currently a PhD student in Linguistics at CY Cergy Paris University. Her thesis focuses on political discourse. More precisely, her research involves discourse analysis, the detection of ideological discourse, and sentiment analysis (including polarity detection and emotion detection). During her masters in NLP at the New Sorbonne University (Paris 3), she became interested in machine learning and deep learning. Her master thesis was an initial attempt to investigate the performance of deep learning methods in the task of authorship attribution.

Jérémy Demange is a developer who started to code on his own at a young age. Demange studied at the IUT of Cergy-Pontoise. For a year and a half, he has worked as a developer and design engineer at the IDHN laboratory together with Julien Longhi. He has been involved in many laboratory projects and worked on the development of several small text analysis applications. He also contributed to the CHEMI IRITA project, which deals with authorship attribution, developing a functional web application for document analysis. Furthermore, Demange developed an interactive application for co-reference detection. He intends to develop a complete application integrating corpus building functionality (with integrated scraping functionality), pre-processing functions and text analysis tools.

Julien Longhi is a Professor of Linguistics at the University Cergy-Pontoise in Paris. He specialises in discourse analysis of political and media texts, focusing on ideologies, social media and digital humanities. He has published books, articles, and edited volumes in pragmatics, semantics and corpus linguistics, and discourse analysis. Dr Longhi is currently working on two major projects: one investigating ideology detection in Twitter and the other looking at risk and security discourses in collaboration with security authorities. In addition to collaborating with authorities, he has opened a platform http://ideo2017.ensea.fr/ in which members of the public can analyse politicians' tweets. Dr Longhi is also an active commentator of current political matters in French journalistic media and blogs.

Hofstra University's Forensic Linguistic Death Penalty Innocence Project: student interns' analysis and research on language used as evidence in criminal trials promote social justice

Author: Robert Leonard (robert.a.leonard@hofstra.edu)

Abstract

The analysis of language evidence in investigations is often nonscientific. Skilled language users such as investigators and judges may understand how language works but not to the degree that scientific linguistics does. False confessions can be instrumental in convicting defendants yet often go undetected. I focused here on ones written by the police and attributed falsely to the accused. How can it be demonstrated that a confession did or did not originate with a suspect? One answer is authorship analysis. In the US, the standards for the admission of scientific testimony are Daubert and Frye. However, the author has been successful in having judges admit his authorship methodology and allow his testimony numerous times in murder and high stakes monetary cases. This paper asks two questions: 1) Did the accused, Antwan Cubie, himself author the murder confession, or did he not? 2) Do the patterns in the so-called confession match the linguistic patterns of known contemporaneous writings of Cubie, or do they better match those of the testifying detective? Forensic linguistics narrows the suspect pool of possible authors, discerns demographic information from language evidence, and then given samples from subjects, helps identify or disallow possible authors. Both qualitative and quantitative methods are used. Qualitative methods are largely inductive. The linguistic analysis aims to discern nonrandom patterns indicating whether a hypothesis of common authorship better explains the data than hypothesising independent authorship. Important tools are corpus analysis and theoretical apparatuses such as Community of Practice and the sociolinguistic variable. The Court and the Torture Commission have received our analysis, and a major law firm has now taken Cubie on as a pro bono appeals client. Supervised by faculty, Hofstra forensic linguistic interns work with Law School interns in analysing the evidence and appeal possibilities in especially capital cases in which language evidence-typically a recorded conversation, an interrogation, or a confession-played a crucial role in a defendant's conviction and death sentence. Their ongoing research using authentic data increases our ever-improving understanding of language and the efficacy of the application to real-world cases.

References

Coulthard, M. (2004). Author identification, idiolect, and linguistic uniqueness. Applied Linguistics, 25, 431-447.

Grant, T. (2013). TXT 4N6: Method, consistency, and distinctiveness in the analysis of SMS text messages. Journal of Law and Policy, 21, 467-494.

Leonard, R. A. (2017). Forensic linguistics. In V. van Hasselt, & M. Bourke (Eds.), Handbook of behavioral criminology: Contemporary strategies and issues. US: Springer.

Leonard, R. A., Ford, J., & Christensen, T. K. (2017). Forensic linguistics: Applying the science of linguistics to issues of the law. Hofstra Law Review, 45, 881.

Shuy, R. W. (1993). Language crimes: The use and abuse of language evidence in the courtroom. Cambridge: Blackwell.

Keywords: authorship; confessions; innocence project; exonerations; Daubert

Biodata

Robert A. Leonard is a Professor of Linguistics and Director of the Hofstra Graduate Program in Linguistics: Forensic Linguistics, and Director of the Forensic Linguistics Death Penalty Innocence Project, a joint venture with Hofstra Law School. Leonard partnered with Roger Shuy, Georgetown Distinguished Research Professor, Emeritus, the founder of forensic linguistics in the US, and has since worked on hundreds and hundreds of cases. Dr Leonard has consulted with the FBI and police, counter-terrorism, and intelligence agencies throughout the US, Canada, the United Kingdom, and Europe, working on cases and training agents in the use of forensic linguistics in law enforcement, threat assessment, and counter-terrorism. The FBI's BAU-the Behavioral Analysis Unit recruited Leonard to help train their agents in forensic linguistic techniques and advise on their Communicated Threat Assessment Database (CTAD). Other clients have included innumerable defence teams, Apple, Facebook, and the Prime Minister of Canada. The New Yorker calls Leonard '[O]ne of the foremost language detectives in the country.' (the New Yorker article is at http://www.newyorker.com/magazine/2012/07/23/words-on-trial ) Wikipedia's bio of him is at http://en.wikipedia.org/wiki/Robert_A._Leonard.

Mafia language analysis and detection using computational linguistics tools

Author: Elena Morandini (elenamo71@gmail.com)

Hate Speech and violent language detection with computational linguistics tools are up-to-the-minute. If such technology could be applied to criminal jargon, law enforcement agencies would have new resources to improve investigations and evidence gathering against organised crime. For example, if proven that the Mafia's particular and semantically unique use of Sicilian dialect is not found in other criminal jargon, the mentioned cutting-edge technology could help detect Mafia language among millions of words in messages or voice recordings. Trying to fill the gap in Natural Language Processing (NLP) applied to crime language for computational forensic linguistics and Jurisprudence purposes, this investigation foresees the innovative approach of using Machine Learning (ML) tools to the not yet detectable Mafia language from electronic surveillance transcriptions, used in Italian courts as evidence. The theoretical underpinning of this research is how the NLP approaches the linguistic problems of supervised learning. Starting with document classification up to the Mafia language identification, the alternative hypothesis has been demonstrated: A Mafia language variable can be differentiated and automatically detected from a no-Mafia variable, represented by any other criminal jargon. The lack of linguistic references regarding crime language identification and NLP perspectives has been a limitation in establishing a research methodology. The empirical trial-and-error approach was embraced, partly following Corpus Linguistics for text collection and the NLP standard procedures for the analysis. Modularity has been used to reduce the samples into tokens, then into lemmas to be analysed with PoS Tagger and Parsing. The frequency of words has been determined, as well as its keywords with AntConc and RStudio. Once the linguistic elements in both variable languages’ samples were identified, a semantic analysis was fulfilled from the quantitative perspective to extract specific features with T-Lab. After the distinctive linguistic elements of both types of language had been identified, a further analysis was carried out to see whether the ML model could learn how to identify them from Mafia language labelled examples. For this purpose, the Weka utility was used with TF/IDF values settings in the StringToWordVector Classifier to convey further relevance to the keywords already identified in the content analysis. With a 70% success rate in identifying Mafia language, the results show an improvement over the majority class baseline approach, calculated as a starting point before applying more complex models to get more robust results in the Weka experiment. This pioneering investigation opens new perspectives in language analysis with ML tools for Computational Forensic Linguistics and legal proceedings. Multidisciplinary teams with IT specialists, legal experts, law enforcement agencies, and linguists could directly analyse electronic eavesdrops files thanks to ML tools, saving the investigators time and energy. Importance would be given only to those conversations that may raise a high percentage of suspicious elements. Relevant information would be collected, analysed, and classified for more consistent results faster and with better cross-validation to extract meaningful data, more likely to be admitted as evidence in court to fight against Mafia with their own words.

Keywords: Mafia language; computational forensic linguistics; NLP; ML; tools

References

Cavallaro, L., Ficara, A., de Meo, P., Fiumara, G., Catanese, S., Bagdasar, O., & Liotta, A. (2020). Disrupting resilient criminal networks through data analysis: The case of Sicilian Mafia. ArXiv. 1-12.

Chung, Y.-L., Kuzmenko, E., Sinem Tekiro˘ Glu, S., & Guerini, M. (2019). CONAN-COunter NArratives through Nichesourcing: A multilingual dataset of responses to fight online hate speech. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 2819-2829.

Falcone, G., & Padovani, M. (1991). Cose di Cosa Nostra (6th ed.). BUR Rizzoli.

Razavi, A. H., Inkpen, D., Uritsky, S., & Matwin, S. (2010). Offensive language detection using multi-level classification. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 6085 LNAI(May), 16-27.

Witten, I. H., Frank, E., & Hall, M. A. (2011). Data mining. Elsevier Inc. Burlington USA.

Biodata

Elena Morandini holds a Bachelor of Arts degree in English and Spanish Foreign Languages and Literatures from the University of Udine. In 2021 she has graduated from the University of Alicante with a Master’s degree in English and Spanish for Specific Purposes, specialising in computational forensic linguistics. She has been a freelance translator since 2000.

A linguistic approach to the analysis of the likelihood of confusion in trademarks, including personal names

Author: Yenny Eliana Sotomayor Marcelo (yeliana9@gmail.com)

Abstract

This paper focuses on the analysis of the conceptual content of trademarks, including personal names. The idea is to offer a reliable and scientific framework when solving opposition proceedings. The conceptual content of these trademarks is not clearly stated. Instead, they are indistinctly regarded as (1) signs identifying an individual or as (2) signs merely used to denote goods/services. This situation has created an uncertain treatment with two discordant positions: (a) personal names do not have conceptual content, and (b) personal names have conceptual content. Trademark linguistics is an established area of forensic linguistics (Butters, 2010). Trade mark law seeks to enforce its language planning to grant rights over ownership of words (Shuy, 2002). However, the functions of language in trademarks have not been fully identified in legal literature yet. Thus, considerations about the meaning require a starting point in linguistics, with a broader and highly contextualised interpretation. The linguist's contribution can help determine the current usage by linking notions of the linguistic field with legal rules and, eventually, proposing changes in the law. However, Guillén-Nieto (2011) affirms that forensic linguistics is a relative newcomer in forensic sciences in civil law countries. For this reason, even when we can find some cases where the linguistic analysis has been admitted, in the end, there is no defined value for this contribution. This paper asks four questions: 1) Should trademarks, including personal names, be treated as signs denoting only goods/services? 2) How can we define their 'associative content'? 3) How can we assess the conceptual content of trademarks, including personal names? And 4) how important is the conceptual content for the analysis of trademarks, including personal names? This paper considers the semiotic and linguistic framework. The semiotic analysis looks into the nature of these signs and how they are interpreted. Linguistics, referring to the lexical, semantic, and pragmatic analysis, assist us in determining their conceptual content. Semiotics defines 1) the signifier as a personal name and 2) the signifier as a trademark, with fixed designations. At the lexical level, the etymology identifies its roots and variants. Onomastics demonstrates the onymic object: as a personal name and as a trademark. The semantics of proper names contains two basic senses: 1) personal name and 2) trademark, each with referents and assigned meanings.

References

Butters, R. (2010). Trademark linguistics. Trademarks: Language that one owns. In M. Coulthard, & A. Johnson, (Eds.), The Routledge handbook of forensic linguistics (pp. 351-364). New York, NY: Routledge.

Guillén-Nieto, V. (2011). The linguist as expert witness in the community trademark courts. ITL - International Journal of Applied Linguistics, 162, 63-83.

Shuy, R. W. (2002). Linguistic battles in trademark disputes. New York: Palgrave Macmillan.

Keywords: trademarks including personal names; the likelihood of confusion; European Union trade mark law; forensic linguistics; conceptual content

Biodata

Yenny Eliana Sotomayor is a Peruvian lawyer with a Professional degree (Summa cum laude) granted by Universidad de Lima, Peru. She specialises in civil and business law. Later, she graduated from the University of Alicante with a Master’s degree in English and Spanish for Specific Purposes. She obtained an honorific mention for her Master's Dissertation on trademark linguistics. In the beginning, her working experience was mainly in civil litigations before the courts of justice in Peru. Subsequently, she moved to Asia to work for international companies in Japan and Hong Kong, acquiring extensive knowledge in intellectual property, international law, and legal translation in English, Spanish, Portuguese, French, and Italian. Sotomayor is currently working on her PhD in trademarks linguistics at the University of Alicante.

Language prestige and credibility attribution: On the effect of linguistic features on the evaluation of written witness statements

Joy Steigler (joy.steigler@uni-muenster.de)

Abstract

Written witness statements are an important step in an investigative process: Their evaluation as a credible or an unreliable report can decide about the form and extent of further investigations conducted by the police. Studies have shown that - regardless of the statement's content - linguistic features can significantly influence the perception of a witness' trustworthiness. For instance, disturbances of speech fluency like stuttering can decrease the impression of false reporting (Addington 1971). However, the analysis of witness credibility has yet focused on two perspectives: Predominantly studies about witness credibility strive to identify actual (verbal) truth indicators. Outnumbered studies analysing credibility attribution processes do either investigate non-verbal or extralinguistic (e.g., prosodic) features. Corresponding to this research gap, this PhD project suggested for presentation shifts the focus on intralinguistic features. Since it is proven that the impression of competence leads to the impression of credibility (see Nawratil 2006) the present study follows the hypothesis of a higher credibility attribution in written statements containing prestigious language features. Prestigious language is operationalised as educational language containing characteristics such as a complex and condensed nominal style. The empirical approach consists of two steps: text manipulation and a subjective rating through test persons using a semantic differential. The text manipulation comprises the replacement of simplified colloquial features by educational language features. As an example, active structures are exchanged using depersonalising constructions. In the rating, test groups were asked to evaluate the written witness statements on a classic semantic differential including dichotomous poles like ‘trustworthy - untrustworthy’. Half of the test groups read the authentic statements; the other half read the manipulated reports. In total, the corpus rated consists of 35 authentic witness statements (provided through a collaboration with a traffic commissariat and the prosecution of Münster, approved by the ministry of justice of North Rhine-Westphalia) and of 35 manipulated pedants. The present results picture strong evidence of witnesses using educational language receiving a higher credibility score. Furthermore, the present study excursively attempts to categorise the text type of written witness statements showing an enormous span width of techniques used by the witnesses to formulate the incidents. The inventory of the authentic statements contrary to expectations inter alia shows dominance of conceptual oral constructions. Colloquial forms are frequently used. In addition, especially quantitative parameters as text and sentence lengths seem to vary with the degree of involvement. Although the written statement is one of the first and, therefore, relevant measures used for investigations by the police, it has not been considered in forensic linguistic research until now. However, the results show that certain language features can indeed influence the impression of credibility. Moreover, witnesses seem to struggle with producing the unknown text type—resulting in sparsely comprehensible reports. Therefore, linguistic research on witness statement can and should be implemented to create guidelines to help the production and evaluation.

Keywords: written witness statements; credibility attribution

Biodata

Joy Steigler is a last year PhD candidate at the WWU Graduate School for Empirical and applied linguistics. Steigler is analysing the impact of language attitude on credibility attribution in written witness statements. Steigler is also a research assistant in the interdisciplinary DFG collaborative research project Law and Literature in the subproject How and why do courts quote? Within this project, I authored one encyclopedia article about legal linguistics and another about quotations combining a legal and a linguistic view. (Both will be published in May 2021).

The voice of law and the voice of medicine confronted: Interactional strategies in expert witness cross-examination.

Author: Magdalena Szczyrbak (magdalena.szczyrbak@uj.edu.pl)

Abstract

Expert testimony has focused on numerous legal, psychological and philosophical studies (Ward, 2017). Lawyer-witness interaction, in turn, has been a prominent research area in linguistic scholarship, looking at such aspects as questioning strategies, a manifestation of power and authority, and the effect of presentational style on juries (Cotterill, 2003; Heffer, 2005). Since expert witnesses do not report what they have personally experienced or observed, they offer assessments based on 'sufficient facts or data', and they communicate expert knowledge with a 'reasonable degree of scientific (or discipline) certainty.' This aspect finds reflection in the linguistic practices such witnesses pursue when interacting with the council and in the questioning tactics that the opposing counsel employs to undermine expert testimony's validity during cross-examination. During cross-examination, the voice of law—with its fact-finding principles determining what counts as 'evidence' or 'truth'—and the voice of science—with its primary goal of discovering the truth—meet. In my talk, adopting a discourse-analytic perspective, I will explain how the troubled relationship between science and law becomes manifest during the cross-examination of two medical experts in a jury trial and how the two resist the narrative imposed by the opposing counsel. Using data from the California v. Murray trial, I will present the strategies identified in a corpus-assisted analysis of the trial transcripts (collocates of I, you and we) as well as those identified through a qualitative analysis of the trial videos, and I will demonstrate the role that these strategies play in the negotiation of the status of expert knowledge. As the analysis suggests, the counsel's turns are characterised by the use of hypotheticals (would you expect) as well as reliance on negation to challenge the witness's expertise (didn't you?). The expert witnesses, on the other hand, use communication verbs to mark resistance (I would disagree), negation to signal non-commitment (I'm not aware), as well as markers of expert identity (we as clinicians) and of limited knowledge (it's outside my realm). The tension between the legal and medical worlds is also visible in the choice and preferred interpretation of medical terminology (insomnia vs sedation). Summing up, the findings demonstrate what interactional strategies are typical of expert witness cross-examination in the adversarial system. In addition, they provide more insights into the presentational style of expert witnesses and how they communicate expert knowledge or its lack. As such, they add to the body of research on courtroom epistemics and may be applied in future studies comparing the styles and stances of expert and fact witnesses.

References

Cotterill, J. (2003). Language and power in court. A linguistic analysis of the O.J Simpson trial. Basingstoke: Palgrave Macmillan.

Heffer, C. (2005). The language of jury trial. A corpus-aided analysis of legal-lay discourse. Basingstoke: Palgrave Macmillan.

Ward, T. (2017). Expert testimony, law and epistemic authority. Journal of Applied Philosophy, 34(2), 263-277.

Keywords: courtroom interaction, expert testimony; expert witness; jury trial; stance

Biodata

Magdalena Szczyrbak is an Assistant Professor at the Institute of English Studies, Jagiellonian University, Kraków, Poland. Dr Szczyrbak's research interests are mainly in discourse analysis and corpus-assisted discourse studies applied to legal discourse and, in particular, to the study of stance-related and evaluative patterns in legal genres. She is the author of the book The realisation of concession in the discourse of judges: A genre perspective (2014). She has also published several articles in scholarly journals and contributed several chapters in edited volumes, most of which are legal discourse analysis. Her most recent studies focus on using mental and communication verbs in courtroom examinations and judicial opinions and their role in epistemic and evidential marking. She is currently researching stance-related interaction patterns in jury trials, emphasising the strategies pursued in expert witness examinations. Professor Szczyrbak is a member of several professional associations (International Language and Law Association, American Pragmatics Association, Polish Linguistic Society) and she is a certified Polish-English translator and court interpreter.

Text type independence of distinguishing author features

Author: Hans van Halteren (hvh@let.ru.nl)

Abstract

When trying to identify or verify an author of a disputed text, we use textual features that are to some degree distinguishing for the author(s) in question. We build on our experience (in expert-based studies) or statistics (in computer-based studies) to determine those features. In both cases, our opinions on ‘good’ features are based on available undisputed texts. One problem with this approach is that the disputed texts may differ from the disputed texts. Especially in a forensic context, it is likely that available background material is quite different from the disputed texts. Potential ‘solutions’ to this problem so far generally consist of 1) ignoring the problem altogether and assuming that the value of the features survives a text type change or 2) demand that background material is of the same type as the disputed texts. As the circumstances often rule out the second solution, I investigate the viability of the first one. I largely expect that not all features will be equally distinguishing in different text types but that some types of features, especially ones based on syntax and richness, may be more resistant to text type changes than others. My experimental data consists of the British National Corpus (BNC), with the extracted features being the same as those used in author identification for the chapter on Automatic Authorship Investigation in the Palgrave volume on Forensic Linguistics (Victoria Guillén Nieto and Dieter Stein, 2021): character n-grams, token n-grams, syntactic n-grams (i.e. subtrees of the constituency analysis trees), syntactic rewrites, and richness measures (both lexical and syntactic). There are both the original forms and ones that try to mask topic dependent words for the token and syntactic features. Based on the documentation, I identified 19 authors who have samples in two different text types. This aspect varies from academic texts in different fields, or texts within the same field but academic and non-academic, to prose fiction versus academic text in the technical-engineering field. For those authors, I determine how the authors' feature values in each text type compare to the other authors' values in those text types. I then compare whether these relative values are similar for the different text types. Finally, I determine specific features and types of features and how robust they are against text type changes. My findings are vital to any future investigations, forensic or not, where there is insufficient background material of the same nature as the disputed text(s). Furthermore, if my hypothesis is correct, it will indicate which types of features can be used in this situation. At the moment of submitting this abstract, I do not yet have results. However, the data is already processed, and there will be full results for presentation at the conference.

Keywords: author recognition; text classification features; cross-genre; forensic linguistics; English

Biodata

Hans van Halteren is a researcher at the Centre for Language Studies at the Radboud University in Nijmegen. His fields of study are corpus linguistics, language variation and forensic linguistics, in which he applies various computational methods, among which machine learning. Within forensic linguistics, Dr van Halteren works on the foundations of automatic authorship recognition and especially on charting what might be measured in texts to determine authorship. In this, he is especially interested in textual features which derive from the syntactic structure of the utterances in the texts, a topic which he introduced to the field in his most cited article so far—Baayen, van Halteren, & Tweedie, (1996), Outside the Cave of Shadows: Using syntactic annotation to enhance authorship attribution. Literary and Linguistic Computing. This year, Dr van Halteren has also started to investigate the use of deep learning techniques for authorship studies, but that will be the topic of another talk.