=Paper= {{Paper |id=Vol-3033/paper77 |storemode=property |title=Linguistic Cues of Deception in a Multilingual April Fools’ Day Context |pdfUrl=https://ceur-ws.org/Vol-3033/paper77.pdf |volume=Vol-3033 |authors=Katerina Papantoniou,Panagiotis Papadakos,Giorgos Flouris,Dimitris Plexousakis |dblpUrl=https://dblp.org/rec/conf/clic-it/PapantoniouPFP21 }} ==Linguistic Cues of Deception in a Multilingual April Fools’ Day Context== https://ceur-ws.org/Vol-3033/paper77.pdf
    Linguistic Cues of Deception in a Multilingual April Fools’ Day Context

Katerina Papantoniou1,2 , Panagiotis Papadakos2 , Giorgos Flouris2 , Dimitris Plexousakis1,2
              1. Computer Science Department, University of Crete, Greece
                   2. Institute of Computer Science, FORTH, Greece
             {papanton, papadako, fgeo, dp}@ics.forth.gr



                        Abstract                                spectrum, as they satisfy widely acceptable defini-
                                                                tions of deception as in Masip et al. (2005).
     In this work we consider the collection                       The massive participation of news media in this
     of deceptive April Fools’ Day (AFD)                        custom establishes a rich corpus of deceptive arti-
     news articles as a useful addition in ex-                  cles from a diversity of sources. Although AFD ar-
     isting datasets for deception detection                    ticles may exploit common linguistic instruments
     tasks. Such collections have an established                with satire news, like exaggeration, humour, irony
     ground truth and are relatively easy to con-               and paralogism, they are usually considered a dis-
     struct across languages. As a result, we in-               tinct category. This is mainly due to the fact that
     troduce a corpus that includes diachronic                  they also employ other mechanisms which char-
     AFD and normal articles from Greek                         acterize deception in general, like sophisms, and
     newspapers and news websites. On top                       changes in cognitive load and emotions (Hauch et
     of that, we build a rich linguistic feature                al., 2015) to deceive their audience. AFD articles
     set, and analyze and compare its deception                 are often believable, and there exist cases where
     cues with the only AFD collection cur-                     sophisticated AFD articles have been reproduced
     rently available, which is in English. Fol-                by major international news agencies worldwide1 .
     lowing a current research thread, we also                     This motivated us to extend our previous work
     discuss the individualism/collectivism di-                 on linguistic cues of deception and their relation
     mension in deception with respect to these                 to the cultural dimension of individualism and col-
     two datasets. Lastly, we build classi-                     lectivism (Papantoniou et al., 2021), in the context
     fiers by testing various monolingual and                   of the AFD. That work examines if differences
     crosslingual settings. The results show-                   in the usage of linguistic cues of deception (e.g.,
     case that AFD datasets can be helpful                      pronouns) across cultures can be identified and at-
     in deception detection studies, and are in                 tributed to the individualism/collectivism divide.
     alignment with the observations of other                      Specifically, the contributions of this work are:
     deception detection works.
                                                                   • A new corpus that includes diachronic AFD
1    Introduction                                                    and normal articles from Greek newspapers
                                                                     and news websites2 , adding one more AFD
April Fools’ Day (for short AFD) is a long stand-
                                                                     collection to the currently unique one in En-
ing custom, mostly in Western societies. It is the
                                                                     glish (Dearden and Baron, 2019).
only day of the year when practical jokes and de-
ception are expected. This is the case for all social              • A study and discussion of the linguistic cues
interactions, including journalism, which is gener-                  of deception that prevail in the Greek and En-
ally considered to aim at the presentation of truth.                 glish collection, along with their similarities.
Every year on this day, newspapers and news web-
                                                                   • A discussion on whether the consideration
sites take part in an unofficial competition to in-
                                                                     of the individualism/collectivism cultural di-
vent the most believable, but untrue story. In this
                                                                   1
respect, AFD news articles fall into the deception                   https://www.nationalgeographic.com/history/article/150331-
                                                                april-fools-day-hoax-prank-history-holiday
                                                                   2
     Copyright © 2021 for this paper by its authors. Use per-        The collection is available in: https://gitlab.i
mitted under Creative Commons License Attribution 4.0 In-       sl.ics.forth.gr/papanton/elaprilfoolcorp
ternational (CC BY 4.0).                                        us
      mension in the context of AFD aligns with          immediate families, whereas in collectivism ties in
      the results of our previous work.                  society are stronger. In Papantoniou et al. (2021)
    • An examination of the performance of vari-         there is an preliminary effort driven by prior work
      ous classifiers in identifying AFD articles, in-   in psychology discipline (Taylor et al., 2017) to
      cluding multilanguage setups.                      examine if deception cues are altered across cul-
                                                         tures and if this can be attributed to this divide.
2    Related Work                                        Among the conclusions were that people from in-
                                                         dividualistic cultures employ more third and less
The creation of reliable and realistic ground truth      first person pronouns to distance themselves from
datasets for the deception detection task is a chal-     the deceit when they are deceptive, whereas in the
lenging task (Fitzpatrick and Bachenko, 2012).           collectivism group this trend is milder, signalling
Crowdsourcing, in the form of online campaigns           the effort of the deceiver to distance the group
in which people express themselves in truthful           from the deceit. In addition, in individualistic cul-
and/or deceitful manner for a small payment are          tures positive sentiment is employed in deceptive
a well established way to collect deceptive data         language, whereas in collectivists there is a re-
(Ott et al., 2011). Real-life situations such as tri-    straint of expression of sentiment both in truthful
als (Soldner et al., 2019) or the use of data from       and deceptive texts.
board games have also been employed (Peskov et              To this end, this work explores the deception-
al., 2020). Also a popular approach is the reuse         related characteristics of a new Greek corpus
of content from sites that debunk articles like fake     based on AFD articles from a variety of sources,
news and hoaxes (Wang, 2017; Kochkina et al.,            and compares them with the English ones3 . Fur-
2018). Lastly, satire news are another way to col-       ther, since related studies (Triandis and Vassil-
lect deceptive texts, but with some particularities      iou, 1972; Hofstede, 1980; Koutsantoni, 2005) de-
due to humorous deception (Skalicky et al., 2020).       scribe Greece as a culture with more collectivis-
   The only work that explores AFD articles is that      tic characteristics (by using country as proxy from
of Dearden et al. (2019). They collected 519 AFD         culture), we also discuss differences in deception
and 519 truthful stories and articles in English for     cues along this cultural dimension.
a period of 14 years. A large set of features was
exploited to identify deception cues in AFD sto-         3       Corpus Creation
ries. Structural complexity and level of detail were
among the most valuable features while the ex-           The AFD articles have been hand gathered be-
ploitation of the same feature set to a fake news        cause a crawling based collection approach was
dataset resulted in similar observations.                not applicable in our case. Since the news web
                                                         sites industry in Greece is not huge to establish
   To the best of our knowledge, the only decep-
                                                         an acceptable number of crawled AFD articles, we
tion related dataset for the Greek language is that
                                                         had to additionally collect articles from the press,
of Karidi et al. (2019). This work proposed an
                                                         including articles from the pre-WWW era. Specif-
automatic process for the creation of a fake news
                                                         ically, we visited the local library that maintains
and hoaxes articles corpus, but unfortunately the
                                                         a printed archive of newspapers and searched for
created corpus over Greek websites is not avail-
                                                         disclosure articles in the issues after the 1st April,
able. If we also consider that the creation of a
                                                         took photos of the AFD articles, and then used
Greek dataset for deception through crowdsourc-
                                                         OCR and manual inspection to extract the text.
ing is a cumbersome and expensive task, that is
                                                         In addition we contacted national and local news
further hindered by the exceptionally limited num-
                                                         media providers to get access in their digitalized
ber of native Greek crowd workers, it is easy to
                                                         archives. The rest were gathered from the Web.
understand why there is a lack of datasets.
                                                            The articles were categorized thematically into
   Regarding the individualism/collectivism cul-
                                                         the following five categories: society, culture, pol-
tural dimension, it constitutes a well-known divi-
                                                         itics, world, and sports. If no category was pro-
sion of cultures that concerns the degree in which
members of a culture value more individual over              3
                                                              We also experimented with data from the limited number
group goals and vice versa. In individualism, ties       of satirical and hoaxes sources of the Greek Web. We do not
                                                         discuss them here though, since the classifiers reported excel-
between individuals are loose and individuals are        lent accuracy showcasing the lack of diversity and the exis-
expected to take care of only themselves and their       tence of domain specific information in the collected data.
vided by the original source, we manually anno-          4    Features Analysis
tated the articles. For each article we kept the ti-
                                                         For the analysis of AFD articles we adapt and
tle, the main body, the published date, the name,
                                                         build upon the feature set used in Papantoniou et
the type of the source (newspaper or news web-
                                                         al. (2021), but for the Greek language. The result-
site), and (if available) the caption, the subtitle
                                                         ing feature set consists of 64 features for the Greek
and the author. As preprocesing steps we ap-
                                                         language and 75 for the English, due to the smaller
plied spellcheck and normalization. The correc-
                                                         availability of linguistic resources for Greek (e.g.,
tion of spelling mistakes was necessary primar-
                                                         in sentiment lexicons). For the analysis we per-
ily for articles extracted through OCR tools, al-
                                                         formed the non-parametric Mann–Whitney U test
though spelling errors were identified in other arti-
                                                         (two-tailed) with a 99% confidence interval (CI)
cles too. Normalization was performed for homo-
                                                         and α = 0.01. Table 3 depicts the results of this
geneity reasons in the texts retrieved from the 80’s,
                                                         analysis for elAFD and enAFD datasets5 .
since we observed language differences in some
forms (e.g., in the suffix of genitive case), which         In both datasets, positive sentiment is related
are remains of an old form of Modern Greek4 .            to the deceptive articles, while negative sentiment
                                                         with the truthful articles. The only exception con-
   For the truthful collection we used the same
                                                         cerns the enAFD dataset, where for the NRC lex-
manual procedure and we tried to have a balanced
                                                         icon the opposite holds (NRC is one of the six
dataset in terms of thematic categories. The truth-
                                                         sentiment lexicons used for features in English).
ful collection consists of articles that have been
                                                         In addition, negative emotions like anger, fear and
published in days relatively close to the 1st of
                                                         sadness are related to truthful news articles in both
April in order to have articles that do not differ
                                                         datasets. The use of positive emotive language
significantly in respect to their topics, mentioned
                                                         during deception may be a strategy for deceivers to
named entities, etc.
                                                         maintain social harmony as noticed also by other
   Since the AFD tradition is vivid in Greece, we        studies (Newman et al., 2003; Pérez-Rosas et al.,
were able to locate a lot of such articles from var-     2018). The difference in the use of emotional
ious newspapers and new websites for our corpus          language between truthful and deceptive news is
(112 different sources). Specifically, we managed        more intense in the enAFD dataset, where five out
to collect 254 truthful and 254 deceptive articles       of the eight emotions in the NRC lexicon are found
spanning over the period 1979 - 2021. In Tables 1        statistical significant. This is in alignment with the
to 2 some statistics of the corpus are depicted.         results in Papantoniou et al. (2021) for individual-
                                                         istic and collectivistic cultures.
       Measure                Truthful       Deceptive      Further, deceptive texts seem to be related with
       Num. of articles       254            254         an increased use of adverbs in both datasets. This
       Avg. length            336            255         can be related to the less concreteness of deceptive
       Min. length            57             33          texts as discussed in Kleinberg et al. (2019) and
       Max. length            1347           1163        it is in line with many theories of deception like
                                                         the Reality Monitoring (Johnson et al., 1998), Cri-
            Table 1: Overview of the dataset.            teria based Content Analysis (Undeutsch, 1989)
                                                         and Verifiability Approach (Nahari et al., 2014).
                                                         This also explains the prevalence of the number of
             Topic       Truthful      Deceptive         named entities, spatial related words, conjunctions
             culture     20            24                and WDAL imagery score in truthful texts in the
             politics    85            78                enAFD dataset and the use of more motion verbs
             society     86            118               in deceptive texts in the elAFD dataset. According
             sports      22            29                to cognitive load theory (Sweller, 2011) in decep-
             world       41            5                 tive texts the language is less specific and consists
                                                         of simpler constructs. The same holds for modal-
       Table 2: Distribution of articles per topic.      ity, another common feature among the datasets,
                                                         that is considered a signal of subjectivity that pro-
                                                             5
                                                               All the features are described in
   4
       https://en.wikipedia.org/wiki/Katharevousa        https://gitlab.isl.ics.forth.gr/papanton/elaprilfoolcorpus
vides a degree of uncertainty. In addition, hedges                Deceptive                   Truthful
in enAFD dataset, also express some feeling of                                            elAFD
doubt or hesitancy.                                               adverbs (0.31)              punctuation (-0.17)
   Lexical diversity as expressed by the token-type               adj. & adv. (0.27)          nrc sadness(-0.17)
ratio (TTR), that is the ratio of unique words to the             TTR (0.27)                  plosives (-0.16)
total number of tokens, is related to the deceptive               pos. sentiment (0.21)       nrc anger (-0.15)
                                                                  modal verbs (0.17)          nrc fear (-0.14)
texts. This seems to contradict all the above, but
                                                                  motion verbs (0.117)        vowels (-0.14)
could be attributed to the fact that deceptive texts                                          consonants (-0.14)
are shorter. Although this is more evident in the
                                                                                          enAFD
case of the enAFD dataset, it also holds for elAFD
                                                                  boosters (0.39)             NE num. (-0.27)
dataset (see Table 1).                                            modal verbs (0.35)          spatial num. (-0.26)
   Boosters, which are words that express confi-                  TTR(0.31)                   conjuctions (-0.24)
dence (e.g., certainly) are quite discriminative for              future (0.27)               nrc fear (-0.23)
deceptive texts for the enAFD dataset. Moreover                   adverbs (0.2)               past (-0.23)
we observe the connection of the future tense with                1st pers. pp (0.2)          nrc sadness (-0.23)
deception and of the past with truth. The above                   mpqa pos. (0.2)             nrc anger (-0.21)
were also marked in Papantoniou et al. (2021) in                  nrc neg.* (-0.2)            nrc trust (-0.21)
different domain from the news articles domain.                   2nd pers. pp (0.19)         avg. word len. (-0.17)
   Finally, first personal pronouns have been found               1st pers. pp pl. (0.18)     collectivism (-0.16)
                                                                  sentiwordnet pos. (0.17) nrc pos.* (-0.16)
to be rather discriminative of deceptive texts in
                                                                  demonstrative (0.17)        wdal imagery (-0.15)
various deception detection and cultural studies,
                                                                  hedges (0.17)               mpqa neg. -0.14)
including Papantoniou et al. (2021). However, in                  adj & adv (0.16)            nasals (-0.14)
this study pronouns are statistical important only                present (0.15)              fbs neg. (-0.14)
for the enAFD dataset. This probably reflects id-                 vader sentiment (0.14)      consonants (-0.13)
iosyncrasies of the news domain, since articles                   verb num. (0.14)            anew arousal (-0.13)
mainly present objectively facts and not opinions,                pers. pron. (0.12)          prepositions (-0.12)
and as a result the use of first personal pronouns                total pronouns (0.11)       fricatives (-0.11)
is avoided. This holds for the elAFD dataset that                                             3rd per. pp sg. (-0.11)
includes AFD articles from the news sites and the                                             avg. preverb len. (-0.11)
press, and not for the enAFD dataset that consists                                            nrc disgust (-0.1)
of various types of AFD articles and stories col-                Table 3: The statistical significant features (p<0.1)
lected from the web through crowdsourcing6 .                     with at least a small effect size (r>0.1) for the
                                                                 elAFD and enAFD datasets. The features are in
5       Classification
                                                                 ascending p value order. We also report the effect
We evaluated the predictive performance of differ-               size. Features with moderate effect size (r>0.3) are
ent feature sets and approaches for AFD datasets,                bold, while common features between the datasets
including logistic regression experiments7 and                   are underlined. pp denotes personal pronouns.
fine-tuned monolingual BERT models for each
language8 (Devlin et al., 2019; Koutsikakis et
                                                                 80% and 20% of a language specific dataset re-
al., 2020). We also performed cross lingual ex-
                                                                 spectively, and then tested the performance of
periments by exploiting the multilingual BERT
                                                                 the model over the other dataset. We report
model (mBERT) to examine if there are similar-
                                                                 the results on test sets, while validation subsets
ities among AFD datasets captured by the BERT.
                                                                 were used for fine-tuning the hyper-parameters of
   A stratified split to the datasets was used to cre-           the algorithms. For the logistic regression the
ate training, testing, and validation subsets with               tuned through brute force parameters were: a)
a 70-20-10 ratio. For the cross lingual experi-                  Weka algorithm (SimpLog|Log: simple logistic
ment we trained and validated a model over the                   (Landwehr et al., 2005) or logistic (Le Cessie and
    6
      https://aprilfoolsdayontheweb.com/2004.html                Van Houwelingen, 1992)) b) all n-grams of size in
    7
      We employ the Weka API (Hall et al., 2009)                 [a, b], with a ≥ b and a, b ∈ [1, 3] ((a, b)), c) stem-
    8
      We used tensorflow 2.2.0, keras 2.3.1, and the bert-for-
tf2 0.14.4 implementation of google-research/bert, over an       ming (stem), d) attribute selection (attrsel) (ap-
AMD Radeon VII card and the ROCm 3.7 platform.                   plicable only to Log algorithm since it is the de-
fault for SimpLog ), e) stopwords removal (stop)          Best setup                     R    P    F    A’   A
and, f) lowercase conversion (lowercase). For             ling.SimpLog                   62   76   68   82   71
the BERT experiments, the hyperparameters were            ph-gram(1,2),attrsel,Log *     70   67   68   77   68
tuned by random sampling 60 combinations of               char-gram(3,3),SimpLog *       72   68   70   76   69
values, keeping the combination that gave the min-        w-gram(1,2),SimpLog            68   73   71   80   72
imum validation loss. Early stopping with pa-             pos-gram(2,3),SimpLog *        72   65   68   75   67
tience 4 was used and the max epochs number               ling.+word,(1,3),stop,
was set to 20. The tuned hyperparameters were:            lowercase,SimpLog              74   79   76   85   77
learning rate, batch size, dropout rate, max token           Table 4: Logistic regression results for elAFD.
length, and randomness seeds.
   In all cases, we report Recall (R), Precision          Best setup                     R    P    F    A’   A
(P ), F-measure (F ), Accuracy (A) and AUC (A′ ).         ling.Log *                     66   80   72   87   75
Since the datasets are balanced the majority base-        ph-gram(1,1),SimpLog           80   77   78   84   78
line is 50%. The input for the models consists of         char-gram(1,3),attrsel,Log *   76   72   74   80   73
the concatenation of the title, the subtitle, the body    w-gram(1,1),stem,SimpLog       79   81   80   87   80
of the articles and the caption text. Since titles        pos-gram(3,3),SimpLog *        71   69   70   76   69
are important for deception detection (Horne and          sn-gram(2,2),SimpLog *         80   68   73   77   71
Adali, 2017) and BERT processes texts of up to            ling.+W ord,(3,3),stop,
                                                          lowercase,SimpLog              74   80   77   87   78
512 wordpieces, we placed the title first.

5.1   Logistic Regression Experiments                        Table 5: Logistic regression results for enAFD.

The examined features sets were: a) the fea-                                     R       P    F    A’   A
tures presented in section 4 (ling), b) n-grams                  elbert          85      70   77   79   79
features i.e., phoneme-gram (ph-gram), character-                elbert+ling     68      83   75   77   77
gram (char-gram), word-gram (w-gram), POS-                       elmbert         16      57   25   52   52
gram (pos-gram), and syntactic-gram (sn-gram)                    elmbert+ling    62      78   69   72   72
(the latter for the enAFD only), and c) the lin-                 enbert          79      86   82   83   83
guistic+ model that represents the best model that               enbert+ling     69      87   77   79   79
combines the linguistic features with any of the                 enmbert         37      97   54   68   68
n-gram features. The results are presented in Ta-                enmbert+ling    50      95   66   74   74
bles 4 and 5. With * we mark the setups with a                   en→el mbert     31      73   44   60   60
statistically significant difference to the best setup           el→en mbert     22      84   35   59   59
regarding accuracy, based on a two proposition z-
test (1-tailed) with a 99% CI. We observe that the             Table 6: BERT models evaluation results.
combination of lingustic features with uni/bi/tri-
grams for the elAFD dataset and the unigrams for
                                                         ments are presented in Table 6. Although it out-
the enAFD are the best setups. For the enAFD
                                                         performed logistic regression experiments in both
dataset, the second best model is the combina-
                                                         datasets, the differences are not statistical signif-
tion of linguistic features with trigrams. SimpLog
                                                         icant. In addition, the combination with linguis-
seems to perform better, while stemming, lower-
                                                         tic features is not beneficial. Multilingual BERT
case conversion and stopwords removal are gener-
                                                         models perform worse, especially for Greek. In
ally beneficiary.
                                                         the cross lingual experiments the classifiers per-
5.2   BERT Experiments                                   formance is limited to about 60% accuracy in both
                                                         experiments, showcasing that the BERT layers are
In these experiments, we fine-tuned BERT by
                                                         not able to capture language agnostic information
adding a task-specific linear classification layer on
                                                         from our datasets.
top, using the sigmoid activation function. We also
combined BERT with linguistics features by con-          6     Conclusion and Future Work
catenating the embedding of the [CLS] token with
the linguistic features, and pass the resulting vec-     We introduced a new dataset with AFD news ar-
tor to the task-specific classifier (with a slightly     ticles in Greek and analyzed and compared its de-
modified architecture). The results of the experi-       ception cues with another English one. The results
showcased the use of emotional language, espe-            Proceedings of the Workshop on Computational Ap-
cially of positive sentiment, for deceptive articles      proaches to Deception Detection, pages 31–8, Avi-
                                                          gnon, France, April. Association for Computational
which is even more prevalent in the individualis-
                                                          Linguistics.
tic English dataset. Further, deceptive articles use
less concrete language, as manifested by the in-        Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard
creased use of adverbs, hedges, and boosters and         Pfahringer, Peter Reutemann, and Ian H. Witten.
less usage of named entities, spatial related words      2009. The WEKA data mining software: an update.
                                                         SIGKDD Explorations, 11(1):10–18.
and conjunctions compared to the truthful ones.
The future and past tenses were correlated with         Valerie Hauch, Iris Blandón-Gitlin, Jaume Masip, and
deceptive and truthful articles respectively. All the     Siegfried L. Sporer. 2015. Are Computers Effec-
above, mainly align with previous work (Papanto-          tive Lie Detectors? A Meta-Analysis of Linguistic
niou et al., 2021), except from some differences in       Cues to Deception. Personality and Social Psychol-
                                                          ogy Review, 19(4):307–342. PMID: 25387767.
the usage of pronouns for the Greek dataset, which
is attributed to the idiosyncrasies of the news do-     Geert Hofstede. 1980. Culture’s consequences: In-
main. The accuracy of the deployed classifiers of-        ternational differences in work-related values. Sage
fered adequate performance, with no statistically         Publications.
significant differences between the best logistic re-
                                                        Benjamin D. Horne and Sibel Adali. 2017. This Just
gression and the BERT models.                             In: Fake News Packs a Lot in Title, Uses Simpler,
   In the future we aim at creating even more             Repetitive Content in Text Body, More Similar to
crosslingual datasets for deception detection tasks       Satire than Real News. ArXiv, abs/1703.09398.
through crowdsourcing and by employing the
                                                        Marcia K. Johnson, Julie G. Bush, and Karen J.
Chattack platform (Smyrnakis et al., 2021).              Mitchell. 1998. Interpersonal Reality Monitoring:
                                                         Judging the Sources of Other People’s Memories.
Acknowledgement                                          Social Cognition, 16(2):199–224.

This work has received funding by the Hellenic          Bennett Kleinberg, Isabelle van der Vegt, Arnoud
                                                          Arntz, and Bruno Verschuere. 2019. Detecting de-
Foundation for Research and Innovation (H.F.R.I.)
                                                          ceptive communication through linguistic concrete-
under the “1st Call for H.F.R.I. Research Projects        ness, Mar.
to support Faculty Members & Researchers and
the Procurement of High-and the procurement             Elena Kochkina, Maria Liakata, and Arkaitz Zubiaga.
of high-cost research equipment grant” (Project           2018. PHEME dataset for Rumour Detection and
                                                          Veracity Classification.
Number:4195).
                                                        Dimitra Koutsantoni. 2005. Greek Cultural Charac-
                                                          teristics and Academic Writing. Journal of Modern
References                                                Greek Studies, 23:97–138, 05.

Edward Dearden and Alistair Baron. 2019. Fool’s         John Koutsikakis, Ilias Chalkidis, Prodromos Malaka-
  Errand: Looking at April Fools Hoaxes as Disin-         siotis, and Ion Androutsopoulos. 2020. GREEK-
  formation through the Lens of Deception and Hu-         BERT: The Greeks Visiting Sesame Street. In
  mour. April. 20th International Conference on           11th Hellenic Conference on Artificial Intelligence,
  Computational Linguistics and Intelligent Text Pro-     SETN 2020, page 110–117, New York, NY, USA.
  cessing, CICLing 2019 ; Conference date: 07-04-         Association for Computing Machinery.
  2019 Through 13-04-2019.
                                                        Niels Landwehr, Mark Hall, and Eibe Frank.
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and             2005. Logistic Model Trees. Machine Learning,
   Kristina Toutanova. 2019. BERT: Pre-training of        59(1):161–205, May.
   Deep Bidirectional Transformers for Language Un-
   derstanding. In Proceedings of the 2019 Conference   S. Le Cessie and J.C. Van Houwelingen. 1992. Ridge
   of the North American Chapter of the Association        Estimators in Logistic Regression. Applied Statis-
   for Computational Linguistics: Human Language           tics, 41(1):191–201.
   Technologies, Volume 1 (Long and Short Papers),
   pages 4171–86, Minneapolis, Minnesota, June. As-     Jaume Masip, Siegfried L. Sporer, Eugenio Garrido,
   sociation for Computational Linguistics.                and Carmen Herrero. 2005. The detection of de-
                                                           ception with the reality monitoring approach: a re-
Eileen Fitzpatrick and Joan Bachenko. 2012. Build-         view of the empirical evidence. Psychology, Crime
   ing a Data Collection for Deception Research. In        & Law, 11(1):99–122.
Galit Nahari, Aldert Vrij, and Ronald P. Fisher. 2014.   Felix Soldner, Verónica Pérez-Rosas, and Rada Mi-
  The Verifiability Approach: Countermeasures Facil-       halcea. 2019. Box of Lies: Multimodal Decep-
  itate its Ability to Discriminate Between Truths and     tion Detection in Dialogues. In Proceedings of the
  Lies. Applied Cognitive Psychology, 28(1):122–           2019 Conference of the North American Chapter of
  128.                                                     the Association for Computational Linguistics: Hu-
                                                           man Language Technologies, Volume 1 (Long and
Matthew L. Newman, James W. Pennebaker, Diane S.           Short Papers), pages 1768–1777, Minneapolis, Min-
 Berry, and Jane M. Richards. 2003. Lying Words:           nesota, June. Association for Computational Lin-
 Predicting Deception from Linguistic Styles. Per-         guistics.
 sonality and Social Psychology Bulletin, 29(5):665–
 75. PMID: 15272998.                                     John Sweller. 2011. Chapter Two - Cognitive Load
                                                           Theory. volume 55 of Psychology of Learning and
Myle Ott, Yejin Choi, Claire Cardie, and Jeffrey T.        Motivation, pages 37–76. Academic Press.
 Hancock. 2011. Finding Deceptive Opinion Spam
 by Any Stretch of the Imagination. In Proceed-          Paul J. Taylor, Samuel Larner, Stacey M. Conchie, and
 ings of the 49th Annual Meeting of the Association        Tarek Menacere. 2017. Culture moderates changes
 for Computational Linguistics: Human Language             in linguistic self-presentation and detail provision
 Technologies - Volume 1, HLT ’11, pages 309–19,           when deceiving others. Royal Society Open Science,
 Stroudsburg, PA, USA. Association for Computa-            4(6):170128, June.
 tional Linguistics.
                                                         Harry C. Triandis and Vasso Vassiliou. 1972. Interper-
Katerina Papantoniou,       Panagiotis Papadakos,          sonal influence and employee selection in two cul-
  Theodore Patkos, Giorgos Flouris, Ion An-                tures. Journal of Applied Psychology, 56:140–145.
  droutsopoulos, and Dimitris Plexousakis. 2021.
  Deception detection in text and its relation to the    Udo Undeutsch, 1989. The Development of Statement
  cultural dimension of individualism/collectivism.        Reality Analysis, pages 101–19. Springer Nether-
  Natural Language Engineering. Also appeared as           lands, Dordrecht.
  an arXiv preprint arXiv:2105.12530.
                                                         William Yang Wang. 2017. "Liar, Liar Pants on Fire":
Verónica Pérez-Rosas, Bennett Kleinberg, Alexandra         A New Benchmark Dataset for Fake News Detec-
  Lefevre, and Rada Mihalcea. 2018. Automatic De-          tion. In Regina Barzilay and Min-Yen Kan, editors,
  tection of Fake News. In Proceedings of the 27th In-     Proceedings of the 55th Annual Meeting of the As-
  ternational Conference on Computational Linguis-         sociation for Computational Linguistics, ACL 2017,
  tics, pages 3391–3401, Santa Fe, New Mexico,             Vancouver, Canada, July 30 - August 4, Volume 2:
  USA, August. Association for Computational Lin-          Short Papers, pages 422–426. Association for Com-
  guistics.                                                putational Linguistics.

Denis Peskov, Benny Cheng, Ahmed Elgohary, Joe
  Barrow, Cristian Danescu-Niculescu-Mizil, and Jor-
  dan Boyd-Graber. 2020. It Takes Two to Lie: One
  to Lie, and One to Listen. In Proceedings of the
  58th Annual Meeting of the Association for Compu-
  tational Linguistics, pages 3811–3854, Online, July.
  Association for Computational Linguistics.

Danae Pla Karidi, Harry Nakos, and Yannis Stavrakas.
  2019. Automatic Ground Truth Dataset Creation
  for Fake News Detection in Social Media. In Hu-
  jun Yin, David Camacho, Peter Tino, Antonio J.
  Tallón-Ballesteros, Ronaldo Menezes, and Richard
  Allmendinger, editors, Intelligent Data Engineering
  and Automated Learning – IDEAL 2019, pages 424–
  436, Cham. Springer International Publishing.

Stephen Skalicky, Nicholas Duran, and Scott A Cross-
   ley. 2020. Please, Please, Just Tell Me: The Lin-
   guistic Features of Humorous Deception. Dialogue
   & Discourse, 11(2):128–149, December.

Emmanouil Smyrnakis, Katerina Papantoniou, Panagi-
  otis Papadakos, and Yannis Tzitzikas. 2021. Chat-
  tack: A Gamified Crowd-sourcing Platform for Tag-
  ging Deceptive & Abusive Behaviour. In European
  Conference on Information Retrieval, pages 549–
  553. Springer.