<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>UPV-UMA at CheckThat! Lab: Verifying Arabic Claims using a Cross Lingual Approach</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Bilal Ghanem</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Goran Glavas</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anastasia Giachanou</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Simone Paolo Ponzetto</string-name>
          <email>simoneg@informatik.uni-mannheim.de</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Paolo Rosso</string-name>
          <email>prosso@dsic.gupv.es</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Francisco Rangel</string-name>
          <email>francisco.rangel@autoritas.es</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Evidence Retrieval</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Cross-</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Autoritas Consulting</institution>
          ,
          <addr-line>Valencia</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>PRHLT Research Center, Universitat Politecnica de Valencia</institution>
          ,
          <country country="ES">Spain</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Mannheim</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper we present our team participation at CheckThat!2019 lab - Task 2 on Arabic claim veri cation. We propose a cross-lingual approach to detect the factuality of claims using three main steps, evidence retrieval, evidence ranking, and textual entailment. Our approach achieves the best performance in subtask-D, with a value of 0.62 as F1. Copyright c 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CLEF 2019, 9-12 September 2019, Lugano, Switzerland.</p>
      </abstract>
      <kwd-group>
        <kwd>Claims Factuality</kwd>
        <kwd>Lingual Word Embeddings</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Rumours in news media and political debates may shape people's believes. Public
opinion can be easily manipulated and this sometimes can lead to severe
consequences including harming individuals, religions, and several other victims. For
example, in 2016 a man opened re on a Washington pizzeria because of a fake
claim that reported that the pizzeria was housing young children as sex slaves as
part of a child abuse ring led by the presidential candidate Hillary Clinton [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
The spread of these claims is rapid and uncontrolled, which makes their veri
cation hard and time consuming. Thus, automated methods have been proposed
to facilitate the process of their veri cation.
      </p>
      <p>
        The Arabic language has a large number of speakers around the world.
However, due to the language has a limited number of Natural Language Processing
(NLP) resources for the Arabic language, there is an increasing gap between
this language and other languages regarding the availability of NLP systems.
Recently, there have been various research attempts on NLP tasks on Arabic,
such as fact checking [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], author pro ling [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], and irony detection [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>
        In this paper, we present our participation in the CheckThat! Lab - Task 2 [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]
for detecting the factuality of Arabic claims in general news topics. Our approach
is based on inferring the veracity by using a Natural Language Inference (NLI)
system trained on the English language to predict if an Arabic pair of sentences
entail each other. To do that, we use cross-lingual embeddings.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        Previous works on claims' factuality can be roughly split into two main
approaches: external sources-based, and context-based. The external-sources-based
approaches pass a claim to external search engines (e.g., Google, Bing), and then
they build various features from the results. Ghanem et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] proposed to pass
the claims to Google and Bing search engines in order to retrieve evidences and
then they extracted features like similarity between the claims and the snippets,
as well as the Alexa rank4 of the retrieved links. Finally, the authors used these
features to train a Random Forest classi er. A similar approach was proposed
by Karadzhov et al. [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] who computed the cosine similarity between the claim
and the top N results to feed these similarities into a Long-Short Term Memory
(LSTM).
      </p>
      <p>
        On the other hand, the context-based approaches use a di erent way of
inferring the factuality. Castillo et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] used text characteristics, user-based,
topic-based, and tweets propagation-based features. Similarly, Mukherjee and
Weikum [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] proposed a continuous conditional random eld model that
exploits several signals of interaction between a set of features (e.g., language of
the news, source trustworthiness, and users' con dence).
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Task Description</title>
      <p>Given a set of Arabic claims with their relevant documents (web pages), the
goal of the task is to predict the factuality of these claims using the provided
web pages. Task 25 has 4 di erent sub-tasks, but we decided to participate in
two of them, namely task B and D. Task B aims to predict how useful is a
web page with respect to a claim, and the target labels are: very useful for
veri cation, useful for veri cation, not useful or not relevant. Task D aims to
nd the claim's factuality (True or False). This task is organized in 2 cycles;
in cycle 1 the factuality should be estimated using the provided unlabeled web
pages, whereas in cycle 2 using useful web pages (very useful and useful labels).
The organizers provided the web pages in a real scenario, where the participants
had to retrieve the evidence and then compared it to the claim.</p>
      <p>Regarding the task data, the organizers provided 10 Arabic claims with their
correspondent web pages with a number between 26 and 50 web pages results
for each claim. These web pages were provided in their original form (HT M L
format). For the test set, the organizers provided 59 claims to be veri ed.</p>
      <sec id="sec-3-1">
        <title>4 https://www.alexa.com/siteinfo 5 https://sites.google.com/view/clef2019-checkthat/task-2-evidence-factuality</title>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Proposed Approach</title>
      <p>We propose an approach that consists of the following three main steps: evidence
retrieval, evidence ranking, and textual entailment. Figure 1 shows a schematic
overview of our approach.</p>
      <p>
        Evidence Retrieval: In the rst step, we read the content of the articles and
then we split them into sentences using comma (,) and dot (.) as delimiters
following the previous literature work [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. To obtain the best recall, we retrieve
the top N similar sentences to the claim using cosine similarity over character
n-grams. We use n-gram of length 5 and 6; we choose them experimentally. In
addition, we tried to retrieve the most similar sentences using Named Entities
(NEs), but we found that there are some sentences without named entities, like:
      </p>
      <p>Translation: Cold drinks reduce colds and their symptoms</p>
      <p>In this step, we discard very short sentences6. Finally, we pass the top 20
sentences to the next step.</p>
      <p>
        Evidence Ranking: For this step, we rank the top 20 sentences using word
embeddings. For each claim-evidence pair, we measure their similarity and we
rank the evidence based on the similarity values. For the word embeddings, we
6 We discarded sentences that have less than 35 characters. This kind of sentences
appeared when a dot and a comma occur closely.
use Arabic f astT ext7 pretrained model. We explore the following three di erent
similarity techniques:
1. Cosine over embeddings: We calculate the average of the words embeddings
of each sentence, and compute the cosine similarity.
2. Cosine over weighted embeddings: We calculate the average of the words'
embeddings weighted by the Term Frequency Inverse Document Frequency
(TF-IDF) weighting scheme, and then we compute the cosine similarity on
the two weighted sentences' vectors. We compute the TF-IDF weights using
the Comparable Wikipedia Corpus [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ].
3. DynaMax: It is an unsupervised and non-parametric similarity measure
based on fuzzy theory that dynamically extracts good features from the
word embeddings depending on the sentence pair [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ].
      </p>
      <p>Since the training dataset is very small, it was not possible to nd the best
similarity technique statistically. Thus, we decided to manually investigate the
ranked sentences and we found that using DynaMax we get the most semantically
similar evidence sentences at the top ranks.</p>
      <p>
        Textual Entailment: For this step, we propose to train a system on par with
state-of-the-art results in NLI task, that is the Enhanced Sequential Inference
Model (ESIM) [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. We follow the implementation details of [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. We train the
ESIM on a large NLI corpus for English, namely MultiNLI [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. Since the claims'
language is Arabic, we rst project the Arabic word embeddings to the vectors
space of the English word embeddings8 we used during the training of the ESIM
model. To this end, we learn a linear projection matrix by solving the
Procrustes problem [
        <xref ref-type="bibr" rid="ref17 ref6">17,6</xref>
        ] using 5K automatically obtained English-Arabic word
translations as supervision9. To evaluate the performance of our model, we use
a multilingual XNLI corpus [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] created by translating development and test sets
of the MultiNLI corpus. Our cross-lingually transferred ESIM system achieved
58% accuracy on the Arabic test set of the XNLI corpus.
      </p>
      <p>In this step of our approach, we receive a claim with its 20 ranked sentences
from the Evidence Ranking step. We feed the claim with each ranked sentence
to the ESIM model and we estimate their prediction probabilities with respect
to Entailment, Neutral, Contradiction labels. Since each claim is represented by
20 predictions, we weight the predictions in one of two methods:
1. Similarity Weighting: We weight the predictions by the evidence ranking
similarity values. Given the prediction probability P of one of the classes C,
we weight it as: Pc = Pi2=0 1 Pci SentenceSimilarityi.
2. Majority Class: Given the NLI predictions for each claim P , we extract
the majority class by: countclasses( argmax P ).</p>
      <sec id="sec-4-1">
        <title>7 https://fasttext.cc/docs/en/crawl-vectors.html</title>
        <p>8 We used English fastText embeddings: https://github.com/facebookresearch/
fastText
9 The 5k words obtained by translating the most frequent words appeared in an
English Wikipedia corpora using Google Translator.</p>
        <p>Finally, after weighting the predictions for each claim, we infer the nal
2classes prediction (True, False) from the 3-classes (NLI labels) using the following
rule:
f (Pentailment; Pcontradiction) =
(T rue; if Pentailment</p>
        <p>F alse; otherwise</p>
        <p>Pcontradiction</p>
        <p>For the Majority Class weighing method, the Pentailment and Pcontradiction
of a claim are represented by the count frequency of the class instead of its
probability.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Experiments and Results</title>
      <p>Task2 subtask-B: In this subtask, we use the rst two steps of our approach to
submit a run. In the rst step, we retrieve the sentences from the web pages using
character n-grams. Here, we retrieve all the sentences with a cosine similarity
value greater than 0. Then, we pass them to the next step where we rank them
based on the words embeddings. At this step, we discard the ranks and we only
average the sentences similarity values for each web page (W Pavg). Then with a
rule-based method, we map the web pages averaged values into the 4 classes:
f (W Pavg) =
8very usef ul; if W Pavg
&gt;
&gt;
&gt;&lt;usef ul;
&gt;not usef ul; if W Pavg
&gt;
&gt;:not relevant; if W Pavg =
0:45</p>
      <p>In the cases that we do not get any sentence from the retrieval process, we set
W Pavg to -1. The thresholds are set experimentally. Table 1 presents the results
of the subtask-B, for both 2-classes and 4-classes prediction. Our submission for
the 2-classes prediction obtains the best performance, but still lower than the
provided baseline by the organizers. For the 4-classes prediction, we obtain a
lower overall rank, lower than the baseline as well.</p>
      <p>Task2 subtask-D: For subtask-D, we use our three steps approach. For each of
the two cycles (see Section on Task Description) we submit two runs, one using
the Similarity Weighting and the other using the Majority Class 10.</p>
      <p>Table 2 presents the results on the test set for the subtask-D. Considering
the second cycle submissions' results, since they are less biased, we observe that
the similarity value weighting performs better than the majority class method
clearly. We obtain the best performing runs in both cycles, higher than the
baselines with 0.25 F1 value on average.
10 We submitted our runs for cycle-1 at late time, thus the organizers considered them
as submissions for cycle-2.
In our experiments we consider the rst 20 sentences to be fed to the ESIM
model. In Figure 2, we investigate the e ect of varying the number of sentences
to consider for each claim on the test set. We use the second cycle (given the
labeled web pages) in this experiment.</p>
      <p>Understanding the causes of errors of our approach is important for future
improvements. We manually examined the predictions to understand the causes
of errors. We categorize them into the following cases:
1. Un-famous news: Some of the truthful claims were not covered by many
news sites. We found that our approach retrieved few correct evidence (two
or three evidence) while the rest of the evidence describe things related to the
main entity but not regarding the same claim issue. Since in our approach we
use the rst 20 evidence to infer the factuality, the rst 3 similar evidence, as
an example, voted positively for the factuality of the sentence, where the rest
17 voted negatively. This kind of errors can be resolved by using a dynamic
number of evidence sentences for each claim instead of a xed one.
2. The spread of false rumors: The spread of rumors over the web can
mislead people. Since our approach is based on retrieving the claim's evidence
from the web, the existence of these false rumors can consequently mislead
our system. As an example, given the following false claim:
Fig. 2: The performance of our approach on the test set using (a) the Similarity
Value weighting and (b) Majority Class with varying the number of evidence
sentences.
Translation: Rifaat al-Assad, the uncle of Bashar al-Assad, died in a
hospital in Paris
Our approach retrieved the following evidence which supports the claims:
Translation: News about the death of the butcher of Hama and Palmyra
prisons, Rifaat al-Assad in a Paris hospital
This evidence was retrieved as a Twitter post. Considering only news
agencies as source of news where random users are not allowed to post news, can
prevent these errors.
3. Inaccurate sentence segmentation: The Arabic language has a
complicated sentence structure, where using dots to split a document into sentences
is inaccurate step. Following the previous works in Arabic, we used dot (.)
and comma (,) to split the evidence documents into sentences. We found
that in some cases, the important evidence sentence in a document has a
comma between the object and predicate. As an example:
Translation: Egypt executed 15 militants convicted of attacks that resulted
in the deaths of a number of military and police men in the Sinai Peninsula
The evidence in a document presented as follows:</p>
      <p>, AÒîDÓ 15 ú¯ Ð@Y«BAK. Õºk ©K. @P àñj. Ë@ éjÊÓ HY®K
ZAJ  ÈAÖÞ ú¯ éjÊÖÏ@ H@ñ®Ë@ XñJk. ð  AJ.  ÉJ®K. ÑêÓAîE@ éJ®Êg úÎ«
Translation: The Prison Service carried out a fourth death sentence in 15
accused, (COMMA) for killing o cers and soldiers of the armed forces in
northern Sinai
The comma between the sentence's parts made the evidence unsupportive
to the claim by splitting it.
4. Weak ESIM predictions: We found some claims whose evidence was
retrieved correctly but the ESIM model was unable to verify them. We argue
that this kind of error is due to the aligned cross-lingual embedding.</p>
    </sec>
    <sec id="sec-6">
      <title>Conclusion and Future Work</title>
      <p>In this paper, we presented our participation in CheckThat! lab - Task 2 at
CLEF-2019. We presented an approach that consists of 3 main steps from Arabic
claims veri cation. Our proposed approach managed to achieve a good
performance. Also, from the error analysis, the results showed that our cross-lingual
model is solid since the majority of errors were due to the other previous reasons.
As a future work, we plan to focus and improve the errors cases we identi ed for
more e ective retrieval, ranking, and prediction.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgements</title>
      <p>The work of Paolo Rosso and Fracisco Rangel was made possible by NPRP grant
9-175-1-033 from the Qatar National Research Fund (a member of Qatar
Foundation). The statements made herein are solely the responsibility of the authors.
The work of Paolo Rosso was partially funded by the the Spanish MICINN under
the research project MISMIS-FAKEnHATE on Misinformation and
Miscommunication in social media: FAKE news and HATE speech
(PGC2018-096212-BC31). The work of Goran Glavas was carried out within the scope of the AGREE
project supported by the Eliteprogramm of the Baden-Wrttemberg Stiftung.
Anastasia Giachanou is supported by the SNSF Early Postdoc Mobility grant
P2TIP2 181441 under the project Early Fake News Detection on Social Media,
Switzerland</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Castillo</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mendoza</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Poblete</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Information Credibility on Twitter</article-title>
          .
          <source>In: Proceedings of the 20th international conference on World Wide Web</source>
          . pp.
          <volume>675</volume>
          {
          <issue>684</issue>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhu</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ling</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wei</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jiang</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Inkpen</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Enhanced LSTM for Natural Language Inference</article-title>
          .
          <source>arXiv preprint arXiv:1609.06038</source>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Conneau</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lample</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rinott</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Williams</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bowman</surname>
            ,
            <given-names>S.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schwenk</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stoyanov</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Xnli: Evaluating Cross-lingual Sentence Representations</article-title>
          . arXiv preprint arXiv:
          <year>1809</year>
          .
          <volume>05053</volume>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Elsayed</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nakov</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          , Barron-Ceden~o,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Hasanain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Suwaileh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            , Da San Martino, G.,
            <surname>Atanasova</surname>
          </string-name>
          ,
          <string-name>
            <surname>P.</surname>
          </string-name>
          :
          <article-title>Overview of the CLEF-2019 CheckThat!: Automatic Identi cation and Veri cation of Claims. In: Experimental IR Meets Multilinguality, Multimodality, and</article-title>
          <string-name>
            <surname>Interaction. LNCS</surname>
          </string-name>
          , Lugano, Switzerland (
          <year>September 2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Ghanem</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Montes-y Gomez</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rangel</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <string-name>
            <surname>UPV-INAOE-AutoritasCheck</surname>
            <given-names>That</given-names>
          </string-name>
          :
          <article-title>An Approach based on External Sources to Detect Claims Credibility</article-title>
          .
          <source>Proceedings of the International Conference of the Cross-Language Evaluation Forum for European Languages, CLEF '18</source>
          ,
          <string-name>
            <surname>Avignon</surname>
          </string-name>
          , France, September. (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Glavas</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Litschko</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ruder</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vulic</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>How to (Properly) Evaluate CrossLingual Word Embeddings: On Strong Baselines, Comparative Analyses, and Some Misconceptions</article-title>
          . arXiv preprint arXiv:
          <year>1902</year>
          .
          <volume>00508</volume>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Hasanain</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Suwaileh</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Elsayed</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <article-title>Barron-Ceden~o,</article-title>
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Nakov</surname>
          </string-name>
          ,
          <string-name>
            <surname>P.</surname>
          </string-name>
          :
          <article-title>Overview of the CLEF-2019 CheckThat! Lab on Automatic Identi cation and Veri cation of Claims. Task 2: Evidence and Factuality</article-title>
          .
          <source>CEUR Workshop Proceedings</source>
          , CEURWS.org, Lugano, Switzerland (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Karadzhov</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nakov</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marquez</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <article-title>Barron-Ceden~o,</article-title>
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Koychev</surname>
          </string-name>
          ,
          <string-name>
            <surname>I.</surname>
          </string-name>
          :
          <source>Fully Automated Fact Checking Using External Sources. arXiv preprint arXiv:1710.00341</source>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Karoui</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zitoune</surname>
            ,
            <given-names>F.B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moriceau</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          :
          <article-title>Soukhria: Towards an Irony Detection System for Arabic in Social Media</article-title>
          .
          <source>Procedia Computer Science</source>
          <volume>117</volume>
          ,
          <issue>161</issue>
          {
          <fpage>168</fpage>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>Y.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Papineni</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Roukos</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Emam</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hassan</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          :
          <article-title>Language Model Based Arabic Word Segmentation</article-title>
          .
          <source>In: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics-Volume</source>
          <volume>1</volume>
          . pp.
          <volume>399</volume>
          {
          <fpage>406</fpage>
          .
          <article-title>Association for Computational Linguistics (</article-title>
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Mukherjee</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weikum</surname>
          </string-name>
          , G.:
          <article-title>Leveraging Joint Interactions for Credibility Analysis in News Communities</article-title>
          .
          <source>In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management</source>
          . pp.
          <volume>353</volume>
          {
          <issue>362</issue>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Nakov</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barron-Cedeno</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Elsayed</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Suwaileh</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marquez</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zaghouani</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Atanasova</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kyuchukov</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , Da San Martino, G.:
          <article-title>Overview of the CLEF2018 CheckThat! Lab on Automatic Identi cation and Veri cation of Political Claims</article-title>
          .
          <source>In: Proceedings of the International Conference of the Cross-Language Evaluation Forum for European Languages</source>
          . pp.
          <volume>372</volume>
          {
          <issue>387</issue>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Potthast</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rangel</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tschuggnall</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stamatatos</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stein</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Overview of PAN17</article-title>
          . In:
          <article-title>International Conference of the Cross-Language Evaluation Forum for European Languages</article-title>
          . pp.
          <volume>275</volume>
          {
          <fpage>290</fpage>
          . Springer (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rangel Pardo</surname>
            ,
            <given-names>F.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ghanem</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Char</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>ARAP: Arabic Author Pro ling Project for Cyber-Security. Sociedad Espan~ola para el Procesamiento del Lenguaje Natural (</article-title>
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Saad</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alijla</surname>
            ,
            <given-names>B.O.</given-names>
          </string-name>
          :
          <article-title>Wikidocsaligner: An o -the-shelf Wikipedia Documents Alignment Tool</article-title>
          . In
          <source>: Proceedings of the 2017 Palestinian International Conference on Information and Communication Technology</source>
          . pp.
          <volume>34</volume>
          {
          <issue>39</issue>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Simpson</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Man pleads guilty in washington pizzeria shooting over fake news</article-title>
          . https://www.reuters.com/article/us-washingtondc
          <article-title>-gunman/ man-pleads-guilty-in-washington-pizzeria-shooting-over-fake-news-</article-title>
          <string-name>
            <surname>idUSKBN16V1XC</surname>
          </string-name>
          (
          <year>2017</year>
          ), [Online; accessed 10-may-2019]
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>S.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Turban</surname>
            ,
            <given-names>D.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hamblin</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hammerla</surname>
          </string-name>
          , N.Y.:
          <article-title>O ine Bilingual Word Vectors, Orthogonal Transformations and the Inverted Softmax</article-title>
          .
          <source>In: Proceedings of ICLR</source>
          (
          <year>2017</year>
          ), https://arxiv.org/abs/1702.03859
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Williams</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nangia</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bowman</surname>
            ,
            <given-names>S.R.:</given-names>
          </string-name>
          <article-title>A broad-coverage Challenge Corpus for Sentence Understanding Through Inference</article-title>
          .
          <source>arXiv preprint arXiv:1704.05426</source>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Zhelezniak</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Savkov</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shen</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moramarco</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Flann</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hammerla</surname>
          </string-name>
          , N.Y.:
          <article-title>Don't Settle for Average, Go for the Max: Fuzzy Sets and Max-Pooled Word Vectors</article-title>
          . arXiv preprint arXiv:
          <year>1904</year>
          .
          <volume>13264</volume>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>