<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>The Copenhagen Team Participation in the Factuality Task of the Competition of Automatic Identi cation and Veri cation of Claims in Political Debates of the CLEF-2018 Fact Checking Lab</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Dongsheng Wang</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jakob Grue Simonsen</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Birger Larseny</string-name>
          <email>birger@hum.aau.dk</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Christina Lioma</string-name>
          <email>c.liomag@di.ku.dk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science University of Copenhagen (DIKU)) (</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Given a set of political debate claims that have been already identi ed as worth checking, we consider the task of automatically checking the factuality of these claims. In particular, given a sentence that is worth checking, the goal is for the system to determine whether the claim is likely to be true, false, half-true or that it is unsure of its factuality. We implement a variety of models, including Bayes, SVM, RNN, to either step-wise assist our model or work as potential baselines. Then, we develop additional multi-scale Convolutional Neural Networks (CNNs) with di erent kernel sizes that learn from external sources whether a claim is true, false, half-true or unsure as follows: we treat claims as search engine queries and step-wise retrieve the top-N documents from Google with as much original claim as possible. We strategically select most relevant but su cient documents with respect to the claims, and extract features, such as title, total number of results returned, and snippet to train the prediction model. We submitted results of SVM and CNNs, and the overall performance of our techniques is successful, achieving the overall best performing run (with lowest error rate 0.7050 from our SVM and highest accuracy 46.76% from our CNNs) in the competition.</p>
      </abstract>
      <kwd-group>
        <kwd>political debates</kwd>
        <kwd>RNN</kwd>
        <kwd>CNN</kwd>
        <kwd>fact checking</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>claims. In particular, given a sentence that is worth checking, the goal is for the
system to determine whether the claim is likely to be true, half-true, false or
that it is unsure of its factuality.</p>
      <p>
        One of the two examples given by organizers [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] is shown in Table 1, where
Hillary Clinton mentions Bill Clinton's work in the 1990s, followed by a claim
made by Donald Trump stating that for president Clinton approved the North
American Free Trade Agreement (NAFTA). This last statement by Trump is
judged to be HALF-TRUE because it was George H.W. Bush who signed the
approval for NAFTA, but Bill Clinton who signed it into law.
CLINTON: I think my husband did a pretty good job in the 1990s.
      </p>
      <p>CLINTON: I think a lot about what worked and how we can make it work again...
TRUMP: Well, he approved NAFTA... (HALF-TRUE)</p>
      <p>As CLEF provides limited data (only 82 unique claims with labels), but the
task of fact checking relies on labeled data to train prediction models, nding
suitable datasets for training is the rst basic step. Furthermore, the task at
hand is more complex than traditional binary prediction (True/False) as graded
truth values must be predicted, including the di cult \Half-True". There are
primarily three objectives that we take into consideration:
1. Select external claims with labels and suitable proportion of samples
2. Retrieving most relevant but suitable amount of external sources
(documents) for claims
3. Find the best models and parameters and tune them to their best
performance</p>
      <p>The three objectives are met by proceeding in a stepwise manner. Selecting
external claims of high quality is the basis of the following steps. The multiple
labels and their proportional samples have to be taken into account when
selecting datasets with di erent labeling. Subsequently, retrieving most relevant
but adequate documents for these claims are signi cant to support the building
of training models. Finally, selected features of documents should be tted well
into di erent models, of which the parameters should be tuned to improve the
nal results.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Approaches Used and Progress Beyond State-of-the-Art</title>
      <p>Our approach is as follows: we use a step-wise modeling from data selection,
preprocessing, retrieval, to training modeling, with the aim of choosing a suitable
proportion of samples with labels and supporting documents that we are going to
employ. Speci cally, we take advantage of a simple Bayesian model, to analyze
label impact and data su ciency in the data processing stage, and stepwise
search in the stage of supporting document retrieving.</p>
      <p>
        For training the model, we employ CNN and RNN models, as well as an
SVM. RNN model is employed in similar tasks such as the work of detecting
rumors from microblogs by Ma et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] through capturing the variation of
contextual information of relevant posts over time in microblogs. Closer to our
aims, Karadzhov et al. [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] investigate a fact checking task, and we implement
a similar framework as shown in Figure 2; we use two simpli cations compared
to [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]: (a) we only use 5 Google snippets while the original author uses 4 units
consisting of one Google snippet, one Bing snippet and two triplets of rolling
sentences from Google and Bing respectively; and (b) we only calculate one
similarity, namely pairwise TF-IDF cosine similarity, whereas [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] calculates the
average of cosine with TF-IDF, cosine over embeddings, and containment.
      </p>
      <p>
        CNNs are adopted in the sentence-level classi cation by Kim et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] and
they have been demonstrated to improve performance on NLP classi cation
tasks. In our CNN model, as shown in Figure 1, inspired by [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], we employ
multi-scale CNNs with di erent kernel sizes to overcome the drawback of simple
convolutional kernel with xed window size over encoded semantics of
documents. It is hard to determine window size using simple convolutional kernels,
because small window normally requires deeper networks to gain critical
information and large window sizes result in loss of local information. Therefore,
multi-scale CNNs, together with the other feature (total return), are designed
to represent the comprehensive contextual information of the text. Speci cally,
we encode the semantics with word2vec [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] for documents of concatenated
snippets on the rst layer into low-dimensional vector. Then, we perform multi-scale
CNNs with di erent kernel sizes on the second layer over the embedded word
vectors. In our experiment settings, we concatenate four CNNs with 1, 2, 3 and
4 kernel sizes respectively, followed by max-pooling layer and dropout after each
of them. We set the channel as static word vectors. In CNN representation layer,
we add total return of the rst search (step-wise search is discussed in Section
3.2) for each claim as additional information. Because the total return has a
large numeric range, we discretize it into eight equally-sized categories based on
the statistics of training samples.
      </p>
      <p>While each step of our approach uses only simple, well-known methods, our
progress beyond state-of-the-art methods consists of the combination of the
following:
1. We use step-wise modeling, instead of using a mixed model in the nal step,
i.e., we use traditional Bayes models for data prepossessing, including data
selection (label mapping) and external source analysis (su ciency analysis),
and then build a CNN model based on previous conclusion.
2. We employ step-wise searching in retrieving supporting documents with as
much of the original claim as possible while strategically retrieving enough
documents, instead of just using keywords.
3. we employ multi-scale CNNs with multiple kernel sizes, together with
discrete total return, to represent the contextual information. We assume that
multi-scale CNNs can obtain comprehensive information and the total
return of a claim represents the intensity of attention, which to some degree
re ect its hidden status.
We describe how to collect claims with labels from Politifact in Section 3.1, and
how we retrieve supporting documents for these claims in Section 3.2, and we
give a short description concerning the word embedding we utilize in Section
3.3.
3.1</p>
      <sec id="sec-2-1">
        <title>Claims with labels from Politifact</title>
        <p>Due to the small dataset of claim samples (a total of 82 unique claims with
labels) provided by CLEF-2018 Fact Checking Lab, we use Politifact as an external
source to collect claims and their labels. Speci cally, we crawl Politifact
TruthO-Meter statements from www.politifact.com that are operated by editors and
reporters from the Tampa Bay Times. For the Truth-O-Meter webpage,
Politifact sta ers research U.S. politics statements and label them as \true", \mostly
true", \half true", \mostly false", \false", \and pants on re" (the latter for
outrageously false claims). We obtain a total of 4,604 statements/claims from
Politifact as demonstrated in Table 2.</p>
        <p>The task of CLEF-2018 Fact Checking Lab requires us to predict claims as
one of the three labels: \true", \false" and \half-true". Therefore, we map the
six Politifact labels into three categories and remove some ambiguous labels.
Table 3 shows three examples of label mapping; for Map1, we map all six into
three labels; remove Mostly False for Map2; and Mostly-true as well in Map3.
In experiment part (Section 4.1), we would compare the performance for each
mapping and obtain the best one as training dataset.</p>
        <p>In addition, we try to discover how many overlaps does Politifact have with
that in test set. If we check the exactly same claims, no claims are found, i.e., no
overlap exists. If we use Levenshtein distance to detect similar claims, there are
still no same claims exist when similarity is set below 0.8, whereas there are only
three claims that are, to some degree, similar when the similarity is set below
0.8.</p>
        <p>It is noted that of all the retrieved URLs, there were a total of 1,310 bad URLs
out of 84,451 URLs (A ratio of 1.55%) in our training dataset. For model building
and training with our existing dataset, we use the politifact dataset without
further processing or ltering. For testing data from CLEF, we used the
urllters function provided by CLEF to lter bad urls when retrieving supporting
documents from Google, and output the prediction result. According to our
internal testing, the performance does not seem to be negatively impacted with
or without bad URLs in training dataset, and is sometime even slightly better
after ltering.
3.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Documents Retrieval from Search Engine</title>
        <p>We retrieved supporting documents and texts in order to train the prediction
models, in addition to the claim texts themselves. To that end, we retrieve
topN documents from Google. Compared to the classical approach of analyzing
and shortening claims into keywords, we used a step-wise searching method to
maintain the semantics as much as possible.</p>
        <p>
          The reason for this is that we conjecture that using the whole sentence could
keep more of the original semantics, including speaking habits and
commonlyused sequences of words. We thus use each whole claim verbatim as a Google
query, at the risk of retrieving only few documents. We subsequently apply
stepwise searching to ll up a list of documents as follows. First, we initialize a set
with zero documents for each claim. We then use the whole claim as a query to
retrieve documents and populate the set with the results. If the set has less than
N documents (N=20 in our concrete experiments), we remove the stop words
of the claim and search again, populating the set with retrieved documents. If
the set contains fewer than N documents, we search again using only the nouns,
verbs and adjectives obtained by using part-of-speech (POS) (obtained with the
Stanford POS tagger [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]), populate the set in the same way as with the second
search. We ended up being able to retrieve 20 documents for each of the claim. It
is noted that we do not necessarily use all of them as we re-rank the documents
with cosine similarity and employ, for example, top 4 or 5 snippets among them.
3.3
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>Pre-trained embedding</title>
        <p>
          For CNN and RNN, We employ existing pre-trained word vectors - word2vec [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]
for our word embedding. word2vec is published by Google who trained it on 100
billion words of Google news with continuous bag-of-words model and generated
the vectors of 300-dimension.
4
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Analysis of the Results</title>
      <p>In Section 4.1 and 4.2, we perform some experiments with a Bayesian classi er
to investigate how to map the six labels into three, and determine how many
documents (snippets) we need to t models. We do not use more advanced
classi ers such as RNNs or CNNs to conduct this analysis as the word embedding
layer is hidden, hard to explain, and the proper neural network layer are sensitive
to parameters rather than semantics of texts. Conversely, it is usually easier to
understand how a trained Bayesian classi er based on bag of words and n-gram
re ect the semantics of texts in a simple and straightforward way. In Section 4.3,
we give the comparison of performance of di erent models on the test dataset.
4.1</p>
      <sec id="sec-3-1">
        <title>Label mapping</title>
        <p>The six Politifact labels must be mapped into three: True, False, and Half-True.
Some labels are ambiguous, e.g., Mostly-True can be either True or Half-True.
Therefore, we tune the mapping using a Bayes classi cation model on three
combination cases listed in Table 3. As shown in Figure 3, Map3 has the same
highest accuracy with Map1, the highest macro F1 and the highest macro recall.
Therefore, we apply Map3 mapping case as our training data. We discard the
claims with labels not listed in Map3, and the new statistics is shown in table
4, which is applied to all other models as well.
1,473 941
n
We manipulate the number of document snippets that are concatenated and
compare their performance (accuracy, macro F1, etc.) to determine the number
of snippet texts needed. We rank the documents according to their pairwise
TFIDF similarity with the claims and select the top-N (we test N=1 to N=10 in
our experiments) to concatenate.</p>
        <p>As shown in Figure 4, using two snippets leads to the highest performance,
5 snippets the second-best, and 4 and 8 the third-best. In short, the rank of
"2,5,4,8, etc." is the order of numbers of snippets we learned that we can refer
to utilize. As we know, training deep learning model is time-consuming, such
analysis narrows the scope of choices and enables us to focus on parameters of
models themselves. However, note that the results by number of snippets are
quite unstable, and that no rm conclusions can be drawn. In our experiment,
we primarily conduct our experiment on 2 and 5 snippets (highest and second
highest) and attempt to obtain the models of parameters and number of snippets
with the best performance.
For Bayes and SVM classi ers, we employ the Bag of Words (BOW) model for
English texts, tokenizing them and removing stop words; also, we adopt TF-IDF
(Term Frequency times inverse document frequency) for term weight. We use grid
search to tune the parameters of each model. We rank the documents of each
claim according to their similarity with the claim and concatenate the rst ve
snippets as a whole document. As shown in Table 5, Naive bayes with grid search
could reach its best performance to 53.98% and 43.90% on Politifact samples
(20% of Politifact samples=743 samples) and CLEF (82 samples), respectively.
We produce two best models, with CNNs and RNN, on Politifact and CLEF
test respectively, titled as CNN1, CNN2, RNN1, RNN2, shown in the table.
For CNN and RNN, we observe that RNN has worse performance than CNN
on Politifact samples, with 47.22% and 46.88%, respectively. The performance
on CLEF samples with 45.12% and 46.34% is similar. In contrast, using CNN
results in an accuracy of 55.56% and 51.56%, respectively, on Politifact samples,
but only 42.68% and 48.78% on CLEF samples.</p>
        <p>We submitted the SVM+Gridsearch and CNN models because SVM is more
stable across most test cases, and CNN has a relatively good overall performance,
but is less stable. We run the two models on the nal test dataset without label
from CLEF and output the prediction result. The test result from CLEF is
shown in Table 6. Within our two groups of results, we observe that for metric
MAE (mean absolute error) SVM outperforms CNN while for accuracy CNN2
outperforms SVM. However, either one group of our results outperforms those
of all the other teams for each single metric.</p>
        <p>There are several further empirical phenomena evident from our experiments:
1. The RNN model is more unstable than other models and sensitive not only
to parameters but also to epochs.
2. The traditional models, Bayes and SVM, sometimes have worse performance
than the neural network-based approaches, but are much more robust in
terms of performance.
3. We originally conjectured that claims in documents could be concerned with
temporal information which could be exploited well by an RNN model.
Hence, we also try to re-rank the documents of a claim according to year,
and t on RNN model. This does not improve performance. One possible
reason is that some documents do not have time information, and placing
them in a ranked list (in front or rear) just introduces uncertainty. Another
reason might be that we adopt only a few documents so the ranking is not
as apparent as we assumed.
5</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Perspectives for Future Work</title>
      <p>
        Our current work only employs lexical and syntactic context. In future work, we
plan to add information about semantic structures and argumentation ow; we
believe that this will aid our methods in identifying some of the most egregious
common examples of poor reasoning or argumentation (e.g., logical fallacies).
One similar work that could be referred to is by Ba el at. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] who extract
entities and relations from web and Twitter and gathers the con icting information.
Secondly, while we have found that using combinations of simpli cations of
several methods found in the literature, we aim to investigate whether using tuned
versions of the original methods (e.g., [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]) may improve our results.
6
      </p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgement</title>
      <p>This work is funded by the European Union's Horizon 2020 research and
innovation programme under the Marie Sklodowska-Curie grant agreement No 721321.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>M. L. Ba</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Berti-Equille</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Shah</surname>
            , and
            <given-names>H. M.</given-names>
          </string-name>
          <string-name>
            <surname>Hammady</surname>
          </string-name>
          . VERA:
          <article-title>A platform for veracity estimation over web data</article-title>
          .
          <source>In Proceedings of the 25th International Conference on World Wide Web, WWW</source>
          <year>2016</year>
          , Montreal, Canada,
          <source>April 11-15</source>
          ,
          <year>2016</year>
          , Companion Volume, pages
          <volume>159</volume>
          {
          <fpage>162</fpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>A.</given-names>
            <surname>Barron-Ceden</surname>
          </string-name>
          ~o,
          <string-name>
            <given-names>T.</given-names>
            <surname>Elsayed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Suwaileh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Marquez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Atanasova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Zaghouani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kyuchukov</surname>
          </string-name>
          , G. Da San Martino, and
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          .
          <article-title>Overview of the clef-2018 checkthat! lab on automatic identi cation and veri cation of political claims, task 2: Factuality</article-title>
          . In L. Cappellato,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.-Y.</given-names>
            <surname>Nie</surname>
          </string-name>
          , and L. Soulier, editors,
          <source>CLEF 2018 Working Notes. Working Notes of CLEF 2018 - Conference and Labs of the Evaluation Forum, CEUR Workshop Proceedings</source>
          , Avignon, France,
          <year>September 2018</year>
          .
          <article-title>CEUR-WS.org</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>C.</given-names>
            <surname>Hansen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Hansen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. G.</given-names>
            <surname>Simonsen</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Lioma</surname>
          </string-name>
          .
          <article-title>The Copenhagen Team Participation in the Check-Worthiness Task of the Competition of Automatic Identi cation and Veri cation of Claims in Political Debates of the CLEF-2018 Fact Checking Lab</article-title>
          .
          <source>Technical report, CLEF Fact Checking Lab</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>G.</given-names>
            <surname>Karadzhov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Marquez</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          <article-title>Barron-Ceden~o, and I. Koychev. Fully automated fact checking using external sources</article-title>
          . In R. Mitkov and G. Angelova, editors,
          <source>Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP</source>
          <year>2017</year>
          , Varna, Bulgaria, September 2 -
          <issue>8</issue>
          ,
          <year>2017</year>
          , pages
          <fpage>344</fpage>
          {
          <fpage>353</fpage>
          . INCOMA Ltd.,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>Y.</given-names>
            <surname>Kim</surname>
          </string-name>
          .
          <article-title>Convolutional neural networks for sentence classi cation</article-title>
          .
          <source>In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25-29</source>
          ,
          <year>2014</year>
          , Doha,
          <string-name>
            <surname>Qatar,</surname>
          </string-name>
          <article-title>A meeting of SIGDAT, a Special Interest Group of the ACL</article-title>
          , pages
          <volume>1746</volume>
          {
          <fpage>1751</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>J.</given-names>
            <surname>Ma</surname>
          </string-name>
          , W. Gao,
          <string-name>
            <given-names>P.</given-names>
            <surname>Mitra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kwon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. J.</given-names>
            <surname>Jansen</surname>
          </string-name>
          ,
          <string-name>
            <surname>K.-F. Wong</surname>
            , and
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Cha</surname>
          </string-name>
          .
          <article-title>Detecting rumors from microblogs with recurrent neural networks</article-title>
          .
          <source>In Proceedings of the Twenty-Fifth International Joint Conference on Arti cial Intelligence</source>
          ,
          <source>IJCAI'16</source>
          , pages
          <fpage>3818</fpage>
          {
          <fpage>3824</fpage>
          . AAAI Press,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>T.</given-names>
            <surname>Mikolov</surname>
          </string-name>
          , I. Sutskever,
          <string-name>
            <given-names>K.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. S.</given-names>
            <surname>Corrado</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Dean</surname>
          </string-name>
          .
          <article-title>Distributed representations of words and phrases and their compositionality</article-title>
          .
          <source>In Advances in neural information processing systems</source>
          , pages
          <volume>3111</volume>
          {
          <fpage>3119</fpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Barron-Ceden</surname>
          </string-name>
          ~o,
          <string-name>
            <given-names>T.</given-names>
            <surname>Elsayed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Suwaileh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Marquez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Zaghouani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Gencheva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kyuchukov</surname>
          </string-name>
          , and G. Da San Martino.
          <article-title>Overview of the CLEF-2018 lab on automatic identi cation and veri cation of claims in political debates</article-title>
          .
          <source>In Working Notes of CLEF</source>
          <year>2018</year>
          {
          <article-title>Conference and Labs of the Evaluation Forum</article-title>
          , CLEF '18,
          <string-name>
            <surname>Avignon</surname>
          </string-name>
          , France,
          <year>September 2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          and
          <string-name>
            <given-names>C. D.</given-names>
            <surname>Manning</surname>
          </string-name>
          .
          <article-title>Enriching the knowledge sources used in a maximum entropy part-of-speech tagger</article-title>
          .
          <source>In Proceedings of the 2000 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing</source>
          and Very Large Corpora:
          <article-title>Held in Conjunction with the 38th Annual Meeting of the Association for Computational Linguistics</article-title>
          - Volume
          <volume>13</volume>
          , EMNLP '
          <volume>00</volume>
          , pages
          <fpage>63</fpage>
          {
          <fpage>70</fpage>
          ,
          <string-name>
            <surname>Stroudsburg</surname>
          </string-name>
          , PA, USA,
          <year>2000</year>
          .
          <article-title>Association for Computational Linguistics</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>S.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Huang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Z.</given-names>
            <surname>Deng</surname>
          </string-name>
          .
          <article-title>Densely connected cnn with multi-scale feature attention for text classi cation</article-title>
          .
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>