<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Neural Weakly Supervised Fact Check-Worthiness Detection with Contrastive Sampling-Based Ranking Loss</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Casper Hansen</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Christian Hansen</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jakob Grue Simonsen</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Christina Lioma</string-name>
          <email>c.liomag@di.ku.dk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, University of Copenhagen</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper describes the winning approach used by the Copenhagen team in the CLEF-2019 CheckThat! lab. Given a political debate or speech, the aim is to predict which sentences should be prioritized for fact-checking by creating a ranked list of sentences. While many approaches for check-worthiness exist, we are the rst to directly optimize the sentence ranking as all previous work has solely used standard classi cation based loss functions. We present a recurrent neural network model that learns a sentence encoding, from which a check-worthiness score is predicted. The model is trained by jointly optimizing a binary cross entropy loss, as well as a ranking based pairwise hinge loss. We obtain sentence pairs for training through contrastive sampling, where for each sentence we nd the k most semantically similar sentences with opposite label. To increase the generalizability of the model, we utilize weak supervision by using an existing check-worthiness approach to weakly label a large unlabeled dataset. We experimentally show that both weak supervision and the ranking component improve the results individually (MAP increases of 25% and 9% respectively), while when used together improve the results even more (39% increase). Through a comparison to existing state-of-the-art check-worthiness methods, we nd that our approach improves the MAP score by 11%.</p>
      </abstract>
      <kwd-group>
        <kwd>fact check-worthiness</kwd>
        <kwd>neural networks</kwd>
        <kwd>contrastive rank- ing</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Tasks performed</title>
      <p>
        The Copenhagen team participated in Task 1 [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] of the CLEF 2019 Fact Checking
Lab (CheckThat!) on automatic identi cation and Veri cation of claims [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. This
report details our approach and results.
      </p>
      <p>The aim of Task 1 is to identify sentences in a political debate that should
be prioritized for fact-checking: given a debate, the goal is to produce a ranked
list of all sentences based on their worthiness for fact checking.</p>
      <p>Examples of check-worthy sentences are shown in Table 1. In the rst
example Hillary Clinton mentions Bill Clinton's work in the 1990s, followed by a
claim made by Donald Trump stating that president Clinton approved the North
American Free Trade Agreement (NAFTA). In the second example Hillary
Clinton mentions Donald Trump's beliefs about climate change. While this may be
more di cult to fact-check, it is still considered an interesting claim and thus
check-worthy.</p>
      <sec id="sec-1-1">
        <title>CLINTON I think my husband did a pretty good job in the 1990s.</title>
      </sec>
      <sec id="sec-1-2">
        <title>CLINTON I think a lot about what worked and how we can make it work again...</title>
        <p>TRUMP Well, he approved NAFTA...</p>
      </sec>
      <sec id="sec-1-3">
        <title>CLINTON Take clean energy</title>
      </sec>
      <sec id="sec-1-4">
        <title>CLINTON Some country is going to be the clean-energy superpower of the 21st century.</title>
        <p>CLINTON Donald thinks that climate change is a hoax perpetrated by the Chinese.</p>
      </sec>
      <sec id="sec-1-5">
        <title>CLINTON I think it's real.</title>
      </sec>
      <sec id="sec-1-6">
        <title>TRUMP I did not.</title>
        <p>2</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Main objectives of experiments</title>
      <p>The task of check-worthiness can be considered part of the fact-checking pipeline,
which traditionally consists of three steps:
1. Detect sentences that are interesting to fact-check.
2. Gather evidence and background knowledge for each sentence.
3. Manually or automatically estimate veracity.</p>
      <p>
        Sentences detected in step 1 for further processing are described as being
checkworthy. This detection can be considered a ltering step in order to limit the
computational processing needed in total for the later steps. In practice,
sentences are ranked according to their check-worthiness such that they can be
processed in order of importance. Thus, the ability to correctly rank check-worthy
sentences above non-check-worthy is essential for automatic check-worthiness
methods to be useful in practice. However, existing check-worthiness methods
[
        <xref ref-type="bibr" rid="ref10 ref11 ref16 ref19 ref5 ref6 ref9">10,16,11,5,6,9,19</xref>
        ] do not directly model this aspect, as they are all based on
traditional classi cation based training objectives.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Related work</title>
      <p>
        Most existing check-worthiness methods are based on feature engineering to
extract meaningful features. Given a sentence, ClaimBuster [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] predicts
checkworthiness by extracting a set of features (sentiment, statement length,
Part-ofSpeech (POS) tags, named entities, and tf-idf weighted bag-of-words), and uses
a SVM classi er for the prediction. Patwari et al. [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] presented an approach
based on similar features, as well as contextual features based on sentences
immediately preceding and succeeding the one being assessed, as well as certain
hand-crafted POS patterns. The prediction is made by a multi-classi er system
based on a dynamic clustering of the data. Gencheva et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] also extend the
features used by ClaimBuster to include more context, such as the sentence's
position in the debate segment, segment sizes, similarities between segments,
and whether the debate opponent was mentioned. In the CLEF 2018
competition on check-worthiness detection [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], Hansen et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] showed that a recurrent
neural network with multiple word representations (word embeddings,
part-ofspeech tagging, and syntactic dependencies) could obtain state-of-the-art results
for check-worthiness prediction. Hansen et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] later extended this work with
weak supervision based on a large collection of unlabeled political speeches and
showed signi cant improvements compared to existing state-of-the-art methods.
This paper directly improves the work done by Hansen et al. by integrating a
ranking component into the model trained via contrastive sampling.
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Neural Check-Worthiness Model</title>
      <p>
        Our Neural Check-Worthiness Model (NCWM) uses a dual sentence
representation, where each word is represented by both a word embedding and its syntactic
dependencies within the sentence. The word embedding is a traditional word2vec
model [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] that aims at capturing the semantics of the sentence. The syntactic
dependencies of a word aim to capture the role of that word in modifying the
semantics of other words in the sentence [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. We use a syntactic dependency
parser [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] to map each word to its dependency (as a tag) within the sentence,
which is then converted to a one-hot encoding. This combination of capturing
both semantics and syntactic structure has been shown to work well for the
check-worthiness task [
        <xref ref-type="bibr" rid="ref6 ref9">6,9</xref>
        ]. For each word in a sentence, the word embedding
and one-hot encoding are concatenated and fed to a recurrent neural network
with Long Short-Term Memory Units (LSTM) as memory cells (See Figure 1).
The output of the LSTM cells are aggregated using an attention weighted sum,
where each weight is computed as:
t =
      </p>
      <p>exp (score (ht))
Pi exp (score (hi))
(1)
where ht is the output of the LSTM cell at time t, and score( ) is a learned
function that returns a scalar. Finally, the attention weighted sum is transformed
to the check-worthiness score by a sigmoid transformation, such that the score
lies between 0 and 1.</p>
      <p>Loss functions. The model is jointly trained using both a classi cation and
ranking loss function. For the classi cation loss, we use the standard binary cross
Prediction</p>
      <sec id="sec-4-1">
        <title>Aggregation</title>
      </sec>
      <sec id="sec-4-2">
        <title>Memory</title>
      </sec>
      <sec id="sec-4-3">
        <title>Representation</title>
      </sec>
      <sec id="sec-4-4">
        <title>Sentence</title>
        <p>LSTM
word1</p>
        <p>Check-worthiness
score
LSTM
wordM
entropy loss. For the ranking loss, we use a hinge loss based on the computed
check-worthiness score of sentence pairs with opposite labels. To obtain these
pairs we use contrastive sampling, such that for each sentence we sample the k
most semantically similar sentences with the opposite label, i.e., for check-worthy
sentences we sample k non-check-worthy sentences. In order to estimate the
semantic similarity we compute an average word embedding vector of all words in a
sentence, and then use the cosine similarity to nd the k most semantically
similar sentences with the opposite label. The purpose of this contrastive sampling is
to enable the model to better learn the small di erences between check-worthy
and non-check-worthy sentences. The combination of both the classi cation and
ranking loss enables the model to learn accurate classi cations while ensuring
the predicted scores are sensible for ranking.
Our approach is summarized in Figure 1, and in the following the underlined
values were found to perform the best during validation. The cross validation
consisted of a fold for each training speech and debate. The LSTM has f50,
100, 150, 200g hidden units, a dropout of f0, 0.1, 0.3, 0.5g was applied to the
attention weighted sum, and we used a batch size of f40, 80, 120, 160, 200g.
For the contrastive sampling we found the 5 most semantically similar sentences
with the opposite label. For the syntactic dependency parsing we use spaCy1,
and for the neural network implementation TensorFlow.
1 https://spacy.io/</p>
        <p>Attention
LSTM
word2
Word embedding
Syn. dependency</p>
        <p>Word embedding
Syn. dependency</p>
        <p>Word embedding
Syn. dependency</p>
        <p>
          To train a more generalizable model we employ weak supervision [
          <xref ref-type="bibr" rid="ref18 ref3 ref6 ref8">3,6,8,18</xref>
          ]
by using an existing check-worthiness approach, ClaimBuster2 [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ], to weakly
label a large collection of unlabeled political speeches and debates. We obtain
271 political speeches and debates by Hillary Clinton and Donald Trump from
the American Presidency Project3. This weakly labeled dataset is used for
pretraining our model. To create a pretrained word embedding, we crawl documents
related to all U.S. elections available through the American Presidency Project,
e.g., press releases, statements, speeches, and public fundraisers, resulting in
15,059 documents. This domain speci c pretraining was also done by Hansen et
al. [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ], and was shown to perform signi cantly better than a word embedding
pretrained on a large general corpus like Google News4.
6
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Results</title>
      <p>For evaluation we use the o cial test dataset of the competition, while choosing
the hyper parameters based on a 19-fold cross validation (1 fold for each training
speech and debate). Following the competition guidelines, we report the MAP
and P@k metrics for the full test data, only the 3 debates, and only the 4
speeches. This splitting choice was done to investigate how the performance
varies depending on the setting.</p>
      <p>
        Overall, our Neural Check-Worthiness Model (NCWM) obtained the rst
place in the competition with a MAP of 0.1660 (primary run). To investigate
the e ect of the ranking component and the weak supervision (See Table 2), we
also report the results when these are not part of NCWM. The model without
the ranking component is similar to the state-of-the-art work by Hansen et al.
[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] (contrastive-1 run), and the model without either the ranking component
or weak supervision is similar to earlier work by Hansen et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. The results
show that the ranking component and weak supervision lead to notable
improvements, both individually and when combined. The inclusion of weak supervision
leads to the largest individual MAP improvement (25% increase), while the
individual improvement of the ranking component is smaller (9% increase). We
observe that the ranking component's improvement is marginally larger when
weak supervision is included (11% increase with weak supervision compared to
9% without), thus showing that even a weakly labeled signal is also bene cial
for learning the correct ranking. Combining both the ranking component and
weak supervision leads to a MAP increase of 39% compared to a model without
either of them, which highlights the immense bene t of using both for the task
of check-worthiness as the combination provides an improvement larger than the
individual parts.
      </p>
      <p>To investigate the performance on speeches and debates individually, we split
the test data and report the performance metrics on each of the sets. In both of
2 https://idir.uta.edu/claimbuster/
3 https://web.archive.org/web/20170606011755/http://www.presidency.ucsb.</p>
      <p>
        edu/
4 https://code.google.com/archive/p/word2vec/
NCWM 0.1660 0.2857 0.2571 0.1571 0.1229
NCWM (w/o. ranking) [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] 0.1496 0.1429 0.2000 0.1429 0.1143
NCWM (w/o. WS) 0.1305 0.1429 0.1714 0.1429 0.1200
NCWM (w/o. ranking and w/o. WS) [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] 0.1195 0.1429 0.1429 0.1143 0.1057
      </p>
      <sec id="sec-5-1">
        <title>Test (Speeches) MAP P@1 P@5</title>
        <p>
          NCWM 0.2502 0.5000 0.3500 0.2375 0.1800
NCWM (w/o. ranking) [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] 0.2256 0.2500 0.3000 0.2250 0.1800
NCWM (w/o. WS) 0.1937 0.2500 0.3000 0.2000 0.1600
NCWM (w/o. ranking and w/o. WS) [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] 0.1845 0.2500 0.2500 0.1875 0.1450
        </p>
      </sec>
      <sec id="sec-5-2">
        <title>Test (Debates) MAP P@1 P@5</title>
        <p>
          NCWM 0.0538 0.0000 0.1333 0.0500 0.0467
NCWM (w/o. ranking) [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] 0.0482 0.0000 0.0667 0.0333 0.0267
NCWM (w/o. WS) 0.0462 0.0000 0.0000 0.0667 0.0667
NCWM (w/o. ranking and w/o. WS) [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] 0.0329 0.0000 0.0000 0.0167 0.0533
them we observe a similar trend as for the full dataset, i.e., that both the ranking
component and weak supervision lead to improvements individually and when
combined. However, the MAP on the debates is signi cantly lower than for the
speeches (0.0538 and 0.2502 respectively). We believe the reason for this
difference is related to two issues: i) All speeches are by Donald Trump and 15
out of 19 training speeches and debates have Donald Trump as a participant,
thus the model is better trained to predict sentences by Donald Trump. ii)
Debates are often more varied in content compared to a single speech, and contain
participants who are not well represented in the training data. Issue (i) can be
alleviated by obtaining larger quantities and more varied training data, while
issue (ii) may simply be due to debates being inherently more di cult to predict.
Models better equipped to handle the dynamics of debates could be a possible
direction to solve this.
7
        </p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Conclusion and future work</title>
      <p>
        We presented a recurrent neural model that directly models the ranking of
checkworthy sentences, which no previous work has done. This was done through a
hinge loss based on contrastive sampling, where the most semantically similar
sentences with opposite labels were sampled for each sentence. Additionally, we
utilize weak supervision through an existing check-worthiness method to label
a large unlabeled dataset of political speeches and debates. We experimentally
veri ed that both the sentence ranking and weak supervision lead to notable
performance MAP improvements (increases of 9% and 25% respectively) compared
to a model without either of them, while using both lead to an improvement
greater than the individual parts (39% increase). In comparison to a
state-ofthe-art check-worthiness model [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], we found our approach to perform 11% better
on the MAP metric, while also achieving the rst place in the competition.
      </p>
      <p>
        In future work we plan to investigate approaches for better modelling
checkworthiness in debates, as this is important for real-world applications of
checkworthiness systems. Speci cally, we plan to (1) investigate how context [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] can
be included to better model the dynamics of a debate compared to a speech;
(2) the use of speed reading for sentence ltering [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]; and (3) extending the
evaluation of this task beyond MAP and P@k, for instance using evaluation
measures of both relevance and credibility [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>P.</given-names>
            <surname>Atanasova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          , G. Karadzhov,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mohtarami</surname>
          </string-name>
          , and G. Da San Martino.
          <article-title>Overview of the CLEF-</article-title>
          2019
          <source>CheckThat! Lab on Automatic Identi cation and Veri cation of Claims. Task</source>
          <volume>1</volume>
          :
          <string-name>
            <surname>Check-Worthiness</surname>
          </string-name>
          .
          <source>CEUR Workshop Proceedings</source>
          , Lugano, Switzerland,
          <year>2019</year>
          . CEUR-WS.org.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>J. D.</given-names>
            <surname>Choi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Tetreault</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Stent</surname>
          </string-name>
          . It depends:
          <article-title>Dependency parser comparison using a web-based evaluation tool</article-title>
          .
          <source>In Annual Meeting of the Association for Computational Linguistics</source>
          , pages
          <volume>387</volume>
          {
          <fpage>396</fpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>M.</given-names>
            <surname>Dehghani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Zamani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Severyn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Kamps</surname>
          </string-name>
          , and
          <string-name>
            <given-names>W. B.</given-names>
            <surname>Croft</surname>
          </string-name>
          .
          <article-title>Neural ranking models with weak supervision</article-title>
          .
          <source>In ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          , pages
          <volume>65</volume>
          {
          <fpage>74</fpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>T.</given-names>
            <surname>Elsayed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Barron-Ceden</surname>
          </string-name>
          ~o,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hasanain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Suwaileh</surname>
          </string-name>
          , G. Da San Martino, and
          <string-name>
            <given-names>P.</given-names>
            <surname>Atanasova</surname>
          </string-name>
          .
          <article-title>Overview of the CLEF-2019 CheckThat!: Automatic Identi cation and Veri cation of Claims. In Experimental IR Meets Multilinguality</article-title>
          , Multimodality, and Interaction,
          <string-name>
            <surname>LNCS</surname>
          </string-name>
          , Lugano, Switzerland,
          <year>September 2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>P.</given-names>
            <surname>Gencheva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Marquez</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          <article-title>Barron-Ceden~o, and</article-title>
          <string-name>
            <surname>I. Koychev.</surname>
          </string-name>
          <article-title>A contextaware approach for detecting worth-checking claims in political debates</article-title>
          .
          <source>In International Conference Recent Advances in Natural Language Processing</source>
          , pages
          <volume>267</volume>
          {
          <fpage>276</fpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>C.</given-names>
            <surname>Hansen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Hansen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Alstrup</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. Grue</given-names>
            <surname>Simonsen</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Lioma</surname>
          </string-name>
          .
          <article-title>Neural checkworthiness ranking with weak supervision: Finding sentences for fact-checking</article-title>
          .
          <source>In Companion Proceedings of the 2019 World Wide Web Conference</source>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>C.</given-names>
            <surname>Hansen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Hansen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Alstrup</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. G.</given-names>
            <surname>Simonsen</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Lioma</surname>
          </string-name>
          .
          <article-title>Neural speed reading with structural-jump-lstm</article-title>
          .
          <source>ICLR</source>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>C.</given-names>
            <surname>Hansen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Hansen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. G.</given-names>
            <surname>Simonsen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Alstrup</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Lioma</surname>
          </string-name>
          .
          <article-title>Unsupervised neural generative semantic hashing</article-title>
          .
          <source>In ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>C.</given-names>
            <surname>Hansen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Hansen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. G.</given-names>
            <surname>Simonsen</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Lioma</surname>
          </string-name>
          .
          <article-title>The copenhagen team participation in the check-worthiness task of the competition of automatic identication and veri cation of claims in political debates of the clef-2018 fact checking lab</article-title>
          . In CLEF-2018
          <source>CheckThat! Lab</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>N.</given-names>
            <surname>Hassan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Arslan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Li</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Tremayne</surname>
          </string-name>
          .
          <article-title>Toward automated fact-checking: Detecting check-worthy factual claims by claimbuster</article-title>
          .
          <source>In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</source>
          , pages
          <year>1803</year>
          {
          <year>1812</year>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11. I. Jaradat,
          <string-name>
            <given-names>P.</given-names>
            <surname>Gencheva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Barron-Ceden</surname>
          </string-name>
          ~o,
          <string-name>
            <given-names>L.</given-names>
            <surname>Marquez</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          . Claimrank:
          <article-title>Detecting check-worthy claims in arabic and english</article-title>
          .
          <source>In Conference of the North American Chapter of the Association for Computational Linguistics</source>
          , pages
          <volume>26</volume>
          {
          <fpage>30</fpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>C. Lioma</surname>
            ,
            <given-names>J. G.</given-names>
          </string-name>
          <string-name>
            <surname>Simonsen</surname>
            , and
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Larsen</surname>
          </string-name>
          .
          <article-title>Evaluation measures for relevance and credibility in ranked lists</article-title>
          . In J.
          <string-name>
            <surname>Kamps</surname>
            , E. Kanoulas, M. de Rijke,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Fang</surname>
          </string-name>
          , and E. Yilmaz, editors,
          <source>Proceedings of the ACM SIGIR International Conference on Theory of Information Retrieval</source>
          ,
          <string-name>
            <surname>ICTIR</surname>
          </string-name>
          <year>2017</year>
          , Amsterdam, The Netherlands,
          <source>October 1-4</source>
          ,
          <year>2017</year>
          , pages
          <fpage>91</fpage>
          {
          <fpage>98</fpage>
          . ACM,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <given-names>C.</given-names>
            <surname>Lioma</surname>
          </string-name>
          and
          <string-name>
            <surname>C. J. K. van Rijsbergen</surname>
          </string-name>
          .
          <article-title>Part of speech n-grams and information retrieval</article-title>
          .
          <source>French Review of Applied Linguistics, Special issue on Information Extraction and Linguistics</source>
          , XIII(
          <year>2008</year>
          /1):9{
          <fpage>22</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <given-names>T.</given-names>
            <surname>Mikolov</surname>
          </string-name>
          , I. Sutskever,
          <string-name>
            <given-names>K.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. S.</given-names>
            <surname>Corrado</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Dean</surname>
          </string-name>
          .
          <article-title>Distributed representations of words and phrases and their compositionality</article-title>
          .
          <source>In Advances in neural information processing systems</source>
          , pages
          <volume>3111</volume>
          {
          <fpage>3119</fpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Barron-Cedeno</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Elsayed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Suwaileh</surname>
          </string-name>
          , et al.
          <article-title>Overview of the clef2018 checkthat! lab on automatic identi cation and veri cation of political claims</article-title>
          .
          <source>In International Conference of the CLEF Association</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <given-names>A.</given-names>
            <surname>Patwari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Goldwasser</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Bagchi</surname>
          </string-name>
          .
          <article-title>Tathya: A multi-classi er system for detecting check-worthy statements in political debates</article-title>
          .
          <source>In ACM on Conference on Information and Knowledge Management</source>
          , pages
          <volume>2259</volume>
          {
          <fpage>2262</fpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <given-names>D.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. C.</given-names>
            <surname>Lima</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. G.</given-names>
            <surname>Simonsen</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Lioma</surname>
          </string-name>
          .
          <article-title>Contextual compositionality detection with external knowledge bases and word embeddings</article-title>
          . In S. Amer-Yahia,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mahdian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Goel</surname>
          </string-name>
          , G. Houben,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lerman</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. J. McAuley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. A.</given-names>
            <surname>Baeza-Yates</surname>
          </string-name>
          , and L. Zia, editors,
          <source>Companion of The 2019 World Wide Web Conference, WWW</source>
          <year>2019</year>
          , San Francisco, CA, USA, May
          <volume>13</volume>
          -17,
          <year>2019</year>
          ., pages
          <volume>317</volume>
          {
          <fpage>323</fpage>
          . ACM,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18. H.
          <string-name>
            <surname>Zamani</surname>
          </string-name>
          , W. B.
          <string-name>
            <surname>Croft</surname>
            , and
            <given-names>J. S.</given-names>
          </string-name>
          <string-name>
            <surname>Culpepper</surname>
          </string-name>
          .
          <article-title>Neural query performance prediction using weak supervision from multiple signals</article-title>
          .
          <source>In International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          , pages
          <volume>105</volume>
          {
          <fpage>114</fpage>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>C. Zuo</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Karakas</surname>
            , and
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Banerjee</surname>
          </string-name>
          .
          <article-title>A hybrid recognition system for check-worthy claims using heuristics and supervised learning</article-title>
          .
          <source>In CLEF-2018 CheckThat! Lab</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>