<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>CheckThat! Automatic Identi cation and Veri cation of Claims: IIT(ISM) @CLEF'19 Check Worthiness</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ritesh Kumar</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Shivansh Prakash</string-name>
          <email>helloshivanshprakash@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Shashank Kumar</string-name>
          <email>shashank0218@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rajendra Pamula</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science and Engineering, Indian institute of Technology (Indian School of Mines) Dhanbad</institution>
          ,
          <addr-line>826004</addr-line>
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper describes the work that we did at Indian Institute of Technology (ISM) towards CheckThat!: Automatic Identi cation and Veri cation of Claims for CLEF 2019. As per requirement of CLEF-2019 we submit the only one run in its Check Worthiness Task. Two-fold crossvalidation is used to select a model for submission to CheckThat Lab at CLEF 2019. For our run, we use SVM and LSTM method. Overall, our performance is not satisfactory. However, as new entrant to the eld, our scores are encouraging enough to work for better results in future.</p>
      </abstract>
      <kwd-group>
        <kwd>LSTM</kwd>
        <kwd>SVM</kwd>
        <kwd>Feature extraction</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Investigative journalists and volunteers have been working hard to get to the
core of a claim and present solid evidence in favor or against it. And in this day
and age of information abundant amount of info/data is readily available, thus
manual fact-checking is very time-consuming and therefore automatic methods
were proposed as a means of speeding up the process [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Also, some steps of the
fact-checking pipeline are given less attention, e.g. checking worthiness estimates
is severely mis-understood as a problem. That is why we took this problem of
checking worthiness estimates of statements/claims. The overview of of shared
task for CheckThat! can be found in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. In brief, we have to identify which
sentence should be prioritized for fact-checking given a political debate or a
transcribed speech, segmented into sentences with annotated speakers. This is a
ranking task and systems are required to produce a score per sentence that will
perform the ranking. The task is performed in English.
      </p>
      <p>
        The rate at which a statement in an interview, a press release, or a tweet can
spread almost instantly across the globe has created an unprecedented situation
[
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ]. There is almost no time to cross-check a statement or a claim against
facts and this has proved critical in politics, e.g. during the 2016 US Presidential
Campaign, whose results is said to be a ected by fake news and claims spread
using social media. As it became apparent that this is a problem that can create
great a ects in our lives, a number of fact-checking programs have started, led
by organizations such as FactCheck and Snopes [5{7].
      </p>
      <p>The organization of the rest of the paper is as follows. Section 2 describes
about the dataset. We describe our methodology: eld categories and indexing,
which document and topic elds we used for retrieval in Section 3. In Section
4 we describe our results. Finally, we conclude in Section 5 with directions for
future work.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Data</title>
      <p>
        We participate in Task 1. The detail description of dataset can be found in
[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. The training data consisted of 19 les which include political debates and
speeches. Each le contains the debate/speech split into sentences. Each line
contains a single sentence, its speaker and a label which is annotated by experts
as check-worthy or not. The sentence is labelled 0 if it is not worthy of checking
and 1 if it is worthy for checking. The data consists of a total of 16,421
sentences, of which 440 were labeled as check-worthy. There is an imbalance in the
dataset with only 2:68% of the dataset labelled same as that of the target class
i.e. check-worthy. A few instances of this training data, along with their speakers
and labels, are presented below :
300 SANDERS Let's talk about climate change. 0
301 SANDERS Do you think there's a reason why not one Republican has the
guts to recognize that climate change is real, and that we need to transform our
energy system? 1
The test data is a collection of seven les consisting of debates and speeches. In
this task, we do not use any external knowledge other than domain-independent
language resources such as parsers and lexicons. Instead, we concentrate on
extracting linguistic features that can indicate the check-worthiness of the
sentences .
3
3.1
      </p>
    </sec>
    <sec id="sec-3">
      <title>Methodology</title>
      <p>Data Preprocessing and Feature Extraction
Data Processing is required to prepare data to be used by systems to classify
and rank statements according to the check-worthiness score. Rich feature
extraction is needed to perform ranking according to language constructs rather
than relying on heuristic or encyclopedia Knowledge. Syntactic and semantic
features were extracted from both speeches and debates to represent sentences
consistently and converted every sentence into vector. These features are:
Lexical Features, Sentence Embedding, Stylometric Features, Semantic Features,
a ective Features and Metadata Features. First of all we perform speaker
normalization by assigning each speaker an unique Id. The reason for this is that
the same speaker is present in the training data with di erent names in di erent
instances. The sentences in the training data are rst tokenized. We also remove
stop-words and the remaining tokens are stemmed using a stemmer. A single
le which contains sentences extracted from required le is made to be used as
training data. Syntactic and semantic features are extracted from both speeches
and debates to represent sentences consistently and converted every sentence
into vector.</p>
      <p>We use two approaches for designing the model for this task Support Vector
Machine (SVM) and Recurrent Neural Network using Long Short Term Layers
(LSTM). We use SVM because of its simplicity, ease of use and its ability to
avoid over- tting using regularization parameters. However, since the datasets
are more aptly described using a sequence of sentences, a model which considers
this sequence-time nature of the data will obviously be a better choice. Since
SVMs do not consider this, we also try to implement a recurrent neural network
model having memory units, in the form OF LSTM layers.</p>
      <p>For our model, we use a layer of LSTM cells having an input shape of (batch
size, no. of time steps, feature dimension). There were 64 units of LSTM cells in
each layer hence, the output shape of the layer will be (64). We then add a dense
layer of 64 neurons after the LSTM layer with a Recti ed Linear Unit (ReLU)
activation function. A dropout layer is also added to avoid over- tting of data.
Finally, a softmax output layer is added producing output probabilities of the
classes for each input instance.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Results</title>
      <p>The complete training data is divided into two parts i.e. training data and
validation data. The validation data consists of a speech and a debate selected from
the training part. The models are trained using the training part and evaluated
using validation part. Many models with di erent parameters are generated and
evaluated and the model giving the best results is selected from them. We are
provided with some code that helped us to evaluate our models in various
evaluation metrics. We get the best result using Recurrent neural network model
rather than SVMs. The scores obtained by our six runs are given in Table 1.
The o cial evaluation measure provided by CLEF'19 is M AP . We show the
best score in the task demonstrated by run-id Copenhagen(*), for the sake of
comparison.
Copenhagen(*) 1 .1660 .4176 .1387</p>
      <p>ISMD16title eld 10 .0835 .2238 .0714
This year we participate in Task 1 of the Check That!: Automatic Identi cation
and Veri cation of Claim. A SVM and a Recurrent Neural Network model is
used with supervised learning to classify and rank sentences according to their
check-worthy score in political debates and speeches. A rich feature set is
extracted to represent sentences in best possible way to tackle class imbalance and
avoid heuristic or encyclopedia Knowledge. Two-fold cross-validation is used to
select a model for submission to CheckThat Lab at CLEF 2019. While there can
be no denial of the fact that our overall performance is average, initial results
are suggestive as to what should be done next. Our work has shown us lot of
interesting possibilities for future work. There is a lot of room for improvements
in our task. Linguistic form of information is under-studied and can be explored
in more depth for better result. We use shallow syntactic features in our project,
this can be improved by using very deep syntactic feature to represent sentence.
Further study is required for including more linguistic and non-linguistics
features which may a ect check-worthy score of phrases. Furthermore, we can use
more complex recurrent neural network models to achieve better result. We shall
be exploring some of these tasks in the coming days.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Recasens</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Danescu-Niculescu-Mizil</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jurafsky</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Linguistic Models for Analyzing and Detecting Biased Language</article-title>
          . In:
          <article-title>Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)</article-title>
          .
          <source>vol. 1</source>
          , pp.
          <fpage>1650</fpage>
          -
          <lpage>1659</lpage>
          (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Elsayed</surname>
          </string-name>
          ,
          <article-title>Tamer and Nakov, Preslav and Barron-Ceden~o, Alberto and Hasanain, Maram and Suwaileh, Reem and Da San Martino, Giovanni and Atanasova, Pepa. Overview of the CLEF-2019 CheckThat!: Automatic Identi cation and Veri cation of Claims. Experimental IR Meets Multilinguality, Multimodality, and Interaction"</article-title>
          ,
          <source>LNCS</source>
          , Springer, Lugano, Switzerland, September,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Porter</surname>
            ,
            <given-names>M.F.</given-names>
          </string-name>
          :
          <article-title>Snowball: A Language for Stemming Algorithms</article-title>
          . http://snowball tartarus.org/texts/introduction.html (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Pang</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vaithyanathan</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Thumbs up?: sentiment classi cation using machine learning techniques</article-title>
          .
          <source>In: Proceedings of the ACL-02 conference on Empirical methods in natural language processing-</source>
          Volume
          <volume>10</volume>
          . pp.
          <fpage>7986</fpage>
          .
          <article-title>Association for Computational Linguistics (</article-title>
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Mikolov</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Corrado</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dean</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>E cient Estimation of Word Representations in Vector Space</article-title>
          .
          <source>arXiv preprint arXiv:1301.3781</source>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Loria</surname>
          </string-name>
          , S.: TextBlob: Simpli ed Text Processing. http://textblob.readthedocs.org/en/dev/ (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Cazalens</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lamarre</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Leblay</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Manolescu</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tannier</surname>
            ,
            <given-names>X.:</given-names>
          </string-name>
          <article-title>A content management perspective on fact-checking</article-title>
          . In:
          <article-title>" Journalism, Misinformation and Fact Checking" alternate paper track of"</article-title>
          <source>The Web Conference"</source>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Atanasova</surname>
          </string-name>
          , Pepa and Nakov, Preslav and Karadzhov, Georgi and Mohtarami, Mitra and Da San Martino, Giovanni. Overview of the CLEF-2019
          <source>CheckThat! Lab on Automatic Identi cation and Veri cation of Claims. Task</source>
          <volume>1</volume>
          :
          <string-name>
            <surname>Check-Worthiness</surname>
          </string-name>
          , CLEF-ceur,
          <year>2019</year>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>