<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Estimating Credibility of News Authors from their WIKI Validated Predictions</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Navya Yarrabelly</string-name>
          <email>yarrabelly.navya@research.iiit.ac.in</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kamalakar Karlapalem</string-name>
          <email>kamal@iiit.ac.in</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>In: D. Albakour, D. Corney, J. Gonzalo, M. Martinez,</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>B. Poblete, A. Vlachos (eds.): Proceedings of the NewsIR'18, Workshop at ECIR</institution>
          ,
          <addr-line>Grenoble, France, 26-March-2018, published at http://ceur-ws.org</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>DSAC, IIIT Hyderabad</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper, we consider a set of articles or reports by journalists or others, wherein they predict or promise something about future. The problem we approach is determining the credibility of the authors based on the predictions coming out to be true. The two speci c problems we address are extracting the predictions from the articles and annotating with various prediction attributes. And then we determine the truth of these predictions, using Wikipedia as a credible source to extract relevant facts which can ascertain the validity of the predictions. We proposed and built an end to end system for automated predictions validation(APV) by extracting future speculations and predictions from news articles and social media. We considered 28 news articles and extracted 97 predictions from these articles and the range of credibility scores(Fscores) for these articles are (0.57-0.71).</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        In newspaper articles, many journalists evaluate the
current state of a airs and predict possible future
scenarios. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] estimates from their investigations that
nearly one-third of news articles contain predictive
statements. Therefore, it is imperative to determine
the passages, sentences and phrases of the news articles
that predict the future scenarios. A person well versed
Copyright © 2018 for the individual papers by the papers'
authors. Copying permitted for private and academic purposes.
This volume is published and copyrighted by its editors.
with reading articles can easily determine
predictability aspects of a news article and over time has some
assurance about which articles or news agencies
correctly predict some of the future scenarios. It is
important and necessary, therefore, to enhance our ability to
computationally determine the credibility of
journalists based on their ability to predict the future
scenarios correctly. As a step towards this direction, we
take up the automatic veri cation of predictive
statements against facts collected from credible information
sources. This task of machine reading at scale has the
di culties of relevant article retrieval ( nding the
signi cant facts) with that of machine perception of
content (entailment of predictions from facts).
      </p>
      <p>Consider the following prediction published on date
`d'.</p>
      <p>Example: The Reserve Bank of India may lower the
economic growth projection for 2017-18 to 6.7 per cent
later this month, from its August forecast of 7.3 per
cent, in view of issues with GST implementation and
lower kharif output estimates. In the above
predictive sentence, we have to precisely extract and validate
only the predictive part \The Reserve Bank of India
may lower the economic growth projection for 2017-18
to 6.7 per cent", \in view of issues with GST
implementation and lower kharif output estimates." is the
premise on which the prediction is made and \from its
August forecast of 7.3 per cent" is a supporting clause.
The reference future date for this prediction \later this
month" is translated to actual date `d+30'. The facts
relevant to the predictive part, which are published
after the target date `d+30', are extracted to determine
the entailment relation from fact to prediction.
Contributions: The main contributions of the
approach proposed are
(1) To translate predictions to structured queries, we
annotate the predictions with a wide range of
attributes(in Table 1). This can further be used by an
IR system to retrieve predictions made in reference to
a future time period, targeting an event etc.
(2) We also report a timeline story of its relevant facts
and analysis, and the fact sources con rming the truth
of the predictions. News IR systems can also come up
with recommendations or follow up links for an article
read, based on the predictive attributes and from the
timeline of facts extracted.
(3) We propose an approach to tackle open-domain
prediction validation using Wikipedia as the unique
knowledge source.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        Research has in the past focused on how to answer
questions but has not devoted attention to discerning
the accuracy of the predictions/promises made. To
the best of our knowledge [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] is the only work which
focused on the estimation of validity of predictions, by
calculating cosine similarity between predicted news
and the relevant events that actually occurred. We
offer semantic and syntactic analysis based on the
structure of relation triplets in a predictive sentence and
incorporated domain-speci c knowledge into the
system. Also, their retrieval model is limited to topics
contained by predictions(manually collected). Though
applications on future information retrieval have been
studied by a number of researchers, study on the
problem of validating predictions from Natural Language
Understanding perspective is limited. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] presents a
search engine for future and Past events relevant to
a users query. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] automatically generates summaries
of future events related to queries. Their methods
rely on extracting and processing statements
containing future temporal references. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] retrieves and ranks
predictions that are relevant to a news article using
features: term similarity, entity-based similarity, topic
similarity, and temporal similarity.
      </p>
      <sec id="sec-2-1">
        <title>Relevance to Fact Checking and QA Systems</title>
        <p>
          To some extent our problem can be compared with
the Fact Checking and Question Answering systems.
Though research has been done on the truth
assessment of fact statements relying on iterative peer
voting, leveraging language to infer accuracy of fact
candidates has only started. [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] calculates the credibility
of an uncertain fact by comparing other related facts.
Fact validity is estimated by the co-occurrence degree
of the doubt object and predicate by relying on page
counts for web queries.[
          <xref ref-type="bibr" rid="ref4">4</xref>
          ], [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] proposed to convert a
fact-checking question into a set of factoid-style
questions and validated the answers against those retrieved
by Factoid Question Answering systems Our problem
di ers from existing fact checking systems and
question answering systems in its retrieval problem, as we
only have to validate the predictive part of a sentence
and retrieve the relevant facts which occurred within
the implicit temporal constraints imposed.
3.1
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Predictions</title>
      <sec id="sec-3-1">
        <title>Predictions Extraction :</title>
        <p>From each article, we annotate the sentences as
predictive or factual using the implementation from [15].
It also identi es the predictive phrase in the prediction
and resolves the scope of the prediction in a complex
sentence.
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Semantic Graph Model for Predictive</title>
      </sec>
      <sec id="sec-3-3">
        <title>Sentence Simpli cation</title>
        <p>
          News articles often contain long and syntactically
complex sentences with relevant dependent relations
spanning over various clauses. It is required to determine
constituents that commonly supply no more than
contextual background information. Inspired by the work
of sentence simpli cation using relation graph1 and
syntactic sub-structures [
          <xref ref-type="bibr" rid="ref1 ref11">11, 1</xref>
          ], we followed a syntax
based sentence simpli cation approach to determine
such constituents and to annotate predictions with
various attributes. We constructed a Triplet-Level
Semantic Graph Model (TLSGM) which has
relationtriplets as vertices and the semantic relationships
between the triplets govern the edges in the graph. From
the TLSGM, we identi ed core triplets of the
predictive part of the sentence and dis-embedded other
peripheral triplets w.r.t the head predictive phrase
extracted in Section 3.1. Then only these core triplets
are validated to determine the accuracy of the
prediction.
        </p>
        <p>Vertices: Vertices in the TLSGM represent
(subject, predicate phrase, object) relation triplets
extracted from the prediction.</p>
        <p>Edges: An edge between two nodes N1 →N2
represents the semantic relation of node N2 w.r.t node N1.
Edges can be formed either from the subject or object
of a node to another node describing/modifying the
noun phrase of subject/object, following the rules for
noun descriptors. While edges formed from a predicate
to another node follow verb descriptor rules given
below. We illustrate the descriptor rules using example
sentences given below.</p>
        <p>1. Example 1 : Mary Kom, who won Bronze at
London Olympics, still has a fty- fty chance of
gaining a wildcard entry to the 2016 Rio Olympics.
(Mary Kom, has, fty- fty chance) is the head
predictive triplet (H).
2. Example 2 : The Reserve Bank of India is likely
to leave interest rates unchanged inorder to keep
in ation rate controlled.</p>
      </sec>
      <sec id="sec-3-4">
        <title>Rules for Noun Descriptors</title>
        <p>Modi ers and Dependents of the head of the noun
1https://github.com/Lambda-3/Graphene
phrase of either the subject or object of a triplet are
discussed below, categorized by the dependency
relations.
acl:relcl, appos : A relative clause modi er from
the head noun of an NP to the head of a relative
clause. The clause introduced by this dependency only
gives additional information on the noun phrase and
does not remark about the future predictive action,
which is our focus of interest. Example 1 has relation
acl:relcl(Sindhu, won) from the subject of node H. And
the edge between H and N2: (Mary Kom, won, Bronze
at London Olympics ) is only an additional descriptor
of H. Node N2 and its edges are pruned from the graph.
acl : An adjectival clause introduced by a Noun.</p>
        <p>If the dependent is a verb, and it has no subject,
it takes the object of the governor. Example 1 has a
relation acl(chance,gaining) from the object of H. And
the edge between H and N2 : ( fty- fty chance,
gaining, wildcard entry to the 2016 Rio Olympics) further
speci es the predictive action of N1 and hence node
N2 is retained in the graph.</p>
        <p>If the dependent is an adjective, it will only
describe the subject/object. This relation is also used
for optional depictives to modify the nominal of which
it provides a secondary predication. Example 2 has
a relation acl(rates, unchanged) from the object of H.
And the edge between H and N2 : (interest rates,
unchanged) acts as a quali er reference for the entities
contained in the prediction.</p>
      </sec>
      <sec id="sec-3-5">
        <title>Rules for Verb Descriptors</title>
        <p>xcomp : An open clausal complement (xcomp) of
a VP, without its own subject, whose reference is
determined by an external subject.</p>
        <p>If the governor of the relation contains an object
of its own, the clause introduced by xcomp provides
attributes to the relation contained by the governor
predicate and acts as a purpose or consequence clause.
Ex : Microsoft share values may go down by 10
dollars to give space to the new iPhone launch. We create
an edge (Microsoft share values, may go down, by 10
dollars) -&gt; (,give,space to the new iPhone launch),
governed by the relation xcomp(go, give).</p>
        <p>If the governor of the relation does not contain an
object, the dependent predicate modi es the head
predicate. We modify the predicate of the current node to
include the dependent predicate connected by xcomp
relation. Example 2 has a relation xcomp(likely,
leave). We modify H to (The Reserve Bank of India,
is likely to leave , interest rates unchanged ).</p>
        <p>ccomp for a verb : A clausal complement of a verb
is a dependent clause with an internal subject which
functions like an object of the verb, or adjective. The
clause introduced further describes the future course of
action referred by the governor predicate P. Ex: Modi
promised that Indian GDP growth rate would cross
8% this year has a relation ccomp(promised, cross),
which adds an edge from (Modi, promised,) to (GDP
growth rate, would cross, 8%) .</p>
        <p>advcl : An adverbial clause modi er of a VP or
S is a clause modifying the verb to introduce either a
temporal, consequence, conditional or purpose clause
and adds speci city to the head clause. Example 2
has a relation advcl(leave, keep) which adds an edge
from H to triplet N2 : (RBI, to keep , in ation rate
controlled). The validity of the predictive sentence
should be determined regardless of the state of truth
of the purpose/conditional clause. Hence the node N2
and its edges are discarded from the graph.
3.3</p>
      </sec>
      <sec id="sec-3-6">
        <title>Prediction Attributes</title>
        <p>Each Node in TLSGM is further classi ed and
labeled with reference to the root node i.e head
prediction node of the graph. We have determined the
characteristics of following constituents, using a
number of syntactic features (dependency relation types,
constituency-based parse trees as well as POS and
NER labels). Attributes : (Action; Event; Event
location; Event Time; Purpose / Consequence of
predictive action; Premise; Conditional clause; Quali er
Reference which adds speci city attributes of the
entities involved in the prediction; Numeric Quanti er
Reference; Certainty Perspective to isolate predictive
stances taken by an author from third party's voices
that are presented by the author).
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Extracting Relevant facts</title>
      <p>In the following section, we describe our system for
Automatic Prediction Validation (APV) which
consists of three components: (1) Keyword selection
module to select keywords speci c to the predictive part,
dis-embedding the linguistic peripheral clauses
identied in section 3.2 (2) the Document Retriever module
for nding facts relevant to the prediction and (3) a
machine comprehension model, Document Reader, for
ascertaining the accuracy of predictions from a small
collection of relevant facts.
4.1</p>
      <sec id="sec-4-1">
        <title>Keyword Selection</title>
        <p>Obtaining the pertinent facts relevant to the
prediction is in itself a complicated problem to solve.
Predictions have event and temporal based constraints,
clausal complements, appositives, relative clauses etc.
to add speci city or modify the action of an event.
To overcome the problem of query drift introduced by
these clauses, we further dis-embed keywords
expressing the time constraints, premise clauses, certainty
perspective (annotated in Section 3.2) and the
speculative words used. We identify the headword of the
predictive phrase and used a rule based approach so
that the predictive sentence fragments can be detected
and to select keywords pertaining to the predictive
action and its attributes in the sentence. Let K be the
set of relation triplets, we add the head vertex of the
graph (TLSG) to K and recursively add selected nodes
from its edges to K. We select nodes with edge labels
corresponding to Action, Event, Quali er and
Quantier References as described in Table 1. We then give
proximity queries where subject, predicate and object
occur within a window of 7 words. We further expand
the query set iteratively by adding purpose clauses and
expand keywords in a query with their synonyms.
Example: For the predictive sentence \Lizzie
Armitstead is predicted to win gold medal in cycling road
race at the Rio Olympics. "
Query : (Lizzie Armitstead win gold medal) OR
(Lizzie Armitstead win cycling road race) OR
(Lizzie Armitstead win Rio Olympics. )
4.2</p>
      </sec>
      <sec id="sec-4-2">
        <title>Candidate Relevant</title>
      </sec>
      <sec id="sec-4-3">
        <title>From Wikipedia</title>
      </sec>
      <sec id="sec-4-4">
        <title>Facts</title>
      </sec>
      <sec id="sec-4-5">
        <title>Extraction</title>
        <p>
          To extract pertinent facts which can ascertain the
accuracy of the predictions, we used Wikipedia as a
knowledge source. Wikipedia's publicly available apis2
to access revision history of each article and its
up-todate knowledge marked with timestamps makes it a
reliable source for event-based prediction validation.
We used tagme3 as a semantic interpreter that maps
fragments of natural language text into a weighted
sequence of Wikipedia concepts relevant to the input.
Using the query set in the above step(Section 4.1), we
extracted the top 50 documents, from a local Lucene
index of Wikipedia English dump. To further extract
the relevant snippet from the article, we only included
the article content with revision dates occurring within
the time-window referenced by temporal constraints
extracted for the validity of the prediction. We used
the word2vec python implementation of Gensim [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]
using Wikipedia as a corpus for generating
embeddings to represent contextual term vectors. Inspired
from [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ], we adapted Zero Filter, Terms lter, Exact
Sequence Filter, Normalization Filter, N-grams Filter,
Density Filter to extract and sort the relevant
candidate facts from the retrieved articles. Additionally, we
implemented the following lters
• Distance lter : Assigns a score to a fact based
on the distance between subject and object from
each triplet in prediction.
• Category Filter: For all the annotated Wikipedia
concepts in the prediction and facts, we build
cat2https://www.mediawiki.org/wiki/API:Search
3https://tagme.d4science.org/tagme/
egory vectors and assigned a score based on the
cosine similarity between prediction category
vector and the fact category vector.
• Wikipedia concept relevance: Cumulative
pairwise similarity score of extracted Wikipedia
concepts from the prediction and fact's context from
the Wiki article.
• Context similarity: Distributional semantic
similarity score between words and phrases from the
prediction and fact.
        </p>
        <p>From these candidate facts, we ltered the top 100
facts sorted with their current score.
4.3</p>
      </sec>
      <sec id="sec-4-6">
        <title>Validation of Predictions</title>
        <p>Our approach allows to translate the prediction and
fact to a semantic representation, incorporating
knowledge from external sources and then try to determine
if the representation of the prediction is subsumed by
that of the fact.</p>
        <p>We pass all the (prediction, fact) pairs to two
components: 1. (RATSR) framework(described below)
and 2. an RTE system which performs rich
syntactic analysis of the linguistic phenomena between the
entailment pair.</p>
      </sec>
      <sec id="sec-4-7">
        <title>Relation Alignment for Textual Similarity</title>
        <p>Recognition (RATSR) The RATSR framework has
three major components: 1. Preprocessor. Prediction
and fact pairs are annotated with a range of analytical
tools. 2. Graph Generator. Applies metrics to
compare triplets in speci ed annotation views to generate a
match graph over the Prediction and Fact constituents
of the entailment pair. 3. Alignment Score. Filters the
edges in the match graph to focus on a scoring function
based on the alignment output.</p>
        <p>
          (1)Preprocessor: Sentence and word
segmentation; POS tagging; dependency parsing; named entity
recognition; co-reference resolution; temporal
expression identi ers; Wikipedia concepts annotator; Multi
word expression identi ers4; Phrasal verbs identi ers;
Quanti er and Quali er references. These resources
are used for annotating both predictions and facts at
the sentence level and triplet level.
(2)Graph Generator: Similarity metrics are applied to
the relevant constituent pairs drawn from the
Prediction and Fact. [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] uses relation triplet similarity by
calculating similarity across subject, verb and object
pairs from PPDB[
          <xref ref-type="bibr" rid="ref12">12</xref>
          ], as a feature for stance classi
cation. We construct a relation match graph(RMG) by
iterating over each triplet in prediction and fact and
calculate similarity over various views to give a
similarity score between the two triplets being compared
4https://radimrehurek.com/gensim/models/phrases.
html\#id2
and create an edge with similarity score as the weight.
We propose methods for similarity between triplets for
various annotations mentioned in the pre-processing
step.
        </p>
        <p>
          • Triplet Similarity Score using Latent Semantic
Analysis Models (Score = S1): Adapting the
implementation from [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] and using multiplication
as vector composition operator for phrases with
more than one word, we de ne the similarity of
SPO triplets using distributional models as given
below:
Probability that fact triplet tf:(sf,vf,of) implies
prediction triplet tp:(sp,vp,op)is
(1)
(2)
P (tp
&gt; tf) =P (spjtf)(1
        </p>
        <sec id="sec-4-7-1">
          <title>P (vpjtf)(1</title>
        </sec>
        <sec id="sec-4-7-2">
          <title>P (opjtf)(1</title>
          <p>P (sp))+
P (vp))+
P (op))</p>
          <p>P (spjtf) = P (spjsf) + P (spjvf) + P (spjof)
• Triplet Similarity Score using Lexical
Semantic Models(Score = S2): We calculate
similarity scores between subject, predicate and object
pairs from prediction and fact from synonym and
antonym similarity using Wordnet, PPDB and
Wikipedia concept Similarity; hyponym and
hypernym similarity using Wikipedia and Wordnet
taxonomy structure; length of the path between
two entities in DBPedia; Numeric references
similarity. We then combine these scores to give a
cumulative lexical similarity score between the two
triplets.</p>
          <p>
            (3)Alignment: The goal of alignment component is
to decompose the text and hypothesis into semantic
constituents, and determine which prediction triplet
should be aligned to which fact triplet. In
contrast to aligning words[
            <xref ref-type="bibr" rid="ref2">2</xref>
            ] from prediction to fact, we
align triplets to exploit the semantic roles of the
constituents; to facilitate for the analysis of speci c
prediction attributes(in Table 1) which are matched in the
fact; and also to validate against a cluster of relevant
facts. We used a maximum weight perfect bipartite
graph matching algorithm to align triplets from
prediction to relevant triplets from facts.
          </p>
          <p>From the similarity scores obtained from the
RATSR framework and an RTE system[?], we set
threshold limits to label the entailment pair as true,
false or unrelated. .
5</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Results &amp; Discussion</title>
      <p>Dataset Preparation: We collected two datasets,
one from predictions in sports domain and the other
from campaign promises made by Barack Obama. We
automatically extracted predictions from articles on
Rio Olympics from 6 sites (denoted as A5, B6, C7, D8,
E9, F10) and manually ltered the predictions which
can be objectively evaluated and those which can be
reduced to factoid questions. `Olympics Predictions'
dataset consists of 97 predictions made for various
events in trials for Rio Olympics and the Rio Olympics
2016. We further manually annotated each prediction
as true, if it has come true and false otherwise. We
collected the second dataset `Obama Promises' from
politifact11, where each promise is labeled as `broken'
or `promise kept' or `compromised'. We collected a set
of 257 such promises which can be objectively
evaluated.</p>
      <p>We evaluated the predictions and obtained labels
using our prediction validation system on the two
datasets. Table 3 compares the accuracy scores of
these labels against the actual labels. Table 2 presents
the reliability scores obtained(normalized by the
number of predictions) of the 6 sources we considered.</p>
      <p>Discussion: `Obama Promises' contains
multisentence predictions and requires more robust NLP
modules to identify the main predictive clause that has
to be validated, besides other supporting predictive
clauses (example: \Create a $10 billion fund to
help homeowners re nance or sell their homes.
The Fund will not help speculators, people who bought
vacation homes or people who falsely represented their
incomes"). High false negative rate can be attributed
5https://www.eurosport.co.uk
6http://edition.cnn.com/
7https://www.foxsports.com.au
8http://www.couriermail.com.au/
9https://www.theguardian.com/
10https://www.thehindu.com/news
11http://www.politifact.com/truth-ometer/promises/obameter/browse/
to the drift in both facts retrieval module and
validation module, due to other insigni cant predictive
clauses. `Rio Predictions' contains mostly event-based
predictions and the high false positive rate for this
dataset is partly due to omitting explicit negative
entity similarity in the context of a given prediction. For
example, the entities `Ussain Bolt' and `Wayde van
Niekerk' are negatively related in the context of
`winning a medal at Rio Olympics'. This negative
similarity should be translated to negative triplet similarity
and further to labeling as a contradicting relation for
the prediction-fact entailment pair. We plan to
address this in our future work, by generating alternative
statements for a prediction by automatically
identifying the doubt unit in a sentence and lling with
relevant comparable entities/phrases.
[15] Navya Yarrabelly and Kamalakar Karlapalem.
Extracting predictive statements with their scope from
news articles. In The 12th International AAAI
Conference on Web and Social Media (ICWSM-18),
Submitted for publication.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Gabor</given-names>
            <surname>Angeli</surname>
          </string-name>
          , Melvin Jose Johnson Premkumar, and
          <string-name>
            <given-names>Christopher D</given-names>
            <surname>Manning</surname>
          </string-name>
          .
          <article-title>Leveraging linguistic structure for open domain information extraction</article-title>
          .
          <source>In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing</source>
          (Volume
          <volume>1</volume>
          :
          <string-name>
            <surname>Long</surname>
            <given-names>Papers)</given-names>
          </string-name>
          , volume
          <volume>1</volume>
          , pages
          <fpage>344</fpage>
          {
          <fpage>354</fpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>William</given-names>
            <surname>Ferreira</surname>
          </string-name>
          and
          <string-name>
            <given-names>Andreas</given-names>
            <surname>Vlachos</surname>
          </string-name>
          .
          <article-title>Emergent: a novel data-set for stance classi cation</article-title>
          .
          <source>In Proceedings of the 2016</source>
          conference
          <article-title>of the North American chapter of the association for computational linguistics: Human language technologies</article-title>
          , pages
          <volume>1163</volume>
          {
          <fpage>1168</fpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Adam</given-names>
            <surname>Jatowt</surname>
          </string-name>
          , Kensuke Kanazawa, Satoshi Oyama, and
          <string-name>
            <given-names>Katsumi</given-names>
            <surname>Tanaka</surname>
          </string-name>
          .
          <article-title>Supporting analysis of futurerelated information in news archives and the web</article-title>
          .
          <source>In Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries</source>
          , pages
          <volume>115</volume>
          {
          <fpage>124</fpage>
          . ACM,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Hiroshi</given-names>
            <surname>Kanayama</surname>
          </string-name>
          , Yusuke Miyao,
          <string-name>
            <given-names>and John</given-names>
            <surname>Prager</surname>
          </string-name>
          .
          <article-title>Answering yes/no questions via question inversion</article-title>
          .
          <source>Proceedings of COLING 2012</source>
          , pages
          <fpage>1377</fpage>
          {
          <fpage>1392</fpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Kensuke</given-names>
            <surname>Kanazawa</surname>
          </string-name>
          , Adam Jatowt, and
          <string-name>
            <given-names>Katsumi</given-names>
            <surname>Tanaka</surname>
          </string-name>
          .
          <article-title>Improving retrieval of future-related information in text collections</article-title>
          .
          <source>In Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence</source>
          and
          <string-name>
            <surname>Intelligent Agent</surname>
          </string-name>
          Technology-Volume
          <volume>01</volume>
          , pages
          <fpage>278</fpage>
          {
          <fpage>283</fpage>
          . IEEE Computer Society,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Nattiya</given-names>
            <surname>Kanhabua</surname>
          </string-name>
          , Roi Blanco, and
          <string-name>
            <given-names>Michael</given-names>
            <surname>Matthews</surname>
          </string-name>
          .
          <article-title>Ranking related news predictions</article-title>
          .
          <source>In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval</source>
          , pages
          <volume>755</volume>
          {
          <fpage>764</fpage>
          . ACM,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Hideki</given-names>
            <surname>Kawai</surname>
          </string-name>
          , Adam Jatowt, Katsumi Tanaka, Kazuo Kunieda, and
          <string-name>
            <given-names>Keiji</given-names>
            <surname>Yamada</surname>
          </string-name>
          .
          <article-title>Chronoseeker: Search engine for future and past events</article-title>
          .
          <source>In Proceedings of the 4th International Conference on Uniquitous Information Management and Communication, page 25. ACM</source>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Mio</given-names>
            <surname>Kobayashi</surname>
          </string-name>
          ,
          <string-name>
            <surname>Ai Ishii</surname>
            , Chikara Hoshino, Hiroshi Miyashita, and
            <given-names>Takuya</given-names>
          </string-name>
          <string-name>
            <surname>Matsuzaki</surname>
          </string-name>
          .
          <article-title>Automated historical fact-checking by passage retrieval, word statistics, and virtual question-answering</article-title>
          .
          <source>In Proceedings of the Eighth International Joint Conference on Natural Language Processing</source>
          (Volume
          <volume>1</volume>
          :
          <string-name>
            <surname>Long</surname>
            <given-names>Papers)</given-names>
          </string-name>
          , volume
          <volume>1</volume>
          , pages
          <fpage>967</fpage>
          {
          <fpage>975</fpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Dmitrijs</given-names>
            <surname>Milajevs</surname>
          </string-name>
          , Mehrnoosh Sadrzadeh, and Thomas Roelleke.
          <article-title>Ir meets nlp: On the semantic similarity between subject-verb-object phrases</article-title>
          .
          <source>In Proceedings of the 2015 International Conference on The Theory of Information Retrieval</source>
          , pages
          <volume>231</volume>
          {
          <fpage>240</fpage>
          . ACM,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Piero</surname>
            <given-names>Molino</given-names>
          </string-name>
          , Pierpaolo Basile, Annalina Caputo, Pasquale Lops, and
          <string-name>
            <given-names>Giovanni</given-names>
            <surname>Semeraro</surname>
          </string-name>
          .
          <article-title>Exploiting distributional semantic models in question answering</article-title>
          .
          <source>In Semantic Computing (ICSC)</source>
          ,
          <year>2012</year>
          IEEE Sixth International Conference on, pages
          <volume>146</volume>
          {
          <fpage>153</fpage>
          . IEEE,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Christina</surname>
            <given-names>Niklaus</given-names>
          </string-name>
          , Bernhard Bermeitinger, Siegfried Handschuh, and
          <string-name>
            <given-names>Andre</given-names>
            <surname>Freitas</surname>
          </string-name>
          .
          <article-title>A sentence simpli cation system for improving relation extraction</article-title>
          .
          <source>arXiv preprint arXiv:1703.09013</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Ellie</surname>
            <given-names>Pavlick</given-names>
          </string-name>
          , Pushpendre Rastogi, Juri Ganitkevitch, Benjamin Van Durme, and
          <string-name>
            <surname>Chris</surname>
          </string-name>
          Callison-Burch.
          <article-title>Ppdb 2.0: Better paraphrase ranking, ne-grained entailment relations, word embeddings, and style classi cation</article-title>
          .
          <source>In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing</source>
          (Volume
          <volume>2</volume>
          :
          <string-name>
            <surname>Short</surname>
            <given-names>Papers)</given-names>
          </string-name>
          , volume
          <volume>2</volume>
          , pages
          <fpage>425</fpage>
          {
          <fpage>430</fpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>Radim</given-names>
            <surname>Rehurek</surname>
          </string-name>
          and
          <string-name>
            <given-names>Petr</given-names>
            <surname>Sojka</surname>
          </string-name>
          .
          <article-title>Software Framework for Topic Modelling with Large Corpora</article-title>
          .
          <source>In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks</source>
          , pages
          <volume>45</volume>
          {
          <fpage>50</fpage>
          ,
          <string-name>
            <surname>Valletta</surname>
          </string-name>
          , Malta, May
          <year>2010</year>
          . ELRA. http://is.muni.cz/publication/ 884893/en.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>Yusuke</given-names>
            <surname>Yamamoto</surname>
          </string-name>
          and
          <string-name>
            <given-names>Katsumi</given-names>
            <surname>Tanaka</surname>
          </string-name>
          .
          <article-title>Finding comparative facts and aspects for judging the credibility of uncertain facts</article-title>
          .
          <source>Web Information Systems Engineering-WISE</source>
          <year>2009</year>
          , pages
          <fpage>291</fpage>
          {
          <fpage>305</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>