<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Where's the Why? In Search of Chains of Causes for Query Events</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Suchana Datta</string-name>
          <email>suchana.datta@ucdconnect.ie</email>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Derek Greene</string-name>
          <email>derek.greene@ucd.ie</email>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Debasis Ganguly</string-name>
          <email>debasis.ganguly1@ie.ibm.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dwaipayan Roy</string-name>
          <email>dwaipayan.roy@gmail.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mandar Mitra</string-name>
          <email>mandar@isical.ac.in</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>IBM Research Europe</institution>
          ,
          <addr-line>Dublin</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Indian Institute of Science, Education and Research</institution>
          ,
          <addr-line>Kolkata</addr-line>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Indian Statistical Institute</institution>
          ,
          <addr-line>Kolkata</addr-line>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>School of Computer Science, University College Dublin</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Traditional information retrieval systems are primarily focused on nding topically-relevant documents, which are descriptive of a particular query concept. However, when working with sources such as collections of news articles, a user might often want to identify not only those documents which describe a news event, but also documents which explain the chain of events which potentially led to that event occurring. These associations might be complex, involving a number of causal factors. Motivated by this information need, we formulate the task of causal information retrieval. We provide a literature survey on causality-related research, and explain how the proposed task di ers from standard retrieval problems. We then empirically investigate the ability of popular retrieval methods to successfully retrieve causally-relevant documents. Our results demonstrate that the performance of traditional methods are not upto the mark for this task, and that causal information retrieval remains an open challenge which is worthy of further research.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Faced with any situation or event, it is a fundamental part of human nature
to ask `why?' and `how?', as we attempt to understand the context in which
we nd ourselves. The same can be said when we seek to analyze any complex
nature of events in modern society. As a concrete example, we may want to
understand `why was the UAE-Israel peace accord signed?' so that we can analyze
its after-e ects. Consequently, we often try to map events in the form of
causee ect relations. Over the years, the study of cause-e ect relations has focused on
uncovering the inter-relationships among di erent phenomena in terms of cause
and e ect [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Sometimes these associations are immediately evident to us, such
as smoking causes lung cancer. However, these associations often can be rather
complex, involving a combination of a number of causal factors that might have
led to an observed event, together with a number of further precursory
components that might have triggered events present in these causal factors in a
recursive fashion. In the example above, the instant causal factors might include
Israel's settlement plan or Trump's diplomatic strategy [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. However, if we look
further for foregoing causes of Israel's settlement plan, certain factors such as,
acquiring global recognition, improving relations with middle east etc. might be
notable. Literature emphasizes that in most situations there will be no de nitive
rules around how cause-e ect relations should be structured [
        <xref ref-type="bibr" rid="ref15 ref30">15,30</xref>
        ]. It is rather
di cult to explicitly enlist a list of causes (in the form of short text segments) for
these complex cause-e ect relationships. Rather, these causal factors are spread
across a number of multi-topical documents. In that sense, perhaps it is better
to present this information to a user leaving him the task of subjectively guring
out the potential causes.
      </p>
      <p>Traditional search systems concentrate on matching terms between
documents and a user query. However, this might not cover the situation where a
user's search is intended to reveal the causes which led to speci c event. In this
paper, we investigate this gap in the information retrieval literature, by
addressing two associated research questions:
{ RQ-1: Is a new research paradigm required to address the requirements of
identifying causally-relevant information (i.e., causal information retrieval )?
{ RQ-2: Is a traditional search system adequate for the requirements?
To address these key questions, in Section 2 we provide a detailed literature
survey on the causality research to date. In Section 3 we explore the emergence and
the challenges of causal information retrieval task and conduct few experiments
to investigate whether or not these models can meet the requirements of that
task. We conclude in Section 4 with suggestions for further research in this area.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Literature Review</title>
      <p>
        Identifying the inherent nature of cause-e ect relations from text has been
explored in multiple ways, although largely in the context of textual entailment
[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. However, we are interested in capturing document-level causal information,
rather than working at the sentence level. In this section, we provide a high level
overview of various existing approaches designed to capture cause-e ect
associations, which will help us to frame the problem of causal information retrieval.
Causal Relation Extraction. With the increasing popularity of deep neural
architectures, the study of causation is now more based around
counterfactuals (i.e., what might have happened?). But initially causality was more closely
related to identifying semantic relations between a cause and an e ect [
        <xref ref-type="bibr" rid="ref30 ref33">30,33</xref>
        ].
While sentence-level entailment has been harnessed to capture causal
characteristics [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], other authors have investigated causal relations between two queries
[
        <xref ref-type="bibr" rid="ref32">32</xref>
        ] which eventually has lead to the idea of using event pairs [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Later, the
authors in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] attempted to establish causality within texts by predicting event
causality, i.e causality between event pairs, (e.g. `police arrested him' because
`he killed someone'). Nonetheless, these approaches are concerned with
sentential cause-e ect relation extraction, whereas we investigate on causality spanned
across a document collection for a given query.
      </p>
      <p>
        Graph-based approaches. Graphs provide a convenient way to visualize
causee ect relations. While authors in [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ] proposed a non-parametric graph-based
framework to trace causal inferences, other works [
        <xref ref-type="bibr" rid="ref24 ref8">8,24</xref>
        ] used Directed Acyclic
Graph (DAG) to represent causal relations and later focus shifted to Bayesian
Network [
        <xref ref-type="bibr" rid="ref36">36</xref>
        ]. On the other hand, the authors in [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ] focused on solving
eventpair causality relations, encoded in text (e.g. we `recognized' the problem and
`took' care of it), with graphs. Thus graph pattern based techniques primarily
focus on identi cation of event pairs from text and study their patterns with
probabilistic measures. However, for our task, selecting candidate events from a
larger set of events that are likely to be related to the query event is the primary
challenge, as causal events might not hold any direct relation with the query.
Causal Knowledge Bases. Research on causality that made use of
domainindependent knowledge was rst introduced in the late 1990s and continues
today. As knowledge-based causality developed gradually, researchers attempted to
explore automatic causal relation acquisition (speci cally common cause-e ect
propositions) [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] and exploit semantic property of predicates [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] which e
ciently nd contradictory pairs (e.g. `destroy cancer' ? `develop cancer'). The
knowledge-base pattern approach was extended in [
        <xref ref-type="bibr" rid="ref37">37</xref>
        ], where a set of patterns
was initially used to create a network of causes and e ects, leading to a relational
embedding method.
      </p>
      <p>
        Document Classi cation. Causality has also been shown to be relevant in
document classi cation, where the relationship between features and classes is
often complex. Paul [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ] sought to answer the question of `which term features
cause documents to have the class labels that they do?', and developed a
propensity score matching technique for selecting important features. The work in [
        <xref ref-type="bibr" rid="ref34">34</xref>
        ]
considered the causal inference task as a classi cation problem, and using logistic
regression, they illustrated how to analyze causality a variety of datasets. The
authors took into account factors such as missing data and measurement errors,
which often hinders downstream causal analysis.
      </p>
      <p>
        Future Scenario Generation/Prediction. Contingency discourse problems
in NLP, speci cally new event prediction, consider causal relation extraction
from text data as being particularly challenging [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ]. The authors in [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ]
initiated this research with the automatic compilation and generalization of a
sequence of events from di erent web corpora. However, other researchers argue
that in order to address causality, either two of the events in the consecutive
sentences must hold an inter-sentential contingent relation [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ] or there should be
a pre-trained event-causality chaining database generated from web data [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
Therefore, future scenario prediction problems require prior event knowledge,
which is unlikely in our case as users may have no prior knowledge about the
plausible causes of a query event.
      </p>
      <p>
        Question-Answering. The NLP literature highlights that question-answering
(QA) systems exploit the inherent nature of causality by disambiguating the
pervasive nature of causal relations [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] which aids to identify inter and
intrasentential causal links between terms and clauses to answer `why' questions [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ].
Lately, a decision support system [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] was proposed to foresee the consequences
of queries like, `Should I join the military ?' or `Should I move to California ?'.
another group of researchers focused on a new variant of QA, referred to as
common sense causality identi cation [
        <xref ref-type="bibr" rid="ref11 ref12">12,11</xref>
        ]. This causality variant helped to
disambiguate discourse relations and reasoning with sentence proximity by making
use of knowledge-bases. Thus, QA approaches involve either lexical or syntactic
patterns generation; or morphological features extraction between cause and
effect. Therefore, this does not t into tasks where causal documents are unlikely
to have any de nite pattern with the query event.
      </p>
      <p>
        Deep Causal Relations. Since 2018, causality has been incorporated in
classical CNN models [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ], and has also been used to furnish a general abstraction
over deep unsupervised learning methods [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ]. Work in [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] focused on the salient
concepts extracted from a target CNN network, which further helped to estimate
the information captured by activations in the target network. Conversely, the
authors in [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] propose the use of knowledge-based CNN to identify causal
relations from natural language text.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Causal Information Retrieval</title>
      <p>The techniques described in Section 2 consider causal relations either at the
sentence level or within a single document. In certain cases, these methods involve
prior knowledge about causal events, while in other cases they require some
prede ned lexical, syntactic or morphological relations. However, these techniques
do not cover more nuanced causes and e ects in larger document collections,
such as those we hope to capture with retrieval models.To address the research
questions introduced in Section 1, we propose a theoretical model of causality
from an IR perspective. We propose an associated work ow, and we then
investigate to what extent the requirements of causal search diverge from those of
topical search. We do this by analyzing the performance of di erent standard
retrieval models on a benchmark dataset with causal annotations.
3.1</p>
      <sec id="sec-3-1">
        <title>Why do we need a Causal Retrieval Model?</title>
        <p>In practice, information retrieval tasks are addressed by making use of term
overlaps between a query and documents, where the notion of relevance varies
depending on the task speci cations. As an example of this, consider the query
`American military o cers at Abu Ghraib prison accused', and a set of sample
top-ranked document excerpts for this query (see Table 1). Now, if the task is
to retrieve documents that are related to the topic itself, then any document
highlighting an accusation against US military o cers, o ensive treatment
towards detainees, leaked pictures of their torture, steps taken by US government</p>
        <p>Query - Accused American military o cers in Abu Ghraib prison
Topical The US is investigating a series of allegations of abuse, including sexual humiliation,
of prisoners by the US military in Iraqs Abu Ghraib jail...</p>
        <p>RelDoc: 1 The rst American military intelligence soldier to be court-martialled over the Abu
Ghraib abuse scandal was sentenced today to eight months in jail...</p>
        <p>The torture in Abu Ghraib prison re ects the breakdown in the chain of command in
the US military...</p>
        <p>RelDoc: 2 ...abuse is everywhere routine. One cornerstone of this new US policy seems to be to
outsource the task of interrogating....where torture is routine like Syria or Egypt...
Causal ....a female US soldier dragging an Iraqi detainee on the prison oor like a dog on a
leash, one end of which is shown tied to the mans neck...</p>
        <p>RelDoc: 1 ....one detainee handcu ed to a bunk bed in Baghdads Abu Ghraib prison, his arms
pulled so wide apart that his back is arched...
....they were savagely beaten and repeatedly humiliated by American soldiers working
on the night shift at Tier 1A in Abu Ghraib during the holy month of Ramazan,....
RelDoc: 2 ...they were pressed to denounce Islam or were force-fed pork and liquor...They forced
us to walk like dogs on our hands and knees...hitting us hard on our face and chest...
etc. is considered as relevant. As such, four of the documents in Table 1 might
be deemed relevant and retrieval using term overlap su ces the task. On the
other hand, if the task shifts to identifying causally-relevant documents
recursively (i.e. queryevent causeevent causeevent :::) for the same query, the
notion of relevance would now be concentrated on `why US military o cers are
accused' and the chain of further precursory causal events. In that case, reports
on o cers' torture stories, detainees statements accusing o cers, evidence
published on newspapers etc. are likely to meet the requirements of the task at this
level (say, leveli) and for next level onward (i.e., leveli+1), we would be nding
further prevalent causes given the e ect event at leveli. Thus, only two of the
documents in Table 1, labelled as `causal', appear to be causally relevant to the
aforementioned query. Now the question arises if term overlap between query
and documents is adequate to meet up with this current task speci cations or it
requires di erent ideologies which we investigate in the later part of this paper.</p>
        <p>Moreover, events that are eventually reported by news media are often
triggered by a series of causes spread over an extended period of time. Consequently,
making the initial query more speci c by adding cause-related keywords, such as
`American military o cers accusation causes (or reasons)' etc., and then using a
traditional IR system is unlikely to retrieve relevant information, since details
regarding the causes of the event might not be explicitly reported in news articles.
However, such causality-speci c information could be discovered by analyzing
a number of documents and associating the latent relationships between their
terms, along with the series of triggering causes.
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Model Architecture</title>
        <p>For a causal retrieval model, we assume the user is searching for cause-related
information and there exists some agent or system to assist the user. Given a query
event Q = fq0; q1; ::; qng, the user seeks documents containing causal information
related to the query, and the search is performed over a xed document collection
C. The causal retrieval model aims to present causally-connected information in
a recursive fashion. That is, given an event, it nds possible causes for that even,
and given those causes (i.e. additional events), the system then nds what might
have caused those successively (see Figure 1). Here each succession represents
one level. We now formally describe this process.</p>
        <p>We assume that in a n term query Q, a small text snippet (i.e. sequence of
terms) would be considered as the potential causal query (i.e. e ect event) which
we refer as initial query event Q. Therefore, Q can be represented as the 0th event
at level-0 (i.e. no retrieval is performed yet), which we denote as D(00;0). At the
next level (i.e. level-1), given the query D(00;0), the system displays a set of top
ranked k documents to the user, denoted D(1) = fD1; D2; :::::; Dkg. Here each
document Dj can be further fragmented into short text segments that might
be a potential event having preceding causes. Thus, we constitute Di = fD(1j;1),
D(1j;2),..., D(1j;i),...., D(1j;n(Dj1))g, where D(1j;i) denotes the ith event identi ed at
level-1 from the document retrieved at jth rank. Assume that at level-1, the text
segment D(1j;i) is recognized as a potential event which has precursory chain of
causes. Consequently, D(1j;i) will act as query at level-1 and retrieve a further
set of k causally-relevant documents, which will be treated as level-2. In this
situation, the e ect query event D(1j;i) could be displayed to the user as hyperlink,
which could expand to another new set of ranked documents once it is clicked
by the user. As shown in Figure 1, the candidate e ect event D(1j;i) is considered
as root of the sub-tree and it further expands to an immediate level with a new
act as the root of a
subtree expanded
after the user clicks
on this
i-th event identified
from the document
retrieved at level-1 at
the j-th rank
assassination of osama
…..TheUnited States blames bin Laden and his al Qaida
network for the September 11, 2001, hijackedplane attacks
on America that killed morethan 3,000 people and has
vowedto destroy them.....
…..United States has offereda $25.....thevoiceon the audio
tape hailed anti-Western attacks in Bali, Kuwait, Yemen and
Jordanand last month's …h..o..s..t.a..g..e.-..t.a..k..i.n..g..i.n...M....o..s..c.o..w...... …....</p>
        <p>Osamabin Ladens al Qaidanetwork may be plotting
spectacular attacks inside the US, with national landmarks or
the aviation, oil and nuclear industries as possibletargets,</p>
        <p>hostagetaking moscow
i-th event identified
from the document
retrieved at level-2
at the j-th rank</p>
        <p>Chechnya hadlongstruggled to assertitsindependence.A
disastroustwo-yearwarended in1996,but Russianforces
returned to the regionjustthree yearslater....</p>
        <p>After a 57-hour-standoff at thePalaceof Culture, during
which twohostages werekiled, Russian special forces
surrounded and raidedthetheater on themorning of
October 26…..</p>
        <p>new ranked list
Fig. 1: Work ow of a user's experience in an interactive causality search interface.
bmroatshteerrhood SaudinetworkZatyrdplaehraoidider terrorBushPakismtaisnsion ctcNooaomupnismc-araolelnl
IslaemnIsgaliabnmaeder navIyraSnEALstsreacirnet ATl-aqOlLiabsaeaaaddmtnetaanackTtehUhswrtSreeraoaapndrttlKeedtemapnpebybceneaeotrnam1tg1ebaorlinniegs</p>
        <p>Omar arms buzz
(a) Cosine similarities between topical and
causal documents.
(b) Term associations related to
bin Laden's assassination.
ranked list of documents D(2) = fD1; D2; :::::; Dkg. Thus, we again fragment
each document Dj into short text segments, and identify potential e ect event
for the next level of retrieval and the process continues recursively.</p>
        <p>
          Evidently, at each level of this process, the main challenge involves retrieving
the top-ranked causally-relevant document pertaining to the event. Therefore,
in the next section we investigate the problem analytically to nd the answer to
our second research question { is a traditional search system adequate for the
requirements of the causal information retrieval task?
Recently, the authors in [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] claimed that, for a given query event, the two sets
of relevant documents (topical and causal) will have only a partial term
overlap. With the help of a pseudo-relevance feedback technique, they made use of
high term sampling probabilities for terms that are infrequent in the
pseudorelevant document set to identify causal documents However, prioritizing
infrequent terms might always not helpful, especially in cases where the query is quite
broad, such as `Assassination of Osama bin Laden'. We illustrate this situation
in Figure 2b, where it is clear that many terms, such as, Bush, Iran, SEALs,
and typhoid are quite infrequent. However, these terms might not lead us to the
actual causes of the event.
        </p>
        <p>
          Therefore, to investigate the nature of causally-relevant documents and how
they are coupled with that of topical one, in this paper we conduct a number of
experiments on the open dataset proposed in the shared task [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. The collection
consists of 303; 291 news articles collected from Telegraph India5. Also, they
provide 25 query topics which have a causal information need and annotated
relevance judgements, each related to a di erent news event. We measure the
cosine similarity between the two associated relevance judgement sets (topical
5 https://www.telegraphindia.com
and causal) based on their term associations, as depicted in Figure 2a. We
observe that news events which might have been triggered by multiple causes, such
as Assassination of Osama bin Laden (topic-1) or involve prominent gures or
organizations that are often reported in news articles, such as Maharashtra chief
minister resigned (topic-3), have poor similarity between both set of documents.
This re ects the fact that the causal results for this event have a small term
overlap with the topical set. In contrast, the similarity value increases
substantially if events have either a smaller number of causal factors, such as Carphone
Warehouse terminated deal with Channel 4 (topic-19), or are related to less
signi cant entities, for example Court blocks Facebook in Pakistan. Such cases
exhibit considerable term overlap, which we validate with retrieval experiments
later in this paper. Furthermore, we explore this association with a couple of
experiments and discuss our observations in the following subsections.
3.4
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>Experimental Setup</title>
        <p>
          Since we aim to investigate the notion of causal relevance for query events, we
analyze the performance of a number of standard retrieval models, in order
to obtain an insight into whether these models can address the requirements
of causality. Firstly, we employ a retrieval framework with the BM25 ranking
function to see if query term overlaps with the document could capture causes or
not. We named this method `BM25' as reported in Table 2. Next, we evaluated
how classical language retrieval models, speci cally a linear smoothed language
model performed with: (i) Jelinek-Mercer smoothing; (ii) Dirichlet smoothing
[
          <xref ref-type="bibr" rid="ref35">35</xref>
          ]. We refer to these methods as `LM-JM' and `LM-DIR' respectively.
        </p>
        <p>It is evident that there are speci c representative terms for each query event
which result in the di erence between its corresponding topical and causal
document sets. Usually query narrations are good resources for those representative
terms as they clearly express information need for the associated task. Therefore,
the next method that we investigate is `BM25-TN' (i.e. search using Title along
with Narration and rank by BM25), where we use topic narrations as queries,
which in turn leads us to a causally-relevant document set. Based on the
intuition that terms close to the query event in an N -dimensional word vector space
might be useful to capture causes, we examine whether query reformulation
with word2vec word vectors can capture causality. We make use of a pre-trained
model, built on the Telegraph collection described previously, to help us to learn
query-term associations. Once trained, this model can recommend related terms
that are similar to the query terms, which might potentially be causally relevant.
Thus we selected m nearby candidate terms for expanding the query to identify
causal documents from the target collection, ranking them using BM25 (referred
to as `BM25-W2V').</p>
        <p>Finally, we explored the method `BM25-CS' (Causality Speci c), where we
make the query more speci c to the causal information need. We consider that
a user might build queries including one or more causality-indicative terms. For
instance, `Assassination of Osama bin Laden causes (or reasons)' might sound
more reasonable than `Assassination of Osama bin Laden', if the search intention
is to to nd the causes of the event. Therefore, we made use of a subset of 25
synonyms for the term `cause' to formulate more causality-speci ed queries on
which to search. This set includes terms such as: finduce, lead, produce, provoke,
compel, elicit, evoke, incite, introduce, kicko , kindle, motivate, reason g.
Parameter Settings. The parameters associated with BM25, speci cally k1
(used for term frequency scaling) and b (term frequency normalization by
document length), were varied in range of [0:1; 1:5] and [0:1; 0:9] respectively in steps
of 0:1. We also tuned for the method LM-JM in the range [0:1; 0:9] (varied in
steps of 0:1), and for LM-DIR in [500; 2000] (varied in steps of 100).
Additionally, we varied the number of candidate expansion terms chosen by BM25-W2V
from 50 to 200, varying in steps of 10. Table 2 illustrates the optimal results
achieved by optimizing parameters using grid search.
3.5</p>
      </sec>
      <sec id="sec-3-4">
        <title>Observations</title>
        <p>From our results we make a number of observations. Firstly, it is clear from
Table 2 that, irrespective of examined model architecture, the performance of
traditional retrieval algorithm drops considerably as it attempts to nd causal
information, in comparison with topical search. Secondly, BM25 improves
recall marginally over linearly smoothed language models. However,
Dirichletsmoothed LM appears to be as e cient as BM25 in terms of precision. Thirdly,
as discussed in Section 3.4, topic narrations are expected to lead us to the causal
chain of any query event and should deviate the search from topical relevance
to causal. In practice, BM25-TN proves to be competent in terms of capturing
more cause-related information than topical in the retrieved relevant set (i.e.
increased recall), which is our primary intention. Fourthly, it is evident that blindly
formulating any query that itself mentions the search intention (i.e BM25-CS),
or expanding a query with terms that are closely associated in the vector space
of the target collection (i.e. BM25-W2V), is not adequate to harness the search
scope; rather it might deviate the search intention from the actual topic to a
large extent by adding noise.</p>
        <p>BM25
LM-JM
LM-DIR
BM25-TN
BM25-W2V
BM25-CS</p>
        <sec id="sec-3-4-1">
          <title>Topical</title>
        </sec>
        <sec id="sec-3-4-2">
          <title>Causal</title>
        </sec>
        <sec id="sec-3-4-3">
          <title>MAP Recall NDCG P@5 MAP Recall NDCG P@5</title>
          <p>To obtain a better understanding of document associations, we plot per-query
MAP histograms for both topical and causal relevance for three of the standard
retrieval frameworks (see Figure 3). Also, we show the topical-causal MAP
distributions for each of the 25 queries in Figure 4. In Section 3.3, we argued that
cosine similarity values between topical and causal set of documents are in
uenced by; (i) the number of causal factors (inversely proportional); (ii) whether
the query has any association with familiar entities (holds inverse relation). The
results show that the MAP values obtained for sets of topics justify this
argument. For example, topic-6: Babri Masjid demolition case against Advani
(Indian Politician), topic-22: Lalu Prasad Yadav (Minister of Indian
Parliament and was accused for multiple scams) convicted etc. achieved lower MAP
for causality task as compared to topical. Conversely, for cases, such as topic-8:
Court blocks facebook in Pakistan (single cause query and no important entity),
topic-21: Praveen Mahajan accused (non-public gure) etc. traditional models
performed well in terms of causality.
4</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Conclusion</title>
      <p>Causal retrieval is important in situations where a user's search is focused on
nding the plausible causes of an event mentioned in the search query. For
instance, when a user wishes to investigate the chain of preceding occurrences in
the context of event-driven news. In this paper, we have presented a high-level
literature survey on causality, covering the last three decades. We have observed
that there is a gap in the literature in terms of research on causality search. In
an e ort to mitigate this gap, we have formally de ned the problem of causal
information retrieval, and explained how it di ers from traditional topical search.
Furthermore, we have conducted experiments which demonstrate that traditional
methods from the information retrieval literature, which are focused on topical
relevance, provide limited utility in nding causally-relevant documents. This
re-enforces the view that causal information retrieval remains an open challenge
which is worthy of further research in the IR community.</p>
      <p>Taking this into account, we have proposed an architecture for a recursive
causal retrieval model that will help users to perform in-depth exploration in
terms of causality pertaining to a news event, and the chain of causes which led
to that event. Therefore, implementing the recursive model, conducting
comprehensive o ine experiments to evaluate it, and performing an extensive user
study will form the most important future extensions of our work.
Acknowledgement. This work was supported by Science Foundation Ireland
(SFI) under Grant Number SFI/12/RC/2289 P2.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <article-title>Forum for Information Retrieval Evaluation</article-title>
          . http://fire.irsi.res.in/fire/ 2020/home
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Causality-driven Adhoc</surname>
          </string-name>
          Information Retrieval. https://cair-miners.github.io/ CAIR-2020-website/ (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Asghar</surname>
          </string-name>
          , N.:
          <article-title>Automatic extraction of causal relations from natural language texts:A comprehensive survey</article-title>
          .
          <source>CoRR abs/1605</source>
          .07895 (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>BBC</given-names>
            <surname>Middle</surname>
          </string-name>
          <article-title>East editor: Five reasons why Israel's peace deals with the UAE and Bahrain matter</article-title>
          . https://www.bbc.com/news/world-middle-east-
          <volume>54151712</volume>
          (
          <year>2020</year>
          ),
          <source>online; accessed 14 September 2020</source>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Beamer</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Girju</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          :
          <article-title>Using a bigram event model to predict causal potential</article-title>
          . p.
          <volume>430</volume>
          {
          <fpage>441</fpage>
          .
          <string-name>
            <surname>Proc</surname>
          </string-name>
          . CICLing'
          <volume>09</volume>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Blanco</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          , et al.:
          <article-title>Causal relation extraction</article-title>
          .
          <source>In: LREC</source>
          <year>2008</year>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Datta</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , et al.:
          <article-title>Retrieving potential causes from a query event</article-title>
          . p.
          <volume>1689</volume>
          {
          <fpage>1692</fpage>
          .
          <string-name>
            <surname>Proc</surname>
          </string-name>
          . SIGIR'
          <volume>20</volume>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Dawid</surname>
            ,
            <given-names>A.P.</given-names>
          </string-name>
          : Beware of the dag! p.
          <volume>59</volume>
          {
          <fpage>86</fpage>
          .
          <string-name>
            <surname>Proc</surname>
          </string-name>
          . COA'
          <volume>08</volume>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Do</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          , et al.:
          <article-title>Minimally supervised event causality identi cation</article-title>
          .
          <source>In: Proc. EMNLP'11</source>
          . pp.
          <volume>294</volume>
          {
          <issue>303</issue>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Girju</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          :
          <article-title>Automatic detection of causal relations for question answering</article-title>
          . p.
          <volume>76</volume>
          {
          <fpage>83</fpage>
          .
          <string-name>
            <surname>Proc</surname>
          </string-name>
          . MultiSumQA'
          <volume>03</volume>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Gordon</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , et al.:
          <article-title>SemEval-2012 task 7: Choice of plausible alternatives: An evaluation of commonsense causal reasoning</article-title>
          .
          <source>In: (SemEval</source>
          <year>2012</year>
          ). pp.
          <volume>394</volume>
          {
          <issue>398</issue>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Gordon</surname>
            ,
            <given-names>A.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bejan</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sagae</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Commonsense causal reasoning using millions of personal stories</article-title>
          .
          <source>In: Proc. AAAI'11</source>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Harradon</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , et al.:
          <article-title>Causal learning and explanation of deep neural networks via autoencoded activations</article-title>
          . ArXiv abs/
          <year>1802</year>
          .00541 (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Hashimoto</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , et al.:
          <article-title>Toward future scenario generation: Extracting event causality exploiting semantic relation, context, and association features</article-title>
          .
          <source>In: Proc. 52nd Annual Meeting of the ACL</source>
          . pp.
          <volume>987</volume>
          {
          <issue>997</issue>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Hashimoto</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , et al.:
          <article-title>Generating event causality hypotheses through semantic relations</article-title>
          . p.
          <volume>2396</volume>
          {
          <fpage>2403</fpage>
          .
          <string-name>
            <surname>Proc</surname>
          </string-name>
          . AAAI'
          <volume>15</volume>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Hashimoto</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , et al.:
          <article-title>Excitatory or inhibitory: A new semantic orientation extracts contradiction and causality from the web</article-title>
          .
          <source>In: Proc. EMNLP-CoNLL'12</source>
          . pp.
          <volume>619</volume>
          {
          <issue>630</issue>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Inui</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Okumura</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Investigating the characteristics of causal relations in Japanese text</article-title>
          .
          <source>In: Proc. Workshop on Frontiers in Corpus Annotations II: Pie in the Sky</source>
          . pp.
          <volume>37</volume>
          {
          <issue>44</issue>
          (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Kaplan</surname>
            ,
            <given-names>R.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Berry-Rogghe</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          :
          <article-title>Knowledge-based acquisition of causal relationships in text</article-title>
          .
          <source>Knowledge Acquisition</source>
          <volume>3</volume>
          (
          <issue>3</issue>
          ),
          <volume>317</volume>
          {
          <fpage>337</fpage>
          (
          <year>1991</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Kiciman</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thelin</surname>
          </string-name>
          , J.:
          <article-title>Answering what if, should i, and other expectation exploration queries using causal inference over longitudinal data</article-title>
          .
          <source>In: Proc. DESIRES'18</source>
          . pp.
          <volume>9</volume>
          {
          <issue>15</issue>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mao</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Knowledge-oriented convolutional neural network for causal relation extraction from natural language texts</article-title>
          .
          <source>Expert Systems with Applications</source>
          <volume>115</volume>
          , 512{
          <fpage>523</fpage>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Narendra</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          , et al.:
          <article-title>Explaining deep learning models using causal inference</article-title>
          .
          <source>ArXiv abs/1811</source>
          .04376 (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Oh</surname>
            ,
            <given-names>J.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Torisawa</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hashimoto</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sano</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>De Saeger</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ohtake</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>Whyquestion answering using intra- and inter-sentential causal relations</article-title>
          .
          <source>In: Proc. 51st Annual Meeting of the ACL</source>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Paul</surname>
          </string-name>
          , M.J.:
          <article-title>Feature selection as causal inference: Experiments with text classi cation</article-title>
          .
          <source>In: Proc. (CoNLL</source>
          <year>2017</year>
          ). pp.
          <volume>163</volume>
          {
          <issue>172</issue>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Pearl</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paz</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Graphoids: Graph-based logic for reasoning about relevance relations or when would x tell you more about y if you already know z?</article-title>
          <source>In: Proc. ECAI'86</source>
          (
          <year>1986</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>Pearl</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          :
          <article-title>Causal diagrams for empirical research</article-title>
          .
          <source>Biometrika</source>
          <volume>82</volume>
          (
          <issue>4</issue>
          ),
          <volume>669</volume>
          {
          <fpage>688</fpage>
          (
          <year>1995</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <surname>Radinsky</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Horvitz</surname>
          </string-name>
          , E.:
          <article-title>Mining the web to predict future events</article-title>
          . p.
          <volume>255</volume>
          {
          <fpage>264</fpage>
          .
          <string-name>
            <surname>Proc</surname>
          </string-name>
          . WSDM'
          <volume>13</volume>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27.
          <string-name>
            <surname>Radinsky</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          , et al.:
          <article-title>Learning causality for news events prediction</article-title>
          . p.
          <volume>909</volume>
          {
          <fpage>918</fpage>
          .
          <string-name>
            <surname>Proc</surname>
          </string-name>
          . WWW'
          <volume>12</volume>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          28.
          <string-name>
            <surname>Raina</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Madhavan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ng</surname>
            ,
            <given-names>A.Y.</given-names>
          </string-name>
          :
          <article-title>Large-scale deep unsupervised learning using graphics processors</article-title>
          .
          <source>In: Proc. ICML'09</source>
          . p.
          <volume>873</volume>
          {
          <issue>880</issue>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          29.
          <string-name>
            <surname>Riaz</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Girju</surname>
          </string-name>
          , R.:
          <article-title>Another look at causality: Discovering scenario-speci c contingency relationships with no supervision</article-title>
          . p.
          <volume>361</volume>
          {
          <fpage>368</fpage>
          .
          <string-name>
            <surname>Proc</surname>
          </string-name>
          . ICSC'
          <volume>10</volume>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          30.
          <string-name>
            <surname>Riaz</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Girju</surname>
          </string-name>
          , R.:
          <article-title>In-depth exploitation of noun and verb semantics to identify causation in verb-noun pairs</article-title>
          .
          <source>In: Proc. SIGDIAL'14</source>
          . pp.
          <volume>161</volume>
          {
          <issue>170</issue>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          31.
          <string-name>
            <surname>Rink</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bejan</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Harabagiu</surname>
            ,
            <given-names>S.M.</given-names>
          </string-name>
          :
          <article-title>Learning textual graph patterns to detect causal event relations</article-title>
          .
          <source>In: Proc. FLAIRS</source>
          '
          <volume>11</volume>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          32.
          <string-name>
            <surname>Sun</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          , et al.:
          <article-title>Causal relation of queries from temporal logs</article-title>
          . p.
          <volume>1141</volume>
          {
          <fpage>1142</fpage>
          .
          <string-name>
            <surname>Proc</surname>
          </string-name>
          . WWW'
          <volume>07</volume>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          33.
          <string-name>
            <surname>Tanaka</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , et al.:
          <article-title>Acquiring and generalizing causal inference rules from deverbal noun constructions</article-title>
          .
          <source>In: Proc. COLING'12</source>
          . pp.
          <volume>1209</volume>
          {
          <issue>1218</issue>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          34.
          <string-name>
            <surname>Wood-Doughty</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          , et al.:
          <article-title>Challenges of using text classi ers for causal inference</article-title>
          .
          <source>In: Proc. EMNLP'18</source>
          . pp.
          <volume>4586</volume>
          {
          <issue>4598</issue>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          35.
          <string-name>
            <surname>Zhai</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>La</surname>
            <given-names>erty</given-names>
          </string-name>
          , J.:
          <article-title>A study of smoothing methods for language models applied to ad hoc information retrieval</article-title>
          . p.
          <volume>334</volume>
          {
          <fpage>342</fpage>
          .
          <string-name>
            <surname>Proc</surname>
          </string-name>
          . SIGIR'
          <volume>01</volume>
          (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          36.
          <string-name>
            <surname>Zhang</surname>
          </string-name>
          , J.:
          <article-title>Causal reasoning with ancestral graphs</article-title>
          .
          <source>J. Machine Learning Research</source>
          <volume>9</volume>
          ,
          <issue>1437</issue>
          {
          <fpage>1474</fpage>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          37.
          <string-name>
            <surname>Zhao</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , et al.:
          <article-title>Constructing and embedding abstract event causality networks from text snippets</article-title>
          . pp.
          <volume>335</volume>
          {
          <fpage>344</fpage>
          .
          <string-name>
            <surname>Proc</surname>
          </string-name>
          . WSDM'
          <volume>17</volume>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>