<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Reasoning about Conditional Beliefs for the Winograd Schema Challenge</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Denis Golovin and Jens Claßen</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Christoph Schwering</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Knowledge-Based Systems Group, RWTH Aachen University</institution>
          ,
          <addr-line>Aachen</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>School of Computer Science and Engineering, The University of New South Wales</institution>
          ,
          <addr-line>Sydney</addr-line>
          ,
          <country country="AU">Australia</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The Winograd Schema Challenge has been proposed as an alternative to the Turing Test for measuring a machine's intelligence by letting it solve difficult pronoun resolution problems that cannot be tackled by statistical analysis alone. While many solutions so far are based on machine learning and natural language processing, we believe that a knowledgebased approach is better suited. In particular, we propose to employ a logic for conditional beliefs that is capable of dealing with incomplete or even inconsistent information (which commonsense knowledge often is). It does so by formalising the observation that humans often reason by picturing different contingencies of what the world could be like, and then choose to believe what is thought to be most plausible. We discuss and evaluate an implementation where relevant commonsense background information is obtained from the ConceptNet semantic network, translated into our formalism, and processed by a reasoner for our logic.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>The Winograd Schema Challenge (WSC) has been proposed by
Levesque, Davis and Morgenstern [2012] as an alternative to
the Turing Test for measuring a machine’s intelligence. While
agreeing with Turing’s primary concern, which is focusing
on the observable behaviour exhibited by the system, the aim
is to avoid some of the pitfalls that come with a free-form
conversation as Turing had in mind. Most importantly, in
order to pass as a person, a machine will have to resort to
deception to be able to answer questions about their height,
childhood, and so on. Moreover, the judgement of what passes
as human-like conversation is highly subjective and may vary
depending on the interrogator that is performing the test.</p>
      <p>The central idea behind the WSC is to instead let the
machine answer a number of questions of a specific form. As an
example, consider the statement</p>
    </sec>
    <sec id="sec-2">
      <title>The delivery truck zoomed by the school bus because it was going so slow.</title>
      <p>The question is “What was going slow?”, i.e., the task is
to disambiguate to which of the two parties mentioned in
the statement (“the delivery truck”, “the school bus”) a
pronoun (“it”) refers. Each schema contains a special word (here:
“slow”) that, when replaced by an alternate word (“fast”),
changes the answer. While obvious to an English-speaking
human reader, such questions pose a real challenge for a
machine since the given statement alone does not provide enough
information to derive an answer, but rather requires having
appropriate commonsense background knowledge and being
able to reason about it. Winograd schemas are designed to be
“Google-proof”, meaning that they should not be solvable by
simple statistical analysis: If instead of a “delivery truck” the
above sentence spoke of a “sports car”, it would not be hard to
detect a correlation to words associated with fast movement.</p>
      <p>WSC competitions that were held so far prove that the task is
indeed difficult, and likely cannot be solved by classical natural
language processing alone. Robust commonsense reasoning
on the other hand will undoubtedly be beneficial in other areas,
in particular when it comes to domestic robots and
humanmachine interaction in general.</p>
      <p>Such a system will not only need access to a large corpus
of background information that preferably covers all areas
of the wide spectrum of everyday human life, but also be
able to appropriately function in light of the fact that such
information is often incomplete or even conflicting. Humans
are very capable of this, and we usually do so by picturing
different contingencies of what the world could be like, and
then choosing to believe what we think the most plausible
scenario is.</p>
      <p>In this paper, we present an approach to the WSC that is
based on a logic that formalises these concepts, the logic of
conditional beliefs [Schwering et al., 2017]. We designed and
implemented a system that
• translates the WSC sentence into a logical formula,
• extracts relevant background information from a
knowledge corpus and translates it into our logical language,
and
• performs reasoning to decide what the most plausible
answer is.</p>
      <p>The rest of this paper is organised as follows. In Section 2
we discuss related work. Section 3 shows an overview of
our system’s architecture. We then present the formal
framework of our approach in Section 4. Section 5 describes the
implementation of our system, while Section 6 presents an
evaluation. Finally we conclude in Section 7.
2</p>
      <sec id="sec-2-1">
        <title>Related Work</title>
        <p>Rahman and Ng [2012] view the WSC as a special form of
pronoun resolution problem and propose an approach based
on machine learning. Their system relies on abstract feature
vectors extracted from eight different sources, among which
Google and Lexical Features turned out to be most useful
in their analysis. For the former, they construct a number of
queries from the verb governing the pronoun in the input
sentence, the words followed by it, and the two candidate parties,
and then count the number of results returned by the search
engine. The latter are obtained from a training set and represent
the presence or absence of certain predefined structures in the
sentence. They achieve an accuracy of 73.1% on their test set,
where the overall dataset of 941 sentence pairs was created by
30 students and split using a 70/30 training/test ratio. It should
be noted that they considered a relaxed version of the problem:
while the WSC only allows for instances where resolving the
pronoun requires background knowledge not expressed in the
statement, they did not impose this condition on sentences.</p>
        <p>Sharma et al. [2015] present a three-step method based on
their semantic parser “K-parser”: first, the input sentence is
brought into a graph representation encoding the conceptual
classes of words as well as their dependencies and semantic
relations. Next, a series of Google queries is constructed where
nominal entities are replaced by wild cards and verbs are
substituted by their synonyms. The sentences thus found are then
matched against the input by comparing their semantic graphs
by a set of predefined rules. For evaluation, they selected 71
sentences from the WSC corpus, resulting in 53 answers, 49
of which were correct.</p>
        <p>Bailey et al. [2015] introduce a new formalism for reasoning
about the correlation between sentences, which can either be
positive or negative. Intuitively, the correlation between two
sentences is positive when after hearing the first sentence, the
hearer views the second sentence as more plausible than before.
Negative correlation on the other hand lowers the plausibility.
For example,</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>The delivery truck zoomed by the school bus.</title>
      <p>is positively correlated with</p>
    </sec>
    <sec id="sec-4">
      <title>The school bus was going so slow.</title>
      <p>but negatively with</p>
    </sec>
    <sec id="sec-5">
      <title>The truck was going so slow.</title>
      <p>The authors of the aforementioned paper present the
correlation calculus as a deductive system to formalise this idea.
It extends classical predicate calculus by a new operator ,
where F G expresses that F and G are positively correlated,
and F :G denotes a negative correlation. They show how
their formalism can be used to derive such correlation
formulas from a set of formulas encoding background information
and thus justify the answers for several WSC instances. While
their work makes an important step towards solving the WSC,
they admit that it so far does not address the important problem
of generating the needed commonsense background axioms
(these were hand coded in their examples), nor did they come
up with an implementation that would allow automating the
reasoning process and evaluating the approach against others.
Their formalism also currently lacks the capability to handle
non-monotonic aspects such as the statement</p>
    </sec>
    <sec id="sec-6">
      <title>If the shelf is level, then the sculpture will not roll off.</title>
      <p>which, when read as default rule, allows to withdraw the
conclusion that the sculpture does not roll off in face of new
information, e.g. that there is an earthquake.</p>
      <p>Finally, Liu et al. [2017] propose another approach based on
machine learning, which is used (a) in a knowledge acquisition
step to identify cause-effect relationships between commonly
used words from large text corpora and (b) to find association
relationships in the form of conditional probabilities between
pairs of discrete events. They manually selected a subset of
70 Winograd schemas, among which their method achieves an
accuracy of 70%.
3</p>
      <sec id="sec-6-1">
        <title>System Architecture Overview</title>
        <p>The architecture of our system is visualised in Figure 1. In
a preprocessing step, tuples that represent the acting parties,
their actions, and their relationships are extracted from the
input Winograd schema sentences. For instance, the Winograd
schema from Section 1 leads to tuples (truck, zoomed by, bus)
and (zoomed by, goes slow). Then a commonsense repository,
ConceptNet, is employed to derive further information about
the extracted words. These facts come in the form of binary
relations; for instance, it may tell us that zooming by is a form
of travelling rapidly. From these facts, a conditional
knowledge base is constructed in the logic BO, and from which
the system aims to infer through automated reasoning how to
disambiguate the pronoun in the most plausible way. In the
school bus-scenario, the system compares the plausibility of
party #1 (the truck) versus party #2 (the school bus) being the
slow party.
4</p>
      </sec>
      <sec id="sec-6-2">
        <title>A Logic of Plausible Beliefs</title>
        <p>Reasoning in our framework is based on a logic of plausible
beliefs [Schwering et al., 2017] called BO. This logic features
non-monotonic conditionals of the form</p>
        <p>If some premise holds,
then plausibly some consequent is true,
which enables us to represent and reason about more and less
plausible contingencies. We expect this to be crucial in the
context of the WSC, where we aim to find the correct answer
by comparing the plausibility of the different candidates. In
the following we briefly introduce the formal machinery of
this logic.</p>
        <p>The terms in our language consist of infinitely many
firstorder variables, usually denoted as x or y, and infinitely many
(standard) names n 2 N = f#1; #2;:::g. Formulas are of the
form
• P (t1;:::; tk), (t1 = t2), ( ^ ),
: ,
8x ,
• B( 1 )
mg,
where P is a predicate symbol other than =, ti are terms, and
are formulas, and i and i are objective formulas, that is,
i and i mention no B or O operators. A formula is ground
when it contains no variables. A sentence is a formula without
free variables. Let _, 9, , be the usual abbreviations, &gt;
stand for 8x(x = x), and ? abbreviate :&gt;.</p>
        <p>
          Standard names can be seen as special constants that
represent all the individuals in the universe. They considerably
simplify the semantics compared to the classical Tarskian
model theory; in particular, quantification can be handled by
simply substituting standard names. The formula B( ) )
intuitively expresses a conditional belief, namely the belief
that if is true, then plausibly is also true. The formula
Of 1 ) 1;:::; m ) mg goes one step further and says
that the conditional beliefs B( i ) i) are all that is
believed; everything that is not a consequence of these
conditional beliefs is implicitly not believed. The concept is called
only-believing; it generalises Levesque’s only-knowing [
          <xref ref-type="bibr" rid="ref8">1990</xref>
          ]
to the case of conditional beliefs and shares many properties
with Pearl’s System Z [
          <xref ref-type="bibr" rid="ref8">1990</xref>
          ]. Only-believing is particularly
useful to capture the idea of a conditional knowledge base: the
knowledge base is all the agent believes.
        </p>
        <p>To investigate such problems, we define a possible-worlds
semantics for BO. A world is a set of ground non-equality
atoms. An epistemic state ~e is an infinite sequence of sets
of worlds e1; e2;::: such that e1 e2 ::: and for some
p 2 f1; 2;:::g, ep = ep+1 = :::</p>
        <p>Intuitively, a world is a truth assignment of the ground
non-equality atoms, that is, of all P (n1;:::; nj ) where the ni
are standard names. An epistemic state stratifies sets of such
worlds with the subset relation. The intuition behind an
epistemic state ~e is to model a system of spheres: e1 contains
the most-plausible worlds, e2 adds the second-most-plausible
worlds, and so on. Note that in any epistemic state ~e only
finitely many different sets of worlds are allowed, since the
def3. ~e; w j= (
4. ~e; w j= :
5. ~e; w j= 8x
inition requires that ep = ep+1 = ::: for some p. (Just as well
we could have defined an epistemic state to be a non-empty
finite sequence of supersets; the present definition however is
often easier to work with.)</p>
        <p>Given an epistemic state ~e and a world w, we can define
truth of a sentence in BO, written ~e; w j= . We let nx
denote the result of substituting all free occurrences of x by n.
The objective part of the semantics is defined inductively as
follows:
1. ~e; w j= P (n1;:::; nj ) iff P (n1;:::; nj ) 2 w;
2. ~e; w j= (n1 = n2) iff n1 and n2 are identical names;
^ ) iff ~e; w j=</p>
        <p>and ~e; w j= ;
iff ~e; w 6j= ;
iff ~e; w j=
nx for all n 2 N .</p>
        <p>Notice that Rule 5 handles quantification by substitution of
standard names.</p>
        <p>Before we proceed with the semantics of conditional belief,
we define the plausibility b~e j c of an objective sentence in
~e as the index of the first sphere consistent with :
b~e j c = minfp j p = 1 or ~e; w j=
for some w 2 epg,
where 1 2= f1; 2;:::g represents an “undefined” plausibility
with the understanding that p + 1 = 1 and p &lt; 1 for all
p 2 f1; 2;:::g. Then the semantics of beliefs is as follows:
6. ~e; w j= B( )
) iff for all p</p>
        <p>1,
if p</p>
        <p>b~e j c and w0 2 ep, then ~e; w0 j= (
7. ~e; w j= Of 1 )
The intuitive meaning of Rule 6 is that B( ) ) holds iff the
most-plausible worlds also satisfy . Note that B( ) )
is vacuously true when there is no world. As a consequence,
B(: ) ?) can be used to express that is known.</p>
        <p>Only-believing Of 1 ) 1;:::; m ) mg has the effect
of B( i ) i) plus maximising every sphere: it requires the
sphere ep to contain all worlds that satisfy all ( i i) for
which b~e j ic p. It turns out that this definition gives rise
to a procedure that generates the unique system of spheres ~e
that corresponds to Of 1 ) 1;:::; m ) mg [Schwering
et al., 2017], which means that a conditional knowledge base
has a unique semantic representation as a system of spheres.</p>
        <p>While reasoning in the logic as presented here is
undecidable given its first-order nature, a decidable (and sometimes
tractable) variant based on the theory of limited belief has been
developed and implemented [Schwering and Lakemeyer, 2016;
Schwering, 2017].</p>
      </sec>
      <sec id="sec-6-3">
        <title>5 Implementation</title>
        <p>In order to automate the process of solving WSC through
reasoning with conditional beliefs, we need to (a) translate
a given WSC instance into formulas of BO, (b) provide
necessary background knowledge in the form of a KB of BO
formulas, and (c) call the BO reasoner to answer the query.
Here we adopt an approach based on ConceptNet 5 [Speer
and Havasi, 2012], a large semantic network that represents
knowledge in a graph structure where each node is a concept
(e.g., “zoom”, “going fast”) and each edge corresponds to one
of 28 different relations (e.g., “IsA”, “RelatedTo”). Apart from
manually entered data, the network integrates information
from various sources, including subsets of the OpenCyc and
DBPedia ontologies as well as the Wiktionary and WordNet
3.0 dictionaries. The origin of each piece of information is kept
in metadata associated to each edge, in particular a numerical
weight representing the strength of the assertion. Weights have
a default value of 1 but can be higher or lower, where a negative
value expresses that the assertion is false or irrelevant.</p>
        <p>As opposed to fully-fledged commonsense knowledge bases
such as Cyc [Matuszek et al., 2006], ConceptNet 5 does not
come with a pre-defined formal semantics and/or inference
engine. It can rather be viewed as an intermediate step between
the intuitive, informal knowledge used by humans on the one
hand and a formal, logic-based representation on the other.
As developers, this gives us the freedom to attach meaning to
assertions in a way that suits our application.</p>
        <p>In our case, we identified 24 of the 28 relations to be useful
and mapped them to conditional beliefs. Consider the IsA
relation: Each corresponding tuple has the form IsA(A,B), where
A and B are concepts. Assuming that we represent concepts by
unary predicates, we can encode the assertion as an ordinary
belief (expressing that it is plausible to assume that if x is an
A, it is also a B):
&gt; ) 8x (A(x)</p>
        <p>B(x))
However, thus the statement will only hold in the most
plausible sphere, which essentially amounts to doing monotonic
reasoning only. We hence instead use the following encoding:</p>
        <p>A(x) ) B(x)
That is, the most plausible A’s are assumed to be B’s. Similar
mappings have to be defined for the remaining relations. The
ones we used are summarised in Table 1, where A ) B stands
for formula (2), A ) :B is shorthand for A(x) ) :B(x)
etc.
5.2 Representing Winograd Schemas in BO
To create the knowledge base, we start with the given WSC
sentence, e.g.</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>The fish ate the worm. It was tasty.</title>
      <p>Here, the word tasty can be replaced by hungry and the
reference of it changes from worm to fish. Note that this a structure
that many (but not all) Winograd schemas exhibit: They start
with a statement (unambiguously) describing a situational
relationship between two parties (“the fish ate the worm”),
followed by an ambiguous description of an effect of it (“it was
tasty”). We use the natural language processing tool ReVerb
[Fader et al., 2011] to obtain a structured representation for
the former in the form of a triple ( sh; ate; worm).</p>
      <p>Note that it is possible to replace both parties with
placeholders and still understand the sentence and resolve the pronoun.
The requirement that the names of the two parties do not carry
(1)
(2)</p>
      <sec id="sec-7-1">
        <title>Relation</title>
        <p>IsA
NotIsA
RelatedTo</p>
      </sec>
      <sec id="sec-7-2">
        <title>Description</title>
        <p>A is a kind of B
A is not kind of B</p>
        <p>A is related to B
DerivedFrom A is derived from B
Antonym A is the opposite of B
MotivatedBy A is motivated by B
Synonym A is a synonym of B
FormOf B is the root word of</p>
        <p>A
HasPrerequisite in order for A to
happen, B needs to
happen
UsedFor A is used for B
NotUsedFor A is not used for B
PartOf A is part of B
HasA B belongs to A, either
as an inherent part or
due to a social
construct of possession
CapableOf A is capable of B
NotCapableOf A is not capable of B
ObstructedBy A is a goal that can be
prevented by B
HasProperty A has B as a property
NotHasProperty A has not B as a
property
Desires A is a conscious
entity that typically
wants B
NotDesires A is a conscious
entity that typically
does not want B
CreatedBy B is a process or
agent that creates A
DefinedAs A and B overlap
considerably in meaning,
and B is a more
explanatory version of</p>
        <p>A
Entails If A is happening, B
is also happening
MannerOf A is a specific way to
do B</p>
      </sec>
      <sec id="sec-7-3">
        <title>Mapping</title>
        <p>A ) B
A ) :B
A ) B and
B ) A
A ) B
:(A ^ B)
B ) A
A B
A B
A ) B
A ) B
A ) :B
B ) A
A ) B
A ) B
A ) :B
B ) :A
A ) B
B ) :A
A ) B
A ) :B
A ) B
A ) B
A ) B
A ) B
any relevant information for solving a WSC, actually intended
as a main difficulty, can here be used to our advantage: Instead
of “fish” we simply denote the first party by standard name #1,
and similarly the “worm” as second party by #2. We will not
have to worry about reasoning about the relationship between
a fish and a worm, but can rather concentrate on the relation
between concepts ate and tasty.</p>
        <p>The triple ( sh; ate; worm) now gives us the certain
information that the fish (#1) is the active party in the event “ate”,
which we represent by adding</p>
        <p>:ate(#1 ) ) ?
to the KB. Next, we consult ConceptNet to check if more
information about the two parties is available. Specifically, we try to
use the fact that the worm (#2) is the passive party in the event.
We hence search for a concept which has similar properties as
an eaten worm, that is to say something which is commonly
known to be eaten, i.e., something which is the passive party
in the event of ate. After determining eaten to be the passive
form of ate through the common root word eat , we therefore
query ConceptNet to find instances of the ReceivesAction
relation (that we only use for this specific purpose). In the
example, ConceptNet suggests that the concept apple is
commonly known to be eaten since ReceivesAction(apple; eaten)
has a high weight. We hence represent that the worm is the
passive party by the formula</p>
        <p>&gt; ) apple(#2 )
The new expression can be read as it is believed that the second
party is active in being an ’apple’. The fact that the passive
form of ate is eaten is further represented as follows:
ate(x ) ) :eaten(x )
:ate(x ) ) eaten(x )</p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>The new beliefs can be read as it is believed that if party x is</title>
      <p>active in the event ’ate’ it presumably is not active in the event
’eaten’, and vice versa. Finally, we add the belief that if a party
is active in the event of eaten, then it is presumably active in
the event of being an apple:</p>
      <p>eaten(x ) ) apple(x )
Note that beliefs of the form (4) to (6) are only generated in
case a corresponding ReceivesAction actually exists (which is
often not the case); only the ones of the form (3) are guaranteed
to be present in every WSC instance.</p>
      <sec id="sec-8-1">
        <title>5.3 Reasoning</title>
        <p>Starting with the formulas generated for the WSC sentence
as demonstrated in Section 5.2, we search for a path in
ConceptNet that connects the two target concepts ate and tasty
identified from the WSC sentenced (using normalisation like
elimination of stop words and word stemming if needed). All
relations on this path are mapped to formulas as described in
Section 5.1, instantiated by the standard names #1; #2
representing the two parties. We then call the BO reasoner on the
resulting KB with the two queries
tasty (#1 ) _ tasty (#2 ) ) tasty (#1 )
tasty (#1 ) _ tasty (#2 ) ) tasty (#2 )
A decision is then made if exactly one of them comes out as
true and the other as false.
(3)
(4)
(5)
(6)
(7)</p>
        <sec id="sec-8-1-1">
          <title>Evaluation</title>
        </sec>
      </sec>
      <sec id="sec-8-2">
        <title>6.1 Quantitative Evaluation</title>
        <p>We performed a preliminary experimental evaluation of our
implemented system on the publicly available Winograd schema
collection1 containing 144 schemas in total. Since ConceptNet
does not directly support querying paths between concepts, we
search in a two-stage process. First, a subgraph of ConceptNet
is generated in memory, which we do by starting with the
target concepts and successively adding adjacent edges and
nodes, where for each node the (at most) 30 edges with highest
weight are considered. We then search for the actual path by
means of the Dijkstra algorithm, alternatingly using edge costs
of 1=weight or 1 (the latter because sometimes ConceptNet’s
weights are not reliable).</p>
        <p>Table 2 shows the results for increasing depth limits in
terms of the number of decisions made (recall that this means
that exactly one of the two queries comes out true) and the
fraction of correctly answered sentences. While obviously a
search depth of zero yields no results, neither does a limit of
one, meaning none of the concepts mentioned in Winograd
schemas are directly connected in ConceptNet, showing that
these schemas are indeed hard in some sense. Once we set
the limit to two or above, decisions are made with increasing
accuracy (though not much better than guessing). With the
current implementation we were not able to increase the depth
beyond three due to reaching memory limitations.
depth
0
1
2
3
decisions
0
0
103
159
correctly answered
0
0
52 (50.5%)
85 (53.5%)</p>
        <p>After looking into some of the examples where wrong or no
decisions were made, we adjusted our method slightly. First,
we know that exactly one of the two parties is referred by
the pronoun. Moreover, while one party (the fish) actively
participates (eats) in the described scenario as encoded by (3),
often the other party (the worm) is not the active one. We
hence changed the antecedents of the belief conditionals in the
queries (7) to formulas of the form</p>
        <p>(tasty (#1) tasty (#2)) ^ :ate(#2 )
where denotes exclusive-or. Furthermore, we found that the
returned paths disproportionally often contain the RelatedTo,
Synonym and IsA relations. To discourage such overuse, we
introduced a penalty on the weights for the most frequently
appearing relations. These modifications, with depth limit
three, resulted in 93 decisions and a success rate of 54:8%,
i.e., 51 correct and 42 incorrect answers.</p>
      </sec>
      <sec id="sec-8-3">
        <title>6.2 Qualitative Evaluation</title>
        <p>To illustrate our system’s operation, let us a look at a schema
successfully resolved by it in more depth, namely “The
delivery truck zoomed by the school bus because it is
going so fast/slow”. ReVerb maps the sentence to the triple
1http://www.cs.nyu.edu/faculty/davise/papers/WinogradSchemas/WS.html
whizz
water
from this is then
(delivery truck ; zoomed by ; school bus ), yielding the two
target concepts zoomed by and going fast (or going slow ,
respectively). A search in ConceptNet returns the following
path between them: zoomed RelatedT !o zoom Synonym!
RelatedT o sound RelatedT !o wave RelatedT !o
RelatedT !o!boat Is!A going fast . The KB constructed
f:zoomed(#1) ) ?;
zoomed(n) ) zoom(n);
zoom(n) ) zoomed(n);</p>
        <p>&gt; ) whizz(n)
whizz(n) ) sound(n);
sound(n) ) wave(n);
wave(n) ) sound(n);
wave(n) ) water(n);
water(n) ) wave(n);</p>
        <p>boat(n) ) water(n);
water(n) ) boat(n);</p>
        <p>zoom(n);
boat(n) ) going f ast(n) j n 2 f#1; #2gg
which correctly suggests that party #1 is the correct one, i.e. the
delivery truck. For the alternate sentence we obtain the KB
f:zoomed(#1) ) ?;
zoomed(n) ) zoom(n);
zoom(n) ) zoomed(n);
zoom(n) ) travel(n);
rush(n) ) travel(n);
zoom(n) ) travel rapidly(n);
&gt; ) hurry(n)
&gt; ) hurry(n)
rush(n);
travel rapidly(n);
&gt; ) :(rush(n) ^ go slow(n)) j n 2 f#1; #2gg
which similarly relies heavily on RelatedTo, Synonym, and
IsA, but additionally uses that rush is an antonym of go slow.
The system again yields the correct answer #2 (the school bus)
to the query “what is going slow?”.</p>
        <p>The examples demonstrate that most generated
conditionals seem reasonable and relevant for the schema, however
sometimes interspersed with ones that are not as intuitive. The
latter is often due to the RelatedTo relation, whose
meaning is on the one hand very vague, and which on the other
hand accounts for the majority of edges in ConceptNet. In
any case, even if the system yields a wrong answer, we
obtain human-understandable justifications for the automatically
made decisions, which we believe is a major benefit of a
knowledge-based approach to the WSC.
7</p>
        <sec id="sec-8-3-1">
          <title>Conclusion</title>
          <p>We presented a solution to the Winograd Schema Challenge
based on reasoning about conditional beliefs, where
knowledge bases are generated by means of the NLP tool ReVerb
and the semantic net ConceptNet 5. Unlike many other
approaches that rely on searching text repositories (Google), we
thus adopt a knowledge-based approach that, when it comes
to the reasoning part, has the benefit that the generated BO
KB can be regarded as a human-readable justification for the
decision made by the system, which moreover comes with
a clear formal semantics. We also employ a major difficulty
of the WSC to our advantage, namely that the two parties
in a schema do not carry any relevant information and could
equally be replaced by placeholders.</p>
          <p>While we do not achieve a very high quantitative success
rate on the WSC dataset, our results are consistently better than
chance, indicating that this may be a step in the right direction.
Our system’s performance of course heavily depends on the
quality of data fed into it, both from the NLP module’s and
ConceptNet’s side. In particular, information in ConceptNet on
many topics is rather sparse and relies heavily on the RelatedTo
relation, whose meaning is very vague and ambiguous.</p>
          <p>There are many directions for future work. First of all,
further experimentation with our system may be in order. Our
mapping from ConceptNet relations to belief conditionals is
not set in stone, and quite simple in the sense that many
relations are represented in the same way. It would be interesting
(and not much effort to implement in our existing framework)
to try other translations. For example, at the moment our
system neglects that there may be interactions between the
different relations, e.g. we may want to deduce that X is not capable
of doing Y from the facts that X is not capable of doing Z and
that Z is a prerequisite of Y.</p>
          <p>Furthermore, ConceptNet was only one possible choice for
a commonsense knowledge repository, and the system may
benefit from trying other alternatives such as the
Never-EndingLearning (NELL) [Mitchell et al., 2015] KB. In contrast to
ConceptNet, NELL employs a large number of lower level
relations such as playsInstrument which on the one hand would
reduce ambiguity, but on the other hand increase the required
modelling effort, i.e., constructing a mapping.</p>
        </sec>
      </sec>
      <sec id="sec-8-4">
        <title>Acknowledgments.</title>
        <p>This work was supported by the German Research
Foundation (DFG) research unit FOR 1513 on Hybrid Reasoning for
Intelligent Systems, project A1.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [Bailey et al.,
          <year>2015</year>
          ]
          <string-name>
            <given-names>Dan</given-names>
            <surname>Bailey</surname>
          </string-name>
          , Amelia Harrison, Yuliya Lierler, Vladimir Lifschitz, and
          <string-name>
            <given-names>Julian</given-names>
            <surname>Michael</surname>
          </string-name>
          .
          <article-title>The Winograd schema challenge and reasoning about correlation</article-title>
          .
          <source>In Working Notes of the Symposium on Logical Formalizations of Commonsense Reasoning</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [Fader et al.,
          <year>2011</year>
          ]
          <string-name>
            <given-names>Anthony</given-names>
            <surname>Fader</surname>
          </string-name>
          , Stephen Soderland, and
          <string-name>
            <given-names>Oren</given-names>
            <surname>Etzioni</surname>
          </string-name>
          .
          <article-title>Identifying relations for open information extraction</article-title>
          .
          <source>In Proceedings of the Conference on Empirical Methods in Natural Language Processing</source>
          , pages
          <fpage>1535</fpage>
          -
          <lpage>1545</lpage>
          . Association for Computational Linguistics,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [Levesque et al.,
          <year>2012</year>
          ]
          <string-name>
            <given-names>Hector J.</given-names>
            <surname>Levesque</surname>
          </string-name>
          , Ernest Davis, and
          <string-name>
            <given-names>Leora</given-names>
            <surname>Morgenstern</surname>
          </string-name>
          .
          <article-title>The Winograd Schema Challenge</article-title>
          . In Gerhard Brewka, Thomas Eiter, and
          <article-title>Sheila A</article-title>
          . McIlraith, editors,
          <source>Proceedings of the Thirteenth International Conference on the Principles of Knowledge Representation and Reasoning (KR</source>
          <year>2012</year>
          ), pages
          <fpage>552</fpage>
          -
          <lpage>561</lpage>
          . AAAI Press,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <source>[Levesque</source>
          , 1990]
          <string-name>
            <given-names>Hector J.</given-names>
            <surname>Levesque</surname>
          </string-name>
          .
          <article-title>All I know: a study in autoepistemic logic</article-title>
          .
          <source>Artificial Intelligence</source>
          ,
          <volume>42</volume>
          (
          <issue>2</issue>
          ),
          <year>1990</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [Liu et al.,
          <year>2017</year>
          ] Quan Liu, Hui Jiang, Andrew Evdokimov,
          <string-name>
            <surname>Zhen-Hua</surname>
            <given-names>Ling</given-names>
          </string-name>
          , Xiaodan Zhu, Si Wei, and
          <string-name>
            <given-names>Yu</given-names>
            <surname>Hu</surname>
          </string-name>
          .
          <article-title>Causeeffect knowledge acquisition and neural association model for solving a set of Winograd schema problems</article-title>
          .
          <source>In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI)</source>
          , pages
          <fpage>2344</fpage>
          -
          <lpage>2350</lpage>
          . AAAI Press,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [Matuszek et al.,
          <year>2006</year>
          ]
          <string-name>
            <given-names>Cynthia</given-names>
            <surname>Matuszek</surname>
          </string-name>
          , John Cabral, Michael Witbrock,
          <string-name>
            <surname>and John Deoliveira.</surname>
          </string-name>
          <article-title>An introduction to the syntax and content of Cyc</article-title>
          .
          <source>In Proceedings of the 2006 AAAI Spring Symposium on Formalizing and Compiling Background Knowledge and Its Applications to Knowledge Representation and Question Answering</source>
          , pages
          <fpage>44</fpage>
          -
          <lpage>49</lpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [Mitchell et al.,
          <year>2015</year>
          ]
          <string-name>
            <given-names>T.</given-names>
            <surname>Mitchell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Cohen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Hruschka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Talukdar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Betteridge</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Carlson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Dalvi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gardner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Kisiel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Krishnamurthy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Lao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Mazaitis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mohamed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Nakashole</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Platanios</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ritter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Samadi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Settles</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Wijaya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Saparov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Greaves</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Welling</surname>
          </string-name>
          .
          <article-title>Never-ending learning</article-title>
          .
          <source>In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence (AAAI-15)</source>
          , pages
          <fpage>2302</fpage>
          -
          <lpage>2310</lpage>
          . AAAI Press,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <source>[Pearl</source>
          , 1990]
          <string-name>
            <given-names>Judea</given-names>
            <surname>Pearl</surname>
          </string-name>
          .
          <article-title>System Z: A natural ordering of defaults with tractable applications to nonmonotonic reasoning</article-title>
          .
          <source>In Proceedings of the Third Conference on Theoretical Aspects of Reasoning about Knowledge (TARK)</source>
          , pages
          <fpage>121</fpage>
          -
          <lpage>135</lpage>
          . Morgan Kaufmann,
          <year>1990</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <source>[Rahman and Ng</source>
          , 2012]
          <string-name>
            <given-names>Altaf</given-names>
            <surname>Rahman</surname>
          </string-name>
          and
          <string-name>
            <given-names>Vincent</given-names>
            <surname>Ng</surname>
          </string-name>
          .
          <article-title>Resolving complex cases of definite pronouns: The Winograd schema challenge</article-title>
          .
          <source>In Jun'ichi Tsujii</source>
          , James Henderson, and Marius Pasca, editors,
          <source>Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL</source>
          <year>2012</year>
          ), pages
          <fpage>777</fpage>
          -
          <lpage>789</lpage>
          . ACL,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <source>[Schwering and Lakemeyer</source>
          , 2016]
          <string-name>
            <given-names>Christoph</given-names>
            <surname>Schwering</surname>
          </string-name>
          and
          <string-name>
            <given-names>Gerhard</given-names>
            <surname>Lakemeyer</surname>
          </string-name>
          .
          <article-title>Decidable reasoning in a first-order logic of limited conditional belief</article-title>
          .
          <source>In Proceedings of the Twenty-Second European Conference on Artificial Intelligence (ECAI)</source>
          , pages
          <fpage>1379</fpage>
          -
          <lpage>1387</lpage>
          . IOS Press,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [Schwering et al.,
          <year>2017</year>
          ]
          <string-name>
            <given-names>Christoph</given-names>
            <surname>Schwering</surname>
          </string-name>
          , Gerhard Lakemeyer, and
          <string-name>
            <given-names>Maurice</given-names>
            <surname>Pagnucco</surname>
          </string-name>
          .
          <article-title>Belief revision and projection in the epistemic situation calculus</article-title>
          .
          <source>Artificial Intelligence</source>
          ,
          <volume>251</volume>
          :
          <fpage>62</fpage>
          -
          <lpage>97</lpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <source>[Schwering</source>
          , 2017]
          <string-name>
            <given-names>Christoph</given-names>
            <surname>Schwering. Limbo</surname>
          </string-name>
          :
          <article-title>A reasoning system for limited belief</article-title>
          .
          <source>In Proceedings of the TwentySixth International Joint Conference on Artificial Intelligence (IJCAI)</source>
          , pages
          <fpage>5246</fpage>
          -
          <lpage>5248</lpage>
          . AAAI Press,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [Sharma et al.,
          <year>2015</year>
          ]
          <string-name>
            <given-names>Arpit</given-names>
            <surname>Sharma</surname>
          </string-name>
          , Nguyen Ha Vo, Somak Aditya, and
          <string-name>
            <given-names>Chitta</given-names>
            <surname>Baral</surname>
          </string-name>
          .
          <article-title>Towards addressing the Winograd schema challenge - building and using a semantic parser and a knowledge hunting module</article-title>
          .
          <source>In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI)</source>
          , pages
          <fpage>1319</fpage>
          -
          <lpage>1325</lpage>
          . AAAI Press,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <source>[Speer and Havasi</source>
          , 2012]
          <string-name>
            <given-names>Robert</given-names>
            <surname>Speer</surname>
          </string-name>
          and
          <string-name>
            <given-names>Catherine</given-names>
            <surname>Havasi</surname>
          </string-name>
          .
          <article-title>Representing general relational knowledge in ConceptNet 5</article-title>
          .
          <source>In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC)</source>
          , pages
          <fpage>3679</fpage>
          -
          <lpage>3686</lpage>
          . European Language Resources Association (ELRA),
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>