-

Reasoning about Conditional Beliefs for the Winograd Schema Challenge

Denis Golovin and Jens Claßen

Christoph Schwering

1 0 Knowledge-Based Systems Group, RWTH Aachen University , Aachen , Germany 1 School of Computer Science and Engineering, The University of New South Wales , Sydney , Australia

The Winograd Schema Challenge has been proposed as an alternative to the Turing Test for measuring a machine's intelligence by letting it solve difficult pronoun resolution problems that cannot be tackled by statistical analysis alone. While many solutions so far are based on machine learning and natural language processing, we believe that a knowledgebased approach is better suited. In particular, we propose to employ a logic for conditional beliefs that is capable of dealing with incomplete or even inconsistent information (which commonsense knowledge often is). It does so by formalising the observation that humans often reason by picturing different contingencies of what the world could be like, and then choose to believe what is thought to be most plausible. We discuss and evaluate an implementation where relevant commonsense background information is obtained from the ConceptNet semantic network, translated into our formalism, and processed by a reasoner for our logic.

The Winograd Schema Challenge (WSC) has been proposed by Levesque, Davis and Morgenstern [2012] as an alternative to the Turing Test for measuring a machine’s intelligence. While agreeing with Turing’s primary concern, which is focusing on the observable behaviour exhibited by the system, the aim is to avoid some of the pitfalls that come with a free-form conversation as Turing had in mind. Most importantly, in order to pass as a person, a machine will have to resort to deception to be able to answer questions about their height, childhood, and so on. Moreover, the judgement of what passes as human-like conversation is highly subjective and may vary depending on the interrogator that is performing the test.

The central idea behind the WSC is to instead let the machine answer a number of questions of a specific form. As an example, consider the statement

The delivery truck zoomed by the school bus because it was going so slow.

The question is “What was going slow?”, i.e., the task is to disambiguate to which of the two parties mentioned in the statement (“the delivery truck”, “the school bus”) a pronoun (“it”) refers. Each schema contains a special word (here: “slow”) that, when replaced by an alternate word (“fast”), changes the answer. While obvious to an English-speaking human reader, such questions pose a real challenge for a machine since the given statement alone does not provide enough information to derive an answer, but rather requires having appropriate commonsense background knowledge and being able to reason about it. Winograd schemas are designed to be “Google-proof”, meaning that they should not be solvable by simple statistical analysis: If instead of a “delivery truck” the above sentence spoke of a “sports car”, it would not be hard to detect a correlation to words associated with fast movement.

WSC competitions that were held so far prove that the task is indeed difficult, and likely cannot be solved by classical natural language processing alone. Robust commonsense reasoning on the other hand will undoubtedly be beneficial in other areas, in particular when it comes to domestic robots and humanmachine interaction in general.

Such a system will not only need access to a large corpus of background information that preferably covers all areas of the wide spectrum of everyday human life, but also be able to appropriately function in light of the fact that such information is often incomplete or even conflicting. Humans are very capable of this, and we usually do so by picturing different contingencies of what the world could be like, and then choosing to believe what we think the most plausible scenario is.

In this paper, we present an approach to the WSC that is based on a logic that formalises these concepts, the logic of conditional beliefs [Schwering et al., 2017]. We designed and implemented a system that • translates the WSC sentence into a logical formula, • extracts relevant background information from a knowledge corpus and translates it into our logical language, and • performs reasoning to decide what the most plausible answer is.

The rest of this paper is organised as follows. In Section 2 we discuss related work. Section 3 shows an overview of our system’s architecture. We then present the formal framework of our approach in Section 4. Section 5 describes the implementation of our system, while Section 6 presents an evaluation. Finally we conclude in Section 7. 2

Related Work

Rahman and Ng [2012] view the WSC as a special form of pronoun resolution problem and propose an approach based on machine learning. Their system relies on abstract feature vectors extracted from eight different sources, among which Google and Lexical Features turned out to be most useful in their analysis. For the former, they construct a number of queries from the verb governing the pronoun in the input sentence, the words followed by it, and the two candidate parties, and then count the number of results returned by the search engine. The latter are obtained from a training set and represent the presence or absence of certain predefined structures in the sentence. They achieve an accuracy of 73.1% on their test set, where the overall dataset of 941 sentence pairs was created by 30 students and split using a 70/30 training/test ratio. It should be noted that they considered a relaxed version of the problem: while the WSC only allows for instances where resolving the pronoun requires background knowledge not expressed in the statement, they did not impose this condition on sentences.

Sharma et al. [2015] present a three-step method based on their semantic parser “K-parser”: first, the input sentence is brought into a graph representation encoding the conceptual classes of words as well as their dependencies and semantic relations. Next, a series of Google queries is constructed where nominal entities are replaced by wild cards and verbs are substituted by their synonyms. The sentences thus found are then matched against the input by comparing their semantic graphs by a set of predefined rules. For evaluation, they selected 71 sentences from the WSC corpus, resulting in 53 answers, 49 of which were correct.

Bailey et al. [2015] introduce a new formalism for reasoning about the correlation between sentences, which can either be positive or negative. Intuitively, the correlation between two sentences is positive when after hearing the first sentence, the hearer views the second sentence as more plausible than before. Negative correlation on the other hand lowers the plausibility. For example,

The delivery truck zoomed by the school bus.

is positively correlated with

The school bus was going so slow.

but negatively with

The truck was going so slow.

The authors of the aforementioned paper present the correlation calculus as a deductive system to formalise this idea. It extends classical predicate calculus by a new operator , where F G expresses that F and G are positively correlated, and F :G denotes a negative correlation. They show how their formalism can be used to derive such correlation formulas from a set of formulas encoding background information and thus justify the answers for several WSC instances. While their work makes an important step towards solving the WSC, they admit that it so far does not address the important problem of generating the needed commonsense background axioms (these were hand coded in their examples), nor did they come up with an implementation that would allow automating the reasoning process and evaluating the approach against others. Their formalism also currently lacks the capability to handle non-monotonic aspects such as the statement

If the shelf is level, then the sculpture will not roll off.

which, when read as default rule, allows to withdraw the conclusion that the sculpture does not roll off in face of new information, e.g. that there is an earthquake.

Finally, Liu et al. [2017] propose another approach based on machine learning, which is used (a) in a knowledge acquisition step to identify cause-effect relationships between commonly used words from large text corpora and (b) to find association relationships in the form of conditional probabilities between pairs of discrete events. They manually selected a subset of 70 Winograd schemas, among which their method achieves an accuracy of 70%. 3

System Architecture Overview

The architecture of our system is visualised in Figure 1. In a preprocessing step, tuples that represent the acting parties, their actions, and their relationships are extracted from the input Winograd schema sentences. For instance, the Winograd schema from Section 1 leads to tuples (truck, zoomed by, bus) and (zoomed by, goes slow). Then a commonsense repository, ConceptNet, is employed to derive further information about the extracted words. These facts come in the form of binary relations; for instance, it may tell us that zooming by is a form of travelling rapidly. From these facts, a conditional knowledge base is constructed in the logic BO, and from which the system aims to infer through automated reasoning how to disambiguate the pronoun in the most plausible way. In the school bus-scenario, the system compares the plausibility of party #1 (the truck) versus party #2 (the school bus) being the slow party. 4

A Logic of Plausible Beliefs

Reasoning in our framework is based on a logic of plausible beliefs [Schwering et al., 2017] called BO. This logic features non-monotonic conditionals of the form

If some premise holds, then plausibly some consequent is true, which enables us to represent and reason about more and less plausible contingencies. We expect this to be crucial in the context of the WSC, where we aim to find the correct answer by comparing the plausibility of the different candidates. In the following we briefly introduce the formal machinery of this logic.

The terms in our language consist of infinitely many firstorder variables, usually denoted as x or y, and infinitely many (standard) names n 2 N = f#1; #2;:::g. Formulas are of the form • P (t1;:::; tk), (t1 = t2), ( ^ ), : , 8x , • B( 1 ) mg, where P is a predicate symbol other than =, ti are terms, and are formulas, and i and i are objective formulas, that is, i and i mention no B or O operators. A formula is ground when it contains no variables. A sentence is a formula without free variables. Let _, 9, , be the usual abbreviations, > stand for 8x(x = x), and ? abbreviate :>.

Standard names can be seen as special constants that represent all the individuals in the universe. They considerably simplify the semantics compared to the classical Tarskian model theory; in particular, quantification can be handled by simply substituting standard names. The formula B( ) ) intuitively expresses a conditional belief, namely the belief that if is true, then plausibly is also true. The formula Of 1 ) 1;:::; m ) mg goes one step further and says that the conditional beliefs B( i ) i) are all that is believed; everything that is not a consequence of these conditional beliefs is implicitly not believed. The concept is called only-believing; it generalises Levesque’s only-knowing [ 1990 ] to the case of conditional beliefs and shares many properties with Pearl’s System Z [ 1990 ]. Only-believing is particularly useful to capture the idea of a conditional knowledge base: the knowledge base is all the agent believes.

To investigate such problems, we define a possible-worlds semantics for BO. A world is a set of ground non-equality atoms. An epistemic state ~e is an infinite sequence of sets of worlds e1; e2;::: such that e1 e2 ::: and for some p 2 f1; 2;:::g, ep = ep+1 = :::

Intuitively, a world is a truth assignment of the ground non-equality atoms, that is, of all P (n1;:::; nj ) where the ni are standard names. An epistemic state stratifies sets of such worlds with the subset relation. The intuition behind an epistemic state ~e is to model a system of spheres: e1 contains the most-plausible worlds, e2 adds the second-most-plausible worlds, and so on. Note that in any epistemic state ~e only finitely many different sets of worlds are allowed, since the def3. ~e; w j= ( 4. ~e; w j= : 5. ~e; w j= 8x inition requires that ep = ep+1 = ::: for some p. (Just as well we could have defined an epistemic state to be a non-empty finite sequence of supersets; the present definition however is often easier to work with.)

Given an epistemic state ~e and a world w, we can define truth of a sentence in BO, written ~e; w j= . We let nx denote the result of substituting all free occurrences of x by n. The objective part of the semantics is defined inductively as follows: 1. ~e; w j= P (n1;:::; nj ) iff P (n1;:::; nj ) 2 w; 2. ~e; w j= (n1 = n2) iff n1 and n2 are identical names; ^ ) iff ~e; w j=

and ~e; w j= ; iff ~e; w 6j= ; iff ~e; w j= nx for all n 2 N .

Notice that Rule 5 handles quantification by substitution of standard names.

Before we proceed with the semantics of conditional belief, we define the plausibility b~e j c of an objective sentence in ~e as the index of the first sphere consistent with : b~e j c = minfp j p = 1 or ~e; w j= for some w 2 epg, where 1 2= f1; 2;:::g represents an “undefined” plausibility with the understanding that p + 1 = 1 and p < 1 for all p 2 f1; 2;:::g. Then the semantics of beliefs is as follows: 6. ~e; w j= B( ) ) iff for all p

1, if p

b~e j c and w0 2 ep, then ~e; w0 j= ( 7. ~e; w j= Of 1 ) The intuitive meaning of Rule 6 is that B( ) ) holds iff the most-plausible worlds also satisfy . Note that B( ) ) is vacuously true when there is no world. As a consequence, B(: ) ?) can be used to express that is known.

Only-believing Of 1 ) 1;:::; m ) mg has the effect of B( i ) i) plus maximising every sphere: it requires the sphere ep to contain all worlds that satisfy all ( i i) for which b~e j ic p. It turns out that this definition gives rise to a procedure that generates the unique system of spheres ~e that corresponds to Of 1 ) 1;:::; m ) mg [Schwering et al., 2017], which means that a conditional knowledge base has a unique semantic representation as a system of spheres.

While reasoning in the logic as presented here is undecidable given its first-order nature, a decidable (and sometimes tractable) variant based on the theory of limited belief has been developed and implemented [Schwering and Lakemeyer, 2016; Schwering, 2017].

5 Implementation

In order to automate the process of solving WSC through reasoning with conditional beliefs, we need to (a) translate a given WSC instance into formulas of BO, (b) provide necessary background knowledge in the form of a KB of BO formulas, and (c) call the BO reasoner to answer the query. Here we adopt an approach based on ConceptNet 5 [Speer and Havasi, 2012], a large semantic network that represents knowledge in a graph structure where each node is a concept (e.g., “zoom”, “going fast”) and each edge corresponds to one of 28 different relations (e.g., “IsA”, “RelatedTo”). Apart from manually entered data, the network integrates information from various sources, including subsets of the OpenCyc and DBPedia ontologies as well as the Wiktionary and WordNet 3.0 dictionaries. The origin of each piece of information is kept in metadata associated to each edge, in particular a numerical weight representing the strength of the assertion. Weights have a default value of 1 but can be higher or lower, where a negative value expresses that the assertion is false or irrelevant.

As opposed to fully-fledged commonsense knowledge bases such as Cyc [Matuszek et al., 2006], ConceptNet 5 does not come with a pre-defined formal semantics and/or inference engine. It can rather be viewed as an intermediate step between the intuitive, informal knowledge used by humans on the one hand and a formal, logic-based representation on the other. As developers, this gives us the freedom to attach meaning to assertions in a way that suits our application.

In our case, we identified 24 of the 28 relations to be useful and mapped them to conditional beliefs. Consider the IsA relation: Each corresponding tuple has the form IsA(A,B), where A and B are concepts. Assuming that we represent concepts by unary predicates, we can encode the assertion as an ordinary belief (expressing that it is plausible to assume that if x is an A, it is also a B): > ) 8x (A(x)

B(x)) However, thus the statement will only hold in the most plausible sphere, which essentially amounts to doing monotonic reasoning only. We hence instead use the following encoding:

A(x) ) B(x) That is, the most plausible A’s are assumed to be B’s. Similar mappings have to be defined for the remaining relations. The ones we used are summarised in Table 1, where A ) B stands for formula (2), A ) :B is shorthand for A(x) ) :B(x) etc. 5.2 Representing Winograd Schemas in BO To create the knowledge base, we start with the given WSC sentence, e.g.

The fish ate the worm. It was tasty.

Here, the word tasty can be replaced by hungry and the reference of it changes from worm to fish. Note that this a structure that many (but not all) Winograd schemas exhibit: They start with a statement (unambiguously) describing a situational relationship between two parties (“the fish ate the worm”), followed by an ambiguous description of an effect of it (“it was tasty”). We use the natural language processing tool ReVerb [Fader et al., 2011] to obtain a structured representation for the former in the form of a triple ( sh; ate; worm).

Note that it is possible to replace both parties with placeholders and still understand the sentence and resolve the pronoun. The requirement that the names of the two parties do not carry (1) (2)

Relation

IsA NotIsA RelatedTo

Description

A is a kind of B A is not kind of B

A is related to B DerivedFrom A is derived from B Antonym A is the opposite of B MotivatedBy A is motivated by B Synonym A is a synonym of B FormOf B is the root word of

A HasPrerequisite in order for A to happen, B needs to happen UsedFor A is used for B NotUsedFor A is not used for B PartOf A is part of B HasA B belongs to A, either as an inherent part or due to a social construct of possession CapableOf A is capable of B NotCapableOf A is not capable of B ObstructedBy A is a goal that can be prevented by B HasProperty A has B as a property NotHasProperty A has not B as a property Desires A is a conscious entity that typically wants B NotDesires A is a conscious entity that typically does not want B CreatedBy B is a process or agent that creates A DefinedAs A and B overlap considerably in meaning, and B is a more explanatory version of

A Entails If A is happening, B is also happening MannerOf A is a specific way to do B

Mapping

A ) B A ) :B A ) B and B ) A A ) B :(A ^ B) B ) A A B A B A ) B A ) B A ) :B B ) A A ) B A ) B A ) :B B ) :A A ) B B ) :A A ) B A ) :B A ) B A ) B A ) B A ) B any relevant information for solving a WSC, actually intended as a main difficulty, can here be used to our advantage: Instead of “fish” we simply denote the first party by standard name #1, and similarly the “worm” as second party by #2. We will not have to worry about reasoning about the relationship between a fish and a worm, but can rather concentrate on the relation between concepts ate and tasty.

The triple ( sh; ate; worm) now gives us the certain information that the fish (#1) is the active party in the event “ate”, which we represent by adding

:ate(#1 ) ) ? to the KB. Next, we consult ConceptNet to check if more information about the two parties is available. Specifically, we try to use the fact that the worm (#2) is the passive party in the event. We hence search for a concept which has similar properties as an eaten worm, that is to say something which is commonly known to be eaten, i.e., something which is the passive party in the event of ate. After determining eaten to be the passive form of ate through the common root word eat , we therefore query ConceptNet to find instances of the ReceivesAction relation (that we only use for this specific purpose). In the example, ConceptNet suggests that the concept apple is commonly known to be eaten since ReceivesAction(apple; eaten) has a high weight. We hence represent that the worm is the passive party by the formula

> ) apple(#2 ) The new expression can be read as it is believed that the second party is active in being an ’apple’. The fact that the passive form of ate is eaten is further represented as follows: ate(x ) ) :eaten(x ) :ate(x ) ) eaten(x )

The new beliefs can be read as it is believed that if party x is

active in the event ’ate’ it presumably is not active in the event ’eaten’, and vice versa. Finally, we add the belief that if a party is active in the event of eaten, then it is presumably active in the event of being an apple:

eaten(x ) ) apple(x ) Note that beliefs of the form (4) to (6) are only generated in case a corresponding ReceivesAction actually exists (which is often not the case); only the ones of the form (3) are guaranteed to be present in every WSC instance.

5.3 Reasoning

Starting with the formulas generated for the WSC sentence as demonstrated in Section 5.2, we search for a path in ConceptNet that connects the two target concepts ate and tasty identified from the WSC sentenced (using normalisation like elimination of stop words and word stemming if needed). All relations on this path are mapped to formulas as described in Section 5.1, instantiated by the standard names #1; #2 representing the two parties. We then call the BO reasoner on the resulting KB with the two queries tasty (#1 ) _ tasty (#2 ) ) tasty (#1 ) tasty (#1 ) _ tasty (#2 ) ) tasty (#2 ) A decision is then made if exactly one of them comes out as true and the other as false. (3) (4) (5) (6) (7)

Evaluation 6.1 Quantitative Evaluation

We performed a preliminary experimental evaluation of our implemented system on the publicly available Winograd schema collection1 containing 144 schemas in total. Since ConceptNet does not directly support querying paths between concepts, we search in a two-stage process. First, a subgraph of ConceptNet is generated in memory, which we do by starting with the target concepts and successively adding adjacent edges and nodes, where for each node the (at most) 30 edges with highest weight are considered. We then search for the actual path by means of the Dijkstra algorithm, alternatingly using edge costs of 1=weight or 1 (the latter because sometimes ConceptNet’s weights are not reliable).

Table 2 shows the results for increasing depth limits in terms of the number of decisions made (recall that this means that exactly one of the two queries comes out true) and the fraction of correctly answered sentences. While obviously a search depth of zero yields no results, neither does a limit of one, meaning none of the concepts mentioned in Winograd schemas are directly connected in ConceptNet, showing that these schemas are indeed hard in some sense. Once we set the limit to two or above, decisions are made with increasing accuracy (though not much better than guessing). With the current implementation we were not able to increase the depth beyond three due to reaching memory limitations. depth 0 1 2 3 decisions 0 0 103 159 correctly answered 0 0 52 (50.5%) 85 (53.5%)

After looking into some of the examples where wrong or no decisions were made, we adjusted our method slightly. First, we know that exactly one of the two parties is referred by the pronoun. Moreover, while one party (the fish) actively participates (eats) in the described scenario as encoded by (3), often the other party (the worm) is not the active one. We hence changed the antecedents of the belief conditionals in the queries (7) to formulas of the form

(tasty (#1) tasty (#2)) ^ :ate(#2 ) where denotes exclusive-or. Furthermore, we found that the returned paths disproportionally often contain the RelatedTo, Synonym and IsA relations. To discourage such overuse, we introduced a penalty on the weights for the most frequently appearing relations. These modifications, with depth limit three, resulted in 93 decisions and a success rate of 54:8%, i.e., 51 correct and 42 incorrect answers.

6.2 Qualitative Evaluation

To illustrate our system’s operation, let us a look at a schema successfully resolved by it in more depth, namely “The delivery truck zoomed by the school bus because it is going so fast/slow”. ReVerb maps the sentence to the triple 1http://www.cs.nyu.edu/faculty/davise/papers/WinogradSchemas/WS.html whizz water from this is then (delivery truck ; zoomed by ; school bus ), yielding the two target concepts zoomed by and going fast (or going slow , respectively). A search in ConceptNet returns the following path between them: zoomed RelatedT !o zoom Synonym! RelatedT o sound RelatedT !o wave RelatedT !o RelatedT !o!boat Is!A going fast . The KB constructed f:zoomed(#1) ) ?; zoomed(n) ) zoom(n); zoom(n) ) zoomed(n);

> ) whizz(n) whizz(n) ) sound(n); sound(n) ) wave(n); wave(n) ) sound(n); wave(n) ) water(n); water(n) ) wave(n);

boat(n) ) water(n); water(n) ) boat(n);

zoom(n); boat(n) ) going f ast(n) j n 2 f#1; #2gg which correctly suggests that party #1 is the correct one, i.e. the delivery truck. For the alternate sentence we obtain the KB f:zoomed(#1) ) ?; zoomed(n) ) zoom(n); zoom(n) ) zoomed(n); zoom(n) ) travel(n); rush(n) ) travel(n); zoom(n) ) travel rapidly(n); > ) hurry(n) > ) hurry(n) rush(n); travel rapidly(n); > ) :(rush(n) ^ go slow(n)) j n 2 f#1; #2gg which similarly relies heavily on RelatedTo, Synonym, and IsA, but additionally uses that rush is an antonym of go slow. The system again yields the correct answer #2 (the school bus) to the query “what is going slow?”.

The examples demonstrate that most generated conditionals seem reasonable and relevant for the schema, however sometimes interspersed with ones that are not as intuitive. The latter is often due to the RelatedTo relation, whose meaning is on the one hand very vague, and which on the other hand accounts for the majority of edges in ConceptNet. In any case, even if the system yields a wrong answer, we obtain human-understandable justifications for the automatically made decisions, which we believe is a major benefit of a knowledge-based approach to the WSC. 7

Conclusion

We presented a solution to the Winograd Schema Challenge based on reasoning about conditional beliefs, where knowledge bases are generated by means of the NLP tool ReVerb and the semantic net ConceptNet 5. Unlike many other approaches that rely on searching text repositories (Google), we thus adopt a knowledge-based approach that, when it comes to the reasoning part, has the benefit that the generated BO KB can be regarded as a human-readable justification for the decision made by the system, which moreover comes with a clear formal semantics. We also employ a major difficulty of the WSC to our advantage, namely that the two parties in a schema do not carry any relevant information and could equally be replaced by placeholders.

While we do not achieve a very high quantitative success rate on the WSC dataset, our results are consistently better than chance, indicating that this may be a step in the right direction. Our system’s performance of course heavily depends on the quality of data fed into it, both from the NLP module’s and ConceptNet’s side. In particular, information in ConceptNet on many topics is rather sparse and relies heavily on the RelatedTo relation, whose meaning is very vague and ambiguous.

There are many directions for future work. First of all, further experimentation with our system may be in order. Our mapping from ConceptNet relations to belief conditionals is not set in stone, and quite simple in the sense that many relations are represented in the same way. It would be interesting (and not much effort to implement in our existing framework) to try other translations. For example, at the moment our system neglects that there may be interactions between the different relations, e.g. we may want to deduce that X is not capable of doing Y from the facts that X is not capable of doing Z and that Z is a prerequisite of Y.

Furthermore, ConceptNet was only one possible choice for a commonsense knowledge repository, and the system may benefit from trying other alternatives such as the Never-EndingLearning (NELL) [Mitchell et al., 2015] KB. In contrast to ConceptNet, NELL employs a large number of lower level relations such as playsInstrument which on the one hand would reduce ambiguity, but on the other hand increase the required modelling effort, i.e., constructing a mapping.

Acknowledgments.

This work was supported by the German Research Foundation (DFG) research unit FOR 1513 on Hybrid Reasoning for Intelligent Systems, project A1.

[Bailey et al., 2015 ]

Dan

Bailey , Amelia Harrison, Yuliya Lierler, Vladimir Lifschitz, and

Julian

Michael . The Winograd schema challenge and reasoning about correlation . In Working Notes of the Symposium on Logical Formalizations of Commonsense Reasoning , 2015 .

[Fader et al., 2011 ]

Anthony

Fader , Stephen Soderland, and

Oren

Etzioni . Identifying relations for open information extraction . In Proceedings of the Conference on Empirical Methods in Natural Language Processing , pages 1535 - 1545 . Association for Computational Linguistics, 2011 .

[Levesque et al., 2012 ]

Hector J.

Levesque , Ernest Davis, and

Leora

Morgenstern . The Winograd Schema Challenge . In Gerhard Brewka, Thomas Eiter, and Sheila A . McIlraith, editors, Proceedings of the Thirteenth International Conference on the Principles of Knowledge Representation and Reasoning (KR 2012 ), pages 552 - 561 . AAAI Press, 2012 .

[Levesque , 1990]

Hector J.

Levesque . All I know: a study in autoepistemic logic . Artificial Intelligence , 42 ( 2 ), 1990 .

[Liu et al., 2017 ] Quan Liu, Hui Jiang, Andrew Evdokimov, Zhen-Hua

Ling

, Xiaodan Zhu, Si Wei, and

Hu . Causeeffect knowledge acquisition and neural association model for solving a set of Winograd schema problems . In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI) , pages 2344 - 2350 . AAAI Press, 2017 .

[Matuszek et al., 2006 ]

Cynthia

Matuszek , John Cabral, Michael Witbrock, and John Deoliveira. An introduction to the syntax and content of Cyc . In Proceedings of the 2006 AAAI Spring Symposium on Formalizing and Compiling Background Knowledge and Its Applications to Knowledge Representation and Question Answering , pages 44 - 49 , 2006 .

[Mitchell et al., 2015 ]

Mitchell ,

Cohen ,

Hruschka ,

Talukdar ,

Betteridge ,

Carlson ,

Dalvi ,

Gardner ,

Kisiel ,

Krishnamurthy ,

Lao ,

Mazaitis ,

Mohamed ,

Nakashole ,

Platanios ,

Ritter ,

Samadi ,

Settles ,

Wang ,

Wijaya ,

Gupta ,

Chen ,

Saparov ,

Greaves , and

Welling . Never-ending learning . In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence (AAAI-15) , pages 2302 - 2310 . AAAI Press, 2015 .

[Pearl , 1990]

Judea

Pearl . System Z: A natural ordering of defaults with tractable applications to nonmonotonic reasoning . In Proceedings of the Third Conference on Theoretical Aspects of Reasoning about Knowledge (TARK) , pages 121 - 135 . Morgan Kaufmann, 1990 .

[Rahman and Ng , 2012]

Altaf

Rahman and

Vincent

Ng . Resolving complex cases of definite pronouns: The Winograd schema challenge . In Jun'ichi Tsujii , James Henderson, and Marius Pasca, editors, Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL 2012 ), pages 777 - 789 . ACL, 2012 .

[Schwering and Lakemeyer , 2016]

Christoph

Schwering and

Gerhard

Lakemeyer . Decidable reasoning in a first-order logic of limited conditional belief . In Proceedings of the Twenty-Second European Conference on Artificial Intelligence (ECAI) , pages 1379 - 1387 . IOS Press, 2016 .

[Schwering et al., 2017 ]

Christoph

Schwering , Gerhard Lakemeyer, and

Maurice

Pagnucco . Belief revision and projection in the epistemic situation calculus . Artificial Intelligence , 251 : 62 - 97 , 2017 .

[Schwering , 2017]

Christoph

Schwering. Limbo : A reasoning system for limited belief . In Proceedings of the TwentySixth International Joint Conference on Artificial Intelligence (IJCAI) , pages 5246 - 5248 . AAAI Press, 2017 .

[Sharma et al., 2015 ]

Arpit

Sharma , Nguyen Ha Vo, Somak Aditya, and

Chitta

Baral . Towards addressing the Winograd schema challenge - building and using a semantic parser and a knowledge hunting module . In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI) , pages 1319 - 1325 . AAAI Press, 2015 .

[Speer and Havasi , 2012]

Robert

Speer and

Catherine

Havasi . Representing general relational knowledge in ConceptNet 5 . In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC) , pages 3679 - 3686 . European Language Resources Association (ELRA), 2012 .