=Paper= {{Paper |id=Vol-2052/paper12 |storemode=property |title=Towards Representing What Readers of Fiction Believe |pdfUrl=https://ceur-ws.org/Vol-2052/paper12.pdf |volume=Vol-2052 |authors=Toryn Q. Klassen,Hector J. Levesque,Sheila A. McIlraith |dblpUrl=https://dblp.org/rec/conf/commonsense/KlassenLM17 }} ==Towards Representing What Readers of Fiction Believe== https://ceur-ws.org/Vol-2052/paper12.pdf
                       Towards Representing What Readers of Fiction Believe

                      Toryn Q. Klassen and Hector J. Levesque and Sheila A. McIlraith
                                                    Department of Computer Science
                                                          University of Toronto
                                                  {toryn,hector,sheila}@cs.toronto.edu




                            Abstract                                   to represent that in their argumentation-based approach to
                                                                       story understanding. Some information of that sort could be
  Despite the extensive literature on the problem of story under-      represented in a story grammar (Rumelhart 1975). Charniak
  standing, there has been little focus on formally representing
  some forms of knowledge that are specific to stories, such
                                                                       and Goldman (1989) pointed out the significance of readers
  as how the reader expects information to be presented over           assuming that mentioned objects are going to be relevant. In
  the course of reading. To illustrate, the reader of a mystery        work on using abduction to interpret text (Hobbs, Stickel,
  story may expect to eventually find out who is guilty, and           and Martin 1993), it’s been suggested that the abductive ex-
  also that the author may first try to mislead them about who is      planations might refer to such things as authors’ plans. De-
  guilty. We propose literary logic, a formalism based on work         spite interest in interpreting literature (Hobbs 1990), this has
  by Friedman and Halpern for reasoning about dynamic sys-             not been much focused on in the context of stories. We may
  tems, and apply it in representing this sort of knowledge. We        note that if scripts are learned from corpora, as by Cham-
  also consider issues relating to carrying over world knowl-          bers and Jurafsky (2009), they probably end up also captur-
  edge into fiction, and knowledge of genre conventions.               ing information about what events authors find noteworthy.
                                                                       Chaturvedi, Peng, and Roth (2017) consider several forms of
                     1    Introduction                                 knowledge in trying to predict the correct ending of a story,
                                                                       including knowledge of patterns of sentiment in stories.
Story understanding is a long-standing problem in artificial              Forms of knowledge which have not been explored in
intelligence, with notable early work from the 1970s (Char-            much depth include what the reader believes that they will
niak 1972; Schank and Abelson 1977). McCarthy (1990), in               come to learn from reading (parts of) the story, and what
a memo originally from 1976, pointed out that stories raise            the reader thinks the author will try to make them believe
problems for commonsense reasoning. Research has contin-               over time. For example, the reader may believe that they will
ued, and recent years have seen a proliferation of corpora of          learn from reading a mystery who was guilty, but that the au-
stories in various mediums with accompanying questions for             thor will try to make them believe at some time that an inno-
machine learning purposes, including MCTest (Richardson,               cent character is guilty. Or the reader may believe that if they
Burges, and Renshaw 2013), ROCStories (Mostafazadeh et                 haven’t been told the main character’s eye color by halfway
al. 2016), MovieQA (Tapaswi et al. 2016), and COMICS                   through a book, they’ll never find it out. In this paper, we
(Iyyer et al. 2017). In this paper, we are concerned with the          apply an approach to modelling belief and time to this sort
task of a reader answering questions about a story after read-         of representational problem. We also focus specifically on
ing (some prefix of) it.1 We propose a logic for the purpose           fiction, unlike most AI story understanding research.
of determining how the reader would answer such questions                 Applying world knowledge to fiction is more complicated
based on their various types of background knowledge.                  than to non-fiction. In the philosophical literature, there has
   Much of the work on story understanding has focused on              been substantial work on defining “truth in fiction”. Lewis
the “world knowledge” needed to understand stories. For ex-            (1978) noted the phenomenon of carry-over, that “factual
ample, Charniak (1972) devoted a chapter to how knowledge              premisses [...] may carry over into the fiction, not because
about piggy banks can be used in understanding passages                there is anything explicit in the fiction to make them true, but
about them. In representing stereotypical events, scripts              rather because there is nothing to make them false” (p. 42).
(Schank and Abelson 1977) also encode world knowledge,                 To use an example of his, we may assume that Sherlock
like that tips are given at restaurants after eating.                  Holmes does not have a third nostril. Lewis offered multi-
   However, there are other forms of knowledge that are also           ple definitions of truth in fiction; his “Analysis I” said that
relevant. Diakidoy et al. (2014) suggested that readers have
“story knowledge” such as expectations that characters’ ef-              A sentence of the form “In the fiction f , φ” is non-
forts would meet with complications, but they did not try                vacuously true iff some world where f is told as known
                                                                         fact and φ is true differs less from our actual world, on
   1
     We will not be considering summarization or further tasks that      balance, than does any world where f is told as known
have been suggested as part of story understanding (Michael 2013).       fact and φ is not true.
His “Analysis II” was similar but instead of considering dif-        like the genre of the story being read) while imaginary pred-
ferences from the actual world, considered differences from          icates (that hold only “in imagination”) describe properties
the worlds where the common beliefs of the fiction’s com-            that apply within the world of the story being read. The
munity of origin were true. Others have used similar ideas,          reader’s beliefs about the extensions of both sorts of pred-
e.g. Walton (1990) had his “Reality Principle” and “Mutual           icates can change over the course of reading.
Belief Principle” which roughly correspond to Lewis’s anal-
yses, though Walton regarded them only as rules of thumb.            2.1   Syntax
Genre information – e.g., about time travel (Morgenstern             The syntax of LL involves both terms and predicates.
2014), or that dragons breathe fire – is another sort of knowl-         A term is either a standard name or a variable. There
edge, which is unclear how to incorporate into these sorts of        is a countably infinite set N = {# 1, # 2, # 3, . . . } of stan-
definitions; perhaps the most detailed approach attempting           dard names. Intuitively, these stand for all the objects that
to do so was given by Bonomi and Zucchi (2003).                      we may want to refer to, including not just real-life things
   These philosophical approaches were not fully formalized          like piggy banks, but also theoretical literary concepts like
and expressed in logic. The logics designed by philosophers          what Van Inwagen (1977) called “creatures of fiction”, like
for dealing with fiction (Woods 1974; Heintz 1979) have              the character Sherlock Holmes or his pipe. There also is a
usually focused on other issues, like handling inconsistent          countably infinite set of variables. Note that the logic does
stories (which we will not be addressing in this paper).             not have constants or function symbols, though the standard
   The formal logic we present in this paper, which we call          names can be thought of as constant symbols that satisfy the
literary logic (LL), is a variant of the logic used by Fried-        unique name assumption and an infinitary version of domain
man and Halpern (1999) to model belief revision in dynam-            closure. For a discussion of why standard names are useful,
ical systems. We argue that LL can be used to represent              see Levesque and Lakemeyer (2000, section 2.2).
various forms of knowledge relevant to story understanding.             As previously indicated, there are two (non-empty) sets of
We focus on two main issues: representing reader’s expec-            predicate symbols, the real Φr and the imaginary Φf (which
tations about stories (which may take into account genre-            do not have to be disjoint). Each predicate P from either
specific information), and the carry-over of world knowl-            set has an arity, ar(P ), which is the number of terms that
edge and its interaction with genre knowledge (e.g. about            it takes as arguments. So, to give a typical example, there
dragons). Literary logic provides temporal features that we          could be a real unary predicate Rabbit that indicates its ar-
apply to the first issue (though they may also have a role           gument is a rabbit in reality, and an imaginary unary predi-
to play with respect to the second), and non-monotonic as-           cate also called Rabbit that indicates its argument is a rabbit
pects that are useful for both. The outline of this paper is         in the world of the story under consideration. The set of real
as follows. Section 2 describes the syntax and semantics             predicates would also typically include predicates to express
of literary logic, and notes some of its properties. Section         literary propositions; for example, there could be a 0-ary real
3 shows how the question-answering task can be formal-               predicate FantasyGenre which would indicate that the story
ized, describing how we can make use of abnormality predi-           was in the fantasy genre.
cates (McCarthy 1986) in specifying the reader’s initial epis-          A real atom is a string of the form P (t1 , . . . , tk ), where
temic state. Section 4 formalizes some examples of reader            P ∈ Φr , k = ar(P ), and t1 , . . . , tk are terms. Similarly, an
knowledge: we consider carry-over (and incorporating genre           imaginary atom is a string of the form Q(t1 , . . . , tk ), where
knowledge) in section 4.1, and then expectations about mys-          Q ∈ Φf . We will say that an atom is ground if no variables
tery stories in section 4.2. Section 5 discusses related work,       appear in it. We will assume that there is a unary predicate
and section 6 concludes with a discussion of future work.            Mentioned ∈ Φr , which we will later give the special mean-
                                                                     ing of picking out those standard names that appear within
                    2    Literary logic                              the discourse.
                                                                        The formulas of LL are the expressions of the form φ
This section describes the language LL, which is closely             generated by the grammar below, where P is a real atom, Q
based on the logic of Friedman and Halpern, which provided           is an imaginary atom, x is a variable, and t1 and t2 are terms.
for modelling the accessibility and plausibility of possible
worlds over time. The major differences include that LL is                 α := Q | ¬α | (α ∧ α) | (t1 = t2 ) | ∃x(α)
first-order, is evaluated with respect to finite rather than infi-         φ := P | ¬φ | (φ ∧ φ) | (t1 = t2 ) | ∃x(φ) | Iα |
nite timelines (because stories are finite and are read in finite               Dα | #φ | φ | φ U φ | φ S φ | φ  φ
time), and includes the complete set of past and future tem-
poral operators from Lichtenstein, Pnueli, and Zuck (1985).          We will also be talking about α-type formulas, which are ex-
LL describes the beliefs of a reader over time as they read          pressions of the form α generated by the grammar (though
a discourse, a sequence of logical sentences representing a          the φ-type formulas are what we will mean when we refer
story, one sentence per time step.                                   to LL formulas). A variable x appearing in a (α- or φ-type)
   A very visible feature of LL (that is mostly just for clar-       formula is said to be free if it does not appear within a sub-
ity of presentation) is that we have two sorts of predicates,        formula of the form ∃x(φ), and a formula with no free vari-
“real” and “imaginary” ones, and have a special “in imagi-           ables is called a sentence. The use to which we will put α-
nation” operator I. The idea is that real predicates describe        type sentences is that the discourse being read is a sequence
properties in the real world (including “literary” properties,       of α-type sentences, which describe the world of the story.
An LL sentence (that is, a φ-type sentence) describes the               2.2    Semantics
real world, and can include modal operators to describe the              The grammar provides two types of formulas, denoted by α
reader’s beliefs over time.                                              and φ. While our goal in this section is to define satisfaction
    The operators ¬, ∧, ∃, and = are familiar from first-order           and validity with respect to φ-type sentences, let us first de-
logic, and we can use them to define abbreviations like ∨, ⊃,            fine a satisfaction relation |= that specifies when an α-type
≡, and ∀ in the usual ways. It’s convenient to have a symbol             sentence is satisfied by an interpretation π (which we take
> that always takes a true truth value; let > := ∀x(x = x).              to be a set of imaginary ground atoms). We will use the no-
    We will read Iα as “α is imagined” or “α is true in imag-            tation α[x/c] to indicate the formula obtained by replacing
ination” (though the I operator serves a technical function              all free occurrences of the variable x in α by c ∈ N .
and is not intended to formalize a commonsense notion of
imagination). We will read Dα as “The last sentence read of             1. π |= Q(c1 , . . . , ck ) iff Q(c1 , . . . , ck ) ∈ π
the discourse was α”.                                                   2. π |= ¬α iff π 6|= α
    The operators # (“next”),         (“previous”), U (“until”),        3. π |= (α1 ∧ α2 ) iff π |= α1 and π |= α2
and S (“since”) are standard temporal logic operators.2 They
describe time for the reader, who reads one sentence of the             4. π |= (c1 = c2 ) iff c1 and c2 are identical names
discourse each time step. We can define further temporal op-            5. π |= ∃x(α) iff π |= α[x/c] for some c ∈ N
erators in the usual ways: ♦ (“eventually”) by ♦φ := > U φ,              This is just an established way of giving the semantics of a
 (“always in the future”) by φ := ¬♦¬φ,  (“sometime                   first-order logic with substitutional quantification (Levesque
in the past”) by φ := > S φ, and  (“always in the past”)               and Lakemeyer 2000). Now, let us make some definitions.
by φ := ¬¬φ. We can also define an operator ¸ (“af-
ter reading”) by ¸ φ := ♦(φ ∧ ¬#>), so ¸ φ means that φ                  Definition 1 (discourse). A discourse is a finite sequence of
is true at the final time (i.e., when the entire story has been          α-type sentences, ending with End, a special sentence not
read), and µ (“initially”) by µ φ := (φ ∧ ¬ >), so that                 appearing earlier (we do not have to introduce a new symbol
µ ϕ means φ is true at time 0 (the initial time).                        for this; we can just take End = >).
    The formula φ1  φ2 means that φ2 is true in all the                 Definition 2 (complex world). A (complex) world is a tuple
most plausible accessible worlds in which φ1 is true. We                 w = hwr , wf , wd i where wr is a set of real ground atoms,
could follow Friedman and Halpern in defining a belief op-               wf is a set of imaginary ground atoms, and wd is a discourse
erator B with the abbreviation Bφ := >  φ (that is, φ                   s.t. (1) if wd (i) = α for any i then wf |= α, and (2) iff c ∈ N
is believed if it is true in all the most plausible accessible           appears in a sentence of wd , then Mentioned (c) ∈ wr .
worlds), but instead let us give a more general definition: if              Intuitively, wr is the set of all real ground atoms that are
ψ is any sentence, let                                                   true in the world w, wf is the set of all imaginary ground
                      Bψ φ := (µ ψ)  φ.                          (1)    atoms that are true in w, and wd is a formal representation
                                                                         of the story that is told in w. Note that wf represents one way
We can think of Bψ φ as indicating that φ is believed by an
                                                                         of “completing” the fictional world in a way compatible with
agent who initially considers it impossible that ψ is false. We
                                                                         the story being told. What wf makes true is not the same as
will call ψ the knowledge base (or KB) of the agent (though
                                                                         what is fictionally true (as determined by a Fψ operator).
ψ is not necessarily true). Note that ψ cannot include the
                                                                            The sentences of a discourse, unlike those of a natural
Bψ operator, for then Bψ φ would not expand to a finite sen-
                                                                         language story, are not indexical relative to the “current”
tence, but ψ can contain Bψ0 for a suitable different sentence
                                                                         time within the story. So, for example, the rather trivial story
ψ 0 . We also define a “knowledge” operator Kψ by
                                                                         “John picked up a block. Then he put it back down.” could
                 Kψ φ := ((µ ψ) ∧ ¬φ)  ¬>.                       (2)    get encoded (in a style based after Maslan, Roemmele, and
The result is that Kψ φ is true if the agent with knowledge              Gordon (2015)) as the following discourse: hJohn(# 1) ∧
base ψ considers it impossible that φ is false.                          Block (# 2) ∧ Pickup(# 1, # 2, # 3), Precedes(# 3, # 4) ∧
    We define (a subjective version of) fictional truth to be            Putdown(# 1, # 2, # 4), Endi. The last arguments to Pickup
what the reader, after reading the entire story, believes is             and Putdown are meant to be the names of event instances,
imagined:                                                                so e.g. Pickup(# 1, # 2, # 3) says that # 3 is an event in which
                                                                         #
                       Fψ α := ¸ Bψ Iα.                           (3)      1 picked up # 2, and Precedes expresses the events’ tempo-
We can read Fψ α as saying that α is (subjectively) fiction-             ral ordering (with respect to time within the story, not time
ally true. The reason why we want to consider the final time             for the reader). The point here however is not the specific
(and so use the ¸ operator) is that fictional truth is deter-            way these sentences represent time, but that they are like
mined by the story as a whole, which has only been fully                 what Quine (1968) called “eternal sentences” in that their
consumed at the final time.                                              truth does not depend on their time of evaluation. We will
    Furthermore, we define [α]φ by [α]φ := (#Dα ⊃ #φ).                   also expect a discourse to usually provide standard names
So [α]φ says that φ is true after reading α (provided that the           for relevant objects and events, as our example did.
next sentence is actually α). We will abbreviate sequences                  In order to provide semantics for the  operator, we
of such operators with [α1 ; . . . ; αk ]φ := [α1 ] · · · [αk ]φ. So,    need a way to represent plausibility. Friedman and Halpern
e.g., [α1 ; α2 ]φ abbreviates (#Dα1 ⊃ #(#Dα2 ⊃ #φ)).                     did so using the very general notion of a plausibility space;
                                                                         we will use what can be considered a special case of that,
   2
       Our “next” and “previous” operators are the “strong” versions.    a version of the popular “system of spheres” representation
  (Lewis 1973; Grove 1988; Bonomi and Zucchi 2003). Be-                Intuitively, w ∼ψb v if at world w and time b, the reader
  low we will use W to denote the set of all complex worlds.         with knowledge base ψ considers world v possible.
  Definition 3 (system of spheres). A system of spheres is a         Observation 1. , w, b k− Bψ φ iff , v, b k− φ for every
  set S of subsets (“spheres”) of W such that (1) for any two        v ∈ min ([w]∼ψ b
                                                                                       ).
  spheres U ∈ S and V ∈ S, either U ⊆ V or V ⊆ U , (2) for              Bψ can be shown to be a K45 operator (supporting posi-
  any non-empty set V ⊆ W, there is a ⊆-minimal sphere C             tive and negative introspection). There is also remembrance
  such that C ∩ V 6= ∅, and (3) W ∈ S.                               of past beliefs, e.g. we have k− Bψ φ ⊃ Bψ Bψ φ.
     A system of spheres can also be thought of as a total pre-      Observation 2. While (I(α1 ∨ α2 ) ⊃ (Iα1 ∨ Iα2 )) and
  order  on worlds, where w  v (“w is at least as plausi-          (I(∃xα) ⊃ ∃xIα) are valid for any α1 and α2 , Fψ (α1 ∨
  ble as v”) if every sphere containing v also contains w. Ev-       α2 ) ⊃ (Fψ α1 ∨ Fψ α2 ) and Fψ (∃xα1 ) ⊃ ∃xFψ α1 are not
  ery system of spheres has a “central” sphere (the ⊆-minimal        (assuming for the last that some Q ∈ Φf has nonzero arity).
  sphere C such that C ∩ W 6= ∅) containing the -minimal
  (most plausible) worlds.                                              Note that if Observation 2 did not hold the behavior of
     The  operator depends not just on the plausibility of          the Fψ operator would contradict the generally accepted
  worlds, but on which worlds are (currently) accessible.            idea that fiction is incomplete (Doležel 1995) and so there
                                                                     is no answer to the question of, for example, exactly how
  Definition 4. For b a non-negative integer, the accessibil-        many children Lady Macbeth had in Shakespeare’s Mac-
  ity relation at time b, ∼b ⊆ W × W, is given by w ∼b               beth (Wolterstorff 1976). As McCarthy (1990) wrote, “In a
  v iff |wd | ≥ b, |vd | ≥ b, and wd (i) = vd (i) for 1 ≤ i ≤ b.     made-up story, questions about middle names or what year
     Intuitively, at time b the reader will not consider possible    the story occurred in do not necessarily have an answer”.
  any world with a discourse not starting with the same b sen-       Observation 3. k− (Dα ⊃ Fψ α). That is, whatever the
  tences they have read so far. For a world w with |wd | ≥ b         discourse includes is fictionally true for any reader.
  we may use the notation [w]∼b := {v ∈ W : w ∼b v}. That
  is, [w]∼b is the set of worlds accessible from w at time b.           Note this means we cannot encode metaphorical lan-
     The satisfaction of a literary logic sentence φ is given rel-   guage. Also, Walton (1990, §4.5) raised philosophical ques-
  ative to a system of spheres , a world w = hwr , wf , wd i ∈      tions on how literally some other aspects of stories should be
  W, and a time b ∈ {0, 1, . . . , n}, where n = |wd |, the length   taken, such as whether Shakespearean characters are really
  of the discourse wd . The recursive rules for when , w, b         fictionally uttering the poetic speeches attributed to them (or,
  satisfy φ, written , w, b k− φ, are given below:                  more simply, whether characters are really speaking English
                                                                     in English-language stories). Our formalism does not offer a
 1. , w, b k− P (c1 , . . . , ck ) iff P (c1 , . . . , ck ) ∈ wr    choice in how to answer that.
 2. , w, b k− ¬φ iff , w, b 6k− φ
 3. , w, b k− (φ1 ∧ φ2 ) iff , w, b k− φ1 and , w, b k− φ2             3    Applying LL to question-answering
 4. , w, b k− (c1 = c2 ) iff c1 and c2 are identical names          We want to formalize how a reader would answer questions
                                                                     after reading (a prefix of) a story. As part of this formal-
 5. , w, b k− ∃x(φ) iff , w, b k− φ[x/c] for some c ∈ N            ization, we want to specify (within the language) not just
 6. , w, b k− Iα iff wf |= α                                        what the reader initially believes, but what things the reader
 7. , w, b k− Dα iff b > 0 and wd (b) = α                           initially considers more plausible than others (so as to deter-
                                                                     mine exactly how the reader’s beliefs evolve in response to
 8. , w, b k− #φ iff b < n and , w, b + 1 k− φ                     reading). In LL, following some previous work on belief re-
 9. , w, b k− φ iff b > 0 and , w, b − 1 k− φ                      vision (Friedman and Halpern 1999; Shapiro et al. 2011), the
10. , w, b k− φ1 U φ2 iff , w, j k− φ2 for some j such that        plausibility of worlds does not actually change over time, but
     b ≤ j ≤ n and , w, k k− φ1 for all k s.t. b ≤ k < j            only the accessibility relation. That nonetheless suffices to
                                                                     allow whether a proposition is believed to change back and
11. , w, b k− φ1 S φ2 iff , w, j k− φ2 for some j such that        forth over time (see (Shapiro et al. 2011, section 6)). This
     0 ≤ j ≤ b and , w, k k− φ1 for all k s.t. j < k ≤ b            suggests we can fix one system of spheres to always use, and
12. , w, b k− φ1  φ2 iff , v, b k− φ2 for every v ∈               just set the initial accessibility relation appropriately (which
     min {v ∈ [w]∼b : , v, b k− φ1 }                               the ψ in the Bψ operator has the effect of doing). In this sec-
  We will write , w k− φ if , w, 0 k− φ. We will write             tion, we will define the ‘k∼’ relation, the analogue of k− for a
  k− φ (“φ is valid”) if , w k− φ for every system of spheres       particular fixed system of spheres. Then question-answering
   and world w.                                                     can be done by determining which expressions of the form
                                                                     k∼ [α1 ; . . . ; αk ]Bψ φ hold, i.e. what a reader with a KB ψ of
  2.3   Properties                                                   background knowledge (of possibly various types) believes
  To understand the Bψ operator, it is helpful to introduce an-      after reading the first k sentences of the discourse.
                                                                        To define the specific system of spheres, we will apply
  other accessibility relation, ∼ψb ⊆ W × W, where b is a time       the idea of circumscription (McCarthy 1986) and have the
  and ψ a sentence.
                                                                     plausibility of worlds be inversely related to the sizes of the
  Definition 5. Given a system of spheres , a time b, and           extensions of distinguished “abnormality” predicates. Sup-
  sentence ψ, define w ∼ψ   b v iff w ∼b v and , v, 0 k− ψ.         pose that we have a finite set of abnormality predicates, each
with an associated priority (a positive integer). If Ab is a       sequence of all leading universally quantified variables). If
k-ary abnormality predicate of priority i, we will say that        φ(~x) uses only operators from first-order logic and does not
Ab(c1 , . . . , ck ) is a priority i ground atom. For a world w,   include real atoms for which there are not imaginary coun-
let Ci (w) ∈ {0, 1, 2, . . . } ∪ {∞} be the sum of numbers of      terparts, then I(φ(~x)) is also a formula. Then you could
priority i ground atoms from wr and wf . Let the partial order     automatically generate the sentence ∀~x(Ab(~x) ∨ I(φ(~x))),
≺CIRC ⊆ W × W be defined by w ≺CIRC v if there is some i           where Ab is some abnormality predicate (of appropriate ar-
for which Ci (w) < Ci (v) and Cj (w) ≤ Cj (v) for all j ≤ i.       ity) not used in ψ. This new sentence, roughly a defeasible
Note that ≺CIRC is a prioritized version of the preference         imaginary copy of ∀~x(φ(~x)), could be conjoined with ψ.
relation from cardinality-based circumscription (Liberatore           Carry-over by humans is probably more complicated than
and Schaerf 1997; Moinard 2000). The associated preorder           that. Ryan (1991, ch. 3) proposed restrictions on what should
CIRC can be seen to satisfy the system of spheres definition.     get carried over, including that the existence of real peo-
Definition 6 (k∼). For w a world, b a time, and φ an LL            ple or geographic locations should only be carried over into
sentence, we define k∼ by w, b k∼ φ if CIRC , w, b k− φ,          fictions that name at least one real person or location. A
and we define k∼ φ if CIRC , w k− φ for every world w.            psychological experiment of Weisberg and Goldstein (2009)
                                                                   suggested that people are less likely to carry over facts into
   Using the fixed system of spheres CIRC , essentially the       fictions differing from reality in other ways.
reader represented using the Bψ operator “only knows” the             Below we consider the interaction of carry-over with fic-
knowledge base ψ (see Levesque (1990)) but also applies            tional conventions in two examples of philosophical origin.
circumscription to determine their beliefs. So we have, e.g.,
k∼ B(P ⊃ab) ¬P . Note that Observations 2 and 3 still apply if     Scrulch the dragon Lewis (1978, p. 45) gave a case where
the use of ‘k−’ in them is replaced by ‘k∼’, and Observation       fictional truth depends on more than world knowledge:
1 works for any system of spheres, including CIRC .                 Suppose I write a story about the dragon Scrulch, a
                                                                     beautiful princess, a bold knight, and what not. It is a
4     Examples of formalizing reader knowledge                       perfectly typical instance of its stylized genre, except
As a reader reads, they draw conclusions about the imag-             that I never say that Scrulch breathes fire. Does he nev-
inary world of the story, and also about real-world liter-           ertheless breathe fire in my story? Perhaps so, because
ary truths, like the genre of a story. Consider the knowl-           dragons in that sort of story do breathe fire. But the ex-
edge base ψ = I(∀x(Knight(x) ∧ ¬Ab(x) ⊃ Man(x))) ∧                   plicit content does not make him breathe fire. Neither
(I(∃xDragon(x )) ⊃ FantasyGenre) which states that (in               does background, since in actuality and according to
imagination) knights are normally men, and that the ex-              our beliefs there are no animals that breathe fire.
istence of (imaginary) dragons is a sign of the story be-          For us there is no difficulty in writing additional sentences
longing to the fantasy genre. If a reader with this knowl-         that describe how things in imagination are different from in
edge reads as the first sentence of discourse (Dragon(# 1) ∧       reality, such as Ab ∨I(∀x(Dragon(x ) ⊃ BreathesFire(x)).
Knight(# 2)), which is a formal version of “There was              Here Ab would represent the abnormality of a story about
a knight and a dragon”, we would want them to believe              dragons which didn’t breathe fire. We would have to give Ab
that the story is a fantasy and that there is in imagina-          sufficiently high priority so that this sentence would overrule
tion a man, and that is what we have: k∼ [Dragon(# 1) ∧            any carried over beliefs about animals not breathing fire in
Knight(# 2)](Bψ FantasyGenre ∧ Bψ IMan(# 2)).                      general. Note that the sentence does not do anything to spec-
   Expectations about the story’s development can also be          ify the fire-breathing abilities of real dragons; despite believ-
represented. The expectation that a story will literally fol-      ing that fictional dragons normally breathe fire, the reader
low a version of the rule of “Chekhov’s gun” – if a gun            could still regard real dragons that breathe fire as (even) less
is shown hanging on the wall in one scene, it should               plausible than real dragons that do not breathe fire.
be fired by the end of the story – can be written as
∀x∀e1 ∃e2 Mentioned (e1 ) ∧ I(HangingOnWall (x, e1 ) ∧            Recognizing a witch Walton (1990, §4.3) gave a number
Gun(x)) ∧ ¬Ab ⊃ (Mentioned (e2 ) ∧ I(Firing(x, e2 ))) .            of examples of tricky cases about fictional truth, including
That is, if the eventuality e1 of a gun hanging on the wall is     one about what information is needed to recognize a fictional
mentioned, then normally a firing event e2 is also mentioned.      character as a witch. He wrote (p. 161, 164) the following
(The reader would also need further knowledge about events         (about drawing, but clearly also relevant to other media):
to prevent considering that e1 = e2 .) How to encode the gen-        Any child can draw a witch. Depicting a woman with a
eral underlying pragmatic principle is less clear.                   black cape, conical hat, and long nose will usually do
                                                                     the trick. [...] The fact that fictionally there is a witch
4.1    Carry-over and genre conventions                              is implied by the fact that fictionally there is a woman
In the example with the knight, to make the belief that              with a black cape, conical hat, and long nose. But it is
knights are normally men applicable to fiction, we enclosed          not the case that were there (in the real world) a long-
it in an I operator. However, we would prefer to write beliefs       nosed woman decked out in black cape and conical hat
about the real world, and have them automatically get car-           [...], there would be a witch. [...] Although it is fictional
ried over to fiction. A first, syntactic, approximation to that      in a mutually recognized legend that there are witches
is the following: Suppose the knowledge base ψ is a con-             and that they have long noses and wear conical hats,
junction including a conjunct ∀~x(φ(~x)) (~x abbreviates the         it is much less clearly fictional in it that, were there a
  woman of that description, she would be a witch. Is it             of authorial competence we can roughly paraphrase that as
  part of the legend that there are no Halloween parties,            “The first character a naı̈ve reader would suspect of being
  or that nonwitches never dress thus [...]?                         guilty is innocent.” Consider a reader with KB ψ; let us sup-
This idea, that someone described in a stereotypically witch-        pose that they are the reader the author would be able to trick
like way is a witch while not necessarily everyone in the            into suspecting the wrong character. How could we extend
world of the story with witch-like characteristics is a witch,       their KB to make them genre-savvy?
can be expressed in literary logic. We can do so by writ-               As a prelude to that, consider the following formula:
ing ∀x(I(WitchLike(x)) ∧ Mentioned (x) ∧ ¬Ab(x) ⊃                         φ(x) = Bψ IG(x) ∧ ∀y(y 6= x ⊃ ¬Bψ IG(y))                 (4)
IWitch(x)). Then the IWitch(x) conclusion is not drawn               Recalling that “” is the “previously” operator, we can read
for every x having witch-like characteristics, but only for          φ(x) as “x is believed to be guilty (in imagination) and for
those also mentioned in the discourse (recall the special            all y not equal to x, y was not previously believed to be
Mentioned predicate). This is an example of scoped non-              guilty (in imagination)”, where the beliefs are understood to
monotonic reasoning (Etherington, Kraus, and Perlis 1991).           be those of the reader with KB ψ. So, for a standard name
                                                                     c, φ(c) is true at a time iff c is the (unique) first character
4.2   Expectations about mystery stories                             believed to be guilty (in imagination).
A reader may expect when reading a mystery story to even-               Below, proposition 2 shows a sentence (incorporating
tually find out who is guilty. In this section, we will use the      φ(x) as a subformula) we can conjoin to ψ to produce the
unary imaginary predicate symbol G(x) to mean that x is              knowledge base ψ 0 of a savvy reader, and establishes that if
guilty (in imagination).                                             φ(c) is ever true (i.e., that c is the first character the reader
   Suppose that ψ is the KB of a reader, and we want to              with KB ψ believes is guilty) then the reader with KB ψ 0 will
inform this reader that they should expect to find out who is        believe that c is not guilty (unless that reader knows that c is
guilty. The proposition below shows how we can extend ψ              guilty, e.g. because the discourse includes G(c) explicitly).
into a KB ψ 0 so a reader knowing only ψ 0 believes they will        Proposition 2. Let ψ be an LL sentence, let Ab be a 0-
find out who is guilty (assuming that a reader knowing only          ary real abnormality predicate of higher priority than any
the original KB, ψ, does not believe that they won’t find out        abnormality predicate appearing in ψ, let G be a unary
who’s guilty – in other words, that k∼ ¬Bψ ¬∃x(Fψ G(x))).            imaginary predicate, and (as in Equation 4) let φ(x) =
Proposition 1. Let G be a unary imaginary predicate. Sup-            Bψ IG(x) ∧ ∀y(y 6= x ⊃ ¬Bψ IG(y)).                 Then define
                                                                     ψ 0 = ψ ∧ Ab ∨ ∀x(φ(x) ⊃ I¬G(x)) . Then
                                                                                                                   
pose ψ is an LL sentence s.t. k∼ ¬Bψ ¬∃x(Fψ G(x)). Let
ψ 0 = ψ ∧ ∃x(Fψ G(x)). Then k∼ Bψ0 ∃x(Fψ0 G(x)).
                                                                                                                               
                                                                           k∼ ∀x (φ(x) ∧ ¬Kψ0 IG(x)) ⊃ Bψ0 I¬G(x) .
Proof. We want to prove that for every world w, we have              Proof. Suppose for a world w, time b, and name c, we
w, 0 k∼ Bψ0 ∃x(Fψ0 G(x)). To do that, we want to0 show               have w, b k∼ φ(c) ∧ ¬Kψ0 IG(c). We want to show that
that v, 0 k∼ ∃x(Fψ0 G(x)) for every v ∈ min([w]∼ψ           ). Fix   w, b k∼ Bψ0 I¬G(c). It can be shown that w, b k∼ ¬Kψ0 Ab,
                                                          0          and therefore (because all worlds in which Ab is false are
an arbitrary such v (if there are none, we are done), and let
n = |vd |. We have that v, 0 k∼ ψ 0 and so (by the definition of     more plausible than all others) w, b k∼ Bψ0 ¬Ab. So w, b k∼
ψ 0 ) v, 0 k∼ ψ and v, 0 k∼ ∃x(Fψ G(x)). Let c ∈ N be such           Bψ0 ∀x(φ(x) ⊃ I¬G(x)), and so w, b k∼ Bψ0 (φ(c) ⊃
that v, 0 k∼ Fψ G(c). Then v, n k∼ Bψ IG(c). Therefore,              I¬G(c)). So we will be done if we can show that w, b k0∼
for each v 0 ∈ min([v]∼ψ    ), we have v 0 , n k∼ IG(c). It can      Bψ0 φ(c), i.e. that v, b k∼ φ(c) for every v ∈ min([w]∼ψ     b
                                                                                                                                     ).
                          n 0
be shown that min([v]∼n ) ⊆ min([v]∼ψ
                           ψ                     ), which means      This follows because for any such v, the discourse vd must
                                  0
                                              n                      agree with wd on the first b entries, which is enough to give
that for each v ∗ ∈ min([v]∼ψ   n
                                    ), we have  v ∗
                                                    , n k∼ IG(c).    φ(c) the same truth value there.
Hence v, n k∼ Bψ0 IG(c), and v, 0 k∼ ∃x(Fψ0 G(x)).
   We could also consider representing the knowledge that
                                                                               5    Discussion and related work
the reader won’t find out who’s guilty until very near the           5.1   Regarding non-monotonic reasoning
end, which Brewer and Lichtenstein (1982) considered to              We have space only to make a couple remarks in this section.
be an example of a “curiosity discourse organization”. The              Many forms of circumscription, including prioritized cir-
non-trivial part would be formalizing the vague “very near”          cumscription, have been considered in the literature (see e.g.
(to say the reader won’t have a belief about who’s guilty at         (Lifschitz 1994)). Cardinality-based circumscription (Liber-
a precise time – say, three sentences before the end – we            atore and Schaerf 1997; Moinard 2000) has the advantage
could simply write something like ¸             ¬∃xBψ IG(x)).        of being simple to work with because there will always be a
Brewer and Lichtenstein suggested that the purpose of sto-           set of most plausible worlds in which any sentence is true, if
ries is to entertain, and that three ways that authors accom-        that sentence is true in any worlds.
plish this is by creating suspense, surprise, and curiosity by          The way that we can use the ψ in the Bψ operator to re-
manipulating when information gets revealed to the reader.           fer to what is and isn’t believed by an agent with knowledge
Our next example might be considered a case of surprise.             ψ 0 recalls hierarchic autoepistemic logic (Konolige 1988).
   A genre-savvy reader might think that in a mystery story,         In common with that system, and unlike standard autoepis-
it’s true that “The first character the author tries to make you     temic logic (Levesque 1990), the issue of there being multi-
suspect of being guilty is innocent.” Under an assumption            ple “stable expansions” of an agent’s beliefs does not arise.
5.2    Regarding stories and fiction                                         tribute of the projection of fictional horizons that are
Wilensky (1983) briefly discussed “dynamic points” in sto-                   traditional and well-known (for example, the fabulous
ries, involving violations of the expectations of a character                world of speaking animals).
or the reader. He wrote (p. 616) that “Only with recourse to               At an intermediate point in reading, the reader’s beliefs
events that are supposed to transpire in the reader during the             about what carries over may be influenced not just by what
course of understanding a text can the discourse structure                 the author has said about the fictional world, but by what the
of a theory of stories be stated.” LL, of course, is expressly             reader believes the author will say in the rest of the story.
designed to allow referring to changing beliefs of the reader.                Our formal discourses require temporal information be
   Michael (2013) also considered encoding reader expec-                   explicitly encoded, like some other approaches (Diakidoy et
tations in a formal system. He gave an example of encod-                   al. 2014; Maslan, Roemmele, and Gordon 2015). While this
ing the expectation “that the story clarifies at each instance             is not like the ordinary use of natural language, it is much
whether it is day or night” (which is not unreasonable for a               less complicated. For logic-based approaches that do try to
story told in a visual medium), though the brief outline given             deal with those kinds of issues, see Episodic Logic (Schubert
of the semantics for his system does not cover how disjunc-                and Hwang 2000) and Segmented Discourse Representation
tive expectations like that should be handled. Michael also                Theory (Asher and Lascarides 2005).
considers the case of the reader being told additional infor-                 The form of fictional truth we have formalized is reader-
mation about what the author expects them to infer.                        dependent. Whether what a text means should depend on the
   We may note that there is also work in the AI subfield of               reader is controversial (Hirst 2008). An objective form might
narrative generation that concerns itself with reader expecta-             be implemented in a multi-agent version of literary logic, by
tions. For example, the “Prevoyant” system (Bae and Young                  defining objective fictional truth in terms of the knowledge
2014) is supposed to generate narratives that are surprising.              of an ideal reader which other readers have beliefs about.
   Rapaport and Shapiro (1995) presented a computational                   (LL in this paper is not truly multi-agent, despite the param-
approach to handling carry-over in story understanding in                  eterized Bψ operators, since one reader cannot reason about
their SNePS system. Beliefs about reality are copied into a                another’s beliefs without specifying what the latter’s KB is.)
“story world context” and belief revision is used to deal with
any conflicts that may arise as the story is read (they also dis-
cuss an alternative approach using a “story operator”). Genre
                                                                                                 6    Conclusion
conventions are not discussed, and since time for the reader               We have argued that our logic LL, following on the work
does not play an explicit role, it’s not clear how reader be-              of Friedman and Halpern, can represent various forms of
liefs about the future could be represented or queried.                    knowledge relevant to story understanding, and so be used
   The ISAAC story understanding system (Moorman and                       to determine how a reader with such knowledge would an-
Ram 1994) has a so-called “creative understanding” process                 swer questions about a story. We encourage investigating
that allows for modifying pre-existing concepts in an attempt              how other formal approaches developed for modelling belief
to understand a story. Moorman and Ram did not relate this                 over time could similarly be useful in story understanding.
to the philosophical literature on carry-over, and further in-                In future work, we plan to further investigate carry-over
vestigation of that would be interesting.                                  and to construct fully worked out examples with complete
   Bonomi and Zucchi (2003) gave an approach to combin-                    stories. Also, it should be possible to replace the Mentioned
ing carry-over with genre conventions. They consider having                predicate with epistemic constructions. Another point is that
two systems of spheres (the worlds in these, unlike our com-               LL models the reader as logically omniscient (Hintikka
plex worlds, are not split into real and imaginary parts), one             1975), seeing all consequences of its own beliefs, but it
centered on Bx , the set of worlds conforming to the “overt                would be interesting to consider resource-bounded readers,
beliefs” of the author of x (the fiction in question), and one             as that is what real authors write for.
centered on Rx , the set of worlds following the conventions
for x. Fictional truth is determined by what is true in all the                                Acknowledgments
closest worlds to Bx from among those worlds that are clos-
                                                                           This research was supported by the Natural Sciences and
est to Rx in which the “directly generated content”3 of x is
                                                                           Engineering Research Council of Canada (NSERC).
true. Note this requires which conventions x follows to be
already known, while LL allows for reasoning about that.
   We have not much considered interaction between ex-                                               References
pectations and carry-over. However, Martı́nez-Bonati (1983)                Asher, N., and Lascarides, A. 2005. Logics of Conversation. Stud-
suggested that the reader expects to quickly find out how re-              ies in natural language processing. Cambridge University Press.
alistic the world of the story is (p. 188):                                Bae, B.-C., and Young, R. M. 2014. A computational model of
   If I read a few narrative sentences implying a system                   narrative generation for surprise arousal. IEEE Transactions on
                                                                           Computational Intelligence and AI in Games 6(2):131–143.
   of reality not different from ordinary life, I will rapidly
   tend to solidify my expectations into a “realistic” fic-                Bonomi, A., and Zucchi, S. 2003. A pragmatic framework for truth
   tional horizon. [...] A similar promptness will be an at-               in fiction. Dialectica 57(2):103–120.
                                                                           Brewer, W. F., and Lichtenstein, E. H. 1982. Stories are to enter-
    3
      This is not the same as the literal content, as they also consider   tain: A structural-affect theory of stories. Journal of Pragmatics
(in an unformalized way) the narrator and their reliability.               6(56):473–486.
Chambers, N., and Jurafsky, D. 2009. Unsupervised learning of          McCarthy, J. 1986. Applications of circumscription to formalizing
narrative schemas and their participants. In ACL 2009, 602–610.        common-sense knowledge. Artificial Intelligence 28(1):89–116.
Charniak, E., and Goldman, R. 1989. Plan recognition in stories        McCarthy, J. 1990. An example for natural language understand-
and in life. In UAI 1989, 54–59.                                       ing and the AI problems it raises. In Lifschitz, V., ed., Formalizing
Charniak, E. 1972. Toward a model of children’s story compre-          Common Sense: Papers by John McCarthy. Ablex Publishing Cor-
hension. MIT AI Laboratory Technical Report 266.                       poration. 70–76.
Chaturvedi, S.; Peng, H.; and Roth, D. 2017. Story comprehension       Michael, L. 2013. Story understanding... calculemus! In Common-
for predicting what happens next. In EMNLP 2017, 1604–1615.            sense 2013.
Diakidoy, I.-A.; Kakas, A.; Michael, L.; and Miller, R. 2014. Story    Moinard, Y. 2000. Note about cardinality-based circumscription.
comprehension through argumentation. In Computational Models           Artificial Intelligence 119(1):259 – 273.
of Argument - Proceedings of COMMA 2014, 31–42.                        Moorman, K., and Ram, A. 1994. A model of creative understand-
Doležel, L. 1995. Fictional worlds: Density, gaps, and inference.     ing. In AAAI 1994, 74–79.
Style 29(2):201–214.                                                   Morgenstern, L. 2014. Representing and reasoning about time
Etherington, D. W.; Kraus, S.; and Perlis, D. 1991. Nonmonotonic-      travel narratives: Foundational concepts. In KR 2014, 642–645.
ity and the scope of reasoning. Artificial Intelligence 52(3):221–     Mostafazadeh, N.; Chambers, N.; He, X.; Parikh, D.; Batra, D.;
261.                                                                   Vanderwende, L.; Kohli, P.; and Allen, J. 2016. A corpus and
Friedman, N., and Halpern, J. Y. 1999. Modeling belief in dy-          cloze evaluation for deeper understanding of commonsense stories.
namic systems, part II: Revision and update. Journal of Artificial     In NAACL HLT 2016, 839–849.
Intelligence Research (JAIR) 10:117–167.                               Quine, W. V. 1968. Propositional objects. Crı́tica: Revista His-
Grove, A. 1988. Two modellings for theory change. Journal of           panoamericana de Filosofı́a 2(5):3–29.
Philosophical Logic 17(2):157–170.                                     Rapaport, W. J., and Shapiro, S. C. 1995. Cognition and fiction.
Heintz, J. 1979. Reference and inference in fiction. Poetics           In Duchan, J. F.; Bruder, G. A.; and Hewitt, L. E., eds., Deixis
8(1):85–99.                                                            in Narrative: A Cognitive Science Perspective. Lawrence Erlbaum
                                                                       Associates, Inc. 107–128.
Hintikka, J. 1975. Impossible possible worlds vindicated. Journal
of Philosophical Logic 4(4):475–484.                                   Richardson, M.; Burges, C. J.; and Renshaw, E. 2013. MCTest: A
Hirst, G. 2008. The future of text-meaning in computational lin-       challenge dataset for the open-domain machine comprehension of
guistics. In Proceedings of the 11th International Conference on       text. In EMNLP 2013, 193–203.
Text, Speech and Dialogue (TSD 2008), 3–11.                            Rumelhart, D. 1975. Notes on a schema for stories. In Bobrow,
Hobbs, J. R.; Stickel, M.; and Martin, P. 1993. Interpretation as      D. G., and Collins, A. M., eds., Representation and Understanding:
abduction. Artificial Intelligence 63:69–142.                          Studies in Cognitive Science. Academic Press, Inc. 211–236.
Hobbs, J. R. 1990. Literature and Cognition. Number 21 in CSLI         Ryan, M.-L. 1991. Possible Worlds, Artificial Intelligence, and
Lecture Notes. Center for the Study of Language and Information.       Narrative Theory. Indiana University Press.
Iyyer, M.; Manjunatha, V.; Guha, A.; Vyas, Y.; Boyd-Graber, J.;        Schank, R. C., and Abelson, R. P. 1977. Scripts, Plans, Goals,
Daumé III, H.; and Davis, L. 2017. The amazing mysteries of the       and Understanding: An Inquiry into Human Knowledge Structures.
gutter: Drawing inferences between panels in comic book narra-         Lawrence Erlbaum Associates, Inc.
tives. In CVPR 2017.                                                   Schubert, L. K., and Hwang, C. H. 2000. Episodic Logic meets
Konolige, K. 1988. Hierarchic autoepistemic theories for non-          Little Red Riding Hood: A comprehensive, natural representa-
monotonic reasoning. In AAAI 1988, 439–443.                            tion for language understanding. In Iwanska, L., and Shapiro,
                                                                       S. C., eds., Natural Language Processing and Knowledge Repre-
Levesque, H. J., and Lakemeyer, G. 2000. The logic of knowledge        sentation: Language for Knowledge and Knowledge for Language.
bases. MIT Press.                                                      MIT/AAAI Press. 111–174.
Levesque, H. J. 1990. All I know: A study in autoepistemic logic.      Shapiro, S.; Pagnucco, M.; Lespérance, Y.; and Levesque, H. J.
Artificial Intelligence 42(2-3):263–309.                               2011. Iterated belief change in the situation calculus. Artificial
Lewis, D. 1973. Counterfactuals. Harvard University Press.             Intelligence 175(1):165–192.
Lewis, D. 1978. Truth in fiction. American Philosophical Quar-         Tapaswi, M.; Zhu, Y.; Stiefelhagen, R.; Torralba, A.; Urtasun, R.;
terly 15(1):37–46.                                                     and Fidler, S. 2016. MovieQA: Understanding stories in movies
Liberatore, P., and Schaerf, M. 1997. Reducing belief revision to      through question-answering. In CVPR 2016.
circumscription (and vice versa). Artificial Intelligence 93(1):261–   Van Inwagen, P. 1977. Creatures of fiction. American Philosophi-
296.                                                                   cal Quarterly 14(4):299–308.
Lichtenstein, O.; Pnueli, A.; and Zuck, L. D. 1985. The glory of       Walton, K. L. 1990. Mimesis as Make-Believe: On the Foundations
the past. In Proceedings of the Conference on Logic of Programs,       of the Representational Arts. Harvard University Press.
196–218.                                                               Weisberg, D. S., and Goldstein, J. 2009. What belongs in a fictional
Lifschitz, V. 1994. Circumscription. In Handbook of Logic in           world? Journal of Cognition and Culture 9(1):69–78.
AI and Logic Programming, volume 3. Oxford University Press.           Wilensky, R. 1983. Story grammars versus story points. Behav-
298–352.                                                               ioral and Brain Sciences 6(4):579–623.
Martı́nez-Bonati, F. 1983. Towards a formal ontology of fictional      Wolterstorff, N. 1976. Worlds of works of art. The Journal of
worlds. Philosophy and Literature 7(2):182–195.                        Aesthetics and Art Criticism 35(2):121–132.
Maslan, N.; Roemmele, M.; and Gordon, A. S. 2015. One hun-             Woods, J. 1974. The Logic of Fiction: A Philosophical Sounding
dred challenge problems for logical formalizations of common-          of Deviant Logic. Mouton.
sense psychology. In Commonsense 2015, 107–113.