Extracting Evidence Summaries from Detective Novels

                       Aditya Motwani                          Aayush Naik
                    DSAC, IIIT Hyderabad                   DSAC, IIIT Hyderabad
               aditya.motwani@research.iiit.ac.in      aayush.naik@students.iiit.ac.in
                                     Kamalakar Karlapalem
                                     DSAC, IIIT Hyderabad
                                          kamal@iiit.ac.in


                                                        Abstract
                       Literary novels are a mine of information stored in a narrative. Infor-
                       mation Extraction (IE) on these novels can be directed by a specific aim
                       to derive knowledge of interest. In this paper, we explore the problem
                       of extracting evidence summaries for all characters in a detective novel.
                       But in this domain, there is no standard annotated text which provides
                       the ground truth for evidence and there exists no fixed metric of evalu-
                       ation. For extracting summaries, we explore unsupervised methods of
                       learning. We first assume that all characters are equally likely to be
                       culprits and build evidence against each character. Given this evidence,
                       we assess the likelihood of each character being the culprit, eventually
                       arriving at the true culprit(s). Our contribution is the formulation of
                       evidence summaries for characters in determining the culprit. Our ex-
                       periments are based on a corpus of detective novels spanning the last
                       hundred years. We evaluate these summaries for comprehension with
                       the help of human readers and broadly categorize novels into three
                       groups.


1    Introduction
Detective Novels are a mine of information embedded in a narrative. In this paper, we explore the problem
of extracting evidence summaries from detective novels. A big difficulty and novelty of this problem can be
attributed to the widely varying writing style of different authors. This makes it very difficult for standard
Information Extraction (IE) based techniques to extract important sentences in novels across multiple books
and authors. A standard approach for this task would be to train state-of-the-art deep networks and learn what
constitutes as valid evidence. To train deep networks for such information extraction, we require a large training
corpus of annotated data. But in this domain of detective novels, there is no existing annotated data, and there
is no fixed metric of evaluation. Moreover, making such a corpus is not as simple as just labeling a sentence as
clue or not a clue by just a cursory glance at the sentence. One needs to have the relevant context from the
novel and a good understanding of what piece of information could be considered as a valid clue. Therefore it is
difficult to extract data in a supervised way. Therefore, our methods primarily focus on unsupervised techniques.

Copyright c 2019 for the individual papers by the paper’s authors. Copying permitted for private and academic purposes. This
volume is published and copyrighted by its editors.
In: A. Jorge, R. Campos, A. Jatowt, S. Bhatia (eds.): Proceedings of the Text2StoryIR’19 Workshop, Cologne, Germany, 14-April-
2019, published at http://ceur-ws.org
These techniques exploit the underlying semantic structure in a novel using spacy’s [HM17] sentence vectors as
mathematical representation which attempts to overcome the challenge of manual annotation of sentences.
   For the purpose of our experiments we build a corpus of detective novels that primarily revolve around a
singular culprit who is involved in a crime like robbery or murder. To incorporate various writing styles, we pick
popular authors over the last century. Some of the prominent authors in the corpus that we analyze are Arthur
Conan Doyle, Agatha Christie, Stieg Larsson, J.K. Rowling, Dan Brown and Lee Child. The standard approach
to solve this problem is to first extract evidence and then deduce the likely culprit. However, without a concrete
definition of valid evidence, this problem is difficult to address. So, our methods can be viewed as approaching
the problem backwards. We extract and evaluate the possible evidence against every character in the novel. We
only collect the evidence relating to the crime that was revealed before the culprit is declared. This gives us an
evidence map of each character. An analysis of this map by human evaluators gives us the likelihood of the said
character being the culprit.
   We focus our analysis on the text before the aforementioned juncture. Given the juncture of revelation as a
priori information, we define what constitutes as valid evidence (see Section 3) using different algorithms and
metrics. Our first step is to use trends pertaining to the domain of detective novels to determine the important
character (especially the culprit) in novels.
   We need to extract the evidence relating to the crime that was revealed before the culprit is declared. Using
heuristics defined in section 3.1 we find the approximate position of the culprit revealing sentence (defined in
section 3.0.2) in the novel by narrowing down to the relevant paragraph. Since the paragraph contains only a
few sentences, most of which are related to the culprit, it is feasible to find the revealing sentence.
   We use our algorithm which refines text into potential evidence. We then validate our results by employing
human readers given two summaries of each character - one with baseline summaries (described in section 3.3)
and the other with summaries from our approaches - and comparing them in table 2. From our gathered evidence
summaries and our responses from our human readers, we broadly classify the novels into three categories based
on the extracted evidence.
   In this paper, we talk about related work in section 2. We detail the overview of our solution and some
necessary definitions in section 3. Next, we describe the exact methods and algorithms used in section 3.1 and
3.2. We provide and explain our results in section 4. We finally conclude and talk about future work in section
5.


2   Related Work
Summarization of large text documents is an important task in the NLP domain. However, summarization
techniques for texts with a narrative structure are challenging to produce. [LE11] attempted a method for
evidence discovery along with visual aid in understanding the characters and structure in detective novel genre.
Moen et.al [MPH+ 16] generate extractive summaries on medical reports using word space models in distributional
semantic modelling. Medical reports involves temporal patient history, and unpacks in similar ways to detective
novels.
   Some of the popular techniques for summarizations happen graph based. Algorithms such as [TDC10] has
been attempted on different domains using different algorithms. Thakkar et al [TDC10] use a similarity metric
based on common words in sentences to generate a graph and then use the PageRank algorithm to find important
sentences in the domain of web articles. In our corpus, we observe there is no correlation between PageRank of
sentences and their importance with respect to evidence against the culprit. Aker et al [ACG10] use A* search
with different heuristics for multi-document summarization on an established ground truth. In comparison, our
problem does not have an established ground truth or a pre-annotated corpus of sentences for the A* search.


3   Evidence extraction from detective novels
3.0.1   Evidence

Any piece of information that increases the confidence of the reader that a particular character is the culprit can
be categorised as evidence. In the context of our problem, every sentence in list of sentences of the output of
the search algorithm is a potential evidence sentence. In table 1, we show some examples of evidence sentences
retrieved by our search algorithm for the novel The ABC Murders [Chr03] by Agatha Christie.
                                    Table 1: Clues from The ABC Murders


                 Miss Grey here told us that she did not see or speak to any stranger on the
                 day that Sir Carmichael Clarke was murdered.
                 There was the possibility that the footmarks might have been made by some-
                 body else who happened to have the same kind of studs in his shoes.
                 According to the police theory, Ralph was wearing another pair of the same
                 kind, and I found out that it was true that he had two pairs.


3.0.2    Culprit Sentence
The first sentence in the novel that makes it unambiguous and lucid who the true culprit is.
  Very often, this is an assertion or deduction by the prime detective following which, the detective explains
how he/she arrives at that deduction. Or sometimes, it’s a confession by the culprit.

3.1     Culprit Frequency Binning
We divide the text into buckets and glean information from aggregation of these buckets. The information we
are looking for is the location in the novel where the culprit is revealed.
   First we look for the approximate location (paragraph) where the culprit is revealed. To achieve this, we
use a bucket optimization algorithm. We want to maximize the frequency of occurrence of the culprit in a
paragraph. Using the bucketing algorithm, we optimize the size of a paragraph to get the best frequency. We
group consecutive sentences into bins of sentences and count the frequency of the culprit in each bin. By varying
bin size, we arrive at an optimal paragraph size where there is a sharp increase in the mentions of the culprit.
Once we’ve identified the paragraph, our search space has reduced drastically. At the beginning of the culprit

 Algorithm 1: Culprit Bucket Optimization
  Function bucket Sentence[] doc, Integer min bucket size, Integer max bucket size
     Integer bucket size = min bucket size
     Integer optimal bucket size = min bucket size
     Sentence[] culprit bucket
     Integer max mentions = 0
     while bucket size ≤ max bucket size do
         Sentence[][] Buckets = emptyset()
         Sentence[] bucket = Buckets[bucket size*i to bucket size*(i+1)] : for i in range(number of buckets)
         Divide the list of sentences into buckets of size bucket size
         Count the number of times the culprit is mentioned in each bucket
         Let mentions and bucket hold the values for the most mentions among all buckets and the bucket
           with most mentions respectively
         if mentions > max mentions then
             max mentions = mentions
             optimal bucket size = bucket size
             culprit bucket = bucket
         end
         bucket size = bucket size +1
     end
     return culprit bucket, optimal bucket size
  end

paragraph, we add an asserting sentence which implicates the culprit. But such a sentence would not exist for
a character who is not the culprit. For example, in the Murder of Roger Ackroyd by Agatha Cristie, we add
the sentence “Ursala Borne killed Roger Ackroyd.” as Ursala Borne was not the culprit. This helps us maintain
uniformity of the culprit sentence. That is, even if the text lacks an explicit implication of the culprit or any
other character, we can add such an implication at the beginning. This is then used to extract evidence against
all the characters. Additionally, this maintains consistency of the culprit sentence across novels. If no evidence
against a particular character is found, then, we can state that that character was not the culprit with high
certainty.

3.2   The Search Algorithm - Augmented A* Search
The search algorithm is a higher-order function that takes in a strategy function along with a starting sentence
and a goal sentence. The strategy defines how and which sentences are picked. The aim is to find an optimal
path, starting at the first sentence of the novel and ending at the culprit-revealing sentence.
   Our Augmented A* algorithm in two parts - a general graph search algorithm [Algorithm (2)] that takes in
a strategy as input and the augmented A* strategy [Algorithm (3)]. We use a variant of the A*- search (with
additional hyperparameters) as the strategy function input to the higher-order graph search function.
   The final output of the graph search is a list of sentences. This list of sentences is analogous to an extractive
evidence summary. An example is shown in table 3.

 Algorithm 2: Graph Search
  Function search Sentence[] doc, Sentence start, Sentence goal, Strategy strategy
     Sentence current = start
     Set(Sentence) frontier = ∅
     Set(Sentence) visited = ∅
     while current != goal do
        yield current
        visited = visited ∪ {current}
        Set(Sentence) new nodes = Set(neighbors(current)) - visited - frontier
        frontier = frontier ∪ new nodes
        next node, dist from start = strategy(doc, current, frontier, goal, dist from start)
        current = next node
        frontier = frontier - {current}
     end
  end


 Algorithm 3: A* Strategy
  Strategy astar Sentence[] doc, Sentence current, Set(Sentence) frontier, Sentence goal, Map
   distance from start, Float α, Float β, Function goal heuristic, Function distance
      if distance from start is empty then
          distance from start = sent : ∞ forall sent in doc
          distance from start[current] = 0
      end
      forall Sentence neighbor in neighbors(current) do
          dist from start[neighbor] = min(dist from start[neighbor], dist from start[current] +
            distance(current, neighbor))
      end
      Sentence next node = argminnode {node | f (node) = minnode0 α.dist from start[x]+β.goal heuristic(goal,
        x)}
      return next node, dist from start
  end


3.3   Baseline
We use the graph based summarization method Lexrank [ER04] as a baseline for creating evidence based sum-
maries. Lexrank works by building a graph representation of text. It then computes sentence importance based
on the eigenvector centrality and cosine similarity in the graph representation of sentences. We then compute
a connectivity matrix based on intra-sentence cosine similarity. We find the closest sentences to the culprit
sentence and create an extractive summary of the character. The baseline output is a generic character summary
of the characters, not explicitly focusing on evidence.
4      Results
Evidence sentences are obfuscated in the narrative of a novel. There is no standard metric of evaluation or a
baseline as such. We present a best-effort scenario by comparing our methods with the baseline and evaluating
with human readers.
   We evaluated our evidence summaries with 10 human evaluators who have read all the books or read the
long summaries on Wikipedia and Sparknotes. The long summaries were a good representation of the story as
verified by the authors of this paper. Initially, the evaluators are presented with the summaries from our baseline
method and then they are presented with evidence summaries from our Augmented AStar. The participants
were asked to report if the evidence summaries presented to them for different characters in the novel assisted
them in identifying the culprit in the novel. The responses of the participants were broadly categorized into
three classes - “The evidence summaries were useful in finding the right culprit (CI)”,“The evidence summaries
are misleading and direct the crime to a non-culprit character (N CI)” and “The evidence summaries are not
informative in labelling any character as culprit (N A)”.
   From table 2, we see that the Augmented A* Search has a higher CI percentage across novels, as compared
to the first group. The highest CI score for the baseline lexrank is 30%. That is, 3/10 readers found the baseline
summary useful in finding the culprit in the best case scenario of Eye of the Needle. In contrast, the second
group presented with our Augmented A* reported a highest with 8/10 readers reporting CI (helpful in culprit
identification) for Red Dragon.
   We also categorize the novels into three types based on the rules detailed by vans et al [VD15] on writing
detective stories and our intuition into - hard, elusive, and trivial. Hard novels are the ones in which most of the
twenty rules given by [VD15] are violated and/or simply not written for reader deduction. Some of these rules
include

    • “The reader should have the same opportunity as the detective to solve the crime”.
    • “The villain must be found by logical deduction, not luck, accident, or un-motivated confessions”.
    • “The villain has to be someone who plays a prominent part of the story”.

  This makes it impossible for the reader to deduce who the culprit is. Elusive novels follow most of the rules,
and thus it is possible for a keen reader to deduce the culprit. Trivial novels may or may not follow the rules,
but they just fail at hiding the culprit properly or don’t even aim to do so in the first place.
  From the response of our readers presented with our Augmented A* summaries we draw the following inferences
-

    1. Novels with a high percentage of readers who fall in N A, can be said to provide negligible information about
       the culprit before revelation - hard.
    2. Novels with high percentage of readers who fall in CI are the ones which make culprit guessing easy and do
       not have sufficient obfuscation - trivial.
    3. Novels with high percentage of readers who fall in N CI seem to provide only subtle obfuscated information
       about the culprit - elusive.

   In the trivial category, the highest CI percentage is 80% for Eye of the Needle and Red Dragon. In Red
Dragon, the author Thomas Harris, aims to show the culprit in a gruesome fashion and how the culprit Francis
Dolarhyde eludes the detective. The author does not aim to hide the identity of the culprit.
   Stories that falls into the hard category typically surprise the reader by hiding information. In Hound of
Baskerville and The Adventure of the Blue Carbuncle, Sherlock Holmes withholds critical information until
revealing the culprit, but it leaves the reader in awe of the detective who could draw such amazing deductions.
This makes guessing who the culprit very difficult.
   In the elusive category, the highest percentage of readers in N CI is 70% indicating a certain degree of
obfuscation of evidence using narrative tricks. This is done by making it hard to sift out the evidence using certain
crafty tricks like addition of phony clues, misdirection by adding spurious culprits and mention of seemingly
irrelevant information which turns out to be key in the near future. An astute reader can always identify the
culprit in advance. Summaries for Agatha Christie novels contain lines which readers overlook but are found to
be of importance. For example, in The Murder of Roger Ackroyd [Chr26], since the reader does not expect the
 Novel Title                       Author(s)                              Percentage Score                     Revelation Category
                                                            Baseline Lexrank               A* Search
                                                        (CI)    (N CI)     (N A)    (CI)    (N CI)     (N A)
 Adventure of Blue Carbuncle       Arthur Conan Doyle     0        10        90      10       10         80    hard
 Hound of Baskerville              Arthur Conan Doyle     0        10        90       0       10         90    hard
 A Study in Scarlet                Arthur Conan Doyle     0         0       100       0        0        100    hard
 The Sign of the Four              Arthur Conan Doyle     0         0       100       0        0        100    hard
 The ABC Murders                   Agatha Christie       10        10        80      20       60         20    elusive
 The Murder of Roger Ackroyd       Agatha Christie        0        10        90      20       70         10    elusive
 Murder in Mesopotamia             Agatha Christie       10        10        80      10       60         30    elusive
 And Then There Were None          Agatha Christie       10        20        70      10       40         50    hard
 The Mysterious Affair at Styles   Agatha Christie        0        20        80      30       50         20    elusive
 The Girl with the Dragon Tattoo   Stieg Larsson         20         0        80      40       50         10    elusive
 The Maltese Falcon                Dashiell Hammett      10        10        80      20       20         60    hard
 The Moonstone                     Wilkie Collins        10        10        80      20       20         60    hard
 Eye of the Needle                 Ken Follett           30        10        60      80       10         10    trivial
 Shock for the Secret Seven        Enid Blyton           10        10        80      30       10         60    hard
 Red Dragon                        Thomas Harris         20        10        70      80       10         10    trivial
 Killing Floor                     Lee Child             10        10        80      20       30         50    hard
 Angels and Demons                 Dan Brown             10        20        70      30       20         50    hard
 The Lost Symbol                   Dan Brown             10        20        70      30       20         50    trivial
 In the Woods                      Tana French           10        10        80      20       20         60    hard
 The Cuckoo’s Calling              J. K. Rowling         10         0        80      10       60         30    elusive


Table 2: Comparison of baseline vs our solution over 20 novels. CI is the percentage of readers who found the
summaries useful in finding the right culprit. N CI is the percentage of readers who found the summaries as
leading to a non-culprit character. N A is the percentage of readers who found the summary as not informative
in labelling any character as the culprit.

narrator to be the culprit, many clues are left unnoticed; and in Murder on the Orient Express 12 out of the 13
suspects are the culprits − something a reader would not expect apriori.

5    Conclusion and Future Work
In this paper, we explore the problem of extracting evidence summaries for characters in a novel. As there is no
existing annotated data, we explore an unsupervised method of learning. We use an augmented A* algorithm
which takes in an implicit graph representation and finds an optimal path to the culprit reveal sentence. The
results show that the evidence summaries of the culprit contain much more information than the summaries
generated for characters who are innocent of the crime. Furthermore, we evaluate our evidence summaries by
presenting them to human readers and classify the novels into three categories based on the ease in which they
were able to identify the culprit.
   As future work, we would like to explore abstractive summarization as a technique to generate coherent
evidence summaries. An interesting application would be in the domain of voice-activated agents, which would
provide culprit narratives for humans. This agent would have the capability to extract facts from text and discuss
detective novels with humans. Such an application could be realized on a voice-activated agent like Amazon
Echo or Google Assistant.

References
[ACG10]        Ahmet Aker, Trevor Cohn, and Robert Gaizauskas. Multi-document summarization using a* search
               and discriminative training. In Proceedings of the 2010 conference on empirical methods in natural
               language processing, pages 482–491. Association for Computational Linguistics, 2010.

[Chr26]        Agatha Christie. The murder of roger ackroyd. Collins, 1926.

[Chr03]        A. Christie. The ABC Murders: A Hercule Poirot Mystery. Hercule Poirot Mysteries. HarperCollins,
               2003.

[ER04]         Günes Erkan and Dragomir R Radev. Lexrank: Graph-based lexical centrality as salience in text
               summarization. Journal of artificial intelligence research, 22:457–479, 2004.

[HM17]         Matthew Honnibal and Ines Montani. spacy 2: Natural language understanding with bloom embed-
               dings, convolutional neural networks and incremental parsing. To appear, 2017.

[LE11]         A.L. Louis and A.P. Engelbrecht. Unsupervised discovery of relations for analysis of textual data.
               Digital Investigation, 7(3):154 – 171, 2011.
[MPH+ 16] Hans Moen, Laura-Maria Peltonen, Juho Heimonen, Antti Airola, Tapio Pahikkala, Tapio Salakoski,
          and Sanna Salanterä. Comparison of automatic summarisation methods for clinical free text notes.
          Artificial intelligence in medicine, 67:25–37, 2016.

[TDC10]    Khushboo S Thakkar, Rajiv V Dharaskar, and MB Chandak. Graph-based algorithms for text sum-
           marization. In Emerging Trends in Engineering and Technology (ICETET), 2010 3rd International
           Conference on, pages 516–519. IEEE, 2010.
[VD15]     S.S. Van Dine. Twenty Rules For Writing Detective Stories. Booklassic, 2015.

6   Appendix
                   Table 3: Evidence Summary of the A.B.C murders [Chr03] (last 10 lines)

          A newspaper might not print the first letter, but by the time the second crime took place, A.B.C.
          could have been assured of all the publicity the press could give.
          The great idea of such a killer is to hide his tracks - not to advertise them When we consider the
          four victims selected - or at any rate three of them (for I know very little of Mr. Downes or Mr.
          Earlsfield), we realize that if he had chosen, the murderer could have done away with them without
          incurring any suspicion.
          I may say that his description, as given me by Miss Grey, did not quite correspond with my own
          picture of the man who strangled Betty Barnard ”I will pass over the next stages quickly.
          A fourth murder was committed - the murder of a man named George Earlsfield - it was supposed
          in mistake for a man named Downes, who was something of the same build and who was sitting
          near him in the cinema. ”
          “Supposing, my friends, that while Cust committed three of the crimes - the A, C and D crimes -
          he did not commit the B crime.”
          “Up to the time of the Barnard murder, no facts about the A.B.C. murders had been made public.
          It therefore followed that whoever killed Betty Barnard must have had access to facts known only
          to certain persons - myself, the police, and certain relations and neighbours of Mrs Ascher.
          “I had now only to review the various crimes and find the possible guilty person.
          Also I learned that he had his holiday early in August, which rendered it unlikely that he had
          anything to do with the Churston crime.
          The man I had known a long time in my secret mind was the same as the man whom I had known
          as a person.