Introduction

Extracting Evidence Summaries from Detective Novels

Aditya Motwani Aayush Naik DSAC

IIIT Hyderabad DSAC

IIIT Hyderabad kamal@iiit.ac.in

Literary novels are a mine of information stored in a narrative. Information Extraction (IE) on these novels can be directed by a speci c aim to derive knowledge of interest. In this paper, we explore the problem of extracting evidence summaries for all characters in a detective novel. But in this domain, there is no standard annotated text which provides the ground truth for evidence and there exists no xed metric of evaluation. For extracting summaries, we explore unsupervised methods of learning. We rst assume that all characters are equally likely to be culprits and build evidence against each character. Given this evidence, we assess the likelihood of each character being the culprit, eventually arriving at the true culprit(s). Our contribution is the formulation of evidence summaries for characters in determining the culprit. Our experiments are based on a corpus of detective novels spanning the last hundred years. We evaluate these summaries for comprehension with the help of human readers and broadly categorize novels into three groups.

Introduction

These techniques exploit the underlying semantic structure in a novel using spacy's [HM17] sentence vectors as mathematical representation which attempts to overcome the challenge of manual annotation of sentences.

For the purpose of our experiments we build a corpus of detective novels that primarily revolve around a singular culprit who is involved in a crime like robbery or murder. To incorporate various writing styles, we pick popular authors over the last century. Some of the prominent authors in the corpus that we analyze are Arthur Conan Doyle, Agatha Christie, Stieg Larsson, J.K. Rowling, Dan Brown and Lee Child. The standard approach to solve this problem is to rst extract evidence and then deduce the likely culprit. However, without a concrete de nition of valid evidence, this problem is di cult to address. So, our methods can be viewed as approaching the problem backwards. We extract and evaluate the possible evidence against every character in the novel. We only collect the evidence relating to the crime that was revealed before the culprit is declared. This gives us an evidence map of each character. An analysis of this map by human evaluators gives us the likelihood of the said character being the culprit.

We focus our analysis on the text before the aforementioned juncture. Given the juncture of revelation as a priori information, we de ne what constitutes as valid evidence (see Section 3) using di erent algorithms and metrics. Our rst step is to use trends pertaining to the domain of detective novels to determine the important character (especially the culprit) in novels.

We need to extract the evidence relating to the crime that was revealed before the culprit is declared. Using heuristics de ned in section 3.1 we nd the approximate position of the culprit revealing sentence (de ned in section 3.0.2) in the novel by narrowing down to the relevant paragraph. Since the paragraph contains only a few sentences, most of which are related to the culprit, it is feasible to nd the revealing sentence.

We use our algorithm which re nes text into potential evidence. We then validate our results by employing human readers given two summaries of each character - one with baseline summaries (described in section 3.3) and the other with summaries from our approaches - and comparing them in table 2. From our gathered evidence summaries and our responses from our human readers, we broadly classify the novels into three categories based on the extracted evidence.

In this paper, we talk about related work in section 2. We detail the overview of our solution and some necessary de nitions in section 3. Next, we describe the exact methods and algorithms used in section 3.1 and 3.2. We provide and explain our results in section 4. We nally conclude and talk about future work in section 5. 2

Related Work

Summarization of large text documents is an important task in the NLP domain. However, summarization techniques for texts with a narrative structure are challenging to produce. [LE11] attempted a method for evidence discovery along with visual aid in understanding the characters and structure in detective novel genre. Moen et.al [MPH+16] generate extractive summaries on medical reports using word space models in distributional semantic modelling. Medical reports involves temporal patient history, and unpacks in similar ways to detective novels.

Some of the popular techniques for summarizations happen graph based. Algorithms such as [TDC10] has been attempted on di erent domains using di erent algorithms. Thakkar et al [TDC10] use a similarity metric based on common words in sentences to generate a graph and then use the PageRank algorithm to nd important sentences in the domain of web articles. In our corpus, we observe there is no correlation between PageRank of sentences and their importance with respect to evidence against the culprit. Aker et al [ACG10] use A* search with di erent heuristics for multi-document summarization on an established ground truth. In comparison, our problem does not have an established ground truth or a pre-annotated corpus of sentences for the A* search. 3

Evidence extraction from detective novels

3.0.1

Evidence Any piece of information that increases the con dence of the reader that a particular character is the culprit can be categorised as evidence. In the context of our problem, every sentence in list of sentences of the output of the search algorithm is a potential evidence sentence. In table 1, we show some examples of evidence sentences retrieved by our search algorithm for the novel The ABC Murders [Chr03] by Agatha Christie. Miss Grey here told us that she did not see or speak to any stranger on the day that Sir Carmichael Clarke was murdered.

There was the possibility that the footmarks might have been made by somebody else who happened to have the same kind of studs in his shoes.

According to the police theory, Ralph was wearing another pair of the same kind, and I found out that it was true that he had two pairs. 3.0.2

Culprit Sentence The rst sentence in the novel that makes it unambiguous and lucid who the true culprit is.

Very often, this is an assertion or deduction by the prime detective following which, the detective explains how he/she arrives at that deduction. Or sometimes, it's a confession by the culprit. 3.1

Culprit Frequency Binning We divide the text into buckets and glean information from aggregation of these buckets. The information we are looking for is the location in the novel where the culprit is revealed.

First we look for the approximate location (paragraph) where the culprit is revealed. To achieve this, we use a bucket optimization algorithm. We want to maximize the frequency of occurrence of the culprit in a paragraph. Using the bucketing algorithm, we optimize the size of a paragraph to get the best frequency. We group consecutive sentences into bins of sentences and count the frequency of the culprit in each bin. By varying bin size, we arrive at an optimal paragraph size where there is a sharp increase in the mentions of the culprit. Once we've identi ed the paragraph, our search space has reduced drastically. At the beginning of the culprit Algorithm 1: Culprit Bucket Optimization

Function bucket Sentence[] doc, Integer min bucket size, Integer max bucket size

Integer bucket size = min bucket size Integer optimal bucket size = min bucket size Sentence[] culprit bucket Integer max mentions = 0 while bucket size max bucket size do

Sentence[][] Buckets = emptyset() Sentence[] bucket = Buckets[bucket size*i to bucket size*(i+1)] : for i in range(number of buckets) Divide the list of sentences into buckets of size bucket size Count the number of times the culprit is mentioned in each bucket Let mentions and bucket hold the values for the most mentions among all buckets and the bucket with most mentions respectively if mentions > max mentions then max mentions = mentions optimal bucket size = bucket size culprit bucket = bucket end bucket size = bucket size +1 end return culprit bucket, optimal bucket size end paragraph, we add an asserting sentence which implicates the culprit. But such a sentence would not exist for a character who is not the culprit. For example, in the Murder of Roger Ackroyd by Agatha Cristie, we add the sentence \Ursala Borne killed Roger Ackroyd." as Ursala Borne was not the culprit. This helps us maintain uniformity of the culprit sentence. That is, even if the text lacks an explicit implication of the culprit or any other character, we can add such an implication at the beginning. This is then used to extract evidence against all the characters. Additionally, this maintains consistency of the culprit sentence across novels. If no evidence against a particular character is found, then, we can state that that character was not the culprit with high certainty. The search algorithm is a higher-order function that takes in a strategy function along with a starting sentence and a goal sentence. The strategy de nes how and which sentences are picked. The aim is to nd an optimal path, starting at the rst sentence of the novel and ending at the culprit-revealing sentence.

Our Augmented A* algorithm in two parts - a general graph search algorithm [Algorithm (2)] that takes in a strategy as input and the augmented A* strategy [Algorithm (3)]. We use a variant of the A*- search (with additional hyperparameters) as the strategy function input to the higher-order graph search function.

The nal output of the graph search is a list of sentences. This list of sentences is analogous to an extractive evidence summary. An example is shown in table 3.

Algorithm 2: Graph Search

Function search Sentence[] doc, Sentence start, Sentence goal, Strategy strategy Sentence current = start Set(Sentence) frontier = ; Set(Sentence) visited = ; while current != goal do yield current visited = visited [ fcurrentg Set(Sentence) new nodes = Set(neighbors(current)) - visited - frontier frontier = frontier [ new nodes next node, dist from start = strategy(doc, current, frontier, goal, dist from start) current = next node frontier = frontier - fcurrentg end end

Algorithm 3: A* Strategy

Strategy astar Sentence[] doc, Sentence current, Set(Sentence) frontier, Sentence goal, Map distance from start, Float , Float , Function goal heuristic, Function distance if distance from start is empty then distance from start = sent : 1 forall sent in doc distance from start[current] = 0 end forall Sentence neighbor in neighbors(current) do dist from start[neighbor] = min(dist from start[neighbor], dist from start[current] + distance(current, neighbor)) end Sentence next node = argminnodefnode j f (node) = minnode0 :dist from start[x]+ :goal heuristic(goal, x)g return next node, dist from start end 3.3

Baseline We use the graph based summarization method Lexrank [ER04] as a baseline for creating evidence based summaries. Lexrank works by building a graph representation of text. It then computes sentence importance based on the eigenvector centrality and cosine similarity in the graph representation of sentences. We then compute a connectivity matrix based on intra-sentence cosine similarity. We nd the closest sentences to the culprit sentence and create an extractive summary of the character. The baseline output is a generic character summary of the characters, not explicitly focusing on evidence. Evidence sentences are obfuscated in the narrative of a novel. There is no standard metric of evaluation or a baseline as such. We present a best-e ort scenario by comparing our methods with the baseline and evaluating with human readers.

We evaluated our evidence summaries with 10 human evaluators who have read all the books or read the long summaries on Wikipedia and Sparknotes. The long summaries were a good representation of the story as veri ed by the authors of this paper. Initially, the evaluators are presented with the summaries from our baseline method and then they are presented with evidence summaries from our Augmented AStar. The participants were asked to report if the evidence summaries presented to them for di erent characters in the novel assisted them in identifying the culprit in the novel. The responses of the participants were broadly categorized into three classes - \The evidence summaries were useful in nding the right culprit (CI)",\The evidence summaries are misleading and direct the crime to a non-culprit character (N CI)" and \The evidence summaries are not informative in labelling any character as culprit (N A)".

From table 2, we see that the Augmented A* Search has a higher CI percentage across novels, as compared to the rst group. The highest CI score for the baseline lexrank is 30%. That is, 3/10 readers found the baseline summary useful in nding the culprit in the best case scenario of Eye of the Needle. In contrast, the second group presented with our Augmented A* reported a highest with 8/10 readers reporting CI (helpful in culprit identi cation) for Red Dragon.

We also categorize the novels into three types based on the rules detailed by vans et al [VD15] on writing detective stories and our intuition into - hard, elusive, and trivial. Hard novels are the ones in which most of the twenty rules given by [VD15] are violated and/or simply not written for reader deduction. Some of these rules include \The reader should have the same opportunity as the detective to solve the crime". \The villain must be found by logical deduction, not luck, accident, or un-motivated confessions". \The villain has to be someone who plays a prominent part of the story".

This makes it impossible for the reader to deduce who the culprit is. Elusive novels follow most of the rules, and thus it is possible for a keen reader to deduce the culprit. Trivial novels may or may not follow the rules, but they just fail at hiding the culprit properly or don't even aim to do so in the rst place.

From the response of our readers presented with our Augmented A* summaries we draw the following inferences 1. Novels with a high percentage of readers who fall in N A, can be said to provide negligible information about the culprit before revelation - hard. 2. Novels with high percentage of readers who fall in CI are the ones which make culprit guessing easy and do not have su cient obfuscation - trivial. 3. Novels with high percentage of readers who fall in N CI seem to provide only subtle obfuscated information about the culprit - elusive.

In the trivial category, the highest CI percentage is 80% for Eye of the Needle and Red Dragon. In Red Dragon, the author Thomas Harris, aims to show the culprit in a gruesome fashion and how the culprit Francis Dolarhyde eludes the detective. The author does not aim to hide the identity of the culprit.

Stories that falls into the hard category typically surprise the reader by hiding information. In Hound of Baskerville and The Adventure of the Blue Carbuncle, Sherlock Holmes withholds critical information until revealing the culprit, but it leaves the reader in awe of the detective who could draw such amazing deductions. This makes guessing who the culprit very di cult.

In the elusive category, the highest percentage of readers in N CI is 70% indicating a certain degree of obfuscation of evidence using narrative tricks. This is done by making it hard to sift out the evidence using certain crafty tricks like addition of phony clues, misdirection by adding spurious culprits and mention of seemingly irrelevant information which turns out to be key in the near future. An astute reader can always identify the culprit in advance. Summaries for Agatha Christie novels contain lines which readers overlook but are found to be of importance. For example, in The Murder of Roger Ackroyd [Chr26], since the reader does not expect the Adventure of Blue Carbuncle Hound of Baskerville A Study in Scarlet The Sign of the Four The ABC Murders The Murder of Roger Ackroyd AMnudrdTerheinn MTheesroepoWtaemreiaNone The Mysterious Affair at Styles The Girl with the Dragon Tattoo The Maltese Falcon The Moonstone Eye of the Needle Shock for the Secret Seven Red Dragon Killing Floor Angels and Demons The Lost Symbol In the Woods The Cuckoo's Calling Arthur Conan Doyle Arthur Conan Doyle Arthur Conan Doyle Arthur Conan Doyle Agatha Christie Agatha Christie Agatha Christie Agatha Christie Agatha Christie Stieg Larsson Dashiell Hammett Wilkie Collins Ken Follett Enid Blyton Thomas Harris Lee Child Dan Brown Dan Brown Tana French J. K. Rowling (NA) 80 90 100 100 20 10 30 50 20 10 60 60 10 60 10 50 50 50 60 30 hard hard hard hard elusive elusive elusive hard elusive elusive hard hard trivial hard trivial hard hard trivial hard elusive narrator to be the culprit, many clues are left unnoticed; and in Murder on the Orient Express 12 out of the 13 suspects are the culprits something a reader would not expect apriori. 5

Conclusion and Future Work

In this paper, we explore the problem of extracting evidence summaries for characters in a novel. As there is no existing annotated data, we explore an unsupervised method of learning. We use an augmented A* algorithm which takes in an implicit graph representation and nds an optimal path to the culprit reveal sentence. The results show that the evidence summaries of the culprit contain much more information than the summaries generated for characters who are innocent of the crime. Furthermore, we evaluate our evidence summaries by presenting them to human readers and classify the novels into three categories based on the ease in which they were able to identify the culprit.

As future work, we would like to explore abstractive summarization as a technique to generate coherent evidence summaries. An interesting application would be in the domain of voice-activated agents, which would provide culprit narratives for humans. This agent would have the capability to extract facts from text and discuss detective novels with humans. Such an application could be realized on a voice-activated agent like Amazon Echo or Google Assistant. [ACG10]

A. Christie. The ABC Murders: A Hercule Poirot Mystery. Hercule Poirot Mysteries. HarperCollins, 2003.

Gunes Erkan and Dragomir R Radev. Lexrank: Graph-based lexical centrality as salience in text summarization. Journal of arti cial intelligence research, 22:457{479, 2004.

Matthew Honnibal and Ines Montani. spacy 2: Natural language understanding with bloom embeddings, convolutional neural networks and incremental parsing. To appear, 2017.

A.L. Louis and A.P. Engelbrecht. Unsupervised discovery of relations for analysis of textual data.

Digital Investigation, 7(3):154 { 171, 2011. [TDC10] [VD15]

S.S. Van Dine. Twenty Rules For Writing Detective Stories. Booklassic, 2015. 6

Appendix

A newspaper might not print the rst letter, but by the time the second crime took place, A.B.C. could have been assured of all the publicity the press could give.

The great idea of such a killer is to hide his tracks - not to advertise them When we consider the four victims selected - or at any rate three of them (for I know very little of Mr. Downes or Mr. Earls eld), we realize that if he had chosen, the murderer could have done away with them without incurring any suspicion.

I may say that his description, as given me by Miss Grey, did not quite correspond with my own picture of the man who strangled Betty Barnard "I will pass over the next stages quickly. A fourth murder was committed - the murder of a man named George Earls eld - it was supposed in mistake for a man named Downes, who was something of the same build and who was sitting near him in the cinema. " \Supposing, my friends, that while Cust committed three of the crimes - the A, C and D crimes he did not commit the B crime." \Up to the time of the Barnard murder, no facts about the A.B.C. murders had been made public. It therefore followed that whoever killed Betty Barnard must have had access to facts known only to certain persons - myself, the police, and certain relations and neighbours of Mrs Ascher. \I had now only to review the various crimes and nd the possible guilty person. Also I learned that he had his holiday early in August, which rendered it unlikely that he had anything to do with the Churston crime.

The man I had known a long time in my secret mind was the same as the man whom I had known as a person.

[MPH+16] Hans

Moen

, Laura-Maria

Peltonen

, Juho Heimonen, Antti Airola, Tapio Pahikkala, Tapio Salakoski, and Sanna Salantera. Comparison of automatic summarisation methods for clinical free text notes . Arti cial intelligence in medicine , 67 : 25 { 37 , 2016 .

Khushboo S Thakkar , Rajiv V Dharaskar, and

Chandak . Graph-based algorithms for text summarization . In Emerging Trends in Engineering and Technology (ICETET) , 2010 3rd International Conference on, pages 516 { 519 . IEEE, 2010 .