Argumentation-driven information extraction
                       for online crime reports

                    Marijn Schraagen                                       Daphne Odekerken
           Information and Computing Sciences                      Information and Computing Sciences
                    Utrecht University                                      Utrecht University
                  M.P.Schraagen@uu.nl                                      D.Odekerken@uu.nl

                        Bas Testerink                                             Floris Bex
                    Dutch National Police                          Information and Computing Sciences
                   Bas.Testerink@politie.nl                                   Utrecht University
                                                                 Institute for Law, Technology and Society
                                                                              Tilburg University
                                                                                F.J.Bex@uu.nl


                         Abstract
                                                                   1   Introduction
     A new system is currently being developed to
                                                                   The ideas presented in this paper are part of a a
     assist the Dutch National Police in the assess-
                                                                   collaborative initiative of the Dutch National Police
     ment of crime reports submitted by civilians.
                                                                   and Utrecht University for developing a framework for
     This system uses Natural Language Process-
                                                                   (semi-)autonomous business processes in the police or-
     ing techniques to extract observations from
                                                                   ganization using technologies from text and data ana-
     text. These observations are used in a formal
                                                                   lytics together with computational argumentation and
     reasoning system to construct arguments sup-
                                                                   dialog. One project under the umbrella of this ini-
     porting conclusions based on the extracted ob-
                                                                   tiative concerns technologies to improve the intake of
     servations, and possibly ask the complainant
                                                                   criminal reports submitted by civilians on the topic
     who files the report extra questions during the
                                                                   of online trade fraud, which concerns cases such as
     intake process. The aim is to develop a dy-
                                                                   fake webshops and malicious second-hand traders on
     namic question-asking system which automat-
                                                                   trading platforms (e.g., eBay). Around 40.000 reports
     ically learns effective and user-friendly strate-
                                                                   are filed each year, and the legal background for trade
     gies. The proposed approach is planned to be
                                                                   fraud is a single article of the Dutch Criminal Code
     integrated in the daily workflow at the Dutch
                                                                   (art. 326) and a relatively small set of cases that are
     National Police, in order to provide increased
                                                                   used as legal precedents. This high volume and relative
     efficiency and transparency for processing of
                                                                   simplicity of such cases makes them ideal for further
     crime reports.
                                                                   automated processing.
Keywords: Argumentation, Information Extraction,                      For the case of online trade fraud, the Dutch po-
Relation Extraction                                                lice currently collects online-submitted crime reports
                                                                   using a web interface which requires citizens to fill out
Copyright © CIKM 2018 for the individual papers by the papers'     several predefined fields (such as the name of the coun-
authors. Copyright © CIKM 2018 for the volume as a collection      terparty, bank account number, etc.) as well as a free
by its editors. This volume and its papers are published under     text description of the situation. Using this informa-
the Creative Commons License Attribution 4.0 International (CC     tion the police decides to either (a) discard the report
BY 4.0).                                                           because it does not concern trade fraud, (b) accept
                                                                   the report and include it in the police database for
further processing, or (c) ask follow-up questions (by      not delivered
e-mail) to the complainant in case more information
                                                                                waited
is needed. In the current situation, human analysts
have to read through all incoming reports and decide                                 deception
on either (a), (b) or (c). To improve the efficiency of                  R1
this assessment, we aim to develop a system that auto-
matically determines the appropriate course of action
given a report.                                                        not sent          R2          R3
   One way of handling (possible) trade fraud re-               paid
ports is to train an algorithm to automatically deter-
mine which action to take given a complete incom-
                                                                         R4         false location        false website
ing report. This was explored in previous research
[KSBB17, va18], where classifiers were trained to clas-
sify reports as being of class (a - discard report) or of
class (b - accept report), based on the elements of the                 fraud
report (address of suspect, trade site that was used,
shallow linguistic features). Given that the data is               Figure 1: Example argumentation graph.
highly skewed – only 16% of the incoming reports is
normally discarded by human analysts – the results are      texts, so their occurrence can be explained by exactly
promising, with an F1 -score of 67.5% for class discard,    those sentences, and more complex conclusions based
95.2% for class accept and a macro-average F1 -score of     on multiple factors in a case can be checked by means
80.8%.                                                      of the argumentation.
   One important issue with the above solution is that         In the rest of this paper, we discuss the concepts of
for a machine learning classifier it cannot be explained    our intake system. The design and implementation of
satisfactorily why a complaint was discarded or ac-         the system is part of ongoing research, therefore the
cepted. For example, one important feature that is          discussion in the current paper is primarily intended
used as input for the final classifier F C algorithm is     to be conceptual (leaving a full evaluation for future
the output of another classifier W C trained on the         work). The current discussion is structured as follows:
(lemmatized) words of the free text field. The expla-       Section 2 discusses the argumentation theory and in-
nation of F C’s decision to accept a report is then, for    ference mechanisms that constitute the basis of the
instance, that the classifier W C gives a probability of    automated reasoning about fraud. The process of col-
0.8 to accept, based on the occurrence of certain words     lecting complaint information can be modeled in dif-
(such as “never” and “tickets”) in the report text. In      ferent ways, which is described in Section 3. One of
a legal or law enforcement application, however, we         the proposed approaches involves a dialog with the
need transparent explanations that make sense from          complainant, which requires a question asking policy,
a legal and common-sense perspective, not explana-          as described in Section 4. As a prerequisite for ar-
tions that are based on certain patterns in the data.       gumentative inference, the basic observations need to
For example, we want to know that the complainant           be extracted from the input given by the complainant.
who filed the report bought tickets from the (suspect)      For textual input, natural language processing (NLP)
counterparty, but these tickets were never delivered.       techniques are required for this task. The observations
   In order to automatically assess trade fraud reports     as used in the graph generally denote a relation be-
submitted online, we turn to a combination of sym-          tween entities, e.g., a send -relation involving the com-
bolic, argument-based reasoning about a case (simi-         plainant, the counterparty and a package as relation
lar to [PS96]) and non-symbolic information extrac-         elements. The classification of entities and relations is
tion techniques that use machine learning. These ex-        described in Section 5. Section 7 concludes the paper
traction techniques are intended to find basic observa-     and discusses next steps.
tions such as “this report concerns a ticket for a music
concert”, “money was paid by the complainant to the
                                                            2    Argumentation Theory
counterparty” and “nothing was delivered to the com-
plainant”, and use these observations as premises in        The Dutch Criminal Code defines fraud as “mislead-
legal arguments to infer that, for example, the report      ing through false contact details, deceptive tricks or an
concerns a possible case of fraud and should therefore      accumulation of lies”. These elements can be traced
be considered for further processing. Thus, the non-        back to observations or observable facts collected from
symbolic algorithms are fine-grained: the basic obser-      the victim and relevant third parties. Based on the
vations are closer to sentences in the original report      legal definitions in the Dutch Criminal Code, the rel-
evant case law and knowledge of working procedures         which is not in the grounded extension, then there is
of the police analysts who currently assess the fraud      an argument C in the grounded extension that attacks
reports, we have constructed an argumentation theory       B and thus defends A. Other options than grounded
about online trade fraud. To construct the argumen-        semantics exist, but grounded semantics fit nicely with
tation theory, the right balance needs to be found in      the conservative nature of legal processes and can be
the level of detail for observations. On the one hand      computed in polynomial time given the arguments.
we want an observation to be directly observable from         Consider for example the situation that a package
the input document, for instance ‘no mention of pay-       has been sent, however the recipient was not at home
ments occurs in this document’. On the other hand,         and the delivery service issued a note that the pack-
observations that are too detailed lead to a large argu-   age has been returned to the sender. For the purposes
mentation theory, which is more difficult to construct,    of the example, assume that the counterparty in fact
maintain and use in argument inference. We try to          has bad intentions (e.g., sending a defective product)
find a balance by interacting with the police-side users   and has used a false address. In this case the proposi-
of the system such as the people that handle incoming      tions false location, not delivered, waited and paid are
complaints. If they think a statement is obvious from      true, but not sent is observed to be false (given the
a document then we do not require an argumentation         note from the delivery service). Based on these ob-
structure for those statements. Such statements are        servations and the argumentation graph presented in
candidates for becoming observables. For other state-      Figure 1, using a forward chaining algorithm the con-
ments such as ‘this document concerns fraud’ it is not     clusions deception, not sent and fraud can be inferred.
immediately obvious and we require some argumen-           However, the conclusion not sent conflicts with the ob-
tation as to why a crime is committed in that case.        servation sent. Therefore not sent and the dependent
Currently, we work with an argumentation theory of         conclusion fraud are not in the grounded extension,
46 rules and 26 observable facts [Ber18].                  which consists of the observation set and the conclu-
   The argumentation rules and observables can be          sion deception.
modeled in an argumentation graph, where (sets of)            When sufficient information is available the argu-
observations provide support or counter-evidence for       ment inference will result in a stable state 1 . We say
other propositions. A simplified example argumenta-        that a certain conclusion is stable if either A) an argu-
tion graph is presented in Figure 1. Inference rules       ment for it is included in the grounded extension and
are conjunctive, e.g., ‘if the package is not delivered    more information does not change this, or B) there
and the complainant waited a for reasonable period of      is an argument for the conclusion but this argument
time then the package is not sent’ (R1 ). The obser-       or any other argument for the conclusion can never
vation nodes are indicated with a gray background in       be in the grounded extension, or C) no argument can
Figure 1.                                                  be made and neither will this be possible with more
   Once the graph is constructed, the observed nodes       information. For instance, consider a case where the
are used as input to infer conclusions.         As per     counterparty in the case has refunded the payment to
ASPIC+ [Pra10] definitions, we use inference trees as      the complainant. In that case, there is no legal basis
the data structure with which to represent arguments.      anymore to convict the counterparty of fraud. So if
An inference tree consists of a set of premises and a      this proposition is observed for a case, then the sys-
conclusion connected by rules, with possible intermedi-    tem can establish that there will never be an argument
ate conclusions (which are in turn premises for further    for fraud in the grounded extension. The information
conclusions) in between. For example, in Figure 1,         necessary to result in a stable state needs to be pro-
the inference tree for the conclusion not sent contains    vided by the complainant (possibly combined with in-
the premises and the conclusion of rule R1 , whereas       formation from third parties, such as banks or trade
the inference tree for the conclusion fraud contains all   websites). The interaction with the complainant can
nodes in the graph. Arguments may attack each other        be modeled in different ways, which will be discussed
because of inconsistent conclusions (rebutting attack)     in the next section.
or because a conclusion contradicts a premise of an-
other argument (premise attack). Given a set of ar-        3    User interaction
guments and the attack relation, we determine the set
of acceptable arguments by calculating the grounded        As mentioned earlier, the Dutch police currently col-
extension from Dung’s abstract argumentation frame-        lects a report (including free text but also predefined
work [Dun95]. The grounded extension contains all          fields for addresses, trade sites, etc.) using a web in-
arguments that are conflict-free and that defend them-        1 Stability is not fully calculated (due to computational com-
selves against any attackers, that is, if argument A       plexity). Instead, we deploy a heuristic that runs polynomial in
in the grounded extension is attacked by argument B        the number of argument graph edges.
terface. The argumentation system as described in            use Q-learning [Wat89] to train a policy. Note that this
Section 2 can be based on this document by providing         requires the assumption that the answer to a question
a conclusion (i.e., fraud or not fraud ) if a stable state   is independent of how the current set of propositions is
is reached, and suggesting to ask follow-up questions        obtained. For our Q-learning approach we require a re-
otherwise. Here, the full report document is used as         ward function. In order to promote efficient dialogs, we
input to instantiate propositions in the argumentation       give a small penalty for each action. To promote sta-
graph.                                                       bility, we give a high reward for reaching stable states.
    Alternatively, the user interaction model can be         Finally, alongside a reward function we need a user
changed into a dialog paradigm. In this case the com-        model that realistically provides responses to questions
plainant does not file a report document, but instead        (the probabilities of transitions in the Markov Decision
the system guides the complainant through the report-        Process). To this end we currently work with hand-
ing process by asking a number of questions. After           written models. When the system is deployed it will
each question the argumentation graph is updated us-         gather user data and then a data-driven model will
ing the reply of the complainant, and the dialog is fin-     replace the initial model.
ished when the argumentation reaches a stable state.             As an example, using the argumentation graph in
The questions can be selected dynamically, such that         Figure 1, consider the state in which false location is
the argumentation advances towards a stable state            known to be true and all other propositions are un-
with each question. This approach is similar to the          known. Suppose the Q-learning algorithm selects ‘ask
current practice for reporting a crime at a police sta-      for false website’ as the next action to evaluate. This
tion, where a police officer asks a number of questions      question can support deception as a conclusion, how-
in order to fill out a crime report.                         ever this conclusion was already supported by false lo-
    Note that, for practical purposes, the two ap-           cation. The new state after asking this question there-
proaches can be considered as opposite ends of the           fore has the same reward value as the previous state,
same methodology, i.e., providing a complete docu-           while the penalty is increased by performing the ques-
ment to the argumentation graph is essentially a dia-        tion action. This will lead the Q-learning algorithm to
log with a single user response. Similarly, a question       reject this state-action pair as part of the policy, and
within a multi-step dialog can result in a complex user      to consider alternative actions instead.
response which can be considered a short document.
Regardless of the length of the dialog, the answers          5    Extracting entities and relations
need to be parsed and processed in order to extract
relevant information. This could be avoided by using         As described in Section 3, user input (either from re-
closed-form questions with a list of predefined answers      port documents or from dialog responses) needs to be
(e.g., ‘Did you receive a package?’, ‘How long did you       mapped to propositions in the argumentation graph.
wait?’), however such a dialog may prove to be insuf-        These propositions generally consist of a relation be-
ficient for users to explain the details of the situation.   tween relevant entities (people, objects, locations, etc.)
                                                             described in the complaint. Various techniques can be
                                                             used to extract entities and relations between entities
4    Question policy learning
                                                             from text, ranging from dictionary lists and syntactic
When using a dialog between the system and the com-          patterns to complex parsing algorithms and machine
plainant, we want the system to get to a stable state        learning models. For Dutch legal data the Dutch de-
as efficiently as possible. Determining whether a state      pendency parser Frog [BCD07] can be used for named
is stable consists of hypothesizing over all possible fu-    entity recognition, for which the performance on le-
ture questions. As this is generally infeasible to do, we    gal data is evaluated in previous work ([SBB17]). For
turn to machine learning methods to train a question-        relation extraction the development and evaluation of
ing strategy to approximate an ideal solution.               automatic methods is an ongoing effort in the current
    The policy that is to be learned maps observed           research project, as described in the remainder of this
propositions to questions that can be asked or to a          section.
terminating action (accept/reject). The action results          In order to use these techniques effectively in a law
in some response from the user, which consists of new        enforcement application, the expected result from text
observations and possibly inferred conclusions (both         processing should be considered carefully. Given the
propositions) that are added to the already known            domain, for example, knowing whether the victim has
propositions. As a result, we may view the state of          paid the counterparty is essential. However, other in-
the system as a set of propositions and the actions          formation containing entities (e.g., details of contacts
as non-deterministic transitions between states. If we       with other victims) are not relevant for legal reason-
model this as a Markov Decision Process, then we can         ing. The relevance of certain types of information de-
 concept      property                                       We have transferred €100,- to this man on account
 residence    person name, location, large distance          number 1234.
              role: complainant, counterparty,
              related, unrelated                              Relation:    send
 send         sender, recipient, object                       Sender:      we
 receive      indicator of relation                           Role:        complainant
              validity: true, false, unclear                  Recipient:   this man
              type: product, payment, contact, other          Role:        counterparty
              roles sender, recipient: complainant,           Object:      €100,-
                                                              Indicator:   transferred
              counterparty, other
                                                              Status:      sent (or: ‘not sent’, ‘unclear’)
              object: fake, broken, other

Table 1: Examples of properties of interest for anno-               Figure 2: Example annotated sentence.
tation.                                                      name, location, role of the person and large distance
termines how data should be collected and processed          property. In future research the processing of coref-
in developing entity and relation extraction methods.        erential expressions (e.g., the token he to refer to an
This includes a mapping from nodes in the argumenta-         earlier mention of John Smith in a document or dialog)
tion graph to entities and relations in the text. How-       will be addressed.
ever, other propositions may not be represented di-             Payment and delivery information are captured by
rectly in the text, such as the use of a false website. In   send and receive relations. The reason for this is that
such cases, partial information may be present in the        the complainant usually only knows one side of the
text (e.g., the counterparty operated a website), while      story: if the complainant intended to buy a product,
the proposition itself can only be validated after con-      he or she typically claims having sent money to the
sidering information from a third party (e.g. checking       counterparty without having received the product. We
with the ISP to prove that the website is fake). How-        do not know for sure if the counterparty received the
ever, in both cases the legal definition of the crime (as    money and/or sent the product, as there is a possibility
expressed in the argumentation graph) is essential for       of a delivery or payment failure by a third party. The
the development of text processing methods.                  send and receive relations are ternary, having a sender,
                                                             receiver and object, although some of the entities may
6    Data annotation for relation extrac-                    be omitted in the text: for instance, in the sentence
                                                             ‘I did not receive anything’ the sender is missing. For
     tion
                                                             the sender and receiver, we annotate the corresponding
As we stated in Section 5, some of the propositions          character indices and the role (complainant, counter-
in the argumentation graph are based on relations be-        party or other). In some complaints, the complainant
tween entities in the crime report documents. We plan        reports that he or she received a broken product. This
to use supervised machine learning techniques (see for       suggests a civil case instead of a fraud case. There-
example [XML+ 15]) to automatically extract these re-        fore, we annotate not only the character indices but
lations, which has shown to provide high accuracy            also the state of the product. Furthermore, we anno-
for the current dataset in preliminary experiments.          tate the word(s) indicating a send or receive relation
Therefore, crime report documents need to be anno-           and the validity of the relation. An example anno-
tated with the concepts identified in the domain anal-       tated sentence (translated for illustration purposes) is
ysis process. Concepts of interest include residence,        provided in Figure 2.
payment and delivery information. Each concept has              We plan to use the annotations in a classifier that,
a number of associated properties for which annota-          given a set of tokens, decides if they are entities in
tion could prove useful. These properties are listed in      one of the aforementioned relations. The output of
Table 1.                                                     this classifier can then be mapped to propositions for
   Residence relations are interesting as they may indi-     the argumentation graph. Such a classifier is intended
cate the deceptive trick in which the fraudster gives a      to operate on free text input, using simple features
false address. In that case, we often find in the report     such as the presence of selected keywords as well as
that the actual occupant of the address was an un-           more complex features such as lemmas or grammat-
related person who did not know anything about the           ical dependency paths. For real-world free text the
advertisement. Furthermore, the address is often in a        computation of these features may be unreliable (e.g.,
remote relation, facilitating the fraudster to (falsely)     as a result of misspellings in the source text) or, even
promise to send items per mail. To be able to detect         with reliable features, a real-world example may not
these situations in the future, we annotate the person       conform to the regularities found in the training set.
However, using a suitable classifier and an appropri-                tion theory. Bsc. thesis, Utrecht University,
ate training set size, the model is expected to gener-               2018.
alize over irregularities to a certain extent. Moreover,
as mentioned in Section 3, using a dialog component        [BCD07] Antal van den Bosch, Bertjan Busser, Sander
within the system will provide some context to inter-              Canisius, and Walter Daelemans. An effi-
pret the results of the relation classifier.                       cient memory-based morphosyntactic tagger
                                                                   and parser for Dutch. In Selected Papers
                                                                   of the 17th Computational Linguistics in the
7   Conclusion
                                                                   Netherlands Meeting, pages 99–114. Nether-
In this paper we have described an approach to auto-               lands Graduate School of Linguistics, 2007.
matically handling the intake of criminal reports filed
online by citizens. The proposed approach combines         [Dun95] Phan Minh Dung. On the acceptability
different types of techniques (i.e., natural language              of arguments and its fundamental role in
processing, argumentation and Q-learning) to obtain                nonmonotonic reasoning, logic programming
a system that A) handles natural language, B) pro-                 and n-person games. Artificial Intelligence,
duces arguments for complex conclusions and hence                  77:321–357, 1995.
provides understandable and legally sensible explana-
                                                           [va18]    Ilse van ’t Hul. Improving online trade fraud
tions for decisions regarding complaint reports, and
                                                                     complaint classification by applying machine
C) is capable of gathering information from its envi-
                                                                     learning techniques. Bsc. thesis, Utrecht
ronment efficiently by only asking the most relevant
                                                                     University, 2018.
questions to the user and terminating the process if
no more relevant information is to be found.               [KSBB17] William Kos, Marijn Schraagen, Matthieu
   The algorithms ans implementations presented in                 Brinkhuis, and Floris Bex. Classification in
this paper are currently under development and a                   a skewed online trade fraud complaint cor-
number of prototypes are working or nearing comple-                pus. In Preproceedings of the 29th Benelux
tion. Furthermore, parts of the system, such as au-                Conference on Artificial Intelligence, pages
tomatically drawing conclusions using the argumen-                 172–183, 2017.
tation graph, the named entity recognition and basic
relation extraction, have been implemented in the ex-      [Pra10]   Henry Prakken. An abstract framework
isting development systems at the Dutch National Po-                 for argumentation with structured argu-
lice.                                                                ments. Argument & Computation, 1(2):93–
   The techniques developed are generalizable beyond                 124, 2010.
the domain of online trade fraud. Extending the sys-
                                                           [PS96]    Henry Prakken and Giovanni Sartor. A di-
tem to other domains will involve a substantial (knowl-
                                                                     alectical model of assessing conflicting argu-
edge) engineering effort: argumentation theories will
                                                                     ments in legal reasoning. Artificial Intelli-
have to be built for different domains, and algorithms
                                                                     gence and Law, 4:331–368, 1996.
for extracting new types of observations will have to
be trained. While our solution thus suffers from the       [SBB17] Marijn Schraagen, Matthieu Brinkhuis, and
classical “knowledge engineering bottleneck” that has              Floris Bex. Evaluation of named entity
hampered knowledge-based systems for decades, we                   recognition in Dutch online criminal com-
believe the focus on smaller, relatively simple assess-            plaints. Computational Linguistics in The
ments makes true autonomous systems more feasible.                 Netherlands Journal, 7:3–16, 2017.
Furthermore, building and maintaining a small argu-
mentation theory may be more suitable for general IT       [Wat89] Chris Watkins. Learning from Delayed Re-
personnel at the Dutch Police than training machine                wards. PhD thesis, King’s College London,
learning algorithms on a new dataset. Finally, the al-             1989.
gorithms for entity and relation extraction are aimed
                                                           [XML+ 15] Yan Xu, Lili Mou, Ge Li, Yunchuan Chen,
to be as general as possible, with good performance
                                                                   Hao Peng, and Zhi Jin. Classifying relations
in different domains. Thus, other tasks and processes
                                                                   via long short term memory networks along
within the police organization can be gradually incor-
                                                                   shortest dependency paths. In Proceedings of
porated into the framework.
                                                                   EMNLP 2015, pages 1785–1794. Association
                                                                   for Computational Linguistics, 2015.
References
[Ber18]   Jeroen Bergers. Improving online trade
          fraud complaint handling using argumenta-