=Paper= {{Paper |id=Vol-2582/paper3 |storemode=property |title=What are you Thinking? Explaining Conversation Agent Responses for Criminal Investigations |pdfUrl=https://ceur-ws.org/Vol-2582/paper3.pdf |volume=Vol-2582 |authors=Sam Hepenstal,Leisha Zhang,Neesha Kodagoda,B. L. William Wong |dblpUrl=https://dblp.org/rec/conf/iui/HepenstalZKW20a }} ==What are you Thinking? Explaining Conversation Agent Responses for Criminal Investigations== https://ceur-ws.org/Vol-2582/paper3.pdf
          What are you thinking? Explaining conversational agent
                   responses for criminal investigations
                               Sam Hepenstal                                                              Leishi Zhang
             Defence Science Technology Laboratory, UK                                          Middlesex University London, UK

                            Neesha Kodagoda                                                           B. L. William Wong
                    Middlesex University London, UK                                             Middlesex University London, UK

ABSTRACT                                                                            is significant. In June 2019 Cressida Dick, the Commissioner of the
The adoption of complex artificial intelligence (AI) systems in en-                 Metropolitan Police, explained that “sifting through vast amounts
vironments that involve high risk and high consequence decision                     of phone and computer data is partly to blame (for low solved crime
making is severely hampered by critical design issues. These issues                 rates) as it slows down investigations”[16]. A more natural inter-
include system transparency which covers (i) the explainability of                  action, which removes the requirement for analysts to translate
results and (ii) the ability of a user to inspect and verify system                 their questions into restrictive syntax or structures, could speed up
goals and constraints. We present a novel approach to designing a                   this process significantly. If an analyst were able to communicate
transparent conversational agent (CA) AI system for information re-                 with their data in the same way as they do with their colleagues,
trieval to support criminal investigations. Our method draws from                   through natural language, then they could achieve significant time
Cognitive Task Analysis (CTA) interviews to inform the system                       savings and speed up investigations.
architecture, and Emergent Themes Analysis (ETA) of questions                           However, for complex applications to be used in high risk and
about an investigation scenario, to understand the explanation                      high consequence domains, transparency is crucial. If an analyst
needs of different system components. Furthermore, we implement                     misinterprets system processes and information caveats when re-
our design approach to develop a preliminary prototype CA, named                    trieving information in a live investigation then the impacts can be
Pan, which demonstrates transparency provision. We propose to                       serious, for example leading to errors such as directing resources
use Pan for exploring system requirements further in the future.                    to the wrong location, or failing to find a vulnerable victim. Mis-
Our approach enables complex AI systems, such as Pan, to be used                    interpretation is a particular risk where there are subjectivities,
in sensitive environments introducing capabilities which otherwise                  such as when using a CA to interpret human intentions. We define
would not be available.                                                             transparency as the ease with which a user can (i) explain results
                                                                                    provided by a system, in addition to (ii) being able to inspect and
CCS CONCEPTS                                                                        verify the goals and constraints of the system within context [3].
• Computer systems organization                Human-centered com-                 Without transparency, including appropriate levels of audit, com-
                                                                                    plex systems cannot be used by intelligence analysts to support
puting.
                                                                                    their investigations.
                                                                                        The domain of intelligence analysis is broad and diverse, there-
KEYWORDS
                                                                                    fore we have focused upon a narrow spectrum of criminal intel-
explainability, criminal intelligence analysis, conversational agents,              ligence analysis and information retrieval tasks. To develop the
transparency                                                                        prototype we first gathered and analysed data from CTA interviews
ACM Reference Format:                                                               with operational police analysts to identify the way they recognise,
Sam Hepenstal, Leishi Zhang, Neesha Kodagoda, and B. L. William Wong.               construct, and develop their questioning strategies in an investiga-
2020. What are you thinking? Explaining conversational agent responses for          tion. We captured important attributes within the interview data
criminal investigations. In Proceedings of the IUI workshop on Explainable          linked to the Recognition-Primed Decision (RPD) model [9] and
Smart Systems and Algorithmic Transparency in Emerging Technologies                 applied Formal Concept Analysis (FCA), a mathematical method
(ExSS-ATEC’20). Cagliari, Italy, 7 pages.
                                                                                    to transform question objects and associated functional attributes
                                                                                    into lattice structures, to identify intention concepts (contribution
1    INTRODUCTION                                                                   1). We can therefore provide an explanation structure for each in-
Artificial intelligence (AI) based conversational agent (CA) tech-                  tention, and the underlying system processes, which mirrors the
nologies are complex systems which are increasing in popularity                     way in which humans recognise situations. We propose that this
[5, 6], because they provide more intuitive, natural, and faster ac-                approach enhances the ability to inspect system behaviour and
cess to information. They could benefit criminal investigations,                    deliver transparency.
where repeated information retrieval tasks are performed by ana-                        We also present findings from scenario based interviews with a
lysts and the volume of data that requires filtering and processing                 different set of operational analysts. In these we looked to identify
                                                                                    what information is required in explanations of the various com-
ExSS-ATEC’20, March 2020, Cagliari, Italy                                           ponents that form a CA. The interview data is distilled to distinct
© 2020 Crown copyright (2020), Dstl. Use permitted under Creative Commons License   statements made by the analysts and further refined using Emer-
Attribution 4.0 International (CC BY 4.0).                                          gent Themes Analysis (ETA), to form an explanation framework
ExSS-ATEC’20, March 2020, Cagliari, Italy                                         Sam Hepenstal, Leishi Zhang, Neesha Kodagoda, and B. L. William Wong


covering CA system components (contribution 2). We describe a             explaining the internals of a system and completeness is to describe
novel CA prototype, named Pan, (contribution 3) designed to ad-           it as accurately as possible.
dress transparency issues, using our findings from the two sets of            Intelligence analysis is a field where analysts operate in complex,
interviews.                                                               subjective, uncertain and ambiguous environments, and a simple
   The work discussed in this paper provides a preliminary investi-       explanation of the data or a model which defines a response is not
gation of transparency issues for information retrieval with complex      enough to satisfy their needs for understanding. For example, if
systems, in the specific domain of criminal investigations. In fu-        the method applied by the system presents significant constraints
ture work we plan to probe further and to evaluate the prototype          of which the analyst is not aware. Previous research has looked
through experimentation with intelligence analysts.                       at this issue and developed a design framework for algorithmic
                                                                          transparency [3]. This describes the necessity to go beyond XAI
2     RELATED WORK                                                        when designing intelligent systems, to include visibility of the sys-
                                                                          tem goals and constraints within context of the situation. Context
Analysts play an important role in criminal investigations, as the
                                                                          relates to the usage and user, including a user’s mental model for
results of their analysis underpins decision making by police com-
                                                                          the ways in which the CA system works. Users who have a dif-
manders. For example, intelligence analysis directs the prioritisation
                                                                          ferent mental model to the realities of the system can encounter
of lines of inquiry in an investigation and assessments of key sus-
                                                                          difficulties and are prone to error [12].
pects. The process of intelligence analysis involves repetitive and
intellectually non-trivial information retrieval tasks where “each
                                                                          2.2    Structuring Human-Machine Recognition
piece of insight leads to intense periods of manual information
gathering”[4]. For example, if a new lead is provided about a sus-        In a policing scenario, when an analyst is presented with a situation
picious vehicle, analysts would ask questions such as ‘who owns           they immediately look to make sense of it. They apply experience
the vehicle?’ and ‘is the vehicle linked to any previous incidents?’      to recognise aspects of the situation and construct a plausible nar-
If an intelligent system can improve this process the impact could        rative explanation with supporting evidence. Klein [9] presents
be significant.                                                           the Recognition-Primed Decision (RPD) model to characterise how
    Manual formulation of query syntax or interactions with tra-          humans recognise and respond to situations, including their cues,
ditional analysis tools can be cumbersome and time consuming.             expectancies, actions and goals. The RPD model was first developed
A more natural interaction, which removes the requirement for             to understand how experienced people can make rapid decisions
analysts to translate their questions into restrictive syntax or struc-   using a case study on fire ground commanders, another high risk
tures, could speed up this process significantly. If an analyst were      and high consequence domain.
able to communicate with their data in the same way as they do                We desire a CA that can recognise situations in a similar fash-
with their colleagues, through natural language, then they could          ion and respond to analyst questions appropriately. We also need
achieve significant time savings and speed up investigations.             analysts to recognise the behaviour of a CA in each situation when
    We define typical CAs as being able to understand users by            it attempts to understand and respond to the analyst. We propose
matching their input pattern to a particular task category (inten-        that the RPD model provides a useful foundation to designing CA
tion), for example through ‘Artificial Intelligence Markup Language’      intentions so a CA can recognise analyst inputs, in addition to an
(AIML) [15], where the intention triggers a set of functional pro-        explanation structure so that its behaviour can also be recognised
cesses. For banal tasks, such as playing a music playlist, the risks      and understood by the analyst.
of an incorrect or misleading response are low and the resulting
consequences limited. As a result, traditional CAs have not been          3 MODELLING CA INTENTIONS
built with algorithmic transparency in mind. If you ask Google            3.1 Participants and Method
Assistant, for example, why it has provided a particular response it      We conducted Cognitive Task Analysis (CTA) interviews, applying
will not be able to tell you and instead responds with humour, such       the Critical Decision Method [9], with four intelligence analysts to
as ‘Let’s let mysteries remain mysteries.’ This is not appropriate        delve into a particularly memorable incident for each. The analysts
for use in criminal investigations where decisions can have serious       have a minimum of 5 years operational experience. In this study, we
impact, for example to direct resources towards the wrong suspect.        analyse interview data to identify the thought processes of analysts,
                                                                          including the questions they asked during their investigations and
2.1     Criminal investigations need explaining                           requirements for responses.
Some research to date has touched on the need for a CA to be
able to explain its responses. Preece et al. describe the ability to      3.2    Analysis and Results
ask a CA ‘why’ they have provided a particular response, so an            For each interview we attempted to understand how analysts iden-
analyst can obtain the agent’s rationale. An explanation could be         tified what was happening and the information they needed to
“a summary of some reasoning or provenance for facts”[13]. This           advance their investigations. Critical to this process is how analysts
understanding of explanation is consistent with research into ex-         recognise and respond to situations. We analysed analyst interview
plainable machine-learning, where the focus is placed upon the            statements, structuring them against the Recognition-Primed Deci-
specifics of the data retrieved, or the internals of a model. Gilpin      sion (RPD) model [7], and found that the model is appropriate to
et al. [2], defines eXplainable AI (XAI) as a combination of inter-       capture and explain their processing of information in an investiga-
pretability and completeness, where interpretability is linked to         tion (Table 1). We propose that the RPD model, therefore, provides
What are you thinking? Explaining conversational agent responses for criminal investigations                               ExSS-ATEC’20, March 2020, Cagliari, Italy


                                Table 1: RPD Mapping from Interview Statements (Example from Interview 1)

 Transcript Statement [CTA: A1, 11:30]                    Goals               Cues             Expectancies   Actions               Why?             What for?
 “We had no idea initially what the kidnap                Understand         Man gone          There     is   Searched known        To reduce        To     direct
 was for. We were searching associates, we                the motive,        missing.          information    associates, looked    scope of         next steps
 looked for any previous criminal convictions,            the risk to        Thought he        for victim     for previous con-     investi-         of investi-
 we spoke to neighbours, and telephone infor-             the victim,        had been          within         victions, spoke to    gation           gation and
 mation for his phone. One of the neighbours              and possible       kidnapped         existing       neighbours and        and assess       better use
 had suspected he had been kidnapped, and a               suspects           due to wit-       databases      witnesses, looked     level    of      experience
 witness had seen him being bundled into a                                   ness report.                     at      telephone     risk             to recognise
 car and alerted the police because they knew                                known                            information.                           patterns
 he was vulnerable.”                                                         to       be
                                                                             vulnerable

 Extracted Questions:                                     Goals      Cues                      Expectancies   Actions               Why?    What for?
 What people are associates of victim?                    Find asso- Victim                    The victim     Search for people     To find po-
                                                                                                                                            So that in-
                                                          ciates     name                      knows the      connected to vic-     tential sus-
                                                                                                                                            quiries can
                                                                                               offenders      tim name              pects   be made into
                                                                                                                                            suspects
 Does the victim have any previous convic- Find convic- Victim                                 The victim     Search for con- To under- To assess
 tions?                                    tions        name                                   has     been   victions directly stand past risk      and
                                                                                               targetted      linked to victim victimi-     inform pri-
                                                                                               before         name              sation or oritisation
                                                                                                                                offending
 What calls have involved the victims                     Find calls          Victim           The victim     Search for calls To      find To identify
 phone?                                                                       phone            has     been   involving phone recent        possible
                                                                              number           involved in    number            communi- leads        or
                                                                                               recent calls                     cations     location


a concise and clear representation of an analyst’s behaviour when                              Table 2: Example FCA-RPD Objects and Attributes
retrieving information, and thus can be used to give an explanation
structure for their intentions. We can design system processes that                               Recognition- FCA Object: “Has [victim name]
mirror this representation.                                                                       Primed Deci- been reported in any activity?”
    In Table 1, we also show how we extract individual questions                                  sion Aspect
asked by analysts from interview statements and can structure them
                                                                                                  Cues               Pass specific input details (Vic-
against the RPD model. Furthermore, we can interpret the RPD
                                                                                                                     tim Name, Activity)
attributes more generically to suit multiple questions of the same
                                                                                                  Goals              Present confirmation
type. During the interviews each analyst provided many examples
                                                                                                  Expectancies       Expected that input details and
of their information needs and the questions that they asked when
                                                                                                                     pattern exist
performing an investigation. For example, one analyst stated that
                                                                                                  Actions            Perform adjacent information
“I looked through every database for the victim’s name, custody
                                                                                                                     search for entities extracted
records, PNC (Police National Computer), stop and search, vehicles
                                                                                                  Why?               Retrieve list for further explo-
he drove, to see if he had been stopped and searched with other
                                                                                                                     ration.
people in the vehicle and if they had been named.” [CTA: A1, 15:00].
                                                                                                  What for?          To find new lines of inquiry.
From this statement, we can extract a number of questions posed
by the analyst that could be directed towards a CA, including “how
many vehicles have travelled to the victims address?” To answer
this question the analyst provides cues for ‘vehicles’, ‘travelled’ and
‘victims address’. Their goal is to retrieve summary information i.e.                     analysis approach which is effective at knowledge discovery and
‘how many’, and they are interested in finding a specific pattern of                      provides intuitive visualisations of hidden meaning in data [1]. FCA
data in the database, which connect the cues. Table 2 provides a                          represents the subject domain through a formal context made of
different example, with generic RPD attributes.                                           objects and attributes of the subject domain [14]. By breaking down
    In this paper, we present how RPD attributes can be used to                           analyst questions and structuring their components against the
dynamically model analyst intentions for searching and retrieving                         RPD model we extract attributes which can be used by a CA to pro-
information, through Formal Concept Analysis (FCA). FCA is an                             cess a response. In this study we identified specific RPD attributes
                                                                                          which address over 500 analyst questions, akin to those described
ExSS-ATEC’20, March 2020, Cagliari, Italy                                          Sam Hepenstal, Leishi Zhang, Neesha Kodagoda, and B. L. William Wong


by analysts in interviews. We then performed FCA to identify in-           Table 3: ETA Snapshot for Clarification of System Processes
tention concepts. In our case, the subject domain comprises the
intentions of an analyst when they ask questions in an investiga-           Broad         Sub-Theme        Framework        Statement
tion. Therefore, FCA objects are questions including “Has [victim           Theme                          Area
name] been reported in any activity?”. FCA attributes are the RPD
                                                                            System        Clarification Clarification I am concerned that
model specifics which the CA must recognise and act upon in or-
                                                                            Processes     of system                   info is missing because
der to answer each question, such as the action ‘Perform adjacent
                                                                                          inputs.                     of search criteria.
information search’. Attributes can be simple methods, for example
looking for single shortest paths or a pre-defined SPARQL pattern,
                                                                                          Clarification                     Understanding         as
or they can be more advanced capabilities, such as clustering similar
                                                                                          of system                         a tool is also important
instances. Importantly, each generic RPD attribute corresponds to
                                                                                          processes                         for the whole system,
a functional process and therefore can be developed as a module.
                                                                                                                            such as when and
FCA allows us to group modules together to form intentions, with
                                                                                                                            where to use it.
question objects that can be used to train text classification for the
user input to the CA.
                                                                                                                            How have the re-
   The lattice, as shown in Figure 1, presents distinct object group-
                                                                                                                            sults been worked out
ings. The final layer of concept circles are complete concept inten-
                                                                                                                            and what methods have
tions, where all parts of the RPD are considered. The circles are
                                                                                                                            been applied?
sized based upon the number of associated questions. We can see
that three questions in our set can be answered by combining the
highlighted attributes. These attributes can answer the question,          Table 4: CA Explanation Area Framework and Sub-Themes
‘how many vehicles are in our database?’, with ‘vehicle’ as a cue.
The CA looks for adjacent information i.e. where there are instances
                                                                                   Framework            ETA Sub-Themes
of the class ‘vehicles’, presents a summary count, and retrieves a list.
                                                                                   Area
To provide transparency we propose we can simply present what
attributes, and therefore functional processes, underpin a concept                 Clarification        Clarification of data attributes
with their descriptions. Our model-agnostic and modular approach                                        and structure, entity details, sys-
is akin to what Molnar [11] describes as the future of machine                                          tem input variables, metrics,
learning interpretability. We have used the concept lattice to define                                   question language, system pro-
the intentions that an analyst can trigger through a CA interface,                                      cesses, response methods, re-
where each intention reflects our explanation structure; the RPD                                        sponse language.
model.                                                                             Continuation         Provide information to support
                                                                                                        continuation of investigation,
4 UNDERSTANDING CA RESPONSES                                                                            including use of past interac-
                                                                                                        tions to move to next.
4.1 Participants and Method                                                        Exploration          Associated/additional data in re-
We interviewed four intelligence analysts with more than 10 years                                       sponses or on periphery, inten-
operational experience, from a different organisation to those in-                                      tion match, system processes,
terviewed previously. We aimed to explore their requirements for                                        source documents.
understanding the responses and processes of a CA in the context                   Justification        Provide information to justify
of a criminal investigation. Each interview lasted an hour and we                                       selected system processes and
presented interviewees with a series of questions and correspond-                                       the data defining the response.
ing CA responses with two explanation conditions, switching the                    Verification         Additional details for entities,
order of presentation. For one condition, responses described the                                       correct intention match and im-
data alone (1) and, in the other condition, the data and system pro-                                    pact/constraints of system pro-
cesses (2). We were not attempting to test the differences between                                      cesses.
conditions, rather we used them as a starting point from which                                          Check data reliability.
we could explore additional needs. Throughout interviews a single
researcher took extensive notes from which individual statements
were extracted. In total there were 114 distinct statements extracted,     of what the data is about, with structure, and is fast and practical
with counts for each analyst ranging from 24 to 34.                        [10]. A single researcher analysed the statements and identified
                                                                           that they could be coded against the core functional components
4.2     Analysis and Results                                               of a CA, for example ‘System Processes’ as shown in Table 3. From
To analyse the statements we used an approach called Emergent              these components, we have drawn out the specific understanding
Themes Analysis (ETA), as described by Wong and Blandford [19,             needed for CA responses as sub-themes. The sub-themes are further
20], where broad themes, which are similar ideas and concepts, are         categorised to form a general framework (Table 4) for explanation
identified, indexed and collated. ETA is useful for giving a feeling       needs from an intelligent CA system.
What are you thinking? Explaining conversational agent responses for criminal investigations                                 ExSS-ATEC’20, March 2020, Cagliari, Italy




               Figure 1: Concept Lattice for RPD Model Intentions (computed and drawn with Concept Explorer [21])


   Exploring the interview data through the ETA method and struc-                            In Table 4, we display the framework areas and related sub-
ture is helpful when we come to design CA components. For ex-                             themes that emerged from ETA. Specific areas in the explanation
ample, examining Table 3 again, we can see that to provide under-                         framework can be linked to existing models for sensemaking, such
standing of system processes to an analyst we need to allow for                           as the Data Frame Model [8] for elaborating and reframing ques-
clarification of both input variables and processes. Drawing upon                         tions, or Toulmin’s model for argumentation [17] to provide justifi-
details in the statements we can see that it is important to clarify                      cation. Table 5 presents the key framework areas for each compo-
any constraints related to the search inputs, the general capabilities                    nent theme, where at least two analysts made associated statements,
of the system as a whole, and specific processes applied in any in-                       together with a summary of sub-themes specific to both CA com-
stance. We can incorporate explanations that provide clarification                        ponent and framework area. Different CA components draw more
of these aspects, in addition to solutions for other themes extracted                     heavily on particular aspects of the framework and therefore our
through ETA, into the design of our prototype application.                                ETA analysis helps us to design and tailor explanations for each
   An analyst’s ability to have clarification, verification and justifi-                  component.
cation of system processes is crucially important, as identified by
all analysts interviewed. This finding supports the framework for
providing algorithmic transparency presented by Hepenstal et. al                          5    CA PROTOTYPE
[3] and reiterates the need to go beyond traditional approaches to
                                                                                          We have developed an initial prototype CA application called Pan,
explainable AI (XAI) which focus upon explanations of the impor-
                                                                                          which uses FCA to define the different intention concepts to which
tant features for a model and accuracy measures. Specific concerns
                                                                                          it can respond. The objects (questions) which are attached to a
included a need to justify follow up questions and the underlying
                                                                                          concept are used as training data for machine learning text classifi-
rationale of the system for use in court (ETA: A1; Q2; C1). Addition-
                                                                                          cation, so that a user’s question can be matched to an appropriate
ally, an understanding of the system processes selected by the CA,
                                                                                          intention. Each intention concept has associated attributes and
including descriptions of the methods applied (all analysts, multiple
                                                                                          we have developed methods to handle these as individual models,
statements), and inherent constraints, such as the questions which
                                                                                          which create query syntax and interact with the database. In this
cannot be answered by the CA and information which has been
                                                                                          way, FCA can combine multiple distinct combinations of attribute
omitted by the process (ETA: A2; Q1; C2 | A3; Q2; C2; | A4; Q4; C2).
                                                                                          models flexibly to meet different analyst intentions. We propose
Essentially, analysts need to be able to justify, clarify and verify the
                                                                                          that by combining model-based attributes with FCA to define inten-
CA intention triggered by their query and the related functional
                                                                                          tion concepts we provide a highly flexible approach to developing
attributes. We believe our RPD explanation structure provides a
                                                                                          CA intentions. The objects and corresponding RPD attributes are
neat mechanism to pick apart the system processes and provide,
                                                                                          critical for providing visibility to an analyst for the responses given
for each, the understanding required.
                                                                                          by a CA and are akin to explainability scenarios i.e. “narratives of
                                                                                          possible use that seek to answer the questions: who will use this
ExSS-ATEC’20, March 2020, Cagliari, Italy                                          Sam Hepenstal, Leishi Zhang, Neesha Kodagoda, and B. L. William Wong


     Table 5: CA Component Core Understanding Needs                        the intention. For example, when a concept triggers the action for
                                                                           finding single shortest path connections between instances, the ana-
  CA Component           Framework Area Summary of                         lyst is presented with a description that includes any constraints to
  Theme                  (common for mul- Sub-theme(s)                     be wary of. Specifically, that it will not find longer paths or consider
                         tiple analysts)                                   multiple routes. These caveats will impact how the analyst consid-
                                                                           ers any information returned or how to rephrase their question.
  Extracted              Clarification    +   More information of en-
                                                                           The attribute descriptions for each RPD module hang together as a
  Entities               Verification (3)     tities extracted for clar-
                                                                           narrative, akin to explainability scenarios. We intend to run experi-
                                              ification and verifica-
                                                                           ments with Pan and operational intelligence analysts to validate
                                              tion.
                                                                           our understanding of explanation needs and our RPD explanation
  CA Intention           Clarification (3),   Clear language to un-
                                                                           structure for CA intentions.
  Interaction            Continuation (2)     derstand classification
                                              (i.e. no confusing re-
                                              sponse metric) and in-
                                              formation to support         6   USE CASES AND INITIAL FEEDBACK
                                              continuation of investi-
                                                                           In order for AI systems to be used for high risk and high conse-
                                              gation.
                                                                           quence decision making they must provide transparency of their
  System                 Continuation (4),    User wants system un-
                                                                           reasoning. As put by one analyst, “[the principal analyst] said none
  Processes              Verification (4),    derstanding to support
                                                                           of my analysts would stand up in court where the beginning point
                         Clarification (3),   continuation of investi-
                                                                           of their evidence is an algorithm.” [CTA: A4, 32:30] and that “You
                         Exploration (2),     gation, to allow them to
                                                                           have to be able to trace it (your reasoning) all the way back to
                         Justification (2)    verify processes are cor-
                                                                           evidentially explain why you did each part... an analyst always has
                                              rect and explore them
                                                                           to justify what they have done, so does a system.” [CTA: A4, 35:00]
                                              in more or less de-
                                                                           We believe that Pan addresses these issues by providing algorith-
                                              tail and justify their
                                                                           mic transparency of its reasoning, using an architecture that aids
                                              use/approach and con-
                                                                           recognition and explanations that meet our explanation framework.
                                              straints.
                                                                           Early feedback from analysts on our approach is positive, open-
  Data                   Clarification (3)    Clarification of data up-
                                                                           ing routes for Pan to be tested in high risk and high consequence
                                              dates and source, and
                                                                           application domains where traditional CAs would not be deployed.
                                              data structure to aid
                                              forming questions.
  Response               Clarification (4),   Justification of re-
                         Justification (4),   sponse with underlying       7   CONCLUSIONS AND FUTURE WORK
                         Exploration (2)      data, clarification of
                                                                           In this paper, we describe our approach to capture and model analyst
                                              language (not trying
                                                                           thought when retrieving information in a criminal investigation.
                                              to be human) and
                                                                           We also present analysis to understand their needs for explanations
                                              terminology, ability to
                                                                           from a complex CA system. Finally, we describe a prototype CA
                                              explore results in more
                                                                           which incorporates FCA and RPD to build intention concepts and is
                                              detail.
                                                                           therefore, we believe, transparent by design. We plan to evaluate the
                                                                           transparency impacts of our approach to intention concept design,
                                                                           gather additional requirements, and to validate our explanation
system and what might they need to know to be able to make sense           framework through experimentation with operational analysts. To
of its outputs?” [18]                                                      date we have not explored how a CA should present its responses
   Our work to identify the core understanding needs for CA com-           to an analyst. Thus, we will look to explore how explanations are
ponents has helped to inform the design of explanations for different      communicated, such as the specific textual or visual method.
parts of the system, for example, when the CA matches user input              The role of investigation scope was prominent in CTA interviews
to an intention concept, triggers associated attribute models, and         with analysts, where their questions were framed by the initial
responds. The explanation provides information required for an             scope, thus introducing the risk that important information beyond
analyst to understand the CA component themes of ‘Data’, ‘Ex-              the scope is missed. We will consider how CAs can help mitigate
tracted Entities’, and ‘Response’. As an analyst types their query         the constraints of investigation scope, through machine reasoning
and entities are extracted, they are provided with identifier infor-       for example. Analysts expressed the desire to avoid obvious follow
mation where possible. We have also designed for the ability for           up questions, so it would be helpful for a CA to predict and explore
an analyst to inspect and verify system goals and constraints. In          additional questions autonomously. One approach for this is to
our prototype we allow the user to step into the intention concept         model investigation paths as a Bayesian network. Transparency is a
which has been triggered through a dialog window, so they can              critical issue in autonomous systems and our explanation structure
inspect and verify clear textual descriptions with our explanation         could help understanding by aiding the explanations of system
structure, of the cues, goals, actions, expectancies and purpose of        behaviours across model states.
What are you thinking? Explaining conversational agent responses for criminal investigations                                         ExSS-ATEC’20, March 2020, Cagliari, Italy


ACKNOWLEDGMENTS                                                                            [9] G. A. Klein, R. Calderwood, and D. MacGregor. 1989. Critical decision method
                                                                                               for eliciting knowledge. IEEE Transactions on Systems, Man, and Cybernetics 19,
This research was assisted by experienced military and police an-                              3 (1989), 462–472.
alysts who work for the Defence Science Technology Laboratory                             [10] Neesha Kodagoda and William Wong. 2009. Cognitive Task Analysis of Low
                                                                                               and High Literacy Users: Experiences in Using Grounded Theory and Emer-
(Dstl) and the National Crime Agency (NCA) in the United King-                                 gent Themes Analysis. Human Factors and Ergonomics Society Annual Meeting
dom.                                                                                           Proceedings (2009).
                                                                                          [11] Christoph Molnar. 2019. Interpretable Machine Learning. A guide for making black
                                                                                               box models explainable.
REFERENCES                                                                                [12] Donald Norman. 1983. Design rules based on analyses of human error. Commun.
 [1] Simon Andrews, Babak Akhgar, Simeon Yates, Alex Stedmon, and Laurence                     ACM 26, 4 (1983), 254–258.
     Hirsch. 2013. Using formal concept analysis to detect and monitor organised          [13] Alun Preece, William Webberley, Dave Braines, Erin G. Zaroukian, and
     crime.                                                                                    Jonathan Z. Bakdash. 2017. Sherlock: Experimental Evaluation of a Conversa-
 [2] Leilani H. Gilpin, David Bau, Ben Z. Yuan, Ayesha Bajwa, Michael Specter, and             tional Agent for Mobile Information Tasks. IEEE Transactions on Human-Machine
     Lalana Kagal. 2018. Explaining Explanations: An Approach to Evaluating Inter-             Systems 47, 6 (2017), 1017–1028.
     pretability of Machine Learning.                                                     [14] Nadeem Qazi, B. L. W. Wong, Neesha Kodagoda, and Adderley Rick. 2016. As-
 [3] Sam Hepenstal, Neesha Kodagoda, Leishi Zhang, Pragya Paudyal, and                         sociative search through formal concept analysis in criminal intelligence analysis.
     B. L. William Wong. 2019. Algorithmic Transparency of Conversational Agents..             Institute of Electrical and Electronics Engineers (IEEE).
     In IUI ATEC. Los Angeles.                                                            [15] Nicole M. Radziwill and Morgan C. Benton. 2017. Evaluating quality of chatbots
 [4] Sam Hepenstal, B. L. William Wong, Leishi Zhang, and Neesha Kodagoda. 2019.               and intelligent conversational agents. arXiv preprint arXiv:1704.04579 (2017).
     How analysts think: A preliminary study of human needs and demands for AI-           [16] Danny Shaw. 2019. Crime solving rates ‘woefully low’, Met Police Commissioner
     based conversational agents.                                                              says. www.bbc.co.uk
 [5] Bret Kinsella. 2018. Amazon Echo Device Sales Break New Records, Alexa               [17] Stephen E. Toulmin. 1958. The Uses of Argument. Cambridge University Press.
     Tops Free App Downloads for iOS and Android, and Alexa Down in Europe on             [18] Christine Wolf. Mar 17, 2019. Explainability scenarios: towards scenario-based
     Christmas Morning. https://voicebot.ai/2018/12/26/amazon-echo-device-sales-               XAI design (IUI ’19). ACM, 252–257.
     break-new-records-alexa-tops-free-app-downloads-for-ios-and-android-and-             [19] BL William Wong and Ann Blandford. 2004. Describing Situation Awareness
     alexa-down-in-europe-on-christmas-morning/                                                at an Emergency Medical Dispatch Centre. In Proceedings of the Human Factors
 [6] Bret Kinsella. 2019. NPR Study Says 118 Million Smart Speakers Owned by                   and Ergonomics Society Annual Meeting, Vol. 48. SAGE Publications Sage CA: Los
     U.S. Adults. https://voicebot.ai/2019/01/07/npr-study-says-118-million-smart-             Angeles, CA, 285–289.
     speakers-owned-by-u-s-adults/                                                        [20] William Wong and Ann Blandford. 2002. Analysing ambulance dispatcher de-
 [7] Gary Klein. 1993. A Recognition Primed Decision (RPD) Model of Rapid Decision             cision making: trialing emergent themes analysis. Vetere, F and Johnson, L and
     Making. (1993).                                                                           Kushinsky, R, (eds.) Ergonomics Society of Australia: Canberra, Australia. (2002).
 [8] Gary Klein, B. Moon, and R. Hoffman. 2006. Making Sense of Sensemaking 2: A          [21] Serhiy A. Yevtushenko. 2000. System of data analysis "Concept Explorer". (In
     Macroeconomic Model. IEEE Intelligent Systems 21.5 (2006), 88–92.                         Russian). Russia, 127–134.