<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Workshops. ACM, Los Angeles, USA, March</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Towards an Explainable Threat Detection Tool</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Alison Smith-Renner</string-name>
          <email>alison.smith@dac.us</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rob Rua</string-name>
          <email>rob.rua@dac.us</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mike Colony</string-name>
          <email>mike.colony@dac.us</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Decisive Analytics Corporation</institution>
          ,
          <addr-line>Arlington, VA</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2019</year>
      </pub-date>
      <volume>20</volume>
      <issue>2019</issue>
      <abstract>
        <p>In general, threats can be loosely divided into two categories known threats and unknown threats. Traditional threat detection systems are limited to the identification of known threats that have been previously encountered and labeled by a security expert. These supervised learning systems are able to learn to detect and identify known threats but are unable to react to unknown threats. To this end, we have developed an unsupervised learning anomaly detection system to identify anomalous behavior without training data. Our system's interactive interface supports human-machine teaming to classify these identified anomalies as threats or benign events; however, system transparency is required to enhance operator trust and improve their feedback into the system. Transparency in this case is particularly challenging as our anomaly detection framework is based on algorithms which are inherently hard to explain (neural networks). In this paper, we introduce a realworld task and system that requires transparency, and we propose explanation methods for increasing the transparency of our threat detection tool alongside a user study for evaluating these explanations.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Anomaly detection</kwd>
        <kwd>explanations</kwd>
        <kwd>transparency</kwd>
        <kwd>human-machine teaming</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>CCS CONCEPTS</title>
      <p>• Human-centered computing~Human computer
interaction (HCI) • Human-centered computing~HCI design
and evaluation methods • Human-centered
computing~Interactive systems and tools</p>
    </sec>
    <sec id="sec-2">
      <title>1 Introduction</title>
      <p>IUI Workshops’19, March 20, 2019, Los Angeles, USA
Copyright © 2019 for the individual papers by the papers’ authors. Copying permitted
for private and academic purposes. This volume is published and copyrighted by its
editors.</p>
      <p>Forward deployed military installations face unique challenges
when automating threat detection in security monitoring systems.
In particular, development of a general framework for identifying
emerging security threats poses two technical obstacles: (1) the
security framework must be robust to an environment where
“normal” activities are initially unknown and (2) the framework
must support collaboration with operators to determine what types
of anomalous behavior constitute an actual threat. To this end, we
have developed an unsupervised anomaly detection system,
DAART, (for Detection of Anomalous Activity in Real Time), to
identify anomalous behavior without training data. DAART’s
interactive interface supports human-machine teaming to classify
these identified anomalies as threats or benign events.</p>
      <p>Traditional threat detection systems are limited to the
identification of known threats that have been previously
encountered and labeled by a security expert, such as a man
wielding a gun or an intruder in a restricted location. The
unsupervised nature of the DAART system additionally supports
identification and action on unknown threats, which is necessary to
adapt to ever changing environments. A by-product of this,
however, is an initial trend towards recall over precision, meaning
many benign activities may be alerted to the user. DAART’s Active
Learning component learns from operator feedback in the form of
accepting or rejecting alerts (alert-level feedback) to better
distinguish benign anomalous behavior from threats. A
human-inthe-loop system, such as this, requires system transparency to
improve operator trust, accelerate operator workflow, and better
enable operators to provide the valuable feedback required to
improve the system’s threat classifications.</p>
      <p>
        A threat detection system may err in two distinct ways: (1) false
positives in which benign behavior is predicted to be a threat and
(2) false negatives in which a threat is considered benign (and
therefore not alerted to the user). Operator trust is negatively
affected if a system produces many false positives without
explanation or if the operator cannot confirm whether the system
produces false negatives. System transparency in the form of
alertlevel and system-level explanations therefore enhances trust,
because users can better understand when and why a system makes
mistakes as well as to ensure the system doesn’t miss any potential
threat behavior, respectively [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
      </p>
      <p>
        Not all anomalies are threats and not all threats are equally
important. System transparency accelerates operator workflow by
providing the evidence needed to quickly and accurately prioritize
and determine the validity of threats. Finally, the DAART system
improves with user feedback, so the goal is to get the best feedback
as possible from users while minimizing the time and effort to
provide it. System transparency enhances the feedback process
because users’ feedback is improved when they have an
understanding of how the system works and why an alert is
considered anomalous. Furthermore, users’ time and effort are
minimized when providing feedback through the same
visualizations presented to them for explanation purposes, as these
are already familiar [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
      </p>
      <p>In this paper we present the DAART system for identifying
anomalous behavior without training data, and we propose
interactive explanation methods for improved operator trust,
accelerated workflow, and enhanced operator feedback through
system transparency. In particular, we propose methods for
determining and displaying explanation information, such as
multimodal localization (or attention) and normalcy exemplar, and an
interactive explanation interface to present these and other simple
explanation types (system confidence, alternate classifications,
features) to users for promoting transparency and providing a
means for user feedback. We additionally propose a user study to
evaluate these various interactive explanation methods for the
DAART system.</p>
    </sec>
    <sec id="sec-3">
      <title>2 Background 2.1</title>
    </sec>
    <sec id="sec-4">
      <title>Anomaly Detection</title>
      <p>
        Detecting anomalies in sensor data requires a standardized feature
representation of the incoming data. Traditionally, these features
are defined by expert scientists who specialize in particular sensor
modalities. More recently, supervised machine learning models
have been able to outperform expert-defined features in their
descriptiveness about the original sensor data [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. DAART
improves on this, leveraging state-of-the-art research in
unsupervised convolutional feature learning [
        <xref ref-type="bibr" rid="ref1 ref6">1,6</xref>
        ] to generate
comparably discriminative features without the need for
humanlabeled training data. While these extracted features are not as
easily understood by a human as expert-specified features, they
have more expressive power when used for tasks such as anomaly
detection. Importantly, this approach is sensor agnostic, meaning it
can be applied to any sensor data, including but not limited to, EO
and IR video, audio, and acoustic sensors.
2.2
      </p>
    </sec>
    <sec id="sec-5">
      <title>User Feedback</title>
      <p>
        Interactive machine learning systems incorporate end-user
feedback to re-train underlying algorithms and improve their
output. Users may provide this feedback in the form of interactively
labeling data [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ], as part of an interactive training phase [
        <xref ref-type="bibr" rid="ref7 ref8">7,8</xref>
        ], to
fix specific system mistakes [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ], or to inject their domain expertise
into the system [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
      <p>
        The system we present here builds on interactive machine
learning techniques, such as accepting and rejecting system’s output
[
        <xref ref-type="bibr" rid="ref25">25</xref>
        ] and interactive clustering [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] for improved threat
classification. We additionally propose to enhance the system with
support for richer user feedback, such as modifying feature weights
[
        <xref ref-type="bibr" rid="ref18">18</xref>
        ].
2.3
      </p>
    </sec>
    <sec id="sec-6">
      <title>System Transparency</title>
      <p>
        There is growing interest in system transparency, or explainable
artificial intelligence (XAI), driven in part by both DARPA’s XAI
initiative [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] and the European Union’s data protection law for
“right to explanation” [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]. We aim for transparency in our
anomaly detection system as it supports operator decision making
[
        <xref ref-type="bibr" rid="ref23">23</xref>
        ], improves trust [
        <xref ref-type="bibr" rid="ref14 ref16">14,16</xref>
        ], and aids users in better providing
feedback to the system [
        <xref ref-type="bibr" rid="ref13 ref20">13,20</xref>
        ] as well as motivating them to do so
[
        <xref ref-type="bibr" rid="ref19">19</xref>
        ].
      </p>
      <p>
        System transparency can be provided through explanations or
visualizations that provide insight into what the system is doing and
why it is doing it (see [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] for a survey). In particular, prior work has
identified explanation types [
        <xref ref-type="bibr" rid="ref13 ref17">13,17</xref>
        ] to improve end user
understanding of complex systems. We propose to implement and
evaluate these explanation types in the DAART system.
      </p>
    </sec>
    <sec id="sec-7">
      <title>3 DAART</title>
      <p>
        We generate discriminative features and perform anomaly
detection using an approach based on Generative Adversarial
Networks (GANs) [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>
        Generative Adversarial Networks (GANs) have demonstrated
the ability to generate images from a random noise vector that are
able to fool another model attempting to determine if the image is
real or fake. A GAN consists of two competing models. A Generator
(G) model learns how to transform random noise into a fake image.
A Discriminator (D) model then tries to determine if the fake image
is real or fake. Over time both models are trained until the fake
images are indistinguishable from the real images. An Adversarial
Learned Inference (ALI) [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] model is an adaptation of a GAN that
can be exploited for anomaly detection. The ALI model modifies a
standard GAN by adding an Encoder (E) that simultaneously learns
to generate a latent input vector that will allow the Generator (G)
model to fool the Discriminator (D) model. To generate
discriminative features in DAART, we use an approach based on
GANomaly [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], an extension of ALI.1
      </p>
      <p>GANomaly extends ALI to perform anomaly detection as part of
the feature extraction process. This approach consists of three
subnetworks: A Generator (G), Encoder (E), and Discriminator (D).
During the training process, the GANomaly model is trained only
on normal images, in essence learning a model of normalcy. The
images (video frames, or vectors from other sensor types) are fed
into the Generator, which learns two things: (1) a lower dimensional
mapping (z), and (2) to reconstruct the image (xʹ). The reconstructed
image is then fed into both the Encoder and Discriminator. The
Encoder learns a second lower dimensional mapping of the
reconstructed image (zʹ), and the Discriminator learns to tell the
difference between real and fake images. During the training
process, the Generator learns by minimizing the loss between the
original image and the fake image (x - xʹ). The Encoder learns by
minimizing the loss between the first and second lower dimensional
spaces (z - zʹ).</p>
      <p>Once trained, GANomaly is used in live operations to test each
input vector (e.g. video frame) to compute an anomaly score. Since
the model was trained only on normal behavior, an anomaly score
that determines whether an input is anomalous or not can be
computed based on the L1-normalized Euclidean difference between
the first lower dimensional mapping learnt by the encoder and the
lower dimensional mapping learnt from the reconstructed image (z
- zʹ).</p>
    </sec>
    <sec id="sec-8">
      <title>3.2 Scopes and Normalcy Model</title>
      <p>Scopes define specific filters on sensor and facility conditions that
the user wants the anomaly detection system to be restricted to
when discovering anomalies. When a Scope is specified, all data
that matches the defined filters will be processed by the anomaly
detection algorithm, which calculates an n-dimensional probability
distribution of the data over the features learned during feature
extraction. The result is a baseline normalcy model defining what
sensor data is considered normal activity.</p>
      <p>
        This baseline normalcy model is incrementally updated each
time new sensor data is ingested into the DAART system. We use
this normalcy model to compute a strangeness metric proposed in
prior work [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]: incoming sensor data is compared against the
1 We performed a qualitative comparison of GANomaly and ALI and found that
GANomaly demonstrated super results and stability in the training phase.
baseline model to determine how strange the observation is
compared to normal behavior. Once this strangeness metric has
been calculated for each individual sensor modality, the metrics are
merged across all modalities to determine whether an incident
observed by multiple sensors is anomalous.
      </p>
    </sec>
    <sec id="sec-9">
      <title>3.3 Interactive rTheat Classification</title>
      <p>When the DAART system identifies anomalous activity, it alerts the
user. An example of the DAART system upon identifying
anomalous activity is shown in Figure 2.</p>
      <p>Figure 2 (right) shows the alert, which includes a video clip of
the anomalous activity and a timestamp at which the activity occurs.
Users interact with an alert to either specify that it is “Not a threat”
or provide a threat class for it. In addition to viewing the clip, the
system also explains the anomaly using a timeseries chart showing
the strangeness score of the anomaly compared to prior readings as
shown by Figure 2 (left). Users can additionally modify the
strangeness threshold above which an anomaly is alerted. We
propose additional methods for explanation and user feedback in the
following section.</p>
    </sec>
    <sec id="sec-10">
      <title>4 Explanation Methods</title>
    </sec>
    <sec id="sec-11">
      <title>4.1 Localization (Attention)</title>
      <p>For operators to better understand the anomalies and threats that
DAART alerts them to, it would be ideal to be able to isolate which
part of the sensor reading was anomalous. In the case of EO video,
this could mean showing the user a bounding box which identifies
where in the video stream the anomalous activity is occurring. This
type of functionality is extremely valuable in helping the operator
decide what threat label to assign to new unknown threats, and to
help them better determine what course of action is reasonable in
response to a threat.</p>
      <p>Because of the fully unsupervised GAN-based approach DAART
uses for anomaly detection, localization of the anomalous activity
in sensor readings is non-trivial. Unlike many supervised
approaches, in which the detection of specific objects or actions are
triggers for anomaly or threat alerts, the current GANomaly-based
unsupervised approach uses a more context-oriented approach
which examines the entire sensor reading at once.</p>
      <p>
        Recently, however, because of the popularity of GAN-based
techniques for unsupervised machine learning, approaches have
been developed for fully unsupervised object detection and
localization using GANs [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. These approaches use introspection of
the hidden layers of the GAN feature extractor, mapped back to the
original input space, to identify which areas in the input data are
contributing to the network recognizing an object.
      </p>
      <p>We propose to integrate this introspection-based approach into
its unsupervised GAN-based feature learning, allowing anomalies
detected using those learned features to be localized. The
operatorfacing DAART explanation interface will then be updated to show
which parts of an anomalous sensor reading are most responsible
for the reading being considered anomalous.</p>
    </sec>
    <sec id="sec-12">
      <title>4.2 Normalcy Exemplar</title>
      <p>We propose to generate normalcy exemplars that can be displayed
to operators to compare against detected threats. These normalcy
explanations allow the system to describe what the
situation typically looks like to help explain why a new instance is
deemed anomalous or threatening.</p>
      <p>We propose two techniques for generating normalcy exemplars
for this comparison. The first technique is to simply determine the
existing normal exemplar (non-anomalous prior reading) that is
most similar to the anomalous input using the features space. A
side-by-side view displays that exemplar sensor reading against the
detected anomalous reading for comparison. Furthermore,
bounding boxes can be added to highlight differences in the
anomalous input by utilizing the localization information. One
limitation to this technique is that not all differences between the
normal and anomalous scenes are important. The second, and more
complex, proposed technique accounts for this limitation by
generating a synthetic “normal” feature vector that is similar to the
anomalous reading, but without the features that make it an
anomaly. The GAN then generates a synthetic sensor reading from
feature vector. In this case, the only difference between the two
displayed exemplars are the elements of the input that make it
anomalous.</p>
      <p>We can similarly use the GAN to determine normalcy exemplars
for other data types, such as audio and acoustic sensor data, but a
challenge of this task will be determining appropriate ways to
expose this information to operators. Audio can be handled
similarly to imagery, for example, by providing two audio clips the
operator can listen to for comparison. However, for the other data
types, we will work with operators to determine what view of each
modality fits best into their existing threat detection workflow as
part of this task.</p>
    </sec>
    <sec id="sec-13">
      <title>4.3 Interactive Explanation Interface</title>
      <p>
        In addition to the explanation information discussed in prior
sections, prior work has introduced simple explanation types [
        <xref ref-type="bibr" rid="ref13 ref17">13,17</xref>
        ]
shown to improve end user understanding of complex algorithm
processes. These types include the classification, system confidence
in the classification, human-understandable features of the
classifier, and alternate classifications. We propose to implement a
set of explanation information within a DAART interactive
explanation interface to best support system transparency and user
feedback. Figure 3 is a notional representation of a sample anomaly
input and the explanation information that might be displayed to
the user. How many and which of these explanation types to display
to the user must be chosen to maximize transparency without
overwhelming or confusing the operator [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. In particular, prior
work has shown that confidence should only be displayed when it
is high or else will result in negative impacts on trust [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. We
propose a user study for evaluating these explanation types in the
following section.
      </p>
      <p>Normal Exemplar</p>
      <p>Detected Anomaly</p>
      <p>System
Confidence(Anomaly) =</p>
      <p>99.71%
Threat Class
“Smoke Bomb”
Alternative Classes
smoke bomb
fire
fog</p>
      <p>Localization (Attention)</p>
      <p>Threat Cluster Graph
fog
smoke
bomb
fire</p>
      <p>
        These explanation presentations additionally provide an
intuitive means for user feedback [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], which yields more and better
feedback from the user in the loop. Operators can provide alert-level
feedback to correct classifications by interacting with the assigned
classification (accepting or rejecting the class) and furthermore by
interacting with the alternative classes to select the correct class if
it exists. Operators can provide system-level feedback by interacting
with the features or localization information from the input that
resulted in the classification. Additionally, as the DAART tool
utilizes clusters of previously classified anomalies to perform
classification, we propose to expose these clusters to the user as part
of the interactive explanation interface. This view provides a global
explanation of all threat data and how it is understood by the
system. An operator may notice that two separate clusters can be
merged to represent a single threat type or that a single cluster
should be split to represent two distinct threat types.
      </p>
    </sec>
    <sec id="sec-14">
      <title>5 User Study</title>
      <p>We outline a user study of our proposed explanation interface to
evaluate the effects of varied explanation information on trust,
feedback quality, and overall human-machine team performance.</p>
    </sec>
    <sec id="sec-15">
      <title>5.1 Research eQustions</title>
      <p>
        The goal of the proposed study is to answer the following research
questions:
Q1: Which explanation information yields the optimal
humanmachine team?
Similar to [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], we hope to determine a set of the explanation
information to provide to the user to maximize performance while
reducing system complexity.
      </p>
      <p>Q2: How is trust affected by varied explanation information?
As trust is particularly important to the adoption of a system such
as ours in the military domain, we intend to evaluate the effect of
varied explanation information on system trust
5.2</p>
    </sec>
    <sec id="sec-16">
      <title>Method</title>
      <p>To support examination of the identified research questions, we
propose a crowdsourced user study. We will identify a dataset and
specific task that is representative of real system usage, but also
approachable to non-security experts. This might include video on
a street corner for which the human-machine team is tasked with
identifying suspicious behavior or video replay from a tower
defense-style game2 for which the human-machine team is tasked
with identifying aggressive behavior towards a base. In the study,
we will hold all aspects of the DAART system constant, but simply
vary the explanation information shown to the user during the task.
After the task we will evaluate the human-machine team
performance, as well as ask users to score the system in terms of
trust, frustration, complexity. In this way, we can study the effects
of the various explanations on user experience.</p>
    </sec>
    <sec id="sec-17">
      <title>6 Conclusion</title>
      <p>In this paper we present an unsupervised anomaly detection system,
DAART, that identifies anomalies from normal behavior and
classifies those anomalies as threats through interaction with
operators. We additionally propose and explanation interface
towards the goal of a DAART system that the user not only
understands and trusts but is maximally accurate due to increased,
improved user feedback. Our proposed user study aims to evaluate
the explanation interface to increase effectiveness and reduce
complexity.</p>
    </sec>
    <sec id="sec-18">
      <title>ACKNOWLEDGMENTS</title>
      <p>This work was supported by AFRL contract FA8650-18-P-1628 and
advised by Dr. Olga Mendoza-Schrock and Mr. Todd Rovito. This
publication was cleared for public release via 88ABW-2019-0665.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Samet</given-names>
            <surname>Akcay</surname>
          </string-name>
          , Amir Atapour-Abarghouei,
          <string-name>
            <given-names>and Toby P.</given-names>
            <surname>Breckon</surname>
          </string-name>
          .
          <year>2018</year>
          . GANomaly:
          <string-name>
            <surname>Semi-Supervised Anomaly</surname>
          </string-name>
          Detection via Adversarial Training.
          <fpage>1</fpage>
          -
          <lpage>16</lpage>
          . https://doi.org/arXiv:
          <year>1805</year>
          .06725v3
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Daniel</given-names>
            <surname>Barbará</surname>
          </string-name>
          , Carlotta Domeniconi, and
          <string-name>
            <given-names>James P.</given-names>
            <surname>Rogers</surname>
          </string-name>
          .
          <year>2006</year>
          .
          <article-title>Detecting outliers using transduction and statistical testing</article-title>
          .
          <source>In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '06</source>
          . https://doi.org/10.1145/1150402.1150413
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Or</given-names>
            <surname>Biran</surname>
          </string-name>
          and
          <string-name>
            <given-names>Courtenay</given-names>
            <surname>Cotton</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Explanation and Justification in Machine Learning:</article-title>
          <source>A Survey. 1st Workshop on Explainable Artificial Intelligence.</source>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Junsuk</given-names>
            <surname>Choe</surname>
          </string-name>
          , Joo Hyun Park, and
          <string-name>
            <given-names>Hyunjung</given-names>
            <surname>Shim</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Generative Adversarial Networks for Unsupervised Object Co-localization</article-title>
          . Retrieved from http://arxiv.org/abs/
          <year>1806</year>
          .00236
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Jeff</given-names>
            <surname>Donahue</surname>
          </string-name>
          and
          <string-name>
            <given-names>Trevor</given-names>
            <surname>Darrell</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Adversarial Feature Learning</article-title>
          . Iclr. https://doi.org/10.1038/nphoton.
          <year>2013</year>
          .187
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Alexey</given-names>
            <surname>Dosovitskiy</surname>
          </string-name>
          ,
          <source>Philipp Fischer, Jost Tobias Springenberg</source>
          ,
          <string-name>
            <given-names>Martin</given-names>
            <surname>Riedmiller</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Thomas</given-names>
            <surname>Brox</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Discriminative unsupervised feature learning with exemplar convolutional neural networks</article-title>
          .
          <source>IEEE Transactions on Pattern Analysis and Machine Intelligence</source>
          . https://doi.org/10.1109/TPAMI.
          <year>2015</year>
          .2496141
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Jerry</given-names>
            <surname>Alan Fails and Dan R Olsen</surname>
          </string-name>
          .
          <year>2003</year>
          .
          <article-title>Interactive machine learning</article-title>
          .
          <source>Proceedings of the 8th international conference on Intelligent user interfaces IUI</source>
          <volume>03</volume>
          :
          <fpage>39</fpage>
          -
          <lpage>45</lpage>
          . https://doi.org/10.1145/604045.604056
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Rebecca</given-names>
            <surname>Fiebrink</surname>
          </string-name>
          , Dan Trueman, and
          <string-name>
            <surname>Perry R Cook</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>A metainstrument for interactive, on-the-fly machine learning</article-title>
          .
          <source>In Proceedings of New Interfaces for Musical Expression (NIME)</source>
          ,
          <volume>3</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Ian</given-names>
            <surname>Goodfellow</surname>
          </string-name>
          , Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and
          <string-name>
            <given-names>Yoshua</given-names>
            <surname>Bengio</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Generative Adversarial Nets (NIPS version)</article-title>
          .
          <source>Advances in Neural Information Processing Systems</source>
          <volume>27</volume>
          . https://doi.org/10.1001/jamainternmed.
          <year>2016</year>
          .8245
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Dave</given-names>
            <surname>Gunning</surname>
          </string-name>
          .
          <source>Explainable Artificial Intelligence (XAI)</source>
          .
          <source>Defense Advanced Research Projects Agency (DARPA)</source>
          .
          <source>Retrieved October 10</source>
          ,
          <year>2018</year>
          from https://www.darpa.mil/program/explainable-artificial-intelligence
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Yuening</surname>
            <given-names>Hu</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jordan</surname>
            Boyd-Graber,
            <given-names>Brianna</given-names>
          </string-name>
          <string-name>
            <surname>Satinoff</surname>
            , and
            <given-names>Alison</given-names>
          </string-name>
          <string-name>
            <surname>Smith</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Interactive topic modeling</article-title>
          .
          <source>Machine Learning</source>
          <volume>95</volume>
          , 3:
          <fpage>423</fpage>
          -
          <lpage>469</lpage>
          . https://doi.org/10.1007/s10994-013-5413-0
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Alex</surname>
            <given-names>Krizhevsky</given-names>
          </string-name>
          , Ilya Sutskever, and
          <string-name>
            <surname>Hinton Geoffrey E.</surname>
          </string-name>
          <year>2012</year>
          .
          <article-title>ImageNet Classification with Deep Convolutional Neural Networks</article-title>
          .
          <source>Advances in Neural Information Processing Systems</source>
          <volume>25</volume>
          (
          <issue>NIPS2012</issue>
          ):
          <fpage>1</fpage>
          -
          <lpage>9</lpage>
          . https://doi.org/10.1109/5.726791
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Todd</surname>
            <given-names>Kulesza</given-names>
          </string-name>
          , Margaret Burnett,
          <string-name>
            <surname>Weng-Keen Wong</surname>
            , and
            <given-names>Simone</given-names>
          </string-name>
          <string-name>
            <surname>Stumpf</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Principles of Explanatory Debugging to Personalize Interactive Machine Learning</article-title>
          .
          <source>In Proceedings of the 20th International Conference on Intelligent User Interfaces - IUI '15</source>
          ,
          <fpage>126</fpage>
          -
          <lpage>137</lpage>
          . https://doi.org/10.1145/2678025.2701399
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Todd</surname>
            <given-names>Kulesza</given-names>
          </string-name>
          , Simone Stumpf, Margaret Burnett, Sherry Yang,
          <string-name>
            <given-names>Irwin</given-names>
            <surname>Kwan</surname>
          </string-name>
          , and Weng Keen Wong.
          <year>2013</year>
          .
          <article-title>Too much, too little, or just right? Ways explanations impact end users' mental models</article-title>
          .
          <source>In Proceedings of IEEE Symposium on Visual Languages</source>
          and
          <string-name>
            <surname>Human-Centric</surname>
            <given-names>Computing</given-names>
          </string-name>
          , VL/HCC,
          <fpage>3</fpage>
          -
          <lpage>10</lpage>
          . https://doi.org/10.1109/VLHCC.
          <year>2013</year>
          .6645235
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>Hanseung</given-names>
            <surname>Lee</surname>
          </string-name>
          , Jaeyeon Kihm, Jaegul Choo, John Stasko, and
          <string-name>
            <given-names>Haesun</given-names>
            <surname>Park</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>iVisClustering: An Interactive Visual Document Clustering via Topic Modeling</article-title>
          .
          <source>Computer Graphics Forum</source>
          <volume>31</volume>
          ,
          <year>3pt3</year>
          :
          <fpage>1155</fpage>
          -
          <lpage>1164</lpage>
          . https://doi.org/10.1111/j.1467-
          <fpage>8659</fpage>
          .
          <year>2012</year>
          .
          <volume>03108</volume>
          .x
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Brian</surname>
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Lim and Anind</surname>
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Dey</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Investigating intelligibility for uncertain context-aware applications</article-title>
          .
          <source>In Proceedings of the 13th international conference on Ubiquitous computing - UbiComp '11</source>
          , 415. https://doi.org/10.1145/2030112.2030168
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Brian</surname>
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Lim</surname>
          </string-name>
          ,
          <string-name>
            <surname>Anind K. Dey</surname>
            , and
            <given-names>Daniel</given-names>
          </string-name>
          <string-name>
            <surname>Avrahami</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Why and why not explanations improve the intelligibility of context-aware intelligent systems</article-title>
          .
          <source>Proceedings of the 27th international conference on Human factors in computing systems - CHI</source>
          <volume>09</volume>
          :
          <fpage>2119</fpage>
          -
          <lpage>2129</lpage>
          . https://doi.org/10.1145/1518701.1519023
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>Hema</given-names>
            <surname>Raghavan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O</given-names>
            <surname>Madani</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Rosie</given-names>
            <surname>Jones</surname>
          </string-name>
          .
          <year>2006</year>
          .
          <article-title>Active learning with feedback on features and instances</article-title>
          .
          <source>Journal of Machine Learning Research</source>
          . https://doi.org/10.1016/
          <fpage>S0022</fpage>
          -460X(
          <issue>70</issue>
          )
          <fpage>80001</fpage>
          -
          <lpage>3</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Al</surname>
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Rashid</surname>
            , Kimberly Ling,
            <given-names>Regina D.</given-names>
          </string-name>
          <string-name>
            <surname>Tassone</surname>
            , Paul Resnick, Robert Kraut,
            <given-names>and John</given-names>
          </string-name>
          <string-name>
            <surname>Riedl</surname>
          </string-name>
          .
          <year>2006</year>
          .
          <article-title>Motivating Participation by Displaying the Value of Contribution</article-title>
          .
          <source>In Proceedings of the SIGCHI conference on Human Factors in computing systems</source>
          ,
          <volume>955</volume>
          -
          <fpage>958</fpage>
          . https://doi.org/10.1145/1124772.1124915
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Stephanie</surname>
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Rosenthal</surname>
            and
            <given-names>Anind K.</given-names>
          </string-name>
          <string-name>
            <surname>Dey</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Towards Maximizing the Accuracy of Human-Labeled Sensor Data</article-title>
          .
          <source>In Proceedings of the International Conference on Intelligent User Interfaces</source>
          ,
          <fpage>259</fpage>
          -
          <lpage>268</lpage>
          . https://doi.org/10.1145/1719970.1720006
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>Burr</given-names>
            <surname>Settles</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Closing the loop: Fast, interactive semi-supervised annotation with queries on features and instances</article-title>
          .
          <source>In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing</source>
          . https://doi.org/10.1021/jp048641u
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>Michael</given-names>
            <surname>Shilman</surname>
          </string-name>
          , Desney S Tan, and
          <string-name>
            <given-names>Patrice</given-names>
            <surname>Simard</surname>
          </string-name>
          .
          <year>2006</year>
          .
          <article-title>CueTIP: a mixedinitiative interface for correcting handwriting errors</article-title>
          .
          <source>Proceedings of the ACM Symposium on User Interface Software and Technology</source>
          . https://doi.org/10.1145/1166253.1166304
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <surname>Kimberly</surname>
            <given-names>Stowers</given-names>
          </string-name>
          , Nicholas Kasdaglis,
          <string-name>
            <given-names>Michael</given-names>
            <surname>Rupp</surname>
          </string-name>
          , Jessie Chen, Daniel Barber, and
          <string-name>
            <given-names>Michael</given-names>
            <surname>Barnes</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Insights into human-agent teaming: Intelligent agent transparency and uncertainty</article-title>
          .
          <source>In Advances in Intelligent Systems and Computing</source>
          ,
          <volume>149</volume>
          -
          <fpage>160</fpage>
          . https://doi.org/10.1007/978-3-
          <fpage>319</fpage>
          -41959-6_
          <fpage>13</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          <source>[24] The European Parliament and The Council of The European Union</source>
          .
          <year>2016</year>
          .
          <article-title>General Data Protection Regulation</article-title>
          . https://doi.org/http://eurlex.europa.eu/pri/en/oj/dat/2003/l_285/l_28520031101en00330037.pdf
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>Kevin</given-names>
            <surname>Ward</surname>
          </string-name>
          and
          <string-name>
            <given-names>Jack</given-names>
            <surname>Davenport</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Human-machine interaction to disambiguate entities in unstructured text and structured datasets</article-title>
          .
          <source>In SPIE Conference on Next-Generation Analyst V.</source>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>