<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>An Explainable AI User Interface for Facilitating Collaboration between Domain Experts and AI Researchers</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Meng Shi</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Celal Savur</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Elizabeth Watkins</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ramesh Manuvinakurike</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gesem Gudino Mejia</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Richard Beckwith</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giuseppe Raffa</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Intel Labs</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Intel Corporation</string-name>
        </contrib>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <fpage>26</fpage>
      <lpage>28</lpage>
      <abstract>
        <p>The adoption of AI approaches is increasing across domains, yet the study of explainable AI user interfaces for domain experts remains limited. One potential benefit provided by this type of user interface is to facilitate better data collection and improve in-domain model training. With the advancement of explainable AI (XAI) methods for end-users, domain experts can more easily collaborate with AI researchers and contribute to the process of building and deploying domain relevant AI models. We propose an XAI interface for domain experts in a manufacturing setting, that provides transparency into multi-modal AI systems and supports domain expert collaboration with AI researchers to fine-tune models through active feedback. In this paper, we report early findings of a user study with this XAI interface, with participants including both domain experts and AI researchers. These early findings hold promise for supporting improved system understanding by end users as well as cross-functional collaboration between domain experts and AI researchers.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Explainable User Interface</kwd>
        <kwd>XAI for transparency</kwd>
        <kwd>XAI for human computer collaboration</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>The field of artificial intelligence (AI) has progressed remarkably in recent years. It enables new
services, such as AI assistance systems which will be adopted not only by households, but also
professionals. AI support can streamline operations, support human performance in working
environments, and augment human capabilities across tasks.</p>
      <p>However, AI assistance systems in these areas encounter very real problems due to a very limited
amount of data for training, issues with the incorporation of domain knowledge, and difficulties
in fine-tuning the models.</p>
      <p>Having domain experts collaborate with AI researchers in development of the AI assistance
system could potentially help alleviate the problems mentioned above, since expert feedback is a
rich source of data incorporating domain knowledge. However, such collaboration requires the
domain expert to also be familiar with the design and back-end knowledge of such AI assistance
systems, which is almost impossible without an explainable AI approach.</p>
      <p>In this paper, we report a user interface designed to support explainable AI (XAI) for a complex
multi-modal AI assistance system with a goal of facilitating collaboration between domain
experts in manufacturing (e.g., technicians on the factory floor) and developers working on
multimodal ambient sensing systems for human performance and task support.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        In the first few years of its history, the XAI field focused on developing algorithms to explain or
interpret the behavior of "black box" models [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] by collecting and displaying information about,
for example, parameters or weights which were especially significant in the production of a
prediction or output. Most of this work presumed the consumers of explanations to be AI
engineers or researchers looking to fine-tune or improve their own models. More recently, a
subset of the field has taken a "human-centered" turn, focusing in part on how non-AI researchers
might also benefit from XAI [
        <xref ref-type="bibr" rid="ref1 ref2 ref3">1,2,3</xref>
        ]. This turn included tailoring XAI interfaces and form factors
for end-users rather than AI researchers [
        <xref ref-type="bibr" rid="ref5 ref6">5,6</xref>
        ].
      </p>
      <p>Our contribution builds on this work to facilitate collaboration between two groups: AI
researchers and one type of end-user, users who have expert-level knowledge of their domain of
work, i.e., domain experts. Facilitating collaboration between these two groups can address key
challenges of data availability, data quality, and the resource demands of model fine-tuning. We
propose an XAI interface through which domain experts collaborate with AI researchers to give
direct input into AI models. A key innovation is the step of actionability: we intend our UI to
support end-user action and cross-functional collaboration, in the form of providing feedback on
model performance provided by domain experts, to improve system performance.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Task Guidance System</title>
      <p>We developed a POC (proof of concept) XAI user interface prototype for a multi-modal AI system
providing performance support in a manufacturing scenario. As this prototype is a kind of
perceptually guided task guidance system, we call it “Task Guidance System” (TGS). TGS, as shown
in Fig. 1, is designed to facilitate collaboration between domain experts and AI researchers in
finetuning the various components of the multi-modal AI system.</p>
      <p>
        The TGS visualizes multiple models used by the multi-modal AI supporting system for Action
Recognition (MSTCN[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]), Automatic Speech Recognition (Whisper[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]), Intent Recognition (BERT
[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]), etc., which infer the user’s actions and predict their next steps. The visualization is
represented as a dashboard that uses module design principles for all the AI models being used.
The TGS can help domain experts understand the AI models. With TGS, the domain experts can
see how their inputs are taken by the models and compare predictions generated by each model
to the phenomenon the predictions represent. This is possible because TGS visualizes how each
model analyzes its inputs and presents the output of each model.
      </p>
      <p>In addition, TGS allows domain experts with ground-truth knowledge to critique these
predictions through its dialogue system and, thereby, lets AI researchers fine-tune the AI models.
To support this collaboration, prompts encourage users to provide feedback at the models'
request. These prompts include an uncertainty score for vision, an importance score for noting
which words in an utterance were most indicative of a match, highlighted inconsistencies
between different models, etc.</p>
      <p>The contribution of this work is significant because the obstacles mentioned earlier may hold us
back from building systems that can help humans achieve their full potential. The XAI UI we
propose is intended to directly address these obstacles.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Methods: Two User Studies</title>
      <p>The XAI UI was evaluated remotely with two user studies, one with domain experts and one with
AI researchers. Wizard-of-Oz presentations of the UI were combined with qualitative interview
protocols for both studies.</p>
      <p>The first study was conducted with two domain experts interacting with the AI assistance system
including the XAI UI. Both experts had several months of experience working with the AI system
(without the XAI UI). A semi-structured interview protocol was designed and carried out to elicit
and capture data about the domain experts' cognitive model of system and TGS, whether the
information displayed had any impact on their understanding of model predictions, or their
willingness to provide feedback to AI researchers.</p>
      <p>The second study looked at three AI researchers experienced in multi-modal AI assistance
systems. A second semi-structured interview protocol was designed to elicit and capture
perceptions of the UI, what domain knowledge might be important for their goals as AI
researchers, and how they feel this knowledge can be adequately captured from domain experts
through TGS. All interview results were analyzed collaboratively by two of the authors using
thematic analysis.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Results</title>
      <p>All five interviews took place remotely, via a video conferencing platform. Most interviews took
between 30 and 90 minutes, with one outlier Wizard/interview process taking roughly two
hours.</p>
      <sec id="sec-5-1">
        <title>AI researchers reported that they anticipate deriving important benefits from domain experts' feedback, especially in domains of high dynamism, complexity, or those with diverse users.</title>
        <p>AI researchers confirmed that they could derive value from capturing domain experts'
knowledge. They expressed that this data could be particularly valuable for fine-tuning models in
domains characterized by high degrees of change, or domains or applications that may be made
more complex by the need to serve multiple users or multiple different types of users.</p>
      </sec>
      <sec id="sec-5-2">
        <title>Aligned multimodal data are valuable to be streamed together.</title>
        <p>AI researchers report it is important to see the entire data flow from domain experts' raw
performance data to the final output. For example, with a multi-modal AI system, one models'
output may rely on another's output.</p>
      </sec>
      <sec id="sec-5-3">
        <title>It is valuable to have direct access to domain experts' feedback.</title>
        <p>Domain experts demonstrated a willingness to help AI researchers using TGS, and identified
novel challenges AI researchers might face in negotiating a space of multiple or changing
approaches in methods or objects. All domain experts reported that comprehending the
information in the UI was challenging. However, even then, domain expert interaction with TGS
provided valuable feedback for AI researchers. For example, domain experts pointed out
instances where the task model was too specific and pointed to variants on the task that should
be supported by the AI assistance system. Without access to the nuanced domain knowledge held
by domain experts, AI researchers could face risks when fine-tuning their models of incorrectly
labeling actions or objects that they might not know are acceptable alternative methods for
accomplishing a task. We found that when domain experts interpret data differently than AI
researchers, XAI information can reveal to domain experts that their input is needed to rectify an
inaccurate model prediction.</p>
      </sec>
      <sec id="sec-5-4">
        <title>We also found that domain experts' curiosity about model inferences was increased through their exposure to the XAI interface.</title>
        <p>When domain experts saw incorrect model predictions, they verbally challenged the
discrepancies between the prediction and their understanding of correctness of the process being
analyzed: one domain expert asked, unprompted, “Why does [the UI] think I'm holding [item]
when I'm clearly holding [a different item]?” when shown the TGS UI. We believe that this
promises that further engagement will see an increased willingness to engage with and correct
inaccurate predictions. This was confirmed by explicit inquiries about their willingness to
provide feedback: both domain experts expressed that they would be both willing and
enthusiastic to provide input back to AI researchers via the interface.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Future Work</title>
      <p>The paper presents our ongoing XAI dashboard and findings of an early user study. A larger-scale
user study will be performed to more robustly evaluate the visual content and affordances of the
UI and dashboard. We plan to design additional XAI visualizations across multiple form factors,
including popular approaches like heatmaps. We also plan visualizations plotting prediction
history to identify outlier predictions and generate user-facing alerts. Our plans also include
providing more options to edit models directly through TGS, which we hope will make fine-tuning
models faster and more flexible. We hope our application can support scaling systems more
quickly and easily across domains.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>U.</given-names>
            <surname>Ehsan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Wintersberger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. A.</given-names>
            <surname>Watkins</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Manger</surname>
          </string-name>
          , G. Ramos,
          <string-name>
            <given-names>J. D.</given-names>
            <surname>Weisz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. Daumé</given-names>
            <surname>Iii</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Riener</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M. O.</given-names>
            <surname>Riedl</surname>
          </string-name>
          ,
          <article-title>Human-Centered Explainable AI (HCXAI): Coming of Age</article-title>
          .
          <source>In Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems (CHI EA '23)</source>
          .
          <article-title>Association for Computing Machinery</article-title>
          , New York, NY, USA, Article
          <volume>353</volume>
          ,
          <fpage>1</fpage>
          -
          <lpage>7</lpage>
          . https://doi.org/10.1145/3544549.3573832
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>U.</given-names>
            <surname>Ehsan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Wintersberger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q. V.</given-names>
            <surname>Liao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. A.</given-names>
            <surname>Watkins</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Manger</surname>
          </string-name>
          , H.
          <string-name>
            <surname>Daumé</surname>
            <given-names>III</given-names>
          </string-name>
          ,
          <string-name>
            <surname>A. Riener</surname>
            , and
            <given-names>M. O.</given-names>
          </string-name>
          <string-name>
            <surname>Riedl</surname>
          </string-name>
          ,
          <article-title>Human-Centered Explainable AI (HCXAI): Beyond Opening the Black-Box of AI</article-title>
          .
          <source>In Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems (CHI EA '22)</source>
          .
          <article-title>Association for Computing Machinery</article-title>
          , New York, NY, USA, Article
          <volume>109</volume>
          ,
          <fpage>1</fpage>
          -
          <lpage>7</lpage>
          . https://doi.org/10.1145/3491101.3503727
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>U.</given-names>
            <surname>Ehsan</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M. O.</given-names>
            <surname>Riedl</surname>
          </string-name>
          ,
          <string-name>
            <surname>Human-centered Explainable</surname>
            <given-names>AI</given-names>
          </string-name>
          :
          <article-title>Towards a Reflective Sociotechnical Approach</article-title>
          . arXiv preprint arXiv:
          <year>2002</year>
          .01092. https://doi.org/10.48550/arXiv.
          <year>2002</year>
          .01092
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>F.</given-names>
            <surname>Emmert-Streib</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Yli-Harja</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Dehmer</surname>
          </string-name>
          , Explainable Artificial Intelligence and
          <article-title>Machine Learning: A reality rooted perspective</article-title>
          . J. Cohen (Ed.), Special issue: Digital Libraries, volume
          <volume>39</volume>
          ,
          <year>1996</year>
          . https://doi.org/10.48550/arXiv.
          <year>2001</year>
          .09464
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Choo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Park</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Park</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. S.</given-names>
            <surname>Nam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. Y.</given-names>
            <surname>Jung</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <article-title>Designing an XAI interface for BCI experts: A contextual design for pragmatic explanation interface based on domain knowledge in a specific context</article-title>
          .
          <source>International Journal of Human Computer Studies</source>
          ,
          <volume>174</volume>
          , [
          <volume>103009</volume>
          ]. https://doi.org/10.1016/j.ijhcs.
          <year>2023</year>
          .103009
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S. S.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. A.</given-names>
            <surname>Watkins</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Russakovsky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Fong</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Monroy-Hernández</surname>
          </string-name>
          ,
          <article-title>"Help Me Help the AI": Understanding How Explainability Can Support Human-AI Interaction</article-title>
          .
          <source>In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI '23)</source>
          .
          <article-title>Association for Computing Machinery</article-title>
          , New York, NY, USA, Article
          <volume>250</volume>
          ,
          <fpage>1</fpage>
          -
          <lpage>17</lpage>
          . https://doi.org/10.1145/3544548.3581001
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Y. A.</given-names>
            <surname>Farha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gall</surname>
          </string-name>
          ,
          <article-title>"Ms-tcn: Multi-stage temporal convolutional network for action segmentation." In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR '</article-title>
          <year>2019</year>
          ) (pp.
          <fpage>3575</fpage>
          -
          <lpage>3584</lpage>
          ). https://doi.org/10.48550/arXiv.
          <year>1903</year>
          .01945
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A.</given-names>
            <surname>Radford</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. W.</given-names>
            <surname>Kim</surname>
          </string-name>
          , T. Xu,
          <string-name>
            <given-names>G.</given-names>
            <surname>Brockman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>McLeavey</surname>
          </string-name>
          ,
          <string-name>
            <surname>and I. Sutskever</surname>
          </string-name>
          ,
          <article-title>Robust speech recognition via large-scale weak supervision</article-title>
          .
          <source>In International Conference on Machine Learning</source>
          (
          <year>2023</year>
          ) (pp.
          <fpage>28492</fpage>
          -
          <lpage>28518</lpage>
          ). PMLR. https://doi.org/10.48550/arXiv.2212.04356
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , Bert:
          <article-title>Pre-training of deep bidirectional transformers for language understanding</article-title>
          . arXiv preprint arXiv:
          <year>1810</year>
          .
          <fpage>04805</fpage>
          . (
          <year>2018</year>
          ) https://doi.org/10.48550/arXiv.
          <year>1810</year>
          .04805
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>