<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>ExDocS: Evidence based Explainable Document Search</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sayantan Polley</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Atin Janki</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marcus Thiel</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Juliane Hoebel-Mueller</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andreas Nuernberger</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Otto von Guericke University Magdeburg</institution>
          ,
          <addr-line>Universitätsplatz 2, 39106 Magdeburg, Germany - first authors with</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>We present an explainable document search system (ExDocS), based on a re-ranking approach, that uses textual and visual explanations to explain document rankings to non-expert users. ExDocS attempts to answer questions such as “Why is document X ranked at Y for a given query?”, “How do we compare multiple documents to understand their relative rankings?”. The contribution of this work is on re-ranking methods based on various interpretable facets of evidence such as term statistics, contextual words, and citation-based popularity. Contribution from the user interface perspective consists of providing intuitive accessible explanations such as: “document X is at rank Y because of matches found like Z” along with visual elements designed to compare the evidence and thereby explain the rankings. The quality of our re-ranking approach is evaluated on benchmark data sets in an ad-hoc retrieval setting. Due to the absence of ground truth of explanations, we evaluate the aspects of interpretability and completeness of explanations in a user study. ExDocS is compared with a recent baseline - explainable search system (EXS), that uses a popular posthoc explanation method called LIME. In line with the “no free lunch” theorem, we find statistically significant results showing that ExDocS provides an explanation for rankings that are understandable and complete but the explanation comes at the cost of a drop in ranking quality.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Explainable Rankings</kwd>
        <kwd>XIR</kwd>
        <kwd>XAI</kwd>
        <kwd>Re-ranking</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <sec id="sec-1-1">
        <title>2. How do we compare multiple documents to understand their relative rankings? 3. Are the explanations provided interpretable and complete?</title>
        <p>
          Explainability in Artificial intelligence (XAI) is currently
a vibrant research topic that attempts to make AI systems
transparent and trustworthy to the concerned
stakeholders. The research in XAI domain is interdisciplinary but There have been works [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], [7] in the recent past that
is primarily led by the development of methods from the attempted to address related questions such as "Why is a
machine learning (ML) community. From the classifi- document relevant to the query?" by adapting XAI
methcation perspective, e.g., in a diagnostic setting a doctor ods such as LIME [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] primarily for neural rankers. We
may be interested to know that how prediction for a dis- argue that the idea of relevance has deeper connotations
ease is made by the AI-driven solution. XAI methods in related to the semantic and syntactic notion of similarity
ML are typically based on exploiting features associated in text. Hence, we try to tackle the XAI problem from
with a class label, development of add-on model specific a ranking perspective. Based on interpretable facets we
methods like LRP [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ], model agnostic ways such as LIME provide a simple re-ranking method that is agnostic of
[
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] or causality driven methods [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. The explainability the retrieval model. ExDocS provides local textual
exproblem in IR is inherently diferent from a classification planations for each document (Part D in Fig. 1). The
setting. In IR, the user may be interested to know how a re-ranking approach enables us to display the “math
becertain document is ranked for the given query or why a hind the rank” for each of the retrieved documents (Part
certain document is ranked higher than others [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. Often E in Fig. 1). Besides, we also provide a global
explanaan explanation is an answer to a why question [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. tion in form of a comparative view of multiple retrieved
        </p>
        <p>In this work, Explainable Document Search (ExDocS), documents (Fig. 4).
we focus on a non-web ad-hoc text retrieval setting and We discuss relevant work for explainable rankings
aim to answer the following research questions: in section two. We describe our contribution to the
reranking approach and methods to generate explanation in
1. Why is a document X ranked at Y for a given section three. Next in section four, we discuss the
quantiquery? tative evaluation of rankings on benchmark data sets and
The 1st International Workshop on Causality in Search and a comparative qualitative evaluation with an explainable
Recommendation (CSR’21), July 15, 2021, Online search baseline in a user study. To our knowledge, this is
" sayantan.polley@ovgu.de (S. Polley*); atin.janki@ovgu.de one of the first works comparing two explainable search
(A. Janki*); marcus.thiel@ovgu.de (M. Thiel); systems in a user study. In section five, we conclude
jaunldiarneaes.h.noueebrenl @beorvggeru@.deov(Jg.uH.doee(bAe.l-NMuueerlnlebre)r;ger) that ExDocS provides explanations that are interpretable
CPWrEooUrckReshdoinpgs IhStpN:/c1e6u1r3-w-0s.o7r3g ©CCo2Em02mU1oRCnospLWyicreigonhsrtekfAostrthrtihboiusptpioanpPe4rr.0obIynctieetsrenaaduttiihononragsl.s(CUC(seCBpYEer4Um.0i)Rt.te-d WundSe.roCrrega)tive Wanidlccooxmonplseitgen.eTdh-erarneskutletssta.rHeostwateivsteirc,atlhlye seixgpnliaficnaanttioinns
come at a cost of reduced ranking performance paving
way for future work. The ExDocS system is online1 and
the source code is available on-request for reproducible
research.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>The earliest attempts on making search results
explainable can be seen through the visualization paradigms
[8, 9, 10] that aimed at explaining term distribution and
statistics. Mi and Jiang [11] noted that IR systems were
one of the earliest among other research fields to ofer
interpretations of system decisions and outputs, through
search result summaries. The areas of product search [12] Figure 2: Contribution of Query Terms for relevance
and personalized professional search [13], have explored
explanations for search results by creating
knowledgegraphs based on user’s logs. In [14] Melucci made a plainability, the perspective of ethics and fairness [15, 16]
preliminary study and suggested that structural equa- is also often encountered in IR whereby the retrieved data
tion models from the causal perspective can be used to may be related to disadvantaged people or groups. In
generate explanations for search systems. Related to ex- [17] a categorization of fairness in rankings is devised
1https://tinyurl.com/ExDocSearch based on the use of pre-processing, in-processing, or</p>
    </sec>
    <sec id="sec-3">
      <title>3. Concept: Re-ranking via</title>
    </sec>
    <sec id="sec-4">
      <title>Interpretable facets</title>
      <p>The concept behind ExDocS is based on the re-ranking
of interpretable facets of evidence such as term statistics,
contextual words, and citation-based popularity. Each
of these facets is also a selectable search criterion in
the search interface. We have a motivation to provide a
• Contextual and Synonym Search:
‘contextual words’ (term-count of query words +
expanded contextual words). Contextual words are
word-embeddings+synonyms in this case.
• Keyword Search with Popularity score:
‘citation-based popularity’ (popularity score of a
document)
Based on benchmark ranking performance, we
empirically determine a weighted combination of these facets
which is also available as a search criteria choice in the to Table 1). We benchmark our retrieval performance
interface. Additionally, we provide local and global vi- by comparing with [21] and confirm that our ranking
sual explanations. Local ones in form of visualizing the approach needs improvement to at least match the
contribution of features (expanded query terms) for each baseline performance metrics.
document as well as comparing them globally for
multiple documents (refer the Evidence Graph in the lower 4.2. Evaluation of explanations
part of Fig. 4).</p>
      <p>input : q = {w1,w2,...,wn}, D = {d1,d2,...,dm},</p>
      <p>facet
output : A re-ranked doc list
1 Select top-k docs from D using cosine similarity,</p>
      <p>such as
2</p>
      <p>{′1, ′2, ..., ′} ∈ 
for  ← 1 to  do
if facet == ‘term statistics’ or ‘contextual
words’ then
evidence(di)← Σ ∈(, )
// count(w, di) is count of</p>
      <p>term w in di
end
if facet == ‘citation-based popularity’ then
evidence(di)← ()
// popularityScore(di) could
be inLinks count, PageRank
or HITS score of di
end
9
10 end
11 Rerank all docs in Dk using evidence
12 return Dk</p>
      <sec id="sec-4-1">
        <title>Algorithm 1: Re-ranking algorithm</title>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. Evaluation</title>
      <sec id="sec-5-1">
        <title>We have two specific focus areas in evaluation. The first one is related to the quality of the rankings and the second one is related to the explainability aspect. We leave out evaluation of the popularity score model for future work.</title>
        <sec id="sec-5-1-1">
          <title>4.1. Evaluation of re-ranking algorithm</title>
        </sec>
      </sec>
      <sec id="sec-5-2">
        <title>We experimented the re-ranking algorithm on the TREC</title>
        <p>Disk 4 &amp; 5 (-CR) dataset. The evaluations were carried out
by using the trec_eval[20] package. We used TREC-6
adhoc queries (topics 301-350) and used only ‘Title’ of the
topics as the query. We noticed that Keyword Search,
Contextual Search, Synonym Search, and
Contextual Synonym Search systems were unable
to beat the ‘Baseline ExDocS’ (OOTB Apache Solr) on
metrics such as MAP, R-Precision, and NDCG (refer
We performed a user study to qualitatively evaluate the
explanations. Also, to compare ExDocS’s explanations
with that of EXS; we integrated EXS’s explanation model
into our interface. Therefore, keeping the look and feel of
both systems alike, we tried to reduce user’s bias towards
any system.
4.2.1. User study setup</p>
      </sec>
      <sec id="sec-5-3">
        <title>A total of 32 users participated in a lab controlled user</title>
        <p>
          study. 30 users were from a computer science background
while 26 users had a fair knowledge of information
retrieval systems. Each user was asked to test out both
the systems and the questionnaire was formatted in a
Latin-block design. The name of the systems was masked
as System-A (EXS) and System-B (ExDocS).
4.2.2. Metrics for evaluation
We use the existing definitions ([
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] and [22]) of
Interpretability, Completeness and Transparency in the
community with respect to evaluation in XAI. The following
factors are used for evaluating the quality and
efectiveness of explanations:
• Interpretability: describing the internals of a
system in human-understandable terms [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ].
• Completeness: describing the operation of a
system accurately and allowing the system’s
behavior to be anticipated in future [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ].
• Transparency: an IR system should be able to
demonstrate to its users and other interested
parties, why and how the proposed outcomes were
achieved [22].
        </p>
        <sec id="sec-5-3-1">
          <title>4.3. Results and Discussion</title>
        </sec>
      </sec>
      <sec id="sec-5-4">
        <title>We discuss the results of our experiments and draw conclusions to answer the research questions.</title>
        <p>RQ1. Why is a document X ranked at Y for a
given query?
We answer this question by providing the individual
textual explanation for every document (refer to Part D of
Fig. 1) on the ExDocS interface. The “math behind the
rank” (refer to Part E of Fig. 1) of a document is explained
as a percentage of the evidence with respect to the best
matching document.</p>
        <p>RQ2. How do we compare multiple documents
to understand their relative rankings?
We provide an option to compare multiple documents
through visual and textual paradigms (refer to Fig. 4). The
evidence can be compared and contrasted and thereby
understand the reasons for a document’s rank being higher
or lower than others.</p>
        <p>RQ3. Are the generated explanations
interpretable and complete?
We evaluate the quality of the explanations in terms of
their interpretability and completeness. Empirical
evidence from the user study on Interpretability:
1. 96.88% of the users understood the textual
explanations of ExDocS
2. 71.88% of the users understood the relation
between the query term and features (synonyms or
contextual words) shown in the explanation
3. Users gave a mean rating of 4 out of 5 (standard
deviation = 1.11) to ExDocS on the
understandability of the percentage calculation for rankings,
shown as part of the explanations</p>
      </sec>
      <sec id="sec-5-5">
        <title>When users were explicitly asked - whether they could “gather an understanding of how the system functions based on the given explanations”, users gave a positive</title>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>5. Conclusion and Future Work</title>
      <sec id="sec-6-1">
        <title>Moreover, 78.13% of total users claimed that they could</title>
        <p>anticipate ExDocS behavior in the future based on the
understanding gathered through explanations (individual
and comparative). Based on the above empirical evidence
we argue that the ranking explanations generated by
ExDocS can be assumed to be complete.</p>
        <p>Transparency: We investigate if the explanations
make ExDocS more transparent [22] to the user. Users
gave ExDocS a mean rating of 3.97 out of 5 (standard
deviation = 0.86) on ‘Transparency’ based on the
individual (local) explanations. In addition to that, 90.63%
of the total users indicated that ExDocS became more
transparent after reading the comparative (global)
explanations. This indicates that explanations make ExDocS
more transparent to the user.
response with a mean rating of 3.84 out of 5 (standard
deviation = 0.72). The above-mentioned empirical
evidence indicates that the ranking explanations provided In this work, we present an Explainable Document Search
by ExDocS can be deemed as interpretable. (ExDocS) system that attempts to explain document
rank</p>
        <p>Empirical evidence from the user study on Complete- ings using a combination of textual and visual elements
ness: to a non-expert user. We make use of word embeddings
and WordNet thesaurus to expand the user query. We use
1. All users found the features shown in the expla- various interpretable facets such as term statistics,
connation of ExDocS to be reasonable (i.e. sensible textual words, and citation-based popularity. Re-ranking
or fairly good) results from a simple vector space model with such
in2. 90.63% of the users understood through compara- terpretable facets help us to explain the “math behind
tive explanations of ExDocS that- why a partic- the rank” to an end-user. We evaluate the explanations
ular document was ranked higher or lower than by comparing ExDocS with another explainable search
other documents baseline in a user study. We find statistically significant
results that ExDocs provides interpretable and complete
explanations. Although, it was dificult to find a clear
winner between both systems in all aspects. In line with
the “no free lunch” theorem, the results show a drop in
ranking quality on benchmark data sets at the cost of
getting comprehensible explanations. This paves way
for ongoing research to include user feedback to adapt
the rankings and explanations. ExDocS is currently
being evaluated in domain-specific search settings like law
search where explainability is a key factor to gain user
trust.</p>
        <p>Comparison of explanations between ExDocS
and EXS:
Both the systems performed similarly in terms of
  and . However, users
found ExDocS explanations to be more interpretable
compared to that of EXS (refer to Fig. 5), and this
comparison was statistically significant in WSR test ( | | &lt;
( = 0.05, = 10) = 10, where | | = 5.5).
5th International Conference on Data Science and tems with Application to LinkedIn Talent Search,
Advanced Analytics (DSAA), IEEE, 2018, pp. 80–89. in: Proceedings of the 25th ACM SIGKDD
In[7] Z. T. Fernando, J. Singh, A. Anand, A Study on ternational Conference on Knowledge Discovery
the Interpretability of Neural Retrieval Models Us- amp; Data Mining, KDD ’19, Association for
Coming DeepSHAP, in: Proceedings of the 42nd Inter- puting Machinery, New York, NY, USA, 2019, p.
national ACM SIGIR Conference on Research and 2221–2231. URL: https://doi.org/10.1145/3292500.
Development in Information Retrieval, SIGIR’19, 3330691. doi:10.1145/3292500.3330691.
Association for Computing Machinery, New York, [17] C. Castillo, Fairness and Transparency in Ranking,
NY, USA, 2019, p. 1005–1008. SIGIR Forum 52 (2019) 64–71.
[8] M. A. Hearst, TileBars: Visualization of Term Distri- [18] V. Chios, Helping results assessment by adding
exbution Information in Full Text Information Access, plainable elements to the deep relevance matching
in: Proceedings of the SIGCHI Conference on Hu- model, in: Proceedings of the 43rd International
man Factors in Computing Systems, CHI ’95, ACM ACM SIGIR Conference on Research and
DevelopPress/Addison-Wesley Publishing Co., USA, 1995, ment in Information Retrieval, Association for
Comp. 59–66. puting Machinery, New York, NY, USA, 2020. URL:
[9] O. Hoeber, M. Brooks, D. Schroeder, X. D. Yang, https://ears2020.github.io/accept_papers/2.pdf .</p>
        <p>TheHotMap.Com: Enabling Flexible Interaction in [19] D. Roy, S. Saha, M. Mitra, B. Sen, D. Ganguly, I-REX:
Next-Generation Web Search Interfaces, in: Pro- A Lucene Plugin for EXplainable IR, in:
Proceedceedings of the 2008 IEEE/WIC/ACM International ings of the 28th ACM International Conference on
Conference on Web Intelligence and Intelligent Information and Knowledge Management, CIKM
Agent Technology - Volume 01, WI-IAT ’08, IEEE ’19, Association for Computing Machinery, New
Computer Society, USA, 2008, p. 730–734. York, NY, USA, 2019, p. 2949–2952.
[10] M. A. Soliman, I. F. Ilyas, K. C.-C. Chang, URank: [20] C. Buckley, et al., The trec_eval evaluation package,
Formulation and Eficient Evaluation of Top-k 2004.</p>
        <p>Queries in Uncertain Databases, in: Proceedings of [21] D. K. Harman, E. Voorhees, The Sixth Text
REthe 2007 ACM SIGMOD International Conference trieval Conference (TREC-6), US Department of
on Management of Data, SIGMOD ’07, Association Commerce, Technology Administration, National
for Computing Machinery, New York, NY, USA, Institute of Standards and Technology (NIST), 1998.
2007, p. 1082–1084. [22] A. Olteanu, J. Garcia-Gathright, M. de Rijke, M. D.
[11] S. Mi, J. Jiang, Understanding the Interpretability Ekstrand, Workshop on Fairness, Accountability,
of Search Result Summaries, in: Proceedings of the Confidentiality, Transparency, and Safety in
Infor42nd International ACM SIGIR Conference on Re- mation Retrieval (FACTS-IR), in: Proceedings of the
search and Development in Information Retrieval, 42nd International ACM SIGIR Conference on
ReSIGIR’19, Association for Computing Machinery, search and Development in Information Retrieval,
New York, NY, USA, 2019, p. 989–992. 2019, pp. 1423–1425.
[12] Q. Ai, Y. Zhang, K. Bi, W. B. Croft, Explainable</p>
        <p>Product Search with a Dynamic Relation
Embedding Model, ACM Trans. Inf. Syst. 38 (2019).
[13] S. Verberne, Explainable IR for personalizing
professional search, in: ProfS/KG4IR/Data: Search@</p>
        <p>SIGIR, 2018.
[14] M. Melucci, Can Structural Equation Models
Interpret Search Systems?, in: Proceedings of the
42nd International ACM SIGIR Conference on
Research and Development in Information Retrieval,
SIGIR’19, Association for Computing Machinery,
New York, NY, USA, 2019. URL: https://ears2019.</p>
        <p>github.io/Melucci-EARS2019.pdf .
[15] A. J. Biega, K. P. Gummadi, G. Weikum, Equity of
attention: Amortizing individual fairness in
rankings, in: The 41st International ACM SIGIR
conference on Research &amp; Development in Information</p>
        <p>Retrieval, 2018, pp. 405–414.
[16] S. C. Geyik, S. Ambler, K. Kenthapadi,
Fairness</p>
        <p>Aware Ranking in Search and Recommendation
Sys</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>E. L.</given-names>
            <surname>Mencia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Fürnkranz</surname>
          </string-name>
          ,
          <article-title>Eficient multilabel classification algorithms for large-scale problems in the legal domain</article-title>
          ,
          <source>in: Semantic Processing of Legal Texts</source>
          , Springer,
          <year>2010</year>
          , pp.
          <fpage>192</fpage>
          -
          <lpage>215</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S.</given-names>
            <surname>Bach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Binder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Montavon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Klauschen</surname>
          </string-name>
          ,
          <string-name>
            <surname>K.-R. Müller</surname>
          </string-name>
          , W. Samek,
          <article-title>On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation</article-title>
          ,
          <source>PloS one 10</source>
          (
          <year>2015</year>
          )
          <article-title>e0130140</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M. T.</given-names>
            <surname>Ribeiro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Guestrin</surname>
          </string-name>
          ,
          <article-title>"Why Should I Trust You?": Explaining the Predictions of Any Classifier</article-title>
          ,
          <source>in: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</source>
          , KDD '16,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2016</year>
          , p.
          <fpage>1135</fpage>
          -
          <lpage>1144</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J.</given-names>
            <surname>Pearl</surname>
          </string-name>
          , et al.,
          <source>Causal inference in statistics: An overview, Statistics surveys 3</source>
          (
          <year>2009</year>
          )
          <fpage>96</fpage>
          -
          <lpage>146</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Anand</surname>
          </string-name>
          , EXS:
          <article-title>Explainable Search Using Local Model Agnostic Interpretability</article-title>
          ,
          <source>in: Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining</source>
          , WSDM '19,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2019</year>
          , p.
          <fpage>770</fpage>
          -
          <lpage>773</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>L. H.</given-names>
            <surname>Gilpin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Bau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. Z.</given-names>
            <surname>Yuan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bajwa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Specter</surname>
          </string-name>
          , L. Kagal,
          <article-title>Explaining explanations: An overview of interpretability of machine learning</article-title>
          ,
          <source>in: 2018 IEEE</source>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>