<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Global Interpretability for ProtoPNet Using Rule-Based Explanations</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Alec Parise</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Brian Mac Namee</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University College Dublin</institution>
          ,
          <addr-line>Dublin</addr-line>
          ,
          <country country="IE">Ireland</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Deep learning models are undeniably powerful but often criticised for their “black-box” nature. Prototypical Part Networks (ProtoPNets) address this by providing local, prototype-based explanations for individual predictions. However, while local insights are useful, they fail to capture the model's overall behaviour, a critical shortcoming when domain experts need to diagnose, validate and refine complex models. In this paper we propose a method that converts ProtoPNet activations into human-readable, prototype-based rules using a RIPPER-style induction algorithm. This rule-driven perspective not only elucidates the model's global decision-making process but also ofers actionable insights for model debugging and enhancement.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Explainable AI</kwd>
        <kwd>Interpretable Machine Learning</kwd>
        <kwd>ProtoPNet</kwd>
        <kwd>RIPPER</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Deep learning models achieve high accuracy but often lack transparency, limiting trust in
critical applications [
        <xref ref-type="bibr" rid="ref19 ref20">20, 19</xref>
        ]. Methods like the Prototypical Part Network (ProtoPNet) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] have
been developed to address this. ProtoPNet augments a CNN with a prototype layer that learns
representative image patches; during inference the prediction made for an image is explained
by showing a user the patches that most strongly influence the prediction—–a process inspired
by prototype theory [
        <xref ref-type="bibr" rid="ref11 ref12">26, 11, 12</xref>
        ]. While such local explanations ofer intuitive, example-based
insights, they do not capture the model’s global decision process. Global interpretability is
essential for understanding model behavior, troubleshooting errors, and refining models, as
seen in fields like medical imaging.
      </p>
      <p>
        In our work, we convert ProtoPNet’s local explanations into global, rule-based ones using an
adapted RIPPER rule induction algorithm [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], inspired by frameworks such as AIMEE [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. This
approach maps prototype contributions into if-then rules, yielding a human-readable summary
of the model’s decision-making. We validate our method on the Caltech bird dataset [25] by
comparing the rule-based algorithm’s adherence, accuracy, and complexity to the original
model, addressing challenges like the trade-of between adherence and interpretability and
prototype-feature alignment [
        <xref ref-type="bibr" rid="ref1 ref11 ref12 ref14">1, 11, 12, 14</xref>
        ].
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        Prototype-based neural networks have emerged as a promising approach to enhance
interpretability in deep learning. ProtoPNet [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] augments CNNs with a prototype layer that learns
representative image patches, providing case-based explanations inspired by prototype
theory [26]. Subsequent work has addressed challenges such as redundancy and scalability through
methods like prototype merging [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] and decision-tree variants [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. Alternative interpretability
approaches include post-hoc attribution methods (e.g., LIME [24], SHAP [23]) and concept
bottleneck models. Despite these advances, achieving a coherent global explanation of model
behavior remains challenging. Our work builds on these foundations by transforming ProtoPNet’s
local explanations into global, rule-based insights via a RIPPER-style induction algorithm.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Generating Global Explanations</title>
      <p>
        Our goal is to move from instance-level prototype explanations to a global, rule-based
representation of model behavior. We combine a CNN backbone (e.g., ResNet [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]) with ProtoPNet [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] to
obtain local interpretability, and then convert the resulting prototype activations into a set of
human-readable if-then rules using a RIPPER-style induction algorithm [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. This section details
the process of extracting prototype activations, converting them into binary presence-absence
features via a hard gating mechanism, and inducing global rules.
      </p>
      <sec id="sec-3-1">
        <title>3.1. Extracting Prototype Activations</title>
        <p>ProtoPNet augments a CNN with a prototype layer that learns representative image patches.
The process involves:
1. Feature Extraction: An input image  is passed through the convolutional layers in
the CNN backbone. Each convolutional filter extracts local features, and as the image
propagates through the network, these features are spatially organized into a feature
map—a multi-dimensional array where each element corresponds to a small, localized
region (or ”patch”) of the original image.
2. Prototype Comparison: Each patch is compared to a set of learned prototypes using a
similarity metric (typically the negative squared Euclidean distance).
3. Activation Scoring: A max-pooling operation aggregates the similarity scores for each
prototype  :
( , ) = max sim(,  ),</p>
        <p>∈()
where  () denotes the feature map and sim(,  ) the similarity between patch  and
prototype  . This score is then weighted by a non-negative coeficient via a ReLU
activation:</p>
        <p>( , ) = ( , ) × ReLU(( )).</p>
        <p>The resulting prototype-activation profile highlights the image regions that most influence the
classification.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Binary Feature Creation via Hard Gating Method</title>
        <p>To induce interpretable rules, we convert the continuous activation scores ( , ) into binary
presence-absence indicators using a hard gating mechanism. This approach discretizes the
activations while retaining a diferentiable approximation during training. The process involves
three key steps:
1. Noise Injection: To simulate stochastic sampling and encourage exploration of binary
states, we add Gumbel noise to the activation score:</p>
        <p>( , ) = ( , ) + ( , ),
where the Gumbel noise is generated as
( , ) = − log (︀ − log( ( , )))︀</p>
        <p>with  ( , ) ∼ Uniform(0, 1).</p>
        <p>This transformation converts uniformly distributed random values into noise that
effectively approximates the extreme value distribution, which is useful for modeling the
binary decision process.
2. Soft Gating: The noisy score ( , ) is then passed through a temperature-scaled
sigmoid function to obtain a soft probability:
 ( , ) = 
︂( ( , ) )︂

.</p>
        <p>The temperature parameter  controls the sharpness of the sigmoid output: a lower 
results in a steeper function, closely approximating a hard threshold, while a higher 
produces a smoother transition. This soft gating step maintains diferentiability, which is
critical for gradient-based optimization during training.
3. Thresholding: Finally, the soft probability  ( , ) is converted into a binary decision
using a threshold of 0.5:
presence( , ) =
{︃1,  ( , ) ≥ 0.5,</p>
        <p>0, otherwise.</p>
        <p>This discretization yields a clear binary indicator of whether prototype  is considered
active in the image . The resulting binary presence-absence matrix is then used as input
for the RIPPER-style rule induction algorithm.</p>
        <p>Overall, this hard gating method efectively captures uncertainty in the activation signals
while providing a diferentiable pathway for training and a clear binary representation for
downstream rule extraction.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Adapted RIPPER Rule Induction</title>
        <p>
          After converting each image’s continuous prototype activations into binary indicators, we train a
RIPPER-style rule learner [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] to form a global, logical view of prototype-based classifications. In
standard RIPPER, a one-vs-all approach iteratively adds or removes conditions (e.g., “Prototype
 is present”) to maximize target class coverage while minimizing misclassifications, followed
by pruning to avoid overfitting.
        </p>
        <p>
          Our implementation extends RIPPER to better accommodate prototype-based features by:
1. Cross-Class Penalty: Adding a term in the First Order Inductive Learner (FOIL) [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] gain
calculation to penalize conditions referencing non-target prototypes.
2. Merging Redundant Single-Literal Rules: Combining multiple single-literal rules
for the same class into a single rule with an AND clause to reduce redundancy while
preserving coverage and accuracy.
3. Top- N Prototype-Based Fallbacks: Constructing minimal fallback rules using the
top
        </p>
        <p>N frequently activated prototypes when a class remains uncovered.
4. Adaptive Rule Growth and Pruning: Iteratively growing rules by adding conditions
that maximize modified FOIL gain, then pruning using a hold-out set and removing a
fraction of positive examples to prevent redundancy.</p>
        <p>This adapted RIPPER method yields a global set of rules consistent with ProtoPNet’s decisions,
providing a structured framework to analyze the model’s internal logic.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experiment Design</title>
      <p>This section describes the design of an experiment performed to evaluate the approach to
generate global explanations for ProtoPNet models described in Section 3. The evaluation uses
10 randomly selected classes from the Caltech Bird Dataset [25].300 images from these classes
were randomly for training the ProtoPNet an this set was expanded to 1794 augmented images
(using random flips, skewing, and other perturbations) to enhance robustness.</p>
      <p>For binarized feature extraction, 280 test images spanning the 10 classes were sampled from
the original dataset. Seventy percent of these images were used for training and cross-validating
the RIPPER rule induction model, with the remaining 30% reserved for testing the ability of the
rule-based representation to replicate ProtoPNet’s behavior.</p>
      <p>To further ensure the reliability of our rule induction process, we performed dynamic
hyperparameter tuning using the Optuna framework [27] in conjunction with 5-fold cross validation.
Rather than setting rule induction hyperparameters (e.g., max_conditions, min_coverage,
prune_size, etc.) arbitrarily, Optuna systematically explores the search space of possible
values—leveraging TPE (Tree-structured Parzen Estimator) [28] to converge toward optimal
configurations.</p>
      <p>We assessed the quality of the generated explanations with three metrics:
• Adherence: The percentage of cases where the RIPPER model’s predictions agree with</p>
      <p>ProtoPNet on a hold-out test set.
• Accuracy: The RIPPERmodel’s prediction accuracy compared to ground truth labels and</p>
      <p>ProtoPNet’s performance.
• Complexity: Measured by the total number of extracted rules and the average number
of conditions per rule.</p>
      <p>These metrics were computed both overall and per class. ProtoPNet achieved a baseline
accuracy of 85.13% on the test set. Our evaluation focuses on whether the extracted rules
faithfully capture ProtoPNet’s internal reasoning while remaining interpretable for end users.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Results</title>
      <p>IF Contains Prototype 78 AND Contains Prototype 77 AND NOT Contains
Prototype 22 THEN class=7</p>
      <p>This particular rule highlights the significance of certain plumage patterns (Prototypes 78
and 77), while simultaneously indicating the absence of another plumage patterns (Prototype 22)
for predicting the Black Billed Cuckoo. Collectively, the three rules in the figure illustrate
how a RIPPER-style surrogate can provide multiple, human-readable explanations for what a
ProtoPNet model has learned.</p>
      <p>To assess the quality of these global explanations, we evaluated their adherence and accuracy
(see Section 4). Our approach achieved an adherence of 70.37% and a rule-based accuracy of
71.60%. Additionally, the method produced a total of 71 rules with an average of 2.42 conditions
per rule. These metrics demonstrate that this is a promising approach to efectively captures
key decision cues while maintaining interpretability. Overall, by integrating both the presence
and absence of prototypes, the approach extracts a comprehensive yet concise set of rules that
elucidate the model’s learned knowledge, thereby ofering the potential to streamline analysis
and enhance human interpretability.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>This paper describes an approach to bridge the gap between local, prototype-based explanations
and a global, rule-based perspective. By moving beyond single-image explanations, it reveals
common patterns that a model has learned, ofering a high-level, comprehensible map of how
prototypes shape model decisions.</p>
      <p>Future work will compare this strategy to established approaches such as decision trees and
concept bottleneck models, in order to further contextualize its performance and scalability.
Additionally, we plan to extend our evaluation framework by incorporating alignment with
domain expert knowledge, analyzing coverage and specificity, validating generalizability on
new or perturbed data, and conducting user studies to assess interpretability. We will also
investifgate modifications to the approach including diferent gating mechanisms, diferent rule
extraction algorithms, and the integration of human feedback into the rule building process.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgements</title>
      <p>This publication has emanated from research conducted with the financial support of Science
Foundation Ireland under Grant number 18/CRT/6183. For the purpose of Open Access, the
author has applied a CC BY public copyright licence to any Author Accepted Manuscript version
arising from this submission.</p>
    </sec>
    <sec id="sec-8">
      <title>Declaration on Generative AI</title>
      <p>The authors have not employed any generative-AI tools in the preparation of this article.
arXiv:2205.15480 (2022).
[23] Lundberg, S.M., Lee, S.-I.: A unified approach to interpreting model predictions . arXiv
preprint arXiv:1705.07874 (2017).
[24] Ribeiro, M.T., Singh, S., Guestrin, C.: "Why should I trust you?" Explaining the predictions
of any classifier . In: Proceedings of the 22nd ACM SIGKDD International Conference on
Knowledge Discovery and Data Mining, 1135–1144 (2016).
[25] Krizhevsky, A.: Learning Multiple Layers of Features from Tiny Images. Technical Report,
University of Toronto (2009). https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.
pdf
[26] Rosch, E. H.: Natural Categories. Cognitive Psychology, 4(3), 328–350 (1973). Elsevier.
[27] Akiba, T., Sano, S., Yanase, T., Ohta, T., Koyama, M.: Optuna: A Next-generation
Hyperparameter Optimization Framework. In: Proceedings of the 25th ACM SIGKDD International
Conference on Knowledge Discovery &amp; Data Mining (KDD 2019), pp. 2623–2631. ACM (2019)
[28] Watanabe, S.: Tree-structured parzen estimator: Understanding its algorithm components
and their roles for better empirical performance. arXiv preprint arXiv:2304.11127 (2023).
[29] Jang, E., Gu, S., Poole, B.: Categorical Reparameterization with Gumbel-Softmax. arXiv
preprint arXiv:1611.01144 (2016).
[30] McCulloch, W. S. and Pitts, W. (1943). A logical calculus of the ideas immanent in nervous
activity. The Bulletin of Mathematical Biophysics, 5, 115–133. Springer.
[31] Courbariaux, M., Hubara, I., Soudry, D., El-Yaniv, R., and Bengio, Y. (2016). Binarized neural
networks: Training deep neural networks with weights and activations constrained to +1
or -1. arXiv preprint arXiv:1602.02830.
[32] Bengio, Y., Léonard, N., and Courville, A. (2013). Estimating or propagating gradients
through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432.
[33] Louizos, C., Welling, M., and Kingma, D. P. (2017). Learning sparse neural networks through
0 regularization. arXiv preprint arXiv:1712.01312.
[34] Shazeer, N., Mirhoseini, A., Maziarz, K., Davis, A., Le, Q., Hinton, G., and Dean, J. (2017).</p>
      <p>Outrageously large neural networks: The sparsely-gated mixture-of-experts layer. arXiv
preprint arXiv:1701.06538.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tao</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barnett</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rudin</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Su</surname>
            ,
            <given-names>J.K.</given-names>
          </string-name>
          :
          <article-title>This looks like that: Deep learning for interpretable image recognition</article-title>
          .
          <source>Advances in Neural Information Processing Systems</source>
          ,
          <volume>32</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Piorkowski</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , et al.:
          <article-title>AIMEE: An Interactive Model Explanation Environment for Deep Neural Networks</article-title>
          .
          <source>In: Proceedings of the [Conference]</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Cohen</surname>
            ,
            <given-names>W.W.</given-names>
          </string-name>
          :
          <article-title>Fast Efective Rule Induction</article-title>
          .
          <source>Machine Learning</source>
          ,
          <volume>2</volume>
          ,
          <fpage>115</fpage>
          -
          <lpage>137</lpage>
          (
          <year>1995</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>He</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ren</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sun</surname>
          </string-name>
          , J.:
          <article-title>Deep Residual Learning for Image Recognition</article-title>
          .
          <source>Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</source>
          ,
          <fpage>770</fpage>
          -
          <lpage>778</lpage>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Chiaburu</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Haußer</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bießmann</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>CoProNN: Concept-based Prototypical Nearest Neighbors for Explaining Vision Models</article-title>
          .
          <source>arXiv preprint arXiv:2404.14830</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Richards</surname>
            ,
            <given-names>B. L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mooney</surname>
          </string-name>
          , R. J.:
          <article-title>First-order Theory Revision</article-title>
          .
          <source>In: Machine Learning Proceedings</source>
          <year>1991</year>
          , pp.
          <fpage>447</fpage>
          -
          <lpage>451</lpage>
          . Elsevier (
          <year>1991</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Bontempelli</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Teso</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tentori</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Giunchiglia</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Passerini</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Concept-level debugging of part-prototype networks</article-title>
          .
          <source>arXiv preprint arXiv:2205.15769</source>
          (
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Heinrich</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sick</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Scholz</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>AudioProtoPNet: An interpretable deep learning model for bird sound classification</article-title>
          .
          <source>arXiv preprint arXiv:2404.10420</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Sinhamahapatra</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shit</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sekuboyina</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Husseini</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schinz</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lenhart</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Menze</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kirschke</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Roscher</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guennemann</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Enhancing Interpretability of Vertebrae Fracture Grading using Human-interpretable Prototypes</article-title>
          .
          <source>arXiv preprint arXiv:2404.02830</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Nauta</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Van</surname>
            <given-names>Bree</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Seifert</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          :
          <article-title>Neural Prototype Trees for Interpretable Fine-Grained Image Recognition</article-title>
          .
          <source>In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition</source>
          ,
          <fpage>14933</fpage>
          -
          <lpage>14943</lpage>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Sourati</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Deshpande</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ilievski</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gashteovski</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Saralajew</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Robust Text Classification: Analyzing Prototype-Based Networks</article-title>
          .
          <source>arXiv preprint arXiv:2311.06647</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Narayanan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bergen</surname>
            ,
            <given-names>K.J.</given-names>
          </string-name>
          :
          <article-title>Prototype-Based Methods in Explainable AI and Emerging Opportunities in the Geosciences</article-title>
          .
          <source>arXiv preprint arXiv:2410.19856</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rudin</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shah</surname>
            ,
            <given-names>J.A.</given-names>
          </string-name>
          :
          <article-title>The Bayesian Case Model: A Generative Approach for CaseBased Reasoning and Prototype Classification</article-title>
          .
          <source>Advances in Neural Information Processing Systems</source>
          ,
          <volume>27</volume>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Rymarczyk</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , Struski, Ł.,
          <string-name>
            <surname>Górszczak</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lewandowska</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tabor</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zieliński</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Interpretable image classification with diferentiable prototypes assignment</article-title>
          .
          <source>In: European Conference on Computer Vision</source>
          ,
          <fpage>351</fpage>
          -
          <lpage>368</lpage>
          (
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>A.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Netzorg</surname>
            , R., Cheng,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yu</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Improving Prototypical Visual Explanations with Reward Reweighing, Reselection, and Retraining</article-title>
          .
          <source>In: Proceedings of the Forty-first International Conference on Machine Learning</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Rymarczyk</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , Struski, Ł.,
          <string-name>
            <surname>Tabor</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zieliński</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Protopshare: Prototypical parts sharing for similarity discovery in interpretable image classification</article-title>
          .
          <source>In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery &amp; Data Mining</source>
          ,
          <fpage>1420</fpage>
          -
          <lpage>1430</lpage>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Netzorg</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yu</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Improving prototypical part networks with reward reweighing, reselection, and retraining</article-title>
          .
          <source>arXiv preprint arXiv:2307.03887</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Vilone</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Longo</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>A quantitative evaluation of global, rule-based explanations of post-hoc, model agnostic methods</article-title>
          .
          <source>Frontiers in Artificial Intelligence</source>
          ,
          <volume>4</volume>
          ,
          <issue>717899</issue>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Rudin</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead</article-title>
          .
          <source>Nature Machine Intelligence</source>
          ,
          <volume>1</volume>
          (
          <issue>5</issue>
          ),
          <fpage>206</fpage>
          -
          <lpage>215</lpage>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>Lipton</surname>
            ,
            <given-names>Z.C.</given-names>
          </string-name>
          :
          <article-title>The mythos of model interpretability: In machine learning, the concept of interpretability is both important and slippery</article-title>
          .
          <source>Queue</source>
          ,
          <volume>16</volume>
          (
          <issue>3</issue>
          ),
          <fpage>31</fpage>
          -
          <lpage>57</lpage>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>A comparative analysis of model agnostic techniques for explainable artificial intelligence</article-title>
          .
          <source>Research Reports on Computer Science</source>
          ,
          <volume>25</volume>
          -
          <fpage>33</fpage>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>Yuksekgonul</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zou</surname>
          </string-name>
          , J.:
          <article-title>Post-hoc concept bottleneck models</article-title>
          . arXiv preprint
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>