<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>This part looks alike this: identifying important parts of explained instances and prototypes</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jacek Karolczak</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jerzy Stefanowski</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Poznan University of Technology, Institute of Computing Science</institution>
          ,
          <addr-line>ul. Piotrowo 2, 60-695 Poznań</addr-line>
          ,
          <country country="PL">Poland</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Although prototype-based explanations provide a human-understandable way of representing model predictions they often fail to direct user attention to the most relevant features. We propose a novel approach to identify the most informative features within prototypes, termed alike parts. Using feature importance scores derived from an agnostic explanation method, it emphasizes the most relevant overlapping features between an instance and its nearest prototype. Furthermore, the feature importance score is incorporated into the objective function of the prototype selection algorithms to promote global prototypes diversity. Through experiments on six benchmark datasets, we demonstrate that the proposed approach improves user comprehension while maintaining or even increasing predictive accuracy.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;prototype-based explanation</kwd>
        <kwd>feature importance</kwd>
        <kwd>user attention guidance</kwd>
        <kwd>local and global explanations</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Research on explaining black-box machine learning methods, which have been intensively developing in
recent years, has led to the introduction of a great number of various explanation methods; see, e.g. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
Since prototypes correspond to training data, they are easier for humans to understand compared to
more complex explanation methods [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Prototypes can serve as a local explanation by associating
predictions with similar examples or as a global explanation to illustrate model decision boundaries
using a limited number of representative instances.
      </p>
      <p>
        Although in general prototypes can be applied to diferent types of data, in this paper we focus on
tabular data, i.e., the description of examples in the form of vectors of (feature , value) pairs. However,
their interpretation may be a challenge, especially when there are too many features [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. For local
explanations in particular, human users may encounter dificulties in assessing which features are most
important for the prediction of the considered instance. Furthermore, it can be expected for global
explanations that the discovered prototypes are not only well spread over the learning data space but
are simultaneously characterized by quite diversified subsets of the most important features.
      </p>
      <p>
        Recall that similar expectations have been examined for other data modalities. For images,
prototypical parts networks were introduced to identify characteristic patches instead of complete images [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
However, for tabular data, the decomposition into meaningful parts remains underexplored. To bridge
this gap, we introduce the identification of the most important features in prototypes. This is achieved by
applying an agnostic explanation method for computing the feature importance of the black-box model,
and ofers a more refined perspective than existing techniques. Such subsets of features can be exploited
for local or global approaches and support users in better interpreting the provided explanations.
      </p>
      <p>Our approach uses feature importance in two ways. First, we identify alike parts by highlighting the
most informative overlapping features between an instance and its nearest prototype, directing the
user’s attention to a limited number of key features when interpreting a model prediction. Second, we
incorporate feature importance into the prototype selection objective function to promote diversity,
which aids in identifying alike parts. These strategies balance interpretability and diversity, enhancing
both local explanations and prototype selection. The methods are evaluated on benchmark datasets,
with source code available on GitHub1.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related work</title>
      <p>A dataset  consists of  instances (learning examples), expressed as  = (x, )=1, where each
x ∈ X represents a  dimensional feature vector, and  ∈  denotes its corresponding label. In this
work, we consider tabular data in a feature-value format. We assume the presence of a classifier ℎ
trained in , which serves as a black-box model to make predictions. The classifier ℎ maps an input
instance x to a predicted label  : ℎ(x) ↦→ ^.</p>
      <p>From our perspective, a prototype is a representative instance selected from the dataset, i.e., an element
(x , ^ ), where ^ denotes the class assignment of the instance made by the classifier ℎ. Typically, the
set of prototypes, denoted as , is a subset of , such that  = {(x ,  )}=1, where  ≪  ( ⊂  ),
ensuring that the number of prototypes is much smaller than the total size of the training dataset.</p>
      <p>
        Some prototype selection methods use kernel functions and vector quantization [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], while KNN-based
methods share similar principles. IKNN_PSLFW [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], for example, partitions data into class-specific
subsets and selects prototypes farthest from other classes. However, most methods rely on standard
distance measures in the original attribute space, requiring a similarity definition that supports diverse
data types (binary, numerical, categorical) and is robust to scaling diferences. However, most of
these algorithms exploit standard distance measures in the original attribute space, which requires the
definition of similarity that supports diferent data types and is immune to diferent scales.
      </p>
      <p>
        More recent proposals mitigate these distance limitations by considering the proximity of instances in
the new space, referring to predictions of the black-box model; see the tree-space prototypes developed
for explaining ensembles. The first algorithm, SM-A, introduced in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], searches for prototypes –
medoids in this space. However, it requires the user to specify the expected number of prototypes. This
limitation was later addressed by A-PETE [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], which automates prototype selection.
      </p>
      <p>
        Although numerous methods have been proposed to assess feature influence for black-box model
predictions, they have not been widely applied in conjunction with prototype-based explanations.
Popular techniques such as SHAP [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] as a local explanation yield a vector of length equal to the
number of features, where each value attributes the importance score of individual features, helping to
understand the behavior of the model for specific instances.
      </p>
      <p>
        Despite multiple studies on prototypes for tabular data, only a few papers discuss how prototypes
should be presented to end users. In [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], some prototype visualizations are provided, such as 2D scatter
plots or self-organizing maps; however, they are suitable only for low-dimensional data and ultimately
do not focus user attention on specific parts.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Method</title>
      <p>In Section 3.1 we will first present our proposal to support the local explanation of the example
predication by the nearest prototype. Then, in Section 3.2 we will generalize it to create a diverse global
set of prototypes.</p>
      <sec id="sec-3-1">
        <title>3.1. Identifying important parts</title>
        <p>
          Following [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ], for many features, a prototype as a whole can be dificult to comprehend and therefore
make it dificult to explain the prediction of a black-box model. Some features within the prototype
may be of high importance, while others may have low importance to the specific prediction that is
being explained.
        </p>
        <sec id="sec-3-1-1">
          <title>1https://github.com/jkarolczak/important-parts-of-prototypes</title>
          <p>Finding alike parts for the instance and its prototype from Apple Quality dataset. The first two rows present the
feature importance values for the instance and its prototype, respectively. The third row shows the computed
weights, obtained as the element-wise product of normalized feature importance scores (Formula 2). The bottom
row indicates the binary mask, which selects the most relevant shared features-those with weights above the
mean - denoted by ’1’</p>
          <p>Instance
Prototype
Weights
Mask</p>
          <p>Size
-2.77
-0.97
0.18
1
-1.08
-0.20
0.02</p>
          <p>0
Weight Sweetness</p>
          <p>Crunchiness Juiciness</p>
          <p>Ripeness</p>
          <p>Acidity
-1.72
-3.07
0.27
1
1.38
0.00
0.00
0
0.19
-0.52
0.00
0
3.65
3.16
0.51
1
0.31
-0.52
0.00
0</p>
          <p>Therefore, we propose a method that identifies the most informative features shared between an
instance and its prototype, guiding the user’s attention to a concise subset of features. Further, we refer
to them as alike parts, where the importance of features within the alike part is similarly high in both
the instance and its nearest prototype.</p>
          <p>
            To explain the instance x by its nearest prototype p , we first identify the alike parts by computing
feature importance scores for each feature  ∈ 1, . . . ,  in the classification of
classifier ℎ, denoted as (ℎ, x) and (ℎ, p ), respectively. We use the SHAP method [
            <xref ref-type="bibr" rid="ref8">8</xref>
            ] to quantify the
influence of each feature, as it is one of the most widely used methods for feature importance estimation.
x and p using the
However, any feature importance method can be applied in this context. To ensure comparability, the
raw importance scores are normalized, as they can vary in magnitude. We treat both positive and
negative scores equally by squaring them, which avoids cancellations and enables the identification of
similarities and diferences between the instance and its prototype:
^(ℎ, x) =
          </p>
          <p>((ℎ, x))2
∑︀
=1((ℎ, x))2
, ^(ℎ, p ) =</p>
          <p>((ℎ, p ))2
∑︀
=1((ℎ, p))2</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. New definition of optimization problem</title>
        <p>
          The prototype selection algorithms discussed in this paper, such as [
          <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
          ], define the task of identifying
representative data points as a -medoids problem, which is solved using a greedy approximation
algorithm. Typically, the -medoids problem minimizes a distance function  between each training
example x and its nearest prototype p . This is expressed as follows:
 = ^(ℎ, x) · ^(ℎ, p ) .
        </p>
        <p>︃(
 = ⊮  &gt;

1 ∑︁ 
=1</p>
        <p>)︃
 () = ∑︁
||
=1 p∈</p>
        <p>
          min  (x, p ) ,
where the notation || refers to the cardinality of the training set. The choice of the distance function
 varies between diferent algorithms. In neural network-based approaches, it can be a dot product
between trainable embeddings [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ], or in tree ensembles, a specialized tree distance metric [
          <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
          ].
        </p>
        <p>To strengthen diversification in feature importance, we propose extending the objective function by
including an additional feature importance component   defined as the product of normalized feature
importance of -th feature of instance x and its nearest prototype p :</p>
        <p>(x, p ) = ∑︁
=1</p>
        <p>((ℎ, x))2 ((ℎ, p ))2
∑︀=1((ℎ, x))2 · ∑︀=1((ℎ, p))2
.</p>
        <p>The   scores can be calculated once for all x prior to optimization and cached for eficiency. The
revised function incorporates both the minimization of the distance between each instance and its
nearest prototype and an additional term weighted by  to account for the feature importance score.
The first term promotes that each instance in the dataset is well represented by a prototype, promoting
compact coverage of  by assigning each instance to its closest prototype, while the second encourages
diversification in the feature importance across prototypes. The revised function is formally defined as:
||
 () = ∑︁ min ( (x, p ) +  ·   (x, p )) ,</p>
        <p>=1 p∈</p>
        <p>
          This modification enables a more nuanced global prototype selection, with  balancing distance and
feature importance. The updated formulation improves prototype selection for identifying alike parts.
The proposed method is robust to missing values, assuming that the selected components can handle
them. In this paper, we used prototype selection algorithms [
          <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
          ] based on RF [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ], and SHAP, both of
which natively support missing values. Therefore, the method does not require additional preprocessing
for missing data.
(5)
(6)
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experiments</title>
      <p>
        As discussed in Section 3.2, the proposed optimization method can be adapted to various algorithms. We
applied this modification to prototype selection algorithms optimizing tree distance: A-PETE, SM-A, and
G-KM [
        <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
        ], to explain the Random Forest (RF) ensemble [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. All use greedy medoid selection, with
key diferences: G-KM selects an equal number of prototypes per class (greedy k-Medoid approximation
computed within classes); SM-A [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] selects the prototype providing the greatest improvement across all
classes; and A-Pete [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] automates this by stopping based on relative improvements (see [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] for pseudo
codes). For evaluation, we use four benchmark datasets that have a subset of globally important features:
Australia Rain2, Breast Cancer3, Diabetes4, and Passenger Satisfaction5; and two: Apple Quality6 and
Wine Quality7, which exhibits high feature importance across all features.
      </p>
      <p>The experiments are organized as follows: Section 4.1 presents examples of alike parts identification
on real data and how extending the optimized function improves this process. Section 4.2 aims to
quantify the quality of the proposed improvements by comparing our modified with the original
prototype selection methods, highlighting the impact of our changes. Section 4.3 presents an ablation
study that analyzes the contribution of the  factor to algorithm performance.</p>
      <sec id="sec-4-1">
        <title>4.1. Studying the methods in action</title>
        <p>
          Finding alike parts on real data is shown in Table 1, illustrating how feature importance for both the
instance and prototype is used to compute weights. Table 2 compares how alike parts of an instance
and its nearest prototype are selected using the original (raw) and FI-informed versions of the A-Pete
for the Diabetes dataset. Incorporating feature importance into A-Pete’s optimization led to diferent
selections than the raw algorithm when generating prototypes from black-box RF [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ].
        </p>
        <p>
          For example, when using the prototype from raw A-PETE, only the Glucose is highlighted as the
feature important for both the instance and prototype. Meanwhile, the FI-informed algorithm also
highlights Diabetes Pedigree Function, and Age which aligns with established medical knowledge on
diabetes risk factors [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. This demonstrates the potential of our method to facilitate the identification
of more meaningful relationships between instances and prototypes.
        </p>
        <p>A visual comparison of the globally generated sets of prototypes and selected important attributes
for the Diabetes dataset is presented as Figure 1. The figure contrasts prototypes generated using
the original (raw) A-Pete algorithm with those generated using the FI-informed approach. The figure
demonstrates that the FI-informed algorithm yields more diversified prototypes that highlight parts
varying between prototypes – the sixth feature was selected as important only when FI was included in
the target function. A similar phenomenon was observed for Australia Rain and Breast Cancer – certain
features were considered significant only when using the FI-informed version of the algorithm.</p>
        <p>The proposed approach was validated on the test subset of each dataset to quantitatively compare
the frequency of features identified as important. In the Figure 2, presenting results, one can observe
that the frequency of highlighting each feature difers between the original and FI-informed strategies.
This diference is especially noticeable for the G-KM algorithm: when prototypes are selected using the
FI-informed strategy, certain features are highlighted that were not emphasized by the raw algorithm.</p>
        <sec id="sec-4-1-1">
          <title>2https://www.kaggle.com/datasets/jsphyg/weather-dataset-rattle-package</title>
          <p>3https://www.kaggle.com/datasets/rahmasleam/breast-cancer
4https://www.kaggle.com/datasets/mathchi/diabetes-data-set
5https://www.kaggle.com/datasets/teejmahal20/airline-passenger-satisfaction
6https://www.kaggle.com/datasets/nelgiriyewithana/apple-quality
7https://www.kaggle.com/datasets/taweilo/wine-quality-dataset-balanced-classification</p>
        </sec>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Predictive performance in comparison to original versions of the algorithms</title>
        <p>
          This section compares the accuracy achieved by a surrogate model based on prototypes, as it was
done in [
          <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
          ]. The surrogate model uses a 1-nearest neighbor (1-NN) search within the set of selected
prototypes and is evaluated on classifying instances from a test set. We specifically examine the impact
of our modified prototype selection method, which incorporates feature importance.
        </p>
        <p>Figure 3 illustrates how algorithm-specific hyperparameters and the weighting factor  influence
prototype selection and consequently impact accuracy, with  controlling the extent to which feature
importance is incorporated into the optimization function. The results show that the modified approach
maintains or improves predictive performance with respect to main parameters. Similar information is
presented in Table 3 where the values corresponding to the accuracy optima found are presented for
the original and the FI-incorporated algorithms.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Ablation study</title>
        <p>Here, we analyze the impact of the parameter  on the selection of the prototype by examining how it
influences the alikeness between an explained instance and its prototype. Figure 4 shows how mean
feature importance and alike-part length vary with  . The results indicate that as  increases, the
mean feature importance similarity tends to rise, suggesting that high  encourages the selection of
prototypes that align more closely with important features of the explained instance. However, this
trend is not strictly monotonic and careful tuning is required, with  ≤ 2.0 often providing a good
balance, although the optimal value depends on the dataset. To determine the optimal value of  , grid
search or Bayesian optimization can be used to tune  and other algorithm-specific parameters, aiming
to maximize the accuracy of a surrogate 1-NN model.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Discussion</title>
      <p>This work introduces an innovative approach to prototype-based explanations, enhancing their
interpretability by directing user attention to the most important features of both the prototype and
the classified instance, the so-called alike parts. By incorporating feature importance into the
prototype selection, our proposal bridges a gap in the literature where these two aspects were previously
considered separately. The experimental results suggest that this integration improves the clarity of
the explanation while preserving and, in some cases, even improving the predictive accuracy (see
Section 4.2). Incorporating feature importance leads to selecting prototypes with diferent, often more
meaningful, alike parts. This was shown with the Diabetes dataset, where our method identified
features such as Age and Pedigree Function as crucial, aligning with established medical knowledge
(see Section 4.1). Moreover, it can extend beyond the tested algorithms, G-KM, SM-A, A-PETE, and a
black-box RF. Importantly, Section 4.3 shows that adjusting the weighting factor  fine-tunes the balance
between feature importance and distance minimization, highlighting adaptability to diferent tasks.
Future research should explore its efectiveness from the user perspective, assessing whether these
explanations enhance human understanding of model decisions. Furthermore, evaluating the approach
on non-tabular modalities, such as images and text, is necessary to assess its broader applicability.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>This research was funded in part by National Science Centre, Poland OPUS grant no.
2023/51/B/ST6/00545 and in part by PUT SBAD 0311/SBAD/0752 grant.</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <sec id="sec-7-1">
        <title>The author has not employed any Generative AI tools.</title>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>F.</given-names>
            <surname>Bodria</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Giannotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Guidotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Naretto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Pedreschi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Rinzivillo</surname>
          </string-name>
          ,
          <article-title>Benchmarking and survey of explanation methods for black box models</article-title>
          ,
          <source>Data Mining and Knowledge Discovery</source>
          <volume>37</volume>
          (
          <year>2023</year>
          )
          <fpage>1719</fpage>
          -
          <lpage>1778</lpage>
          . doi:
          <volume>10</volume>
          .1007/s10618-023-00933-9.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>O. Menis</given-names>
            <surname>Mastromichalakis</surname>
          </string-name>
          , G. Filandrianos,
          <string-name>
            <given-names>J.</given-names>
            <surname>Liartis</surname>
          </string-name>
          , E. Dervakos, G. Stamou,
          <article-title>Semantic prototypes: Enhancing transparency without black boxes</article-title>
          ,
          <source>in: Proceedings of the 33rd ACM International Conference on Information and Knowledge Management</source>
          , CIKM '24,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2024</year>
          , p.
          <fpage>1680</fpage>
          -
          <lpage>1688</lpage>
          . doi:
          <volume>10</volume>
          .1145/3627673.3679795.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>C.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Tao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. J.</given-names>
            <surname>Barnett</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Su</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Rudin</surname>
          </string-name>
          ,
          <article-title>This looks like that: deep learning for interpretable image recognition</article-title>
          ,
          <source>in: Proceedings of the 33rd International Conference on Neural Information Processing Systems</source>
          , Curran Associates Inc.,
          <string-name>
            <surname>Red</surname>
            <given-names>Hook</given-names>
          </string-name>
          ,
          <string-name>
            <surname>NY</surname>
          </string-name>
          , USA,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>F</given-names>
            <surname>.-M. Schleif</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Villmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Hammer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Schneider</surname>
          </string-name>
          ,
          <article-title>Eficient kernelized prototype based classification</article-title>
          ,
          <source>International Journal of Neural Systems</source>
          <volume>21</volume>
          (
          <year>2011</year>
          )
          <fpage>443</fpage>
          -
          <lpage>457</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , H. Xiao,
          <string-name>
            <given-names>R.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>K-nearest neighbors rule combining prototype selection and local feature weighting for classification, Knowledge-Based Systems 243 (</article-title>
          <year>2022</year>
          )
          <article-title>108451</article-title>
          . doi:
          <volume>10</volume>
          .1016/j.knosys.
          <year>2022</year>
          .
          <volume>108451</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S.</given-names>
            <surname>Tan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Soloviev</surname>
          </string-name>
          , G. Hooker, M. T. Wells,
          <article-title>Tree space prototypes: Another look at making tree ensembles interpretable</article-title>
          ,
          <source>in: Proceedings of the 2020 ACM-IMS on Foundations of Data Science Conference, FODS '20</source>
          ,
          <year>2020</year>
          , p.
          <fpage>23</fpage>
          -
          <lpage>34</lpage>
          . doi:
          <volume>10</volume>
          .1145/3412815.3416893.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Karolczak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Stefanowski</surname>
          </string-name>
          ,
          <string-name>
            <surname>A-PETE</surname>
          </string-name>
          :
          <article-title>Adaptive prototype explanations of tree ensembles</article-title>
          ,
          <source>in: Progress in Polish Artificial Intelligence Research</source>
          , volume
          <volume>5</volume>
          , Warsaw University of Technology,
          <year>2024</year>
          , pp.
          <fpage>2</fpage>
          -
          <lpage>8</lpage>
          . URL: https://pages.mini.pw.edu.pl/~estatic/pliki/PP-RAI_
          <year>2024</year>
          <article-title>_proceedings</article-title>
          .pdf.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Lundberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.-I.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <article-title>A unified approach to interpreting model predictions</article-title>
          ,
          <source>in: Proceedings of the 31st International Conference on Neural Information Processing Systems</source>
          , NIPS'17, Curran Associates Inc.,
          <year>2017</year>
          , p.
          <fpage>4768</fpage>
          -
          <lpage>4777</lpage>
          . doi:
          <volume>10</volume>
          .5555/3295222.3295230.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Biehl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Hammer</surname>
          </string-name>
          , T. Villmann,
          <article-title>Prototype-based models in machine learning</article-title>
          ,
          <source>WIREs Cognitive Science</source>
          <volume>7</volume>
          (
          <year>2016</year>
          )
          <fpage>92</fpage>
          -
          <lpage>111</lpage>
          . doi:
          <volume>10</volume>
          .1002/wcs.1378.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>O.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Rudin</surname>
          </string-name>
          ,
          <article-title>Deep learning for case-based reasoning through prototypes: a neural network that explains its predictions</article-title>
          ,
          <source>in: Proceedings of the 32nd AAAI Conference on Artificial Intelligence</source>
          , AAAI Press,
          <year>2018</year>
          . doi:
          <volume>10</volume>
          .5555/3504035.3504467.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>L.</given-names>
            <surname>Breiman</surname>
          </string-name>
          , Random forests,
          <source>Machine Learning</source>
          <volume>45</volume>
          (
          <year>2001</year>
          )
          <fpage>5</fpage>
          -
          <lpage>32</lpage>
          . doi:
          <volume>10</volume>
          .1023/A:
          <fpage>1010933404324</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>A.</given-names>
            <surname>Kautzky-Willer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Harreiter</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>Pacini, Sex and gender diferences in risk, pathophysiology and complications of type 2 diabetes mellitus</article-title>
          ,
          <source>Endocrine Reviews</source>
          <volume>37</volume>
          (
          <year>2016</year>
          )
          <fpage>278</fpage>
          -
          <lpage>316</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>