<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>AnyCBMs: How to Turn Any Black Box into a Concept Bottleneck Model⋆</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Gabriele Dominici</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pietro Barbiero</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Francesco Giannini</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Martin Gjoreski</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marc Langeinrich</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Università della Svizzera Italiana</institution>
          ,
          <addr-line>Lugano</addr-line>
          ,
          <country country="CH">Switzerland</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Università di Siena</institution>
          ,
          <addr-line>Siena</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Interpretable deep learning aims at developing neural architectures whose decision-making processes could be understood by their users. Among these techniqes, Concept Bottleneck Models enhance the interpretability of neural networks by integrating a layer of human-understandable concepts. These models, however, necessitate training a new model from the beginning, consuming significant resources and failing to utilize already trained large models. To address this issue, we introduce “AnyCBM”, a method that transforms any existing trained model into a Concept Bottleneck Model with minimal impact on computational resources. We provide both theoretical and experimental insights showing the efectiveness of AnyCBMs in terms of classification performances and efectivenss of concept-based interventions on downstream tasks.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Interpretability</kwd>
        <kwd>Explainable AI</kwd>
        <kwd>Concept Learning</kwd>
        <kwd>Concept Bottleneck Models</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Numerous national and international regulatory frameworks underscore the transformative
potential of artificial intelligence (AI). However, they also warn of the inherent risks associated
with such powerful technology, emphasizing the importance of careful monitoring and strict
protections. For instance, the recent AI Act [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] aims to implement detailed regulations for AI
systems, ensuring their safety, transparency, and accountability. Similarly, in the US, the federal
government issued an executive order that proposes principles for trustworthy AI. Hence,
interpretable AI has become a crucial aspect of modern machine learning to address concerns
over the opaque nature of deep learning (DL) models [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ]. The quest for transparency has been
driven by the need to understand the decision-making processes of AI systems, particularly
in critical areas where ethical [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and legal [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] implications of these systems’ decisions are
significant.
      </p>
      <p>
        Concept Bottleneck Models (CBMs) [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] are a family of diferentiable models aiming to
increase DL interpretability [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. These models map input data (e.g., pixel intensities) to
humanunderstandable concepts (e.g., shapes, colors), and then use these concepts to predict labels of
a downstream classification task. However, existing CBMs necessitate training a new model
from the beginning even in settings where trained or fine-tuned models already exist. In these
scenarios, current CBM architectures would consume significant resources in re-training or
ifne-tuning again possibly large models. As a result, this limitation restricts CBMs’ ability
to be adopted in new domains. To bridge this gap, we introduce Any Concept Bottleneck
Models (AnyCBMs, Figure 1), a method to transform any black-box neural architecture into an
interpretable CBM. The key innovation of AnyCBMs lies in a neural model mapping black-box
embeddings into a set of supervised concepts and then mapping the predicted concepts back to
black-box embeddings. This allows AnyCBMs to be applied to any layer of a trained black box
and to perform concept-based interventions as in standard CBMs. Results demonstrate that
AnyCBMs match black-box performance in classification accuracy in downstream tasks and
CBM performance in concept accuracy. In addition, AnyCBM could steer the behaviour of a
black-box model acting on human-understandable concepts as efectively as in CBMs.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Background</title>
      <p>
        Concept-based models  :  →  learn a map from a concept space  to a task space  [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
If concepts are semantically meaningful, then humans can interpret this mapping by tracing
back predictions to the most relevant concepts [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. When the features of the input space are
hard for humans to reason about (such as pixel intensities), concept-based models work on the
output of a concept-encoder mapping  :  →  from the input space  to the concept space
 [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. These architectures are known as Concept Bottleneck Models (CBMs) [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. In general,
training a CBM model may require a dataset where each sample consists of input features
 ∈  ⊆ R (e.g., an image’s pixels),  ground truth concepts  ∈  ⊆ { 0, 1} (i.e., a binary
vector with concept annotations, when available) and  task labels  ∈  ⊆ { 0, 1} (e.g., an
image’s classes). During training, a CBM is encouraged to align its predictions to task labels
i.e.,  ≈ ^ =  (()). Similarly, a concept predictor can be supervised when concept labels
are available i.e.,  ≈ ^ = (). We indicate concept and task predictions as ^ = (()) and
^ = ( (^)) respectively. When concept labels are not available, they can still be extracted
in with unsupervised techniques [
        <xref ref-type="bibr" rid="ref10 ref11 ref9">9, 10, 11</xref>
        ], which make CBMs applicable to a wide range of
applications.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. AnyCBM: Turning Black Boxes into Concept Bottleneck</title>
    </sec>
    <sec id="sec-4">
      <title>Models</title>
      <p>AnyCBM (Figure 1) is a method designed to convert any opaque neural network architecture
into a Concept Bottleneck Model (CBM) that is interpretable. The fundamental innovation of
AnyCBMs involves the use of an external model that processes embeddings from a trained
black box model. These embeddings, denoted as ℎ() ∈ () ⊆ R, are encoded into a set of
supervised concepts  ∈ . Subsequently, these concepts are mapped back into embeddings
ℎ() ∈ () ⊆ R. This process allows for the embedding space of the black box model to be
translated into a more understandable and interpretable form, where each concept represents a
meaningful feature or characteristic that explains the decision-making process of the neural
network. The following definition formalizes AnyCBMs.</p>
      <p>Definition 3.1 (AnyCBM). Given a black box model  :  →  and a set of concepts , a
AnyCBM is a tuple of models ( ,  ) such that, the following diagram commutes:
()</p>
      <p>()</p>
      <p>More specifically, the concept predictor   : () →  encodes black box embeddings into
concepts, and the task encoder   :  → () maps concepts back into black box embeddings.
In practice, the commutative diagram describes how the interpretable mapping through 
via   and   should be consistent with the direct transformation of the black box . Also
properties and capabilities of AnyCBMs can directly be derived from the commutative diagram
as it constraints the relationships among the transformations  ,  , and .</p>
      <p>In the following we present two practical case studies.</p>
      <p>Case 1:  is the identity function on  When  is the identity function, (ℎ()) = ℎ() for
all ℎ() ∈ (), and () = (). The diagram simplifies, and we have:</p>
      <p>∘   = id
Theorem 3.2. If  is the identity function on , then   is injective:
 = id =⇒   :  ˓→ ()
(1)
Proof. Assume  (1) =  (2). Since   is surjective, there exist ℎ1, ℎ2 ∈ () such that
 (ℎ1) = 1 and  (ℎ2) = 2. Then,</p>
      <p>ℎ1 =  ( (ℎ1)) =  (1) =  (2) =  ( (ℎ2)) = ℎ2.</p>
      <p>Thus, 1 = 2, proving that   is injective.</p>
      <p>
        Significance: This property implies that   can uniquely reconstruct elements of () from
, despite   not being injective. For example, if   represents a lossy compression, then  
could be an error-correcting decoding where no information is lost despite compression.
Case 2: independent training In many practical cases, concept predictors and task encoders
are independently trained to reduce concept leakage [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. In this common setting, we can prove
another property of AnyCBMs task encoders.
      </p>
      <p>Theorem 3.3. If   and   are independently trained and  is a multi-layer neural network, then
  cannot be surjective.</p>
      <p>Proof. Assume for contradiction that   is surjective. The surjectivity of   would require
that every point in () is the image of some point in . Given the independent training, the
domain of   is finite, specifically 2. Since () ⊆ R, the mapping   :  → () must pull
from a set with finite cardinality 2 to cover R which is a contradiction. Hence,   cannot be
surjective.</p>
      <p>Significance: This theorem indicates that the surjectivity of   depends on the way we
train the concept bottleneck. This means that, under independent training, AnyCBMs are not
invertible, even when  represents an invertible transformation.</p>
    </sec>
    <sec id="sec-5">
      <title>4. Experiments</title>
      <p>Our experiments aim to answer the following questions:
• How is AnyCBMs classification performance on concepts and downstream tasks compared
to standard CBMs and black boxes?
• How efective are concept interventions in AnyCBM compared to concept interventions
in CBM?
• Is it possible to train AnyCBM with a dataset slightly diferent from the one used to train
the black-box model?
This section describes essential information about the experiments.</p>
      <sec id="sec-5-1">
        <title>4.1. Data &amp; task setup</title>
        <p>
          In our experiments, we use two diferent datasets commonly used to evaluate CBMs: MNIST
even/odd [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ], where the task is to predict whether handwritten digits are even or odd; and
CUB [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ], where the task is to predict bird species based on bird characteristics.
        </p>
      </sec>
      <sec id="sec-5-2">
        <title>4.2. Evaluation</title>
        <p>
          In our analysis, we use ROC-AUC scores to measure classification performance in concepts and
downstream tasks and to measure the efectiveness of concept-based interventions in improving
classification performance in downstream tasks. To measure the efectiveness of interventions,
we follow a similar approach to the one described by Espinosa Zarlenga et al. [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. First, we
perturb the latent embeddings by adding a small random noise a few layers before predicting
concepts both in AnyCBM and CBM. Then, we intervene on a portion of the concepts with the
ground truth. Finally, we test our assumption about the possibility of training AnyCBM with a
diferent dataset with concepts. We train the black-box model with an MNIST even/odd dataset
with RGB images. Then, we train AnyCBM with a version of MNIST that contains greyscale
images with associated concepts. All results are reported using the mean and standard error
over five diferent runs with diferent parameter initializations.
        </p>
      </sec>
      <sec id="sec-5-3">
        <title>4.3. Baselines</title>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>5. Key findings</title>
      <p>In our experiments, we compare AnyCBMs with standard CBMs and with an end-to-end
blackbox model in terms of generalisation performance. We compare AnyCBMs’ interventions with
the efectiveness of interventions in standard CBMs.</p>
      <p>AnyCBMs match black box and CBM performances in terms of classification accuracy
on concepts and downstream tasks (Table 1), AnyCBMs perform just as well as the
original black-box models on which they are based when it comes to accurately completing
tasks. Additionally, the accuracy with which these models handle concepts is equal to that of
other similar Concept Bottleneck Model architectures. This suggests that AnyCBMs could be a
valuable tool for making existing black-box models easier to understand. Using AnyCBMs, we
might be able to explain how these complex models work and, in particular, which encoded
information is inside the layers of the models, making them more transparent and accessible
for further analysis and improvement.</p>
      <p>AnyCBM interventions are as efective as in Concept Bottleneck Models (Figure 2)
AnyCBMs are as responsive to concept-based interventions as standard CBMs. This means
that when concepts predicted by AnyCBMs are manually changed by human experts at test
time, they efectively impact the downstream task accuracy. This finding underlines the ability
of AnyCBMs to interact with domain experts as it would have been expected by CBMs. In</p>
      <p>MNIST even/odd
Task ROC AUC Concept ROC AUC
99.8 ± 0.0
99.8 ± 0.0 99.8 ± 0.0
99.6 ± 0.0 98.8 ± 0.3
addition, this represents a successful method to steer the behaviour of the model modifying
human-understandable concepts.</p>
      <p>AnyCBM can be trained with a diferent dataset from the one used to train the
blackbox model (Table 2) One can initially train a black box model with a dataset, which could
be larger or more beneficial for addressing the downstream task. Subsequently, the AnyCBM
module can be trained on a slightly diferent dataset that includes concept annotations. As
demonstrated in Table 2, this approach does not compromise the model’s performance in terms
of task accuracy when both the black-box model and AnyCBM are utilised in the original
dataset. It also partially accurately predicts concepts in the original dataset, even when there is
a distribution shift. This indicates that AnyCBM can alleviate a significant constraint of CBMs,
which is the requirement for concept annotations in the dataset used to train the entire model.
In addition, the dataset used to train the AnyCBM module could contain only input and concept
annotations, without the need for label annotations.</p>
    </sec>
    <sec id="sec-7">
      <title>6. Discussion</title>
      <p>Advantages In the age of Large Models with billions of parameters, the development of
solutions that do not require retraining to enhance their capabilities is crucial. AnyCBMs
successfully meet this need, as they do not require the alteration of the weights of a pre-trained
black-box model. This enables any black-box model to acquire the extra features of CBMs, such
as the interpretability of the latent space and the capacity to change the model’s behaviour
through concept interventions. Furthermore, we believe that AnyCBM can be trained using
a dataset that is smaller than the one used to train the original black-box because it has a
consistently smaller number of parameters. Interestingly, the dataset can even be distinct (for
instance, we might train the model with a dataset without concepts while training AnyCBM with
a slightly diferent dataset that has only concept annotations), mitigating the CBMs’ constraint
of needing concept annotations for the training set used to train the model. Under these
circumstances, it might be intriguing to determine whether certain concepts can be accurately
predicted from the latent embeddings of black-box models. If some concepts are unpredictable,
this could suggest that the black-box models did not grasp that particular concept in the prior
training, either due to the dataset employed or its irrelevant role in task prediction.
Limitations Although the model gains the benefits of CBMs, it also takes on some of their
drawbacks. The primary constraint is the necessity for concept data to train the AnyCBM
component, although this is somewhat alleviated by the reduced need for concept annotations
and the option to utilise an alternate dataset for their extraction.</p>
      <p>
        Future work We underscore the importance of delving deeper into AnyCBM and its benefits,
while also trying to mitigate its drawbacks. For example, it would be intriguing to examine its
application in multimodal contexts, where automatic concept extraction could be feasible, as
suggested in [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
    </sec>
    <sec id="sec-8">
      <title>7. Conclusion</title>
      <p>This paper introduces Any Concept Bottleneck Models (AnyCBMs), a method for transforming
opaque neural networks into interpretable Concept Bottleneck Models (CBMs), allowing for
insights into the decision-making process in terms of concept-based explanations and
interventions. This paper analyses practical case studies that demonstrate the properties and limitations
of AnyCBMs in enhancing interpretability while maintaining high classification performances
from both a theoretical and an experimental perspective. These results suggest how AnyCBMs
could represent a computationally efective solution to enhance the interpretability of existing
trained or fine-tuned black-box neural networks, allowing also for concept-based interventions
in the black-box latent space.
This study was funded by TRUST-ME (project 205121L_214991), SmartCHANGE (GA No.
101080965) and XAI-PAC (PZ00P2_216405) projects.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>T.</given-names>
            <surname>Madiega</surname>
          </string-name>
          ,
          <source>Artificial intelligence act</source>
          ,
          <source>European Parliament: European Parliamentary Research Service</source>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Bussone</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Stumpf</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
          </string-name>
          <article-title>O'Sullivan, The role of explanations on trust and reliance in clinical decision support systems</article-title>
          , in: 2015 international conference on healthcare informatics, IEEE,
          <year>2015</year>
          , pp.
          <fpage>160</fpage>
          -
          <lpage>169</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>C.</given-names>
            <surname>Rudin</surname>
          </string-name>
          ,
          <article-title>Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead</article-title>
          ,
          <source>Nature Machine Intelligence</source>
          <volume>1</volume>
          (
          <year>2019</year>
          )
          <fpage>206</fpage>
          -
          <lpage>215</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Durán</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. R.</given-names>
            <surname>Jongsma</surname>
          </string-name>
          ,
          <article-title>Who is afraid of black box algorithms? On the epistemological and ethical basis of trust in medical AI</article-title>
          ,
          <source>Journal of Medical Ethics</source>
          <volume>47</volume>
          (
          <year>2021</year>
          )
          <fpage>329</fpage>
          -
          <lpage>335</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S.</given-names>
            <surname>Lo</surname>
          </string-name>
          <string-name>
            <surname>Piano</surname>
          </string-name>
          ,
          <article-title>Ethical principles in machine learning and artificial intelligence: cases from the field and possible ways forward</article-title>
          ,
          <source>Humanities and Social Sciences Communications</source>
          <volume>7</volume>
          (
          <year>2020</year>
          )
          <fpage>1</fpage>
          -
          <lpage>7</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>P. W.</given-names>
            <surname>Koh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y. S.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mussmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Pierson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Liang</surname>
          </string-name>
          ,
          <article-title>Concept bottleneck models</article-title>
          ,
          <source>in: International Conference on Machine Learning, PMLR</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>5338</fpage>
          -
          <lpage>5348</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A.</given-names>
            <surname>Ghorbani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Abid</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zou</surname>
          </string-name>
          ,
          <article-title>Interpretation of neural networks is fragile</article-title>
          ,
          <source>in: Proceedings of the AAAI conference on artificial intelligence</source>
          , volume
          <volume>33</volume>
          ,
          <year>2019</year>
          , pp.
          <fpage>3681</fpage>
          -
          <lpage>3688</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>C.-K. Yeh</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Arik</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.-L. Li</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Pfister</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Ravikumar</surname>
          </string-name>
          ,
          <article-title>On completeness-aware concept-based explanations in deep neural networks</article-title>
          ,
          <source>Advances in Neural Information Processing Systems</source>
          <volume>33</volume>
          (
          <year>2020</year>
          )
          <fpage>20554</fpage>
          -
          <lpage>20565</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A.</given-names>
            <surname>Ghorbani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wexler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <article-title>Towards automatic concept-based explanations</article-title>
          , arXiv preprint arXiv:
          <year>1902</year>
          .
          <volume>03129</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>L. C.</given-names>
            <surname>Magister</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Kazhdan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Liò</surname>
          </string-name>
          , Gcexplainer:
          <article-title>Human-in-the-loop concept-based explanations for graph neural networks</article-title>
          ,
          <source>arXiv preprint arXiv:2107.11889</source>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>T.</given-names>
            <surname>Oikarinen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Das</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. M.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          , T.-W. Weng,
          <article-title>Label-free concept bottleneck models</article-title>
          ,
          <year>2023</year>
          . arXiv:
          <volume>2304</volume>
          .
          <fpage>06129</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>A.</given-names>
            <surname>Mahinpei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Clark</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Lage</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Doshi-Velez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Pan</surname>
          </string-name>
          ,
          <article-title>Promises and pitfalls of black-box concept learning models</article-title>
          ,
          <source>arXiv preprint arXiv:2106.13314</source>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>P.</given-names>
            <surname>Barbiero</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Ciravegna</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Giannini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Lió</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gori</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Melacci</surname>
          </string-name>
          ,
          <article-title>Entropy-based logic explanations of neural networks</article-title>
          ,
          <source>in: Proceedings of the AAAI Conference on Artificial Intelligence</source>
          , volume
          <volume>36</volume>
          ,
          <year>2022</year>
          , pp.
          <fpage>6046</fpage>
          -
          <lpage>6054</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>C.</given-names>
            <surname>Wah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Branson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Welinder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Perona</surname>
          </string-name>
          ,
          <string-name>
            <surname>S. Belongie,</surname>
          </string-name>
          <article-title>The caltech-ucsd birds</article-title>
          <string-name>
            <surname>-</surname>
          </string-name>
          200-2011 dataset,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>M.</given-names>
            <surname>Espinosa Zarlenga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Barbiero</surname>
          </string-name>
          , G. Ciravegna,
          <string-name>
            <given-names>G.</given-names>
            <surname>Marra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Giannini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Diligenti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Shams</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Precioso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Melacci</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Weller</surname>
          </string-name>
          , et al.,
          <source>Concept embedding models, Advances in Neural Information Processing Systems</source>
          <volume>35</volume>
          (
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>