<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Neurosymbolic Framework for Bias Correction in Convolutional Neural Networks</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Parth Padalkar</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Natalia Ślusarz</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ekaterina Komendantskaya</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gopal Gupta</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Heriot-Watt University</institution>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Southampton University</institution>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Texas at Dallas</institution>
          ,
          <addr-line>Richardson</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Recent eforts in interpreting Convolutional Neural Networks (CNNs) focus on translating the activation of CNN filters into a stratified Answer Set Program (ASP) rule-sets. The CNN filters are known to capture high-level image concepts, thus the predicates in the rule-set are mapped to the concept that their corresponding filter represents. Hence, the rule-set exemplifies the decision-making process of the CNN w.r.t the concepts that it learns for any image classification task. These rule-sets help understand the biases in CNNs, although correcting the biases remains a challenge. We introduce a neurosymbolic framework called NeSyBiCor for bias correction in a trained CNN. Given symbolic concepts, as ASP constraints, that the CNN is biased towards, we convert the concepts to their corresponding vector representations. Then, the CNN is retrained using our novel semantic similarity loss that pushes the iflters away from (or towards) learning the desired/undesired concepts. The final ASP rule-set obtained after retraining, satisfies the constraints to a high degree, thus showing the revision in the knowledge of the CNN. We demonstrate that our NeSyBiCor framework successfully corrects the biases of CNNs trained with subsets of classes from the Places dataset while sacrificing minimal accuracy and improving interpretability.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Neurosymbolic AI</kwd>
        <kwd>CNN</kwd>
        <kwd>Semantic Loss</kwd>
        <kwd>Answer Set Programming</kwd>
        <kwd>XAI</kwd>
        <kwd>Representation Learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Deep learning models can be biased based on the training data. One infamous illustration of this
bias is exemplified by the “wolf in the snow” problem [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], where convolutional neural networks
(CNNs) erroneously identify a husky as a wolf due to the presence of snow in the background.
This happened because they learnt to associate ‘snow’ with ‘wolf’ based on the training data.
This bias can lead to dire consequences if deployed in sensitive scenarios such as such as disease
diagnosis ([
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]) and autonomous vehicle operation ([
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]).
      </p>
      <p>
        Recent works have shown that it is possible to obtain the knowledge of a trained CNN in the
form of a symbolic rule-set, more specifically as a stratified Answer Set Program ([
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ]). The
authors proposed NeSyFOLD, a framework where the activation of filters in the final CNN layer
represents predicate truth values in the rule-set, revealing concepts learned by the model and
their relation to the target class. CNN filters, which are  ×  matrices, capture image concepts.
Predicates are labeled according to the concepts identified by these filters. The FOLD-SE-M [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]
Rule-Based Machine Learning (RBML) algorithm is used to extract the rules from the last layer
iflters.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Methodology</title>
      <p>We introduce the NeSyBiCor (Neuro-Symbolic Bias Correction) framework, to aid in correcting
pre-identified biases of the CNN. The ASP rule-set generated by NeSyFOLD, from the
biascorrected CNN serves to validate the efectiveness of the framework. The pre-identified biases
are presented as semantic constraints based on concepts that should and should not be used
to predict the class of an image. These concepts can be selected by scrutinizing the
ruleset generated by NeSyFOLD. We map the undesirable/desirable semantic concepts to their
corresponding vector representations learnt by the filters. Next, we retrain the CNN with a
novel semantic similarity loss function which is a core component of our framework. The loss
function is designed to minimize the similarity of the representations learnt by the filters with
the undesirable concepts and maximize the similarity to the desirable ones. Once the retraining
is complete, we use the NeSyFOLD framework again to extract the refined rule-set. Hence, our
approach provides a way to refine a given ASP rule-set subject to some semantic constraints.
Figure 1 illustrates the framework.</p>
      <p>Computing Concept Representation Vectors: The first step is to obtain concept representation
vectors for each desired and undesired concept specified in the semantic constraints of each
target class. A CNN filter can be flattened into a vector representing the detected patterns. For
example, in the rul-set (blue box) shown in Fig. 1, the filter associated with the predicate ‘sky1/1’
detects blue sky patterns, while ‘sky2/1’ detects evening sky patterns in desert road images. To
compute the concept representation vector for sky, we calculate the filter representation vectors
for all predicates containing sky in their name and positively linked to the ‘desert road’ class.
For example, ‘sky1/1’ and ‘sky2/1’ have two such filters. Their filter representation vectors
are computed by averaging the output vectors of the top-10 images these filters activate in the
training set. The final concept representation vector for sky is the mean of the ‘sky1/1’ and
‘sky2/1’ vectors. This process is repeated for every desired and undesired concepts.
Calculating the Semantic Similarity Loss: The semantic similarity loss, ℒ for a training
set with  images and a CNN with  filters in the last layer, is defined as:</p>
      <p>⎡  ⎡ ⎤⎤
ℒ = ∑︁ ⎣∑︁ ⎣  ∑︁ _(r , r) −   ∑︁ _(r , r)⎦⎦
=1 =1 ∈B ∈G
(1)
Here, cos_sim calculates the cosine similarity between two vectors. r represents the filter
output from the ℎ filter of the ℎ image, while r and r are the concept representation
vectors for undesired concepts  ∈ B and desired concepts  ∈ G, respectively.   and   are
hyperparameters that balance the influence of these terms.</p>
      <p>The loss increases when filter vectors resemble undesired concepts and decreases when they
are closer to desired concepts, similar to the word2vec loss function. As training progresses, the
iflters are encouraged to learn desired concepts while avoiding undesired ones. The total loss
is defined as the sum of crossentropy loss ℒℰ and semantic similarity loss ℒ , ℒ  ℒ =
Recalibrating the Concept Representation Vectors: We propose rectifying all the concept
representation vectors for each class after every  epochs during training. We do this by running
the NeSyFOLD framework after every  epochs and obtaining a new rule set from the partially
retrained CNN. We then aggregate the new concept representation vectors with the old concept
representation vectors by taking their mean.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Experiments and Results</title>
      <p>
        We train a CNN on subsets of the Places [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] dataset and compare the results with the rule-set
obtained using NeSyFOLD before and after the bias correction with our NeSyBiCor framework.
Details of the experiments can be found elsewhere (Padalkar et al. [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]). Figure 2 shows the
initial and bias-corrected rule-set for 2 subsets of the Places dataset. The des subset constitutes
the ‘desert road’ and ‘street’ class and the defs subset comprises of the ‘desert road’, ‘forest road’
and ‘street’ classes.
      </p>
      <p>The undesired concepts for the ‘desert road’ class are ‘sky’ and ‘building’. In RULE-SET 1,
rule 2 uses the ‘sky1/1’ predicate to determine if the image belongs to the ‘desert road’ class.
In the bias-corrected rule-set, RULE-SET 1*, there is no ‘sky’ based predicate. Moreover, the
only predicate positively linked with the ‘desert road’ class is the ‘ground1_road1/1’ predicate
which is based on the desired concept ‘ground’ and refers to the corresponding filter, learning a
pattern comprising of specific type of patches of ‘ground’ and ‘road’. Thus it is clear that at
the end of the bias correction, very few/none of the filters learn representations of the ‘sky’ or
‘building’ concepts, hence correcting the “bias” of the CNN towards them while predicting the
“desert road” class.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusion and Future Work</title>
      <p>Our framework, in addition to correcting the bias of a CNN also allows the user to fine-tune
the bias based on general concepts according to their specific needs or application. To the
best of our knowledge, this is the first method that does bias correction by using the learnt
representations of the CNN filters in a targeted manner. We show through our experiments that
the bias-corrected rule-set is highly efective at avoiding the classification of images based on
undesired concepts. It is also more likely to classify the images based on the desired concepts.</p>
      <p>
        Finally, the work we presented here may be used to extend implementations of loss functions
based on Diferentiable Logics: [
        <xref ref-type="bibr" rid="ref10 ref11 ref9">9, 10, 11</xref>
        ].
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M. T.</given-names>
            <surname>Ribeiro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Guestrin</surname>
          </string-name>
          ,
          <article-title>"why should i trust you?": Explaining the predictions of any classifier</article-title>
          ,
          <year>2016</year>
          . URL: https://arxiv.org/abs/1602.04938. doi:
          <volume>10</volume>
          .48550/ARXIV.1602. 04938.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>T.</given-names>
            <surname>Müezzinoglu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Baygin</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Tuncer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. D.</given-names>
            <surname>Barua</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Baygin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Dogan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Tuncer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. E.</given-names>
            <surname>Palmer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. H.</given-names>
            <surname>Cheong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U. R.</given-names>
            <surname>Acharya</surname>
          </string-name>
          , Patchresnet:
          <article-title>Multiple patch division-based deep feature fusion framework for brain tumor classification using MRI images</article-title>
          ,
          <year>2023</year>
          . URL: https://doi.org/10.1007/s10278-023-00789-x. doi:
          <volume>10</volume>
          .1007/S10278-023-00789-X.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>R.</given-names>
            <surname>Barea</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. M.</given-names>
            <surname>Bergasa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Romera</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>López-Guillén</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Perez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Tradacete</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>López</surname>
          </string-name>
          ,
          <article-title>Integrating state-of-the-art cnns for multi-sensor 3d vehicle detection in real autonomous driving environments</article-title>
          ,
          <source>in: Proc. ITSC</source>
          , IEEE,
          <year>2019</year>
          , pp.
          <fpage>1425</fpage>
          -
          <lpage>1431</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>P.</given-names>
            <surname>Padalkar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wang</surname>
          </string-name>
          , G. Gupta,
          <article-title>Nesyfold: A framework for interpretable image classification</article-title>
          ,
          <source>in: Proc. AAAI</source>
          , AAAI Press,
          <year>2024</year>
          , pp.
          <fpage>4378</fpage>
          -
          <lpage>4387</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>P.</given-names>
            <surname>Padalkar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <surname>G. Gupta,</surname>
          </string-name>
          <article-title>Using logic programming and kernel-grouping for improving interpretability of convolutional neural networks</article-title>
          ,
          <source>in: Proc. PADL</source>
          , volume
          <volume>14512</volume>
          <source>of LNCS</source>
          , Springer,
          <year>2024</year>
          , pp.
          <fpage>134</fpage>
          -
          <lpage>150</lpage>
          . URL: https://doi.org/10.1007/978-3-
          <fpage>031</fpage>
          -52038-
          <issue>9</issue>
          _9. doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>031</fpage>
          -52038-9\_9.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>H.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>Gupta, FOLD-SE: an eficient rule-based machine learning algorithm with scalable explainability</article-title>
          ,
          <source>in: Proc. PADL</source>
          , volume
          <volume>14512</volume>
          <source>of LNCS</source>
          , Springer,
          <year>2024</year>
          , pp.
          <fpage>37</fpage>
          -
          <lpage>53</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>B.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Lapedriza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Khosla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Oliva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Torralba</surname>
          </string-name>
          ,
          <article-title>Places: A 10 million image database for scene recognition</article-title>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>P.</given-names>
            <surname>Padalkar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ślusarz</surname>
          </string-name>
          , E. Komendantskaya,
          <string-name>
            <given-names>G.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <article-title>A neurosymbolic framework for bias correction in cnns</article-title>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Fischer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Balunovic</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
            Drachsler-Cohen,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Gehr</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Zhang</surname>
          </string-name>
          , M. Vechev,
          <article-title>Dl2: training and querying neural networks with logic</article-title>
          ,
          <source>in: International Conference on Machine Learning, PMLR</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>1931</fpage>
          -
          <lpage>1941</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>N.</given-names>
            <surname>Ślusarz</surname>
          </string-name>
          , E. Komendantskaya,
          <string-name>
            <given-names>M. L.</given-names>
            <surname>Daggitt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Stewart</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Stark</surname>
          </string-name>
          ,
          <article-title>Logic of diferentiable logics: Towards a uniform semantics of dl</article-title>
          ,
          <source>in: 24th International Conference on Logic for Programming</source>
          ,
          <source>Artificial Intelligence and Reasoning</source>
          ,
          <source>LPAR</source>
          <year>2023</year>
          ,
          <year>2023</year>
          , pp.
          <fpage>473</fpage>
          -
          <lpage>493</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>M. L. Daggitt</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          <string-name>
            <surname>Kokke</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Atkey</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Slusarz</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Arnaboldi</surname>
          </string-name>
          , E. Komendantskaya,
          <article-title>Vehicle: Bridging the embedding gap in the verification of neuro-symbolic programs</article-title>
          ,
          <year>2024</year>
          . URL: https://doi.org/10.48550/arXiv.2401.06379. doi:
          <volume>10</volume>
          .48550/ARXIV.2401. 06379. arXiv:
          <volume>2401</volume>
          .
          <fpage>06379</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>