<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Bridging Symbolic and Sub-Symbolic AI: Towards Cooperative Transfer Learning in Multi-Agent Systems</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Matteo Magnini</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giovanni Ciatto</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andrea Omicini</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Dipartimento di Informatica - Scienza e Ingegneria (DISI), Alma Mater Studiorum-Università di Bologna</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Cooperation and knowledge sharing are of paramount importance in the evolution of an intelligent species. Knowledge sharing requires a set of symbols with a shared interpretation, enabling efective communication supporting cooperation. The engineering of intelligent systems may then benefit from the distribution of knowledge among multiple components capable of cooperation and symbolic knowledge sharing. Accordingly, in this paper, we propose a roadmap for the exploitation of knowledge representation and sharing to foster higher degrees of artificial intelligence. We do so by envisioning intelligent systems as composed by multiple agents, capable of cooperative (transfer) learning-Co(T)L for short. In CoL, agents can improve their local (sub-symbolic) knowledge by exchanging (symbolic) information among each others. In CoTL, agents can also learn new tasks autonomously by sharing information about similar tasks. Along this line, we motivate the introduction of Co(T)L and discuss benefits and feasibility.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;transfer learning</kwd>
        <kwd>multi-agent systems</kwd>
        <kwd>artificial general intelligence</kwd>
        <kwd>symbolic knowledge extraction</kwd>
        <kwd>symbolic knowledge injection</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Human beings can perform a huge number of diferent tasks: if a human needs to learn a new
task, it can typically manage to do so easily. Broadly speaking, humans may learn new skills
in two ways: by generalising experience – e.g., via inductive reasoning –, or, by deductively
infer new knowledge from that they already hold or can get from others—e.g., talking (direct
communication) or reading (indirect communication) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. In the former case, novel knowledge
is formed into the learner’s mind. Conversely, in the latter case, knowledge requires to be
represented via symbolic means (e.g., words, gestures, etc.), in order for communication – and
therefore transfer of meaning – to occur. In particular, knowledge acquisition also requires the
learner to reason about how to exploit the acquired knowledge in practice.
      </p>
      <p>When learners are computational agents (rather than humans), algorithms are available to
mimic basic cognitive capabilities such as induction, communication, knowledge representation,
and reasoning. These have been developed under the umbrella of symbolic artificial intelligence
(AI) and machine learning (ML). However, unlike the human case, symbolic AI and ML algorithms
are commonly tailored to solving one or few tasks at a time, and they are not meant to take
advantage from interaction, cooperation, nor knowledge exchange.</p>
      <p>
        This paper stems from the idea that ML-based intelligent systems could and should take
advantage from the exchange of symbolic knowledge to improve their learning capabilities
[
        <xref ref-type="bibr" rid="ref2 ref3 ref4">2, 3, 4</xref>
        ]. In particular, we argue that symbolic knowledge exchange may have a role to play in
letting software agents attain the capability of learning to learn new tasks.
      </p>
      <p>Along this line, we envision two sorts kind of intelligent systems: Cooperative Learning
(CoL) and Cooperative Transfer Learning (CoTL). CoL systems are multi-agent systems (MAS)
whose agents can retrieve / provide knowledge about a specific task from / to other agents,
so as to exploit that knowledge during learning and possibly at inference time. CoTL systems
are CoL systems whose agents can acquire, exploit, and combine knowledge about diferent
related tasks so as to learn to execute novel tasks they were not designed for. Both kinds of
systems are able to mimic the learning process of human societies, despite to diferent extents.
In other words, agents help each others by sharing (predominantly) symbolic and (possibly)
sub-symbolic knowledge about the tasks they need to do—similarly to what humans do.</p>
      <p>Accordingly, in this paper we propose a roadmap for the exploitation of knowledge
representation and sharing to foster higher degrees of artificial intelligence, via CoL and CoTL. In
particular, we analyse the requirements of both CoL and CoTL w.r.t. the state of the art, and
discuss how they could be realised in principle. Along this line, the paper is organised as follows.
Section 2 provides definitions for symbolic and sub-symbolic knowledge representations along
with techniques to manipulate knowledge. Section 3 introduces the definition of CoL and CoTL
systems providing general agent architectures. Finally, Section 4 discuses the main advantages
of CoL and CoTL, it draw conclusions, and provides insights about future works.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Background</title>
      <sec id="sec-2-1">
        <title>2.1. Symbolic vs. Sub-symbolic Knowledge</title>
        <p>Symbols are carriers of meaning that people may exploit in communication, e.g., words, trafic
signs, flags, etc. They are commonly used to represent knowledge in a way that is interpretable
for humans. Furthermore, symbols can be automatically processed by algorithms, and, therefore,
by computational agents.</p>
        <p>
          Following the definition given in [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], a symbolic representation [of knowledge] consists of:
(i) a set of symbols, (ii) a set of grammatical rules enabling possibly infinite combinations of
those symbols, and (iii) the possibility to assign elementary/combined symbols with meaning.
Formal logics – such as propositional logic (PL), knowledge graphs (KG), Horn clauses and first
order logic (FOL) – are notable examples of symbolic knowledge representation means.
        </p>
        <p>
          Formal logics allow for both intensional and extensional knowledge representation. In
particular, an intensional definition means represent data indirectly by describing the elements
of relations or set via other relations or sets. Because of intensional representation (a.k.a.
compactness), domain independence, and versatility, logic can be used as lingua franca for
knowledge representation [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ].
        </p>
        <p>
          Conversely, sub-symbolic representations violate the definition provided in [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. In fact, they
commonly represent data as arrays of numbers of fixed size – violation of item (ii) –, and
knowledge as functions over such arrays. Notably, each component of any array is poorly
meaningful by it-self (violation of item (iii)), unless considered with its local context (neighbour
numbers in the array).
        </p>
        <p>
          Sub-symbolic functions are widely used in ML tasks, such as neural networks (NN). The vast
majority of NN consist in a direct acyclic graph of neurons, which are composed by several
connection weights plus a bias value and an activation function. NN (and in general any
subsymbolic predictor) cannot be conventiently interpreted by humans: even small NN would
require a significant cognitive to be partially understood by the human mind. Therefore, NN
are used as black-boxes [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] and this is accepted in trade of their high performances.
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Symbolic Knowledge Extraction vs. Injection</title>
        <p>
          Symbolic Knowledge Extraction (SKE) is the set of methods accepting trained sub-symbolic
predictors as input and producing symbolic knowledge as output, in such a way that the
extracted knowledge reflects the behaviour of the predictor with fidelity [
          <xref ref-type="bibr" rid="ref10 ref8 ref9">8, 9, 10</xref>
          ]. Literature
provides several SKE algorithms: some may focus on extraction out of classifiers (cf. [
          <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
          ])
or regressors (cf. [
          <xref ref-type="bibr" rid="ref13 ref14">13, 14</xref>
          ]). Virtually all those methods extract knowledge in the form of
propositional rules.
        </p>
        <p>Despite SKE is usually used as a way to post-hoc explain black-box predictors to humans,
it may serve other purposes. For instance, knowledge extracted via SKE could be exploited
to help ML during training. In this case, representing knowledge symbolically brings several
benefits: the extracted knowledge is agnostic w.r.t. the original predictor, and it is compact due
to intensionality.</p>
        <p>
          Dually to SKE, Symbolic Knowledge Injection (SKI) is the set of algorithms afecting how
sub-symbolic predictors draw their inferences by making them consistent with some prior
symbolic knowledge [
          <xref ref-type="bibr" rid="ref15 ref16 ref17">15, 16, 17</xref>
          ]. There are three main ways to provide such knowledge: (i) by
altering the loss function used during the training to induce an error whenever the prediction
of the network violates the knowledge, (ii) by modifying the undergoing architecture in such a
way that the additional parts “mimic” the knowledge, and (iii) by generating input data for the
predictor from the knowledge. There are several SKI algorithms in literature covering all such
approaches, and supporting the injection of diferent logic formalisms. For instance, references
[
          <xref ref-type="bibr" rid="ref18 ref19">18, 19</xref>
          ] deal with FOL, references [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ] use Horn logic, while reference [
          <xref ref-type="bibr" rid="ref21 ref22">21, 22, 23, 24, 25</xref>
          ]
target PL. Notably, virtually all SKI methods target NN because of their superior predictive
performance, other than their malleability.
        </p>
        <p>Generally speaking, SKI improves the eficiency or efectiveness of the sub-symbolic predictors
it is applied to (e.g., accuracy, training time, data greediness, etc.). The common SKI workflow
requires a human expert providing domain-specific knowledge to be injected. However, this is
not a strict requirement. In fact, symbolic knowledge may be provided not only by humans, but
by other computational agents as well. For instance, the knowledge to be injected may be the
result of some prior SKE process.</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Transfer vs. Multi-Task Learning</title>
        <p>Let us denote as ‘task’ any kind of supervised ML task. Accordingly, Transfer Learning (TL) is
the set of techniques aimed at letting a predictor  , targetting task  , take advantage from the
knowledge  acquired by some prior predictor  ′, trained on some other task  ′. Of course,
tasks  and  ′ are assumed to be similar to some extent. The main objectives of TL are to:
(i) reduce the amount of data required to train  , (ii) speed up its training, and (iii) improve its
predictive performance.</p>
        <p>TL algorithms from literature difer w.r.t. two major dimensions: what to transfer and how to
transfer [26]. Of course, another relevant aspect is when to transfer (cf. Sections 2.3 and 3.1).
Finally, similarity among tasks is yet another fundamental aspect—which is often devoted to
the experience of practitioners.</p>
        <p>Notably, TL has been most successfully applied to convolutional NN – in particular, ImageNet
[27] – for biomedical image processing [28]. However, despite their variety, most TL techniques
only support the transfer of sub-symbolic knowledge. In fact, the transferred knowledge
commonly consist of the shallowest layers of a NN, which are transplanted into another NN, of
which only the deeper layers are then re-trained. Hence, to the best of our knowledge, there are
no TL algorithms explicitly leveraging upon symbolic knowledge transfer.</p>
        <p>Multi-task Learning (MTL) is a set of mechanisms aimed to improve the performance of a
predictor via transfer learning [29]. More precisely, given a set of similar tasks {1, . . . , } –
according to some notion of task similarity –, MTL aims at learning the  tasks altogether, by
training as many predictors 1, . . . , . In doing so, MTL attempts to improve the performance
of each , by taking advantage of the knowledge while training the other predictors [30].</p>
        <p>Diferently from TL techniques where there is one task that receives the knowledge from the
others, in MTL all tasks receive knowledge from the others, simultaneously. Similarly to TL,
virtually all MTL techniques rely on sub-symbolic knowledge transfer.</p>
        <p>MTL techniques may be classiefid w.r.t. whether they target either homogeneous or
heterogeneous tasks. Two tasks 1, 2 are homogeneous when they share the same input and
output attributes (names and type). What may be diferent is data sampling, and its distribution.
Conversely, heterogeneous tasks have diferent attributes, with possibly no overlapping [31].</p>
        <p>In MTL the question “where to transfer” is not avoided like in TL. Especially for heterogeneous
tasks, the problem of computing a ‘degree of similarity’ is still open. Empirically, one could test
if two or more tasks are related by applying MTL itself: if the overall performance increases
using MTL, then the tasks can be considered as similar.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Contributions</title>
      <sec id="sec-3-1">
        <title>3.1. Cooperative Learning</title>
        <p>
          A CoL system is a MAS where agents can retrieve knowledge about a task from other agents
and provide knowledge to others when requested. Explicit knowledge sharing – especially
symbolic – is of paramount importance for the MAS as a whole, as it enables agent-to-agent
knowledge transfer [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ].
        </p>
        <p>To support CoL, agents should be endowed with some fundamental capabilities, namely:
(a) Agent’s architecture for CoL.</p>
        <sec id="sec-3-1-1">
          <title>1. learning from experience and updating their behaviour accordingly,</title>
        </sec>
        <sec id="sec-3-1-2">
          <title>2. representing their inner behavioural specification in symbolic form,</title>
        </sec>
        <sec id="sec-3-1-3">
          <title>3. updating their behaviour to comply to some symbolic specification,</title>
        </sec>
        <sec id="sec-3-1-4">
          <title>4. interacting with each other, possibly exchanging symbolic specifications.</title>
          <p>As the reader may notice, capabilities 2 and 3 are complementary. When combined with
capability 4, these may pave the way to cooperation among agents, aimed at learning by interaction.
Finally, capability 1 is necessary to let some agents learn novel behaviours independently of
others.</p>
          <p>When actually building CoL systems, capability 1 is likely supported by sub-symbolic ML. In
particular, each agent is assumed to be endowed with some ML predictor, supporting learning
from local data. However, since MAS are commonly composed by heterogeneous agents serving
disparate purposes, many predictors of diverse sorts are likely to be exploited within the same
system.</p>
          <p>To support capability 4 in spite of heterogeneity, agents should agree on common, shared
symbolic representation means by which behavioural specifications could be described—and
later exchanged. Along this line, SKE and SKI may serve the purposes of capabilities 2 and 3,
respectively</p>
          <p>Figure 1a shows the general design of CoL agents. Each CoL agent must be equipped with
(possibly multiple) SKE and SKI algorithms, in order to support symbolic knowledge I/O. When
queried, an agent may extract symbolic knowledge from its inner ML predictor (via some SKE
technique) and send it to the querying agent. The recipient may then update its local predictor
by injecting the received knowledge into it. Knowledge pre- and post-processing steps (e.g.,
pruning / merging / selecting formulae) may occur before SKI or after SKE, to regulate which
particular chunks of knowledge are actually transferred.</p>
          <p>Crucial choices to be addressed during the design of a CoL system are: (i) the supported
formalisms for knowledge representation, and (ii) an appropriate SKE/SKI toolkit w.r.t. the
undergoing predictor(s) knowledge representation (almost straightforward for pedagogical SKE
techniques). About (i), one could choose FOL over PL or KG for its expressiveness. However, the
more one formalism is expressive the less are the available techniques: it is therefore reasonable
to consider also less expressive logics for (ii).</p>
          <p>It is worth mentioning that a single agent can be initially trained even in lack of prior
knowledge. At some point in the training process, the agent may extract knowledge, combine it
with other knowledge received from other agents, inject it back, and continue training. In this
way, the agent performs several train-extract-fix-inject cycles, in order to boost its performance
w.r.t. the target task. In principle, at every new cycle, the extracted knowledge is more and
more accurate in describing how to approach the task, because the predictor itself is more and
more accurate in its prediction due to better new prior knowledge.</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Cooperative Transfer Learning</title>
        <p>A CoTL system is a MAS where agents can retrieve knowledge about several tasks from other
agents and provide knowledge to others when requested. Unlike simple CoL systems, agents in
CoTL systems may exploit knowledge (either their own, or other agents’ one) about related
tasks to learn novel tasks they were not originally designed for. In other words, the ultimate
goal of CoTL systems is to make agents able to “learning to learn”.</p>
        <p>Learning to learn [novel tasks] is an extension of the well known definition of ML [ 32]
introduced in [33]. More precisely, given:
• a set of task  = {1, . . . , },
• trainable experience for each task {1, . . . , } = ℰ (e.g., ML predictors or symbolic
knowledge bases), and
• a performance measure for each task {1, . . . , } s.t.  :  × ℰ →
R,
a computational agent is capable of learning to learn when each  increases as a function of all
items in ℰ , and well as .</p>
        <p>Relevant practical aspects about CoTL concern when and how to transfer experience.
Concerning the ‘when’, humans should not intervene in the process and arbitrary choose which
tasks is correlated with the others—as it would be infeasible. Therefore, agents must also be
endowed with the ability of computing similarity among diferent tasks. The choice of which
one(s) use is up to the designer (e.g., similarity based, distance based, etc.) or it could be also be
treated as a task to learn. The interested reader may find useful insights in [29].</p>
        <p>Let us now focus on ‘how’ to transfer experience. If an agent is dealing with homogeneous
tasks (that is, tasks with the same input and output space but diferent data distributions), it can
easily use the knowledge of one task while addressing the other. Instead, heterogeneous tasks
are diferent beasts: they can difer in both input and output features—and there may also be
no overlap at all. Consider for instance the case of two heterogeneous classification tasks 1
and 2, for which an agent owns experience in the form of logic knowledge bases 1 and 2
composed by Horn clauses. That agent may then transfer knowledge from one task to the other
via the following procedure:
1. if there exist some rule  ∈ 1 ∪ 2 s.t. both the head and the body of  only refer to
input features shared and classes which are shared among 1 and 2, then the rule can
be used as-is;
2. if the body of  refers to shared input features and the head targets to classes which are
either 1- or 2-specific, then the rule could be used anyway by some SKI algorithm (e.g.,
[34]) otherwise it is necessary to find a mapping between the specific classes of the two
tasks if it exists (e.g., just renaming, linear dependencies with other labels, etc.);
3. finally, if both shared and task-specific features are referred in
, one could:
a) relax the rule (e.g., considering only terms involving shared features) then go to
step 1;
b) find a mapping between task-specific features of the two tasks [ 31], then go to step
1;
c) if none of the previous step are possible (e.g., resulting in empty body), then ignore
the rule.</p>
        <p>A similar procedure can be applied for other sorts of task (e.g., regression) and adapted to deal
with other form of knowledge representations.</p>
        <p>Figure 1b shows a general design for a CoTL agent. In the same way as for CoL systems,
multiple SKE and SKI algorithms must be available for each agent in order to exploit symbolic
knowledge.</p>
        <p>In addition to CoL, the core component of a CoTL agent is the task similarity score.
Accordingly, we assume designers provide a function  :  2 → R≥ 0, where  is the task space, to
evaluate the degree of similarity between two tasks.</p>
        <p>Another relevant design aspect of CoTL is the criterion for selecting knowledge for related
tasks. For instance, designers may leverage threshold-based approach selecting the knowledge
of all the tasks having a  score greater than the threshold. Alternatively, one can use the
knowledge of the  most related tasks.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Discussion and Conclusions</title>
      <p>The joint exploitation of symbolic and sub-symbolic knowledge representation means, as well
as knowledge manipulation tools, to build MAS that miminc humans’ knowledge sharing
capabilities is a promising research direction. It has the potential to overcome current limitation
from the state of the art, such as: (i) TL considers sub-symbolic knowledge representation but
not symbolic one therefore it is not human interpretable; (ii) MTL has to train a predictor on
related tasks at the same time implying intrinsic dificulties to scale, moreover (iii) it is currently
tailored on sub-symbolic knowledge alone.</p>
      <p>CoL and CoTL systems can bring a great impulse in the study and developing of intelligent
systems. A non-exhaustive list of advantages of CoL and CoTL are: (i) they could provide
more and more accurate human-interpretable explanation for a task; (ii) they may increase the
performance of an agent/predictor in solving a single task; (iii) the improvement of one agent in
solving a task should lead other agents to improve; (iv) the improvement on one task could lead
towards the improvement of other tasks; (v) learning is a continuous and automated process
that does not require human intervention.</p>
      <p>However, there are a number of challenges to be addressed for research on CoL and CoTL
to proceed. First, in spite of the many algorithms proposed into the literature so far, running
implementation of SKE and SKI algorithms are rare. Second, the choice of how to deal with
task similarity in CoTL is not trivial and can afect the performance of the whole system. Third,
trust should be taken into account, eventually. How good is the knowledge that an agent is
receiving? What is the reputation of an agent? Finally, there is still the need for datasets –
probably smaller ones w.r.t. not using CoL and CoTL – to successfully train predictors on new
tasks. Indeed, humans can perform new tasks quite well even with just an explanation of how
to do it and without explicit training (e.g., play a new game). Achieving such ability would be a
big jump towards artificial general intelligent systems.</p>
      <p>Summarising, this paper introduces novel concepts of Cooperative Learning and Cooperative
Transfer Learning within the scope of MAS. These systems integrate both symbolic and
subsymbolic knowledge representation and manipulation tools to mimic the learning process of
the human society. The paper proposes a general agent architecture for both CoL and CoTL
and discusses about advantages and limits.</p>
      <p>This preliminary work is a forerunner for empirical future works on CoL and CoTL. Proceeding
by crescent complexity, the first works will address CoL: starting from a train-extract-fix-inject
local workflow and then perform tests on a whole MAS. The next step will be investigating
CoTL systems and a new kind of CoTL MAS capable of learn a new task without the explicit
need of a dataset.</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>This paper was partially supported by the CHIST-ERA IV project “Expectation” –
CHIST-ERA19-XAI-005 –, co-funded by EU and the Italian MUR (Ministry for University and Research).
edge, in: S. J. Hanson, J. D. Cowan, C. L. Giles (Eds.), Advances in Neural Information
Processing Systems 5, [NIPS Conference, Denver, Colorado, USA, November 30 -
December 3, 1992], Morgan Kaufmann, 1992, pp. 871–878. URL: http://papers.nips.cc/paper/
638-network-structuring-and-training-using-rule-based-knowledge.
[23] Z. Hu, X. Ma, Z. Liu, E. H. Hovy, E. P. Xing, Harnessing deep neural networks with logic
rules, in: Proceedings of the 54th Annual Meeting of the Association for Computational
Linguistics, volume 1: Long Papers, The Association for Computer Linguistics, 2016, pp.
2410–2420. doi:10.18653/v1/p16-1228.
[24] M. Diligenti, S. Roychowdhury, M. Gori, Integrating prior knowledge into deep learning,
in: X. Chen, B. Luo, F. Luo, V. Palade, M. A. Wani (Eds.), Proceedings of the 16th IEEE
International Conference on Machine Learning and Applications (ICMLA), Cancun, Mexico,
December 18-21, 2017, IEEE, 2017, pp. 920–923. doi:10.1109/ICMLA.2017.00-37.
[25] M. Magnini, G. Ciatto, A. Omicini, A view to a KILL: Knowledge injection via lambda
layer, in: A. Ferrando, V. Mascardi (Eds.), WOA 2022 – 23rd Workshop “From Objects to
Agents”, volume 3261 of CEUR Workshop Proceedings, Sun SITE Central Europe, RWTH
Aachen University, 2022, pp. 61–76. URL: http://ceur-ws.org/Vol-3261/paper5.pdf.
[26] S. J. Pan, Q. Yang, A survey on transfer learning, IEEE Transactions on Knowledge and</p>
      <p>Data Engineering 22 (2010) 1345–1359. doi:10.1109/TKDE.2009.191.
[27] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, L. Fei-Fei, ImageNet: A large-scale hierarchical
image database, in: 2009 IEEE Conference on Computer Vision and Pattern Recognition
(CVPR 2009), 2009, pp. 248–255. doi:10.1109/CVPR.2009.5206848.
[28] H. Shin, H. R. Roth, M. Gao, L. Lu, Z. Xu, I. Nogues, J. Yao, D. J. Mollura, R. M. Summers,
Deep convolutional neural networks for computer-aided detection: CNN architectures,
dataset characteristics and transfer learning, IEEE Transactions on Medical Imaging 35
(2016) 1285–1298. doi:10.1109/TMI.2016.2528162.
[29] R. Caruana, Multitask learning, Mach. Learn. 28 (1997) 41–75. doi:10.1023/A:
1007379606734.
[30] Y. Zhang, Q. Yang, A survey on multi-task learning, IEEE Transactions on Knowledge and</p>
      <p>Data Engineering 34 (2022) 5586–5609. doi:10.1109/TKDE.2021.3070203.
[31] O. Day, T. M. Khoshgoftaar, A survey on heterogeneous transfer learning, Journal of Big</p>
      <p>Data 4 (2017) 29. doi:10.1186/s40537-017-0089-0.
[32] T. M. Mitchell, Machine learning, International Edition, McGraw-Hill Series in Computer</p>
      <p>Science, McGraw-Hill, 1997. URL: https://www.worldcat.org/oclc/61321007.
[33] S. Thrun, L. Y. Pratt, Learning to learn: Introduction and overview, in: S. Thrun, L. Y. Pratt
(Eds.), Learning to Learn, Springer, 1998, pp. 3–17. doi:10.1007/978-1-4615-5529-2_
1.
[34] M. Magnini, G. Ciatto, A. Omicini, KINS: Knowledge injection via network structuring, in:
R. Calegari, G. Ciatto, A. Omicini (Eds.), CILC 2022 – Italian Conference on Computational
Logic, volume 3204 of CEUR Workshop Proceedings, CEUR-WS, 2022, pp. 254–267. URL:
http://ceur-ws.org/Vol-3204/paper_25.pdf.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D.</given-names>
            <surname>Charney</surname>
          </string-name>
          , T. Ormerod,
          <article-title>Review of communication at a distance: The influence of print on sociocultural organization and change, by David S. Kaufer and Kathleen M. Carley and human reasoning: the psychology of deduction, by</article-title>
          <string-name>
            <given-names>J.</given-names>
            <surname>St.B.T. Evans</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.E.</given-names>
            <surname>Newstead and R.M.J. Byrne</surname>
          </string-name>
          ,
          <source>International Journal of Human-Computer Studies</source>
          <volume>40</volume>
          (
          <year>1994</year>
          )
          <fpage>1067</fpage>
          -
          <lpage>1073</lpage>
          . doi:
          <volume>10</volume>
          .1006/ijhc.
          <year>1994</year>
          .
          <volume>1048</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>G.</given-names>
            <surname>Ciatto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Najjar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.-P.</given-names>
            <surname>Calbimonte</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvaresi</surname>
          </string-name>
          ,
          <article-title>Towards explainable visionary agents: License to dare and imagine</article-title>
          , in: D.
          <string-name>
            <surname>Calvaresi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Najjar</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Winikof</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          Främling (Eds.),
          <source>Explainable and Transparent AI</source>
          and
          <string-name>
            <surname>Multi-Agent Systems</surname>
            . Third International Workshop, EXTRAAMAS 2021,
            <given-names>Virtual</given-names>
          </string-name>
          <string-name>
            <surname>Event</surname>
          </string-name>
          , May 3-
          <issue>7</issue>
          ,
          <year>2021</year>
          , Revised Selected Papers, volume
          <volume>12688</volume>
          of Lecture Notes in Computer Science, Springer Nature, Basel, Switzerland,
          <year>2021</year>
          , pp.
          <fpage>139</fpage>
          -
          <lpage>157</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>030</fpage>
          -82017-
          <issue>6</issue>
          _
          <fpage>9</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Omicini</surname>
          </string-name>
          ,
          <article-title>Not just for humans: Explanation for agent-to-agent communication</article-title>
          , in: G. Vizzari,
          <string-name>
            <given-names>M.</given-names>
            <surname>Palmonari</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Orlandini (Eds.),
          <source>Proceedings of the AIxIA 2020 Discussion Papers Workshop co-located with the the 19th International Conference of the Italian Association for Artificial Intelligence (AIxIA2020)</source>
          , Anywhere, November 27th,
          <year>2020</year>
          , volume
          <volume>2776</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>11</lpage>
          . URL: http://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>2776</volume>
          /paper-1.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvaresi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Ciatto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Najjar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Aydoğan</surname>
          </string-name>
          ,
          <string-name>
            <surname>L. Van der Torre</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Omicini</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. I. Schumacher</surname>
          </string-name>
          , Expectation:
          <article-title>Personalized explainable artificial intelligence for decentralized agents with heterogeneous knowledge</article-title>
          , in: D.
          <string-name>
            <surname>Calvaresi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Najjar</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Winikof</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          Främling (Eds.),
          <source>Explainable and Transparent AI</source>
          and
          <string-name>
            <surname>Multi-Agent Systems</surname>
            . Third International Workshop, EXTRAAMAS 2021,
            <given-names>Virtual</given-names>
          </string-name>
          <string-name>
            <surname>Event</surname>
          </string-name>
          , May 3-
          <issue>7</issue>
          ,
          <year>2021</year>
          , Revised Selected Papers, volume
          <volume>12688</volume>
          of Lecture Notes in Computer Science, Springer Nature, Basel, Switzerland,
          <year>2021</year>
          , pp.
          <fpage>331</fpage>
          -
          <lpage>343</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>030</fpage>
          -82017-6_
          <fpage>20</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>T. van Gelder</surname>
          </string-name>
          ,
          <article-title>Why distributed representation is inherently non-symbolic</article-title>
          , in: G. Dorfner (Ed.),
          <source>Konnektionismus in Artificial Intelligence und Kognitionsforschung. Proceedings 6. Österreichische Artificial Intelligence-Tagung (KONNAI)</source>
          , Salzburg, Österreich,
          <volume>18</volume>
          . bis 21.
          <source>September</source>
          <year>1990</year>
          , volume
          <volume>252</volume>
          of Informatik-Fachberichte, Springer,
          <year>1990</year>
          , pp.
          <fpage>58</fpage>
          -
          <lpage>66</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>642</fpage>
          -76070-
          <issue>9</issue>
          _
          <fpage>6</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>G.</given-names>
            <surname>Ciatto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Calegari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Omicini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvaresi</surname>
          </string-name>
          ,
          <string-name>
            <surname>Towards</surname>
            <given-names>XMAS</given-names>
          </string-name>
          :
          <article-title>eXplainability through Multi-Agent Systems</article-title>
          , in: C. Savaglio, G. Fortino,
          <string-name>
            <given-names>G.</given-names>
            <surname>Ciatto</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Omicini (Eds.),
          <source>AI&amp;IoT 2019 - Artificial Intelligence and Internet of Things</source>
          <year>2019</year>
          , volume
          <volume>2502</volume>
          <source>of CEUR Workshop Proceedings</source>
          , Sun SITE Central Europe, RWTH Aachen University,
          <year>2019</year>
          , pp.
          <fpage>40</fpage>
          -
          <lpage>53</lpage>
          . URL: http://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>2502</volume>
          /paper3.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Z. C.</given-names>
            <surname>Lipton</surname>
          </string-name>
          ,
          <article-title>The mythos of model interpretability</article-title>
          ,
          <source>Commun. ACM</source>
          <volume>61</volume>
          (
          <year>2018</year>
          )
          <fpage>36</fpage>
          -
          <lpage>43</lpage>
          . doi:
          <volume>10</volume>
          .1145/3233231.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>R.</given-names>
            <surname>Andrews</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Diederich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. B.</given-names>
            <surname>Tickle</surname>
          </string-name>
          ,
          <article-title>Survey and critique of techniques for extracting rules from trained artificial neural networks</article-title>
          ,
          <source>Knowledge-Based Systems 8</source>
          (
          <year>1995</year>
          )
          <fpage>373</fpage>
          -
          <lpage>389</lpage>
          . doi:/10.1016/
          <fpage>0950</fpage>
          -
          <lpage>7051</lpage>
          (
          <issue>96</issue>
          )
          <fpage>81920</fpage>
          -
          <lpage>4</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>R.</given-names>
            <surname>Guidotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Monreale</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ruggieri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Turini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Giannotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Pedreschi</surname>
          </string-name>
          ,
          <article-title>A survey of methods for explaining black box models</article-title>
          ,
          <source>ACM Computing Surveys</source>
          <volume>51</volume>
          (
          <year>2018</year>
          )
          <fpage>1</fpage>
          -
          <lpage>42</lpage>
          . doi:
          <volume>10</volume>
          .1145/3236009.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>R.</given-names>
            <surname>Calegari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Ciatto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Omicini</surname>
          </string-name>
          ,
          <article-title>On the integration of symbolic and sub-symbolic techniques for XAI: A survey</article-title>
          ,
          <source>Intelligenza Artificiale</source>
          <volume>14</volume>
          (
          <year>2020</year>
          )
          <fpage>7</fpage>
          -
          <lpage>32</lpage>
          . doi:
          <volume>10</volume>
          .3233/IA-190036.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M. W.</given-names>
            <surname>Craven</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. W.</given-names>
            <surname>Shavlik</surname>
          </string-name>
          ,
          <article-title>Using sampling and queries to extract rules from trained neural networks</article-title>
          , in: W. W. Cohen, H. Hirsh (Eds.),
          <source>Machine Learning, Proceedings of the Eleventh International Conference</source>
          , Rutgers University, New Brunswick, NJ, USA, July
          <volume>10</volume>
          -
          <issue>13</issue>
          ,
          <year>1994</year>
          , Morgan Kaufmann,
          <year>1994</year>
          , pp.
          <fpage>37</fpage>
          -
          <lpage>45</lpage>
          . doi:
          <volume>10</volume>
          .1016/b978-1
          <source>-55860-335-6</source>
          .
          <fpage>50013</fpage>
          -
          <lpage>1</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>M. W.</given-names>
            <surname>Craven</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. W.</given-names>
            <surname>Shavlik</surname>
          </string-name>
          ,
          <article-title>Extracting tree-structured representations of trained networks</article-title>
          , in: D. S. Touretzky, M. Mozer, M. E. Hasselmo (Eds.),
          <source>Advances in Neural Information Processing Systems</source>
          <volume>8</volume>
          , NIPS, Denver, CO, USA, November
          <volume>27</volume>
          -
          <issue>30</issue>
          ,
          <year>1995</year>
          , MIT Press,
          <year>1995</year>
          , pp.
          <fpage>24</fpage>
          -
          <lpage>30</lpage>
          . URL: http://papers.nips.cc/paper/ 1152-extracting
          <article-title>-tree-structured-representations-of-trained-networks.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>J.</given-names>
            <surname>Huysmans</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Baesens</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. Vanthienen,</surname>
          </string-name>
          <article-title>ITER: an algorithm for predictive regression rule extraction</article-title>
          , in: A.
          <string-name>
            <surname>M. Tjoa</surname>
          </string-name>
          , J. Trujillo (Eds.),
          <source>Data Warehousing and Knowledge Discovery</source>
          , 8th International Conference, DaWaK
          <year>2006</year>
          , Krakow, Poland, September 4-
          <issue>8</issue>
          ,
          <year>2006</year>
          , Proceedings, volume
          <volume>4081</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2006</year>
          , pp.
          <fpage>270</fpage>
          -
          <lpage>279</lpage>
          . doi:
          <volume>10</volume>
          .1007/11823728_
          <fpage>26</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>F.</given-names>
            <surname>Sabbatini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Ciatto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Omicini</surname>
          </string-name>
          ,
          <string-name>
            <surname>Gridex:</surname>
          </string-name>
          <article-title>An algorithm for knowledge extraction from black-box regressors</article-title>
          , in: D.
          <string-name>
            <surname>Calvaresi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Najjar</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Winikof</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          Främling (Eds.),
          <source>Explainable and Transparent AI</source>
          and
          <string-name>
            <surname>Multi-Agent</surname>
            <given-names>Systems</given-names>
          </string-name>
          - Third International Workshop, EXTRAAMAS 2021,
          <string-name>
            <given-names>Virtual</given-names>
            <surname>Event</surname>
          </string-name>
          , May 3-
          <issue>7</issue>
          ,
          <year>2021</year>
          , Revised Selected Papers, volume
          <volume>12688</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2021</year>
          , pp.
          <fpage>18</fpage>
          -
          <lpage>38</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>030</fpage>
          -82017-
          <issue>6</issue>
          _
          <fpage>2</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>T. R.</given-names>
            <surname>Besold</surname>
          </string-name>
          , A. S.
          <string-name>
            <surname>d'Avila Garcez</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Bader</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Bowman</surname>
            ,
            <given-names>P. M.</given-names>
          </string-name>
          <string-name>
            <surname>Domingos</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Hitzler</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Kühnberger</surname>
            ,
            <given-names>L. C.</given-names>
          </string-name>
          <string-name>
            <surname>Lamb</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Lowd</surname>
            ,
            <given-names>P. M. V.</given-names>
          </string-name>
          <string-name>
            <surname>Lima</surname>
            , L. de Penning, G. Pinkas,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Poon</surname>
          </string-name>
          , G. Zaverucha,
          <article-title>Neural-symbolic learning and reasoning: A survey and interpretation</article-title>
          ,
          <source>CoRR abs/1711</source>
          .03902 (
          <year>2017</year>
          ). URL: http://arxiv.org/abs/1711.03902. arXiv:
          <volume>1711</volume>
          .
          <fpage>03902</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Xie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. S.</given-names>
            <surname>Meel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Kankanhalli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Soh</surname>
          </string-name>
          ,
          <article-title>Embedding symbolic knowledge into deep networks</article-title>
          , in: H.
          <string-name>
            <surname>M. Wallach</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Larochelle</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Beygelzimer</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <article-title>d'Alché-</article-title>
          <string-name>
            <surname>Buc</surname>
            ,
            <given-names>E. B.</given-names>
          </string-name>
          <string-name>
            <surname>Fox</surname>
          </string-name>
          , R. Garnett (Eds.),
          <source>Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems</source>
          <year>2019</year>
          ,
          <article-title>NeurIPS 2019</article-title>
          , December 8-
          <issue>14</issue>
          ,
          <year>2019</year>
          , Vancouver, BC, Canada,
          <year>2019</year>
          , pp.
          <fpage>4235</fpage>
          -
          <lpage>4245</lpage>
          . URL: https://proceedings.neurips. cc/paper/2019/hash/7b66b4fd401a271a1c7224027ce111bc-Abstract.html.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17] L. von
          <string-name>
            <surname>Rueden</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Mayer</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Beckh</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Georgiev</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Giesselbach</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Heese</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Kirsch</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Walczak</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Pfrommer</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Pick</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Ramamurthy</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Garcke</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Bauckhage</surname>
            ,
            <given-names>J. Schuecker,</given-names>
          </string-name>
          <article-title>Informed machine learning - a taxonomy and survey of integrating prior knowledge into learning systems</article-title>
          ,
          <source>IEEE Transactions on Knowledge and Data Engineering</source>
          (
          <year>2021</year>
          )
          <fpage>1</fpage>
          -
          <lpage>1</lpage>
          . doi:
          <volume>10</volume>
          .1109/TKDE.
          <year>2021</year>
          .
          <volume>3079836</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>D. H.</given-names>
            <surname>Ballard</surname>
          </string-name>
          ,
          <article-title>Parallel logical inference and energy minimization</article-title>
          , in: T. Kehler (Ed.),
          <source>Proceedings of the 5th National Conference on Artificial Intelligence</source>
          . Philadelphia, PA, USA,
          <year>August</year>
          11-
          <issue>15</issue>
          ,
          <year>1986</year>
          . Volume 1: Science, Morgan Kaufmann,
          <year>1986</year>
          , pp.
          <fpage>203</fpage>
          -
          <lpage>209</lpage>
          . URL: http://www.aaai.org/Library/AAAI/
          <year>1986</year>
          /aaai86-
          <fpage>033</fpage>
          .php.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>A. S. d'Avila Garcez</surname>
            ,
            <given-names>D. M.</given-names>
          </string-name>
          <string-name>
            <surname>Gabbay</surname>
          </string-name>
          ,
          <article-title>Fibring neural networks</article-title>
          , in: D. L.
          <string-name>
            <surname>McGuinness</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          Ferguson (Eds.),
          <source>Proceedings of the Nineteenth National Conference on Artificial Intelligence, Sixteenth Conference on Innovative Applications of Artificial Intelligence, July 25-29</source>
          ,
          <year>2004</year>
          , San Jose, California, USA, AAAI Press / The MIT Press,
          <year>2004</year>
          , pp.
          <fpage>342</fpage>
          -
          <lpage>347</lpage>
          . URL: http://www.aaai.org/Library/AAAI/
          <year>2004</year>
          /aaai04-
          <fpage>055</fpage>
          .php.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>R.</given-names>
            <surname>Manhaeve</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Dumancic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kimmig</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Demeester</surname>
          </string-name>
          , L. De Raedt,
          <article-title>Neural probabilistic logic programming in deepproblog</article-title>
          ,
          <source>Artificial Intelligence</source>
          <volume>298</volume>
          (
          <year>2021</year>
          )
          <article-title>103504</article-title>
          . doi:
          <volume>10</volume>
          .1016/ j.artint.
          <year>2021</year>
          .
          <volume>103504</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>G. G.</given-names>
            <surname>Towell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. W.</given-names>
            <surname>Shavlik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. O.</given-names>
            <surname>Noordewier</surname>
          </string-name>
          ,
          <article-title>Refinement ofapproximate domain theories by knowledge-based neural networks</article-title>
          , in: H. E. Shrobe,
          <string-name>
            <given-names>T. G.</given-names>
            <surname>Dietterich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. R.</given-names>
            <surname>Swartout</surname>
          </string-name>
          (Eds.),
          <source>Proceedings of the 8th National Conference on Artificial Intelligence</source>
          . Boston, Massachusetts, USA,
          <source>July 29 - August 3</source>
          ,
          <year>1990</year>
          , 2 Volumes, AAAI Press / The MIT Press,
          <year>1990</year>
          , pp.
          <fpage>861</fpage>
          -
          <lpage>866</lpage>
          . URL: http://www.aaai.org/Library/AAAI/
          <year>1990</year>
          /aaai90-
          <fpage>129</fpage>
          .php.
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>V.</given-names>
            <surname>Tresp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hollatz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ahmad</surname>
          </string-name>
          ,
          <article-title>Network structuring and training using rule-based knowl-</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>