=Paper= {{Paper |id=Vol-2954/invited-1 |storemode=property |title=Empowering Knowledge Bases: A Machine Learning Perspective (Invited Paper and Talk) |pdfUrl=https://ceur-ws.org/Vol-2954/invited-1.pdf |volume=Vol-2954 |authors=Claudia d’Amato |dblpUrl=https://dblp.org/rec/conf/dlog/dAmato21 }} ==Empowering Knowledge Bases: A Machine Learning Perspective (Invited Paper and Talk)== https://ceur-ws.org/Vol-2954/invited-1.pdf
               Empowering Knowledge Bases:
               A Machine Learning Perspective
                  (Invited Paper and Talk)

                                   Claudia d’Amato

                  Computer Science Department – University of Bari
                            claudia.damato@uniba.it



        Abstract. The construction of Knowledge Bases requires quite often
        the intervention of knowledge engineering and domain experts, resulting
        in a time consuming task. Alternative approaches have been developed
        for building knowledge bases from existing sources of information such
        as web pages and crowdsourcing; seminal examples are NELL, DBPedia,
        YAGO and several others. With the goal of building very large sources of
        knowledge, as recently for the case of Knowledge Graphs, even more com-
        plex integration processes have been set up, involving multiple sources of
        information, human expert intervention, crowdsourcing. Despite signifi-
        cant efforts for making Knowledge Graphs as comprehensive and reliable
        as possible, they tend to suffer of incompleteness and noise, due to the
        complex building process. Nevertheless, even for highly human curated
        knowledge bases, cases of incompleteness can be found, for instance with
        disjointness axioms missing quite often. Machine learning methods have
        been proposed with the purpose of refining, enriching, completing and
        possibly raising potential issues in existing knowledge bases while show-
        ing the ability to cope with noise. The talk will concentrate on classes
        of mostly symbol-based machine learning methods, specifically focusing
        on concept learning, rule learning and disjointness axioms learning prob-
        lems, showing how the developed methods can be exploited for enriching
        existing knowledge bases. During the talk it will be highlighted as, a
        key element of the illustrated solutions, is represented by the integration
        of: background knowledge, deductive reasoning and the evidence coming
        from the mass of the data. The last part of the talk will be devoted
        to the presentation of an approach for injecting background knowledge
        into numeric-based embedding models to be used for predictive tasks on
        Knowledge Graphs.1


1     Introduction
The construction of Knowledge Bases (KBs) quite often results a time consum-
ing task since the intervention of knowledge engineering and domain experts is
needed. For this reason, alternative solutions have been developed, exploiting
1
    Copyright © 2021 for this paper by its authors. Use permitted under Creative
    Commons License Attribution 4.0 International (CC BY 4.0).
2      Claudia d’Amato

existing sources of information such as web pages; seminal examples are NELL2 ,
DBpedia3 , YAGO4 . Recently the construction of very large KBs, named Knowl-
edge Graphs (KGs), has been targeted and several examples already exist, span-
ning from enterprise products, such as those built by Google and Amazon to
other open KGs such as Freebase5 and Wikidata6 , to mention a few of them.
    A KG is a graph of data intended to convey knowledge of the real world, con-
forming to a graph-based data model where nodes represent entities of interest
and edges represent possibly different relations between these entities, further-
more the data graph can be enhanced with schema, generally by the use of
ontologies that are employed to define and reason about the semantics of nodes
and edges [24]. KGs are very large data collections requiring even more complex
integration processes than in the past, involving multiple sources of information,
human expert intervention, crowdsourcing. Despite significant efforts for making
KGs as comprehensive as possible, it is well known [24] they tend to suffer of in-
completeness and noise, due to the complex building process. Nevertheless, even
for highly human curated KBs as for some ontologies, cases of incompleteness
can be found, for instance with disjointness axioms missing quite often [43] that,
as a consequence, may generate noise, that is invalid information with respect
to the domain of reference despite of having a consistent KB.
    Machine Learning (ML) methods, mostly grounded on inductive approaches,
have been proposed for refining, enriching, completing and possibly raising is-
sues in existing KBs while showing the ability to cope with noise [13]. Problems
such as link prediction but also query answering and instance retrieval have been
regarded as classification problems. Suitable methods, often inspired by symbol-
based solutions in the Inductive Logic Programming (ILP) [14] field (aiming at
inducing an hypothesised logic program from a background knowledge and a
collection of examples), have been proposed [9, 27, 31, 19, 39]. Most of them are
able to cope with expressive representation languages such as Description Logics
(DLs) [2], theoretical foundation for OWL7 , standard representation language
in the Semantic Web [4], and the Open World Assumption (OWA) typically
adopted, differently from the Closed Wold Assumption (CWA) that is usually
assumed in the traditional ML settings. Also, problems such as ontology refine-
ment and enrichment at terminology/schema level, e.g. providing complex de-
scriptions for a given concept name or assessing disjointness axioms, have been
regarded as concept learning problems to be solved via supervised/unsupervised
inductive learning methods for DL representations [17, 18, 30, 42, 36].
    Nowadays, numeric-based ML methods such as embeddings [6, 26] and deep
learning [15] solutions are receiving major attention because of their impressive
ability to scale when applied to very large data collections. Mostly KG refine-

2
  http://rtw.ml.cmu.edu/rtw/
3
  https://www.dbpedia.org/
4
  https://yago-knowledge.org/
5
  https://developers.google.com/freebase
6
  https://www.wikidata.org/
7
  https://www.w3.org/OWL/
                                  Title Suppressed Due to Excessive Length        3

ment tasks, and specifically link/type predictions and triple classifications are
targeted, with the goal of improving/limiting incompleteness in KGs. Neverthe-
less, the important gain, in terms of scalability, that numeric-based methods are
obtaining is penalizing: a) the possibility to have interpretable models as a re-
sult of a learning process; b) the ability to exploit deductive (and complementary
forms of) reasoning; c) the expressiveness of the representations to be considered
and the compliance with the OWA.
    For these reasons, the talk will focus primarily on the advances in the class of
symbol-based ML methods, specifically analyzing concept learning, rule learn-
ing and disjointness axioms learning problems and related solutions to be used
for enriching existing KBs. The key point is represented by the integration of:
background knowledge, deductive reasoning and the evidence coming from the
mass of data. The main idea consists in exploiting the evidence coming from
assertional knowledge and deductive reasoning for drawing plausible conclusions
to be possibly represented with intensional models. The last part of the talk
will be dedicated to the presentation of an approach for injecting background
knowledge into numeric-based ML methods and specifically embedding models
to be used for predictive tasks on KGs.
    Accordingly, in the following, concept learning (see Sect. 2), rule learning
(see Sect. 3) and disjointness axioms learning (see Sect. 4) are briefly surveyed
while semantically enriched numeric based solutions are illustrated in Sect. 5.
Discussions on future research directions and conclusions are reported in Sect. 6.


2     Concept Learning for Ontology Enrichment

With the purpose of enriching ontologies at terminological level, methods for
learning concept descriptions for a concept name have been proposed. The prob-
lem has been regarded as a supervised concept learning problem aiming at ap-
proximating an intensional DLs definition, given a set of individuals of an onto-
logical KB acting as positive/negative training examples.
    Various solutions, e.g. DL-Foil [17] and celoe [30] (part of the DL-
Learner suite8 ), have been formalized. They are mostly grounded on a separate-
and-conquer (sequential covering) strategy: a new concept description is built
by specializing, via suitable refinement operators, a partial solution to correctly
cover (i.e. decide a consistent classification for) as many training instances as
possible. Whilst DL-Foil works under OWA, celoe works under CWA. Both of
them may suffer of ending up in sub-optimal solutions. In order to overcome such
issue, DL-Focl [37], Parcel [40] and SpACEL [41] have been proposed. DL-
Focl is an optimized version of DL-Foil, implementing a base greedy covering
strategy. Parcel combines top-down and bottom-up refinements in the search
space. Specifically, the learning problem is split into various sub-problems, ac-
cording to a divide-and-conquer strategy, that are solved by running celoe as
a subroutine. Once the partial solutions are obtained, they are combined in
8
    https://dl-learner.org/.
4       Claudia d’Amato

a bottom-up fashion. SpACEL extends Parcel by performing a symmetrical
specialization of a concept description.
    These solutions proved to be able to learn approximated concept descriptions
for a target concept name to be used for possibly introducing new (inclusion or
equality) axioms in the KB, nevertheless, quite often, relatively small ontological
KBs have been considered for the experiments, revealing that currently these
solutions have limited ability to scale on very large KBs such as KGs.

3     Rule Learning for Knowledge Completion
Knowledge completion consists in finding new information at assertional level,
that is facts, that are missing in a considered KB. This task has received increas-
ing attention with the development of KGs that are well known to be incomplete,
since it is also strongly related to the link prediction task (see Sect. 5).
    One of the most well known system for knowledge completion of RDF9 knowl-
edge bases is AMIE [19]. Inspired by association rule mining [1] and ILP litera-
ture, AMIE aims at mining logic rules from RDF knowledge bases with the final
goal of predicting new facts, that are RDF triples. AMIE (and its optimized ver-
sion AMIE+ [20]) currently represents the most scalable rule mining system for
learning rules on large RDF data collections and it is also explicitly tailored to
support the OWA. However, it does not exploit any form of deductive reasoning.
    A related rule mining system, similarly based on a level-wise generate and
test strategy has been proposed [12]. It aims at learning SWRL rules [25] from
OWL ontologies while exploiting schema level information and deductive rea-
soning during learning process. Both AMIE and the solution presented in [12]
showed the ability to mine useful rules and to predict new assertional knowledge.
However, the solution proposed in [12] showed limited ability to scale due to the
exploitation of the reasoning capabilities. Nevertheless, with [12] additional rules
could be obtained to be exploited for generalizing towards other kind of axioms
(besides facts) such as general inclusion axioms, to be only validated by a domain
expert or knowledge engineering.

4     Learning Disjointness Axioms
Disjointness axioms are essential for making explicit the negative knowledge
about a domain, yet they are often overlooked when modelling ontologies [43]
(thus also affecting the efficacy of reasoning services). To tackle this problem,
automated methods for discovering these axioms from the data distribution have
been devised.
    A solution grounded on association rule mining [1] has been proposed in [42].
It is based on studying the correlation between classes comparatively, namely
association rules, negative association rules and correlation coefficient. Back-
ground knowledge and reasoning capabilities are used to a limited extent. Dif-
ferent approaches have been instead proposed in [31] and [3] where relational
9
    https://www.w3.org/RDF/
                                  Title Suppressed Due to Excessive Length        5

learning methods and techniques based on formal concept analysis have been
respectively employed to the purpose. However, no specific assessment of the
quality of the induced axioms is made.
    A different solution has been proposed in [36] where, moving from the as-
sumption that two or more concepts may be mutually disjoint when the sets
of their (known) instances do not overlap, the problem has been regarded as
a clustering problem, aiming at finding partitions of similar individuals of the
KB, according to a cohesion criterion quantifying the degree of homogeneity of
the individuals in an element of the partition. Specifically, the problem has been
cast as a conceptual clustering problem, where the goal is both to find the best
possible partitioning of the individuals and also to induce intensional definitions
of the corresponding classes expressed in the standard representation languages.
Emerging disjointness axioms are captured by the employment of terminologi-
cal cluster trees (TCTs) and by minimizing the risk of mutual overlap between
concepts. Once the TCT is grown, groups of (disjoint) clusters located at sibling
nodes identify concepts involved in candidate disjointness axioms to be derived.
Unlike [42], based on the statistical correlation between instances, the empirical
evaluation of [36] showed its ability to discover disjointness axioms also involv-
ing complex concept descriptions, thanks to the exploitation of the underlying
ontology as background knowledge.


5   Enriched Embedding Models for Knowledge Graph
    Completion

Symbol-based ML methods adopt symbols for representing entities and relation-
ships of a domain and infer generalizations, ideally readily interpretable even by
the humans, that provide new insights into the data. Differently, numeric-based
methods typically adopt feature vector (propositional) representations and can-
not provide interpretable models but they usually result rather scalable. For this
reason numeric-based solutions have been mainly used for performing link pre-
diction in KGs (also referred to as knowledge graph completion), which amounts
to predict the existence (or the probability of correctness) of triples in the large
KGs, that are known to be often missing facts. Mostly RDF representation lan-
guage has been targeted and almost no reasoning is exploited; most expressive
languages (such as OWL) are basically discarded.
     Among the others, KG embedding methods have received the greatest atten-
tion. They aim at converting the data graph into an optimal low-dimensional
space in which graph structural information and graph properties are preserved
as much as possible [6, 26]. The low-dimensional spaces enable computationally
efficient solutions that scale better with the KG dimensions. Graph embedding
methods may differ in their main building blocks: the representation space (e.g.
point-wise, complex, discrete, Gaussian, manifold), the encoding model (e.g. lin-
ear, factorization, neural models) and the scoring function (that can be based
on distance, energy, semantic matching or other criteria) [26]. In any case, the
6      Claudia d’Amato

objective consists in learning embeddings such that the score of a valid (positive)
triple is lower than the score of an invalid (negative) triple.
    TransE [5] has been the very first embedding model registering very high
scalability performances. The method relies on a stochastic optimization process,
that iteratively updates the distributed representations by increasing the score
of the positive triples i.e. the observed triples, while lowering the score of un-
observed triples standing as negative examples. The embedding of all entities
and predicates in the KG is learned by minimizing a margin-based ranking loss.
Nevertheless, TransE resulted limited in representing properly various types
of properties such as reflexive ones, and 1-to-N , N -to-1 and N -to-N relations.
To tackle this limitation, moving from TransE, a large family of models has
been originated; among the others, TransR [32], has been proposed resulting
as more suitable to handle non 1-to-1 relations.
    An important point that needs to be highlighted is that, because RDF repre-
sentation is mostly tackled, most of the considered data collections only contain
positive (training) examples, since usually false facts are not encoded. However,
as negative examples are needed when learning vector embeddings, for obtain-
ing negative examples two different approaches are generally adopted: either
corrupting true/observed triples with the goal of generating plausible negative
examples or making a local-closed world assumption (LCWA) in which the data
collection is assumed as locally complete [35]. In both cases, wrong negative
information may be generated and thus used when training and learning the
embedding models. Even more so, existing embedding models do not make use
of the additional semantic information that may be coded when more expressive
representation languages are adopted.
    Recently, empowered embedding models, targeting KG refinement tasks, have
been proposed. In [22] a KG embedding method considering also logical rules
has been formalized, where triples in the KG and rules are represented in a
unified framework. A common loss over both representations is defined which is
minimized to learn the embeddings. This proposal resulted in a novel solution
but the specific form of prior knowledge that needs to be available for the KG
constitutes its main drawback. A similar drawback also applies to [34], where a
solution based on adversarial training is formalized, exploiting Datalog clauses
to encode assumptions which are used to regularize neural link predictors. An
inconsistency loss is derived that measures the degree of violation of such as-
sumptions on a set of adversarial examples.
    Complementary solutions have been developed. Besides graph structural in-
formation, they exploit the additional knowledge that is typically available when
rich representation languages as RDFS and OWL are employed. Differently from
the works previously mentioned, these solutions do not require any specific ad-
ditional formalisms for representing prior knowledge. Particularly, in [33] it has
been proved the effectiveness of combining embedding methods and strategies
relying on reasoning services for the injection of background knowledge to en-
hance the performance of a specific predictive model. Following this line, Tran-
sOWL, aiming at injecting background knowledge during the learning process,
                                  Title Suppressed Due to Excessive Length        7

and its upgraded version TransROWL, where a newly defined and more suit-
able loss function and scoring function are also exploited, have been proposed [11,
10]. These solutions can take advantage of an informed corruption process that
leverages on reasoning capabilities, while limiting the amount of false negatives
that a less informed random corruption process may cause. Specifically, Tran-
sOWL formalizes a model characterized by two main components devoted to
inject background knowledge in the embedding-based model during the training
phase: 1) Reasoning: It is used for generating corrupted triples that can certainly
represent negative instances, thus avoiding false negatives, for a more effective
model training. Precisely, using a reasoner, corrupted triples are generated ex-
ploiting available RDF/OWL axioms, particularly domain, range, disjointWith,
functionalProperty; 2) Background Knowledge Injection: A set of different ax-
ioms, that are equivalentClass, equivalentProperty, inverseOf and subClassOf, are
employed for defining constraints on the score function to be used in the training
phase, so that the resulting vectors, related to such axioms, reflect their specific
properties. As a consequence, new triples are also added to the training set on
the grounds of these axioms.
    These models have been proved improving their effectiveness on link predic-
tion and triple classification tasks when compared to models such as TransE
and TransR, grounded on the use of a random corruption process and structural
graph properties only. Nevertheless, some shortcomings also emerged, precisely
when some of the considered schema axioms were missing, suggesting that ad-
ditional efforts need to be pursued in this direction.


6   Discussion and Conclusions

Existing symbol-based ML methods are not actually comparable, in terms of
scalability, to numeric-based ML solutions. Nevertheless, as already discussed
in Sect. 1 and Sect. 5, this gain is not for free since it is obtained mostly by
giving up expressive representation languages, such as OWL, and by almost
forgetting one of the most powerful characteristic of these languages, that is
being empowered with deductive reasoning capabilities that allow to derive new
knowledge. Furthermore, differently from symbol-based methods, numeric-based
solutions lack of the ability to provide interpretable models, thus limiting the
possibility to interpret and understand the motivations for the returned results.
Additionally, tasks such as concept and/or disjointness axioms learning cannot
be performed without symbol-based methods which can certainly benefit of very
large amount of information for providing potentially more accurate results.
    Even if some initial results have been registered for semantically enriched
embedding solutions (Sect. 5), significant research efforts need to be devoted to-
wards developing ML solutions that, while keeping scalability, are able to target
expressive representations as well as to provide interpretable models. This actu-
ally means pushing towards the integration of numeric and symbolic approaches.
Some discussions in this direction have been developed by the Neural-Symbolic
Learning and Reasoning community [21, 23], which seeks to integrate principles
8       Claudia d’Amato

from neural networks learning and logical reasoning. The main conclusion has
been that neural-symbolic integration appears particularly suitable for applica-
tions characterized by the joint availability of large amounts of (heterogeneous)
data and knowledge descriptions, which is actually the case for KGs. Addition-
ally, a set of key challenges and opportunities have been outlined [21], such
as: how to represent expressive logics within neural networks, how neural net-
works should reason with variables, or how to extract symbolic representation
from trained neural networks. Preliminary results have been recently registered,
encouraging pursuing the research direction. An example is represented by Sim-
plE [28], a scalable tensor-based factorization model that is able to learn inter-
pretable embeddings incorporating logical rules through weight tying. Ideas for
extracting propositional rules from trained neural networks under a background
knowledge have been illustrated in [29], showing that the exploitation of a back-
ground knowledge allows: a) to reduce the cardinality of extracted rule sets; b)
to reproduce the input-output function of the trained neural network.
    A conceptual sketch for explaining the classification behavior of artificial
neural networks in a non-propositional setting with the use of a background
knowledge has been proposed [38]. This sheds the light on another important
issue, that is, the necessity of providing explanations to ML results [8], partic-
ularly when they come from very large KBs. The solution illustrated in [38] is
in agreement with the idea of exploiting symbol-based interpretable models to
explain conclusions [14]. Nevertheless, interpretable models describe how solu-
tions are obtained but not why they are obtained, whilst, as argued in [16, 21],
providing an explanation should mean supplying a line of reasoning illustrating
the decision making process of a model using human understandable features [7].
Hence, in a more broad sense, providing an explanation requires to open the box
of the reasoning process and make it understandable.


References

 1. Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets
    of items in large databases. In: Buneman, P., et al. (eds.) Proc. of the 1993
    ACM SIGMOD Int. Conf. on Management of Data, pp. 207–216. ACM (1993).
    https://doi.org/10.1145/170035.170072
 2. Baader, F., Calvanese, D., McGuinness, D.L., Nardi, D., Patel-Schneider, P.F.
    (eds.): Description Logic Handbook, 2nd edition. Cambridge University Press
    (2010). https://doi.org/10.1017/CBO9780511711787
 3. Baader, F., Ganter, B., Sertkaya, B., Sattler, U.: Completing description logic
    knowledge bases using formal concept analysis. In: Veloso, M. (ed.) IJCAI 2007,
    Proceedings of the 20th International Joint Conference on Artificial Intelligence.
    pp. 230–235. AAAI Press (2007)
 4. Berners-Lee, T., Hendler, J., Lassila, O.: The semantic web. Scientific American
    284(5), 34–43 (2001)
 5. Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., Yakhnenko, O.: Translating
    embeddings for modeling multi-relational data. In: Burges, C.J.C., et al. (eds.)
    Proceedings of NIPS 2013, pp. 2787–2795. Curran Associates, Inc. (2013)
                                    Title Suppressed Due to Excessive Length          9

 6. Cai, H., Zheng, V.W., Chang, K.: A comprehensive survey of graph
    embedding: Problems, techniques, and applications. IEEE Transac-
    tions on Knowledge and Data Engineering 30(09), 1616–1637 (2018).
    https://doi.org/10.1109/TKDE.2018.2807452
 7. Chen, J., Lécué, F., Pan, J., Horrocks, I., Chen, H.: Knowledge-based transfer
    learning explanation. In: Thielscher, M., et al. (eds.) Principles of Knowledge Rep-
    resentation and Reasoning: Proc. of the 16th International Conference, KR 2018.
    pp. 349–358. AAAI Press (2018)
 8. d’Amato, C.: Logic and learning: Can we provide explanations in the current
    knowledge lake? In: Bonatti, P., et al. (eds.) Knowledge Graphs: New Directions
    for Knowledge Representation on the Semantic Web (Dagstuhl Seminar 18371),
    Dagstuhl Reports, vol. 8, pp. 37–38. Schloss Dagstuhl–Leibniz-Zentrum fuer Infor-
    matik (2019). https://doi.org/10.4230/DagRep.8.9.29
 9. d’Amato, C., Fanizzi, N., Esposito, F.: Query answering and ontology population:
    An inductive approach. In: Bechhofer, S., et al. (eds.) The Semantic Web: Research
    and Applications, 5th European Semantic Web Conference, ESWC 2008, Proceed-
    ings. LNCS, vol. 5021, pp. 288–302. Springer (2008). https://doi.org/10.1007/978-
    3-540-68234-9 23
10. d’Amato, C., Quatraro, N.F., Fanizzi, N.: Embedding models for knowledge graphs
    induced by clusters of relations and background knowledge. In: 1st International
    Joint Conf. on Learning & Reasoning, IJCLR 2021, Proceedings (2021), to appear
11. d’Amato, C., Quatraro, N.F., Fanizzi, N.: Injecting background knowledge into
    embedding models for predictive tasks on knowledge graphs. In: Verborgh, R., et
    al. (eds.) The Semantic Web International Conference, ESWC 2021, Proceedings.
    LNCS, vol. 12731, pp. 441–457. Springer (2021). https://doi.org/10.1007/978-3-
    030-77385-4 26
12. d’Amato, C., Tettamanzi, A., Tran, D.M.: Evolutionary discovery of multi-
    relational association rules from ontological knowledge bases. In: Blomqvist, E.,
    et al. (eds.) Knowledge Engineering and Knowledge Management International
    Conference, EKAW 2016, Proceedings. LNCS, vol. 10024, pp. 113–128 (2016).
    https://doi.org/10.1007/978-3-319-49004-5 8
13. d’Amato, C.: Machine learning for the semantic web: Lessons learnt
    and next research directions. Semantic Web 11(1), 195–203 (2020).
    https://doi.org/10.3233/SW-200388
14. De Raedt, L. (ed.): Logical and Relational Learning: From ILP to MRDM (Cog-
    nitive Technologies). Springer-Verlag (2008)
15. Deng, L., Yu, D. (eds.): Deep Learning: Methods and Applications. NOW Publish-
    ers (2014). https://doi.org/10.1561/2000000039
16. Doran, D., Schulz, S., Besold, T.: What does explainable AI really mean? A new
    conceptualization of perspectives. In: Besold, T.R., Kutz, O. (eds.) Proceedings of
    the 1st International Workshop on Comprehensibility and Explanation in AI and
    ML 2017 co-located with 16th Int. Conf. of the Italian Association for Artificial
    Intelligence (AI*IA 2017), CEUR Work. Proc., vol. 2071. CEUR-WS.org (2017)
17. Fanizzi, N., d’Amato, C., Esposito, F.: DL-FOIL concept learning in description
    logics. In: Zelezný, F., Lavrac, N. (eds.) Inductive Logic Programming, 18th In-
    ternational Conference, ILP 2008, Proceedings. LNCS, vol. 5194, pp. 107–121.
    Springer (2008). https://doi.org/10.1007/978-3-540-85928-4 12
18. Fanizzi, N., d’Amato, C., Esposito, F.: Metric-based stochastic con-
    ceptual clustering for ontologies. Inf. Syst. 34(8), 792–806 (2009).
    https://doi.org/10.1016/j.is.2009.03.008
10      Claudia d’Amato

19. Galárraga, L., Teflioudi, C., Hose, K., Suchanek, F.M.: AMIE: association rule
    mining under incomplete evidence in ontological knowledge bases. In: Schwabe,
    D., et al. (eds.) 22nd International World Wide Web Conference, WWW ’13. pp.
    413–422. International World Wide Web Conferences Steering Committee / ACM
    (2013). https://doi.org/10.1145/2488388.2488425
20. Galárraga, L., Teflioudi, C., Hose, K., Suchanek, F.M.: Fast rule mining in on-
    tological knowledge bases with AMIE+. VLDB Journal 24(6), 707–730 (2015).
    https://doi.org/10.1007/s00778-015-0394-1
21. d’Avila Garcez, A., Besold, T., De Raedt, L., Földiák, P., Hitzler, P., Icard, T.,
    Kühnberger, K., Lamb, L., Miikkulainen, R., Silver, D.: Neural-symbolic learn-
    ing and reasoning: Contributions and challenges. In: 2015 AAAI Spring Sym-
    posia. AAAI Press (2015), http://www.aaai.org/ocs/index.php/SSS/SSS15/
    paper/view/10281
22. Guo, S., Wang, Q., Wang, L., Wang, B., Guo, L.: Jointly embedding knowledge
    graphs and logical rules. In: Proceedings of EMNLP 2016. pp. 192–202. ACL
    (2016). https://doi.org/10.18653/v1/D16-1019
23. Hitzler, P., Bianchi, F., Ebrahimi, M., Sarker, M.K.: Neural-symbolic inte-
    gration and the semantic web. Semantic Web Journal 11(1), 3–11 (2020).
    https://doi.org/10.3233/SW-190368
24. Hogan, A., Blomqvist, E., Cochez, M., d’Amato, C., de Melo, G., Gutiérrez,
    C., Gayo, J.L., Kirrane, S., Neumaier, S., Polleres, A., Navigli, R., Ngomo,
    A.N., Rashid, S., Rula, A., Schmelzeisen, L., Sequeda, J., Staab, S., Zim-
    mermann, A.: Knowledge graphs. ACM Computing Surveys 54, 1–37 (2021).
    https://doi.org/https://doi.org/10.1145/3447772
25. Horrocks, I., Patel-Schneider, P., Boley, H., Tabet, S., Grosof, B., Dean., M.:
    SWRL: A semantic web rule language combining owl and ruleml (2004), http:
    //www.aaai.org/ocs/index.php/SSS/SSS15/paper/view/10281
26. Ji, S., Pan, S., Cambria, E., Marttinen, P., Yu, P.S.: A survey on knowledge graphs:
    Representation, acquisition and applications arXiv:2002.00388 (2020)
27. Józefowska, J., Lawrynowicz, A., Lukaszewski, T.: The role of semantics in mining
    frequent patterns from knowledge bases in description logics with rules. TPLP
    10(3), 251–289 (2010). https://doi.org/10.1017/S1471068410000098
28. Kazemi, S., Poole, D.: Simple embedding for link prediction in knowledge graphs.
    In: Bengio, S., et al. (eds.) Advances in Neural Information Processing Systems 31:
    Annual Conf. on Neural Information Processing Systems 2018, NeurIPS 2018, pp.
    4289–4300. ACM (2018)
29. Labaf, M., Hitzler, P., Evans, A.: Propositional rule extraction from neural net-
    works under background knowledge. In: Besold, T.R., et al. (eds.) Proceedings of
    the Twelfth International Workshop on Neural-Symbolic Learning and Reasoning,
    NeSy 2017. CEUR Workshop Proceedings, vol. 2003. CEUR-WS.org (2017)
30. Lehmann, J., Auer, S., Bühmann, L., Tramp, S.: Class expression
    learning for ontology engineering. J. Web Semant. 9(1), 71–81 (2011).
    https://doi.org/10.1016/j.websem.2011.01.001
31. Lehmann, J., Bühmann, L.: ORE - a tool for repairing and enriching knowledge
    bases. In: Patel-Schneider, P.F., et al. (eds.) The Semantic Web - ISWC 2010 - 9th
    International Semantic Web Conference, ISWC 2010, Revised Selected Papers, Part
    II. LNCS, vol. 6497, pp. 177–193. Springer (2010). https://doi.org/10.1007/978-3-
    642-17749-1 12
32. Lin, Y., Liu, Z., Sun, M., Liu, Y., Zhu, X.: Learning entity and relation embeddings
    for knowledge graph completion. In: AAAI 2015 Proceedings. p. 2181–2187. AAAI
    Press (2015)
                                    Title Suppressed Due to Excessive Length          11

33. Minervini, P., Costabello, L., Muñoz, E., Novácek, V., Vandenbussche, P.: Reg-
    ularizing knowledge graph embeddings via equivalence and inversion axioms. In:
    Ceci, M., et al. (eds.) Proceedings of ECML PKDD 2017, Part I. LNAI, vol. 10534,
    pp. 668–683. Springer (2017). https://doi.org/10.1007/978-3-319-71249-9 40
34. Minervini, P., Demeester, T., Rocktäschel, T., Riedel, S.: Adversarial sets for reg-
    ularising neural link predictors. In: Elidan, G., et al. (eds.) UAI 2017 Proceedings.
    AUAI Press (2017)
35. Nickel, M., Murphy, K., Tresp, V., Gabrilovich, E.: A review of relational machine
    learning for knowledge graphs. Proceedings of the IEEE 104(1), 11–33 (2016).
    https://doi.org/10.1109/JPROC.2015.2483592
36. Rizzo, G., d’Amato, C., Fanizzi, N., Esposito, F.: Terminological cluster trees for
    disjointness axiom discovery. In: Blomqvist, E., et al. (eds.) The Semantic Web -
    14th International Conference, ESWC 2017, Proceedings, Part I. LNCS, vol. 10249,
    pp. 184–201. Springer (2017). https://doi.org/10.1007/978-3-319-58068-5 12
37. Rizzo, G., Fanizzi, N., d’Amato, C., Esposito, F.: A framework for tackling
    myopia in concept learning on the web of data. In: Faron-Zucker, C., et al.
    (eds.) Knowledge Engineering and Knowledge Management International Confer-
    ence, EKAW 2018, Proceedings. LNCS, vol. 11313, pp. 338–354. Springer (2018).
    https://doi.org/10.1007/978-3-030-03667-6 22
38. Sarker, M., Xie, N., Doran, D., Raymer, M., Hitzler, P.: Explaining trained neu-
    ral networks with semantic web technologies: First steps. In: Besold, T.R., et al.
    (eds.) Proceedings of the 12th Int. Workshop on Neural-Symbolic Learning and
    Reasoning, NeSy 2017. CEUR Workshop Proc., vol. 2003. CEUR-WS.org (2017)
39. Tiddi, I., d’Aquin, M., Motta, E.: Dedalo: Looking for clusters explanations in a
    labyrinth of linked data. In: Presutti, V., et al. (eds.) The Semantic Web: Trends
    and Challenges - 11th Int. Conference, ESWC 2014, Proceedings. LNCS, vol. 8465,
    pp. 333–348. Springer (2014). https://doi.org/10.1007/978-3-319-07443-6 23
40. Tran, A., Dietrich, J., Guesgen, H., Marsland, S.: An approach to parallel class
    expression learning. In: Bikakis, A., Giurca, A. (eds.) Rules on the Web: Research
    and Applications - 6th Int. Symposium, RuleML 2012, Proc., LNCS, vol. 7438, pp.
    302–316. Springer (2012). https://doi.org/10.1007/978-3-642-32689-9 25
41. Tran, A., Dietrich, J., Guesgen, H., Marsland, S.: Parallel symmetric class expres-
    sion learning. J. of Machine Learning Research 18(64), 1–34 (2017)
42. Völker, J., Fleischhacker, D., Stuckenschmidt, H.: Automatic acquisition
    of class disjointness. Journal of Web Semantics 35(P2), 124–139 (2015).
    https://doi.org/10.1016/j.websem.2015.07.001
43. Wang, T.D., Parsia, B., Hendler, J.: A survey of the web ontology landscape. In:
    Cruz, I., et al. (eds.) The Semantic Web - ISWC 2006, 5th Int. Semantic Web Con-
    ference Proceedings. LNCS, vol. 4273. Springer (2006), doi: 10.1007/11926078 49