<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Welcome, newborn entity! On handling newly generated entities in ontology transformation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Vojtěch Svátek</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ondřej Zamazal</string-name>
          <email>ondrej.zamazal@vse.cz</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kateřina Haniková</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>David Chudán</string-name>
          <email>david.chudan@vse.cz</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mohammad Javad Saeedizade</string-name>
          <email>javad.saeedizade@liu.se</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Eva Blomqvist</string-name>
          <email>eva.blomqvist@liu.se</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Workshop</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Linköping University</institution>
          ,
          <country country="SE">Sweden</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Prague University of Economics and Business</institution>
          ,
          <addr-line>Czechia</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <fpage>26</fpage>
      <lpage>28</lpage>
      <abstract>
        <p>Modeling can be seen both as an engineering and a design task. Even when clear requirements are available for an ontology modeling endevour, many times the requirements can be solved in several diferent ways. Even further, when requirements change and evolve, a certain modeling style, or pattern, may no longer be the best choice. However, refactoring an ontology due to such changes in requirements, or due to the desire to align better with other external ontologies or data sets, is a complex and tedious process, also requiring extensive expertise. Attempting to automate part of the ontology transformation process, by identifying typical transformation patterns, and creating tool support for their semi-automated application, is therefore an important research topic. One specific sub-task of such automation is the naming of new entities (e.g. classes or properties) that are generated through the pattern application. In this paper we discuss the need for such automated naming support, and show the feasibility of introducing Large Language Models (LLMs) for taking the automation one step further.</p>
      </abstract>
      <kwd-group>
        <kwd>Ontology transformation</kwd>
        <kwd>OWL</kwd>
        <kwd>entity naming</kwd>
        <kwd>Large language models</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        When developing an ontology, the ontology engineers must choose one of, sometimes, many possible
ontological representations of the abstract situation being modeled [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The diferent alternatives can
be characterized as conforming to certain modeling styles or modeling patterns. For example, a complex
relationship can be expressed as a chain of simpler relationships (e.g., a person wasBornIn a city (or,
settlement), which is, in turn, linked by locatedIn to a country) or we can create a shortcut using a
compound relationship (e.g., wasBornInCountry directly linking a person with a country). The choice of
modeling style impacts down-stream applications that consume the ontology and apply its ontological
viewpoint, e.g., a knowledge graph visualization tool can either be capable of employing the chain
property to downsize a diagram or not – depending on whether this property is an explicit part of the
ontology. One solution to the ontology rigidity problem (entailed by once-made modeling decisions) is
to enable posterior transformation of an ontology across diferent modeling styles/patterns. To manage
the complexity and resource needs of such transformations, ontology transformation could be fully
automated, or, more likely, semi-automated. A pre-requisite of automation is the availability of reusable
transformation patterns – either a priori or, potentially, via on-the fly discovery of such patterns by
advanced AI methods.
      </p>
      <p>The idea of automated transformation of an ontology into its (fully or partially) semantically
equivalent structural variants, fitting particular tasks, was probably first coined by Zamazal et al. in the</p>
      <p>CEUR</p>
      <p>ceur-ws.org</p>
      <p>PatOMat project [2]. The user interaction was supported by a graphical user interface, GUIPOT, and
a graphical editor of transformation patterns was also provided [3]. The tools developed in PatOMat
were used in a number of pilot use cases [4]. However, the high amount of user interaction in the
to-be-transformed entity selection phase as well as the need to manually tune the new entities’ lexical
labels (generated via simple lexico-syntactic heuristics) after the transformation, prevented the whole
approach from larger uptake. The EvoPat tool [5], in contrast, handled not only the schema
transformation but also the instance data transformation. However, due to its RDF-triple granularity, expressing
high-level logical patterns in it would have led to low comprehensibility of the transformation pattern
representations. It also did not address entity naming, neither in detection nor in transformation.</p>
      <p>Aside from such early endeavors, the ontology transformation task has not been extensively
researched in recent years, presumably due to the challenges of formulating the transformation patterns,
identifying the cases when they can be applied and managing the efects of their application. However,
with the growing base of potential transformation patterns as well as with current capabilities of
generative AI, we argue this is the time to attempt at such automation anew. Specifically, Large Language
Models (LLMs) should be capable of supporting the matching between ontologies and transformation
patterns as well as of handling some parts of the transformation efect, such as the generation and
naming of new entities. The value of such methods is also supported by the fact that the amount of
published RDF datasets is steadily increasing, and their reuse could bring tangible benefits – if not
hindered by discrepancies between their modeling styles.</p>
      <p>In this paper we focus on a specific problem within ontology transformation: handling the new
entities (e.g. properties or classes) that result from the application of the transformation pattern. In
Section 2 we illustrate a number of ontology transformation patterns, and characterize their ‘new entity
generation’ efect. In Section 3 we then present the results of preliminary experiments in providing
meaningful names for such entities via LLMs.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Transformation patterns and their newly generated entities</title>
      <p>A formal framework for describing ontology transformation patterns has already been provided in
earlier work [2]. Here we only provide the intuition that a transformation pattern (in the context of
OWL2OWL transformation) consists of (i) the source pattern and (ii) target pattern, and of (iii) semantic
links between the source and target. Source and targets both being OWL patterns, i.e., syntactically
fragments of ontologies containing placeholders in the position of some or all entities, and the semantic
links being primarily responsible for providing a name (i.e., typically, an rdfs:label value and/or a
human readable IRI fragment) for every newly generated entity.1</p>
      <p>Bootstrapping a set of transformation patterns can be done thanks to their similarity to alignment
patterns [6] used to align existing, independently developed ontologies. Specifically, so-called complex
alignments define correspondences not between atomic entities but between (class or property)
expressions of which at least one is compound. Note however that while the complex alignment patterns
structurally correspond to transformation patterns, they do not tackle the problems of handling new
entities, central in this paper, since no new entity comes into being in the ontology alignment process.</p>
      <p>Table 1 shows seven verbally described transformation patterns (some having alignment patterns as
pre-cursor). Note that some are representatives of larger transformation pattern families, the members
of which share some properties, in particular, the nature of output entities.</p>
      <p>Given the lack of space, we only explain the patterns semi-formally, providing (for each pattern,
which we number from 1 to 7):
1. A tentative name for each pattern.
2. The meta-type of the new entities – class, object property (OP) or data property (DP) – and their
connection to pre-existing ontology entities via RDFS axioms (subclass, domain and range).
1Our current implementation of transformation patterns thus actually calls this third part of the pattern ‘naming
transformation’, for simplicity.
3. The nature of the entailment (OWL) axiom that can be used to entail the ABox statements – class
instantiation or property assertion – for the new entity.</p>
      <p>A more formal description of each pattern is in an auxiliary page.2</p>
      <p>Pattern 1 is the shortcutting of 2 chained3 properties by one new property [7], as in the initial
‘city/country of birth’ example. Pattern 2 is derived from a published alignment pattern [6]; if we have an
object property with defined domain and range (such as :hasReviewDecision with domain :Paper and
range :Decision), and the range class has a subclass (e.g., :AcceptanceDecision) then we can create
a new corresponding subclass of the domain class (‘class of papers with acceptance decision’), equivalent
to a restriction of the property to the subclass of the range property. Pattern 3 is similar to Pattern
1, but with slightly diferent input and somewhat diferent background semantics. The properties
in this patterns do not form a ‘chain’, since the direction of the two source properties are opposite,
thus requiring an inverse property restriction in the entailment. For example, we may have a class
:Marriage that is in the domain of two properties, :involvesHusband and :involvesWife; from this
we derive a new object property expressing the direct link (‘married to’) between the husband and wife.
The meaning of Patterns 4 and 5 is straightforward, e.g., simply replacing one property with another,
either its inverse (e.g., :livesIn with :hasInhabitant) or another kind (DP instead of OP). Note that
no OWL entailment can assure the ABox entailment for Pattern 5, as it requires the production of a
string literal for the target data property substituting the corresponding IRI value of the source object
property (e.g., instead of the IRI dbpedia:Amsterdam there would be "Amsterdam"̂̂xsd:string).
The literal could be produced either by parsing the IRI string or by fetching the value of a ‘naming’
property such as rdfs:label or dcterms:title, but neither approach is guaranteed to work. Pattern
6 also consists in generating a data property from an object property such that both have the same
domain (e.g., ‘Person’) , however, with a diferent meaning: the target, integer-valued, data property
(e.g., ‘numberOfcitizenships’) counts the cardinality of the source object property (e.g., ‘isCitizenOf’)
assertions for the given RDF subject. Finally, Pattern 7 is more complex and more content-oriented
compared to the previous ‘purely structural’ ones. Semantically, the input is a ternary relation
expressing the fact that an object  plays a role  with respect to an object  (for example, Bob plays the role
of researcher at CWI); however, syntactically, the relation  is reified to a class (e.g., :RolePlaying,
with outgoing relations :involvesRolePlayer, :involvesRole and :involvesRelatedObject). The
transformation then generates two output entities, as a kind of combination of Pattern 2 and Pattern
1. The first new entity is a subclass of the class of  (e.g., :Researcher as subclass of :Person), as
a ‘typification’ of the role. The new subclass can be defined through an OWL property restriction,
but a compound one (such as Researcher ≡ ∃ ^involvesRolePlayer . (∃    .  ℎ ) ,
in description logic notation), with link between  and the reified relationship node inverted ( ^).
The second new entity is an object property (e.g., ‘researches at’) directly connecting the new class
2https://github.com/Onto-DESIDE-VSE/TransformationPatterns/tree/main/experiments/LLMs%20experiments/EKAW24
3In the sense that the range of one is the domain of the other.
(:Researcher)4 with the class of  (say, :Institute). Here, we cannot even express the
entailment in OWL, but would need a rule language (formally, a Horn clause), since there are multiple
non-chained antecedent atoms needed (e.g., researchesIn(X,Y) :- involvesRolePlayer(Z,X),
involvesRole(Z,ResearcherRole ), involvesRelatedObject(Z,Y)5).</p>
      <p>Note that we only discussed simple transformation patterns that are merely applied on a single
ontology fragment in one match. Some other patterns, such as conversions of class hierarchies to SKOS
(instance-level) taxonomies, require bulk operation and would be more dificult to describe.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Large Language Model (LLM) as namegiver</title>
      <p>All mentioned transformation patterns feature the creation of new entities, and except for OP to DP
conversion (where likely – although not certainly – the data property could retain the name of the
object one), the labeling of them is not trivial. Earlier approaches [4] employed template-based NLG
techniques for such entity naming, sometimes with a grain of linguistic knowledge (such as WordNet
derivative forms, allowing, e.g., nominalization of verbs). Nowadays it however seems natural to resort,
for this task, to LLMs, which should assure fluency of the output, and possibly also some degree of
semantic abstraction, leading to parsimony in terms of the number of tokens involved. The last point
is particularly important if the transformed entities are to appear in analytical applications, whether
knowledge graph visualization, or, say, knowledge graph mining, where output hypotheses/models
may consist of numerous interconnected entities.</p>
      <p>To assess the feasibility of the approach we have carried out preliminary, merely qualitative
experiments on generating names for new entities in the context of Pattern 1 (a new property) and Pattern 2
(a new class). For each pattern we gathered a dataset of randomly selected examples from the Archivo
repository.6 After that a human annotator (one of the authors of this paper) suggested a name for each
new property (Pattern 1) and for each new class (Pattern 2). For Pattern 1 we selected 16 examples that
were deemed the most interesting and suitable for the pattern. For Pattern 2 there were 25 examples
selected. These examples were used to execute the experiments with LLMs and assess their output
suggestions.</p>
      <p>For the LLM-based naming of generated entities, we performed two experiments. The first (a) was
prompting using natural language only, and the second one (b) used prompts including the input
ontology structures in Turtle syntax. There intuition being to explore whether the LLMs can handle
requests involving the actual pattern structure, as expressed in a formal language (possibly benefiting
from the LLM having been also trained on code samples in this language), or whether such structures
need to first be transformed into natural language (involving an additional step with potential sources
of error).</p>
      <p>Even though these experiments are very preliminary, only a first attempt to assess the quality of LLM
support capabilities, the results are promising.7 However, in the experiments with Pattern 2, it seems
that the LLM is able to provide better results to this task when it is prompted with natural language
only (experiment a), rather than in Turtle syntax (b).</p>
      <p>A certain level of agreement was observed between the LLM suggestions and the human naming,
and also between diferent LLMs. For example, suggestions made for the chain of properties (Pattern 1)
with the following classes and properties “Meeting - meeting notes - Meeting Notes - contains action item
- Action Item” included:
• Suggested property (human) – assigned action item
• GPT-4o (natural language prompt, without output specification) – generates action item or has
action item
4Note the change of property’s domain to the subclass of the original domain, with no counterpart in Pattern 1 or Pattern 2.
5We alter the entity notation in the examples according to the standard syntax (RDF vs. description logic vs. Horn rules); the
identity of entities should be however clear from the lexical content of the tokens.
6https://archivo.dbpedia.org/
7Full set of results available from https://github.com/Onto-DESIDE-VSE/TransformationPatterns/tree/main/experiments/
LLMs%20experiments/EKAW24
• GPT-4o (natural language prompt, 3 outputs required) – action item assigned in or outcome of
meeting
• GPT-4o (Turtle representation) – has action item or includes task
• GPT-4o (Turtle representation, three-shot prompting) – has action item or includes action item
In some cases, the results gained from LLMs appeared even more suitable than the names suggested by
a human. For example, for the chain of properties (Pattern 1), with the following example “Organization
- delivery location - Residence Object - address - Basic Address”. The human annotator suggested to call the
new property between “Organization” and “Basic Address” “address of delivery location”, while Gemini
Pro and GPT-4o suggested “has delivery address”, which is more natural. This indicates that apart from
the fully automated case, even in a semi-automated setting, using LLM suggestion may allow human
users to think of better formulations.</p>
      <p>Of course, there were also some ‘hallucinations’8 returned by some of the models, i.e. suggestions
that do not make sense. For example, for Pattern 2 with the example “Person - has period - Period of Life
Honour” a suggestion fo the LLM for the new subclass of person was “Honored Period Person”. Although,
with diferent prompt, the same model suggested “Honored Person”. This is also the reason why we
decided to prompt each model for more than one option, so that we could observe if it is capable of
coming up with some suitable name for a new entity even in light of occasional random responses.</p>
      <p>It should be noted that while we for simplicity referred to the mentioned research activities as to
‘experiments’, there was no rigorous experimental protocol present; the nature of the analysis of results
was rather that of group discussion over the individual results, followed by seeking agreements on
whether the overall observation on the models’ performance is positive or not.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusions and future work</title>
      <p>Adequately treating, and plausibly naming, newly generated entities is a crucial challenge in the
automation of ontology transformations. Such challenges need to be overcome to make RDF knowledge
graphs adaptable to new requirements and applications, in a scalable way. Throughout the paper, we
have illustrated the need for handling of new entities in many common transformation patterns, and
provisionally demonstrated the feasibility of using LLMs for the naming of these new entities.</p>
      <p>We plan to extend the preliminary attempts to use LLMs for new entity naming to more rigorous
experiments with multiple language models and across a larger set of transformation patterns. Upon
the successful completion of these experiments, we will integrate the invocation of properly configured
LLMs into our new ontology transformation tool currently under development [8]9. A further challenge
to be investigated is the mentioned interference between the application of diferent transformation
patterns (or manifold application of the same pattern) on the same ontology. In order to better grasp this
phenomenon, as well as for various other purposes, we plan to design an ontology and knowledge graph
of OWL2OWL transformation patterns, capturing both syntactic and semantic (linking to relevant
foundational ontologies) aspects of these patterns.</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments References</title>
      <p>This work has been supported by the EU’s Horizon Europe grant no. 101058682 (Onto-DESIDE).
8Since we are dealing with names for general entities and not with facts, using the term ‘hallucination’ for inadequate output
might not be the best fit.
9The PatOMat2 tool, https://github.com/Onto-DESIDE-VSE/patomat2, is a follow-up of the original PatOMat, with numerous
enhancements.
[2] O. Sváb-Zamazal, V. Svátek, L. Iannone, Pattern-based ontology transformation service exploiting
OPPL and OWL-API, in: P. Cimiano, H. S. Pinto (Eds.), Knowledge Engineering and Management
by the Masses - 17th International Conference, EKAW 2010, Lisbon, Portugal, October 11-15, 2010.
Proceedings, volume 6317 of Lecture Notes in Computer Science, Springer, 2010, pp. 105–119. URL:
https://doi.org/10.1007/978-3-642-16438-5_8. doi:10.1007/978-3-642-16438-5\_8.
[3] O. Sváb-Zamazal, M. Dudás, V. Svátek, User-friendly pattern-based transformation of OWL
ontologies, in: A. ten Teije, J. Völker, S. Handschuh, H. Stuckenschmidt, M. d’Aquin, A. Nikolov,
N. Aussenac-Gilles, N. Hernandez (Eds.), Knowledge Engineering and Knowledge Management
- 18th International Conference, EKAW 2012, Galway City, Ireland, October 8-12, 2012.
Proceedings, volume 7603 of Lecture Notes in Computer Science, Springer, 2012, pp. 426–429. URL:
https://doi.org/10.1007/978-3-642-33876-2_39. doi:10.1007/978-3-642-33876-2\_39.
[4] O. Zamazal, V. Svátek, Patomat - versatile framework for pattern-based ontology transformation,</p>
      <p>Comput. Informatics 34 (2015) 305–336. URL: http://www.cai.sk/ojs/index.php/cai/article/view/1138.
[5] C. Rieß, N. Heino, S. Tramp, S. Auer, EvoPat - pattern-based evolution and refactoring of RDF
knowledge bases, in: P. F. Patel-Schneider, Y. Pan, P. Hitzler, P. Mika, L. Zhang, J. Z. Pan, I. Horrocks,
B. Glimm (Eds.), The Semantic Web - ISWC 2010 - 9th International Semantic Web Conference,
ISWC 2010, Shanghai, China, November 7-11, 2010, Revised Selected Papers, Part I, volume 6496
of Lecture Notes in Computer Science, Springer, 2010, pp. 647–662. URL: https://doi.org/10.1007/
978-3-642-17746-0_41. doi:10.1007/978-3-642-17746-0\_41.
[6] F. Scharfe, O. Zamazal, D. Fensel, Ontology alignment design patterns, Knowl. Inf. Syst. 40 (2014)
1–28. URL: https://doi.org/10.1007/s10115-013-0633-y. doi:10.1007/s10115-013-0633-y.
[7] A. Krisnadhi, N. Karima, P. Hitzler, R. Amini, V. Rodríguez-Doncel, K. Janowicz, Ontology design
patterns for linked data publishing, in: P. Hitzler, A. Gangemi, K. Janowicz, A. Krisnadhi, V. Presutti
(Eds.), Ontology Engineering with Ontology Design Patterns - Foundations and Applications,
volume 25 of Studies on the Semantic Web, IOS Press, 2016, pp. 201–232. URL: https://doi.org/10.
3233/978-1-61499-676-7-201. doi:10.3233/978-1-61499-676-7-201.
[8] O. Zamazal, M. Ledvinka, V. Svátek, PatOMat2: A Tool for Pattern-Based Ontology Transformation
using SPARQL, in: I. Novalija, C. Badenes-Olmedo (Eds.), Companion Proceedings of the 24th
International Conference on Knowledge Engineering and Knowledge Management, Amsterdam,
Netherlands, CEUR Workshop Proceedings, To Appear. CEUR-WS.org, 2024.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>C.</given-names>
            <surname>Shimizu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Hammar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Hitzler</surname>
          </string-name>
          , Modular ontology modeling,
          <source>Semantic Web</source>
          <volume>14</volume>
          (
          <year>2023</year>
          )
          <fpage>459</fpage>
          -
          <lpage>489</lpage>
          . URL: https://doi.org/10.3233/SW-222886. doi:
          <volume>10</volume>
          .3233/SW- 222886.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>