<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>WL2Gen: Towards a Configurable Ontology Generator for Benchmarking</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Gunjan Singh</string-name>
          <email>gunjans@iiitd.ac.in</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ashwat Kumar</string-name>
          <email>ashwat16023@iiitd.ac.in</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sumit Bhatia</string-name>
          <email>sumit.bhatia@adobe.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Raghava Mutharaju</string-name>
          <email>raghava.mutharaju@iiitd.ac.in</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Adobe Research</institution>
          ,
          <addr-line>New Delhi</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Knowledgeable Computing and Reasoning Lab, IIIT-Delhi</institution>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>OWL 2, Ontology Generator</institution>
          ,
          <addr-line>Ontology Reasoner, Benchmarking</addr-line>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Topics</institution>
          ,
          <addr-line>Posters and Demos</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>Recent advancements in OWL 2 reasoners have significantly enhanced their capabilities. However, scaling challenges persist, especially with expressive profiles such as OWL 2 DL. Efective benchmarking is essential to address these challenges. We discuss our efort towards establishing a configurable ontology generator, OWL2Gen, that empowers users to custom-build benchmark ontologies by specifying the types of axioms they need and their respective counts. With a user-friendly interface that lists all OWL 2 constructs, OWL2Gen ofers an adaptable approach to ontology generation, facilitating more detailed and nuanced performance evaluations. The code and documentation are available under the Apache 2.0 ER2024: Companion Proceedings of the 43rd International Conference on Conceptual Modeling: ER Forum, Special htp:/ceur-ws.org CEUR Workshop Proceedings (CEUR-WS.org) ISN1613-073</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR
ceur-ws.org</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>
        In the past decade, there has been remarkable progress in the development of reasoners that
support expressive ontology languages such as OWL 2. Despite the advancements, OWL 2
reasoners still struggle to scale well, especially for expressive language profiles like OWL 2
DL [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. To build high-quality reasoners, developers need to identify and improve performance
bottlenecks in their existing systems. A reasoner benchmark aids developers in evaluating
their system’s performance, addressing limitations, and paving the way for further research to
enhance performance and functionality. In particular, a reasoner needs to be evaluated from
several aspects, such as (a) Coverage, i.e., support for diferent OWL 2 language constructs
and their combinations, as well as support for various OWL 2 profiles (EL, QL, and RL) [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ];
(b) Scalability, i.e., the ability to handle large and expressive ontologies; (c) Performance, i.e.,
evaluation under constraints such as reasoning time and memory consumption.
      </p>
      <p>
        Current benchmarks, such as LUBM [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], UOBM [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], and OWL2Bench [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], ofer a certain degree
of flexibility but remain limited in scope. They allow users to assess reasoner performance by
generating ABoxes of varying sizes while the TBox remains fixed. OntoBench [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] provides a web
interface for selecting OWL 2 constructs but lacks comprehensive scalability and performance
CEUR
Workshop
Proceedings
evaluation. PyGraft [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] generates domain-agnostic ontologies but does not cover all OWL 2
constructs.
      </p>
      <p>To address these limitations, we present our configurable Ontology Generator, OWL2Gen.
Inspired by OntoBench, we provide an interface that allows users to select and specify both
the choice of constructs and their individual counts, facilitating the generation of custom-built
ontologies. It leverages existing OWL 2 DL TBoxes to generate ontologies, providing a practical
evaluation scenario. OWL2Gen’s user-friendly interface simplifies ontology creation, making it
accessible to users with varying levels of familiarity with OWL.</p>
      <p>In the following sections, we describe OWL2Gen’s methodology and outline our plans
for future enhancements, aiming to expand its capabilities for comprehensive and enhanced
benchmarking of OWL 2 reasoners.
2. OWL2Gen
OWL2Gen is a user-friendly application designed for flexible and configurable ontology
generation. The frontend, detailed in Section 2.1, is developed using HTML, CSS, and JavaScript,
and it interacts with a Java-based backend through a REST interface. The backend, outlined in
Section 2.2, is implemented using the Spring Framework and utilizes the OWL API 1 to create
customized ontologies.</p>
      <sec id="sec-2-1">
        <title>2.1. Graphical User Interface (GUI)</title>
        <p>The graphical interface of OWL2Gen is designed for simplicity and eficiency, catering to users
with varying levels of expertise to easily generate customized ontologies. As shown in Figure 1,
the main interface features a configuration panel that lists all OWL 2 constructs, grouped
into eight distinct categories according to the OWL 2 Quick Reference Guide2. This structure
mirrors the one used in OntoBench. The categories are arranged within frames in the GUI, and
predefined buttons allow users to quickly select presets, such as choosing all elements within a
specific category.</p>
        <p>Once the user selects the desired OWL 2 constructs, they can specify the quantity needed for
each selected construct. After configuring the desired settings, the user can click the “Generate”
button. At this point, OWL2Gen communicates with the backend via the REST API, where the
user’s configuration is processed and transformed into a fully functional ontology.</p>
        <p>OWL2Gen supports various OWL serialization formats, making it adaptable to various
applications and system requirements. Users can select their preferred serialization format from
a drop-down menu, which includes all formats supported by the OWL API, which currently
include Turtle, Manchester, Functional, OWL/XML, and RDF/XML.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Methodology</title>
        <p>The methodology leverages existing OWL 2 DL TBoxes to generate ontologies, allowing users
to either provide their own input ontologies or use the default ones provided. In this approach,
1https://owlapi.sourceforge.net/
2https://www.w3.org/TR/2012/REC-owl2-quick-reference-20121211/
we first create separate buckets for each OWL 2 construct, all initially empty. Once the user
selects the input base ontologies, axioms belonging to diferent constructs are sorted into the
corresponding buckets. For example, axioms like Faculty ⊑ ∃ worksFor.College would be
placed in the existential restriction bucket. These axioms serve as the foundation for generating
new ontologies. If the user selects existential restriction, axioms from this bucket will be picked
for inclusion in the new ontology.</p>
        <p>If the required count for a construct is less than or equal to the available axioms, they are
used directly. However, if the required count exceeds the number of axioms in the bucket, new
axioms are generated by replacing term(s) in the original axiom. For instance, introducing a
new class Faculty_1 can lead to generating an axiom like Faculty_1 ⊑ ∃worksFor.College.
This approach ensures consistency and readability by aligning the naming conventions with the
original vocabulary. The decision on which terms to replace is influenced by factors such as
maximizing class reuse and ensuring suficient connections between entities to avoid unnecessary
new classes. Our generation pipeline leverages data structures that rank potential axioms based
on the usage frequency of the entities involved in the axiom and the interconnections among
them so far. These structures are consulted before generating new axioms, optimizing class
and property reuse while maintaining complex connections. Additionally, we introduce some
randomness into the process, such as determining whether Faculty and Faculty_1 should
both be subclasses of Employee or if they should be subclasses of Employee and Employee_1,
respectively.</p>
        <p>Since we rely on an existing input TBox, some axioms may include related hierarchical or
domain/range axioms. For example, alongside Faculty ⊑ ∃worksFor.College, additional
axioms such as Faculty ⊑ Employee, College ⊑ University, and University ⊑ Organization
might be relevant. To account for this, we provide users the option to retain these
hierarchical and domain/range connections in the generated ontology, ensuring a more complete and
coherent structure that better reflects real-world relationships and reasoning scenarios.</p>
        <p>Furthermore, while reasoners may handle one set of axioms generated for certain constructors
eficiently, a diferent set of axioms with the same constructors could cause computational
blow-ups due to unforeseen interactions. To address this variability, we generate four distinct
ontologies in each run. Given that each constructor’s bucket contains multiple axioms, this
approach enables the creation of varied ontologies. Further details on the approach and design
choices are available in the documentation at https://github.com/kracr/owl2gen.</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Use Case Scenarios</title>
        <p>The OWL2Gen ontology generator presents many use cases that significantly enhance research
and practical applications in ontology engineering. The key application lies in benchmarking
various ontology reasoning systems, including conventional and emerging neuro-symbolic
reasoners. Since, the generation algorithm leverages an input ontology, users can generate
ontologies in two ways: first, by applying the same set of OWL 2 constructs to the same input
ontology, and second, by using the same constructs across diferent input ontologies. This
results in ontologies with varying structural characteristics, even with identical constructs.
Researchers can compare these generated ontologies and observe how they exhibit diferent
performance characteristics, highlighting the impact on reasoning eficiency. Additionally,
users can generate ontologies by gradually increasing the count of individual constructs or
combinations of constructs. This iterative process helps in systematically identifying bottlenecks
in reasoning performance.</p>
        <p>Further, the generator allows users to create ontologies that range from simple structures
with fewer axioms to more complex designs with intricate relationships. This variation enables
researchers to investigate specific cases where smaller ontologies, despite their simplicity, may
challenge reasoning engines due to unexpected complexities in their design. Conversely, larger
ontologies, which are often more structured, can demonstrate faster reasoning times, providing
insights into how these systems handle scalability and eficiency.</p>
        <p>Moreover, the configurable nature of OWL2Gen enables users to produce ontologies tailored
to the unique requirements of neuro-symbolic reasoners, which integrate neural networks with
symbolic reasoning. By generating ontologies that reflect the particular demands of
neurosymbolic tasks, users can explore how these systems respond to diferent structural designs and
reasoning scenarios.</p>
        <p>Additionally, OWL2Gen enhances the benchmarking of visualization tools by providing a rich
variety of generated ontologies that can be used to assess and compare their efectiveness. By
creating ontologies with diverse structures and complexities, researchers can evaluate how well
diferent visualization platforms represent intricate relationships and semantics. This process
not only helps identify the strengths and weaknesses of visualization tools but also drives
improvements in their design. For beginners, OWL2Gen ofers a user-friendly interface to
experiment with various ontology configurations and understand their implications. In sensitive
domains, it can simulate ontologies that meet specific regulatory requirements, enabling safe
and compliant exploration of data relationships. This multifaceted utility positions OWL2Gen
as a vital asset for academic research and practical applications across diverse fields.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Conclusion and Future Work</title>
      <p>In this paper, we present our initial eforts to develop a configurable ontology generator that
supports all OWL 2 language constructs. Currently, our tool enables the generation of scalable
ontologies based on input TBoxes; however, it requires users to provide a complete TBox with
all necessary constructs upfront, which limits flexibility for various applications. To address
this limitation, we are working on a more general approach to enhance the tool’s usability.
Our vision is to create a configurable ontology generator that dynamically adapts to varying
input requirements without needing a fully predefined TBox. We also plan to expand support
for various OWL 2 profiles (EL, QL, RL) to ensure compliance with syntactic and semantic
constraints. Additionally, we aim to support incremental ontology building, allowing users
to start with basic constructs and progressively add complexity based on insights from an
integrated visualization tool, which will provide users with valuable insights into the generated
ontology’s complexity.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>P.</given-names>
            <surname>Hitzler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Krötzsch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Parsia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. F.</given-names>
            <surname>Patel-Schneider</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Rudolph</surname>
          </string-name>
          , et al.,
          <article-title>Owl 2 web ontology language primer</article-title>
          ,
          <source>W3C recommendation 27</source>
          (
          <year>2009</year>
          )
          <fpage>123</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Krötzsch</surname>
          </string-name>
          ,
          <article-title>OWL 2 profiles: An introduction to lightweight ontology languages</article-title>
          , in: T. Eiter, T. Krennwallner (Eds.),
          <source>Reasoning Web. Semantic Technologies for Advanced Query Answering - 8th International Summer School</source>
          <year>2012</year>
          , Vienna, Austria, September 3-
          <issue>8</issue>
          ,
          <year>2012</year>
          . Proceedings, volume
          <volume>7487</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2012</year>
          , pp.
          <fpage>112</fpage>
          -
          <lpage>183</lpage>
          . URL: https://doi.org/10.1007/978-3-
          <fpage>642</fpage>
          -33158-
          <issue>9</issue>
          _4. doi:
          <volume>10</volume>
          .1007/978- 3-
          <fpage>642</fpage>
          - 33158- 9\_4.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Pan</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. Heflin,</surname>
          </string-name>
          <article-title>LUBM: A Benchmark for OWL Knowledge Base Systems</article-title>
          ,
          <source>Journal of Web Semantics</source>
          .
          <volume>3</volume>
          (
          <year>2005</year>
          )
          <fpage>158</fpage>
          -
          <lpage>182</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>L.</given-names>
            <surname>Ma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Qiu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z .</given-names>
            and
            <surname>Xie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Pan</surname>
          </string-name>
          , S. Liu,
          <article-title>Towards a Complete OWL Ontology Benchmark</article-title>
          ,
          <source>in: The Semantic Web: Research and Applications</source>
          , Springer Berlin Heidelberg,
          <year>2006</year>
          , pp.
          <fpage>125</fpage>
          -
          <lpage>139</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>G.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bhatia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Mutharaju</surname>
          </string-name>
          ,
          <article-title>OWL2Bench: A Benchmark for OWL 2 Reasoners, in: The Semantic Web - ISWC</article-title>
          <year>2020</year>
          - 19th
          <source>International Semantic Web Conference, ISWC 2020, Lecture Notes in Computer Science</source>
          , Springer,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>V.</given-names>
            <surname>Link</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Lohmann</surname>
          </string-name>
          ,
          <string-name>
            <surname>H. F.</surname>
          </string-name>
          ,
          <article-title>OntoBench: Generating Custom OWL 2 Benchmark Ontologies</article-title>
          , in: International Semantic Web Conference,
          <year>2016</year>
          , pp.
          <fpage>122</fpage>
          -
          <lpage>130</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>N.</given-names>
            <surname>Hubert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Monnin</surname>
          </string-name>
          , M.
          <string-name>
            <surname>d'Aquin</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Monticolo</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Brun</surname>
          </string-name>
          , Pygraft:
          <article-title>Configurable generation of synthetic schemas and knowledge graphs at your fingertips</article-title>
          ,
          <source>in: The Semantic Web - 21st International Conference, ESWC</source>
          <year>2024</year>
          , Hersonissos, Crete, Greece, May
          <volume>26</volume>
          -30,
          <year>2024</year>
          , Proceedings,
          <string-name>
            <surname>Part</surname>
            <given-names>II</given-names>
          </string-name>
          , volume
          <volume>14665</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2024</year>
          , pp.
          <fpage>3</fpage>
          -
          <lpage>20</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>