<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>M. Ledvinka);</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Martin Ledvinka</string-name>
          <email>martin.ledvinka@fel.cvut.cz</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Miroslav Blaško</string-name>
          <email>miroslav.blasko@fel.cvut.cz</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michal Med</string-name>
          <email>michal.med@fel.cvut.cz</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Workshop</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Vienna, Austria</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, Faculty of Electrical Engineering, Czech Technical University in Prague</institution>
          ,
          <addr-line>Technická 2, Prague 6</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Terminology</institution>
          ,
          <addr-line>Glossary, Thesaurus, SKOS</addr-line>
        </aff>
      </contrib-group>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>Many domains use specific terminologies to describe concepts. Being able to explicitly manage such terminologies instead of relying on their common knowledge is beneficial both for newcomers and for people for whom the terminology is not their daily bread and butter. This is especially true for legislative terminologies. We present TermIt, a Semantic Web-based terminology manager that allows domain experts to create and manage high-quality terminologies, link them to normative documents as well as use them to annotate other related documents. We discuss the architecture of the system and the technologies used in its development.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>People in diferent domains tend to use specific terminologies to describe concepts. The meaning of
these terms can often be in a conflict with the meaning understood by lay folk (an
issue in GitHub issue
tracking does not necessarily mean a problem with the software), or there may exist subtle diferences
in the terms’ semantics. In case of legislation or other normative documents, these subtle diferences
may have serious consequences. For example, according to the Czech Ordinance No. 268/2009 Coll
(Ordinance on technical requirements for buildings), a building is an above-ground structure, including
its underground portion, spatially concentrated and externally enclosed for the most part by perimeter
walls and roof construction. On the other hand, according to the Czech Law No. 406/2000 Coll.
(Energy Management Act), a building is an above-ground structure and its underground parts, spatially
concentrated and externally largely enclosed by external walls and roof structure, in which energy is
used to modify the indoor environment for heating or cooling purposes. The diference between these
two meanings can be important in case of communication with a public ofice or in a legal dispute.</p>
      <p>It is therefore advantageous to be able to manage a terminology of the domain consisting of terms,
where each term has a label, i.e. the phrase it is called by and a definition
describing the term’s meaning.</p>
      <p>TermIt was created to support the creation, management and sharing of such terminologies.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Use Case Overview</title>
      <p>
        TermIt is a terminology manager based on SKOS (Simple Knowledge Organization System) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. It is an
information system with two main use cases:
      </p>
      <sec id="sec-2-1">
        <title>1. Terminology management 2. Document annotation</title>
        <p>CEUR
ceur-ws.org</p>
        <sec id="sec-2-1-1">
          <title>2.1. Terminology Management</title>
          <p>The terms in TermIt are grouped into glossaries, where each glossary corresponds to a
skos:ConceptScheme.1 Each term is a SKOS skos:Concept with basic attributes such as
skos:prefLabel, skos:definition, etc. A hierarchy of terms can be built within a glossary
(using skos:broader) as well as across glossaries (using skos:broadMatch). Additional non-hierarchical
relationships between terms can be specified using skos:related(Match) and skos:exactMatch. Users
are also able to create and use additional attributes that are not covered by the built-in data model.</p>
          <p>To support use in multilingual use cases (for example, the Czech Standardization Agency uses TermIt
to map building construction glossaries in cooperation with other European standardization agencies),
all string-based term attributes may contain language-tagged values. The only requirement for a term
is to have a label (skos:prefLabel) in the glossary’s configured primary language.</p>
          <p>
            Although domain experts, who are the primary intended users of TermIt, are not expected to do
detailed modeling of the relationships between terms (this task requires an ontology engineer), TermIt
does ofer basic modeling capabilities to more advanced users. One already mentioned is the SKOS-based
ability to create term hierarchies. Another is the ability to classify terms into categories. By default, a
UFO-based [
            <xref ref-type="bibr" rid="ref2">2</xref>
            ] classification scheme is supported, but diferent options may be provided. Should a need
arise to create a conceptual model of the domain, an ontology engineer could use a modeling tool such
as OntoGrapher [
            <xref ref-type="bibr" rid="ref3">3</xref>
            ] (see Section 3) to further specify term data.
          </p>
          <p>
            Data quality is a concern in many cases where people provide the input for an information system
and terminology management is no exception. On the contrary, since the terminology is often based
on normative documents or used in highly regulated conditions, the quality of the terminology is
important. The solution in TermIt is twofold: 1) basic consistency rules, such as the requirement for a
term to have a label that is unique within its glossary, are enforced; 2) a validation service that checks
the quality of terms in a glossary using a (configurable) set of SHACL [
            <xref ref-type="bibr" rid="ref4">4</xref>
            ] rules is used. These rules
check, for example, that a term has a definition, that the term’s label is not used as another term’s
synonym (skos:altLabel), that a term’s identifier is in line with the glossary namespace, etc.
          </p>
          <p>Other terminology management features of TermIt include tracking changes made by users, the
ability to create glossary snapshots, comment on terms, and a simple state-based workflow. The user
may search for glossaries and terms using full-text search or use faceted search for more fine-grained
term exploration. Glossaries may be exported/imported to/from RDF as well as MS Excel.</p>
        </sec>
        <sec id="sec-2-1-2">
          <title>2.2. Document Annotation</title>
          <p>Terminologies are often based on normative documents. TermIt therefore supports the attachment
of documents (possibly consisting of multiple files) to glossaries. These documents can be used as a</p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>1skos: is prefix for http://www.w3.org/2004/02/skos/core#</title>
        <p>
          seed to populate the glossary based on suggestions from Annotace 2 [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] – a text analysis service, or as a
reference for the definition of existing terms in a glossary. In fact, the definition of the term can be
marked directly in the document, providing a clear link to its source.
        </p>
        <p>Another use case for document annotation is having an existing glossary and finding the occurrences
of its terms in the document. For example, the Prague Institute for Planning and Development maintains
multiple glossaries that are then used to provide definitions of terms occurring in the documents it
publishes. The Portal of Prague Planning Analytical Materials (PAM)3 is one of the sites where such
documents are published. Figure 1 illustrates how the PAM Portal and its maintainers use TermIt.</p>
        <p>It is important to note that the document annotation is semi-automatic – the service suggests
occurrences of terms, but an authorized user is ultimately required to approve or reject these occurrences.
If a new version of a document is uploaded to TermIt, previous occurrences are retained, as long as the
changes in the document are not too significant. Annotace expects the content to be HTML (or plain
text that it transforms to HTML) and produces RDFa-annotated HTML.4</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Technical Overview</title>
      <p>
        TermIt as a system consists of the TermIt application and multiple optional services - Figure 2 illustrates
its architecture. The TermIt application itself is designed as a monolith with back-end written in Java
and front-end written in TypeScript. The back-end follows the layered architectural style [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] which is
typical for web applications and promotes separating the business logic of the application from the
infrastructural data access and web service layers. The additional services that can be used as part of
TermIt are:
Annotace [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] is a text analysis service that can be used to annotate HTML documents with occurrences
of terms from the provided glossaries and suggest new terms based on their significance in the
text. The current implementation supports two lemmatizers: Spark5 and MorphoDiTa [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] (more
suitable for Czech).
      </p>
      <p>Validation service (as already mentioned in Section 2.1) is used to determine the quality of the term
metadata and provide the user feedback as to what they should improve. The service contains
a predefined set of SHACL rules and the caller may choose which ones to use for individual
validation calls.</p>
      <p>Authentication service compatible with the OAuth2 standard can be used instead of the built-in
internal user management and authentication mechanism. This configuration is suitable in case
the target organization already has an authentication solution in place or when TermIt is to be
used with OntoGrapher (see below).</p>
      <p>
        OntoGrapher [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] is a Web-based conceptual modeling tool designed primarily for ontological
engineers, facilitating a more detailed description of relationships between terms.
      </p>
      <sec id="sec-3-1">
        <title>3.1. Technologies</title>
        <p>TermIt uses Semantic Web technologies at all levels of its implementation, yet the software libraries
it uses ensure that it does not difer from regular Web applications in terms of software engineering
and development. The data are stored in a triple store. In particular, GraphDB6 is used for its superior
performance and support for custom inference rules. TermIt uses a number of rules, for example, to
infer inverse relationships between terms or a term and the glossary to which it belongs. RDF named</p>
        <sec id="sec-3-1-1">
          <title>2https://github.com/kbss-cvut/annotace, accessed 2025-07-30</title>
          <p>3https://uap.iprpraha.cz/en, accessed 2025-07-30
4https://www.w3.org/TR/rdfa-core/, accessed 2025-07-30
5https://sparknlp.org/, accessed 2025-07-30
6https://graphdb.ontotext.com/, accessed 2025-07-30
graphs are used to structure the data – each glossary is stored in a separate named graph (its IRI7
coincides with the glossary’s identifier) and additional named graphs are used for related technical data
(change records, comments, etc.).</p>
          <p>
            The repository is accessed using the Java OWL Persistence API (JOPA) [
            <xref ref-type="bibr" rid="ref8">8</xref>
            ] – a semantic data
persistence library. The back-end uses Spring Boot8 and provides a REST API for the front-end and other
clients to use. The API is documented by an OpenAPI9 specification. The API supports JSON and
JSON-LD as data-exchange formats. The front-end communicates with the back-end using JSON-LD,
but other clients (such as the aforementioned Portal of Prague Planning Analytical Materials) use JSON.
JSON-LD (de)serialization is handled by the JB4JSON-LD library [
            <xref ref-type="bibr" rid="ref9">9</xref>
            ], which is also developed by TermIt’s
authors.
          </p>
          <p>The front-end is a client-side application written in TypeScript using React. It follows common React
application development practices – it uses a state management library, routing, authentication, and
authorization workflow – and the user interface (UI) is localized to Czech and English. Since all UI
texts are extracted into localization files, adding a translation for another language would be trivial.
As mentioned earlier, the front-end communicates with the back-end using JSON-LD with context.
From experience, this increases the amount of data transferred for a single object retrieval, but in
situations where objects repeat in the content, the amount of data actually decreases significantly,
because JB4JSON-LD uses an object’s identifier for its repeated occurrences. jsonld.js, 10 is then used to
ensure that all references are replaced with the full object for the rest of the front-end code.</p>
          <p>The whole system is published and deployed using the Docker11 containerization platform. This
ensures the runtime is the same for all instances and nothing (besides Docker itself) needs to be installed
on the host system.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Deployments</title>
      <p>
        TermIt has been used or is planned to be used by several organizations:
• Prague Institute for Planning and Development
• Czech Standardization Agency
7Internationalized Resource Identifier
8https://spring.io/projects/spring-boot, accessed 2025-07-30
9https://spec.openapis.org/oas/latest.html, accessed 2025-07-30
10https://github.com/digitalbazaar/jsonld.js, accessed 2025-07-30
11https://www.docker.com/, accessed 2025-07-30
• Czech Digital and Information Agency – used in a project of an assembly line of semantic
vocabularies for the Czech legislation [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]
• ČEPS – Czech Transmission System Operator – currently in evaluation phase
• Ministry of Health of the Czech Republic – currently in negotiations
      </p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusions</title>
      <p>We have introduced TermIt, a terminology manager based on Semantic Web technologies. We argued
that a clear definition of terminology is important, especially in domains where misunderstanding the
meaning of a word or a phrase can have an impact on a conversation, a dispute, or a legal act.</p>
      <p>Building TermIt with Semantic Web technologies has had a number of benefits. A well established
model (SKOS) was used to describe the terms, including their relationships, while retaining the possibility
to easily define additional properties. The terms have stable and globally valid identifiers in the form
of IRIs, and the terminologies can be published as machine-readable data. On the other hand, some
drawbacks have also been discovered – mainly issues with performance for larger amounts of data
(glossaries with hundreds or thousands of terms or deep hierarchies).</p>
      <p>TermIt can be used in cooperation with additional tools, such as a validation service, a text analysis
service, and a tool for modeling detailed relationships between terms.</p>
      <p>
        A more detailed description of TermIt use cases as well as a comparison with other relevant tools can
be found in [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
    </sec>
    <sec id="sec-6">
      <title>Declaration on Generative AI</title>
      <sec id="sec-6-1">
        <title>The author(s) have not employed any Generative AI tools.</title>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>A. Online Resources</title>
      <sec id="sec-7-1">
        <title>TermIt source code is available on GitHub under the GPL license:</title>
        <p>• TermIt back-end at https://github.com/kbss-cvut/termit,
• TermIt front-end at https://github.com/kbss-cvut/termit-ui,
• TermIt Docker Compose configuration (including installation instructions) at
https://github.com/kbss-cvut/termit-docker.</p>
        <p>A demo instance of TermIt is available at https://kbss.felk.cvut.cz/termit-demo.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Miles</surname>
          </string-name>
          , S. Bechhofer,
          <article-title>SKOS Simple Knowledge Organization System Reference</article-title>
          ,
          <source>W3C Recommendation, W3C</source>
          ,
          <year>2009</year>
          . https://www.w3.org/TR/skos-reference,
          <year>accessed 2025</year>
          -
          <volume>07</volume>
          -30.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Guizzardi</surname>
          </string-name>
          , Giancarlo,
          <source>Ontological Foundations for Structural Conceptual Models, Ph.D. thesis</source>
          , University of Twente,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Binder</surname>
          </string-name>
          , P. Křemen,
          <article-title>OntoGrapher: a Web-based Tool for Ontological Conceptual Modeling</article-title>
          ,
          <source>in: Proceedings of the 21st CIAO! Doctoral Consortium, and Enterprise Engineering Working Conference Forum</source>
          <year>2021</year>
          co
          <article-title>-located with 11th Enterprise Engineering Working Conference (EEWC</article-title>
          <year>2021</year>
          ), CEUR-WS,
          <year>2021</year>
          . URL: https://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>3115</volume>
          /paper2.pdf, accessed
          <year>2025</year>
          -
          <volume>07</volume>
          -30.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>H.</given-names>
            <surname>Knublauch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Kontokostas</surname>
          </string-name>
          ,
          <article-title>Shapes Constraint Language (SHACL)</article-title>
          ,
          <source>W3C Recommendation, W3C</source>
          ,
          <year>2017</year>
          . URL: https://www.w3.org/TR/2017/REC-shacl-
          <volume>20170720</volume>
          /.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>L.</given-names>
            <surname>Saeeda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Med</surname>
          </string-name>
          , M. Ledvinka,
          <string-name>
            <given-names>M.</given-names>
            <surname>Blaško</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Křemen</surname>
          </string-name>
          ,
          <article-title>Entity linking and lexico-semantic patterns for ontology learning</article-title>
          ,
          <source>in: The Semantic Web</source>
          , Springer, Cham,
          <year>2020</year>
          , pp.
          <fpage>138</fpage>
          -
          <lpage>153</lpage>
          . doi:
          <volume>10</volume>
          .1007/ 978- 3-
          <fpage>030</fpage>
          - 49461-
          <issue>2</issue>
          _
          <fpage>9</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>F.</given-names>
            <surname>Buschman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Meunier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Rohnert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Sommerlad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Stal</surname>
          </string-name>
          , Pattern-Oriented
          <source>Software Architecture Volume 1: A System of Patterns</source>
          , volume
          <volume>1</volume>
          , Wiley, Hoboken, New Jersey,
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Straková</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Straka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hajič</surname>
          </string-name>
          ,
          <article-title>Open-Source Tools for Morphology, Lemmatization, POS Tagging and Named Entity Recognition, in: Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, Association for Computational Linguistics</article-title>
          , Baltimore, Maryland,
          <year>2014</year>
          , pp.
          <fpage>13</fpage>
          -
          <lpage>18</lpage>
          . URL: http://www.aclweb.org/anthology/P/P14/P14-5003.pdf, accessed
          <year>2025</year>
          -
          <volume>07</volume>
          -30.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Ledvinka</surname>
          </string-name>
          , P. Křemen, JOPA:
          <article-title>Accessing Ontologies in an Object-oriented Way</article-title>
          ,
          <source>in: Proceedings of the 17th International Conference on Enterprise Information Systems - Volume 1: ICEIS</source>
          , INSTICC, SciTePress,
          <year>2015</year>
          , pp.
          <fpage>212</fpage>
          -
          <lpage>221</lpage>
          . doi:
          <volume>10</volume>
          .5220/0005400302120221.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Ledvinka</surname>
          </string-name>
          ,
          <article-title>Java Binding for JSON-LD, in:</article-title>
          <source>Proceedings of the 19th International Conference on Web Information Systems and Technologies - Volume</source>
          <volume>1</volume>
          : WEBIST, INSTICC, SciTePress,
          <year>2023</year>
          , pp.
          <fpage>207</fpage>
          -
          <lpage>214</lpage>
          . doi:
          <volume>10</volume>
          .5220/0012168500003584.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>K.</given-names>
            <surname>Klíma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Blaško</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Křemen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Nečaský</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ledvinka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Binderová</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Švagr</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Kopecký</surname>
          </string-name>
          ,
          <article-title>Assembly line: a tool for collaborative modeling of ontologies in public administration</article-title>
          ,
          <source>in: 2023 ACM/IEEE International Conference on Model Driven Engineering Languages and Systems Companion</source>
          <volume>(</volume>
          <string-name>
            <surname>MODELS-C)</surname>
          </string-name>
          , IEEE,
          <year>2023</year>
          , pp.
          <fpage>24</fpage>
          -
          <lpage>29</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>P.</given-names>
            <surname>Křemen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Med</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Blaško</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Saeeda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ledvinka</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Buzek, TermIt: Managing normative thesauri,
          <source>Semantic Web</source>
          <volume>16</volume>
          (
          <year>2025</year>
          )
          <article-title>SW-243547</article-title>
          . doi:
          <volume>10</volume>
          .3233/SW-243547.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>