<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Qurator</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Human-Friendly, Machine-Readable: Coreon MKS for Visual Curation of Semantic Content and Linked Data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Alena Vasilevich</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michael Wetzel</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Coreon GmbH</institution>
          ,
          <addr-line>Rungestrasse 20, 10179 Berlin</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2022</year>
      </pub-date>
      <volume>3</volume>
      <fpage>19</fpage>
      <lpage>23</lpage>
      <abstract>
        <p>In this work we introduce Coreon Multilingual Knowledge System, a visual tool for concept-based data modeling and curation. Coreon environment is built on the visual paradigm; its goal is to ofer a comprehensible and user-friendly solution for domain experts, well-versed in concept modelling dialects, as well as ad-hoc users, who are new to curation of semantic content. We describe the mechanism of the tool, which, being machine-readable, is powered by the language-agnostic knowledge graph, capably embedding the non-deterministic phenomena of the human language.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Visual data curation tools</kwd>
        <kwd>semantic knowledge management</kwd>
        <kwd>terminology management</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
    </sec>
    <sec id="sec-2">
      <title>2. Coreon MKS: Digital Curation Tool with a Visual Aptitude</title>
      <p>Coreon Multilingual Knowledge System (MKS)1 is a visual web-based tool that aims to
commoditize the curation and maintenance of structured data[4]. Having witnessed domain experts’
frustration with cumbersome software, our goal is to reduce instrumental barriers and
improve working experience for maintainers, who may struggle with obscure data-modeling tools,
standards, format intricacies and their limitations[5].</p>
      <p>Unlike scrolling through endless flat lists of entries, a dynamic and context-illustrating
navigation allows maintainers to scale up and down within the data resource, zooming in on
specific parts or branches that require their attention.</p>
      <p>With the visual paradigm and model adaptability as its core features, Coreon MKS ofers the
user the following key functionalities:
• a collaborative environment for easy data curation, accessible for domain experts who
had no previous experience with ontology-editing or knowledge-management software.
In Coreon, users can create, combine, and edit terminologies, ontologies, taxonomies,
controlled vocabularies, and graph-based data collections via familiar drag-and-drop
actions, shortcutting extensive training and studying of software manuals;
• a customizable browser-based interface, suitable for ad-hoc users and domain experts,
for internal and external contributors. The system reacts on the user’s role, hiding
preconfigured elements as well as selected properties, should they be flagged as visible only
for power users, whereas maintainers have the rest of interaction elements at their disposal;
• supporting a variety of established formats: SKOS, TBX, RDF in all relevant syntax
lfavours (Turtle, N3, JSON-LD, etc.);
• validation of data against custom-established constraints and rules;
• data transparency – MKS tracks all record changes, tracing the complete evolution of the
data which is often crucial for institutions and highly regulated industries;
• seamless interaction and smooth integration with third-party systems and services via
RESTful Web API or SPARQL2 protocol that relies on the well-pronounced LLOD
standards;
• adherence to the Linked Data principles, supporting interoperability and re-usability of
language resources [6];
• deployed as light-weight web-based SaaS, it needs no software installation, patches, and
configuration updates;
• Single-Sign-On mechanism enables simple integration into enterprise authentication
environments, based on OASIS’ SAML2 standard.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Data Model</title>
      <p>MKS is a a semantic knowledge repository, with a language-independent knowledge graph
as its backbone (see Figures 1, 2). Conceptually, the solution brings the best of Knowledge
Organization Systems (KOS) and Terminology Management Systems (TMS) together. MKS
leverages the lean structure of SKOS3 yet does not neglect natural language phenomena, bringing
along the expressiveness of ISO’s TermBase eXchange (TBX)4.</p>
      <p>2https://www.w3.org/TR/rdf-sparql-query/
3https://www.w3.org/2004/02/skos/
4https://www.tbxinfo.net/</p>
      <p>Coreon is based on an open data model: the user does not have to fit rigid types of complex
conceptual models but rather gets maximum flexibility in defining and configuring properties
to describe the world. The model is therefore schema-less and easily adaptable for any metadata
requirements. Data maintainers have diferent data types at their disposal to best define a
concept and its terms (see Figure 3). In production environments, Coreon repositories encounter
up to 100 custom properties to fully illustrate a domain.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Catering for Both Knowledge and Language</title>
      <p>In a repository, all players receive unique identifiers – individual persistent IDs that
unambiguously locate any given element, regardless of whether it is a concept, term, property or
a concept relation. In essence, two kinds of information, i.e. knowledge via the semantically
linked concepts and language via the terms, are modelled and stored separately from each other
while being linked through unique concept identifiers.</p>
      <p>From the knowledge perspective, the linking in MKS is performed not at the term but
at the concept level. This approach makes concept maps completely independent from the
terminological information. By linking entries this way, it becomes possible to model knowledge
for phenomena that reflect the non-deterministic nature of the human language, such as word
sense ambiguity, synonymy, and multilingualism. Each concept can be populated with unlimited
descriptive and documentary information, in dozens of languages. Linking per concept also
ensures smooth maintenance of relations without additional data clutter: relation edges are
independent from labels, terms and their variants, and other metadata.</p>
      <p>From the language perspective, Coreon’s concept model captures the following aspects:
• multi-directionality: terms in all languages that belong to the same concept are all stored
in one record. Instead of maintaining several databases per language, a user can simply</p>
      <p>change the desired language with a mouse-click;
• unlimited amount of terms: there is no need to maintain links between terms, neither is
there one synonym field. Terms become synonyms as they are simply stored within the
same concept; there is no limit on the amount of terms/synonyms for a concept;
• term autonomy: each term can carry a full set of descriptive metadata information. This
means that all terms in all languages can be exhaustively described; diferent terms
can become preferred to support diferent contexts, e.g., in one situation the full form
European Central Bank, in another – its acronym ECB.</p>
      <p>In contrast with TMSs, the power of Coreon knowledge graph gives users control over large
amounts of data. Related concepts (e.g. flat screen , LCD screen, and TFT display) are not just
listed under F, L, and T letters, which are alphabetically remote. Rather, they are semantically
linked as parent-child concepts and rendered in proximity to each other on the concept map.</p>
    </sec>
    <sec id="sec-5">
      <title>5. UI Components</title>
      <p>A demo read-only MKS containing European Union’s multilingual thesaurus (EUROVOC)5 can
be accessed and explored at https://www.coreon.com/dashboard/eurovoc.</p>
      <p>Coreon users interact with the environment through four main UI components:</p>
      <p>• Concept View (Figure 3), displays all the information of one selected concept, in all
languages, together with all metadata properties;
• Concept Map (Figure 4), displays the inter-concept relations in a tree-like hierarchy for
easy and structured navigation and supports multiple inheritance;
• Term List, alphabetic, lexical access to all terms;
• Search field, to directly search and locate records in the concept map.</p>
      <p>Coreon UI renders not only the hierarchical relations (explicitly shown in the concept-map
mode in Figure 4) but also associated relations. To avoid visualization challenges triggered by a
potential graph complexity, associated relations are clustered per relation type and currently
are displayed for one concept at a time.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>We believe that Coreon MKS removes instrumental barriers for conceptual data modeling,
becoming a helping hand for domain experts, data maintainers, and other mor and less
techsavvy stakeholders. MKS ensures that its users can focus on their domain-modeling tasks,
without additional burden of format intricacies and extensive help from IT specialists.</p>
      <p>Aside from being the tool for visual data maintenance, MKS ensures a smooth cooperation
between domain experts and other professionals, interested in the final outcome of the data
modeling activities. Being machine-readable, Coreon MKS promotes resource re-usability
and integration into solutions that often benefit from injections of the structured data (e.g.
development of virtual assistants, optimization of business processes, fine-tuning of machine
translation and machine-learning models).
using scatterplot matrix navigation, IEEE transactions on Visualization and Computer
Graphics 14 (2008) 1539–1148.
[3] N. Tang, E. Wu, G. Li, Towards democratizing relational data visualization, in: Proceedings
of the 2019 International Conference on Management of Data, 2019, pp. 2025–2030.
[4] M. Wetzel, Multilinguale taxonomien mit coreon. wissens- und sprachmanagement in einer
lösung, Rechte, Rendite, Ressourcen. Wirtschaftliche Aspekte des
Terminologiemanagements 14 (2014) 41–51.
[5] W. Ziegler, Metadaten für intelligenten content, Intelligente Information: Schriften zur</p>
      <p>Technischen Kommunikation 22 (2017) 51–66.
[6] A. Vasilevich, M. Wetzel, Multilingual knowledge systems as linguistic linked open data for
european language grid, in: S. Carvalho, R. R. Souza (Eds.), Proceedings of the Workshops
and Tutorials held at LDK 2021 co-located with the 3rd Language, Data and Knowledge
Conference (LDK 2021), CEUR Workshop Proceedings, CEUR-WS.org, Zaragoza, Spain,
2021, pp. 126–134. URL: http://ceur-ws.org/Vol-3064/02_Alena_Vasilevich.pdf.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D. A.</given-names>
            <surname>Keim</surname>
          </string-name>
          ,
          <article-title>Visual exploration of large data sets</article-title>
          ,
          <source>Communications of the ACM</source>
          <volume>44</volume>
          (
          <year>2001</year>
          )
          <fpage>38</fpage>
          -
          <lpage>44</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>N.</given-names>
            <surname>Elmqvist</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Dragicevic</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.-D. Fekete</surname>
          </string-name>
          ,
          <article-title>Rolling the dice: Multidimensional visual exploration</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>