<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>An Evolutional Based Data-Driven Quality Model for Ontologies</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Introduction / Problem Statement</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Rostock University</institution>
          ,
          <addr-line>18051 Rostock</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <fpage>0000</fpage>
      <lpage>0003</lpage>
      <abstract>
        <p>The use of ontologies today is still unabated, with upcoming usage scenarios through the rise of artificial intelligence. That increases the importance and the need for automatic, reliable evaluation techniques. Even though numerous ontology metrics have been proposed in the past years, the interpretation of these metrics remains arbitrary. The specific influence of these simple metrics on quality attributes is not validated in a scientifically sound approach. The goal of this doctorate is to establish and validate a link between comprehensive quality attributes like “understandability” or “completeness” and the metrics proposed in the literature. Using a data-centric research design, the objective is the identification of quality grades and improvement recommendations through the application of a novel, data-driven quality framework. This has the potential to support especially inexperienced ontology engineers in assessing their work and the creation of better ontologies. The novelty of this research lies in the data-centricity of its design. Using a collection of large amounts of evolutional ontology metric data, statistical relevant correlations between these metrics over time are to be found. This enables the validation of already proposed quality attributes and the identification of new ones.</p>
      </abstract>
      <kwd-group>
        <kwd>Ontology evaluation</kwd>
        <kwd>ontology quality</kwd>
        <kwd>ontology metrics</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>quality is not researched systematically. For example, how, and in what kind of
composition do the metrics proposed by Tartir et al. in the OntoQA-Framework [5]
influence the understandability of an ontology? How does Gangemi et al.’s graph metrics
[6] influence the reusability of an ontology? The impact of specific metrics on particular
quality attributes is often not described and if so, not validated in an empirically sound
approach. Even though the missing validation of the proposed metrics is criticized in
many papers [7–9], it has not yet experienced great attention from the research
community. Further, most of the metrics stay isolated due to their heterogeneous structure,
degree of formality, and different objectives [10].</p>
      <p>These shortcomings hinder the explanatory power and comprehensibility of
ontology metrics. Especially inexperienced modelers are facing challenges selecting the
right metrics for the right goals. Even though the ontology metrics are calculated
objectively, their interpretation remains subjective [11].</p>
      <p>Validated measurements of ontology quality attributes can help modelers to develop
high-quality ontologies based on their aimed usage scenario. The envisioned purpose
of this Ph.D. project is a translation of the abstract metrics into measurements for
highlevel quality dimensions like, among others, “completeness”, “clarity”, or
“adaptability”. As a side effect, based on the calculated quality score, improvement
recommendations can be derived, highlighting the artifacts that are the most influential factors for
each quality dimension. In effect, it is expected that this can not only lead to better
ontologies but in the long term also to better-trained modeling workforce.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>Related work for this research endeavor originates from the field of ontology
evaluation. Over the past years, various evaluation methods have been proposed. The
following section discusses the most influential and relevant publications and motivates this
doctoral research through the shortcomings of the earlier approaches.</p>
      <p>Often referenced is the categorization by Burton-Jones et al. into syntactic, semantic,
pragmatic, and social quality within a total of 10 associated metrics [7]. This paper
already assigns metrics to quality constructs like lawfulness or clarity. Nevertheless,
the framework heavily depends on user inputs, one has to provide weights for the
metrics and their aggregation. The overall 14 user-assigned weights make the application
of this quality framework arbitrary, the practical implications for this framework
regarding attributes like ontology reuse, understandability, or computational efficiency
are not examined.</p>
      <p>OntoQa by Tartir et al. provides another set of metrics categorized in schema, class,
and knowledge base metrics. Even though an interpretation is provided for most of the
results, a holistic view on the ontology is missing and the metrics stay isolated [5].</p>
      <p>Duque-Ramos et al. adapted the software quality framework SQuaRE towards
ontologies, naming it OQuaRE. Here, ontologies are measured using 14 metrics. For these
measurements, threshold values are provided for the grading into a numerical scale
from 1 to 5. Further, these metrics are mapped to some quality characteristics like
testability or modularity [12]. Using an expert evaluation, the relation between the quality
attributes and the calculated metrics were empirically validated [13]. The maturity of
this framework regarding the validation of metrics and their linkage to quality attributes
exceeds the other approaches. However, even though the relationship between metrics
and quality dimensions were established, this relationship merely associates the
calculated metrics to quality attributes, without proposing a composition.</p>
      <p>A recent review on metrics by Lourdusamy and John came to the same conclusion
and criticized the missing holistic view on ontologies as well as the lack of empirical
validation [14].</p>
      <p>Most of the current approaches just hypothesize the combination of the isolated
metrics into intuitive, comprehensive quality aggregations. Even if a validation is
performed, it is often done in a rather narrow quantitative study with limited significance.
In conclusion, metrics for ontologies are already researched extensively, the room for
novelties in that area is limited. To ensure a useful application though, these proposed
metrics need to be validated and set into a useful context. This context can be provided
by the development of a holistic ontology quality framework as proposed in this Ph.D.
proposal.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Research Questions</title>
      <p>In the previous sections, current challenges were outlined regarding the support of
(especially inexperienced) knowledge engineers using empirically validated quality
measurements and improvement recommendations. The following research questions are
derived based on the identified shortcomings of the approaches currently available:
3.1</p>
      <sec id="sec-3-1">
        <title>RQ 1: How do Ontology Metrics Develop over the Evolution of</title>
      </sec>
      <sec id="sec-3-2">
        <title>Ontologies?</title>
        <p>Ontology metrics just provide a snapshot of an artifact that is itself dynamic. Taking
into account historical data, the evolvement of the ontology can be made visible. As
Malone et al. stated, ontologies develop in different stages, that all come with specific
characteristics regarding the performed changes [15]. It is expected that these stages
can be identified using the historical development of the calculated metrics. The
maturity level of an ontology then indicates the kind of assessment and evaluation that is
needed to rate and improve the model.
3.2</p>
      </sec>
      <sec id="sec-3-3">
        <title>RQ 2: How do Ontology Metrics Correlate with Each Other, and how do these Correlations form Comprehensive Quality Dimensions?</title>
        <p>The next research question is concerned with the correlations between metrics. The
combination of multiple isolated metrics into comprehensive quality attributes is
expected to raise the understandability of the evaluation. The validation based on a
statistical analysis of a large metric repository ensures their significance. Further, the
decrease in the amount of shown measurements can improve the comparability between
ontologies.</p>
      </sec>
      <sec id="sec-3-4">
        <title>RQ 3: Under what Condition is the Quality of an Ontology Sufficient for a</title>
      </sec>
      <sec id="sec-3-5">
        <title>Given Use-Case?</title>
        <p>To be able to infer a quality indication out of measurements, an interpretation of these
values is obligatory. Without guidance, their interpretation remains arbitrary, especially
for inexperienced ontology engineers. Based on the analysis of large ontology
repositories, their use-cases, and their respective threshold-values for other metrics, the goal
is to research quality scores that are common for often used knowledge representations.
These common threshold values can be taken as a threshold recommendation for future
ontology developments.
3.4</p>
      </sec>
      <sec id="sec-3-6">
        <title>RQ 4: Which Improvement Recommendation can be Derived from the</title>
      </sec>
      <sec id="sec-3-7">
        <title>Metric Calculations?</title>
        <p>The awareness of the quality of an artifact is indispensable for its improvement.
However, especially for inexperienced knowledge engineers, a pure numerical assessment
is not sufficient to ensure the creation of better ontologies. It is argued that the
inexperienced workforce needs more support in the form of recommendations. Based on the
quality attribute calculations that are the output of RQ 2 and their suggested scores that
are the output of RQ 3, the goal is to derive modeling recommendations out of the
assessed characteristics. These scores enable the modeler to fix the weaknesses in his
ontology design that have the most significant impact on the respective quality score.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Research Plan and Preliminary Results</title>
      <p>This section highlights the current methodological plan, as well as steps that have been
achieved so far. Especially the former is subjected to changes in the future, as the
maturity of this research grows.
4.1</p>
      <sec id="sec-4-1">
        <title>Development of the Technical Prerequirements</title>
        <p>As stated in section 2, most of the current approaches are based on argumentative
quality aggregations and limited statistical validation. The approach proposed for this
doctorate follows a more data-centric paradigm. Based on the extensive analysis of sizeable
evolutional data sets, metric correlations out of literature shall either be validated or
new ones detected. The data will be collected through the tool “OntoMetrics” of
Rostock University. First proposed in 2016 by Lantow [16], it offers 81 metrics, mostly
based on the work of [17] and [5]. While it currently just supports the analysis of single
ontology files, in the future, the tool will be extended towards support for git
repositories and an easy to use GitLab and GitHub1 interface. All metrics of the analyzed data
1 gitlab.com, github.com
sets will be stored for analysis. It is expected that the further development of the
OntoMetrics-tool increases its usage and enables a growing database of analyzed
ontologies. Recently, a large-sized company approached Rostock University for collaboration
in assessing their ontologies. The collaboration with this industry partner can also lead
to a growing database of collected ontology metrics.
4.2</p>
      </sec>
      <sec id="sec-4-2">
        <title>Methodical Blueprint for Answering the Research Questions</title>
        <p>As soon as enough data is collected to infer valid correlations, the answering of the
research questions can begin. The planned methodological process is shown in figure 1
below. As the research is still in an early stage, the outlined process may be subjected
to changes in the future.
The research question 1 is at first concerned with the description of the gathered
data. An initial overview of how the various ontologies differ can provide an insight
into the variance of metrics as well as their historical development. It guides the
deriving of the metric aggregations for the next research question, RQ2. These aggregations
shall include compositions of metrics into comprehensible quality dimensions like
“understandability”, “reusability”, “learnability” and more. The creation of the metric
compositions can be based on mere data analysis using statistical data mining approaches,
from already proposed metric aggregations in the literature or a combination of both.</p>
        <p>The distinctive feature between this and previous research is the intense focus on the
data-centric validation of the measurements. Using the various analyzed ontologies, it
is expected to see that identified commonalities are to be true not only for a limited set
of ontologies but for a statistically relevant proportion of the ontologies that are
available for analysis.</p>
        <p>As the measurements are identified and validated, the next step for RQ3 is now to
find common, recommended threshold values for these metrics. Most likely, a prior
classification of the ontology is necessary – an upper-level ontology will look different
than a task-dependent domain ontology. With the metrics repository, it is expected to
derive threshold values for evaluation scores for the quality framework. An example of
this threshold could be that most mature, often used domain ontologies have an
“understandability” of at least 4.3. This enables the ontology engineer to directly compare his
work against other ontologies that he knows or that that are used in his organization,
while at the same time having an idea when a desirable quality level is reached.</p>
        <p>The metrics compositions that are the output of RQ2 are in the latter also the input
for RQ4. As ontologies get analyzed using these measurements, there are metrics within
these compositions that may have a significant impact on the calculated score. They
can be, therefore, identified as the most essential items to improve.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Evaluation of the Research</title>
      <p>The created metric compositions for the measurement of quality attributes of RQ2 and
the created threshold values of RQ3 are based on statistical analysis. Using further
collected test-data, the mathematical validity of the metric aggregations can be proven.
But a mere mathematical construct does not guarantee the usefulness to the consumer
of such metrics. To perform this kind of user-centric evaluation and therefore ensure
that the research questions are answered to a sufficient degree, a last evaluation stage
is required. The embodiment of this affirmative study is not yet decided. There are,
however, various ideas for confirmative studies.</p>
      <p>An often-used approach for the validation of usefulness to the user is a quantitative
survey. Here, it is possible to question the perceived level of experience, as well as the
perception of the usefulness of the various quality measurements.</p>
      <p>In the beginning, we set up the hypothesis that especially inexperienced users tend
to need automatical supported guiding using an empirically grounded evaluation tool.
To connect the perceived impact of the tool with reality, the survey could be aligned
with metrics and metadata out of the git repositories. For some metrics, the metadata
can give a clear indication for its validity: Reusability, for example, should be visible
in downloads or forks. If the questionnaire before also captured the user name of a given
repository service, then a direct link between the perceived quality of individual users
and their performance can be established. Higher perceived productivity should affect
the number and size of commits. An improved perceived personal performance should
be observable through more elaborate modeling techniques, thus resulting in an
improved metric rating for their contributions. Using this approach, the actual effect of
the proposed metrics on the performance of the modeler can be observed.
6</p>
    </sec>
    <sec id="sec-6">
      <title>Discussion and Further Outlook</title>
      <p>At a first glimpse, the field of ontology evaluation seems to be mature. The technology
exists since the early 90s, and much work has been published. However, as outlined in
section 2, today’s ontology evaluation approaches lack a sound empirical validation
and the translation into human understandable, reliable metrics. This shortcoming is
getting addressed with this doctorate. This research can help especially inexperienced
knowledge engineers to create better ontologies by providing comprehensible, reliable
feedback, including improvement recommendations.</p>
      <p>This research is focused on the structural attributes of ontologies. Measurement
methodologies that require additional input like golden standard, data-driven, or
task/application based [18] are not in the scope of this research. It is therefore not
assessed whether an ontology that might be not sufficient from a structural perspective
excels in a specific usage scenario. It is likewise not planned to include user ratings or
domain terminology coverage into the quality measurements. Nonetheless, further
research might consider the combination of different methodologies with the result of this
scientific work.</p>
      <p>The collection of data is the main requirement for the planned research activities. As
stated in section 4, the tool “OntoMetrics” will be the foundation for further data
collection activities and is getting developed towards the integration of historical data and
repository services. As soon as these services are integrated, public available ontologies
like the gene ontology2 can be analyzed towards a first outlook on RQ1, while the tool
builds up a more extensive user base and therefore collects more data.</p>
      <p>The acceptance of the tool OntoMetrics and the associated data gathering is an
essential activity for the success of this research. The size and quality of the database will
strongly affect the quality and validity of the research, as well as the next steps.
Recently, the respective tool has gained some interest not only by researchers but also by
industry partners, further collaboration is planned. This supports our hypothesis for a
need for better evaluation of the growing amount of ontologies and motivates further
the optimism for the success of this data-driven ontology evaluation research.
Acknowledgment. I would like to give special thanks to Prof. Kurt Sandkuhl for his
mentoring and support.
2 https://github.com/geneontology
17.
18.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          https://doi.org/10.1016/j.datak.
          <year>2004</year>
          .
          <volume>11</volume>
          .010 Alm,
          <string-name>
            <given-names>R.</given-names>
            ,
            <surname>Kiehl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Lantow</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Sandkuhl</surname>
          </string-name>
          ,
          <string-name>
            <surname>K.</surname>
          </string-name>
          :
          <article-title>Applicability of Quality Metrics for Ontologies on Ontology Design Patterns</article-title>
          . In: Filipe,
          <string-name>
            <surname>J</surname>
          </string-name>
          . (ed.)
          <source>Proceedings of the 5th International Conference on Knowledge Engineering and Ontology Development (IC3K)</source>
          , Vilamoura, Algarve, Portugal,
          <volume>9</volume>
          /19/2013 - 9/22/2013, pp.
          <fpage>48</fpage>
          -
          <lpage>57</lpage>
          . SciTePress, S.l. (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          https://doi.org/10.5220/0004541400480057 Ashraf,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Hussain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.K.</given-names>
            ,
            <surname>Hussain</surname>
          </string-name>
          ,
          <string-name>
            <surname>F.K.</surname>
          </string-name>
          :
          <article-title>A Framework for Measuring Ontology Usage on the Web</article-title>
          .
          <source>The Computer Journal</source>
          (
          <year>2013</year>
          ). https://doi.org/10.1093/comjnl/bxs134 Sicilia,
          <string-name>
            <given-names>M.A.</given-names>
            ,
            <surname>Rodríguez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>García-Barriocanal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            ,
            <surname>Sánchez-Alonso</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          :
          <article-title>Empirical findings on ontology metrics</article-title>
          .
          <source>Expert Systems with Applications</source>
          (
          <year>2012</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          https://doi.org/10.1016/j.eswa.
          <year>2011</year>
          .
          <volume>11</volume>
          .094 Hammar,
          <string-name>
            <surname>K.</surname>
          </string-name>
          :
          <article-title>Content Ontology Design Patterns: Qualities, Methods, and</article-title>
          <string-name>
            <surname>Tools. P.h.D.</surname>
          </string-name>
          , Linköping University (
          <year>2017</year>
          )
          <article-title>Duque-</article-title>
          <string-name>
            <surname>Ramos</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fernández-Breis</surname>
            ,
            <given-names>J.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stevens</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Aussenac-Gilles</surname>
          </string-name>
          , N.:
          <article-title>OQuaRE: A square-based approach for evaluating the quality of ontologies</article-title>
          .
          <source>Journal of Research and Practice in Information Technology</source>
          <volume>43</volume>
          ,
          <fpage>159</fpage>
          -
          <lpage>176</lpage>
          (
          <year>2011</year>
          )
          <article-title>Duque-</article-title>
          <string-name>
            <surname>Ramos</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fernández-Breis</surname>
            ,
            <given-names>J.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Iniesta</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dumontier</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Egaña</given-names>
            <surname>Aranguren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Schulz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Aussenac-Gilles</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Stevens</surname>
          </string-name>
          ,
          <string-name>
            <surname>R.</surname>
          </string-name>
          :
          <article-title>Evaluation of the OQuaRE framework for ontology quality</article-title>
          .
          <source>Expert Systems with Applications</source>
          (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          https://doi.org/10.1016/j.eswa.
          <year>2012</year>
          .
          <volume>11</volume>
          .004 Lourdusamy,
          <string-name>
            <surname>R.</surname>
          </string-name>
          , John, A.:
          <article-title>A review on metrics for ontology evaluation</article-title>
          .
          <source>In: Proceedings of the 2nd International Conference on Inventive Systems and Control (ICISC)</source>
          ,
          <year>Coimbatore</year>
          ,
          <volume>01</volume>
          /
          <fpage>19</fpage>
          - 01/20/2018, pp.
          <fpage>1415</fpage>
          -
          <lpage>1421</lpage>
          . IEEE (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          https://doi.org/10.1109/ICISC.
          <year>2018</year>
          .8399041 Malone,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Stevens</surname>
          </string-name>
          ,
          <string-name>
            <surname>R.</surname>
          </string-name>
          :
          <article-title>Measuring the level of activity in community built bio-ontologies</article-title>
          .
          <source>Journal of biomedical informatics</source>
          (
          <year>2013</year>
          ). https://doi.org/10.1016/j.jbi.
          <year>2012</year>
          .
          <volume>04</volume>
          .002 Lantow,
          <string-name>
            <surname>B.</surname>
          </string-name>
          :
          <article-title>OntoMetrics: Application of on-line ontology metric calculation</article-title>
          . In: Johansson,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Vencovský</surname>
          </string-name>
          ,
          <string-name>
            <surname>F</surname>
          </string-name>
          . (eds.)
          <source>Joint Proceedings of the BIR - Workshops and Doctoral Consortium</source>
          , Prague, Czech Republic,
          <volume>09</volume>
          /
          <fpage>14</fpage>
          - 09/16/16 (
          <year>2016</year>
          )
          <string-name>
            <surname>Gangemi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Catenacci</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ciaramita</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lehmann</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gil</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bolici</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Strignano</surname>
            <given-names>Onofrio</given-names>
          </string-name>
          :
          <article-title>Ontology evaluation and validation. An integrated formal model for the quality diagnostic task (2005) Brank</article-title>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Grobelnik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Mladenic</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
          </string-name>
          :
          <article-title>A Survey of Ontology Evaluation Techniques</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <article-title>In: Proceedings of the conference on data mining and data warehouses</article-title>
          .
          <source>SiKDD</source>
          <year>2005</year>
          , Ljubljana, Slovenia, pp.
          <fpage>166</fpage>
          -
          <lpage>170</lpage>
          (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>