<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>(Linked) Data Quality Assessment: An Ontological Approach</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Aparna Nayak ID</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bojan Bozic ID</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Luca Longo ID</string-name>
          <email>luca.longog@tudublin.ie</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>SFI Centre for Research Training in Machine Learning, School of Computer Science, Technological University Dublin</institution>
          ,
          <addr-line>Dublin</addr-line>
          ,
          <country>Republic of Ireland</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>The e ective functioning of data-intensive applications usually requires that the dataset should be of high quality. The quality depends on the task they will be used for. However, it is possible to identify task-independent data quality dimensions which are solely related to data themselves and can be extracted with the help of rule mining/pattern mining. In order to assess and improve data quality, we propose an ontological approach to report data quality violated triples. Our goal is to provide data stakeholders with a set of methods and techniques to guide them in assessing and improving data quality.</p>
      </abstract>
      <kwd-group>
        <kwd>Data quality assessment</kwd>
        <kwd>Data quality improvement</kwd>
        <kwd>Linked data</kwd>
        <kwd>Root cause analysis</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Data quality can be perceived as ` tness for use' for a given application or a
use case. Data quality is often determined by assessing if it meets the user's
requirement. Assessing the data quality usually requires a large number of quality
metrics to be computed rather than a single metric for a particular application.
A broad range of data quality dimensions and categories of such dimensions as
well as metrics for measuring these dimensions are de ned in [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]. High-quality
data leads to better decision-making across the application whereas poor quality
data can be re ned using multiple available techniques [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
      </p>
      <p>The linked data principles promote publishing data and interlinking them in
a machine-readable format using Semantic Web standards. Knowledge graphs
are seen as one of the essential components in envisioning the Semantic Web's
idea. A knowledge graph is a graph-based knowledge base that can be considered
as an RDF graph. Nodes in these graphs represent entities or literal, while edge
represents the relation either between entities or between entities and literals. An
RDF graph consists of a RDF triple where each triple (s,p,o) is an ordered set of
the following RDF terms: a subject s 2 U [ B, a predicate p 2 U , and an object
o 2 U [ B [ L. An RDF term is either a Uniform Resource Identi er (URI, U), a
blank node (B) or a literal (L). Nodes are usually associated with a type which a
class in case of an entity or a datatype in the case of literal.</p>
      <p>In the proposed model it is expected to design a uni ed approach to publish
data quality along with improved quality dataset while understanding the root
causes of data quality violated triples. The following is the general work ow for
assessing quality and determining the root causes of violations: 1) identify a
dataset, 2) uplift data in case of non-RDF data, 3) choose quality dimensions
4) detect root causes of data quality violated triples 5) apply data quality
improvement techniques. Without the use of external knowledge bases, the model
provides stakeholders with a set of techniques for identifying quality problems
and automatic suggestions for improving the overall quality of the dataset.</p>
      <p>In conclusion, the proposed method can be summarised as the de nition of
strategies to assess data quality. The remainder of this article is structured as
follows. Section 2 describes the applicability of the proposed method. Existing
methods and ontologies to assess data quality are discussed in section 3. Section 4
discusses the research questions that the proposed research aims to solve. Section
5 tries to answer the research questions in detail. Section 6 discusses preliminary
proposed ontology along with gaps of existing frameworks. Section 7 mentions the
evaluation plan. Finally, section 8 re ects the overall strengths of the framework.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Relevancy</title>
      <p>This research proposal aims to deliver an end-to-end system that uplifts non-RDF
to RDF, assesses its quality, and identi es root causes of quality violated triples
to improve the quality. The e ort and time required to preprocess the data will be
reduced if methods to identify the root causes of data quality violated triples are
provided. The proposed system is relevant for all data publishers, contributors,
and consumers as the assessment of data quality will also locate the quality
violated triples. Moreover, the proposed approach aims to publish the data in
RDF which is a machine-understandable format. Users can decide whether or
not to x a problem in the dataset by looking at suggested quality improvement
suggestions and the exact cause of the problem.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Related work</title>
      <p>
        Data quality is described as a multidimensional concept [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. Assessment of
data quality involves inspecting multiple dimensions. Relevant dimensions for
linked data quality have been elucidated exhaustively in literature [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ],
[
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. Data quality assessment is followed by generating the quality report in a
standard format. To manage data quality assessment reports e ectively, several
data quality ontologies such as Data Quality Management(DQM) [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], Reasoning
Violation Ontology(RVO) [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], Data Quality Ontology (daQ) [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], Data Quality
Vocabulary(DQV) 1 [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] have been proposed. All the ontologies in the literature
are helpful to represent data quality assessment reports except for RVO. RVO is
a dedicated reasoning error ontology that helps to process errors.
      </p>
      <sec id="sec-3-1">
        <title>1 https://www.w3.org/TR/vocab-dqv/</title>
        <p>
          There exist several frameworks such as Luzzu [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], SemQuire [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ], LD Sni er
[
          <xref ref-type="bibr" rid="ref13">13</xref>
          ], TripleCheckMate [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ], RDFUnit [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] and SWIQA [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] which aims to assess
linked dataset. These frameworks di er in terms of scalability, generation of the
quality report to publish results, total number of metrics assessed and use of
external knowledge base. All the frameworks discussed here lack adoption of
data quality improvement techniques as well as identifying root causes of quality
violated triples. LiQuate [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ], Sieve [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] and RDF improvements [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] are some
frameworks that improves data quality after assessing couple of metrics.
        </p>
        <p>
          The root cause analysis technique aims at identifying triples that have violated
data quality. In the literature, there exists multiple data validation techniques
such as SHACL 2 and rule-based reasoning [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. These validation techniques
help to validate the shape of RDF rather than assessment and improvement.
An extension of Luzzu [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] identi es data quality violated triples based on data
quality metric evaluation that helps stakeholders to prioritize and x the errors.
However, all the data quality assessment or validation approaches require either
an external knowledge base or an ontology or both. In our research, we found 14
quality dimensions in the literature that are both quanti able and have quality
violated triples. We focus on the intrinsic dimension as these are explicitly relevant
to assess the quality at the A-Box level.
4
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Research questions</title>
      <p>Research questions (RQs) deal with uplifting structured data, assessment and
the enhancement steps considering the quality dimensions. The RQ related to
the proposed method can be summarised as follows.</p>
      <p>RQ1: To what extent the comma separated values be uplifted to linked data
format dependent/independent of the domain vocabulary? Multiple mapping
languages exist to uplifting the dataset to RDF. When the domain vocabulary is
at hand, we aim to reuse the most promising mapping languages and de ne our
ontology to uplift data to ll the gaps in missing vocabulary. On contrary, the
input data can be used to learn ontology. Ontology learning helps to assess the
consistency of the dataset.</p>
      <p>RQ2: To what extent can the quality of linked data be assessed to identify the
root causes of data quality violated triples without external knowledge base? In
order to identify the root causes of the data quality violated triples we have to
investigate the quality dimensions in particular focusing on the intrinsic dimension
can be de ned independently of the external data sources.</p>
      <p>RQ3: To what extent the performance of the proposed model can be signi
cantly improved by identifying domain dependent characteristics? The proposed
model is compared with baseline methods with diverse datasets to analyze the
scalability, domain-dependent/independent behaviour and applicability. It also
requires building a synthetic dataset to showcase data quality metrics coverage.</p>
      <p>RQ4: To what extent all the quality metrics under consideration can be
improved without degrading existing data quality? Data quality improvement
technique for each metric is identi ed and applied to understand the overall data</p>
      <sec id="sec-4-1">
        <title>2 https://www.w3.org/TR/shacl</title>
        <p>quality. This helps to identify correlated metrics and understand the e cacy of
data quality improvement techniques.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Approach</title>
      <p>The proposed approach aims to help stakeholders to i) assess data quality
problems, ii) identify triples that have violated quality, iii) understand e ective
strategies in solving detected problems. The entire framework can be implemented
as a linear approach, summarised as follows:
{ Data quality assessment : De nes actions to acquire the values of the identi ed
data quality dimensions.</p>
      <p>De nition of the quality dimensions to assess
Measurement of the selected quality dimensions</p>
      <p>Reporting the data quality measurement
{ Root cause analysis : Analyses the triples for data quality violated triples.</p>
      <p>Identi cation of quality dimensions
Designing an ontology to report root cause analysis</p>
      <p>Representation of the results
{ Suggestions to improve triples that have violated quality: De nes a process
to rectify the violations when quality is poor.</p>
      <p>Building a knowledge base to correct the common errors identi ed in
root cause analysis
Implementation of rule-based method to identify appropriate data quality
improvement techniques.
5.1</p>
      <sec id="sec-5-1">
        <title>Data quality assessment</title>
        <p>Data quality assessment computation usually requires an understanding of the
metrics. One of the aims of the proposed method is to assess data at the A-Box
level without considering the external knowledge base. Metrics that belong to the
intrinsic dimension and some of the metrics that belong to the representational
dimension focus on evaluating both A-box and T-box. Table 1 gives an overview of
the metrics that will be considered to assess data quality. Rule mining algorithms
will be used to learn frequent patterns in the linked data. This helps to assess
the metrics such as no misuse of properties, correct domain and range de nition
and many more. The clustering algorithm helps assess the metrics such as no
misplaced classes/properties and detection of outliers. Metrics that either does
not require an external knowledge base or requires a static external knowledge
base are identi ed and considered for the assessment.</p>
        <p>Table 2 gives an overview of metrics that requires only computation.
Assessment only column refers that the speci ed metric is used only for assessment
purposes, and the root cause of quality violation is not applied. When the raw
form of RDF data is considered, metrics that are listed under direct use of RDF
data can be assessed.</p>
        <p>
          The root cause violations of the triples will be identi ed using data quality
metrics focused on evaluating intrinsic dimensions. Of the 24 quality metrics
identi ed by [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ], eleven metrics neither requires an ontology nor dynamic knowledge
base. The remaining dimensions are not considered mainly due to the factors
1) Input data requires to be stored on client side. 2) Need of external data
sources. For example, quality dimensions in the accessibility class (availability,
security, performance, interlinking, licensing) are not taken into account because
this dimension focuses on and evaluates metrics when data is stored on the
server. Contextual dimension focus on the context of the task at hand is also not
considered for evaluation. The supported dimensions are intrinsic and
representational. Hence the strengths of the proposed model are in enforcing intrinsic and
representational data quality dimensions.
5.2
        </p>
      </sec>
      <sec id="sec-5-2">
        <title>Root cause analysis</title>
        <p>Root cause analysis helps to identify the reason for the violations. To develop
an e ective correctable measure to correct and prevent such adverse outcomes
in the future, it is critical to rst understand the cause. The evaluation of the
root cause is followed by the formulation of recommendations. For example,
datatype mismatch usually is a quality problem. When the system detects it, a
recommendation can be given for the desired datatype along with subject and
object information.
5.3</p>
      </sec>
      <sec id="sec-5-3">
        <title>Data quality improvement</title>
        <p>This stage aids in the improvement of data quality. The identi cation of quality
violated triples is the result of root cause analyses. Depending on the data quality
assessment value for each assessed metric, quality improvement techniques can
be applied. Data quality improvement method helps to remove outliers, correct
malformed datatype literals, infer missing triples and many more. The proposed
method also aims to notify the suggestion in the form of add/delete/modify the
erroneous triple, which could lead to an overall data quality improvement.
6</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Discussion</title>
      <p>Existing data quality assessment approaches compute the quality metric, and
the assessment result is given as a numeric value. Prevailing assessment methods
require an external knowledge base in the form of vocabulary/ontology or a
dictionary. Linked data is veri ed against a given external knowledge base to
generate an assessment report. The proposed approach makes use of ontology
learning, rule mining, and static dictionaries to assess the quality of the data.
Ontology learning and rule mining algorithm help to understand the frequent
patterns that can be used to assess metrics such as consistency, syntactic validity
and outliers. Dataset users should have the privilege to know the root causes
of the data quality violated triples. Root cause analysis helps to identify the
location of the problem, such as subject, predicate, or object, along with the type
of the quality problem. Quality problem type refers to a metric that has failed on
a particular triple. The frameworks presented in section 3 does not identify the
quality violated triples rather only helps in assessment of metrics. An ontology
will be proposed to report data quality violated triples as well as data quality
assessment. A preliminary proposed ontology is as shown in gure 1. Moreover,
suggestions over quality violated triples will be given that helps to improve data
quality. The suggestions in terms of add/modify/delete help to improve data
quality.
7</p>
    </sec>
    <sec id="sec-7">
      <title>Evaluation Plan</title>
      <p>The outcome of this research project will be a tool for supporting data scientists
to uplift datasets to RDF format, assess data quality and analysing root causes
of data quality violated triples to improve data quality. Multiple features must
be evaluated to understand the success of the proposed approach. However, the
main contribution of the work focuses on assessing data quality and analysing the
root causes of data quality violated triples to improve the quality. The model's
performance is also evaluated considering scalability and time taken to identify
the root causes of violations of the knowledge graph. Moreover, metrics coverage
can be evaluated by considering a) synthetic dataset, b) the results obtained by
other tools to compare and c) results which has to be veri ed manually.</p>
      <p>The synthetic dataset is carefully designed to verify the coverage of all the
implemented metrics. Apart from the synthetic dataset, the model is evaluated
by considering multiple datasets to verify applicability. The correctness of the
model is veri ed by comparing the proposed model with existing tools and with
the help of manual evaluation. Thus, our hypothesis that answers all the research
questions de ned in section 4 is stated as follows.</p>
      <p>Hypothesis: The time required to improve data quality after identifying the
root causes of violations is less than applying random data quality improvement
technique directly on the raw data.
8</p>
    </sec>
    <sec id="sec-8">
      <title>Re ections</title>
      <p>To the best of our knowledge, quality assessment, root cause analysis and quality
improvement are rarely managed simultaneously in linked data. Therefore, our
goal is to ll this gap by proposing a framework that helps data providers and
consumers to assess and improve data quality. Moreover, the features o ered by
this framework will be integrated into the ML framework to understand data
quality and test the applicability of our proposal.</p>
      <p>Acknowledgements This publication has emanated from research conducted
with the nancial support of Science Foundation Ireland under Grant number
18/CRT/6183. For the purpose of Open Access, the author has applied a CC BY
public copyright licence to any Author Accepted Manuscript version arising from
this submission.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Albertoni</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Isaac</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Introducing the data quality vocabulary (DQV)</article-title>
          .
          <source>Semantic Web</source>
          <volume>12</volume>
          (
          <issue>1</issue>
          ),
          <volume>81</volume>
          {
          <fpage>97</fpage>
          (
          <year>2021</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Bozic</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brennan</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Feeney</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mendel-Gleason</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          :
          <article-title>Describing reasoning results with rvo, the reasoning violations ontology</article-title>
          . In:
          <article-title>MEPDaW and LDQ co-located with ESWC</article-title>
          .
          <source>CEUR Workshop Proceedings</source>
          , vol.
          <volume>1585</volume>
          , pp.
          <volume>62</volume>
          {
          <fpage>69</fpage>
          .
          <string-name>
            <surname>CEUR-WS.org</surname>
          </string-name>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Debattista</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Auer</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lange</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Luzzu - A methodology and framework for linked data quality assessment</article-title>
          .
          <source>ACM J. Data Inf. Qual</source>
          .
          <volume>8</volume>
          (
          <issue>1</issue>
          ), 4:
          <issue>1</issue>
          {4:
          <issue>32</issue>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Debattista</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lange</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Auer</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <article-title>: daq, an ontology for dataset quality information</article-title>
          .
          <source>In: Proceedings of the Workshop on Linked Data on the Web co-located with WWW. CEUR Workshop Proceedings</source>
          , vol.
          <volume>1184</volume>
          .
          <string-name>
            <surname>CEUR-WS.org</surname>
          </string-name>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Furber</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hepp</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Swiqa - a semantic web information quality assessment framework</article-title>
          .
          <source>In: 19th European Conference on Information Systems</source>
          , ECIS. p.
          <volume>76</volume>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6. Furber,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Hepp</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          :
          <article-title>Towards a vocabulary for data quality management in semantic web architectures</article-title>
          .
          <source>In: Proceedings of the 2011 EDBT/ICDT Workshop on Linked Web Data Management</source>
          . pp.
          <volume>1</volume>
          {
          <issue>8</issue>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Hogan</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Harth</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Passant</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Decker</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Polleres</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Weaving the pedantic web</article-title>
          .
          <source>In: Proceedings of the WWW 2010 Workshop on Linked Data on the Web. CEUR Workshop Proceedings</source>
          , vol.
          <volume>628</volume>
          .
          <string-name>
            <surname>CEUR-WS.org</surname>
          </string-name>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Kontokostas</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Westphal</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Auer</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hellmann</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lehmann</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cornelissen</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zaveri</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Test-driven evaluation of linked data quality</article-title>
          .
          <source>In: 23rd International World Wide Web Conference</source>
          , WWW. pp.
          <volume>747</volume>
          {
          <fpage>758</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Kontokostas</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zaveri</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Auer</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lehmann</surname>
          </string-name>
          , J.:
          <article-title>Triplecheckmate: A tool for crowdsourcing the quality assessment of linked data. In: Knowledge Engineering and the Semantic Web</article-title>
          .
          <source>Communications in Computer and Information Science</source>
          , vol.
          <volume>394</volume>
          , pp.
          <volume>265</volume>
          {
          <fpage>272</fpage>
          . Springer (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Langer</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Siegert</surname>
            , V., Gopfert,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gaedke</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Semquire - assessing the data quality of linked open data sources based on DQV</article-title>
          .
          <source>In: Current Trends in Web Engineering - ICWE. Lecture Notes in Computer Science</source>
          , vol.
          <volume>11153</volume>
          , pp.
          <volume>163</volume>
          {
          <fpage>175</fpage>
          . Springer (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Meester</surname>
            ,
            <given-names>B.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Heyvaert</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Arndt</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dimou</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Verborgh</surname>
          </string-name>
          , R.:
          <article-title>RDF graph validation using rule-based reasoning</article-title>
          .
          <source>Semantic Web</source>
          <volume>12</volume>
          (
          <issue>1</issue>
          ),
          <volume>117</volume>
          {
          <fpage>142</fpage>
          (
          <year>2021</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Mendes</surname>
            ,
            <given-names>P.N.</given-names>
          </string-name>
          , Muhleisen, H.,
          <string-name>
            <surname>Bizer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Sieve: linked data quality assessment and fusion</article-title>
          .
          <source>In: Proceedings of the 2012 Joint EDBT/ICDT Workshops</source>
          . pp.
          <volume>116</volume>
          {
          <fpage>123</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Mihindukulasooriya</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Garc</surname>
            a-Castro,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gomez-Perez</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>LD sni er: A quality assessment tool for measuring the accessibility of linked data</article-title>
          .
          <source>In: Knowledge Engineering and Knowledge Management - EKAW. Lecture Notes in Computer Science</source>
          , vol.
          <volume>10180</volume>
          , pp.
          <volume>149</volume>
          {
          <fpage>152</fpage>
          . Springer (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Paulheim</surname>
          </string-name>
          , H.:
          <article-title>Knowledge graph re nement: A survey of approaches and evaluation methods</article-title>
          .
          <source>Semantic Web</source>
          <volume>8</volume>
          (
          <issue>3</issue>
          ),
          <volume>489</volume>
          {
          <fpage>508</fpage>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Radulovic</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mihindukulasooriya</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Garc</surname>
            a-Castro,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gomez-Perez</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>A comprehensive quality model for linked data</article-title>
          .
          <source>Semantic Web</source>
          <volume>9</volume>
          (
          <issue>1</issue>
          ),
          <volume>3</volume>
          {
          <fpage>24</fpage>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Ruckhaus</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vidal</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Castillo</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Burguillos</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baldizan</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          :
          <article-title>Analyzing linked data quality with liquate</article-title>
          .
          <source>In: The Semantic Web: ESWC. Lecture Notes in Computer Science</source>
          , vol.
          <volume>8798</volume>
          , pp.
          <volume>488</volume>
          {
          <fpage>493</fpage>
          . Springer (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Strong</surname>
            ,
            <given-names>D.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>Y.W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>R.Y.</given-names>
          </string-name>
          :
          <article-title>Data quality in context</article-title>
          .
          <source>Communications of the ACM</source>
          <volume>40</volume>
          (
          <issue>5</issue>
          ),
          <volume>103</volume>
          {
          <fpage>110</fpage>
          (
          <year>1997</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Vaidyambath</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Debattista</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Srivatsa</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brennan</surname>
            ,
            <given-names>R.:</given-names>
          </string-name>
          <article-title>An intelligent linked data quality dashboard</article-title>
          .
          <source>In: Proceedings for the 27th AIAI Irish Conference on Arti cial Intelligence and Cognitive Science. CEUR Workshop Proceedings</source>
          , vol.
          <volume>2563</volume>
          , pp.
          <volume>341</volume>
          {
          <fpage>352</fpage>
          .
          <string-name>
            <surname>CEUR-WS.org</surname>
          </string-name>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Wand</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
          </string-name>
          , R.Y.:
          <article-title>Anchoring data quality dimensions in ontological foundations</article-title>
          .
          <source>Commun. ACM</source>
          <volume>39</volume>
          (
          <issue>11</issue>
          ),
          <volume>86</volume>
          {
          <fpage>95</fpage>
          (
          <year>1996</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Zaveri</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rula</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Maurino</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pietrobon</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lehmann</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Auer</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Quality assessment for linked data: A survey</article-title>
          .
          <source>Semantic Web</source>
          <volume>7</volume>
          (
          <issue>1</issue>
          ),
          <volume>63</volume>
          {
          <fpage>93</fpage>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>