<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A SHACL-based Data Consistency Solution for Contract Compliance Verification</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Robert David</string-name>
          <email>robert.david@graphwise.ai</email>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Albin Ahmeti</string-name>
          <email>albin.ahmeti@graphwise.ai</email>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Geni Bushati</string-name>
          <email>geni.bushati@sti2.at</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Amar Tauqeer</string-name>
          <email>amar.tauqeer@wur.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anna Fensel</string-name>
          <email>anna.fensel@wur.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Artificial Intelligence Chair Group, Wageningen University &amp; Research</institution>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Semantic Technology Institute, Department of Computer Science, Universität Innsbruck</institution>
          ,
          <country country="AT">Austria</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Semantic Web Company GmbH</institution>
          ,
          <country country="AT">Austria</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Vienna University of Economics and Business</institution>
          ,
          <country country="AT">Austria</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>Vienna University of Technology</institution>
          ,
          <country country="AT">Austria</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In recent years, there have been many developments for GDPR-compliant data access and sharing based on consent. For more complex data sharing scenarios, where consent might not be sufcfiient, many parties rely on contracts. Before a contract is signed, it must undergo the process of contract negotiation within the contract lifecycle, which consists of negotiating the obligations associated with the contract. Contract compliance verification (CCV) provides a means to verify whether a contract is GDPR-compliant, i.e., adheres to legal obligations and there are no violations. The rise of knowledge graph (KG) adoption, enabling semantic interoperability using well-defined semantics, allows CCV to be applied on KGs. In the scenario of different participants negotiating obligations, there is a need for data consistency to ensure that CCV is done correctly. Recent work introduced the automated contracting tool (ACT), a KG-based and ODRL-employing tool for GDPR CCV, which was developed in the Horizon 2020 project smashHit (https://smashhit.eu). In this work, we propose a novel approach to overcome some limitations of ACT. We semi-automatically resolve CCV inconsistencies by providing repair strategies, which automatically propose (optimal) solutions to the user to re-establish data consistency and thereby support them in managing GDPR-compliant contract lifecycle data. We have implemented the approach, integrated it into ACT and tested its correctness and performance against basic CCV consistency requirements.</p>
      </abstract>
      <kwd-group>
        <kwd>Privacy protection</kwd>
        <kwd>GDPR</kwd>
        <kwd>Contract compliance verification</kwd>
        <kwd>Data consistency</kwd>
        <kwd>Constraint languages</kwd>
        <kwd>SHACL</kwd>
        <kwd>Knowledge graphs</kwd>
        <kwd>Logic programming</kwd>
        <kwd>Answer set programming</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        One of the legal bases that must be satisfied for General Data Protection Regulation (GDPR) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] is
informed consent before sharing any actual data. In certain cases, consent might not be sufcfiient and
parties need to establish a contract for data sharing. This is typical for online services that comprise
Business-to-Business (B2B) and Business-to-Consumer (B2C) data sharing, where a contract specifies
the terms and obligations specifying each party’s responsibilities. The contract lifecycle comprises all the
phases, namely, negotiation, signing, execution, auditing, and termination/renewal phase. In this context,
Contract Compliance Verification (CCV) is about auditing and provides a means to verify whether a
contract is GDPR compliant, that is, to check and report violations, e.g., for not fulfilling an obligation.
      </p>
      <p>
        Recently, approaches for digital contracting to tackle the CCV challenge are often based on knowledge
graphs (KGs) [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] as a means to improve interoperability, interpretation, and contextualization of data,
employing precise semantics through ontologies and controlled vocabularies. The automated contracting
tool (ACT) [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] has been tested and used in two data sharing scenarios, namely, for car insurances and for
smart cities. Especially in the latter case, a vast amount of data pertaining to contracts is shared, calling
for a more scalable, trusted, and secure platform for data sharing. We define our challenges as:
(i) Identifying and reporting CCV violations by defining and integrating constraints for ACT using the
      </p>
      <p>Shapes Constraint Language SHACL.
(ii) Fixing CCV violations, i.e. semi-automatically repairing violations while at the same time ensuring
the semantics of the contract.</p>
      <p>
        To cope with these challenges, we first establish integrity constraints using SHACL. In the context of the
ACT tool, it may occur that CCV violations are reported based on inconsistencies present in received
data. For such cases of CCV violations, we employ SHACL validation as a declarative approach to
identify inconsistencies, and SHACL repairs [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] to semi-automatically correct identified inconsistencies
with a minimum of user intervention. This is achieved by defining and implementing repair strategies
for inconsistent data reported by CCV. These repair strategies i) extend our previous work on SHACL
repairs [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] to provide use case specific repair optimizations and ii) integrate with our ACT tool for users
to semi-automatically fix inconsistent data reported by CCV.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        There are several systems developed to manage contracts consistently between participants, as described
in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] and [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Related work for automated contract management can be found in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] and [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
Regulation compliance checking is shown in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] using ontologies and SPARQL. Automated compliance
checking for RDF data is presented in [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. CCV Architecture</title>
      <p>Contract Lifecycle The contracting process involves drafting, negotiating terms, signing, executing
obligations, and performing CCV checks during the auditing phase. Finally, the process ends with the
termination or renewal of the contract.</p>
      <p>Data Model The data for contracts and related elements is described using the Contract Ontology (CO),
which incorporates the Financial Industry Business Ontology (FIBO) to semantically describe contract
lifecycle elements. This work specifically focuses on the fibo:Contract class, represented by the property
co:hasContractStatus, and the co:Obligation class, where contracts are linked via the co:hasObligations
property and the obligation state is represented by co:hasState.</p>
      <p>
        ACT [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] is an application that is used to create and manage contracts between different parties along
the contract lifecycle. Using its interface, users can associate clauses and obligations with contracts.
The ACT UI flags whenever there is an obligation that has not been fulfilled in the respective due
time by a party, i.e. GPDR compliance has been violated. When creating and changing a contract or
obligation, respectively, CCV validation is done via SHACL processor. The validation process checks
the conformance with respect to consistency requirements pertaining to obligations, such as if the status
of a contract is consistent with the obligation states. Furthermore, ACT has been extended1 to display
whenever there are data inconsistencies, and it offers to restore consistency by selecting one of the repair
choices via a human-in-the-loop approach.
      </p>
    </sec>
    <sec id="sec-4">
      <title>4. CCV Consistency and Repairs</title>
      <p>The CCV consistency requirements were originally determined for the contract data in the smashHit KG,
which was built for the use cases of the smashHit project. For achieving CCV consistency in relation to
ACT, we introduce repair strategies to automatically ensure that every contract has a clearly defined and
consistent status regarding the contract lifecycle.</p>
      <p>Recent work presented SHACL repairs – an approach to repair RDF graphs to satisfy given SHACL
constraints. It is based on explanations for constraint violations, which are represented as sets of
additions and deletions for the RDF graph to achieve conformance. The implementation uses answer
set programming (ASP) to provide repair choices as part of minimal models. In our context of fixing
violations reported by CCV, we aim to automatically determine an optimal repair choice to minimize
human-in-the-loop. In cases where this is not possible, the ACT UI has been extended to allow users to
select one preferred choice.</p>
      <p>Consequently, we extend SHACL repairs with repair strategies for ACT, which formalize optimizations
to automatically determine optimal repair choices compliant to CCV consistency requirements. We
implement a repair strategy program on top of the SHACL repair program, which parses RDF repair
strategies and automatically generates additional ASP rules to determine optimal choices.</p>
      <p>In the following, we provide an example for one of the ACT consistency requirements. The following
SHACL shape represents the consistency requirement: "A Contract is violated if at least one associated
Obligation has the state set to violated".
:ContractViolationShape a sh:NodeShape; sh:targetClass fibo:Contract;
sh:property [ sh:path :hasContractStatus; sh:maxCount 1; ];
sh:or (
[ sh:not [ sh:property [ sh:path ( co:hasObligations co:hasState ); sh:hasValue co:ViolatedState; ]; ] ]
[ sh:property [ sh:path co:hasContractStatus; sh:hasValue co:statusViolated; ] ] ).</p>
      <p>Consistency for ContractViolationShape means that there must not be any non-violated contract if there
is a violated obligation. In case of a violation, the repair program either deletes hasObligations, removes
the ViolatedState from the obligations or it adds the statusViolated to the contract. The repair strategy we
define for CCV semantics is to only allow adding statusViolated for hasContractStatus to represent the
actual real-world situation. We define two constraints for the repair strategy.
:ViolatedContractStrategy a shr:RepairStrategy;
shr:hasConstraint [ sh:path co:hasObligations; sh:action sh:delete; ];
shr:hasConstraint [ sh:path co:hasState; shr:action shr:delete; shr:values ( co:ViolatedState; ) ] .
We assume the following data graph:
:contb2b a fibo:Contract; co:hasContractStatus :statusFulfilled; co:hasObligations :ob_1 .
:ob_1 a co:Obligation; co:hasState co:ViolatedState .</p>
      <p>Without the repair strategy, the repair program would return the following two minimal repairs with
deletions D1 and D2, which both violate the CCV semantics:</p>
      <p>D1 = {hasObligations(contb2b, ob_1)}</p>
      <p>D2 = {hasState(ob_1,ViolatedState)}
In contrast, when we apply the repair strategy, the repair program will return a new minimal repair under
the constraints added by the repair strategy with additions A and deletions D:</p>
      <p>A = {hasContractStatus(contb2b, statusViolated)}</p>
      <p>D = {hasContractStatus(contb2b, statusFul f illed)}
With repair strategies, the data can now only be repaired when assigning the contract status statusViolated
to the contract, thereby implementing the expected CCV semantics.</p>
      <p>Evaluation of CCV Repairs Tests were conducted using contract data from the smashHit project
to evaluate the correctness and performance, including bulk performance tests. The results in Fig. 1
show that data with up to 3000 inconsistencies was repaired within a minute. The performance curve
shows an above linear development. As a conclusion from the test results, our approach shows promising
performance in practice, highlighting the approach’s correctness and scalability. The implementation and
test data are available on GitHub2.
2https://github.com/robert-david/shacl-repairs/tree/ccv-repair-strategies
500 1,000 1,500 2,000 2,500</p>
      <p>Data graph inconsistencies</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusions &amp; Future Work</title>
      <p>This work presents a new approach for semi-automatically resolving data inconsistencies in CCV using
the ACT tool, simplifying the repair process and helping to manage GDPR-compliant contract data.
Future work will focus on handling more complex constraints, evaluating repair quality, and expanding
the approach to broader use cases, ultimately improving the applicability and reliability of semantic
contract environments and supporting the adoption of formalized contracts in automated systems.</p>
    </sec>
    <sec id="sec-6">
      <title>Declaration on Generative AI</title>
      <p>The author(s) have not employed any Generative AI tools.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>B.</given-names>
            <surname>Wolford</surname>
          </string-name>
          ,
          <article-title>General Data Protection Regulation (GDPR)</article-title>
          ., Available online: https://gdpr.eu/ what-is-gdpr/,
          <year>2022</year>
          . Accessed on
          <issue>20</issue>
          <year>July 2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Tauqeer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kurteva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. R.</given-names>
            <surname>Chhetri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ahmeti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Fensel</surname>
          </string-name>
          ,
          <string-name>
            <surname>Automated</surname>
            <given-names>GDPR</given-names>
          </string-name>
          <article-title>contract compliance verification using knowledge graphs</article-title>
          ,
          <source>Inf</source>
          .
          <volume>13</volume>
          (
          <year>2022</year>
          )
          <article-title>447</article-title>
          . doi:
          <volume>10</volume>
          .3390/info13100447.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Tauqeer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Fensel</surname>
          </string-name>
          ,
          <article-title>GDPR data sharing contract management and compliance verification tool</article-title>
          ,
          <source>Software Impacts</source>
          <volume>21</volume>
          (
          <year>2024</year>
          )
          <article-title>100653</article-title>
          . doi:doi.org/10.1016/j.simpa.
          <year>2024</year>
          .
          <volume>100653</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S.</given-names>
            <surname>Ahmetaj</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>David</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Polleres</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Šimkus</surname>
          </string-name>
          ,
          <article-title>Repairing shacl constraint violations using answer set programming</article-title>
          ,
          <source>in: The Semantic Web - ISWC</source>
          <year>2022</year>
          : 21st International Semantic Web Conference, Virtual Event,
          <source>October 23-27</source>
          ,
          <year>2022</year>
          , Proceedings, Springer-Verlag, Berlin, Heidelberg,
          <year>2022</year>
          , p.
          <fpage>375</fpage>
          -
          <lpage>391</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>031</fpage>
          -19433-7_
          <fpage>22</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>R. G. Brown,</surname>
          </string-name>
          <article-title>The corda platform: An introduction</article-title>
          ,
          <source>Retrieved</source>
          <volume>27</volume>
          (
          <year>2018</year>
          )
          <year>2018</year>
          . doi:DOI:10.13140/ RG.2.2.30487.37284.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>L.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Shi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Luo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <article-title>A blockchain-driven electronic contract management system for commodity procurement in electronic power industry</article-title>
          ,
          <source>IEEE Access 9</source>
          (
          <year>2021</year>
          )
          <fpage>9473</fpage>
          -
          <lpage>9480</lpage>
          . doi:
          <volume>10</volume>
          .1109/ACCESS.
          <year>2021</year>
          .
          <volume>3049562</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Aldewereld</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Dignum</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.-H.</given-names>
            <surname>Tan</surname>
          </string-name>
          ,
          <article-title>Compliance checking of organizational interactions</article-title>
          ,
          <source>ACM Trans. Manage. Inf. Syst</source>
          .
          <volume>5</volume>
          (
          <year>2015</year>
          ). doi:
          <volume>10</volume>
          .1145/2629630.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A.</given-names>
            <surname>Lomuscio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Qu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Solanki</surname>
          </string-name>
          ,
          <article-title>Towards verifying compliance in agent-based web service compositions</article-title>
          , volume
          <volume>1</volume>
          ,
          <year>2008</year>
          , pp.
          <fpage>265</fpage>
          -
          <lpage>272</lpage>
          . doi:
          <volume>10</volume>
          .1145/1402383.1402424.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>C.</given-names>
            <surname>Molina-Jimenez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Shrivastava</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Strano</surname>
          </string-name>
          ,
          <article-title>A model for checking contractual compliance of business interactions</article-title>
          ,
          <source>IEEE Transactions on Services Computing</source>
          <volume>5</volume>
          (
          <year>2011</year>
          )
          <fpage>276</fpage>
          -
          <lpage>289</lpage>
          . doi:
          <volume>10</volume>
          .1109/ TSC.
          <year>2011</year>
          .
          <volume>37</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>K. R.</given-names>
            <surname>Bouzidi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Fies</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Faron-Zucker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Zarli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. L.</given-names>
            <surname>Thanh</surname>
          </string-name>
          ,
          <article-title>Semantic web approach to ease regulation compliance checking in construction industry</article-title>
          ,
          <source>Future Internet</source>
          <volume>4</volume>
          (
          <year>2012</year>
          )
          <fpage>830</fpage>
          -
          <lpage>851</lpage>
          . doi:https://doi.org/10.3390/fi4030830.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>L.</given-names>
            <surname>Robaldo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Pacenza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zangari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Calegari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Calimeri</surname>
          </string-name>
          , G. Siragusa,
          <article-title>Efficient compliance checking of RDF data</article-title>
          ,
          <source>Journal of Logic and Computation</source>
          (
          <year>2023</year>
          ). doi:
          <volume>10</volume>
          .1093/logcom/exad034.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>