<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>LIVE: a Tool for Checking Licenses Compatibility between Vocabularies and Data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Guido Governatori</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ho-Pun Lam</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Antonino Rotolo</string-name>
          <email>antonino.rotolo@unibo.it</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Serena Villata</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ghislain Atemezing</string-name>
          <email>auguste.atemezing@eurecom.fr</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fabien Gandon</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>INRIA Sophia Antipolis</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>NICTA Queensland Research Laboratory</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Bologna</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>In the Web of Data, licenses specifying the terms of use and reuse are associated not only to datasets but also to vocabularies. However, even less support is provided for taking the licenses of vocabularies into account than for datasets, which says it all. In this paper, we present a framework called LIVE able to support data publishers in verifying licenses compatibility, taking into account both the licenses associated to the vocabularies and those assigned to the data built using such vocabularies.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>
        The license of a dataset in the Web of Data can be specified within the data, or outside
of it, for example in a separate document linking the data. In line with the Web of
Data philosophy [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], licenses for such datasets should be specified in RDF, for instance
through the Dublin Core vocabulary1. Despite such guidelines, still a lot of effort is
needed to enhance the association of licenses to data on the Web, and to process licensed
material in an automated way. The scenario becomes even more complex when another
essential component in the Web of Data is taken into account: the vocabularies. Our
goal is to support the data provider in assigning a license to her data, and verifying
its compatibility with the licenses associated to the adopted vocabularies. We answer
this question by proposing an online framework called LIVE2 (LIcenses VErification)
that exploits the formal approach to licenses composition proposed in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] to verify the
compatibility of a set of heterogeneous licenses. LIVE, after retrieving the licenses
associated to the vocabularies used in the dataset under analysis, supports data providers
in verifying whether the license assigned to the dataset is compatible with those of the
vocabularies, and returns a warning when this is not the case.
      </p>
    </sec>
    <sec id="sec-2">
      <title>The LIVE framework</title>
      <p>The LIVE framework is a Javascript application, combining HTML and Bootstrap.
Hence, installation has no prerequisite. Since the tool is written in Javascript, the best
way to monitor the execution time is with the performance.now() function. We use the
10 LOD datasets with the highest number of links towards other LOD datasets available
at http://lod-cloud.net/state/#links. For each of the URLs in Datahub, we
retrieve the VoID3 file in Turtle format, and we use the voidChecker function4 of the
LIVE tool to retrieve the associated license, if any. The input of the LIVE framework
(Figure 1) consists in the dataset (URI or VOiD) whose license has to be verified. The
framework is composed by two modules. The first module takes care of retrieving the
vocabularies used in the dataset, and for each vocabulary, retrieves the associate license5
(if any) querying the LOV repository. The second module takes as input the set of
licenses (i.e., the licenses of the vocabularies used in the dataset as well as the license
assigned to the dataset) to verify whether they are compatible with each others. The
result returned by the module is a yes/no answer. In case of negative answer, the data
provider is invited to change the license associated to the dataset and check back again
with the LIVE framework whether further inconsistencies arise.</p>
      <p>Check consistency of
licensing information</p>
      <p>for dataset D
Warning: licenses are
not compatible</p>
      <p>LIVE framework</p>
      <p>Licenses
retrieval
module
vocabularies and data
licenses</p>
      <p>Licenses
compatibility
module
retrieve vocabularies
used in the dataset
retrieve licenses
for selected vocabularies</p>
      <p>LOV
dataset D</p>
      <p>
        Retrieving licensing information from vocabularies and datasets. Two use-cases are
taken into account: a SPARQL endpoint, or a VoID file in Turtle syntax. In the first
use case, the tool retrieves the named graphs present in the repository, and then the
user is asked to select the URI of the graph that needs to be checked. Having that
information, a SPARQL query is triggered, looking for entities declared as owl:Ontology,
3 http://www.w3.org/TR/void/
4 http://www.eurecom.fr/~atemezin/licenseChecker/voidChecker.html
5 Note that the LIVE framework relies on the dataset of machine-readable licenses (RDF, Turtle
syntax) presented in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
voaf:Vocabulary or object of the void:vocabulary property. The final step is to
look up the LOV catalogue to check whether they declare any license. There are two
options for checking the license: (i) a “strict checking” where the FILTER clause
contains exactly the namespace of the submitted vocabulary, or (ii) a “domain checking”,
where only the domain of the vocabulary is used in the FILTER clause. This latter option
is recommended in case only one vocabulary has to be checked for the license. In the
second use case, the module parses a VoID file using a N3 parser for Javascript6, and
then collects the declared vocabularies in the file, querying again LOV7 to check their
licensing information. When the URIs of the licenses associated to the vocabularies and
the dataset are retrieved, the module retrieves the machine-readable description of the
licenses in the dataset of licenses [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        Licenses compatibility verification. The logic proposed in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and the licenses
compatibility verification process has been implemented using SPINdle [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] – a defeasible logic
reasoner capable of inferencing defeasible theories with hundredth of thousand rules.
Users
      </p>
      <p>Licenses retrieval
RDF–Defeasible
Theory Translator</p>
      <p>Theories
Composer</p>
      <p>Composed Theory
Contextual Info</p>
      <p>Reasoning</p>
      <p>Engine
Compatibility</p>
      <p>Checker</p>
      <p>Composed Theory Results</p>
      <p>Conclusions</p>
      <p>User
interface
Reasoning
layer</p>
      <p>As depicted in Figure 2, after receiving queries from users, the selected licenses
(represented using RDF) will be translated into the DFL formalism supported by SPINdle
using the RDF-Defeasible Theory Translator. That is, each RDF-triple will be translated
into a defeasible rule based on the subsumption relation between the subject and object
of a RDF-triples. In our case, we can use the subject and object of the RDF-triples as
the antecedent and head of a defeasible rule, respectively. Besides, the translator also
supports direct import from the Web and processing of RDF data into SPINdle theories.</p>
      <p>
        The translated defeasible theories will then be composed into a single defeasible
theory based on the logic proposed in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], using the Theories Composer. Afterwards, the
6 https://github.com/RubenVerborgh/N3.js
7 Since LOV endpoint does not support the JSON format in the results, we have uploaded the
data in eventmedia.eurecom.fr/sparql.
composed theory, together with other contextual information (as defined by user), will
be loaded into the SPINdle reasoner to perform a compatibility check before returning
the results to the users.
      </p>
      <p>We have evaluated the time performances of the LIVE framework in two steps.
First, we evaluate the time performances of the licenses compatibility module: it needs
about 6ms to compute the compatibility of two licenses. Second, we evaluate time
performances (Chrome v. 34) of the whole LIVE framework for the 10 LOD datasets
with the highest number of links towards other LOD datasets, considering both the
licenses retrieval module and the licenses compatibility one. The results show that
LIVE provides the compatibility evaluation in less than 5 seconds for 7 of the selected
datasets. Time performances of LIVE are mostly affected by the first module while the
compatibility module does not produce a significant overhead. For instance, consider
Linked Dataspaces8, a dataset where we retrieve the licensing information in both
the dataset and the adopted vocabularies. In this case, LIVE retrieves in 13:20s 48
vocabularies, the license for the dataset is CC-BY, and the PDDL license is attached one
of the vocabularies9. The time for verifying the compatibility is 8ms, leading to a total
of 13:208s.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Future perspectives</title>
      <p>We have introduced the LIVE framework for licenses compatibility. The goal of the
framework is to verify the compatibility of the licenses associated to the vocabularies
exploited to create a RDF dataset and the license associated to the dataset itself. Several
points have to be taken into account as future work. More precisely, in the present paper
we consider vocabularies as data but this is not the only possible interpretation. For
instance, we may see vocabularies as a kind of compiler, such that, after the creation
of the dataset then the external vocabularies are no more used. In this case, what is a
suitable way of defining a compatibility verification? We will investigate this issue as
well as we will evaluate the usability of the online LIVE tool to subsequently improve
the user interface.
8 http://270a.info/
9 http://purl.org/linked-data/cube</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Cabrio</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Aprosio</surname>
            ,
            <given-names>A.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Villata</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>These are your rights: A natural language processing approach to automated rdf licenses generation</article-title>
          .
          <source>In: ESWC2014</source>
          ,
          <string-name>
            <surname>LNCS</surname>
          </string-name>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Governatori</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rotolo</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Villata</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gandon</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>One license to compose them all - a deontic logic approach to data licensing on the web of data</article-title>
          .
          <source>In: International Semantic Web Conference (1). Lecture Notes in Computer Science</source>
          , vol.
          <volume>8218</volume>
          , pp.
          <fpage>151</fpage>
          -
          <lpage>166</lpage>
          . Springer (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Heath</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bizer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Linked Data: Evolving the Web into a Global Data Space</article-title>
          . Morgan &amp;
          <string-name>
            <surname>Claypool</surname>
          </string-name>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Lam</surname>
            ,
            <given-names>H.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Governatori</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          :
          <article-title>The making of SPINdle</article-title>
          .
          <source>In: Proceedings of RuleML, LNCS 5858</source>
          . pp.
          <fpage>315</fpage>
          -
          <lpage>322</lpage>
          . Springer (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>