<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>An FCA Framework for Knowledge Discovery in SPARQL Query Answers</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Melisachew Wudage Chekol</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Amedeo Napoli</string-name>
          <email>amedeo.napolig@inria.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>LORIA (INRIA, CNRS, and Universite de Lorraine)</institution>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Formal concept analysis (FCA) is used for knowledge discovery within data. In FCA, concept lattices are very good tools for classi cation and organization of data. Hence, they can also be used to visualize the answers of a SPARQL query instead of the usual answer formats such as: RDF/XML, JSON, CSV, and HTML. Consequently, in this work, we apply FCA to reveal and visualize hidden relations within SPARQL query answers by means of concept lattices.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Recently, the amount of semantically rich data available on the Web has grown
considerably. Since the conception of linked data publishing principles, over 295
linked (open) datasets (LOD) have been produced1. A reasonable number of
these datasets provide endpoints for accessing (querying) their contents.
Querying is mainly done through the W3C recommended query language SPARQL.
The answers of SPARQL queries have often the following formats: RDF/XML,
JSON, CSV, text, TSV, Java Script, XML, Spreadsheet, Ntriples, and HTML. It
might be interesting to analyse, mine, and then visualize hidden relations within
the answers. These tasks can be carried out using Formal Concept Analysis
(FCA).</p>
      <p>
        FCA is used for knowledge discovery within data represented by means of
objects and their attributes [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Concept lattices can reveal hidden relations
within data and can be used for organizing, classifying, and even mining data.
A survey of the bene ts of FCA to semantic web (SW) and vice versa has been
proposed in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] (in particular ontology completion [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]). Additionally, studies in
[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] are based on FCA for managing SW data. The former provides an
entry point to linked data using questions in a way that can be navigated. It
gives a transformation of an RDF graph into a formal context where the subject
of an RDF triple becomes the object, a composition of the predicate and object
of the triple becomes an attribute. The latter obliges the user to specify variables
corresponding to objects and attributes of a context. These variables are used
to create a SPARQL query which is used to extract content from linked data
in order to build the formal context. Following this line, we propose a way to
1 http://linkeddata.org/
      </p>
      <sec id="sec-1-1">
        <title>SELECT Query</title>
      </sec>
      <sec id="sec-1-2">
        <title>Linked data</title>
        <p>Keyword</p>
      </sec>
      <sec id="sec-1-3">
        <title>Object and Attribute</title>
      </sec>
      <sec id="sec-1-4">
        <title>CONSTRUCT Query CONSTRUCT Query</title>
      </sec>
      <sec id="sec-1-5">
        <title>SELECT Query</title>
      </sec>
      <sec id="sec-1-6">
        <title>Formal Context</title>
      </sec>
      <sec id="sec-1-7">
        <title>Concept Lattice</title>
        <p>
          organize Semantic Web data, and more precisely, the organization of SPARQL
query answers by means of concept lattices. As a result, the user is able to
visualize, navigate and classify the answers w.r.t. their context. For that, we
propose the architecture depicted in Figure 1, based on three components which
are discussed below:
1. Keyword search: In this component, a keyword (for instance, \14 juillet") is
used to search (to nd information regarding this word) a speci ed dataset.
To do so, a URI produced from the keyword is sent to the dataset to check
its existence. When this is the case, a CONSTRUCT query containing the
keyword is directed to the endpoint of the dataset. The answers are collected
and organized to create a formal context as explained in the next section.
2. SELECT Query: This component builds a formal context out of the answers
of a SPARQL query. SPARQL queries are converted into CONSTRUCT
queries to form RDF graphs from the answers. Again, this will be illustrated
in the next section.
3. Variables corresponding to objects and attributes of a formal context: This
component enables the user to precisely specify the objects and attributes
of the formal context. Out of which a SPARQL query is formed and sent
to a chosen SPARQL endpoint. The answers of the query are collected to
build a formal context. From that, a concept lattice is constructed. This is
the approach considered by the authors in [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. An example is proposed in
the next section.
        </p>
        <p>In each case, the objective is to build a formal context and then to build the
associated concept lattice.
2</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Proposal</title>
      <p>
        A formal context represents data using objects and their attributes. Formally, it
is a triple K = (G; M; I) where G is a set of objects, M is a set of attributes,
and I G M is a binary relation. A derivation operator (0) is used to compute
formal concepts of a context. Given a set of objects A, a derivation operator 0
computes the maximal set of attributes shared by objects in A and is denoted
by A0 (this is done dually with set of attributes B). A formal concept is a pair
(A; B) where A0 = B and B0 = A. A set of formal concepts ordered with the set
inclusion relation form a concept lattice [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>De nition 1 (RDF as a Formal Context). Given an RDF graph G and a
transformation function , a formal context is obtained from G as follows:
{ If ho1; rdf : type; Ci 2 G, then (ho1; rdf : type; Ci) =
{ If ho1; R; o2i 2 G, then (ho1; R; o2i) = o1</p>
      <p>C
o1 x
o2
9R:&gt; 9R :&gt;
x
x</p>
      <p>
        We consider a core fragment of RDFS called df [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] which contains the
minimal vocabulary, df = fsp,sc,type,dom,rangeg, where sp denotes the
subproperty relation, sc is subclass, and dom stands for domain. This fragment
was proven to be minimal and well-behaved in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Its semantics corresponds to
that of full RDFS. Triples containing schema information are transformed as:
1. If hC1; rdfs : subClassOf ; C2i 2 G, then (hC1; rdfs : subClassOf; C2i) =
8o 2 G: (o; C1) 2 I ) (o; C2) 2 I.
2. If hR1; rdfs : subPropertyOf ; R2i 2 G, then (hR1; rdfs : subPropertyOf ; R2i)
= 8o1; o2 2 G: (o1; 9R1:&gt;) 2 I ) (o1; 9R2:&gt;) 2 I.
3. If hR; rdfs : domain; Ci 2 G, then (hR; rdfs : domain; Ci) = 8o1 2 G:
(o1; 9R:&gt;) 2 I ) (o1; C) 2 I.
4. If hR; rdfs : range; Ci 2 G, then (hR; rdfs : range; Ci) = 8o2 2 G:
(o1; 9R :&gt;) 2 I ) (o2; C) 2 I.
      </p>
      <p>The users above are able to build a formal context from an RDF graph or a set
of SPARQL query answers. Then, there are several algorithms that can compute
the concept lattice associated with a formal context and that can be used in our
framework.</p>
      <p>Example: consider a SPARQL query that selects lm titles (as objects of the
formal context) and genres (as attributes) from DBpedia to populate a formal
context. Consequently, this formal context is shown in Figure 2.</p>
      <p>A possible concept lattice obtained from the formal context associated with
the query answers is depicted in Figure 3. Now we have a classi cation of the
results of the query w.r.t. a given topic or constraint. Additionally, this is exactly
the same thing as if we were querying the Web with Google and here we have a
classi cation of the answers w.r.t. a user constraints.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Conclusion</title>
      <p>SPARQL query answers are provided in di erent formats (RDF/XML, CSV,
JSON, TTL, and others), which do not reveal hidden semantics in the answers.</p>
      <p>Concept lattices are useful in this regard. In this work, we used concept lattices
to hierarchically organize and analyse the content of query answers.</p>
      <p>This is an ongoing work and we are currently implementing the procedure.
We should investigate how well it scales, given the size of SPARQL query answers
over linked data. Overall, this work shows some of the bene ts of FCA that can
be provided to the semantic web.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Baader</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ganter</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sertkaya</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sattler</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          :
          <article-title>Completing description logic knowledge bases using formal concept analysis</article-title>
          .
          <source>In: Proc. of IJCAI</source>
          . vol.
          <volume>7</volume>
          , pp.
          <volume>230</volume>
          {
          <issue>235</issue>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>d'Aquin</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Motta</surname>
          </string-name>
          , E.:
          <article-title>Extracting relevant questions to an RDF dataset using formal concept analysis</article-title>
          .
          <source>In: Proceedings of the sixth international conference on Knowledge capture</source>
          . pp.
          <volume>121</volume>
          {
          <fpage>128</fpage>
          .
          <string-name>
            <surname>ACM</surname>
          </string-name>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Ganter</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wille</surname>
          </string-name>
          , R.:
          <source>Formal Concept Analysis</source>
          . Springer, Berlin (
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Kirchberg</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Leonardi</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tan</surname>
            ,
            <given-names>Y.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Link</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ko</surname>
            ,
            <given-names>R.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>B.S.:</given-names>
          </string-name>
          <article-title>Formal concept discovery in semantic web data</article-title>
          .
          <source>In: ICFCA</source>
          . pp.
          <volume>164</volume>
          {
          <fpage>179</fpage>
          . Springer-Verlag (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5. Mun~oz,
          <string-name>
            <given-names>S.</given-names>
            ,
            <surname>Perez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Gutierrez</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          :
          <article-title>Minimal deductive systems for RDF</article-title>
          .
          <source>In: The Semantic Web: Research and Applications</source>
          , pp.
          <volume>53</volume>
          {
          <fpage>67</fpage>
          . Springer (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Sertkaya</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>A survey on how description logic ontologies bene t from FCA</article-title>
          .
          <source>In: CLA</source>
          . vol.
          <volume>672</volume>
          , pp.
          <volume>2</volume>
          {
          <issue>21</issue>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>