<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Fuzzy Ontology-Approach to improve Semantic Information Retrieval</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Silvia Calegari</string-name>
          <email>calegari@disco.unimib.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Elie Sanchez</string-name>
          <email>elie.sanchez@medecine.univ-mrs.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>7 Bd Jean Moulin</institution>
          ,
          <addr-line>13385 Marseille Cedex5,</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Dipartimento Di Informatica, Sistemistica e Comunicazione Universita` di Milano - Bicocca V.le Sarca 336/14</institution>
          ,
          <addr-line>20126 Milano</addr-line>
          ,
          <country country="IT">Italia</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper shows how a Fuzzy Ontology based approach can improve semantic documents retrieval. After formally defining a Fuzzy Knowledge Base, it is discussed a special type of new non-taxonomic fuzzy relationships, called (semantic) correlations. These correlations, first assigned by experts, are updated after querying, or when a document has been inserted into a database. It is then introduced an Information Retrieval algorithm that allows to derive a unique path among the entities involved in the query in order to obtain maxima semantic associations in the knowledge domain.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Ontologies in the sense of a formal, explicit specification of a shared conceptualisation
[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], constitute a key component of the Semantic Web, facilitating a machine
processable representation of information. Two-valued-based logical methods are insufficient
to handle ill-structured, uncertain or imprecise information encountered in real world
knowledge. A tolerance for imprecision, by a positive use of Fuzzy Logic may be
exploited to enhance the power of the Semantic Web [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ]. It has been shown that Fuzzy
Logic allows to bridge the gap between human-understandable soft logic and
machinereadable hard logic. Indeed there has been a natural integration of Fuzzy Logic in
Ontology in order to define a new theoretical paradigm called Fuzzy Ontology [
        <xref ref-type="bibr" rid="ref4 ref5 ref6">4, 5, 6</xref>
        ].
      </p>
      <p>Recently, an increasing number of approaches to Information Retrieval have
proposed models based on concepts rather than on keywords. So that, in this work,
ontologies have been combined to objects (stored in a database) in order to search new
documents semantically correlated to user’s query.</p>
      <p>
        In this paper, the notion of Fuzzy Concept Network (FCN), introduced in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], is
extended incorporating Database Objects so that, concepts and documents can
similarly be represented in the network. It is then introduced and described an Information
Retrieval algorithm using an Object-Fuzzy Concept Network (O-FCN). This algorithm
allows to derive a unique path among the entities involved in the query in order to obtain
the maximum semantic associations in the knowledge domain.
      </p>
      <p>
        It will now be introduced a formal Fuzzy Ontology (see also [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ]). This approach
depends purely on an application choice. Indeed, we consider a formal Fuzzy Ontology
as a quadruple OF = fC; R; F; Ag where C is a set of fuzzy concepts, or entities
indifferently. The set of entities of the fuzzy ontology will be indicated by E. R is a set
of fuzzy relations. Each R 2 R is a n-ary fuzzy relation on the domain of entities R :
En 7! [0; 1]. In particular, R = T [ Tnot where T is the set of the taxonomic relations
and Tnot is the set of the non-taxonomic relations. F is a set of fuzzy relations on the
set of entities E and a specific domain contained in D = fintegers; strings; :::g, and
A is a set of axioms expressed in an proper logical language.
      </p>
      <p>
        Note that even an OWL ontology “may” only include instances: we separated them
in our approach, the advantage is that we can have one ontology and multiple instances
that conform to it. Using this definition, it is possible to introduce the notion of Fuzzy
Knowledge Base. Our definition is based on the vision of an ontology for the Semantic
Web where knowledge is expressed in a Description Logic-based ontology as a triple
hT ; R; Ai where T , R and A are respectively a TBox, RBox and ABox [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Thus, by
using a fuzzy ontology the knowledge of a domain is defined in order to correspond to
a Description Logic (DL) knowledge base.
      </p>
      <p>Definition 1. A Fuzzy Knowledge Base is a couple defined as:</p>
      <p>KBF = (OF ; I)
where OF is a Fuzzy Ontology as previously defined and I is a set of instances
associated with the fuzzy ontology. Furthermore, every concept C 2 C is a fuzzy set on the
domain of the instances defined as C : I 7! [0; 1].</p>
      <p>In this context the set I is identified with the objects stored in the database, i.e.
ODB = I and C : ODB 7! [0; 1]. In particular the set of objects can consist of
documents, digital pictures, notes and so on, i.e. ODB = fD; P; N ; : : : g where D is a set of
documents, P is a set of digital pictures, and N is a set of notes, etc.
A new fuzzy relationship: Correlation. In the Semantic Web area of research, a crucial
topic is to define a dynamic knowledge of a domain adapting itself to the context. In
order to achieve this aim, it is needed to handle the trade off between the correct
definition of an object (given by the ontology structure) and the actual meaning assigned
to the artifact by humans (i.e. the experience-based context assumed by every person
according to his specific knowledge).</p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] it has been proposed a system that allows to achieve these objectives. It
consists in the determination of a semantic correlation among the entities that are searched
together, for example, in a query or when a document has been inserted into the database.
In particular, a fuzzy weight on the correlations is also assigned during the
definition of the ontological domain by an expert according to his/her experience. A
correlation is a binary non-taxonomic fuzzy relation: corr : E £ E 7! [0; 1], where
E = fe1; e2; : : : ; eng is the set of the entities contained in the ontology. This defines
how the entities are linked semantically. The closer to 1 is the corr value, the more the
two considered entities are semantically associated.
      </p>
      <p>In this way, the fuzzy ontology gives a solution to the trade off of the knowledge
base and allows to dynamically adapt itself to the context in which it is introduced.</p>
      <p>
        Information Retrieval Algorithm using O-FCN and its
Evaluation
In [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] we introduced a Fuzzy Concept Network (FCN) to represent the dynamical
behaviour of the fuzzy ontologies. In particular, the FCN representation lets us introduce
a new semantic network based on the correlations defined in the fuzzy ontology. But an
ontology allows to handle a complete knowledge base and so to make reasoning on the
instances. In this work we extend this possibility by inserting directly in the FCN the
objects of the domain stored into the database. In this way, we can reason directly with
the elements of the specific application only visiting the FCN graph. In the following
an extended FCN definition is given in order to insert the objects of the domain:
Definition 2. An Object-Fuzzy Concept Network (O-FCN) is a weighted graph Nfo =
fODB; Nf g, where ODB is the set of the objects stored in the database and Nf =
fE; F; mg is a Fuzzy Concept Network (FCN). Each object is described by the entities
of the FCN , i.e. 8oi 2 ODB oi = fe1; : : : ; eng where ei; : : : ; en 2 E.
      </p>
      <p>The set ODB identifies all the information that is contained into the database, such
as documents, digital pictures, videos, and so on.</p>
      <p>In Fig. 1 it is given a 3D graphical representation of the prototype of a small O-FCN.
The different thickness of the links identifies how strongly the entities are correlated.
The thicker the link the more correlated are the two entities (i.e. the closer to 1 is the
fuzzy value).</p>
      <p>
        A recent application of Information Retrieval System (IRS) is the Semantic Web
area of research. Indeed, the necessity of a better definition of IRS emerged in order to
retrieve semantic information considered useful to a user query. Information Retrieval
is a domain that involves the organization, storage, retrieval and display of information
[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. In order to extend the query vector it has been proposed a new algorithm based on
fuzzy ontology. When navigating the O-FCN it is possible to find semantic links among
the concepts: for each term specified in the query, a unique path is defined at each step,
corresponding to the maximum value correlation. A step-by-step brief description of
this new algorithm is given below (see also Fig. 2):
’O-FCN’-IR Search ( Eq : word vector )
1: ’O-FCN’-based Eq extension (pruning phase)
2: ’O-FCN’-based documents extraction
3: ’O-FCN’-based relevance calculation (cosine distance)
return ranking of the documents
      </p>
      <p>
        The O-FCN has been involved in all the steps of the algorithm in order to
semantically enrich the results that were obtained. In this way, to retrieve documents it is easier
to process than from the previous one that used only FCN [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. The algorithm input is a
vector Eq identifying the terms in the query. The first step (1) uses these terms to locate
the unique path finding maximum correlation value among them. Eq is extended
navigating the O-FCN recursively. Now, the “pruning phase” is directly inserted into the
query extension algorithm. In this way, it is possible to find immediately the important
entities, which are more semantically correlated w.r.t. the Eq set. In step (2) the O-FCN
has been involved in order to directly extract the documents by the network. Whereas
in the last step, O-FCN is used to calculate the relevance of the documents in order to
sort them in decreasing order. The final score of a document is evaluated through a
cosine distance among the weights of each entity. This is done for normalisation purposes.
Such a value is finally sorted in order to obtain a ranking among the documents.
Evaluation A creative learning environment is the context chosen to test the new
Information Retrieval algorithm based on O-FCN. In particular, the ATELIER (Architecture
and Technologies for Inspirational Learning Environments) project has been involved.
ATELIER is an EU-funded project that was part of the Disappearing Computer
initiative. The aim of this project was to build a digitally enhanced environment, supporting
a creative learning process in architecture and interaction design education. In this
context, it emerges that the evolution of the O-FCN is mainly given by the words of the
documents inserted in a hyper-media data base (HMDB) and from the entities written
during the definition of a query by the students.
      </p>
      <p>We have studied the dynamic evolution of the O-FCN examining 485 documents and
200 queries of the students. For each query a user had the opportunity to include up to
5 different concepts and the possibility to semantically enrich his/her requests using the
following list of concept modifiers: little, enough, moderately, quite, very, totally.
The algorithm has been tested in two different situations: classical and fuzzy approaches.
In the first case, the crisp situation has been reported assigning value 1.0 to the
correlations values and without taking the concept modifiers into the queries of the students.
Instead, in the last case, all the parameters described in this paper have been considered.</p>
      <p>
        Fuzzy recall and fuzzy precision measures [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] are the parameters used in order to
evaluate retrieval algorithms in these two different situations: crisp and fuzzy cases. In
Table 1 it is reported the average values of fuzzy precision and fuzzy recall for the 200
queries performed in the two approaches. Retrieved documents are ranked up to a theta
threshold (µ). In particular, we have chosen three values of µ (0:35; 0:50 and 0:75), to
validate the algorithm in different situations.
      </p>
      <p>In Table 1, apparently, crisp approach is similar to the fuzzy one and the relevance
of the obtained documents is more or less the same. Instead, in the fuzzy case it has
been observed a better accuracy of relevant documents. Indeed, this result was derived
from the analysis of coefficient variance based on fuzzy precision measure (here CVP ).
In detail, CVP := ( ¾ ) ¢ 100 where ¾ is the standard deviation calculated on the</p>
      <p>PF
relevance of the documents and PF is the fuzzy precision, and it is a useful statistic for
comparing the degree of variation from one data series to another. In general, the larger
this number, the greater the variability in the data. Figure 3 depicts the trend of CVP
between fuzzy and crisp approaches, for each query. In the fuzzy case we can observe
higher CVP values for the fuzzy case, for all the queries analysed. This means that the
fuzzy case approach identifies more refinement and accuracy than the crisp case.
180
160
140
120
ry 100
e
u
#q 80
60
40
20
0
0.35
0.5
θ-value</p>
      <p>0.75
Fig. 3. Trend of CVP value for each query.</p>
      <p>Fuzzy
Crisp</p>
    </sec>
    <sec id="sec-2">
      <title>Conclusion</title>
      <p>It has been shown how the introduction of Fuzzy Ontologies, derived models and new
structures, can improve an Information Retrieval System. More extensive developments
will be shown in a forthcoming journal paper. The methodology allows to handle a
trade off between the correct definition of an object, taken in the ontology structure,
and the actual meaning assigned by individuals. So that it offers the opportunity to
exploit an additional knowledge hidden in entities-documents relationships, or semantic
correlations, after querying a database, but also to enrich the semantics of the system.
After analysis, the obtained results for relevance presented a better accuracy in the fuzzy
case than in the crisp one.</p>
    </sec>
    <sec id="sec-3">
      <title>Acknowledgements</title>
      <p>The work presented in this paper had been partially supported by the ATELIER project
(IST-2001-33064). Particular thanks are due to Fabio Farina for his contribution in the
Section of the numerical validations.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Gruber</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>A Translation Approach to Portable Ontology Specifications</article-title>
          .
          <source>Knowledge Acquisition</source>
          <volume>5</volume>
          (
          <year>1993</year>
          )
          <fpage>199</fpage>
          -
          <lpage>220</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Sanchez</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          :
          <article-title>Fuzzy Logic and the Semantic Web</article-title>
          .
          <source>Capturing Intelligence</source>
          .
          <source>Elsevier</source>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Zadeh</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>From Search Engines to Question-Answering Systems - The Problems of World Knowledge, Relevance, Deduction and Precisiation</article-title>
          . In
          <string-name>
            <surname>Sanchez</surname>
          </string-name>
          , E., ed.:
          <article-title>Fuzzy Logic and the Semantic Web</article-title>
          .
          <source>Capturing Intelligence</source>
          .
          <source>Elsevier</source>
          (
          <year>2006</year>
          )
          <fpage>163</fpage>
          -
          <lpage>210</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Calegari</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ciucci</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Fuzzy Ontology, Fuzzy Description Logics and Fuzzy-OWL</article-title>
          .
          <source>In: Proceedings of WILF 2007</source>
          .
          <article-title>Volume 4578 of LNCS</article-title>
          . (
          <year>2007</year>
          )
          <article-title>In printing</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Calegari</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ciucci</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Fuzzy Ontology and Fuzzy-OWL in the KAON Project</article-title>
          .
          <source>In: FUZZIEEE 2007. IEEE International Conference on Fuzzy Systems</source>
          (
          <year>2007</year>
          )
          <article-title>In printing</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Sanchez</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yamanoi</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Fuzzy ontologies for the semantic web</article-title>
          . In Larsen,
          <string-name>
            <given-names>H.L.</given-names>
            ,
            <surname>Pasi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            ,
            <surname>Arroyo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.O.</given-names>
            ,
            <surname>Andreasen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Christiansen</surname>
          </string-name>
          , H., eds.
          <source>: FQAS. LNCS 4027</source>
          , Springer (
          <year>2006</year>
          )
          <fpage>691</fpage>
          -
          <lpage>699</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Calegari</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Farina</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Fuzzy Ontologies and Scale-free Networks Analysis</article-title>
          .
          <source>International Journal of Computer Science and Applications IV(II)</source>
          (
          <year>2007</year>
          )
          <fpage>125</fpage>
          -
          <lpage>144</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Baader</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Calvanese</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McGuinness</surname>
            ,
            <given-names>D.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nardi</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Patel-Schneider</surname>
          </string-name>
          , P.F., eds.:
          <article-title>The Description Logic Handbook: Theory, Implementation, and Applications</article-title>
          . In Baader, F.,
          <string-name>
            <surname>Calvanese</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McGuinness</surname>
            ,
            <given-names>D.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nardi</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Patel-Schneider</surname>
          </string-name>
          , P.F., eds.: Description Logic Handbook, Cambridge University Press (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Salton</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mcgill</surname>
            ,
            <given-names>M.J.</given-names>
          </string-name>
          :
          <article-title>Introduction to Modern Information Retrieval</article-title>
          .
          <string-name>
            <surname>McGraw-Hill</surname>
          </string-name>
          , Inc., New York, NY, USA (
          <year>1986</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Sanchez</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pierre</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Fuzzy Logic and Genetic Algorithms in Information Retrieval</article-title>
          . In Yamakawa, T., ed.
          <source>: Proceedings of the 3rd Int. Conf. on Fuzzy Logic, Neural Nets and Soft Computing</source>
          , Jono Printing Co.
          <article-title>(</article-title>
          <year>1994</year>
          )
          <fpage>29</fpage>
          -
          <lpage>35</lpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>