Using Concept Formal Analysis for Cooperative
     8VLQJ&RQFHSW)RUPDO$QDO\VLVIRU&RRSHUDWLYH
              Information Retrieval
                                 ,QIRUPDWLRQ5HWULHYDO
                                            
                   Ibtissem Nafkha1 , Samir Elloumi1
                                                     , and Ali Jaoua2
  1
                             Ibtissem NAFKHA1, Samir ELLOUMI1, Ali JAOUA2
      University
            1        of   Tunis,
              University of Tunis,Department
                                        DepartmentofofComputer     Science,Campus
                                                       Computer Science,     Campus   Universitaire,
                                                                                   Universitaire,
              Le Belvédère,
                     Le Belvédère, 1060,   Tunis,   Tunisia.Phone/Fax:      +216  71
                                      1060, Tunis, Tunisia.Phone/Fax: +216 71 885190 885190
                               2
                                 University
                                    (2)       of Qatar,
                                        University        Faculty
                                                   of Qatar, Facultyof
                                                                    of Sciences,
                                                                       Sciences,
                          Department       of Computer
                                  Department   of ComputerScience,    Doha,Qatar.
                                                             Science, Doha,   Qatar.
        1                                                                    2
          {ibtissem.nafkha,samir.elloumi}@fst.rnu.tn,                          jaoua@qu.edu.qa
                       {ibtissem.nafkha,samir.elloumi}@fst.rnu.tn, jaoua@qu.edu.qa
                                              
                    . The potentials of formal concept analysis (FCA) for information
          retrieval (IR) have been highlighted by a number of research studies since its
          inception. The growth of the web has favoured the emergence of new search
          applications. In this paper, we will focus on the unique features of FCA for
          searching in distributed information and for reducing the size of the set
          information. The development of a FCA-based applications for distributed
          information returns a major gain and the obtained results are promising. This
          study has several perspectives for real and fuzzy data.

 ,QWURGXFWLRQ
    Along with the growth of the world wide web, information retrieval systems gain
 importance since they are often the only way to find the few documents actually
 relevant to a specific question in the vast quantities of text available. Moreover, with
 the advent of the Web along with the unprecedented amount of information available
 in electronic format and its distributed structure, Formal Concept Analysis (FCA) is
 more useful and practical than ever, because this technology addresses important
 limitations of the systems that currently support users in their quest for information.
 Over the last few years, the range of functionality has been expanded to include new
 tasks such as data reduction and collaborative (or cooperative) information retrieval.
 In fact, due to the huge quantity of available information and its distributed structure,
 it is necessary to abstract it and eliminate the redundancy data. In this context, a
 method for data reduction based on the formal concept analysis is proposed in
 [16,17]. At the same time, new IR domains have been investigated including different
 types of information (email messages, web documents,..). Thus, there is nowadays a
 much better awareness of the strengths and limitations of this technique for organising
 and searching distributed information. We are interested by searching in distributed
 information. So, this paper is organized as follows. In section 2, we introduce some
 basic definitions on formal analysis. Then in section 3, we present the data reduction.
 The section 4 is devoted to the presentation of several methods for searching in
 distributed information. In section 5, we present the complexity of those methods.
 
 0DWKHPDWLFDO)RXQGDWLRQV
 
   Among the mathematical theories found recently with important applications in
 computer science, lattice theory has a specific place for data organization, information


c V. Snášel, R. Bělohlávek (Eds.): CLA 2004, pp. 120–136, ISBN 80-248-0597-9.
  VŠB – Technical University of Ostrava, Dept. of Computer Science, 2004.
                                                  1
     Using Concept Formal Analysis for Cooperative Information Retrieval              121


engineering, data mining and for reasoning. It may be considered as the mathematical
tool    that     unifies   data    and    knowledge    or    information    retrieval
[1,2,3,4,5,6,7,8,9,10,18,19,20]. In this section, we define formal context, formal
concept, Galois connection and the lattice of concepts associated to the formal
context.

)RUPDO&RQWH[W

'HILQLWLRQ : A formal context is a triple k = <O,P,R>, where O is a finite set of
elements called objects, P a finite set of elements called properties and R is a binary
relation defined between O and P. The notations (g,m), or R(g,m)=1, mean that
“formal object g verifies property m in relation R" [12,21].

([DPSOHLet O be a set of some animals and P be a set of the properties such as :
a= needs water, b= lives in water, c= lives on land, d= can move around.
Let O= {Leech, Bream, Frog, Dog} and P={a, b, c, d}. The following context may be
defined by the table 1.

                                            a    b   c    d
                             1   Leech      1    1   0    1
                             2   Bream      1    1   0    1
                             3   Frog       1    1   1    1
                             4   Dog        1    0   1    1
                       Table 1 : An example of a formal context.

*DORLV&RQQHFWLRQ

'HILQLWLRQLet A ⊆ Ο and B ⊆ P two finite sets, R a relation on O x P. For both sets
A and B, operators I(A) and K (B) are defined as :
       I (A) = {m | ∀g, g ∈ A Æ (g,m) ∈ R}
       K (B) = {g | ∀m, m ∈ B Æ (g,m) ∈ R}

  Operator I defines the properties shared by all elements of A. Operator K defines
objects sharing the same properties included in set B. Operators I and K define a
Galois Connection between sets O and P [12].

3URSRVLWLRQ  Operators I and K define a Galois connection between O and P, such
that if A1, A2 are subsets of O, and B1, B2 are two subsets of P, then f and h verify
the following properties [12] :
        • A1⊆ A2 ⇒ I(A1) ⊇ I (A2)                    (1)
        • B1⊆ B2 ⇒ K (B1) ⊇ K(B2)                    (2)
        • A1⊆ KRI(A1) and B1 ⊆ IRK(B1) (3)
        • A⊆ K (B) ⇔ B ⊆ I(A)                         (4)
        • I= IRKRI and K = KRIRK(5)


                                            2
122    Ibtissem Nafkha, Samir Elloumi, Ali Jaoua


 )RUPDO&RQFHSW
 
 'HILQLWLRQA formal concept of the context <O,P,R> is a pair (A,B), where A ⊆ Ο,
 B ⊆ P, such I (A) = B and K (B) = A. Sets A and B are called respectively the
 domain (extent) and range (intent) of the formal concept.
 
 &RQFHSW/DWWLFH
 
 'HILQLWLRQFrom a formal context <O,P,R>, we can extract all possible concepts. In
 [12], we prove that the set of all concepts may be organized as a lattice, when we
 define the following partial order relation << between two concepts, (A1,B1) <<
 (A2,B2) ⇔ (A1 ⊆ A2 ) et (B2 ⊆ B1). The concepts (A1,B1) and (A2,B2) are called
 nodes in the lattice.

 (TXLYDOHQFHEHWZHHQDQREMHFWDQGDVXEVHWRIRWKHUREMHFWV
 
 'HILQLWLRQLet x ∈ O be an object, A⊆ Ο be a finite set and R a relation on O × P.
 We define FORVXUH [ = g(f(x)) and FORVXUH $  g(f(A)) [16,17].

 'HILQLWLRQLet x ∈ O be an object, A⊆ Ο be a finite set and R be a relation on O×P.
 We say that an object x is equivalent to the objects A, relatively to a relation R, if and
 only if, {x}∪A is a domain of a concept of R, and that the closure
 (x)=closure(A)={x}∪ A, where x ∉ A.
 
 ([DPSOH  Let R be the following relation, in the table, with 5 objects
 {O1,O2,O3,O4,O5} and three attributes {A,B,C}.
                                        A       B      C
                                O1       1      1      1
                                O2       1      0      1
                                O3       1      0      0
                                O4       0      1      1
                                O5       0      0      1
                              Table 2 : Example of a relation R

 O5 is equivalent to {O1,O2,O4}, the reason is that the concept containing O5 is
 CP={O1,O2,O4,O5}×{C} ; and inversely the concept containing {O1,O2,O4} is also
 CP.
 
 &XUUHQW6HDUFK
   Using FCA can complement the existing search systems to address some of their
 main limitations. Basically, FCA exploits the similarity between documents in order
 to offer an automatically support structure (i.e., the document lattice) in which to
 place the information retrieval process. The document lattice can be used to improve
 basic individual search strategies [1,2,4,13]. Moreover, query refinement is one of the


                                             3
       Using Concept Formal Analysis for Cooperative Information Retrieval               123


most natural applications of concept lattices. Its main objective is to recover from the
null-output or the information overload problem. The concept lattice may be used to
make a transformation between the representation of a query and the representation of
each document [5,6,7,8,9]. The query is merged into the document lattice and each
document is ranked according to the length of the shortest path linking the query to
the document concept. In the other hand, on the set of terms describing the document,
there exist hierarchies in the form of thesaurus [4,10,13,14]. The information
searching using FCA takes as input a query that will be forwarded to a selected search
engine [6,7,8]. The first pages retrieved by the search engine in response to the query
are collected and parsed. At this point, a set of index units that describe each returned
document is generated; such indices are next used to build the concept lattice
corresponding to the retrieved results. The last step consists of showing the lattice to
the user and managing the subsequent interaction between the user and the system. In
spite of such limitations such as for larger information collection, generally we get a
huge number of references, we are interest to build a FCA-based system for
distributed information, which may affect both the efficiency and the effectiveness of
the overall system. These limitations can be overcome by using data reduction that
will be described in the next section.

&RQFHSWXDO5HGXFWLRQ0HWKRG

  The fundamental question for data reduction is the following : How can we reduce
data without losing any knowledge ? The advantage of reduced data is that it can be
used directly as a prototype for making decision, for supervised learning or reasoning
[16,17]. The proposed algorithm is based on the elimination of any object that may be
replaced by a subset of the equivalent objects [17]. Firstly, we use relational join
operator in order to obtain a total relation. Secondly, we apply reduction algorithm on
the final table to minimize the number of rows. Let R be the initial binary relation,
representing the database, and RD be the expected output of the equivalent reduced
concept. We describe in detail the different steps of the algorithm.

Algorithm reduction (RD, R)
   Initialize RD=R
   For each object x in the domain of the remaining context RD,
   we do the following steps :
     1)we find the set of its properties P (i.e. subset of
        attributes satisfied by x).
     2) we find the set of objects S, except x, sharing all the
        properties P.
     3)if S is not empty, we check if object x is included in
        the set of objects sharing the same properties as S. In
        the positive case, object x is removed from context RD.
End of the loop
End.

  This method takes place on two steps. The first step consists in applying the
relational join operator on the set of the tables. The second step is the execution of the
conceptual reduction algorithm on the result table. The major problem is that the


                                            4
124    Ibtissem Nafkha, Samir Elloumi, Ali Jaoua


 treatment of an important size databases needs a considerable storage space and
 makes the reduction process sultry. Our idea is to propose an algorithm which makes
 simultaneously the join and the reduction of the tables set in only one reduced table.
 Then, we will present our system in the next section.

 6HDUFKHQKDQFHGE\)&$
   In this section we will present some search that may be improved through a FCA
 for distributed information.

 &RRSHUDWLYHFRQFHSWXDO,QIRUPDWLRQ5HWULHYDO6\VWHP
      To improve upon regular Information Retrieval System (IRS), cooperative
 conceptual information retrieval system (C2IRS) was proposed [18]. As it is shown in
 figure 1, C2IRS is composed of two parts :
      1. Part 1 is the cooperative information retrieval system that we present in the
         next subsection. In this phase, C2IRS uses a conceptual approach in the
         searching process from different databases. Each local result of each CIRS is a
         concept that we store in an $QVZHU$UUD\. We assume that we have N different
         formal contexts, so we can get a maximum of N concepts in the $QVZHU$UUD\.
      2. Part 2 is the final Answer Formulation that we describe in subsection 4.1.2.
         First of all, we apply the 0HUJLQJ$OJRULWKP on the $QVZHU$UUD\. Then, we
         apply the 0HUJLQJ $OJRULWKP on the modified $QVZHU DUUD\ in order to
         obtain the final answer. We detail those algorithms with an illustrative
         example.
                                                                    
                                                                  )*#&+
                                                              
                                     !""$# %                                        !" $# %

                                                                              
                                         & '                       «                          (

                                !"" '                                                      !" ,(
                                                                  «

                                                -  ! %!$.* -   /0#% $ !

                                                             132 456 7*48 9;: <
 
 
                                Figure 1: Architecture of C2IRS
 &RRSHUDWLYH,QIRUPDWLRQ5HWULHYDO6\VWHPDifferent information retrieval systems
 (IRS) cooperate to give a most complete answer to a query. Each IRS has a local
 database on which we apply the Galois connection (GC) to retrieve the documents


                                                   5
      Using Concept Formal Analysis for Cooperative Information Retrieval                                 125


satisfying the query. In the query, we express that we want to find documents which
are indexed by some indexing terms Qr. To resolve a query (Qr), each CIRS executes
the 5HWULHYH $OJRULWKP proposed in [18] on its local database (DBi). We apply the
Galois connection only on the query terms Tl existing in the local database. Thus, the
result of each CIRS, stored in the $QVZHU DUUD\, is a concept which domain is the
local searching terms Tl and its codomain is the retrieved documents.

([DPSOHLet the following illustrative example. Let the set of databases 1, 2 and 3
presented by formal contexts in figure 2. For the query of the form :"Which
documents are indexed by the terms t2, t3 and t5", the query is formed by three terms
t2, t3 and t5.
             a) database 1          b) database 2           c) database 3
                   t1 W=  W>          t1 W=  W?               t1 W >  W ? 
             A        0                 B         1       0     0    A       0           
             B        1       0     1       G         0              B       1           
             C        1                  J        0       0     1    C       1    1        0
             D        0       0     1       K         1       0     0    G       0           
                                            L         1       0     1

                 ""  "Q                  ""  "Q                       ""  "Q
                 % R    $Q/                % R    $Q/                     % R    $Q/
           Answer
             I J        K             I L                  M             J L                ONPM
                     Figure 2 : Cooperative Information Retrieval System

  We apply the 5HWULHYH $OJRULWKP already presented on the set of databases. The
Galois connection applied on the set of query terms, existing in the first database,
Tl={t2,t3}, gives the set of documents {A,C}. Then, we put the first concept found
({t2,t3},{ A, C}) in the first cell of the $QVZHU array.

  By the same way, we continue to find concept from each contexts . We obtain four
concepts from the several databases that we place in the following $QVZHUarray.
                                   
                         W= W>          W= W ?                W> W ? 
                         AC                G                  A B G
                                    Table 3 : THE @BACED;FHG ARRAY.
    Based on this array, we construct the final answer by the application of the
)LQDOB$QVZHUB)RUPXODWLRQ algorithm proposed in [18]. We detail the steps of this
algorithm in the next subsection.


                                                            6
126    Ibtissem Nafkha, Samir Elloumi, Ali Jaoua


 )LQDO$QVZHUIRUPXODWLRQ In this section, we present the second part of our system
 which is the final answer formulation. The formulation of the final answer consists in
 the application of two proposed algorithms on the $QVZHU array. This algorithm
 combines the concepts of the AQVZHU array according to some conditions. The final
 answer is built somehow iteratively. Initially, the final answer is an the empty set. We
 read the set of concepts element by element. For each element, if the domain of a
 concept is equal or greater than the terms of the query, then we add its range (set of
 document references) to the final answer. If this condition is not satisfied, we build
 intersection between its range and ranges of other concepts in the answer array (i.e. a
 subset of documents), and we compute the union of the domains (to find a subset of
 terms). We continue to build these intersections until we find at least all the terms of
 the query.
 
 ([DPSOHFor our example, we remind that our query is {t2,t3,t5}. Initially, the final
 answer is an empty set. We read the first concept of $QVZHU array. Their terms are
 different to the query. So, we merge their terms with terms of the second concept and
 we compute the intersection between their ranges. The union of terms is the set {t2, t3,
 t5} which is equal to the query. But the result of the intersection between documents is
 an empty set. So, we continue with the next concept. We merge the terms of the first
 and the last concept. The result is the set {t2,t3,t5} which is equal to the query. The
 intersection between their documents gives the set of documents {A}. We add this set
 to the final answer. By repeating this process with to the next concept, we find the
 union of terms {t2,t5,t3} which is equal to the query. So, we add the result of the
 intersection of documents that is the set {G} to the final answer. The final answer is
 now the set {A,G}. The final answer obtained is the set of documents {A,G} that will
 be deliver to the user.
 The retrieval process using this system is based on a formal concept analysis. For
 each database, we search the local concept. We use a cooperative approach to
 formulate the final answer basing on extracted concepts from several databases. The
 major problem is that when we search for pertinent information, we generally get a
 huge number of references. Moreover, the treatment of an important size databases
 needs a considerable storage space and makes the research process sultry. So, we need
 to minimize the size of each database. In the next section, we present the cooperative
 conceptual information retrieval system based on data reduction.
 
  &RRSHUDWLYH &RQFHSWXDO ,QIRUPDWLRQ 5HWULHYDO 6\VWHP EDVHG RQ 'DWD
 5HGXFWLRQ
   To improve upon cooperative conceptual information retrieval system, we propose
 a system for cooperative conceptual information retrieval basing on data reduction.
 As it is shown in figure 3, our proposed system is composed of two parts :
      1. Part 1 is detailed in the next subsection. In this phase, our proposed system
           uses a conceptual approach, proposed in [17], in the reducing process of
           different databases. Each database is reduced in order to minimize its size
           and obtain a reduced database (RDB) and the equivalent objects set.
      2. Part 2 is a search process which will be described in the next subsection.
           Basing on the set of reduced databases (RDB) and the equivalent objects set,


                                            7
     Using Concept Formal Analysis for Cooperative Information Retrieval                                                  127


         we cooperate to give a most complete answer to a query by using the
         cooperating process, proposed in [18], that will be detailed with an
         illustrative example.

                                SUT$V&W
                                                                    s,[W\,t


                                                                      DB1              p"p        DBN
                                  h abV Wi

                                                                           XEY _` V Z \ a["cqU[ \ [
                                                                                gUVra ` \ ] Y _
                   s[W\vu
                           XBY Y Z V W[ \ ] ^"V
                            XEY _` V Z \ ab["c
                          d _ e Y Wf[ \ ] Y _
                              gUV\ W] V^"["c                        RDB1
                                                                                           p"p"p RDB          N


                               j3kblmHno
                                                                           wyxa] ^ ["c V _ \vz{"|}V ` \ T,~bV\


           Figure 3 : Cooperative Conceptual Information retrieval based on data reduction

&RQFHSWXDO GDWD UHGXFWLRQ In [17], an accurate algorithm for data reduction is
proposed. This algorithm assumes that we can minimize data without losing any
knowledge. This algorithm is based on the elimination of any object that may be
replaced by a subset of equivalent objects as defined in [17]. We describe in detail the
different steps of the algorithm and we show its application on the example. Let R be
the initial binary context, representing the database, and RD be the expected output of
the equivalent reduced concept.

([DPSOHLet the following databases presented by the formal contexts in figure.
For the database 1, the document D is equivalent to {B,C,A}, the reason is that the
concept containing D is C1={A,B,C,D}×{t3}; and inversely the concept containing
{A,B,C} is also C1 that may be obtained by using the Galois connection. This means
that document D can be removed without modifying the initial knowledge database.
By the same way, we can remove B and J from the database 2. Moreover, we remove
the document A from the database 3. The reduced databases are showed in the
following table.
                   t1 W =  W >          t1 W =  W ?         t1 W >  W ? 
             A    0                            G   0                                 B       1             
             B    1       0        1               K   1       0       0                     C       1        1      0
             C    1                            L   1       0       1                     G       0             
                                    Table 4 : The reduced databases.


                                                           8
128    Ibtissem Nafkha, Samir Elloumi, Ali Jaoua


   Based on the reduced databases and the set of equivalent documents, the
 cooperative conceptual search process is executed to respond to the query. We detail
 the steps of this system in the next subsection.


 &RRSHUDWLYH &RQFHSWXDO ,QIRUPDWLRQ 5HWULHYDO In this section, we present the
 second part of our system which is a cooperative conceptual information retrieval
 which is proposed in [18]. Our system is formed by several conceptual information
 retrieval systems (CIRS) that cooperate to give a most complete answer to a query.
 Each one has a local reduced database on which we apply the Galois connection to
 retrieve the sites satisfying the query. In the query, we express that we want to find
 sites which are indexed by some indexing properties. We apply the Galois connection
 only on the query properties existing in the local reduced database. Thus, the result of
 each CIRS is a concept which domain is the local searching properties and its range is
 the retrieved sites. From this set of concepts, we formulate the final answer by
 applying a merging and a combining algorithms proposed in [18].
 
 ([DPSOH Let the reduced databases presented in table 4. For the query, treated in
 the last section, of the form : "Which documents are indexed by the properties t2 , t3
 and t5", the query is formed by three properties t2 , t3 and t5. We will find three
 concepts from reduced databases by applying Galois connection. The reduced
 database 1 contains only the properties t2 and t3. So, the first concept is {t2,t3}×{A,C}.
 The concepts founds from the second and the third reduced databases are
 simultaneous {t2,t5}×{G} and {t3,t5}×{B,G}. In order to formulate the final answer,
 we combine these concepts by applying merging and combining algorithms. The final
 answer is built somehow iteratively. Initially, the final answer is an the empty set. We
 treat the set of concepts element by element. We read the first concept, their
 properties are different to the query. So, we merge their properties with properties of
 the second concept and we compute the intersection between their ranges. The union
 of properties is the set {t2,t3,t5} which is equal to the query. The intersection of
 documents is empty. At this stage, the final answer is an empty set. Then, we compute
 the union of properties and the intersection of documents between the first and the
 third concept. The result of this merge is the set { t2,t3,t5} which is equal to the query.
 The result of the intersection between documents is an empty set So, we continue
 with the next concept. We merge the properties of the second and the last concept.
 The result is the set {t2,t3,t5} which is equal to the query. So, we add the result of the
 intersection of documents that is the set {G} to the final answer. The final answer is
 now the set {G}. The final answer obtained is the set of documents {G} that will be
 delivered to the user. The answer can be enriched by the additional information
 existing in the equivalent objects set. We have A is equivalent to the document G.
 The enriched final answer is so {G,A} that will be delivered to the user. We remark
 that we obtain the same answer as the cooperative conceptual information retrieval
 system presented in the last section that its databases contain in total 13 lines.
 Moreover, our proposed system is applied on the reduced databases which have only
 9 lines that is more easy. Our system handles bases which have a small sizes resulting
 of the conceptual reduction of the initial bases having an important size. Thus, our


                                             9
     Using Concept Formal Analysis for Cooperative Information Retrieval                  129


proposed system uses a storage space less important than that used by the cooperative
conceptual information retrieval system and returns then the quicker research process.
The major problem is that the treatment of databases which have an important size
needs a considerable storage space and makes the reduction process sultry. Our idea is
to propose an algorithm which makes simultaneously the join and the reduction of the
tables set in only one reduced table. In the next section, we present our system.

 &RQFHSWXDO LQIRUPDWLRQ UHWULHYDO V\VWHP EDVHG RQ FRRSHUDWLYH FRQFHSWXDO
GDWDUHGXFWLRQ

  In this section, we present an approach for information retrieval [19] that is
composed of two parts as it is shows in the following figure :
    3. Part 1 : From a set of formal contexts representing the databases, we apply
      the proposed algorithm for FRRSHUDWLYH FRQFHSWXDO GDWD UHGXFWLRQ in order to
      obtain the reduced table and the set of equivalent objects.
    4. Part 2 : The research is based on this reduced table. To respond to a query
      that is a set of terms, we apply the Galois connection to retrieve the documents
      satisfying the query. The final answer can be enriched by the set of equivalent
      objects.
                                       s,[W\t

                                             DB1                          DBN


                                           XEY Y Z V&W[ \ ] ^"V XEY _` V Z \ ab["c
                                                 qU[ \ [gUVra ` \ ] Y _

                                          Equivalent                       gBqU
                                           objects

                                           s[W\vu
               SUT$V&W  h aV&Wi         XEY _` V Z \ ab["cbd _e Y Wf[ \ ] Y _
                                                         gUV\ W] V^"["c


                                                     jklmHno
Figure 4 : Architecture of the conceptual information retrieval system based on cooperative
conceptual data reduction

  The advantage of this approach is that it is much faster than other methods. Indeed,
the treatment of databases which have an important size needs a considerable storage
space and makes the research process sultry. So, we need to minimize the size of the
databases set. Based on the reduced database, the conceptual research process is
executed. In the query, we express that we want to find documents which are indexed
by some indexing terms. We apply the Galois connection only on the query terms


                                              10
130    Ibtissem Nafkha, Samir Elloumi, Ali Jaoua


 existing in the reduced documentary database to formulate the answer that will be
 delivered to the user. 
 
 &RRSHUDWLYHFRQFHSWXDOGDWDUHGXFWLRQ In this section, we present the principle of
 our method for the cooperative conceptual data reduction and show how we apply in
 order to generate the reduced table and the set of equivalent objects. The algorithm
 proposed in [19] generates the reduced table from a set of tables without using the
 join operator. It is based on the use of a graph containing the concepts found in each
 new line addition. In another words, the construction of the concepts lattice is done in
 an incremental manner. This concepts lattice contains only the significant objects. The
 reduced table generation from the set of tables is done in an incremental manner. Our
 algorithm has as input the set of tables T and as output the graph G, the reduced table
 RT and the set of equivalent objects. It is based on the following proposition.

 3URSRVLWLRQLet C = (dom , range) be a concept and Ci = (domi , rangei), i = 1..N, be
 its children.Let E = dom \ ( ∪ domi ), where the operator “\” is the minus between
 two sets. ,I E is not empty DQG range = ( ∩ rangei ). 7KHQ E contains the objects to
 be eliminated.
 
    We execute the principle of the cooperative conceptual data reduction on the same
 example to prove that it give the same result as the conceptual reduction data.

 ([DPSOH  Let the following databases presented in figure 2. We apply now our
 proposition. The table RT and the graph G are empty. For the first object
 {A}×{t2,t3,t5} generates a new node , that has as domain {A} and as range is
                                    
 {t2,t3,t5}, that is added to the graph.
                                                     W= W> W?
                                Figure 5 : The graph G.

 The line is added to the table RT :
                                                t2           t3       t5
                                    A           1            1        1
                      Table 5 : The table RT after adding the first object.

         The treatment of the second document {B}×{t1,t3,t5} generates a new node ,
 that is (B, {t1,t3,t5}),in the graph. The treatment with the node gives the intersection
 {t3,t5}. We add a new node that its domain is {A,B} and its range is the set {t3,t5}.
                                     
                                        N            W> W?
 
                        1
                                                         
                            A        t2 t3 t5
                                                                  N        W W> W?
                                Figure 6 : The graph G.


                                                       11
     Using Concept Formal Analysis for Cooperative Information Retrieval                                                           131


The line is added to the table RT that becomes :
                                             t1            t2          t3           t5
                                 A           0             1           1            1
                                 B           1             0           1            1
                     Table 6 : The table RT after adding the second object.

        For the object {C}×{t1,t2,t3}, a new node is added to the graph. The treatment
with the node gives the intersection {t2,t3}.We add a new node that its domain is
{A,C} and its range is {t2,t3}. The treatment with the node  generate a new node 
that is its domain is {B,C} and its range is the intersection found {t1,t3}.
                                                      3
                                                                                             
                                      W= W>             AB              t3 t5
                                                                                                 NP         W W> 
                                                                               2
                                            1
                         W W= W>         A               t2 t3 t5                B            t1 t3 t5

                                                   Figure 7 : The graph G.

The line is added to the table. The table is now :
                                             t1            t2          t3           t5
                                 A           0             1           1            1
                                 B           1             0           1            1
                                 C           1             1           1            0
                      Table 7 : The table RT after adding the third object.

       For the object {D}×{t3}, a new node  is added to the graph. The treatment
with the node and gives an intersection {t3} that is its range, it’s a modified node.
We add to domain the set {A,B,C}. We verify the propriety of the proposition
presented. Indeed, we remark that the domain of the node is different from the union
of the domains of their children   and  ({A,B,C,D} <> {A,C}∪{A,B}∪{B,C})
and its range is equal to the intersection of the ranges of their children ({t3}= {t2,t3}∩
{t3,t5}∩ {t1,t3}). So, the object D (D ={A,B,C,D} \ {A,C} ∪ {A,B} ∪ {B,C}) will be
eliminate from the node and from RT. The document D is equivalent to {A,B,C}.
The graph becomes :
                                            
                                                                 ONP                      W> 

                                5                                       3                                                      6
                                      AC            t2 t3              AB                t3 t5              BC         t1 t3

                      4                                 1                                    2
                          C           t1 t2 t3          A              t2 t3 t5                    B        t1 t3 t5
                                                   Figure 8 : The graph G.


                                                            12
132       Ibtissem Nafkha, Samir Elloumi, Ali Jaoua


 The new row is added to the RT that becomes :
                                                 T1           T2    T3       T5
                                        A        0            1     1        1
                                        B        1            0     1        1
                                        C        1            1     1        0
                        Table 8 : The table RT after adding the new object.

 By the same way, we continue the treatment for the others objects. Finally, we obtain
 the following graph.
              7                                   8                                   9
                  ABC         t3                   ABL              t5                BCL         t1

          5                         3                                    6                                     10
              AC      t2 t3             AB            t3 t5         BC        t1 t3          BL        t1 t5

              4                                                1                  2
                  C      t1 t2 t3           A           t2 t3 t5         B        t1 t3 t5

                                                   Figure 9 : The graph G.

      Finally, we obtain the reduced database RT that is presented in the following table.
                                                 t1           t2    t3       t5
                                        B        1            0     1        1
                                        C        1            1     1        0
                                        G        0            1     1        1
                                        L        1            0     0        1
                                                Table 9 : The table RT.

   The document D is equivalent to the set of documents {B,C,G}, the document J is
 equivalent to the set {B,G} and the document K is equivalent to the set {B,C,L}.
 Basing on the reduced database and the set of equivalent objects, we execute the
 conceptual research process that is detailed in the next section.

 &RQFHSWXDO LQIRUPDWLRQ UHWULHYDO The information retrieval system (IRS) is based
 on the reduced database to give a most complete answer to a query. We apply the
 Galois connection (GC) on the reduced database to retrieve the documents satisfying
 the query. In the query, we express that we want to find documents which are indexed
 by some indexing terms Qr. To resolve a query (Qr), we apply the Galois connection
 on the query terms Tl. Thus, the result of IRS is a concept which range is the
 searching terms Tl and its domain is the retrieved documents.
 
 ([DPSOH   Let us treat the databases presented in the table 9. For a query of the
 form:"Which documents are indexed by the terms t2, t3 and t5", the query is formed by


                                                               13
     Using Concept Formal Analysis for Cooperative Information Retrieval              133


three terms t2, t3 and t5. We will find the concept from reduced database by applying
Galois connection. The concept found is {t2, t3,t5}×{G}. The final answer obtained is
the document G that will be delivered to the user. We have in the set of equivalent
object the information that the document G is equivalent to the document A. So, the
final answer will be {A,G}. We remark that we obtain the same result as the
cooperative information retrieval based on the data reduction, presented in the last
section, that treats three databases containing in total 9 rows. Moreover, our proposed
system is applied on the four rows reduced database that is more easy. In the next
section, we present the complexity of all presented systems for information retrieval.

&RPSOH[LW\$QDO\VLV

The complexity analysis of a treatment is ensured by calculating :
     i) the time complexity, which is the number of operations of the treatment, and
     ii)the space complexity, which is the number of memories cases used by the
        treatment.
We compute the complexity analysis of cooperative conceptual information retrieval
system, conceptual information retrieval system based on data reduction and
conceptual information retrieval system based on cooperative data reduction.

&RRSHUDWLYH&RQFHSWXDO,QIRUPDWLRQ5HWULHYDO6\VWHP
        We present the complexity of the cooperative conceptual information retrieval
system (C2IRS) proposed in [11]. We suppose that the database has Q sites and P
properties. We remind first of all the steps of this system:
         - Phase 1: the search of concepts from different databases.
         - Phase 2: the formulation of the final answer from the found concepts.

7LPH&RPSOH[LW\We suppose that each database has Qsites and Pproperties. The
time complexity of the system C2IRS is  Q P Q. So, &  (n,m) = 2nm + n.

6SDFHFRPSOH[LW\The system C2IRS uses k matrix with n lines and m columns. So,
it uses (k*n*m) memory cells. So, the space complexity of this method is CS = knm.

 &RRSHUDWLYH FRQFHSWXDO LQIRUPDWLRQ UHWULHYDO LQIRUPDWLRQ EDVHG RQ 'DWD
UHGXFWLRQ
   We suppose that the database has Qsites and Pproperties and the reduced database
has Q¶sites and P¶properties. We remind first all steps of this system:
         - Phase 1: conceptual data reduction applied on several databases.
         - Phase 2: cooperative conceptual information retrieval applied on the reduced
              databases.

7LPHFRPSOH[LW\ It is easy to find the complexity &¶  (QP) for this system which is
equal in : &¶  (n,m) = &¶ }  v (n,m) + &¶ }"   (Q¶P¶).
The complexity of phase 1 is O(n*m). The phase 2 realize (2(Q¶ P¶ Q¶ operations
[17]. So, the system realizes Q P>2(Q¶ P¶ Q¶@operations.


                                         14
134    Ibtissem Nafkha, Samir Elloumi, Ali Jaoua


 So, &¶  (n,m) = nm + 2n’m’ + n’.
 Since the size of the reduced database is lesser than that of the initial database
 (n’<<n). So, the complexity of our proposed system is lesser than that of the C2IRS .
 &¶  (n,m) = nm + 2n’m’+n’ << &  (n,m) = 2nm + n.
 
 6SDFH &RPSOH[LW\ Our retrieval system handles a reduced database. So, we can
 reserve only (k*n’*m’) memory cells. So, the space complexity of our system is
 C’S=kn’m’.
 We note that our algorithm uses a less number of memory cells that the C2IRS :
 C’S<<CS. This can be explained by the integration of the conceptual data reduction in
 our proposed system.
 
 &RQFHSWXDOLQIRUPDWLRQUHWULHYDOEDVHGRQ&RRSHUDWLYHGDWDUHGXFWLRQ
 
 7LPH FRPSOH[LW\ If we have K tables, each one has Q lines and P properties, we
 realize N iterations where N = nn. In each iteration, we cover the concepts graph G
 and we do then ||G|| operations where ||G|| represents the graph cardinality. So, we do
 ||G|| update operations on the graph (adding or modification operation). So , C’T ≈
 O(N* ||G||2).
 Then, our algorithm have a complexity less then that of the conceptual data reduction
 considering the cardinality of the graph is too small compared to the number of
 properties (||G||<< M)
 
 6SDFH&RPSOH[LW\Our algorithm treat in each iteration one line. So, we can reserve
 only M memory cases. For the treatment of each line, the algorithm uses a graph that
 is constructed in incremental manner. This graph necessities 8*||G|| memory cells
 since for each graph node, we reserve 8 memory cells. So, the space complexity of
 our algorithm is C’S=M+(8*||G||).
         We note that our algorithm uses a less number of memory cells that the
 conceptual data reduction : C’S<<CS. This can be to explain by the fact why the
 conceptual data reduction treats a matrix of N lines and M columns on the other hand
 our algorithm treats only one line.

 &RQFOXVLRQ
 
        In this paper, we have presented three conceptual information retrieval
 systems. The cooperative conceptual information retrieval system combines the
 results of several information retrieval systems. Each one has its local formal context
 representing a database on which we apply a conceptual search to obtain a first set of
 concepts. From this set of concepts, we formulate the final answer by applying a
 proposed algorithm merging and combining concepts with a suitable way. The
 volume of the available information is increasing exponentially. So, generally we get
 a huge number of reference. It is necessary to abstract it and eliminate the redundancy
 data. However, our main idea is to integrate the data reduction in the search process.
 The information retrieval is based on the data reduction. From a set of formal contexts
 presenting a databases set, we apply the relational join operator in order to fusion the


                                           15
      Using Concept Formal Analysis for Cooperative Information Retrieval                     135


tables. Basing on the join resulting table, we apply the data reduction algorithm to
obtain the reduced table on which the search process is executed. The major problem
is that the treatment of an important size databases needs a considerable storage space
and makes the reduction and the research process sultry. We are presented an
information retrieval based on a cooperative conceptual data reduction. We have
developed a fast algorithm, that combines the join and the reduction steps, for
minimizing formal contexts and we have apply it in an information retrieval system.
From a set of databases, we obtain a reduced database on which we apply a
conceptual research for responding to a query. This application was experimented on
many databases and the obtained results are promising. This study has several
perspectives for real and fuzzy data. 

5HIHUHQFHV
1. Aboud M., Chrisment C., Razouk R., Florence S., Soulé-Dupuy : Query a Hypertext
    Information Retrieval System by use of Classification. Information Processing and
    Management, 29(3), (1993) 387-396
2. Amati G., Carpineto C. , and Romano G. : FUB at TREC-10 Web Track: A Probabilistic
    Framework for Topic Relevance Term Weighting. In Proceedings of the 10th Text
    REtrieval Conference (TREC-10), NIST Special Publication 500-250, Gaithersburg, MD,
    USA (2001) 182-191.
3. Bordat J.P.: Calcul pratique du treillis de Galois d'
                                                       une correspondance. Math. Sci. Hum., 96,
    (1986) 31-47.
4. Carpineto C. and Romano G. : Using Concept Lattices for Text Retrieval and Mining. In the
    first International Conference on Formal Concept Analysis, Darmstadt, Germany, (2003).
5. Carpineto C. and Romano G. : Information retrieval through hybrid navigation of lattice
    representations. International Journal of Human-Computer Studies, 45(5), (1996) 553-578.
6. Carpineto C. and Romano G : A lattice conceptual clustering system and its application to
    browsing retrieval. Machine Learning, 24(2), (1996) 1-28.
7. Carpineto C. and G. Romano. Effective reformulation of Boolean queries with concept
    lattices. In Proceedings of the 3rd International Conference on Flexible Query-Answering
    Systems, pages 83-94, Roskilde, Denmark, 1998.
8. Cole R. and Eklund P. : Browsing semi-structured web texts using formal concept analysis.
    In Proceedings of the 9th International Conference on Conceptual Structures, Stanford, CA,
    USA, (2001) 319-332.
9. Efthimiadis E. : Query expansion. In M. E. Williams, editor, Annual Review of Information
    Systems and Technology, v31, American Society for Information Science, Silver Spring,
    Maryland, USA, (1996) 121-187.
10. Ferrfie S. and Ridoux O. : A file system based on concept analysis. In Proceedings of the
    1st International Conference on Computational Logic, London, UK, (2000) 1033-1047.
11. Gammoudi M. M., and Nafkha I.: A formal method for inheritance graph
   hierarchy construction. Information Sciences 140(3-4), (2002) 295-317.
12. Ganter B. and Wille R. : Formal Concept Analysis - Mathematical Foundations. Springer,
    1999.
13. Godin R. and Mili. H. : Building and Maintaining Analysis Level Class Hierarchies Using
    Galois Lattices. In Proceedings of the 8th Annual Conference on Object Oriented
    Programming Systems Languages and Applications, Washington, D.C., USA, (1993) 394-
    410.
14. Godin R., Missaoui R., and April A. : Experimental comparison of navigation in a Galois
    lattice with conventional information retrieval methods. International Journal of Man-
    Machine Studies, 38: (1993) 747-767.


                                             16
136     Ibtissem Nafkha, Samir Elloumi, Ali Jaoua


 15. Godin R. , Saunders E. , and Jecsei J. : Lattice model of browsable data spaces. Journal of
     Information Sciences, 40: (1986) 89-116.
 16. Jaoua A., Bsaies Kh., and Consmtini W. : May reasoning be reduced to an Information
     Retrieval problem. Relational Methods in Computer Science, Quebec, Canada, (1999).
 17. Jaoua A., Al-Rashdi A., AL-Muraikhi H., Al-Subaiey M., Al-Ghanim N., and Al-Misaifri
     S.: Conceptual Data Reduction, Application for Reasoning and Learning. The 4th Workshop
     on Information and Computer Science, KFUPM, Dhahran, Saudi Arabia, (2002).
 18. Nafkha I., Elloumi S. and Jaoua A. : Conceptual Cooperative Information Retrieval System.
     In International Arab Conference on Information Technology, Doha December 16-19,
     Qatar, (2002) 534-539.
 19. Nafkha I., Elloumi S. and Jaoua A. : Conceptual Information Retrieval System based on
     cooperative conceptual data reduction. 1St International Conference on Information &
     Communication Technologies : from Theory to Applications, Syria, (2004).
 20. Salton G. : Automatic Text Processing: The Transformation, Analysis, and Retrieval of
     Information by Computer. Addison Wesley, 1989.


                                              17