=Paper=
{{Paper
|id=None
|storemode=property
|title=Querying Relational Concept Lattices
|pdfUrl=https://ceur-ws.org/Vol-959/paper26.pdf
|volume=Vol-959
|dblpUrl=https://dblp.org/rec/conf/cla/AzmehHNHV11
}}
==Querying Relational Concept Lattices==
Querying Relational Concept Lattices Z. Azmeh1 , M. Huchard1 , A. Napoli , M. Rouane-Hacene3 , and P. Valtchev3 2 1 LIRMM, 161, rue Ada, F-34392 Montpellier Cedex 5 2 LORIA, B.P. 239, F-54506 Vandœuvre-lès-Nancy 3 Dépt. d’informatique, UQÀM, C.P. 8888, Succ. Centre-Ville Montréal, Canada Abstract. Relational Concept Analysis (RCA) constructs conceptual abstractions from objects described by both own properties and inter- object links, while dealing with several sorts of objects. RCA produces lattices for each category of objects and those lattices are connected via relational attributes that are abstractions of the initial links. Navigating such interrelated lattice family in order to find concepts of interest is not a trivial task due to the potentially large size of the lattices and the need to move the expert’s focus from one lattice to another. In this paper, we investigate the navigation of a concept lattice family based on a query expressed by an expert. The query is defined in the terms of RCA. Thus it is either included in the contexts (modifying the lattices when feasible), or directly classified in the concept lattices. Then a navigation schema can be followed to discover solutions. Different navigation possibilities are discussed. Keywords: Formal Concept Analysis, Relational Concept Analysis, Re- lational Queries. 1 Introduction Recently [1], we worked on the problem of selecting suitable Web services for instantiating an abstract calculation workflow. This workflow can be seen as a DAG whose nodes are abstract tasks (like book a hotel room) and directed edges are connections between the tasks, which often correspond to a data flow (like connecting reserve a train ticket and book a hotel room: train dates and time- table are transmitted from reserve a train ticket to book a hotel room). The selection is based on quality-of-service (QoS) properties like response time or availability and on the composability quality between services chosen for neigh- bor tasks in the workflow. Besides, we aim at identifying and storing a set of backup services adapted to each task. To be efficient in the replacement of a fail- ing Web service by another, we want to organize each set of backup Web services by a partial order that expresses the quality criteria and helps to choose a good trade-off for instantiating the abstract workflow. Analyzing such multi-relational data is a complex problem, which can be approached by various methods includ- ing querying, visualization, statistics, or rule extraction (data mining). We proposed an approach based on Relational Concept Analysis (an itera- tive version of Formal Concept Analysis) to solve this problem, because of its multi-relational nature. Web services are filtered and grouped by tasks they may satisfy (e. g. the Web services for booking a hotel room). In formal contexts (one for each task), we associate the Web services and their QoS criteria. For exam- ple, the service HotelsService by lastminutetravel.com would be described by low response time, medium availability (classical scaling is applied to the QoS val- ues). In relational contexts we encode the composability levels in each directed edge of the workflow. Given an edge of the workflow, the composition quality depends on the way output data of the source task cover input data of the end- ing task, and the need for data adaptation. A relational context encodes for example the relation Adaptable-Fully-Composable between services for reserve a train ticket and services for book a hotel room. In this relation TravelService by puturist.com is connected to HotelsService by lastminutetravel.com if output data of TravelService can be used, with a slight adaptation, to fill input data of HotelsService. The concept lattice family we obtain (one Web service lattice for each task of the workflow) makes it possible: (1) to select a Web service for each task based on QoS and composability criteria, (2) to memorize classified alternatives for each task. Due to the nature of our problem, we are interested in classifying indepen- dently the Web services corresponding to the tasks and not classifying the solu- tions. By solution, we mean a set of Web services, each of which can instantiate a task of the workflow. If a particular service fails or is no more available, the goal is to constitute a new working combination out of the old one, with the smallest number of service replacements. To the best of our knowledge, this problem area has not been investigated in-depth prior to our study, especially in the context of Relational Context Analysis [7, 6]. Therefore, we believe that it would be use- ful to generalize and report what we learned in our experience. In more general terms, we have multi-relational data and a question which contains variables we want to instantiate, and we aim at: – Finding a specific set of objects that satisfy the query. An answer is composed of objects, each object instantiates one variable; – Classifying, for each variable, the objects depending on the way they satisfy (or not) the query, to find alternative answers. In this paper, we put the problem in a more general framework, which as- sumes an unrestricted relational context family and a query given by an expert. The query can be seen as a DAG, where some nodes are labelled by variables and some others are labelled by objects. The nodes roughly correspond to the formal (object-attribute) contexts and the edges correspond to the relational (object- object) contexts. A set of lattices is built using Relational Concept Analysis and the existential scaling operator. We assume that an expert gives a total ordering of the edges of the DAG. Then an algorithm navigates the lattices following this ordering. This navigation allows us to determine objects that answer the query. These objects with their position in the lattices are what the expert wants to explore, to extract a solution and store the alternatives. In the following, Section 2 reminds the main principles of Relational Concept Analysis (RCA). Section 3 defines the model of queries in the RCA framework that we consider in this paper. Section 4 presents and discusses an algorithm that navigates the concept lattice family using a query. Related work is presented in Section 5 and we conclude in Section 6. 2 Background on Relational Concept Analysis For FCA, we use the notations of [4]. In RCA [5], the objects are classified not only according to the attributes they share, but also according to the links be- tween them. Let us take the following case study. We consider a list of countries, a list of restaurants, a list of Mexican dishes, a list of ingredients, and finally a list of salsas. We impose some relations between these entities {Country, Restau- rant, MexicanDish, Ingredient, Salsa}, such that: a Country ”has” a Restaurant; a Restaurant ”serves” a MexicanDish; a MexicanDish ”contains” an Ingredient; an Ingredient is ”made-in” a Country; and finally a Salsa is ”suitable-with” a MexicanDish. We express these entities and their relations by the DAG in Fig. 1. We capture an instantiation of this entity-relationship diagram in a relational context family. Fig. 1. The entities of the Mexican food example (left). The query schema (right) Definition 1. A relational context family RCF is a pair (K, R) where K is a set of formal (object-attribute) contexts Ki = (Oi , Ai , Ii ) and R is a set of relational (object-object) contexts rij ⊆ Oi × Oj , where Oi (domain of rij ) and Oj (range of rij ) are the object sets of the contexts Ki and Kj , respectively. The RCF corresponding to our example contains five formal contexts and five relational contexts, illustrated in Table 1 (except the made-in relational context, which is not used in this paper for sake of simplicity). An RCF is used in an iterative process to generate at each step a set of concept lattices. First concept lattices are built using the formal contexts only. Then, in the following steps, a scaling mechanism translates the links between objects into Table 1. Relational Context Family for mexican dishes America Europe r1 r2 r3 r4 r5 r6 r7 Asia d1 d2 d3 d4 d5 d6 Restaurant mx MexicanDish en × us ca Chili’s es lb fr Country Burritos × Chipotle × Canada × × Enchiladas × El Sombrero × England × × Fajitas × Hard Rock × France × × Nachos × Mi Casa × Lebanon × × Quesadillas × Taco Bell × Mexico × × Tacos × Old el Paso × Spain × × USA × × i10 i11 i12 i1 i2 i3 i4 i5 i6 i7 i8 i9 Ingredient medium-hot chicken × beef × pork × vegetables × mild beans × hot s1 s2 s3 s4 rice × Salsa cheese × Fresh Tomato × × guacamole × Roasted Chili-Corn × × sour-cream × Tomatillo-Green Chili × × lettuce × Tomatillo-Red Chili × × corn-tortilla × flour-tortilla × contains chickenbeef porkvegetablesbeansricecheeseguacamolesour-creamlettucecorn-tortillaflour-tortilla Burritos × × × × × × × × × × × Enchiladas × × × × Fajitas × × × × × × × × Nachos × × × × Quesadillas × × × × × Tacos × × × × × × × has Chili’s Chipotle El Sombrero Hard Rock Mi Casa Taco Bell Old el Paso Canada × × × × × England × × × France × × × Lebanon × × × Mexico × × × Spain × × USA × × × × × × serves Burritos Enchiladas Fajitas Nachos Quesadillas Tacos Chili’s × × × Chipotle × × El Sombrero × × × × × × Hard Rock × × Mi Casa × × × × × Taco Bell × × × × Old el Paso × suitable-with Burritos Enchiladas Fajitas Nachos Quesadillas Tacos Fresh Tomato × × × × × × Roasted Chili-Corn × × Tomatillo-Green Chili × × Tomatillo-Red Chili × × × × × × conventional FCA attributes and derives a collection of lattices whose concepts are linked by relations. For example, the existential scaled relation (that we will use in this paper) captures the following information: if an object os is linked to another object ot , then in the scaled relation, this link is encoded in a relational attribute assigned to os . This relational attribute states that os is linked to a concept, which clusters ot with other objects. This is used to form new groups, for example the group (See Concept 84) of restaurants, which serve at least one dish containing sour cream (such dishes are grouped in Concept 75). The steps are repeated until reaching the stability of lattices (when no more new concepts are generated). For mexican dishes, four lattices of the concept lattice family are represented in Figures 3 and 4. The ingredient lattice is presented in Fig. 2. Definition 2. Let rij ⊆ Oi × Oj be a relational context. The exists scaled ∃ ∃ relation rij is defined as rij ⊆ Oi × B(Oj , A, I), such that for an object oi and a concept c: (oi , c) ∈ rij ⇐⇒ ∃ x, x ∈ o0i ∩ Extent(c). ∃ In this definition, A is any set of attributes maybe including relational at- tributes, which are defined below. Definition 3. A relational attribute (s r c) is composed of a scaling operator s (for example exists), a relation r ∈ R, and a concept c. It results from scaling a relation rij ∈ R where rij ⊆ Oi × Oj . It expresses a relation between the objects o ∈ Oi with the concepts of B(Oj , A, I). An existential relational attribute is denoted by ∃rij c where c ∈ B(Oj , A, I). For example: the Concept 50 in the Country lattice owns the relational attribute ∃has Concept 60. This expresses that each country in Concept 50 (Canada and USA) has at least a restaurant in Concept 60 extent (El Som- brero or Mi Casa). Fig. 2. The concept lattice for ingredients of the RCF in Table 1 (concepts names are reduced to C n). 3 Introducing Relational Queries In this section, we define the notion of query and answer to a query. First (section 3.1) we recall simple queries that help navigating concept lattices [7]. Then (section 3.2), we generalize to relational queries that lead the navigation across a concept lattice family. 3.1 Simple queries Definition 4. A query (including its answer) on a context K = (O, A, I), de- noted by q|K (or q when it is not ambiguous), is a pair q = (oq , aq ), such that oq is the query object(s) i.e. the set of objects satisfying the query (or the an- swer set), and aq is the set of attributes defining the constraint of the query. By definition, we have: o0q ⊇ aq , where aq ⊆ A. Fig. 3. Country and restaurant lattices for exists and the RCF in Table 1. For example q|Kcountry = ({England, F rance, Spain}, {Europe}) is a query on the country context (in Table 1), asking for countries in Europe. Another example q|KM exicanDish = ({}, {rice, corn-tortilla}) When aq is closed, solving the query consists in finding the concept C = (a0q , aq ). To ensure that such a concept exists, a virtual query object ovq that satisfies ovq0 = aq can be added to the context (as an additional line). Then, three types of answers can be interesting: the more precise answers are in a0q , less constrained (with less attributes) answers are in extents of super-concepts of C, more constrained (with more attributes) answers are in extents of sub-concepts of C. When aq is not closed, and we don’t use the virtual query object, searching for answers needs to find first the more general concept C whose intent contains aq . Now we will define more generally what we mean by relational queries. Fig. 4. Dishes and salsa lattices for exists and the RCF in Table 1. 3.2 Relational queries In this study, a relational query is composed of several simple queries, to which we add relational constraints. The relational constraints are expressed via virtual query objects (variables), one for each formal context, where we want to find an object. A virtual query object may have relations (according to the relational contexts) with objects of other contexts, as well as with other virtual query objects. Definition 5. A relational query Q on a relational context family (K, R) is a pair Q = (Aq , Ovq , Rq ), where: 1. Aq is a set of simple queries Aq = {q|Ki = (oq |Ki , aq |Ki ) | q|Ki is a query on Ki ∈ K} 2. There is a one-to-one mapping between Aq and Ovq , where Ovq is the set of virtual query objects. 3. Rq is a set of relational constraints Rq = {(ov q|Ki , rij , Oq )}, where ov q|Ki is the virtual object associated with q|Ki , Oq ⊆ Oj ∪ {ov q|Kj }, with ov q|Kj is the virtual object associated with Kj . For example, let us consider the following query: I am searching for a country with the attribute ”fr”, a restaurant in this country serving Mexican dish contain- ing (chicken, cheese, and corn-tortilla), and a salsa which is ”hot” and suitable with the dish. This query can be translated into a relational query Qexample = (Aq , Ovq , Rq ) as follows: Aq = {qcountry , qrest. , qdish , qsalsa }, aqcountry = {f r}, aqrest. = aqdish = ∅, aqsalsa = {hot}. Ovq = {ov qdish , ov qcountry , ov qrest. , ov qsalsa } Rq = {(ov qdish , contains, {chicken, cheese, corn-tortilla}), (ov qcountry , has, {ov qrest. }), (ov qrest. , serves, {ov qdish }), (ov qsalsa , suitable-with, {ov qdish })}. By definition, a query corresponds to the data model, and must respect the schema of the RCF (see in Fig. 1). An answer to the relational query is included in the answers of the simple queries. For our example, the answers of the simple queries would be oqcountry = {F rance}, oqrest. contains all the restaurants, oqdish contains all the dishes, oqsalsa = {T omatillo-Red Chili}. If we consider these objects connected with the relations, this forms what we call the maximal answer graph. In this graph, we are interested in the subgraphs that cover the query (they have at least one object per element of Aq ). These subgraphs are included in the graph of Fig. 5. There are various interesting forms of answer: having exactly one object per element of Aq , or having several objects per element of Aq . Fig. 5. The subgraph containing all the answers with the relations between the objects corresponding to the relational query example. Definition 6. An answer to a relational query Q = (Aq , Ovq , Rq ) is a set of objects X having a unique object per each context that is involved in the query: X =< oi | oi ∈ Oi with 1 ≤ i ≤ n > These objects satisfy the query Q = (Aq , Ovq , Rq ), when they have the requested attributes: ∀ q|Ki ∈ Aq , ∃ oi ∈ X : o0i ⊇ aq|Ki and they are connected as expected: ∀ (ov q|Ki , r, Oq ) ∈ Rq with r ⊆ Oi × Oj , (and thus : Oq ⊆ Oj ∪ {ov q|Kj }) and ∀ o ∈ Oq , we have : 1. if o ∈ Oj , we have (oi , o) ∈ r 2. if o = ovq|K , we have (oi , oj ) ∈ r with oj ∈ X ∩ Oj j For our example, the set of answers to the relational query, is: {{F rance, El Sombrero, Enchiladas, T omatillo- Red Chili}, {F rance, El Sombrero, Quesadillas, T omatillo-Red Chili}, {F rance, El Sombrero, T acos, T omatillo-Red Chili}, {F rance, Old el P aso, T acos, T omatillo-Red Chili}}. Answers can be provided with an aggregated form which can be found in lattices, as we explain below. They allow us to discover sets of equivalent objects relatively to the answer. E.g. {Enchiladas, Quesadillas, T acos} are equivalent objects if we choose F rance and ElSombrero. Definition 7. An aggregated answer to a query Q = (Aq , Ovq , Rq ) is the set AR containing the sets Si , such that: – there is a one-to-one mapping between AR and Aq which maps each q|Ki to a set Si – ∀ q|Ki ∈ Aq , ∀ oi ∈ Si , o0i ⊇ q|Ki (objects of Si have the requested attributes) – when (ov q|Ki , r, Oq ) ∈ Rq - if ov q|Kj ∈ Oq , r ⊆ Oi × Oj , thus : ∀ oi ∈ Si , ∀ oj ∈ Sj , Sj ∈ AR, we have (oi , oj ) ∈ r (virtual objects are connected if requested) - f or each oj ∈ Oq ∩Oj we have : (oi , oj ) ∈ r (connections with particular objects are satisfied). For example, an aggregated answer for our query is {Scountry , Srest. , Sdish , Ssalsa } = {{F rance}, {ElSombrero}, {Enchiladas, Quesadillas, T acos}, {T omatillo- RedChili}} 4 Navigating a Concept Lattice Family w.r.t. a Query In this section, we explain how the navigation between the concept lattices can be guided by a relational query. Following relational attributes that lead us from one lattice to another, we navigate a graph whose nodes are the concept lattices. In a first subsection, we propose an algorithm which gives a general navigation schema that applies to concept lattices built with the existential scaling. Then we present several variations of this navigation algorithm. 4.1 A query-based navigation algorithm Our approach for navigating the concept lattices along the relational attributes is based on the observations made during an experimental use of RCA, for finding the appropriate Web services to implement an abstract calculation workflow [1]. We consider an RCF and a query that respects the RCF relations. From our experience, we observed that an expert often expresses his query by a phrase, where the chronology of the principal verbs (relations) gives a natural path for the query flow. This will be our hypothesis. Let us consider the query previously specified: I am searching for a country, with the basic attribute ”fr”, that has a restaurant which serves dishes containing chicken, cheese and corn-tortilla; I am searching for a hot salsa suitable with this dish. In order to simplify the notation, we use the same notation for queries q|ki and the virtual objects ov q|Ki . The query path is a total ordering of the arcs of the query (the query itself is a DAG in general). For our example, the path is the total ordering for Rq given by {(qcountry , has, {qrestaurant }), (qrestaurant , serves, {qdish }), (qdish , contains, {chicken, cheese, corn-tortilla}), (qsalsa , suitable-with, {qdish })}. Each arc cor- responds to a relation used in the query. All the relations involved inside a query are covered by this path. This translation of the expert query determines a composition on the relations. The query path does not always correspond to a directed chain in the object graph (e.g. dishes are the end of two of the considered relations (serves and suitable-with)). We propose the algorithms 1 to 3 (an additional procedure is needed which combines two others) for navigating through a concept lattice family using queries. During the exploration, we fill a set X by objects that will constitute an answer at the end (at most one object for each formal context). In this section, the algorithm is presented as an automatic procedure. Its use to guide an expert in its manual exploration of the data is discussed afterwards. Algorithm 1 identifies three main cases: – line 2, the arc connects two query objects, e.g. (qcountry , has, {qrestaurant }); – line 5, the arc connects a query object to original objects e.g. (qdish , contains, {chicken, cheese, corn-tortilla}); – line 8, the arc connects a query object to another query object and to original objects e.g. (qdish , contains, {qingredient , chicken, cheese, corn-tortilla}). Each of these cases considers, for a given arc a, whether the partial answer X already contains a source object or (inclusively) a target object. When the arc connects a query object to another query object a = (q|Ks , rst , q|Kt ), (Algorithm 2), four cases are possible. – X does not contain any object for Ks and any ot for Kt : we identify the highest concept that introduces the attributes of q|Ks and we select an object in its extent (lines 3-5). Then the algorithm continues on the next conditional statement (to find a target). – X contains an object os for Ks and an object ot for Kt selected in previous steps: we just check if os owns the relational attribute pointing at the object concept introducing ot , that is γot (line 8)1 . – X contains only an object os for Ks . We should find a target. We identify, under the meet of the concepts that introduce the attributes of q|Kt , one of the lowest concepts to which os points (lines 12-14). We select a target in its extent. – X contains only an object ot for Kt . We should find a source. We identify the meet of the concepts that introduce the attributes of q|Ks and the relational attribute that points to ot (lines 20-23). We select a source in its extent. When the arc connects a query object to original objects a = (q|Ks , rst , Oq ) (Algorithm 3): – Either X contains an object for Ks and we need to check if the relational attributes confirm that this object is connected to all the original objects in Oq ) (line 4); – Or we have to select an object for Ks , owning the attributes of the query q|Ks and owning the relational attributes ending in the concepts introducing the original objects (line 9-11). The algorithm for the last case is a combination of the algorithms for the two other cases. Note that whenever a condition is not verified, we have to backtrack, this is not specified in the algorithm for sake of simplicity. If the query path forms also a directed chain in the entity-relationship diagram, the main algorithm is a depth-first search. But in the general case, in some steps, when we consider an arc, we assigned to X an object for the end of the arc, and we need to find a source object. For example, we start with the arc (qcountry , has, {qrestaurant }) where the query path begins. We have to identify a source object os satisfying the query {f r} (Definition 4). For example, we choose the object France appearing the extent of Concept4 , whose intent contains fr. We extract the relational attributes of os = F rance, having the form ∃rst C). They are in practice in the lattices denoted by r : C. For example, we obtain has:Concept 19, has:Concept 15, has:Concept 60, has:Concept 16, etc. We keep the relational attributes with the concepts satisfying the target query in the corresponding lattice and discard the rest. In our example, the qrestaurant is empty. A relational attribute with the smallest concept (Ct ) is the one to consider that leads us to find a solution. We choose Concept 15 among the available smallest concepts. Let ∃ rst Ct be the selected relational attribute (if it exists). The object ot must be in the extent of Ct . In our example, we select El Sombrero. Then we consider the query-to-query arc (qrestaurant , serves, {qdish }). Given that an object is selected for Krestaurant , we look for a possible target object, led by the query qdish = ∅ and the relational attributes owned by the object 1 We remind that γo is the object concept introduced by o. concept Concept 15 which introduces El Sombrero. Suppose we choose (line 13) a relational attribute that targets one of the minimum concepts, namely serves : Concept 23 (but serves : Concept 26 or serves : Concept 25 are also possible). This leads us to Concept 23, in the extent of which we select Enchiladas. Dealing with the next arc (qdish , contains, {chicken, cheese, corn-tortilla}) involves, since we have already selected a dish, to verify (Algorithm 3, line 4) that, the object concept γEnchiladas owns all the relational attributes that go to object concepts introducing chicken, cheese, and corn-tortilla. These are contains : γ chicken = Concept 29, contains : γ cheese = Concept 36 and contains : γ corn − tortilla = Concept 40 and they are indeed inherited by γEnchiladas = Concept 23. When the arc (qsalsa , suitable-with, {qdish }) is considered, the target (Enchi- ladas) is in X. Thus we identify a source in the extent of the Concept 47, which satisfies the target query {hot}. Its intent contains suitable − with : Concept 23 which is Enchiladas. A target object (Tomatillo-Red Chili) is selected in the extent of Concept 47. The answer is now complete. Algorithm 1: Navigate(RCF, Q, PQ ) //PQ = (ak ) | ak = rij and rij ∈ RQ Data: (K, R): an RCF; Q = (Aq , Ovq , Rq ): a query on (K, R); and a query path Result: X: an object set (answer for Q) or fail foreach arc a ∈ PQ do 1 if a = (q|Ks , rst , q|Kt ) then 2 Case pure query 3 else 4 if a = (q|Ks , rst , Oq ) with Oq ⊆ Ot then 5 Case pure objects 6 else 7 if a = (q|Ks , rst , q|Kt ) with q|Kt ∈ Oq then 8 Case query and objects 9 4.2 Variations about the algorithm Integrating queries into the contexts. One approach that was investigated in the case of simple queries consists of integrating the virtual query object in the context, then building the concept lattice. This can also be done for rela- tional queries. A relational query Q = (Aq , Ovq , Rq ) can be integrated into an RCF by adding the virtual query objects ovq|K into the context Ki . Each vir- i tual query object ov q|Ki owns the attributes of the query aq|Ki and for each arc (ovq|K , rij , ovq|K ), the relational context of rij is enriched by a line for ovq|K , i j i Algorithm 2: Case pure query Let a = (q|Ks , rst , q|Kt ) 1 if // X does not contain a source and a target for the current arc a 2 X ∩ Os = ∅ and X ∩ Ot = ∅ then // select a source in the extent of a concept that verifies the source query 3 Let Cs be the highest concept having Intent (Cs ) ⊇ q|Ks select os ∈ Extent(Cs ) 4 X ← X ∪ {os } 5 if // X contains a source and a target for the current arc a 6 X ∩ Os = {os } and X ∩ Ot = {ot } then // verify that the source is connected to the target 7 check ∃rst γot ∈ Intent(γos ) 8 else 9 if // X contains a source for the current arc a 10 X ∩ Os = {os } then // select a target in the extent of a concept that verifies the target query 11 and is connected to the source Let Ct be the highest concept having Intent (Ct ) ⊇ q|Kt 12 and Ct ∈ min(C | ∃ (∃rst C) ∈ Intent(γos )) 13 select ot ∈ Extent(Ct ) 14 X ← X ∪ {ot } 15 else 16 // X contains a target for the current arc a 17 // select a source in the extent of a concept that verifies the source query18 and is connected to the target 19 Let ot ∈ X ∩ Ot 20 Let Cs be the highest concept having Intent (Cs ) ⊇ q|Ks 21 and ∃rst γot ∈ Intent(Cs ) 22 select os ∈ Extent(Cs ) 23 X ← X ∪ {os } 24 Algorithm 3: Case pure objects Let a = (q|Ks , rst , Oq ) with Oq ⊆ Ot 1 if // X contains a source for the current arc a 2 X ∩ Os = {os } then // verify that the source is connected to the objects in Oq 3 check ∀o ∈ Oq , ∃rst γo ∈ Intent(γos ) 4 else 5 // X does not contain a possible source 6 // select a source in the extent of a concept that verifies the source query 7 // and is connected to the target objects 8 Let Cs be the highest concept having Intent (Cs ) ⊇ q|Ks 9 and ∀o ∈ Oq , ∃rst γo ∈ Intent(Cs ) 10 select os ∈ Extent(Cs ) 11 X ← X ∪ {os } 12 a column for ovq|K and the relation (ovq|K , ovq|K )2 . We generate the corre- j i j sponding concept lattice family, considering the existential scaling 3 . Locating the highest concept that introduces all the attributes of each query of each con- cerned context, now is much more easy because it introduces the virtual query object. Then, we can navigate in a similar way as before. Opportunities of browsing offered by the exploration. As we explained before, the algorithm described in the previous section can be understood as an automatic procedure to determine a solution to a query. Nevertheless, it is more interesting to use it as a guiding method for the exploration of data by a human expert. Each object selection is a departure point for inspecting the objects of the selected concept, and, explore the neighborhood, going up by relaxing constraints or going down by adding constraints. A point in favor of the lattices is that they do not only give us a solution, but they also classify the objects of the solutions and provide a navigation structure. They also carry other information about the objects which can be useful for the expert: attributes that objects of the answer set have necessarily, attributes that appear simultaneously as attributes of the answer, etc. In our Web service application, we preferred the solution which integrates the query in RCF because it was easier to identify the answers. The lattices show how the existing objects match and differ from the query, thanks to the factorization of attributes between the query and the existing objects. Nevertheless, having several queries at the same time would not be efficient. Thus, the solution has been used only for specific problems. An incremental algorithm can be used to introduce the query, which enlightens the process of modifying the lattice and highlights the structure of the data. We can keep the original lattice (before query integration), and save the query objects together with the resulting concepts in an auxiliary structure. This way, we can always easily go back to the original lattices. 5 Related Work ER-compatible data, e.g., relational databases, and concept lattices have a long history of collaboration. First attempts to apply FCA to that sort of data go back to the introduction of concept graphs by R. Wille in the mid-90s [8]. The standard approach is rooted in the translation of an ER model into a power- context family (PCF) where basically everything is represented within a formal context [9]. Thus, inter-object links of various arities (i.e., tuples of different sizes) are reified and hence become formal objects of a dedicated context (one per arity). The overall reasoning is therefore uniformly based on the formal concepts. 2 See our example in Table http://www.lirmm.fr/~huchard/RCA_queries/ mexicoExistsWithQuery.rcft.html) 3 It is represented in Figure http://www.lirmm.fr/~huchard/RCA_queries/ mexicoExistsWithQuery.rcft.svg While this brings an undeniable mathematical strength in the formalization of the data processing and, in particular, querying, there are some issues with the expressiveness. Indeed, while formal concepts are typically based on a doubly universal quantification, the relational query languages mostly apply existential one. Alternatives to the PCF in the interpretation of concept graphs have been proposed that involve the notions of nested graphs and cuts [2]. It was shown that the resulting formalism, called Nested Query Graphs, have the same expressive power over relational data as first order predicate logic and hence can be used as a visual representation of most mainstream SQL queries. Existing approaches outside the concept graphs-based paradigm (see [3, 6]) follow a more conventional coding schema. Here inter-object links are modeled either through a particular sort of formal attributes or they reside in a differ- ent binary tables that match two sorts of individuals among them (instead of matching a set of individuals against a set of properties). Our own relational concept analysis framework is akin to this second category of approaches, hence our querying mechanisms are closer in spirit to those presented in the aforemen- tioned papers. For instance, in [3], the author proposes a language modeled w.r.t. SPARQL (the query language associated with the RDF language) to query relational data within the logical concept analysis (LCA) framework. The idea is to explore the relation structure of the data, starting from a single object and following its links to other objects. The language admits advanced constructs such as negation and disjunction and therefore qualifies as a fully-fledged relational query language. Recently, a less expressive language has been proposed in [6] for the brows- ing of a relational database content while taking advantage of the underlying conceptual structure. As the author himself admits, the underlying data for- mat used to ground the language semantics, the linked context family, is only slightly different from our own relational context family construct. The queries are limited here to conjunctions and existential quantifiers, yet variables are ad- mitted. Consequently, query topologies are akin to general graphs: In actuality, the browsing engine comprises a factorization mechanism enabling the discovery of identical extensions in the query graph which are subsequently merged. The downside of remaining free of the extensive commitments made by the concept graphs formalism both in terms of syntax and of semantics is the lack of unified methodological and mathematical framework beneath this second group of approaches. As a result, these diverge on a wide range of aspects which makes their in-depth comparison a hard task. First, there is an obvious query language expressiveness gap: On that axis, the two extremes are occupied by the LCA- and the RCA-based approaches, respec- tively, the former being the most expressive and the latter, the less expressive one. Then, the role played by the concept lattices vs. the query resolution is specific in each case. While in the LCA-based approach the concepts seem to be formed on the fly, in [6] the author seems to imply that they are constructed beforehand. Despite this distinction, in both cases the concept lattice is a sec- ondary structure that supports query resolution. In our own approach however, lattices are not only constructed prior to querying, but they also incorporate relational information in the intents of their concepts. In this sense, they are the primary structures whereas the queries are intended as navigational support. 6 Conclusion In this paper, we have presented a query-based navigation approach that helps an expert to explore a concept lattice family. The approach was based on an ap- plication of Relational Concept Analysis to the selection of suitable Web services for instantiating an abstract service composition. There are many perspectives of this work. In our Web service experience, we tested other scaling operators (like the covers operator) that offers other results, and helps to find more easily the aggregate answers. The query language can be made more expressive (including quantifiers). For example, we can request dishes containing only {chicken, cheese, ...}, which means that the universal scaling operator shall be used in the RCA process for this particular relation. Besides, the query path can be calculated, rather than being defined by the expert, suggesting more efficient exploration paths. References 1. Azmeh, Z., Driss, M., Hamoui, F., Huchard, M., Moha, N., Tibermacine, C.: Selec- tion of composable web services driven by user requirements. In: ICWS. pp. 395–402. IEEE Computer Society (2011) 2. Dau, F., Correia, J.H.: Nested concept graphs: Applications for databases and math- ematical foundations. In: Contribution to ICCS 2003. Skaker Verlag (2003) 3. Ferré, S.: Conceptual navigation in RDF graphs with SPARQL-Like Queries. In: Kwuida, L., Sertkaya, B. (eds.) ICFCA. LNCS, vol. 5986, pp. 193–208. Springer (2010) 4. Ganter, B., Wille, R.: Formal Concept Analysis, Mathematical Foundations. Sprin- ger-Verlag (1999) 5. Huchard, M., Rouane-Hacene, M., Roume, C., Valtchev, P.: Relational concept dis- covery in structured datasets. Ann. Math. Artif. Intell. 49(1-4), 39–76 (2007) 6. Kötters, J.: Object configuration browsing in relational databases. In: Valtchev, P., Jäschke, R. (eds.) ICFCA. Lecture Notes in Computer Science, vol. 6628, pp. 151–166. Springer (2011) 7. Messai, N., Devignes, M.D., Napoli, A., Smaı̈l-Tabbone, M.: Querying a bioin- formatic data sources registry with concept lattices. In: Dau, F., Mugnier, M.L., Stumme, G. (eds.) ICCS. LNCS, vol. 3596, pp. 323–336. Springer (2005) 8. Wille, R.: Conceptual graphs and formal concept analysis. In: Lukose, D., Delugach, H.S., Keeler, M., Searle, L., Sowa, J.F. (eds.) ICCS. Lecture Notes in Computer Science, vol. 1257, pp. 290–303. Springer (1997) 9. Wille, R.: Formal concept analysis and contextual logic. In: Hitzler, P., Scharfe, H. (eds.) Conceptual Structures in Practice. pp. 137–173. Chapman and Hall/CRC (2009)