Adaptation knowledge discovery for cooking using closed itemset extraction Emmanuelle Gaillard, Jean Lieber, and Emmanuel Nauer LORIA (UMR 7503—CNRS, INRIA, Nancy University) BP 239, 54506 Vandœuvre-lès-Nancy, France, First-Name.Last-Name@loria.fr Abstract. This paper is about the adaptation knowledge (AK) discov- ery for the Taaable system, a case-based reasoning system that adapts cooking recipes to user constraints. The AK comes from the interpreta- tion of closed itemsets (CIs) whose items correspond to the ingredients that have to be removed, kept, or added. An original approach is pro- posed for building the context on which CI extraction is performed. This approach focuses on a restrictive selection of objects and on a specific ranking based on the form of the CIs. Several experimentations are pro- posed in order to improve the quality of the AK being extracted and to decrease the computation time. This chain of experiments can be seen as an iterative knowledge discovery process: the analysis following each experiment leads to a more sophisticated experiment until some concrete and useful results are obtained. Keywords: adaptation knowledge discovery, closed itemset, data preprocess- ing, case-based reasoning, cooking. 1 Introduction This paper addresses the adaptation challenge proposed by the Computer Cook- ing Contest (http://computercookingcontest.net/) which consists in adapt- ing a given cooking recipe to specific constraints. For example, the user wants to adapt a strawberry pie recipe, because she has no strawberry. The underlying question is: which ingredient(s) will the strawberries be replaced with? Adapting a recipe by substituting some ingredients by others requires cook- ing knowledge and adaptation knowledge in particular. Taaable, a case-based reasoning (CBR) system, addresses this problem using an ingredient ontology. This ontology is used for searching which is/are the closest ingredient(s) to the one that has to be replaced. In this approach the notion of “being close to” is given by the distance between ingredients in the ontology. In the previous example, Taaable proposes to replace the strawberries by other berries (e.g. raspberries, blueberries, etc.). However, this approach is limited because two in- gredients which are close in the ontology are not necessarily interchangeable and because introducing a new ingredient in a recipe may be incompatible with some other ingredient(s) of the recipe or may required to add other ingredients. This paper extends the approach proposed in [2] for extracting this kind of adaptation knowledge (AK). The approach is based on closed itemset (CI) extraction, in which items are the ingredients that have to be removed, kept, or added for adapting the recipe. This paper introduces two originalities. The first one concerns the way the binary context, on which the CI extraction is performed, is built, by focusing on a restrictive selection of objects according to the objectives of the knowledge discovery process. The second one concerns the way the CIs are filtered and ranked, according to their form. The paper is organised as follows: Section 2 specifies the problem in its whole context and introduces Taaable which will integrate the discovered AK in its reasoning process. Section 3 gives preliminaries for this work, introducing CI extraction, case-based reasoning, and related work. Section 4 explains our approach; several experiments and evaluations are described and discussed. 2 Context and motivations 2.1 Taaable The Computer Cooking Contest is an international contest that aims at compar- ing systems that make inferences about cooking. A candidate system has to use the recipe base given by the contest to propose a recipe matching the user query. This query is a set of constraints such as inclusion or rejection of ingredients, the type or the origin of the dish, and the compatibility with some diets (vegetarian, nut-free, etc.). Taaable [1] is a system that has been originally designed as a candidate of the Computer Cooking Contest. It is also used as a brain teaser for research in knowledge based systems, including knowledge discovery, ontology engineer- ing, and CBR. Like many CBR systems, Taaable uses an ontology to retrieve recipes that are the most similar to the query. Taaable retrieves and creates cooking recipes by adaptation. If there exist recipes exactly matching the query, they are returned to the user; otherwise the system is able to retrieve similar recipes (i.e. recipes that partially match the target query) and adapts these recipes, creating new ones. Searching similar recipes is guided by several ontolo- gies, i.e. hierarchies of classes (ingredient hierarchy, dish type hierarchy and dish origin hierarchy), in order to relax constraints by generalising the user query. The goal is to find the most specific generalisation of the query (the one with the minimal cost) for which recipes exist in the recipe base. Adaptation consists in substituting some ingredients of the retrieved recipes by the ones required by the query. Taaable retrieves recipes using query generalisation, then adapts them by substitution. This section gives a simplified description of the Taaable system. For more details about the Taaable inference engine, see e.g. [1]. For example, for adapting the “My Strawberry Pie” recipe to the no Strawberry constraint, the system first generalises Strawberry into Berry, then specialises Berry into, say, Raspberry. 2.2 Domain ontology An ontology O defines the main classes and relations relevant to cooking. O is a set of atomic classes organised into several hierarchies (ingredient, dish type, dish origin, etc.). Given two classes B and A of this ontology, A is subsumed by B, denoted by B w A, if the set of instances of A is included in the set of instances of B. For instance, Berry w Blueberry and Berry w Raspberry. 2.3 Taaable adaptation principle Let R be a recipe and Q be a query such that R does not exactly match Q (oth- erwise, no adaptation would be needed). For example, Q = no Strawberry and R = “My Strawberry Pie”.The basic ontology-driven adaptation in Taaable follows the generalisation/specialisation principle explained hereafter (in a sim- plified way). First, R is generalised (in a minimal way) into Γ (R) that matches Q. For example, Γ may be the substitution Strawberry Berry. Second, Γ (R) is specialised into Σ(Γ (R)) that still matches Q. For example, Σ is the substitu- tion Berry Raspberry (the class Berry is too abstract for a recipe and must be made precise). This adaptation approach has at least two limits. First, the choice of Σ is at random: there is no reason to choose raspberries instead of blue- berries, unless additional knowledge is given. Second, when such a substitution of ingredient is made, it may occur that some ingredients should be added or removed from R. These limits point out the usefulness of additional knowledge for adaptation. 3 Preliminaries 3.1 Itemset extraction Itemset extraction is a set of data-mining methods for extracting regularities into data, by aggregating object items appearing together. Like FCA [8], itemset extraction algorithms start from a formal context K, defined by K = (G, M, r), where G is a set of objects, M is a set of items, and r is the relation on G × M stating that an object is described by an item [8]. Table 1 shows an example of context, in which recipes are described by the ingredients they require: G is a set of 5 objects (recipes R, R1 , R2 , R3 , and R4 ), M is a set of 7 items (ingredients Sugar, Water, Strawberry, etc.). An itemset I is a set of items, and the support of I, support(I), is the number of objects of the formal context having every item of I. I is frequent, with respect to a threshold σ, whenever support(I) ≥ σ. I is closed if it has no proper superset J (I ( J) with the same support. For example, {Sugar, Raspberry} is an item- set and support({Sugar, Raspberry}) = 2 because 2 recipes require both Sugar and Raspberry. However, {Sugar, Raspberry} is not a closed itemset, because {Sugar, PieCrust, Raspberry} has the same support. Another, equivalent, defi- nition of closed itemsets can be given on the basis of a closure operator ·00 defined as follows. Let I be an itemset and I 0 be the set of objects that have all the items y h e Pi err Co arc Ge rry Ci uic t p Pi on ll s i Ap n wb ru st Wh be ti eJ am he r r e ga te ra eC rn ol sp la pl pl nn eS Su Wa St Co Ra Ap R × × × × × × R1 × × × × × R2 × × × × R3 × × × × × R4 × × × × × Table 1. Formal context representing ingredients used in recipes. of I: I 0 = {x ∈ G | ∀i ∈ I, x r i}. In a dual way, let X be a set of objects and X 0 be the set of properties shared by all objects of X: X 0 = {i ∈ M | ∀x ∈ X, x r i}. This defines two operators: ·0 : I ∈ 2M 7→ I 0 ∈ 2G and ·0 : X ∈ 2G 7→ X 0 ∈ 2M . These operators can be composed in an operator ·00 : I ∈ 2M 7→ I 00 ∈ 2M . An itemset I is said to be closed if it is a fixed point of ·00 , i.e., I 00 = I. In the following, “CIs” stands for closed itemsets, and “FCIs” stands for frequent CIs. For σ = 3, the FCIs of this context are {Sugar, PieCrust}, {Sugar, PieCrust, Cornstarch}, {Sugar, Water}, {Sugar}, {Water}, {PieCrust}, and {Cornstarch}. For the following experiments, the Charm algorithm [12] that efficiently computes the FCIs is used thanks to Coron a software platform implementing a rich set of algorithmic methods for symbolic data mining [11]. 3.2 Case-based reasoning Case-based reasoning (CBR [10]) consists in answering queries with the help of previous experience units called cases. In Taaable, a case is a recipe and a query represents user constraints. In many systems, including Taaable, CBR consists in the retrieval of a case from the case base and in the adaptation of the retrieved case in an adapted case that solves the query. Retrieval in Taaable is performed by minimal generalisation of the query (cf. section 2.3). Adaptation can be a simple substitution (e.g., substitute strawberry with any berry) but it can be improved thanks to the use of some domain specific AK. This motivates the research on AK acquisition. 3.3 Related work The AK may be acquired in various way. It may be collected from experts [6], it may be acquired using machine learning techniques [9], or be semi-automatic, using data-mining techniques and knowledge discovery principles [3,4]. This paper addresses automatic AK discovery. Previous works, such as the ones proposed by d’Aquin et al. with the Kasimir project in the medical do- main [5], and by Badra et al. in the context of a previous work on Taaable [2], are the foundations of our work. Kasimir is a CBR system applied to decision support for breast cancer treatment. In Kasimir, a case is a treatment used for a given patient. The patient is described by characteristics (age, tumour size and location, etc.) and the treatment consists in applying medical instructions. In order to discover AK, cases that are similar to the target case are first selected. Then, FCIs are computed on the variations between the target case and the similar cases. FCIs matching a specific form are interpreted for generating AK [5]. Badra et al. use this approach to make cooking adaptations in Taaable [2]. Their work aims at comparing pairs of recipes depending on the ingredients they contain. A recipe R is represented by the set of its ingredients: Ingredients(R). For example, the recipe “My Strawberry Pie” is represented by Ingredients(“My Strawberry Pie”) = {Sugar, Water, Strawberry, PieCrust, Cornstarch, CoolWhip} Let (R, R0 ) be a pair of recipes which is selected. According to [2], the represen- tation of a pair is denoted by ∆, where ∆ represents the variation of ingredients between R and R0 . Each ingredient ing is marked by −, =, or +: – ing − ∈ ∆ if ing ∈ Ingredients(R) and ing ∈ / Ingredients(R0 ), meaning 0 that ing appears in R but not in R . / Ingredients(R) and ing ∈ Ingredients(R0 ), meaning – ing + ∈ ∆ if ing ∈ that ing appears in R0 but not in R. – ing = ∈ ∆ if ing ∈ Ingredients(R) and ing ∈ Ingredients(R0 ), meaning that ing appears both in R in R0 . Building a formal context about ingredient variations in cooking reci- pes. Suppose we want to compare the recipe R with the four recipes (R1 , R2 , R3 , R4 ) given in Table 1. V Variations between R = “My Strawberry Pie” and a mark recipe Ri have the form j ingi,j . For example: ∆R,R1 = Sugar= ∧ Water− ∧ Strawberry− ∧ PieCrust= ∧ Cornstarch= ∧ CoolWhip− ∧ Raspberry+ ∧ Gelatin+ (1) According to these variations, a formal context K = (G, M, I) can be built (cf. Table 2, for the running example): – G = {∆R,Ri }i mark – M is the set of ingredient variations: M = {ingi,j }i,j . In particular, M − contains all the conjuncts of ∆R,R1 (Strawberry , etc., cf.(1)). – (g, m) ∈ I, if g ∈ G, m ∈ M , and m is a conjunct of g, for example (∆R,R1 , Strawberry− ) ∈ I. eC ry − ol ch − rn ch = nn ce + la y + Co ust − Ra hip − Pi ust = Pi mon + ll + Ap in + Pi ber Co tar Co tar Ge err Ci Jui r− r= r= e+ he w r r s s W b t e a ga te te ra eC rn sp pl pl eS Su Wa Wa St Ap ∆R,R1 × × × × × × × × ∆R,R2 × × × × × × × ∆R,R3 × × × × × × × × ∆R,R4 × × × × × × × × × Table 2. Formal context for ingredient variations in pairs of recipes (R, Rj ). Interpretation. In the formal context, an ingredient marked with + (resp. −) is an ingredient that has to be added (resp. removed). An ingredient marked with = is an ingredient common to R and Ri . 4 Adaptation Knowledge discovery AK discovery is based on the same scheme as knowledge discovery in databases (KDD [7]). The main steps of the KDD process are data preparation, data- mining, and interpretation of the extracted units of information. Data prepara- tion relies on formatting data for being used by data-mining tools and on filtering operations for focusing on special subsets of objects and/or items, according to the objectives of KDD. Data-mining tools are applied for extracting regularities into the data. These regularities have then to be interpreted; filtering operations may also be performed on this step because of the (often) huge size of the data- mining results or of the noise included in these results. All the steps are guided by an analyst. The objective of our work is the extraction of some AK useful for adapt- ing a given recipe to a query. The work presented in the following focuses on filtering operations, in order to extract from a formal context encoding ingredient variations between pairs of recipes, the cooking adaptations. The database used as entry point of the process is the Recipe Source database (http://www.recipesource.com/) which contains 73795 cooking recipes. For the sake of simplicity, we consider in the following, the problem of adapting R by substituting one or several ingredient(s) with one or several ingredient(s) (but the approach can be generalised for removing more ingredients, and also be used for adding ingredient(s) in a recipe). Three experiments are presented; they address the same adaptation problem: adapting the R = “My Strawberry Pie” recipe, with Ingredients(“My Strawberry Pie”) = {Sugar, Water, Strawberry, PieCrust, Cornstarch, CoolWhip}, to the query no Strawberry. In each ex- periment, a formal context about ingredient variations in recipes is built. Then, FCIs are extracted and filtered for proposing cooking adaptation. The two first experiments focus on object filtering, selecting recipes which are more and more similar to the “My Strawberry Pie” recipe: the first experiment uses recipe from the same type (i.e. pie dish) as “My Strawberry Pie” instead of choosing recipes of any type; the second experiment focuses on a more precise filtering based on similarity between the “My Strawberry Pie” recipe and recipes used for gener- ating the formal context on ingredient variations. 4.1 A first approach with closed itemsets As introduced in [2], a formal context is defined, where objects are ordered pairs of recipes (R, R0 ) and properties are ingredients marked with +, =, − for representing the ingredient variations from R to R0 . The formal context which is build is similar to the example given in Table 2. In each pair of recipes, the first element is the recipe R =“My Strawberry Pie” that must be adapted; the second element is a recipe of the same dish type as R which, moreover, does not contain the ingredient which has to be removed. In our example, it corresponds to pie dish recipes which do not contain strawberry. This formal context allows to build CIs which have to be interpreted in order to acquire adaptation rules. Experiment. 3653 pie dish recipes that do not contain strawberry are found in the Recipe Source database. The formal context, with 3653 objects × 1355 items produces 107,837 CIs (no minimal support is used). Analysis. Some interesting CIs can be found. For example, {PieCrust− , Strawberry− , Cornstarch− , CoolWhip− , Water− , Sugar− } with support of 1657, contains all the ingredients of R with a − mark, meaning that there are 1657 recipes which have no common ingredients with the R recipe. In the same way, {PieCrust− , Strawberry− , Cornstarch− , CoolWhip− , Water− } with sup- port 2590, means that 2590 recipes share only the Sugar ingredient with R because the sugar is the sole ingredient of R which is not included in this CI. The same analysis can be done for {PieCrust− , Strawberry− , Cornstarch− , CoolWhip− , Sugar− } (support of 1900), for water, etc. Conclusion. The CIs are too numerous for being presented to the analyst. Only 1996 of the 3653 pie dish without strawberry recipes share at least one ingredient with R. There are too many recipes without anything in common. A first filter can be used to limit the size of the formal context in number of objects. 4.2 Filtering recipes with at least one common ingredient Experiment. The formal context, with 1996 objects × 813 items, produces 22,408 CIs (no minimal support is used), ranked by decreasing support. Results. The top five FCIs are: – {Strawberry− } with support of 1996; – {Strawberry− , CoolWhip− } with support of 1916; – {Strawberry− , PieCrust− } with support of 1757; – {Strawberry− , PieCrust− , CoolWhip− } with support of 1679; – {Strawberry− , Cornstarch− } with support of 1631. Analysis. Several observations can be made. The first FCI containing an ingre- dient marked by + ({Strawberry− , Egg+ }, with support of 849) appears only at the 46th position. Moreover, there are 45 FCIs with one ingredient marked by + in the first 100 FCIs, and no FCI with more than one ingredient marked by +. A substituting ingredient ing can only be found in CIs containing ing + meaning that there exists a recipe containing ing, which is not in R. So, FCIs that do not contain the + mark cannot be used for finding a substitution proposition, and they are numerous in the first 100 ones, based on a support ranking (recall that it has been chosen not to consider adaptation by simply removing ingredient). In the first 100 FCIs, there is only 15 FCIs containing both an ingredient marked by + and an ingredient marked by =. In a FCI I, the = mark on a ingredient ing means that ing is common to R and to recipe(s) involved by the creation of I. So, an ingredient marked by = guarantees a certain similarity (based on ingredients that are used) between the recipes R and R0 compared by ∆R,R0 . If a FCI I contains a potential substituting ingredient, marked by +, but does not contain any =, the risk for proposing a cooking adaptation from I is very high, because there is no common ingredient with R in the recipe the potential substituting ingredient comes from. In the first 100 recipes, the only potential substituting ingredients (so, the ingredients marked by +) are egg, salt, and butter, which are not satisfactory from a cooking viewpoint for substituting the strawberries. We have conducted similar experiments with other R and queries, and the same observations as above can be made. Conclusion. From these observations, it can be concluded that the sole rank- ing based on support is not efficient to find relevant cooking adaptation rules, because the most frequent CIs do no contain potential substituting ingredients and, moreover, have no common ingredient with R. 4.3 Filtering and ranking CIs according to their forms To extract realistic adaptation, CIs with a maximum of ingredients marked by = are searched. We consider that a substitution is acceptable, if 50% of ingredients of R are preserved and if the adaptation does not introduce too many ingredients; we also limit the number of ingredients introduced to 50% of the initial number of ingredients in R. For the experiment with the R = “My Strawberry Pie”, containing initially 6 ingredients, it means that at least 3 ingredients must be preserved and at most 3 ingredients can be added. In term of CIs, it corresponds to CIs containing at least 3 ingredients marked with = and at most 3 ingredients marked with +. Experiment. Using this filter on CIs produced by the previous experiment re- duces the number of CIs to 505. However, because some CIs are more relevant than others, they must be ranked according to several criteria. We use the fol- lowing rules, given by priority order: 1. A CI must have a + in order to find a potential substituting ingredient. 2. A CI which has more = than another one is more relevant. This criterion promotes the pairs which have a largest set of common ingredient. 3. A CI which has less − than another one is more relevant. This criterion promotes adaptations which remove less ingredients. 4. A CI which has less + than another one is more relevant. This criterion promotes adaptations which add less ingredients. 5. If two CIs cannot be ranked according to the 4 criteria above, the CI the more frequent is considered to be the more relevant. Results. The 5 first CIs ranked according to the previous criteria are: – {Water= , Sugar= , Strawberry− , CoolWhip− , Cornstarch= , PieCrust= , Salt+ } with support of 5; – {Water= , Sugar= , Strawberry− , CoolWhip− , Cornstarch= , PieCrust= , LemonJuice+ } with support of 4; – {Water= , Sugar= , Strawberry− , CoolWhip− , Cornstarch= , PieCrust= , LemonJuice+ , CreamCheese+ } with support of 2; – {Water= , Sugar= , Strawberry− , CoolWhip− , Cornstarch= , PieCrust= , LemonJuice+ , WhippingCream+ } with support of 2; – {Water= , Sugar= , Strawberry− , CoolWhip− , Cornstarch= , PieCrust= , LemonJuice+ , LemonPeel+ } with support of 2. Analysis. One can observe that potential substituting ingredients take part of the first 5 CIs and each CIs preserve 4 (of 6) ingredients. The low supports of these CIs confirm that searching frequent CIs is not compatible with our need, which is to extract CIs with a specific form. Conclusion. Ranking the CIs according to our particular criteria is more efficient than using a support based ranking. This kind of ranking can also be seen as a filter on CIs. However, this approach requires to compute all CIs because the support of interesting CIs is low. 4.4 More restrictive formal context building according to the form of interesting CIs The computation time can be improved by applying a more restrictive selection of recipe pairs at the formal context building step, decreasing drastically the size of the formal context. Indeed, as the expected form of CIs is known, recipe pairs that cannot produce CIs of the expected form can be removed. This can also be seen as a selection of recipes that are similar enough to R. R0 is considered as enough similar to R if R0 has a minimal threshold σ = = 50% of ingredients in common with R (cf. (2)) and if R0 has a maximal threshold σ + = 50% of ingredients that are not used in R (cf. (3)). These two conditions expresses for ∆R,R0 the same similarities conditions considered in section 4.3 on CIs. |Ingredients(R) ∩ Ingredients(R0 )| ≥ σ= (2) |Ingredients(R)| |Ingredients(R0 ) \ Ingredients(R)| ≥ σ+ (3) |Ingredients(R)| Experiment. Among the 1996 pie dish recipes not containing Strawberry, only 20 recipes satisfy the two conditions. The formal context, with 20 objects × 40 items, produces only 21 CIs (no minimal support is used). Results. The 5 first CIs, satisfying the form introduced in the previous section and ranked by decreasing support are: – {Water= , Sugar= , Cornstarch= , PieCrust= , Strawberry− , CoolWhip− , RedFoodColoring+ , Cherry+ } with support of 1; – {Water= , Sugar= , Cornstarch= , PieCrust− , Strawberry− , CoolWhip− , PieShell+} with support of 6; – {Water= , Sugar= , Cornstarch= , PieCrust− , Strawberry− , CoolWhip− , Raspberry+ } with support of 3; – {Water= , Sugar− , Cornstarch= , PieCrust= , Strawberry− , CoolWhip− , Apple+ , AppleJuice+ } with support of 3; – {Water= , Sugar= , Cornstarch= , PieCrust− , Strawberry− , CoolWhip− , Peach+ , PieShell+} with support of 2. Analysis. According to these CIs the first potential substituting ingredients are: RedFoodColoring, Cherry, PieShell, Raspberry, Apple, and Peach. Each CI preserves 3 or 4 (of 6) ingredients to 6 and two CIs add 2 ingredients. Conclusion. This approach reduces the computation time without reducing the result quality. Moreover, it gives the best potential adaptation in the first CIs. 4.5 From CIs to adaptation rules As Taaable must propose a recipe adaptation, CIs containing potentially sub- stituting ingredients must be transformed. Indeed, a CI does not represent a direct cooking adaptation. For example, the third CI of the last experiment contains Raspberry+ , simultaneously with CoolWhip− and PieCrust− . Remov- ing the pie crust (i.e. PieCrust− ) can look surprising for a pie dish, but one must keep in mind that a CI does not correspond to a real recipe, but to an abstraction of variations between R and a set of recipes. So, producing a complete adaptation requires to get back to the ∆R,Ri for having all the vari- ations of ingredient that will take part to the adaptation. For example, for y− h= st + + + t− Ra ip − Ge ll + Pi arc Co err Pi rry GC lor Fo n + ru s r= r= st ru wb Wh be he ti Co eC te ga rn eC ra ol sp eS la od Pi Wa Su Co St ∆R,R1 × × × × × × × × × ∆R,R2 × × × × × × × × × ∆R,R3 × × × × × × × × Table 3. Formal context for ingredient variations in pairs of recipes (R, Rj ). the CI {Water= , Sugar= , Cornstarch= , PieCrust− , Strawberry− , CoolWhip− , Raspberry+ }, the ∆R,Ri (with i ∈ [1; 3]) are the ones given by Table 3. The adaptation rules extracted from these 3 recipe variations are: – {CoolWhip, PieCrust, Strawberry} ; {Gelatin, GCPieCrust, Raspberry}; – {CoolWhip, PieCrust, Strawberry} ; {FoodColor, PieShell, Raspberry}; – {CoolWhip, PieCrust, Strawberry} ; {PieShell, Raspberry}. For R2 and R3 , PieShell is added in replacement of PieCrust; in R1 , GCPieCrust plays the role of PieCrust. These three recipe variations propose to replace Strawberry by Raspberry. For R1 (resp. R2 ), Gelatin (resp. FoodColor) is also added. Finally, the three recipe variations propose to remove the CoolWhip. Our approach guarantees the ingredient compatibility, with the assumption that the recipe base used for the adaptation rule extraction process contains only good recipes, i.e. recipes which do not contain ingredient incompatibility. Indeed, as adaptation rules are extracted from real recipes, the good combination of ingredients is preserved. So, when introducing a new ingredient ing1 (marked by ing1+ ), removing another ingredient ing2 (marked by ing2− ) could be required. The reason is that there is no recipe, entailed in the creation of the CI from which the adaptation rules are extracted, using both ing1 and ing2 . In the same way, adding a supplementary ingredient ing3 (marked by ing3+ ) in addition of ing1 , is obtained from recipes which use both ing1 and ing3 . Applying FCA on these ∆R,Ri produces the concept lattice presented in Fig. 1 in which the top node is the CI retained. This node can be seen as a generic cooking adaptation, and navigating into the lattice will conduct to more specific adaptation. The KDD loop is closed: after having (1) selected and formatting the data, (2) applying a data-mining CI extraction algorithm, and (3) interpreting the results, a new set of data is selected on which a data-mining –FCA– algorithm could then be applied. We have chosen to return the adaptation rules generated from the 5 first CIs to the user. So, the system proposes results where Strawberry could be replaced (in addition of some other ingredient adding or removing) by “RedFoodColoring and Cherry”, by Raspberry with optional Gelatin or FoodColor, by Peach Fig. 1. The lattice computed on the formal context given in Table 3. with optional FoodColor or LemonJuice, by “HeavyCream and LemonRind”, or by “Apple and AppleJuice”. 5 Conclusion This paper shows how adaptation knowledge can be extracted efficiently for ad- dressing a cooking adaptation challenge. Our approach focuses on CIs with a particular form, because the support is not a good ranking measure for this problem. A ranking method based on 5 criteria explicitly specified for this adap- tation problem is proposed; the support is used in addition to distinguish CIs which satisfy in the same way the 5 criteria. Beyond the application domain, this study points out that KD is not only a data-mining issue: the preparation and interpretation steps are also important. Moreover, it highlights the iterative nature of KD: starting from a first experi- ment with few a priori about the form of the results which are too numerous to be interpreted, it arrives to an experiment with a precise aim that gives results that are easy to interpret as adaptation rules. It has been argued in the paper that this approach is better than the basic adaptation approach (based on substituting an ingredient by another one, on the basis of the ontology), in that it avoids some ingredient incompatibilities and makes some specialisation choices. However, a careful study remains to be made in order to compare experimentally these approaches. A short-term future work is to integrate this AK discovery into the online system Taaable, following the principles of opportunistic KD [2]. A mid-term future work consists in using the ontology during the KD process. The idea is to add new items, deduced thanks to the ontology (e.g. the properties Cream− and Milk+ entail the variation Dairy= ). First experiments have already been conducted but they raise interpretation difficulties. Indeed, the extracted CIs contain abstract terms (such as Dairy= or Flavoring+ ) that are not easy to interpret. References 1. F. Badra, R. Bendaoud, R. Bentebitel, P.-A. Champin, J. Cojan, A. Cordier, S. De- sprés, S. Jean-Daubias, J. Lieber, T. Meilender, A. Mille, E. Nauer, A. Napoli, and Y. Toussaint. Taaable: Text Mining, Ontology Engineering, and Hierarchical Clas- sification for Textual Case-Based Cooking. In ECCBR Workshops, Workshop of the First Computer Cooking Contest, pages 219–228, 2008. 2. F. Badra, A. Cordier, and J. Lieber. Opportunistic Adaptation Knowledge Dis- covery. In Lorraine McGinty and David C. Wilson, editors, 8th International Con- ference on Case-Based Reasoning - ICCBR 2009, volume 5650 of Lecture Notes in Computer Science, pages 60–74, Seattle, États-Unis, July 2009. Springer. The original publication is available at www.springerlink.com. 3. S. Craw, N. Wiratunga, and R. C. Rowe. Learning adaptation knowledge to im- prove case-based reasoning. Artificial Intelligence, 170(16-17):1175–1192, 2006. 4. M. d’Aquin, F. Badra, S. Lafrogne, J. Lieber, A. Napoli, and L. Szathmary. Case base mining for adaptation knowledge acquisition. In International Joint Confer- ence on Artificial Intelligence, IJCAI’07, pages 750–756, 2007. 5. M. D’Aquin, S. Brachais, J. Lieber, and A. Napoli. Decision Support and Knowl- edge Management in Oncology using Hierarchical Classification. In Katherina Kaiser, Silvia Miksch, and Samson W. Tu, editors, Proceedings of the Symposium on Computerized Guidelines and Protocols - CGP-2004, volume 101 of Studies in Health Technology and Informatics, pages 16–30, Prague, Czech Republic, 2004. Silvia Miksch and Samson W. Tu, IOS Press. 6. M. d’Aquin, J. Lieber, and A. Napoli. Adaptation Knowledge Acquisition: a Case Study for Case-Based Decision Support in Oncology. Computational Intelligence (an International Journal), 22(3/4):161–176, 2006. 7. U. M. Fayyad, G. Piatetsky-Shapiro, and P. Smyth. From data mining to knowledge discovery in databases. AI Magazine, pages 37–54, 1996. 8. B. Ganter and R. Wille. Formal Concept Analysis: Mathematical Foundations. Springer, 1999. 9. K. Hanney and M. T. Keane. Learning Adaptation Rules From a Case-Base. In I. Smith and B. Faltings, editors, Advances in Case-Based Reasoning – Third European Workshop, EWCBR’96, LNAI 1168, pages 179–192. Springer, 1996. 10. C. K. Riesbeck and R. C. Schank. Inside Case-Based Reasoning. Lawrence Erlbaum Associates, Inc., Hillsdale, New Jersey, 1989. 11. L. Szathmary and A. Napoli. CORON: A Framework for Levelwise Itemset Min- ing Algorithms. Supplementary Proc. of The Third International Conference on Formal Concept Analysis (ICFCA ’05), Lens, France, pages 110–113, 2005. 12. M. J. Zaki and C.-J. Hsiao. CHARM: An efficient algorithm for closed itemset mining. In SIAM International Conference on Data Mining SDM’02, pages 33–43, 2002.