=Paper=
{{Paper
|id=Vol-1953/healthRecSys17_paper_11
|storemode=property
|title=Investigating Substitutability of Food Items in Consumption Data
|pdfUrl=https://ceur-ws.org/Vol-1953/healthRecSys17_paper_11.pdf
|volume=Vol-1953
|authors=Sema Akkoyunlu,Cristina Manfredotti,Antoine Cornuéjols,Nicolas Darcel,Fabien Delaere
|dblpUrl=https://dblp.org/rec/conf/recsys/AkkoyunluMCDD17
}}
==Investigating Substitutability of Food Items in Consumption Data==
Investigating substitutability of food items in consumption data Sema Akkoyunlu Cristina Manfredotti Antoine Cornuéjols UMR MIA-Paris, AgroParisTech, UMR MIA-Paris, AgroParisTech, UMR MIA-Paris, AgroParisTech, INRA INRA INRA Université Paris-Saclay Université Paris-Saclay Université Paris-Saclay Paris, France Paris, France Paris, France sema.akkoyunlu@agroparistech.fr cristina.manfredotti@agroparistech. antoine.cornuejols@agroparistech.fr fr Nicolas Darcel Fabien Delaere UMR PNCA, AgroParisTech, INRA Danone Nutricia Research Université Paris-Saclay Palaiseau, France Paris, France fabien.delaere@danone.com nicolas.darcel@agroparistech.fr ABSTRACT In order to promote healthy and sustainable diet and prevent Food based dietary guidelines are insufficiently followed by con- chronic diseases, dietary guidelines targeted to the general pop- sumers. One of the principal explanations of this failure is that they ulation are produced by public health agencies. However, it has are too general and do not take into account eating habits. Providing been noted that the compliance to the guidelines is usually low personalized dietary recommendations via nutrition recommender although the awareness concerning the food based dietary recom- system can hence help people improve their eating habits. Under- mendations is rather good [9]. Nutrition related knowledge does standing eating habits is a keystone in order to build a context not imply adherence to dietary guidelines. Several causes explain aware recommender system that delivers personalized dietary rec- this phenomenon: cultural and personal preferences, difficulty of ommendations. As a first step towards this goal, we explore food implementing dietary changes, availability of food items [12]. Nu- relationships on real-world data using the INCA 2 dataset, a French tritionists stress the fact that it is crucial to understand consumers’ consumption survey. We particularly focus on extracting food sub- behaviours in order to make practical food-based recommendations stitutions, i.e food items that can replace each other. We consider because making changes is challenging [4]. that two food items can be substituted if they are consumed during One fair assumption is that people are more likely to follow similar contexts. We define the context in the nutrition field and we recommendations if these are acceptable from their point of view. introduce a measure of substitutability between food items based We hypothesize that the user acceptance is a prerequisite for the on consumption data that encodes the context. compliance and could be improved by producing user-tailored rec- ommendations that take into account dietary habits. On the long CCS CONCEPTS term, our objective is to build a nutrition recommender system taking into account dietary habits in order to encourage people • Information systems → Information extraction; Recom- toward healthier alternatives with high compliance. mender systems; In food related recommender systems, the recommended items KEYWORDS are recipes [6], [8] or food items themselves [7]. We rather want to build a food item based recommender system that delivers message recommender system, substitution, food consumption, nutrition such as "instead of eating X, eat Y". In order to deliver relevant ACM Reference format: recommendations, it is important for a recommender systems to Sema Akkoyunlu, Cristina Manfredotti, Antoine Cornuéjols, Nicolas Darcel, know substitutability relationships between items [11], [14]. This and Fabien Delaere. 2017. Investigating substitutability of food items in is important in food recommender systems as well. consumption data. In Proceedings of the Second International Workshop on Moreover, it has been shown that context-aware recommender Health Recommender Systems co-located with ACM RecSys 2017, Como, Italy, systems produce better recommendations than recommender sys- August 2017 (HealthRecSys’17), 5 pages. tems that do not take into account the context [2]. In order to extract meaningful relationships between food items, in our model 1 INTRODUCTION we consider contextual information. To the best of our knowledge, Nutritional quality of diets is proven to be an important factor in one study tackled the subject of food substitutability based on health dysfunctions. The risk of developing modern chronic dis- real-world consumption data [1]. However, they do not take into eases such as cardiovascular diseases, obesity or diabetes is linked account contextual information such as the type of meal where to unhealthy eating habits [13]. substitutability relationships can be highly different. ∗ International Workshop on Health Recommender Systems, August 2017, Como, Italy. In this paper, we specifically investigate food substitutability. ©2017. Copyright for the individual papers remains with the authors. Copying permit- To do that, we define the concept of dietary context as the set of ted for private and academic purposes. This volume is published and copyrighted by food items a food is consumed with and the concept of food intake its editors. HealthRecSys’17, August 2017, Como, Italy Akkoyunlu et al. context as the setting of food consumption. Our intuition is that limited by the characteristics of the available data. Instead of in- two food items are substitutable if they are consumed in similar vestigating all the dietary contexts of a food item, we decided to dietary contexts and that substitutability differs according to the explore collections of meals that differ only by one item. We define food intake context. the dietary context of a meal database c as the intersection of a The rest of the paper is organized as follows. Section 2 describes set of meals Sm such that : our methodology. Section 3 reports the results. Finally in section 4, len(c) = max (len(x)) − 1 (1) we discuss our results and present our future perspectives. x ∈Sm Let us define the substitutable set Sc associated to a dietary con- 2 OUR APPROACH text c as the set of food items such that the context c plus one item 2.1 Notations and problem statement of Sc can be effectively consumed together. For instance, the substi- tutable set of the dietary context c = {bread, jam, juice} might be Let X be the set of food items. A meal is a collection of food items Sc = {co f f ee, tea, yoдurt }. consumed at the same timeframe. For instance, {coffee, bread, jam, juice} is a meal. The meal database DB is the set of all meals. Let us 2.3 Mining substitutable items denote DBbr eak f ast the database of breakfasts and DBlunch the database of lunches. To efficiently retrieve interesting sets of dietary contexts and their Our objective is to mine food pair substitutability applied by substitutable set, in this paper, we propose an approach based on consumers when they compose their meals. Given a database of graph mining techniques. Let us denote the meal graph G = (V , E) meals, we want to extract substitutability relationships based on where V is the set of nodes representing meals from the database the way people consume food. No nutritional information is used and E is the set of edges such that two nodes are connected if there during this process. Instead, contextual information is used in order is at most one item that changes between them. A meal should to extract meaningful substitutability relationships. appear at least once in the database in order to appear as a node in the graph. Figure 1 is a simple illustration of a meal network. 2.2 Defining Context The notion of context is quite complex and difficult to define uni- versally. In the field of recommender systems, the context is usually defined according to the field of application of the system. In the nutrition field, we define two types of contexts: the dietary context and the food intake context. The dietary context of a food item x is the set of food items c with which x is consumed. For instance, in the meal {coffee, bread, jam, juice}, the dietary context of {coffee} is {bread, jam, juice}. We think that the dietary context is fundamental when seeking substitutability of food items because the way people compose their meals is intrinsically dependent on the relationships between the items. The food intake context is defined as the set of all variables such as the type of the meal (breakfast, lunch, dinner, snack), the location (home, workplace, restaurant), the participants (family, Figure 1: Example of a simple meal network friend, coworkers, alone). This corresponds to the notion of context usually used in context-aware recommender systems [2]. Designed in this way, the nodes of the substitutable set of a di- There are three paradigms for incorporating context in recom- etary context are adjacent. They form a sub-graph that is completely mender systems : contextual pre-filtering, contextual post-filtering connected. Such an object is called a clique in graph mining. More and contextual modelling [2]. Contextual pre (post)-filtering con- specifically, the nodes form a maximal clique. A maximal clique sists in splitting the dataset according to contextual variables before is a clique to which another node cannot be added. In our setting, (after) applying algorithms. Contextual modelling consists in incor- discovering substitutable sets is similar to mining maximal cliques porating contextual information in the algorithm. In our framework, in a graph. In this paper we use the algorithm of Bron-Kerbosh [5] dietary context is used in order to model substitutability whereas to search for maximal cliques. the food intake context is used for contextual pre-filtering. All discovered maximal cliques are not cliques that are inter- Our objective is to investigate substitutability among food items esting for our study. We want cliques such that the size of the based on the assumption that two food items are highly substi- intersection of the nodes is a dietary context as defined above. We tutable if they are consumed in similar dietary contexts and in the denote these cliques as substitutable cliques. However, we may same intake context. encounter cliques as in Figure 2. In this case, the intersection of Investigating all possible dietary contexts of a food item is com- the nodes is {A} and we cannot derive a substitutable set from this putationally expensive because the number of possible dietary con- clique. text is exponential in the number of food items and the length of To avoid retrieving uninteresting cliques, we apply Algorithm 1 the dietary context. The number of interesting contexts is actually that filters out substitutable cliques. Investigating substitutability of food items in consumption data HealthRecSys’17, August 2017, Como, Italy ABC context then the score equals 0. The higher |Ax :y | + |Ay:x | is, the higher the association of x and y is and the lower the score is. ABD AED 3 EXPERIMENTS Figure 2: Example of an uninteresting clique 3.1 The INCA 2 database The French dataset INCA 21 is the result of a survey conducted Algorithm 1 Find substitutable clique during 2006-2007 about individual food consumption. Individual 7-day food diaries are reported for 2624 adults and 1455 children function isSubcliqe(clique) over several months taking into account possible seasonality in context = getContext(clique) eating habits. A day is composed of three main meals : breakfast, lenmax = max(len(x) for x in clique) lunch and dinner. The moments in between are denoted as snacking. if lenmax - len(context) = 1 then For the main meals, the location (home, work, school, outdoor) and return True the companion (family, friends, coworkers, alone) are registered. else The 1280 food entries are organized in 44 groups and 110 sub- return False groups of food items. We chose to consider the medium level of hierarchy in order to capture substitution relationships inter-groups and intra-groups. For instance, when we apply our algorithm to the example of Only adults are considered in this paper. All meals are gathered Figure 1, we get that this graph is a maximal clique and a substi- in a meal database DBmeals regardless of the type of meal. The tutable clique more particularly. The context is {bread, butter} and database can be split according to contextual information in order the substitutable set associated to this context is {coffee, tea, milk, to get better results [3]. We compare the results of our methodology jam, nothing}. In this particular case, it is possible to substitute an on three datasets : DBbr eak f astlunch , DBbr eak f ast and DBlunch . item by nothing because {bread, butter } can be consumed as such. 3.2 Results 2.4 Computing a substitutability score Applying our algorithm on DBbr eak f ast yields 2368 contexts. Some Substitutability is not a binary relationship because there are differ- of these and their substitutable sets are given in Table 1. Our results ent degrees of substitutability. Moreover, if two items are consumed are coherent. For example, either bread, rusk or viennoiserie can together, they are less substitutable because they might be associ- be consumed for breakfast with coffee, sugar and water. ated. Therefore, we need a function to quantify the relationship of substitutability that incorporates the possibility of associativity. Context Substitutable set Our hypothesis is that two items are highly substitutable if they are consumed in similar dietary contexts. bread We want to compute a substitutability score such as : coffee, sugar, water, butter rusk viennoiserie (1) Two items are highly substitutable if they are consumed in yogurt similar contexts. sugar (2) Two items are less substitutable if they are consumed to- tea/infusions, donuts jam/honey gether. nothing (3) Substitutability is a symmetrical relationship. Table 1: Results of context and substitutable set retrieval for Let us denote, for an item x, the context set C x as the set of breakfasts dietary contexts in which x is a substitutable item. If the cardinality of C x denoted as |C x | is high, then x is substitutable in many dietary We applied our algorithm to the three datasets. The results are contexts. reported in Table 2. We can see that we can obtain inter-group For two items x and y, the condition (1) is described by the substitutions such as {potatoes ⇒ green beans} but also intra-group intersection of C x and Cy . If |C x ∩ Cy | is high, then x and y are substitutions as {bread ⇒ rusk}. consumed in similar contexts. The substitutions proposed are consistent with regards to eating We denote Ax :y the set of contexts of x where y appears : habits. Substitutes of drinks are also drinks : the substitutes of coffee are tea, cocoa and chicory. It is also the case for spreadable food Ax :y = {c ∈ C x |y ∈ c} (2) items : the substitutes for butter for breakfast are spreadable items. The cardinality of Ax :y denotes how y is associated to x. No semantic information describing how a food item can be eaten Taking into account these considerations, we propose the sub- is available in the dataset and yet considering the dietary context stitutability score inspired by the Jaccard index [10]: helps us retrieving this kind of information. Substitutions between food items of the same nutritional food |C x ∩ Cy | f (x, y) = (3) groups are found. For instance, the substitutes for potatoes are pasta |C x ∪ Cy | + |Ax :y | + |Ay:x | and rice: they all contain starches. The score equals 1 when x and y appear in exactly the same contexts 1 https://www.data.gouv.fr/fr/datasets/donnees-de-consommations-et-habitudes- and Ax :y = Ay:x = ∅. If x and y are never consumed in the same alimentaires-de-letude-inca-2-3/ HealthRecSys’17, August 2017, Como, Italy Akkoyunlu et al. Breakfast and lunch Breakfast Lunch Substitute item Substitute item Substitute item Food Item Score Score Score (ordered by score) (ordered by score) (ordered by score) Rusk 0.2234 Rusk 0.3716 Fruits 0.0497 Bread Viennoiserie 0.1359 Viennoiserie 0.2010 Yogurt 0.0490 Cakes 0.0745 Cakes 0.1243 Potatoes 0.0468 Tea 0.2799 Tea 0.4219 Sodas 0.065 Coffee Cocoa 0.1729 Chicory 0.2550 Yogurt 0.0642 Chicory 0.1486 Cocoa 0.2255 Fruits 0.0633 Coffee 0.2799 Coffee 0.4219 Cakes 0.0536 Tea Cocoa 0.1721 Chicory 0.1965 Viennoiserie 0.0417 Chicory 0.1289 Cocoa 0.1462 Coffee 0.0412 Chicory 0.2171 Chicory 0.2211 Cereal bars 0.25 Cocoa Coffee 0.1729 Coffee 0.2077 Preprocessed vegetables 0.0526 Tea 0.1289 Tea 0.1965 Hamburgers 0.0256 Margarine 0.2413 Margarine 0.4030 Margarine 0.0602 Butter Honey/jam 0.0924 Chocolate spread 0.1240 Fruits 0.0431 Chocolate spread 0.0786 Honey/jam 0.1175 Sauces 0.0431 Juice 0.1409 Yogurt 0.1815 Doughnut 0.0869 Milk Yogurt 0.1264 Juice 0.1504 Other milk 0.0666 Sugar 0.1089 Tap water 0.1361 Milk in powder 0.0625 Sodas 0.0814 Sodas 0.0860 Wine Beer 0.0704 / / Tap water 0.0755 Tap water 0.0412 Beer 0.0746 Sandwich baguette 0.2429 Sandwiches baguette 0.2810 Pizza Other sandwiches 0.1729 / / Other sandwiches 0.2177 Meals with pasta or potatoes 0.1513 Meal with pasta or potatoes 0.1658 Pasta 0.1111 Pasta 0.1142 Potatoes Green beans 0.0922 / / Green beans 0.0941 Rice 0.0602 Rice 0.0616 Table 2: Top 3 substitutable items for several items for breakfast and lunch 4 DISCUSSION AND CONCLUSIONS 5 ACKNOWLEDGEMENT We proposed a score of substitutability based on consumption data This study was funded by Danone Nutricia Research. that can be used in a recommender system together with other scores such as a nutritional score that takes into account the nutri- REFERENCES tional contribution of the substitution and a user preference score. [1] Achananuparp, P., and Weber, I. Extracting food substitutes from food diary The substitutability score is based on the assumption that two items via distributional similarity. CoRR abs/1607.08807 (2016). [2] Adomavicius, G., and Tuzhilin, A. Context-aware recommender systems. In are substitutable if they are consumed in similar contexts. Prelimi- Recommender Systems Handbook, F. Ricci, L. Rokach, B. Shapira, and P. B. Kantor, nary results on the INCA2 dataset show that this assumption helps Eds. Springer US, 2011, pp. 217–253. retrieving substitutability relationships based on consumption data. [3] Baltrunas, L., and Ricci, F. Context-based splitting of item ratings in collab- orative filtering. In Proceedings of the Third ACM Conference on Recommender When we split the dataset according to the contextual variable Systems (New York, NY, USA, 2009), RecSys ’09, ACM, pp. 245–248. "type of meal", the substitutes and the scores are different. Coffee [4] Bier, D. M., Derelian, D., German, J. B., Katz, D. L., Pate, R. R., and Thompson, can be substituted by tea, chicory and coffee for breakfast whereas K. M. Improving compliance with dietary recommendations. Nutrition Today 43, 5 (sep 2008), 180–187. for lunch, it can be substituted by sodas, yogurt and fruits. Food [5] Bron, C., and Kerbosch, J. Algorithm 457: Finding all cliques of an undirected items are consumed differently according to the type of meal. The graph. Commun. ACM 16, 9 (Sept. 1973), 575–577. [6] Freyne, J., and Berkovsky, S. Intelligent food planning: personalized recipe relationship of substitutability is therefore different too. recommendation. In Proceedings of the 15th International Conference on Intelligent Difference of scale in scores is noted according to the type of User Interfaces, IUI 2010, Hong Kong, China, February 7-10, 2010 (2010), pp. 321–324. meal. It may be due to the fact that the diversity of food items [7] Ge, M., Elahi, M., Fernaández-Tobías, I., Ricci, F., and Massimo, D. Using tags and latent factors in a food recommender system. In Proceedings of the 5th consumed during lunch is higher than during breakfast. A rescaling International Conference on Digital Health 2015 (New York, NY, USA, 2015), DH factor based on the diversity of the type of meal can be introduced. ’15, ACM, pp. 105–112. As future work we plan to investigate this aspect and implement [8] Harvey, M., Ludwig, B., and Elsweiler, D. You are what you eat: Learning user tastes for rating prediction. In String Processing and Information Retrieval the nutritional score and the user preference related score. - 20th International Symposium, SPIRE 2013, Jerusalem, Israel, October 7-9, 2013, Proceedings (2013), pp. 153–164. [9] Ivens, B. J., and Smith Edge, M. Translating the Dietary Guidelines to Promote Investigating substitutability of food items in consumption data HealthRecSys’17, August 2017, Como, Italy Behavior Change: Perspectives from the Food and Nutrition Science Solutions Joint Task Force. J Acad Nutr Diet 116, 10 (Oct 2016), 1697–1702. [10] Jaccard, P. The distribution of the flora in the alpine zone.1. New Phytologist 11, 2 (1912), 37–50. [11] McAuley, J. J., Pandey, R., and Leskovec, J. Inferring networks of substitutable and complementary products. In Proceedings of the 21th ACM SIGKDD Inter- national Conference on Knowledge Discovery and Data Mining, Sydney, NSW, Australia, August 10-13, 2015 (2015), pp. 785–794. [12] Webb, D., and Byrd-Bredbenner, C. Overcoming consumer inertia to dietary guidance. Adv Nutr 6, 4 (Jul 2015), 391–396. [13] World Health Organization. Diet, nutrition and the prevention of chronic diseases: report of a joint who/fao expert consultation. Tech. rep., 2003. [14] Zheng, J., Wu, X., Niu, J., and Bolivar, A. Substitutes or complements: an- other step forward in recommendations. In Proceedings 10th ACM Conference on Electronic Commerce (EC-2009), Stanford, California, USA, July 6–10, 2009 (2009), pp. 139–146.