An Evaluation of Recommendation Algorithms for Online Recipe Portals Christoph Trattner David Elsweiler University of Bergen University of Regensburg Norway Germany christoph.trattner@uib.no david.elsweiler@ur.de ABSTRACT for example, building nutritional content into the recommendation Better models of food preferences are required to realise the oft process [15, 19, 34] or by recommending meal plans, which tailor touted potential of food recommenders to aid with the obesity crisis. recommendations to users’ nutritional needs over time [6]. Many of the food recommender evaluations in the literature have Providing healthful food recommendations, using any of the been performed with small convenience samples, which limits our suggested strategies necessitates, however, that we can accurately conidence in the generalisability of the results. In this work we test model and predict the food individual users would actually like to a range of collaborative iltering (CF) and content-based (CB) re- eat. We have yet limited understanding as to which recommender commenders on a large dataset crawled from the web consisting of algorithms work best [33] and the studies that have been performed naturalistic user interaction data over a 15 year period. The results typically focus on one approach in isolation (e.g. recipe ingredients reveal strengths and limitations of diferent approaches. While CF [11] or properties of the associated image [14]). Moreover, past approaches consistently outperform CB approaches when testing work has tended to employ datasets derived from small scale user on the complete dataset, our experiments show that to improve on studies [11, 19] limiting our conidence in the generalisability of the CF methods require a large number of users (> 637 when sampling results. In this work, we test a number of competitive collaborative randomly). Moreover the results show diferent facets of recipe con- iltering (CF) and content-based (CB) recommenders on a large tent to ofer utility. In particular one of the strongest content related scale naturalistic dataset similar to those that have been studied features was a measure of health derived from guidelines from the for cultural [24, 40] or epidemiological [37] reasons using data UK Food Safety Agency. This inding underlines the challenges we science methods. We formulate the problem as is typically done face as a community to develop recommender algorithms, which in recommendation experiments using past feedback from a given improve the healthfulness of the food people choose to eat. user to predict future interactions by that same user [26]. The aim being not only to compare and contrast diferent models, but also to KEYWORDS examine the utility of diferent facets of content - which are diverse in the case of online recipes - and establish how these inluence the Online recipes; recommender systems recommendation performance. The main indings include that: ACM Reference Format: Christoph Trattner and David Elsweiler. 2019. An Evaluation of Recom- mendation Algorithms for Online Recipe Portals. In Proceedings of the 4th • CF methods consistently outperform CB methods over the full International Workshop on Health Recommender Systems co-located with 13th dataset. ACM Conference on Recommender Systems (HealthRecSys’19) (HealthRecSys • CF requires either a small number of highly active users or over ’19) , 5 pages. six hundred users, selected randomly to achieve competitive performance. 1 INTRODUCTION • There is a useful signal in the CB facets, which would be useful in cold-start situations. Food recommenders (e.g. [11, 15]) and studies of online recipes (e.g. • One of the most robust content features is the nutritional health- [24, 40] ) have received increased research attention of late. A key iness of the recipe as deined by a measure derived from the motivation for this is often health, with recommender systems being United Kingdom Food Standards Agency (FSA). This highlights touted as a means to help people change dietary habits and address that users are typically consistent in their nutritional preferences costly societal problems, such as diabetes and obesity [7, 11]. over time and emphasizes the challenges faced to change eating Diverse studies have been published, ofering insight into the habits. contextual factors inluencing recipe preference [28, 40] and the future popularity of recipes [36], as well as providing an under- standing of the links between recipe preference and incidence of The remainder of the paper is structured as follows: Sections 3 eating related illness [37]. A further strain of research has attemp- and 4 describe the data basis and experimental setup, respectively. ted to incorporate health in the food recommendation problem by, Section 5 continues to report the results of two rounds of experi- ments, the irst of which uses the full dataset and the second em- ploys a bootstrapping approach to test algorithms on sub-samples HealthRecSys ’19, September 20, 2019, Copenhagen, Denmark of the data of various sizes. Section 6 summarises the indings and © 2019 Copyright for the individual papers remains with the authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). This sets these in context against the literature, which is reviewed in the volume is published and copyrighted by its editors.. following section. 24 HealthRecSys ’19, September 20, 2019, Copenhagen, Denmark Tratner et al. 2 RELATED WORK Table 1: Basic statistics of the Internet recipes dataset ob- tained from Allrecipes.com. In this section two bodies of related work are reviewed. The irst focuses on the evaluation of food recommender algorithms. The second summarises studies of user interaction with online recipe Total published recipes 60,983 portals, which provides insight into human food preference and Recipes containing nutrition information 58,263 the variables inluencing this. Recipes rated 46,713 Ratings 1,032,226 2.1 Food Recommendation Users providing ratings 125,762 Eforts to design automated systems to recommend meals can be traced to the mid-1980s where case-based planning was employed [18, 21]. More recent eforts have focused on rating prediction, using interacted with and a growing body of evidence reports correla- either aspects of recipe content or ratings data using collaborative tions between recipes accessed via search engines, recipes portals iltering approaches. Freyne et al. [11] showed the recommenda- and social-media and incidence of diet-related illness [1, 3, 29, 37]. tions could be improved by decomposing recipes into individual Moreover, clear weekly and seasonal trends can be observed in ingredients and building user proiles comprising ingredients users the way users interact with recipes, both in terms of the contained liked based on ratings for the recipes containing these ingredients. ingredients and the nutritional value of the recipes (fat, proteins, Harvey et al. extended the approach and improved performance by carbohydrates, and calories) [23, 40]. Other work has reported difer- creating positive and negative proiles for users and reducing the ent interaction patterns for users with diferent gender [28, 39] and dimensionality of the matrices [19]. who live in diferent geographical areas within a country [40, 44]. Other CB approaches have employed visual signals. Yang and The number of variables shown to relate to eating habits highlights colleagues demonstrated that algorithms designed to extrapolate im- just how challenging a problem food recommendation is. portant visual aspects of food images outperform baseline methods The brief review of literature above has highlighted the increas- [42, 43]. Elsweiler et al. [8] also show that automatically extrac- ing popularity of food recsys research and that a key motivator is ted low-level image features, such as brightness, colourfulness and desire to build systems to promote healthy nutrition. Key takeaways sharpness can be useful for predicting user food preference. from the review are as follows: A second approach has been to exploit ratings data using col- • While several evaluations have CF and CB baselines, no extensive laborative iltering (CF) techniques. Freyne and Berkovsky tested comparison of CF and CB approaches in food recsys domain has a nearest neighbour approach, which ofered poorer performance been published. than the content approach described above [11]. Ge et al. [15] tested • Moreover, no detailed investigation of diferent aspects of content a matrix factorization solution that fuses ratings information and that may be useful is available and much of the recipe content user supplied tags to achieve signiicantly better prediction accur- (recipe description, cooking steps, cooking time etc.) has not been acy than content-based and standard matrix factorization baselines. evaluated. Several studies report that the best results are achieved when CF • Finally, the evaluations performed to date have typically been and CB approaches are combined in hybrid models [11, 14, 19]. performed on small artiicially generated test collections. A common motivator for food recommendation work has been to promote healthy nutrition. One approach is to rely on rules de- 3 MATERIALS rived from domain experts to meet daily energy requirements [13] To address the identiied gaps in the literature, in this work, we or focus on the nutritional requirements of speciic groups such make use of a web crawl of the online platform Allrecipes.com to as the elderly care [10] or body-builders [38]. Others have tailored evaluate diverse CF and CB approaches in the recipe recommenda- recommendations based on the user’s caloriic or other nutritional tion context. needs [15, 16, 34], existing nutritional habits [31] or combine re- The platform was crawled between 20th and 24th of July, 2015. commendations to meet requirements [6]. Again, approaches have We retrieved 60,983 recipes published by 25,037 users between the been published for speciic target groups e.g. diabetics [25]. years 2000 and 2015 through the sitemap that is available in the robots.txt ile of the website. In this paper we only make use of the 2.2 Studies of Food Behaviour using Online 58,263 recipes where nutrition information was available. The basic Recipe Portals statistics of this dataset can be found in Table 1. While not focusing on recommendation, a large body of recent work In addition to the core recipe components ś such as recipe title, sheds light on food preferences by studying interactions with on- ingredient list, number of servings and instructions ś we also col- line food portals. Analysing the nutritional content of these portals lected for each recipe the according image, comments provided by using metrics derived from the World Health Organisation (WHO) users, rating information and nutrition facts1 , such as total energy and the United Kingdom Food Standards Agency (FSA) has found (kCal), protein (g), carbohydrate (g), sugar (g), salt (g), fat (g) and recipes to be mainly unhealthy, although healthy recipes can be saturated fat (g) content (measured in 100g per recipe). found [35]. Overall, people tend to interact with the least healthy 1 Allrecipes.com estimates the nutritional facts for an uploaded recipe by matching recipes most often [34]. There is, nevertheless, heterogeneity in the contained ingredients with those in the ESHA research database [9]. The ESHA the user-base with respect to the nutritional properties of recipes system is used by popular companies such as MCDonald’s and Kellogs. 25 An Evaluation of Recommendation Algorithms for Online Recipe Portals HealthRecSys ’19, September 20, 2019, Copenhagen, Denmark Allrecipes.com is just one of many online recipe portals. Others • Directions: From the directions block we computed two similarity popular sites include Food.com, Epicurious.com, Yummly.com and features based again on a LDA topic vector representation of Cooks.com. We chose Allrecipes.com because, at the time of writ- the text as well as on TFśIDF vector representation. Similarities ing, it claims to be the world’s largest food-focused social network: were again computed employing the cosine similarity measure the site has a community of over 40 million users from 24 countries on these vectors. who annually visit 3 billion recipes [2]. This claim has been corrob- • Ratings: Here we rely on the the number of ratings of a recipe as orated by services such as eBizMBA, which ranks Allrecipes.com well the average rating. To compute similarities between recipes as the most popular recipe website [5]. This means that we not on theses indicators we rely again on the inverse Manhatten only analyze a large scale dataset, but also the most popular recipe distance, i.e. 1 − |metric(r i ) − metric(r j )|. platform on the Web. • Health: In order to measure healthiness of a recipe we rely on the following macro nutrient: ‘fat’, ‘saturated fat’, ‘sugar’ and 4 EXPERIMENTAL SETUP ‘salt’ (measured in 100g per recipe). This allows us to measure We ran a series of experiments evaluating the performance of the healthiness of a recipe according to international standards 6 prominent recommender algorithms on the rating data using as introduced in 2007 by The Food Standard Agency (FSA) [12]. the LibRec2 framework. The algorithms tested are: Random item There are also other standards that can be applied, such as the ranking (our baseline), Most Popular item ranking (MostPopular), ones provided by the World Health Organization (WHO) [41] user- and item-based collaborative iltering (denoted as UserKNN or the HEI metric as proposed by the CDC [20]. We employ the and ItemKNN) [30], Bayesian Personalized Ranking (BPR) [26], standards provided by the FSA, as this is currently most robust Weighted matrix factorization (WRMF) [22] and Latent Dirichlet method to estimate the healthiness of online recipes. The metric Allocation (LDA) [17]. was also used in related work [34]. The scale ranges from 4 for For the content-based approaches we induced in total 20 diferent very healthy recipes to 12 for very unhealthy recipes. Throughout features, which we used to compute similarities between recipes. the paper we refer to this metric as ‘FSA score’. Below we briely summarise these features and their corresponding For each of the features described above, we derive a scoring sets: function that computes as follows: Í • Title: For the title feature set, we derived 5 similarity features, sim(i, p) based on Levenshein distance, Least Common Sub-Sequence p ∈Pu score(u, i)f eatur e = , (1) (LCS), Jaro-Winkler distance and bi-gram distance. To obtain a |Pu | similarity value between two recipes based on these features where Pu is the set of items of a user u, i an arbitrary item, and we calculate 1 − dist(r i , r j ). Furthermore, we employ LDA topic sim(i, p) is any of the above mentioned similarity metrics between modelling on the recipe titles using Mallet with Gibbs sampling. item i and p. The number of topics was set to 100 topics. Hence for each recipe For each feature set we calculate scores based on the linear we induce a vector of dimension one hundred capturing the topic combination of the similarities3 . distribution. To calculate similarities between recipes we employ As in previous work [26], we operationalise the experiments the cosine similarity metric. as a personalized ranking problem (item recommendation). The • Image: For the image feature set we employed on the one hand aim here is to provide a user with a ranked list of items where the side image attractiveness measures such as image brightness, ranking has to be inferred from the implicit behavior of the user sharpness, contract, colorfulness and entropy as well as deep (e.g. recipes rated in the past). Implicit feedback systems, such as convolutional neural network (CNN) features from a pre-trained those studied in [26] are challenging as only positive observations VGG-16 model [32]. For each image we derive one embedding are available. The non-observed user-item pairs ś e.g. a user has vector of dimension 4096 and calculate cosine similarity between not cooked a recipe yet ś are a mixture of real negative feedback recipes on these vectors. To measure the similarity between two (the user is not interested in cooking the recipe) and missing values recipes based on the image attractiveness metrics [36] we employ (the user might want to cook the recipe in the future). We use 5- the Manhatten distance, i.e. 1 − |metric(r i ) − metric(r j )|. fold cross validation as protocol for all the experiments and report • Ingredients: To calculate similarities between recipes on ingredi- the recommendation performance results employing AUC as a ent level, we inducted four diferent features. On the one hand performance metric [27]. side the text itself was used and brought to a TFśIDF repres- To reduce data sparsity issues, a well-known issue in collaborat- entation to calculate cosine similarity between recipes. On the ive iltering-based methods [27], in the irst experiments we apply other hand side we also chose to employ LDA again to derive a p-core ilter approach [4] using only user proiles with at least a topic distribution and to calculate cosine similarity between 20 rating interactions4 and recipes that have been rated at least 20 recipes on those vectors. Finally, we employed the normalized times by the users, resulting in a inal dense dataset comprising ingredient strings, to calculate similarities between recipes using 1273 users, 1031 items and 50,681 interactions. To study the efects cosine similarity and Jaccard. In the case of cosine we normalized of diferent levels of users on performance we report a second set the quantities of each ingredient to 100g of a recipe and used the normalized quantity values as frequency indicator. 3 Parameters were tuned to the optimum using grid search. 4 We transfer all ratings to positive feedback, i.e. any rating is counted as positive feedback and any none interaction as negative feedback. This makes sense as 95% of 2 http://www.librec.net/ all ratings in the Allrecipes.com dataset are 5-star ratings, see also [36]. 26 HealthRecSys ’19, September 20, 2019, Copenhagen, Denmark Tratner et al. Table 2: Results of the recommender experiment ś collabor- (A) Dense Data Samples (p−core=20) ative (CF) vs content-based (CB) ś in the dense data sample ● ● ● ● ● ● ● ● with all users. Best features in each set (CF and CB) are bol- 0.68 ● Algorithm AUC ● ● BPR ded. Top-5 (↑) and Bottom-5 (↓) single content features are 0.64 ● CB:All also marked. 0.60 ● 1 5 10 20 30 40 50 60 70 80 90 100 Method Algorithm AUC Number of Users [%] (B) Sparse Data Samples (no p−core) BPR .7094 ● ● WRMF .6881 0.60 ● Algorithm ● UserKNN .6962 AUC 0.56 ● CF ● ● BPR ItemKNN .6909 0.52 ● CB:All ● ● MostPopular .6864 0.48 ● ● ● LDA .6863 1 5 10 20 30 40 50 60 70 80 90 100 Number of Users [%] Title:Levenstein-Distance .5468 (↑) Title:Bigram-Distance .5500 (↑) Figure 1: (A) shows the results in the dense data samples (= Title:LCS-Distance .5424 p-core iltered) where each user has at least 20 item interac- Title:LDA-Text-Cosine .5353 tions and each item is at least 20-times interacted with, (B) Title:Jaro-Winkler-Distance .5324 shows the results in the sparse data samples (=no p-core). Title:All .5523 Image:Cosine-Embeddings .5322 Image:Colorfulness-Distance .5072 (↓) Image:Contrast-Distance .5175 Image:Sharpness-Distance .5109 Image:Entropy-Distance .5080 (↓) AUC scores of > .686. This compares to .5883 achieved by the linear Image:Brightness-Distance .4991 (↓) combination of content features (= CB:All). CB Image:All .5425 Examining the performance of diferent aspects of content (title, Ingredients:Cosine-Text .5547 image, ingredients, direction and health) shows that there is a signal Ingredients:Cosine-LDA-Text .5653 (↑) in each of these aspects. This is a sign of the consistency, in terms Ingredients:Jaccard .5502 of the properties of recipes, which individual users tend to rate. Ingredients:Cosine .5575 The fact that the combined model łAllž does not achieve a high Ingredients:All .5718 improvement on these signals individually is perhaps an indication that a linear combination is not the best means to combine these Directions:Cosine-LDA-Text .5606 (↑) signals. One of the strongest content-based features is the FSA score Directions:Cosine-Text .5210 (AUC=.5775). Again, this hints at consistency in user preference, Directions:All .5731 this time in terms of the healthiness of recipes, which individual Ratings:Number-Distance .4789 (↓) users interact with. Ratings:Average-Distance .4832 (↓) To complement these initial results and better understand the Ratings:All .5249 relationship between CF and CB methods and the amount of data required to achieve strong recommendation performance with these Health:FSA .5775 (↑) approaches, we performed the bootstrapping study as described CB:All .5883 above. The results are presented in Figure 1. Random .4989 In a irst test, see Figure 1 (A), we sampled only from active users, that is, we derived a test size of various sizes where users had rated at least 20 items and the items involved had also achieved of bootstrapped experiments using smaller dense samples of heavy at least 20 ratings. Taking this dense sample showed that even a users (using the same criteria as above), and varying collection sizes small number of users can attain stable performance. With only 1% using standard random sampling, referred to as ‘sparse samples’ in of all users (N=13) the CF technique (BPR) is able to outperform the the text. These experiments were repeated 100 times each and the content approach. Nevertheless, when users are selected at random average performance reported. from the dataset and no p-core ilter is applied, see Figure 1 (B) ś which we argue is a much more realistic setup [4] ś many more 5 RESULTS users are required on average to achieve an equivalent perform- ance. Whereas the CB approaches achieve a consistent performance The results of the experiments on the full dataset are shown in (AUC=> .54) regardless of the number of users studied, half of the Table 2. The CF methods clearly outperform the content-based dataset (50%, N=637) is required before the CF methods outperform approaches. The best performing CF method (BPR) achieved an the CB approach. AUC score of .7094 and the remaining CF methods demonstrated 27 An Evaluation of Recommendation Algorithms for Online Recipe Portals HealthRecSys ’19, September 20, 2019, Copenhagen, Denmark 6 SUMMARY & CONCLUSION [18] Kristian J Hammond. 1986. CHEF: A Model of Case-based Planning.. In AAAI. 267ś271. In this work we have tested competitive recommendation algorithms [19] Morgan Harvey, Bernd Ludwig, and David Elsweiler. 2013. You are what you eat: on a large online recipe dataset. While algorithms of these types Learning user tastes for rating prediction. In International Symposium on String Processing and Information Retrieval. Springer, 153ś164. have been evaluated before (e.g. [11, 19]), no systematic evaluation [20] HEI. 2016. Healthy Eating Index. (Oct. 2016). https://www.cnpp.usda.gov/health has been performed on naturalistic data of this type for only recipes yeatingindex and no results have been published with respect to what signal can [21] Thomas R Hinrichs. 1989. Strategies for adaptation and recovery in a design problem solver. In Proc. of Workshop CBR ’89. 115ś118. be ofered by diferent facets of recipe content. [22] Yifan Hu, Yehuda Koren, and Chris Volinsky. Collaborative iltering for implicit Our primary inding is that CF outperformed CB in our exper- feedback datasets. In Proc. of ICDM’08. 263ś272. [23] Tomasz Kusmierczyk, Christoph Trattner, and Kjetil Nùrvåg. Temporality in iments. This is a diferent result from the literature - both [11] online food recipe consumption and production. In Proc. of WWW ’15. and [19] report ingredient based CB methods outperforming CF [24] Tomasz Kusmierczyk, Christoph Trattner, and Kjetil Nùrvåg. 2016. Understanding baselines. The small size datasets in these past studies, however, and Predicting Online Food Recipe Production Patterns. In Proc. of HT ’16. 243ś 248. suggests the results to be compatible. It is only after data for several [25] Maiyaporn Phanich, Phathrajarin Pholkul, and Suphakant Phimoltares. 2010. hundred (in our experiment 637) users is available that CF methods Food recommendation system using clustering analysis for diabetic patients. In start to outperform CB. Proc. of ISA ’10. 1ś8. [26] Stefen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars Schmidt-Thieme. With respect to recipe content, the performance of FSA high- BPR: Bayesian personalized ranking from implicit feedback. In Proc. of UIAI’09. lights the challenge in changing people’s habits. This aligns with 452ś461. [27] Francesco Ricci, Lior Rokach, and Bracha Shapira. 2011. Introduction to recom- past work revealing that the majority of users tend to prefer un- mender systems handbook. Springer. healthy food, a smaller group preferred healthy recipes, but both [28] Markus Rokicki, Eelco Herder, Tomasz Kusmierczyk, and Christoph Trattner. groups were consistent in their judgments over time [19]. As a com- Plate and Prejudice: Gender Diferences in Online Cooking. In Proc. of UMAP ’16. 207ś215. munity we need to think hard about how these group members can [29] Alan Said and Alejandro Bellogín. You are What You Eat! Tracking Health be targeted with recommendations that might alter this situation. Through Recipe Interactions. In Proc. of RSWeb ’14. [30] Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl. Item-based collaborative iltering recommendation algorithms. In Proc. of WWW ’01. 285ś REFERENCES 295. [1] Soiane Abbar, Yelena Mejova, and Ingmar Weber. You Tweet What You Eat: [31] Hanna Schäfer and Martijn C Willemsen. 2019. Rasch-based tailored goals for Studying Food Consumption Through Twitter. In Proc. of CHI ’15. nutrition assistance systems. In Proc. of IUI ’19. 18ś29. [2] Allrecipes. 2016. Allrecipe.com Press report. available at http://press.allrecipes.c [32] Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks om/. Last accessed on 22.03.2019. (2016). for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014). [3] Munmun De Choudhury and Sanket S Sharma. Characterizing Dietary Choices, [33] Christoph Trattner and David Elsweiler. 2017. Food recommender systems: Nutrition, and Language in Food Deserts via Social Media. In Proc. of CSCW ’16. important contributions, challenges and future research directions. arXiv preprint [4] Stephan Doerfel, Robert Jäschke, and Gerd Stumme. 2016. The role of cores in arXiv:1711.02760 (2017). recommender benchmarking for social bookmarking systems. ACM Transactions [34] Christoph Trattner and David Elsweiler. 2017. Investigating the healthiness on Intelligent Systems and Technology (TIST) 7, 3 (2016), 40. of internet-sourced recipes: implications for meal planning and recommender [5] Ebizma. 2017. Ebizma rankings for recipe websites. available at http://www.ebiz systems. In Proc. of WWW ’17. 489ś498. mba.com/articles/recipe-websites. Last accessed on 22.03.2019. (2017). [35] Christoph Trattner, David Elsweiler, and Simon Howard. 2017. estimating the [6] David Elsweiler and Morgan Harvey. Towards automatic meal plan recommend- healthiness of internet recipes: a cross-sectional study. Frontiers in public health ations for balanced nutrition. In Proc. of RecSys ’15. 313ś316. 5 (2017), 16. [7] David Elsweiler, Morgan Harvey, Bernd Ludwig, and Alan Said. Bringing the [36] Christoph Trattner, Dominik Moesslang, and David Elsweiler. 2018. On the "healthy" into Food Recommenders. In Proc. of DRMS ’15. 33ś36. predictability of the popularity of online recipes. EPJ Data Science 7, 1 (2018), 20. [8] David Elsweiler, Christoph Trattner, and Morgan Harvey. Exploiting Food Choice [37] Christoph Trattner, Denis Parra, and David Elsweiler. 2017. Monitoring obesity Biases for Healthier Recipe Recommendation. In Proc. of SIGIR ’17. 575ś584. prevalence in the United States through bookmarking activities in online food [9] ESHA. 2016. Nutrition Labeling Software. available at http://www.esha.com/. portals. PloS one 12, 6 (2017), e0179144. Last accessed on 22.03.2019. (2016). [38] Piyaporn Tumnark, Filipe Almeida da Conceição, João Paulo Vilas-Boas, Leandro [10] Vanesa Espín, María V Hurtado, and Manuel Noguera. 2016. Nutrition for Elder Oliveira, Paulo Cardoso, Jorge Cabral, and Nonchai Santibutr. 2013. Ontology- Care: a nutritional semantic recommender system for the elderly. Expert Systems based personalized dietary recommendation for weightlifting. In Proc. of Int. WS 33, 2 (2016), 201ś210. on Computer Science in Sports. 44ś49. [11] Jill Freyne and Shlomo Berkovsky. Intelligent Food Planning: Personalized Recipe [39] Claudia Wagner and Luca Maria Aiello. 2015. Men eat on Mars, Women on Recommendation. In Proc. of IUI ’10. 321ś324. Venus?: An Empirical Study of Food-Images.. In Proc. of WebSci ’15. 63ś1. [12] FSA. 2016. Guide to creating a front of pack (FoP) nutrition label for pre-packed [40] Claudia Wagner, Philipp Singer, and Markus Strohmaier. 2014. The nature and products sold through retail outlet. available at https://www.food.gov.uk/sites/de evolution of online food preferences. EPJ Data Science 3, 1 (2014), 1ś22. fault/iles/multimedia/pdfs/pdf-ni/fop-guidance.pdf. Last accessed on 22.03.2019. [41] Joint Who and FAO Expert Consultation. 2003. Diet, nutrition and the prevention (2016). of chronic diseases. World Health Organ Tech Rep Ser 916, i-viii (2003). [13] Dhomas Hatta Fudholi, Noppadol Maneerat, and Ruttikorn Varakulsiripunth. [42] Longqi Yang, Yin Cui, Fan Zhang, John P Pollak, Serge Belongie, and Deborah 2009. Ontology-based daily menu assistance system. In Proc. of ECTICON ’09. Estrin. 2015. Plateclick: Bootstrapping food preferences through an adaptive 694ś697. visual interface. In Proc. of CIKM ’15. 183ś192. [14] Xiaoyan Gao, Fuli Feng, Xiangnan He, Heyan Huang, Xinyu Guan, Chong Feng, [43] Longqi Yang, Cheng-Kang Hsieh, Hongjian Yang, John P Pollak, Nicola Dell, Zhaoyan Ming, and Tat-Seng Chua. 2018. Visually-aware Collaborative Food Serge Belongie, Curtis Cole, and Deborah Estrin. 2017. Yum-Me: A Personalized Recommendation. arXiv preprint arXiv:1810.05032 (2018). Nutrient-Based Meal Recommender System. ACM Transactions on Information [15] Mouzhi Ge, Francesco Ricci, and David Massimo. Health-aware Food Recom- Systems (TOIS) 36, 1 (2017), 7. mender System. In Proc. of RecSys ’15. 333ś334. [44] Yu-Xiao Zhu, Junming Huang, Zi-Ke Zhang, Qian-Ming Zhang, Tao Zhou, and [16] Elizabeth Gorbonos, Yang Liu, and Chính T Hoàng. 2018. NutRec: Nutrition Yong-Yeol Ahn. 2013. Geography and similarity of regional cuisines in China. Oriented Online Recipe Recommender. In Proc. of WI ’18. 25ś32. PloS one 8, 11 (2013), e79161. [17] Tom Griiths. 2002. Gibbs sampling in the generative model of latent dirichlet allocation. (2002). 28