Exploring eating behaviours modelling for user clustering Sema Akkoyunlu Cristina Manfredotti Antoine Cornuéjols UMR MIA-Paris, AgroParisTech, UMR MIA-Paris, AgroParisTech, UMR MIA-Paris, AgroParisTech, INRA INRA INRA Université Paris-Saclay Université Paris-Saclay Université Paris-Saclay Paris, France Paris, France Paris, France sema.akkoyunlu@agroparistech.fr cristina.manfredotti@agroparistech. antoine.cornuejols@agroparistech.fr fr Nicolas Darcel Fabien Delaere UMR PNCA, AgroParisTech, INRA Danone Nutricia Research Université Paris-Saclay Palaiseau, France Paris, France fabien.delaere@danone.com nicolas.darcel@agroparistech.fr ABSTRACT nutrient based "XX gram of iron per day". However, the compliance Food based dietary guidelines are not fully adopted by consumers. to the guidelines are relatively low although the awareness about One of the principal explanations for this failure is that they are food based dietary guidelines is rather good [8]. Several causes too general and do not take into account eating habits. Experts in contribute to this phenomenon: cultural and personal preferences, nutrition believe that providing personalized dietary recommenda- difficulty of implementing dietary changes, availability and price of tions via nutrition recommender system can help people improve food items [22]. One solution to this problem could be to provide their eating behaviours. Understanding eating habits is a keystone a food recommender system able to take into account most of in order to build a context aware recommender system that delivers these causes. Early studies showed that web-based personalized personalized dietary recommendations. As a step towards this goal, interventions are more effective than standard public health advice we propose a method for representing food consumptions based on for inducing compliance with healthy eating recommendations Doc2Vec for discovering clusters of eating behaviours. We compare [6]. Moreover changing eating habits is challenging, thus food our method to the state of the art methods used in the nutrition based recommendations should better be easy to follow [1]. But community. for recommendations to be practical, one should first understand consumers’ eating behaviour. CCS CONCEPTS In food related recommender systems, the recommended objects are recipes [4] [20], food items [5] or menus [3]. Recipe recommenda- • Information systems → Information extraction; • Human- tion systems take advantage of users’ past recipes ratings to propose centered computing → User models; recipes that they might like. Menu based recommendation systems KEYWORDS combine meals that users showed preference for with nutritional constraints based on the nutritional requirements of users. Food food recommender systems; user modelling; eating behaviours; item based recommendation systems are designed to learn the users’ Doc2Vec tastes for food items. Most of them use popular recommendation ACM Reference format: algorithms often based on matrix factorizations techniques which Sema Akkoyunlu, Cristina Manfredotti, Antoine Cornuéjols, Nicolas Darcel, learn an embedding space for representing users and food items and Fabien Delaere. 2018. Exploring eating behaviours modelling for user simultaneously. However, this representation does not take into clustering. In Proceedings of Third International Workshop on Health Recom- account that food items are seldom consumed in isolation and that mender Systems co-located with Twelfth ACM Conference on Recommender users’ preferences for food items can change in response to the Systems, Vancouver, BC, Canada, October 6, 2018 (HealthRecSys’18), 6 pages. other food items consumed (i.e the dietary context) and to the con- text of consumption (e.g. eating croissant for breakfast is acceptable, 1 INTRODUCTION but it is not for lunch). It seems necessary to take into account these Most chronic diseases such as diabetes, obesity and cardiovascular aspects for increasing the efficacy of food item recommendation diseases are correlated to unhealthy eating habits [25]. In order to in real-life settings. Context-aware recommender systems seem help people to adopt healthier eating habits, public health agencies therefore to be the appropriate approach. However, modelling the have created dietary guidelines targeted to the general population. context is highly dependent on the domain at hand. It is thus nec- These guidelines can be food based, for instance "eat at least 5 essary to first model eating behaviours and understand how the fruit or vegetable per day", "limit your consumption of salt" 1 or context impacts eating behaviours. ∗ HealthRecSys’18, October 6, 2018, Vancouver, BC, Canada. ©2018 Copyright for Several dietary assessment methods are available: the food fre- the individual papers remains with the authors. Copying permitted for private and quency questionnaire (FFQ), 24-hour dietary recall (24HR) and food academic purposes. This volume is published and copyrighted by its editors." diaries. FFQ are easy to implement and cost-effective however, the 1 http://solidarites-sante.gouv.fr/IMG/pdf/PNNS_2011-2015.pdf HealthRecSys’18, October 6, 2018, Vancouver, BC, Canada Akkoyunlu et al. User ID_meal Meals works on its own dataset and, most of the time, only one method m1 coffee, cereals of dimension reduction is applied for deriving eating behaviours. m2 pasta, beef, fruits There is no apparent gold standard method, but the existing litera- Anna m3 coffee ture seems to favour the use of PCA. m4 rice, vegetable, fruits These methods are reductionist: they only consider food items m3 coffee alone. Nutrition experts argue that this reductionist perspective m5 pizza, soda may not be efficient for recommendation purposes: deeper and more Bob m3 coffee complex information are needed [23]. Opposed to the reductionist m6 pasta, soda viewpoint, the holistic approach considers the diet as "a dynamic m7 tea, cereals, interaction of the parts of their synthesis" [7]. Food item interactions m8 pasta, vegetable should accordingly be used for modelling eating behaviours. Christian m9 tea, cereals, fruits One solution would be to consider dietary data in a meal-based m4 rice, vegetable, fruits form. Meal pattern analysis provides more details regarding the way Table 1: Toy example of food consumption data (10 food people compose their meals [24] and could provide more insights for items, 4 meals per user characterising eating behaviours. This approach takes into account the complexity of the diet and aims at overcoming the limitations of the study of foods in isolation [7]. A meal based approach for dis- covering eating behaviours was introduced by Woolhead et al.[24]. questionnaire is tailored by research groups with a specific aim They used frequent itemsets to generate a generic meal classifi- in mind. Besides, its accuracy is not enough for recommendation cation. They derived 63 generic meals across all meal types and purposes. 24HR method is an interview that requires 30 minutes computed mean daily nutrient intakes associated to the generic rather precise but one day of consumption per user is not sufficient meals. For each subject, mean daily intakes of energy percentage in order to learn preferences. Food diaries are a prospective open- contribution of each generic meal type was computed. Then PCA ended food consumption assessment method where consumers was applied to discover eating behaviours. Authors themselves write down all the food items and beverages consumed over a spe- argue that this methodology induces a subjective classification. Be- cific time period [19]. Quite often, the time periods go from 3 to sides, relying on frequent itemsets to code meals may overlook 7 consecutive days. The main advantages are that no interviewer infrequent eating patterns at a population level but frequent at an is required, the whole process can be automatized adapted for rec- individual level, discarding these patterns as noise. This shows the ommendation purposes and provided several days of consumption, necessity of an adequate representation of meals. changes in diet can be captured. Throughout the paper, the toy Developing a food recommender system that takes into account food diaries dataset in Table 1 will be used to illustrate the user the meals and their context, and not only food items, requires that modelling methods. two main challenges be met: (1) finding a proper meal description Dietary behaviour is modelled using two main types of meth- model in which distances between meals can be computed and ods: theoretical ones and empirical ones [15]. Theoretical methods (2) discovering an adequate way of aggregating several meals for use dietary indexes developed by research groups or agencies in computing distances between users in order to discover clusters of order to rank the healthiness of eating behaviours. Indexes are con- eating behaviours. structed based on the current knowledge in nutrition but can also In this paper, our contribution is twofold: we propose a novel include current dietary guidelines and recommendations which domain of application of word embedding to user profiling and are usually generated from empirical research. However, Newby we compare three approaches to describe eating behaviours. We et al. [15] stress the fact that there can be conflicts when there is propose a new approach to model meal representation by applying no scientific consensus about what a healthy behaviour is before the Doc2Vec algorithm [10] in order to learn a meal embedding analysis. It results in indexes that measure different definitions of space. This allows, in turn, the use of a cosinus similarity adapted to a healthy behaviour. In empirical methods, there is no nutritional matrices to compute similarities between users and infers clusters a priori about eating behaviours, i.e there is no definition about of users. Moreover, in food based approaches, we compare the state what a healthy behaviour is. Patterns are found with no nutritional in the art methods with Doc2Vec applied on users. a priori. We only focus on empirical ones as our goal is to learn The rest of the paper is organised as follows. Section 2 describes dietary behaviours based on consumption data in an unsupervised methods for user modelling. Section 3 reports the results of our ex- way. In the literature, two methods stand out for discovering eating periments on a real-world dataset. We discuss the results in Section behaviours: clustering and factor analysis. Cluster analysis aims 4 and we finally conclude in Section 5. at discovering groups of behaviours, while factor analysis seeks the most relevant factors. Clustering may use factor analysis as a 2 METHODS preprocessing step. Thus, the K-Means algorithm is often applied to the matrix of consumption of food items directly [17] or after di- 2.1 Food-based methods mension reduction using e.g. Principal Component Analysis (PCA) 2.1.1 State of the art methods. [21] or Non-Negative Matrix Factorisation [26]. In alimentation behaviour science, researchers work mostly on To our knowledge, there is no comprehensive review about meth- food items. They transform food consumption data into matrices ods used for deriving empirically eating patterns [15]. Each study where the columns correspond to the frequency or the quantity Exploring eating behaviours modelling for user clustering HealthRecSys’18, October 6, 2018, Vancouver, BC, Canada of consumption of food items and the rows to users as shown in Figure 2 is an illustration of what applying Doc2Vec algorithm Figure 1. The next step consists in applying Principal Component on individual eating consumptions means. Individual documents of consumption are fed in the model. The result is an embedding space of users based on their eating consumptions which means that each user is described by a set of coordinates. Users are represented as vectors in this figure because similarity between users is computed with cosine similarity, a metric commonly used in document re- trieval. It is basically the angle between two user vectors. At this stage of the method, we compute the similarity matrix of users. Our goal is then to cluster users according to their similarity. Spectral clustering is a method that exploits similarity measures by considering data points as nodes of a weighted connected graph. Clusters are found by partitioning this graph based on the eigen- vectors of the Laplacian matrix derived from the similarity matrix. Figure 1: Matrix of consumption of the toy example Choosing the optimal number of clusters is often a problem for clus- tering algorithms. There are several heuristics adapted for spectral Analysis (PCA) or Non-Negative Matrix Factorization (NMF) [11]. clustering. The heuristic advised by [13] is the eigengap heuristic. PCA consists in finding a set of linearly independent variables, The optimal number of clusters k is the number such that the dif- called principal components, that capture as much as possible the ference between the eigenvalues λk +1 − λk is large. Justification variance of the data points. NMF is similar to PCA but imposes for this procedure is provided in [13]. a non-negativity constraint on the parameters of the model. This is found useful in many domains such as signal processing and 2.2 A novel meal based method using Doc2Vec recommender systems, because more amainable to interpretation 2.2.1 Learning an embedding space for meals. by experts [12]. Meals are defined as combinations of food items simultaneously Clusters of eating behaviours are then discovered by applying consumed by one user at a single moment of consumption on one K-Means algorithm on the result of PCA or NMF. In order to find survey day. Meals are actually lists of food items. In the meal based the optimal number of clusters, a popular clustering evaluation approach, the objective is to be able to compute similarities between metric is used, the silhouette coefficient [18]. meals in order to compute similarities between users to derive 2.1.2 Another food based method: applying Doc2Vec to users. clusters of users. However, it is not trivial to compute similarity Word2Vec is a popular model for word embedding. Doc2Vec, pro- between two meals, for example between {pasta, beef, fruits} and posed by [10] is an extension of Word2Vec: instead of learning {rice, vegetable, fruits}. word embeddings, the model learns distributed representations A straightforward idea would be to define first a similarity be- of arbitrarily large units of text such as sentences, paragraphs or tween food items and then define a way to summarize those sim- documents. It was proposed in two flavours: DBOW (Distributed ilarities to compute a similarity between meals. We observe two Bag Of Words) and DMPV (Distributed Memory version of Para- problems with this idea. First, there is no domain similarity measure graph Vector). DBOW is simpler than DMPV as it does not take between food items. One can use classification of food items as a into account the order of the words when learning the embedding proxy to a similarity measure. But there are lots of classification space. It is the version that is suited for our task as the order does schemes in the literature. Second, this approach is against the phi- not matter. Besides, empirical evaluations of Doc2Vec showed that losophy of the holistic approach as it ignores interactions that may DBOW performs better than DMPV [9]. exist between food items. The food based approach considers that a user is described by the frequency of consumption of single food items. Similarly, a user can be considered as a document where the food items eaten over a specific amount of time play the role of words. Figure 2: Application of Doc2Vec on user consumptions in Figure 3: Application of Doc2Vec on meals in the meal based the food based approach approach HealthRecSys’18, October 6, 2018, Vancouver, BC, Canada Akkoyunlu et al. An elegant way of learning such interactions is to learn an em- determined by using an internal clustering evaluation score, the bedding space with Doc2Vec. Indeed, the embedding is learned silhouette score. The optimal number of clusters is found when in such way that similar meals are closer in the induced space the silhouette score is maximised. For PCA and NMF, we vary the as showed in Figure 3. Each user is now described by a matrix number of clusters between 2 and 30 and compute the silhouette where the rows correspond to the meals and the columns to the score. The score is maximised for k = 9. coordinates of meals in the Doc2Vec induced space. Loadings of factors of PCA and NMF can be give a hint about the new representation space of users. Figure 4 shows the loadings 2.2.2 Computing distance between users. of factors of PCA according to food items. For ease of reading only Once the meal representation is learned, the challenge becomes food items whose absolute value of contribution to any factor is one of computing a similarity between users. In our approach, superior to 0.005 are displayed. NMF factors are shown in Figure 5. this amounts to compute the similarity between two documents The food items are displayed if their contribution to any factor is by taking into account the distances between sentences. Indeed, superior to 0.3. meals can be considered sentences of users who are documents. Mathematically speaking, this amounts to compute a similarity between matrices. Such a similarity was introduced in [14]. The authors of the paper proposed a cosine kernel in order to compute the similarity between the documents A and B in Equation 1: ⟨A, B⟩ cos(A, B) = (1) ∥A∥ F · ∥B∥ F where ⟨·, ·⟩ is the Frobenius inner product and ∥·∥ F the Frobe- nius norm. Using the Frobenius inner product enables to compare the similarity of the sentences to determine the similarity of the documents. Let us denote s A and s B the number of sentences in doc- ument A and document B respectively. This formula implies that the cosinus similarity is computed between the first sentences of both documents then the second ones and so on until the min(s A , s B )-th sentences. If one document is longer than the other one, the last sentences of the longer document are not taken into account for the similarity computation. For eating behaviour modelling, this means that two consumers are similar if they eat similar meals at the same moment of the day on the same day. This is a rather strong assumption concerning eating behaviour modelling. Figure 4: Factor loadings of PCA: explaining the new repre- sentation space 3 EXPERIMENTS 3.1 INCA2 dataset The INCA2 dataset 2 consists of individual 7-day food records 3.3 Doc2Vec on users collected during 2006-2007 from 2,624 adult French consumers We constitute the corpus by aggregating the food item consump- over several months in order to take into account seasonality. A tions per user, each user constituting a document. We use the Gen- close-ended list of 1,342 food items organized in 122 sub-groups sim implementation of Doc2Vec in order to learn our model. The and in 44 groups were used for coding the dietary records. Further corpus contains 2624 documents. After learning the model, we com- detail about the survey methods can be found in [2]. We decide to pute the cosinus similarity of users and perform spectral clustering. work on sub-groups because the vocabulary is larger than when The optimal number of clusters is 5 clusters obtained using the considering groups while having enough repetitions unlike when eigengap heuristic. considering food items. We do not impose the number of clusters to be the same for all the methods as we want to see if the number 3.4 Doc2Vec on meals of clusters that each method discovers is different, if the clusters are overlapping or not. We gather the corpus of meals by aggregating food items consumed at the same moment of consumption, at the same day, by the same 3.2 PCA and NMF on consumption data user. The corpus is constituted of 37 283 unique meals. A meal embedding is learned using the Gensim Doc2Vec implementation. The state of the art methods require the selection of two parameters: For each user, the vector of each of his meals is computed leading to the number of components C of the reduction of dimensionality user matrices. The similarity matrix between users is obtained by method and the number of clusters K. The number of clusters k is applying the cosine kernel to user matrices. Spectral clustering is 2 https://www.data.gouv.fr/fr/datasets/donnees-de-consommations-et-habitudes- applied and the number of clusters is determined using the eigengap alimentaires-de-letude-inca-2-3/ heuristic. It yields 3 clusters. Exploring eating behaviours modelling for user clustering HealthRecSys’18, October 6, 2018, Vancouver, BC, Canada Figure 5: Factor loadings of NMF: explaining the new repre- Figure 6: Repartition of users in clusters per method sentation space for clustering users based on their food consumptions is not primor- 3.5 Comparison of the clustering results dial. However, as shown in Figures 4 and 5, the eating behaviours Our goal now is to compare the clustering results and determine discovered are different. The coefficients of PCA can be interpreted in which cases a food-based approach is adequate and the contri- as consumptions when positive and non consumption when nega- bution of a meal-based approach. In order to compare agreement tive. For instance, the eating behaviour 0 consists in drinking tap between clustering results, we compute the Adjusted Rand Index water but not spring or mineral water. We can also extract infor- (ARI) [16]. It is a popular measure which consists in computing the mation such that those who consume coffee do not consume tea agreement between two clustering results i.e two partitions. ARI is and vice versa. On the opposite, the coefficients of NMF are strictly recommended for cases where the number of clusters is different, positive hence the interpretation only concerns food consumptions. which is our case. ARI takes values in [−1, 1], 1 meaning that both For instance, the eating behaviour 0 consists in eating all types of clusterings agree, values close 0 mean that clusterings are made at vegetables. The extracted eating behaviours are different according random. to the method of reduction of dimensionality. We recommend to test both methods to compare extracted eating behaviours as the FOOD BASED MEAL BASED provided insights of both methods can be interesting. Doc2VecDoc2Vec PCA NMF users meals 4.2 Contribution of Doc2Vec for the food based PCA 1 0,93 0,14 0,017 approach NMF 1 0,13 0,018 We apply Doc2Vec directly to users in order to challenge the state of Doc2Vec users 1 0,013 the art methods in food based approaches as we want to see how the Doc2Vec meals 1 NLP method performs on this task. The number of clusters using Table 2: Comparison of clustering results with Adjusted the Doc2Vec method on users yields a smaller number of clusters Rand Index and clustering results are rather different. A major drawback of this method is that eating behaviours cannot be inspected as easily as in We also plot in Figure 6 the repartition of users in clusters across the state of the art methods. Further analysis is needed in order to the methods. From one method to another, the number of cluster is understand why clustering results are so different. This method is attributed randomly and does not hold meaning. adequate if the objective is to extract clusters of consumers, however in this state, this approach is not really adapted if explanations are 4 DISCUSSION expected about eating behaviours. Being able to identify eating behaviours is key for recommendation purposes as explanations 4.1 Comparison of PCA and NMF for food may be needed for people to implement the recommendations. based user modelling Usually, the performance of a neural language model is computed No matter the factorization method used before the clustering step, on supervised tasks such as document retrieval or analogies. We the clustering results are very similar according to the Adjusted are in an unsupervised setting which complicates the assessment Rand index. This means that the choice of the factorization method of the performance of the learned meal embedding. HealthRecSys’18, October 6, 2018, Vancouver, BC, Canada Akkoyunlu et al. 4.3 Comparison state in the art food based International Conference on Digital Health 2015 (New York, NY, USA, 2015), DH ’15, ACM, pp. 105–112. approach and meal based approach [6] Hageman, P. A., Pullen, C. H., Hertzog, M., and Boeckner, L. S. Effectiveness It is in the meal based approach that the number of clusters is the of tailored lifestyle interventions, using web-based and print-mail, for reducing blood pressure among rural women with prehypertension: main results of the smallest. This shows that consumers of this dataset with regards wellness for women: DASHing towards healthclinical trial. International Journal to their way of composing their meals are less diverse as we only of Behavioral Nutrition and Physical Activity 11, 1 (dec 2014). [7] Hoffmann, I. Transcending reductionism in nutrition research. The American find 3 clusters. This result should be interpreted in the light of the Journal of Clinical Nutrition 78, 3 (sep 2003), 514S–516S. assumption made about eating behaviours. We consider that two [8] Ivens, B. J., and Smith Edge, M. Translating the Dietary Guidelines to Promote consumers are similar in the meal based approach if they consume Behavior Change: Perspectives from the Food and Nutrition Science Solutions Joint Task Force. J Acad Nutr Diet 116, 10 (Oct 2016), 1697–1702. similar meals on the same moment of the day on the same day, [9] Lau, J. H., and Baldwin, T. An empirical evaluation of doc2vec with practical a strong assumption on 7-day food diary data. This may lead to insights into document embedding generation. In Proceedings of the 1st Workshop more or less low values of similarity overall between users yielding on Representation Learning for NLP, Rep4NLP@ACL 2016, Berlin, Germany, August 11, 2016 (2016), pp. 78–86. in lesser clusters. It would be interesting to investigate the relax- [10] Le, Q., and Mikolov, T. Distributed representations of sentences and documents. ation of this assumption by assuming that users are similar if they In Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32 (2014), ICML’14, JMLR.org, pp. II–1188–II–1196. consume similar meals regardless the day of consumption or the [11] Lee, D. D., and Seung, H. S. Learning the parts of objects by nonnegative matrix moment of consumption. Again, it is difficult to extract eating be- factorization. Nature 401 (1999), 788–791. haviours as the model is not designed for this purpose. Another [12] Luo, X., Zhou, M., Xia, Y., and Zhu, Q. An efficient non-negative matrix- factorization-based approach to collaborative filtering for recommender systems. language model could be used for modelling food consumption, IEEE Trans. Industrial Informatics 10, 2 (2014), 1273–1284. Latent Dirichlet Allocation (LDA) model. [13] Luxburg, U. A tutorial on spectral clustering. Statistics and Computing 17, 4 (Dec. 2007), 395–416. [14] Mijangos, V., Sierra, G., and Montes, A. Sentence level matrix representation 5 CONCLUSION for document spectral clustering. Pattern Recognition Letters 85 (2017), 29–34. [15] Newby, P. K., and Tucker, K. L. Empirically derived eating patterns using factor In this paper we explore user modelling in food consumption for or cluster analysis: a review. Nutr. Rev. 62, 5 (May 2004), 177–203. clustering users for recommendation purposes. We compare two [16] Rand, W. M. Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association 66, 336 (dec 1971), 846–850. state of the art methods in the nutrition community. Our conclusion [17] Reedy, J., Wirfalt, E., Flood, A., Mitrou, P. N., Krebs-Smith, S. M., Kipnis, is that both methods yield more or less the same clustering results. V., Midthune, D., Leitzmann, M., Hollenbeck, A., Schatzkin, A., and Subar, However, the eating behaviours discovered are different. Moreover, A. F. Comparing 3 dietary pattern methods–cluster analysis, factor analysis, and index analysis–with colorectal cancer risk: The NIH-AARP diet and health study. we propose a new food-based approach by considering food con- American Journal of Epidemiology 171, 4 (dec 2009), 479–487. sumptions as textual data and learning an embedding model with [18] Rousseeuw, P. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 1 (Nov. 1987), 53–65. Doc2Vec. The application of Doc2Vec to user food consumption [19] Shim, J.-S., Oh, K., and Kim, H. C. Dietary assessment methods in epidemiologic is adequate for user clustering, however it is not adapted for ex- studies. Epidemiology and Health (jul 2014), e2014009. tracting eating behaviours. We argued the importance of having [20] Teng, C.-Y., Lin, Y.-R., and Adamic, L. A. Recipe recommendation using ingre- dient networks. In Proceedings of the 4th Annual ACM Web Science Conference a holistic approach toward nutrition in order to make acceptable (New York, NY, USA, 2012), WebSci ’12, ACM, pp. 298–307. recommendations. We propose a new meal based approach which [21] Thorpe, M. G., Milte, C. M., Crawford, D., and McNaughton, S. A. A compar- consists in learning a meal embedding space and then computing ison of the dietary patterns derived by principal component analysis and cluster analysis in older australians. International Journal of Behavioral Nutrition and user similarity based on their meals’ similarity. The usage of NLP Physical Activity 13, 1 (feb 2016). for food data analysis is promising. However, if clusters that can [22] Webb, D., and Byrd-Bredbenner, C. Overcoming consumer inertia to dietary guidance. Advances in Nutrition 6, 4 (jul 2015), 391–396. be explained is needed (which is often the case), then it is better to [23] Wendel, S., Dellaert, B. G., Ronteltap, A., and van Trijp, H. C. Consumers’ resort to generative language models such as LDA. Further work intention to use health recommendation systems to receive personalized nutrition will investigate the use of LDA for modelling eating behaviours. advice. BMC Health Services Research 13, 1 (apr 2013). [24] Woolhead, C., Gibney, M. J., Walsh, M. C., Brennan, L., and Gibney, E. R. A generic coding approach for the examination of meal patterns. The American Journal of Clinical Nutrition 102, 2 (jun 2015), 316–323. 6 ACKNOWLEDGEMENT [25] World Health Organization. Diet, nutrition and the prevention of chronic This study was funded by Danone Nutricia Research. diseases: report of a joint who/fao expert consultation. Tech. rep., 2003. [26] Zetlaoui, M., Feinberg, M., Verger, P., and Clemençon, S. Extraction of food consumption systems by nonnegative matrix factorization (NMF) for the REFERENCES assessment of food choices. Biometrics 67, 4 (2011), 1647–1658. [1] Bier, D. M., Derelian, D., German, J. B., Katz, D. L., Pate, R. R., and Thompson, K. M. Improving compliance with dietary recommendations. Nutrition Today 43, 5 (sep 2008), 180–187. [2] Dubuisson, C., Lioret, S., Touvier, M., Dufour, A., Calamassi-Tran, G., Volatier, J.-L., and Lafay, L. Trends in food and nutritional intakes of french adults from 1999 to 2007: results from the INCA surveys. British Journal of Nutrition 103, 07 (dec 2009), 1035. [3] Elsweiler, D., and Harvey, M. Towards automatic meal plan recommendations for balanced nutrition. In Proceedings of the 9th ACM Conference on Recommender Systems, RecSys 2015, Vienna, Austria, September 16-20, 2015 (2015), pp. 313–316. [4] Freyne, J., and Berkovsky, S. Recommending food: Reasoning on recipes and ingredients. In User Modeling, Adaptation, and Personalization, 18th International Conference, UMAP 2010, Big Island, HI, USA, June 20-24, 2010. Proceedings (2010), pp. 381–386. [5] Ge, M., Elahi, M., Fernaández-Tobías, I., Ricci, F., and Massimo, D. Using tags and latent factors in a food recommender system. In Proceedings of the 5th