1. Introduction

August

HeASe: An AI-powered Framework to Promote Healthy and Sustainable Eating

Alessandro Petruzzelli

Cataldo Musto

Michele Ciro Di Carlo

Giovanni Tempesta

Giovanni Semeraro

0 0 University of Bari Aldo Moro , via Orabona 4, Bari, 70125 , Italy

2024

05 2024 0000 0001

This paper introduces Healthy And Sustainable eating (HeASe), a comprehensive framework designed to promote healthy and sustainable eating by leveraging large language models and food retrieval techniques. As global concerns about nutrition and environmental sustainability escalate, the need for efective solutions that allow people to better nourish and improve their knowledge and self-awareness about food becomes imperative. To this end, given an input recipe, our framework first identifies a set of substitute meals by exploiting a retrieval strategy based on macro-nutrients, then relies on large language models to re-rank candidate recipes based on their healthiness and sustainability. As shown in our experiments, the methodology has the ability to expose individuals to better dietary choices, potentially contributing to overall well-being and reducing the ecological footprint of food consumption.

eol>Food Recommendation Large Language Models Health-aware Recommender Systems Sustainability

1. Introduction

Today, the food industry is eficient and ofers a variety of fresh and processed options. However, every step of the agricultural and food chain raises environmental concerns. Land use, water consumption, and air emissions all have an impact on the environment. While technological advancements create new markets and opportunities, they must also address these environmental challenges. To mitigate the environmental footprint of the food chain, a fundamental shift in consumer behavior is essential. Indeed, we must transition towards a dietary paradigm that prioritizes both individual health and environmental sustainability [ 1 ]. This necessitates a move away from conventional consumption patterns and towards a more mindful approach to food choices. All these principles are in lines with several Sustainable Development Goals (SDGs), in particular SDG3 (Good Health and Well-being) and SDG12 (Responsible Consumption and Production).

In recent years, food recommendation systems (RSs) [ 2 ] have emerged as a promising avenue to guide consumers toward healthier and more sustainable dietary choices. These systems can be categorized into two primary types: health-aware and sustainable-aware RSs [ 3 ]. Health-aware food RSs [ 4 ] aim to assist users in defining daily diets that align with their nutritional needs and health goals. These systems typically achieve this by balancing user preferences with various health-related factors. Previous methods have tried to incorporate healthiness by replacing ingredients with healthier alternatives [ 5, 6 ] or incorporating nutritional facts as function constraints [ 7, 8 ]. In [ 9 ], a post-filtering method has been proposed to score recipes based on health criteria.While these approaches have shown promise in promoting healthier eating habits, they often face limitations. Notably, methods that directly substitute ingredients or impose hard constraints on healthiness can significantly alter the recipe’s original characteristics, potentially compromising user satisfaction. Additionally, post-filtering approaches may discard potentially healthy recipes that fall below an arbitrary threshold, limiting user choice.

On the other hand, sustainability-aware food RSs solely consider the environmental impact related to food consumption. For instance, in [ 3 ], the authors introduce a system that exploits the information about water footprint. In particular, it promotes recipes with ingredients whose production needs a lower quantity of water. While being of interest and certainly novel, this approach fails to capture the complete picture of a recipe’s impact ignoring other sustainability aspects such as carbon emissions [ 10 ], that play a key role in assessing the sustainability of a recipe. To sum up, the analysis of the state of the art showed that there is a scarcity of systems that jointly tackle the problem of providing food suggestions that are healthy and sustainable at the same time.

Accordingly, we propose a novel framework that aims to fill in this gap by exploiting large language models (LLMs) and a recipe similarity formula based on macro-nutrients. In particular, given an input (not sustainable) recipe, we first use macro-nutrients to identify suitable alternative, then we rank them based on our sustainability score and we finally exploit large language models ( i.e., GPT 3.5 Turbo [ 11 ]) to select an alternative recipe that is both healthy and sustainable. Up to our knowledge, the use of LLMs to identify sustainable food alternative is a completely novel research direction.

In our vision, this approach acknowledges that health-conscious consumers often consider not only the nutritional value of food but also its environmental impact. So, by incorporating a sustainability score for each ingredient, the framework can identify recipes that encompass both individual well-being and environmental responsibility. A toy example showing the behavior of the framework is presented in Figure 1, while the contribution of the paper can be summarized as follows: • Sustainability Score: we introduce a strategy to estimate the sustainability of a recipe based on the information about water and carbon footprint of its ingredients. • Dataset: we release a new dataset that extends HUMMUS [ 12 ] with sustainability and healthiness scores for ingredients. In particular, we provided all the recipes in the dataset with information about environmental aspects. This will encourage and foster research in the area of sustainabilityaware food RSs. • HeASe Framework: we propose a framework that provides users with more sustainable and healthier recipes by exploiting: (a) recipe similarity based on macro-nutrients; (b) sustainability and healthiness scores; (c) selection mechanism based on LLMs. • Evaluation: we showed that our sustainability scores allowed to identify similar but more sustainable recipes. Moreover, we also showed the LLMs can be particularly efective in selecting the most suitable alternative given a pool of candidate recipes. Both these directions have been scarcely investigated in the state of the art.

2. Assessing Healthiness and Sustainability 2.1. Calculating Healthiness of Recipes

Determining the "healthiness" of a recipe is a complex issue, heavily influenced by its nutrient composition and individual dietary needs. The concept of healthy food has experienced significant evolution, with past approaches focusing on factors like calories information [ 4 ], cholesterol levels [ 13 ], or multinutrients like protein, sodium, and saturated fats [ 14 ].

Today, we have a more comprehensive framework based on guidelines from international health organizations like the World Health Organization (WHO) [ 15 ]. The WHO recommends daily intake ranges for 15 macro-nutrients. Based on these intakes, in the HUMMUS dataset [ 12 ] the authors created a single score reflecting a recipe’s overall healthiness. In particular, the method relies on the "trafic light" system proposed by [ 16 ]: each macro-nutrient range is assigned a color based on its perceived healthfulness (green for healthy, yellow for moderate, red for unhealthy) , and each color is mapped to a range of scores. The individual scores of the macro-nutrients are then added up and normalized to create a final WHO score ranging from 0 (very healthy) to 14 (very unhealthy) for each recipe. Given a recipe , from now on the healthiness of the recipes calculated as we just described is indicated as (). For more details on the formula, we suggest to refer to [ 12 ].

2.2. Calculating Sustainability of Recipes

While the task of calculating the healthiness of a recipe has some previous attempts, the assessment of the sustainability is relatively newer and scarcely investigated. Indeed, sustainability is a complex and constantly developing field, with no single universally accepted method. This makes it challenging to objectively compare the environmental impact of diferent recipes. Only of the first attempts in this direction is represented by the SU-EATABLE Life (SEL) dataset [ 17 ], that provides carbon footprint (WC) and water footprint (WF) data for various food ingredients.

In this work, we tackle the task of assessing the sustainability of the recipes available in the HUMMUS dataset by properly processing the information encoded in SEL dataset. In particular, the process is organized as follows: 1. Pre-process the SU-EATABLE Life (SEL) dataset. We remove noise by eliminating items lacking both footprints, removing unnecessary characters from names, and filtering out stopwords and adjectives. 2. Match ingredients with recipes: We match ingredients in the SEL dataset with those in each recipe from the HUMMUS dataset. 3. Handle missing ingredients: To ensure comprehensive matching, we perform additional steps: • Check if the SEL ingredient name is contained within the recipe ingredient name. • Check if the recipe ingredient name is contained within the SEL ingredient name. • If the above steps find matchings, we utilize transformers 1 to calculate the similarity between missing ingredients and matched ones in SEL, with a threshold of 0.98. We manually reviewed similarities further refined the matches. 4. Manual intervention for high-occurrence missing ingredients: We manually addressed 87 missing ingredients with over 1000 occurrences, identifying 19 potential associations.

Based on the previous strategy, given an ingredient we can obtain its corresponding water and carbon footprints, labeled as () and ().

Next, to evaluate the overall environmental impact of an ingredient we designed a new metric named Ingredient Sustainability Score (ISS), calculated as follows: () = × () + × () (1) where: • represents the specific ingredient. • () denotes the water footprint of ingredient . • () represents the carbon footprint of .

• and are weighting factors, with = 0.2 and = 0.82

1https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2

2This weighting scheme prioritizes the carbon footprint over the water footprint, reflecting the generally greater environmental impact of greenhouse gas emissions compared to water use. Of course, diferent weighting schemes may be adopted as well.

Next, based on the ISS scores for ingredients, we define a scoring function for recipes. To this end, we first rank the ingredients 1 . . . based on their ISS. Then, we define the Recipe Sustainable Score (RSS) for a recipe as: (2) (3) () = ||− 1 ∑︁ ()− =0 Where represents the -th ingredient of the recipe, based on the previous ranking.

The intuition behind this formula is to give a greater importance to the ingredients with higher carbon and water footprint (i.e., those that have a greater environmental impact). Diferently from a simple average, that gives identical importance to the ingredients, this strategy gives more importance to ingredients that are not sustainable. Indeed, this discounting mechanism ensures that the overall recipe score reflects the dominance of the main ingredient while incorporating the influence of additional ingredients. Finally, the ultimate sustainability score (SuS) of a recipe was computed as: SuS(R) = 1 − () − −

Where MinRSS and MaxRSS are the minimum and maximum RSS scores obtained over the dataset of recipes, respectively, and are used as a normalization factor. It is important to note that the Sustainability Score is calculated based on the water and carbon footprint of all the ingredients of the recipe. These have negative environmental impacts, so a higher overall score indicates a more sustainable recipe. A qualitative evaluation of the efectiveness of our formula is provided next.

2.3. Description of the Dataset

As mentioned in the previous steps, one of the contributions of the paper is a new dataset providing information about sustainability of recipes. Our dataset is based on Health-aware User-centered recoMMendation and argUment-enabling data Set (HUMMUS) dataset. This dataset is built on top of the existing FoodKG [ 18 ] knowledge graph. The authors have added more data to the graph by collecting additional information for each recipe. They have also included valuable features such as nutritional scores from WHO, FDA, and Nutriscore. This dataset has over 507, 000 recipes, and each recipe contains details about ingredients, macro-nutrients (calories, total fat, etc.), and other relevant information organized into tags. The tags provide information about key recipe aspects like main ingredients (meat, pork, fruit) and dish category (main course, dessert, breakfast). The dataset contains a set of 902 unique tag values.

To ensure the dataset’s quality, we performed some prep-rocessing steps. We removed duplicate recipes, those missing any tags, and those lacking any listed ingredients. This process helped to refine the dataset and improve its overall usability, reducing the number of recipes to 214, 800.

Next, we applied the pipeline described in section 2.2 to calculate the SuS score for each recipe. However, during this process, we noticed that not all ingredients could be matched, even after manual checking. To maintain the overall quality of the dataset, we decided to remove recipes where more than 30% of ingredients could not be matched in the SEL dataset. This additional filtering reduced the number of recipes to 100,870.

Finally, we categorized recipes with three sustainability labels based on their sustainability scores: • High ( ≥ 0.9): Representing highly sustainable recipes (16,433 recipes). • Medium (0.5 < < 0.9): Representing moderately sustainable recipes (79,157 recipes). • Low ( ≤ 0.5): Indicating recipes with low sustainability (5,280 recipes).

Some examples of the recipes that were classified in each category will be provided next. Moreover, the dataset together with the labels we calculated was used in our experiment to assess the efectiveness of the strategy and was released as a contribution of the work.

3. Description of the Framework

This section introduces the HeASe framework. As previously stated (see Figure 1), the goal of the framework is to automatically suggest a similar-but-healthier and more sustainable alternative of an input recipe given a by user. For better understanding the framework, we break down the process into four steps, each corresponding to a component in Figure 2.

3.1. Step 1: Encoding Module

The workflow starts with the Encoding Module. In a nutshell, this module takes as input the input recipe and returns a vector encoding the characteristics of the recipe in terms of macro-nutrients. This is a mandatory step, since we want to identify recipes that are healthier and more sustainable, but also similar to the input. Accordingly, it is necessary to understand nutritional values and characteristics of a recipe.

To this end, we exploited a pre-trained transformer fine-tuned on the recipe domain 3 to encode the input recipe based on the name of the recipe. Next, we calculate the similarity between the input recipe and the names of the other recipes available in the dataset. If a match with a similarity score exceeding 0.99 is found, we obtain a precise match. It means that a recipe with (almost) the same name exists in the dataset. Otherwise, the most similar recipes are returned. In this way, the framework is able to manage both exact and non-exact matching.

In case of exact match, the output of the module is a vector encoding the values of the macro-nutrients of the matched recipe, together with the descriptive tags available in the dataset. Conversely, in case of non-exact matching, the macro-nutrients of the input recipe are obtained as the centroid vector of the macro-nutrients of the similar recipes previously identified by the transformer.

3https://huggingface.co/davanstrien/autotrain-recipes-2451975973 3.2. Step 2: Retrieval Module

As mentioned in the previous step, the Encoding module generates a representation of the input recipe based on its macro-nutrients. Such a representation is then used to search for similar recipes. To address this task, we calculated the similarity in terms of macro-nutrients between the input recipe (as returned by the Encoding module) and all the recipes in the dataset, based on the cosine similarity. This allowed us to retrieve recipes that closely matched the input recipe in terms of their nutritional composition.

Moreover, we also used the tags that are available for each recipe as a further element to improve the quality of the retrieved recipes. In particular, we only return recipes that are similar and share at least one tag (i.e., pasta, breakfast, japanese, etc.) with the input recipe provided by the user. In this way, we avoid that very diferent recipes could be included in the output of the Retrieval module.

3.3. Step 3: Ranking Module

Once similar recipes are obtained, it is necessary to rank them in order to identify an alternative that is more sustainable and healthier. This role is played by the Ranking module, whose goal is to take as input the recipes previously returned by the Retrieval module and identify the better alternatives for the user. To rank the recipes, we defined a new function called HeaSe Score (HS), defined as follows: HS(R) = · Sustainability() + · WHO() (4) • Where represents a recipe. • SuS() is a function that returns the sustainability score of R, as described in Section 2.2 • WHO() is a function that returns the WHO score of a given recipe.

• and hyperparameters that allow you to weight the importance of each factor.

At the end of this step, a list of ranked alternative recipes is obtained. It is worth emphasizing that the workflow can also stop after this step, by returning to the user the top-1 recipe retrieved by the systems based on the HeaSe score. However, we also implemented a Selection module based on LLMs to assess whether the knowledge encoded in large language models can be exploited to better handle this task.

3.4. Step 4: Selection Module

Finally, in the Selection module, the output previously obtained from the Ranking module is processed by using LLMs, specifically GPT-3.5 turbo, in order to select the most suitable alternative of the recipe provided as input by the user. To carry out this step we specifically designed a strategy inspired by Retrieval-Augmented Generation (RAG) [19] which takes as input the list of candidate recipes and asks the LLM to select the most suitable one. This is done through a zero-shot prompt that is used to query the LLM, leaving it the task to identify the most suitable candidate recipe based on the knowledge encoded in the language model. An example of such a prompt is provided below. As shown in the example, we populate the prompt with the recipes previously identified and we let GPT pick the more sustainable alternative recipe. To mitigate potential biases like positional bias [20], the retrieved recipes are shufled and inserted into the prompt without any additional information.

U s i n g your knowledge , p l e a s e rank ( i f n e c e s s a r y ) t h e f o l l o w i n g r e c i p e s from most t o l e a s t recommended b a s e d on a b a l a n c e o f s u s t a i n a b i l i t y and h e a l t h i n e s s : 1 . R e c i p e : H e a l t h y S a l a d 2 . R e c i p e : Quinoa Bowl 3 . R e c i p e : V e g g i e S t i r − F r y Which one s h o u l d I c h o o s e ? R e t u r n j u s t t h e name .

It is crucial to note that the lack of information about the input recipe is intentional and derives from the experiment’s ultimate objective. We aim to assess the LLM’s ability to accurately identify the recipe with higher values of sustainability and healthiness without relying on specific recipe details.

Of course, one of the goals of the experiment will be to assess the efectiveness of LLMs in the task of automatically identifying healthy and sustainable recipes.

4. Experimental Evaluation

This section explores the efectiveness of the proposed metrics and framework through experiments addressing the following Research Questions (RQs): RQ1 - Scoring Efectiveness: Can SuS and HeASe scores actually rank recipes based on sustainability and healthiness? RQ2 - Retrieval Efectiveness: Is the framework able to successfully identify suitable food alternatives? RQ3 - LLM-based Selection Efectiveness: Can LLMs be leveraged to automatically select sustainable alternatives?

4.1. Experimental Setting

Dataset and Evaluation Protocol All the experiments rely on the dataset previously described in Section 2.3, that is also available online on our repository4. Based on this dataset, we evaluated the performance of the framework by providing an input recipe and by checking whether the alternative identified by the framework is healthier and/or more sustainable. To guarantee the soundness of the protocol, we evaluated the performance of HeaSe system across diverse scenarios: 1. Low Sustainability: based on 100 randomly selected recipes labeled as "Low" in sustainability. 2. Medium Sustainability: based on 100 randomly selected recipes labeled as "Medium" in sustainability. 3. High Health: based on 100 randomly selected recipes with a WHO score above average. 4. Unknown Recipes: based on 30 Recipes not present in the recipe dataset.

These scenarios allow us to assess the framework’s eficacy in diferent contexts. For instance, for the "Low Sustainability" scenario we expect significant improvements in the output recipe’s sustainability and healthiness compared to the input. However, we also evaluate the framework’s performance in more challenging settings (i.e., high health, based on recipes that are already healthy, or unknown, in order to also assess the efectiveness of non-exact matching in the retrieval phase).

Implementation Details and Model Parameters The model uses a pre-trained transformer encoder with a hidden dimensionality of 768. This allows the model to eficiently find similarities between the input text and recipe titles, even when the input doesn’t perfectly match the recipe title. As for the Retrieval module, the number of alternative recipes based on macro-nutrient similarity which is returned is set to 100. The recipe representation is based on its macro-nutrients, which include: Calories [cal], Total Fat [g], Saturated Fat [g], Cholesterol [mg], Sodium [mg], Dietary Fiber [g], Sugars [g], and Protein [g]. As regards the scoring function in the Ranker module, the best configuration for the model was achieved by setting the alpha and beta values in the formula 4 to 0.7 and 0.3, respectively.

Evaluation Metric We evaluate the performance of the HeASe system by calculating the mean percentage increment of each metric for each scenario. Given an input recipe () and a list of possible alternatives () returned by the system, we compute the following:

1 ∑︀=0 () − () WHO_incr = ()

1 ∑︀ SuS_incr = =0 () − () () 1 ∑︀ HeASe_incr = =0 () − () () (7)

Intuitively, these metrics calculate the increase (if any) in terms of healthiness and sustainability of the recipe retrieved by the framework compared to the input one.

Sensitivity Analysis. Finally, to investigate the performance of the system on varying of diferent parameters, we also carried out a sensitivity analysis based on the following key factors: • Tags matching: This option controls how strictly the recipe tags need to match between the input recipe and the retrieved items. By setting it to true, the framework only outputs recipes that share all the same tags with the input recipe. • Retrieved items: This parameter determines the number of alternative recipes retrieved as recommendations.

4.2. Discussion of the Results

RQ1 - Scoring Function Efectiveness: based on SuS and HeASe scores.

To answer RQ1, we present the top-5 and worst-5 recipes

• Top-5 Recipes (Tables 1 and 3): as shown in the tables, this includes recipes like "Homemade Oatmeal," "Quinoa-Toasted," and "Seasoned Rice", which excel in both sustainability and healthiness, achieving high SuS and HeASe scores. These options likely prioritize plant-based ingredients and simple preparation methods, reducing environmental impact and promoting nutritional value. Generally speaking, we can state that the list of the more sustainable and healthy recipes confirms the efectiveness of the scoring function we designed. • Worst-5 recipes (Tables 2 and 4): Conversely, recipes like "Rich Lamb Curry," "Five Meat Chili," and "Middle Eastern Stew" score poorly in both categories. These dishes likely contain significant amounts of meat, which can contribute to a higher environmental footprint and potentially lower overall health benefits. Also, in this case, we can state that the poorly sustainable recipes are correctly identified through our scoring function.

The disparity between metrics: Interestingly, the top and bottom scorers for SuS do not entirely overlap with those for HeASe. "Boiled Radishes" and "Granita" for example, rank highly in SuS but not in HeASe. This suggests that some sustainable practices might not always translate directly to health benefits, and vice versa, highlighting the need for a balanced metric like HeASe.

To sum up, we can answer RQ1 by stating that the qualitative analysis we provided generally confirmed the efectiveness of the scoring function we introduced in this paper. RQ2 - Retrieval Efectiveness To answer RQ2, we conducted several tests to evaluate the efectiveness of the framework, that is to say, to assess whether the alternative recipes retrieved through our pipeline are healthier and more sustainable w.r.t. the input recipe. In particular, for each of the 100 recipes in each scenario (see Section 4.1) we retrieved the 100 most similar recipes based on macronutrients, we ranked them based on our HeaSe score, and we calculated the average increase in terms of healthiness and sustainability for all the recipes. The results are reported in Table 5.

As shown in Table 5, the results confirmed the efectiveness of the approach, since the proposed alternative recipes are healthier and more sustainable, on average, in all the experimental scenarios we considered. It is worth emphasizing that the results are consistent across all the diferent scenarios, even if the gaps of course reflect the complexity of the task. Indeed, when poorly sustainable recipes are used as input of the framework, a huge average increase emerges from all the alternatives. Even though this was expected, it is important to see that the increase we obtained is really huge, on average. It is also important to note that an average increase in terms of sustainability is obtained when recipes that are already healthy are used as input. Next, the results of the sensitivity analysis are shown in Figures 3 and 4. Due to space constraints, we only reported the plot for two scenarios, i.e., the "Low Sustainability" scenario and the "High Health" scenario. The other scenarios follow a similar trend. Plots clearly show that the framework achieves better performance as the number of alternative recipes increases, and it confirmed our choice of choice of retrieving and ranking 100 similar recipes. In particular, as shown in Figure 4a, this is a necessary choice for the "high health" scenario, since by considering the top-1 and top-10 recipes retrieved we have an average decrease in sustainability. Conversely, by increasing the number of recipes, the overall healthiness and sustainability are higher. While this suggests that alternative strategies for retrieval and ranking need to be investigated in the future, proper tuning of the parameters still guarantees good performance.

Finally, Figures 3b and 4b show the results on varying of the tag matching strategy. The results reveal slight diferences, with configurations that don’t require matching all tags generally producing better results. This means that when the retrieved recipes need to match all the tags of the input recipe, non-relevant recipes may be generally returned. To sum up, all the results of the sensitivity analysis showed that the platform generally performs well, but a proper choice of parameter may lead to more efective results. by the platform based on diferent input recipes. As shown in the table, in all the reported settings the alternative recipe is healthier more sustainable, and suficiently similar to the input one. This definitely confirmed the efectiveness of the design choices. More tests can be carried out by running our online demo5.

5https://github.com/GiovTemp/SustainaMeal_Case_Study

RQ3 - LLM-based Selection Efectiveness: Finally, to answer RQ3, we evaluated the ability of GPT3.5 Turbo to automatically pick the more sustainable alternative in a pool of candidate recipes retrieved by the system. The process follows the step described in the Selection module of the framework. Due to limitations in prompt length, we experimented with a smaller set of alternatives (i.e., 10 candidate recipes). The analysis with a longer prompt is left as future work. In Table 7, we compare the healthiness and sustainability of the recipe with the highest score calculated by the Ranker to the recipe identified by GPT among the top-10 returned by the Ranker as well. As shown in the table, the results show that the LLM showed an unexpected and surprising ability to exploit its own knowledge about responsible food consumption to automatically select the best recipe in a pool of 10 candidates. Indeed, when compared with the top-1 recipes previously picked, the average sustainability and healthiness of the recipes is generally higher. These findings suggest that LLMs can efectively leverage the strengths of both retrieval and generation techniques to identify recipes that are both sustainable and healthy. This is an important finding of this work, showing the efectiveness of LLMs in a novel and scarcely investigated research direction.

5. Discussion and Future Works

The framework described in this paper aligns with SDG3 and SDG12. In particular, we foresee the following impact: - SDG 3 - Good Health and Well-being: Promoting Healthier Diets. The framework focuses on encouraging individuals to adopt healthier eating habits. By leveraging our system users can explore and choose recipes that contribute to a balanced and nutritious diet. This directly contributes to the goal of ensuring good health and well-being by promoting better nutrition and reducing the risk of diet-related diseases. - SDG12 - Responsible Consumption and Production: Ingredient Substitution: The framework contributes to responsible consumption by helping users identify more sustainable substitute ingredients in recipes. This aligns with SDG 12’s focus on ensuring sustainable consumption by promoting ecofriendly and ethically sourced ingredients.

In summary, the HeaSe framework contributes to SDG 3 by promoting healthier diets and better well-being and to SDG 12 by encouraging responsible consumption and production practices. By combining technology-driven solutions with user engagement and education, the project seeks to address the interconnected challenges of health and sustainability in the context of food choices. In future work, we will evaluate diferent strategies for the selection of alternative recipes, and we evaluate the efectiveness with real users.

Acknowledgements

We acknowledge the support of the PNRR project FAIR - Future AI Research (PE00000013), Spoke 6 Symbiotic AI under the NRRP MUR program funded by the NextGenerationEU and project PHaSE (CUP H53D23003530006) - Promoting Healthy and Sustainable Eating through Interactive and Explainable AI Methods, funded by MUR under the PRIN program. Additionally, we acknowledge the CINECA award under the ISCRA initiative (class C project: IscrC_LLM_REC), for the availability of high-performance computing resources and support The Semantic Web – ISWC 2019, Springer International Publishing, Cham, 2019, pp. 146–162. [19] P. Lewis, E. Perez, A. Piktus, F. Petroni, V. Karpukhin, N. Goyal, H. Küttler, M. Lewis, W.-t. Yih, T. Rocktäschel, et al., Retrieval-augmented generation for knowledge-intensive nlp tasks, Advances in Neural Information Processing Systems 33 (2020) 9459–9474. [20] P. Wang, L. Li, L. Chen, Z. Cai, D. Zhu, B. Lin, Y. Cao, Q. Liu, T. Liu, Z. Sui, Large language models are not fair evaluators, 2023. arXiv:2305.17926.

[1]

Hartmann ,

Lazzarini ,

Funk ,

Siegrist , Measuring consumers' knowledge of the environmental impact of foods , Appetite 167 ( 2021 ) 105622 .

[2]

Trattner ,

Elsweiler , Food recommender systems: important contributions, challenges and future research directions , arXiv preprint arXiv:1711.02760 ( 2017 ).

[3]

Gallo ,

Landro ,

R. La

Grassa ,

Turconi , Food recommendations for reducing water footprint , Sustainability 14 ( 2022 ). URL: https://www.mdpi.com/2071-1050/14/7/3833. doi: 10 .3390/ su14073833.

[4]

Ge ,

Ricci ,

Massimo , Health-aware food recommender system , in: Proceedings of the 9th ACM Conference on Recommender Systems , RecSys '15, Association for Computing Machinery, New York, NY, USA, 2015 , p. 333 - 334 . URL: https://doi.org/10.1145/2792838.2796554. doi: 10 .1145/2792838.2796554.

[5]

C.-Y.

Teng ,

Y.-R.

Lin ,

L. A.

Adamic , Recipe recommendation using ingredient networks , in: Proceedings of the 4th annual ACM web science conference , 2012 , pp. 298 - 307 .

[6]

Elsweiler ,

Trattner ,

Harvey , Exploiting food choice biases for healthier recipe recommendation , in: Proceedings of the 40th international acm sigir conference on research and development in information retrieval , 2017 , pp. 575 - 584 .

[7]

Elsweiler ,

Harvey ,

Ludwig ,

Said , Bringing the "healthy" into food recommenders , in: International Workshop on Decision Making and Recommender Systems , 2015 . URL: https: //api.semanticscholar.org/CorpusID:1838398.

[8]

Y.-K.

Ng ,

Jin , Personalized recipe recommendations for toddlers based on nutrient intake and food preferences , in: Proceedings of the 9th international conference on management of digital ecosystems , 2017 , pp. 243 - 250 .

[9]

Trattner ,

Elsweiler , Investigating the healthiness of internet-sourced recipes: implications for meal planning and recommender systems , in: Proceedings of the 26th international conference on world wide web , 2017 , pp. 489 - 498 .

[10]

Pandey ,

Agrawal ,

J. S.

Pandey , Carbon footprint: current methods of estimation, Environmental monitoring and assessment 178 ( 2011 ) 135 - 160 .

[11]

Brown ,

Mann ,

Ryder ,

Subbiah ,

J. D.

Kaplan ,

Dhariwal ,

Neelakantan ,

Shyam ,

Sastry ,

Askell , et al., Language models are few-shot learners , Advances in neural information processing systems 33 ( 2020 ) 1877 - 1901 .

[12]

Bölz ,

Nurbakova ,

Calabretto ,

Gerl ,

Brunie ,

Kosch , Hummus: A linked, healthinessaware, user-centered and argument-enabling recipe data set for recommendation , in: Proceedings of the 17th ACM Conference on Recommender Systems , RecSys '23, Association for Computing Machinery, New York, NY, USA, 2023 , p. 1 - 11 . URL: https://doi.org/10.1145/3604915.3609491. doi: 10 .1145/3604915.3609491.

[13]

Starke ,

Trattner ,

Bakken ,

Johannessen ,

Solberg , The cholesterol factor: Balancing accuracy and health in recipe recommendation through a nutrient-specific metric , in: Proceedings of the 1st Workshop on Multi-Objective Recommender Systems (MORS 2021 ), 2021 .

[14]

R. Yera

Toledo ,

A. A.

Alzahrani ,

Martínez , A food recommender system considering nutritional information and user preferences , IEEE Access 7 ( 2019 ) 96695 - 96711 . doi: 10 .1109/ACCESS. 2019 . 2929413 .

[15]

W. H.

Organization , Healthy diet, https://www.who.int/news-room/fact-sheets/detail/healthy-diet, 2020 .

[16]

Sacks ,

Rayner ,

Swinburn , Impact of front-of-pack 'trafic-light'nutrition labelling on consumer food purchases in the uk , Health promotion international 24 ( 2009 ) 344 - 352 .

[17]

Petersson ,

Secondi ,

Magnani ,

Antonelli ,

Dembska ,

Valentini ,

Varotto ,

Castaldi , A multilevel carbon and water footprint dataset of food commodities , Scientific data 8 ( 2021 ) 127 .

[18]

Haussmann ,

Seneviratne ,

Chen , Y. Ne'eman , J. Codella,

C.-H.

Chen ,

D. L.

McGuinness ,

M. J.

Zaki , Foodkg: A semantics-driven knowledge graph for food recommendation , in: C. Ghidini , O.

Hartig , M.

Maleshkova , V.

Svátek , I. Cruz ,

Hogan ,

Song ,

Lefrançois ,

Gandon (Eds.),