<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>What's On My Plate: Towards Recommending Recipe Variations for Diabetes Patients</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Markus Rokicki</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Eelco Herder</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Elena Demidova</string-name>
          <email>demidovag@L3S.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>L3S Research Center</institution>
          ,
          <addr-line>Hannover</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>As community-based recipe platforms continue to grow in popularity, recipe recommendation is an active research area. Simultaneously, the analysis of online recipes can provide us with insights on dietary patterns in particular communities. In this paper, we focus on recipe recommendation for a user group that is constrained in terms of choices: diabetes patients need to balance their diet more than average persons and to be aware of the nutritional value of their meals. First, we discuss the type of situations where diabetes-speci c food recommendations are desirable. Further, we analyze how people's age and gender interact with food intake. Based on a large dataset, we explore how variations in `canonical meals' can be exploited for recommending which alternatives better t the user's dietary requirements.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Diabetes is a widely spread chronic disease that a ects about 10% of the Western
population. Patients su ering from diabetes have to take many vital decisions
on a daily basis, including: am I allowed to eat this, what is my blood sugar
level, how much insulin should I take right now? Particularly those who have
just been diagnosed with diabetes experience di culties facing such decisions.</p>
      <p>The GlycoRec project1 aims to develop a system that provides diabetes
patients with personalized support and advices for improving their everyday lives.
GlycoRec will provide patients with information and advice regarding their
nutrition, physical activities, and the use of medicine. This empowers patients to
better communicate their needs with their doctors and advisors, and to better
implement advices and stated goals in their everyday lives.</p>
      <p>In this paper, we present our rst steps towards personalized nutritional
advice. Despite the availability of online nutritional information2, these databases
only provide information on single ingredients and/or standardized products
such as ready-made meals. Our goal is to provide diabetes patients with
recommendations and feedback in everyday situations, including: (i) How many
1 https://www.pfh.de/hochschule/forschung/forschungsprojekt-glycorec.html
2 The quality and acceptance of databases with nutritional information vary wildly.</p>
      <p>In Germany, an established resource is http://www.mri.bund.de/de/service/
datenbanken/bundeslebensmittelschluessel.html.
carbohydrates can I expect that the Thai curry on the menu contains?, and (ii)
Which recipe variation best matches both my dietary restrictions and my taste?</p>
      <p>Based on an extensive dataset from the German recipe website Kochbar.de3,
we show that there are di erences in eating patterns with respect to gender, age
group, and dietary restrictions, such as diabetes. We cluster popular meal names
into `canonical meals ' and analyze to what extent these meals vary in terms of
ingredients and nutritional value. These insights provide directions for strategies
to nd the best recipe or for adapting recipes to user needs and preferences.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        Diabetes mellitus is a widespread disease that requires constant attention of
the patients and their caregivers. Research has shown that e ective prevention
of diabetes-related complications includes lifestyle changes that include an
increased physical activity as well as a diet that is associated with lower blood
pressure [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Telemedicine and the improved acceptance of smart phones and
tablet computers among patients and physicians, contributes to improved patient
guidance and self-empowerment [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. A particular focus of guidance is nutrition.
      </p>
      <p>
        Online recipe recommenders can play an important role in generating healthy
meal plans. Even though the used ingredients are the major reason for liking or
disliking a meal, there are health-conscious users who also take nutritional
information into account [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. In a feasibility study on recipe recommendation, Freyne
and Berkovsky found that both content-based (e.g. ingredients) and
collaborative approaches (taste, context) should be taken into account [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        An in-depth analysis on how users choose and adapt recipes is given by Teng
et al [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Making use of complement and substitution networks, they show which
ingredients users add, remove, pair or substitute. This allows them to predict
which variation of a recipe will receive the best ratings.
      </p>
      <p>
        Kusmierczyk et al. analyzed data from the German community platform
Kochbar and found clear seasonal and weekly trends in online food recipe
production, both in terms of nutritional value (fat, proteins, carbohydrates and
calories) [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] and in terms of ingredient combinations and experimentation [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. West
et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] analyzed similar patterns for the American population, with slightly
di erent results. They were also able to automatically detect anomalous days
bank holidays and other celebrations - and users who aimed to change their diet.
      </p>
      <p>
        Making use of these and other insights, the team from IBM Watson created
the prototype Chef Watson, which automatically creates recipes that match user
preferences, based on existing recipes from the Bon Appetit recipe website [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Dataset and Preprocessing</title>
      <p>
        We use a crawl from Kochbar.de, a German online food community website to
which users can upload and rate cooking recipes, provided by Kusmierczyk et
3 https://www.kochbar.de
al. [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. The dataset encompasses more than 400 thousand recipes published
between 2008 and 2014. In addition to information on ingredients and preparation,
more than 330 thousand recipes also contain nutrition facts. Almost 200
thousand users provided more than 2.7 million comments and 7.7 million ratings.
The ratings are on a Likert scale, but - surprisingly - they are overwhelmingly
positive (99:1% gave a rating of 5).
      </p>
      <p>We consider only the 309 thousand recipes that contain valid information
on energy (in kJ and kcal), carbohydrates, proteins, and fat. For each recipe,
a mostly structured list of ingredients (with quantities) is given. We extracted
the ingredients and performed simple normalization by converting the text to
lowercase, normalizing whitespaces, removing text in parentheses, and splitting
on conjunctions such as \and" and \or". This process yields more than 300
thousand ingredients, with an average of 10 ingredients per recipe. As only a
moderate amount of 2258 ingredients occurs frequently (in at least 100 recipes),
this simple preprocessing is su cient for a rst analysis of the dataset.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Di erences in Food Intake</title>
      <p>As a rst step, we aim to identify di erences in user recipes created by di erent
user groups. Among the 200 thousand users, 95 thousand provided information
regarding gender (25 thousand male, 70 thousand female users) and 57 thousand
provided information regarding their age (mean 42.2, median 42). In addition,
we are interested in diabetes patients. However, apparently most patients did not
disclose the fact that they su er from diabetes: from the pro le information we
are able to identify only 65 users who su er from diabetes or have close relatives
with diabetes - a number too small for an e ective analysis. Similarly, only about
3.000 of the recipes have been labeled as `diabetes friendly'. By contrast, about
220.000 recipes are marked as gluten free, 137.000 as lactose free and (only)
37.000 recipes as vegetarian.</p>
      <p>30
]g0 25
0
1
[/g 20
s
t
iren 15
t
u
N
gea 10
r
e
vA 5
carbohydrates</p>
      <p>fat
proteins
[10, 20) [20,30) [30, 40) [40, 50) [50, 60) [60, 70) [70, 80]</p>
      <p>User Age</p>
    </sec>
    <sec id="sec-5">
      <title>Canonical Meals and their Variations</title>
      <p>
        As discussed by [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], most users do not search for recipes based on their nutritional
values, but rather look for recipes with certain ingredients or for a particular
dish. As is to be expected, many variations of popular recipes can be found
at Kochbar.de. In order to nd out di erences in nutritional value within a
particular type of dish, we selected the 200 most frequently used recipe titles as
`canonical meals', to which we assigned all recipes of which the title contained
the title of the canonical meal. The `top meals' are shown in Table 1. From the
selection one can clearly see that the user base of Kochbar.de is German.
      </p>
      <sec id="sec-5-1">
        <title>Canonical Meal</title>
      </sec>
      <sec id="sec-5-2">
        <title>Recipes Ratings Comments</title>
        <p>Karto elsalat (potato salad) 1,863
Pizza 1,812
Kasekuchen (cheese cake) 1,681
Apfelkuchen (apple pie) 1,450
Gulasch (gulash) 1,187
Nudelsalat (pasta salad) 1,706
Eierlikor (egg liquor) 1,085
Pfannkuchen (pancake) 1,221
Lasagne (lasagna) 1,187
Tiramisu 1,358
41,226
38,250
36,807
34,935
28,668
30,196
27,482
24,492
22,570
22,931
14,117
13,210
12,085
11,982
10,162
9,782
9,360
8,550
8,034
7,773</p>
        <p>To nd out to what extent canonical meals vary in nutritional value, we
selected three di erent, representative meals and calculated the means and
standard deviations - see Table 2. We also analyzed which ingredients are associated
with recipes that are high and low in carbohydrates, fat, proteins, and energy
by calculating the average levels for all meals and sorting them accordingly.</p>
        <p>Potato salad is low in protein, but the standard deviations are relatively high.
Ingredients associated with high protein are meat and sh, low-protein recipes
contain vegetables instead, such as pickles, radish, olives and asparagus. The
same pattern can be found for lasagna. Low-fat cheese cake is associated with
low-fat milk products and high-fat with chocolate and cream cheese.</p>
      </sec>
      <sec id="sec-5-3">
        <title>Meal</title>
      </sec>
      <sec id="sec-5-4">
        <title>Potato salad</title>
        <p>Cheese cake
Lasagna
Carbohydrates
Fat</p>
      </sec>
      <sec id="sec-5-5">
        <title>Proteins</title>
        <p>Mean Std dev</p>
      </sec>
      <sec id="sec-5-6">
        <title>Mean Std dev Mean Std dev 10:26 29:93</title>
        <p>8:44
8.025
15.68
10.81
13:09
11:97
13:23
17.64
9.29
11.00
3:27
7:32
8:06
3.60
2.97
6.18</p>
      </sec>
      <sec id="sec-5-7">
        <title>Kcal</title>
      </sec>
      <sec id="sec-5-8">
        <title>Mean Std dev 171:25 151.17 256:29 95.96 185:43 137.25</title>
        <p>These ndings con rm our expectation that it is possible to estimate the
nutritional value of certain dishes from the recipes associated with these dishes
and that one can identify ingredients associated with high or low levels of food
value. This provides a good base for developing food recommender systems that
suggest `better recipes' and alternative ingredients for particular recipes.
Existing food databases for diabetes patients are incomplete and inconsistent.
Moreover, they only contain ingredients and standardized products. Diabetes
patients typically learn over time what they can eat and what not, but this is
often not su cient for many common situations. In restaurants, served meals are
often `black boxes' in terms of nutritional value, which causes great uncertainty
among diabetes patients, especially when trying out new meals while on vacation
in foreign countries.</p>
        <p>To provide patients with advice and information on the nutritional value of
a meal - and to recommend them alternative meals or ingredients - we aim to
exploit recipe sites such as Kochbar.de, which contain user-provided recipes. This
paper provides some rst insights and con rms the feasibility of the approach.</p>
        <p>A particular challenge for the food recommender will be to provide detailed
feedback on the precision of its estimations and the resulting recommendations.
Particularly for `canonical meals' with many variations, estimations may need
to be re ned with additional user input and feedback.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgment</title>
      <p>The GlycoRec project is funded by the Federal Ministry of Education and
Research (BMBF) under the funding scheme Adaptive, Learning Systems
(Adaptive, lernende Systeme).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Firth</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <article-title>Cooking by numbers</article-title>
          .
          <source>New Scientist</source>
          <volume>225</volume>
          ,
          <issue>3003</issue>
          (
          <year>2015</year>
          ),
          <volume>19</volume>
          {
          <fpage>20</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Freyne</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Berkovsky</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <article-title>Recommending food: Reasoning on recipes and ingredients</article-title>
          . In User Modeling, Adaptation, and
          <string-name>
            <surname>Personalization</surname>
          </string-name>
          .
          <year>2010</year>
          , pp.
          <volume>381</volume>
          {
          <fpage>386</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Harvey</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ludwig</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Elsweiler</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <article-title>Learning user tastes: a rst step to generating healthy meal plans?</article-title>
          <source>In First International Workshop on Recommendation Technologies for Lifestyle Change (LIFESTYLE</source>
          <year>2012</year>
          )
          <article-title>(</article-title>
          <year>2012</year>
          ), p.
          <fpage>18</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Kusmierczyk</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Trattner</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <article-title>and N rvag, K. Temporal patterns in online food innovation</article-title>
          .
          <source>In 5th Temporal Web Analytics Workshop (TempWeb) at WWW</source>
          <year>2015</year>
          .
          <article-title>(</article-title>
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Kusmierczyk</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Trattner</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <article-title>and N rvag, K. Temporality in online food recipe consumption and production</article-title>
          .
          <source>In Proc. of WWW</source>
          (
          <year>2015</year>
          ), vol.
          <volume>15</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Lehmann</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Spinas</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <article-title>Screening, diagnostik und management von diabetes mellitus und diabetischen folgeerkrankungen</article-title>
          .
          <source>Therapeutische Umschau</source>
          <volume>57</volume>
          ,
          <issue>1</issue>
          (
          <year>2000</year>
          ),
          <volume>12</volume>
          {
          <fpage>21</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Schildt</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Mertens</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <article-title>Chronic care management of diabetes mellitus{ telemedicine as option in a changing supply situation with general practitioners (gp)</article-title>
          .
          <source>Diabetes aktuell fur die Hausarztpraxis</source>
          <volume>10</volume>
          ,
          <issue>06</issue>
          (
          <year>2012</year>
          ),
          <volume>262</volume>
          {
          <fpage>268</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Teng</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Adamic</surname>
            ,
            <given-names>L. A.</given-names>
          </string-name>
          <article-title>Recipe recommendation using ingredient networks</article-title>
          .
          <source>CoRR abs/1111</source>
          .3919 (
          <year>2011</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>West</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>White</surname>
            ,
            <given-names>R. W.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Horvitz</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <article-title>From cookies to cooks: Insights on dietary patterns via analysis of web usage logs</article-title>
          .
          <source>In Proc. 22nd Conf. World Wide Web</source>
          (
          <year>2013</year>
          ), pp.
          <volume>1399</volume>
          {
          <fpage>1410</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>