Towards IoT Workflows for Kitchens Enabled with Ambient Intelligence: A Position Paper Filippos Ventirozos1,2 , Riza Batista-Navarro2,∗ , Bijan Parsia2 and Sarah Clinch2 1 Department of Computing and Mathematics, Manchester Metropolitan University, Oxford Road, Manchester M15 6BH, UK 2 Department of Computer Science, University of Manchester, Oxford Road, Manchester M13 9PL, UK Abstract The Internet of Things and ambient intelligence are together enabling smart environments that can anticipate and meet user needs. However, Mark Weiser’s fully-fledged vision for ambient intelligence is still a long-held dream. This article explores user requirements for creating an ambient intelligent kitchen. Cooking can involve complex interactions between IoT devices and users, making it challenging to design effective IoT workflows. We argue that a successful IoT workflow for cooking should be context-aware, offer varying levels of automation, and provide detailed guidance, among other things. This paper then explores natural language processing technologies, along with an end-user development framework, to fulfil these requirements and make the vision attainable, thereby empowering users to control their smart kitchens. Keywords Internet of Things, Ambient Intelligence, Smart Kitchens, End-User Development, Meta-design, Natural Language Processing, Workflows 1. Introduction The embedding of networked sensors and actuators into our surroundings (i.e., the Internet of Things, or IoT ), together with associated processing, is enabling novel applications at work, in cities and at home. Collected data can be used to understand the current state and to trigger actions intended to meet context-specific needs, creating “smart” environments, interchangeably referred to as environments with ambient intelligence (AmI) in this paper. AmI is a variant of artificial intelligence, in which a “smart environment” anticipates and transparently meets its users’ needs [1]. Although useful in a wide range of domains, one common vision for this kind of technology is the realisation of smart homes. The kitchen has a critical role in household activity, particularly in enabling meal preparation. Although individuals now cook both less frequently and for shorter durations, eating at home remains a significant domestic activity [2, 3]. Home cooking comes with a number of associated benefits including reduced household expense, increased nutrition, reduced carbon footprint and opportunities for to connect with family members. It is estimated that almost 50% time spent on cooking will be automated by the next 10 years [4]. However, a smart kitchen environment can be highly diverse in contrast with other household ones. Consider, for example, a simple lighting controller. Such a device determines the need for light (checking for motion or occupancy, general light levels, and perhaps relevant activities such as watching TV) and then acts to meet the need. This works well when the user behaviour meshes well with the set of actions and triggers. For pre-configured sets of IoT devices, this entails that the behaviours are sufficiently similar so as as to fit the set of triggers appropriately. However, kitchen use is highly diverse [5] and cooking itself often requires complex chains of actions (measuring in different ways, mixing at different speeds at different times, heating in stages), making it challenging to ensure that technology successfully anticipates and meets the needs of its users. AAPEI ’24: The First International Workshop on Adjustable Autonomy and Physical Embodied Intelligence, October 20, 2024, Santiago de Compostela, Spain ∗ Corresponding author. Envelope-Open f.ventirozos@mmu.ac.uk (F. Ventirozos); riza.batista@manchester.ac.uk (R. Batista-Navarro); bijan.parsia@manchester.ac.uk (B. Parsia); sarah.clinch@manchester.ac.uk (S. Clinch) © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings We hypothesise one of the barriers to ambient kitchen adoption is the ability of the end-users to have understanding, control and agency on their IoT workflow–the backbone of supporting operations in the kitchen for the end-user. We propose that much of the “heavy lifting” can be done with natural language processing (NLP) for constructing IoT workflows. The majority of modern recipes, regardless of their heterogeneous forms, contain instructions for completing a process in a series of steps that involve food ingredients, utensils and devices. For cooking devices, these recipes will also provide information about the specific type of device, temperature and duration (“Heat oven to 220∘ C […] Bake for 25-30 mins”). In this position paper, we explore the “ideal” IoT workflow from the perspective of the end-user. We conducted a small set of qualitative interviews to obtain information on how smart devices may support different types of users in their cooking activities. We then propose the application of NLP, as well as computer vision, to retrieve cooking knowledge to form the basis of cooking IoT workflows. Furthermore, we discuss how end-user development (EUD) meta-design principles can augment the IoT workflow. 2. Understanding User Requirements in the Kitchen In order to gain an understanding of the needs of potential end-users of kitchens enabled with ambient intelligence, we interviewed five participants selected based on convenience sampling [6]. It was ensured that within the set of participants, the three primary personas in the kitchen identified by Kerr et al. [5] are represented, described in Figure 1: the “Beginner”, the “Experienced Challenger” and the “Family Chef”. Figure 1: Three primary kitchen personas: the beginner, the experienced challenger, and the family chef (adapted from Kerr et al. [5]). 2.1. Participant Profiles Participants were recruited via email using mailing lists and were supplied with a participant information sheet. Each participant was asked a series of demographic questions, including questions about participants’ cooking habits. Subsequently, participants were presented with a set of six questions that were formulated based on key themes identified in the study by Kerr et al. [5]. These themes centred specifically on the cooking process, exploring areas where smart devices might offer valuable assistance. In addition, participants were asked two questions in relation to ambient intelligence in the kitchen, which were framed as “what if” scenarios [7]. Our participants were all adults aged 30+ who cooked for at least two days a week. All participants reported engaging in some form of novel cooking practices (i.e., being adventurous in cooking) although participants differed in how they realised this, e.g., by tweaking known recipes, following new recipes online, or trying their own creations. Such practices were most common for our one beginner participant, most likely due to the fact that they had a very limited set of established cooking behaviours. Participant personas were derived by the lead author based on a combination of closed-question responses and responses to the open questions described above. This approach allowed us to differentiate between someone who cooks often and with experience, and someone who cooks often but with limited confidence and experience. Similarly, concerns about cooking duration were used alongside information about the number of children in the household to identify one participant as a family chef. Out of our five participants, it was determined that three are experienced challengers, one is a beginner and another one is a family chef. Table 1 presents a summary of the participants’ characteristics based on their responses to the questions on demographics and cooking habits. Table 1 A summary of the participants’ responses to our questions on demographics and cooking habits. Their corresponding personas are also indicated. Age Cooking Frequency Novelty in Cooking # Adults (# Child.) Persona 1 40-50 7 days/week monthly 1 Exp. Challenger 2 40-50 7 days/week weekly 1 Exp. Challenger 3 50-60 2-3 days/week at least once a month 2 or more Exp. Challenger 4 30-40 2 days/week twice a month 5 (3) Family Chef 5 30-40 5 days/week every 2-3 days 1 Beginner 2.2. Thematic Analysis Based on the participants’ responses, we identified various themes that are outlined below. 2.2.1. Following Individual Cooking Steps Precision of Measurements Several participants raised issues related to the measurement of ingre- dients. This was true not only for the beginner but also for our three experienced challengers, especially when cooking food using international recipes. For instance, their local supplies and cooking practices prompted them to determine quantities in metric units, but the recipes they tried to follow would use American terms such as “a stick” (for butter), or ambiguous expressions such as “a little bit” or “enough”. All participants suggested a role for their imagined smart kitchen whereby users are provided assistance when it comes to the measurement of ingredients, interpreting ambiguous measurements, conversion between different units, dealing with the lack of measuring equipment, and abstracting over measurements by simply prompting the user to put more (or less) of an ingredient into the dish. Distractions and Keeping Track Prior research [5] had suggested that family chefs were partially characterised by distractions that arose during cooking. However, our interviews showed that distrac- tions affected all participants. What did distinguish our family chef and beginner (apart from level of experience) was their higher likelihood of facing external distractions (e.g., from family members, chores for the family chef) over cooking-related ones (e.g., finding utensils, preparing ingredients). An experienced challenger also noted that the use of technology to aid their cooking (e.g., using a tablet to display a recipe or to look up information) also led to external distractions (e.g., being notified of new emails). Our experienced challengers also placed much stronger emphasis on distractions that emerged as part of the cooking process. These often emerged as a result of parallel activities that were either inherent in the cooking process itself, or that came as a result of working through a dish without planning ahead (e.g., beginning to cook some ingredients without having prepared those needed in later stages). All participants agreed that distractions adversely affected their cooking, causing forgetfulness about steps or the need for repeated information checks. Some employed coping strategies, like tasting for missing ingredients, but these were not always effective. Participants suggested that automated prompts tailored to current situations could help them refocus, especially against external distractions. They envisioned various state-tracking tools, ranging from progress-monitoring cameras to an “AI spoon” that could taste and adjust food seasoning or consistency. Step Presentation All participants used recipes in some form (cookbooks, notes, online videos, blogs) but with varying preferences. The beginner and family chef favoured online videos. Nonetheless, everyone found them challenging to follow in real-time, requiring frequent pausing and rewinding. Some participants preferred text and images but noted layout issues. One experienced challenger suggested that audio dictation combined with images would be valuable. During the interviews, even without being prompted to discuss technology-based interventions, participants made references to conversational agents as a means for obtaining feedback to handle uncertainty, e.g., to seek advice on particular cooking steps. General Cooking Knowledge Participants faced challenges in following recipes due to insufficient guidance. Our beginner participant sought more advice on adjusting flavours, such as spice levels, while the family chef had difficulty replicating popular dishes accurately. Even non-beginners frequently turned to the web for extra information or clarifications. The online information accessed included ingredient substitutions, user reviews and technique demonstrations. Lastly, one experienced challenger emphasised the importance of timely information retrieval. 2.2.2. Dependencies between Cooking Steps The family chef suggested that at times the optimal ordering of steps may not be correctly represented in a recipe, whilst all the experienced challengers identified challenges in determining when time periods had elapsed (particularly when these were expressed as event occurrences, e.g., “when the onions have browned”). Overall, determining how long it takes an end-user to follow a particular step—and how this might affect the succeeding steps—is vital in following a recipe workflow. Participants acknowledged the need for ambient kitchens to know the dependencies between cooking steps. Finally, a participant highlighted potential safety benefits of an imagined smart kitchen, e.g., for preventing cross-contamination, where an end-user could be told to wash certain utensils used with certain ingredients, before using them in a succeeding step. 2.2.3. Record Keeping and Sharing All our participants identified some value in keeping records within the kitchen. This includes taking notes when they decide to change an ingredient during cooking, or to record the outcome of a certain deviation; these are helpful when adapting dishes to individuals’ tastes, or to note dishes that emerged from unintended changes to the cooking process. All participants stated that they will find value in having a kitchen that can automatically log their activities both for personal use and for sharing with others. 2.2.4. Personalisation Our participants also identified a number of other opportunities for personalisation in a smart kitchen, with adaptations to ingredients (to cater to food preferences) and kitchen equipment. In the case of the former, participants envisaged systems that would adjust quantities in accordance with personal preferences (e.g., for seasoning, spices or texture), scaling a recipe to household size, or to make more significant dish adaptations in line with dietary requirements. With respect to kitchen equipment, participants frequently suggested that cooking times could be adjusted to account for differences in oven temperature, either by altering instructions given to the user or by having the kitchen itself control the cooking process. 2.2.5. Agency Agency refers to giving users control over the degree of automation, allowing fallback to user control in cases of uncertainty. That is, users can choose which parts of the recipe workflow they want to modify, execute or carry out manually. Our participants had very different expectations about the degree to which the kitchen itself would intervene in cooking tasks. In many cases, participants simply expected that the smart kitchen would make their existing processes more efficient by providing additional information or prompts. However, some participants also saw a role for technology to automate simple manual tasks (e.g., adjusting oven temperature) through to very high levels of automation (i.e., equivalent to having access to a physical assistant). However, as envisaged levels of automation increased, so too did concerns about agency or the potential loss of an enjoyable personal cooking practice. We note that for our beginner participant, neither of these two concerns were raised, but they did express scepticism that (despite their desire for one) a fully automated kitchen might be unrealistic in the short term because of the importance of personal judgement and creativity in the cooking process. Lastly, the beginner participant raised concerns that automation may not respond promptly or appropriately in unexpected circumstances (e.g., if something catches fire), leading to increased risk. 3. Technological Recommendations: AI and EUD In this section, we provide recommendations on how the end-user requirements outlined in the previous section can be addressed through a combination of AI and IoT workflows. 3.1. Understanding Recipes through Natural Language Processing Natural language processing (NLP) provides an effective way to extract procedures and measure- ments from cooking recipes. We examine relevant literature, including multi-modal approaches, to automatically generate IoT workflows solely from these recipes. End-user needs in relation to precision of measurements can be partially met by the NLP task of named entity recognition (NER) [8]. However, recipes often use expressions that are relative rather than absolute (e.g., “fill until water covers the pot”). Also, they do not always specify all the ingredients necessary in each step, since readers are expected to use their common sense knowledge in interpreting the steps. More recent studies have explored the use of multi-modal data to supplement ingredients and tools omitted in text, e.g., by using any images shown for each cooking step [9]. Hence, it is evident that research undertaken in analysing text and images can provide end-users more realistic measurements. Previously reported NLP studies employed a variety of transformer-based models [10, 11, 12] to infer how the state or location of an entity (ingredient) changes as it undergoes a process. For example, seminal work by Dalvi et al. [12] assembles a state change matrix (Move, Create, Destroy, No change) from a series of instructional steps (including those from recipes), building a graph that denotes which actions depend on which. Advances in this domain, including multi-modal approaches (i.e., adding information from images or videos) could resolve participants’ needs for keeping track of the state of various ingredients involved in a recipe. The computer vision (CV) tasks of action recognition and object detection for the cooking domain [13, 14, 15, 16] could aid in step presentation by enabling a conversational agent to give prompts to a user when necessary. Such agents, largely underpinned by large language models (LLMs) [17], have been shown to be capable of interactively communicating with end-users [18]. In addition, combining action recognition with conversational agents has been tested in the cooking domain [19]. Thus, existing approaches within both the NLP and CV literature could accommodate presentation requirements. General cooking knowledge can be addressed by NLP through the employment of pre-trained language models [18]. These models were trained on a massive amount of documents, including cooking recipes. These type of models could address participants’ needs when seeking further information regarding a cooking step or an ingredient, e.g., through question answering. A number of NLP studies [20, 21, 22, 23] focussed on extracting a workflow of cooking actions and their corresponding ingredients or cooking tools. To some extent, this could help users in dealing with dependencies between cooking steps. Given the above, it should be feasible to parse a textual recipe into an IoT workflow for automating kitchen tasks, that specify how each device should be configured or operated (e.g., in terms of mode, start time, duration and temperature). However, despite the most recent advances in NLP and in multi-modal modelling, challenges persist due to the nature of recipe data. Instructional text is usually not precise and some measurements and configurations (including timings, device settings and quantities) still need to be inferred, and may vary depending on each user’s available devices and preferences. For instance, the instruction “bake for 30 mins, until golden brown” will require the inference of the following steps on the part of an agent with ambient intelligence: • setting the temperature • opening the door • expecting the dish inside • performing a check as to whether the dish is golden brown or if 30 minutes have passed, and stopping the process if so • alerting the user to the next step. To some extent, these issues are mitigated by constructing recipe databases with precise information. For instance, process-oriented case-based reasoning can suggest recipes to end-users who come with very specific queries in relation to number of calories or type of cuisine [24]. However, such databases require recipes to have been already parsed into a structured format, which poses a hindrance to large- scale automation. Furthermore, in case-based reasoning, the conveyed cooking actions are directed towards the user rather than devices or machines. Although it is evident that NLP can support the construction of IoT workflows for a kitchen with ambient intelligence, to our knowledge, NLP for recipes remains relatively under-explored. Furthermore, there exist no testbeds for ambient intelligent kitchens that allow for incorporating NLP models for extrinsic evaluation. 3.2. End-User Development The field of EUD places particular emphasis on personalisation. This dimension is not addressed through natural language processing of recipes, as they are generally based on collective insights from community members such as cooks and food bloggers, rather than individual user preferences. In the EUD paradigm, and more broadly in end-user programming, users can specify rules to govern system behaviour in response to particular events or contexts. For example, Ghiani et al. [25] offered mechanisms for users to set natural language rules that dictate the behaviour of kitchen appliances, such as turning off the oven when the user leaves the house. Our interviews indicated that users envision agency over a variety of assistive devices with different levels of autonomy. In contrast, existing research on ambient kitchens usually focusses on a limited set of devices with predefined functionalities [26, 27, 28]. Furthermore, commercial products that we are aware of, such as Thermomix TM61 , Bosch Cookit2 , TOKIT3 , and Xiaomi Smart Cooking Robot4 , primarily offer smart devices with a range of pre-configured functionalities. However, they lack multi-device collaboration and proactive personalisation features. Despite significant progress in the field, we agree with the findings presented by [29]: there are still barriers to achieving collaboration between multiple (kitchen) devices. Additionally, there is a notable absence of adaptive integration for new devices and features, particularly those that can proactively sense and respond to end-user needs. We propose the construction of IoT workflows that leverages existing recipe mining technologies using NLP, augmented by an EUD meta-design framework. According to Fischer’s definition [30]: “Meta-design is a conceptual framework for defining and creating social and technical infras- tructures in which new forms of collaborative design can occur.” 1 https://www.thermomix.com/tm6 2 https://cookit.bosch-home.com/de/faq/ 3 https://uk.tokitglobal.com/ 4 https://www.mi.com/de/product/xiaomi-smart-cooking-robot/ Inspired by Ventirozos et al. [31], we envision a hierarchical meta-design that allows stakeholders with varying degrees of technical expertise to collaborate. The aim is to integrate lower-level semantic representations, such as Petri nets and behaviour trees, with higher-level IoT EUD programming styles, including natural language and block-based visual programming languages. This approach empowers end-users with their preferred tools for crafting their own workflows, thereby reducing the burden on professionals to design workflows for each new device and recipe. Additionally, it places the control in the hands of the end-users, allowing them to test and tailor their ambient intelligence settings transparently, thereby fulfilling the dual requirements of personalisation and agency (i.e., determining which functionalities they want to be automated). Moreover, we acknowledge that the ambient kitchen ecosystem involves multiple stakeholders. These range from IoT device vendors and R&D practitioners (developing new models for NLP, object detection, and action recognition) to chefs, food bloggers, nutritionists and food engineers—all contributing their unique perspectives. Our EUD meta-design vision aims to democratise the ambient intelligence space by enabling these various stakeholders to participate in the development of ambient kitchens, even if they possess limited technical knowledge. To illustrate our vision, consider the following scenario demonstrating the future utility of kitchens enabled with ambient intelligence: Jo plans a festive dinner for the family. The grandparents want their favourite dishes, so Jo decides to cook them by reusing her recipe workflows from last year. The kids, however, want a new recipe they saw online. Also, they want to cook it with their grandma. The AmI kitchen agent suggests the best order to implement the recipes between Jo and the grandma with the kids. Once the cooking starts, the agent reminds grandma what to put in and observes that the utensils are cleaned properly between cutting the meat and the veggies. The kids ask the AmI agent where mango grows. The agent provides answers and videos to educate and shows all the recipes that they can try with it. Suddenly, the bell rings and the uncle comes and brings an air fryer. Jo is thinking that the new device can alleviate some of the work of the other devices, so she asks the agent to incorporate the air fryer into the recipes. The agent looks in its knowledge base for parts of the recipes that can be done with the air fryer and shows them to Jo. Jo decides that is preferable to not have the grandparents’ recipe on the air fryer since they might not be used to it, so she informs the agent about this by voice and also says that she wants to automate most of the recipes since there is a lot to prepare. After the big dinner, Jo sees that the combined IoT workflow is a success, so it is shared with friends. One of the friends is a restaurant owner, who liked one of the recipes with the air fryer; she reuses the workflow and asks her partner to inspect it and edit it using his preferred workflow coding tool for their semi-automated kitchen. 4. Discussion & Limitations Our investigation of technological recommendations is by no means exhaustive and a larger interviewee sample would potentially elicit further requirements. Nevertheless, our study indicates that domain requirements for an AmI kitchen could be satisfied with NLP and EUD technologies to some extent (with some limitations in terms of cohesion and evaluation). Moreover, even from a small sample of participants, the need for a framework to support agency (and varying degrees of automation) and personalisation was evident. We suggest further research in EUD meta-design to create new inclusive environments for human collaboration and augmented intelligence to democratise and speed up the development of AmI kitchens. 5. Conclusion In this position paper, we sought to enrich understanding of different kinds of users’ cooking behaviours and their associated requirements for a kitchen enabled with ambient intelligence (AmI kitchen). We demonstrate that NLP and other relevant AI methods can satisfy most of the requirements, and propose the use of EUD meta-design to complement the aforementioned approaches to fulfil the agency and personalisation requirements needed for user adoption. Acknowledgments We would like to thank ARM Ltd and the UK EPSRC under grant number EP/S513842/1 (studentship 2109081), whose funding made this study possible. Declaration on Generative AI The authors have not employed any Generative AI tools. References [1] A. Vasilakos, W. Pedrycz, Ambient Intelligence, Wireless Networking, And Ubiquitous Computing, Artech House, Inc., USA, 2006. [2] A. Gatley, M. Caraher, T. Lang, A qualitative, cross cultural examination of attitudes and behaviour in relation to cooking habits in france and britain, Appetite 75 (2014) 71 – 81. URL: http:// www.sciencedirect.com/science/article/pii/S0195666313005011. doi:https://doi.org/10.1016/ j.appet.2013.12.014 . [3] G. Ma, Food, eating behavior, and culture in chinese society, Journal of Ethnic Foods 2 (2015) 195–199. URL: https://www.sciencedirect.com/science/article/pii/S2352618115000657. doi:https: //doi.org/10.1016/j.jef.2015.11.004 . [4] V. Lehdonvirta, L. P. Shi, E. Hertog, N. Nagase, Y. Ohta, The future(s) of unpaid work: How sus- ceptible do experts from different backgrounds think the domestic sphere is to automation?, PLOS ONE 18 (2023) 1–16. URL: https://doi.org/10.1371/journal.pone.0281282. doi:10.1371/journal. pone.0281282 . [5] S. J. Kerr, O. Tan, J. C. Chua, Cooking personas: Goal-directed design requirements in the kitchen, International Journal of Human-Computer Studies 72 (2014) 255–274. URL: https:// www.sciencedirect.com/science/article/pii/S1071581913001365. doi:https://doi.org/10.1016/ j.ijhcs.2013.10.002 . [6] S. J. Stratton, Population research: Convenience sampling strategies, Prehospital and Disaster Medicine 36 (2021) 373–374. doi:10.1017/S1049023X21000649 . [7] C. Potts, K. Takahashi, A. Anton, Inquiry-based requirements analysis, IEEE Software 11 (1994) 21–32. doi:10.1109/52.268952 . [8] S. R. Gunamgari, S. Dandapat, M. Choudhury, Hierarchical recursive tagset for annotating cooking recipes, in: Proceedings of the 11th International Conference on Natural Language Processing, NLP Association of India, Goa, India, 2014, pp. 353–361. URL: https://aclanthology.org/W14-5149. [9] Y. Zhang, Y. Yamakata, K. Tajima, Supplementing omitted named entities in cooking procedural text with attached images, in: 2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR), 2021, pp. 199–205. doi:10.1109/MIPR51284.2021.00037 . [10] L. Zhang, H. Xu, A. Kommula, C. Callison-Burch, N. Tandon, OpenPI2.0: An improved dataset for entity tracking in texts, in: Y. Graham, M. Purver (Eds.), Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational Linguistics, St. Julian’s, Malta, 2024, pp. 166–178. URL: https://aclanthology.org/2024.eacl-long.10. [11] N. Tandon, K. Sakaguchi, B. Dalvi, D. Rajagopal, P. Clark, M. Guerquin, K. Richardson, E. Hovy, A dataset for tracking entities in open domain procedural text, in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, Online, 2020, pp. 6408–6417. URL: https://aclanthology.org/2020.emnlp-main.520. doi:10.18653/v1/2020.emnlp- main.520 . [12] B. Dalvi, N. Tandon, A. Bosselut, W.-t. Yih, P. Clark, Everything happens for a reason: Discovering the purpose of actions in procedural text, in: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Association for Computational Linguistics, Hong Kong, China, 2019, pp. 4496–4505. URL: https://aclanthology.org/D19-1457. doi:10.18653/v1/D19- 1457 . [13] S. Bansal, S. Khandelwal, S. Gupta, D. Goyal, Kitchen activity recognition based on scene context, in: 2013 IEEE International Conference on Image Processing, 2013, pp. 3461–3465. doi:10.1109/ ICIP.2013.6738714 . [14] J. Monteiro, R. Granada, R. C. Barros, F. Meneguzzi, Deep neural networks for kitchen activity recognition, in: 2017 International Joint Conference on Neural Networks (IJCNN), 2017, pp. 2048–2055. doi:10.1109/IJCNN.2017.7966102 . [15] M. Ramadan, A. El-Jaroudi, Action detection and classification in kitchen activities videos us- ing graph decoding, The Visual Computer 39 (2023) 799–812. URL: https://doi.org/10.1007/ s00371-021-02346-5. doi:10.1007/s00371- 021- 02346- 5 . [16] I. Azurmendi, E. Zulueta, J. M. Lopez-Guede, J. Azkarate, M. González, Cooktop sensing based on a YOLO object detection algorithm, Sensors 23 (2023). URL: https://www.mdpi.com/1424-8220/23/ 5/2780. doi:10.3390/s23052780 . [17] S. Schöbel, A. Schmitt, D. Benner, M. Saqr, A. Janson, J. M. Leimeister, Charting the evolution and future of conversational agents: A research agenda along five waves and new frontiers, Information Systems Frontiers 26 (2024) 729–754. URL: https://doi.org/10.1007/s10796-023-10375-9. doi:10.1007/s10796- 023- 10375- 9 . [18] J. I. Choi, S. Kuzi, N. Vedula, J. Zhao, G. Castellucci, M. Collins, S. Malmasi, O. Rokhlenko, E. Agichtein, Wizard of tasks: A novel conversational dataset for solving real-world tasks in conversational settings, in: Proceedings of the 29th International Conference on Computational Linguistics, International Committee on Computational Linguistics, Gyeongju, Republic of Korea, 2022, pp. 3514–3529. URL: https://aclanthology.org/2022.coling-1.310. [19] R. Kojima, O. Sugiyama, K. Nakadai, Audio-visual scene understanding utilizing text information for a cooking support robot, in: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2015, pp. 4210–4215. doi:10.1109/IROS.2015.7353973 . [20] J. Jermsurawong, N. Habash, Predicting the structure of cooking recipes, in: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Lisbon, Portugal, 2015, pp. 781–786. URL: https://aclanthology.org/ D15-1090. doi:10.18653/v1/D15- 1090 . [21] O. Abend, S. B. Cohen, M. Steedman, Lexical event ordering with an edge-factored model, in: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics, Denver, Colorado, 2015, pp. 1161–1171. URL: https://aclanthology.org/N15-1122. doi:10.3115/v1/N15- 1122 . [22] Y. Yamakata, S. Mori, J. Carroll, English recipe flow graph corpus, in: Proceedings of the Twelfth Language Resources and Evaluation Conference, European Language Resources Association, Marseille, France, 2020, pp. 5187–5194. URL: https://aclanthology.org/2020.lrec-1.638. [23] D. P. Papadopoulos, E. Mora, N. Chepurko, K. W. Huang, F. Ofli, A. Torralba, Learning program representations for food images and cooking recipes, in: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 16538–16548. doi:10.1109/CVPR52688.2022. 01606 . [24] R. Bergmann, L. Grumbach, L. Malburg, C. Zeyen, Procake: A process-oriented case-based reasoning framework., in: ICCBR Workshops, volume 2567, 2019, pp. 156–161. [25] G. Ghiani, M. Manca, F. Paternò, C. Santoro, Personalization of context-dependent applications through trigger-action rules, ACM Trans. Comput.-Hum. Interact. 24 (2017). URL: https://doi.org/ 10.1145/3057861. doi:10.1145/3057861 . [26] A. Neumann, C. Elbrechter, N. Pfeiffer-Leßmann, R. Kõiva, B. Carlmeyer, S. Rüther, M. Schade, A. Ückermann, S. Wachsmuth, H. J. Ritter, “kognichef”: A cognitive cooking assistant, KI - Künstliche Intelligenz 31 (2017) 273–281. URL: https://doi.org/10.1007/s13218-017-0488-6. doi:10. 1007/s13218- 017- 0488- 6 . [27] B. Bouchard, K. Bouchard, A. Bouzouane, A smart cooking device for assisting cognitively impaired users, Journal of Reliable Intelligent Environments 6 (2020) 107–125. URL: https://doi.org/10.1007/ s40860-020-00104-3. doi:10.1007/s40860- 020- 00104- 3 . [28] G. Kondylakis, G. Galanakis, N. Partarakis, X. Zabulis, Semantically annotated cooking procedures for an intelligent kitchen environment, Electronics 11 (2022). URL: https://www.mdpi.com/ 2079-9292/11/19/3148. doi:10.3390/electronics11193148 . [29] D. Marikyan, S. Papagiannidis, E. Alamanos, A systematic review of the smart home literature: A user perspective, Technological Forecasting and Social Change 138 (2019) 139–154. URL: https://www.sciencedirect.com/science/article/pii/S0040162517315676. doi:https://doi.org/10. 1016/j.techfore.2018.08.015 . [30] G. Fischer, E. Giaccardi, Meta-design: A Framework for the Future of End-User Development, Springer Netherlands, Dordrecht, 2006, pp. 427–457. URL: https://doi.org/10.1007/1-4020-5386-X_ 19. doi:10.1007/1- 4020- 5386- X_19 . [31] F. Ventirozos, R. Batista-Navarro, S. Clinch, D. Arellanes, Iot cooking workflows for end-users: A comparison between behaviour trees and the dx-man model, in: 2021 ACM/IEEE International Conference on Model Driven Engineering Languages and Systems Companion (MODELS-C), 2021, pp. 341–350. doi:10.1109/MODELS- C53483.2021.00057 . A. Interview Questions I. Demographics and cooking habits: 1. What age range describes you? 2. How often do you cook? 3. How often do you cook something new? 4. Do you usually cook for more than 5 people? 5. Do you have children, if yes how many? II. Questions drawn from the study by Kerr et al. [5]: 1. How easy can you follow a recipe? What kind of modality do you prefer and why? 2. Do you anticipate any advances in device configurations? For instance, taking out the guesswork and automatically be turned on/off when necessary. 3. How easy is for you to get advice on specific steps of a recipe, what do you find most cumbersome? 4. If you weigh ingredients, do you find the task irksome? Would automation in weighting be helpful? For instance, the scale knowing what ingredient is being measured and how much is the correct weight - the same applies with liquids. 5. Whenever you have inspirations whilst cooking, do you log them and how? Do you think this process could be improved by an intelligent kitchen knowing what ingredients and device configurations you used? 6. If you share recipes, would the above process also be helpful? 7. Do you experience distractions whilst cooking? What is the most annoying thing about it? III. Automation and Instruction: 1. Can you imagine yourself cooking by giving instructions to your kitchen, to do the labour intensive tasks? How would you envision that? Would you be worried that the automation could screw some things up? 2. What if Ambient Intelligence could adapt to your needs and your environment? Tailor cooking recipes and steps for you?