Fostering Inexperienced User Participation in ML-based Systems Design: A Literature Review of Visual Language Tools Serena Versino1,* , Tommaso Turchi1 and Alessio Malizia1,2 1 University of Pisa, Largo Bruno Pontecorvo, 3, 56127 Pisa (Italy) 2 Molde University College, Molde (Norway) Abstract The application of Artificial Intelligence (AI) technologies in various sectors is based on machine learning (ML) systems, which, despite their transformative potential, can be complex and opaque for non-technical users. This review explores the role of Visual Programming Languages (VPLs) in lowering these barriers, enhancing the accessibility of ML-based system design for domain experts. We examine the application of ML processes through VPLs, seeking tools that open AI to a broader audience while identifying current challenges and future research directions. Bridging the gap between experts and the broader society is necessary, especially in sectors where responsible and trustworthy AI systems play a pivotal role in decision-making. By democratizing AI, we aim to provide socio-technical conditions that enable users with diverse background to actively contribute to the design of ML-based systems, enhancing their understanding and trust. Therefore, this literature review addresses also how VPL-based tools incorporate features for interpretability and collaboration. Our findings reveal that tools either lack comprehensive customizability, demand computing proficiency, or lack interpretability features. These limitations can affect a synergistic communication between users and intelligent systems, uncovering a research gap in the development of VPLs suited for novices engaged in the design of ML-based systems. Keywords visual programming language, participation, AI democratization, machine learning 1. Introduction Nowadays, Artificial Intelligence (AI) is transforming business, academia, and socio-cultural dynamics alike. AI applications range widely, from facilitating language translation and email spam filtering to enhancing virtual personal assistant functionalities for scheduling. Moreover, AI is instrumental in refining medical diagnoses, boosting agricultural efficiency, aiding in climate change efforts, and increasing production system efficiency via predictive maintenance [1]. Therefore, AI integration across diverse sectors has the potential to drive innovation in product development, decision-making processes, and organizational efficiencies, marking a Proceedings of the 1st International Workshop on Designing and Building Hybrid Human–AI Systems (SYNERGY 2024), Arenzano (Genoa), Italy, June 03, 2024. * Corresponding author. $ serena.versino@phd.unipi.it (S. Versino); tommaso.turchi@unipi.it (T. Turchi); alessio.malizia@unipi.it (A. Malizia)  0000-0002-9860-9142 (S. Versino); 0000-0001-6826-9688 (T. Turchi); 0000-0002-2601-7009 (A. Malizia) © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings pivotal shift in operational paradigms [2, 3]. Educational institutions are similarly adapting, revising pedagogical approaches to integrate AI, reflecting its transformative impact on teaching and learning methodologies [4]. Recent trends in AI development are propelling society towards an increasingly algorithmic era [5]. The European Commission’s white paper emphasizes that this trajectory of AI will significantly influence our future, though the exact nature of AI’s interaction with people and its subsequent impact remains uncertain [1]. Although AI systems are perceived as fair and precise, their performance can vary significantly across different domains. AI technology can entail a number of potential risks, such as opaque decision-making, gender-based or other kinds of discrimination. For example, recommender systems utilize algorithms to manipulate search engine outcomes based on user inquiries, thus impacting consumption decisions [6], shaping public opinion, and societal perceptions [7]. These systems filter and prioritize information based on underlying factors such as browsing history and demographic data [8]. At the core of AI technology lie sophisticated ML-based systems, trained on human data that encompasses a broad spectrum of demographics, cultures, and personal traits of those who generate them. The growing complexity of algorithms has centralized their development and management among a small group of technical experts, such as software developers, and increased society’s dependence on their expertise [9]. Domain specialists are often excluded from the design process of ML-based systems, which limits their understanding of these systems and relegates them to the role of mere end-users. However, individuals with high computing proficiency often lack insight into the specific operational domains of their applications. This gap raises concerns about the societal impact, transparency, and trustworthiness of ML-based systems [10, 11]. Therefore, closing the knowledge divide between domain specialists and computing professionals is crucial for ensuring ethical and fair decision-making in these systems [11]. This objective can be achieved by facilitating broader participation in the design of ML- based systems across different levels of expertise. The democratization of AI encourages participation from a broad user base by fostering socio-technical ecosystems that equip diverse societal segments with the tools to navigate the challenges brought by AI advancements. Therefore, AI democratization seek to harmo- nize the technical knowledge of computing professionals with the nuanced understanding of domain-specific practitioners, ensuring that AI systems are ethically aligned and contextually relevant [12]. End-User Development (EUD) has emerged as a pivotal strategy for this cultural transformation. It enables users to transition from passive roles, such as consumers of artifacts and systems, to active roles, like designers [12, 13]. By facilitating knowledge reformulation, enabling creative expression, and fostering content generation, EUD allows diverse audiences to design and create their own tools and artifacts. This cultural transformation has given rise to cultures of participation, where multidisci- plinary teams collaborate within socio-technical settings to achieve common goals [14, 15]. These teams span the spectrum of computer users: from those who program such as computing professionals to those who use applications for productivity such as domain specialists. While the objective is to empower domain specialists to develop and modify systems, it does not shift the burden of designing high-quality systems to them. Instead, EUD and Human-Centered AI (HCAI) offer the necessary support for end-users, who are most familiar with their requirements, to adapt and improve their systems. HCAI research, for example, explores innovative methods to engage novice users through visual user interfaces [16]. In educational contexts, tools like Visual Programming Languages (VPLs) and no-code platforms such as Scratch [17] prioritize user-friendly experiences by simplifying complex computational operations. By engaging users in the design of ML-based systems through such participatory approaches can support the broad appropriation and integration of trustworthy AI technologies across various domains [18, 19]. This work is based on the main research question: ‘Can VPL-based frameworks foster the participation of both novice and expert practitioners in the design of ML-based systems?’ This study contributes to the research in Hybrid Human-AI Systems exploring the application of ML techniques through VPLs, aiming to reveal how VPLs can democratize AI and promote a synergistic communication between novice users and ML-based systems. Research for this review was conducted through a search of publications within the ACM and IEEE digital libraries. This study is organized as follows: Section 1 introduces the topic. Section 2 provides back- ground and highlights contributions from EUD in VPLs. Section 3 discusses related works. Section 4 outlines the methodology for the literature review, detailing the data collection, search processes, exclusion criteria, and paper selection. Section 5 delves into the literature analysis. Finally, Section 6 focuses on the discussion and conclusions. 2. Background This section explores the historical progression and current state of EUD and VPLs, along with advancements in user interface technologies. It also discusses the integration of Explainable AI (XAI) techniques into the design of ML-based systems to enhance domain specialists’ trust and understanding of these systems. 2.1. End-User Development (EUD) Since the 1960s, the development of various programming languages has been driven by the goal of enhancing coding accessibility, catering to educational purposes and user empowerment [20]. Initially, software development was predominantly the domain of computing specialists, which left end-users with little to no influence over the design and functionality of software [21]. The advent of EUD in the late 1980s, coupled with advancements in personal computing, marked a paradigm shift in this dynamic. EUD revolutionized the way users interact with software by enabling them to configure systems and develop applications, thereby democratizing software design and modification beyond what was previously possible within the domain of professional software engineering [13]. This transformation covered the entire software development life- cycle [22]. Central to this transformation was the adoption of participatory design principles, engaging end-users directly in the system design process. Such participation transformed users from passive participants into active contributors, who could influence software design without needing extensive coding skills [13]. Concurrently, advances in AI technology began to emerge as powerful tools for solving real-world problems. These advancements brought a renewed focus to computing, ranging from knowledge representation and utilization to system assembly, and encompassing activities such as perception, reasoning, and decision-making [23]. Despite its advantages, the application of EUD often focused on short-term problem-solving, occasionally sidestepping the traditional, more complex methodologies necessary for developing sustainable, long-term AI applications. This tendency persisted until recent years, when a growing body of research began to support efforts to bridge the knowledge and involvement gap between professional software designers and end-users. 2.2. Visual Programming Languages (VPLs) To overcome the technical barriers that novices face with coding, educational approaches have incorporated visual components that intuitively represent programming concepts, like pressing buttons or spatial movement. For instance, VPLs utilize visual representations of programming logic, facilitating an intuitive approach to software development [24]. At the core of programming languages are syntax and semantics, respectively the structure of the language and the meaning conveyed. In the review by Kuhail et al. [25], the merge of two well-established taxonomies, that is Myers [26] and Burnett and Baker [27], provides four distinct categories of VPLs. They include block-based, form-based, diagram-based and icon-based languages. Block- based languages simplify programming by allowing users to construct programs using drag-and- drop code blocks, thus reducing syntax errors and focusing on conceptual understanding (e.g., tools like Scratch [17] and TAPAS [28]). Icon-based languages use graphical icons, easing the integration of diverse content sources and supporting novices in creating Personal Information Spaces [29]. Form-based languages enable the configuration of forms and computational cells through both textual and visual elements, facilitating the definition of data interdependencies [30]. Diagram-based or flow-based languages employ a data flow paradigm represented as directed graphs [31, 32], making complex data processing understandable through visual nodes and arcs, such as Grasshopper [33] used in architecture domain. 2.3. Graphical, Tangible and Natural Interfaces VPLs integrate visual elements into syntax that can enable inexperienced users to design and improve software via graphical interfaces [34]. Graphical User Interfaces (GUIs) and Tangible User Interfaces (TUIs) represent significant advancements in facilitating the comprehension of intricate concepts through interactive engagement and manipulation. GUIs, traditionally based on mouse and keyboard inputs, constrain user interactions to predefined mechanisms. TUIs leverage direct manipulation of physical objects such as blocks or cards to enhance the understanding of complex concepts, thereby accelerating improvements in software usability [35]. Further evolution has led to the development of Natural User Interfaces (NUIs), which exploits innate human capabilities such as touch, vision, and speech, offering an intuitive and natural means of digital interaction [36]. NUIs can utilize diverse mediums for digital interaction. Through cameras and sensors, they enable touch interfaces that allow direct manipulation of digital content via touchscreens. For instance, voice recognition devices allow users to interact using natural language commands, while gesture recognition devices interpret body movements, and facial expression recognition devices enable interfaces to respond to users’ emotions. NUIs also extend into augmented and virtual reality, enabling interactions with digital content overlaid on the real world or in virtual environments. 2.4. Explainable AI (XAI) The challenges encountered by novices entering AI technology extend beyond computing barriers. In recent years, the inherently complex nature of ML-based systems has raised ethical concerns regarding the fairness of their decision-making processes and their explainability. For instance, cases including the investigation into Goldman Sachs for gender-based credit discrimi- nation1 , observed biases in Amazon’s automated hiring processes2 , and ethnic disparities in the COMPAS algorithm3 , uncovered the need for improved transparency in such processes. These instances showed that the successful adoption of ML-based systems in their domain-specific applications relies on decision makers’ comprehension and trust. Similar to human interactions, trust in ML-based systems should be established on a foundation of mutual understanding and shared values. Indeed, our confidence in these systems increases when we understand their underlying processes, enabling us to intervene and ensure that decision-making aligns with ethical standards [12]. At the current state, decision makers, who are domain specialists, adopt AI technology as end-users, meaning they are not necessarily ML experts. However, they require a clear understanding of ML-based systems to make informed decisions about their deployment. To tackle these challenges, researchers have developed frameworks such as Shneiderman’s model for HCAI [16]. This framework emphasizes methodologies that ensure human control, interpretability, and transparency, while enhancing the automation of ML-based systems [37]. In the realm of XAI research, this is facilitated, for example, through the application of SHapley Additive exPlanations (SHAP). SHAP is a model-agnostic approach that employs game theory to assign importance values to features for individual predictions. This technique generates data perturbations to measure the impact on model output, aiding in detecting potential biases [38]. Despite significant advancements, the development of XAI techniques remains in its early stages, particularly in the realm of data visualizations [39]. The complexity of these techniques often challenges novices, providing only a partial glimpse into the underlying ML processes, which still appear as black boxes to domain specialists. Current research in XAI and HCAI aims to refine interpretability methods by incorporating more effective techniques [40] and to develop strategies that directly involve domain specialists in the design of ML-based systems [16]. 3. Related works The body of literature shows an enduring interest in VPLs and user interfaces within the field of Human-Computer Interaction (HCI). Daniel D. Hils [41] anticipated that flow-based languages could widen the appeal of visual programming by applying it to new domains, introducing visual programming to domain specialists. Boshernitsan and Downes [42] observed a shift towards graphical displays in VPLs but cautioned against abandoning text-based languages due to challenges in readability and navigation. Later, Rouly et al. [43] emphasized the importance of 1 MIT Technology Review: Gender Bias in Goldman Sachs’ Apple Card Algorithm 2 Reuters: Bias in Amazon’s AI Recruitment Process 3 ProPublica: Ethnic Bias in COMPAS (Correctional Offender Management Profiling for Alternative Sanctions) Risk Assessment Algorithm user interface design in the usability of integrated development environments (IDEs), suggesting a design approach that favors simplicity and user-centric controls. They highlighted the role of incorporating HCI theories, such as those proposed by Green and Petre [44] regarding cognitive dimensions in visual programming, to enhance IDE design and usability. Studies by Mason and Dave [45] explored the benefits of VPLs in reducing the complexity associated with programming, thereby making these tools more accessible to novices. Further exploring the educational impact of VPLs, Noone and Mooney [46] examined their effects on learning programming, observing that VPLs can lead to increased interest among students. They cited the example of Scratch [17], a block-based language recognized for its ability to lessen the cognitive load on learners, thus enabling them to concentrate more on understanding programming concepts rather than tackling the intricacies of syntax. This approach has been integrated into new educational taxonomies designed to leverage the advantages of VPLs in the educational domain [47]. However, further explorations suggested that flow-based languages could offer a more intuitive understanding of programming concepts for beginners compared to block-based languages [45]. Meanwhile, Ray [48] delved into the ecosystem surrounding VPLs, reporting their extensive use in system simulation and multimedia, as well as the predominance of open- source environments. Despite their advantages in visualizing programming logic, facilitating logical understanding, and enhancing portability across various devices, VPLs faced challenges such as poor user interfaces, slow code generation, a lack of standardized models, and an absence of abstraction layers that hindered their growth. Then, Kuhail et al. [49] pointed out the lack of studies analyzing evidence-based visual programming approaches in domains beyond robotics, IoT, and education, highlighting an emerging interest in interactive displays, AI context, and data science. They reported a sharp increase in VPL research publications between 2017 and 2019, focusing on block-based and flow-based languages. Key evaluation metrics identified in their survey included completion time, number of errors, perceived usability, usefulness, workload, and cognitive dimensions [44]. The authors emphasized the need for integrating conversational agents and ML models to aid end-users in developing and debugging visual programming projects, suggesting a forward path for enhancing the accessibility and efficiency of VPLs. In their recent work, El Kamouchi et al. [50] study the use of low-code/no-code (LC/NC) technologies in web/mobile development and healthcare, observing a widespread adoption in AI-powered systems. They emphasize the advantages of LC/NC technologies in reducing costs and accelerating development, while also pointing out ongoing challenges, like restrictions associated with proprietary software and performance issues. Across the surveyed literature, the authors identify challenges and limitations of VPL-based tools, such as inadequate user interfaces, the absence of standardized models, limited user- friendliness for beginners, and the complexity inherent in ML-based applications. This review addresses this research gap by examining the application of ML techniques through VPLs, including the presence of efforts to enhance trust and comprehension in ML decision-making processes. 4. Methodology Following Kitchenham and Charters [51] framework, our analysis began with a planning phase dedicated to reviewing the existing literature on VPLs. This preliminary investigation highlighted a gap in literature on VPL-based systems within the realm of ML for domain experts. Then, we formulated and concurred on the research questions and established a review protocol. This protocol outlined the search strategy and determined the criteria for including and excluding studies. Following the retrieval of articles from selected databases, we carried out the execution phase, characterized by a two-step screening process. Initially, articles were screened based on their titles and abstracts, followed by a more detailed examination using the defined exclusion criteria. Throughout this second stage, the pertinence of each paper to our review was evaluated. In the final phase, the articles that met our criteria were analyzed to respond to the research questions and to report the findings. In this section, we outline the rationale behind our research questions (4.1), detail the search process (4.2), define the exclusion criteria (4.3), and present the paper selection derived from this procedure (4.4), which resulted in the identification of 38 most pertinent articles published between 1994 and 2024 from a pool of 2,363 collected papers. 4.1. Research Questions The research questions are crafted to explore the application of VPL-based tools in the ML context for domain specialists, aiming to uncover areas that require further exploration. Our aim is to examine the use of VPL-based tools, identify the application domains generating the most interest, investigate the types of VPLs employed, and how user experience and usability have been assessed. Such questions will offer an overview of the field, including both technological facets, user and application considerations. Our literature review addresses the following research questions: RQ1: Which VPL-based tools have been used in designing ML-based systems? We aim at uncovering the technical features of VPL-based tools within the ML application, fostering a deeper understanding of their strengths and limitations. RQ2: Which kind of VPL-based tools for ML-based system design are available, and in what ways have they been implemented? By investigating the various types of VPLs used (e.g., block-based or flow-based), we aim at revealing underexplored areas and potential limitations in current methodologies. RQ3: What are the ML application domains where VPL-based tools find their use? This question aims to highlight the domains that have been the focus of research, shedding light on explored areas and opportunities for further development. RQ4: What access modalities are available for designing ML-based systems? Exploring the range of access modalities will enable us to identify potential limitations within existing solutions. RQ5: What is the background of users who have used VPL-based tools in ML application domains? We seek to identify user profiles, whether the primary users are computing experts or domain experts. RQ6: How have the usability and user participation of VPL-based systems been evaluated? Grasping how usability and user participation assessments are applied can determine their current scope and potential for advancements in research. 4.2. Search Process We collected publications from the digital repositories of the Association for Computing Ma- chinery (ACM) and the Institute of Electrical and Electronics Engineers (IEEE) as of February 2024. We executed searches using a keyword string designed to capture studies intersecting the domains of VPLs and ML. Our search strategy employed the following keywords and phrases: ("visual programming language" OR "visual language" OR "visual programming” OR “visual programming environment” OR “visual environment”) AND ("graphical user interface" OR "graphical interface" OR “software” OR “visual block” OR “visual graph” OR “Block based” OR “Flow based”) AND (“machine learning” OR “deep learning” OR "data mining”) 4.3. Exclusion criteria We defined a specific set of selection criteria to assess the relevance of papers to our study. These criteria were applied as follows: 1) the papers must be authored in English; 2) each paper must include a title, abstract, and keywords for accurate identification in order to maintain the integrity of the selection process; 3) the focus of the papers must be on the application of VPLs in the context of ML. Studies that concentrate solely on interaction with a single object were not considered; 4) the papers extending four pages or more were included as providing enough content for a thorough analysis. 4.4. Paper selection We found a total of 2,363 articles across the chosen digital libraries. Among them, 1,538 articles were sourced from the Institute of Electrical and Electronics Engineers (IEEE) library, with the other 825 articles coming from the ACM Digital Library. We compiled references to these articles in BibTeX format, subsequently processing them with ‘bibtexparser’ and ‘pandas’ Python libraries. Details of the selection procedure are concisely illustrated in Fig. 1 by the PRISMA flow diagram. Figure 1: Identification process for paper selection During the initial screening phase of the 2,363 articles, which considered the title, abstract, keywords, and authors, we removed 7 duplicates through manual review. Additionally, 347 articles were excluded due to missing data necessary for correct identification of the paper, and 1,908 articles were removed for not aligning with our research focus (such as those exclusively discussing either VPLs or ML applications, or systematic reviews solely on VPLs or ML). Then, we applied the exclusion criteria to the remaining 101 articles to ascertain their final relevance and suitability for inclusion in our study. Afterward, we excluded an additional 63 articles, primarily for lack of relevant VPLs aspects linked to ML application, that is minimal or no use of VPL-based tools, and insufficient length. Finally, the search phase led to a collection of 38 articles. 5. Literature Analysis The publication timeline presented in Tab. 1 reveals an early phase of exploration for VPLs within the context of ML during the early 1990s. Despite the initial introduction of VPLs in the 1980s, this period is marked by an evolution in the field, which has led to the sophisticated technologies we see today. Since 2018, a significant rise in interest towards VPLs has emerged, mirroring the need of user-friendly tools alongside the escalating complexity and wide adoption of AI-based systems. Table 1 Distribution of selected articles per year 1994 2005 2006 2013 2014 2015 2017 2018 2019 2020 2021 2022 2023 Total 1 1 1 2 1 1 1 5 5 3 5 5 7 38 A synthesis of the primary themes related to our research questions is provided in Fig. 2. The diagram features boxes corresponding to each research question, organizing categories found in the content analysis of the collected articles. Each category provides the count of associated contributions. In the box for UX evaluation methods, articles are cross-referenced as studies can employ diverse evaluation metrics (see Tab. 2 in Appendix for the full list of evaluation metrics). Figure 2: Summary of the key aspects of the research questions and literature review contributions 5.1. Methods and Tools for VPLs in ML The methods and tools section in Fig. 2 aim at addressing two research questions, RQ1: “Which VPL-based tools have been used in designing ML-based systems?” and RQ2: “What types of VPL- based tools for ML-based system design are available, and how have they been implemented?” Software is typically developed using text-based programming languages, such as Java and Python, and is often coupled with user interfaces to enhance the understanding of complex concepts through interactive engagement and manipulation [25]. These interfaces may include GUIs, TUIs, and NUIs. Given this context and that many publications do not thoroughly detail the visual language used for ML application (whether block or flow-based) or specify the programming language employed, our study focuses on the information explicitly provided by the authors. This review uncovered 35 tools providing GUIs for ML-based system design. We identified 19 Java-based tools — including 8 block-based examples such as Prompt Sapper [52] and 9 flow-based like Visual Apriori [53] — alongside 14 Python tools, of which 4 are block-based (e.g., Milo [54]). 5.1.1. Customization Customization is pivotal for users aiming to tailor ML-based systems to specific domain require- ments. However, in this review many tools were not explicitly described by their authors as customizable. While some authors have highlighted their products’ customizability, including features like the creation of new nodes or blocks and the input of parameters for fine-tuning activities, findings show that such customization, beyond basic parameter adjustments, often demands computing expertise. This requirement can limit accessibility for novices. Among the 29 customizable tools, 10 are block-based and 16 are flow-based, suggesting a potential prevalence of flow-based tools. Block-based In the category of Java-developed block-based tools, the literature mentions tools such as Scratch [55, 56], along with its implementations including Tooee [57], LevelUp [58], and Interactive Machine Learning Sandbox [59], as well as TinyML (an implementation of ML Blocks) [60]. Within the Python ecosystem for block-based tools, examples include DeepBlocks [61] and GNU Radio Companion [62]. Additionally, the review identifies tools offering customizable components in both Java and Python, such as Rupai (Blockly) [63]. Flow-based Among the seven tools identified as Python-based and flow-based, all offer customization capabilities, highlighting Python’s popularity and its suitability for such applica- tions4 . Examples are Orange [64] (along with its implementations like Goldenberry [65, 66]), DL-IDE [67], SMILE (Simple Machine Learning) [68], DeepVisual [69], and Graphical AI [70]. In the category of Java and flow-based tools, examples include aFlux [71], Rapsai (Rapid Appli- cation Prototyping System for AI) [72, 73], RapidMiner [74], KNIME [75], Yale [76], Node-RED [77], and OneLabeler [78]. When considering tool integrations and implementations as separate entities, the distribution of GUI-equipped tools that are both flow-based and customizable, and written in either Java 4 IEEE Spectrum: The Top Programming Languages 2023 or Python, appears to be nearly balanced. Moreover, focusing solely on the aspect of GUI and customization — irrespective of the programming language used — the majority emerges as flow-based, with 16 tools, compared to 10 that are block-based. This disparity can partly due to the fact that not all authors specify the programming languages utilized. Some cases, such as Marcelle [79], CO-ML [80], and WEKA - Machine Learning workbench [81], exhibit particular ambiguities. For Marcelle and CO-ML, the available documentation falls short of specifying whether these tools are developed using Java, Python, or a mix of both, and it does not categorize them explicitly as block-based or flow-based. While WEKA is identified as Java-based, its documentation lacks clear information on the VPL approach it employs. These omissions may highlight the complexities involved in implementing VPLs within ML design. Finally, our review identified few tools that play a role in providing methods to clarify the inner workings of black-box models or to elucidate ML mechanisms, for instance mentioning XAI techniques. Among such tools, Gest [82], CO-ML [80], Mix & Match [83], and Rapsai [73]. 5.2. Interaction Modality The interaction modality of these tools can affect their accessibility and usability for users with limited experience. Our analysis indicates that drag-and-drop is the primary interaction modality, enabling users to easily manipulate and connect nodes or blocks in a visual workspace. This modality is typical of block-based and flow-based languages, with our review reporting 27 tools. However, examples like Mix & Match [83] explore alternative approaches. It is a hybrid physical-digital toolkit that integrates GUIs with tangible tokens that users manipulate to design ML-based systems and perform typical ML tasks, such as supervised and unsupervised classification. Another example is Gest [82] - ML gesture recognition system, through which children utilize a sensor to engage with ML concepts. 5.3. Application Domains Equipping domain experts with the necessary tools to participate in the design process can sup- port the development of unbiased and trustworthy ML-based systems. For instance, leveraging their specialized knowledge can enable increased control over recommender systems, which shape our choices by tailoring search results to our queries, thus influencing our consumption patterns, public opinion, and societal perceptions [6, 7]. Such control can prevent these systems from filtering and prioritizing information based on opaque criteria, like browsing habits and user demographics [8]. Given this premise, the third box in Fig. 2 delves into research question RQ3: ‘What are the ML application domains where VPL-based tools find their use?’ This review reports that VPLs are mainly utilized within the field of computer science (13 papers), such as DeepGraph [84]. In the education sector, VPLs are also reflected in 13 papers, with tools like Scratch [55, 56, 57, 58, 59] being employed to introduce children to the concepts underlying ML processes. In industry sector, tools like PaddlePaddle [85] can empower companies to train their employees to become experts proficient in both ML processes and business applications. In healthcare, VPLs provide valuable tools for domain experts, as shown in 5 papers. For instance, KNIME [75] is used to develop an ML-based system aimed at predicting hospital admissions. Similarly, RapidMiner [74] is applied in biomedical informatics for visual workflow design, thereby enhancing healthcare decisions and facilitating the early diagnosis and prediction of diseases. The Workflow Designer [86] enables users to prototype and manage complex ML workflows, such as those involving electroencephalography signals. Additionally, there are tools for managing ML pipelines in the cloud for specific applications, such as diabetes treatment, using Lemonade [87]. In these cases, widely used ML models, including K-Nearest Neighbors, Naïve Bayes, Decision Trees, Support Vector Machines, and Deep Neural Networks, have been deployed and assessed. 5.4. Accessibility In this section, we investigate the accessibility of VPL-based tools, as it can affect the participation of a broader audience. Easy accessibility can enhance inclusivity, improve the overall user experience, and increase usability for all users. Tools that allow end-user modifications can be adapted to specific domains, preventing exclusionary experiences. We assess accessibility through two modalities: ease of user modification and mode of access. For the first modality, we evaluate whether the tool is open-source, which enables users to freely inspect, modify, and enhance it, or proprietary, which includes restrictions imposed by the owner. For the second modality, we examine the access method of the application development environment, whether through a web browser or a desktop application. We evaluated both modalities together, recognizing that the ease of user modification represents a deeper form of accessibility. Therefore, we address RQ4 ‘What access modalities are available for designing ML-based systems?’, by exploring these two access modalities (see Accessibility box in Fig. 2). Aligned with existing literature on VPLs in the IoT domain [48], we expected a prevalence of open-source web applications. Our review indeed confirmed this expectation, with 28 papers indicating a preference for open-source environments, of which 13 specifically favor web applications such as [60, 63, 79, 72, 52, 70, 87, 84]. This tendency reflects a strategic effort to extend access more broadly and address the accessibility hurdles that proprietary desktop-based platforms (e.g., LabVIEW [88]) present. In our review, we identified eight papers featuring examples of open-source applications developed specifically for desktop environments, including [75, 64, 65, 66, 69]. Additionally, we found seven applications, such as [59, 58, 55, 56, 57], that are developed for both desktop and web platforms. Finally, we found an application [83] that employs a hybrid model combining TUIs and GUIs. This application incorporates both open-source and proprietary components, and is partially developed for both web and desktop platforms. 5.5. End-users This section aims at addressing the research question RQ5: ‘What is the background of users who have used VPL-based tools in ML application domains?’ VPLs leverage visual represen- tations of programming logic to offer an intuitive approach to software design, making them particularly accessible to users with little to no programming experience. This review reveals their application by domain specialists (17 papers) working in sectors like healthcare and agri- culture, as well as students within educational settings. In nine papers, VPLs have been utilized across various proficiency levels, with expectations of more in-depth use by experienced practi- tioners, such as for ML integration. However, their tool interface can facilitate the prototyping process of ML-based systems by domain experts in healthcare and education. In the computer science domain (10 papers), including computer vision, IoT, and AI engineering, experts have utilized VPL-based tools to mitigate syntax errors and identify areas for improvement in ML processes. Research provides examples demonstrating that collaboration in co-design activities can effectively engage children in the development of new Intelligent User Interfaces (IUIs) by using modalities such as speech, gesture, and writing. This participation can empower them to conceptualize and propose ideas for complex technical systems that integrate AI processes [89]. This review reports some initiatives aimed at enhancing collaboration among practitioners with diverse levels of expertise, such as Marcelle [79], CO-ML [80], and Rapsai [72]. Similarly, the Mix & Match tool [83] employs a hybrid model combining TUIs and GUIs to foster collaborative design efforts. 5.6. User Experience Evaluation Methods A key aspect of the study was to examine the extent of user participation in evaluating their interaction with the proposed VPL-based tool. In addressing RQ6 ‘How have usability and user participation in VPL-based systems been assessed?’, our analysis revealed that 12 studies conducted evaluations on usability and user experience of these systems (see Tab. 2 in Appendix). The other studies focused on computational performance, employing traditional ML evaluation metrics like accuracy, F1 score, and loss. These 12 studies evaluate the usability of VPL-based systems employing a range of methods, such as Likert scales for open-ended questions and custom questionnaires (5 papers), task completion times (6 papers), the think aloud protocol (2 papers), Affinity for Technology Interaction (ATI) Scale and USE Questionnaire (2 papers), and both NASA-TLX and SUS assessments. NASA-TLX and SUS were mainly used in two studies: one assessing the usability and cognitive load of a flow-based system for junior data scientists in comparison to tabular and code-based representations [67], and another evaluating the effectiveness of diverse VPLs for domain experts in healthcare, biomedical laboratories, and education [90]. A study [83] employed the USE Questionnaire (USEQ) to measure user satisfaction concerning usability, satisfaction, and ease of use. The review reported studies employing the Likert scale [59] focusing on the design of a prototype VPL-based system, where participants rated aspects such as interface components, visualization clarity, and system interaction. In certain instances, more tailored evaluation criteria were utilized, such as custom questionnaires [70] exploring users’ experiences with the VPL-based tool through specific questions on their preferences for developing AI/ML graphically and their favored programming languages. Another paper [52] leveraged the cognitive dimensions framework [44] to assess usability at different developmental stages of the VPL-based prototype. The sample sizes of user studies varied, from 4 to 30 participants, with an average participation of 17 individuals. Two studies each had 30 participants [82, 52], while one study did not specify participant numbers [79] (see Tab. 2). In terms of participant demographics, seven out of the twelve studies disclosed an age range of participants from 10 to 56 years old. However, five of the twelve studies [54, 58, 70, 79, 78] omitted details on participants’ ages, with [70] lacking almost any information regarding its participants. In a study [82], the age of participants ranged from 10 to 13, as the research aimed at developing methods to teach ML concepts to children. Additionally, six out of the twelve studies detailed the gender distribution among participants, which was not always even; only [83] showed a balanced gender distribution. In the case of [52], gender information was provided for 18 out of 30 participants. Across all studies, out of a total of 122 participants, 44% were female, 46% were male, and 10% were not specified. Overall, the data indicate that the assessment of users’ participation with VPL-based tools has been underemphasized, with a greater focus placed on the computational efficiency of the systems deployed and users’ experience. 6. Discussion and Conclusions The integration of AI across various industries primarily relies on ML-based systems, which, despite their transformative potential, can be complex and inaccessible to those without a technical background. This review examines how VPLs can mitigate these barriers, thereby making the design of ML-based systems more accessible to domain specialists. It investigates the application of ML processes through VPLs, aiming to identify tools that democratize AI by addressing both existing challenges and potential areas for future research. Given that leveraging the expertise of domain specialists can enhance trust and trustworthiness in ML decisions, this study investigates the extent to which VPL-based tools integrate interpretability techniques and promote collaborative work environments. Through a systematic examination of 38 articles, selected from an initial pool of 2,363, this review sheds light on the potential of VPLs to contribute to the democratization of AI and enhance its accessibility. Employed technologies Our findings reveal that ML-based system development primarily employs GUIs based on flow-based programming languages, allowing for user customization. The programming language used, whether Java or Python, alongside the choice of a flow-based design, doesn’t inherently limit customization capabilities. However, the focus on customization features suggests that GUIs interfaces can be more easily manipulated by users with computing expertise. The review also reports a limited number of tools that contribute to demystifying the operations of black-box models or explaining ML mechanisms, for example, by incorporating XAI techniques. The accessibility of tools for domain specialists can be influenced by the interaction modality. Our analysis shows that drag-and-drop functionality is the predominant mode of interaction, simplifying complex tasks and enhancing user experience. Despite their user-friendly design for beginners, our review reveals that VPLs are mainly used by computing experts for technological developments, and in education to teach ML concepts to students. Recent research in education is exploring advancements in ML and sensor technology to augment interactive learning experiences. For example, we identified efforts to introduce ML-based gesture recognition systems that utilize physical input devices through the use of sensors [82]. Such systems can enhance the understanding of ML concepts among novices by enabling them to collect data, design ML models, and iteratively refine these models based on feedback. The significant evolution in microprocessors, memory, cameras, and sensors over the past decade has facilitated gestural interaction, signifying a shift toward NUIs [36]. Contemporary literature provides evidence of tools that embody NUI principles directly. For instance, InteractML [91] that simplifies the development and adjustment of ML models for creators of all backgrounds, using a node-based graph and virtual reality interface, with minimal programming required. Although these ongoing technological advancements are expected to generate a wave of innovative applications in the near future, this review reports poor literature on the integration of NUIs with VPLs. Application domains, accessibility and evaluation metrics Our study revealed that beyond education and computer science, few domains, such as the healthcare sector, have adopted VPL-based tools (e.g., KNIME and RapidMiner). This finding highlights an opportunity to further explore the capabilities of flow-based programming languages in specific domains. By assessing their limitations and identifying possible enhancements, we can broaden the reach of VPLs to a more diverse audience. The findings indicate a significant trend toward adopting open-source platforms accessible through web applications, consistent with earlier research insights. VPLs can accommodate various expertise levels, simplifying complex tasks for novice users and empowering computing experts. However, they are primarily utilized by novices in educational settings and by experts in computer science. This evidence may explain the lack of initiatives aimed at encouraging collaboration between novices and experts. Finally, the variety in evaluation methods, from tailored custom metrics to broader questionnaires like SUS and NASA TLX, highlights the lack of standardized methodologies for evaluating the usability of VPL-based systems, along with user participation and experience. In summary, our review of VPL-based tools in the ML context reveals a common problem. A significant number of these tools are not customizable, lack features for interpretability, or require substantial computing expertise for effective application. This finding shows a gap in research towards developing ML-based systems that are readily accessible to domain specialists without deep computing knowledge. By integrating XAI techniques, we could improve understanding of ML decision-making processes. To address this gap, our future work will introduce PyFlowML5 , a prototype developed within an open-source, flow-based environment tailored for widespread adoption. With a focus on customizability and user- friendliness, we plan to assess whether PyFlowML can streamline ML processes and integrate XAI techniques, thereby improving trust and trustworthiness among novices. Currently being tested by both experts and end-users, we plan to compare its usability with tools like KNIME. This comparison could contribute to set benchmarks for developing VPL-based tools designed to foster the participation of domain specialists in the ML-based systems design. Limitations This systematic literature review aims to explore how VPL-based tools can engage domain experts in designing trustworthy ML-based systems from a HCI perspective. This study’s robustness could be influenced by factors like study selection, drawing primarily from digital libraries like IEEE Xplore and ACM, which house a vast collection of conference papers and journal articles relevant to our focus. However, the coverage of these libraries, while valuable, is not all-encompassing, potentially affecting the comprehensiveness of our findings. 5 YouTube: PyFlowML Demo Acknowledgments Research partly funded by PNRR - M4C2 - Investimento 1.3, Partenariato Esteso PE00000013 - “FAIR - Future Artificial Intelligence Research” - Spoke 1 “Human- centered AI”, funded by the European Commission under the NextGeneration EU programme. References [1] E. Commission, White paper on artificial intelligence - a european approach to excellence and trust, Available online: https://eur-lex.europa.eu/legal- content/EN/TXT/HTML/?uri=CELEX:52020DC0065&from=EN, 2020. [2] S. Makridakis, The forthcoming artificial intelligence (ai) revolution: Its impact on society and firms, In: Futures, Volume 90, Pages 46-60, 2017. doi:https://doi.org/10.1016/ j.futures.2017.03.006. [3] E. Brynjolfsson, A. McAfee, The second machine age: Work, progress, and prosperity in a time of brilliant technologies, Book published by WW Norton & Company, 2014. [4] D. Baidoo-Anu, L. Owusu Ansah, Education in the era of generative artificial intelligence (ai): Understanding the potential benefits of chatgpt in promoting teaching and learning, In: Journal of AI, Volume 7, Number 1, Pages 52-62, 2023. doi:10.61969/jai.1337500. [5] B. Shneiderman, C. Plaisant, M. Cohen, S. Jacobs, N. Elmqvist, N. Diakopoulos, Grand challenges for hci researchers, In: Interactions, Volume 23, Number 5, Pages 24–25, 2016. doi:10.1145/2977645. [6] M. Mansoury, H. Abdollahpouri, M. Pechenizkiy, B. Mobasher, R. Burke, Feed- back loop and bias amplification in recommender systems, In: arXiv, 2020. doi:10.48550/ARXIV.2007.13019. [7] S. Milano, M. Taddeo, L. Floridi, Recommender systems and their ethical challenges, AI & SOCIETY, Volume 35, Issue 4, Pages 957–967, 2020. doi:10.1007/s00146-020-00950-y. [8] M. Makhortykh, A. Urman, R. Ulloa, Detecting race and gender bias in visual repre- sentation of ai on web search engines, In: Communications in Computer and Informa- tion Science, Pages 36-50, Springer International Publishing, 2021, 2021. doi:10.1007/ 978-3-030-78818-6_5. [9] Y. N. Harari, Why technology favors tyranny, In: The Atlantic, Volume 322, Number 3, Pages 64-73, 2018. [10] N. R. Council, Beyond productivity: Information technology, innovation, and creativity, Book published by National Academies Press, 2003. [11] B. Shneiderman, Bridging the gap between ethics and practice: Guidelines for reliable, safe, and trustworthy human-centered ai systems, In: ACM Transactions on Interactive Intelligent Systems (TiiS), Volume 10, Number 4, Pages 1-31, 2020. [12] G. Fischer, End-user development: Empowering stakeholders with artificial intelligence, meta-design, and cultures of participation, In: Proceedings, Springer-Verlag, Berlin, Hei- delberg, 2021. doi:10.1007/978-3-030-79840-6_1. [13] H. Lieberman, F. Paternò, M. Klann, V. Wulf, End-user development: An emerging paradigm, In: End User Development, Pages 1-8, Springer, 2006. [14] G. Fischer, Understanding, fostering, and supporting cultures of participation, In: Interac- tions, Volume 18, Number 3, Pages 42-53, 2011. [15] G. Fischer, D. Fogli, A. Mørch, A. Piccinno, S. Valtolina, Design trade-offs in cultures of participation: Empowering end users to improve their quality of life, In: Behaviour & Information Technology, Volume 39, Number 1, Pages 1-4, 2020. doi:10.1080/0144929X. 2020.1691346. [16] B. Shneiderman, Human-centered ai, Book published by Oxford University Press, 2022. [17] S. Dasgupta, B. M. Hill, Scratch community blocks: Supporting children as data scientists, In: Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, Pages 3620-3631, Denver, Colorado, USA, CHI ’17, 2017. doi:10.1145/3025453.3025847. [18] F. Paternò, V. Wulf, New perspectives in end-user development, Book published by Springer, 2017. [19] A. Halfaker, R. S. Geiger, Ores: Lowering barriers with participatory machine learning in wikipedia, In: Proc. ACM Hum.-Comput. Interact., Volume 4, CSCW2, Article 148, Pages 1-37, 2020. doi:10.1145/3415219. [20] C. Kelleher, R. Pausch, Lowering the barriers to programming: A taxonomy of programming environments and languages for novice programmers, ACM Comput. Surv., Vol. 37, No. 2, Article 83, Publication date: June 2005, 2005. doi:10.1145/1089733.1089734. [21] J. C. Brancheau, J. C. Wetherbe, Key issues in information systems management, MIS quarterly, pp. 23–45, 1987. [22] A. J. Ko, R. Abraham, L. Beckwith, A. Blackwell, M. Burnett, M. Erwig, C. Scaffidi, J. Lawrance, H. Lieberman, B. Myers, M. B. Rosson, G. Rothermel, M. Shaw, S. Wiedenbeck, The state of the art in end-user software engineering, ACM Comput. Surv., Vol. 43, No. 3, Article 21, Publication date: April 2011, 2011. doi:10.1145/1922649.1922658. [23] P. H. Winston, Artificial intelligence, Addison-Wesley Longman Publishing Co., Inc., 1984. [24] F. Paternò, End user development: Survey of an emerging field for empowering people, In: International Scholarly Research Notices, Volume 2013, Hindawi, 2013. [25] M. A. Kuhail, S. Farooq, R. Hammad, M. Bahja, Characterizing visual programming ap- proaches for end-user developers: A systematic review, In: IEEE Access, Volume 9, Pages 14181-14202, 2021. doi:10.1109/ACCESS.2021.3051043. [26] B. A. Myers, Taxonomies of visual programming and program visualization, In: Journal of Visual Languages and Computing, Volume 1, Number 1, Pages 97-123, March 1990, 1990. [27] M. M. Burnett, M. J. Baker, A classification system for visual programming languages, In: Journal of Visual Languages and Computing, Volume 5, Number 3, Pages 287-300, September 1994, 1994. [28] T. Turchi, A. Malizia, Fostering computational thinking skills with a tangible blocks programming environment, In: 2016 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC), Pages 232-233, 2016. doi:10.1109/VLHCC.2016. 7739692. [29] C. Ardito, M. F. Costabile, G. Desolda, R. Lanzilotti, M. Matera, A. Piccinno, M. Picozzi, User-driven visual composition of service-based interactive spaces, In: Journal of Visual Languages & Computing, Volume 25, Number 4, Pages 278-296, 2014, 2014. doi:https: //doi.org/10.1016/j.jvlc.2014.01.003. [30] M. M. Burnett, A. L. Ambler, Interactive visual data abstraction in a declarative visual programming language, In: Journal of Visual Languages & Computing, Volume 5, Number 1, Pages 29-60, 1994, 1994. doi:https://doi.org/10.1006/jvlc.1994.1003. [31] D. D. Hils, Visual languages and computing survey: Data flow visual programming lan- guages, In: Journal of Visual Languages & Computing, Volume 3, Number 1, Pages 69-101, 1992, 1992. doi:https://doi.org/10.1016/1045-926X(92)90034-J. [32] K. N. Whitley, L. R. Novick, D. Fisher, Evidence in favor of visual representation for the dataflow paradigm: An experiment testing labview’s comprehensibility, International Journal of Human-Computer Studies, Vol. 64, No. 4, pp. 281–303, 2006. [33] B. McNeel, S. Davidson, Grasshopper, Online resource available at http://www.grasshopper3d.com/, 2023. URL: http://www.grasshopper3d.com/. [34] M. M. Burnett, D. W. McIntyre, Visual programming, In: Computer-Los Alamitos, Volume 28, Pages 14-14, IEEE Institute of Electrical and Electronics, 1995, 1995. [35] T. Turchi, A. Malizia, A human-centred tangible approach to learning computational thinking, EAI Endorsed Transactions on Ambient Systems, Vol. 3, No. 9, 2016. [36] D. A. Norman, Natural user interfaces are not natural, Interactions, Vol. 17, No. 3, pp. 6–10, 2010. [37] M. Turek, Explainable ai (xai), DARPA, 2018. URL: https://www.darpa.mil/program/ explainable-artificial-intelligence. [38] M. Ibrahim, M. Louie, C. Modarres, J. Paisley, Global explanations of neural networks: Mapping the landscape of predictions, In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society (AIES ’19), 2019. [39] A. Das, P. Rad, Opportunities and challenges in explainable artificial intelligence (xai): A survey, arXiv preprint arXiv:2006.11371, 2020. [40] D. Slack, S. Hilgard, E. Jia, S. Singh, H. Lakkaraju, Fooling lime and shap: Adversarial attacks on post hoc explanation methods, In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society (AIES ’20), 2020. URL: https://doi.org/10.1145/3375627.3375830. doi:10.1145/3375627.3375830. [41] D. D. Hils, Visual languages and computing survey: Data flow visual programming languages, Journal of Visual Languages & Computing, Vol. 3, No. 1, pp. 69–101, 1992. doi:10.1016/1045-926X(92)90034-J. [42] M. Boshernitsan, M. S. Downes, Visual programming languages: A survey, Computer Science Division, University of California, Los Angeles, CA, USA, 2004. [43] J. M. Rouly, J. D. Orbeck, E. Syriani, Usability and suitability survey of features in visual ides for non-programmers, Proceedings of the 6th Workshop on Evaluation and Usability of Programming Languages and Tools, PLATEAU ’14, pp. 31–42, 2014. doi:10.1145/ 2688204.2688207, portland, Oregon, USA. [44] T. R. G. Green, M. Petre, Usability analysis of visual programming environments: A ‘cognitive dimensions’ framework, In: Journal of Visual Languages & Computing, Volume 7, Number 2, Pages 131-174, 1996. doi:https://doi.org/10.1006/jvlc.1996.0009. [45] D. Mason, K. Dave, Block-based versus flow-based programming for naive programmers, In: 2017 IEEE Blocks and Beyond Workshop, Pages 25-28, 2017. doi:10.1109/BLOCKS. 2017.8120405. [46] M. Noone, A. Mooney, Visual and textual programming languages: A systematic review of the literature, Journal of Computers in Education, Vol. 5, pp. 149–174, 2018. [47] D. Saito, A. Sasaki, H. Washizaki, Y. Fukazawa, Y. Muto, Program learning for beginners: Survey and taxonomy of programming learning tools, Presented at the 2017 IEEE 9th International Conference on Engineering Education (ICEED), pp. 137-142, 2017. doi:10. 1109/ICEED.2017.8251181. [48] P. P. Ray, A survey on visual programming languages in internet of things, Scientific Programming, Vol. 2017, 2017. [49] M. A. Kuhail, S. Farooq, R. Hammad, M. Bahja, Characterizing visual programming ap- proaches for end-user developers: A systematic review, IEEE Access, Vol. 9, pp. 14181– 14202, 2021. [50] H. E. Kamouchi, M. Kissi, O. E. Beggar, Low-code/no-code development: A systematic literature review, Presented at the 2023 14th International Conference on Intelligent Systems: Theories and Applications (SITA), pp. 1–8, 2023. [51] B. Kitchenham, S. Charters, Guidelines for performing systematic literature reviews in software engineering, 2007. Issue 2, January 2007. [52] Y. Cheng, J. Chen, Q. Huang, Z. Xing, X. Xu, Q. Lu, Prompt sapper: a llm-empowered production tool for building ai chains, ACM Transactions on Software Engineering and Methodology, 2023. [53] A. Mahanti, R. Alhajj, Visual interface for online watching of frequent itemset generation in apriori and eclat, Fourth International Conference on Machine Learning and Applications (ICMLA’05), pp. 6–pp, 2005. [54] A. Rao, A. Bihani, M. Nair, Milo: A visual programming environment for data science education, 2018 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC), pp. 211–215, 2018. [55] W. Shi, Z. Dong, L. Zhang, Graphical platform of intelligent algorithm development for object detection of educational drone, 2021 China Automation Congress (CAC), pp. 6780– 6784, 2021. [56] P. Plaza, M. Castro, J. M. Sáez-López, E. Sancristobal, R. Gil, A. Menacho, F. García-Loro, B. Quintana, S. Martin, M. B. et al., Promoting computational thinking through visual block programming tools, 2021 IEEE Global Engineering Education Conference (EDUCON), pp. 1131–1136, 2021. [57] Y. Park, Y. Shin, Tooee: A novel scratch extension for k-12 big data and artificial intelligence education using text-based visual blocks, IEEE Access, Vol. 9, pp. 149630–149646, 2021. [58] T. Reddy, R. Williams, C. Breazeal, Levelup–automatic assessment of block-based machine learning projects for ai education, 2022 IEEE Symposium on Visual Languages and Human- Centric Computing (VL/HCC), pp. 1–8, 2022. [59] G. Nodalo, J. M. S. III, J. Valenzuela, J. A. Deja, On building design guidelines for an interactive machine learning sandbox application, Proceedings of the 5th International ACM In-Cooperation HCI and UX Conference, pp. 70–77, 2019. [60] R. Williams, M. Moskal, P. D. Halleux, Ml blocks: A block-based, graphical user interface for creating tinyml models, 2022 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC), pp. 1–5, 2022. [61] T. Calò, L. D. Russis, Towards a visual programming tool to create deep learning models, Companion Proceedings of the 2023 ACM SIGCHI Symposium on Engineering Interactive Computing Systems, pp. 38–44, 2023. [62] R. Anil, R. Danymol, H. Gawande, R. Gandhiraj, Machine learning plug-ins for gnu radio companion, 2014 International Conference on Green Computing Communication and Electrical Engineering (ICGCCEE), pp. 1–5, 2014. [63] M. H. Masum, T. S. Rifat, S. M. Tareeq, H. Heickal, A framework for developing graphically programmable low-cost robotics kit for classroom education, Proceedings of the 10th International Conference on Education Technology and Computers, pp. 22–26, 2018. [64] J. Demšar, T. Curk, A. Erjavec, Č. Gorup, T. Hočevar, M. Milutinovič, M. Možina, M. Polajnar, M. Toplak, A. S. et al., Orange: data mining toolbox in python, Journal of Machine Learning Research, Vol. 14, No. 1, pp. 2349–2353, 2013. [65] S. Rojas-Galeano, N. Rodriguez, Goldenberry: Eda visual programming in orange, Proceed- ings of the 15th annual conference companion on Genetic and evolutionary computation, pp. 1325–1332, 2013. [66] L. P. Garzón-Rodriguez, H. A. Diosa, S. Rojas-Galeano, Deconstructing gas into visual software components, Proceedings of the Companion Publication of the 2015 Annual Conference on Genetic and Evolutionary Computation, pp. 1125–1132, 2015. [67] S. G. Tamilselvam, N. Panwar, S. Khare, R. Aralikatte, A. Sankaran, S. Mani, A visual programming paradigm for abstract deep learning model development, Proceedings of the 10th Indian Conference on Human-Computer Interaction, pp. 1–11, 2019. [68] I. Khodnenko, S. V. Ivanov, A. Lantseva, A lightweight visual programming tool for machine learning and data manipulation, 2020 International Conference on Computational Science and Computational Intelligence (CSCI), pp. 981–985, 2020. [69] C. Xie, H. Qi, L. Ma, J. Zhao, Deepvisual: a visual programming tool for deep learning systems, 2019 IEEE/ACM 27th International Conference on Program Comprehension (ICPC), pp. 130–134, 2019. [70] A. Shen, Y. Sun, Graphicalai: A user-centric approach to develop artificial intelligence and machine learning applications using a visual and graphical language, 2021 4th International Conference on Data Storage and Data Engineering, pp. 52–58, 2021. [71] T. Mahapatra, I. Gerostathopoulos, C. Prehofer, S. G. Gore, Graphical spark programming in iot mashup tools, 2018 fifth international conference on Internet of Things: systems, management and security, pp. 163–170, 2018. [72] R. Du, N. Li, J. Jin, M. Carney, X. Yuan, R. Iyengar, P. Yu, A. Kowdle, A. Olwal, Experiencing rapid prototyping of machine learning based multimedia applications in rapsai, Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems, pp. 1–4, 2023. [73] R. Du, N. Li, J. Jin, M. Carney, S. Miles, M. Kleiner, X. Yuan, Y. Zhang, A. Kulkarni, X. L. et al., Rapsai: Accelerating machine learning prototyping of multimedia applications through visual programming, Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, pp. 1–23, 2023. [74] M. Bjaoui, H. Sakly, M. Said, N. Kraiem, M. S. Bouhlel, Depth insight for data scientist with rapidminer «an innovative tool for ai and big data towards medical applications», Proceedings of the 2nd International Conference on Digital Tools & Uses Congress, pp. 1–6, 2020. [75] R. Tsoni, V. Kaldis, I. Kapogianni, A. Sakagianni, G. Feretzakis, V. S. Verykios, A machine learning pipeline using knime to predict hospital admission in the mimic-iv database, 2023 14th International Conference on Information, Intelligence, Systems & Applications (IISA), pp. 1–6, 2023. [76] I. Mierswa, M. Wurst, R. Klinkenberg, M. Scholz, T. Euler, Yale: Rapid prototyping for complex data mining tasks, Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 935–940, 2006. [77] R. Machhamer, J. Altenhofer, K. Ueding, L. Czenkusch, F. Stolz, M. Harth, M. Mattern, A. Latif, S. Haab, J. H. et al., Visual programmed iot beehive monitoring for decision aid by machine learning based anomaly detection, 2020 9th Mediterranean Conference on Embedded Computing (MECO), pp. 1–5, 2020. [78] Y. Zhang, Y. Wang, H. Zhang, B. Zhu, S. Chen, D. Zhang, Onelabeler: A flexible system for building data labeling tools, Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, pp. 1–22, 2022. [79] J. Françoise, B. Caramiaux, T. Sanchez, Marcelle: composing interactive machine learning workflows and interfaces, The 34th Annual ACM Symposium on User Interface Software and Technology, pp. 39–53, 2021. [80] T. Tseng, J. K. Chen, M. Abdelrahman, M. B. Kery, F. Hohman, A. Hilliard, R. B. Shapiro, Collaborative machine learning model building with families using co-ml, Proceedings of the 22nd Annual ACM Interaction Design and Children Conference, pp. 40–51, 2023. [81] G. Holmes, A. Donkin, I. H. Witten, Weka: A machine learning workbench, Proceedings of ANZIIS’94-Australian New Zealnd Intelligent Information Systems Conference, pp. 357–361, 1994. [82] T. Hitron, Y. Orlev, I. Wald, A. Shamir, H. Erel, O. Zuckerman, Can children understand machine learning concepts? the effect of uncovering black boxes, Proceedings of the 2019 CHI conference on human factors in computing systems, pp. 1–11, 2019. [83] A. Jansen, S. Colombo, Mix & match machine learning: An ideation toolkit to design ma- chine learning-enabled solutions, Proceedings of the Seventeenth International Conference on Tangible, Embedded, and Embodied Interaction, pp. 1–18, 2023. [84] Q. Hu, L. Ma, J. Zhao, Deepgraph: A pycharm tool for visualizing and understanding deep learning models, 2018 25th Asia-Pacific Software Engineering Conference (APSEC), pp. 628-632, 2018. doi:10.1109/APSEC.2018.00079. [85] R. Bi, T. Xu, M. Xu, E. Chen, Paddlepaddle: A production-oriented deep learning plat- form facilitating the competency of enterprises, 2022 IEEE 24th Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Sys- tems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys), pp. 92-99, 2022. doi:10.1109/HPCC-DSS-SmartCity-DependSys57074.2022.00046. [86] P. Ježek, L. Vařeka, Workflow designer - a web application for visually designing eeg signal processing pipelines, 2019 IEEE 19th International Conference on Bioinformatics and Bioengineering (BIBE), pp. 368-373, 2019. doi:10.1109/BIBE.2019.00072. [87] W. dos Santos, L. F. M. Carvalho, G. de P. Avelar, A. Silva, L. M. Ponce, D. Guedes, W. Meira, Lemonade: A scalable and efficient spark-based platform for data analytics, 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), pp. 745-748, 2017. doi:10.1109/CCGRID.2017.142. [88] D. Kaya, M. Türk, Comparing the performance of the kernel functions in the lda-svm based classification algorithm in the labview environment, 2018 International Conference on Artificial Intelligence and Data Processing (IDAP), pp. 1-4, 2018. doi:10.1109/IDAP. 2018.8620788. [89] J. Woodward, Z. McFadden, N. Shiver, A. Ben-hayon, J. C. Yip, L. Anthony, Using co-design to examine how children conceptualize intelligent interfaces, Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, pp. 1–14, 2018. doi:10.1145/ 3173574.3174149, cHI ’18, Montreal QC, Canada. [90] C. Schütze, A. Groß, B. Wrede, B. Richter, Enabling non-technical domain experts to create robot-assisted therapeutic scenarios via visual programming, Companion Publication of the 2022 International Conference on Multimodal Interaction, pp. 166–170, 2022. [91] C. Hilton, N. Plant, C. G. Díaz, P. Perry, R. Gibson, B. Martelli, M. Zbyszynski, R. Fiebrink, M. Gillies, Interactml: Making machine learning accessible for creative practitioners working with movement interaction in immersive media, Proceedings of the 27th ACM Symposium on Virtual Reality Software and Technology, Art. No. 23, n.pag., 2021. doi:10. 1145/3489849.3489879, vRST ’21, Osaka, Japan. A. User-based testing details Table 2: User-based testing details Paper Procedure (N. Evaluation Methods N. Users Users Age Users Type User Proficiency Users Tasks) Gender [54] Predefined task (1) Open questions 20 n/a university students inexpert n/a [58] Predefined task (2) Likert scale, custom questionnaire, 25 n/a university students inexpert 16 female, Task completion time 7 males [67] Predefined task (3) Task completion time, SUS, NASA 18 19-24 years old university students expert and inexpert 7 females, TLX, open questions 11 males [82] Predefined task (3) Open questions 30 10-13 years old children inexpert 10 females, 20 males [59] Predefined task (1) Likert scale, open questions 10 19-25 years old university students and expert and inexpert n/a professionals [70] Predefined task (1) Open questions 4 n/a n/a n/a n/a [79] Predefined task (2) Custom questionnaire, Think-aloud n/a n/a university students and expert and inexpert n/a protocol professionals [90] Predefined task (1) ATI Scale, SUS, NASA TLX, 9 26-54 years old professionals inexpert 7 females, Think-aloud protocol (mean = 41) 2 males [78] Predefined task (1) Task completion time, Open questions 8 n/a professionals expert n/a [83] Predefined task (2) Likert scale, open questions, USEQ, 12 18-34 years old university students inexpert 6 females, Task completion time 6 males [52] Predefined task (4) Likert scale, Cognitive Dimensions, 30 18-25 years old university students expert and inexpert 8 females, Task completion time 10 males, 12 not specified [73] Predefined task (2) Likert scale, open questions, custom 22 26-56 years old professionals expert n/a questionnaire, Task completion time