The Impact of different gamification types in the context of data literacy: An online experiment Nikoletta-Zampeta Legaki1, Daniel Fernández Galeote1 and Juho Hamari1 1 Gamification Group, Faculty of Information Technology and Communication Sciences, Tampere University, Finland Abstract As the pace at which humans create data increases, a new challenge for individuals and society is to turn a world full of data into a data-driven world. However, data and statistical literacy remain difficult topics to engage with whereas gamification rises as a promising technique to improve motivation. Therefore, we developed a software composed of interactive charts and tools aiming to teach data literacy in four different versions: (i) challenge- (badges), (ii) immersion- (avatars; story), and (iii) social-based (competition) gamification, along with (iv) a control version (no gamification) to compare the effects of different gamification types on learning outcomes. We conducted four-group random assignment pre-, post-test online experiments with students (N=181) from various courses, schools, and educational levels. The primary results of our experiments show a statistically significant improvement in students’ performance of almost 44% from using the software. Gamification types did not result in statistically significant differences in students’ learning outcomes, suggesting optimism regarding the contribution of interactive data visualization in improving data literacy. Keywords 1 Gamification, data literacy, statistical literacy, education, exploratory data analysis 1. Introduction these challenges, we still lack motivation to gain data insights to transform a world full of data into a data-driven society [3]. We produce and consume data with great ease Understanding data might lead to data-driven and frequency. Even a smartphone can process and well-informed decision-making at a personal and visualize data easier than ever before. or societal level [4]. Thus, there are online However, this data-explosion is as beneficial as databases worldwide that provide data sets the insights that we can get from it, and we still regarding a plethora of global topics (e.g., lack a fact-based view of our world [1]. Data economy, the environment, etc.). Despite these literacy skills have become so critical that they initiatives, people are still discouraged from being have been suggested as a course in secondary statistically aware of this data, resulting in a education, highlighting the importance of wrong perception of social and economic realities “reading and writing with data” [2]. These skills [1]. Even students in statistics courses are help not only to understand the data around us, but reluctant to partake in them because they consider also support a more rational approach to societal them complicated [5, 6]. Introductory statistics problems, realizing what is happening by using courses are an important part of various data, and eventually dealing rationally with disciplines, but neither many teaching approaches societal challenges such as climate change or the have been noted nor the public understanding of COVID-19 pandemic. While we must address 6th International GamiFIN Conference 2022 (GamiFIN 2022), April 26-29 2022, Finland EMAIL: zampeta.legaki@tuni.fi (N.-Z..Legaki); daniel.fernandezgaleote@tuni.fi (D. Fernández.Galeote); juho.hamari@tuni.fi (J. Hamari) ORCID: 0000-0002-2707-8364(N.-Z. Legaki); 0000-0002-5197- 146X (D. Fernández Galeote); 0000-0002-6573-588X (J. Hamari) ©️ 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) 22 these topics has been sufficiently advanced, daily life has accelerated the digitization and making research in this direction valuable [3]. datafication of our society. These have shed light These gaps in pedagogical approaches related on the necessity of critical education about data as to engagement and motivation have recently a set of strategies to support individuals in being spurred significant interest towards employing aware, and understanding their data [12], because design principles from games as a pedagogical data is as useful as the insights, we can get from it avenue. Games and gamification have been linked [13]. Data literacy has been suggested as a with intrinsic motivation, so they can be beneficial solution leading to a data-driven society [14]. in an educational context [7]. The online and However, there is not yet a convergence regarding virtual educational environments that the COVID- a definitive set of strategies to achieve data literate 19 pandemic forced to use make the integration of adults. Data literacy is described broadly as “a set game-based learning in education even more of abilities around the use of data as part of appealing. Gamification, or enhancing a product everyday thinking and reasoning for solving real- or service by providing an experience like those world problems” [15] or “the ability to understand afforded by games, has already been applied in and use data effectively (to inform decisions)” [13, statistics education with mostly positive results 16], and it is a crucial life skill nowadays [15], [8]. Nevertheless, literature reviews on this topic from finding employment to supporting decision- call for more empirical research and rigorous making [4]. However, data literacy is still in its comparison among different gamified strategies infancy. Recent studies argue that it should be to identify methods that effectively attract users’ composed of data understanding and data use, but interest and support education. Empirical research there is no consensus yet about its components. on the effects of individual or simple motivational Other studies tie data literacy with statistical affordances and their comparison regarding literacy [17] and claim the need for public learning outcomes [9, 10] is also crucial for understanding of statistics as a rising societal need understanding the effects of gamification and its to deal with a series of mis-es (e.g., integration in education. misinformation, misunderstandings, etc.) [3, 18]. Therefore, in this study, we investigate the In this regard, statistical literacy, or “the ability to effects of gamification on learning outcomes in interpret, critically evaluate, and communicate the context of data literacy. We developed a web- about statistical information” [19], has been based application that presents statistical concepts suggested as a desired goal for different of Exploratory Data Analysis (hereafter EDA) educational levels and professions (e.g., using a variety of charts and real data. This researchers, various practitioners) too [3, 6]. application supports three additional versions, one Acknowledging that there is no universal for each of the following gamification types: definition of data literacy, for the needs of this challenge-, immersion-, social-based [11, 10]. We study we use the broad definition referred to conducted a series of online experiments above, supported by the concept of statistical supporting random assignment of students literacy, and see it as an evolving concept that (N=181) to one of the treatments (i.e., control, helps individuals to use data as part of everyday challenge-, immersion- and social-based thinking, get insights from a data-rich society by gamification), using an online pre- and post-test using common statistical tools, and supports experimental design. Students come from solving real-world problems [14]. different courses, schools, and educational levels. We are bombarded with statistical news and This study aims to present an interactive data daily, albeit statistics or related fields remain educational application which teaches basic EDA too complicated and difficult to engage with for topics, in a data literacy context, using real data adult learners [3, 20, 5]. Considering the range of sets about societal challenges, and to investigate disciplines that provide at least an introduction to the impact of different gamification types on statistics course, the importance of data literacy learning outcomes. skills, and the need for data literate students and citizens [21], individuals’ reluctance to engage 2. Background with statistics is an urgent problem. In addition, the content of data or statistical literacy courses 2.1. Data Literacy needs to be improved by including real data sets [22] and linking statistical concepts with everyday Our society is becoming more data-rich every life topics [23, 24]. A few initiatives have been day. The penetration of digital technology in our suggested to promote data or statistical literacy in 23 lifelong learning with promising results, e.g., the concluding to a positive impact of gamification, use of digital technology, workshops with real overall. Despite the variety of motivational problems, and data and game-based/gamified affordances, still most of the studies focus on the activities that harness creativity [15, 25, 26, 2]. triad of points-leaderboards-badges [10, 29, 37]. However, their limited number, along with a lack On top of that, negative results in education [39, of research on data literacy pedagogy [12, 27], set 37] call for more research and cautious design of the basis for exploring gamification’s potential in gamification in the education process. [31, 10, 9] this topic. also mention the need for more controlled empirical studies or studies that empirically 2.2. Gamification and data literacy compare individual or different types of motivational affordances and further examine the conditions in which gamification becomes Hence, this study examines gamification’s effective in different contexts. Combining the potential given its mainly positive impact in above-mentioned need to evaluate gamification education. Gamification, defined as an impacts and the need for increasing and “intentional process of transforming any activity, improving teaching methods regarding data system, service … into one which affords positive literacy, within this study we designed and experiences, skills, and practices similar to those implemented a system that supports three afforded by games” [28], has been implemented gamified versions (i.e., challenge-, immersion-, in a variety of fields, and most empirical studies social-based gamification), along with a control focus on education [10, 29]. Specifically, version, that aims to teaches data literacy concepts gamification has been mostly employed in through interactive charts and real data. Next, we Computer Science, Mathematics and Engineering conducted random assignment experiments using [30, 31], noting mainly positive results regarding this system to address the above-mentioned gap. psychological and behavioral outcomes [10]. Gamification has gained acceptance in e-learning educational environments [32] as well, which 3. Methods and data makes it a useful addendum for virtual learning. 3.1. Participants There are also studies in favor of gamification in the data and statistics fields, especially in We conducted online experiments in four digital formats [8, 33]. Most of these studies focus different schools. Schools were recruited based on on introductory statistics courses, which include the first author’s research network, and teachers’ topics related to data and statistical literacy skills agreeing upon participant incentives with the first (e.g., basic EDA topics [40], chart interpretation). author. The total sample is composed of N=181 These courses are often taught in various (58.56% male; 40.89% female; 0.05% non-binary disciplines. Thus, the need to motivate students or other) university students. The participants becomes crucial. Other initiatives to increase were students in different courses and schools, student motivation in these fields also use with different educational levels (undergraduate, interactive software or persuasive data postgraduate, and MBA students) as follows: visualization [35, 36], with mostly optimistic • 62 students; Forecasting Techniques; School results. Despite that most of the studies list the of Electrical & Computer Engineering, positive impact of gamification or persuasive data National Technical University of Athens, visualization, their effects seem to vary depending Greece; class 2021; 4th year of undergraduate on the context and the audience [37]. studies (FT ECE-NTUA). Gamification is sometimes categorized into • 18 students; Experimental Research Methods; three types based on the motivational affordances Business Administration Department, used, i.e., challenge/achievement (focusing on the University of Thessaly, Greece; class 2021; feeling of competence and using points, badges, 2nd year of MBA studies (RM MBA-UTH). etc.), immersion (putting emphasis on avatars and • 36 students; Quantitative Methods in narrative), and social (concentrating on Decision Making, Business Administration competition and/or collaboration) [11, 10, 38]. Department, University of Thessaly, Greece; [11] attempted to link these gamification types class 2021; 2nd year of MBA studies (DS with different intrinsic needs satisfaction, i.e., MBA-UTH). autonomy, competence, and relatedness needs, finding differences among their effects but • 22 students; Multimedia and Hyper-media Theory; University of Pretoria, South Africa, 24 class 2021; 2nd year of undergraduate studies structure regarding both EDA and data related (MHT UPR). topics: central tendency (3 questions), spread (5 • 30 students; Forecasting and Data Analytics questions), growth rate (2 questions), graph via Gamification, Summer School, Tampere interpretation and data knowledge about SDGs University, Finland; class 2021; (14 questions), re-expression of data and COVID- under/postgraduate; (FDAG TAU). 19 pandemic (2 questions), and regression and • 13 students; Special Issues in the Time Project correlation (4 questions). Calculating the mean Management; Business Administration among some numbers or answering about the Department, University of Thessaly, Greece, percentage of countries with laws against sexual class 2021, 4th year of undergraduate studies harassment at work are some examples. We also (PMUTH). included 2 attention questions. The order of the questions and answers and the description of some questions might slightly differ, but the questions 3.2. Materials related to the SDGs and the calculations needed to answer the pre- and post-tests are the same. The All materials were designed and implemented content of the questions is interwoven with the to meet the goals of this study, i.e., to effectively learning objectives and the content of the teach basic EDA concepts and compare the effects application. The pre- and post-tests are exactly the of different gamification types on learning same for all the different versions. outcomes. The materials of our experiments are composed of three main parts: a pre-test on-line questionnaire, a web-based educational 3.2.2. Description of the (gamified) application supporting four different versions (a educational application control, and three versions of gamified learning), and a post-test online questionnaire. Every We designed the content and then version of the web-based application matches to implemented a publicly available web-based one of the following: control (i.e., no gamification application from scratch, which aims to teach elements), challenge- (i.e., badges), immersion- basic EDA [40, 34, 41] and compare different (i.e., avatars and a story related to the presented gamification types (i.e., challenge-, immersion-, data), and social-based (i.e., illustrated text about social-based) regarding the learning outcomes. A the participant’s rank among others) gamification. brief description of the content and the design of All the materials are in English. A more detailed the application follows: description for each part follows. Content design. The main EDA topics were chosen according to our learning objectives, well- 3.2.1. Pre- and post-test known relevant literature [46, 34, 41], our target audience (i.e., adults), time limitations (a full The pre- and post-tests have the same round should last for approximately 1 hour), and structure, number of questions, and topics. They the conditions of an online experiment. We opted are com-posed of 30 multiple choice questions for an online system to have students from various related to EDA topics and misconceptions educational background and levels because of the regarding worldwide data. Even though data COVID-19 pandemic restrictions. Our main goal literacy is gaining more importance and there are was to provide data literacy skills to a broad some courses available online, our search of audience, even to those without a strong statistical standardized tests about data literary skills did not background. We borrowed some of the thematic conclude to any result. Thus, a 30-question test axes from online courses, combining them with was constructed by reviewing related online relevant literature [40, 34, 41] and our expertise. courses, data literacy and EDA literature [40, 34, The web-based application is divided into 5 41], using part of Analytics Vidhya’s test2 (a few discrete pages/levels, and every page is linked questions from a test about fundamental statistics with one topic as follows: page 1: central skills), including data about the UN 2030 tendency; page 2: the spread of data; page 3: chart Sustainable Development Goals (hereafter interpretation; page 4: re-expression of data; page SDGs)3, and considering our expertise in statistics 5: regression and correlation. The higher the level, and forecasting. We concluded to the following the more complicated the topic is, integrating a 2 3 https://www.analyticsvidhya.com/ https://sdgs.un.org/goals 25 scaffolding approach [42]. For each level, an extra different stimuli but keep the same settings and motivational affordance appears (see Figure 1). In interface whenever possible to have comparable addition, we included small interactive exercises versions. More iterations took place among the as practice in four out of the five pages and first and third authors of this study. We decided colorful buttons at the top of every page that that the motivational affordance should be explained important topics using language as represented at the top and bottom of the page (see plain as possible. Figure 1 shows the 3rd page of gamification placeholder, Figure 1), include the application, the colorful buttons with further verbal feedback, and updated, by adding one explanation upon clicking on them, and an more, for each page. Regarding challenge-based interactive chart with an in-application question. gamification, we decided to include badges (i.e., Moreover, we selected data related to the SDGs bronze, silver, gold, and ruby stars). If the user’s and other societal challenges such as the COVID- answers met the criteria for each page, then a new 19 pandemic for the in-application examples and badge was added in the gamification placeholder. charts to increase engagement with these issues. For the immersion-based version, a series of Online open data sources were used, including avatars related to the sustainable development Our World in Data, The World bank, agenda pillars (i.e., people, prosperity, planet, GAPMINDER’s database and report, and others peace, and partnerships) were created, along with mentioned in the application. an illustrated story related to the SDGs. Figure 1 shows the illustrated story for a user who has completed page 3, so three stages of the story have been opened. The story progresses based on user correct answers in the application. Social-based gamification was presented via a competition element materialized through illustrated quotes and messages indicating the user’s rank as compared to others. So, having completed the tasks for every page, the user discovered or lost an additional badge (challenge version), contributed or not to the sustainability goals based on the story (immersion version), and improved or not their ranking (social version). The spaces dedicated to the pictures and texts were blank for the control version. A full round in the application. Initially, the user reads navigation instructions according to the version that they have been randomly assigned to. Figure 1: Screenshot from the gamified For the challenge-, immersion-, and social-based application, immersion-based version. gamification there are additional descriptions regarding the respective elements. Additionally, Gamification design. One of the main goals for the immersion-based gamification, the user of this study is to compare the three gamification selects an avatar. Saving their choice, a user types [10, 11] in an online application along with moves to page 1. Every page is composed of 4 to having no gamification, regarding EDA and data 10 interactive charts, a question for each, and literacy learning outcomes. The distinction interactive calculators to help answering the between gamification types is based mainly on questions. For each page, participants should different player motivational directions and the correctly answer an increasing number of game elements used, and it has been associated questions to meet the criteria and gain the extra with slight differences in need satisfaction [11]. respective badge, or contribute to the story that Following the design guidelines by [43], we they participate in, or upgrade their rank. The conducted a brainstorming session with six cumulative number of participants’ correct gamification experts from the authors’ research answers defines the competitive messages they group interested in gamification design for data receive regarding their rank. A participant needs literacy. Mainly, we focused on the integration of to answer all the questions for each page to one motivational affordance per type to provide a proceed to the next one, and all the completed game-like experience to participants with pages remain available. Feedback regarding 26 correct answers is available only for the calculators. Then, they got feedback about their completed levels, with the respective motivational choices. Participants assigned to one of the affordance per gamification type. A full round is gamified conditions had one new gamification done when the tasks in page 5 are completed. element (a badge, a strip of a story, or their Then, the user reaches the post-test. During the illustrated rank) available at the top and bottom of round, users can logout (progress is saved). A user every page, based on their performance. The same who has completed a full round cannot participate process was followed for all the available pages, again. having different datasets/charts and tasks per page, up to page 5. All the previous levels along 3.2.3. Procedure and experimental with correct answers were available while participants moved to the next pages. Having design saved their answers on page 5, participants were directed to the post-test. All participants had to The experiments were conducted in the answer the same post-test questions. All the context of six courses at different schools. All the previous pages (apart from the pre-test) were participants received the same instructions available without the correct answers. Figure 2 regarding the application. However, students in illustrates the experimental design and the FT ECE-NTUA, DS MBA-UTH, RM MBA- procedure that was followed. UTH, and PM UTH had the instructions in Greek and the incentive for participation was a bonus of 4. Results 1 out of 10 in the course’s final grade instead of an equivalent exercise in the final exam. Students in FDAG TAU and MHT UPR received the The objective of this study is to present and instructions in English and their participation was evaluate an online application which uses real mandatory as part of the course. Participants were worldwide data to teach basic data literacy topics instructed to use a computer. They were aware of and investigate the impact of three gamification the possibility to logout and sign in and the time types regarding the learning outcomes. Hence, we available for participation varied per school. collected students’ performance in pre- and post- Students in FT ECE-NTUA had a month available tests. Student performance for each test was to register and complete a full round, students in calculated as the sum of the correct answers. MHT UPR, and DS MBA-UTH, RM MBA-UTH, Considering all the questions as equivalent, the and PMUTH had two weeks, and students in maximum score per test is equal to the number of FDAG TAU had one week. These differences are questions, i.e., 30. The overall statistical approach due to the different course settings. to this study’s results is divided into two steps. All participants had to register and give Initially, the collected data from both pre- and informed consent to proceed. Upon their post-test questionnaires are examined using registration, they had to complete all the pre-test descriptive statistics to explore the group means, questions, which were not available afterward. standard deviations, and numbers. Figure 3 There was no feedback regarding the pre-test illustrates the performances of students per test questions. Having saved their answers, they were and treatment and Table 1 presents the descriptive randomly assigned to one of the four conditions, statistics per group and school, too. i.e., control, challenge-, immersion-, or social- In terms of an overall evaluation of the based gamification, using the sample() function of educational application, the mean value of the R-base package. Then, participants had to read students’ post-test performance (M=19.03, the instructions. Additional instructions were SD=5.05) is higher than their pre-test provided to participants based on the version that performance (M=13.24, SD=4.09), as expected. they had been assigned to. For example, assigned We conducted a paired t-test, with a confidence participants to the immersion version had to read interval equal to 95%. The null hypothesis: H0 additional instructions regarding the game equal differences in means is rejected (t = 20.634, elements and select one of the avatars. df=180, p<0.001), thus using the suggested Having read the instructions and educational application improves mean independently of the version, participants were performance in the context of data literacy by directed to page 1, where they needed to answer 43.73%, resulting in a large effect size (d=1.26). questions based on the provided charts and/or use 27 Table 1 Students’ performance per different gamification types (pre- vs post-test). Group N Pre-test Post-test Difference Improvement Wilcox sign test Effect size M SD M SD M SD Control 48 13.29 3.57 19.15 4.78 5.85 3.86 44.04% Z=19.5 p<0.001 r=0.839 (large) Challenge- 46 12.72 4.23 17.67 5.38 4.96 3.83 38.97% Z=29.5 p<0.001 r=0.817 (large) Immersion- 55 13.58 4.52 19.13 5.16 5.55 3.83 40.83% Z=13.5 p<0.001 r=0.842 (large) Social- 32 13.34 3.95 20.62 4.44 7.28 3.1 54.57% Z=0 p<0.001 r=0.875 (large) Next, the impact of different gamification types regarding learning outcomes in EDA topics is examined. Analysis of covariance (ANCOVA) was chosen to examine the effects of using different gamification types on student performance, controlling for initial differences in the pre-test. However, having different group sizes and not meeting the assumption of linear relationship between the dependent variable and the covariate, we follow the non-parametric alternative, using the sm and fANCOVA packages in R (v3.5.3) for validation. Four curves have been calculated based on polynomial regression with automatic smoothing parameter selection via AICC for curve fitting. Based on the comparison Figure 2: Flowchart of the experimental design. of four non-parametric regression curves, the null hypothesis “H0: there is no difference between the 4 curves” cannot be rejected (T=21.08, p=0.741). Acknowledging non-parametric analysis limitations, we also conduct a one-way ANOVA on performance change. Our sample meets the ANOVA assumption (i.e., normal distribution, homogeneity of variance, and the observations are independent of each other). No statistically significant differences were detected F(3,177)= 2.563, p=0.056. This fact is in line with the non- parametric analysis, and it implies that all groups Figure 3: Student performance in pre- and post- showed a similar learning gain for all the different tests per gamification type. gamification types, including the control group, regarding student performance change. In order to further examine the differences between pre- and post-tests in the different 5. Discussion and conclusions gamification strategies, we opted for non- parametric tests, due to the violation of the normal 5.1. Discussion distribution assumption in the groups. A Wilcoxon Signed-Ranks Test for each type Overall, the results suggest that the use of the indicates that the mean post-test ranks were online application improved learning outcomes statistically significantly higher than the mean regarding data literacy, as investigated in this pre-test rank for each type of gamification, with a study, using interactive charts and tools, with real confidence interval equal to 95%. Table 1 presents data sets related to current societal challenges, i.e., mean values, standard deviations, numbers for COVID-19 and SDGs. Other studies suggest that each type, differences in pre- and post-test learning objectives in the context of data literacy performance and the improvement along with can be achieved by using data visualization respective Wilcoxon effect sizes. techniques and statistics as persuasive technology 28 means, that is as interactive technology that aims though badges, avatars and a story, and to change a person’s attitudes or behavior [44]. competition are representative of the challenge-, They can also promote critical thinking among immersion-, and social-based gamification [10], students [45,46] and provide opportunities in our results are limited regarding the teaching-learning process [47,48]. Despite that implementation, the sample, and the described the use of persuasive technology, or captology as context. is mentioned [44], is not yet mature in education, Another limitation refers to the sample and the there are some preliminary positive indicators procedure of the experiments. The sample sizes, about its potential on some learning variables such the difference in students’ schools, and years of as attitude and motivation, but further research is study within the students’ distribution into needed [49]. Hence, the noted improvement of all different conditions might affect the homogeneity the versions, control included, could be an effect of slopes between pre- and post-test’ s of the interactive charts and tools provided and the performances. Since a non-parametric analysis chosen thematic areas, which could impact on was conducted, the validity of the results is not motivation positively and eventually learning. affected. The difference in incentives needs to be This finding could be in accordance with [35], mentioned, as well. Students in FT ECE-NTUA, who examine the effects of data visualization as a RM MBA-UTH, DSMBA-UTH, PM UTH persuasive visualization tool that might positively received 1 point out of 10 as a bonus in their impact people’s attitude and memorization or grade, instead of an equivalent exercise at the end. even improve accuracy as in a Bayesian reasoning However, students in FDAG TAU and MHT UPR problem [36]. However, further empirical participated in the application as part of research is needed in this area [35]. mandatory assignments to successfully pass the Another important finding is that the course. Finally, despite the difference in the integration of different gamification types did not instructions about the available time to complete result in statistically significant differences on a full round, all students but one completed a full students’ learning outcomes. Despite the round in a maximum of three days. This study mentioned positive effects of gamification in focuses only on the impact of a web-based education [10, 8], there are a few studies com- application on data literacy and the comparison menting on the potential negative effects of among different gamification types. However, gamification [37]. Based on [37], 35% of the both pre- and post-test questionnaires comprise reviewed papers mentioned indifference as an more questions than the knowledge questions, effect, when gamification did not impact for better which might lead to research fatigue. A larger nor for worse. Our results are in line with [50, 51], sample is suggested, and completing a full round where there was no significant impact of during three days, even though the noted im- gamification on e-learning interventions on provement shows the potential of this approach. students’ performance, even though in these studies the participants’ initial motivation was 5.3. Conclusions higher and the described interventions lasted longer. In our study, the novelty effect or research In our study, a (gamified) application was fatigue might contribute also to this lack of effect on performance since most of the students designed and implemented to teach data literacy and compare the impact of different gamification completed the full activity, on average, in two types on learning outcomes. Our results indicate hours, rather than logging out and signing in. an average of 43.73% improvement in learning outcomes and suggest optimism regarding the 5.2. Limitations contribution of interactive data visualization, interactive tools, and a friendly user interface in There are some limitations regarding the improving data literacy. However, we should be design of the application. All the versions, control more skeptical about the integration of included, contain interactive charts, icons/emojis, gamification when there is already a system with and colorful buttons. The control version does not these characteristics as a basis. Employing a larger include any gamification, but it might be playfully sample of the general public will strengthen the framed given its interactivity, diverse colors, and results and support data literacy teaching. In a user-friendly design. So, even the control addition, investigating gameful experience version might afford a playful experience. Even constructs and connecting the used gamification 29 features with the improvement in specific learning Journal of Information Management 45 outcomes and data literacy topics will provide (2019) 191–210. insightful perspectives regarding the impact of [11] N. Xi, J. Hamari, Does gamification satisfy gamification design choices in data literacy. needs? a study on the relationship between gamification features and intrinsic need 6. Acknowledgements satisfaction, International Journal of Infor- mation Management 46 (2019) 210–221. [12] L. Pangrazio, J. Sefton-Green, The social This work has received funding from the utility of ‘data literacy’, Learning, Media and European Union’s Horizon 2020 research and Technology 45 (2020) 208–220. innovation program under the Marie Sklodowska- [13] M. Frank, J. Walker, J. Attard, A. Tygel, Curie, grant agreement No 840809, the Academy Data literacy-what is it and how can we make of Finland Flagship Program (337653 Forest- it happen?, The Journal of Community Human-Machine Interplay (UNITE)) and the Informatics 12 (2016). Nessling Foundation (project No 202100217). [14] T. Koltay, Data literacy: in search of a name and identity, Journal of Documentation 7. References (2015). [15] A. Wolff, D. Gooch, J. J. C. Montaner, U. [1] H. Rosling, Factfulness, Flammarion, 2019. Rashid,G. Kortuem, Creating an under- [2] C. D’Ignazio, R. Bhargava, Databasic: standing of data literacy for a data-driven Design principles, tools and activities for society, The Journal of Community data literacy learners, The Journal of Informatics 12 (2016). Community Informatics 12 (2016). [16] E. B. Mandinach, E. S. Gummer, A systemic [3] F. C. Von Roten, Do we need a public view of implementing data literacy in understanding of statistics?, Public educator preparation, Educational Understanding of Science 15(2006) 243– Researcher 42 (2013) 30–37. 249. [17] M. Shields, Information literacy, statistical [4] A. Yoon, A. Copeland, P. J. McNally, literacy, data literacy, IASSIST quarterly 28 Empowering communities with data: Role of (2005) 6–6. data intermediaries for communities’ data [18] K. K. Wallman, Enhancing statistical utilization, Proceedings of the Association literacy: Enriching our society, Journal of the for Information Science and Technology 55 American Statistical Association 88 (1993) (2018) 583–592. 1-8. [5] M. D. Albritton, P. R. McMullen, Classroom [19] I. Gal, Adults’ statistical literacy: Meanings, integration of statistics and management com-ponents, responsibilities, International science via forecasting, Decision Sciences statistical review 70 (2002) 1–25. Journal of Innovative Education 4 (2006) [20] J. B. Ramsey, Why do students find statistics 331. so difficult, Proceedings of the 52th Session [6] D. J. Rumsey, Statistical literacy as a goal for of the ISI. Helsinki (1999) 10–18. introductory statistics courses, Journal of [21] F. C. von Roten, Y. de Roten, Statistics in statistics education 10 (2002). science and in society: From a state-of-the- [7] N.-Z. Legaki, K. Karpouzis, V. Assima- art to a new research agenda, Public kopoulos, J. Hamari, Gamification to avoid Understanding of Science22 (2013) 768-784. cognitive biases: An experiment of gamify- [22] R. W. Erwin Jr, Data literacy: Real-world ing a forecasting course, Technological learning through problem-solving with data Forecasting and Social Change 167 (2021) sets, American Secondary Education (2015) 120725. 18-26. [8] Z. Legaki, J. Hamari, Gamification in stati- [23] B. Berikan, S. Özdemir, Investigating stics education: A literature review (2020). “problem-solving with datasets” as an [9] L. E. Nacke, C. S. Deterding, The maturing implementation of computational thinking: of gamification research, Computers in A literature review, Journal of Educational Human Behaviour (2017) 450–454. Computing Research 58 (2020) 502–534. [10] J. Koivisto, J. Hamari, The rise of [24] A. Yoon, A. Copeland, Understanding social motivational information systems: A review im-pact of data on local communities, Aslib of gamification research, International Journal of Information Management (2019). 30 [25] S. Werning, Making data playable: A game [38] N.-Z. Legaki, N. Xi, J. Hamari, K. co-creation method to promote creative data Karpouzis, V. Assimakopoulos, The effect of literacy., Journal of Media Literacy challenge-based gamification on learning: Education 12 (2020) 88–101. An experiment in the context of statistics [26] R. Bhargava, C. D’Ignazio, Designing tools education, International journal of human- and activities for data literacy learners, in: computer studies 144 (2020) 102496. Workshop on Data Literacy, Webscience, [39] M. D. Hanus, J. Fox, Assessing the effects of 2015. gamification in the classroom: A longitudinal [27] A. Wolff, M. Wermelinger, M. Petre, study on intrinsic motivation, social Exploring design principles for data literacy comparison, satisfaction, effort, and activities to sup-port children’s inquiries academic performance, Computers& from complex data, Inter-national Journal of education 80 (2015) 152–161. Human-Computer Studies 129(2019) 41–54. [40] M. Komorowski, D. C. Marshall, J. D. [28] J. G. Hamari, G. Ritzer, C. Rojek, The Salciccioli, Y. Crutain, Exploratory data Blackwell encyclopedia of sociology, 2019. analysis, Secondary analysis of electronic [29] M. Trinidad, M. Ruiz, A. Calderón, A health records (2016) 185–203. bibliometric analysis of gamification [41] R. K. Pearson, Exploratory data analysis research, IEEE Access 9(2021) 46505- using R, CRC Press, 2018. 46544. [42] M. Stewart, Understanding learning: theories [30] J. Swacha, State of research on gamification and critique, in: University teaching in focus, in education: A bibliometric survey, Routledge, 2012, pp. 3–20. Education Sciences11 (2021) 69. [43] B. Morschheuser, L. Hassan, K. Werder, J. [31] C. Dichev, D. Dicheva, Gamifying Hamari, How to design gamification? a education: what is known, what is believed method for engineering gamified software, and what remains uncertain: a critical review, Information and Software Technology 95 International journal of educational (2018) 219–237. technology in higher education 14 (2017)1- [44] B. J. Fogg, Persuasive computers: 36. perspectives and research directions, in: [32] A. N. Saleem, N. M. Noori, F. Ozdamli, Proceedings of the SIGCHI conference on Gamification applications in e-learning: A Human factors in computing systems, 1998, literature review, Technology, Knowledge pp. 225–232. and Learning (2021) 1–21. [45] S. Forbes, J. Chapman, J. Harraway, D. [33] A. Lekka, E. Toki, C. Tsolakidis, J. Pange, Stirling, C. Wild, Use of data visualisation in Literature review on educational games for the teaching of statistics: A new zealand learning statistics, in: 2017 IEEE Global perspective., Statistics Education Research Engineering Education Conference Journal 13 (2014). (EDUCON), IEEE, 2017, pp. 844–847. [46] L. Ryan, Visualization techniques to [34] J. W. Tukey, et al., Exploratory data analysis, cultivate data literacy, in: Advances in volume 2, Reading, Mass., 1977. exemplary instruction, CreateSpace, 2015. [35] A. V. Pandey, A. Manivannan, O. Nov, M. [47] S. Devincenzi, V. Kwecko, F. P. de Toledo, Satterthwaite, E. Bertini, The persuasive F. P. Mota, J. Casarin, S. S. da Costa Botelho, power of data visualization, IEEE Persuasive technology: Applications in transactions on visualization and computer education, in: 2017 IEEE Frontiers in Edu- graphics 20 (2014) 2211–2220. cation Conference (FIE), IEEE, 2017, pp. 1- [36] L. Micallef, P. Dragicevic, J.-D. Fekete, 7. Assessing the effect of visualizations on [48] B. J. Fogg, Persuasive technology: using bayesian reasoning through crowdsourcing, computers to change what we think and do, IEEE transactions on visualization and Ubiquity 2002(2002) 2. computer graphics 18 (2012)2536–2545. [49] S. Agnisarman, K. C. Madathil, L. Stanley, [37] A. M. Toda, P. H. Valle, S. Isotani, The dark A survey of empirical studies on persuasive side of gamification: An overview of technologies to promote sustainable living, negative effects of gamification in education, Sustainable Computing: Informatics and in: Researcher links workshop: higher Systems 19 (2018) 112–122. education for all, Springer, 2017, pp. 143– [50] P. M. Papadopoulos, T. Lagkas, S. N. 156. Demetriadis, How revealing rankings affects 31 student attitude and performance in a peer review learning environment, in: Interna- tional Conference on Computer Supported Education, Springer, 2015, pp. 225-240. [51] A. Domínguez, J. Saenz-de Navarrete, L. De- Marcos,L. Fernández-Sanz, C. Pagés, J.-J. Martínez-Herráiz, Gamifying learning experiences: Practical implications and outcomes, Computers & education 63(2013) 380–392. 32