Developing and Validating a Multidimensional AI Literacy Questionnaire: Operationalizing AI Literacy for Higher Education Gabriele Biagini1, Stefano Cuomo1 and Maria Ranieri1 1 University of Florence, Florence, Italy Abstract As Artificial Intelligence (AI) permeates numerous aspects of daily life, fostering AI literacy in higher education becomes vital. This study presents the development and validation of an AI Literacy Questionnaire designed to assess AI literacy across four dimensions, i.e., knowledge-related, operational, critical, and ethical. The questionnaire builds upon the frameworks proposed by Cuomo et al. (2022) and covers a broad spectrum of skills and knowledge, offering a comprehensive and versatile tool for measuring AI literacy. The instrument's reliability and construct validity have been confirmed through rigorous statistical analyses on data collected from a sample of university students. This study acknowledges the challenges posed by the lack of a universally accepted definition of AI literacy and proposes this questionnaire as a robust starting point for further research and development. The AI Literacy Questionnaire provides a crucial resource for educators, policymakers, and researchers as they navigate the complexities of AI literacy in an increasingly AI-infused world. Keywords Artificial Intelligence, AI literacy, Scale development, Questionnaire 1 1. Introduction 1.1. The Pertinence of AI literacy With its rapid advancement Artificial Intelligence (AI) is increasingly permeating areas of daily life and used in various contexts from medicine to literature [1,2]. In this dynamic landscape, higher education institutions have a unique opportunity to enhance students' critical skills and knowledge in AI. To remain relevant, higher education should confront with the demands of this rapidly evolving world, and one crucial aspect is fostering AI literacy among students as a critical academic skill [3,4,5]. Traditionally, AI concepts have primarily been taught in universities, with a focus on computer science and engineering principles [3,6,7,8]. This approach has generated obstacles and barriers to the development of AI literacy amongst the public [9]. Furthermore, while the importance of AI literacy research has grown in recent years, there is still no widely accepted definition of AI literacy [1,10], being "AI literate" commonly referred to the capacity of comprehending, utilizing, monitoring, and engaging in critical reflection on AI applications, without necessarily possessing the ability to develop AI models oneself and applications [9,10]. Proceedings Acronym: Proceedings Name, Month XX–XX, YYYY, City, Country gabriele.biagini@unifi.it (G.Biagini); stefano.cuomo@unifi.it (S.Cuomo); maria.ranieri@unifi.it (M.Ranieri) 0000-0002-6203-122X (G.Biagini); 0000-0003-3174-7337 (S.Cuomo); 0000-0002-8080-5436 (M.Ranieri) © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings 1.2. Assessing Ai Literacy Even though there is no consensus on what AI literacy is, several efforts have been made to develop measurement tools that capture the multidimensionality of AI literacy. However, while some of them were developed specifically for evaluating AI literacy after a course [11,12], other questionnaires focus on a few dimensions of AI, such as emotive or collaborative aspects, while ignoring the same idea of AI literacy due to its intrinsic complexity [1]. The "Attitudes Towards Artificial Intelligence Scale" [13], the "General Attitudes Towards Artificial Intelligence Scale" [14], and the "Artificial Intelligence Anxiety Scale" [15] are three examples of this phenomenon. In order to address this limitation, we initially constructed a multidimensional framework for AI literacy rooted on the Calvani et al. (2008) concept of digital literacy, which provided the ground for the Cuomo et al. [10] AI Literacy framework. Subsequently, we developed an AI literacy Questionnaire that incorporated items from existing assessment tools, as well as new or adapted items, all of which were aligned with the original AI literacy framework. In this paper, we aim at presenting an assessment tool that we have developed, focusing on the validation procedure that we have carried out to ensure the reliability of the tool. Before presenting the evaluation tool and the validation process, we introduce the background of the study, that is the above-mentioned AI literacy framework. 1.3. The AI literacy framework: A multidimensional approach The complexity and multifaceted nature of AI literacy necessitates a comprehensive framework that addresses the different aspects at the core of AI understanding. Our previous research proposed a novel approach, consisting of four key dimensions that collectively encompass the full spectrum of AI literacy [10]. Together, these dimensions provide a multifaceted lens through which AI literacy can be explored, assessed, and cultivated. They emphasize the necessity of moving beyond mere passive consumption of AI to a more critical and responsible understanding, thereby offering a holistic, integrative pathway for approaching AI literacy. Going into details the framework is composed by the following dimensions: - Knowledge-related Dimension: it encompasses the understanding of fundamental AI concepts, focusing on basic skills and attitudes that do not require preliminary technological knowledge [5]. It includes understanding AI types, machine learning principles, and various AI applications such as artificial vision and voice recognition. - Operational Dimension: focused on applying AI concepts in various contexts [16], it emphasizes the ability to design and implement algorithms, solve problems using AI tools, and develop simple AI applications to enhance analytical and critical thinking [17]. - Critical Dimension: highlighting AI's potential to engage students in cognitive, creative, and critical discernment activities [18], it underscores the importance of effective communication and collaboration with AI technologies and critical evaluation of their impact on society. - Ethical Dimension: concerning the responsible and conscious use of AI technologies, this dimension stresses the balanced view of delicate ethical issues raised by AI, such as the delegation of personal decisions to a machine [e.g., job placement or therapeutic pathways], and emphasizes the growing attention towards "AI Ethics", encompassing transparency, fairness, responsibility, privacy, and security. Building upon this multidimensional framework, our research takes a pioneering step towards an empirical understanding of AI literacy. The existing literature, as previously mentioned, tends to focus on singular aspects of AI or addresses AI literacy in a more compartmentalized manner. In contrast, our framework serves as the robust foundation for a newly developed questionnaire, designed to probe the intricate layers of knowledge-related, operational, critical, and ethical dimensions of AI. This alignment between theoretical structure and practical assessment tool marks a significant innovation in the field. By weaving these dimensions into a cohesive instrument, the questionnaire promises not only to assess AI literacy in a more comprehensive manner but also to ignite further research and applications that recognize the richness and complexity of engaging with AI. In the following section, we will delve into the specific design and methodology of the questionnaire, elucidating how it encapsulates the full breadth of the AI literacy landscape. 2. Methodology Questionnaire-based survey methods are extensively employed in social science, business management, and clinical research to gather quantitative data from consumers, customers, and patients [19]. During the creation of a new questionnaire, researchers may consult existing questionnaires with standard formats found in literature references. This article outlines the process of designing and developing an empirical questionnaire, as well as validating its reliability and consistency using various statistical methods. The empirical research method employs a survey-based approach that involves several key steps. The questionnaire was developed following the recommendations of DeVellis [20], and its development included the following steps: clearly determine the construct to measure, generate the items’ pool, determine the format for measurement, have the initial items’ pool reviewed by experts, administer items to a development sample, and finally evaluate the items. 2.1. Identifying the constructs related to the topic. A thorough review of the literature has been conducted to determine the meaningful dimensions to conceptually represent the idea of AI literacy. This review included insights from seminal works, including those by Floridi [21,22], Ng [5], and Selwyn [23], among others, and reliable sources like the European Commission [24,25,26,27,28], the Joint Research Center [29], and the Organization for Economic Co-operation and Development [30,31,32]. It brought to the development of the already presented AI literacy framework [10], including the knowledge- related, operational, critical, and ethical dimensions. These dimensions and their definitions (see above paragraph 1.3) provided the ground to conceptually map the already existing measuring tools for AI literacy or some of its aspects. The results of this analysis are illustrated in the next section. 2.2. Item generation As a first step of the item generation process, we further developed our framework by identifying more analytical descriptors for the four main dimensions that the questionnaire aimed to investigate, that is knowledge-related, operational, critical, and ethical. To this purpose we carried out an examination of relevant literature as well as of seminal institutional documents in the field (such as the European Commission, JRC, OECD, UNESCO, UNICEF, etc.). As a result, we operationalized our framework mapping the emerging conceptual elements on it, thus identifying relevant sub-dimensions. Those conceptual elements provided the ground for the generation of the items. The graph below (Figure 1) summarizes the item generation process from examination of the literature to the identification of appropriate descriptors up to the creation of the items. Literature review (competencies, concepts, dimensions) - on existing frameworks - 38 AI Literacy items 10 items for AI 14 items for AI 8 items for AI Critical 6 Items for AI Ethics Knowledge-related Operational dimension dimension dimension dimension Literature review on institutional sources (European Commission, HILEG, JRC, OECD, UNESCO, UNICEF) - 38 AI Literacy Items 4 items for AI 8 items for AI 10 items for AI Critical 16 Items for AI Ethics Knowledge-related Operational dimension dimension dimension dimension Literature review on seminal Books/Papers (Floridi, Selwin, LeCun, Russell, Bengio) - 42 AI Literacy Items 8 items for AI 10 items for AI 12 items for AI Critical 12 Items for AI Ethics Knowledge-related Operational dimension dimension dimension dimension Final Survey Draft - 118 AI Literacy items 22 items for AI 32 items for AI 30 items for AI Critical 34 Items for AI Ethics Knowledge-related Operational dimension dimension dimension dimension Figure 1: Graphical process for Item generation Before proceeding with the development of a preliminary draft of the questionnaire, in addition to the analysis of the conceptual elements of AI literacy, a review of already validated questionnaires on related topics, such as technology competence or digital literacy, was conducted in order to select items that could be adapted for measuring AI literacy. The table below summarizes the results of the tools’ examination (Table 1). Only then were we able to come up with a final Survey draft capable to cover a range of AI-related knowledge, skills, attitudes, and behaviors that are relevant in today's rapidly evolving technological landscape. Table 1 Existing surveys reviewed. Questionnaire Questionnaire Validation N. of Name Author purpose Target process items Support the Assessment development of a Content of non- (Laupichler scale for the Non-experts validation but no 38 items experts’ AI et al., 2023) assessment of AI factor loadings literacy literacy. Artificial Assess the self- AI Users Complete intelligence (B. Wang et report competence (Expert and validation (EFA, 12 items literacy scale al., 2022) of users in using AI non-expert) CFA, Reliability) (AILS scale) Develop a AI anxiety (Y.-Y. Wang Citizens Complete standardized tool scale (AIAS & Wang, (Expert and validation (EFA, 21 items to measure Ai scale) 2022) non-expert) CFA, Reliability) anxiety Attitude Towards Trust in and Usage Citizens Complete (Sindermann Artificial of Several Specific (Expert and validation (EFA, 5 items et al., 2021) Intelligence AI Products non-expert) CFA, Reliability) (ATAI scale) General Inform legislators Attitudes and organisations towards (Schepman Citizens Complete developing AI Artificial & Rodway, (Expert and validation (EFA, 20 items about their Intelligence 2020) non-expert) CFA, Reliability) acceptance by the Scale (GAAIS end users scale) Descriptors or conceptual elements that recurred in at least two independent sources were transformed into items. We paid close attention to ensuring that the questionnaire covered a comprehensive range of AI literacy dimensions, while maintaining clarity and relevance. By following this process, the initial scale was developed, with 22 items focused on AI knowledge-related dimension, 32 on AI operational dimension, 30 on AI critical dimension, and 34 on AI ethical dimension. The following table (Table 2) contains some sample item to clarify the final output of the item generation phase. Table 2 Initial item generation results. Framework Sample Nr. of Description Matrix option References Dimension question items Know and Know how to understand AI When it comes use AI definitions and Knowledge- to AI, I feel my applications theoretical foundations related knowledge on 22 [5,10 49,50,51] and its Know and dimension the subject fundamental understand AI basic would be: workings. mathematical functions behind the algorithm Using AI Supporting In your opinion concepts, Emergency services the following Operational expertise, and News reporting [5,10, 15, 29, tasks could be 32 dimension applications in 30,31,32, 49] supported by various Emotional support AI? contexts. AI applications Artificially for critical intelligent systems thinking How much do make many errors. Critical abilities (such you agree with An artificially 30 [14, 15,23, 32, 46] dimension evaluating, the following intelligent agent would appraising, statements? be better than an predicting, and employee in many designing) routine jobs. Social Impact: the risk that AI will further concentrate power and wealth in the hands of Human- the few. How much do centered you believe the Democratic impact: factors (such as following the impact of AI [21,22, 24, 25, 26, Ethical justice, considerations technologies on 34 27, 28, 29, 29, 46, dimension responsibility, affect the democracies. 47, 48, 49, 50, 51] openness, trustworthiness Work impact: ethics, and of AI? Impact of AI on the safety). labour market and how different demographic groups might be affected. 2.3. Expert reviews and face validity Face validity is crucial because it assesses whether or not the questionnaire measures what it intends to measure. It involves reviewing the questionnaire and determining if the items and their wording seem relevant and appropriate for measuring the construct of interest, that is AI literacy. To ensure the face validity of the questionnaire, we enlisted the help of a panel of experts (N=5) in the field of AI and educational assessment. It is worth noting that the use of a small group of experts for assessing content validity was considered appropriate in this study, as it focused on a cognitive task that did not require an in-depth understanding of the phenomenon being examined [33,34,35]. These experts were well-versed in AI literacy and possessed a deep understanding of the questionnaire's intended constructs. A draft questionnaire was provided to them and their feedback on the clarity, relevance, and appropriateness of each item were requested. To ensure a shared understanding of the four AI literacy constructs, the definitions were shared with each expert. The process of content validation consisted of the following steps. The expert panel carefully reviewed each item and provided valuable insights and suggestions for improvement. They pointed out any items that seemed unclear, redundant, or irrelevant to the construct being measured. Their feedback was essential in refining the questionnaire and ensuring that it truly captured the essence of AI literacy. The experts were initially asked to categorize each object into one of the four dimensions of our AI literacy framework (i.e., knowledge-related, operational, critical, ethical) following the methodology advocated by Schriesheim and colleagues [35]. If at least four out of the five experts assigned the same classification to an item, it was considered as clearly addressing a concept. There were 118 items in all, and out of those, 15 were either unclassified or erroneously categorized by two experts, while another 23 were misclassified or unclassified by multiple experts. As a result, these elements were not included in the study. The items were then enhanced, including their phrasing and format by the experts’ suggestions, 14 items were rephrased and 20 items, related to the impact of AI in education, were moved outside the main corpus of the questionnaire and became an appendix that can be used in educational context as a wider information section. 2.4. The sample and procedures The next step in validating a questionnaire is the administration of the survey. This step involves collecting data from a sample of participants who will complete the questionnaire. The purpose of questionnaire administration is to gather responses that will be used to evaluate the reliability, validity, and overall, the methodological robustness of the questionnaire. Our survey follows the advice of Likert and Hinkin by using a 5-point Likert scale. A five-point Likert scale was deemed to be more suitable because our questionnaire will be given online. The questionnaire was created so that it could be presented electronically on computers or cellphones, allowing for easy transmission and distribution via the Internet. The actual study was conducted online, in May 2023, via the survey tool “Qualtrics”, while all analyses were implemented using the statistical software R [36,37]. The questionnaire was administered to a convenience sample, consisting in University of Florence’s student teachers of first year (2023) of Primary Education. The sample, after removing the missing data, was composed by 191 student teachers of Primary Education, including 178 females (93,19%) and 11 males (5,76%). The age ranges were between 18-24 (60,21%) to 55-64 (0,52%), while the highest degree of education completed was the High school graduation for 128 respondents (67,55%) and a 3-year University degree for 37 respondents (19,15%). Table 3 summarizes the sample characteristics. Table 3 Sample characteristics Characteristic Items % Frequency Male 5.76% 11 Gender Female 93.19% 178 Prefer not to 1.05% 2 say 18 - 24 60.21% 115 25 - 34 26.18% 50 Age 35 - 44 8.90% 17 45 - 54 4.19% 8 55 - 64 0.52% 1 Early 22.68% 22 Childhood School level of employment (if Elementary 69.07% 67 already working) High School 3.09% 3 Special classes 5.15% 5 (Support) High school 67.55% 128 3y University 19.15% 37 Highest degree or level of 5y University 8.51% 17 education completed 1° level Master 3.19% 6 Doctorate 0.52% 1 2° level Master 1.06% 2 < 5 Years 77.32% 75 > 5 < 10 Years 17.53% 17 Professional experience (in years): >10 < 20 Years 3.09% 3 > 20 Years 2.06% 2 3. Results 3.1. Reliability and validity The reliability of a questionnaire can be considered as the consistency of the survey results. As measurement error is present in content sampling, changes in respondents, and differences across raters, the consistency of a questionnaire can be evaluated using its internal consistency. Internal consistency is a measure of the inter-correlation of the items of the questionnaire and hence the consistency in the measurement of intended construct. Internal consistency is commonly estimated using the coefficient alpha [38], also known as Cronbach's alpha. According to expert suggestions, Cronbach's alpha value is expected to be at least 0.70 to indicate adequate internal consistency of a given questionnaire [20,39]. Low value (below 0.7) of Cronbach's alpha for a given questionnaire represents poor internal consistency and, hence, poor inter-relatedness between items. In our survey, Cronbach's alpha, McDonald’s omega [40] the composite reliability (CR), and the average variance extracted (AVE) were used to assess the survey's reliability and validity. The findings are shown in Table 4. The survey's Cronbach's alpha score was 0.953, while the scores for each of the four constructs were, respectively, 0.880, 0.941, 0.858, and 0.914. Although the reliabilities of each individual constructs were greater than 0.70, the instrument as a whole scored higher than 0.953, indicating that the latter is more reliable than the individual constructs. The scale's convergent validity was evaluated using the CR and AVE criteria set out by Fornell and Larcker [41]. Cronbach's alpha is a more subjective measure of reliability than CR, and CR values of 0.70 and higher are regarded as satisfactory [42]. The AVE compares the variance collected by a construct to the variance caused by measurement error. According to Hair et al. [42], values more than 0.5 show satisfactory convergence; in our scale, CR values were higher than 0.7, and AVE values were superior to 0.5, which indicated acceptable convergence. Table 4 Results of Cronbach’s Alpha, McDonald’s Omega, AVE and CR Average Composi Framework Cronbac McDonald N. of variance te dimensions h’s α ’s ω elements extracted Reliability Knowledge-related 0.880 0,888 0,521 0,916 8 dimension Operational 0.908 0,910 0,513 0,926 12 dimension Critical dimension 0.941 0,950 0,522 0,915 10 Ethical dimension 0.914 0,924 0,520 0,914 10 Total 0.953 0,956 0,531 0,940 40 3.2. Identify underlying components. The fundamental structure of the 60-items measure was further confirmed by exploratory factor analysis (EFA). Component or factor loadings tell what factors are being measured by the questions. Questions that measure the same indicators should load onto the same factors. Factor loadings range from -1.0 to 1.0. The factorial structure of the survey scale was investigated by means of principal component analyses (PCAs) indicating a four-components structure as hypothesized by the framework. The four components were rotated using an orthogonal rotation technique (varimax rotation) to allow for correlations between the components. According to the PCA results, the four variables with eigenvalues larger than 1.00 were responsible for 69.68% of the total extracted variance. This study followed the five rules that are frequently used as the criteria for deciding whether to retain or eliminate items: (1) values larger than the basic root criterion (eigenvalue >1.00); (2) insignificant factor loadings (0.50); (3) significant factor loadings on multiple factors; (4) at least three indicators or items in a single factor; and (5) single item factors [33, 42, 43, 44]. Eventually, 40 items emerged from the 60 items with 10 items focused on the AI knowledge-related dimension, 12 on AI operational dimension, 10 on AI critical dimension and 10 items on AI ethical dimension. The results of the EFA are shown in Table 5. Assumption checks for the final four-factor model resulted in a significant Bartlett’s test of Sphericity χ2 = 2375, df = 528, p < .001, showing a viable correlation matrix that deviated significantly from an identity matrix. The Kaiser-Meyer-Olkin Measure of Sampling Adequacy (KMO MSA) overall was 0.835, indicating amply sufficient sampling. During the confirmatory factor analysis (CFA) the model with the 40 items loaded on the four factor as described, emerged as acceptable with a CFI = 0.959, TLI = 0.950, RMSEA = 0.041, SRMR = 0.05 (Table 6). Table 5 Results of exploratory factor analysis. Factor Loadings Operational Knowledge-related dimension Critical dimension Ethical dimension dimension KW1 0,782 KW2 0,680 KW3 0,809 KW4 0,876 KW5 0,571 KW6 0,857 KW7 0,738 KW8 0,740 OP1 0,665 OP2 0,713 OP3 0,608 OP4 0,759 OP5 0,824 OP6 0,663 OP7 0,668 OP8 0,679 OP9 0,804 OP10 0,671 OP11 0,752 OP12 0,759 CR1 0,748 CR2 0,824 CR3 0,713 CR4 0,617 CR5 0,680 CR6 0,729 CR7 0,607 CR8 0,678 CR9 0,857 CR10 0,729 ET1 0,566 ET2 0,547 ET3 0,720 ET4 0,681 ET5 0,621 ET6 0,790 ET7 0,763 ET8 0,836 ET9 0,824 ET10 0,789 Note: Absolute values less than 0.5 were suppressed. Table 6 Model Fit Statistics RMSEA 90% CI CFI TLI SRMR RMSEA Inferior Superior 0.959 0.950 0.0538 0.0411 0.0211 0.0573 Note. The four-factor model is the theoretical model. CFI = comparative fit index; TLI = Tucker-Lewis index; RMSEA = root mean square error of approximation; SRMR = standardised root mean square residual. TLI, CFI greater than .900 and RMSEA values less than .050 suggest adequate model fit. 4. Discussion This study presents the development and validation of a 40-item assessment scale to provide academics with an instrument for evaluating users’ critical skills in using AI and its fundamental constructs (i.e., knowledge-related, operational, critical, and ethical). Through the creation and validation of a new AI literacy scale, it sought to advance our understanding of AI literacy. The proposed approach is rooted on the Calvani et al. [45] notion of digital literacy, which provided the conceptual ground for the Cuomo et al. [10] AI Literacy framework. We carried out a scoping assessment using DeVellis' recommendations to find suitable items (n=118) related to AI literacy, had the item pools updated by the experts (n=60), and then used EFA and CFA to show the questionnaire's reliability (α= 0.95, AVE= 0.53). The theoretical model, based on four separate constructs as suggested by the adopted framework [10], resulted the most suitable conceptualization model for AI literacy, according to the findings of the factor analysis. The other analyses, such as CR (0,94), also suggested good constructs’ validity. When putting the questionnaire to use in practice, there are a few things noteworthy. The first is that the instrument as a whole is more trustworthy than the constructs alone. The instrument's score was higher than 0.95, even though all four constructs showed reliability coefficients of greater than 0.70. Therefore, rather than using the separate constructs, it is advised to use the instrument as a whole, corresponding to the multidimensionality of AI literacy. Furthermore, we intend to advance and promote future research in this field by defining the AI literacy domain and offering useful measurement tools, by conceptualizing AI literacy and creating appropriate methods for evaluating it. This way designers will be better able to portray realistic user models and, subsequently, constructs able to explain AI systems based on these models. In the landscape of questionnaires aimed at evaluating AI literacy, the novelty and strength of our questionnaire relies on its comprehensive approach to the multidimensional nature of AI literacy. While existing scales [13,14,15], primarily target specific or isolated aspects of AI such as emotive or collaborative dimensions or were developed for evaluating AI literacy after a course [7,12], our questionnaire rigorously acknowledges and assesses the intrinsic complexity of AI literacy, by embracing a multifaceted perspective and providing a more nuanced, holistic understanding of individuals' comprehension, attitudes, and engagement with AI. This innovative focus not only fills a critical gap in the existing literature but also offers new pathways for educators, policymakers, and researchers to cultivate a more profound and integrative AI literacy across various sectors and populations. 5. Limitations It is important to emphasize that these conclusions cannot be applied uniformly given the characteristics of the sample (i.e.., a convenience sample, therefore neither probabilistic nor representative of the reference population). Furthermore, the sample was primarily drawn from higher education. Representatives from other subpopulations, like secondary education, may have slightly different perspectives on various aspects of AI literacy. Therefore, future studies should examine the extent to which the item set is applicable to other fields. Furthermore, to better understand the subject and promote the creation of conditions that are suitable for the implementation of successful educational AI literacy paths, additional research in this field is required. 6. Conclusion In conclusion, this study underscores the importance and urgency of AI literacy measurement tools. In an era where AI is ubiquitous and integral to many aspects of our lives, the need for AI literacy is no longer a prospective necessity, but a present one. By recognizing the multiplicity of definitions and obstacles in the development of AI literacy, we developed an assessment tool based on a multidimensional framework [10]. Grounded in the concept of digital literacy [45] and embracing various aspects of AI literacy including knowledge, skills, attitudes, and behaviors, this tool has been thoroughly validated, showing high reliability and construct validity. Our research contributes to the ongoing academic discourse by proposing a theoretically and empirically sound instrument for assessing AI literacy. We acknowledge that given the diverse definitions and applications of AI literacy, the tool we've developed is by no means definitive, but instead offers a robust starting point for educators, researchers, and policymakers. Future research must continue refining the conceptualization and measurement of AI literacy and explore how this literacy impacts students' ability to engage with AI and the broader effects this engagement has on society. The journey to widespread AI literacy is undoubtedly a complex one, but it is a journey we must undertake with vigor and commitment if we are to equip the next generation with the tools they need to navigate a world increasingly mediated by AI. References [1] M. C. Laupichler, A. Aster, & T. Raupach, Delphi study for the development and preliminary validation of an item set for the assessment of non-experts’ AI literacy, in: Computers and Education: Artificial Intelligence, vol. 4, 2023, pp. 100126. doi: 10.1016/j.caeai.2023.100126. [2] J. Southworth, K. Migliaccio, J. Glover, J. Glover, D. Reed, C. McCarty, J. Brendemuhl, & A. Thomas, Developing a model for AI Across the curriculum: Transforming the higher education landscape via innovation in AI literacy, in: Computers and Education: Artificial Intelligence, vol. 4, 2023, pp. 100127. doi: 10.1016/j.caeai.2023.100127. [3] M. Kandlhofer, G. Steinbauer, S. Hirschmugl-Gaisch, & P. Huber, Artificial Intelligence and Computer Science in Education: From Kindergarten to University, Paper presented at the 2016 IEEE Frontiers in Education Conference (FIE), 2016. [4] R. Luckin, M. Cukurova, C. Kent, & B. Du Boulay, Empowering educators to be AI-ready, in: Computers and Education: Artificial Intelligence, vol. 3, 2022, pp. 100076. doi: 10.1016/j.caeai.2022.100076. [5] D. T. K. Ng, J. K. L. Leung, S. K. W. Chu, & M. S. Qiao, Conceptualizing AI literacy: An exploratory review, in: Computers and Education: Artificial Intelligence, vol. 2, 2021, pp. 100041. doi: 10.1016/j.caeai.2021.100041. [6] J. W. K. Ho, & M. Scadding, Classroom activities for teaching artificial intelligence to primary school students, in S. C. Kong, D. Andone, G. Biswas, H. U. Hoppe, T. Hsu, B. C. Kuo, K. Y. Li, C. Looi, M. Milrad, J. Sheldon, J. Shih, K. Sin, K. Song, & J. Vahrenhold (Eds.), Proceedings of international conference on computational thinking education 2019, The Education University of Hong Kong, 2019, pp. 157–159. [7] S. C. Kong, & H. Abelson (Eds.), Computational thinking education in K–12: Artificial intelligence literacy and physical computing, MIT Press, 2022. [8] S.-C. Kong, & G. Zhang, Evaluating an Artificial Intelligence Literacy Programme for Developing University Students’ Conceptual Understanding, Literacy, Empowerment and Ethical Awareness, 2023. [9] D. Long, & B. Magerko, What is AI Literacy? Competencies and Design Considerations, in: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, 2020, pp. 1–16. doi:10.1145/3313831.3376727. [10] S. Cuomo, G. Biagini, & M. Ranieri, Artificial Intelligence Literacy, che cos’è e come promuoverla. Dall’analisi della letteratura ad una proposta di Framework, in: Media Education, 2022. doi:10.36253/me-13374. [11] Y. Dai, C. S. Chai, P. Y. Lin, M. S. Y. Jong, Y. Guo, & J. Qin, Promoting students’ well-being by developing their readiness for the artificial intelligence age, in: Sustainability, vol. 12, no. 16, 2020. doi:10.3390/su12166597. [12] S.-C. Kong, W. M.-Y. Cheung, & O. Tsang, Evaluating an artificial intelligence literacy programme for empowering and developing concepts, literacy and ethical awareness in senior secondary students, in: Education and Information Technologies, 2022. doi:10.1007/s10639-022-11408-7. [13] C. Sindermann, P. Sha, M. Zhou, J. Wernicke, H. S. Schmitt, M. Li, & C. Montag, Assessing the attitude towards artificial intelligence: Introduction of a short measure in German, Chinese, and English Language, in: KI-Künstliche Intelligenz, vol. 35, no. 1, 2021, pp. 109–118. [14] A. Schepman, & P. Rodway, Initial validation of the general attitudes towards Artificial Intelligence Scale, in: Computers in Human Behavior Reports, vol. 1, 2020, pp. 100014. doi: 10.1016/j.chbr.2020.100014. [15] Y. Y. Wang, & Y. S. Wang, Development and validation of an artificial intelligence anxiety scale: An initial application in predicting motivated learning behavior, in: Interactive Learning Environments, vol. 30, no. 4, 2022, pp. 619–634. doi:10.1080/10494820.2019.1674887. [16] [16] S. Druga, S. T. Vu, E. Likhith, & T. Qiu, Inclusive AI literacy for kids around the world, in: Proceedings of FabLearn 2019, 2019, pp. 104–111. doi:10.1145/3311890.3311904. [17] S. Kim, Y. Jang, W. Kim, S. Choi, H. Jung, S. Kim, & H. Kim, Why and What to Teach: AI Curriculum for Elementary School, 2021, p. 8. [18] J. Su, Y. Zhong, & D. T. K. Ng, A meta-review of literature on educational approaches for teaching AI at the K-12 levels in the Asia-Pacific region, in: Computers and Education: Artificial Intelligence, vol. 3, 2022, pp. 100065. doi: 10.1016/j.caeai.2022.100065. [19] A. Aithal & P. S. Aithal, Development and Validation of Survey Questionnaire & Experimental Data – A Systematical Review-based Statistical Approach, in: International Journal of Management, Technology, and Social Sciences (IJMTS), vol. 5, no. 2, 2020, pp. 233–251. DOI: http://doi.org/10.5281/zenodo.4179499. [20] [20] R. F. DeVellis, Scale Development: Theory and Applications, Vol. 21, Sage publications, 2016. [21] L. Floridi, J. Cowls, M. Beltrametti, R. Chatila, P. Chazerand, V. Dignum, C. Luetge, R. Madelin, U. Pagallo, F. Rossi, B. Schafer, P. Valcke, & E. Vayena, AI4People—An Ethical Framework for a Good AI Society: Opportunities, Risks, Principles, and Recommendations, in: Minds and Machines, vol. 28, no. 4, 2018, pp. 689–707. http://dx.doi.org/10.1007/s11023-018-9482-5. [22] L. Floridi, Introduction–The Importance of an Ethics-First Approach to the Development of AI, in: Ethics, Governance, and Policies in Artificial Intelligence, 2021, pp. 1-4. [23] N. Selwyn, The future of AI and education: Some cautionary notes, in: European Journal of Education, vol. 57, no. 4, 2022, pp. 620–631. doi:10.1111/ejed.12532. [24] European Commission. Joint Research Centre (JRC), The impact of Artificial Intelligence on learning, teaching, and education, Publications Office, 2018. https://data.europa.eu/doi/10.2760/12297. [25] European Commission, Shaping Europe’s digital future—European strategy for data, 2021. [26] European Commission, High-Level Expert Group on Artificial Intelligence, 2018. Available online at: https://digital-strategy.ec.europa.eu/en/policies/expert-group-ai. [27] European Commission, Pilot the Assessment List of the Ethics Guidelines for Trustworthy AI, 2019. Available online at: https://ec.europa.eu/futurium/en/ethics-guidelines- trustworthy-ai/register-piloting-process-0.html. [28] European Commission, On Artificial Intelligence - A European approach to excellence and trust, Technical report, Brussels, 2020. [29] European Commission, Joint Research Centre (JRC) & Organisation for Economic Co- operation and Development (OECD), AI watch, national strategies on artificial intelligence: a European perspective, Publications Office of the European Union, 2021. doi:10.2760/069178. [30] Organisation for Economic Co-operation and Development (OECD), Bridging the digital gender divide: Include, upskill, innovate, OECD Publishing, 2018a. http://www.oecd.org/digital/bridging-the-digital-gender-divide.pdf. [31] Organisation for Economic Co-operation and Development (OECD), Future of education and skills 2030: Conceptual learning framework, OECD Publishing, 2018b. https://www.oecd.org/education/2030/Education-and-AI-preparing-forthe-future-AI- Attitudes-and-Values.pdf. [32] Organisation for Economic Co-operation and Development (OECD), Recommendation of the Council on Artificial Intelligence, OECD Publishing, 2019. https://legalinstruments.oecd.org/en/instruments/OECD-LEGAL-0449. [33] J. C. Anderson, & D. W. Gerbing, Predicting the performance of measures in a confirmatory factor analysis with a pretest assessment of their substantive validities, in: Journal of Applied Psychology, vol. 76, no. 5, 1991, pp. 732-740. [34] T. R. Hinkin, A brief tutorial on the development of measures for use in survey questionnaires, in: Organizational Research Methods, vol. 1, no. 1, 1998, pp. 104-121. [35] C. A. Schriesheim, K. J. Powers, T. A. Scandura, C. C. Gardiner, & M. J. Lankau, Improving construct measurement in management research: Comments and a quantitative approach for assessing the theoretical content adequacy of paper-and-pencil survey-type instruments, in: Journal of Management, vol. 19, no. 2, 1993, pp. 385-417.R Core Team. (2018). R: A language and environment for statistical computing [Computer software] https://cran.r- project.org/. [36] Jamovi Project, jamovi (Version 0.9) [Computer Software], 2019. Disponibile su: https://www.jamovi.org/. [37] L. J. Cronbach, Coefficient alpha and the internal structure of tests, in: Psychometrika, vol. 16, 1951, pp. 297–334. [38] J. Nunnally, Psychometric Theory, New York: McGraw-Hill, 1978. [39] R. P. McDonald, Test theory: A unified treatment, Mahwah, N.J.: L. Erlbaum Associates, 1999. [40] C. Fornell, & D. F. Larcker, Evaluating Structural Equation Models with Unobservable Variables and Measurement Error, in: Journal of Marketing Research, vol. 18, no. 1, 1981, pp. 39–50. [41] J. E. Hair Jr, R. E. Anderson, R. L. Tatham, & W. C. Black, Multivariate data analysis (5th ed.), Upper Saddle River, NJ: Prentice-Hall, 1998. [42] J. Hair, W. C. Black, B. J. Babin, & R. E. Anderson, Multivariate data analysis (7th ed.), Upper Saddle River, NJ: Pearson Education International, 2010. [43] D. W. Straub, Validating instruments in MIS research, in: MIS Quarterly, vol. 13, no. 2, 1989, pp. 147–169. [44] A. Calvani, A. Cartelli, A. Fini, & M. Ranieri, Models and Instruments for Assessing Digital Competence at School, in: Journal of E-Learning and Knowledge Society, vol. 4, no. 3, 2008, pp. 183–193. [45] United Nations Children’s Fund, Policy guidance on AI for children Draft 1.0, 2020. Disponibile su: https://www.unicef.org/globalinsight/media/1171/file/UNICEF-Global- Insight-policy-guidance-AI-children-draft-1.0-2020.pdf. [46] United Nations Children’s Fund, AI policy guidance: How the world responded, 2021a. Disponibile su: https://www.unicef.org/globalinsight/stories/ai-policy-guidance-how- world-responded. [47] United Nations Children’s Fund, Policy guidance on AI for children 2.0, UNICEF, 2021b. Disponibile su: https://www.unicef.org/globalinsight/media/2356/file/UNICEF-Global- Insight-policy-guidance-AI-children-2.0-2021.pdf. [48] United Nations Educational, Scientific and Cultural Organization (UNESCO), Beijing consensus on artificial intelligence and education, 2019a. Disponibile su: https://unesdoc.unesco.org/ark:/48223/pf0000368303. [49] United Nations Educational, Scientific and Cultural Organization (UNESCO), Stepping up AI for social good, 2019b. [50] United Nations Educational, Scientific and Cultural Organization, AI and education: Guidance for policy makers, 2021. Disponibile su: https://unesdoc.unesco.org/ark:/48223/pf0000376709. [51] B. Wang, P. L. P. Rau, & T. Yuan, In Measuring user competence in using artificial intelligence: Validity and reliability of artificial intelligence literacy scale, in: Behaviour and Information Technology, 2022. doi:10.1080/0144929X.2022.2072768. [52] L. J. Cronbach, Coefficient alpha and the internal structure of tests, in: Psychometrika, vol. 16, 1951, pp. 297–334. [53] J. Nunnally, Psychometric Theory, New York: McGraw-Hill, 1978. [54] R. P. McDonald, Test theory: A unified treatment, Mahwah, N.J.: L. Erlbaum Associates, 1999. [55] C. Fornell, & D. F. Larcker, Evaluating Structural Equation Models with Unobservable Variables and Measurement Error, in: Journal of Marketing Research, vol. 18, no. 1, 1981, pp. 39–50. [56] J. E. Hair Jr, R. E. Anderson, R. L. Tatham, & W. C. Black, Multivariate data analysis (5th ed.), Upper Saddle River, NJ: Prentice-Hall, 1998. [57] J. Hair, W. C. Black, B. J. Babin, & R. E. Anderson, Multivariate data analysis (7th ed.), Upper Saddle River, NJ: Pearson Education International, 2010. [58] D. W. Straub, Validating instruments in MIS research, in: MIS Quarterly, vol. 13, no. 2, 1989, pp. 147–169. [59] A. Calvani, A. Cartelli, A. Fini, & M. Ranieri, Models and Instruments for Assessing Digital Competence at School, in: Journal of E-Learning and Knowledge Society, vol. 4, no. 3, 2008, pp. 183–193. [60] United Nations Children’s Fund, Policy guidance on AI for children Draft 1.0, 2020. Disponibile su: https://www.unicef.org/globalinsight/media/1171/file/UNICEF-Global- Insight-policy-guidance-AI-children-draft-1.0-2020.pdf. [61] United Nations Children’s Fund, AI policy guidance: How the world responded, 2021a. Disponibile su: https://www.unicef.org/globalinsight/stories/ai-policy-guidance-how- world-responded. [62] United Nations Children’s Fund, Policy guidance on AI for children 2.0, UNICEF, 2021b. Disponibile su: https://www.unicef.org/globalinsight/media/2356/file/UNICEF-Global- Insight-policy-guidance-AI-children-2.0-2021.pdf. [63] United Nations Educational, Scientific and Cultural Organization (UNESCO), Beijing consensus on artificial intelligence and education, 2019a. Disponibile su: https://unesdoc.unesco.org/ark:/48223/pf0000368303. [64] United Nations Educational, Scientific and Cultural Organization (UNESCO), Stepping up AI for social good, 2019b. [65] United Nations Educational, Scientific and Cultural Organization, AI and education: Guidance for policy makers, 2021. Disponibile su: https://unesdoc.unesco.org/ark:/48223/pf0000376709. [66] B. Wang, P. L. P. Rau, & T. Yuan, In Measuring user competence in using artificial intelligence: Validity and reliability of artificial intelligence literacy scale, in: Behaviour and Information Technology, 2022. doi:10.1080/0144929X.2022.2072768.