Challenges of Enforcing Regulations in Artificial Intelligence Act — Analyzing Quantity Requirement in Data and Data Governance Farhana Ferdousi Liza School of Computing Sciences, University of East Anglia, Norwich Research Park, Norwich NR4 7TJ, UK. Abstract To make Artificial Intelligence (AI) systems and services accountable and regulated in the European Union market, in April 2021, the European Union Parliament published a proposal ‘Laying Down Harmonised Rules on Artificial Intelligence (Artificial Intelligence Act)’, widely known as Artificial Intelligence Act (AI Act). Since then, many concerns have been raised in terms of compliance and whether the regulations are enforceable. However, to the best of our knowledge, none of them provided an explicit technical analysis of the challenges in enforcing the regulation. Among 85 Articles in the AI Act, we emphasize on the Article 10, the central regulatory requirement for data and data governance. In this paper, we have analyzed a specific requirement, the data quantity, to show the challenges of enforcing this requirement in a principled way. In our analysis, we have used deep learning modeling and machine learning generalization theory. Keywords Artificial Intelligence Act, Future Technologies, Generalization Theory, Deep Learning Modeling. 1. Introduction In April 2021, the European Commission proposed regulations on Artificial Intelligence (AI) and published the Artificial Intelligence Act (AI Act) [1]. This landmark regulatory proposal aims to create a clear regulatory environment for AI providers and users to protect AI users from the harmful effects of AI deployments. The AI Act aims at facilitating the development of a single market for lawful and safe AI. Once it comes into force, this Act will impact internationally beyond the European Economic Area [2, 3], therefore a critical assessment is important for sustainable and fair regulation of AI systems. AI Act Article 3(1) Annex I defines Artificial Intelligence (AI) which covers a wide range of computational methods designed to perform certain tasks, such as generating the content, assisting in decision-making about people’s social security benefits, predicting an individual’s risk of committing fraud or defaulting on a loan. The AI systems and approaches include machine learning approaches, including supervised, unsupervised, and reinforcement learning, using a 1st International Workshop on Imagining the AI Landscape After the AI Act (In conjunction with The first International Conference on Hybrid Human-Artificial Intelligence), Vrije Universiteit Amsterdam, Amsterdam, Netherlands, June 13, 2022 $ F.Liza@uea.ac.uk (F. F. Liza)  0000-0003-4854-5619 (F. F. Liza) © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) wide variety of methods including deep learning. AI is the capability of a computer system to mimic human cognitive functions such as learning and problem-solving. One way to train a computer system to mimic human learning and reasoning is to use a neural network, which is a series of algorithms taking inspiration from brains to model complex data representation. The neural network helps the computer system achieve AI through deep learning. In this paper, we have used this particular modeling for our analysis, however, the analysis applies to any AI system that is capable of learning and reasoning mimicking human intelligence. AI systems have the potential to improve livelihood and overall economic and societal welfare. AI-powered systems may help to respond to key global challenges, such as emergency management and the novel coronavirus pandemic. For example, storm Eunice was predicted five days earlier than the storm hit the UK using numerical simulations techniques which provided the intervention opportunities to reduce the casualty. With AI techniques if we could predict the event earlier, it would be possible to prevent casualties. This makes Schultz et al. [4] wonder ‘Can deep learning beat numerical weather prediction?’. Moreover, AI systems can contribute to better healthcare services [5], improve the supply chain [6], help in fast decision making [7] and service deployment [8, 9]. AI systems can also benefit the public sector in several ways, for example, by efficient resource allocation to personalized public services tailored to individual circumstances or early planning for resource allocation based on historical data. While AI systems have unprecedented potential for society, they also come with substantial risks, raising a variety of legal and ethical challenges. AI systems have the potential to unpre- dictably harm people’s life, health, and autonomy. They can also affect fundamental values of European society and around the world, leading to breaches of fundamental rights of people and assembly, non-discrimination, or the right to an effective and a fair trial, as well as consumer protection [10, 11, 12]. To mitigate the risks, the commission has proposed a risk-based regulatory approach. High- risk AI systems that use techniques involving the training of models based on training, validation, and testing data sets are required to meet the required criteria referred to in the AI Act Article 10. We particularly appreciate that the AI Act is imposing regulations on data and data governance to regulate AI, however, a closer analysis shows that the proposed AI Act may need improvement in many areas. This paper is a basis for a preliminary assessment of the AI Act for further discussion on data and data governance. In this paper, we will mainly concentrate on deep learning modeling (i.e., deep neural network models) as a representative AI system that is inspired by brain computation or human intelligence and the generalization theory of machine learning. Without loss of generality, the discussion applies to every AI model specifically those are capable of modeling complex nonlinear data representation. 2. Related works Since the publication of the AI Act proposal, many critical assessments of the regulation and impact of the proposal are published [13, 14]. Fiazza [14] raised the concern of enforcing the AI Act Article 10 by stating that: To touch briefly on a topic worthy of a longer discussion, the Proposal mandates in Article 10 that all datasets used “shall be relevant, representative, free of errors and complete,” “exhibiting the appropriate statistical properties,” and that they be evaluated for possible biases. In so doing, the Proposal represents in the wider context of AI a problem already visible in the General Data Protection Regulation [5]: it is not presently known how to fulfill the bill. The problem is especially thorny in medical robotics in connection to accounting for rare events, anatomical variants and rare pathologies in the datasets. The sentence ‘it is not presently known how to fulfill the bill’ defines the main challenge of enforcing the AI Act 2021. We have analyzed the data quantity requirement to elaborate further to show that it is not possible to precisely quantify the required quantity of data for training an AI system. Similarly, Ebers et al. [13] recognize the impossibility of enforcing the regulation in practice and raise the concern that the regulation in the current state might hamper innovation. They mainly considered the issues including data quality, bias, and the free-of-error requirement of the dataset. Human Right Watch (HRW) has done a critical evaluation of the AI Act and concluded that the regulation can endanger the Social Safety Net1 . In our knowledge, there is no work has been done to show that the compliance and enforcement of the Article 10 are possible in a principled way. We share the community’s concerns. To be able to understand the challenges and to provide sustainable requirements, it is important to analyze the requirement from a technical perspective. Although existing works raise valid concerns, none of them provided an explicit technical analysis of the challenges in enforcing the regulation. Specifically, none of the studies have analyzed the data quantity requirement in AI Act Article 10. In this paper, we provide the technical analysis emphasizing the challenges of enforcing the regulation of the ‘data quantity’ requirement. 3. AI Act Article 10 and GDPR In AI Act, the EU Commission proposes a risk-based categorization of AI systems with four levels of risk (e.g., unacceptable risk, high risk, low risk) and related regulatory obligations and restrictions. AI Act Article 10 outlines the central regulatory requirement for data and data governance. These requirements are for high-risk AI systems which are defined both by general characteristics (AI Act Article 6) and specifically targeted applications (AI Act Annex III). AI Act Article 6 defines ‘high-risk’ AI systems where the AI system is intended to be used as a safety component of a product, or itself a product, and this product is subject to an existing third-party conformity assessment. AI Act Annex III defines the targeted applications including Biometric identification, Law enforcement, and Employment. In addition, the commission has the power to directly designate an AI system as high-risk by adding it to AI Act Annex III (AI Act Article 84). Although analysis of the risk categorization is outside the scope of this paper, the proposed risk categorization seems like an unrealistic approach. If an AI system is developed outside the listed application area, later the AI system poses an unforeseen high-risk or unacceptable risk, the system may be penalized with high fees or the commission can not keep them accountable as they were not listed in Annex III at the time of AI system development. 1 https://www.hrw.org/news/2021/11/10/how-eus-flawed-artificial-intelligence-regulation-endangers-social-safety- net𝑓 𝑡𝑛4 If a high-risk AI system is trained with data; training, validation and test datasets must meet the quality criteria specified in paragraphs 2 to 5 in AI Act Article 10. Paragraph 2 defines the requirements such as appropriate data governance and data management procedures should apply, including design choices, data collection, relevant data preparation processes and prior assessment of the availability, quantity, and suitability of the required datasets, examination of bias, and identification of possible data gaps. Similarly, paragraph 3 outlines requirements such as the datasets must be relevant, representative, accurate and complete, and paragraph 4 requires that to be the extent necessary for the intended purpose, correspond to the characteristics or elements specific to the particular geographic, behavioral, or functional context in which the high-risk AI system is intended to be used. Although the Commission has outlined those requirements, the Commission does not define these characteristics or its understanding of ‘data governance’ at the level that can be used for designing relevant compliance and enforcement techniques. It, therefore, remains subjective where objective requirements are to be enforced concerning compliance. This results in uncertainty in technical implementation on compliance, enforcement, and legal complexity. Like the General Data Protection Regulation (GDPR), the AI Act standardizes requirements for the handling of data. If an AI system uses personal data to develop high-risk AI systems, providers will be obliged to comply with the data handling requirements of the AI Act and the personal data processing requirements of the GDPR. The question arises, in the case of violating any requirements would they be double penalized? This question arises as the EU presents itself as a single interlocutor, not only in the management of personal data (GDPR) but now also for AI systems and services. Thus clarifications on how these laws will be applied in case of violating any requirements would encourage the sustainable development of AI systems and services within the EU’s long-term vision with AI. Otherwise, from a techno- legal perspective, with the fear of penalization in the case of violation of these two related laws, specifically with the presence of non-explicit vague required criteria in the AI Act, the development of AI will be hindered by fear-driven discouragement of the potential investment. 4. Challenges to Enforce AI Act Article 10 Data Quantity Requirement In this section, we will mainly analyze one specific requirement — data quantity — and two other related requirements — free of errors and complete data — in the AI Act Article 10 to emphasize the challenges to comply with the proposed regulation from the technical perspective. AI Act Article 10(2(e)) has the following prior assessment requirement: a prior assessment of the availability, quantity and suitability of the data sets that are needed; To be able to pre-assess the data quantity for an AI system, we would need a complete understanding of a real process that we are aiming to model and a complete understanding of how a model makes a decision. When our understanding is limited for a real process, the modeling of the process would lead to a larger uncertainty. This is because we do not know or understand the process and we can come up with our take on the process which might not be representative of the true process. For example, Saltelli and Funtowicz [15] have described climate change modeling where a better understanding of the process led to a larger uncertainty. In that scenario, the Intergovernmental Panel on Climate Change produced larger prediction uncertainty ranges, as opposed to smaller ones, as more and more processes, scenarios, and models are incorporated and cascading uncertainties made the lack of understanding effect felt in the final estimates. The fundamental problem is that the current state of statistics and nonconvex optimization theory can not provide probability distributions that represent a real- world complex data distribution and can not explain why an AI system (i.e., deep learning models) performs well in empirical settings for real-world problems [16]. With the current state of the computational learning theory, statistics, and mathematics, it is thus not possible to precisely quantify a dataset needed for AI systems including a deep learning model. To elaborate on that, we will use the generalization theory of machine learning. The quantity of the dataset is related to a concept called model generalization which is not yet understood for non-linear machine learning models including deep learning models. To briefly describe generalization, when we train a model, we don’t just want to train a model to perform well on the training data. We want it to generalize to data that the model hasn’t seen in the training data, this unseen data is called held-out test data. For example, for human learning, if a human is told that Lion is dangerous (training), they can deductively reason that a Tiger might be dangerous even if they were never warned about the Tiger (held-out testing). To measure the model’s generalization performance, we measure the model’s performance on a held-out test dataset. If a model works well on the training set but fails to generalize on a held-out dataset, we conclude that the model has been overfitted. Overfitting is problematic for generalized learning. Improving generalization (or preventing overfitting) in AI models, especially for neural nets is still somewhat based on trial and error. The current approach is depending on the data distribution, with a change in data distribution (i.e., an increase in the dataset due to the availability of new data in the real scenario), the whole trial and error process would need to be repeated. Of course, there are a few simple strategies that can help reduce overfitting (e.g., Reducing model capacity, Early stopping, Regularization and weight decay, Ensembles, Data augmentation, and Stochastic regularization), but none of the techniques gives a guarantee of generalized performance, that means that we do not know when these models will fail to generalize. Therefore, we don’t have learning guarantees, error bounds, science, and technology that shed light on the explicit evaluation of generalization. That means, it is not possible to provide a required quantity of a dataset for an AI system in real-world scenarios. The more realistic AI system would need to deal with out-of-distribution learning and generalization to arbitrary out-of-distribution is impossible [17] with the current state of the computational learning theory, statistics, and mathematics. AI Act does not specify the expectation on the generalization performance, thus it imposes difficulty in designing a prior assessment to be reasonably confident that the end-users of the high-risk system will not incur high-risk harm because of a data quantity issue. In conventional learning theory, there is a relationship between the number of parameters and overfitting. When the number of parameters is larger than the number of training examples the model is called an over-parameterized model. From the optimization and computational learning theory’s perspective, overparameterization should lead to overfitting. However, it is not clear why overparameterization does not lead to overfitting in deep learning models [18]. The community has concluded that to understand the deep learning framework we might need to rethink conventional generalization theory [19] and is still working on the theoretical explanation of generalization [16]. For example, Simon et al. [20] concluded that ‘The generalization of deep neural networks remains mysterious, with a full quantitative theory still out of reach’. Essentially Simon et al. [20] tried to formulate a principle theory of generalization and concluded that there is more work to be done to arrive at a general theory of deep learning generalization. Despite the lack of theoretical understanding, one of the benefits of using a deep neural network is that with domain expertise and heuristic strategies [21] deep learning modeling can perform empirically well with a high dimensional large dataset, whereas other models would become intractable. Therefore, it is common to perform studies on how algorithm performance scales with dataset size [22]. This means that algorithms performance improves in deep learning modeling with an increase in sample size whereas conventional models will even fail to model such a large dimensional dataset. However, domain expertise and heuristic strategies that make deep learning models perform superiorly compared to conventional models, do not give us a reasonable pre-assessment criteria on a sample size (i.e., data quantity) that could be used to make deep learning models perform with a acceptable generalization performance for a techno-legal framework. Generally, for a machine learning approach, no theory can precisely quantify data for any type of real-world AI system modeling. Usually, the data size (i.e., sample size) is chosen based on different heuristics [23, 24] including a factor of the number of classes, a factor of the number of input features, a factor of the number of model parameters. However, these are all heuristics and therefore the model correctness and generalization rely on error analysis and different statistical significance tests. When all these heuristics are followed, there is still a possibility of mistakes (e.g., with adversarial attacks) by well-trained machine learning models — conventional models [25] and deep learning models [26]. Whereas deep learning is remarkable in providing solutions to problems that were not possible using conventional machine learning techniques, adversarial attacks and the defenses against adversarial attacks are still under investigation. Do adversarial attacks related to data quantity used in the training? We do not know the answer. Therefore, with the presence of adversarial attacks, these heuristic approaches are not sufficiently reliable for providing good generalizations guarantee. AI Act regulation, Article 10(2(e)) requires the following criteria for a dataset: Training, validation and testing data sets shall be relevant, representative, free of errors and complete. As we have discussed that estimating the exact quantity (i.e., sample size) of the dataset for AI systems is impossible, a direct consequence is that it is not possible to evaluate the completeness of the data set. Moreover, with an unsupervised learning paradigm, it is difficult to evaluate the ‘free of errors’ criterion in the dataset as there is no explicit human labeling of the dataset and it is not clear what it means by ‘error’ in this context. One of the scenarios where the deep learning paradigm excels is unsupervised feature learning. For example, deep learning models are needed for high-dimensional domains (e.g., text, image, video) where manual feature engineering is difficult [27]. In this context, defining an error-free dataset would be difficult given the size of the dataset used in the unsupervised learning framework. Overall, it is necessary to define quantifiable errors to design meaningful compliance and make regulation enforcement fair for an AI system. Such unsupervised learning features work well with the downstream application, however, there is still no theory to explain why automated feature learning using deep learning techniques works mostly well with downstream applications. At the same time, there is no theory on why adversarial examples [28, 29] can cause AI systems to make mistakes. Does this relate to the data-related error or the model-related error? Machine learning has a close relation to statistics. “All models are wrong, but some are useful” is a famous quote often attributed to the British statistician George E. P. Box. Statisticians and researchers try to develop models aiming to predict the behavior of a certain process, for instance, the selling trend of a product or demand for a taxi in a city [30]. Thus, the idea of this quote is that every single model will be wrong, meaning that it will never represent the exact real process. This is true for most machine learning and AI systems, as no model can represent the exact real process. Moreover, we don’t have an understanding of the representation learned by an AI system (i.e, deep learning model). For example, adversarial examples are not distinguishable from the original examples in human eyes. It is possible that AI systems have a different interpretation of the learned representation or features than humans, and there are scopes of theoretical and empirical research to understand the deep learning models. It is difficult to enforce a regulation on completeness and free of errors of the dataset until we have a more in-depth understanding of the AI systems including deep learning models. In attempts to regulate AI systems for high-risk application, the proposed data and data governance requirement criteria in AI Act would pose problematic circumstances from techno- legal perspective, especially when statistical, mathematical, and computational learning theory is not advanced enough to give evaluation metrics for those criteria. For example, anyone can claim that the used dataset for a AI system is in the right quantity for modeling and there is no principled evaluation framework is available to validate this claim. Without a principled quantitative or qualitative evaluation framework, these criteria are discursively meaningless [31], easy to manipulate, and not fit for purpose. When Ursula von der Leyen had pledged that, within 100 days of her election as President of the European Commission, she would have proposed new legislation on AI2 , Floridi [32] remarked that it was a reasonable strategy but an unrealistic timeline. Philosophically, the initiative was a starting point to ensure that the development of AI in the EU is ethically sound, legally acceptable, socially equitable, and environmentally sustainable. The underpinning vision is to have AI systems and services that seek to support the economy, society, and the environment. Fulfilling such a vision is not a simple one and, it will take time and effort to reach a final AI Act that can come close to fulfilling the vision. Yet, the vision, like von der Leyen’s pledge, remains reasonable because the EU is set to deliver such a challenging philosophical framework. More clarity on expectations for the assessment of data and data governance from the legal perspective with a corresponding technical evaluation framework will make this unified, post-Westphalian approach [33] to have several positive effects. AI companies and vendors will have to deal with one EU regulation for 2 https://ec.europa.eu/commission/presscorner/detail/en/ip_20_403 their AI systems and services, not with the individual Member States when they will have to prove that they comply with the new legislation. 5. Conclusion In this paper, we have concentrated on a requirement in Article 10 which is related to data and data governance. There are other issues3 such as application-specific risk categorization that need further consideration. The requirements of the AI Act would have an impact on society, future technologies, and innovation. For example, a global law firm Taylor Wessing has reported that the double regulatory compliance (i.e., GDPR and AI Act) on data governance might lead to financial consequences, the new AI Act imposes higher fines than the GDPR for violating the requirements under Article 10, namely up to EUR 30,000,000 or - in the case of companies - up to 6% of the total annual worldwide turnover of the previous financial year, whichever is higher4 . This might limit innovation which would have economic consequence. To conclude, based on our analysis, we call for a technically implementable AI Act regulations which have long-term international, social, and economic implications. Particularly we call for more clarity from lawmakers on what are their expectations of the data and data governance. Acknowledgement We thank the three anonymous reviewers whose comments and suggestions helped improve and clarify this manuscript. References [1] E. Commission, Proposal for a Regulation Laying down Harmonised Rules on Artificial Intelligence (Artificial Intelligence Act) ; COM (2021) 206 final; European Commission: Brussels, Belgium, 2021. [2] D. Svantesson, The european union artificial intelligence act: Potential implications for australia, Alternative Law Journal (2021) 1037969X211052339. [3] G. Greenleaf, The ‘brussels effect’ of the EU’s ‘AI act’ on data privacy outside europe (june 7, 2021), Privacy Laws Business International Report 1, 3-7, UNSW Law Research 171 (2021) 3–7. URL: https://ssrn.com/abstract=3898904. [4] M. Schultz, C. Betancourt, B. Gong, F. Kleinert, M. Langguth, L. Leufen, A. Mozaffari, S. Stadtler, Can deep learning beat numerical weather prediction?, Philosophical Transac- tions of the Royal Society A 379 (2021) 20200097. [5] A. Panesar, Machine learning and AI for healthcare, Springer, 2019. [6] R. Toorajipour, V. Sohrabpour, A. Nazarpour, P. Oghazi, M. Fischl, Artificial intelligence in supply chain management: A systematic literature review, Journal of Business Research 122 (2021) 502–517. 3 https://www.bbc.co.uk/news/technology-56745730 4 https://www.taylorwessing.com/en/interface/2021/ai-act/fines-under-the-ai-act---a-bottomless-pit [7] R. Vaishya, M. Javaid, I. H. Khan, A. Haleem, Artificial intelligence (ai) applications for covid-19 pandemic, Diabetes & Metabolic Syndrome: Clinical Research & Reviews 14 (2020) 337–339. [8] H. J. Wilson, P. R. Daugherty, Collaborative intelligence: Humans and ai are joining forces, Harvard Business Review 96 (2018) 114–123. [9] J. M. Corchado, P. Chamoso, G. Hernández, A. S. R. Gutierrez, A. R. Camacho, A. González- Briones, F. Pinto-Santos, E. Goyenechea, D. Garcia-Retuerta, M. Alonso-Miguel, B. B. Hernandez, D. V. Villaverde, M. Sanchez-Verdejo, P. Plaza-Martínez, M. López-Pérez, S. Manzano-García, R. S. Alonso, R. Casado-Vara, J. P. Tejedor, F. d. l. Prieta, S. Rodríguez- González, J. Parra-Domínguez, M. S. Mohamad, S. Trabelsi, E. Díaz-Plaza, J. A. Garcia- Coria, T. Yigitcanlar, P. Novais, S. Omatu, Deepint.net: A rapid deployment platform for smart territories, Sensors 21 (2021). URL: https://www.mdpi.com/1424-8220/21/1/236. doi:10.3390/s21010236. [10] M. Ebers, S. Navas, Algorithms and law, Cambridge University Press, 2020. [11] S. Gerke, T. Minssen, G. Cohen, Ethical and legal challenges of artificial intelligence-driven healthcare, in: Artificial intelligence in healthcare, Elsevier, 2020, pp. 295–336. [12] T. Tzimas, Legal and Ethical Challenges of Artificial Intelligence from an International Law Perspective, volume 46, Springer Nature, 2021. [13] M. Ebers, V. R. Hoch, F. Rosenkranz, H. Ruschemeier, B. Steinrötter, The european commission’s proposal for an artificial intelligence act—a critical assessment by members of the robotics and ai law society (rails), J 4 (2021) 589–603. [14] M.-C. Fiazza, The eu proposal for regulating ai: Foreseeable impact on medical robotics, in: 2021 20th International Conference on Advanced Robotics (ICAR), 2021, pp. 222–227. doi:10.1109/ICAR53236.2021.9659429. [15] A. Saltelli, S. Funtowicz, When all models are wrong, Issues in Science and Technology 30 (2014) 79–85. [16] T. J. Sejnowski, The unreasonable effectiveness of deep learning in artificial intelligence, Proceedings of the National Academy of Sciences 117 (2020) 30033–30038. URL: https: //www.pnas.org/doi/abs/10.1073/pnas.1907373117. doi:10.1073/pnas.1907373117. arXiv:https://www.pnas.org/doi/pdf/10.1073/pnas.1907373117. [17] H. Ye, C. Xie, T. Cai, R. Li, Z. Li, L. Wang, Towards a theoretical framework of out- of-distribution generalization, Advances in Neural Information Processing Systems 34 (2021). [18] Z. Allen-Zhu, Y. Li, Y. Liang, Learning and generalization in overparameterized neu- ral networks, going beyond two layers, in: H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, R. Garnett (Eds.), Advances in Neural Information Processing Sys- tems, volume 32, Curran Associates, Inc., 2019. URL: https://proceedings.neurips.cc/paper/ 2019/file/62dad6e273d32235ae02b7d321578ee8-Paper.pdf. [19] C. Zhang, S. Bengio, M. Hardt, B. Recht, O. Vinyals, Understanding deep learning requires rethinking generalization, International Conference on Learning Representations (2017). [20] J. B. Simon, M. Dickens, M. R. DeWeese, Neural tangent kernel eigenvalues accurately predict generalization, CoRR abs/2110.03922 (2021). URL: https://arxiv.org/abs/2110.03922. arXiv:2110.03922. [21] F. F. Liza, M. Grzes, Improving language modelling with noise contrastive estimation, in: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence, AAAI’18/IAAI’18/EAAI’18, AAAI Press, 2018. [22] F. F. Liza, M. Grzes, Relating RNN layers with the spectral WFA ranks in sequence modelling, in: Proceedings of the Workshop on Deep Learning and Formal Languages: Building Bridges, Association for Computational Linguistics, Florence, 2019, pp. 24–33. URL: https://www.aclweb.org/anthology/W19-3903. doi:10.18653/v1/W19-3903. [23] S. J. Raudys, A. K. Jain, et al., Small sample size effects in statistical pattern recognition: Recommendations for practitioners, IEEE Transactions on pattern analysis and machine intelligence 13 (1991) 252–264. [24] R. Krishnaiah, L. Kanal, Dimensionality and sample size considerations in pattern recogni- tion practice. handbook of statistics, 1982. [25] H. Xiao, H. Xiao, C. Eckert, Adversarial label flips attack on support vector machines, in: Proceedings of the 20th European Conference on Artificial Intelligence, ECAI’12, IOS Press, NLD, 2012, p. 870–875. [26] A. Chakraborty, M. Alam, V. Dey, A. Chattopadhyay, D. Mukhopadhyay, A survey on adversarial attacks and defences, CAAI Transactions on Intelligence Technology 6 (2021) 25–45. [27] A. Halevy, P. Norvig, F. Pereira, The unreasonable effectiveness of data, IEEE intelligent systems 24 (2009) 8–12. [28] I. J. Goodfellow, J. Shlens, C. Szegedy, Explaining and harnessing adversarial examples, 2014. URL: https://arxiv.org/abs/1412.6572. doi:10.48550/ARXIV.1412.6572. [29] S. Garg, G. Ramakrishnan, Bae: Bert-based adversarial examples for text classification, in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020, pp. 6174–6181. [30] F. Rodrigues, I. Markou, F. C. Pereira, Combining time-series and textual data for taxi demand prediction in event areas: A deep learning approach, Information Fusion 49 (2019) 120–129. [31] M. R. Olsson, Michel foucault: discourse, power/knowledge, and the battle for truth, Leckie, Gloria J (2010) 63–74. [32] L. Floridi, The european legislation on ai: A brief analysis of its philosophical approach, Philosophy & Technology 34 (2021) 215–222. [33] A. Linklater, Citizenship and sovereignty in the post-westphalian state, European Journal of International Relations 2 (1996) 77–103.