Artificial Intelligence Systems Producing Books: Questions of Agency1 Maurizio Lana Università del Piemonte Orientale, piazza Roma 36, 13100 Vercelli, Italy Abstract The publication of the book Beta Writer. 2019. Lithium-Ion Batteries. A Machine-Generated Summary of Current Research. New York, NY: Springer, produced with Artificial Intelligence software prompts analysis and reflection in several areas. First of all, about what Artificial Intelligence systems are able to do in the production of informative texts. This raises the question of whether and how an Artificial Intelligence software system can be treated as the author of a text it has produced. Assessing whether this is correct and possible leads to a re- examination of the current conception whereby it is taken for granted that the author is a person. This, in turn, face to texts produced by AI systems necessarily raises the question of whether they, like the author-person, are endowed with agency. The article concludes that Artificial Intelligence systems are characterised by a distributed agency, shared with those who designed them and operate them, and that a new type of author must be defined and recognised. Keywords 1 Author, Artificial Intelligence, Book Production, Agency 1. Introduction In 2019, Springer published in print and digital a volume entitled «Lithium-Ion Batteries. A Machine-Generated Summary of Current Research» [1] whose main feature is that it was produced by means of an ad hoc Artificial Intelligence system, so much so that the author was named «Beta Writer». The appearance on the publishing scene of a software author brings a final missing element to the consideration of digital libraries, which until now assumed human authorship for publications. The software author destabilizes the existing principles on which cataloguing is based, and the production of a fully digital book is the first step towards a fully digital library. It is no longer possible to simply state that «digital libraries are libraries»[2], or that «a digital library is an online collection of digital objects, of assured quality, supported by services necessary to allow users to retrieve and exploit the resources», or that «a digital library forms an integral part of the services of a library»[3]: these statements imply different ways of normalizing the concept of digital library by framing it within the familiar concept of physical library. At the origin of the physical library there is the human authorship of analogue publishing products, the products evolve, become digital, but always have a physical person as author: without the author there is no product, without the author there is no library. But the software author, fully consistent with a completely digital library, disrupts this established conceptual and operational structure. Through the issue of authorship, well-known problems emerge, which are much discussed: «do Artificial Intelligence systems have agency?» i.e., do they have the ability to act subjectively in a free IRCDL 2022: 18th Italian Research Conference on Digital Libraries, February 24–25, 2022, Padova, Italy maurizio.lana@uniupo.it (M. Lana) 0000-0002-7520-1195 (M.Lana) ©️ 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) 1 This contribution constitutes an extended abstract of the article "Artificial Intelligence Systems and Problems of the Author Concept. Reflections on recent publishing products" to be published in May 2022 on JLIS.it. way in a given context? «do Artificial Intelligence systems understand the world?» i.e., are they capable of operating appropriately both syntactically and semantically2? Here we will focus on questions of agency that arise from the productive activity we have alluded to. Beta Writer's book does not constitute a perfect example of agency, as will be seen. But the context in which it was generated (the publisher-author relationship) presents it as such, and so we will start from this assumption. 2. Beta Writer. 2019. «Lithium-Ion Batteries. A Machine-Generated Summary of Current Research». New York, NY: Springer. The press release announcing the release of the book [7] explained that it contains a review of scientific articles on developments in lithium battery research, a review defined as «machine-generated» and «automatically compiled by an algorithm», two expressions that are substantially similar. Springer's partner in this activity was a group of researchers from the «Applied Computational Linguistics Lab» of the Goethe Universität Frankfurt3. It is clear that this was an editorial choice of the publisher who intended to draw attention to the product (the book) outside, especially outside, the circles of Artificial Intelligence experts. 2.1. Technical aspects of book production Christian Chiarcos and Niko Schenk of the «Applied Computational Linguistics Lab» at Goethe Universität in the Introduction of «Lithium-Ion Batteries. A Machine-Generated Summary of Current Research» [9] they discuss the procedure of generating (writing) the book and the selection of sources: we decided for a relatively conservative approach, a workflow based on 1. document clustering and ordering, 2. extractive summarization, and 3. paraphrasing of the generated extracts. Three operational steps, the first is «document clustering» and not something like «searching and extracting articles from SpringerLink», which is an obvious assumption, a level 0. The raison d'être of the book, in a field where there is a very large scientific production, is to create a coherent structure of content then organize the sources by themes, «clustering and ordering»; and for each theme write an introductory summary to each chapter. Chiarcos and Schenk explain: In preparation for generating a book, we identify a seed set of source documents as a thematic data basis for the final book, which serve as input to the pipeline. These documents are obtained by searching for keywords in publication titles or by means of meta data annotations. The nuance is subtle and is all in the word seed: we identify a set of sources on the topic and use them as a seed, as the initial input to the AI software4. One would like to know more about this step, but as Henning Schönenberger, director of the sector «data development» in Springer Nature, puts it «it becomes increasingly difficult to understand how a result has been actually derived»5. 2 The reflection on syntactic/semantic is at the center of Durante [4], who gives as an example a chess game against a computer: for the human player the game is semantic, that is, the choice of moves is part of an overall strategic vision, for the Artificial Intelligence program it is syntactic because in response to the move of the human player all the possible subsequent legal moves are calculated. Part of the complexity of confronting society with AI systems, however, is believing that the game is the same. The misunderstanding manifests itself for the first time explicitly in the Dartmouth Program [5]: "the artificial intelligence problem is taken to be that of making a machine behave in ways that would be called intelligent if a human were so behaving" but it is already present in the Turing test which is presented as an imitation game [6]. The importance of the 1955 "Program" is given by the fact that it is the first formulation of an overall project for Artificial Intelligence, by authors of great importance: McCarthy, Minsky, Rochester, Shannon, Turing and Turing. 3 In [8] we see that the project that led to the publication of the book is entitled "Schwach überwachte Verfahren zur Bibliographie-analyse", methods for weakly supervised bibliographic analysis. The collaboration that led to the publication of the volume had been initiated in 2014 and is part of a framework of various projects of the Lab that have philological, linguistic, and digital humanities imprints. 4 This is the procedure adopted with neural networks: you show the software a type of desired outcome (in this case: the example articles chosen by topic, quality, etc.) so that it produces other similar outcomes (in this case, identifying other articles on the topic). See [10] and [11] as introductions to the topic. 5 It is precisely in reference to these issues that a critical orientation called XAI, eXplainable Artificial Intelligence, is developing whose purpose is "to make a shift towards more transparent AI. It aims to create a suite of techniques that produce more explainable models while Trained with the set of articles identified by humans, the AI software extracted from SpringerLink a collection of 1086 publications selected based on words in the title or metadata, and by year of publication. This collection was then processed to sort and group the sources and thus create the structure of the publication. The developers chose to clusterize the publications on the basis of textual similarity of the documents, which provided the lexical data for the operation6 of sorting and grouping the sources: first the "core thematic topics" were identified, which gave rise to the chapters; then within them the "subtopics", the sections. The summary index was developed with the intervention of human experts [1]. 3. Agency and authorship in bibliographic perspective The author referred to by the cataloguing systems is defined as a person (or, alternatively, as an organization, since an organization is made up of people), both in the Italian reference framework: Per responsabilità, ai fini catalografici, si intende la relazione che lega un’opera o una delle sue espressioni a una o più persone o enti che l’hanno concepita, composta, realizzata, modificata o eseguita. [18] and, just by way of example for a very different system, in the United States: The U.S. Copyright Office will register an original work of authorship, provided that the work was created by a human being. [19]. Contiguous to the question of who can be the author/who is the author, is the question of what the author does, what his activity consists of. Dublin Core, by defining "creator" as «an entity (a person, an organization, or a service) primarily responsible for making the resource»7, broadens and generalizes the meaning of author because the horizon widens from the book to the "resource". In the Italian context, the place where the bibliographic reflection on the concept of author is concretized and expressed are the REICAT rules8 which, in the section in which the various forms of responsibility are discussed, always refer to persons: variously known or unknown persons or persons forming groups or hiding their real name as individuals or as groups9. But Artificial Intelligence software is neither a person nor an entity. On the cover of the book on lithium batteries, the character sequence "Beta Writer" is presented in such a way that it can be understood (constructivist approach, cf. [21]) as the name of a personal author, or a pseudonym of a personal author (analogous to "Romain Gary") or the name of a collective author (analogous to "Luther Blisset"[18])10. But Beta Writer is none of these things. 4. The agency of Artificial Intelligence systems A concise presentation of the main themes regarding agency touches on at least 3 interrelated conceptual structures: individual agency and the concept of agent; and agency theory. The notion of individual agency is «centered on a self with the capacity to effectively act upon the world»[22]. Agency maintaining high performance levels" on which see for example [12] which presents a review of studies on various forms of XAI or [13]. Friendly, but no less robust, is the article by [14] which has many visual components. XAI in turn is an expression of society's and scientists' push towards "ethical AI", for which valid entry points are [15], [16] and [17]. 6 Specifically, recursive non-hierarchical clustering: PCA (principal component analysis) with a constraint to generate 4 clusters (the chapters) and for each of them subclusters each consisting of the 25 most relevant elements. 7 https://www.dublincore.org/specifications/dublin-core/dcmi-terms/elements11/creator/ 8 A critical analysis of the meaning of author in the Italian cultural perspective can start from [20] that of the present time highlights the transition from traditional cataloging to methodologies of metadatation. 9 One might think of solving the problem by using ISBD-ER, since it is a publication produced with Artificial Intelligence systems; but this is not an electronic publication, it is instead a normal printed publication even if its production process has been entirely digital except for the final step. Therefore, the problems of authorship identification arise in the traditional context of cataloguing monographs. 10 See [20]: « There are numerous instances where the author appears on the title page of a book with wording that is generic, misleading, or intentionally deceptive.» is both the original engine of action and its reflexive product: the subject recognizes himself as endowed with agency (he recognizes himself as an agent) insofar as he is able to effectively act upon the world11. Artificial Intelligence systems can be interpreted as having complete and independent agency (agents who do not have a principal, agents who are fully 'principals of themselves', as we generally conceive of an adult, sentient, intelligent person, without cognitive or physical disabilities), or as having agency shared with other agents (in the context of a relationship in which a principal instructs multiple agents to act), or as having no agency (mechanical tools: a hammer with which you hammer a nail). The first description of Artificial Intelligence ante litteram is that of Turing that in 1950 speaks of imitation game in the famous article[6] in which he describes the experiment that the machine will overcome when an interlocutor at the keyboard will not be able to distinguish between the responses of a machine and a human being. Very close to a classic definition of Artificial Intelligence formulated in the «Dartmouth program» of 1955: «making a machine behave in ways that would be called intelligent if a human were so behaving»12. In both cases there is no mention of agency, but intelligence is manifested, presumably, in the ability to act. Taddeo and Floridi in 2018 reformulate the 1955 concept in richer and more nuanced terms, abandoning the idea of imitation: a growing resource of interactive, autonomous, self-learning agency, which enables computational artifacts to perform tasks that otherwise would require human intelligence to be executed successfully [16] and precisely in the topic of agency they propose a more complex reading: the effects of decisions or actions based on AI are often the result of countless interactions among many actors, including designers, developers, users, software, and hardware. This is known as distributed agency. With distributed agency comes distributed responsibility [16] That is, the functioning of an Artificial Intelligence system expresses the implications and consequences of the choices made by those who produced it (distributed agency) - and all these people are co-responsible (distributed responsibility) with the Artificial Intelligence system for the decisions and actions it takes13: and this is what appears from the production process of the book of "Beta Writer" that has been exposed in in a summary way. 5. Conclusions The 2019 "Beta Writer event" is a watershed between a before and an after. A before in which the author was unquestionably conceived as a person, with all that this entails from the point of view of catalography and bibliographic reflection, and an after in which we have acknowledged that the author can be a mixed constellation of people and software and computers in a constant feedback loop as Licklider had already written in 196014 and therefore the quiet certainty that an author is a person has been shaken. This "Beta Writer event" was foretold when Barthes and Foucault in 1967-1969 announced the death of the author. The author-constellation is radically different from the author-person and this makes it necessary to rethink that catalographic foundation which is the author, by recognizing in it the possibility of new complexities and depths. Today we may think that these are reasoning about borderline events, questions that arise in an exceptional way about events that happen rarely and that do not touch the ordinary course of the bibliographic world. We believe, instead, that in the years to 11 This is one of the main theses of [23]. 12 The passage is usually cited with reference to [24] who published in 2006 an abridged version of the "Proposal for the Dartmouth Summer research project on Artificial Intelligence" conceived and written in 1955 but never previously published in print. [24] is much cited in the scientific literature, but the passage for which it is generally cited is not found there. It is found instead in [5] which contains the full "Proposal". 13 The UN report on Artificial Intelligence [26] points out, among other things, that the extension of the agency of Artificial Intelligence entails a symmetrical reduction of the space of human agency. 14 It is necessary to re-read what Licklider wrote in 1960 about the symbiosis between man and computer [27] because it helps to free oneself from that 'presentism' that leads one to think that in the world of digital technology every moment of the present brings a radical novelty that renders obsolete and therefore negligible the theoretical and cultural reflection previously constructed. Or perhaps it would even make it useless to construct a theoretical/cultural reflection since it would be obsolete at the very moment in which it is formulated. come these limit events will become frequent and then ordinary, and that it is therefore necessary to prepare the conceptual tools to manage them. 6. References [1] Beta Writer: Lithium-ion batteries. A Machine-Generated Summary of Current Research. Springer, New York, NY (2019). https://doi.org/10.1007/978-3-030-16800-1. [2] AIB, Gruppo biblioteche digitali: Nuovo Manifesto per le biblioteche digitali, https://www.aib.it/ struttura/commissioni-e-gruppi/gruppo-di-lavoro-biblioteche-digitali/2020/82764-nuovo- manifesto -per-le-biblioteche-digitali/, last accessed 2020/06/11. [3] IFLA, UNESCO: Manifesto for Digital Libraries, https://www.ifla.org/files/assets/digital- libraries/documents/ifla-unesco-digital-libraries-manifesto.pdf (2018). [4] Durante, M.: Potere computazionale: l’impatto delle ICT su diritto, società, sapere. Meltemi, Milano (2019). [5] McCarthy, J., Minsky, M.L., Rochester, N., Shannon, C.E.: A proposal for the Dartmouth summer research project on Artificial Intelligence, https://rockfound.rockarch.org/documents/20181/ 35639/AI.pdf/a6db3ab9-0f2a-4ba0-8c28-beab66b2c062, (1955). [6] Turing, A.M.: Computing machinery and intelligence. Mind. LIX, 433–460 (1950). https://doi.org/ 10.1093/mind/LIX.236.433. [7] Springer Nature: Springer Nature publishes its first machine-generated book, https://www.springer. com/gp/about-springer/media/press-releases/corporate/springer-nature- machine-generated- book/16590126, last accessed 2021/07/01. [8] Projects and Cooperations - Applied Computational Linguistics Lab Goethe University Frankfurt, Germany, http://www.acoli.informatik.uni-frankfurt.de/projects.html, last accessed 2021/07/01. [9] Schoenenberger, H., Chiarcos, C., Schenk, N.: Preface. In: Lithium-ion batteries. A Machine- Generated Summary of Current Research. Springer, New York, NY (2019). https://doi.org/10.1007/ 978-3-030-16800-1. [10] Hagan, M.T., Demuth, H.B., Beale, M.H.: Neural network design. PWS Pub, Boston (1996). [11] Wang, S.-C.: Artificial Neural Network. In: Interdisciplinary Computing in Java Programming. pp. 81–100. Springer US, Boston, MA (2003). https://doi.org/10.1007/978-1-4615-0377-4_5. [12] Adadi, A., Berrada, M.: Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI). IEEE Access. 6, 52138–52160 (2018). https://doi.org/10.1109/ACCESS.2018. 2870052. [13] Barredo Arrieta, A., Díaz-Rodríguez, N., Del Ser, J., Bennetot, A., Tabik, S., Barbado, A., Garcia, S., Gil-Lopez, S., Molina, D., Benjamins, R., Chatila, R., Herrera, F.: Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion. 58, 82–115 (2020). https://doi.org/10.1016/j.inffus.2019.12.012. [14] Gunning, D., Aha, D.: DARPA’s Explainable Artificial Intelligence (XAI) Program. AIMag. 40, 44–58 (2019). https://doi.org/10.1609/aimag.v40i2.2850. [15] Floridi, L., Cowls, J., Beltrametti, M., Chatila, R., Chazerand, P., Dignum, V., Luetge, C., Madelin, R., Pagallo, U., Rossi, F.: AI4People—an ethical framework for a good AI society: opportunities, risks, principles, and recommendations. Minds and Machines. 28, 689–707 (2018). https://doi.org/ 10.1007/s11023-018-9482-5. [16] Taddeo, M., Floridi, L.: How AI can be a force for good. Science. 361, 751–752 (2018). https://doi. org/10.1126/science.aat5991. [17] Mittelstadt, B.: Principles alone cannot guarantee ethical AI. Nature Machine Intelligence. 1, 501– 507 (2019). https://doi.org/10.1038/s42256-019-0114-4. [18] ICCU: Regole italiane di catalogazione: REICAT. ICCU, Roma (2009). [19] 19. U.S. Copyright Office: Compendium of U.S. Copyright Office practices. , Washington, DC (2021). [20] Guerrini, M.: Dalla catalogazione alla metadatazione: tracce di un percorso. Associazione italiana biblioteche, Roma (2020). [21] Svenonius, E.: The intellectual foundation of information organization. MIT Press, Cambridge, Mass (2000). [22] Gubrium, J.F., Holstein, J.A.: Individual agency, the ordinary, and postmodern life. Sociological Quarterly. 36, 555–570 (1995). https://doi.org/10.1111/j.1533-8525.1995.tb00453.x. [23] Mehan, H., Wood, H.: The Reality of Ethnomethodology. John Wiley & Sons, Ltd, New York (1975). [24] McCarthy, J., Minsky, M.L., Rochester, N., Shannon, C.E.: A proposal for the Dartmouth summer research project on Artificial Intelligence. AI Magazine. 27, 3 (2006). [25] Yang, G.-Z., Bellingham, J., Dupont, P.E., Fischer, P., Floridi, L., Full, R., Jacobstein, N., Kumar, V., McNutt, M., Merrifield, R., Nelson, B.J., Scassellati, B., Taddeo, M., Taylor, R., Veloso, M., Wang, Z.L., Wood, R.: The grand challenges of Science Robotics. Sci. Robot. 3, eaar7650 (2018). https://doi.org/10.1126/scirobotics.aar7650. [26] United Nations, Kaye, D.: Report of the Special Rapporteur on the promotion and protection of the right to freedom of opinion and expression. United Nations, New York (2018). [27] Licklider, J.C.R.: Man-Computer Symbiosis. IRE Trans. Hum. Factors Electron. HFE-1, 4–11 (1960). https://doi.org/10.1109/THFE2.1960.4503259.