Artificial Intelligence Systems Producing Books: Questions of
Agency1
Maurizio Lana
Università del Piemonte Orientale, piazza Roma 36, 13100 Vercelli, Italy

                 Abstract
                 The publication of the book Beta Writer. 2019. Lithium-Ion Batteries. A Machine-Generated
                 Summary of Current Research. New York, NY: Springer, produced with Artificial Intelligence
                 software prompts analysis and reflection in several areas. First of all, about what Artificial
                 Intelligence systems are able to do in the production of informative texts. This raises the
                 question of whether and how an Artificial Intelligence software system can be treated as the
                 author of a text it has produced. Assessing whether this is correct and possible leads to a re-
                 examination of the current conception whereby it is taken for granted that the author is a person.
                 This, in turn, face to texts produced by AI systems necessarily raises the question of whether
                 they, like the author-person, are endowed with agency. The article concludes that Artificial
                 Intelligence systems are characterised by a distributed agency, shared with those who designed
                 them and operate them, and that a new type of author must be defined and recognised.

                 Keywords 1
                 Author, Artificial Intelligence, Book Production, Agency

1. Introduction
    In 2019, Springer published in print and digital a volume entitled «Lithium-Ion Batteries. A
Machine-Generated Summary of Current Research» [1] whose main feature is that it was produced by
means of an ad hoc Artificial Intelligence system, so much so that the author was named «Beta Writer».
    The appearance on the publishing scene of a software author brings a final missing element to the
consideration of digital libraries, which until now assumed human authorship for publications. The
software author destabilizes the existing principles on which cataloguing is based, and the production
of a fully digital book is the first step towards a fully digital library. It is no longer possible to simply
state that «digital libraries are libraries»[2], or that «a digital library is an online collection of digital
objects, of assured quality, supported by services necessary to allow users to retrieve and exploit the
resources», or that «a digital library forms an integral part of the services of a library»[3]: these
statements imply different ways of normalizing the concept of digital library by framing it within the
familiar concept of physical library.
    At the origin of the physical library there is the human authorship of analogue publishing products,
the products evolve, become digital, but always have a physical person as author: without the author
there is no product, without the author there is no library. But the software author, fully consistent with
a completely digital library, disrupts this established conceptual and operational structure.
    Through the issue of authorship, well-known problems emerge, which are much discussed: «do
Artificial Intelligence systems have agency?» i.e., do they have the ability to act subjectively in a free


IRCDL 2022: 18th Italian Research Conference on Digital Libraries, February 24–25, 2022, Padova, Italy
   maurizio.lana@uniupo.it (M. Lana)
   0000-0002-7520-1195 (M.Lana)
              ©️ 2022 Copyright for this paper by its authors.
              Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
              CEUR Workshop Proceedings (CEUR-WS.org)
1
 This contribution constitutes an extended abstract of the article "Artificial Intelligence Systems and Problems of the Author Concept.
Reflections on recent publishing products" to be published in May 2022 on JLIS.it.
way in a given context? «do Artificial Intelligence systems understand the world?» i.e., are they capable
of operating appropriately both syntactically and semantically2?
    Here we will focus on questions of agency that arise from the productive activity we have alluded
to. Beta Writer's book does not constitute a perfect example of agency, as will be seen. But the context
in which it was generated (the publisher-author relationship) presents it as such, and so we will start
from this assumption.

2. Beta Writer. 2019. «Lithium-Ion Batteries. A Machine-Generated Summary
   of Current Research». New York, NY: Springer.
    The press release announcing the release of the book [7] explained that it contains a review of
scientific articles on developments in lithium battery research, a review defined as «machine-generated»
and «automatically compiled by an algorithm», two expressions that are substantially similar. Springer's
partner in this activity was a group of researchers from the «Applied Computational Linguistics Lab»
of the Goethe Universität Frankfurt3. It is clear that this was an editorial choice of the publisher who
intended to draw attention to the product (the book) outside, especially outside, the circles of Artificial
Intelligence experts.

2.1.       Technical aspects of book production
    Christian Chiarcos and Niko Schenk of the «Applied Computational Linguistics Lab» at Goethe
Universität in the Introduction of «Lithium-Ion Batteries. A Machine-Generated Summary of Current
Research» [9] they discuss the procedure of generating (writing) the book and the selection of sources:
we decided for a relatively conservative approach, a workflow based on
    1. document clustering and ordering,
    2. extractive summarization, and
    3. paraphrasing of the generated extracts.
    Three operational steps, the first is «document clustering» and not something like «searching and
extracting articles from SpringerLink», which is an obvious assumption, a level 0. The raison d'être of
the book, in a field where there is a very large scientific production, is to create a coherent structure of
content then organize the sources by themes, «clustering and ordering»; and for each theme write an
introductory summary to each chapter. Chiarcos and Schenk explain:

    In preparation for generating a book, we identify a seed set of source documents as a thematic data
    basis for the final book, which serve as input to the pipeline. These documents are obtained by
    searching for keywords in publication titles or by means of meta data annotations.

   The nuance is subtle and is all in the word seed: we identify a set of sources on the topic and use
them as a seed, as the initial input to the AI software4. One would like to know more about this step,
but as Henning Schönenberger, director of the sector «data development» in Springer Nature, puts it «it
becomes increasingly difficult to understand how a result has been actually derived»5.
2
  The reflection on syntactic/semantic is at the center of Durante [4], who gives as an example a chess game against a computer: for the human
player the game is semantic, that is, the choice of moves is part of an overall strategic vision, for the Artificial Intelligence program it is
syntactic because in response to the move of the human player all the possible subsequent legal moves are calculated. Part of the complexity
of confronting society with AI systems, however, is believing that the game is the same. The misunderstanding manifests itself for the first
time explicitly in the Dartmouth Program [5]: "the artificial intelligence problem is taken to be that of making a machine behave in ways that
would be called intelligent if a human were so behaving" but it is already present in the Turing test which is presented as an imitation game
[6]. The importance of the 1955 "Program" is given by the fact that it is the first formulation of an overall project for Artificial Intelligence,
by authors of great importance: McCarthy, Minsky, Rochester, Shannon, Turing and Turing.
3
  In [8] we see that the project that led to the publication of the book is entitled "Schwach überwachte Verfahren zur Bibliographie-analyse",
methods for weakly supervised bibliographic analysis. The collaboration that led to the publication of the volume had been initiated in 2014
and is part of a framework of various projects of the Lab that have philological, linguistic, and digital humanities imprints.
4
  This is the procedure adopted with neural networks: you show the software a type of desired outcome (in this case: the example articles
chosen by topic, quality, etc.) so that it produces other similar outcomes (in this case, identifying other articles on the topic). See [10] and [11]
as introductions to the topic.
5
  It is precisely in reference to these issues that a critical orientation called XAI, eXplainable Artificial Intelligence, is developing whose
purpose is "to make a shift towards more transparent AI. It aims to create a suite of techniques that produce more explainable models while
    Trained with the set of articles identified by humans, the AI software extracted from SpringerLink
a collection of 1086 publications selected based on words in the title or metadata, and by year of
publication. This collection was then processed to sort and group the sources and thus create the
structure of the publication. The developers chose to clusterize the publications on the basis of textual
similarity of the documents, which provided the lexical data for the operation6 of sorting and grouping
the sources: first the "core thematic topics" were identified, which gave rise to the chapters; then within
them the "subtopics", the sections. The summary index was developed with the intervention of human
experts [1].

3. Agency and authorship in bibliographic perspective
   The author referred to by the cataloguing systems is defined as a person (or, alternatively, as an
organization, since an organization is made up of people), both in the Italian reference framework:

    Per responsabilità, ai fini catalografici, si intende la relazione che lega un’opera o una delle sue
    espressioni a una o più persone o enti che l’hanno concepita, composta, realizzata, modificata o
    eseguita. [18]

    and, just by way of example for a very different system, in the United States:

    The U.S. Copyright Office will register an original work of authorship, provided that the work was
    created by a human being. [19].

    Contiguous to the question of who can be the author/who is the author, is the question of what the
author does, what his activity consists of. Dublin Core, by defining "creator" as «an entity (a person, an
organization, or a service) primarily responsible for making the resource»7, broadens and generalizes
the meaning of author because the horizon widens from the book to the "resource".
    In the Italian context, the place where the bibliographic reflection on the concept of author is
concretized and expressed are the REICAT rules8 which, in the section in which the various forms of
responsibility are discussed, always refer to persons: variously known or unknown persons or persons
forming groups or hiding their real name as individuals or as groups9. But Artificial Intelligence
software is neither a person nor an entity. On the cover of the book on lithium batteries, the character
sequence "Beta Writer" is presented in such a way that it can be understood (constructivist approach,
cf. [21]) as the name of a personal author, or a pseudonym of a personal author (analogous to "Romain
Gary") or the name of a collective author (analogous to "Luther Blisset"[18])10. But Beta Writer is none
of these things.

4. The agency of Artificial Intelligence systems
   A concise presentation of the main themes regarding agency touches on at least 3 interrelated
conceptual structures: individual agency and the concept of agent; and agency theory. The notion of
individual agency is «centered on a self with the capacity to effectively act upon the world»[22]. Agency


maintaining high performance levels" on which see for example [12] which presents a review of studies on various forms of XAI or [13].
Friendly, but no less robust, is the article by [14] which has many visual components. XAI in turn is an expression of society's and scientists'
push towards "ethical AI", for which valid entry points are [15], [16] and [17].
6
  Specifically, recursive non-hierarchical clustering: PCA (principal component analysis) with a constraint to generate 4 clusters (the chapters)
and for each of them subclusters each consisting of the 25 most relevant elements.
7
  https://www.dublincore.org/specifications/dublin-core/dcmi-terms/elements11/creator/
8
   A critical analysis of the meaning of author in the Italian cultural perspective can start from [20] that of the present time highlights the
transition from traditional cataloging to methodologies of metadatation.
9
  One might think of solving the problem by using ISBD-ER, since it is a publication produced with
Artificial Intelligence systems; but this is not an electronic publication, it is instead a normal printed publication even if its production process
has been entirely digital except for the final step. Therefore, the problems of authorship identification arise in the traditional context of
cataloguing monographs.
10
   See [20]: « There are numerous instances where the author appears on the title page of a book with wording that is generic, misleading, or
intentionally deceptive.»
is both the original engine of action and its reflexive product: the subject recognizes himself as endowed
with agency (he recognizes himself as an agent) insofar as he is able to effectively act upon the world11.
    Artificial Intelligence systems can be interpreted as having complete and independent agency
(agents who do not have a principal, agents who are fully 'principals of themselves', as we generally
conceive of an adult, sentient, intelligent person, without cognitive or physical disabilities), or as having
agency shared with other agents (in the context of a relationship in which a principal instructs multiple
agents to act), or as having no agency (mechanical tools: a hammer with which you hammer a nail).
The first description of Artificial Intelligence ante litteram is that of Turing that in 1950 speaks of
imitation game in the famous article[6] in which he describes the experiment that the machine will
overcome when an interlocutor at the keyboard will not be able to distinguish between the responses of
a machine and a human being. Very close to a classic definition of Artificial Intelligence formulated in
the «Dartmouth program» of 1955: «making a machine behave in ways that would be called intelligent
if a human were so behaving»12. In both cases there is no mention of agency, but intelligence is
manifested, presumably, in the ability to act. Taddeo and Floridi in 2018 reformulate the 1955 concept
in richer and more nuanced terms, abandoning the idea of imitation:

     a growing resource of interactive, autonomous, self-learning agency, which enables computational
     artifacts to perform tasks that otherwise would require human intelligence to be executed
     successfully [16]

     and precisely in the topic of agency they propose a more complex reading:

     the effects of decisions or actions based on AI are often the result of countless interactions among
     many actors, including designers, developers, users, software, and hardware. This is known as
     distributed agency. With distributed agency comes distributed responsibility [16]

   That is, the functioning of an Artificial Intelligence system expresses the implications and
consequences of the choices made by those who produced it (distributed agency) - and all these people
are co-responsible (distributed responsibility) with the Artificial Intelligence system for the decisions
and actions it takes13: and this is what appears from the production process of the book of "Beta Writer"
that has been exposed in in a summary way.

5. Conclusions
    The 2019 "Beta Writer event" is a watershed between a before and an after. A before in which the
author was unquestionably conceived as a person, with all that this entails from the point of view of
catalography and bibliographic reflection, and an after in which we have acknowledged that the author
can be a mixed constellation of people and software and computers in a constant feedback loop as
Licklider had already written in 196014 and therefore the quiet certainty that an author is a person has
been shaken. This "Beta Writer event" was foretold when Barthes and Foucault in 1967-1969
announced the death of the author. The author-constellation is radically different from the author-person
and this makes it necessary to rethink that catalographic foundation which is the author, by recognizing
in it the possibility of new complexities and depths. Today we may think that these are reasoning about
borderline events, questions that arise in an exceptional way about events that happen rarely and that
do not touch the ordinary course of the bibliographic world. We believe, instead, that in the years to


11
   This is one of the main theses of [23].
12
   The passage is usually cited with reference to [24] who published in 2006 an abridged version of the "Proposal for the Dartmouth Summer
research project on Artificial Intelligence" conceived and written in 1955 but never previously published in print. [24] is much cited in the
scientific literature, but the passage for which it is generally cited is not found there. It is found instead in [5] which contains the full "Proposal".
13
   The UN report on Artificial Intelligence [26] points out, among other things, that the extension of the agency of Artificial Intelligence entails
a symmetrical reduction of the space of human agency.
14
   It is necessary to re-read what Licklider wrote in 1960 about the symbiosis between man and computer [27] because it helps to free oneself
from that 'presentism' that leads one to think that in the world of digital technology every moment of the present brings a radical novelty that
renders obsolete and therefore negligible the theoretical and cultural reflection previously constructed. Or perhaps it would even make it
useless to construct a theoretical/cultural reflection since it would be obsolete at the very moment in which it is formulated.
come these limit events will become frequent and then ordinary, and that it is therefore necessary to
prepare the conceptual tools to manage them.

6. References
[1] Beta Writer: Lithium-ion batteries. A Machine-Generated Summary of Current Research.
     Springer, New York, NY (2019). https://doi.org/10.1007/978-3-030-16800-1.
[2] AIB, Gruppo biblioteche digitali: Nuovo Manifesto per le biblioteche digitali, https://www.aib.it/
     struttura/commissioni-e-gruppi/gruppo-di-lavoro-biblioteche-digitali/2020/82764-nuovo-
     manifesto -per-le-biblioteche-digitali/, last accessed 2020/06/11.
[3] IFLA, UNESCO: Manifesto for Digital Libraries, https://www.ifla.org/files/assets/digital-
     libraries/documents/ifla-unesco-digital-libraries-manifesto.pdf (2018).
[4] Durante, M.: Potere computazionale: l’impatto delle ICT su diritto, società, sapere. Meltemi,
     Milano (2019).
[5] McCarthy, J., Minsky, M.L., Rochester, N., Shannon, C.E.: A proposal for the Dartmouth summer
     research project on Artificial Intelligence, https://rockfound.rockarch.org/documents/20181/
     35639/AI.pdf/a6db3ab9-0f2a-4ba0-8c28-beab66b2c062, (1955).
[6] Turing, A.M.: Computing machinery and intelligence. Mind. LIX, 433–460 (1950). https://doi.org/
     10.1093/mind/LIX.236.433.
[7] Springer Nature: Springer Nature publishes its first machine-generated book,
     https://www.springer.        com/gp/about-springer/media/press-releases/corporate/springer-nature-
     machine-generated- book/16590126, last accessed 2021/07/01.
[8] Projects and Cooperations - Applied Computational Linguistics Lab Goethe University Frankfurt,
     Germany, http://www.acoli.informatik.uni-frankfurt.de/projects.html, last accessed 2021/07/01.
[9] Schoenenberger, H., Chiarcos, C., Schenk, N.: Preface. In: Lithium-ion batteries. A Machine-
     Generated Summary of Current Research. Springer, New York, NY (2019).
     https://doi.org/10.1007/ 978-3-030-16800-1.
[10] Hagan, M.T., Demuth, H.B., Beale, M.H.: Neural network design. PWS Pub, Boston (1996).
[11] Wang, S.-C.: Artificial Neural Network. In: Interdisciplinary Computing in Java Programming.
     pp. 81–100. Springer US, Boston, MA (2003). https://doi.org/10.1007/978-1-4615-0377-4_5.
[12] Adadi, A., Berrada, M.: Peeking Inside the Black-Box: A Survey on Explainable Artificial
     Intelligence (XAI). IEEE Access. 6, 52138–52160 (2018). https://doi.org/10.1109/ACCESS.2018.
     2870052.
[13] Barredo Arrieta, A., Díaz-Rodríguez, N., Del Ser, J., Bennetot, A., Tabik, S., Barbado, A., Garcia,
     S., Gil-Lopez, S., Molina, D., Benjamins, R., Chatila, R., Herrera, F.: Explainable Artificial
     Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI.
     Information Fusion. 58, 82–115 (2020). https://doi.org/10.1016/j.inffus.2019.12.012.
[14] Gunning, D., Aha, D.: DARPA’s Explainable Artificial Intelligence (XAI) Program. AIMag. 40,
     44–58 (2019). https://doi.org/10.1609/aimag.v40i2.2850.
[15] Floridi, L., Cowls, J., Beltrametti, M., Chatila, R., Chazerand, P., Dignum, V., Luetge, C., Madelin,
     R., Pagallo, U., Rossi, F.: AI4People—an ethical framework for a good AI society: opportunities,
     risks, principles, and recommendations. Minds and Machines. 28, 689–707 (2018). https://doi.org/
     10.1007/s11023-018-9482-5.
[16] Taddeo, M., Floridi, L.: How AI can be a force for good. Science. 361, 751–752 (2018). https://doi.
     org/10.1126/science.aat5991.
[17] Mittelstadt, B.: Principles alone cannot guarantee ethical AI. Nature Machine Intelligence. 1, 501–
     507 (2019). https://doi.org/10.1038/s42256-019-0114-4.
[18] ICCU: Regole italiane di catalogazione: REICAT. ICCU, Roma (2009).
[19] 19. U.S. Copyright Office: Compendium of U.S. Copyright Office practices. , Washington, DC
     (2021).
[20] Guerrini, M.: Dalla catalogazione alla metadatazione: tracce di un percorso. Associazione italiana
     biblioteche, Roma (2020).
[21] Svenonius, E.: The intellectual foundation of information organization. MIT Press, Cambridge,
     Mass (2000).
[22] Gubrium, J.F., Holstein, J.A.: Individual agency, the ordinary, and postmodern life. Sociological
     Quarterly. 36, 555–570 (1995). https://doi.org/10.1111/j.1533-8525.1995.tb00453.x.
[23] Mehan, H., Wood, H.: The Reality of Ethnomethodology. John Wiley & Sons, Ltd, New York
     (1975).
[24] McCarthy, J., Minsky, M.L., Rochester, N., Shannon, C.E.: A proposal for the Dartmouth summer
     research project on Artificial Intelligence. AI Magazine. 27, 3 (2006).
[25] Yang, G.-Z., Bellingham, J., Dupont, P.E., Fischer, P., Floridi, L., Full, R., Jacobstein, N., Kumar,
     V., McNutt, M., Merrifield, R., Nelson, B.J., Scassellati, B., Taddeo, M., Taylor, R., Veloso, M.,
     Wang, Z.L., Wood, R.: The grand challenges of Science Robotics. Sci. Robot. 3, eaar7650 (2018).
     https://doi.org/10.1126/scirobotics.aar7650.
[26] United Nations, Kaye, D.: Report of the Special Rapporteur on the promotion and protection of
     the right to freedom of opinion and expression. United Nations, New York (2018).
[27] Licklider, J.C.R.: Man-Computer Symbiosis. IRE Trans. Hum. Factors Electron. HFE-1, 4–11
     (1960). https://doi.org/10.1109/THFE2.1960.4503259.