Proceedings of the
Workshop on Computational Methods in the
Humanities 2022 (COMHUM 2022)
University of Lausanne, June 9–10, 2022


       Section des sciences
       du langage
       et de l'information
                                          COMHUM
                                              2022
Preface
At the turn of the 2020s, a defining characteristic of digital humanities remains the remarkably wide
spectrum of viewpoints they encompass, ranging from a pure engineering perspective applied to hu-
manities data to the use of well established humanities research methods to investigate born-digital
artifacts. In this framework, the COMHUM workshop series positions itself as an international forum
primarily devoted to the following research questions:

1. Which computational methods are most appropriate for dealing with the particular challenges posed
   by humanities research, e.g., uncertainty, vagueness, incompleteness, but also with different posi-
   tions (points of view, values, criteria, perspectives, approaches, readings, etc.)?
2. How can such computational methods be applied to concrete research questions in the humanities?

   The second edition of the Workshop on Computational Methods in the Humanities (COMHUM 2022)
took place on June 9 and 10, 2022 at the University of Lausanne (UNIL).1
   The first day, introduced by a keynote from Vincent Labatut (Avignon University), was devoted
to the specific topic of computational methods for constructing and analyzing character networks.
This topic has ramifications in a variety of disciplines, including linguistics, literary analysis, digital
humanities, and game studies. It is of particular interest for a number of research initiatives at UNIL
and in neighboring institutions. Our goal was to bring together researchers from different communities
studying character networks using computational and methodologically explicit approaches, to review
the state of the art in this domain and to sketch its future developments.
   In the spirit of the first edition of the COMHUM workshop, the second day was open to submissions
on any topic pertaining to theoretical or applied research on computational methods for humanities
research broadly conceived.
   We invited researchers to submit abstracts of 500 to 1000 words. These abstracts were reviewed in a
double-blind fashion by members of the program committee; all submissions received several indepen-
dent reviews. Of the 23 submissions we received, we finally accepted 18. Authors of accepted abstracts
presented their research at the workshop as talks; the abstracts are published on the workshop Web
site.2
   The COMHUM 2022 workshop was organized by members of the Lausanne Lab for Computational
and Statistical Text Analysis (LLIST)3 : François Bavaud, Guillaume Guex, Coline Métrailler (co-chair),
Davide Picca, Michael Piotrowski, Yannick Rochat (chair), and Aris Xanthos, with assistance from
Stéphanie Pichot. It was hosted by the Department of Language and Information Sciences4 and re-
ceived support from the Center for Linguistics and the Science of Language,5 both part of the Faculty
of Arts of the University of Lausanne.
   This volume constitutes the proceedings of COMHUM 2022. It contains revised and expanded ver-
sions of seven of the papers presented at the workshop. Both the original abstracts and the expanded
versions were peer-reviewed by at least two members of the program committee (see below); the seven
revised and expanded papers that constitute these proceedings underwent another round of reviews.
   We would like to thank everybody who contributed, in one way or another, to making COMHUM 2022
a success.

                                                                                                   Yannick Rochat
                                                                                                 Coline Métrailler
                                                                                                Michael Piotrowski
                                                                                                          (editors)

1. See https://ceur- ws.org/Vol- 2314/ for the proceedings of the first edition, COMHUM 2018.
2. https://wp.unil.ch/llist/en/event/comhum2022/
3. https://wp.unil.ch/llist/
4. https://www.unil.ch/sli/
5. https://www.unil.ch/clsl/


                                                         1
Program Committee
• François Bavaud, University of Lausanne (Switzerland)
• Giovanni Colavizza, University of Bologna (Italy)
• Guillaume Guex, University of Lausanne (Switzerland)
• Vincent Labatut, Avignon University (France)
• Coline Métrailler (co-chair), University of Lausanne (Switzerland)
• Fabian Moss, Julius-Maximilians-Universität of Würzburg (Germany)
• Davide Picca, University of Lausanne (Switzerland)
• Michael Piotrowski, University of Lausanne (Switzerland)
• Yannick Rochat (chair), University of Lausanne (Switzerland)
• Elena Spadini, University of Basel (Switzerland)
• Mathieu Triclot, University of Technology of Belfort-Montbéliard (France)
• Aris Xanthos, University of Lausanne (Switzerland)

  Please note that some members of the program committee have changed affiliation since the work-
shop; we have chosen to list them with their current affiliation at the publication date of this volume.


                                                   2
Contents
Modelling Usage Information in a Legacy Dictionary: From TEI Lex-0 to Ontolex-Lemon
   Bruno Almeida, Rute Costa, Ana Salgado, Margarida Ramos, Laurent Romary, Fahad Khan, Sara
   Carvalho, Mohamed Khemakhem, Raquel Silva, Toma Tasovac . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                               5
Data Augmentation for Robust Character Detection in Fantasy Novels
   Arthur Amalvy, Vincent Labatut, Richard Dufour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                23
A (Dis)similarity Index for Comparing Two Character Networks Based on the Same Story
   François Bavaud, Coline Métrailler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                  33
A Framework for Embedding Entities in a Textual Narrative: a Case Study on Les Misérables
   Guillaume Guex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .      43
Introducing VISU: Vagueness, Incompleteness, Subjectivity, and Uncertainty in Art Provenance
   Data
   Fabio Mariani . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   63
Narrative Flow: A Formal Distant Reading Approach for Interactive Narratives
   Coline Métrailler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   85
Exploring Naming Inventories for Architectural Elements for Use in Multi-modal Machine Learn-
   ing Applications
   Ronja Utescher, Aaron Pattee, Ferdinand Maiwald, Jonas Bruschke, Stephan Hoppe, Sander Mün-
   ster, Florian Niebling, Sina Zarrieß . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                95


                                                                                    3