Editorial Introduction to Biographical Data in a Digital World 2019 Angel Daza1 , Antske Fokkens1,2 , Petya Osenova3 , Kiril Simov3 , Alexander Popov3 , Paul Arthur4 , Thierry Declerck5 , Ronald Sluijter6 , Serge ter Braake7 and Eveline Wandl-Vogt8 1 Vrije Universiteit Amsterdam, 2 Eindhoven University of Technology, 3 Bulgarian Academy of Sciences, 4 Edith Cowan University, 5 German Research Center for Artificial Intelligence (DFKI), 6 Huygens ING, 7 Stichting UvO, 8 Austrian Academy of Sciences, Ars Electronica Research Institute knowledge for humanity Abstract The third edition of the Biographical Data in a Digital World Conference took place on 5 and 6 September 2019 in Varna, Bulgaria. The conference included nine long and four short presentations and a session with work groups on modeling biographical data. These proceedings include nine full papers that were accepted upon single blind review. 1 Introduction round of the revised paper which may lead to a decision that The first two editions of Biographical Data in a Digital the work cannot be included in the proceedings). For all World provided a, for many researchers, first opportunity these modes of acceptance, authors could come and present to connect with other researchers working on digital bio- their work. Camera ready versions were collected after the graphical resources. They brought together a wide variety event, so that authors could also make use of feedback from of perspectives from historians, librarians, literature stud- other participants. ies, computer scientists and computational linguistics. Next Four papersB received a regular accept, three were accepted to differences and new angles, we also found commonali- conditionally based on minor revisions, two papers received ties: a shared interest in the richness and variety of the re- a major revision request and one was rejected. All five ab- sources, challenges with gaps in the data and data quality, stracts were deemed relevant and accepted for presentation approaches to data representation and horizons waiting to (without publication). With one author not being able to be explored on several levels. make it, there were thirteen presentations around various Projects in various countries are steadily making progress, topics at the event. All accepted papers submitted revised resources are growing and methods for exploring them are versions. The minor revision papers were checked by the improving. At an international level, we find a continuous editorial board, which also formed a new reviewing com- wish to connect. As a community, we want to exchange mittee for the two remaining papers. One needed an extra ideas and learn from each other, but in particular, we are round of minor revisions after which all completed papers interested in identifying connections among our resources. could be included in the proceedings. An overview of the This desire forms the main motivation for continuing to or- papers is given below. ganize events in the BD-conference series, which resulted In addition to the work presented in this paper, various dis- in the third edition of this conference and (with a long par- cussions around connecting resources from different coun- tially Covid-19 related delay) these proceedings. tries took place during the event. This resulted in the pro- posal of the Horizon2020 project InTaVia1 that is currently 2 Biographical Data in a Digital World 2019 running bringing together researchers from Austria, Den- mark, Germany, the Netherlands and Slovenia who met The third edition of Biographical Data in a Digital World through the Biographical Data conference series. took place in Varna, Bulgaria in September 2019. Presen- ters could either submit a full paper or an abstract only. Full 3 Overview of Papers papers were reviewed by three or four reviewers in a single- blind review process. Abstracts were verified for relevance. The papers in these proceedings cover three themes i) We received ten full papers and five abstracts. Building Digital Biographical Resources ii) applying NLP Because interdisciplinary venues bring together different tools for biographical data mining, and iii) Exploring capa- publication cultures, researchers from various domains in bilities of digital resources with case studies. digital humanities, may not be familiar with the workshop Building Digital Biographical Resources and conference proceedings publication culture common Bhreathnach et al. (2019) identify existing biases, features in computer science. We therefore opted for three modes and omissions in the Irish biographical database Ainm.2 of acceptance for full papers: regular acceptance (with a This is an analysis on the distribution of biographies in chance to make updates and encouragement to incorpo- terms of people’s lifespans, gender and birthplace, as well rate feedback for the camera ready version), acceptance as professions present in the mentioned resource. They with minor revisions (with a requirement to address the 1 main criticism of reviewers which would be verified by the https://intavia.eu 2 editors) and major revisions (with an independent review https://www.ainm.ie/Info.aspx?Topic=resources.en found, for example, that even though the resource spans researchers of military history. It aims at centralizing all in- from 15th to 21st centuries, the database has a heavy bias formation available from different data sources and linking towards people living in the 19th and 20th century (around them to each prisoner. This database facilitates research 75% of the subjects). This work also quantitatively con- that explores the details, relations and subgroups of each firms an already acknowledged bias towards people from individual biography. certain regions and professions. Mayr et al. (2019) describe research that is part of the Hyvönen et al. (2019) argue for shifting towards Linked Polycube project. Polycube’s goal is to visualize multiple Data as the paradigm for publishing and using biographical data dimensions such as space, categories and relations dictionaries. To support their arguments, they describe the over time. This paper particularly raises the question of data service and semantic portal BiographySampo,3 where how visualizations of life and work go together? They biography texts are enriched with 16 external data sources. answer it with the case study of Charles W. Cushman Their paper outlines how reasoning through the structured showing how these visual-analytical frames of reference graph can be used to expand and discover serendipitious could provide a multimodal, narrative framework of relations among entities, persons, and places. biographical knowledge exploration and communication. Povroznik (2019) studies the collective portraits of deputies of local self-government in Russia in the second half of the We would like to thank the program committee members XIX century. With this goal in mind, the paper describes for their careful and critical reviews, which supported the the problems of searching, organizing, modelling, analyz- selection process and helped authors improve their papers. ing and presenting data during the process for building a We would also like to thank the other members of the useful database for prosopographical research. editorial board for their input during discussions and the Vogeler et al. (2019) discuss the viability of an interna- careful checks and reviews for the conditionally accepted tional prosopographical framework. The authors propose papers. Many special thanks go to the local organizers: a set of data resources, interfaces and analytical tools to Petya Osenova, Kiril Simov and Alexander Popov for integrate and facilitate the access to prosopographical re- an excellent job resulting in an impeccably organized sources. Their solution is a data model that can be accessed conference. Finally, we would like to thank everyone by a RESTful API. The paper illustrates this idea through involved for their patience and understanding for the delays a concrete example based on resources built for the Aus- in bringing out these proceedings. trian Prosopographical database (APIS) by following their proposed methods. Angel Daza & Antske Fokkens, Chief Editors. NLP Tools for Biographical Data Mining Magistry et al. (2019) use Natural Language Processing (NLP) tools to extract information and create a structured data resource based on the biographical Dictionary of Re- publican China (BDRC). Their approach also uses the con- structed graphs as means for exploration on the relation- ships centered on education and position occupied by the people on the database. Plum et al. (2019) also apply information extraction methods to identify relevant biography candidates in large databases. The automatic extraction methods are run on two popular data sources: Wikipedia and Wikidata, partic- ularly for the case study of people who had an impact in the Republic of Austria and died between 1951 and 2019. The authors conclude that their NLP pipeline can be helpful to identify suitable candidates and extract relevant informa- tion. Exploring Capabilities of Digital Resources with Case Studies Filipov et al. (2019) present a visualization tool for inter- active analysis on multiple biographical timelines, relating different biographies through a specific set of events and lo- cations. Specifically, they show the example of biographies connected to Austrian music history. Through their visu- alization technique, new potential narratives can be created and contextualized, providing an extra layer on the process of historical research. Koho et al. (2019) introduce a database of Finnish prison- ers of war in the Soviet Union. The project is targeted to the 3 https://seco.cs.aalto.fi/projects/biografiasampo/en/ Full Editorial Board Angel Daza, Vrije Universiteit Amsterdam Antske Fokkens, Vrije Universiteit Amsterdam & Eindhoven University of Technology Petya Osenova, Bulgarian Academy of Sciences Kiril Simov, Bulgarian Academy of Sciences Alexander Popov, Bulgarian Academy of Sciences Paul Arthur, Edith Cowan University Serge ter Braake, Stichting UvO Thierry Declerck, DFKI Saarland Ronald Sluijter, Huygens ING Eveline Wandl-Vogt, Austrian Academy of Sciences, Ars Electronica Research Institute knowledge for humanity Program Committee Peter Bol, Harvard University Serge ter Braake, Stichting UvO Thierry Declerck, DFKI Saarland Lonneke Geerlings, Stichting UvO Gernot Howanitz, Universität Innsbruck Anders Ingram, University of Oxford Bärbel Kröger, Akademie der Wissenschaften zu Göttingen Eetu Mäkelä, Aalto University Lodewijk Petram, Huygens ING Katharina Prager, Ludwig Boltzmann Institute for Digital History Matthias Reinert, Neue Deutsche Biographie, Germany Matthias Schlögl, Österreichische Akademie der Wissenschaften Kiril Simov, Bulgarian Academy of Sciences Ronald Sluijter, Huygens ING Petra Vide Ogrin, Slovenian Academy of Sciences and Arts Georg Vogeler, Österreichische Akademie der Wissenschaften Marcos Zampieri, Rochester Institute of Technology Kalliopi Zervanou, Leiden University Joris van Zundert, Huygens ING 4 References Úna Bhreathnach, Cathal Burke, Jeaic Mag Fhinn, Gearóid Ó Cleircı́n, and Brian Ó Raghallaigh. 2019. A quanti- tative analysis of biographical data from ainm, the irish- language biographical database. In Proceedings of the Third Conference on Biographical Data in a Digital World 2019, Varna, Bulgaria. CEUR. Velitchko Filipov, Nathalie Soursos, Viktor Schetinger, Su- sana Zapke, and Silvia Miksch. 2019. Exiled but not forgotten: Investigating commemoration of musicians in vienna after 1945 through visual analytics. In Proceed- ings of the Third Conference on Biographical Data in a Digital World 2019, Varna, Bulgaria. CEUR. Eero Hyvönen, Petri Leskinen, Minna Tamper, Heikki Rantala, Esko Ikkala, Jouni Tuominen, and Kirsi Ker- avuori. 2019. Linked data – a paradigm shift for publish- ing and using biography collections on the semantic web. In Proceedings of the Third Conference on Biographical Data in a Digital World 2019, Varna, Bulgaria. CEUR. Mikko Koho, Esko Ikkala, and Eero Hyvönen. 2019. Re- assembling the lives of finnish prisoners of the second world war on the semantic web. In Proceedings of the Third Conference on Biographical Data in a Digital World 2019, Varna, Bulgaria. CEUR. Pierre Magistry, Cécile Armand, and Christian Henriot. 2019. Mining the biographical dictionary of republican china. from print to network exploration. In Proceedings of the Third Conference on Biographical Data in a Digi- tal World 2019, Varna, Bulgaria. CEUR. Eva Mayr, Saminu Salisu, Velitchko A. Filipov, Gúnther Schreder, Roger A. Leite, Silvia Miksch, and Florian Windhager. 2019. Visualizing biographical trajectories by historical artifacts: A case study based on the photog- raphy collection of charles w. cushman. In Proceedings of the Third Conference on Biographical Data in a Digi- tal World 2019, Varna, Bulgaria. CEUR. Alistair Plum, Marcos Zampieri, Constantin Orăsan, Eve- line Wandl-Vogt, and Ruslan Mitkov. 2019. Large-scale data harvesting for biographical data. In Proceedings of the Third Conference on Biographical Data in a Digital World 2019, Varna, Bulgaria. CEUR. Nadezhda Povroznik. 2019. Reconstructing data for mod- elling collective biography: A case of zemstvo deputies in russia in the second half of xix century. In Proceed- ings of the Third Conference on Biographical Data in a Digital World 2019, Varna, Bulgaria. CEUR. Georg Vogeler, Gunter Vasold, and Matthias Schlögl. 2019. Data exchange in practice: Towards a prosopographi- cal api. In Proceedings of the Third Conference on Bio- graphical Data in a Digital World 2019, Varna, Bulgaria. CEUR.