Early Modern Book Catalogues and Multilingualism: Identifying Multilingual Texts and Translations using Titles

Early Modern Book Catalogues and Multilingualism: Identifying Multilingual Texts and Translations using Titles YannRyan yann.ryan@kuleuven.be Faculty of Arts KU Leuven

Blijde-Inkomststraat 21 3000 Leuven Belgium

MargheritaFantoli margherita.fantoli@kuleuven.be Faculty of Arts KU Leuven

Blijde-Inkomststraat 21 3000 Leuven Belgium

Early Modern Book Catalogues and Multilingualism: Identifying Multilingual Texts and Translations using Titles 1613-0073 0E5E6167E8CEB10EF678404C9C73B971 GROBID - A machine learning software for extracting information from scholarly documents multilingualism metadata transformer models few-shot classification library catalogues

With this paper we aim to assess whether Early Modern book titles can be exploited to track two aspects of multilingualism in book publishing: publications featuring multiple languages and the distinction between editions of works in their original language and in translation. To this scope we leverage the manually annotated language information available in two book catalogs: the Collectio Academica Antiqua, recording publications of scholars of the Old University of Leuven (1425-1797) and a subset of the Eighteenth Century Collections Online, namely publications of Ancient Greek and Latin works. We evaluate three different approaches: we train a simple tf-idf based support vector classifier, we fine-tune a multilingual transformer model (BERT) and we use a few-shot approach with a pre-trained sentence transformer model. In order to get a better understanding of the results, we make use of SHAP, a library for explaining the output of any machine Learning model. We conclude that while the few-shot prediction is not currently usable for this task, the tf-idf approach and BERT fine-tuning are comparable and both usable. BERT shows better results for the task of identifying translations and when generalizing across different datasets.

Introduction

Metadata catalogues, particularly library catalogues, are increasingly valuable for reconstructing the cultural and intellectual life of the past [33,18,30,27,19]. These catalogues provide insights into both cultural artefacts and the actors behind the publishing industry, often spanning vast temporal and spatial ranges. Widely implemented metadata schemes such as MARC21 1 and Dublin Core 2 facilitate large-scale mining of these resources. The manual creation of catalogues, relying on experts familiar with the epoch and place covered, as well as cataloguing best practices, ensures their reliability as data sources.

In this paper, we aim at investigating whether machine learning and Large Language Models can support the labelling of Early Modern book records in relation to language. Specifically, we explore the use of titles to identify multilingual publications and distinguish between works published in their original language and those translated. The full titles recorded in several catalogues of Early Modern books are highly informative regarding the linguistic form of the book's content: they may mention the translator, the language in which the text is printed, and the language from which the text is translated. A typical is example is provided by the title 'A poetical translation of the works of Horace: with the original text, and critical notes collected from his best Latin and French commentators. By the Rev. d Mr. Philip Francis. In four volumes. '. This paper aims to answer three research questions:

• RQ1: Do the titles recorded in catalogues of Early Modern books contain sufÏcient information to predict if they were multilingual or monolingual, and printed in the original language or translated? • RQ2: Which approach yields the best results: a simple tf-idf classifier, training a Large Language Model, or adopting a few-shot approach? • RQ3: Given the heterogeneity of Early Modern publications, can models trained on one dataset yield satisfactory results on others? Does the diversification of training data improve the results on the datasets analyzed?

The work is structured as follows: in Section 2, we discuss the importance of multilingualism for Early Modern studies and the current possibilities for automatic language information extraction. Section 3 introduces the two datasets used in this experiment. 3 In Section 4, we describe the tasks (Section 4.1) and models (Section 4.2) employed. Finally, Sections 5 and 6 present the results and discuss the potential of this approach.

Related work

Early Modern Europe was marked by multilingualism. As Latin's dominance as the lingua franca waned, vernacular languages began to emerge in scientific and literary production. This shift influenced various practices in the printed press, drawing interest from linguistics, book history, literary studies, and translation studies [2]. A key focus is the reception of classical texts. During Humanism and the Renaissance, Ancient Greek and Latin gained prominence, and, on the one hand, reading original works became central to humanistic education [22,17]. On the other hand, this interest led to significant translation efforts, impacting the cultural landscape [4,11,20].

This study examines two datasets reflecting aspects of Early Modern multilingualism: the diverse linguistic environment of the Low Countries and the evolving practice of printing classical authors in England. The Low Countries, a multilingual hub due to their political situation [13,38], saw significant scholarly activity around the Old University of Leuven, captured by the catalog Collection Academica Antiqua (CAA). The CAA features several Ancient Greek and Latin authors, reflecting the high value placed on classics in the Low Countries' learned society, as exemplified by the curriculum of the Collegium Trilingue [15,14,6]. In England, we focus on the printing of Classics in the eighteenth century. The influence of Ancient Greek and Latin on Grammar School curricula and the role of translations in circulating classics have been welldocumented [39,3,41]. This resulted in multilingual publications recorded in catalogs such as the English Short Title Catalog (ESTC) and Eighteenth Century Collections Online (ECCO), the latter used in this study. More details are provided in Section 3.

Our work utilizes long titles of Early Modern books to annotate their linguistic characteristics. Book titles have been leveraged for metadata enrichment and large-scale analysis in several studies: from the decline of the average length of modern British novel titles [25], to genre classification [26], 4 and topic modeling (two examples based on art catalogs are [10,5]). Recent experiments have leveraged language and multimodal models to semantically enrich metadata sets [40,1,24,31]. In this paper, we assess whether titles can be used to track multilingualism phenomena in a catalogue (i.e., to enrich metadata with specific language information). As noted by Hatzel, Stiemer, Biemann, and Gius [12], traditional, feature-based machine learning approaches are still widely applied in the Humanities. Hence, we compare a tf-idf-based classifier with the performance of Large Language Models (LLMs) [37] (here, BERT [8]), particularly trained for multilingual sentence classification. Transformer-based LLMs are increasingly used for annotation and to enrich metadata or analyse historical text collections, for example to predict the year of publication from text [42], or to investigate genre within books [28]. The availability of multilingual and historical text models, through easy-to-use APIs such as HuggingFace, means that the potential for such models to enhance research or augment our bibliographic understanding of large collections has greatly increased in recent years. Given the high resource cost of fine-tuning LLMs, we also test a few-shot approach for the same task, where only a few examples are used to tune the model (see Section 4.2).

We aim to achieve two objectives: label a work as multilingual or monolingual and identify whether it is printed in the original language or translated. These tasks, while related to language identification [16], are tailored to Early Modern book history: a title may be monolingual but indicate a multilingual work, and identifying the title's language alone is insufÏcient to determine if it is a translation or an original edition. The presence of multiple languages in metadata sets has already been recognized as a major challenge in metadata processing [23].

Data

The present study relies on two datasets: the CAA5 from KU Leuven, and a version of Eighteenth Century Collections Online (ECCO) 6 manually enriched by a group of students. The CAA is curated by the Special Collections of KU Leuven Libraries and comprises books related to the Old University of Leuven (1425-1797), mostly of scholars that, at a certain point of their career, were afÏliated to this university. The CAA version used for this study (exported on 28 July 2023) comprises 3660 holdings, each of them described in MARC XML records. ECCO is a digital database assembled by Gale and stores the (OCRed) full text of a collection of 184,536 titles published in the eighteenth century. Within this collection, we identified the set of classical publications, as those authored by Ancient Greek or Latin authors living before the sixth cen-language pair # CAA language pair #ECCO 7 The total number of classical editions amounts to 5237 rows. We refer to this dataset as ECCO-classics. These two datasets are chosen because of their meticulous language annotation, their partial chronological overlap, the shared presence of classics (several classical works were printed in Early Modern Flanders, and feature in the CAA), 8 but also clear differences in terms of languages included and cultural and geographical background: these characteristics make them useful sets for comparing the capacities of generalization of the different approaches.

Linguistic annotation

Both datasets have been manually annotated with respect to language. The MARC21 metadata schema includes a specific code for language annotation (041), further specified by several subfields, two of which are used in the CAA: 'a' indicating the language of the record, and 'h', indicating the original language. Hence, multilingual works are those including several 'a' codes, regardless of the presence of a 'h' code. Monolingual works include only one 'a' code. Within the monolingual works, some also include an 'h' code, which is noted when the original is different from the language of the edition. We speak of monolingual edition if no 'h' code is recorded, and monolingual translation if it is recorded (and is consequently different from the 'a' code). In fact, monolingual translations are usually works translated into a single target language and published without the original text. We include only monolingual works for identifying translations, because for multilingual works it is hard to single out the function of the different target languages and be sure that one of them is used for translation. An example of multilingual work in the CAA is represented by 'Les dialogvesde Iean Loys Vives, traduits de Latin en François pour l'exercice des deux langues .../Les dialogues de Jean Loys Vives', which is labeled as French and Latin. Table 1 lists the most frequently attested language combinations for multilingual works in the CAA.

The 'Histoire de Notre-Dame de Hale,par Juste Lipse ... Traduit du latin, & augmentée de plusieurs merveilles, venues en lumière depuis la mort de l'auteur' is the title of a work labeled as monolingual translation. Table 2 shows the most frequent pairs of original and target languages in the CAA. As both Table 1 and 2 The same schema was used to label the books in ECCO-classics, and the most frequently attested language-combinations are shown in Table 1 and 2. An example of multilingual work is for instance 'Phaedri Augusti liberti Fabularum aesopiarum libri quinque. Or, a correct latin edition of the Fables of Phaedrus: with a new literal English translation, and a copious parsingindex; Whereby young Beginners may easily and speedily attain the Knowledge of the Latin Tongue. By a gentleman of the University of Cambridge. For the Use of Schools', while an example of monolingual translation is given by 'The iliad of Homer. Translated by Alexander Pope, Esq. '.

Methodology

Tasks

As mentioned above, we aim at classifying the titles following two criteria, namely whether the edition is monolingual or multilingual (multilingual task henceforth), and whether, in case it is monolingual, it contains a work in its original language or in translation (monolingual translation task henceforth). We work with four combinations of the datasets, as listed in Table 3: the CAA, ECCO-classics, balanced CAA, 9 and ECCO and CAA combined. The datasets were split in 80-20 for training and test.

Multilingual and translated works are proportionately more frequent in the ECCO-classics dataset, because printing multilingual editions (i.e. the original text + a commentary or a translation in a modern language) was common practice for the circulation of classical works. When testing the different models, we evaluate the option of training on each dataset separately and testing on each dataset separately, or training with the union of the two and testing on the datasets separately and combining them. In this way, we want to assess both the capacity of the separate models to generalize, and whether more increasing and diversifying the training data improved the final results (RQ3).

Models and approaches

In order to answer RQ 2, we have tested three different approaches: (1) a simple tf-idf model with Linear Support Vector classification [35] (ML henceforth), (2) fine-tuning a Large Language Model (BERT henceforth), and (3) taking a few-shot approach to fine-tune a sentence transformer model (SetFit henceforth). For the ML task, we performed minimal preprocessing of the titles (they were made lowercase, and punctuation was stripped), and created a common vocabulary comprising CAA and ECCO titles. We performed hyperparameter optimization for each model trained, on the hyperparameters ngram range (all combinations of monograms, bigrams and trigrams), the norm used for penalizing the model and avoiding overfitting ('l1', 'l2', 'elasticnet', None) and whether to weight the classes to limit the impact of very frequent classes ('weighted', None).

For the BERT approach, we fine-tuned the base model bert-base-multilingual-cased [7], using the HuggingFace API and packages. We used the model hyperparameters set out in the HuggingFace documentation for fine-tuning BERT for text classification [32], and for this paper, we have not performed hyperparameter optimization on them.

For the few-shot experiment the aim was to provide a small number of examples which were as representative as possible with respect to each task. Separate sets were made for the multilingual and translation tasks. For the multilingual task, the final training set contains 5 examples from each of the languages or language pairs, and an equal number of monolingual and multilingual titles, from both the ECCO and CAA datasets, resulting in about 80 examples in the train set. The train set for the translation task was constructed in a similar way but with an even number of original language and translated works. These were then evaluated using the same test sets as above.

To perform the few-shot classification, the SetFit library was used. SetFit fine-tunes a pretrained SentenceTransformers model [29] using a contrastive training approach. Sentence-Transformers is a form of Transformer-based Large Language Model which can be trained to generate embedding representations at the sentence, paragraph, or document level (rather than at the word-level as a regular LLM). These embeddings are then generally used for tasks such as semantic textual similarity or semantic search. SetFit is a framework for few-shot fine-tuning SentenceTransformers models. Setfit has shown to have performance comparable to a LLMbased approach on tasks such as text classification, but with far fewer data and training time [36]. We used the pre-trained SentenceTransformers model distiluse-base-multilingual-cased-v2 and the hyperparameters from the examples set out in the introductory guide [34]. We then fine-tuned the SentenceTransformers model using a small number of examples.

For each set of results we recorded the accuracy, as well as the precision, recall and f1 scores separately for each class. We include tables comparing the results of the two main tasks, plus the full tables as an appendix. Moreover, we used the SHAP (SHapley Additive exPlanations) library [21] to understand the features most relevant in the classification by the model. SHAP is based on Shapely values, a game-theory approach to explanations which aims to calculate the contribution of each feature in an instance of a prediction. We used the SHAP library to produce plots which highlight tokens and spans of text based on their contribution to the prediction (Figure 1). These plots can then be interpreted qualitatively.

Results

Quantitative results

Below are shown some of the most relevant results, for the full set see the Appendix. Table 4 summarises the performance of the models trained on the 'combined' dataset and tested on both the individual and combined datasets. We report on the class-wise f-scores because the classes are very unevenly distributed, particularly for the CAA, and so the accuracy score is not a good indication of performance. Tables 5 and 6 give direct comparisons between the models on the multilingual and translation tasks, listing a difference simply by subtracting the score of the BERT model from the ML model (negative numbers mean the BERT model performed worse). Tables 7 to 10 in the Appendix provide the details of precision, recall and f1 for the ML and Bert models, on each task, for each class.

RQ1: Titles can be exploited for tracking multilingualism

As can be seen from Table 4, both BERT and the ML method gave quite comparable results across both tasks and all datasets. The SetFit method performed noticeably worse in most cases, except when tested on the combined CAA and ECCO dataset. Overall, results can be considered satisfactory which leads to the conclusion that titles can be used to this scope (RQ1), however the task requires an extended set of labeled training data to be provided.

RQ2: Comparison of the approaches

Tables 5 and 6 give direct comparisons between the models, listing a difference simply by subtracting the score of the BERT model from the ML model (negative numbers mean the BERT model performed worse). These show that generally, the tf-idf approach performed significantly better on the task to distinguish multilingual from monolingual works in many cases (with the exception of the set trained on the CAA and tested on ECCO). For the BERT model, in particular, Table 7 (in Appendix A.1) shows that the identification of the 0 class (i.e. multilingual works) is particularly problematic: recall values tend to be rather low -which indicates that the models tends to generally predict 'monolingual' for most titles.

For the translation task, there is slightly more variation between results of the approaches. The ML model has very low recall of the 1 class (translated work) when trained on the CAA and tested on ECCO, meaning almost all true positives (translations) are missed. This is a significant drawback since it is, for multilingualism studies, the class of interest. The BERT model performs reasonably well except again struggling with the recall of translated works when trained on the CAA and tested on another dataset. Most notable was the ability to identify ECCO translated documents using the model trained only on the CAA, both the full test dataset and the smaller 'balanced' set, as well as the other way around. For this task, BERT was able to generalize much better than the ML method when testing on a different dataset than the one on which it was trained.

The performance of the setfit method (see the Appendix, Table 11) had a comparable pattern to the BERT models. It similarly had low recall and precision for the 0 class (multilingual works), but performed well with most tests on the translated works task, with just 40 examples of each class, across multiple languages.

RQ3: Specificity/generality of the training

In general, the ML and Bert models, when trained on examples from across datasets, are able to perform reasonably well -meaning that a training set made from a combined dataset of ECCO and the CAA gives satisfactory results. Both the ML method and the BERT fine-tuned model give very similar results.

Both models perform very well at identifying monolingual/multilingual works when trained and tested on ECCO. Models trained and tested on ECCO fared better in general, while still underperforming when applied to the CAA test dataset.

The results from models trained on one dataset and tested on the other are much worse. In particular, models trained on the CAA and tested on ECCO perform very badly at both recall and precision of the multilingual class. Again, there is little difference between the ML and BERT models, though the BERT model performs marginally better. The 'CAA balanced' model, trained on a sample of the CAA containing an equal number of monolingual/multilingual titles, balanced across the various target languages, did not perform significantly better than the CAA model, though it was marginally better and much quicker to train. However, the very small number of records might represent a limitation.

Since for the Setfit method we used a mix of examples coming from both datasets, RQ3 does not apply to this model.

Qualitative results

To understand qualitatively what parts of the text caused the classification, we use SHAP explanations, and looked at a range of true positive, true negative, false positive and false negative predictions. Here, we focus on the BERT model trained on the CAA and tested on both CAA and ECCO for the prediction of multilingual texts (a particularly 'difÏcult' combination).

When the models wrongly label a title as monolingual when it is multilingual, in general, these phenomena seem to occur:

• There is no trace of multilingualism in the title (e.g. the Latin title 'Specimen doctrine traditae ab anno MDCXCI.usque ad annum MDCXCVI. inclusive. ' doesn't contain any mention of parts in a different language). • Most of these titles, despite containing hints of multilingualism, are fully in Latin. The wrong prediction might be due to the fact that the CAA contains a lot of Latin monolingual titles, and hence Latin context is considered monolingual despite possible multilingual records. Figure 2 6: Comparative results for the translation task, for bert-base-multilingual-cased approach and tf-idf/SVM. Number reported is the BERT result subtracted from the tf-idf result. Numbers under zero mean that the BERT approach performed worse. Acc, r, p, and f1 denote accuracy, recall, precision, and f-score respectively. translated bit ('cum latina interpretatione') being entirely assigned to monolingual (blue) by the model.

Another recurrent trend in both false and true prediction is the role of Greek: the word 'Greek' (or Gracae, in Graecam linguam) is always used as a predictor of multilingualism, even when the work is monolingual (either in the original language or in translation). Figure 4 and 3 show an example of two monolingual works whose titles contain the word 'Greek'. In both cases, the word Greek heavily impacts the 'multilingual' component, despite the fact that the output is different for the two predictions. This might be due to the fact that in the CAA Ancient Greek texts usually come with translations/notes in a modern language. Text in the Greek alphabet also seems to be used to make identifications of multilingual texts. This raises the issue of the dependency of the models on these specific dataset features. Furthermore, the model in some cases uses the text which we would read as making it likely to be multilingual as an output pointing to monolingual. For example things like 'original subjoined' or 'notes at the end', 'on the opposite page'... One example of this can be seen in Figure 5. This is because these phrases are not found in the CAA titles for multilingual works. The 'combined' model doesn't have this bias, in this case, words relating to notes or annotations contribute to a positive prediction of a work as multilingual, as one might expect.

Words like 'translated', or 'lexicon' across languages increase the output of the model in identifying multilingual works, which is close to what we would expect.

Discussion of relevance and possible uses

Overall, these experiments suggest it is a difÏcult problem to solve using machine learning methods. In particular, the approaches do not seem to generalise well, even using multilingual LLMs which we hoped might mean that different styles of title would be recognised if they were in some way semantically similar. This is perhaps because the way that multilingual and translated works are signified in a title is varied and changes over time and across languages. Despite these reservations, when trained on examples across both datasets, the performance of both traditional machine learning and LLM methods was at a level which we deem usable in real-world applications.

The multilingual fine-tuned BERT has some advantages over traditional ML approaches in identifying translated works but performs worse when distinguishing multilingual works. This seems to be because the signifiers for translated works are more descriptive and straightforward (e.g. 'translated from' or 'made English by'). The multilingual approach means that these kinds of phrases tend be be picked up by the model in different languages.

The few-shot method using SetFit shows some promise in a number of tasks, but does not, from our experiments, seem to be a 'silver bullet' for low-resource metadata enrichment of this kind. However, perhaps with a very well thought-out and diverse set of examples, it may be possible to build a model which can be trained and used for inferences on real-world data. An ideal real-world scenario for metadata enrichment may involve collecting a small number of examples from a specific dataset or collection, fine-tuning a bespoke but small model, and applying it only to that collection. However, as of yet, from our experiments, it does not seem that the multilingual capabilities of SetFit or SentenceTransformers are enough to get highquality results on this task without at least some annotation of the target dataset.

Conclusions

Automatically enriched metadata has significant value to heritage collections catalogue data, potentially helping to increase the accuracy and findability of records. If the purpose is to get enriched metadata, our experiments show some promise and could potentially be operationalised in the future. In fact, traditional ML methods may be enough in many cases, partic-ularly for identifying multilingual works, and have big advantages in terms of ease of use and use of resources. In some cases, methods such as keyword search or regular expressions might also provide acceptable results, though when using multilingual datasets, machine learning methods should have an advantage.

Furthermore, we suggest that certain evaluation metrics are more important than others, particularly with library catalogue data, which is likely to be very unevenly distributed with regards to language and classes. This is of course dependant on the particular task and usecase. If the purpose is to improve catalogue metadata for example, the recall of the multilingual or translated classes may be particularly important, as it may be better to find additional false positives which can then be checked manually afterwards, rather than aiming for precision but missing some relevant works. If the information is not necessarily intended to be 'fed back' to a catalogue but used for bibliographic data science at scale, it may be more important to focus on the overall f-scores to get a broad, albeit imperfect, accuracy.

A.2. Multilingual/Monolingual Task: TFIDF/SVM

Figure 1 :1Figure 1: Example of a text plot from the python SHAP library. In this case, parts of the text contributing to the identification of the title as a translation are highlighted in red.

Figure 2 :2Figure 2: Example of a text plot from the python SHAP library. In this case, parts of the text contributing to the identification of the title as multilingual are highlighted in red. The title was labeled as monolingual while being multilingual.

Figure 3 :Figure 4 :34Figure 3: Example of a text plot from the python SHAP library. In this case, parts of the text contributing to the identification of the title as multilingual are highlighted in red. The title was labeled as multilingual while being monolingual. The word Greek heavily contributes to the multilingual prediction

Figure 5 :5Figure 5: Example of SHAP plot showing a work from ECCO predicted as monolingual by the CAAtrained model. Parts of the text which we would intuitively see make it likely to be multilingual are in fact in this cases contributing to the prediction of the instance as monolingual.

Table 2 :2demonstrate, translation of the classical languages (Ancient Greek and Latin) plays a central role in the multilingualism of the academic production. Most attested language combinations in monolingual translations of CAA and ECCOclassicssource-target languages # CAA source-target languages # ECCOlat-dut51grc-eng1198lat-fre34lat-eng926fre-dutch11grc-lat11lat-ger11grc-fre26datasetmonolingual multilingual monolingual ed. monolingual transl.CAA34661943291175balanced CAA monolingual200194not usednot usedbalanced CAA translationnot usednot used350175ECCO-classics55017651156609combined7020187745132507

Table 3 :3Number of records per class in the four datasets used

Table 4 :4shows a very long title in Latin with an explicit mention of a Class-wise f-scores for the fine-tuned BERT, SVM, and SetFit methods using combined CAA + ECCO datasets.

Multilingual TaskTranslation TaskTrainTestF-score (0)F-score (1)F-score (0)F-score (1)MLcombinedcaa0.820.990.990.89combinedecco0.910.970.980.99combinedcombined0.750.970.980.96BERTcombinedcaa0.810.991.000.99combinedecco0.910.970.990.90combinedcombined0.780.970.980.96SetFitFew-shotcaa0.160.900.980.06Few-shotecco0.510.740.590.33Few-shotcombined0.420.820.800.23TrainTestAccr

Table 5 :5Comparative results for the monolingual/multilingual task, for bert-base-multilingualcased approach and tf-idf/SVM. Number reported is the BERT result subtracted from the tf-idf result. Numbers under zero mean that the BERT approach performed worse. Acc, r, p, and f1 denote accuracy, recall, precision, and f-score respectively.caacaa0.000.060.010.040.000.000.41caaecco0.000.18-0.520.25-0.060.03-0.01caacombined-0.010.09-0.190.06-0.010.010.00caacaa_balanced-0.14-0.34-0.10-0.24-0.04-0.14-0.10eccoecco0.030.030.060.050.020.010.01eccocaa-0.180.560.020.11-0.220.02-0.12eccocombined-0.14-0.10-0.50-0.37-0.15-0.01-0.09eccocaa_balanced0.070.34-0.620.24-0.380.220.02combinedcombined0.000.08-0.030.03-0.010.010.00combinedcaa0.010.06-0.13-0.01-0.010.000.00combinedecco0.000.000.010.000.000.000.00combinedcaa_balanced0.080.20-0.130.05-0.080.220.09caa_balancedcaa_balanced-0.10-0.09-0.28-0.17-0.100.090.00caa_balancedcaa-0.39-0.09-0.26-0.36-0.39-0.01-0.27caa_balancedecco0.010.060.030.050.000.020.00

Table 8 :8Performance results for multilingual/monolingual task and TFIDF/SVMTrainTestAccr (0)p (0)f1 (0)r (1)p (1)f1 (1)caacaa0.960.560.580.570.980.980.57caaecco0.770.011.000.021.000.770.87caacombined0.900.260.880.400.990.900.94caacaa_balanced0.990.981.000.991.000.970.99eccoecco0.900.760.820.790.950.930.94eccocaa0.930.060.080.070.970.960.97eccocombined0.960.910.920.920.980.980.98eccocaa_balanced0.480.091.000.161.000.450.62combinedcombined0.940.660.850.750.980.950.97combinedcaa0.970.720.960.821.000.990.99combinedecco0.960.900.920.910.980.970.97combinedcaa_balanced0.850.731.000.851.000.740.85caa_balancedcaa_balanced0.850.730.920.810.910.720.81caa_balancedcaa0.920.970.340.500.911.000.95caa_balancedecco0.610.390.260.310.680.790.73TrainTestAccr (0)p (0)f1 (0)r (1)p (1)f1 (1)caacaa0.980.990.990.990.740.720.73caaecco0.750.960.590.730.620.970.76caacombined0.870.990.830.900.640.980.77caacaa_balanced0.971.000.960.980.911.000.95eccoecco0.960.910.970.940.990.950.97eccocaa0.660.650.990.780.870.100.18eccocombined0.830.730.990.840.990.680.80eccocaa_balanced0.610.470.920.620.910.440.59combinedcombined0.970.970.990.980.970.950.96combinedcaa0.991.001.001.000.900.900.90combinedecco0.990.990.980.990.991.000.99combinedcaa_balanced0.961.000.950.970.881.000.94caa_balancedcaa_balanced0.870.860.940.900.880.740.81caa_balancedcaa0.930.921.000.960.970.370.54caa_balancedecco0.880.870.820.840.890.920.91caa_balancedcombined0.910.900.960.930.930.850.89

A.3. Translation Task: BERT

Table 9 :9Performance results for translation task and fine-tuned BERTThe data and the code are available at: https://github.com/mfantoli/CHR2024_multilingualism.Enriching metadata based on book titles is also of interest to GLAM institutions, as demonstrated by a recent experiment on British Library data, https://living-with-machines.github.io/genre-classification/01_BL_fiction_no n_fiction.htmlhttps://dial.uclouvain.be/digitization/en/digital-collection/old-academic-collection.https://www.gale.com/primary-sources/eighteenth-century-collections-online.More information on the identification of classical authors is provided in[9].We haven't counted the exact number of classical works in the CAA, but, as an example, there are at least five editions of Homer, more than 10 editions of Cicero, etc.We kept double the number of monolingual editions compared to monolingual translations in order to still achieve enough critical mass in the number of examples.

Acknowledgments

We want to express our gratitute to the STUDIUM.AI team, particular to Violet Soen, whose efforts enabled this research. In addition, we would like to thank the KU Leuven Libraries staff, in particular the metadata and digitization services for sharing the CAA metadata and the relative documentation. Finally, we would like to thank the Computational History group of Helsinki, for providing the framework and infrastructure for annotating the ECCO training data.

Computer vision and machine learning approaches for metadata enrichment to improve searchability of historical newspaper collections DAli KMilleville SVerstockt NVan De Weghe SChambers JMBirkholz 10.1108/jd-01-2022-0029 Journal of Documentation 2023 Multilingual texts and practices in early modern Europe PAuger SBrammall

New York, NY

Routledge 2023 William Shakspere's Small Latine and Lesse Greeke TWBaldwin 1944 University of Illinois Press Urbana Collaborative Translation as a Model for Multilingual Printing in Early Renaissance Editions of Aesop's Fables BBistué Multilingual texts and practices in early modern Europe PAuger SBrammall

New York, NY

Routledge 2023 Text-mining metadata: What can titles tell us of the history of modern and contemporary art? MBowman 10.22148/001c.74602 Journal of Cultural Analytics 8 1 2023 Printers of the Greek Classics and Market Distribution in the Sixteenth Century: The Case of France and the Low Countries NConstantinidou Specialist Markets in the Early Modern Book World R. Kirwan and S. Mullins 40 2015 BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding JDevlin MChang KLee KToutanova arXiv: 181 0.04805 2018 BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding JDevlin M.-WChang KLee KToutanova 10.18653/v1/N19-1423 Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies JBurstein CDoran TSolorio the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Minneapolis, Minnesota

2019 1 Association for Computational Linguistics Quantifying the Presence of Ancient Greek and Latin Classics in Early Modern Britain MFantoli JSuomela TVan Hal MDepauw LVirkki MTolonen Journal of Cultural Analytics forthcoming Topic modelling characterization of Mudejar art based on document titles CGarcia-Zorita ARPacios 10.1093/llc/fqx055 Digital Scholarship in the Humanities 33 3 2018 The Availability of the Classics. Readers, Writers, Translation, Performance SGillespie The Oxford History of Classical Reception in English Literature Oxford University Press 2015 2 Machine learning in computational literary studies HOHatzel HStiemer CBiemann EGius 10.1515/itit-2023-0041 it -Information Technology 65 4-5 2023 Multilingualism and Translation in the Early Modern Low Countries THermans 10.4324/9781003092445 Language Dynamics in the Early Modern Period KBennett ACattaneo

New York

Routledge 2022 20 Enseignement du grec et livres scolaires dans les anciens Pays-Bas et la Principaute de Liege de 1483 à 1600. Deuxième partie: 1551-1600 RHoven Gutenberg-Jahrbuch 55 1980 Enseignement du grec et livres scolaires dans les anciens Pays-Bas et la Principauté de Liège de 1483 à 1600. Première partie: 1483-1550 RHoven Gutenberg-Jahrbuch 54 1979 Automatic Language Identification in Texts: A Survey TJauhiainen MLui MZampieri TBaldwin KLindén 10.1613/jair.1.11675 Journal of Artificial Intelligence Research 65 2019 Printing the Classical Text HJones Printing the Classical Text Brill 2021 A Quantitative Study of History in the English Short-Title Catalogue (ESTC), 1470-1800 LLahti NIlomäki MTolonen 10.18352/lq.10112 LIBER Quarterly: The Journal of the Association of European Research Libraries 25 2 2015 Bibliographic Data Science and the History of the Book (c. 1500-1800) LLahti JMarjanen HRoivainen MTolonen 10.1080/01639374.2018.1543747 Cataloging & Classification Quarterly 57 1 2019 Translations from the Classics into English from Caxton to Chapman HBLathrop 1620. 1933 University of Wisconsin Studies in Language and Literature 35 1477 Madison University of Wisconsin A Unified Approach to Interpreting Model Predictions SMLundberg S.-ILee Advances in Neural Information Processing Systems 30 IGuyon UVLuxburg SBengio HWallach RFergus SVishwanathan RGarnett Curran Associates, Inc 2017 Humanism and the Classical Tradition PMack 10.1093/oso/9780192886699.003.0001 The Oxford History of the Renaissance GCampbell Oxford University PressOxford 2023 Open Bibliographical Data Workflows and the Multilinguality Challenge VMalıńek TUmerle EGray IHeibi PKirály CKlaes PKorytkowski DLindemann AMoretti CPanušková RPéter MTolonen ATomczyńska OVimr 10.5334/johd.190 Journal of Open Humanities Data 10 27 2024 Text classification of column headers with a controlled vocabulary: leveraging LLMs for metadata enrichment MMartorana TKuhn LStork JVan Ossenbruggen 2024 Style, Inc. Reflections on Seven Thousand Titles (British Novels, 1740?1850 FMoretti 10.1086/606125 Critical Inquiry 36 1 2009 Genre Classification of Books on Spanish JANolazco-Flores AVGuerrero-Galván CDel-Valle-Soto LPGarcia-Perera 10.1109/access.2023.3332997 IEEE Access 11 2023 Multilingual Analysis and Visualization of Bibliographic Metadata and Texts With the AVOBMAT Research Tool RPéter ZSzántó ZBiacsi GBerend VBilicki 10.5334/johd.175 Journal of Open Humanities Data 10 23 2024 Explainable Publication Year Prediction of Eighteenth Century Texts with the BERT Model IRastas YCiarán Ryan ITiihonen MQaraei LRepo RBabbar EMäkelä MTolonen FGinter 10.18653/v1/2022.lchange-1.7 Proceedings of the 3rd Workshop on Computational Approaches to Historical Language Change NTahmasebi SMontariol AKutuzov SHengchen HDubossarsky LBorin the 3rd Workshop on Computational Approaches to Historical Language Change

Dublin, Ireland

2022 Association for Computational Linguistics Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks NReimers IGurevych Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics 2019 The Evolution of Scottish Enlightenment Publishing YCRyan MTolonen 10.1017/s0018246x23000614 The Historical Journal 67 2 2024 Large Language Models for Data Annotation: A Survey ZTan DLi SWang ABeigi BJiang ABhattacharjee MKarami JLi LCheng HLiu 10.48550/arxiv.2402.13446 2024 Text classification 2024 The Anatomy of Eighteenth Century Collections Online (ECCO) MTolonen EMäkelä LLahti 10.1353/ecs.2022.0060 Eighteenth-Century Studies 56 1 2022 SetFit: EfÏcient Few-Shot Learning Without Prompts LTunstall 2022 EfÏcient Few-Shot Learning Without Prompts LTunstall NReimers UE SJo LBates DKorat MWasserblat OPereg 10.48550/arxiv.2209.11055 2022 EfÏcient Few-Shot Learning Without Prompts LTunstall NReimers UE SJo LBates DKorat MWasserblat OPereg 10.48550/arxiv.2209.11055 2022 Attention is All You Need AVaswani NShazeer NParmar JUszkoreit LJones ANGomez LKaiser IPolosukhin 2017 Vertalen in de Nederlanden: een cultuurgeschiedenis

Amsterdam

Boom 2021 The English Grammar Schools to 1660 FWatson Their Curriculum and Practice

London

Frank Cass & Co 1968 2nd ed What to do with 2.000.000 Historical Press Photos? The Challenges and Opportunities of Applying a Scene Detection Algorithm to a Digitised Press Photo Collection MWevers NVriend ADeBruin 10.18146/tmg.815 TMG Journal for Media History 25 1 1 2022 The Place of Classics in Education and Publishing PWilson The Oxford History of Classical Reception in English Literature DHopkins CMartindale

Oxford and New York

Oxford University Press 2012 3 Detecting Sequential Genre Change in Eighteenth-Century Texts JZhang YCRyan IRastas FGinter MTolonen RBabbar Proceedings of the Computational Humanities Research Conference FKarsdorp ALassche KNielbo the Computational Humanities Research Conference

Antwerp, Belgium

Ceur 2022. 2022 3290 CEUR Workshop Proceedings