<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Transfer learning for Renaissance illuminated manuscripts: starting a journey from classification to interpretation</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Valeria</forename><surname>Minisini</surname></persName>
							<email>valeria.minisini@uniroma1.it</email>
							<affiliation key="aff0">
								<orgName type="department">Dipartimento di Scienze dell&apos;Antichità</orgName>
								<orgName type="institution">Sapienza Università di Roma</orgName>
								<address>
									<addrLine>Piazzale Aldo Moro 5</addrLine>
									<postCode>00185</postCode>
									<settlement>Rome</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="department">National Research Council -Institute of Heritage Science (CNR-ISPC)</orgName>
								<address>
									<addrLine>Via Salaria KM 29300</addrLine>
									<postCode>00015</postCode>
									<settlement>Monterotondo</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Giorgio</forename><surname>Gosti</surname></persName>
							<email>giorgio.gosti@cnr.it</email>
							<affiliation key="aff1">
								<orgName type="department">National Research Council -Institute of Heritage Science (CNR-ISPC)</orgName>
								<address>
									<addrLine>Via Salaria KM 29300</addrLine>
									<postCode>00015</postCode>
									<settlement>Monterotondo</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Bruno</forename><surname>Fanini</surname></persName>
							<email>bruno.fanini@cnr.it</email>
							<affiliation key="aff1">
								<orgName type="department">National Research Council -Institute of Heritage Science (CNR-ISPC)</orgName>
								<address>
									<addrLine>Via Salaria KM 29300</addrLine>
									<postCode>00015</postCode>
									<settlement>Monterotondo</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Transfer learning for Renaissance illuminated manuscripts: starting a journey from classification to interpretation</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">7BED71BE698B3AF2C92A0AD9ED6754F1</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T18:10+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Deep Neural Network</term>
					<term>Transfer Learning</term>
					<term>Illuminated Manuscript</term>
					<term>Image Classification</term>
					<term>Layout Analysis</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>In recent years illuminated manuscripts have been extensively digitized, providing an unprecedented amount of material for computer vision research and the creation of better performing neural networks. In addition to character recognition, which has long been the main application field, these new resources have allowed the adoption of machine learning to detect decoration and miniatures that usually concern only a few pages of a codex but attract great attention especially from non-academic public. The paper presents ongoing research that aims to demonstrate the possible adoption of transfer learning for digitized artworks to improve a pretrained deep neural network ability to recognize handwritten pages, identify the layout elements and, specifically, figurative miniatures in Renaissance manuscripts to be used for the creation of an immersive interface for consulting and comparing images based on their iconography. After a brief introduction to contextualize the changes brought by the massive cultural heritage digitization, we will present some of the most interesting research conducted on both manuscripts and artworks. Next, the dataset built to train the model will be described, focusing on its composition and the image classification system adopted. In conclusion, we will then expose the training strategy chosen to minimize human effort by dividing the dataset into three groups before concluding with the first results obtained so far and the prospects for future development.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Until the mid-15th century, illuminated manuscripts were the main vehicle for the circulation and preservation of knowledge among the Western upper classes, surviving even beyond Johannes Gutenberg's introduction of movable type printing. This typology of cultural artifacts, difficult to access due to their materials' fragility, has found new life thanks to digitization which has made available to scholars and the common public many little-known works, rarely released from deposits.</p><p>Each complete codex comprises hundreds of pages, which do not only contain text, but are also richly decorated with miniatures that share subjects and iconography with other artistic expressions like paintings and sculptures. Although illuminations occupy only a limited number of sheets, they are also the components that attract the most non-academic audience, making the often-incomprehensible written content partly accessible.</p><p>To enhance the thousands of volumes made available online we are developing an immersive visualization interface that allows users to move easily between the digitized pages, searching for figurative miniatures and capable of relating reproductions of different works with the same subject. To make this possible, a deep neural network system is being trained on a specially constructed dataset containing reproductions of Medieval and Renaissance volumes together with artworks. Our objective is to obtain a model that can recognize elements in the page layout and identify handwritten sheets within heterogeneous collections of items. Given that deep neural networks require massive datasets with high-quality annotations that are labor-intensive, we implemented an Interactive Machine Learning approach to rapidly compile increasingly larger databases, in which a domain expert and an AI expert cooperate to train a model with domain-specific knowledge using transfer learning <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b1">2]</ref>.</p><p>Section 2 outlines the state of the art while Section 3 proceeds to present the dataset by describing its composition and image classification system before exposing the technique adopted for training the model. Finally, in Section 4 we discuss conclusions.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related work</head><p>Most datasets concerning digitized handwritten and early printed texts are composed of materials merged according to a shared language, creation period and resource-distributing institution <ref type="bibr" target="#b2">[3]</ref>. Often, they are small if compared with newspapers <ref type="bibr" target="#b3">[4]</ref> and museum collections <ref type="bibr" target="#b4">[5]</ref>, containing frequently only a subset of pages from the volumes or few complete codices <ref type="bibr" target="#b5">[6]</ref>, such as the HBA dataset <ref type="bibr" target="#b6">[7]</ref> with images that originate from 11 books of the French digital library Gallica. In several cases, the dataset needs to be augmented with new items and an example is <ref type="bibr" target="#b7">[8]</ref> which uses generative adversarial networks (GANs) to create synthetic pages to train the model.</p><p>The utilization of machine learning and specifically neural networks to facilitate the study of historical documents is an established field of research <ref type="bibr" target="#b8">[9]</ref>. The main investigated task has been text recognition <ref type="bibr" target="#b9">[10]</ref>, but the focus on graphic elements such as layout analysis <ref type="bibr" target="#b7">[8]</ref>, drop caps for letter extraction <ref type="bibr" target="#b10">[11]</ref>, figure gestures classification through template-based detectors <ref type="bibr" target="#b11">[12]</ref> and miniature retrieval <ref type="bibr" target="#b12">[13]</ref> have gradually shifting attention from text to artistic aspects. Also attempts with bounding boxes have been made for illumination detection <ref type="bibr" target="#b13">[14]</ref> and iconographic recognition <ref type="bibr" target="#b14">[15]</ref> with good results.</p><p>In the historical-artistic field, among the several neural networks architectures ResNet <ref type="bibr" target="#b15">[16]</ref> was used individually <ref type="bibr" target="#b16">[17]</ref> or within more complex pipelines <ref type="bibr" target="#b17">[18]</ref>, mostly, because it performs well on labeling and detection tasks and is easily trainable, given that its implementation of residual functions allows better error propagation. Often, transfer learning is used to train ResNet architectures given small datasets <ref type="bibr" target="#b17">[18,</ref><ref type="bibr" target="#b18">19]</ref>.</p><p>For an in-depth review of "human-in-the-loop" approaches refer to <ref type="bibr" target="#b19">[20]</ref>. Specifically, <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b1">2]</ref> discuss Interactive Machine Learning. Label Studio is a flexible labeling tool that implements active learning <ref type="bibr" target="#b20">[21]</ref>. As well, ilastik <ref type="bibr" target="#b21">[22]</ref> or Cellpose <ref type="bibr" target="#b22">[23]</ref> are machine learning tools that offer Interactive Machine Learning capabilities.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">The Project</head><p>The collective interest in illuminated manuscripts is still lower than the attraction exerted by works of art, almost entirely coming from relatively few expert researchers due to language barriers and the complexity of interpreting handwritten sheets. In addition to the textual part, which simultaneously serves as both the main vehicle and greatest obstacle to content communication, the miniatures that captivate the viewer can be used to build accessible experiential paths based on a visual language understood regardless of one's country of origin or historical knowledge. The first step in creating an interface that will allow the user to explore the manuscript pages through its more intuitive graphical contents was to automate their recognition.</p><p>To develop the dataset required for deep neural network transfer learning, we implemented an iterative incremental training cycle, composed of four main phases. First, a domain expert annotates or corrects the labels of a smaller dataset. Second, an AI expert uses transfer learning and hyperparameter optimization to train a deep neural network. Third, the model predicts the larger dataset labels. Thus, the cycle restarts from the first phase with the domain expert correcting the annotations predicted for the larger dataset. Labels were created using Label Studio <ref type="bibr" target="#b20">[21]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">The Dataset</head><p>For training our deep neural network to identify the different visual components inside the pages and distinguish any miniatures present also from ornaments or illuminated initials, we have thus formed an original training dataset. It contains images from digitized manuscript volumes, mainly dating back to the Medieval and Renaissance periods, together with reproductions of incunabula and later printed copies but also works of art and objects normally present in digital museum catalogs often featuring figurative subjects with a sacred theme as decoration.</p><p>Files come exclusively from institutional databases like the US Library of Congress, The Metropolitan Museum of Art of New York and the J. Paul Getty Museum Collection of Los Angeles but also from European ones such as the Staatsbibliothek of Berlin and the Koninklijke Bibliotheek of the Netherlands. We preferred images from public domain or released under creative common license CC-BY. Furthermore, by using different sources we tried to obtain a certain variety of scanning and shooting conditions as well as the quality of reproductions, selecting both overall views and detail shots.</p><p>Each element was classified using an alphanumeric code that allowed to quickly trace the origin and provide information about the depicted subject, the century of production, and the specific author or, if unknown, at least its geographical area. As the number of elements increased, a prefix was also added to each series to distinguish the type of physical object digitized, the associated category and subclass. An example of the result thus obtained is "P.01_S(An)_14BMGetty" where "P.01" indicates the category and the subclass it belongs to, "S(An)" identifies the subject as Saint Anne while "14" is the century of creation, "BM" the author's initials and "Getty" is the provenance institution. At the same time, this system has kept low the risk of inserting multiple copies of a work, especially in cases such as engravings, and associating images with the same themes already in the early stage of research to possibly increase the examples of a specific subject when poorly represented.</p><p>In total, the dataset consists of 45.798 items divided into two macro-categories, volumes (P) and art (A), and into ten subgroups: manuscript sheets (P.01), printed pages (P.02), paintings (A.01), engravings (A.02), drawings (A.03), sculptures (A.04), stained glass windows (A.05), tapestries (A.06), art prints (A.07) and other objects (A.08). Since the project is focused on manuscripts, the number of images dedicated to these is higher than all other typologies and constitute 68.25% of the total, chosen to cover a wide variety of layouts and styles. They chiefly come from books of hours, psalters, missals, breviaries, and bibles selected primarily for the presence of illustrations. The miniatures in the chosen manuscripts are both full page and inserted in the text.</p><p>Always privileging the figurative element, the sheets without illustrations have been classified according to the presence of decorations and, only in the absence of these, on text or musical scores considering the dominant component. Four summary partitions were thus obtained in the main category, first useful to balance the composition of the dataset, avoiding the insertion of too many text pages compared to the other elements, and afterwards for the subdivision into three training groups progressively larger in size of 400, 4.000 and 41.398.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">The Training</head><p>The smallest group consists of 330 pages and 70 artworks, a proportion that remains constant in the second and undergoes slight variations in the third set where the category "Pages" contains a total of 34.447 items to which are added 6.951 images of the other one. In the preprocessing steps we uniformed the dataset through resizing to 1280 × 1280 pixels maintaining proportions.</p><p>To limit human effort in the annotation phase, we manually labeled only the first group to report the presence or co-presence in the pages of figurative miniatures (Fig), decorations (Deco), text (Text), music scores (Mus) and illuminated initials (Let) with a minimum of one and a maximum of four tags for a total of 818. This set, split into 70% training and 30% test, was used to initially train our deep neural network to distinguish between pages and art, manuscript and printed pages as to detect the layout elements.</p><p>We use a transfer learning approach based on a ResNet architecture <ref type="bibr" target="#b15">[16]</ref> pretrained on ImageNet dataset <ref type="bibr" target="#b23">[24]</ref>. We planned to test ResNet-50 and 101 as well as ResNet-18. Since we had a relatively small dataset, we started with ResNet-18 because models with fewer parameters are less likely to be in an over-parametrization regime. We obtained unexpectedly good results for both the training and the test metrics. In future, we will test the larger networks.</p><p>In the training phase, we augmented the dataset by transforming the images passed by the dataloader through the following sequence of transformations. First, the dataloader resized the images to 256×256, then it randomly cropped and resized the pictures to 256 × 256, and finally, it normalized the images according to ResNet's suggested parameters. Initially a distinct group of deep neural networks was trained on the first group without the category "Art" and this allowed us to test that art images do not negatively impact the prediction accuracy.</p><p>We considered two gradient descent training methods: Stochastic Gradient Descent (SGD) and Adam. For SGD we used an exponential learning rate scheduler with a starting learning rate 0.001, momentum 0.9, learning rate scheduler step size 7, and gamma 0.1. For Adam, we settled on a learning rate 0.001.</p><p>We also compared two types of transfer learning: fine-tuning on all weights or keeping the features in the training phase while fine-tuning only the last layer.</p><p>Figure <ref type="figure" target="#fig_0">1</ref> shows the accuracy level achieved: in almost all cases, the fine-tuning of all weights and SGD gave the best results, even if the difference is not very large. The accuracy rate is almost always above 90%, particularly high in recognizing pages from art and detecting text or music, while the lowest value was recorded for illuminated initials, mostly caused by the inherent ambiguity between simple ornamental letters and historiated initials.</p><p>Considering the promising results, the models with the highest accuracy level for each target were used to automatically predict the labels for the 4000 images of the second set. Then, the domain expert carefully checked to identify specific model biases and corrected the assigned labels. These models demonstrated their ability to recognize with high precision the presence of decorations and illustrations on the pages, especially when there are borders and the scene occupies a large portion of space. They are however less precise in distinguishing between the miniatures inserted in the text and the illuminated initials, often confusing the two elements and labeling some of the former as "Let". They also incorrectly classified blank pages as "Text" and adopted the same tag for some sculptures together with the right "Fig" <ref type="figure">.</ref> Only in ten cases the prediction was completely wrong: out of 4.000 images only 1.549 required manual intervention, saving several hours of work to reach an overall result of 7.205 annotations. Comparing the automatically applied tags with the correct ones, the most significant decrease was observed for "Text", which went from 3.174 to 2.854, followed by "Let" reduced from 966 to 936. On the other hand, the "Fig" and "Deco" classes increased, going from 1.495 to 1.765 and from 1.214 to 1.292 respectively.</p><p>The accuracy of the predictions was overall considered satisfactory with a rate of 0.98% for the recognition of figurative miniatures, 0.88% for text, 0.86% for decorations and 0.74% for illuminated initials. However, a particular criticality has been observed with musical scores printed after the 15th century which are not recognized as such, significantly impacting the number of labels assigned by the system, only 251 out of the exact 358.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Conclusions</head><p>The initial recognition tests on the layout elements gave us very encouraging results as accuracy levels higher than 80% are not often obtained with such small datasets. Our approach allows for a gradual and efficient labeling process and in this paper we stop at the end of the first step of the incremental learning cycle. In future work, we plan to progressively grow the annotated items' number to verify if similar results will be maintained also for the third larger group.</p><p>Once this phase is completed, a further step will be taken to improve the dataset with more complex label tasks such as instance semantic segmentation. Specifically, we will gradually move from image classification to the detection of specific components using the activation zones as a guide to automatically place boxes and then perform the semantic segmentation of any miniatures present.</p><p>The masks thus obtained will be used to enrich an interactive and immersive web3D system based on an open-source framework <ref type="bibr" target="#b24">[25]</ref>, that will be investigated in the upcoming stages of the research. This will give the user the possibility to inspect and query large amounts of pages in a 3D space, juxtapose similar images, and comparing them with art reproductions.</p><p>In doing so, we intend to demonstrate how, through the use of new technologies, figurative language can be used to structure narratives capable of making artifacts, originally prerogative of a few, available to all by freeing a delicate cultural property from its physical and conservative limits and exploiting iconography as an interpretative system that can be understood independently of the spoken language.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Prediction accuracy for target labels for the 400 images dataset.</figDesc><graphic coords="4,72.00,65.61,451.28,188.04" type="bitmap" /></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Interactive machine learning for health informatics: when do we need the humanin-the-loop?</title>
		<author>
			<persName><forename type="first">A</forename><surname>Holzinger</surname></persName>
		</author>
		<idno type="DOI">10.1007/s40708-016-0042-6</idno>
	</analytic>
	<monogr>
		<title level="j">Brain Informatics</title>
		<imprint>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="page" from="119" to="131" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Interactive machine learning in data exploitation</title>
		<author>
			<persName><forename type="first">R</forename><surname>Porter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Theiler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Hush</surname></persName>
		</author>
		<idno type="DOI">10.1109/MCSE.2013.74</idno>
	</analytic>
	<monogr>
		<title level="j">Computing in Science and Engineering</title>
		<imprint>
			<biblScope unit="volume">15</biblScope>
			<biblScope unit="page" from="12" to="20" />
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">A survey of historical document image datasets</title>
		<author>
			<persName><forename type="first">K</forename><surname>Nikolaidou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Seuret</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Mokayed</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Liwicki</surname></persName>
		</author>
		<idno type="DOI">10.1007/s10032-022-00405-8</idno>
	</analytic>
	<monogr>
		<title level="j">International Journal on Document Analysis and Recognition (IJDAR)</title>
		<imprint>
			<biblScope unit="volume">25</biblScope>
			<biblScope unit="page" from="305" to="338" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">The newspaper navigator dataset: extracting headlines and visual content from 16 million historic newspaper pages in Chronicling America</title>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">C G</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Mears</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Jakeway</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Ferriter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Adams</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Yarasavage</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Thomas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Zwaard</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">S</forename><surname>Weld</surname></persName>
		</author>
		<idno type="DOI">10.1145/3340531.3412767</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 29th ACM International Conference on Information &amp; Knowledge Management, CIKM &apos;20</title>
				<meeting>the 29th ACM International Conference on Information &amp; Knowledge Management, CIKM &apos;20<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery</publisher>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="3055" to="3062" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<title level="m" type="main">The Met Dataset: instancelevel recognition for artworks</title>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">A</forename><surname>Ypsilantis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Garcia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Han</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ibrahimi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">V</forename><surname>Noord</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Tolias</surname></persName>
		</author>
		<ptr target="https://arxiv.org/abs/2202.01747.arXiv:2202.01747" />
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">DIVA-HisDB: a precisely annotated large dataset of challenging medieval manuscripts</title>
		<author>
			<persName><forename type="first">F</forename><surname>Simistira</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Seuret</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Eichenberger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Garz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Liwicki</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Ingold</surname></persName>
		</author>
		<idno type="DOI">10.1109/ICFHR.2016.0093</idno>
	</analytic>
	<monogr>
		<title level="m">15th International Conference on Frontiers in Handwriting Recognition (ICFHR)</title>
				<imprint>
			<date type="published" when="2016">2016. 2016</date>
			<biblScope unit="page" from="471" to="476" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">HBA 1.0: a pixel-based annotated dataset for historical book analysis</title>
		<author>
			<persName><forename type="first">M</forename><surname>Mehri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Héroux</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Mullot</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J.-P</forename><surname>Moreux</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Coüasnon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Barrett</surname></persName>
		</author>
		<idno type="DOI">10.1145/3151509.3151528</idno>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 4th International Workshop on Historical Document Imaging and Processing, HIP &apos;17</title>
				<meeting>the 4th International Workshop on Historical Document Imaging and Processing, HIP &apos;17<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery</publisher>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="107" to="112" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Approximate ground truth generation for semantic labeling of historical documents with minimal human effort</title>
		<author>
			<persName><forename type="first">N</forename><surname>Rahal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Vögtlin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Ingold</surname></persName>
		</author>
		<idno type="DOI">10.1007/s10032-024-00475-w</idno>
	</analytic>
	<monogr>
		<title level="j">International Journal on Document Analysis and Recognition (IJDAR)</title>
		<imprint>
			<biblScope unit="volume">27</biblScope>
			<biblScope unit="page" from="335" to="347" />
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Deep learning for historical document analysis and recognition-a survey</title>
		<author>
			<persName><forename type="first">F</forename><surname>Lombardi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Marinai</surname></persName>
		</author>
		<idno type="DOI">10.3390/jimaging6100110</idno>
	</analytic>
	<monogr>
		<title level="j">Journal of Imaging</title>
		<imprint>
			<biblScope unit="volume">6</biblScope>
			<biblScope unit="page">110</biblScope>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Manuscripts character recognition using machine learning and deep learning</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">A</forename><surname>Islam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><forename type="middle">E</forename><surname>Iacob</surname></persName>
		</author>
		<idno type="DOI">10.3390/modelling4020010</idno>
	</analytic>
	<monogr>
		<title level="j">Modelling</title>
		<imprint>
			<biblScope unit="volume">4</biblScope>
			<biblScope unit="page" from="168" to="188" />
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Towards historical document indexing: extraction of drop cap letters</title>
		<author>
			<persName><forename type="first">M</forename><surname>Coustaty</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Pareti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Vincent</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J.-M</forename><surname>Ogier</surname></persName>
		</author>
		<idno type="DOI">10.1007/s10032-011-0152-x</idno>
	</analytic>
	<monogr>
		<title level="j">International Journal on Document Analysis and Recognition (IJDAR)</title>
		<imprint>
			<biblScope unit="volume">14</biblScope>
			<biblScope unit="page" from="243" to="254" />
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Detecting gestures in medieval images</title>
		<author>
			<persName><forename type="first">J</forename><surname>Schlecht</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Carqué</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Ommer</surname></persName>
		</author>
		<idno type="DOI">10.1109/ICIP.2011.6115669</idno>
	</analytic>
	<monogr>
		<title level="m">2011 18th IEEE International Conference on Image Processing</title>
				<meeting><address><addrLine>Brussels, Belgium</addrLine></address></meeting>
		<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2011">2011</date>
			<biblScope unit="page" from="1285" to="1288" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Miniature illustrations retrieval and innovative interaction for digital illuminated manuscripts</title>
		<author>
			<persName><forename type="first">D</forename><surname>Borghesani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Grana</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Cucchiara</surname></persName>
		</author>
		<idno type="DOI">10.1007/s00530-013-0315-3</idno>
	</analytic>
	<monogr>
		<title level="j">Multimedia Systems</title>
		<imprint>
			<biblScope unit="volume">20</biblScope>
			<biblScope unit="page" from="65" to="79" />
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Illumination detection in IIIF medieval manuscripts using deep learning</title>
		<author>
			<persName><forename type="first">F</forename><surname>Aouinti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Eyharabide</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Fresquet</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Billiet</surname></persName>
		</author>
		<idno type="DOI">10.16995/dm.8073</idno>
	</analytic>
	<monogr>
		<title level="j">Digital Medievalist</title>
		<imprint>
			<biblScope unit="volume">15</biblScope>
			<biblScope unit="page" from="1" to="18" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">AI4MSS : un esperimento di intelligenza artificiale alla Biblioteca Apostolica Vaticana</title>
		<author>
			<persName><forename type="first">P</forename><surname>Manoni</surname></persName>
		</author>
		<idno type="DOI">10.1400/294065</idno>
	</analytic>
	<monogr>
		<title level="m">Guardando oltre i confini : partire dalla tradizione per costruire il futuro delle biblioteche : studi e testimonianze per i 70 anni di Mauro Guerrini</title>
				<editor>
			<persName><forename type="first">G</forename><surname>Bergamin</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">T</forename><surname>Possemato</surname></persName>
		</editor>
		<imprint>
			<publisher>AIB -Associazione Italiana Biblioteche</publisher>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="231" to="244" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Deep residual learning for image recognition</title>
		<author>
			<persName><forename type="first">K</forename><surname>He</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Sun</surname></persName>
		</author>
		<idno type="DOI">10.1109/CVPR.2016.90</idno>
	</analytic>
	<monogr>
		<title level="m">IEEE Conference on Computer Vision and Pattern Recognition (CVPR)</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2016">2016. 2016</date>
			<biblScope unit="page" from="770" to="778" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Compare the performance of the models in art classification</title>
		<author>
			<persName><forename type="first">W</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Qiu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Jiang</surname></persName>
		</author>
		<idno type="DOI">10.1371/journal.pone.0248414</idno>
	</analytic>
	<monogr>
		<title level="j">PLoS ONE</title>
		<imprint>
			<biblScope unit="volume">16</biblScope>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">A dataset and a convolutional model for iconography classification in paintings</title>
		<author>
			<persName><forename type="first">F</forename><surname>Milani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Fraternali</surname></persName>
		</author>
		<idno type="DOI">10.1145/3458885</idno>
	</analytic>
	<monogr>
		<title level="j">Journal on Computing and Cultural Heritage</title>
		<imprint>
			<biblScope unit="volume">14</biblScope>
			<biblScope unit="page">46</biblScope>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Transfer learning for the visual arts: the multi-modal retrieval of Iconclass codes</title>
		<author>
			<persName><forename type="first">N</forename><surname>Banar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Daelemans</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Kestemont</surname></persName>
		</author>
		<idno type="DOI">10.1145/3575865</idno>
	</analytic>
	<monogr>
		<title level="j">Journal on Computing and Cultural Heritage</title>
		<imprint>
			<biblScope unit="volume">16</biblScope>
			<biblScope unit="page">32</biblScope>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Human-in-the-loop machine learning: a state of the art</title>
		<author>
			<persName><forename type="first">E</forename><surname>Mosqueira-Rey</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Hernández-Pereira</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Alonso-Ríos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Bobes-Bascarán</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Fernández-Leal</surname></persName>
		</author>
		<idno type="DOI">10.1007/S10462-022-10246-W</idno>
	</analytic>
	<monogr>
		<title level="j">Artificial Intelligence Review</title>
		<imprint>
			<biblScope unit="volume">56</biblScope>
			<biblScope unit="page" from="3005" to="3054" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<monogr>
		<title level="m" type="main">Label Studio: data labeling software</title>
		<author>
			<persName><forename type="first">M</forename><surname>Tkachenko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Malyuk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Holmanyuk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Liubimov</surname></persName>
		</author>
		<ptr target="https://github.com/HumanSignal/label-studio" />
		<imprint>
			<biblScope unit="page" from="2020" to="2024" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<analytic>
		<title level="a" type="main">ilastik: interactive machine learning for (bio)image analysis</title>
		<author>
			<persName><forename type="first">S</forename><surname>Berg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Kutra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Kroeger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">N</forename><surname>Straehle</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">X</forename><surname>Kausler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Haubold</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Schiegg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Ales</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Beier</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Rudy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Eren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">I</forename><surname>Cervantes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Beuttenmueller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Wolny</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">U</forename><surname>Koethe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">A</forename><surname>Hamprecht</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Kreshuk</surname></persName>
		</author>
		<idno type="DOI">10.1038/s41592-019-0582-9</idno>
	</analytic>
	<monogr>
		<title level="j">Nature Methods</title>
		<imprint>
			<biblScope unit="volume">16</biblScope>
			<biblScope unit="page" from="1226" to="1232" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">Cellpose 2.0: how to train your own model</title>
		<author>
			<persName><forename type="first">M</forename><surname>Pachitariu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Stringer</surname></persName>
		</author>
		<idno type="DOI">10.1038/s41592-022-01663-4</idno>
	</analytic>
	<monogr>
		<title level="j">Nature Methods</title>
		<imprint>
			<biblScope unit="volume">19</biblScope>
			<biblScope unit="page" from="1634" to="1641" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">ImageNet large scale visual recognition challenge</title>
		<author>
			<persName><forename type="first">O</forename><surname>Russakovsky</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Deng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Su</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Krause</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Satheesh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Karpathy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Khosla</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Bernstein</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">C</forename><surname>Berg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Fei-Fei</surname></persName>
		</author>
		<idno type="DOI">10.1007/s11263-015-0816-y</idno>
	</analytic>
	<monogr>
		<title level="j">International Journal of Computer Vision</title>
		<imprint>
			<biblScope unit="volume">115</biblScope>
			<biblScope unit="page" from="211" to="252" />
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">ATON: an open-source framework for creating immersive, collaborative and liquid web-apps for cultural heritage</title>
		<author>
			<persName><forename type="first">B</forename><surname>Fanini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Ferdani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Demetrescu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Berto</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Annibale</surname></persName>
		</author>
		<idno type="DOI">10.3390/app112211062</idno>
	</analytic>
	<monogr>
		<title level="j">Applied Sciences</title>
		<imprint>
			<biblScope unit="volume">11</biblScope>
			<biblScope unit="page">11062</biblScope>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
