Automatic Annotation of Legal References (Allegationes)
                         in the Liber Extra’s Ordinary Gloss
                         Andrea Esuli1,* , Vincenzo Roberto Imperia2 and Giovanni Puccetti1
                         1
                             Istituto di Scienza e Tecnologie dell’Informazione “A. Faedo”, via G. Moruzzi 1 – 56124 Pisa PI, Italy
                         2
                             Università degli Studi di Palermo - Dipartimento di Giurisprudenza, via Maqueda, 172 – 90134, Palermo PA, Italy


                                        Abstract
                                        The study of normative corpora of the past is a key activity in the fields of Religious Studies and Legal History.
                                        The development of intelligent software tools that support this activity is of paramount importance to support
                                        the digital transformation of the community. We present an interdisciplinary activity that lead to an accurate
                                        automatic annotation of legal references in the Liber Extra’s Ordinary Gloss. An index of legal references as been
                                        derived from the annotations enabling the creation of novel navigation and data analysis tools. The contribution
                                        of this work is twofold: the actual index is already by itself valuable resource for the discipline, and we detail the
                                        process that lead to its production, showing that an effective result can be delivered by a small team with limited
                                        resources. Both the index and the code are made publicly available.

                                        Keywords
                                        Legal references, Information Extraction, Conditional Random Fields, Dataset


                         1. Introduction
                         ITSERR [1] (Italian Strengthening of the ESFRI RI RESILIENCE) is a interdisciplinary and distributed
                         Research Infrastructure for Religious Studies. The development of innovative AI-based tools that
                         support of the digital transformation of the community is one of the many goal of the project. In this
                         context we present a contribution to the “GNORM” software, which aims to provide users with a set of
                         tools and functionalities to facilitate research on large normative corpora of the past. One case of study
                         is to enable the consultation of the Corpus Iuris Canonici [2] text and its Glossa Ordinaria (Ordinary
                         gloss) [3], enabling users to access the network embodied by the visual model in which legal texts
                         (manuscripts and printed books) were structured, as described in Section 1.1.

                         1.1. Glossa and Allegationes, from Medieval Books to Computer Science
                         Before examining the specific case study presented here and evaluating the concrete applications in
                         the field of legal history resulting from the development of automatic annotation techniques for legal
                         allegations, it is necessary to clarify the meaning of terms such as gloss, ordinary gloss and allegations,
                         and in particular to consider the nature of allegations in the intellectual context of medieval jurists.
                            The term glossa (gloss) refers to “a brief annotation composed and written to explain a text and
                         addressing either its terminology and its exterior trappings or its animating spirit and its underlying
                         principles” [4]. From the late 11th century and for several centuries thereafter, the gloss served as the
                         principal paratextual tool through which law masters in law schools, later Universities, explained and
                         commented on Roman and Canon law compilations, enabling students and practitioners to engage

                         IRCDL 2025: 21st Conference on Information and Research science Connecting to Digital and Library science, Feburary 20-21 2025,
                         Udine, Italy
                         *
                           Corresponding author.
                         †
                           Contributions: Esuli and Puccetti wrote the code, ran the experiments, and wrote Section 2. Imperia made the manual
                           annotation, and wrote Sections 1 and 3. All the authors wrote abstract, conclusions, and revised the final version of the paper.
                         $ andrea.esuli@isti.cnr.it (A. Esuli); vincenzoroberto.imperia@unipa.it (V. R. Imperia); giovanni.puccetti@isti.cnr.it
                         (G. Puccetti)
                          https://esuli.it/ (A. Esuli); https://gpucce.github.io/ (G. Puccetti)
                          0000-0002-5725-4322 (A. Esuli); 0000-0001-5029-181X (V. R. Imperia); 0000-0003-1866-5951 (G. Puccetti)
                                       © 2025 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
with technically complex texts. The term glossa ordinaria (ordinary gloss) refers to apparatus of glosses
compiled by some eminent jurists, renowned for their exceptional quality and thoroughness. Their
widespread acceptance ensured their consistent reproduction alongside the normative reference text,
establishing them as the ordinary apparatus [5]. The phenomenon of normative texts accompanied by
ordinary gloss is evident both in the manuscript production of the late Middle Ages, the earliest printed
works of the late 15th century, continuing into the significant editorial initiatives of the 16th century.
    Manuscripts and printed texts retained the stan-
dard page layout essentially unchanged, with the
normative text and the ordinary gloss arranged ac-
cording to a model that has been defined, in its
essential features, as the “agora template” [7], as
shown in Figure 1. This spatial configuration, with
the normative text at the center framed by the
glosses, mirrored the dialogical nature of the le-
gal interpretation method: the authoritative text
was accompanied by commentary that explained,
questioned or contradicted it. It can be affirmed
that it is the very content of the apparatus that qual-
ifies them as "hypertexts" [8], since they demand
the active participation of the reader. The margins
of the normative text contain various components,
each serving a specific function, but collectively act-
ing as tools to help the user understand the main
text, making it more accessible and consultable. Par-
ticular attention will be paid here to one of these
components, the allegationes, defined by Hermann
Kantorowicz [9] as references to auctoritates that
must justify every assertion or, at least, deal with
those that at first glance appear to contradict it.
These references are, in the vast majority of cases,
citations of legal norms, presented in a highly abbre-
                                                            Figure 1: A page from the Liber Extra’s Ordinary
viated form, using abbreviations and numbers that           Gloss [6]. The box of text in the upper center of the
correspond to the conventional criteria for identi-         page is the normative text from Liber Extra. The
fying the legal source containing the norm. These           surrounding text are the glosses that also contain
references, initially sporadic in the earliest layers       the legal references.
of glosses, gradually increased in number until they
became quantitatively predominant, alongside the refinement of interpretive techniques developed
by jurists. Many recent work [10, 11, 12, 13] reaffirmed and clarified that allegations are not merely
citations: the use of a reference constitutes an appeal to an authority that not only lends greater solidity
to the legal argument, but forms its very foundation. The use of legal references by medieval jurists
had thus a complex significance. It demonstrated that interpretative operations were carried out on
legitimate grounds, based, on the one hand, on applicable and valid legal norms and, on the other, on
the consistent use of legal categories shared within a common cultural and intellectual context [11].
    These conceptual premises are closely related and provide the backdrop for considerations concerning
more practical aspects. The first pertains to the citation style of allegationes, which presupposed a
normative text that was now stable, fixed, and unalterable. This allowed the reader to pinpoint the
referenced passage with certainty [13]. The second concerns the nature and purpose of these references.
The primary aim was to create a network in which the various sedes materiae were organised and linked,
thus enabling the reader-user to navigate vast normative compilations [8].
    Moreover, the use of legal references could take on an even more incisive argumentative function
if, in addition to a principle or rule derived from the normative text, allegations pro and contra were
included. This technique quickly evolved into a distinct literary genre, with many works constructed
according to these specific criteria [14].
   The formal and substantive nature of the allegationes in the works of the jurists underscores the need
to exploit the potentials of computer science, and specifically machine learning, to develop accurate
tools capable of preserving their peculiarities in future digital editions of medieval legal texts [13].


2. Annotation of legal references
The complex and stratified genesis of the ordinary apparatus to normative compilations, closely linked
to the concept of the authoritative text itself in the Middle Ages, prevented the preparation of critical
editions of them in the modern sense. Consultation of them still requires recourse to manuscripts or
early modern printed editions [11]. This is also the case for the Corpus iuris canonici. For this reason,
the Glossa Ordinaria (Ordinary Gloss) to the Decretales Gregorii IX, best known as the Liber Extra, was
chosen for the design and development of an automatic annotation system for the legal allegations.
Promulgated by the Pope with the bull Rex Pacificus, the legal collection is subdivided into 5 books, 185
titles, 1971 chapters, with a total of 9872 lemmas. The gloss of each lemma contains a variable number
of legal references. From the text of the 1582 Editio Romana [3], which is the common reference edition,
a complete digital transposition was made by Edward A. Reno III, as part of the project “The Digital
Decretals” [15]. The files containing the digital text can be found on the project’s website, together with
specific references to the sections of the apparatus not included in the transcription work, as well as
the interventions made to standardise elements of spelling, abbreviation, punctuation and numbering
in the text.
   Figure 2 illustrates the process that lead to an accurate automatic annotation of the whole Liber
Extra’s Ordinary Gloss based on the annotation of a small subset by a human expert, followed by the
creation of an index of all the annotation in which every legal reference, pointing to a specific norm, is
linked to a lemma, chapter, and title, of the Liber Extra. The next Sections will detail the process.

2.1. The annotation schema
An annotation is a span of text marked in the text to identify some relevant information of a specific
type. The annotation schema we defined identifies the four types of entities:
    • Annotation of type glossed lemma, “Lemma glossato”, indicates the specific terms that are glossed.
      Legal references are included in the gloss text.
    • Annotations of type legal reference, “Allegazione normativa” annotate the references to legal
      norms or regulations.
    • The last two annotations types are title, “Titolo”, and chapter, “Capitolo”, which together with the
      lemma precisely identify the position of the legal reference in the Liber Extra.
The annotations of these entities thus enables building a link between the Liber Extra and the legal
norms and regulations that are crucial for the interpretative framework of the book. The annotation
data also include the exact character position of the beginning and the end of the annotated text in the
digital text.

2.2. Expert annotation
An expert of the domain performed the annotation of the legal references on a subset of the book. The
expert annotated 12 titles out of 185, making a total of 4578 annotations. The titles have been randomly
sampled across the whole book to better cover the different content of each section of the book, i.e.,
following Reno’s notation [15]: 1.02, 1.11, 1.33, 2.02, 2.23, 3.02, 3.26, 4.17, 4.19, 5.01, 5.03, 5.23. The expert
focused only on annotating the legal references, as the annotations of the other three types have been
made successively in a completely automatic way, as described in Section 2.3.
   The tool used by the expert to perform the manual annotation is INCEpTION [16]. INCEpTION is a
popular annotation platform designed for collaborative and efficient text annotation. It supports a wide
Figure 2: An illustration of the process that led to construction of the index of all the legal references in the
Liber Extra’s Glossa Ordinaria.


range of tasks, including named entity recognition, relation extraction, and classification. INCEpTION
helps the manual annotation with a dedicated GUI design, and also by providing automated suggestions
for spans of text that are evaluated as potential new annotations, which the expert can validate with a
single mouse click. The automated suggestions are based on the definition of a recommender, i.e., a
machine-learning algorithm that trains a model by continuously observing the annotations made by
the expert. The automatic annotation model trained by INCEpTION is a valuable help for the annotator,
yet it is not of sufficient accuracy to be used to perform a complete automatic annotation of the rest of
the book. We thus used the 12 annotated titles as a training set for a proper batch training process of
more accurate models, as detailed in Section 2.3.

2.3. Training a model for the automatic annotation of legal references
The code to replicate the automatic annotation is published with an open source license [17]. We focus
here only on the legal references, as the annotation of the other types of entities has been solved using
different methods, as detailed in Section 2.3.1.
   The automatic annotation process is modeled as a word labeling problem. We adopted the BILOU
labeling schema [18] in which each word can be assigned to one of a set of labels, either if is part of a
legal reference or not (O label). For words that are part of a legal reference, different labels are used if
the word is at beginning of the annotation (B label), it is the last (L label), it inside the sequence of words
that form the legal reference (I label), or the legal reference is composed of a unique word (U label).
   We tested two approaches to train the automatic annotation model: using a “traditional” statistical
machine learning algorithm, i.e., Conditional Random Fields [19] (CRFs), and fine-tuning a transformer-
based Large Language Model (LLM) [20]. A preliminary test of few-shot prompting [21] of a generative
LLM reported a very low accuracy and was discarded.
   CRFs are probabilistic graphical models that are used to model sequential or structured data by
defining conditional probabilities of a set of target variables given a set of observed variables, allowing
for the incorporation of context and interdependencies among the variables in the target space. In
natural language processing (NLP), CRFs are often used for tasks like named entity recognition or part-
of-speech tagging, where the output labels (e.g., tags) are interdependent. We tested two configurations
for the CRFs graph: a simple configuration in which the observed variables are all the word bigrams in
a context window of three words before and after the one to be annotated, and a rich configuration that
considered bigrams and trigrams in a context windows of size seven.
   For the fine tuning of LLMs we tested two models: the original BERT [22], as a baseline, and Latin
Figure 3: The annotation interface in INCEpTION. The document been annotated is shown in the center of
the page. Text highlighted in different colors indicate the annotated entities of different type: orange for the
“Lemma glossato” (glossed lemma), green for the “Allegazione normativa” (legal reference). The left-side bar lists
all the annotations in the document, the right-size bar show the details of the currently selected annotation.


BERT [23], a state-of-the-art model for Latin. The fine tuning of LLMs was made using the Transfomers
python package [24], training for 10 epochs, using a learning rate of 2 · 10−5 , a weight decay value of
0.01, and a batch size of 32. Sequences longer than 512 token were split in multiple sequences.
   The comparison of the methods is based on a 10-fold cross validation on the expert-annotated data.
The annotated data is split into ten parts. The splits are kept the same across all the tested methods.
For every fold one tenth of the annotated data is considered to be the test data and the remaining nine
tenth are the training data. The process repeats for all the folds, collecting for each tested method an
automatic annotation of all data. The accuracy of the automatic annotation is determined by comparing
it with the annotations from the expert, using a token-and-blank evaluation model [25].

Table 1
Results of cross-validation experiments (10 folds) on the expert annotated data. “Training Time” reports the time
require to train an automatic annotation model on the entire set of expert-annotated data.
                 Method              Accuracy    Training Time         Hardware      Model size
                 CRFs basic             0.891               7m     CPU (20 cores)       0.5MB
                 CRFs rich              0.978              21m     CPU (20 cores)       1.1MB
                 BERT [22]              0.837              13m       GPU (4 A40)      411.8MB
                 Latin BERT [23]        0.924              13m       GPU (4 A40)      423.3MB

   Results in Table 1 show that CRFs in the rich configuration performed with close to perfect accuracy.
The more complex graph of the rich configuration required three times the training time of the basic
configuration, yet the training time is still short and the improvement is worth the additional cost.
   The comparison between BERT and Latin BERT shows the impact of the main training language
to a task. The tokenizer of BERT obviously struggles with Latin words. For example the five-words
expression “Quae est radix omnium malorum" is tokenized by BERT in 19 tokens, whereas Latin BERT
produces exactly five tokens. This allowed Latin BERT to better identify the relevant elements of the
language that are related to the expression of legal references, whereas BERT struggled with very short
sub-word tokens, which are evidently less statistically related to the concept of legal reference.
   The comparison of CRFs with LLMs highlights that in this annotation task CRFs have many advantage
points. The obvious one is that CRFs get the best accuracy. Not less important is that CRFs require less
computational resources. The training time of CRFs is based on using a desktop with a single Intel i9
CPU, while the LMMs time are based on a dedicated server with 4 A-40 NVIDIA GPUs, costing roughly
ten times the desktop computer. Similarly the sizes of the final models shows a clear advantage for CRFs.
The differences in training time are not very relevant, considering the differences in hardware, and the
fact that even the longest training time is relatively quick. CRFs makes possible to train an annotation
model on personal hardware, enabling the adoption of machine learning to small research groups with
limited computational resources. More experiments on the fine tuning of LLMs may have reduced
the gap with CRFs, yet the high accuracy obtained by CRFs satisfied our goals, thus not justifying the
additional computational costs.

2.3.1. Automatic annotation of other entities
The annotation of chapters and titles has been made using a regular expression, exploiting the specific
format used in The Digital Decretals [15]. For the annotation of the glossed lemmas, we had only a
ordered list of the glossed lemmas, leaving us to identify their position in the text. We solved this
with a search algorithm that considered the constraints imposed by the list. For example, the word
“Omnipotens” occurs 31 times in the whole text, but the only instance as a glossed lemma occurs after
the glossed lemma “Incomprehensibilis” and before “Ineffabilis”. Solving all the constraints for all the
glossed lemmas allowed us to find the exact position for all of them.

2.4. Building the index of legal references
The automatic annotation of the whole text identified 41784 legal references [26], each linked to a
glossed lemma, a chapters, a title, and a book part (among the five parts of the Ordinary Gloss relative
to the five books of the Liber Extra). The automatic annotation thus had a multiplier effect of 9̃.1 with
respect to the number of annotation from the expert.
   We split each legal references in two parts, the “title” or “section” that contains the referenced norm,
and the string that identifies the norm. For example, the legal reference “22. q. 4, incommutabilis” points
to Gratian’s Decretum, Causa 22, Quaestio 4, and its chapter (or canon) “incommutabilis” (according to
today’s citation standard: C.22 q.4 c.9). This allowed us to identify a total of 1795 referenced titles or
sections, from various collections, i.e., the Corpus Iuris Canonici and the Corpus Iuris Civilis.
   The index enables the researchers to perform the activities presented in Section 1. In addition to the
use as a browsing resource, the index itself can be the subject of analysis. For example, Table 2 lists the
ten titles or sections with most references in the Ordinary Gloss, considering the whole Liber Extra and
each of its books separately. This is a simple example of a data analysis enabled by the index.

Table 2
The ten sections or titles with most references in the Ordinary Gloss, considering the whole Liber Extra, or each
book of the Liber Extra separately. Numbers indicate the number of references.
 Liber Extra            Book 1                Book 2                 Book 3                    Book 4                     Book 5
 de elect.       1217   de elect.       739   de appell.       499   de praeben.         225   de spons.            200   de sent. excom. 405
 de appell.       930   de offi. deleg. 366   de testib.       363   12. q. 2            181   de despon. impub. 100      de simon.       223
 de offi. deleg.  721   de rescript.    361   de offi. deleg. 219    de iure patron.     177   de eo qui dux.        63   de accusat.     172
 de sent. excom. 689    de appell.      296   de elect.        207   de elect.           164   qui fil. sint legit.  59   11. q. 3        150
 de rescript.     585   de praeben.     158   de praescrip. 153      16. q. 1            162   de eo qui cog. cons. 55    de homic.       127
 de testib.       545   ff. de procur. 121    de restit. spol. 145   de decim.           160   de cond. appos.       48   de privileg.    116
 de praeben.      439   de re iudic.    104   de iureiur.      135   de conver. coniug. 108    de frig. et malef.    45   1. q. 1         114
 12. q. 2         388   ff. de recepti. 103   de re iudic.     132   16. q. 7            108   de divort.            37   de haeret.      107
 de simon.        379   de sent. excom. 91    2. q. 6          129   de censib.          103   27. q. 2              36   50. dist.       102
 11. q. 3         359   de renunciat.    91   de probat.       124   de concess. praeben. 95   de cons. et affin.    34   de elect.        87
3. Qualitative analysis
A qualitative evaluation of the results of the automatic annotation process is particularly relevant when
considering the degree of correctness and accuracy in relation to the amount of data processed.
   Regarding the correctness of the annotation, the system’s ability to distinguish legal references from
the rest of the text, which is usually written in a discursive and argumentative style, according to the
explanatory function inherent in the gloss, is impressive. Concerning the accuracy of the annotation, the
system has demonstrated, in most of the cases, its ability to delimit the exact scope of legal references.
There are a few exceptions when the text of the allegation is unusually long or contains overly specific
references. There are no particular issues in identifying allegations that refer to the Liber Extra or to
the three partes of Gratian’s Decretum, as each is clearly defined in its citation form. The same holds
true for the allegations to the Code or the Institutes of the Corpus Iuris Civilis, which are usually clearly
preceded by their respective symbols (C. or Inst.).
   The automatic annotation has a recurring imprecision when identifying references to the Digest,
preceded by the abbreviation “ff.”, which is erroneously omitted in many cases. In this case the automatic
annotation model shown a preference for shorter annotations, considering that in most cases both
versions of the annotations, with or without “ff.”, are potentially correct yet referring to different sources.
The specific nature of this error made it easy to solve with a simple automatic post processing of the
annotations, checking for any eventual “ff.” preceding them and adding it to the annotation, as we
found no cases in which the “ff.” expression preceding an annotation was to be excluded. The published
version of the index [26] is thus not affected by this issue.
   Quantitatively insignificant are the cases in which the system detects text passages as legal allegations
when they are not.Particularly interesting are the cases where the system, due to the syntactic structure
of the text, has detected actual allegations that are "non-legal", such as references to Gospel passages.
In conclusion, the large amount of correct annotations and the limited effort required to correct the
erroneous ones, which will be the next step of the project’s activities, confirms the validity and potential
of the approach to annotation we presented.


4. Conclusions
In the field of Legal History, the possibility of using an automatic annotation system for legal allegations
is proving to be a valuable tool. The development of coordinated and interconnected databases of nor-
mative texts and their commentary apparatus, as well as the creation of digital editions of fundamental
works of medieval jurists, are just a few examples of potential applications. Moreover, the creation
of specific reference ontologies could enable the training of increasingly complex retrieval systems,
capable of extracting and sorting a vast amount of data that would otherwise be unmanageable, making
larger projects practically infeasible.
   In the case of study presented in this work, we have been able to annotate a large amount of text
exploiting the annotation a human expert on a tenth of the corpus and then a very effective machine
learning setup to have a complete, accurate annotation of the whole text. The outcome of this research
activity is twofold: we produced a valuable resource, the index, that will contribute to the GNORM
software and support studies on the Ordinary gloss and the Liber extra; and we have proved an effective
low-resource pipeline that can be replicated for similar activities in the field of Religious Studies, Legal
History, and many related disciplines in the humanities.


Acknowledgments
This work was supported by project "Italian Strengthening of ESFRI RI RESILIENCE" (ITSERR) funded
by the European Union under the NextGenerationEU funding scheme (CUP:B53C22001770006).
References
 [1] ITSERR (Italian Strengthening of the ESFRI RI RESILIENCE), 2024. URL: https://itserr.it.
 [2] K. Pennington, Corpus iuris canonici, in: J. Otaduy, A. Viana, J. S. Rueda (Eds.), Diccionario
     general de derecho canónico, Thomson Reuters Aranzadi, 2012, pp. 757–765.
 [3] Bernard of Parma, Glossa Ordinaria to Decretals Gregory IX, in: Decretales D. Gregorii Papae IX.
     suae integritati una cum glossis restitutae. Cum privilegio Gregorii XIII. Pont. Max. et aliorum
     Principum, Romae, In Aedibus Populi Romani, 1582.
 [4] M. Bellomo, The common legal past of Europe, 1000–1800, 1995.
 [5] G. Dolezalek, Glosses and the juridical genre “apparatus glossarum” in the middle ages, Rivista
     Internazionale di Diritto Comune 32 (2021) 9–54.
 [6] Decretales D. Gregorii Papae IX. suae integritati una cum glossis restitutae. Cum privilegio Gregorii
     XIII. Pont. Max. et aliorum Principum, Romae, In Aedibus Populi Romani, 1582, available at
     UCLA Library Digital Collections„ 2024. URL: https://digital.library.ucla.edu/catalog/ark:/21198/
     zz0014rx7w?cv=35.
 [7] A. M. Hespanha, Form and content in early modern legal books, Rechtsgeschichte-Legal History
     12 (2008) 12–50.
 [8] G. Speciale, Apparatus: ipertesto vivo e aperto, Ius Commune. Zeitschrift für Europäische
     Rechtsgeschichte 28 (2001) 47–59.
 [9] H. U. Kantorowicz, Die allegationen im späteren mittelalter, Archiv für Urkundenforschung (1935)
     15–29.
[10] D. Quaglioni, Licet allegare poetas: formanti letterari del diritto fra medioevo ed età moderna,
     Poesia e diritto nel Due e Trecento italiano.-(Memoria del tempo; 65) (2019) 209–219.
[11] S. Menzinger, Reflections on the connection between author and text in medieval juridical
     production, Historia et ius 11 (2017).
[12] S. Menzinger, The past, the others, himself: The open dialogue of a medieval legal author with
     his text, in: Sicut dicit: Editing Ancient and Medieval Commentaries on Authoritative Texts,
     Thurnout, 2019, pp. 273–299.
[13] S. Menzinger, Interazione tra testo e ‘citazione’ nella dottrina giuridica civilistica: secoli XII e
     XIII, in: Juristische Glossierungstechniken als Mittel rechtswissenschaftlicher Rationalisierungen,
     Erich Schmidt Verlag, 2022, pp. 15–26.
[14] P. Weimar, Argumenta brocardica, Studia Gratiana 14 (1967) 89–123.
[15] E. Reno, The Digital Decretals, https://www.digitaldecretals.com/, 2024. [Online; accessed 1-
     November-2024].
[16] J.-C. Klie, M. Bugert, B. Boullosa, R. E. de Castilho, I. Gurevych, The inception platform: Machine-
     assisted and knowledge-oriented interactive annotation, in: Proceedings of the 27th International
     Conference on Computational Linguistics: System Demonstrations, Association for Computational
     Linguistics, 2018, pp. 5–9. URL: http://tubiblio.ulb.tu-darmstadt.de/106270/, event Title: The 27th
     International Conference on Computational Linguistics (COLING 2018).
[17] A. Esuli, G. Puccetti, A Machine Learning pipeline to automatically annotate legal references
     (allegationes) in the Liber Extra’s Ordinary Gloss, 2024. URL: https://github.com/aesuli/CIC_
     annotation. doi:10.5281/zenodo.14381817.
[18] L. Ratinov, D. Roth, Design challenges and misconceptions in named entity recognition, in:
     Proceedings of the thirteenth conference on computational natural language learning (CoNLL-
     2009), 2009, pp. 147–155.
[19] C. Sutton, A. McCallum, et al., An introduction to conditional random fields, Foundations and
     Trends® in Machine Learning 4 (2012) 267–373.
[20] A. Vaswani, Attention is all you need, Advances in Neural Information Processing Systems (2017).
[21] A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever, et al., Language models are
     unsupervised multitask learners, OpenAI blog 1 (2019) 9.
[22] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, BERT: Pre-training of deep bidirectional transformers
     for language understanding, in: J. Burstein, C. Doran, T. Solorio (Eds.), Proceedings of the 2019
     Conference of the North American Chapter of the Association for Computational Linguistics:
     Human Language Technologies, Volume 1 (Long and Short Papers), Association for Computational
     Linguistics, Minneapolis, Minnesota, 2019, pp. 4171–4186. URL: https://aclanthology.org/N19-1423.
     doi:10.18653/v1/N19-1423.
[23] D. Bamman, P. J. Burns, Latin bert: A contextual language model for classical philology, arXiv
     preprint arXiv:2009.10053 (2020).
[24] T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Fun-
     towicz, et al., Transformers: State-of-the-art natural language processing, in: Proceedings of the
     2020 conference on empirical methods in natural language processing: system demonstrations,
     2020, pp. 38–45.
[25] A. Esuli, F. Sebastiani, Evaluating information extraction, in: International Conference of the
     Cross-Language Evaluation Forum for European Languages, Springer, 2010, pp. 100–111.
[26] A. Esuli, V. R. Imperia, G. Puccetti, Automatic Annotation of the Legal References in the Liber
     Extra’s Ordinary Gloss (1.0) [Data set], 2024. doi:10.5281/zenodo.14381709.