=Paper=
{{Paper
|id=Vol-2949/short1
|storemode=property
|title=The IMAGO Project: Towards a Knowledge Base of Medieval and Renaissance Geographical Works (extended abstract)
|pdfUrl=https://ceur-ws.org/Vol-2949/short1.pdf
|volume=Vol-2949
|authors=Valentina Bartalesi,Nicolò Pratelli
|dblpUrl=https://dblp.org/rec/conf/swodch/BartalesiP21
}}
==The IMAGO Project: Towards a Knowledge Base of Medieval and Renaissance Geographical Works (extended abstract)==
The IMAGO Project: Towards a Knowledge
Base of Medieval and Renaissance Geographical
Works
Valentina Bartalesi1[0000−0001−9024−0822] and Nicolò
Pratelli1[0000−0003−0364−922X]
ISTI-CNR, Via G. Moruzzi 1, 56124 Pisa, Italy
{valentina.bartalesi,nicolo.pratelli}@isti.cnr.it
Abstract. The image of the world created by the Medieval and Re-
naissance culture was crucial to the development of Western thought in
European history. To the best of our knowledge Medieval and Renais-
sance geographical works have not been studied using digital methods.
The three years (2020-2023) Italian National research project IMAGO
- Index Medii Aevi Geographiae Operum - aims at providing a system-
atic overview of this literature using Semantic Web technologies. As the
first step to develop tools to support scholars in creating, evolving and
consulting a knowledge base (KB) of the geographical works, we created
an OWL 2 DL ontology. Following the re-use logic and to maximize the
interoperability, we developed the ontology as an extension of two refer-
ence ontologies, that is the CIDOC CRM vocabulary and its extension
FRBRoo, including its in-progress reformulation, LRMoo. In this paper,
we present the project, the ontology and the tool to populate it that
we developed. Furthermore, we present a preliminary study to map the
works collected in the IMAGO KB and the manuscripts stored in the
KB of the Mapping Manuscript Migrations project.
Keywords: Semantic Web · Medieval geographical Works · Digital Hu-
manities.
1 Introduction
The image of the world created by the Medieval and Renaissance culture was
crucial to the development of Western thought in European history. During the
Middle Ages, geographical descriptions were mostly used to collect the human
knowledge into encyclopedic works or to provide universal chronicles [11]. Spe-
cific descriptions of lands, cities, places, monuments and buildings were also
supplied as a guide to the pilgrims travelling to the Holy Land, Rome and San-
tiago de Compostela [10]. By the end of the Middle Ages and the beginnings of
Renaissance Humanism, a more clear image of the world was defined thanks to
the discovery of ancient geographical models (especially the works of Ptolemy
and Strabo): detailed information from the past helped to produce more accu-
rate geographical descriptions and maps. Furthermore, the genre of geographical
Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons
License Attribution 4.0 International (CC BY 4.0).
2 Bartalesi V. and Pratelli N.
description had a further and decisive turning point during the period of the ex-
ploration travels and discoveries: the description and representation of the New
World, along with the reassessment of the physical space, gave the basis of mod-
ern geography [4].
Until now, Medieval and Renaissance geographical works have not been stud-
ied using digital methods. The three years (2020-2023) Italian National research
project IMAGO - Index Medii Aevi Geographiae Operum - aims at providing
a systematic overview of this literature using the Semantic Web technologies to
make available this knowledge as Linked Open Data (LOD) [2] and to develop
automatic search and visualisation services on the collected data. In particular,
the project aims to produce and make available to the users a complete survey
of Medieval and Humanistic geographical works, providing: (i) a classification of
authors, genres and contents; (ii) a list of the manuscript tradition and printed
editions for each work; (iii) a list of critical editions of some more representative
works; (iv) a Medieval Latin toponymy index.
As the first step in order to develop tools to support scholars in creating,
evolving and consulting a knowledge base (KB) of the geographical works, we
created an OWL 2 DL ontology [9] that formally represents this knowledge.
Following the re-use logic and in order to maximize the interoperability, we
developed our ontology as an extension of two reference ontologies, that is the
CIDOC CRM [5] vocabulary and its extension FRBRoo [6], including its in-
progress reformulation, LRMoo [12].
The final aim of the project is the creation of a Web application allowing
scholars to freely access and visualise the data collected in the IMAGO knowl-
edge base. The idea is to improve the studies of Medieval and Renaissance Hu-
manism geography by providing scholars a better insight into this field from
many perspectives, such as the Medieval Latin toponymy and the identification
of historical places. The Web application will host a special section of Medieval
and Renaissance cartography as well, in order to provide a digital collection of
the most interesting maps and drawings.
2 The IMAGO Ontology
As the first step to develop tools to support scholars in creating, evolving and
consulting a KB of the Medieval and Renaissance geographical works, we created
an ontology that formally represents this knowledge. The IMAGO ontology is
derived from a strict collaboration between ISTI-CNR and the scholars from the
University of Pisa and the University of Salento - expert in Latin and Italian
Literature and Linguistics - who are involved in the project. The methodology
we followed to develop the ontology is well known and it is the one usually
adopted to create formal vocabularies in the Semantic Web research field. The
main novelty introduced by our research is the use of the Semantic Web technolo-
gies to formally represent the scientific domain of the geographical Latin works
written during the Middle Ages and the Renaissance. Despite in other research
projects semantic technologies have been used to represent ancient manuscript
Title Suppressed Due to Excessive Length 3
corpora [7, 1, 3, 8], no scientific research that applies a Semantic Web approach
has been conducted in this specific research field. Furthermore, the information
about the geographical Latin works written during the Middle Ages and the
Renaissance is dispersed on paper books, and this makes a systematic overview
of the geographic literature impossible, preventing a well-ordered perception of
how it was gradually set up in time. The IMAGO project aims at making this
information available in digital form to both scholars and general users. We de-
fined a conceptualisation of the domain of knowledge and then we formalised
this conceptualisation using classes and properties from two existing ontologies
we chose as reference vocabularies, that is the CIDOC CRM and its extension
FRBRoo, including its in-progress reformulation LRMoo. We adopted a lot of
terms from these ontologies to maximize the interoperability of our representa-
tion. Finally, we added our own classes and properties to represent the terms
that we did not find in the reference vocabularies. The resulting ontology is ex-
pressed in OWL 2 DL language. Our conceptual idea is that the domain of the
geographical work can be represented using some main categories. The first ones
are the author and title of a work. For each work, the literary genre is specified
along with the toponyms that represent the places that are described or reported
into the work. Furthermore, for each work, several metadata about the related
manuscripts and printed editions are added. In particular, for each manuscript
the following knowledge is reported : the name of the author and the title of
the work in the forms that appear in the manuscript; the library in which the
manuscript is collected; the location of the library; the signature and the folios
of the manuscript; the incipit and explicit of the dedication/proem, if they exist;
the incipit and explicit of the text; the date of the creation of the manuscript;
the secondary sources.
On the other hand, for each printed edition the following knowledge is re-
ported: the author, the title, and curator’s name of the edition; the place and
the date of publication; the publisher; the format of the edition; the number of
pages; the information about the images reported in the edition; some general
notes that the scholars intend to add as comment to the edition; the name of
the author of the introduction, the text of the introduction, the text of the ded-
ications; information about whether the edition is a first edition or a reprint;
primary and secondary sources of the edition; the ecdotic typology. In Table 1
we reported the classes we used to represent our main concepts and in Table
2 the properties we used to express the semantic relationship among concepts
are listed. As a notational convention, the CIDOC CRM uses the letters “E”
and “P” to indicate classes and properties respectively, whereas FRBRoo (and
its revisions LRMoo) uses the letters “F” and “R” to indicate classes and prop-
erties, respectively. Note that we intended dates as time intervals and that we
preferred the classes of FRBRoo and LRMoo instead of the corresponding classes
of CRM when the concepts to capture and represent underlay the semantics of
bibliographic information. Furthermore, we used F2 Expression instead of F1
Work for representing a work since for work we intend a particular edition of
that work.
4 Bartalesi V. and Pratelli N.
Table 1. Classes used to represent our main concepts
Concept Class
Author subclass of E39 Actor
Work equivalent to F2 Expression
Work creation equivalent to F28 Expression Creation
Genre subclass of E55 Type
Toponym subclass of E41 Appellation
Manuscript subclass of F5 Item
Printed Edition subclass of F3 Manifestation
Library subclass of F11 Corporate Body
Place equivalent to E53 Place
Geographical Coordinate equivalent to E94 Space Primitive
Signature equivalent to E42 Identifier
Folios subclass to E19 Physical Object
Date equivalent to E52 Time-Span
Curator/Publisher subclass of E39 Actor
Table 2. Properties used to represent relation among the main concepts
Relation (R) between concepts Property
R(Work creation event,Author) equivalent to P14 is carried out by
R(Work creation event,Work) equivalent to R17 created
R(Manuscript,Title) equivalent to P102 has title
R(Printed Edition,Title) equivalent to P102 has title
R(Manuscript,Library) equivalent to P50 has current keeper
R(Place,Geographical coordinates) equivalent to P168 place is defined by
R(Manuscript,Signature) equivalent to P1 is identified by
R(Manuscript,Folios) equivalent to P46 is composed of
R(Manuscript,Date) equivalent to P4 has time span
R(Printed edition,Date) equivalent to P4 has time span
R(Printed edition,Curator) subproperty of P14 carried out by
R(Printed edition,Publisher) subproperty of P14 carried out by
R(Printed edition,Format) equivalent to R69 specifies physical form
R(Printed edition,Page) equivalent to P106 is composed of
To improve the level of interoperability of the ontology, we used specific ter-
minological resources, when possibile. For the individuals of the class Genre, we
used the Soggettario Nazionale1 , a standard thesaurus created and maintained
by the National Central Library of Florence. To represent the instances of the
following classes we used Wikidata [14] as reference KB: (i) Toponym, which
represents the places that are described or reported into the work, (ii) Library,
in which the manuscript is stored, (iii) Place, which represents the location of a
library. For the ecdotic typology of the printed edition, we did not find a suit-
able terminology, thus we created a short controlled vocabulary to satisfy our
representational aims.
1
https://thes.bncf.firenze.sbn.it/
Title Suppressed Due to Excessive Length 5
To populate the ontology, we developed a semi-automatic Web tool to al-
low scholars to insert knowledge through a user-friendly interface. The tool was
created to reduce the time to insert knowledge and to avoid the insertion of
mistakes thanks to the use of predefined lists of works, authors, libraries, places,
literary genres. The geographical coordinates of the places are also automatically
assigned. The labels and the corresponding IRIs2 contained in these predefined
lists are extracted from the Wikidata knowledge base [14] and the MIRABILE
database3 . Figure 1 shows the main interface of the tool.
At the current stage of the project, our KB includes 250 works, 206 authors and
614 libraries and the scholars have started to insert detailed knowledge about
manuscripts and printed editions of these works. The KB also includes seven
different literary genres, four types of editions, six ecdotic typologies.
Fig. 1. The interface of the tool used by the scholars to insert the knowledge
3 Adding Knowledge about Manuscript Migration to the
IMAGO KB
Mapping Manuscript Migrations (MMM) [3] is a project developed with fund-
ing from the Trans-Atlantic Platform under its Digging into Data Challenge
2
https://www.w3.org/International/articles/idn-and-iri/
3
www.mirabileweb.it
6 Bartalesi V. and Pratelli N.
(2017-2019). By using Linked Open Data principles and Web Semantic tech-
nologies, MMM unite records from three datasets: the Schoenberg Database of
Manuscripts4 at the University of Pennsylvania, the Bibale5 database at the
Institut de recherche et d’histoire des textes, and the Medieval Manuscripts
Catalogue6 at the University of Oxford. Within the MMM, a data model was
developed to serve the aims of the project, but it is general enough to be used by
anyone who would want to represent the knowledge about the manuscript prove-
nance data. It incorporates concepts from several existing ontologies, including
Erlangen CIDOC-CRM for events, FRBRoo for bibliographic information, and
the Getty Thesaurus of Geographic Names for physical locations. The data model
includes also its own classes and properties that serve both unique instances in
the source datasets and manuscript studies in general. The knowledge stored
in the MMM KB is interesting for the IMAGO project. In particular, the in-
formation on how the manuscripts have traveled across time and space from
their place of production to their current locations could significantly enrich the
IMAGO KB. Since both MMM and IMAGO use the same reference vocabu-
laries, the level of interoperability between the two ontologies is high. We have
conducted a preliminary study to map our works and the manuscripts stored in
the MMM KB. Querying the MMM KB, we measured that about 20% of the
works collected in the IMAGO KB is also present in the MMM KB. We plan
to integrate the knowledge related to these shared manuscripts in order to give
more complete information to the users of the IMAGO Web application.
4 Conclusion and Future Work
In this paper, we have presented the research developed within the Italian Na-
tional Research Project IMAGO - Index Medii Aevi Geographiae Operum (2020-
2023). IMAGO aims at creating a KB of the Medieval and Renaissance geograph-
ical works which report the description and representation of the world in the
VI-XV centuries. The knowledge included in the KB is formally represented fol-
lowing the Linked Open Data paradigm and using the languages of the Semantic
Web (OWL 2 DL). Indeed, to the best of our knowledge, until now no scien-
tific research has applied digital methods in a systematic way in this field of
studies. We have presented the ontology we have developed to formally repre-
sent the knowledge about these geographical works. The IMAGO ontology has
been implemented as an extension of two standard vocabularies: CIDOC CRM
and FRBRoo (and its ongoing extension LRMoo). On the basis of the ontol-
ogy, we have developed a tool that is used by the scholars who are inserting
data in our KB. We have also presented a preliminary study to map the works
collected in the IMAGO KB and the manuscripts stored in the KB of the Map-
ping Manuscript Migrations project. First of all, as future work we have planned
to evaluate the ontology. In particular, we plan to conduct two different types
4
https://sdbm.library.upenn.edu/
5
https://bibale.irht.cnrs.fr/
6
https://medieval.bodleian.ox.ac.uk/
Title Suppressed Due to Excessive Length 7
of evaluation: an automatic evaluation and an evaluation involving users. For
the first type of evaluation, we plan to use the automatic OntoQA system [13]
that allows us to evaluate both the model and the KB. For the second type of
evaluation, we plan to propose a specific questionnaire to the scholars who are
currently populating the ontology. After the analysis of the evaluation results,
if necessary, we will review and extent our ontology. The long term aim of the
project is to develop a Web application that allows retrieving and consulting the
data collected in the IMAGO KB in a user-friendly way (e.g. tables, maps, CSV
files) for scholars and general users.
References
1. Barzaghi, S., Palmirani, M., Peroni, S.: Development of an ontology for modelling
medieval manuscripts: The case of progetto irnerio. Umanistica Digitale 9, 117–140
(2020)
2. Bauer, F., Kaltenböck, M.: Linked open data: The essentials. Edition
mono/monochrom, Vienna 710 (2011)
3. Burrows, T., Emery, D., Fraas, M., Hyvönen, E., Ikkala, E., Koho, M., Lewis,
D., Morrison, A., Page, K., Ransom, L., Thomson, E., Tuominen, J., Velios, A.,
Wijsman, H.: Mapping manuscript migrations: Digging into data for researching
the history and provenance of medieval and renaissance manuscripts: White paper
(August 2020), https://diggingintodata.org/file/1281/download?token=x59u8fFQ
4. Defilippis, D.: Da flavio biondo a leandro alberti. corografia e antiquaria tra quattro
e cinquecento, atti del convegno di studi (foggia, 2 febbraio 2006), a cura di d.
defilippis (2009)
5. Doerr, M.: The cidoc conceptual reference module: An ontological approach to
semantic interoperability of metadata. AI Mag. 24(3), 75–92 (Sep 2003)
6. Doerr, M., Bekiari, C., LeBoeuf, P.: Frbroo, a conceptual model for performing arts.
In: 2008 Annual Conference of CIDOC. pp. 06–18. CIDOC – ICOM International
Committee for Documentation (2008)
7. Gehrke, S., Frunzeanu, E., Charbonnier, P., Muffat, M.: Biblissima’s prototype on
medieval manuscript illuminations and their context. In: SW4SHD@ ESWC. pp.
43–48 (2015)
8. Jordanous, A., Lawrence, K.F., Hedges, M., Tupman, C.: Exploring manuscripts:
Sharing ancient wisdoms across the semantic web. In: Proceedings of the 2nd Inter-
national Conference on Web Intelligence, Mining and Semantics. pp. 1–12 (2012)
9. Krötzsch, M.: Owl 2 profiles: An introduction to lightweight ontology languages.
In: Reasoning Web International Summer School. pp. 112–183. Springer (2012)
10. Menestò, E.: Relazioni di viaggi e di ambasciatori. Cavallo, G pp. 535–599 (1994)
11. Potthast, A.: Repertorium fontium historiae medii aevi (1962)
12. Riva, P., Žumer, M.: Frbroo, the ifla library reference model, and now lrmoo:
a circle of development. In: IFLA WLIC 2018 Conference, Transform Libraries,
Transform Societies (2017)
13. Tartir, S., Arpinar, I.B., Moore, M., Sheth, A.P., Aleman-Meza, B.: Ontoqa:
Metric-based ontology quality analysis (2005)
14. Vrandečić, D.: Wikidata: A new platform for collaborative data collection. In:
Proceedings of the 21st international conference on world wide web. pp. 1063–1064
(2012)