<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>MBD</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Leveraging Linked Data to develop rich discovery services for big heterogeneous cultural data; the case of the National Cultural Heritage Aggregator SearchCulture.gr</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Georgia Angelaki</string-name>
          <email>angelaki@ekt.gr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Harris Georgiadis</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Agathi Papanoti</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Elena Lagoudi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>National Documentation Centre</institution>
          ,
          <country country="GR">Greece</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <volume>3</volume>
      <fpage>18</fpage>
      <lpage>19</lpage>
      <abstract>
        <p>Metadata heterogeneity is a cultural aggregator's biggest challenge. In this paper, we will present the tenyear development of SearchCulture.gr, the Greek National Cultural Heritage Aggregator, in establishing a robust and scalable aggregation infrastructure, and a public portal, providing access to nearly 1 million objects. SearchCulture.gr's success lies in its enrichment strategy that applies state-of-the-art semantic technologies and the power of Linked Data to provide both fine-grained search capabilities and a multitude of browsing options, such as displaying objects on a map and using advanced queries to create engaging thematic exhibitions that showcase the breadth and depth of Greek cultural heritage. Thanks to this innovative semantic enrichment strategy with regards to persons, types of objects, themes, geolocations and timespans/historic periods, SearchCulture.gr effectively addresses fundamental user search questions“who,” “when,” “what,” and “where”-resulting in very high precision in search results and overall offering the most advanced discovery services among Europeana's national and thematic aggregators.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;semantic enrichment</kwd>
        <kwd>linked data</kwd>
        <kwd>SearchCulture</kwd>
        <kwd>gr</kwd>
        <kwd>aggregation</kwd>
        <kwd>Europeana 1</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>therefore the whole process of delivering data to Europeana is streamlined for the providers. In
addition, the whole process is free for them.</p>
      <p>SearchCulture.gr’s Interoperability Guidelines are gradually being adopted in the national calls
for funding of digitisation projects and by the Ministry of Culture. This initiative helps establish a
common standard for interoperability and quality of digitization at the national level.</p>
      <p>However, the Interoperability Guidelines alone are insufficient to address the semantic
heterogeneity of the diverse collections aggregated in SearchCulture.gr. Achieving homogenization
and semantic interoperability across all collections is essential for an aggregator to offer targeted
search and browsing functionalities in large-scale datasets. To address this, we developed a
state-ofthe-art semi-automatic semantic enrichment strategy and infrastructure, that we use to enrich the
aggregated content, along with a set of targeted Linked Data Vocabularies. EKT Vocabularies which
are presented later in the text, extend, translate and link to established LOD vocabularies such as the
Virtual International Authority File, UNESCO Thesaurus, Getty AAT and Geonames.</p>
      <p>As a result of this process, every item record ingested is enriched with the new fields "EKT
type", "EKT subject", "EKT person (creator/referred)", “EKT Place” and "EKT Historical period" .
These fields answer the questions: "What is it", "What does it refer to", "Who", “Where” and "When",
respectively.</p>
      <p>In this paper we present the challenges we encountered, the methodology followed, the tools
deployed and the new search and browsing functionalities and map-based discovery services that
were gradually developed by leveraging the power of the semantic web.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Challenges</title>
      <p>The source metadata ingested from a large number of providers is heterogeneous and varies in
quality. Common challenges include typos, differences in syntax, the use of synonymous terms,
variations in language, use of both broader and narrower concepts, and inconsistent use of singular
and plural forms. For instance, dates may be formatted as "first half of the 19th century," "1800-1850,"
or "early 19th century." Similarly, a person’s name might appear as "Rigas Feraios," "Feraios Rigas,"
or "Rigas Velestinlis," all referring to the same individual. Additionally, a place name like Tripoli
could refer to either Tripoli in Arcadia or Tripoli in Libya, among other examples.</p>
    </sec>
    <sec id="sec-3">
      <title>3. The Semantic Enrichment Scheme in SearchCulture.gr</title>
      <p>SearchCulture.gr transforms harvested data into the Europeana Data Model (EDM) which is natively
supported in the aggregator. To leverage the power of the semantic web, EDM features a number of
classes devoted to the representation of “contextual” entities such as for persons and places.</p>
      <p>
        In accordance with EDM that supports a Proxy mechanism to hold different views of the same
Cultural Heritage Object (CHO) [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], the enrichment scheme in SearchCulture.gr is based on adding
links (URI Refs) stored in separate ‘ΕΚΤ’ fields in CHOs’ metadata to terms from Linked Open Data
(LOD) Vocabularies. These links are produced from curated mappings between source metadata
values and terms from target vocabularies.
      </p>
      <p>The implementation of the scheme is done in Semantics.gr, a platform developed in-house by EKT
that serves the development, curation and interlinking of vocabularies, thesauri and authority files
and their publication as LOD supporting any Data Model that can be expressed as an OWL ontology,
besides SKOS.</p>
      <p>Semantics.gr also contains a Mapping Tool used to set Enrichment Mapping Rules (EMRs) in order
to perform bulk data enrichment in aggregator databases and repositories. The GUI environment
includes advanced automated functionalities that help the curator easily define EMRs from source
datasets (resources/terms from other vocabularies, metadata records or aggregated metadata values
or phrases) to terms from a target vocabulary (see fig.1 for a validated mapping in Semantics.gr).</p>
      <p>For each dataset and target vocabulary a dedicated Mapping Form is created. The enrichment
tool supports automatic suggestion of EMRs which is based on string similarity matching between
metadata field values and indexed labels of vocabulary entries (e.g. skos:prefLabel and
skos:altLabel). Besides the primary source field (ie dc:creator, dc:contributor), other fields can also
be used (ie dc:subject, dc:description) as filters in order to set more refined EMRs.</p>
      <p>Τhe curator can create complex logical expressions using the logical operators AND, OR and NOT
on the filters to avoid false positives. For instance, an EMR may assign items with dc:type “image”
to the vocabulary term “vase” if they have a dc:subject value “vase” or “oenochoe” but NOT a
dc:subject value “drawing representation”. Another EMR could map items with dc:type “image” and
dc:subject “drawing representation” to the vocabulary term “drawing”. When the automatic
suggestion function fails to produce correct rules, the curator can set EMRs manually.</p>
      <p>The Mapping Form incorporates a self-improving automatic suggestion mechanism. The manual
mappings from metadata values to usually similar or broader vocabulary terms constitute valuable
knowledge that enhances the effectiveness of autosuggestions in future enrichments, thereby
reducing the need for manual assignments. The curator decides whether a manual EMR should be
remembered by bookmarking it.</p>
      <p>
        For every different type of semantic enrichment, various extensions were made to the main
enrichment scheme to provide additional functionality that would further support and automate
the process as described in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>
        Validated mappings are served on request via a RESTful API in JSON format which can be used
by the aggregator or repository to enrich the collection easily and en masse. The tool is thoroughly
described in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>In summary, the semantic enrichment strategy contains the following steps:
•
•
•
•
•</p>
      <p>Enrichments are performed per collection and per field in a Mapping Form
Source metadata stay intact and enrichment values are clearly marked as “EKT” fields
All mappings produced, either automatic or manual, are always validated by an expert
curator.</p>
      <p>Development of target vocabularies is done in Semantics.gr. These are all bilingual and
hierarchical if applicable. They are often adaptations, extensions and translations of popular
Linked Data vocabularies such as Geonames, UNESCO thesaurus, etc.</p>
      <p>All vocabularies are openly licensed and published in Semantics via open APIs to allow for
re-use
Last but not least, all related work is published and communicated via webinars, articles, etc,
following EKT’s open science approach.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Retrospective mass-scale enrichments</title>
      <p>We started developing vocabularies and applying semantic enrichments to our collections in 2016.
Every semantic enrichment work initially run as a dedicated retrospective enrichment project
covering all datasets ingested in SearchCulture.gr up to that point and was then incorporated in the
standard ingestion process for new collections.</p>
      <p>Cultural heritage item types4. First, we created a SKOS-based LOD original vocabulary consisting
of 193 terms that cover different types of cultural artifacts and is linked to Getty AAT. Metadata
records were enriched with a separate field "EKT type" that holds references to the vocabulary’s
terms.</p>
      <p>
        Greek Time Periods5. Next, we set out to homogenize and normalize chronologies and historical
periods. The vocabulary Greek Time Periods was developed in 2017, and it is constructed according
to the semantic class edm:Timespan of Europeana’s EDM. It contains 169 terms that cover Greek
history from 8000 BC to today. It is hierarchical and bilingual. Depending on whether the original
temporal documentation is based on period labels or chronologies, we adopted two fundamentally
different enrichment strategies, historical period-driven enrichment and chronology-driven
enrichment, respectively. In the chronology-driven enrichment, chronological values are being
homogenized into years or year ranges and then, based on the results, the items are enhanced with
the corresponding terms from the historical periods vocabulary. Our time normalization method is
fully extensible and parametric, takes into consideration language descriptors and covers four types
of temporal expressions, centuries, range of centuries, years/dates and year ranges. As a result,
original metadata records were enriched with two distinct fields, "EKT chronology" and "EKT
historical period". A detailed presentation of the item types and chronology/period-related
enrichments is provided in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>Subjects. The following iteration of enrichments was thematic, adding a new SKOS-based field "EKT
Subject" that includes references to terms of a bilingual and hierarchical vocabulary of subjects that
is interlinked to the UNESCO Thesaurus (EKT version6- 1,391 terms) and a complementary
vocabulary of Thematic Tags7 that covers more specific topics (607 terms).</p>
      <p>Persons (creators/referred persons). Next, we extended our enrichment scheme to persons,
distinguishing between creators and referred persons (i.e. a person depicted in a photograph, cast of
a film, a recipient of a letter or the subject of a biography). Similarly to the other enrichments, this
would involve identifying person entities in metadata and mapping them to entries from a structured
vocabulary, the Notable Persons in Greek History and Culture8. The Vocabulary has reached 9540
terms modelled according to the edm:Agent class and linking to VIAF, Wikidata and other online LD
resources. Each entry was enriched, when possible, with metadata regarding place of birth and death,
date of birth and death, sex, occupation, bibliographic references and links to established resources,
such as the Virtual International Authority File (VIAF), Wikipedia and IMDB.</p>
      <p>
        During the process, a complementary vocabulary was introduced, that of
“Professions/Occupations9”. It is an LOD vocabulary conforming to the skos:Concept semantic class
and consists of 371 terms. It is hierarchical and bilingual, and its terms refer to occupations such as
merchants, doctors and military officers, clergy positions, noble titles, different social movements
affiliates like feminists or socialists or types of artists and literary creators. The terms of this
vocabulary were used to classify entries in the Vocabulary of Persons. A detailed presentation of the
person-related enrichments is provided in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]
4 https://www.semantics.gr/authorities/vocabularies/ekt-item-types
5 https://www.semantics.gr/authorities/vocabularies/time-periods/vocabulary-entries
6 https://www.semantics.gr/authorities/vocabularies/ekt-unesco
7 https://www.semanThematic tagsics.gr/authorities/vocabularies/thematic_tags
8 https://www.semantics.gr/authorities/vocabularies/searchculture-persons
9 https://www.semantics.gr/authorities/vocabularies/professions-occupations
Places. The last retrospective semantic enrichment process regarded geographical information.
Utilizing the GeoNames API, a "starter set" of ~6K terms was selected comprising entities belonging
to the first level of each country’s administrative hierarchy and cities with population over 100k
globally. For Greece, the threshold was set to three levels of administrative divisions and all the
settlements of more than 1K inhabitants. The resulting “Vocabulary of geographical names
GeoNames (EKT version10)” is hierarchical and conforms to the edm:Place contextual class of EDM.
At the end of our enrichments, it reached ~12K terms.
      </p>
      <p>
        In addition to the main vocabulary, a supplementary EKT vocabulary11 was developed to include
features that didn’t fall under the strict administrative hierarchy described above, adding, for
example, historical areas (e.g., Soviet Union), placenames that include many different states (e.g., the
Balkans) or geomorphological elements that may transcend different states, such as rivers etc. Those
two vocabularies are interconnected using two custom fields ekt:isPartOfMatch and
ekt:hasPartMatch (similar to the skos:broaderMatch and skos:narrowerMatch, respectively from the
SKOS data model) in order to express the hierarchical “Has-Part” relationships between the two
vocabularies.A detailed presentation of the spatial enrichments can be found in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>Finally, it is worth noting that the EKT Vocabularies, controlled, standardised, dynamic,
extensible and open, constitute particularly valuable resources that can be also used by cultural
institutions in their primary documentation, enhancing the quality, interconnectivity, integration
and multilinguality of the original metadata. The reuse of such resources improves accessibility to
cultural heritage at national level and achieves economies of scale in documentation processes.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Limitations</title>
      <p>There are inherent limitations in the enrichment process which derive from Dublin Core (DC)-based
metadata models, including EDM, regarding spatial and person information representation.</p>
      <p>In DC and in EDM, geographical information is represented in the properties dcterms:spatial or
dc:coverage in an equivocal way: a toponym can either indicate the place where an item was created,
where it is being kept, or its subject. Regardless, therefore, of whether more nuanced place-based
information is included in the source metadata, at aggregation level this is often compromised due
to the limited expressivity inherent in EDM and other DC-based schemata. This is why the new EKT
place field produced by our enrichments may point to any place the item relates to in an
indistinguishable way.</p>
      <p>Similarly, “dc:contributor” could be the author or the subject of a book, the director, the
screenwriter or the actor of a play, the sculptor, the model or the photographer of an antiquity, the
sender, the receiver or the subject of a letter, etc. Moreover, while the creator in most cases is given
in “dc:creator”, persons that appear in photographs or are the subject of a book, sometimes appear
in “dc:subject”, some other times in “dc:contributor” or “dc:title” or “dc:description”. Ultimately, we
decided that the enrichment on persons should improve SearchCulture.gr by allowing users to easily
find all works of a creator (regardless of its specific role in the creation) and all the CHOs referring
a person (regardless of the kind of reference). We opted for creating two separate fields, “EKT
creator” and “EKT referred person” thus conducting those two kinds of person enrichments. These
are the fields used for person-driven search, browsing and faceting.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Search and browsing functionalities</title>
      <p>The semantic enrichment workflow aims at enhancing the user experience by providing more
detailed and meaningful connections between data points in SearchCulture.gr. The array of
Discovery Services built on the back of the semantic enrichments offers various gateways into the
10 https://www.semantics.gr/authorities/vocabularies/geonames-places-earth/vocabulary-entries?language=en
11 https://www.semantics.gr/authorities/vocabularies/geonames-supplementary-places/vocabulary-entries
aggregated content. This approach not only improves the usability of the platform but also supports
deeper scholarly research and public engagement with Greece's cultural heritage.</p>
      <p>The controlled vocabularies developed are integrated in a flexible way, throughout the portal as
search, faceted filtering and browsing options. The fact that they are bilingual, allows also
englishspeaking users to navigate the –mostly-greek content, up to a certain extent.</p>
      <p>First of all, every controlled vocabulary has its own presentation page which offers dedicated
search and hierarchical (if applicable) browsing options. Fig 2 illustrates the page for Persons. A
subset of terms also appears on the homepage as tag clouds acting as an entry point for the user.</p>
      <p>EKT vocabularies are incorporated in the respective fields in the advanced search. Every field
includes autosuggestion and disambiguation functionality. The fields can be combined to enable
highly targeted searches. For example, Fig. 3 (a) illustrates a query that searches for “photos” related
to the “Balkan Wars”, referring to “E. Venizelos”, involving a “King” and focusing on “Crete”.</p>
      <p>The EKT vocabularies are also used as facets allowing further filtering of the results, as shown in
Fig 2 (b). All resulting enrichment terms are active links pointing to all items enriched with that term.
Every such term is accompanied by its semantic resource in Semantics.gr (marked with a pink “s”)
where one can see the complete record of the resource, such as other linked resources from other
vocabularies, etc.</p>
      <p>In order to exploit the hierarchy incorporated in the respective vocabularies (places, item types,
subjects) we index for each item all broader terms as well, using a separate auxiliary Solr field, thus,
supporting hierarchical searching and faceting. This way, for example, when a user searches items
of “Attica” (the prefecture) the results will also include items of “Athens”.</p>
      <p>Moreover, the enrichments are used to locate the items on an interactive map (Fig. 5). The
implementation of the map navigation was based on leaflet.js and OpenStreetMap. To support the
display of a large number of items on the map, items appear in clusters that can be further expanded
as the user clicks or zooms on the map. Users can retrieve items belonging to a specific place or all
items located in the current map frame (within its coordinates). The new visualization feature was
also added to the Thematic Exhibitions, providing a new dimension with regards to showcasing the
items included.</p>
      <p>By combining the enrichments with the presentation features, we are also able to offer more
"niche" and playful search options, making the content more engaging and interesting overall. For
example, one could search for female writers born in Crete at the beginning of the 20th century and
approach this search in various ways, such as through the advanced search box, the person browsing
page, or the map.</p>
    </sec>
    <sec id="sec-7">
      <title>7. Thematic exhibitions</title>
      <p>In response to the urgent need for digital cultural content for education and use throughout the
closure of physical cultural spaces, during the COVID pandemic, the SearchCulture.gr team
developed the thematic exhibitions functionality and created a series of exhibitions which found
much favor with the public, increasing user numbers significantly.</p>
      <p>The Thematic Exhibitions feature a smaller or larger set of CHOs from different digital collections
that share storytelling relevance. The back end of this functionality also utilizes semantic
enrichments. The curator uses a Query Form (similar to the Advanced Search Box) to select a Type,
Place, Person, or Subject, and enters search parameters such as time, type sub-category, inclusion or
exclusion of items, collections, or organizations. Following traditional curatorial practices, the digital
curation process begins with a conceptual query. The curator refines this query through trial and
error, selecting and organizing objects to create a coherent and compelling narrative by layering,
juxtaposing, comparing, combining, or excluding objects. As the query is refined through this
iterative process, the selected items begin to form a conceptually cohesive and narratively unified
exhibition.</p>
      <p>The Query Form's output is the Exhibition Page, meticulously designed with a bilingual interface
(Greek and English). The interpretative phase involves using tools to clarify and narrate the
exhibition's stories, incorporating images, quotes, and hyperlinks to provide context, depth, and
additional information. The exhibition can be presented through map-based storytelling or a grid
layout, depending on the content's relevance. Each virtual exhibition includes a key image, a title,
and context-providing subtitle, accompanied by an interpretative text and links to external
references for further reading. The Exhibitions Query Form enables editors to create thematic
exhibitions by publishing the retrieved query results in a curated, organized manner.</p>
      <p>With the aim of showcasing various aspects of Greek cultural heritage, the Thematic Exhibitions
developed range from simple, type-based archaeological exhibitions—such as the story of the oil
lamp—to more nuanced topics, like Olfactory or Industrial Heritage. From 2020 to 2024, eighty (80)
Thematic Exhibitions were created, covering arts and crafts, archaeology, music and theater, religion,
architecture, folklore, oral traditions, and social issues. Overall, the goal is to cover the mainstream
and the less visible- both fine arts and crafts, both men and women in history, both eponymous
works and anonymous creativity, both power narratives and minority reports. There is intention to
be inclusive in the storytelling approach, covering many eras, media, locations, types, and subjects
and mixing main historical events or important aspects of heritage with less known or visible themes.
As the number of exhibitions grew rapidly, search and keyword-based faceted filtering were added
to enhance the discoverability of the exhibitions.</p>
      <p>
        Following the incorporation of geolocation enrichments in 2023, a series of place-based
exhibitions was developed, highlighting many areas of Asia Minor where Hellenism thrived, as well
as Greek islands and regions of strong local interest. For a detailed presentation of the new
functionality see [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>Recently, a new sub-category of thematic exhibitions was released focusing on people, either
individuals or groups of people that share the same ideas, passions, genres of writing or painting
(Fig. 7).</p>
      <p>The strategic editorial approach of the national aggregator SearchCulture.gr aligns closely with
that of Europeana, integrating best practices outlined in the Europeana Editorial Guidelines12 and
utilizing digital storytelling techniques recommended by the Europeana Network Association. These
techniques include making content personal, blending expertise with an informal tone, uncovering
hidden stories, using visual and audio materials, ensuring clear narrative structure, beginning with
specific details, and employing evocative imagery.</p>
      <p>The curated exhibitions are being promoted via EKT’s social media and have significantly raised
traffic and engagement to the portal.</p>
    </sec>
    <sec id="sec-8">
      <title>8. Expert knowledge in the enrichment and digital curation processes</title>
      <p>The SearchCulture.gr team consists of four core members: one technical developer who also serves
as the scientific supervisor of the infrastructure, one digital heritage expert responsible for network
development, and two archaeologists working full-time on the semantic enrichment discussed earlier
and digital curation processes described below. Since 2016, several humanities scientists have
contributed on a part-time basis to the semantic enrichment process.</p>
      <p>It is essential to emphasize that the involvement of expert knowledge throughout the process is
indispensable for ensuring the quality and validity of the vast amount of EMRs, mappings, and
vocabulary terms created over the years, as well as of the enrichments themselves. In addition,
curators have provided invaluable feedback throughout the gradual development of the semantic
enrichment infrastructure.</p>
      <p>Furthermore, the disambiguation and identification process often requires substantial historical
knowledge and thorough research using reliable online and offline resources. For instance, over 5,000
settlements in the Greek territory were renamed—sometimes as frequently as every 20 years—during
the 20th century for historical and political reasons. One example is the village of Γκρόπινο (of
12 https://pro.europeana.eu/page/writing-for-europeana-pro
Bulgarian origin), which was renamed Τρόπινο in 1928 as part of an effort to "Hellenize" the name.
In 1940, it was renamed again to Βαλτολείβαδο (meaning "meadow with swamps," a reference to its
natural surroundings), and finally, in 1961, it was given the more "elegant" name Δάφνη (Laurel).
Archival research in this case was essential for SearchCulture.gr curators to accurately assign the
correct Geoname, as related resources are not readily available.</p>
      <p>Moreover, the deep familiarity with incoming collections gained through semantic enrichment
significantly facilitates and improves the digital curation process of thematic exhibitions, as
discussed in the previous section.</p>
      <p>All the above create a virtuous cycle between data ingestion, semantic enrichment, infrastructure
development, collections’ curation, and presentation. The investment in expert human curation
required for the semi-automatic enrichment of the collections is, therefore, justified by the
tremendous added value it brings to the discovery of the collections, as presented below, that is, at
least for the size of the collections that is ingested in SearchCulture.gr.</p>
    </sec>
    <sec id="sec-9">
      <title>9. Related work</title>
      <p>
        Discoverability and re-use can be influenced by the level of metadata heterogeneity and semantic
interoperability [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Different semantic enrichment strategies are, therefore, adopted by large
cultural heritage aggregators as a means to contextualise resources, disambiguate, add
multilinguality and offer search and browsing functionalities across multiple heterogenous source
datasets [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>
        Among the domain and thematic aggregators that form the Europeana Aggregators’ Forum, some
demand the data is enriched prior to ingestion, transferring the responsibility to the providers [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ],
others undertake semantic enrichment post ingestion [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], while the majority just indexes string
data without applying any semantic enrichment before delivering data to Europeana.
      </p>
      <p>
        Europeana enriches aggregated CHOs by automatically linking text strings found in the metadata
to controlled terms from established LOD vocabularies such as Geonames, GEMET and Dbpedia
[
        <xref ref-type="bibr" rid="ref12">12</xref>
        ],[
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. However, complete automated enrichment on structured fields (such as dc:type) adopts an
“enrich-if-you-can” strategy, horizontally, resulting in non-negligible percentages of mistakes [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]
and in relatively low enrichment coverage - despite using extremely large target thesauri, such as
DBpedia and Geonames. Additionally, Europeana will not employ sophisticated methods to extract
related information from other descriptive fields or to create more nuanced matches (e.g., between
synonym terms).
      </p>
      <p>
        Automated annotation methods on more descriptive fields (such as dc:title) yield similar
challenges [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. All these techniques enhance searchability and multilingualism. However, due to
the relatively low enrichment coverage, the extensive target thesauri used, and the significant
percentage of enrichment errors, they cannot achieve sufficient homogenization to enable
aggregators to offer advanced methods for content exploration, such as browsing and faceting on
enriched fields.
      </p>
      <p>
        In the comparative evaluation performed by EuropeanaTech Task Force [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] the necessity of
human-in-the loop methodologies to complement automatically produced enrichments is implied.
Many EU-funded projects deal with the complexities of fully automatic or crowdsourced
enrichments such as Enrich+, St George on a Bike and Europeana XX. SAGE [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] is a semantic
enrichment and validation platform that deploys state-of-the-art AI tools assisted by
human-in-theloop validation mechanisms to produce automatic mappings. However, it lacks the sophistication
provided in Semantics.gr such as the use of filters to refine mappings.
      </p>
      <p>Regarding the search and browsing functionality offered, the way heterogeneous data is curated—
whether effectively or not—reflects in the options presented to users. For example, place-based
search is available through Deutsche Fotothek13 and the German Digital Library14. CulturaItalia.it15
provides place-based filtering of results, while the Swedish Kringla16 offers province-based filtering
and map-based search, though it geolocates only a fraction of the objects on the map and the
functionality is only available in Swedish. Among other cross-domain aggregators, the German
Digital Library provides advanced disambiguation and search functionality for persons, using the
GND thesaurus as the source vocabulary and the OpenRefine tool to enrich person values. Each
individual has a landing page that includes basic biographical information and a list of works for
which they are either the author or a referenced entity.</p>
      <p>
        However, among all the large cultural heritage data aggregators investigated, and to the best of
our knowledge as members of the Europeana Aggregator Forum, SearchCulture.gr is the only
national cross-domain aggregator that adopts such a systematic and meticulous semantic enrichment
strategy across the majority of the EDM contextual classes. This approach enables US to build very
fine-grained search and browsing options, effectively answering the "who," "what," "when," and
"where" questions in a refined and sophisticated manner. The enrichment has remarkably improved
the searchability of the SearchCulture.gr collections as demonstrated in[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
10. Conclusions and future work
Semantic enrichments improve the quality and validity of data, add a layer of multilingualism to
search, clarify concepts and individuals, support the conceptual interconnection between items
found across various repositories, and enhance advanced search capabilities. Highlighting often
unexpected correlations between people, places, themes, historical periods, and types of content
opens new horizons for understanding and researching Greek culture.
      </p>
      <p>
        Given the related efforts, the semantic enrichment scheme presented in this paper, is aligned with
the Europeana TF recommendations [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] and achieves high coverage and effective disambiguation
because i) it adjusts to the documentation particularities of the individual collections ii) it combines
self-improving, automatic and fuzzy-based suggestions with a suite of tools that support the curation
and disambiguation process, iii) uses controlled target vocabularies that are gradually expanded to
cover the needs of the specific collections, and iv) employs expert knowledge for the validation of
the mappings.
      </p>
      <p>This model systematic curation process brings added value to aggregation and supports the
development of advanced possibilities for searching and navigating the heterogeneous richness of
the country’s cultural heritage.</p>
      <p>Building further on such a robust, versatile, and scalable semantic infrastructure, our future plans
include exploring the representation of Events and Intangible Cultural Heritage, developing relevant
targeted semantic vocabularies, opening the thematic exhibitions functionality to end users,
exploring both crowdsourcing and AI technologies to improve metadata quality, and adding more
engaging visualization elements to our portal.</p>
    </sec>
    <sec id="sec-10">
      <title>Declaration on Generative AI</title>
      <p>The authors have not employed any Generative AI tools.
13 https://www.deutschefotothek.de/
14 https://www.deutsche-digitale-bibliothek.de/?lang=en
15 http://culturaitalia.it/
16 https://www.kringla.nu/kringla/</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>ΕΚΤ</surname>
          </string-name>
          <year>2024</year>
          , Basic Interoperability Guidelines https://ariadne.ekt.gr/ariadne/handle/20.500.12776/17194, 2nd edition.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Europeana</given-names>
            <surname>Data Model Primer</surname>
          </string-name>
          , https://pro.europeana.eu/files/Europeana_Professional/Share_your_data/Technical_requireme nts/EDM_Documentation/EDM_Primer_130714.pdf
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>H.</given-names>
            <surname>Georgiadis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Papanoti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Paschou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Roubani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chardouveli</surname>
          </string-name>
          , E. Sachini (
          <year>2018</year>
          ).
          <article-title>Using type and temporal semantic enrichment to boost content discoverability and multilingualism in the Greek cultural aggregator SearchCulture.gr</article-title>
          .
          <source>International Journal of Metadata, Semantics and Ontologies</source>
          ,
          <volume>13</volume>
          (
          <issue>1</issue>
          ),
          <fpage>75</fpage>
          -
          <lpage>92</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>H.</given-names>
            <surname>Georgiadis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Papanoti</surname>
          </string-name>
          , E. Lagoudi, G. Angelaki,
          <string-name>
            <given-names>N.</given-names>
            <surname>Vasilogamvrakis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Panagopoulou</surname>
          </string-name>
          and
          <string-name>
            <given-names>E.</given-names>
            <surname>Sachini</surname>
          </string-name>
          ,
          <article-title>Enriching the Greek National Cultural Aggregator with Key Figures in Greek History</article-title>
          and Culture: Challenges, Methodology, Tools and Outputs.
          <source>International Conference on Theory and Practice of Digital Libraries TPDL 2022</source>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>H.</given-names>
            <surname>Georgiadis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Angelaki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Lagoudi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Vasilogamvrakis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Stamatis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Papanoti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Panagopoulou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Bartzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Charlaftis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Angelidi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ηardouveli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Karagianni</surname>
          </string-name>
          and E.
          <string-name>
            <surname>Sachini</surname>
          </string-name>
          (
          <year>2022</year>
          ).
          <article-title>Publishing LOD Vocabularies in Any Schema with Semantics.gr</article-title>
          . In: Garoufallou,
          <string-name>
            <given-names>E.</given-names>
            ,
            <surname>Ovalle-Perandones</surname>
          </string-name>
          , MA.,
          <string-name>
            <surname>Vlachidis</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . (eds) Metadata and
          <string-name>
            <surname>Semantic Research. MTSR</surname>
          </string-name>
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A.</given-names>
            <surname>Papanoti</surname>
          </string-name>
          , E. Lagoudi, G. Angelaki,
          <string-name>
            <surname>H.</surname>
          </string-name>
          <article-title>Georgiadis and Evi Sachini Greek Culture on the Map: Place-based Enrichment Scheme at the Greek National Cultural Data Aggregator</article-title>
          .
          <source>MTSR 2023</source>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A.</given-names>
            <surname>Papanoti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Angelaki</surname>
          </string-name>
          , E. Lagoudi,
          <string-name>
            <given-names>H.</given-names>
            <surname>Georgiadis</surname>
          </string-name>
          ,
          <article-title>Every good search tells a story: Creating virtual exhibitions on the Greek national cultural aggregator through query-based curation</article-title>
          .
          <source>5th CAA-GR CONFERENCE 2024</source>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>E.</given-names>
            <surname>Garoufallou</surname>
          </string-name>
          , and
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>Papatheodorou: A critical introduction to Metadata for e-Science and</article-title>
          eResearch,
          <source>International Journal of Metadata, Semantics and Ontologies ((IJMSO)</source>
          ,
          <volume>9</volume>
          (
          <issue>1</issue>
          ), pp.
          <fpage>1</fpage>
          -
          <lpage>4</lpage>
          . http://dx.doi.org/10.1504/IJMSO.
          <year>2014</year>
          .059143
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>S.</given-names>
            <surname>Peroni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Tomasi</surname>
          </string-name>
          ,
          <string-name>
            <surname>F.</surname>
          </string-name>
          <article-title>Vitali: The aggregation of heterogeneous metadata in Web-based cultural heritage collections. A case study</article-title>
          .
          <source>International journal of Web Engineering and Technology</source>
          , Vol.
          <volume>8</volume>
          (
          <issue>4</issue>
          ), pp
          <fpage>412</fpage>
          -
          <lpage>432</lpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>M.</given-names>
            <surname>Smith</surname>
          </string-name>
          <article-title>: Linked open data and aggregation infrastructure in the cultural heritage sector, A case study of SOCH, a linked data aggregator for Swedish open cultural heritage, in Information and Knowledge Organisation in Digital Humanities</article-title>
          ,
          <string-name>
            <surname>Routledge</surname>
          </string-name>
          (
          <year>2021</year>
          ),
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <article-title>Linked Open Data for Libraries, Archives and Museums</article-title>
          ,
          <source>EuropeanaTech Insight, Issue</source>
          <volume>7</volume>
          , https://pro.europeana.eu/page/issue-7-lodlam
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>H.</given-names>
            <surname>Manguinhas</surname>
          </string-name>
          , Europeana Semantic Enrichment Framework,
          <source>technical documentation</source>
          <year>2016</year>
          , available at https://pro.europeana.eu/page/europeana-semantic-enrichment
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>J.</given-names>
            <surname>Stiller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Petras</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gäde</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. Isaac.</surname>
          </string-name>
          :
          <article-title>Automatic Enrichments with Controlled Vocabularies in Europeana: Challenges and Consequences</article-title>
          . EuroMed:
          <fpage>238</fpage>
          -
          <lpage>247</lpage>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>E.</given-names>
            <surname>Agirre</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Barrena.</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Lopez de Lacalle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Soroa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Fernando</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Stevenson: Matching Cultural Heritage tems to Wikipedia</article-title>
          .
          <source>In: Proc. LREC</source>
          <year>2012</year>
          . Istanbul, Turkey. (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>EuropeanaTech</given-names>
            <surname>Task</surname>
          </string-name>
          <article-title>Force on a Multilingual and Semantic Enrichment Strategy: final report</article-title>
          , https://pro.europeana.eu/project/multilingual-and
          <article-title>-semantic-enrichment-strategy</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>Europeana</given-names>
            <surname>Task</surname>
          </string-name>
          <article-title>Force on Enrichment and Evaluation Final Report https://pro.europeana.eu/project/evaluation-and-enrichments</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>E.</given-names>
            <surname>Kaldeli</surname>
          </string-name>
          ,
          <article-title>Combining AI tools with human validation to enrich cultural heritage metadata</article-title>
          , https://pro.europeana.eu/post/combining-ai
          <article-title>-tools-with-human-validation-to-enrich-culturalheritage-metadata.</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>