=Paper= {{Paper |id=Vol-1963/paper488 |storemode=property |title=Adopting Semantic Technologies in Public Health Documentation |pdfUrl=https://ceur-ws.org/Vol-1963/paper488.pdf |volume=Vol-1963 |authors=Joffrey Decourselle,Frédéric Riondet |dblpUrl=https://dblp.org/rec/conf/semweb/DecourselleR17 }} ==Adopting Semantic Technologies in Public Health Documentation== https://ceur-ws.org/Vol-1963/paper488.pdf
      Adopting Semantic Technologies in Public
               Health Documentation

                    Joffrey Decourselle1 and Frédéric Riondet2
        1
            LIRIS, UMR5205, Université Claude Bernard Lyon 1, Lyon, France
                        joffrey.decourselle@liris.cnrs.fr
              2
                Central Documentation - Hospices Civils de Lyon, France
                          frederic.riondet@chu-lyon.fr



       Abstract. We present a success story on the adoption of semantic tech-
       nologies for the library of the second biggest university hospital of France.
       This project was divided into three parts: preprocessing, semantic enrich-
       ment and data integration. This abstract introduces the research chal-
       lenges faced in the project as well as the outcomes obtained so far.


Keywords: health documentation, semantics, migration, enrichment

    The Central Documentation of the second biggest university hospital of
France, the Hospices Civils of Lyon (HCL), holds about 500.000 online bibli-
ographic records where each contains metadata about documents like article,
journals, books or legislative texts in the health domain. The availability of such
data is crucial for the work of at least 25.000 health professionals, researchers
or students. Yet, while the latter become more and more demanding in terms
of search and exploration features, improvements of the HCL digital library was
impossible due to the limited capabilities of the old-fashioned library system and
the lack of flexibility of the cataloging model. Thus, the decision was taken to
adopt semantic web technologies for the management of the HCL’s catalog.
    The bibliographic domain inherits for two centuries of deep changes in cata-
loging formats from the old-fashioned cards to computer databases which, on the
one hand, has mainly increased the complexity of the librarian’s job while, one
the other hand, did not reconcile the users with library services. Many catalogs
from digital libraries are still isolated and inefficient due to the lacks of semantics
from the digitized legacy cataloging models. Related studies agrees that neces-
sary improvements in libraries should come from deep changes in cataloging
models to adopt a more semantic way to represent bibliographic relationships
and to improve knowledge discovery. Thus, new standards and visions from the
bibliographic domain like FRBR3 as well as from the Semantic Web community
served as a basis to address HCL’s issues. Our main objective was to automati-
cally migrate and enrich the whole catalog towards semantic linked data before
integrating the latter into a system based on semantic web technologies.

3
    https://www.loc.gov/cds/downloads/FRBR.PDF
Preprocessing. The first step of the project aimed at analyzing the catalog
to tune a system for the automated semantic enrichment process. Such task
mostly aimed at extracting all the valuable knowledge patterns from scattered
data in the records. For HCL, this task was overwhelming due to the variety of
documents in the catalog. Thus, we relied on early research studies4 to automat-
ically analyze the catalog to detect both inconsistencies and valuable knowledge
patterns. The next step aimed to implement the rules extracted from the analy-
sis into an enrichment tool. Yet, because most of existing solutions only provide
basic mappings as rules, the latter was too limited to correctly handle all the
high level knowledge patterns from the catalog without writing too complex and
redundant mappings. Thus, we proposed a novel approach to encapsulate map-
pings and rules in a pattern-based migration model. The latter aimed to help
domain experts to easily discuss and manage the migration rules thanks to the
clear representation of high-level patterns and the graph model.
Semantic enrichment. The automated transformation of a catalog is a costly
process where each record must be evaluated against the dedicated rules to both
construct the new knowledge base and to extract additional knowledge from
external sources. In the context of HCL, we chose to rely on our pattern-based
model of rules which was implemented as an oriented graph of patterns where
each pattern held conditions and mappings written from the first phase. Hence,
we adapted a migration process to evaluate each record using this graph. The
major improvement of this approach came from the inheritance of conditions
between patterns of the oriented graph which prevented useless computation of
mappings to finally enhance the global performance of the system.
Data Integration The company Progilone provided the documentation system
Syrtis5 to integrate the migrated and enriched data from the previous phase.
Syrtis relies on a graph-based model to provide various features to manage se-
mantic data like RDF graph visualization or semantic search. We just had to
provide a pivot model of mappings between the data generated from the enrich-
ment phase and the Syrtis’s vocabulary. In the end, the whole HCL’s catalog has
been successfully migrated and integrated into Syrtis and is available online6 .
The new online catalog based on Syrtis records at least 800 visits in a month.

The use of Semantic technologies to manage the catalog offers new possibili-
ties for users to navigate between bibliographic relationships to better discover
the rich knowledge which was hitherto hidden in records. It also reduces the cat-
aloging efforts for practitioners because intellectual or editorial information are
now well separated into related entities instead of being aggregated in textual
lists. Interoperability of the catalog is also improved thanks to the graph model
and the standard vocabularies making it easier to integrate additional data from
other repositories. Besides, our ongoing works focus on the automated detection
of knowledge patterns from these sources to ease their integration.

4
  https://hal.archives-ouvertes.fr/hal-01324529
5
  http://www.progilone.fr/en/syrtis
6
  https://documentationcentrale.docchu-lyon.fr/