<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Adopting Semantic Technologies in Public Health Documentation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jo rey Decourselle</string-name>
          <email>joffrey.decourselle@liris.cnrs.fr</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Frederic Riondet</string-name>
          <email>frederic.riondet@chu-lyon.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Central Documentation - Hospices Civils de Lyon</institution>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>LIRIS, UMR5205</institution>
          ,
          <addr-line>Universite Claude Bernard Lyon 1, Lyon</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>We present a success story on the adoption of semantic technologies for the library of the second biggest university hospital of France. This project was divided into three parts: preprocessing, semantic enrichment and data integration. This abstract introduces the research challenges faced in the project as well as the outcomes obtained so far. 3 https://www.loc.gov/cds/downloads/FRBR.PDF</p>
      </abstract>
      <kwd-group>
        <kwd>health documentation</kwd>
        <kwd>semantics</kwd>
        <kwd>migration</kwd>
        <kwd>enrichment</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>The Central Documentation of the second biggest university hospital of
France, the Hospices Civils of Lyon (HCL), holds about 500.000 online
bibliographic records where each contains metadata about documents like article,
journals, books or legislative texts in the health domain. The availability of such
data is crucial for the work of at least 25.000 health professionals, researchers
or students. Yet, while the latter become more and more demanding in terms
of search and exploration features, improvements of the HCL digital library was
impossible due to the limited capabilities of the old-fashioned library system and
the lack of exibility of the cataloging model. Thus, the decision was taken to
adopt semantic web technologies for the management of the HCL's catalog.</p>
      <p>The bibliographic domain inherits for two centuries of deep changes in
cataloging formats from the old-fashioned cards to computer databases which, on the
one hand, has mainly increased the complexity of the librarian's job while, one
the other hand, did not reconcile the users with library services. Many catalogs
from digital libraries are still isolated and ine cient due to the lacks of semantics
from the digitized legacy cataloging models. Related studies agrees that
necessary improvements in libraries should come from deep changes in cataloging
models to adopt a more semantic way to represent bibliographic relationships
and to improve knowledge discovery. Thus, new standards and visions from the
bibliographic domain like FRBR3 as well as from the Semantic Web community
served as a basis to address HCL's issues. Our main objective was to
automatically migrate and enrich the whole catalog towards semantic linked data before
integrating the latter into a system based on semantic web technologies.
Preprocessing. The rst step of the project aimed at analyzing the catalog
to tune a system for the automated semantic enrichment process. Such task
mostly aimed at extracting all the valuable knowledge patterns from scattered
data in the records. For HCL, this task was overwhelming due to the variety of
documents in the catalog. Thus, we relied on early research studies4 to
automatically analyze the catalog to detect both inconsistencies and valuable knowledge
patterns. The next step aimed to implement the rules extracted from the
analysis into an enrichment tool. Yet, because most of existing solutions only provide
basic mappings as rules, the latter was too limited to correctly handle all the
high level knowledge patterns from the catalog without writing too complex and
redundant mappings. Thus, we proposed a novel approach to encapsulate
mappings and rules in a pattern-based migration model. The latter aimed to help
domain experts to easily discuss and manage the migration rules thanks to the
clear representation of high-level patterns and the graph model.
Semantic enrichment. The automated transformation of a catalog is a costly
process where each record must be evaluated against the dedicated rules to both
construct the new knowledge base and to extract additional knowledge from
external sources. In the context of HCL, we chose to rely on our pattern-based
model of rules which was implemented as an oriented graph of patterns where
each pattern held conditions and mappings written from the rst phase. Hence,
we adapted a migration process to evaluate each record using this graph. The
major improvement of this approach came from the inheritance of conditions
between patterns of the oriented graph which prevented useless computation of
mappings to nally enhance the global performance of the system.
Data Integration The company Progilone provided the documentation system
Syrtis5 to integrate the migrated and enriched data from the previous phase.
Syrtis relies on a graph-based model to provide various features to manage
semantic data like RDF graph visualization or semantic search. We just had to
provide a pivot model of mappings between the data generated from the
enrichment phase and the Syrtis's vocabulary. In the end, the whole HCL's catalog has
been successfully migrated and integrated into Syrtis and is available online6.
The new online catalog based on Syrtis records at least 800 visits in a month.
The use of Semantic technologies to manage the catalog o ers new
possibilities for users to navigate between bibliographic relationships to better discover
the rich knowledge which was hitherto hidden in records. It also reduces the
cataloging e orts for practitioners because intellectual or editorial information are
now well separated into related entities instead of being aggregated in textual
lists. Interoperability of the catalog is also improved thanks to the graph model
and the standard vocabularies making it easier to integrate additional data from
other repositories. Besides, our ongoing works focus on the automated detection
of knowledge patterns from these sources to ease their integration.</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>