Project Coli-conc: Mapping Library Knowledge Organisation Systems Uma Balakrishnan, Morsheda Akter. Verbundzentrale der GBV, Germany Abstract: Knowledge organisation systems (KOS) across libraries in German speaking countries have always struggled with a lack of homogeneity and consistent subject indexing. Various tools have been developed to map different KOS, either manually or through algorithms which allow for automatic or semi-automatic mapping 1 , 2 , 3 . Considering the complexity and variety of KOS, there is a need for an overarching system which facilitates and promotes consistent subject indexing and enhances access to information and use of KOS. The first step towards designing and deploying such a system was the conversion of various KOS data from their proprietary formats to a uniform format. To meet this requirement, Coli-conc developed in the pilot phase of the project the JSKOS format 4. The newly created format is based on SKOS and JSON LD; it eases the use of KOS in web applications and allows more flexibility in content display of the KOS data. To accelerate and increase the efficiency of the intellectual mapping process, the project proposes the mapping tool Cocoda that adopts a dashboard design. This approach gathers data of the KOS that are being mapped with each other from different sources, consolidates the same and displays it in a concise manner on a single screen for at-a glance-monitoring. The web-based mapping tool, Cocoda, besides enabling term/caption or notation search from a source or a target KOS; presents the hierarchical structure of a queried term/class; and permits browsing of the higher order concepts of the selected KOS. Additionally, the tool automatically generates mappings of a selected term/notation, gives options to edit and save the newly created mappings in the centralized VZG Mapping database integrating a feature to export them in JSKOS and various other formats. Furthermore, the Coli-conc infrastructure includes a KOS-Registry, a platform for Concordances with a web interface, JSKOS API for KOS and mappings which are provided as stand-alone services. 1 Walter, A.-K., Mayr, P., Petras, V., Baerisch, S. (2007) 2 Pfeffer, M. (2013) 3 Lauser, B., Johannsen, G., Caracciolo, C., Keizer, J., van Hage, W. R., Mayr, P. (2008) 4 Voß, J. (2016) Features of the Coli-conc and technical specifications 1. Coli-conc System Architecture The system offers an infrastructure to support semi-automatic creation of mappings. Furthermore, it facilitates the use and exchange of KOS and their mappings. The application is modular designed and consists of three core parts:  Coli-conc Web (CCWeb)  Data Converter  JSKOS API for KOS and Mappings (KOS-API, KK-API)  Database Server CCWeb Cocoda Sources KOS- Representation KK-Suggest KOS-API KK-Measure Web Interface KK-API Concordance s Kos-Registry Converter & Uploader User Administration Database Server Figure 1: Coli-Conc System Architecture 1.1. CCWeb The CCWeb is the heart of the system architecture. It comprises of:  The Mapping tool – Cocoda  KOS-Registry  Concordances  User Admin module and a Web Interface Web Interface Presentation CCWeb Application Concordance s Database Server KOS-Registry Database Cocoda User Administration Figure 2: CCWeb 1.1.1. Mapping Tool – Cocoda Cocoda is designed to perform multiple tasks to speed up the intellectual mapping building process between library KOS. The main components of the tool are:  KOS-Representation Module: to search, browse and display hierarchical intra-KOS structural and content information for concept clarification.  KK-Suggest Module: It combines several techniques and workflows into the framework (statistical co-occurrences, direct concept mapping and retrieval of stored mappings from the concordance database) to maximize the results and generate mapping candidates for a selected source and target KOS.  KK-Measure Module: to monitor and take care of quality assessment of the KOS and their mappings. Figure 3: Mapping Tool – Cocoda 1.1.2. KOS-Registry The KOS-Registry holds a collection of library KOS that are actively in use in the German speaking countries.5 The records have been classified under various types and also enriched with metadata based on the NKOS AP 6 format. An interface enables a keyword search or selection of a specific type of KOS through a drop-down menu and retrieval of the metadata of each KOS in the selected set. The application is equipped with an export function in different formats (XLS, JSON and JSKOS). 7 A script continually updates the registry through automatic data acquisition from BARTOC8. Figure 4: KOS-Registry 5 Balakrishnan, U., Agne, J.M. (2016) 6 NKOS AP: http://nkos.slis.kent.edu/nkos-ap.html 7 Voß, J., Ledl, A., Balakrishnan, U. (2016) 8 BARTOC: https://bartoc.org/ 1.1.3. Concordances To store, manage, and access mappings of the Coli-conc and other related projects as well as to integrate the same into the KK-Suggest module, the project has developed a concordance platform as part of the CCWeb. The database of the platform contains currently over 200.000 mappings and will be further built up with the help of partner institutions. Figure 5: Concordances 2. Technologies and Frameworks deployed for CCWeb Technologies: Web Framework: Servers: JavaServer Java Jersey Apache Faces (JSF) JavaScript bundle Tomcat v 7.0 or v 2.2 higher Application Servers: Database User Server J2EE (Java 2 IDE: Servers: Interface: Request Platform, eclipse Mongodb PrimeFaces Forms: Enterprise v 5.0 Edition) v 3.2.1 XHTML Figure 6: Technologies and frameworks for CCWeb 3. Data Conversion One of the main components of the Coli-conc system is the JSKOS format that was modelled specifically for the project. Application for DDC (Dewey Decimal Classification) and RVK (Regensburg Classification) conversion from their proprietary formats into JSKOS has been developed using MARC4J package9. For other KOS (Basic Classification and Allgemeine Systematik für Öffentliche Bibliotheken,…) and mappings the mc2skos Python script10 was extended to support JSKOS. The conversion scripts for mappings are available at: https://github.com/gbv/cocoda-mappings. Figure 7: DDC Marc21 XML data converter Sequence Diagram 9 MARC4J package: http://svn.k-int.com/default/components/marc4j/tutorial.html 10 mc2skos Python script : https://pypi.python.org/pypi/mc2skos 4. JSKOS API An additional special feature of the Coli-conc architecture is the JSKOS API. Among the key objectives of the project is to provide uniform and easy access to KOS and their mappings on the web. This has been affected by the creation of the JSKOS API. The service has been so far implemented as a database application for DDC, RVK, Basic classification (BC) and as wrappers to access Gemeinsame Normdatei (GND) Wikidata, Open Research and Contribution ID (ORCID). However, the use of the DDC API is subject to a license requirement. Figure 8: DDC- JSKOS API Project Partner Institutions The project is being funded by the German Research Foundation. It has received support from the German National Library, several expert groups, large academic libraries and international institutions. Figure 9: Project Partner Institutions References Balakrishnan, U., Agne, J.M. (2016). Coli-conc Survey - Ergebnisse der Online- Umfrage zur Sacherschließung und Konkordanzprojekten. Retrieved from http://coli- conc.gbv.de/publications/ Lauser, B., Johannsen, G., Caracciolo, C., Keizer, J., van Hage, W. R., Mayr, P. (2008). Comparing human and automatic thesaurus mapping approaches in the agricultural domain. 10th International Conference on Dublin Core and Metadata Applications. Retrieved from https://arxiv.org/abs/0808.2246 Pfeffer, M. (2013): Automatic creation of mappings between classification systems. Workshop Klassifikation und Sacherschließung (LIS'2013), University of uxembourg (Presentation). Retrieved from https://publikationen.bibliothek.kit.edu/1000035777 Voß, J., Ledl, A., Balakrishnan, U. (2016). Uniform description and access to Knowledge Organization Systems with BARTOC and JSKOS. Proceedings of TOTh conference 2016. Retrieved from https://zenodo.org/record/438019 Voß, J. (2016). JSKOS data format for knowledge organization systems. Retrieved from https://gbv.github.io/jskos/jskos.html Walter, A.-K., Mayr, P., Petras, V., Baerisch, S. (2007). Kompetenzzentrum Modellbildung und Heterogenitätsbehandlung (KoMoHe). Retrieved from https://www.gesis.org/forschung/drittmittelprojekte/archiv/komohe/