=Paper= {{Paper |id=Vol-3214/WS6Paper3 |storemode=property |title=Semantic Discovery and Selection of Data Connectors in International Data Spaces |pdfUrl=https://ceur-ws.org/Vol-3214/WS6Paper3.pdf |volume=Vol-3214 |authors=Danniar Reza Firdausy,Patrício de Alencar Silva,Marten van Sinderen,Maria Eugenia Iacob |dblpUrl=https://dblp.org/rec/conf/iesa/FirdausySSI22 }} ==Semantic Discovery and Selection of Data Connectors in International Data Spaces== https://ceur-ws.org/Vol-3214/WS6Paper3.pdf
Semantic Discovery and Selection of Data Connectors in
International Data Spaces
Danniar Reza Firdausy1, Patrício de Alencar Silva1,2, Marten van Sinderen1 and Maria
Eugenia Iacob1
1
 University of Twente, Drienerlolaan 5, Enschede, 7522 NB, The Netherlands
2
 Rio Grande do Norte State University (UERN) Federal University of the Semi-Arid Region (UFERSA),
Mossoró, RN, Brazil


                                Abstract
                                Data sovereignty is the right that individuals and organizations own to control the access to
                                and the disclosure of their private and sensitive data. In Europe, the International Data Spaces
                                Association (IDSA) aims to promote this right by proposing technical and organizational
                                guidelines to help companies build trusted data exchange ecosystems. The IDSA suggests the
                                IDS Connectors as software components necessary to enforce data sovereignty on the
                                technical level. Among many possible functionalities, an IDS Connector could enable (1)
                                data exchange between data owners' and data user's Enterprise systems in a standardized
                                communication protocol; (2) data access policy enforcement; and (3) internal data
                                transformation operations, e.g., integration, mapping, or merging. However, software and
                                service providers may start soon offering IDS Connectors with different configurations
                                through multiple platforms on the Web, making the practical adoption of the IDS
                                architectural guidelines more difficult, especially for small and medium enterprises. We
                                propose developing an IDS Connector Store to discover and select IDS Connectors in IDS
                                ecosystems to cope with this issue. The store will operate as a metadata repository to describe
                                the connectors according to contextual information, e.g., the business domain, pricing model,
                                and data access policies enforced. This paper reports on the current state of this research
                                endeavor by providing a threefold contribution. First, it elaborates on research questions,
                                methods, and goals to address the design problem on hand. Second, it presents an ontology
                                requirements specification document highlighting competency questions related to
                                discovering and selecting IDS Connectors in an IDS ecosystem. Last, it provides the first
                                conceptual draft of an ontology for IDS Connectors described in OntoUML posed for
                                discussion among the conceptual modeling community and to guide meaningful and further
                                specification in Web Ontology Language (OWL).

                                Keywords 1
                                International data spaces, IDS, ontology, semantic web, discoverability, data sovereignty

1. Introduction

   IT-based platforms have proven their significance in facilitating data-sharing and interoperability
among organizations [1]. A few of their benefits perceived by business entities are improving their
planning process, enhancing their capability to fulfill large work orders, and stimulating the creation
of new business models [2, 3]. Despite these gains, establishing a data-sharing ecosystem has
challenges, e.g., conflicting data formats and standards between companies' software systems and the
lack of technical enforcement in disclosing sensitive data [4, 5].

Proceedings of the Workshop of I-ESA’22, March 23–24, 2022, Valencia, Spain
EMAIL: d.r.firdausy@utwente.nl (D.R. Firdausy); p.dealencarsilva@utwente.nl (P. de Alencar Silva); m.j.vansinderen@utwente.nl (M. van
Sinderen); m.e.iacob@utwente.nl (M.E. Iacob)
ORCID: 0000-0001-9743-9754 (D.R. Firdausy); 0000-0001-6827-1024 (P. de Alencar Silva); 0000-0001-7118-1353 (M. van Sinderen);
0000-0002-4004-0117 (M.E. Iacob)
                           © 2022 Copyright for this paper by its authors.
                           Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
CEUR
Wor
Pr
   ks
    hop
 oceedi
      ngs
            ht
            I
             tp:
               //
                ceur
                   -
            SSN1613-
                    ws
                     .or
                   0073
                       g

                           CEUR Workshop Proceedings (CEUR-WS.org)
   Subsequently, these hurdles in carrying out data exchange led to the advancement of the
International Data Spaces (IDS). This initiative is a decentralized and usage policy-enforced data-
sharing ecosystem that puts forward trust, security, interoperability, and data sovereignty in mind [6,
7]. IDS grants the participants access to join and share data in the ecosystem through the IDS
Connectors. These software components will be critical to enabling secure data exchange between a
data provider and data consumer by enforcing usage policies for the data consumer to use, process,
and proliferate the shared data [8]. As a gateway between a company's enterprise systems and their
partners', the IDS Connectors may soon be offered and supplied by software and service providers in
numerous configurations to satisfy diverse sets of participants' needs and capabilities. As a result, the
information to request these IDS Connectors will be increased in a scattered manner throughout the
Web, hampering the potential participants' ability to discover them and ultimately limiting their
adoption of the IDS vision. To cope with this issue, we propose the idea of an IDS Connector Store –
a repository of metadata describing the functionality and contextual information about data
connectors.
   In the meantime, Semantic Web technology has gained more prominence in information sharing. It
has been implemented in an increasing variety of contexts in recent years to enhance the
discoverability and accessibility of resources on the Web [9]. One of the building blocks that
constitute the Semantic Web is Ontology, which is a formal and explicit specification of a concept
used to add a layer of metadata to the described resources to define their meaning [10]. This
procedure makes the Web more accessible and understandable for more refined search results by
software agents in providing information to human agents. Even though considerable research has
adopted the Semantic Web approach, the ones that attempt to cope with the widespread proliferation
of the IDS Connectors are still minimal. Therefore, we propose the application of Ontology and
Semantic Web technology to the development of the IDS Connector Store to facilitate the
discoverability and selection process of the IDS Connectors.
   The organization of the rest of this paper follows. Section 2 will address the question that probes
the methodological steps required to develop the ontology for the IDS Connectors. Then, Section 3
elaborates on producing the ontology requirements specification document, highlighting the
competency questions related to discovering and selecting the IDS Connectors. Finally, in Section 4
and Section 5, we will present the first conceptual draft of the mentioned ontology described in
OntoUML and pose a discussion among the conceptual modeling community to specify the ontology
further in Web Ontology Language (OWL).

2. Research methods
    The first prominent step in developing an ontology-based software application is the formulation
of the ontology requirements specification document (ORSD). In this research, we adopt the scenario-
based NeON Methodology, emphasizing the reuse of existing ontological and non-ontological
resources in developing the ontology [11]. In addition to the requirements specification activity
guidelines, this methodology also provides a template to formulate the ORSD as a filling card that
describes the purpose, scope, implementation language, intended end-user, intended uses,
requirements, and pre-glossary terms of the ontology under design [12]. The process followed to
produce the ORSD based on this methodology will be discussed in more detail in the next section.
    This paper aims to produce a preliminary ontology for the IDS Connectors to facilitate their
discoverability and selection for the IDS participants. To maintain interoperability with the domain
reference ontology, NeON suggests a quick search of knowledge resources for possible reuse during
the development. For this purpose, the IDSA has published the IDS Information Model that describes
the fundamental concepts of the IDS, covering entities from the participants to the infrastructure
components [13]. This IDS Information Model grounds the ontology proposed in this work. The
resulting conceptual model is depicted in OntoUML [14]. This model serves as the basis for further
implementation into OWL to describe the IDS Connectors and distinguish them with subject-
predicate-object triples according to the Resource Description Framework (RDF) format [15].
Through this semantic annotation, several sentences can be formed to explain the IDS Connectors.
For instance, company A maintains an IDS Connector X, IDS Connector Y is offered in a flat-rate
pricing model, or IDS Connector Y complies with GS1 standards [16, 17]. As a result, software
agents will be able to discover the IDS Connectors that are appropriate to their data exchange
demands.

3. IDS connector ontology requirement specification

    The requirements specification identifies the ontology's purpose, scope, and implementation
language. As presented in Table 1, three main end-users that will take advantage of the knowledge
given by the IDS Connector ontology are listed. The business representatives are the first target users
due to their interest in spotting potential business opportunities in the current business landscape. For
the potential IDS participants, the presence of their partners and the prospect of securing a strategic
partnership with other existing participants signal the value accessible to them by participating in the
data space. Such a scenario might influence their willingness to consolidate into the IDS ecosystem.
    Conversely, the interests of the existing participants can take many forms. One example is to find
other prospective partners to engage in strategic information exchange to leverage their value chain
performance. The IT representatives will need to further translate these business strategies to the IT
implementation strategies by investigating the suitable IDS Connector matching their needs and
capabilities. Such a demand leads to concerns about which IDS Connectors fit their business domain
or industrial standards adopted for data exchange. In response, software and service providers will be
interested in making their IDS Connectors discoverable by external software applications.

Table 1
IDS Connector Ontology Requirement Specification Document
Purpose
To serve the IDS Connector Store as a knowledge base in describing the IDS Connector to guide
existing and potential participants of the IDS ecosystem to the relevant IDS Connector.
Scope
The ontology focuses on describing and discovering the IDS Connectors according to contextual
information (e.g., the business domain, pricing model, and enforced data access policy) with the
granularity represented by the competency questions.
Implementation Language
The ontology is represented in OntoUML, with further translation into OWL.
Intended End-Users
 User 1. Business representatives of potential and existing IDS participants
 User 2. IT representatives of potential and current IDS participants
 User 3. Software and Service Providers who develop and supply IDS Connectors
 User 4. Scholars who are keen to explore the ontology's knowledge representation capabilities
Intended Uses
 Use 1. Software and Service Providers publish their offered IDS Connectors' metadata on the IDS
        Connector Store to make their offered IDS Connectors discoverable.
 Use 2. Business representatives search for IDS-compliant partners operating in the same
        business domain, complying with common standards, etc.
 Use 3. IT representatives search for IDS Connectors that match their needs and capabilities.
 Use 4. Scholars search for, use, and import the ontology into their proof-of-concept IDS
        implementations.
Ontology Requirements
Non-Functional Requirements
NFR 1. The ontology must at least use English
NFR 2. The ontology must comply, reuse and integrate with the existing IDS Ontology specified
       under the IDS Information Model.
Functional Requirements: Competency Questions
                                  CQG1. IT Representatives
CQ 1. What software provider offers IDS Connectors?
CQ 2. Which IDS Connectors are developed for a specific business domain?
CQ 3. Which IDS Connectors are complying with a particular standard?
CQ 4. Which IDS Connectors are offered in this pricing model?
CQ 5. Which IDS Connectors support these data usage agreements?
CQ 6. Which IDS Connectors are developed using this application development framework?
CQ 7. Which IDS Connectors are offered in this deployment context?
                               CQG2. Business Representatives
CQ 8. Which IDS participants use a particular IDS Connector from a specific Software Provider?
CQ 9. Which IDS participants operate in a particular business domain?
CQ 10. Which IDS participants comply with a particular standard?
Pre-Glossary of Terms
Terms from Competency Questions & Frequency
- IDS Connector              - Business Domain                     - Data Usage Agreement
- Participant                - Standards                           - Technology
- Software Provider          - Pricing Model                       - Deployment
Objects and Terms for Answers
- Gatewise IDS Connector, -          Transport Logistics, Glass - Delete     After     Interval
  Supplydrive IDS Connector,         Manufacturing,       Steel   Agreement,        Connector-
  TradeCloud IDS Connector.          Manufacturing, etc.          restricted       Agreement,
- Vandaglas B.V., Van Egmond -       OTM, GS1, EDI4STEEL, etc.    Logging Agreement, etc.
  Groep, Meijer Metal         -      Flat Rate, Freemium, Pay - Java, Spring Boot, JavaScript,
- ECI Software Solutions,            per User, Pay per Feature,   NodeJS, VueJS, Python, etc.
  Tradecloud                         etc.                       - On-Premise, Cloud: SaaS, etc.

    The functional and non-functional requirements specification then follows the Ontology
Engineering activity. The ontology development work on the IDS itself was initiated by the IDSA by
publishing the IDS Information Model. This work grounded the development of this IDS Connector
ontology, putting it as a prominent non-functional requirement. On top of this, several functional
requirements in the form of competency questions lead to the semantic-enabled discovery and
selection process of the IDS Connectors. These questions are grouped by considering the immediate
interests of the relevant end-users. Frequent terms are extracted from the competency questions,
leading to the enumeration of objects for answering the end user's query. We instantiate the entities
listed in Table 1 above by referring to the literature [5, 18], industrial standards [16, 17, 19], IDSA
documentation and publications [8, 13, 20], and the publication of the Smart Connected Supplier
Network (SCSN), one of the IDS forerunners in the Dutch manufacturing supply chain [21, 22].

4. Preliminary IDS connector ontology conceptual model

    Using the IDS Reference Architecture Model (RAM) and the IDS Information Model (IM) as a
starting point, we have identified several concepts relevant to answering the CQs above, namely the
Participant and the Connector concepts [8, 13]. As shown in Figure 1, we identify the former concept
as the IDS Actor and extend it further into two specializations. The Core IDS Actors refers to the
participants who either own and provide or request and use data. Meanwhile, the IDS Supporting
Actors are associated with parties that ensure the continuation of the data-sharing ecosystem. The
Software and Service Provider carries this duty by providing essential components to participate in
the data space.
Figure 1: Preliminary IDS Connector Ontology Conceptual Model

   Meanwhile, the Broker Service Provider supports the core actors with the function to look up for
the other actors and the Connector used by the different actors through the functionality offered by the
IDS Connector Store. In addition, the Supporting IDS Actor also covers other roles, such as the
Clearing House and Identity Provider. However, as the IDS RAM describes, these roles can be
assumed by the same organization that takes the part of the Broker Service Provider [8].
   The IDSA expresses the Connectors from several different perspectives. On the one hand, the IDS
IM describes the concept of a Connector to be the generalization of the Base Connector, Trusted
Connector, App Store, and Participant Information Service [13]. Here, we distinguish these
Connectors into the Core Connector and Supporting Connector, each used by the corresponding type
of role. The IDS RAM justifies this distinction by describing that the functions falling into the
supporting category, i.e., the App Store Provider, Broker Service Provider, and Identity Provider, rely
on the Connector technology to carry out their functions [8]. On the other hand, the IDS RAM also
characterizes the Connector from its Deployment Context, Security Profile, Catalog, and Host. The
Deployment Context is designated as the Connector's deployment environment, i.e., on-premises or
cloud-based. Security Profile explicates the Connector's capability to enact a secure data exchange
and processing environment. Host signals the communication protocol that the Connector supports to
expose resources, i.e. HTTPS URLs, MQTT topics, etc. Whereas, the Catalog facilitates the
participant discovery in the ecosystem based on the digital resources that the Connector provides or
consumes.
    We extend the Connector concept with additional properties to facilitate its discovery and
selection. The Business Domain describes the context where the Connector is developed to be
specialized. Standards refer to the criteria to which the connector complies. The Pricing Model
implies how the end-users are expected to pay for the Connector's usage and acquisition. The
Application Framework informs which technology stacks are used to develop and support the
Connector's runtime. Finally, the Data Usage Agreement is understood as the contract composed of
the Data Usage Policy Pattern and agreed by the interacting Core IDS Actor to govern the data usage.
As of now, five types of data usage patterns are supported by the Connector, and more designs may be
added in the future.

5. Conclusion
    This paper introduced a preliminary ontology conceptual model to describe IDS Connectors that
will serve as a façade of a Connector Store. Since this ontology is still under development, we plan to
refine it further to accommodate more relevant competency questions to facilitate the discovery of
IDS Connectors and IDS actors. The next step is to translate this model into OWL representation for
semi-automated machine reasoning. Then, we populate the resulting representation with instances of
IDS Connectors and IDS Actors and load it into a triple store such as TriplyDB to make it publicly
available [23]. Then, we plan to evaluate the published ontology by translating the competency
questions into SPARQL queries to verify the ontology's correctness, consistency, and completeness.
Hence, we plan to validate the utility of the ontology by assessing the relevance of the posed
competency questions with expert opinion and the end-users identified in Table 1, besides verifying if
the answers returned from the SPARQL correspond to end-users expectations. Lastly, the completed
and published ontology will be operationalized into a proof-of-concept implementation of the IDS
Connector Store by connecting the triple store with a user-interfacing application to support the
discovery and selection of IDS Connectors in an IDS-compliant data-sharing ecosystem.

6. Acknowledgements

   This research is financially supported by the Dutch Ministry of Economic Affairs and co-financed
via TKI DINALOG and NWO. The CLICKS project has granted funding for this work (grant no.
439.19.633). CLICKS is the acronym for Connecting Logistics Interfaces, Converters, Knowledge,
and Standards. The authors thank the involved consortium partners for their support and the
anonymous reviewers for their constructive feedback.

7. References

[1] M. L. Markus, Q. N. Bui, Going concerns: The governance of interorganizational coordination
     hubs, Journal of Management Information Systems 28 (2012) 163-198. doi: 10.2753/MIS0742-
     1222280407
[2] M. Banek, D. Juric, D. Pintar, Z. Skocir, M. Vranic, B. Vrdoljak, E-business infrastructure for
     supporting the integration of tourist services, in: 2008 50th International Symposium ELMAR,
     IEEE, New York, 2008, pp. 289-292.
[3] X. Wang, C. Zhang, Y. Jin, X. Zhao, CPSP: A Cloud-based Production Service Platform
     Supporting Co-Manufacturing of Cross-Enterprise, in: 2018 IEEE 22nd International Conference
     on Computer Supported Cooperative Work in Design (CSCWD), IEEE, New York, 2018, pp.
     455-460, doi: 10.1109/CSCWD.2018.8465354.
[4] A. Braud, G. Fromentoux, B. Radier, O. Le Grand, The road to European digital sovereignty
     with Gaia-X and IDSA, IEEE Network 35 (2021) 4-5. doi: 0.1109/MNET.2021.9387709.
[5] I. Lopes-Martínez, L. Paradela-Fournier, J. Rodríguez-Acosta, J. L. Castillo-Feu, M. I. Gómez-
     Acosta, A. Cruz-Ruiz, The use of GS1 standards to improve the drugs traceability system in a
     3PL Logistic Service Provider, DYNA 85 (2018) 39-48.
[6] S. Dalmolen, H. Bastiaansen, E. Somers, S. Djafari, M. Kollenstart, M. Punter, Maintaining
     control over sensitive data in the Physical Internet: Towards an open, service oriented, network-
     model          for         infrastructural      data       sovereignty,       2019.         URL:
     https://repository.tno.nl/islandora/object/uuid%3Ab2e6952a-06ed-46e1-b186-fc25932b28c3
[7] B. Otto, M. Jarke, Designing a multi-sided data platform: findings from the International Data
     Spaces case, Electronic Markets 29 (2019) 561-580. doi: 10.1007/s12525-019-00362-x.
[8] International Data Spaces Association, IDSA Reference Architecture Model Version 3.0, 2019.
     URL:          https://internationaldataspaces.org/wp-content/uploads/IDS-Reference-Architecture-
     Model-3.0-2019.pdf
[9] K. Janowicz, F. Van Harmelen, J. A. Hendler, P. Hitzler, Why the data train needs semantic rails,
     AI Magazine 36 (2015) 5-14.
[10] S. Salma, M. Bouneffa, C. Habiba, Ontology and Semantic Web in Logistic Applications: State
     of the Art, in: 2019 7th Mediterranean Congress of Telecommunications (CMT), IEEE, New
     York, 2019, pp. 1-4. doi: 10.1109/CMT.2019.8931374.
[11] A. Gómez-Pérez, M. C. Suárez-Figueroa, NeOn methodology for building ontology networks: a
     scenario-based                       methodology,                   2009.                 URL:
     https://oa.upm.es/5475/1/INVE_MEM_2009_64399.pdf
[12] M. C. Suárez-Figueroa, A. Gómez-Pérez, B. Villazón-Terrazas, How to write and use the
     ontology requirements specification document, in: OTM Confederated International
     Conferences" On the Move to Meaningful Internet Systems, Springer, Berlin, 2009, pp. 966-982.
[13] IDSA. The International Data Spaces (IDS) Information Model, 2021. URL:
     https://github.com/International-Data-Spaces-Association/InformationModel.
[14] G. Guizzardi, Ontological foundations for structural conceptual models, 2005. URL:
     https://ris.utwente.nl/ws/portalfiles/portal/6042428/thesis_Guizzardi.pdf
[15] T. Berners-Lee, J. Hendler, O. Lassila, The semantic web, Scientific american 284 (2001) 34-43.
[16] GS1, GS1 Transport & Logistics, 2021. URL: https://www.gs1.org/industries/transport-and-
     logistics.
[17] OpenTripModel.         What        is     the    Open       Trip     Model?,    2021.     URL:
     https://www.opentripmodel.org/page/about.
[18] W. Bol Raap, M.-E. Iacob, M. v. Sinderen, S. Piest, An architecture and common data model for
     open data-based cargo-tracking in synchromodal logistics, in: OTM Confederated International
     Conferences" On the Move to Meaningful Internet Systems, Springer, Berlin, 2016, pp. 327-343.
[19] INAD Industrie Software B.V., EDI4STEEL, 2022. URL: https://www.edi4steel.eu/about/.
[20] IDSA. Dataspace Connector, 2021. URL: https://github.com/International-Data-Spaces-
     Association/DataspaceConnector.
[21] C. Stolwijk, F. Berkers, Scalability and agility of the Smart Connected Supplier Network
     approach, 2020. URL: https://repository.tudelft.nl/islandora/object/uuid%3A36745cb0-3d5f-
     4f79-9034-93e02e80529c
[22] SCSN, Smart-Connected Supplier Network (SCSN) Addressbook, 2020. URL:
     https://broker.ids.smart-connected.nl/#home.
[23] TriplyDB, The Netowrk Effect for your Data, 2022. URL: https://triply.cc/.