<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>H. Li);</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>OBG-gen: Ontology-Based GraphQL Server Generation for Data Integration</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Huanyu Li</string-name>
          <email>huanyu.li@liu.se</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Olaf Hartig</string-name>
          <email>olaf.hartig@liu.se</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rickard Armiento</string-name>
          <email>rickard.armiento@liu.se</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Patrick Lambrix</string-name>
          <email>patrick.lambrix@liu.se</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>GraphQL, Ontology, Data Integration, GraphQL Server Generation</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer and Information Science, Linköping University</institution>
          ,
          <addr-line>Linköping</addr-line>
          ,
          <country country="SE">Sweden</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Physics, Chemistry and Biology, Linköping University</institution>
          ,
          <addr-line>Linköping</addr-line>
          ,
          <country country="SE">Sweden</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Swedish e-Science Research Centre</institution>
          ,
          <addr-line>Linköping</addr-line>
          ,
          <country country="SE">Sweden</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>1881</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>A GraphQL server contains two building blocks: (1) a GraphQL schema defining the types of data objects that can be requested; (2) resolver functions fetching the relevant data from underlying data sources. GraphQL can be used for data integration if the GraphQL schema provides an integrated view of data from multiple data sources, and the resolver functions are implemented accordingly. However, there does not exist a semantics-aware approach to use GraphQL for data integration. We proposed a framework using GraphQL for data integration in which a global domain ontology informs the generation of a GraphQL server. Furthermore, we implemented a prototype of this framework, OBG-gen. In this paper, we demonstrate OBG-gen in a real-world data integration scenario in the materials design domain and in a synthetic benchmark scenario.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR
ceur-ws.org</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>GraphQL1 is a conceptual framework for building Web APIs. The framework introduces a
so-called GraphQL schema (Figure 1a) that defines the types of data objects that can be requested,
and resolver functions (Figure 1b) that specify how to retrieve and fetch data from underlying
data sources. Another building block of the framework is the GraphQL query language for
expressing data retrieval requests (Figure 1c). The example schema contains an object type
(U n i v e r s i t y ) with a field definition</p>
      <p>U n i v e r s i t y I D of which the value type is S t r i n g and a field
definition d e p a r t m e n t s of which the value type is [ D e p a r t m e n t ] . It also contains two input object
types (U n i v e r s i t y F i l t e r and S t r i n g F i l t e r ) which can capture the notions of filter expressions.
For instance, the query accepts the argument (U n i v e r s i t y I D : { _ e q : “ u 1 ” } ) according to the
https://www.ida.liu.se/~patla00/ (P. Lambrix)
CEUR
Workshop
Proceedings
http://huanyuli.se (H. Li); https://olafhartig.de (O. Hartig); https://rickard.armiento.se (R. Armiento);
© 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
type University {</p>
      <p>UniversityID: String
departments: [Department] }
input UniversityFilter {</p>
      <p>UniversityID: StringFilter
_and: [UniversityFilter] }
input StringFilter {
_eq: String
_in: [String] }
type Query {</p>
      <p>UniversityList(filter:
UniversityFilter): [University] }
const UniversityList = (uid) =&gt; {
/*assume the underlying data source is a
relational database containing a table
named university with an id column*/
let data = db_connection.select().from(</p>
      <p>‘university’).where(‘id’, uid);
/*assume University is an object defined</p>
      <p>according to the type in the schema*/
let allUniversities = data.then(rows =&gt;</p>
      <p>new University(rows[0]));
return allUniversities;
};
{
}
{
}
UniversityList(
filter:{</p>
      <p>UniversityID:{_eq:“u1”}
})
}
departments
{
head
(a) GraphQL schema example.</p>
      <p>(b) Resolver function example.</p>
      <p>(c) Query example.</p>
      <p>
        U n i v e r s i t y F i l t e r definition, which represents “ UniversityID is equal to ‘u1’”. Additionally,
GraphQL schema presumes the Q u e r y type as the query root operation type. The example
schema has the U n i v e r s i t y L i s t field definition of which the returned type is [ U n i v e r s i t y ] , a
list of universities. GraphQL can be used for data integration by building a GraphQL server
over underlying data sources where the GraphQL schema provides an integrated view of data,
and the resolver functions specify implementations for accessing data sources. However, there
does not exist a semantics-aware approach to employ GraphQL for data integration. It means
the developer needs to write program code (i.e., resolver functions) to populate the various
elements of a GraphQL schema. In our previous work, we provided a semantics-aware approach
to employ GraphQL for data integration which is a global as view approach [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], with formal
methods to generate the GraphQL server [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. In this paper, we demonstrate the implemented
prototype of this approach, OBG-gen.2
      </p>
    </sec>
    <sec id="sec-3">
      <title>2. Approach</title>
      <p>2All the material related to OBG-gen is available online at https://github.com/LiUSemWeb/OBG-gen.</p>
      <p>GraphQL Query Answering Process
(4)
(1)</p>
      <p>Our schema generator (as shown in
Algorithm 1) first iterates over the concept Algorithm 1: Schema Generator
names. For each concept (e.g., U n i v e r s i t y ), IOnuptuptut :: aa sGertaopfhcQoLncsecphtesm,Ca; a set of GCIs, G
the concept name is used as the name of 12 for  ∈extCenddo with an empty object type, 
type to be generated in the GraphQL schema 3 extend  with an empty input type,  Filter
(U n i v e r s i t y ); the term concatenated with ‘Fil- 4 add field/argument declarations to the Query type
5 for  ∈ G do
ter’ is used as the name of an input type to be 6 if  is of the form P ⊑ Q then
generated (U n i v e r s i t y F i l t e r ); the term con- 78 eexxtteenndd  wwiitthh aann ienmpputtytyinptee,rfaFcielttyepre, 
catenated with ‘List’ is used as the name of a 9 extend  with field/argument declarations to
the Query type
ifeld of the Q u e r y type (U n i v e r s i t y L i s t ). Ad- 10 extend  with declaration that  implements 
ditionally, each such field of the Q u e r y type 11 else
is assigned an argument named ‘filter’, with 12 /*  is of the other forms */
a type that is the corresponding input type 13 extend  with field declarations to  ,  Filter
(e.g., f i l t e r : U n i v e r s i t y F i l t e r to U n i v e r s i t y L i s t ). In the next step, the algorithm iterates
over GCIs. Taking such a GCI, U n i v e r s i t y ⊑ ∀ d e p a r t m e n t s .D e p a r t m e n t , as an example, the
algorithm generates field definitions d e p a r t m e n t s : [ D e p a r t m e n t ] of the U n i v e r s i t y type, and
d e p a r t m e n t s : [ D e p a r t m e n t F i l t e r ] of the U n i v e r s i t y F i l t e r type. For a GCI U n i v e r s i t y ⊑ =
1 U n i v e r s i t y I D .S t r i n g , the algorithm generates field definitions U n i v e r s i t y I D : S t r i n g of the
U n i v e r s i t y type, and U n i v e r s i t y I D : S t r i n g F i l t e r of the U n i v e r s i t y F i l t e r type.</p>
      <p>The generic resolver function includes technical components QueryParser and Evaluator
as shown in Figure 3a. The QueryParser parses a query including a filter expression given as
an input argument, and outputs the corresponding abstract syntax trees (ASTs) for the input
argument and the query structure, respectively. Figure 3b shows example ASTs for a filter
expression and a query structure according to the query example in Figure 1c. The QueryParser
parses the query, converts a filter expression into a union of conjunctive expressions (arrow
⃝ 1 ), and generates an AST for each conjunctive expression and an AST for the query structure
(arrow ⃝ 2 ). Then, the filter expressions (frame ⃝ a ) and the query fields (frame ⃝ b ) are evaluated.
The Evaluator is responsible for sending requests to underlying data sources and fetching data
according to an AST. During evaluation of the filter expression, for each AST representing a
conjunctive (sub-)expression, an evaluator is called to request data satisfying the conjunctive
(sub-)expression. After a call to an evaluator based on an AST, data representing the requested
type, which contains identifier information, is returned. During evaluation of the query fields,
the identifier information is an input in the call to the evaluator (arrow ⃝ 3 ). Taking the query in
(a) Outlined generic resolver function.
(b) Example Abstract Syntax Trees.</p>
    </sec>
    <sec id="sec-4">
      <title>3. Demonstration</title>
      <p>
        We demonstrate OBG-gen in a real-world data integration scenario in the materials design
domain and in a synthetic benchmark scenario, Linköping GraphQL Benchmark (LinGBM) [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
The demonstration is shown in a public page,3 with pointers to an introduction video, detailed
evaluation results and live GraphQL servers for the two demonstration scenarios.
Materials Design Domain Demonstration. This demonstration focuses on a real-world
scenario in the field of materials design to integrate data from two data sources following
diferent data models. We will demonstrate that the GraphQL server, generated based on the
Materials Design Ontology (MDO) [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ], can provide integrated access to data from
heterogeneous data sources (i.e., requests data with a single GraphQL query without materializing
the underlying data). The domain ontology used by this demonstration aims to improve the
interoperability in the field for data integration. Therefore, the generated GraphQL schema
plays as an integrated view of materials design data. We write 12 GraphQL queries in total
among which 7 are with filtering conditions. Some of the queries are of domain interest written
based on competency questions used for developing MDO. The other queries are written for
testing the functionalities of the tool. One example query, as shown in Listing 1, is to get all the
calculations pertaining to silicon-based materials with band gap property above 2.0.4
LinGBM Demonstration. LinGBM is a performance benchmark for GraphQL server
implementations. It provides a scalable dataset regarding the University domain and specifies
key technical challenges (e.g., relationship traversal) of GraphQL server implementations. In
addition, it contains query templates covering diferent technical challenges. Therefore in this
scenario, we focus on demonstrating: (1) the generability and applicability of our approach
for data access in a diferent domain; (2) the current coverage of our approach in terms of key
technical challenges (e.g., attribute retrieval, relationship traversal, searching and filtering).
We use the GraphQL schema provided by LinGBM and manually define semantic mappings
3https://liusemweb.github.io/obg-gen/demo/
4This query is of domain interest, because semiconductor materials with band gaps above 2 electronvolts are referred
to as wide-bandgap semiconductors. Such semiconductors are widely used in various electronic devices.
to construct a GraphQL server. We select 7 query templates from LinGBM to create query
instances. One query template is used to construct queries that request all the publications of
which the titles contain a specific string (e.g., “ f o r m a l i z a t i o n ” ).
      </p>
    </sec>
    <sec id="sec-5">
      <title>4. Conclusion</title>
      <p>This paper has briefly introduced the OBG-gen, a prototype implementation for generating
GraphQL servers. Using OBG-gen, GraphQL application developers can avoid constructing
GraphQL servers from scratch. In the future, we will work on supporting more query features
(e.g., order by) in the generic resolver function; follow the development of the GraphQL language
and explore the possibility of formally generating new features based on ontologies.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          , G. De Giacomo, Data Integration:
          <string-name>
            <given-names>A</given-names>
            <surname>Logic-Based</surname>
          </string-name>
          <string-name>
            <surname>Perspective</surname>
          </string-name>
          ,
          <source>AI</source>
          magazine
          <volume>26</volume>
          (
          <year>2005</year>
          )
          <fpage>59</fpage>
          -
          <lpage>59</lpage>
          . doi:
          <article-title>1 0 . 1 6 0 9 / a i m a g . v 2 6 i 1 . 1 7 9 9</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>H.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <article-title>Ontology-Driven Data Access and Data Integration with an Application in the Materials Design Domain</article-title>
          ,
          <source>Ph.D. thesis</source>
          ,
          <year>2022</year>
          .
          <source>doi:1 0 . 3 3</source>
          <volume>8 4 / 9 7 8 9 1 7 9 2 9 2 6 8 3 .</volume>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Cheng</surname>
          </string-name>
          , O. Hartig,
          <article-title>LinGBM: A Performance Benchmark for Approaches to Build GraphQL Servers</article-title>
          ,
          <source>in: Web Information Systems Engineering - WISE 2022 - 23rd International Conference on Web Information Systems Engineering</source>
          ,
          <year>2022</year>
          .
          <source>doi:1 0 . 1 0</source>
          <volume>0 7 / 9 7 8 - 3 - 0 3 1 - 2 0 8 9 1 - 1</volume>
          _
          <fpage>1</fpage>
          6 .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>H.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Armiento</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Lambrix</surname>
          </string-name>
          ,
          <article-title>An Ontology for the Materials Design Domain</article-title>
          ,
          <source>in: The Semantic Web - ISWC 2020 - 19th International Semantic Web Conference</source>
          ,
          <year>2020</year>
          .
          <source>doi:1 0 . 1 0</source>
          <volume>0 7 / 9 7 8 - 3 - 0 3 0 - 6 2 4 6 6 - 8</volume>
          _
          <fpage>1</fpage>
          4 .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>P.</given-names>
            <surname>Lambrix</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Armiento</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Hartig</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Abd Nikooie Pour</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <article-title>The materials design ontology</article-title>
          ,
          <source>Semantic Web</source>
          (
          <year>2023</year>
          ).
          <source>doi:1 0 . 3 2 3 3 / S W - 2</source>
          <volume>3 3 3 4 0 .</volume>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>