<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>OKBQA Framework towards an open collaboration for development of natural language question-answering systems over knowledge bases</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jin-Dong Kim</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Christina Unger</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Axel-Cyrille Ngonga Ngomo</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andre Freitas</string-name>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Young-gyun Hahm</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jiseong Kim</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sangha Nam</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gyu-Hyun Choi</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jeong-uk Kim</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ricardo Usbeck</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Myoung-Gu Kang</string-name>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Key-Sun Choi</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>DBCLS</institution>
          ,
          <addr-line>178-4-4 Wakashiba, Kashiwa-shi, Chiba</addr-line>
          ,
          <country country="JP">JAPAN</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>KAIST</institution>
          ,
          <addr-line>291 Daehak-ro, Guseong-dong, Yuseong-gu, Daejeon</addr-line>
          ,
          <country country="KR">Korea</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Bielefeld</institution>
          ,
          <addr-line>Universittsstrae 25, 33615 Bielefeld</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>University of Paderborn</institution>
          ,
          <addr-line>Warburger Str. 100, 33098 Paderborn</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>University of Passau</institution>
          ,
          <addr-line>Innstrae 41, 94032 Passau</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff5">
          <label>5</label>
          <institution>Young Plus Soft Corp.</institution>
          ,
          <addr-line>71 Karak-ro, Songpa-gu, Seoul</addr-line>
          ,
          <country country="KR">Korea</country>
        </aff>
      </contrib-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Due to recent advances in Semantic Web (SW), the amount of Linked Data
(LD) available particularly in Resource Description Framework (RDF) increases
rapidly (http://lod-cloud.net). However, LD is still used mostly by SW experts.
There are two main obstacles to making LD accessible for common Web users:
(1) the need to learn the query language, SPARQL, and (2) the need to know
the schemas underlying various datasets to be queried. Approaches to ease the
access to LD include graphical query interfaces [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], agent-based systems [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], and
natural language (NL) interfaces [
        <xref ref-type="bibr" rid="ref2 ref3 ref4 ref5 ref6 ref8">2, 5, 8, 4, 6, 3</xref>
        ]. Among them, NL interfaces are
receiving increasing interest due to high expressive power and low learning cost.
      </p>
      <p>Typically, a natural language question-answering (NLQA) system takes
natural language queries as input. The queries are then converted in a structured
query language, e.g. SPARQL, which will be used to consult a knowledge base
(KB), e.g., a SPARQL endpoint, or KBs. While there are a number of relevant
previous works, it is widely understood that the development of a NLQA
system requires expertise in various technologies, e.g., natural language processing
(NLP), database schema analysis, inference, and so on. There is thus a natural
call for collaboration among interested parties.</p>
      <p>With the goal to provide a platform of open collaboration for development
of NLQA systems, the OKBQA framework has been developed. Recently, it has
reached a milestone: (1) core module categories for NLQA systems are gured out
and their APIs are documented, (2) a repository of OKBQA-compatible modules
is implemented, and 24 modules are registered, and (3) a prototype demo system
is implemented and two work ows for QA in English and Korean have been
set up. This manuscript presents a summary of OKBQA Framework, and the
demo presentation will show how the system works to support collaboration for
development of NLQA systems.
2</p>
    </sec>
    <sec id="sec-2">
      <title>OKBQA Framework</title>
      <p>An OKBQA repository is implemented and maintained to provide a venue
for sharing information about modules developed for the OKBQA framework
(http://repository.okbqa.org). The registration of modules is open to anybody.
At the time of writing, there are 24 modules registered to the repository.
4</p>
    </sec>
    <sec id="sec-3">
      <title>Demonstration</title>
      <p>A prototype demo system is developed and maintained as a public service
(http://ws.okbqa.org/wui-2016/), (1) to demonstrate how work ows in OKBQA
actually work, and (2) to support development of modules for the framework.
Currently, two work ows have been set-up for QA in English and Korean. Users
can choose a work ow and try it with natural language queries.</p>
      <p>Note that the performance of currently available work ows may not yet be
competitive, because they are composed by connecting modules developed by
di erent groups without much tuning for harmonization. To improve the
performance, further development of the modules is required, and the interface of the
OKBQA demo system is designed to support it.</p>
      <p>Firstly, the interface allows users to modify the work ows. Note that a
modules for the OKBQA framework is required to be a REST service, and a work ow
is de ned as a sequence of URIs (of the REST services). This means that anyone
can develop a module to replace one in a prede ned work ow. Suppose one has
developed a new DM, either by improving an existing one or by newly
implementing it. Once it is deployed as a REST service, the new DM can be tested
in a work ow by simply specifying (the URL of) the DM as the DM component
of the work ow.</p>
      <p>Secondly, the interface allows users to inspect the input and output of each
module during execution of a work ow. Figure 2 shows a screen-shot of the
interface, which shows the execution of the English work ow with the example query
Which river ows in Seoul?. The left pane shows the progress of the work ow,
and the right pane shows the input and output of each module. With the design,
the OKBQA framework may be thought as an SDK system for development of
NLQA systems</p>
    </sec>
    <sec id="sec-4">
      <title>Conclusion</title>
      <p>The OKBQA framework is developed as a platform of open collaboration for the
development of NLQA systems. The modules developed for the framework can
be found at the OKBQA repository, and they are all freely available as
opensource projects. The prototype demo system is maintained as a public service to
support distributed, voluntary development of modules for the framework.</p>
      <p>There is a large room for improvement in the framework. For example, the
composition of a work ow is not yet su ciently exible, and the performance
of current reference work ows is not yet competitive. Nevertheless, we believe it
is a signi cant milestone that such a framework has begun to work to organize
distributed contributions. We hope this presentation to be an opportunity to
receive feedback from interested parties and also to invite potential collaborators.</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>The development of OKBQA Framework is supported by the ExoBrain project
(http://exobrain.kr/). JDK is supported by the Life Science Database
Integration Project funded by National Bioscience Database Center (NBDC) of Japan
Science and Technology Agency (JST). ACNN and RU are supported by the
H2020 project HOBBIT (GA no. 688227) and the EuroStars projects DIESEL
(01QE1512C) and QAMEL (01QE1549C). The authors thank to all the
participants in OKBQA hackathon so far for their contribution.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Cimiano</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kopp</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Accessing the Web of Data through embodied virtual characters</article-title>
          .
          <source>Semantic Web</source>
          <volume>1</volume>
          (
          <issue>1</issue>
          , 2),
          <volume>83</volume>
          {
          <fpage>88</fpage>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Freitas</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oliveira</surname>
            ,
            <given-names>J.G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Curry</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Carlos</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Silva</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          : Treo:
          <article-title>Combining entitysearch, spreading activation and semantic relatedness for querying linked data</article-title>
          .
          <source>In: In: 1st Workshop on Question Answering over Linked Data (QALD-1) Workshop at 8th Extended Semantic Web Conference (ESWC)</source>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Hamon</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grabar</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mougin</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Querying biomedical linked data with natural language questions</article-title>
          .
          <source>Semantic Web</source>
          <volume>8</volume>
          (
          <issue>4</issue>
          ),
          <volume>581</volume>
          {
          <fpage>599</fpage>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>J.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cohen</surname>
            ,
            <given-names>K.B.</given-names>
          </string-name>
          :
          <article-title>Natural language query processing for SPARQL generation: A prototype system for SNOMED-CT</article-title>
          .
          <source>In: Proceedings of BioLink SIG meeting</source>
          <year>2013</year>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Lopez</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fernandez</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Motta</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stieler</surname>
          </string-name>
          , N.:
          <article-title>Poweraqua: Supporting Users in Querying and Exploring the Semantic Web</article-title>
          .
          <source>Semantic web 3(3)</source>
          ,
          <volume>249</volume>
          {265 (Aug
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Rozinajova</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Macko</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Using Natural Language to Search Linked Data</article-title>
          .
          <source>In: Semantic Keyword-based Search on Structured Data Sources</source>
          . pp.
          <volume>179</volume>
          {
          <fpage>189</fpage>
          . Springer (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Russell</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smart</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>NITELIGHT: A Graphical Editor for SPARQL Queries</article-title>
          .
          <source>In: 7th International Semantic Web Conference (ISWC</source>
          <year>2008</year>
          )
          <article-title>(</article-title>
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Unger</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          , Buhmann, L.,
          <string-name>
            <surname>Lehmann</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ngonga</surname>
            <given-names>Ngomo</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>A.C.</given-names>
            ,
            <surname>Gerber</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>Cimiano</surname>
          </string-name>
          ,
          <string-name>
            <surname>P.</surname>
          </string-name>
          :
          <article-title>Template-based Question Answering over RDF Data</article-title>
          .
          <source>In: Proceedings of the 21st International Conference on World Wide Web</source>
          . pp.
          <volume>639</volume>
          {
          <fpage>648</fpage>
          . WWW '12,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          , New York, NY, USA (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>