<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>An RDF Platform for Generating Web API for Open Government Data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Pattama Krataithong</string-name>
          <email>pattama.kra@nectec.or.th</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marut Buranarach</string-name>
          <email>marut.bur@nectec.or.th</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nuttanont Hongwarittorrn</string-name>
          <email>nth@cs.tu.ac.th</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Thepchai Supnithi</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, Faculty of Science and Technology Thammasat University</institution>
          ,
          <addr-line>Pathumthani</addr-line>
          ,
          <country country="TH">Thailand</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Language and Semantic Technology Laboratory National Electronics and Computer Technology Center (NECTEC)</institution>
          ,
          <addr-line>Pathumthani</addr-line>
          ,
          <country country="TH">Thailand</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Most of datasets in open data portals are mainly in tabular format in spreadsheet, e.g. CSV and XLS. To increase the value and reusability of these datasets, the datasets should be made available in RDF format that can support better data querying and data integration. However, publishing and querying RDF requires different knowledge and skills. In this poster, we present a platform for publishing and querying the dataset in RDF that does not require the user's knowledge of RDF and SPARQL. This framework supports semiautomatic construction of RDF data and RESTFul APIs from the datasets in tabular format. The framework provides automatic schema detection, i.e. data type detection, and ontology and RDF data mapping generation. RESTful API is provided on top of the SPARQL data querying service for each published RDF dataset. A platform prototype was developed and demonstrated using some datasets from the Data.go.th website. Some current research directions include automatic dataset API generation based on Web crawler and validator and development of intelligent search engine over the dataset APIs.</p>
      </abstract>
      <kwd-group>
        <kwd>dataset management</kwd>
        <kwd>open data platform</kwd>
        <kwd>RDF data publishing</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>The number of datasets on the Thailand open government data portal, i.e. Data.go.th,
is continually increasing. Majority of datasets on these portals are in tubular formats
such as Excel and CSV. Based on the 5-star open data model1, Resource Description
Framework (RDF) is a standard data format that can support linked open data. There
are two important standards for integrating data. First, RDF is a standard format for
integrating data based on URI and XML syntax. Second, the Web Ontology Language
(OWL) is important for linked data based on classes and properties.
1 http://5stardata.info/en/</p>
      <p>Consuming RDF data is usually achieved by querying via an SPARQL endpoint. A
developer who wants to use the SPARQL endpoint must have the knowledge about
SPARQL and RDF. Our work proposes that Web API is an easier way for retrieving
RDF-based open data. There are several advantages of proving Web API over the
RDF datasets including:
 Data as a service – developers who do not have background in RDF and</p>
      <p>SPARQL can query a dataset via a RESTFul API service.
 Standard data format– developers do not need to study a new data format, the
query results will be returned in the standard JSON format.</p>
      <p>In this poster, we present a platform that provides a data management support for
RDF data publishing and consuming. The platform was developed using the Ontology
Application Management (OAM) framework [1]. The platform prototype was
available at the Demo-api.data.go.th website, which exemplifies deployment of the
platform using some datasets from the Data.go.th website.
2</p>
    </sec>
    <sec id="sec-2">
      <title>RDF Dataset Management Process</title>
    </sec>
    <sec id="sec-3">
      <title>Usage Scenarios</title>
    </sec>
    <sec id="sec-4">
      <title>Discussion and Research Directions</title>
      <p>This poster describes a semi-automatic framework for generating RDF dataset
from open tabular data. This framework allows the users to publish their datasets in
RDF format and query the data via Web API with no required knowledge about RDF
and SPARQL. One of the difficulties is that some datasets are not in the valid tabular
format [4]. In addition, human intervention is still required, which limited scalability
of the framework. One of our research directions is to develop a Web crawler and
validator to automatically retrieve and create the APIs from all valid datasets of the
Data.go.th website. We are also developing an intelligent search system, which allows
the user to search the data in the datasets via the APIs using a semi-natural-language
UI. The automatic approach for generating APIs for the datasets is shown in Fig 6.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Knowl</surname>
          </string-name>
          . Eng.
          <volume>26</volume>
          ,
          <issue>01</issue>
          ,
          <fpage>115</fpage>
          -
          <lpage>145</lpage>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Krataithong</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Buranarach</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Supnithi</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Hongwarittorrn</surname>
          </string-name>
          , N.:
          <article-title>Semi-Automatic Framework for Generating RDF Dataset from Open Data</article-title>
          .
          <source>In: Proc. of the 11th International Symposium on Natural Language Processing (SNLP2016)</source>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Krataithong</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Buranarach</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Supnithi</surname>
          </string-name>
          , T.:
          <article-title>RDF Dataset Management Framework for Data.go.th</article-title>
          .
          <source>In: Proc. of the 10th International Conference on Knowledge, Information and Creativity Support Systems (KICSS2015)</source>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Ermilov</surname>
            <given-names>I</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Auer</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stadler</surname>
            <given-names>C</given-names>
          </string-name>
          (
          <year>2013</year>
          )
          <article-title>User-driven semantic mapping of tabular data</article-title>
          .
          <source>Proc 9th Int Conf Semant Syst - I-SEMANTICS '13 105. doi: 10.1145/2506182</source>
          .2506196
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>