<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Business Analytics Using The Knowledge Graph Built Over The Russian Legal Entities Registry</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Eugene Hlyzov</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sergey Isaev</string-name>
          <email>isaevg@datafabric.cc</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yury Emelyanov</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dmitry Pavlov</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Olga Belyaeva</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dmitry Mouromtsev</string-name>
          <email>mouromtsev@mail.ifmo.ru</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Olga Parkhimovich</string-name>
          <email>olya.parkhimovich@gmail.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maxim Kolchin</string-name>
          <email>kolchinmax@niuitmo.ru</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>DataFabric Ltd.</institution>
          ,
          <addr-line>St.Petersburg</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>ITMO University</institution>
          ,
          <addr-line>St.Petersburg</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Vismart Ltd.</institution>
          ,
          <addr-line>St.Petersburg</addr-line>
          ,
          <country country="RU">Russia</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Basic facts URL of the live web application: http://tree.datafabric.cc1 (launched on 17.07.2017). Product type: subscription-based commercial web application. Application domains: business intelligence, data analytics. Semantic technologies (ST) employed: RDF, OWL, knowledge graphs, SPARQL. Volume: 2.8 bil. triples, 10 mil. companies, 12 mil. individual entrepreneurs, 27 mil. persons, 333 Gb of raw unstructured data.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Throughout the last 3-4 years Russian authorities have made great volumes
of data collected and maintained by government services open and accessible
to the people. The knowledge concealed in these data is of a great business
value and open a new prominent opportunities for services that o er convenient
and e cient ways of nding, exploring and discovering the hidden relations and
structures that the data contain. In our case, the published open data required
a tremendous e ort in merging various bits, checking and distilling it before it
could be used in an application. We demonstrate how the linked data and
visual analytics are able to satisfy the information needs for o cial registry data
exploration and save the user's time in the process of business investigation.
Finally, from the market perspective, despite the existence of well-established tools
for business intelligence over the open government data, we observe an
unsatised demand for the a ordable, easy-to-use and lightweight services targeted to
individuals and small and medium businesses.</p>
      <p>Our web service allows users a) to search the registry for the entity of
interest b) create a graph and explore visually its connections to other companies,
persons, etc. c) produce a user-generated report about entity's background. The
service requires subscription fee to be paid up front. To access the full
functionality of the service for reviewing purposes please use the following credentials:
login: iswc2017review@example.com, password: demo.
1 English version (for demo purposes only): http://tree-i8n.apps.datafabric.cc</p>
      <p>
        ST In The Product
The added value of ST
Expected business value from utilizing ST: a) simpler and cheaper integration of
additional sources of data; b) cheaper custom development; c) richer user
experience - discovery-provoking visual exploration of company context.
The role of ST in the architecture of the application: The application is
comprised of 4 major parts: data extract-transform-load (ETL) pipeline hosted on
google cloud, RDF-triple store deployed on open source version of Blazegraph2,
server-side logic and request handling service running on Kotlin and Node.js and
frontend user interface logic and rendering implemented using ReactJS and open
source Ontodia library3. On all stages of data transformation from storing the
data to visualizing it we employ ST: data is stored in RDF graph with OWL
ontology applied to it in ETL process, data is queried with SPARQL, data is
visualized in the form of a graph by conversion of RDF into nodes and edges.
Advantages of using ST in our application
{ We apply ST standards to explicate the knowledge concealed in the raw data
by transforming the semi-structured XML les into cumulative RDF graph
with well-de ned data schema expressed in a form of OWL ontology .
{ On the basis of ST we align the internal graph-based data structures with
their graph-based diagrammatic visualization in Ontodia library[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] thus
reducing the amount of data transformation steps and simplifying our code
base.
{ With the help of ST we extend the graph with additional data sources, which
carry their own schemas, by investing minimum e ort into it.
{ By employing the underlying ontology we are empowered by an e ective and
automated data publishing mechanism assuring convenient data
consumption, which contributes greatly to visibility of our product. 4
Challenges that arise from locking on ST
{ To introduce the regular software developers to the tools of semantic web by
selecting the most mature technologies and utilities.
{ To solve the mismatch between the ontologies that come from knowledge
engineers and requirements of our use case regarding querying the data.
{ To adjust the queries and system architecture to achieve the expected
commercial product stability and performance on complex data calls.
2 https://github.com/blazegraph/database
3 https://github.com/ontodia-org/ontodia
4 https://github.com/DataFabricRus/ontology-fts
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Mouromtsev</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pavlov</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Emelyanov</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Morozov</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Razdyakonov</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Galkin</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>The simple web-based tool for visualization and sharing of semantic data and ontologies</article-title>
          .
          <source>In: Int-l Semantic Web Conf. (Posters &amp; Demos)</source>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>