<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>My Health Dictionary: Study on Web Service using Program Information Data-hub as Linked Open Data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Masaru Miyazaki</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Makoto Urakawa</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ichiro Yamada</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kikuka Miura</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Taro Miyazaki</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hiroshi Fujisawa</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Toshio Nakagawa</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Japan Broadcsting Corporation(NHK)</institution>
          ,
          <addr-line>Tokyo</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>With the evolution of the global Internet, it has become increasingly common for companies to automatically exchange various types of data among themselves. In addition, content providers, such as broadcasting stations, are being required to change their content-serving strategy so that the content can be delivered to viewers via various external services. To address this strategy change, this paper proposes program-related information as machine-readable web data that can be used in internal and external services. We report on the construction of a program information data-hub using the Linked Open Data (LOD) standard format recommended by the World Wide Web Consortium. Results obtained by prototyping the data-hub and associated web services show that services employing a variety of program information can be realized by representing knowledge about the content as LOD data.</p>
      </abstract>
      <kwd-group>
        <kwd>linked open data</kwd>
        <kwd>program information</kwd>
        <kwd>health</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>The circulation of large amounts of content on the Internet, the rapid spread of
mobile devices such as smartphones, and the appearance of new viewing styles
such as time-shift playback have resulted in a signi cant change in the behavior
of viewers of TV programs. In line with this change, content providers such as
broadcasting stations are being required to change their content-serving strategy,
so that the content can be delivered to viewers via various external services
rather than simply waiting for them to access content via a broadcast and video
on demand service. We believe that an effective strategy to achieve this objective
is to utilize semantic web and Linked Open Data (LOD) technology to build a
\Web of Data." Consequently, we are currently studying how to build a hub
comprising various types of data associated with a TV program by creating
LOD program-related information and external data. In this demonstration, we
give an outline of the program information data-hub and introduce the prototype
health information services that use the data-hub.</p>
    </sec>
    <sec id="sec-2">
      <title>Related Works</title>
      <p>
        The BBC recognized the possibility of improving the utilization and presence
of content via LOD technologies at an early stage, and they have been working
to build a content data space using LOD. They have consequently developed
LOD for content such as program episodes and music artist information, and
these can be referenced from a variety of external sources. Moreover, at the
web site Wildlife1, animal species and behavior information are systemized as
RDF. Thus, a user can immediately enjoy relevant information and programs.
Further, their efforts to connect various pieces of program information as LOD
by crowdsourcing using tags are highly appreciated in the academic eld[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>In this study, our aim is to actualize a more advanced data space that can be
utilized as an internal or external service of broadcasting stations. As a result, we
have constructed a program information data-hub that has accumulated not only
existing program information but also knowledge about the program contents,
and are currently exploring its possibilities.
3</p>
      <p>Development of program information data-hub
NHK, Japan's public broadcast station, has developed a variety of
programrelated information in an in-house database for the purpose of providing
broadcasting information on their web site. However, much of the information was
described in RDB tabular form and, as a result, was underutilized for
collaboration with other new services. For example, a vegetable that is introduced as a
disease prevention measure in a health program could also be introduced as an
ingredient for a recipe in a cooking program. In this way, programs often provide
information that share common concepts with another program. By connecting
such programs to each other via the common concept, the creation of a new
service that connects programs transversely is possible. With the aim of realizing
such services, we gathered the program-related information (such as location of
image, video, web site) residing in in-house databases, transformed information
to RDF and automatically constructed an RDF store called a program
information data-hub. We used the Programmes Ontology of the BBC as a reference in
describing the schema of our data-hub, and expanded it to be able to describe
NHK's own program information. Next, in order to realize the cooperation
between the various external services, we automatically extracted performer names
and important words included in the program information, and added the link
information to the vocabulary of the DBpedia Japanese2. Further as external
knowledge, we automatically added a link to a \knowledge map," which is a
program-related knowledge dataset that is currently being built.</p>
      <p>
        The knowledge map comprises two types of data. One is \concept map,"
which consists of data obtained by analyzing a large text corpus on the web.
The map shows the semantic relations between words, such as causal relations
1 http://www.bbc.co.uk/nature/wildlife
2 http://ja.dbpedia.org
and hyponymy relations[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ][
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. The second is \content map," which was generated
by analyzing the summary text of a program. This method is composed of two
processes: topic extraction and relation estimation based on TF-IDF statistics
and semantic relation in concept map. It shows the relation between words and
the associated program.
      </p>
      <p>A structural example of a knowledge map associated with the program
information is shown in Fig.1. Concept map included 28 types, approximately 1,012
million word relations in total.</p>
      <p>Finally, we accumulated all schemas, instances, and knowledge map data in
an RDF Store, and constructed an environment that is accessible from a variety
of services through the SPARQL endpoints and WebAPI. Currently, program
information data-hub contains 1.89 million pieces of RDF triple created from
the experimental accumulation of 6,700 pieces of program data over a period of
two months.</p>
      <p>ConceptWord
“Vinegar”
“Vinegar”
ConceptWord</p>
      <p>ConceptM ap
relation
“Prevention”</p>
      <p>ContentM ap
relation
“topic”</p>
      <p>ConceptWord
“Hypertension”</p>
      <p>TV Program
URI:nhkdwc0112
“Dining with the chef” </p>
      <p>No.112 “Vinegar-‐‑‒marinated  Aji Salad”</p>
      <p>My Health Dictionary service using program
information data-hub
We have developed a \My Health Dictionary"(Fig.2) as a service that utilizes
the program information data-hub. My Health Dictionary is implemented as an
extension of Google Chrome. When the user selects a keyword in which he/she
is interested while browsing the Web, the program associated with the word is
displayed as a popup. Fig.2 shows an example in which the keyword
\hypertension" is selected on the text of a Webpage. When the user right-clicks on
the selected keyword, a knob describing the keyword string is displayed, and
related words linking to the keyword are displayed around the knob. If the user
then selects a related word such as \prevention," a list of programs related to
\prevention of hypertension." is then displayed. By clicking the program, the
user can then watch the video of the program or browse the program website.
Using the information from the concept maps, the system is also able to list the
cooking program that introduced a recipe using \vinegar" which is said to help
in the prevention of hypertension. Because the data-hub stores program
information about various genres, the cross-sectoral services that connect the various
Relation word “prevention”</p>
      <p>List of program
Selected word</p>
      <p>Knob of “hypertension”</p>
      <p>The program about the
"mineral" that helps in
the "prevention" of
”hypertension."
programs can be realized by utilizing program-related knowledge such as from
the concept map or the content map.
5</p>
    </sec>
    <sec id="sec-3">
      <title>Conclusion and future work</title>
      <p>In this paper, we reported on a prototype LOD based program information
datahub constructed by linking external knowledge with existing program
information. Further, we demonstrated a service example using the data-hub. In future
work, we plan to conduct studies on data-hub construction and utilization in
areas such as news, culture, and education, and establish an improved and
sophisticated data-hub for actual service operation.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>Yves</given-names>
            <surname>Raimond</surname>
          </string-name>
          , Tristan Ferne, Michael Smethurst,
          <source>Gareth Adams: The BBC World Service Archive prototype, Web Semantics: Science, Services and Agents on the World Wide Web</source>
          , Vol.
          <volume>27</volume>
          {
          <issue>28</issue>
          , pp.
          <volume>2</volume>
          {
          <issue>9</issue>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2. Stijn De Saeger, Kentaro Torisawa,
          <article-title>Jun'ichi Kazama, Kow Kuroda and Masaki Murata: Large Scale Relation Acquisition using Class Dependent Patterns</article-title>
          ,
          <source>In Proceedings of the IEEE International Conference on Data Mining (ICDM'09)</source>
          pp.
          <volume>764</volume>
          {
          <issue>769</issue>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3. Stijn De Saeger, Kentaro Torisawa, Masaaki Tsuchida, Jun ichi Kazama, Chikara Hashimoto, Ichiro Yamada,
          <string-name>
            <surname>Jong-Hoon</surname>
            <given-names>Oh</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Istvan</given-names>
            <surname>Varga</surname>
          </string-name>
          , Yulan Yan,
          <article-title>Relation Acquisition using Word Classes and Partial Patterns</article-title>
          ,
          <source>In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP</source>
          <year>2011</year>
          ), pp.
          <fpage>825</fpage>
          -
          <lpage>835</lpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>