<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Ignet: A centrality and INO-based web system for analyzing and visualizing literature-mined networks</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Arzucan Özgür</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Junguk Hur</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Zuoshuang Xiang</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Edison Ong</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dragomir R. Radev</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yongqun He</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Bogazici University</institution>
          ,
          <addr-line>34342 Istanbul</addr-line>
          ,
          <country country="TR">Turkey</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Michigan</institution>
          ,
          <addr-line>Ann Arbor</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of North Dakota</institution>
          ,
          <country country="US">US</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Ignet (Integrative Gene Network) is a web-based system for dynamically updating and analyzing gene interaction networks mined using all PubMed abstracts. Four centrality metrics, namely degree, eigenvector, betweenness, and closeness are used to determine the importance of genes in the networks. Different gene interaction types between genes are classified using the Interaction Network Ontology (INO) that classifies interaction types in an ontological hierarchy along with individual keywords listed for each interaction type. An interactive user interface is designed to explore the interaction network as well as the centrality and ontology based network analysis. Availability: http://ignet.hegroup.org.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>Many web systems exist for literature mining of gene
interactions, e.g., Chilibot (http://www.chilibot.net/) and iHOP
(http://www.ihop-net.org/UniPub/iHOP/). Some of these
tools mark the interaction keywords in the sentences. One
common obstacle is that these interaction keywords are not
classified; so detailed interaction types cannot be studied.</p>
      <p>
        Ontology-based literature mining is an emerging research
field that applies ontology to support literature mining. The
Interaction Network Ontology (INO) is a newly developed
interaction ontology that supports biomedical literature
mining
        <xref ref-type="bibr" rid="ref3">(Hur et al., 2015)</xref>
        . INO was initially developed to
represent over 800 interaction keywords
        <xref ref-type="bibr" rid="ref6">(Ozgur et al., 2011)</xref>
        , and
their hierarchical structure using ontological format, and
more interaction terms were later added to INO with
welldefined axioms
        <xref ref-type="bibr" rid="ref3">(Hur et al., 2015)</xref>
        .
      </p>
      <p>
        In our previous studies, we also ranked the genes in the
literature-mined gene networks using different types of
centralities: degree centrality, eigenvector centrality, closeness
centrality, and betweenness centrality
        <xref ref-type="bibr" rid="ref6">(Ozgur et al., 2011)</xref>
        .
Theses centralities measure different levels of importance.
For example, in betweenness centrality a node is considered
important if it occurs on many shortest paths between other
nodes, whereas in degree centrality a node is considered
important if it is connected to many other nodes.
      </p>
      <p>
        We have named our literature mining strategy Centrality
and Ontology-based Network Discovery using Literature
data (CONDL)
        <xref ref-type="bibr" rid="ref6">(Ozgur et al., 2011)</xref>
        . CONDL was
successfully applied to extract and analyze IFN-γ and
vaccinerelated gene interaction network as well as vaccine and
fever-related gene interaction network
        <xref ref-type="bibr" rid="ref2">(Hur et al., 2012)</xref>
        .
Based on the CONDL strategy, we have developed Ignet
(http://ignet.hegroup.org), a web-based literature mining
database system that stores gene-gene interactions extracted
from PubMed abstracts. A gene–gene interaction in this
study corresponds to an interaction between genes and/or
gene products such as proteins.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>FEATURES AND USAGE</title>
      <p>
        Briefly, all article abstracts available in PubMed are
retrieved. The sentences, split by Java’s internal splitter
(BreakIterator), were examined using SciMiner
        <xref ref-type="bibr" rid="ref4">(Hur et al.,
2009)</xref>
        to identify gene names and interaction keyword(s)
(e.g., interacts, binds, activates) represented in INO. We
obtained the dependency parse trees of the sentences using
the Stanford Parser
(http://nlp.stanford.edu/software/lexparser.shtml) and extracted the shortest dependency path
between each pair of genes in a sentence. Our assumption is
that the shortest path between two gene names in a
dependency tree is a good description of the semantic relation
between them in the corresponding sentence. We defined an
edit distance-based kernel function among these dependency
paths and used support vector machines (SVM) in the
SVMlight package
        <xref ref-type="bibr" rid="ref5">(Joachims, 1999)</xref>
        to classify each path as
describing an interaction between the gene pair or not
        <xref ref-type="bibr" rid="ref1">(Erkan et al., 2007)</xref>
        . The value output by the decision
function of the SVM classifier (i.e., the score field in Fig. 1B)
can be used as a confidence score to measure the confidence
of association between two genes in a sentence. Positive
score means that the SVM classifier predicts an
“interaction”, whereas negative score corresponds to a prediction of
“not interaction”. The larger the absolute value of a score,
the more confidant the classifier is in the classification
decision. The higher the score of a sentence is, the more likely it
is that the sentence describes an interaction between the pair
of genes. The current database contains only those
interactions with a positive SVM score.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>IGNET USE CASE DEMONSTRATION</title>
      <p>Ignet contains user-friendly web query interface (Fig. 1). A
user can query one gene or two genes. Each gene has its
own centrality scores, which indicate its degree of
importance in a network. All sentences associated with the
queries are obtained, with gene name and INO interaction
verbs highlighted. A click to a specific INO interaction verb
links to a page that shows the hierarchy of the INO verbs
(Fig. 1C).</p>
      <p>In addition, Ignet also includes a subprogram called
Dignet (http://ignet.hegroup.org/dignet), which applies
PubMed search to define the scope of papers for generating
the network, and use the Ignet execution pipeline to
generate gene-gene interactions and networks and calculate
centrality scores for genes in the networks.
(A)</p>
    </sec>
    <sec id="sec-4">
      <title>4 SUMMARY</title>
      <p>
        Ignet is a web-based literature mining system that integrates
the centrality-based literature mining approach with
INObased ontology analysis of interaction types. The gene-gene
relationships are extracted using machine learning methods
with the syntactic and semantic structures of the sentences.
To the best of our knowledge, Ignet is the first web system
that provides centrality analysis for literature-mined gene
interaction networks and ontology representation of
interaction types. Ignet not only provides access to
automatically extracted gene interactions, but it also enables
generations of new hypotheses
        <xref ref-type="bibr" rid="ref2">(Özgür et al., 2010; Özgür et
al., 2011; Hur et al., 2012)</xref>
        .
      </p>
    </sec>
    <sec id="sec-5">
      <title>ACKNOWLEDGEMENTS</title>
      <p>This work was supported by grant R01AI081062 from the
US NIH National Institute of Allergy and Infectious
Diseases and by Marie Curie FP7-Reintegration-Grants within the
7th European Community Framework Programme.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Erkan</surname>
            <given-names>G</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ozgur</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Radev</surname>
            <given-names>D</given-names>
          </string-name>
          :
          <article-title>Semi-Supervised Classification for Extracting Protein Interaction Sentences using Dependency Parsing</article-title>
          .
          <source>In Proceedings EMNLP-CoNLL</source>
          .
          <year>2007</year>
          :
          <fpage>228</fpage>
          -
          <lpage>237</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Hur</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ozgur</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xiang</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>He</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          (
          <year>2012</year>
          ).
          <article-title>Identification of fever and vaccine-associated gene interaction networks using ontology-based literature mining</article-title>
          .
          <source>J Biomed Semantics</source>
          <volume>3</volume>
          ,
          <fpage>18</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Hur</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ozgur</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xiang</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>He</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          (
          <year>2015</year>
          ).
          <article-title>Development and application of an interaction network ontology for literature mining of vaccine-associated gene-gene interactions</article-title>
          .
          <source>J Biomed Semantics</source>
          <volume>6</volume>
          ,
          <fpage>2</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Hur</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schuyler</surname>
            ,
            <given-names>A.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>States</surname>
            ,
            <given-names>D.J.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Feldman</surname>
            ,
            <given-names>E.L.</given-names>
          </string-name>
          (
          <year>2009</year>
          ).
          <article-title>SciMiner: web-based literature mining tool for target identification and functional enrichment analysis</article-title>
          .
          <source>Bioinformatics</source>
          <volume>25</volume>
          ,
          <fpage>838</fpage>
          -
          <lpage>840</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Joachims</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          (
          <year>1999</year>
          ).
          <article-title>"Making large-scale support vector machine learning practical," in Advances in Kernel Methods: Support Vector Learning</article-title>
          , ed. C.
          <string-name>
            <surname>J.B. B. Schölkopf</surname>
          </string-name>
          ,
          <article-title>And</article-title>
          <string-name>
            <given-names>A. J.</given-names>
            <surname>Smola</surname>
          </string-name>
          , Eds. (Cambridge, MA.: MIT Press),
          <fpage>169</fpage>
          -
          <lpage>184</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Ozgur</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xiang</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Radev</surname>
            ,
            <given-names>D.R.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>He</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          (
          <year>2011</year>
          ).
          <article-title>Mining of vaccine-associated IFN-gamma gene interaction networks using the Vaccine Ontology</article-title>
          .
          <source>J Biomed Semantics 2 Suppl</source>
          <volume>2</volume>
          ,
          <fpage>S8</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Xiang</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mungall</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ruttenberg</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>He</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          (
          <year>Year</year>
          ).
          <article-title>"Ontobee: A linked data server and browser for ontology terms"</article-title>
          ,
          <source>in: The 2nd International Conference on Biomedical Ontologies (ICBO): CEUR Workshop Proceedings)</source>
          , Pages
          <fpage>279</fpage>
          -
          <lpage>281</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>