<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Exploring the Relation between Biomedical Entities and Government Funding</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Fang Tan</string-name>
          <email>cathytf@163.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Siting Yang</string-name>
          <email>524228058@qq.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Biomedical entities,</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Xiaoyan Wu</string-name>
          <email>wxy1954174163@163. com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jian Xu</string-name>
          <email>issxj@mail.sysu.edu.cn</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Evolutionary trend</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>School of Information, Management, Sun Yat-Sen University</institution>
          ,
          <addr-line>Guangzhou Guangdong</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2020</year>
      </pub-date>
      <fpage>50</fpage>
      <lpage>53</lpage>
      <abstract>
        <p>In order to study and analyze the effect of government funding on the promotion of scientific research in the field of medicine and to help the government manage research funds more rationally, this study proposes a framework for analyzing the relationship between entities in the field of medicine and funds. The framework consists of four parts: biomedical abstracts acquisition, NIH funding information acquisition and biomedical entity extraction; Development trend analysis of biomedical entity; Analysis of the most funded entities; Analysis of the relationship between entity research popularity and government funding. The results of preliminary analysis are as follows: the field of genetic research is in a period of rapid development, while the field of species research is in a “flat period”; Disease research catch NIH's continuous attention; the stimulating effect of government funding on the research popularity is decreasing, which is affected by various factors.</p>
      </abstract>
      <kwd-group>
        <kwd>funding</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>CCS CONCEPTS</title>
      <p>• Applied computing → Bioinformatics • Applied computing →
Computing in government • Information systems → Information
retrieval</p>
      <p>Entitymetrics,</p>
    </sec>
    <sec id="sec-2">
      <title>INTRODUCTION</title>
      <p>
        By 2019, the total number of literatures in PubMed, the
database of biomedical papers, has reached 29 million [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], and
statistically, nearly 1/3 of US patents come directly from federally
funded programs [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], meaning that the federal government plays
an important role in the development of scientific research.
Entitymetrics was originally proposed by Ding et al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Current
research around entities in medicine mainly includes the
identification and classification of named entities [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], and the
extraction of entity relationships [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], while research on
government funding is limited to quantifying the effects of
government funds in terms of institutions, patents, employment
resolution capacity, etc. [
        <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
        ]. Meanwhile, most of the research
on scientific research and funding is limited to the exploration of
the relationship between some indicators of research
achievements (e.g. quantity and citation) and funding, and lacks a
detailed study on the impact of funding on entity level. Therefore,
this paper combined PubMed medical database and funding
information published by the National Institutes of Health (NIH)
to compare the actual research focus and funding focus in the
biomedical field from 1988 to 2017. First, the trajectory of the
field is mapped from a physical research perspective to
understand macro trends; second, the most funded entities are
counted, the focuses and tendencies of government funding on
biomedical entities are summarized; finally, the specific
relationship between biomedical research funding and research
popularity is further analyzed, which provides a reference for the
government 's choice of funding recipients and funding levels.
2
      </p>
    </sec>
    <sec id="sec-3">
      <title>METHODOLOGY</title>
      <p>This paper proposes a framework for analyzing the
relationship between biomedical entities and funds, as shown in
Figure 1.
Copyright 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).</p>
      <p>
        In Figure 1, the analysis framework can be divided into four
main modules: data acquisition and entity extraction;
Development trend analysis of biomedical entity; Analysis of the
most funded entities; Analysis of the relationship between entity
research popularity and government funding.
1. Data acquisition and entity extraction. Obtaining biomedical
data from PubMed between 1988 and 2017, obtaining funding
information and relevant research papers of project outputs from
NIH funding database, and biomedical entity extraction based on
BERN [
        <xref ref-type="bibr" rid="ref10 ref8 ref9">8, 9, 10</xref>
        ]. BERN, namely Biomedical named entity
recognition and multi-type normalization, a Web-based
biomedical text mining tool. The process of entity extraction
involving two steps: named entity recognition and entity
normalization. At last, 489,433 biomedical entities are obtained
between 1988 and 2017. 2,082,652 research projects are obtained,
with about $1,0261.3 billion [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
2. Development trend analysis of biomedical entity. Biomedical
entities are categorized into Species, Diseases, Gene/Protein, and
Drug/Chemical for evolutionary analysis. Table 1 shows the
number of entities of four types.
Entity Type Number
      </p>
      <p>Species 84,203</p>
      <p>Disease 36,704
Gene/Protein 25,489</p>
      <p>Drug/Chemical 134,574
3. Analysis of the most funded entities. Combined with the
biomedical entity data, the entities mentioned in the NIH project
output articles are extracted to count the amount of funding for the
entities. We define the funding for an entity as the sum of the
funding for all articles in which the entity appears.
4. Analysis of the relationship between entity research popularity
and government funding. We define the entity research popularity
as the number of papers in which the entity is occurred. Thus, the
annual number of four types of entities is counted according to the
year of research paper in which the entity is located. The years
1988, 1998, 2008, and 2017 are selected, with the entity's research
popularity as the vertical axis, and the entity's annual funding
amount calculated by step 3 as the horizontal axis to create scatter
plots.
3
3.1</p>
    </sec>
    <sec id="sec-4">
      <title>PRELIMINARY RESULTS</title>
    </sec>
    <sec id="sec-5">
      <title>Development trend analysis of biomedical entity</title>
      <p>Based on the change of the number of research entities of each
type, the development trend of biomedical fields in the past three
decades is analyzed. The number of entities studied in each year is
the number of biomedical entity types mentioned in all papers
published in that year. Figure 2 shows the number of research
entities for each type over time. From the perspective of
development trend, the number of gene/protein entities is rising
the fastest and is in the stage of rapid development. The research
on species entities is in the flat stage and is less numerous.
13
14</p>
      <p>Entity ID
15</p>
    </sec>
    <sec id="sec-6">
      <title>Analysis of the relationship between research popularity of entity and government funding</title>
      <p>Based on biomedical entities in the four fields (Species,
disease, gene/protein and drug/chemical), the years 1988, 1998,
2008 and 2017 are selected for scatter plotting, and the
relationship between entity's research popularity and government
funding is visually analyzed, to identify the driving effect of the
fund on research in each field from the entity's perspective.</p>
      <p>As shown in Figure 3, in 1988, a small increase in funding is
followed by a significant increase in research popularity. In the
following three years, the linear fit reveals that with the passage of
time and the increase of the funding amount, the stimulating effect
of funding amounts on the popularity of species research slows
down.</p>
      <p>As shown in Figure 4, The linear coefficient obtained by
fitting the linear trend of disease entities in four years is slightly
larger than that obtained by species entities. Like the species
entity, in 1988, a small increase in funding is followed by a
significant increase in research popularity. As the years go by, the
increase in funding amount is greater than the increase in research
popularity, the slope of the fitted line gradually decreases, which
means the stimulating effect of the funding amount on the
research popularity is gradually slowing down. In 2017, there are
more entities with high funding and low research popularity,
which of course could be related to the emergence of new types of
entities.
As shown in Figure 5, for gene/protein entities, the initial trend
in 1988 is similar to the first two (species entity and disease
entity), while later, especially in 2017, it is clear that the upper
limit of research popularity has declined over time.
As
shown
in</p>
      <p>Figure
6, the
presentation
pattern
drug/chemical entities is similar to that of gene/protein entities.
The upper limit of research popularity has declined over time,
which indicates that as the years go by, the amount of funding
does not play a significant role in the drug/chemical entity's effect
on research popularity anymore.</p>
      <p>The above analysis follows that the influencing factors of the
change of entity research popularity should be multi-faceted and
complex, rather than simply being linearly influenced by research
funding, and the complexity increases with the increase of years.
4
4.1</p>
    </sec>
    <sec id="sec-7">
      <title>Conclusion</title>
    </sec>
    <sec id="sec-8">
      <title>CONCLUSION AND FUTURE WORK</title>
      <p>Studies linking entities to government funding and exploration
of trends from an entity perspective barely visible as far as we
know. This study puts forward a preliminary research idea,
applying the idea of entitymetrics to biomedical field from the
perspective of scientific research funds, and carries out a
preliminary research trend exploration and knowledge discovery.
The conclusions are as follows: a) the field of genetic research is
in a period of rapid development, while the field of species
research is in a “flat period”; b) Disease research catch NIH’s
continuous attention; c) the stimulating effect of government
funding on the research popularity is decreasing, which is affected
by various factors. These findings provide the basis for a
followup study.
4.2</p>
    </sec>
    <sec id="sec-9">
      <title>Future work</title>
      <p>Inspired by the initial results, our future work will focus on a
more in-depth exploration of the relationship between government
funding and entity development. In this study, we summarized the
Fang Tan et al.
trends in four categories of entities in the biomedical field and
counted the entities that received the highest funding. However, Is
there any commonality among these entities? Is entity-related
research with any certain characteristics always more likely to be
funded by the government? In addition, current research shows
that the incentive effect of increased government funding on
research in various fields is decreasing, while the impact of other
factors such as the continuity of government funding on research
popularity has not been explored. Therefore, further research will
be conducted on the study of the characteristics of the funded
entities and the rules of government funding.</p>
    </sec>
    <sec id="sec-10">
      <title>ACKNOWLEDGMENTS</title>
      <p>We acknowledge the editors and the anonymous reviewers for
insightful suggestions on this work.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Nicolas</given-names>
            <surname>Fiorini</surname>
          </string-name>
          ,,
          <string-name>
            <surname>Kathi</surname>
            <given-names>Canese</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Grisha</given-names>
            <surname>Starchenko</surname>
          </string-name>
          , et al.,
          <year>2018</year>
          .
          <article-title>Best match: new relevance search for PubMed</article-title>
          .
          <source>PLOS Biology 16</source>
          ,
          <issue>8</issue>
          (
          <issue>Aug</issue>
          ,
          <year>2018</year>
          ),
          <year>e2005343</year>
          . DOI: https://doi.org/10.1371/journal.pbio.
          <volume>2005343</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Guanghui</given-names>
            <surname>Xia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Junlian</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Baokun</given-names>
            <surname>Xing</surname>
          </string-name>
          , et al,
          <year>2019</year>
          .
          <article-title>Study of Named Entity Recognition in Medical Treatment Based on Literatures of Chinese Case Reports</article-title>
          .
          <source>Journal of Medical Intelligence</source>
          <volume>40</volume>
          ,
          <issue>6</issue>
          (May,
          <year>2019</year>
          ),
          <fpage>54</fpage>
          -
          <lpage>59</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Ying</given-names>
            <surname>Ding</surname>
          </string-name>
          , Min Song, Jia Han, et al.,
          <year>2013</year>
          .
          <article-title>Entitymetrics: Measuring the impact of entities</article-title>
          .
          <source>PloS one 8</source>
          ,
          <issue>8</issue>
          (
          <issue>Aug</issue>
          ,
          <year>2013</year>
          ),
          <fpage>1</fpage>
          -
          <lpage>14</lpage>
          . DOI: https://doi.org/10.1371/journal.pone.
          <volume>0071416</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Yuan</given-names>
            <surname>Xu</surname>
          </string-name>
          , Yanqiu Ge, Qiang,
          <string-name>
            <surname>Wang</surname>
          </string-name>
          , et al.,
          <year>2018</year>
          .
          <article-title>Medical Name Entity Recognition and Application in Chinese Admission Record of Stroke Patients Based on CRF and RUTA rule</article-title>
          .
          <source>Journal of Sun</source>
          Yat-sen University (Medical Sciences)
          <volume>39</volume>
          ,
          <issue>3</issue>
          (May,
          <year>2018</year>
          ),
          <fpage>455</fpage>
          -
          <lpage>462</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Xiuyan</given-names>
            <surname>Wang</surname>
          </string-name>
          , Lei Cui,
          <year>2013</year>
          .
          <source>Extract Semantic Relations Between Biomedical Entities Applied Hybrid Method. New Technology of Library and Information Service</source>
          <volume>29</volume>
          ,
          <issue>3</issue>
          (
          <issue>Mar</issue>
          ,
          <year>2013</year>
          ),
          <fpage>77</fpage>
          -
          <lpage>82</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Yongjian</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Jizhong</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <year>2008</year>
          .
          <article-title>A Study of the Relationship between Government R&amp;D Funding and Business Technology Innovation Activities</article-title>
          .
          <source>China Soft Science</source>
          ,
          <volume>11</volume>
          (Nov
          <year>2008</year>
          ),
          <fpage>141</fpage>
          -
          <lpage>148</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Paul A.</given-names>
            <surname>David</surname>
          </string-name>
          , Bronwyn H. Hall,
          <string-name>
            <given-names>Andrew A.</given-names>
            <surname>Toole</surname>
          </string-name>
          ,
          <year>2000</year>
          . Is Public R&amp;
          <article-title>D a Complement or a Substitute for Private R&amp;D--A Review of the Econometric Evidence</article-title>
          .
          <source>Research Policy</source>
          <volume>29</volume>
          ,
          <fpage>4</fpage>
          -
          <lpage>5</lpage>
          (
          <issue>Apr</issue>
          ,
          <year>2000</year>
          ),
          <fpage>497</fpage>
          -
          <lpage>529</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Jacob</given-names>
            <surname>Devlin</surname>
          </string-name>
          ,
          <string-name>
            <surname>Ming-Wei</surname>
            <given-names>Chang</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Kenton</given-names>
            <surname>Lee</surname>
          </string-name>
          , et al.,
          <year>2018</year>
          . Bert:
          <article-title>Pre-training of Deep Bidirectional Transformers for Language Understanding</article-title>
          . arXiv:
          <year>1810</year>
          .04805. Retrieved from https://arxiv.org/abs/
          <year>1810</year>
          .04805.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Donghyeon</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Jinhyuk</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <source>Chan Ho So</source>
          , et al.,
          <year>2019</year>
          .
          <article-title>A neural named entity recognition and multi-type normalization tool for biomedical text mining</article-title>
          .
          <source>IEEE Access 7, (Jan</source>
          <year>2019</year>
          ),
          <fpage>73729</fpage>
          -
          <lpage>73740</lpage>
          . DOI: https://doi.org/10.1109/ACCESS.
          <year>2019</year>
          .
          <volume>2920708</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Jian</surname>
            <given-names>Xu</given-names>
          </string-name>
          , Sunkyu Kim,
          <string-name>
            <given-names>Min</given-names>
            <surname>Song</surname>
          </string-name>
          , et al.,
          <year>2020</year>
          .
          <article-title>Building a PubMed knowledge graph</article-title>
          .
          <source>Scientific Data</source>
          <volume>7</volume>
          ,
          <issue>1</issue>
          (
          <issue>Jun</issue>
          ,
          <year>2020</year>
          ).
          <fpage>1</fpage>
          -
          <lpage>15</lpage>
          . DOI:
          <volume>10</volume>
          .1038/s41597-020-0543-2.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>